Re: Remove duplicate stations

From: Dennis Shea <shea_at_nyahnyahspammersnyahnyah>
Date: Thu, 28 Dec 2006 09:21:09 -0700 (MST)

> I am working with a large volume of station data thats in a absolute mess!! I
> am talking about the GHCN Daily temperature data. There are several stations
> that are located exactly at the lat, long coordinate and have overlapping time
> series. I need to find all such stations and flag them or remove them. The
> station data is arranged in a 2D grid of the form (STNID,T) where STNID is
> unique for each station and T is days from 1 to N. Is there any way to achieve
> the kind of flagging that I am talking about without going through tedious for
> loops?
[a] If N is of arbitrary length, it is not a 'nice' 2D array.
[b] You say that the stations IDs are unique but you never
    mention lat/lon information.
[c] There is no function that does what you want.
[d] Something like the following must be used:

   id = station_data(:,0) ; station IDs only
   nid = dimsizes(id)
   n_dup = 0

   do n=0,nid-2
      i = ind(
      if (.not.all(ismissing(i))) then
          ni = dimsizes(i)
          print("id="+id(n)+" has "+ni +" duplicates")

          n_dup = n_dup+1
      end if
   end do

     This is far from perfect. You may wish to preallocate
     space for duplicate IDs and have more code to avoid
     duplicate printing.

   id_dup = new( 1000, typeof(id)) ; 1000 arbitrary
   id_dup = 0

     Then add code to add entries to the above array.

good luck

ncl-talk mailing list
Received on Thu Dec 28 2006 - 09:21:09 MST

This archive was generated by hypermail 2.2.0 : Thu Dec 28 2006 - 09:40:18 MST