From: Vladyslav Lyubartsev <lyubartsev_at_nyahnyahspammersnyahnyah>

Date: Wed, 18 Mar 2009 12:12:28 +0100

Date: Wed, 18 Mar 2009 12:12:28 +0100

Hi Dennis,

Thank you for your script, however it is too general (and slow) for me.

Using some trick (days are integer, and time interval is less than 20 years)

we can carry out this task much more faster, just use days as auxiliary

arrays index. I put below this algorithm using your uniqueDays function

interface

(http://mailman.ucar.edu/pipermail/ncl-talk/attachments/20090317/7fdcb1f2/at

tachment.pl). Maybe it can be useful for somebody, very simple but quite

fast.

; Days are integer yyyymmdd

function uniqueDays(Days[*]:integer, Data[*]:numeric)

local iDay,DayA,DayB,nDay,i,n,ii,nums,sums,x

begin

DayA = min(Days)

DayB = max(Days)

nDay = DayB-DayA+1

nums = new(nDay,integer) ; that's the trick

nums = 0 ; 1995-2008 (14 years) corresponds to

nDay=14*10,000=140,000

sums = new(nDay,float) ; sure, a lot of nums and sums are dead (e.g.

20021399 is not a day)

sums = 0. ; however, these arrays length is not very

large

n = dimsizes(Days)

do i = 0,n-1 ; calculation time is proportional to

dimsizes(Days)!!!

iDay = Days(i)-DayA ; use Days(i) as index

nums(iDay) = nums(iDay)+1

sums(iDay) = sums(iDay)+Data(i)

end do

ii = ind(nums.gt.0)

x = new((/dimsizes(ii),3/),float)

x(:,0) = DayA+ii ; unique days

x(:,1) = nums(ii) ; number of duplicates

x(:,2) = sums(ii)/nums(ii) ; daily average

return x

end

Best regards,

Slava

-----Original Message-----

From: Dennis Shea [mailto:shea_at_ucar.edu]

Subject: Re: Find and aggregate duplicates

A sample function and test driver is attached.

Quite possibly, others could write more efficient code.

Wall Clock Timings:

8000 elements: less than 1 sec

40000 elements: 6 seconds

80000 elements: 25 sec

Good luck

An embedded and charset-unspecified text was scrubbed...

Name: aggdup.ncl

Url:

http://mailman.ucar.edu/pipermail/ncl-talk/attachments/20090317/7fdcb1f2/att

achment.pl

Vladyslav Lyubartsev wrote:

*> Hello,
*

*> We have two huge arrays with the same dimension:
*

*> Days = (/19950308,19950314,19950314,...,20081228,20081231,20081231/)
*

*> Data = (/ 12.1, 22.5, 32.0, 12.8, 16.0, 32.1/)
*

*> There can be several data values for the same day.
*

*> Days are irregular. Sure we can sort them, however there will be gaps
*

*> and duplicates in this array.
*

*> Is it possible to carry out the following tasks efficiently, no loops,
*

*> using only built-in NCL functions:
*

*> 1) Construct the list of unique days, no duplicates
*

*> 2) Calculate amount of duplicates for each unique day
*

*> 3) Calculate the data average for each unique day
*

*> I see no such solution, only loops.
*

*>
*

*> Am I right?
*

*>
*

*> Thanks,
*

*> Slava
*

_______________________________________________

ncl-talk mailing list

List instructions, subscriber options, unsubscribe:

http://mailman.ucar.edu/mailman/listinfo/ncl-talk

Received on Wed Mar 18 2009 - 05:12:28 MDT

*
This archive was generated by hypermail 2.2.0
: Wed Mar 18 2009 - 14:50:21 MDT
*