Re: memory error question

From: David Ian Brown <dbrown_at_nyahnyahspammersnyahnyah>
Date: Fri, 13 Jul 2007 10:57:50 -0600

Hi Jacob,
I'm sorry to hear that your script is still not working. It is
interesting
that the error report is a bit different. Have you been monitoring the
size of the NCL process periodically using 'top' or 'ps' or something
similar? It would be worthwhile to know whether the process continues
to grow significantly over time or stabilizes at some peak size.

We could try run your program with some memory monitoring software.
We would of course need some data. Perhaps one days worth would
be enough to take a look.

But first can you send me the latest version of your script?
  -dave

On Jul 13, 2007, at 6:55 AM, Jacob.Klee_at_dom.com wrote:

> Thank you again for your thoughtful email.
>
> I implemented the write matrix solution you suggested as I both found
> it an
> interesting test of the failure hypothesis and a workable / feasible
> final
> solution. Sadly, the failure remains (actually the log file did not
> note
> the memory error, rather the (segmentation failure / core dump
> occurred at
> the same time / run position as before), making me begin to wonder if
> there
> might not be something beyond just an inefficient script causing
> problems.
> Apart from a minor bug fix in my handling of freezing level heights
> that
> are near or below the surface (and commenting out the summary print
> statement and creating, writing, and cleaning up the matrix to be
> written
> out), no changes were made to the script.
>
> I am currently running the script again on a different portion of my
> dataset (2004's data rather than 2000's data) after also having
> installed
> the current version of NCL (I was running under the prerelease of the
> current verision previously). This should help to rule out both any
> grib
> 'weirdness' and any prerelease 'weirdness'. My previous debugging
> efforts
> however leads me to anticipate that at ~16:00 EDT this afternoon the
> script
> will fail in like manor as before.
>
> Lastly, for what it is worth, the interval between successive output
> files
> being written remains fairly constant up to the point of failure,
> running
> between ~22 and 25 minutes.
>
> I look forward to any help that can help resolve this problem.
>
>
> -- Jacob Klee
>
>
>
>
>
> David Ian Brown
> <dbrown_at_ucar.edu>
> Sent by:
> To
> ncl-talk-bounces@ Jacob.Klee_at_dom.com
> ucar.edu
> cc
> ncl-talk_at_ucar.edu
>
> Subject
> 07/09/2007 05:56 Re: [ncl-talk] memory error
> PM question
>
>
>
>
>
>
>
>
>
>
> Looking at your script, my first thought would be to try using the
> write_matrix procedure to eliminate the big print statement at the end.
> You could create a float array of 183 x <however many values you have
> in your print statement>,
> populate it with the values that your current print statement
> outputs, and then output it to a file. Since, unfortunately,
> write_matrix does not
> currently have an option for appending to an existing file, you could
> write to
> a temporary file then use a statement like:
>
> system("cat temp_file >> final_csv_file")
>
> to get all the output into a single file.
>
> Since it looks like almost all your output is floating point numbers it
> seems like this would
> work if you can live with a couple of integer values being written out
> as floating point numbers.
>
> If that approach does not seem suitable, the other option is to create
> a shared object from
> Fortran or C code. This approach would allow you to specify every
> detail of how the data is output,
> but would, of course, be a bit more work.
>
> Another thing you could do is try to eliminate the innermost loop (do
> N_INDEX = 0,182)
>
> By converting the input data variables to 1D arrays you could use
> vector subscripting to assign each
> variable in one step.
>
> Convert the 2D indexes into 1D:
>
> Dom1d = DomY * <size of X dimension> + DomX
>
> where presumably size of X dimension would be the size of g3_y_1
> (confusingly) in the GRIB file.
>
> Then if you set
> APCPN1d = ndtooned(APCPN)
>
>
> The values you are looking for would be selected with
>
> APCPN1d(Dom1d)
>
> -dave
>
>
> On Jul 9, 2007, at 7:59 AM, Jacob.Klee_at_dom.com wrote:
>
>> Hi David, and thank you for your reply.
>>
>> So, my troubles are most likely related to the large print statement
>> within
>> my innermost loop?
>>
>> Over the weekend I let a version of the script run in which every
>> variable
>> not already deleted was deleted before the end of the innermost do
>> loop.
>> This had no impact on where the memory error occurred: in the midst of
>> printing the results of the 19th iteration of the innermost
>> calculation
>> loop (I usually see the failure between ~45 and 80 iterations of the
>> print
>> loop during the 19th iteration of the innermost calc. loop, with
>> somewhere
>> 61-65 being the most common).
>>
>> As a restriction imposed by my analylsis software, I need my results
>> formatted as I have the print statement arranged. Additionally, short
>> of
>> ending the script, I gather there is currently no way to release any
>> of the
>> memory consumed by the multitude of unique strings I am producing. If
>> this
>> is correct, what options do I have in making the calculations I need,
>> and
>> still in the end (one way or another) ending with my data arranged
>> comparably to how I am printing it currently? (e.g. would writing the
>> results to a single NCL file be more efficient? If so, how would I
>> then go
>> about writing the data out to a CSV file without again encountering
>> the
>> same problems again?)
>>
>> Thank you all again for your time.
>>
>>
>>
>> -- Jacob Klee
>>
>>
>>
>>
>>
>>
>>
>> David Ian Brown
>> <dbrown_at_ucar.edu>
>> Sent by:
>> To
>> ncl-talk-bounces@ Dennis Shea <shea_at_ucar.edu>
>> ucar.edu
>> cc
>> ncl-talk_at_ucar.edu,
>> Jacob.Klee_at_dom.com
>> 07/06/2007 04:45
>> Subject
>> PM Re: [ncl-talk] memory error
>> question
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>>> In NCL, is there any reason if I delete every varaible (however that
>>>> deletion is acomplished) that 1,000,000 itterations of a script
>>>> would
>>>> consume more memory than 1?
>>
>>
>> Hi Jacob,
>> I assume by "1,000,000 iterations" you are talking about some kind of
>> do loop inside a script -- not running a script that many times.
>>
>> There is no significant use of memory in the iteration of a do loop
>> itself.
>> What can cause a problem, as has been discussed recently, is the
>> non-obvious
>> fact that every unique string created during the execution of an NCL
>> script is stored
>> internally for the life of the script. So for instance, the execution
>> of the loop below
>> creates 1,000,000 unique strings, leading to the consumption of around
>> 18 Megabytes
>> of memory. This is not that big an amount of memory in the overall
>> scheme of things;
>> the bigger problem is that the hash table lookup takes more and more
>> time the more
>> strings it has to search through. In the case of the loop below it is
>> very noticeable that
>> it gets slower and slower as it progresses.
>>
>> begin
>> do i = 1,1000
>> do j = 1,1000
>> print(i + " " + j)
>> end do
>> end do
>> end
>>
>> Changing the loop, as below, which reduces the number of unique
>> strings
>> to 2000,
>> eliminates the slowdown, and reduces the memory consumption to a
>> scarcely noticeable
>> level.
>>
>> begin
>> do i = 1,1000
>> do j = 1,1000
>> print("i = " + i )
>> print("j = " + j )
>> end do
>> end do
>> print ("end")
>> end
>>
>> We hope to take a look at this to see if we can eliminate the
>> permanent
>> storage of
>> transient strings.
>> -dave
>>
>>
>> On Jul 5, 2007, at 9:21 AM, Dennis Shea wrote:
>>
>>> [1] There is no way to delete multiple variables with
>>> one command. This capability has been requested before.
>>> It is on NCL's priority list. Unfortunately, NCL's request list
>>> is long :-(
>>>
>>> [2] If you have deleted all the temporary variables in a loop [let's
>>> say at the
>>> end of each iteration], then I don't think any extra memory
>>> should be used.
>>>
>>> There is the possibility of a memory leak but, to my knowledge,
>>> NCL has been pretty free of this problem. [One exception was
>>> about 3-4 years ago.]
>>>
>>> Jacob.Klee_at_dom.com wrote:
>>>> I will see what adjustments I can make.
>>>>
>>>> A thought came to me and I have not yet been able to fully research
>>>> it yet:
>>>> is there a way to delete / destroy all variables by a single
>>>> command?
>>>>
>>>> In NCL, is there any reason if I delete every varaible (however that
>>>> deletion is acomplished) that 1,000,000 itterations of a script
>>>> would
>>>> consume more memory than 1?
>>>>
>>>> -- Jacob
>>>>
>>>>
>>>>
>>>> Thanks for the information.
>>>>
>>>> It might help if you can deal with chunks of data at a time, and
>>>> then
>>>> delete it. For example, if you need to write a bunch of variables
>>>> to
>>>> a file, then rather than reading them all in at once and waiting
>>>> until
>>>> the end to write them out, you can write them out one at a time, and
>>>> then delete them when you're done.
>>>>
>>>> It might even help if instead of doing everything in one big loop,
>>>> that you instead duplicate the loop and do a certain set of things
>>>> in one. This way you are not storing a whole bunch of variables
>>>> during the duration of a single loop iteration.
>>>>
>>>> --Mary
>>>>
>>>> On Tue, 3 Jul 2007 Jacob.Klee_at_dom.com wrote:
>>>>
>>>>
>>>>> Gladly.
>>>>>
>>>>> I am 'brute-force' data mining... I am creating a csv file of 5
>>>>> years
>>>>> worth of data from the North American Regional Reanalysis dataset
>>>>> (01Mar
>>>>> through 30Sep each year) and extracting / calculating numerous
>>>>> terms
>>>>>
>>>> which
>>>>
>>>>> may have some contribution / explanitory value concerning
>>>>> disruptions to
>>>>> our electric distribution network. (I curently have lightning data
>>>>> standing in as a placeholder for our outtage data which has yet to
>>>>> be
>>>>> compiled on the NARR grid.) I am going to feed the resulting file
>>>>> into
>>>>> DTREG and start the fun analyzing the data, and hopefully in the
>>>>> end
>>>>> producing a product which will further esitmate the potential
>>>>> impact
>>>>> of
>>>>> upcoming weather to our system.
>>>>>
>>>>>
>>>>> Also, the soil data is not involved in any calculations, but is
>>>>> writen
>>>>>
>>>> out
>>>>
>>>>> at the end of the file.
>>>>>
>>>>>
>>>>> -- Jacob
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Mary Haley
>>>>> <haley_at_ucar.edu>
>>>>> Sent by:
>>>>> To
>>>>> ncl-talk-bounces@ Jacob.Klee_at_dom.com
>>>>> ucar.edu
>>>>> cc
>>>>> 'ncl talk'
>>>>> <ncl-talk_at_ucar.edu>
>>>>>
>>>>> Subject
>>>>> 07/03/2007 12:35 Re: [ncl-talk] memory error
>>>>> PM question
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Hi Jacob,
>>>>>
>>>>> One of the places you can look if you start seeing memory
>>>>> allocation
>>>>> problems is if you are creating a lot of arrays, but not freeing
>>>>> them
>>>>> up with "delete" after you are done with them. You already have
>>>>> quite
>>>>> a few "delete" calls in your script, but it might be helpful if you
>>>>> could add more.
>>>>>
>>>>> I believe, yes, in this case you are running out of physical
>>>>> memory,
>>>>> and deleting some arrays should help.
>>>>>
>>>>> The killer might be the triple do loop in which you are calling
>>>>> addfile
>>>>> inside the innermost loop, and then reading a bunch of variables
>>>>> off
>>>>> the file.
>>>>>
>>>>> For example, you have code like this:
>>>>>
>>>>> SOILW00_01 = SOILW(0,:,:)
>>>>> SOILW01_04 = SOILW(1,:,:)
>>>>> SOILW04_10 = SOILW(2,:,:)
>>>>> SOILW10_20 = SOILW(3,:,:)
>>>>>
>>>>> in which I think SOILW is not being used after this point, so you
>>>>> could delete it.
>>>>>
>>>>> It might help us to know the intention of this program, so that
>>>>> we could suggest some other ways to do this kind of heavy file i/o
>>>>> and processing.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> --Mary
>>>>>
>>>>>
>>>>> On Tue, 3 Jul 2007 Jacob.Klee_at_dom.com wrote:
>>>>>
>>>>>
>>>>>> Hello all,
>>>>>>
>>>>>> I have recived the error: fatal:NhlMalloc Failed:[errno=12]:Cannot
>>>>>>
>>>>> allocate
>>>>>
>>>>>> memory while running a memory intensive script. If my own memory
>>>>>> has
>>>>>>
>>>> not
>>>>
>>>>>> yet failed, I remember this translates to running out of physical
>>>>>>
>>>> memory?
>>>>
>>>>>> I am running the prerelease of ....3.0 for CYGWIN on an XP machine
>>>>>> with
>>>>>>
>>>>> 4gb
>>>>>
>>>>>> ram. I have included below the related files to my error.
>>>>>>
>>>>>> My questions are these: am I truly running out of memory? If so,
>>>>>> any
>>>>>> sugestions? If not, what is happening? (If anyone wants to run
>>>>>> this, I
>>>>>> would be happy to provide a handfull of the subseted NARR grib
>>>>>> files I
>>>>>>
>>>> am
>>>>
>>>>>> using.)
>>>>>>
>>>>>>
>>>>>>
>>>>>> -- Jacob Klee
>>>>>>
>>>>>> (See attached file: ncl.exe.stackdump)(See attached file:
>>>>>>
>>>>> DCP_trimmed.log)
>>>>>
>>>>>> (See attached file: DomConvectParams_test.ncl)(See attached file:
>>>>>> .hluresfile)
>>>>>>
>>>>>>
>>>>>> -----------------------------------------
>>>>>> CONFIDENTIALITY NOTICE: This electronic message contains
>>>>>> information which may be legally confidential and/or privileged
>>>>>> and
>>>>>> does not in any case represent a firm ENERGY COMMODITY bid or
>>>>>> offer
>>>>>> relating thereto which binds the sender without an additional
>>>>>> express written confirmation to that effect. The information is
>>>>>> intended solely for the individual or entity named above and
>>>>>> access
>>>>>> by anyone else is unauthorized. If you are not the intended
>>>>>> recipient, any disclosure, copying, distribution, or use of the
>>>>>> contents of this information is prohibited and may be unlawful.
>>>>>> If
>>>>>> you have received this electronic transmission in error, please
>>>>>> reply immediately to the sender that you have received the message
>>>>>> in error, and delete it. Thank you.
>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> ncl-talk mailing list
>>>>> ncl-talk_at_ucar.edu
>>>>> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> ncl-talk mailing list
>>>> ncl-talk_at_ucar.edu
>>>> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>>>>
>>>
>>>
>>> --
>>> ======================================================
>>> Dennis J. Shea tel: 303-497-1361 |
>>> P.O. Box 3000 fax: 303-497-1333 |
>>> Climate Analysis Section |
>>> Climate & Global Dynamics Div. |
>>> National Center for Atmospheric Research |
>>> Boulder, CO 80307 |
>>> USA email: shea 'at' ucar.edu |
>>> ======================================================
>>>
>>> _______________________________________________
>>> ncl-talk mailing list
>>> ncl-talk_at_ucar.edu
>>> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>>
>> _______________________________________________
>> ncl-talk mailing list
>> ncl-talk_at_ucar.edu
>> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>>
>>
>
> _______________________________________________
> ncl-talk mailing list
> ncl-talk_at_ucar.edu
> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>
>

_______________________________________________
ncl-talk mailing list
ncl-talk_at_ucar.edu
http://mailman.ucar.edu/mailman/listinfo/ncl-talk
Received on Fri Jul 13 2007 - 10:57:50 MDT

This archive was generated by hypermail 2.2.0 : Tue Jul 17 2007 - 06:52:01 MDT