Re: Segmentation fault

From: <Oliver.Fuhrer_at_nyahnyahspammersnyahnyah>
Date: Mon Jan 18 2010 - 08:37:52 MST

Dear Rick,

Yes, I have a script which crashes (almost always) with the same
stacktrace. The segfault is always with one of the wind (i.e. vector)
plots. But not reproducibly with the same one. The problem really is,
that even if I make a very small change to the script, e.g. insert a
print statement to narrow down the location of the problem, the
segmentation fault does not occurr anymore. The strangest thing is, that
even if I change the name (!) of the script NCL sometimes doesn't crash.
I tested the script on 3 different machines (all running Linux) and it
crashed by segfault, so I don't think this is really a machine issue.

I don't know the inner workings of NCL, but could it be that the problem
is coming from the delete(wks) statement. This is typically something
users would not have in their scripts, as the garbage collection at the
end automatically closes and renders an open postscript file at the end
of a script.

I wrapped everything up into a package. I'll send you the ftp server
offline. In order to run the example, do the following...

> tcsh
> tar xvfz ncl_crash.tar.gz
> cd ncl_crash
> source NCL_SET_ENV
> ncl PROG_ncl_f.663macro2

Cheers,
Oli

> -----Original Message-----
> From: Rick Brownrigg [mailto:brownrig@ucar.edu]
> Sent: Freitag, 15. Januar 2010 17:52
> To: Fuhrer Oliver
> Cc: ncl-talk@ucar.edu
> Subject: Re: Segmentation fault
>
> Hi,
>
> Does the stack trace always look the same when you get the
> SEGV; i.e.,
> does it always occur in NhlClassIsSubclass()? It looks like
> the fault
> originates from an NCL "delete()" statement, and from one of your
> vector-overlays plots (rather than from the contour plots). There
> were 4 such instances in the script you attached -- is it always a
> certain one? I looked carefully at the delete statements for those
> plots, but don't see anything obviously wrong.
>
> Rick
>
>
> On Jan 15, 2010, at 3:13 AM, <Oliver.Fuhrer@meteoswiss.ch> wrote:
>
> > Dear NCL forum,
> >
> > In a script which is producing operational plots and
> running 8 times a
> > day we get a segmentation fault of NCL. The error is _VERY_
> > sensitive to
> > small changes (print statements in script, etc.) and does not always
> > occur. Thus, efforts to try and reduce the error producing scripts
> > to a
> > minimum have so far not been successful. I do get a core
> dump and upon
> > inspection it gives me the backtrace shown below. I've attached the
> > script that causes the error. Before trying to pack
> everything (input
> > files, external NCL scripts, shared libraries used) into a hug
> > package,
> > I would like to ask if there is anything else that I could try and
> > do to
> > narrow down the cause of the segfault.
> >
> > Thanks for any help,
> > Oli
> >
> > GNU debug session...
> >
> > (gdb) r PROG_ncl_f.663macro2
> > Starting program: /nfs/xt3-homes/users/olifu/ncl/bin/ncl
> > PROG_ncl_f.663macro2
> > Copyright (C) 1995-2009 - All Rights Reserved
> > University Corporation for Atmospheric Research
> > NCAR Command Language Version 5.1.1
> > The use of this software is governed by a License Agreement.
> > See http://www.ncl.ucar.edu/ for more details.
> > (0) COSMO Library Version 0.5 loaded
> > (0) Start of macro
> > (0) jmb_getvar: reading HZEROCL named HZEROCL_GDS10_0DEG in file
> > (0) jmb_getvar: reading HSURF named HH_GDS10_SFC in file
> > (0) jmb_getvar: reading HZEROCL named HZEROCL_GDS10_0DEG in file
> > (0) jmb_getvar: reading HSURF named HH_GDS10_SFC in file
> > (0) jmb_getvar: reading HZEROCL named HZEROCL_GDS10_0DEG in file
> > (0) jmb_getvar: reading HSURF named HH_GDS10_SFC in file
> > (0) jmb_getvar: reading HZEROCL named HZEROCL_GDS10_0DEG in file
> > (0) jmb_getvar: reading HSURF named HH_GDS10_SFC in file
> > (0) start of COSMO2_PREC03h_06.gin
> > (0) jmb_getvar: reading TOT_PREC named
> TOT_PREC_GDS10_SFC_acc3h in
> > file
> > (0) jmb_getvar: reading HSURF named HH_GDS10_SFC in file
> > (0) end of COSMO2_PREC03h_06.gin
> > (0) start of COSMO2_PREC03h_12.gin
> > (0) jmb_getvar: reading TOT_PREC named
> TOT_PREC_GDS10_SFC_acc3h in
> > file
> > (0) jmb_getvar: reading HSURF named HH_GDS10_SFC in file
> > (0) end of COSMO2_PREC03h_12.gin
> > (0) start of COSMO2_PREC03h_18.gin
> > (0) jmb_getvar: reading TOT_PREC named
> TOT_PREC_GDS10_SFC_acc3h in
> > file
> > (0) jmb_getvar: reading HSURF named HH_GDS10_SFC in file
> > (0) end of COSMO2_PREC03h_18.gin
> > (0) start of COSMO2_PREC03h_24.gin
> > (0) jmb_getvar: reading TOT_PREC named
> TOT_PREC_GDS10_SFC_acc3h in
> > file
> > (0) jmb_getvar: reading HSURF named HH_GDS10_SFC in file
> > (0) end of COSMO2_PREC03h_24.gin
> > (0) jmb_getvar: reading TOT_PREC named
> TOT_PREC_GDS10_SFC_acc6h in
> > file
> > (0) jmb_getvar: reading HSURF named HH_GDS10_SFC in file
> > (0) jmb_getvar: reading TOT_PREC named
> TOT_PREC_GDS10_SFC_acc6h in
> > file
> > (0) jmb_getvar: reading HSURF named HH_GDS10_SFC in file
> > (0) jmb_getvar: reading TOT_PREC named
> TOT_PREC_GDS10_SFC_acc12h
> > in
> > file
> > (0) jmb_getvar: reading HSURF named HH_GDS10_SFC in file
> > (0) jmb_getvar: reading SNOWLMT named SNOWLMT_GDS10_0DEG in file
> > (0) jmb_getvar: reading HSURF named HH_GDS10_SFC in file
> > (0) jmb_getvar: reading SNOWLMT named SNOWLMT_GDS10_0DEG in file
> > (0) jmb_getvar: reading HSURF named HH_GDS10_SFC in file
> > (0) jmb_getvar: reading SNOWLMT named SNOWLMT_GDS10_0DEG in file
> > (0) jmb_getvar: reading HSURF named HH_GDS10_SFC in file
> > (0) jmb_getvar: reading SNOWLMT named SNOWLMT_GDS10_0DEG in file
> > (0) jmb_getvar: reading HSURF named HH_GDS10_SFC in file
> > (0) start of COSMO2_WIND_10M_03.gin
> > (0) jmb_getvar: reading U_GDS10_HTGL named U_GDS10_HTGL in file
> > (0) jmb_getvar: reading V_GDS10_HTGL named V_GDS10_HTGL in file
> > (0) jmb_getvar: reading HSURF named HH_GDS10_SFC in file
> > (0) jmb_getvar: reading HSURF named HH_GDS10_SFC in file
> > (0) end of COSMO2_WIND_10M_03.gin
> > (0) start of COSMO2_WIND_10M_09.gin
> > (0) jmb_getvar: reading U_GDS10_HTGL named U_GDS10_HTGL in file
> > (0) jmb_getvar: reading V_GDS10_HTGL named V_GDS10_HTGL in file
> > (0) jmb_getvar: reading HSURF named HH_GDS10_SFC in file
> > (0) jmb_getvar: reading HSURF named HH_GDS10_SFC in file
> >
> > Program received signal SIGSEGV, Segmentation fault.
> > 0x00000000007f4eb1 in NhlClassIsSubclass ()
> > (gdb) bt
> > #0 0x00000000007f4eb1 in NhlClassIsSubclass ()
> > #1 0x00000000007f4ee1 in _NhlIsClass ()
> > #2 0x000000000082966a in _NhlManageOverlay ()
> > #3 0x00000000008676e0 in ManageOverlay ()
> > #4 0x00000000008682ed in VectorPlotSetValues ()
> > #5 0x00000000007cdf04 in CallSetValues ()
> > #6 0x00000000007ceb3b in _NhlSetLayerValues ()
> > #7 0x00000000007cf161 in NhlALSetValues ()
> > #8 0x000000000082d93d in PlotManagerSetValues ()
> > #9 0x00000000007cdf04 in CallSetValues ()
> > #10 0x00000000007ceb3b in _NhlSetLayerValues ()
> > #11 0x00000000007cee9a in SetValuesChild ()
> > #12 0x00000000007cef32 in _NhlALSetValuesChild ()
> > #13 0x0000000000829abc in _NhlManageOverlay ()
> > #14 0x000000000081ac13 in ManageOverlay ()
> > #15 0x000000000081b3cd in ContourPlotSetValues ()
> > #16 0x00000000007cdf04 in CallSetValues ()
> > #17 0x00000000007ceb3b in _NhlSetLayerValues ()
> > #18 0x00000000007f7bd0 in NhlRemoveData ()
> > #19 0x00000000007f85a3 in ReleaseHandles ()
> > #20 0x00000000007f8707 in DataMgrDestroy ()
> > #21 0x00000000007c21ac in CallDestroy ()
> > #22 0x00000000007c2267 in NhlDestroy ()
> > #23 0x00000000007c22db in _NhlDestroyChild ()
> > #24 0x00000000007f82df in DataItemDestroy ()
> > #25 0x00000000007c21ac in CallDestroy ()
> > #26 0x00000000007c21c4 in CallDestroy ()
> > #27 0x00000000007c2267 in NhlDestroy ()
> > #28 0x000000000051c0f0 in HLUObjDestroy ()
> > #29 0x0000000000553e09 in _NclDestroyObj ()
> > #30 0x000000000051c015 in HLUObjDestroy ()
> > #31 0x0000000000553e09 in _NclDestroyObj ()
> > #32 0x000000000051beb5 in HLUObjDelParent ()
> > #33 0x00000000005525f8 in _NclDelParent ()
> > #34 0x00000000005fa0ad in MultiDVal_HLUObj_Destroy ()
> > #35 0x0000000000553e09 in _NclDestroyObj ()
> > #36 0x00000000005f8fc8 in HLUMultiDValDelParent ()
> > #37 0x00000000005525f8 in _NclDelParent ()
> > #38 0x0000000000568e07 in AttDestroyObj ()
> > #39 0x0000000000553e09 in _NclDestroyObj ()
> > #40 0x000000000056885e in AttDelParent ()
> > #41 0x00000000005525f8 in _NclDelParent ()
> > #42 0x000000000059ec39 in VarDestroy ()
> > #43 0x00000000005f67db in HLUVarDestroy ()
> > #44 0x0000000000553e09 in _NclDestroyObj ()
> > #45 0x0000000000614482 in _NclIDelete ()
> > #46 0x00000000005da3af in CallINTRINSIC_PROC_CALL ()
> > #47 0x00000000005e0935 in _NclExecute ()
> > #48 0x000000000051fc69 in yyparse ()
> > #49 0x000000000051b4ce in main ()
> >
> > ________________________________________
> >
> > Oliver Fuhrer
> > Numerical Models
> >
> > Federal Departement of Home Affairs FDHA
> > Federal Office of Meteorology and Climatology MeteoSwiss
> >
> > Kraehbuehlstrasse 58, P.O. Box 514, CH-8044 Zurich, Switzerland
> >
> > Tel. +41 44 256 93 59
> > Fax +41 44 256 92 78
> > oliver.fuhrer@meteoswiss.ch
> > www.meteoswiss.ch - First-hand information
> >
> >
> <PROG_ncl_f.663macro2>_______________________________________________
> > ncl-talk mailing list
> > List instructions, subscriber options, unsubscribe:
> > http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>
>
_______________________________________________
ncl-talk mailing list
List instructions, subscriber options, unsubscribe:
http://mailman.ucar.edu/mailman/listinfo/ncl-talk
Received on Mon Jan 18 08:38:13 2010

This archive was generated by hypermail 2.1.8 : Thu Jan 21 2010 - 13:54:45 MST