Re: Lazy expression evaluation

From: David Brown <dbrown_at_nyahnyahspammersnyahnyah>
Date: Thu, 18 Jun 2009 18:08:21 -0600

Hi Dave,

More good questions and I realize I am still learning how it all
works. Taking your last question first, since the rest of what I have
to say depends on it: functions with array input are called only as
many times as they are invoked by the script. Any looping over the
individual array elements that may be required takes place inside the
function. When a function occurs in an expression, the function is
evaluated and its results are substituted into the expression.

In general operators are also implemented as functions. There is a
'plus' function, an 'and' function, etc. that take as parameters
their left and right operands. A key point is that if the operands
are expressions they must be evaluated prior to calling the operator
function (just as for any function).

It turns out that there *is* a fundamental difference for lazy
evaluation in a scalar context. Modifying your example:

  x = 0
  flags = (x .gt. 0) .and. (1/x .gt. 10)

The evaluation of line 2 starts on the left side of the expression to
the right of the '=' sign. The variable 'x' is evaluated giving the
result 0, which becomes the left side parameter to the GreaterThan
function while the literal value 0 is the right hand parameter. The
GreaterThan function returns False and here is the key point: lazy
evaluation is implemented *in the instruction sequence*. If the
return value from GreaterThan is a scalar value then the instruction
JUMP_IF_FALSE is executed. Since the return value was False, the
"jump" is taken, entirely skipping the evaluation of the right side
operand and the call to the And function. So in the scalar case you
*can* suppress side effects and illegal math on the right hand side;
(1/x .gt. 10) is never evaluated and no error is generated. On the
other hand:

  x = array
  flags = (x .gt. 0) .and. (1/x .gt. 10)

As an array, here x .gt. 0 gives an multiple results, so it is not
possible to JUMP_IF_FALSE. This means that the right hand expression
(1/x .gt. 10) has to be evaluated prior to calling the And function,
leading to a potential illegal math error. It might be possible to
create an implied do loop at this point, conditionally jumping
depending on each element of the left hand side result, but that
disturbs NCL's array processing paradigm and would have serious
performance implications.

Lazy evaluation can still be implemented inside the And function for
array input, but it operates on the results of the left and right
hand side expressions. Nevertheless, it would resolve the
inconsistencies between arrays and scalars in your original scenario:
(.not.ismissing(a) .and. (a .gt. 0))

Bottom line:

For scalars:

   False .and. (any expression) always gives False
   True .or. (any expression) always gives True
   in either of these cases(any expression) will not be evaluated

For arrays: (assuming lazy evaluation within the And and Or functions
is implemented)

   False .and. (any expression) gives False unless fatal error
   True .or. (any expression) gives True unless fatal error
   (any expression) will be evaluated possibly resulting in an error
that prevents a result from being returned;

Note: this is based on a test implementation of lazy evaluation
within the And and Or functions.

Regarding other possible operators that could use lazy evaluation: I
don't think there are any. You might want to look at this page on
wikipedia which discusses this issue for programming languages in
general: http://en.wikipedia.org/wiki/Short-circuit_evaluation

  -dave

On Jun 18, 2009, at 12:14 PM, Dave Allured wrote:

> Dave B,
>
> I am okay with your strategy, making lazy evaluation the general
> rule for array expressions as well as scalars.
>
> So let's be more specific. Am I correct in thinking that this will
> apply only to these two cases in the language, for both arrays and
> scalars?
>
> False .and. (any expression) always gives False
> True .or. (any expression) always gives True
>
> This will be the rule for *any* left side expression, not just when
> the ismissing function is used?
>
> There are no other NCL operators, logical or otherwise, where this
> could possibly apply?
>
> Also, in these cases, the right hand expression is never computed at
> some array points, i.e. all errors and side effects from the right
> side are suppressed at those positions? For example, suppressing
> illegal math, such as divide by zero?
>
> x = array
> flags = (x .gt. 0) .and. (1/x .gt. 10)
>
> Now, I see a problem when the right side contains a function that
> returns an array result. This might be covered by the rule for
> functions in array expressions -- but I can't find that rule in the
> ref. manual! That function is called with which shape? Is the
> function called once, returning an array result, e.g. func(x(0:99))?
> Or is the function called a hundred times with the argument
> iterated, e.g. func(x(0)), func(x(1)), etc? This is important to
> know in general, not just for lazy evaluation.
>
> Well thanks for talking this over with me. I hope others will chime
> in if they see any related issues about lazy evaluation that we may
> have missed.
>
> --Dave
>
> David Brown wrote:
>> Dave A.,
>>
>> My view is that the 'ismissing' function is an explicit exception to
>> the rule you quote (otherwise it would return a missing value itself
>> instead of True) whose main purpose is to allow you to work around
>> the rule when necessary. I think the 'check1' statement below behaves
>> properly because by using 'ismissing' and lazy evaluation the right
>> hand side of the '.and.' expression never gets evaluated and
>> therefore does not figure into the result of the expression as a
>> whole.
>>
>> I agree with your original premise that 'check2' should work the same
>> way, and I now think it is a bug that it does not.
>> By the way I found another reference to lazy evaluation in the NCL
>> glossary. It is discussed without reference to 'if' statements
>> although it does not talk about array logical expressions either:
>>
>> lazy evaluation
>> NCL: The process whereby relational expressions are assigned a value
>> as soon as it is possible to do so, without necessarily evaluating
>> all of the components in the expression. For example, the expression
>> (1 .lt. 3) .or. (2 .lt. 1) can be assigned the value True immediately
>> after evaluating (1 .lt. 3) without having to evaluate (2 .lt. 1).
>>
>> -dave
>>
>>
>> On Jun 16, 2009, at 7:04 PM, Dave Allured wrote:
>>
>>> Dave B,
>>>
>>> On the other hand, the missing value rule for general expressions,
>>> logical and others, is elegantly stated on the Expressions page:
>>>
>>> "When any NCL expression is being evaluated, NCL ignores
>>> elements that are equal to the value of the "_FillValue"
>>> attribute for each variable. When a missing value is
>>> ignored, the result of the expression will contain a
>>> missing value at the corresponding array index."
>>>
>>> There are other contexts in which I would really not want to have
>>> complications added to this rule. With full knowledge I would
>>> probably vote for keeping logical expressions in conformance with
>>> this simple rule.
>>>
>>> This means that (a) a logical expression within an "if" statement
>>> must explicitly be an exception to this rule, with some
>>> dimensionality and nesting considerations; and (b) the assignment
>>> statement for check1 below is not in compliance, and might be a bug
>>> (which you already said).
>>>
>>> This is reminding me of a saying, "Be careful what you ask for!"
>>>
>>> --Dave
>>>
>>> Dave Allured wrote:
>>>> Dave B,
>>>>
>>>> Thanks for looking at this. I think that lazy expression
>>>> evaluation
>>>> for array expressions would be beneficial.
>>>>
>>>> It seems to me that that lazy evaluation was never implemented in
>>>> NCL just for "if" statements. Lazy evaluation also works in
>>>> general
>>>> scalar expressions, just not in array expressions. In this example
>>>> for current NCL versions, check1 is a scalar expression assignment
>>>> which could not have the indicated result without lazy evaluation:
>>>>
>>>> a = (/ 1,2,3 /)
>>>> a@_FillValue = 2
>>>> check1 = (.not.ismissing(a(1)) .and. (a(1) .gt. 0))
>>>> check2 = (.not.ismissing(a) .and. (a .gt. 0))
>>>>
>>>> print (check1)
>>>> print (check2(1))
>>>>
>>>> (0) False
>>>> (0) Missing
>>>>
>>>> Also I see in the same documentation under "if statements",
>>>> there is
>>>> an almost explicit reference to using the ismissing function in
>>>> array mode with lazy expression evaluation. "The function
>>>> ismissing
>>>> returns an array ..." "Combined with lazy conditional expression
>>>> evaluation..." There are other suggestive statements in the same
>>>> section. This seems like simply an incomplete implementation of
>>>> lazy evaluation.
>>>>
>>>> Meanwhile, here is a workaround that I have started to use. This
>>>> needs only one extra line, and it does not depend on lazy
>>>> evaluation. x is an array; the output vmask is also an array of
>>>> the
>>>> same dimensionality:
>>>>
>>>> vmask = (x .gt. -130. .and. x .lt. 130.)
>>>> vmask = where (ismissing (vmask), False, vmask)
>>>>
>>>> This is surely not as efficient as true lazy evaluation in a single
>>>> line, but it will do for now.
>>>>
>>>> --Dave
>>>>
>>>> David Brown wrote:
>>>>> Hi Dave,
>>>>>
>>>>> This is a very interesting observation. In the NCL reference
>>>>> manual lazy
>>>>> expression evaluation is only documented in the context of 'if'
>>>>> statements, which require a scalar logical expression. The
>>>>> documentation
>>>>> for '.and.' and '.or.' says only that the operands must be
>>>>> logical, but
>>>>> does not mention lazy evaluation. Apparently lazy evaluation was
>>>>> implemented specifically for 'if' statement evaluation but was
>>>>> never
>>>>> generalized to work for array logical expressions.
>>>>>
>>>>> This leads to the inconsistency that you have pointed out here.
>>>>> The code
>>>>> could easily be updated to use lazy evaluation for .and. and .or.
>>>>> in the
>>>>> context of array logical expressions. I am not totally confident
>>>>> that
>>>>> there might not be some backwards-compatibility issue, but it
>>>>> does seem
>>>>> like a bug of sorts, so my inclination is to go ahead and make the
>>>>> change, noting that the behavior should be clearly documented. The
>>>>> development team will discuss.
>>>>> -dave
>>>>>
>>>>>
>>>>> On Jun 12, 2009, at 7:22 PM, Dave Allured wrote:
>>>>>
>>>>>> NCL team,
>>>>>>
>>>>>> Is lazy expression evaluation supposed to work for array
>>>>>> expressions?
>>>>>>
>>>>>> See attached script. mask1 is scalar and shows the expected
>>>>>> result.
>>>>>> This is basically the example in the NCL manual under "If
>>>>>> statements",
>>>>>> with assignment rather than if statement.
>>>>>>
>>>>>> http://www.ncl.ucar.edu/Document/Manuals/Ref_Manual/
>>>>>> NclStatements.shtml
>>>>>>
>>>>>> For mask2 I expect True, False, True, but NCL returns True,
>>>>>> Missing,
>>>>>> True. This creates problems for subsequent usage of the mask.
>>>>>>
>>>>>> I checked this with NCL versions 5.0.1 (pre-release, ca. May
>>>>>> 2008) and
>>>>>> 5.1.1 (pre-release). The problem was the same in both.
>>>>>>
>>>>>> uname -a
>>>>>> Darwin mac56.psd.esrl.noaa.gov 9.7.0 Darwin Kernel Version
>>>>>> 9.7.0: Tue
>>>>>> Mar 31 22:54:29 PDT 2009; root:xnu-1228.12.14~1/RELEASE_PPC Power
>>>>>> Macintosh powerpc PowerMac7,3 Darwin
>>>>>>
>>>>>> Please advise. Thank you for taking a look.
>>>>>>
>>>>>> Dave Allured
>>>>>> CU/CIRES Climate Diagnostics Center (CDC)
>>>>>> http://cires.colorado.edu/science/centers/cdc/
>>>>>> NOAA/ESRL/PSD, Climate Analysis Branch (CAB)
>>>>>> http://www.cdc.noaa.gov/psd1/
>>>>>> ; Test program for lazy expression evaluation.
>>>>>> ; 2009-jun-13 By Dave Allured, NOAA/PSD/CU/CIRES/CDC.
>>>>>>
>>>>>> begin
>>>>>> a = (/ 1,2,3 /)
>>>>>> a@_FillValue = 2
>>>>>>
>>>>>> i=1
>>>>>> mask1 = (.not.ismissing(a(i)) .and. (a(i) .gt. 0))
>>>>>> print (mask1)
>>>>>>
>>>>>> mask2 = (.not.ismissing(a) .and. (a .gt. 0))
>>>>>> print (mask2)
>>>>>> end
>>>>>> _______________________________________________
>>>>>> ncl-talk mailing list
>>>>>> List instructions, subscriber options, unsubscribe:
>>>>>> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>>>> _______________________________________________
>>>> ncl-talk mailing list
>>>> List instructions, subscriber options, unsubscribe:
>>>> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>>> _______________________________________________
>>> ncl-talk mailing list
>>> List instructions, subscriber options, unsubscribe:
>>> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
>>
>> _______________________________________________
>> ncl-talk mailing list
>> List instructions, subscriber options, unsubscribe:
>> http://mailman.ucar.edu/mailman/listinfo/ncl-talk
> _______________________________________________
> ncl-talk mailing list
> List instructions, subscriber options, unsubscribe:
> http://mailman.ucar.edu/mailman/listinfo/ncl-talk

_______________________________________________
ncl-talk mailing list
List instructions, subscriber options, unsubscribe:
http://mailman.ucar.edu/mailman/listinfo/ncl-talk
Received on Thu Jun 18 2009 - 18:08:21 MDT

This archive was generated by hypermail 2.2.0 : Tue Jul 07 2009 - 11:13:18 MDT