Re: NCL fortran->Nvidia GPU porting

From: Elmer Joandi <elmersoft_at_nyahnyahspammersnyahnyah>
Date: Thu Aug 18 2011 - 00:49:27 MDT

Gains are impressive, but for particular case a part those come from single
precision being used and other shortcuts. Also, to be noted, Tesla cards
(also single precision) had not much error (not significant for cape output
at least), but older consumer grade GPU made about 2% average error
(compared to fortran original).

OpenCL - of course ncl should be portable over hardware. But, fine-tuning to
specific gpu sometimes tends to give faster results. It is very easy to make
stuff 10x slower there when not having underlying hardware in mind (whatever
the language is). So that way it may be more useful to have Nvidia in mind
(either Cuda or OpenCL) and later copy and have slightly different versions.
That way it would motivate people to use full hardware performance on all
platforms. Also with nvidia cuda - certain algorithms go in magnitude faster
depending on compute capability. So such a effort would also need a decision
about minimum compute capability.

Plans - I did that port for to be able to calculate cape for large domain,
as it was nearly the only slow thing for DrJacks RASP blipmaps. So if NCL
team is about to use it, I would be happy to give it (the cape2d cuda
version ). But about next pieces, then generally, I am in a "searching for a
job" business here, whatever ways (or if at all) it works out.

The graphical stack is still too fat for me to grasp in porting context, I
was profiling it a bit springtime, as in general the whole stuff is still
too slow for 500x500 domain with 10min dumps. So I do not have any specific
ideas what to port next. Cape2d was quite a compact piece. For to benefit
from GPU, piece of code should be quite compact in relation to data flow -
few data in and long calculus. So it can not be in the middle of call stack,
should be in bottom.

Elmer Joandi

ncl-talk mailing list
List instructions, subscriber options, unsubscribe:
Received on Thu Aug 18 00:49:34 2011

This archive was generated by hypermail 2.1.8 : Wed Sep 07 2011 - 10:58:58 MDT