bootstrap_correl
Bootstrap estimates of sample cross correlations (ie, Pearson's correlation coefficient) between two variables.
Available in version 6.4.0 and later.
Prototype
function bootstrap_correl ( x : numeric, y : numeric, nBoot [1] : integer, nDim [*] : integer, opt [1] : logical ) return_val [ variable of type 'list' containing multiple estimates]
Arguments
xA numeric array of up to four dimensions: x(N), x(N,:), x(N,:,:), x(N,:,:,:) where 'N' represents the original sample size.
yA numeric array of up to four dimensions: y(N), y(N,:), y(N,:,:), y(N,:,:,:) where 'N' represents the original sample size.
nBootAn integer specifying the number of bootstrap data samples to be generated.
nDimThe dimension(s) of x and y on which to calculate the statistic. Most commonly, this is set to (/0,0/) or, if they are both the same, simply, 0.
optA logical scalar to which optional attributes may be attached. If opt=False, default values are used. If opt=True and no optional attributes are present, default values will be used. If opt=True then:
- opt@sample_size: specifies the size of the resampled array to be used for the bootstrapped statistics.
- opt@sample_size=N is the default.
- opt@sample_size=n where (n.le.N). When this option is used, n=toint(f*N) where 'f' represents (say) 0.10 to 0.20.
- opt@rseed1=rseed1: allows user to set the first random seed integer value. Default is to use the system initial random seed. (See: random_setallseed)
- opt@rseed2=rseed2: allows user to set the second random seed integer value. Default is to use the system initial random seed. (See: random_setallseed)
- optrseed3="clock": tells NCL to use the 'date' clock to set the two random seeds. (See: random_setallseed)
Return value
A variable of type 'list'. Members of a list can be accessed directly. However, it is clearer if the members are explicity extracted and given meaningful names.
; typeof(Bootstrap) is 'list' BootStrap = bootstrap_correl(x, stat, nBoot, 0, opt) rBoot = BootStrap[0] ; All the bootstrapped values rBootAvg = BootStrap[1] ; Average cross correlation of the bootstrapped samples rBootStd = BootStrap[2] ; Average cross correlation of the bootstrapped samples delete(BootStrap) ; no longer neededAll appropriate meta data are returned. Please use printVarSummary(...) to examine the returned variable.
Description
Bootstrapping is a statistical method that uses data resampling with replacement (see: generate_sample_indices) to estimate the properties of nearly any statistic. It is particularly useful when dealing with small sample sizes. A key feature is that bootstrapping makes no apriori assumption about the distribution of the sample data.
The default version resamples using x and y pairs.
Some side points to remember about cross correlations (Wikipedia):
- The cross correlation coefficient detects only linear dependencies between two variables.
- For the case of a linear model with a single independent variable (x), the coefficient of determination is the square of r (r^2), Pearson's product-moment coefficient.
References:
Computer Intensive Methods in Statistics P. Diaconis and B. Efron Scientific American (1983), 248:116-130 doi:10.1038/scientificamerican0583-116 http://www.nature.com/scientificamerican/journal/v248/n5/pdf/scientificamerican0583-116.pdf An Introduction to the Bootstrap B. Efron and R.J. Tibshirani, Chapman and Hall (1993) Bootstrap Methods and Permutation Tests: Companion Chapter 18 to the Practice of Business Statistics Hesterberg, T. et al (2003) http://statweb.stanford.edu/~tibs/stat315a/Supplements/bootstrap.pdf Climate Time Series Analysis: Classical Statistical and Bootstrap Methods M. Mudelsee (2014) Second edition. Springer, Cham Heidelberg New York Dordrecht London ISBN: 978-3-319-04449-1, e-ISBN: 978-3-319-04450-7 doi: 10.1007/978-3-319-04450-7 xxxii + 454 pp; Atmospheric and Oceanographic Sciences Library, Vol. 51See Also
bootstrap_stat, bootstrap_diff, bootstrap_estimate, bootstrap_regcoef, generate_sample_indices, ListIndexFromName
Examples
Please see the Bootstrap and Resampling application page.
Example 1: Let x(N); y(N), N=100:
nBoot = 1000 ; user set nDim = 0 ; or (/0,0/); dimension numbers corresponding to 'N' opt = False ; use all default options BootStrap = bootstrap_correl(x, y, nBoot, nDim, opt) rBoot = BootStrap[0] ; bootstrapped cross-correlations in ascending order rBootAvg = BootStrap[1] ; Average of the z-transformed bootstrapped cross correlations rBootStd = BootStrap[2] ; Std. deviation(s) of the z-transformed bootstrapped cross correlations delete(BootStrap) ; no longer needed rBootLow = bootstrap_estimate(rBoot, 0.025, False) ; 2.5% lower confidence bound rBootMed = bootstrap_estimate(rBoot, 0.500, False) ; 50.0% median of bootstrapped estimates printVarSummary(rBoot) ; information only printVarSummary(rBootMed)