
bootstrap_diff
Bootstrap mean differences from two samples.
Available in version 6.4.0 and later.
Prototype
function bootstrap_diff ( x : numeric, y : numeric, nBoot [1] : integer, nDim [*] : integer, opt [1] : logical ) return_val [ variable of type 'list' containing multiple estimates]
Arguments
xA numeric array of up to four dimensions: x(NX), x(NX,:), x(NX,:,:), x(NX,:,:,:). 'NX' represents the original sample size.
yA numeric array of up to four dimensions: x(NY), x(NY,:), x(NY,:,:), x(NY,:,:,:). 'NY' represents the original sample size. NOTE: NX and NX may be different.
nBootAn integer specifying the number of bootstrap data samples to be generated.
nDimThe dimension(s) of x and y on which to calculate the statistic. Most commonly, this is set to (/0,0/) or, if they are both the same, simply, 0.
optA logical scalar to which optional attributes may be attached. If opt=False, all default values are used. If opt=True and no optional attributes are present, default values will be used. If opt=True then:
- opt@sample_size_x and opt@sample_size_y allow the user to specify the sample sizes used to estimate the respective means. The defaults are: opt@sample_size_x=NX and opt@sample_size_y=NY.
- opt@sample_size_x=nx where (nx.le.NX) and/or opt@sample_size_y=ny where (ny.le.NY). When these options are used, nx and ny are typically, 10-25% the size of NX and NY.
- opt@rseed1=rseed1: allows user to set the first random seed integer value. Default is to use the system initial random seed. (See: random_setallseed)
- opt@rseed2=rseed2: allows user to set the second random seed integer value. Default is to use the system initial random seed. (See: random_setallseed)
- optrseed3="clock": tells NCL to use the 'date' clock to set the two random seeds. (See: random_setallseed)
Return value
A variable of type 'list'. Members of a list can be accessed directly. However, it is clearer if the members are explicity extracted and given meaningful names.
; typeof(Bootstrap) is 'list' BootStrap = bootstrap_diff(x, y, stat, nBoot, 0, opt) dBoot = BootStrap[0] ; bootstrapped differences in ascending order dBootAvg = BootStrap[1] ; Average of the bootstrapped differences dBootStd = BootStrap[2] ; Std. Deviation of the bootstrapped differences delete(BootStrap) ; no longer needed
Description
Bootstrapping is a statistical method that uses data resampling with replacement (see: generate_sample_indices) to estimate the properties of nearly any statistic. It is particularly useful when dealing with small sample sizes. A key feature is that bootstrapping makes no apriori assumption about the distribution of the sample data.
References:
Computer Intensive Methods in Statistics P. Diaconis and B. Efron Scientific American (1983), 248:116-130 doi:10.1038/scientificamerican0583-116 http://www.nature.com/scientificamerican/journal/v248/n5/pdf/scientificamerican0583-116.pdf An Introduction to the Bootstrap B. Efron and R.J. Tibshirani, Chapman and Hall (1993) Bootstrap Methods and Permutation Tests: Companion Chapter 18 to the Practice of Business Statistics Hesterberg, T. et al (2003) http://statweb.stanford.edu/~tibs/stat315a/Supplements/bootstrap.pdf Climate Time Series Analysis: Classical Statistical and Bootstrap Methods M. Mudelsee (2014) Second edition. Springer, Cham Heidelberg New York Dordrecht London ISBN: 978-3-319-04449-1, e-ISBN: 978-3-319-04450-7 doi: 10.1007/978-3-319-04450-7 xxxii + 454 pp; Atmospheric and Oceanographic Sciences Library, Vol. 51
See Also
bootstrap_stat, bootstrap_correl, bootstrap_regcoef, bootstrap_estimate, generate_sample_indices, ListIndexFromName
Examples
Please see the Bootstrap and Resampling application page.
Example 1: Let x(NX); y(NY)
nBoot = 1000 ; user set nDim = 0 ; (/0,0/) since they refer to the same dimension opt = False BootStrap = bootstrap_diff(x, y, nBoot, nDim, opt) diffBoot = BootStrap[0] ; All the bootstrapped differences diffBootAvg = BootStrap[1] ; Average of the bootstrapped differences diffBootStd = BootStrap[2] ; Std. Dev. of the boot strapped samples delete(BootStrap) ; no longer needed diffBootLow = bootstrap_estimate(diffBoot, 0.025, False) ; 2.5% lower confidence bound diffBootMed = bootstrap_estimate(diffBoot, 0.500, False) ; 50.0% median of bootstrapped estimates diffBootHi = bootstrap_estimate(diffBoot, 0.975, False) ; 97.5% upper confidence bound printVarSummary(diffBoot) ; information only printVarSummary(diffBootMed)
Example 2: Let x(NX,:,:); y(NY,: :) where NX=100 and NY=50. Use subsampling:
nBoot = 2000 ; user set nDim = 0 opt = True opt@sampling_size_x = 30 opt@sampling_size_y = 10 BootStrap = bootstrap_diff(x, y, nBoot, nDim, opt)