Resampling, bootstrapping, jackknifing and their use in multivariate (statistical) data analyses
DOI:
https://doi.org/10.19090/pp.2013.3.249-266Keywords:
resampling, bootstrapping, jackknifing, sampling, canonical discriminant analysisAbstract
In order to perform any data analysis procedure (analysis of variance, linear correlation, linear regression, etc.), a series of assumptions need to be true, most of which are concerned with the distribution of the variables. If one or more of these assumptions are violated, the obtained parameter estimates are inadequate. Resampling methods are offered as a means to overcome this issue, as they only require one assumption to be made - that the available data is reasonably representative of the population. Resampling methods use existing, available samples to create a large number of new subsamples. This produces an empirical distribution of the desired statistics, which forms a basis for an adequate parameter estimate. In this paper, we will discuss the methods of bootstrapping and jackknifing, which fall under the broader category of resampling. The most common form of the jackknifing procedure is sometimes labeled 'leave-one-out' procedure, because new subsamples are made by excluding one unit of the original sample at a time. Thus, the number of new subsamples is equal to the number of units in the original sample. In bootstrapping, subsamples are created by randomly picking units from the original subsample with replacement so that all subsamples are equal in size to the original sample. The number of bootstrapping subsamples is usually very large (over 1000). A canonical discriminant analysis will serve as an example to illustrate how the usage of resampling procedures can significantly alter the obtained parameter estimates and researcher’s conclusions. A detailed explanation on how to perform these methods in IBM SPSS Statistics package is also given.Metrics
Metrics Loading ...