bootstrp

 Bootstrap resampling.


 -- Function File: BOOTSTAT = bootstrp (NBOOT, BOOTFUN, D)
 -- Function File: BOOTSTAT = bootstrp (NBOOT, BOOTFUN, D1, ..., DN)
 -- Function File: BOOTSTAT = bootstrp (..., D1, ..., DN, 'match', MATCH)
 -- Function File: BOOTSTAT = bootstrp (..., 'Options', PAROPT)
 -- Function File: BOOTSTAT = bootstrp (..., 'Weights', WEIGHTS)
 -- Function File: BOOTSTAT = bootstrp (..., 'loo', LOO)
 -- Function File: BOOTSTAT = bootstrp (..., 'seed', SEED)
 -- Function File: [BOOTSTAT, BOOTSAM] = bootstrp (...) 

     'BOOTSTAT = bootstrp (NBOOT, BOOTFUN, D)' draws NBOOT bootstrap resamples
     with replacement from the rows of the data D and returns the statistic
     computed by BOOTFUN in BOOTSTAT [1]. BOOTFUN is a function handle (e.g.
     specified with @) or name, a string indicating the function name, or a
     cell array, where the first cell is one of the above function definitions
     and the remaining cells are (additional) input arguments to that function
     (after the data argument(s)). The third input argument is the data
     (column vector, matrix or cell array), which is supplied to BOOTFUN. The
     simulation method used by default is bootstrap resampling with first order
     balance [2-3].

     'BOOTSTAT = bootstrp (NBOOT, BOOTFUN, D1,...,DN)' is as above except that 
     the third and subsequent input arguments are multiple data objects used
     as input for BOOTFUN.

     'BOOTSTAT = bootstrp (..., D1, ..., DN, 'match', MATCH)' controls the
     resampling strategy when multiple data arguments are provided. When MATCH
     is true, row indices of D1 to DN are the same (i.e. matched) for each
     resample. This is the default strategy when D1 to DN all have the same
     number of rows. If MATCH is set to false, then row indices are resampled
     independently for D1 to DN in each of the resamples. When any of the data
     D1 to DN, have a different number of rows, this input argument is ignored
     and MATCH is enforced to have a value of false. Note that the MATLAB
     bootstrp function only operates in a mode equivalent to MATCH = true.

     'BOOTSTAT = bootstrp (..., 'Options', PAROPT)' specifies options that
     govern if and how to perform bootstrap iterations using multiple
     processors (if the Parallel Computing Toolbox or Octave Parallel package).
     is available This argument is a structure with the following recognised
     fields:
        o 'UseParallel': If true, use parallel processes to accelerate
                         bootstrap computations on multicore machines. 
                         Default is false for serial computation. In MATLAB,
                         the default is true if a parallel pool
                         has already been started. 
        o 'nproc':       nproc sets the number of parallel processes (optional)

     'BOOTSTAT = bootstrp (..., D, 'weights', WEIGHTS)' sets the resampling
     weights. WEIGHTS must be a column vector with the same number of rows as
     the data, D. If WEIGHTS is empty or not provided, the default is a vector
     of length N with uniform weighting 1/N. 

     'BOOTSTAT = bootstrp (..., D1, ... DN, 'weights', WEIGHTS)' as above if
     MATCH is true. If MATCH is false, a 1-by-N cell array of column vectors
     can be provided to specify independent resampling weights for D1 to DN.

     'BOOTSTAT = bootstrp (..., 'loo', LOO)' sets the simulation method. If 
     LOO is false, the resampling method used is balanced bootstrap resampling.
     If LOO is true, the resampling method used is balanced bootknife
     resampling [4]. The latter involves creating leave-one-out (jackknife)
     samples of size N - 1, and then drawing resamples of size N with
     replacement from the jackknife samples, thereby incorporating Bessel's
     correction into the resampling procedure. LOO must be a scalar logical
     value. The default value of LOO is false.

     'BOOTSTAT = bootstrp (..., 'seed', SEED)' initialises the Mersenne Twister
     random number generator using an integer SEED value so that bootci results
     are reproducible.

     '[BOOTSTAT, BOOTSAM] = bootstrp (...)' also returns indices used for
     bootstrap resampling. If MATCH is true or only one data argument is
     provided, BOOTSAM is a matrix. If multiple data arguments are provided
     and MATCH is false, BOOTSAM is returned in a 1-by-N cell array of
     matrices, where each cell corresponds to the respective data argument
     D1 to DN.

  Bibliography:
  [1] Efron, and Tibshirani (1993) An Introduction to the
        Bootstrap. New York, NY: Chapman & Hall
  [2] Davison et al. (1986) Efficient Bootstrap Simulation.
        Biometrika, 73: 555-66
  [3] Booth, Hall and Wood (1993) Balanced Importance Resampling
        for the Bootstrap. The Annals of Statistics. 21(1):286-298
  [4] Hesterberg T.C. (2004) Unbiasing the Bootstrap—Bootknife Sampling 
        vs. Smoothing; Proceedings of the Section on Statistics & the 
        Environment. Alexandria, VA: American Statistical Association.

  bootstrp (version 2024.04.23)
  Author: Andrew Charles Penn
  https://www.researchgate.net/profile/Andrew_Penn/

  Copyright 2019 Andrew Charles Penn
  This program is free software: you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see http://www.gnu.org/licenses/

Demonstration 1

The following code


 % Input univariate dataset
 data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ...
         0 33 28 34 4 32 24 47 41 24 26 30 41]';

 % Compute 50 bootstrap statistics for the mean and calculate the bootstrap
 % standard error of the mean
 bootstat = bootstrp (50, @mean, data, 'seed', 1);
 % Or equivalently
 bootstat = bootstrp (50, @mean, data, 'seed', 1, 'loo', false);
 std (bootstat)

Produces the following output

ans = 2.7156

Demonstration 2

The following code


 % Input univariate dataset
 data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ...
         0 33 28 34 4 32 24 47 41 24 26 30 41]';

 % Compute 50 bootknife statistics for the mean and calculate the unbiased
 % bootstrap standard error of the mean
 bootstat = bootstrp (50, @mean, data, 'seed', 1, 'loo', true);
 std (bootstat)

Produces the following output

ans = 2.4052

Demonstration 3

The following code


 % Input univariate dataset
 data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ...
         0 33 28 34 4 32 24 47 41 24 26 30 41]';
 % Split data into consecutive blocks of two data observations per cell
 data_blocks = mat2cell (data, 2 * (ones (13, 1)), 1);

 % Compute 50 bootknife statistics for the mean and calculate the unbiased
 % bootstrap standard error of the mean
 bootstat = bootstrp (50, @(x) mean (cell2mat (x)), data_blocks, 'seed', 1, ...
                                                                 'loo', true);
 std (bootstat)

Produces the following output

ans = 2.9384

Demonstration 4

The following code


 % Input univariate dataset
 data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ...
         0 33 28 34 4 32 24 47 41 24 26 30 41]';

 % Compute 50 bootknife statistics for the variance and calculate the
 % unbiased standard error of the variance
 bootstat = bootstrp (50, {@var, 1}, data, 'loo', true);
 std (bootstat)

Produces the following output

ans = 39.204

Demonstration 5

The following code


 % Input two-sample dataset
 X = [212 435 339 251 404 510 377 335 410 335 ...
      415 356 339 188 256 296 249 303 266 300]';
 Y = [247 461 526 302 636 593 393 409 488 381 ...
      474 329 555 282 423 323 256 431 437 240]';

 % Compute 50 bootknife statistics for the mean difference between X and Y
 % and calculate the unbiased bootstrap standard error of the mean difference
 bootstat = bootstrp (50, @(x, y) mean (x - y), X, Y, 'loo', true);
 % Or equivalently
 bootstat = bootstrp (50, @(x, y) mean (x - y), X, Y, 'loo', true, ...
                                                      'match', true);
 std (bootstat)

Produces the following output

ans = 14.614

Demonstration 6

The following code


 % Input two-sample dataset
 X = [212 435 339 251 404 510 377 335 410 335 ...
      415 356 339 188 256 296 249 303 266 300]';
 Y = [247 461 526 302 636 593 393 409 488 381 ...
      474 329 555 282 423 323 256 431 437 240]';

 % Compute 50 bootknife statistics for the difference in mean between
 % between independent samples X and Y and calculate the unbiased bootstrap
 % standard error of the difference in mean
 bootstat = bootstrp (50, @(x, y) mean (x) - mean(y), X, Y, 'loo', true, ...
                                                            'match', false);
 std (bootstat)

Produces the following output

ans = 32.705

Demonstration 7

The following code


 % Input bivariate dataset
 X = [212 435 339 251 404 510 377 335 410 335 ...
      415 356 339 188 256 296 249 303 266 300]';
 Y = [247 461 526 302 636 593 393 409 488 381 ...
      474 329 555 282 423 323 256 431 437 240]';

 % Compute 50 bootstrap statistics for the correlation coefficient and
 % calculate the bootstrap standard error of the correlation coefficient
 bootstat = bootstrp (50, @cor, X, Y);
 std (bootstat)

Produces the following output

ans = 0.098974

Demonstration 8

The following code


 % Input bivariate dataset
 X = [212 435 339 251 404 510 377 335 410 335 ...
      415 356 339 188 256 296 249 303 266 300]';
 Y = [247 461 526 302 636 593 393 409 488 381 ...
      474 329 555 282 423 323 256 431 437 240]';

 % Compute 50 bootstrap statistics for the coefficient of determination and
 % calculate it's bootstrap standard error
 bootstat = bootstrp (50, {@cor,'squared'}, X, Y);
 std (bootstat)

Produces the following output

ans = 0.1265

Demonstration 9

The following code


 % Input bivariate dataset
 X = [212 435 339 251 404 510 377 335 410 335 ...
      415 356 339 188 256 296 249 303 266 300]';
 Y = [247 461 526 302 636 593 393 409 488 381 ...
      474 329 555 282 423 323 256 431 437 240]';

 % Compute 4999 bootstrap statistics for the coefficient of determination and
 % calculate 95% percentile confidence intervals
 bootstat = bootstrp (4999, {@cor,'squared'}, X, Y);
 bootint (bootstat)

Produces the following output

ans =

      0.25642        0.743

Demonstration 10

The following code


 % Input bivariate dataset
 X = [212 435 339 251 404 510 377 335 410 335 ...
      415 356 339 188 256 296 249 303 266 300]';
 Y = [247 461 526 302 636 593 393 409 488 381 ...
      474 329 555 282 423 323 256 431 437 240]';

 % Compute 50 bootstrap statistics for the slope and intercept of a linear
 % regression and calculate their bootstrap standard errors
 bootstat = bootstrp (50, @mldivide, cat (2, ones (20, 1), X), Y);
 std (bootstat)

Produces the following output

ans =

       58.311      0.18007

Package: statistics-resampling