R语言实战：重抽样与自助法

February 06, 2021 (最后修改: June 21, 2026)

本文内容来自《R 语言实战》(R in Action, 2nd)，有部分修改

置换检验

置换检验，随机化检验，重随机化检验

精确检验 vs 蒙特卡洛模拟

使用 `coin` 包做置换检验

响应值与组的分配独立吗？
两个数值变量独立吗？
两个类别型别变量独立吗？

library(coin)

独立两样本和 K 样本检验

score <- c(
  40, 57, 45, 55, 58, 
  57, 64, 55, 62, 65
)
treatment <- factor(
  c(rep("A", 5), rep("B", 5))
)
my_data <- data.frame(
  treatment, score
)
my_data

   treatment score
1          A    40
2          A    57
3          A    45
4          A    55
5          A    58
6          B    57
7          B    64
8          B    55
9          B    62
10         B    65

t 检验，结果显著

t.test(
  score ~ treatment,
  data=my_data,
  var.equal=TRUE
)

	Two Sample t-test

data:  score by treatment
t = -2.345, df = 8, p-value = 0.04705
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -19.0405455  -0.1594545
sample estimates:
mean in group A mean in group B 
           51.0            60.6

置换检验，结果不显著

oneway_test(
  score ~ treatment,
  data=my_data,
  distribution="exact"
)

	Exact Two-Sample Fisher-Pitman Permutation Test

data:  score by treatment (A, B)
Z = -1.9147, p-value = 0.07143
alternative hypothesis: true mu is not equal to 0

library(MASS)

UScrime <- transform(
  UScrime,
  So = factor(So)
)
head(UScrime)

    M So  Ed Po1 Po2  LF  M.F Pop  NW  U1 U2 GDP Ineq     Prob    Time    y
1 151  1  91  58  56 510  950  33 301 108 41 394  261 0.084602 26.2011  791
2 143  0 113 103  95 583 1012  13 102  96 36 557  194 0.029599 25.2999 1635
3 142  1  89  45  44 533  969  18 219  94 33 318  250 0.083401 24.3006  578
4 136  0 121 149 141 577  994 157  80 102 39 673  167 0.015801 29.9012 1969
5 141  0 121 109 101 591  985  18  30  91 20 578  174 0.041399 21.2998 1234
6 121  0 110 118 115 547  964  25  44  84 29 689  126 0.034201 20.9995  682

Wilcoxon 秩和检验

wilcox.test(
  Prob ~ So,
  data=UScrime,
)

	Wilcoxon rank sum exact test

data:  Prob by So
W = 81, p-value = 8.488e-05
alternative hypothesis: true location shift is not equal to 0

wilcox_test(
  Prob ~ So,
  data=UScrime,
  distribution="exact"
)

	Exact Wilcoxon-Mann-Whitney Test

data:  Prob by So (0, 1)
Z = -3.7493, p-value = 8.488e-05
alternative hypothesis: true mu is not equal to 0

近似 K 样本置换检验

library(multcomp)

head(cholesterol)

    trt response
1 1time   3.8612
2 1time  10.3868
3 1time   5.9059
4 1time   3.0609
5 1time   7.7204
6 1time   2.7139

set.seed(1234)
oneway_test(
  response ~ trt,
  data=cholesterol,
  distribution=approximate(nresample=9999)
)

	Approximative K-Sample Fisher-Pitman Permutation Test

data:  response by trt (1time, 2times, 4times, drugD, drugE)
chi-squared = 36.381, p-value < 1e-04

列联表中的独立性

chisq_test()、cmh_test()、lbl_test() 函数

library(vcd)

head(Arthritis)

  ID Treatment  Sex Age Improved
1 57   Treated Male  27     Some
2 46   Treated Male  29     None
3 77   Treated Male  30     None
4 17   Treated Male  32   Marked
5 36   Treated Male  46   Marked
6 23   Treated Male  58   Marked

Improved 是有序因子，进行线性趋势检验

set.seed(1234)
chisq_test(
  Treatment ~ Improved,
  data=Arthritis,
  distribution=approximate(nresample=9999)
)

	Approximative Linear-by-Linear Association Test

data:  Treatment by Improved (None < Some < Marked)
Z = -3.6075, p-value = 0.0006001
alternative hypothesis: two.sided

Arthritis_v2 <- transform(
  Arthritis,
  Improved=as.factor(as.numeric(Improved))
)
head(Arthritis_v2)

  ID Treatment  Sex Age Improved
1 57   Treated Male  27        2
2 46   Treated Male  29        1
3 77   Treated Male  30        1
4 17   Treated Male  32        3
5 36   Treated Male  46        3
6 23   Treated Male  58        3

Improved 是分类因子，进行卡方检验

set.seed(1234)
chisq_test(
  Treatment ~ Improved,
  data=Arthritis_v2,
  distribution=approximate(nresample=9999)
)

	Approximative Pearson Chi-Squared Test

data:  Treatment by Improved (1, 2, 3)
chi-squared = 13.055, p-value = 0.0018

数值变量间的独立性

spearman_test() 函数

states <- as.data.frame(state.x77)
head(states)

           Population Income Illiteracy Life Exp Murder HS Grad Frost
Alabama          3615   3624        2.1    69.05   15.1    41.3    20
Alaska            365   6315        1.5    69.31   11.3    66.7   152
Arizona          2212   4530        1.8    70.55    7.8    58.1    15
Arkansas         2110   3378        1.9    70.66   10.1    39.9    65
California      21198   5114        1.1    71.71   10.3    62.6    20
Colorado         2541   4884        0.7    72.06    6.8    63.9   166
             Area
Alabama     50708
Alaska     566432
Arizona    113417
Arkansas    51945
California 156361
Colorado   103766

set.seed(1234)
spearman_test(
  Illiteracy ~ Murder,
  data=states,
  distribution=approximate(nresample=9999)
)

	Approximative Spearman Correlation Test

data:  Illiteracy by Murder
Z = 4.7065, p-value < 1e-04
alternative hypothesis: true rho is not equal to 0

两样本和 K 样本相关性检验

两配对组的置换检验：wilcoxsign_test() 函数

多于两组：friedman_test() 函数

Wilcoxon 符号秩检验

wilcoxsign_test(
  U1 ~ U2,
  data=UScrime,
  distribution="exact"
)

	Exact Wilcoxon-Pratt Signed-Rank Test

data:  y by x (pos, neg) 
	 stratified by block
Z = 5.9691, p-value = 1.421e-14
alternative hypothesis: true mu is not equal to 0

深入研究

independence_test() 函数

`lmPerm` 包的置换检验

线性模型的置换检验，比如 lmp() 和 aovp() 函数

library(lmPerm)

简单回归和多项式回归

set.seed(1234)
fit <- lmp(
  weight ~ height, 
  data=women,
  perm="Prob"
)

[1] "Settings:  unique SS : numeric variables centered"

summary(fit)

Call:
lmp(formula = weight ~ height, data = women, perm = "Prob")

Residuals:
    Min      1Q  Median      3Q     Max 
-1.7333 -1.1333 -0.3833  0.7417  3.1167 

Coefficients:
       Estimate Iter Pr(Prob)    
height     3.45 5000   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.525 on 13 degrees of freedom
Multiple R-Squared: 0.991,	Adjusted R-squared: 0.9903 
F-statistic:  1433 on 1 and 13 DF,  p-value: 1.091e-14

二次方程

set.seed(1234)
fit <- lmp(
  weight ~ height + I(height^2),
  data=women,
  perm="Prob"
)

[1] "Settings:  unique SS : numeric variables centered"

summary(fit)

Call:
lmp(formula = weight ~ height + I(height^2), data = women, perm = "Prob")

Residuals:
      Min        1Q    Median        3Q       Max 
-0.509405 -0.296105 -0.009405  0.286151  0.597059 

Coefficients:
            Estimate Iter Pr(Prob)    
height      -7.34832 5000   <2e-16 ***
I(height^2)  0.08306 5000   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.3841 on 12 degrees of freedom
Multiple R-Squared: 0.9995,	Adjusted R-squared: 0.9994 
F-statistic: 1.139e+04 on 2 and 12 DF,  p-value: < 2.2e-16

多元回归

head(states)

           Population Income Illiteracy Life Exp Murder HS Grad Frost
Alabama          3615   3624        2.1    69.05   15.1    41.3    20
Alaska            365   6315        1.5    69.31   11.3    66.7   152
Arizona          2212   4530        1.8    70.55    7.8    58.1    15
Arkansas         2110   3378        1.9    70.66   10.1    39.9    65
California      21198   5114        1.1    71.71   10.3    62.6    20
Colorado         2541   4884        0.7    72.06    6.8    63.9   166
             Area
Alabama     50708
Alaska     566432
Arizona    113417
Arkansas    51945
California 156361
Colorado   103766

fit <- lmp(
  Murder ~ Population + Illiteracy + Income + Frost,
  data=states,
  perm="Prob"
)

[1] "Settings:  unique SS : numeric variables centered"

summary(fit)

Call:
lmp(formula = Murder ~ Population + Illiteracy + Income + Frost, 
    data = states, perm = "Prob")

Residuals:
     Min       1Q   Median       3Q      Max 
-4.79597 -1.64946 -0.08112  1.48150  7.62104 

Coefficients:
            Estimate Iter Pr(Prob)    
Population 2.237e-04   51    1.000    
Illiteracy 4.143e+00 5000   <2e-16 ***
Income     6.442e-05   51    0.961    
Frost      5.813e-04   51    0.863    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2.535 on 45 degrees of freedom
Multiple R-Squared: 0.567,	Adjusted R-squared: 0.5285 
F-statistic: 14.73 on 4 and 45 DF,  p-value: 9.133e-08

fit <- lm(
  Murder ~ Population + Illiteracy + Income + Frost,
  data=states,
)
summary(fit)

Call:
lm(formula = Murder ~ Population + Illiteracy + Income + Frost, 
    data = states)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.7960 -1.6495 -0.0811  1.4815  7.6210 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 1.235e+00  3.866e+00   0.319   0.7510    
Population  2.237e-04  9.052e-05   2.471   0.0173 *  
Illiteracy  4.143e+00  8.744e-01   4.738 2.19e-05 ***
Income      6.442e-05  6.837e-04   0.094   0.9253    
Frost       5.813e-04  1.005e-02   0.058   0.9541    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2.535 on 45 degrees of freedom
Multiple R-squared:  0.567,	Adjusted R-squared:  0.5285 
F-statistic: 14.73 on 4 and 45 DF,  p-value: 9.133e-08

注意：在 lm() 中 Population 和 Illiteracy 均显著，而 lmp() 中仅 Illiteracy 显著

单因素方差分析和协方差分析

单因素方差分析的置换检验

set.seed(1234)
fit <- aovp(
  response ~ trt,
  data=cholesterol,
  perm="Prob"
)

[1] "Settings:  unique SS "

anova(fit)

Analysis of Variance Table

Response: response
          Df R Sum Sq R Mean Sq Iter  Pr(Prob)    
trt        4  1351.37    337.84 5000 < 2.2e-16 ***
Residuals 45   468.75     10.42                   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

单因素协方差分析的置换检验

set.seed(1234)
fit <- aovp(
  weight ~ gesttime + dose,
  data=litter,
  perm="Prob"
)

[1] "Settings:  unique SS : numeric variables centered"

anova(fit)

Analysis of Variance Table

Response: weight
          Df R Sum Sq R Mean Sq Iter Pr(Prob)    
gesttime   1   161.49   161.493 5000   0.0006 ***
dose       3   137.12    45.708 5000   0.0392 *  
Residuals 69  1151.27    16.685                  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

双因素方差分析

set.seed(1234)
fit <- aovp(
  len ~ supp*dose,
  data=ToothGrowth,
  perm="Prob"
)

[1] "Settings:  unique SS : numeric variables centered"

anova(fit)

Analysis of Variance Table

Response: len
          Df R Sum Sq R Mean Sq Iter Pr(Prob)    
supp       1   205.35    205.35 5000  < 2e-16 ***
dose       1  2224.30   2224.30 5000  < 2e-16 ***
supp:dose  1    88.92     88.92 2032  0.04724 *  
Residuals 56   933.63     16.67                  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

双因素方差分析

set.seed(1234)
fit <- aovp(
  len ~ supp*dose,
  data=ToothGrowth,
  perm="Prob"
)

[1] "Settings:  unique SS : numeric variables centered"

anova(fit)

Analysis of Variance Table

Response: len
          Df R Sum Sq R Mean Sq Iter Pr(Prob)    
supp       1   205.35    205.35 5000  < 2e-16 ***
dose       1  2224.30   2224.30 5000  < 2e-16 ***
supp:dose  1    88.92     88.92 2032  0.04724 *  
Residuals 56   933.63     16.67                  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

置换检验点评

perm 包

corrperm 包

logregperm 包

glmperm 包

置换检验主要用于生成检验零假设的 p 值，有助于回答“效应是否存在”这样的问题。

自助法

从初始样本重复随机替换抽样，生成一个或一系列待检验统计量的经验分布。无需假设一个特定的理论分布，便可生成统计量的置信区间，并能检验统计假设。

boot 包中的自助法

library("boot")

对单个统计量使用自助法

计算 R 平方的函数

rsq <- function(formula, data, indices) {
  d <- data[indices,]
  fit <- lm(formula, data=d)
  return (summary(fit)$r.square)
}

自助抽样

set.seed(1234)
results <- boot(
  data=mtcars,
  statistic=rsq,
  R=1000,
  formula=mpg ~ wt + disp
)

boot 对象可以输出

print(results)

ORDINARY NONPARAMETRIC BOOTSTRAP


Call:
boot(data = mtcars, statistic = rsq, R = 1000, formula = mpg ~ 
    wt + disp)


Bootstrap Statistics :
     original     bias    std. error
t1* 0.7809306 0.01379126  0.05113904

绘制结果

plot(results)

计算置信区间

ci_results <- boot.ci(
  results,
  type=c("perc", "bca")
)
ci_results

BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 1000 bootstrap replicates

CALL : 
boot.ci(boot.out = results, type = c("perc", "bca"))

Intervals : 
Level     Percentile            BCa          
95%   ( 0.6753,  0.8835 )   ( 0.6344,  0.8561 )  
Calculations and Intervals on Original Scale
Some BCa intervals may be unstable

ci_results$percent

     conf                                 
[1,] 0.95 25.03 975.98 0.6752993 0.8835035

ci_results$bca

     conf                                
[1,] 0.95 3.77 896.91 0.6343659 0.8560896

多个统计量的自助法

返回回归系数向量的函数

bs <- function(formula, data, indices) {
  d <- data[indices,]
  fit <- lm(formula, data=d)
  return(coef(fit))
}

自助抽样

set.seed(1234)
results <- boot(
  data=mtcars,
  statistic=bs,
  R=1000,
  formula=mpg ~ wt + disp
)
print(results)

ORDINARY NONPARAMETRIC BOOTSTRAP


Call:
boot(data = mtcars, statistic = bs, R = 1000, formula = mpg ~ 
    wt + disp)


Bootstrap Statistics :
       original        bias    std. error
t1* 34.96055404  4.715497e-02 2.546106756
t2* -3.35082533 -4.908125e-02 1.154800744
t3* -0.01772474  6.230927e-05 0.008518022

多个统计量使用索引

plot(
  results,
  index=2
)

置信区间

boot.ci(
  results,
  type="bca",
  index=2
)

BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 1000 bootstrap replicates

CALL : 
boot.ci(boot.out = results, type = "bca", index = 2)

Intervals : 
Level       BCa          
95%   (-5.477, -0.937 )  
Calculations and Intervals on Original Scale

boot.ci(
  results,
  type="bca",
  index=3
)

BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 1000 bootstrap replicates

CALL : 
boot.ci(boot.out = results, type = "bca", index = 3)

Intervals : 
Level       BCa          
95%   (-0.0334, -0.0011 )  
Calculations and Intervals on Original Scale