语法:var.test(连续变量名~分组变量名,data=数据框名)
> var.test(bwt~smoke,data=birthwt) F test to compare two variances data: bwt by smoke F = 1.3019, num df = 114, denom df = 73, p-value = 0.2254 # P值>0.05,表示可认为方差齐性 alternative hypothesis: true ratio of variances is not equal to 1 95 percent confidence interval: 0.8486407 1.9589574 sample estimates: ratio of variances 1.301927语法:t.test(连续变量名,分组变量名,var.equal=[TRUE/FALSE],data=数据框名) 根据方差齐性检验结果,声明var.equal值 (默认为FALSE)
> t.test(bwt~smoke,data=birthwt,var.equal=TRUE) Two Sample t-test data: bwt by smoke t = 2.6529, df = 187, p-value = 0.008667 # 读取P值 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 72.75612 494.79735 # 组0减去组1的差的95%CI sample estimates: mean in group 0 mean in group 1 # 组0和组1所在总体均值的估计值 3055.696 2771.919若要返回非95%CI (如99%CI),需要额外声明参数conf.level。
即配对样本t检验。语法类似于独立样本t检验,仅需声明参数paired=TRUE即可: t.test(连续变量名,分组变量名,var.equal=[TRUE/FALSE],data=数据框名,paired=TRUE)
首先,复习单因素ANOVA应用条件:
需比较的组≥3个可认定各组数据是从正态总体中独立抽样得到的 (正态性检验)各组数据方差齐性语法:tapply(数据框名$连续变量名,数据框名$分组变量名,shapiro.test)
> tapply(birthwt$bwt,birthwt$race,shapiro.test) $white Shapiro-Wilk normality test data: X[[i]] W = 0.98727, p-value = 0.4861 $black Shapiro-Wilk normality test data: X[[i]] W = 0.97696, p-value = 0.8038 $others Shapiro-Wilk normality test data: X[[i]] W = 0.97537, p-value = 0.2046语法:bartlett.test(连续变量名~分组变量名,data=数据框名)
> bartlett.test(bwt~race,data=birthwt) Bartlett test of homogeneity of variances data: bwt by race Bartlett's K-squared = 0.65952, df = 2, p-value = 0.7191该函数位于car包下。 语法:leveneTest(连续变量名~分组变量名,data=数据框名)
> leveneTest(bwt~race,data=birthwt) Levene's Test for Homogeneity of Variance (center = median) Df F value Pr(>F) group 2 0.4684 0.6267 186语法:aov模型名<-aov(连续变量名~分组变量名,data=数据框名) 使用summary()函数查看单因素ANOVA结果。
race.aov<-aov(birthwt$bwt~birthwt$race) > summary(race.aov) Df Sum Sq Mean Sq F value Pr(>F) birthwt$race 2 5015725 2507863 4.913 0.00834 ** Residuals 186 94953931 510505 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1方差分析得出的P<0.05意味着不能认为几组变量均值全等。此后还要进一步探索变量分组两两之间的均值关系,即事后检验 (post-hoc test)。 基于不同的对第I类错误的控制方法,有多种事后检验的方法可以选择。
语法:TukeyHSD(aov模型名)
> TukeyHSD(race.aov) Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = birthwt$bwt ~ birthwt$race) $`birthwt$race` diff lwr upr p adj black-white -383.02644 -756.2363 -9.816581 0.0428037 others-white -297.43517 -566.1652 -28.705095 0.0260124 others-black 85.59127 -304.4521 475.634630 0.8624372语法:pairwise.t.test(数据框名$连续变量名,数据框名$分组变量名,p.adjust.method=声明的校正方法) 可选用的校正方法包括:“holm”, “hochberg”, “hommel”, “bonferroni”, “BH”, “BY”, “fdr”, “none”。以Bonferroni校正为例举例如下:
> pairwise.t.test(birthwt$bwt,birthwt$race,p.adjust.method = "bonferroni") Pairwise comparisons using t tests with pooled SD data: birthwt$bwt and birthwt$race white black black 0.049 - others 0.029 1.000 P value adjustment method: bonferroni类似于参数检验,不再赘述。
统计目标非参数检验语法两组独立样本Mann-Whitney-Wilcoxin检验wilcox.test(连续变量名,分组变量名,data=数据框名)两组配对样本Wilcoxin符号秩和检验wilcox.test(变量1,变量2,paired=TRUE)多组独立样本Kruskal-Wallis检验kruskal.test(连续变量名,分组变量名,data=数据框名)可以使用Wilcoxin符号秩和检验 (控制第1类错误概率)。 也可以使用PMCMRplus包中的bwsAllPairsTest()进行BWS检验。 语法:bwsAllPairsTest(连续变量名,分组变量名,data=数据框名)
> bwsAllPairsTest(bwt~race,data=birthwt) Pairwise comparisons using BWS All-Pairs Test data: bwt by race white black black 0.027 - others 0.030 0.535 P value adjustment method: holm alternative hypothesis: two.sided