Corey
by Corey
3 min read

Categories

  • articles

Tags

  • R

今日帮助实验人员处理一组数据,需要计算p值研究不同组别不同项目是否有差异,涉及的比较项目较多,但检验方法可以使用同一种,我决定把所有比较项目处理进一个矩阵,在R中循环计算p value,每一个比较项目为2X2矩阵。由于样本总例数较少,我选用Fisher's Exact Test

首先用最擅长的Python将数据按顺序处理好,输入R中做成一个126X2的矩阵:

data = matrix(c(1, 8, 9, 12, 15, 32, 1, 7, 9, 11, 15, 30, 1, 0, 9, 16, 15, 28, 1, 9, 9, 8, 15, 37, 1, 4, 9, 7, 15, 19, 1, 2, 9, 4, 15, 11, 8, 7, 12, 11, 32, 30, 8, 0, 12, 16, 32, 28, 8, 9, 12, 8, 32, 37, 8, 4, 12, 7, 32, 19, 8, 2, 12, 4, 32, 11, 7, 0, 11, 16, 30, 28, 7, 9, 11, 8, 30, 37, 7, 4, 11, 7, 30, 19, 7, 2, 11, 4, 30, 11, 0, 9, 16, 8, 28, 37, 0, 4, 16, 7, 28, 19, 0, 2, 16, 4, 28, 11, 9, 4, 8, 7, 37, 19, 9, 2, 8, 4, 37, 11, 4, 2, 7, 4, 19, 11, 14, 24, 6, 20, 0, 0, 14, 23, 6, 19, 0, 0, 14, 30, 6, 14, 0, 2, 14, 30, 6, 31, 0, 2, 14, 16, 6, 13, 0, 1, 14, 10, 6, 8, 0, 1, 24, 23, 20, 19, 0, 0, 24, 32, 20, 16, 0, 4, 24, 30, 20, 31, 0, 2, 24, 16, 20, 13, 0, 1, 24, 30, 20, 31, 0, 2, 23, 30, 19, 14, 0, 2, 23, 30, 19, 31, 0, 2, 23, 16, 19, 13, 0, 1, 23, 10, 19, 8, 0, 1, 30, 30, 14, 31, 2, 2, 30, 16, 14, 13, 2, 1, 30, 10, 14, 8, 2, 1, 30, 16, 31, 13, 2, 1, 30, 10, 31, 8, 2, 1, 16, 10, 13, 8, 1, 1), ncol = 2)

然后对矩阵进行一个简单的转置,生成63个2X2矩阵,每一个矩阵就是实验需要的比较项目。

re_data = t(data)

看一看转置完成后的样子,从左至右,每一个2X2矩阵就是一个待计算项目。

	  	[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,]    1    8    9   12   15   32    1    7    9
[2,]   14   24    6   20    0    0   14   23    6
     [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17]
[1,]    11    15    30     1     0     9    16    15
[2,]    19     0     0    14    30     6    14     0
     [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25]
[1,]    28     1     9     9     8    15    37     1
[2,]     2    14    30     6    31     0     2    14
     [,26] [,27] [,28] [,29] [,30] [,31] [,32] [,33]
[1,]     4     9     7    15    19     1     2     9
[2,]    16     6    13     0     1    14    10     6
     [,34] [,35] [,36] [,37] [,38] [,39] [,40] [,41]
[1,]     4    15    11     8     7    12    11    32
[2,]     8     0     1    24    23    20    19     0
     [,42] [,43] [,44] [,45] [,46] [,47] [,48] [,49]
[1,]    30     8     0    12    16    32    28     8
[2,]     0    24    32    20    16     0     4    24
     [,50] [,51] [,52] [,53] [,54] [,55] [,56] [,57]
[1,]     9    12     8    32    37     8     4    12
[2,]    30    20    31     0     2    24    16    20
     [,58] [,59] [,60] [,61] [,62] [,63] [,64] [,65]
[1,]     7    32    19     8     2    12     4    32
[2,]    13     0     1    24    30    20    31     0
     [,66] [,67] [,68] [,69] [,70] [,71] [,72] [,73]
[1,]    11     7     0    11    16    30    28     7
[2,]     2    23    30    19    14     0     2    23
     [,74] [,75] [,76] [,77] [,78] [,79] [,80] [,81]
[1,]     9    11     8    30    37     7     4    11
[2,]    30    19    31     0     2    23    16    19
     [,82] [,83] [,84] [,85] [,86] [,87] [,88] [,89]
[1,]     7    30    19     7     2    11     4    30
[2,]    13     0     1    23    10    19     8     0
     [,90] [,91] [,92] [,93] [,94] [,95] [,96] [,97]
[1,]    11     0     9    16     8    28    37     0
[2,]     1    30    30    14    31     2     2    30
     [,98] [,99] [,100] [,101] [,102] [,103] [,104]
[1,]     4    16      7     28     19      0      2
[2,]    16    14     13      2      1     30     10
     [,105] [,106] [,107] [,108] [,109] [,110] [,111]
[1,]     16      4     28     11      9      4      8
[2,]     14      8      2      1     30     16     31
     [,112] [,113] [,114] [,115] [,116] [,117] [,118]
[1,]      7     37     19      9      2      8      4
[2,]     13      2      1     30     10     31      8
     [,119] [,120] [,121] [,122] [,123] [,124] [,125]
[1,]     37     11      4      2      7      4     19
[2,]      2      1     16     10     13      8      1
     [,126]
[1,]     11
[2,]      1

对大矩阵循环,计算每个待比较项目的的p值。

list = seq(1, 126, by = 2)

for(i in list){
  # print(re_inner[,i:(i+1)])
  print(fisher.test(re_data[,i:(i+1)])$p.value)
}

Done!