Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Immune infiltrate estimation: review of methods for deriving cell type profiles from purified cell sample data Maxim Zaslavsky June 2016 1 Introduction Several groups have introduced methods for the deconvolution of bulk tumor gene expression data. Every method includes a unique procedure for isolating a reference profile corresponding to each cell type using sample gene expression data collected from enriched cell lines. For robust deconvolution, it is essential that these reference profiles – taking the form of marker gene lists or characteristic gene expression vectors – are determined carefully. Specifically, we are interested in the following properties of these methods: • • • • Do these training methods extract biological intuition or noise? How well do their selected genes or expression vectors differentiate between similar classes? How well can these methods differentiate all classes, based on metrics like condition number? Do their chosen genes overlap, and are the differences between their training expression vectors or marker genes biologically significant or noise? • How unique are genes to individual cell types? How many are shared between multiple types? We introduce each approach, discuss its theoretical limitations, and examine its output to understand whether there is motivation for new approaches. 2 2.1 Methods for selecting reference profiles Marker gene methods There are three notable methods that produce marker gene lists. First, [1] examined gene expression within immune cell types and in other tissue to produce a list of genes that are specifically expressed in particular immune cell types. To determine whether a certain gene g is exemplary of any cell type(s), the authors find the array with the highest expression level of g and determine which enriched cell type it corresponds to. They multiply this highest expression level by 0.1625 (chosen arbitrarily) and add the maximum expression level of g seen in non-immune tissue samples. If this weighted sum is greater than the next highest expression level of gene g across arrays from other immune cell types, then g is considered characteristic of the cell type in which it was most highly expressed. Finally, if this gene has higher expression in another immune cell type than this weighted sum, the gene is also considered to be characteristic of the other cell type. However, fold change alone is a poor indicator of uniqueness; high expression should not be the only indicator that a gene corresponds to a particular cell type! A more robust method would consider rare expression, even at low levels. [2] pursues the same task with a similar method. To determine whether the expression of gene g is 1 characteristic of cell type t, the authors essentially compute the score: ∆g,t = min (ei (g)) − ei ∈Xt max (meanej ∈Xt′ (ej (g))), t′ ∈T −{t} where Xi is the set of all arrays of cell type i and e(g) is the expression of g in some array. Then, they keep all g ’s with highest ∆g,t . The authors do not specify their filtering cutoff, unfortunately. The authors finally add some cell type-specific genes for populations not sampled, again without much detail (the code is not available). This filtering mechanism ensures that selected genes are unique to their corresponding cell types, but would fail if any two types are very similar. In this case, genes whose differential expression has biological meaning but is small might not pass the filter, whereas genes with seemingly high differential expression – as may be found in the noise from low sample sizes – may pass. Finally, though [3] attempts to estimate tumor purity, which is the absolute fraction of stromal and immune cells in a tumor sample, instead of the relative abundances of specific immune infiltrate cell types, the method also relies on identifying immune signature genes from gene expression in enriched samples and thus deserves investigation. The authors simply divided samples into extremely low and extremely high immune cell infiltration groups (using leukocyte methylation signature scores that are given in many TCGA datasets), removing any samples with medium immune cell infiltration. They computed Significance Analysis of Microarray (SAM) scores on the differential expression of genes between the high-and low-infiltration groups [4]. They selected genes that were significantly differentially expressed to form a gene list. SAM is a straightforward, well-known, and statistically sound method for finding genes that are differentially expressed between two classes. Moreover, the SAM technique can be applied to multi-class situations to determine genes that are significantly differentially expressed in one combination of cell types versus another. I believe SAM would form more robust gene lists in comparison to previous methods that are based solely on fold change. 2.1.1 Analysis Our first measure of whether these marker gene extraction methods are successful is whether known immune pathways are enriched in the gene list. For example, are the genes that these methods believe to be associated with T cells part of the T cell receptor signaling pathway, or are these methods pulling out noise? The two marker gene lists, which we call IRIS [1] and Bindea [2], do not have much agreement on B cells or T cells; on NK cells, there are no intersecting genes at all. We run gene ontology enrichment analysis on the genes they have in common and on the genes unique to each list to see which T cell pathways are found and where. The resulting significant (p < .001) GO terms are contained in the tables below. Though the genes are different, they belong to the same pathways. The IRIS list contains much more noise than the Bindea list. This suggests that the method of ??? is more effective at extracting the unique properties of each immune cell subtype. GO terms in intersection of T cell IRIS and Bindea marker gene lists: 1. T cell receptor signaling pathway 7. T cell aggregation 2. antigen receptor-mediated signaling pathway 8. lymphocyte aggregation 3. T cell costimulation 9. leukocyte aggregation 10. T cell selection 4. lymphocyte costimulation 11. leukocyte cell-cell adhesion 5. immune response-activating cell surface receptor signaling pathway 12. immune response-activating signal transduction 6. T cell activation 13. positive regulation of T cell activation 2 14. homotypic cell-cell adhesion 35. biological adhesion 15. positive regulation of homotypic cell-cell adhesion 36. positive regulation of cell adhesion 37. regulation of lymphocyte activation 16. positive regulation of leukocyte cell-cell adhesion 38. regulation of cell-cell adhesion 17. immune response-regulating cell surface receptor signaling pathway 39. cell-cell adhesion 18. activation of immune response 41. regulation of leukocyte activation 19. positive regulation of cell-cell adhesion 42. cell activation 20. lymphocyte activation 43. regulation of cell activation 21. positive regulation of lymphocyte activation 44. regulation of immune response 22. positive regulation of leukocyte activation 45. thymic T cell selection 23. immune response-regulating signaling pathway 46. positive T cell selection 24. positive regulation of immune response 25. T cell differentiation in thymus 47. positive regulation of calcium-mediated signaling 26. thymocyte aggregation 48. T cell differentiation 27. positive regulation of cell activation 49. regulation of cell adhesion 28. regulation of T cell activation 50. regulation of calcium-mediated signaling 29. regulation of leukocyte cell-cell adhesion 51. regulation of immune system process 30. leukocyte activation 52. immune system process 31. regulation of homotypic cell-cell adhesion 53. lymphocyte differentiation 32. single organismal cell-cell adhesion 54. immune response 33. single organism cell adhesion 55. olfactory bulb axon guidance 34. cell adhesion 56. positive regulation of response to stimulus 40. positive regulation of immune system process GO terms of T cell genes in IRIS but not in Bindea: 1. cell division 12. cell cycle G2/M phase transition 2. nuclear division 3. organelle fission 13. anaphase-promoting complex-dependent proteasomal ubiquitin-dependent protein catabolic process 4. cell cycle process 14. T cell activation 5. mitotic nuclear division 15. T cell aggregation 6. mitotic cell cycle process 16. lymphocyte aggregation 7. mitotic cell cycle 17. leukocyte aggregation 8. cell cycle 18. regulation of cell cycle 9. cell cycle checkpoint 19. spindle organization 10. mitotic cell cycle checkpoint 20. mitotic cell cycle phase transition 11. G2/M transition of mitotic cell cycle 21. leukocyte cell-cell adhesion 3 22. regulation of mitotic cell cycle 56. neuronal stem cell division 23. cell cycle phase transition 57. neuroblast division 24. negative regulation of mitotic cell cycle 58. single organism cell adhesion 25. regulation of spindle organization 59. microtubule cytoskeleton organization 26. homotypic cell-cell adhesion 60. V(D)J recombination 27. mitotic spindle organization 61. immune system development 28. somatic diversification of T cell receptor genes 62. meiotic nuclear division 29. somatic recombination of T cell receptor gene segments 63. mitotic G2 DNA damage checkpoint 64. interleukin-5 production 30. T cell receptor V(D)J recombination 65. regulation of interleukin-5 production 31. spindle stabilization 66. meiotic cell cycle process 32. spindle assembly involved in meiosis 67. DNA integrity checkpoint 33. lymphocyte activation 68. negative regulation of mitotic cell cycle phase transition 34. positive regulation of ubiquitin-protein transferase activity 69. nuclear envelope organization 35. regulation of ubiquitin homeostasis 70. positive regulation of ubiquitin-protein ligase activity involved in regulation of mitotic cell cycle transition 36. free ubiquitin chain polymerization 37. positive regulation of ligase activity 71. positive regulation of proteolysis involved in cellular protein catabolic process 38. meiotic cell cycle 39. mitotic nuclear envelope disassembly 72. regeneration 40. membrane disassembly 73. oogenesis 41. nuclear envelope disassembly 74. spindle assembly 42. forebrain neuroblast division 75. organ regeneration 43. leukocyte activation 76. T cell costimulation 44. sister chromatid segregation 77. positive regulation of protein ubiquitination 45. response to insecticide 78. nuclear chromosome segregation 46. activation of anaphase-promoting complex activity 79. lymphocyte costimulation 80. cell-cell adhesion 47. single organismal cell-cell adhesion 81. centrosome localization 48. neural precursor cell proliferation 82. regulation of mitotic spindle organization 49. regulation of cell cycle process 83. negative regulation of cell cycle phase transition 50. cell proliferation 84. positive regulation of cellular protein catabolic process 51. meiotic spindle organization 52. cell activation 85. negative regulation of cell cycle 53. regulation of ligase activity 86. positive regulation of protein modification by small protein conjugation or removal 54. regulation of ubiquitin-protein transferase activity 87. negative regulation of cell division 55. histone-serine phosphorylation 88. single-organism organelle organization 4 GO terms of T cell genes in Bindea but not in IRIS: 1. T cell receptor signaling pathway 33. positive regulation of cell-cell adhesion 2. antigen receptor-mediated signaling pathway 34. immune response 3. positive regulation of immune system process 35. positive regulation of lymphocyte activation 4. regulation of immune system process 36. T cell costimulation 5. positive regulation of leukocyte activation 37. lymphocyte costimulation 6. regulation of immune response 38. immune system process 7. positive regulation of cell activation 39. lymphocyte activation 8. regulation of T cell activation 40. immune response-regulating signaling pathway 9. regulation of leukocyte cell-cell adhesion 10. regulation of homotypic cell-cell adhesion 41. positive regulation of interleukin-2 biosynthetic process 11. regulation of cell adhesion 42. leukocyte activation 12. immune response-activating cell surface receptor signaling pathway 43. single organismal cell-cell adhesion 13. positive regulation of immune response 45. regulation of interleukin-2 biosynthetic process 14. positive regulation of cell adhesion 46. interleukin-2 biosynthetic process 15. regulation of lymphocyte activation 47. cell-cell adhesion 16. regulation of cell-cell adhesion 48. cell activation 17. regulation of leukocyte activation 49. regulation of defense response to virus by virus 18. T cell activation 50. positive regulation of interleukin-2 production 19. T cell aggregation 51. T cell differentiation 20. lymphocyte aggregation 52. positive regulation of response to stimulus 21. leukocyte aggregation 53. regulation of interleukin-2 production 22. cell adhesion 23. biological adhesion 54. positive regulation of alpha-beta T cell activation 24. regulation of cell activation 55. interleukin-2 production 25. positive regulation of T cell activation 56. positive regulation of cytokine biosynthetic process 44. single organism cell adhesion 26. positive regulation of homotypic cell-cell adhesion 57. positive regulation of myeloid dendritic cell activation 27. positive regulation of leukocyte cell-cell adhesion 58. lymphocyte differentiation 28. leukocyte cell-cell adhesion 59. positive regulation of adaptive immune response based on somatic recombination of immune receptors built from immunoglobulin superfamily domains 29. immune response-activating signal transduction 30. homotypic cell-cell adhesion 60. regulation of alpha-beta T cell activation 31. immune response-regulating cell surface receptor signaling pathway 61. positive regulation of lymphocyte mediated immunity 32. activation of immune response 5 62. Fc-epsilon receptor signaling pathway 64. positive regulation of adaptive immune response 63. positive regulation of signal transduction 2.2 Expression barcodes Storing representative gene expression signatures, as opposed to just marker genes, is key to more robust predictions of immune infiltrate cell type abundances. These distinctive transcriptional profiles are often called unique expression “barcodes” (seemingly named for the heatmaps commonly used to visualize microarray data). We now examine two methods that extract representative expression profiles. [5] introduces the following procedure to select barcodes. For each expressed gene, the authors find the two cell types with highest expression of this gene (perhaps in terms of mean expression across all samples from each cell type, although the details are not given). If the gene is differentially expressed within a 95% fold change confidence interval between those cell types, the gene is flagged as a potential marker for the cell type with higher expression. This approach would clearly fail for very similar subtypes, and may only pull out noise because of low sample sizes. So the authors also compare the cell types with highest and third-highest expression of this gene in case it is hard to tell between the top two groups. They progressively refine their basis matrix with an increasing number of top genes, and report that they minimize the condition number of their matrix with an intermediate number of included genes (360 genes). The authors note that their method produces a well-conditioned matrix. This is an important consideration because the condition number, defined as the ratio of the largest to smallest singular values in the singular value decomposition of the basis matrix, estimates how imprecise solutions to linear systems with this matrix are, and thus is a good proxy for the accuracy of deconvolution under the well-justified biological assumption of linearity [6]. The smaller the condition number, the better conditioned the basis matrix is, meaning the cell types are more distinct. However, more strict statistical testing with a controlled false discovery rate is desired. [7] provides this desired statistical rigor. Like the previous method, this one also iteratively deletes irrelevant genes. The authors find significantly differentially expressed genes between all populations using two-sided unequal variance t-tests, with a (fairly loose) false discovery rate threshold of q < .3 and with log fold change greater than 2.0. The number of selected genes per cell type is reduced from at most the first 150 towards 50 final selected genes in search of the best-conditioned matrix (minimum condition number). Here is an example of the output of these methods. [5] provides raw samples from several populations: T cells, two lines of B cells, and monocytes. Figure 1 and 2 are correlation matrices of the pure samples and of processed basis matrices (via [7] codebase), respectively. Note the poor differentiation in the raw data (especially note the scale), whereas differentiation is much easier in the processed matrix. 2.2.1 Analysis We want to characterize how well expression barcode methods distinguish similar cell types. [5] does not provide code to regenerate their full basis matrix from many samples. However, I was able to reproduce the basis matrix from [7] using their tools and supplied input data, albeit with less filtering: the authors postprocessed their signature matrix to remove some junk genes using annotations from cancer cell lines. Though my basis matrix thus included more genes, I obtained a very similar condition number to their matrix (which they call LM22), and the genes in common all had almost exactly the same expressions throughout. This suggest that the postprocessing that was poorly described and that I was unable to run did not significantly refine the matrix. I performed hierarchical clustering and computed pairwise Pearson correlations between cell typespecific profiles in the signature matrices from [5] and [7]. The pairwise Pearson correlation of the LM22 matrix [7] showed nice differentiation between cell types, and biologically-related cell types were highly correlated (Figure 3). In contrast, the pairwise Pearson correlations from the matrix in [5], hereafter called Abbas, showed very poor differentiation among several B cell types (Figure 4). I also computed pairwise Pearson 6 Figure 1: Pairwise correlation in raw data from [5]. 7 Figure 2: Pairwise correlation in basis matrix created from raw data of [5]. 8 correlations from the combined matrices (Figure 5). Different methods with different datasets still produce nice expected correlations, although are also several unexpected inter-matrix correlations. Figure 3: Pairwise Pearson correlation in LM22 [7]. Hierarchical clustering of genes and cell types in LM22 generally recovers biological similarities between cell types (Figure 6). There is one exception: gamma delta T cells. However, this cell type has been flagged as problematic and may be ignored [8]. Since LM22 has nice differentiation between cell types, it is interesting to examine the most similar cell types in this matrix. The Pearson correlations and the hierarchical clustering reveal that the following classes in LM22 are most similar: • B cells memory, naive • CD4 T cells naive, memory resting When one of each pair of similar cell types is removed, the condition number decreases from 11.38 to 9.30, meaning the resulting matrix is considerably better at deconvolving the more distinct set of cell types. 3 Future directions In total, these papers have 390 microarrays samples. I downloaded and normalized all this array data. We can construct a much richer set of expression profiles from this expanded dataset. In fact, the sample size could potentially allow us to model variance and not just use mean expression profiles, which could be critical for deconvolving the immune contexture of tumors, in which immune cells may have differing activations or other properties depending on the state of the tumor. 9 Figure 4: Pairwise correlation in Abbas [5]. 10 Figure 5: Pairwise pearson correlation in combination of LM22 and Abbas basis matrices, as well as with raw data from [5]. 11 Figure 6: LM22 hierarchical clustering 12 T.cells.CD4.memory.resting T.cells.CD8 T.cells.CD4.naive T.cells.regulatory..Tregs. T.cells.follicular.helper Macrophages.M2 T.cells.CD4.memory.activated Dendritic.cells.resting Macrophages.M0 Eosinophils Mast.cells.activated Mast.cells.resting Neutrophils Monocytes T.cells.gamma.delta NK.cells.resting NK.cells.activated Macrophages.M1 Dendritic.cells.activated B.cells.naive B.cells.memory Plasma.cells SEPT5 CEMP1 EFNA5 KIRREL NPAS1 NTN3 SSX1 PCDHA5 ZNF442 CDH12 KRT18P50 IFNA10 MBL2 PTPRG LHCGR PKD2L2 DACH1 GIPR SMPDL3B MAGEA11 GYPE BRSK2 HIC1 GPR1 CLCA3P GALR1 PDE6C TARDBPP1 BFSP1 MAP9 MAP4K2 CRYBB1 MAP3K13 KCNG2 ATXN8OS GPR25 FRK SLC12A1 UPK3A WNT7A TRAV12−2 CA8 TSHR CDK6 SCN9A PPFIBP1 CCDC102B TEC NOX3 IL5 LINC00597 IL17A IL26 IL4 SKA1 CDC25A ZBTB32 REN MROH7 DENND5B UGT2B17 PAX7 HIST1H2AE ANGPT4 ZNF204P FLJ13197 ZNF324 SERGEF CXorf57 ZNF135 CDHR1 TEP1 KLRC4 NAALADL1 FAM124B STXBP6 PAQR5 SEPT8 RRP9 ORC1 FZD3 SLC7A10 GGT5 TRPM4 KIAA0754 NME8 WNT5B HTR2B GRAP2 COLQ ZBP1 KIR2DL1 S1PR5 TRAF4 MICAL3 IL7 NMBR BEND5 LINC00921 ABCB4 FAM212B STEAP4 C5AR2 CASP5 ABCB9 MAST1 TGM5 CCR10 HIST1H2BG MANEA RASA3 FBXL8 EPB41 RCAN3 CCR6 RPL10L TRAV13−2 HMGB3P30 BARX2 LILRA4 CRISP3 GPR19 TRAV8−6 EPHA1 DSC1 GAL3ST4 VILL BCL7A MEP1A UGT1A8 ZNF286A TMEM156 FRMD8 RYR1 LOC126987 PLCH2 TNFRSF11A SEC31B EPN2 TRPM6 IL5RA P2RY2 DEPDC5 ZNF222 RRP12 DAPK2 TLR7 ARRB1 LILRA3 ACHE APOL6 TNIP3 SLCO5A1 PDCD1LG2 TMEM255A DHRS11 FZD2 GSTT1 COL8A2 GPC4 CSF1 CR2 CD72 FCER2 NLRP3 ASGR1 PADI4 MEFV VNN1 HPSE FES ASGR2 HNMT FAM198B CD33 LTC4S IL1RL1 CMA1 CCL7 SLC12A8 CCL1 NTRK1 FAM174B ADAMTS3 RAB27B HOXA1 BPI AZU1 GFI1 CSF2 IL3 CD40LG TRAV13−1 DPP4 CXCR6 LAG3 PRR5L KLRC3 PLEKHF1 CRTAM ST6GALNAC4 C11orf80 PDK1 KCNA3 TNFRSF13B GNG7 REPS2 TREML2 PGLYRP1 MAK BTNL8 CREB5 CEACAM3 VNN3 CAMP PLEKHG3 MXD1 P2RY10 SMPD3 OSM ST3GAL6 ICA1 PDCD1 CXCR5 ST8SIA1 ZBTB10 IL21 FLT3LG LEF1 UBASH3A LY9 LIME1 ACAP1 ANKRD55 CD70 FOXP3 TYR TRAV21 TRAV9−2 ETS1 CD28 CD8B FASLG PTGDR KIR2DS4 TXK IL12RB2 KIR3DL2 KIR2DL4 TNFSF14 LTA MSC DHX58 PTGIR RASSF4 TLR8 NOD2 PLA1A HESX1 SOCS1 CHST7 ARHGAP22 IL12B ETV3 CCL14 HRH1 RENBP FRMD4A CD68 CD209 FLVCR2 SLAMF8 ALOX15 CCR5 HSPA6 HAL FPR2 CDA NFE2 LILRA2 BST1 ALOX5 CCR2 CD1D CFP DCSTAMP MARCO PPBP BHLHE41 CXCL5 HK3 PSG2 NIPSNAP3B ADAM28 RALGPS2 BRAF NPIPB15 SP140 CD180 GPR18 BACH2 HHEX STAP1 VPREB3 CD22 FCRL2 BLK SPIB FCGR2B PNOC AIM2 IL2RA IFNG IL9 SIT1 CD5 PBXIP1 CLEC2D ZFP36L2 CD96 LAIR2 SKAP1 SPOCK2 DGKA MAP4K1 SH2D1A SIRPG CD6 TRAT1 TNFRSF4 CTLA4 TBX21 GZMM NCR3 DEFA4 TTC38 CD244 CD160 KLRG1 DUSP2 RPL3P7 CD3G TCF7 ATHL1 BMP2K P2RX1 MARCH3 RGS13 MS4A2 IL18R1 ATP8B4 CEACAM8 MS4A3 IL1A MYB FCER1A CTSG ELANE ADRB2 CXCL3 CCL20 MAN1A1 AMPD1 SPAG4 IGHE RASGRP3 EAF2 TNFRSF17 LOC100130100 CD7 PMCH GPR171 PASK ICOS CHI3L2 TRIB2 CXCL13 CD8A GPR97 EMR3 EMR2 CCR3 ZNF165 NR4A3 GPR183 RGS1 GPR65 P2RY14 RNASE2 LRMP FOSB CYP27B1 SLC2A6 SLAMF1 SIGLEC1 CCL8 IFI44L CD80 CD40 CXCL11 APOL3 ADAMDEC1 KYNU RSAD2 CD86 CLIC2 CCL17 BIRC3 TREM2 C1orf54 FPR3 EGR2 CLEC4A CLEC10A NPL CCL23 HLA−DQA1 CCL13 SLC15A3 CHI3L1 CYP27A1 PLA2G7 IL18RAP KLRD1 KLRF1 APOBEC3G CTSW PTGER2 CCND2 CD300A IL1B HDC FFAR2 TNFRSF10C MGAM CXCR1 MMP25 CHST15 QPCT DPEP2 P2RY13 TREM1 CSF3R TLR2 LILRB2 APOBEC3A CLEC7A IGSF6 S100A12 HCK LST1 C5AR1 AQP9 VNN2 CXCR2 CD1B CD1A MMP12 RNASE6 CD1C CD1E ACP5 AIF1 MS4A6A CD4 CCL18 CD19 CD79B KIAA0226L BANK1 HLA−DOB P2RX5 CD79A SIK1 LY86 IL4R TCL1A IRF8 MS4A1 CD37 FAIM3 FAM65B RASGRP2 CD69 FCN1 CCL22 SAMSN1 C3AR1 BCL2A1 EMR1 CLC NCF2 MNDA FPR1 FCGR3B MMP9 TPSAB1 CPA3 HPGDS PRG2 CD3E PIK3IP1 BCL11B LAT PVRIG ZAP70 PTPRCAP CD27 ITK LCK CD2 SELL IL7R TRAC TRBC1 CD3D LTB PRF1 GNLY GZMA IL2RB CD247 GZMB GZMH KLRK1 KLRB1 GZMK CST7 NKG7 TRDC CCL5 IDO1 LAMP3 CCR7 TNFAIP6 EBI3 CD38 CXCL9 CCL19 CXCL10 CCL4 IGKC IGHM IGLL3P GUSBP11 IGHD MZB1 Since RNAseq is popular today for tumor sequencing, it is desirable to obtain enriched immune cell line RNAseq data and produce a new basis matrix. However, online discussion suggests that RNAseq does not support the independence assumptions in microarray analysis: https://www.biostars.org/p/160961/. This context may require different reference profile expression methods. References 1. Abbas A, Baldwin D, Ma Y, Ouyang W, Gurney A, Martin F, et al. Immune response in silico (iRIS): Immune-specific genes identified from a compendium of microarray expression data. Genes and immunity. Nature Publishing Group; 2005;6: 319–331. 2. Bindea G, Mlecnik B, Tosolini M, Kirilovsky A, Waldner M, Obenauf AC, et al. Spatiotemporal dynamics of intratumoral immune cells reveal the immune landscape in human cancer. Immunity. Elsevier; 2013;39: 782–795. 3. Yoshihara K, Shahmoradgoli M, Martínez E, Vegesna R, Kim H, Torres-Garcia W, et al. Inferring tumour purity and stromal and immune cell admixture from expression data. Nature communications. Nature Publishing Group; 2013;4. 4. Storey JD, Xiao W, Leek JT, Tompkins RG, Davis RW. Significance analysis of time course microarray experiments. Proceedings of the National Academy of Sciences of the United States of America. National Acad Sciences; 2005;102: 12837–12842. 5. Abbas AR, Wolslegel K, Seshasayee D, Modrusan Z, Clark HF. Deconvolution of blood microarray data identifies cellular activation patterns in systemic lupus erythematosus. PloS one. Public Library of Science; 2009;4: e6098. 6. Shen-Orr SS, Tibshirani R, Khatri P, Bodian DL, Staedtler F, Perry NM, et al. Cell type–specific gene expression differences in complex tissues. Nature methods. Nature Publishing Group; 2010;7: 287–289. 7. Newman AM, Liu CL, Green MR, Gentles AJ, Feng W, Xu Y, et al. Robust enumeration of cell subsets from tissue expression profiles. Nature methods. Nature Publishing Group; 2015;12: 453–457. 8. Senbabaoglu Y, Winer AG, Gejman RS, Liu M, Luna A, Ostrovnaya I, et al. The landscape of t cell infiltration in human cancer and its association with antigen presenting gene expression. bioRxiv. Cold Spring Harbor Labs Journals; 2015; doi:10.1101/025908 13