Download Supplementary methods for:

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Supplementary methods for:
“Gene expression of lung squamous cell carcinoma reflects mode of lymph node
involvement”
Jill E. Larsen, Sandra J. Pavey, Rayleen Bowman, Ian A. Yang, Belinda E. Clarke,
Maree L. Colosimo, Nicholas K. Hayward, Kwun M. Fong.
Outline:
1. Supplementary methods
2. Supplementary Table 1: 125 genes differentially expressed between N0 and
N1/2m SCC samples
1
Supplementary Methods
Tumor samples
59 patients undergoing curative intent surgical resection of primary lung SCC without
neoadjuvant chemotherapy or radiotherapy were recruited for the study (Table 1),
which had approval from The Prince Charles Hospital Human Research Ethics
Committee. All patients gave informed, written consent. The resected tumor,
dissected lymph nodes and associated lung were examined histologically to determine
pathological stage of the cancer according to the American Joint Committee on
Cancer [1], and tumor samples stratified into three groups: N0 (n=35), hilar N1 nodes
by direct extension (N1d) (n=8), and N1/N2 nodes by lymphatic metastasis (N1/2m)
(n=16).
Microarray experiments
To compare the genetic profiles of N0, N1d and N1/N2m tumors, we used gene
expression data of the 59 SCCs which has been previously described [2]. Briefly, total
RNA from each tumor sample was hybridized with a common reference sample to a
commercially available 22K Human V2.0 Oligo Microarray (Operon Biotechnologies,
Cologne, Germany). Microarray experiments conformed to MIAME guidelines. Raw
images were processed in Imagene V5.1 (BioDiscovery, CA, USA) and imported into
BRB-ArrayTools (Version 3.5; developed by Dr. Richard Simon and Amy Peng Lam)
for normalization, filtering on signal intensity and spot morphology, and statistical
analysis. The data discussed in this publication have been made available in NCBI’s
Gene
Expression
Omnibus
(GEO)
public
repository
(http://www.ncbi.nlm.nih.gov/geo/) through GEO Series accession number GSE5868.
2
Statistical analysis
Three separate approaches were used to analyse N1d tumors in this cohort of 59
SCCs. Supervised analysis and class prediction analysis were performed on the gene
expression data to examine the biology, and Kaplan-Meier (log-rank) analysis was
performed to compare clinical outcome of N1d tumors in comparison to N0 and
N1/2m tumors. A supervised analysis was performed to identify genes differentially
expressed between N0 and N1/2m tumors (Wilcoxon Mann-Whitney U (WMWU)
statistic, P<0.01). Class prediction models were developed to predict whether N1d
tumors where biologically more similar to N0 tumors or N1/2m tumors. Given the
reported variability in prediction models [3-6], five independent models were used for
consensus. Models that have demonstrated robustness for microarray data were
selected, where the number of genes far outweigh the number of samples, and
included Compound Covariate Predictor [7], Diagonal Linear Discriminant Analysis
[4], Nearest Neighbor Predictor [4], Nearest Centroid Predictor [8], and Support
Vector Machine Predictor with a linear kernel [9]. Usually, prediction models are
developed with samples of known class to identify samples of unknown class.
Classification of the unknown sample is performed by identifying to which expression
profile the unknown sample is most similar. In the current analysis, we were not
attempting to classify N1d tumors as N0 or N1/2m tumors but rather identify to which
class, N0 or N1/2m, N1d tumors were most similar. Each of the five models were
built from the training set of 35 N0 and 16 N1/2m samples (with the prediction error
of
each
model
estimated
using
leave-one-out
cross-validation
(LOOCV))
incorporating genes differentially expressed at the 0.01 significance level as assessed
by the random variance t-test [10]. For each LOOCV training set, the entire model
3
building process was repeated, including the gene selection process, and N1d tumors
were completely excluded from the model building to ensure no gene selection bias.
Each model was then used to predict to which class (N0 or N1/2m) each of the 8 N1d
samples were most similar.
To ensure the selected cases were representative in terms of clinical outcomes,
survival estimates of the SCC subsets were analyzed (Kaplan-Meier (log-rank)
analysis) in SPSS for Windows Version 11.5 (SPSS Inc., IL, USA). Hierarchical
clustering was performed using Pearson correlation with bootstrapping of 1,000
iterations in both the sample and feature dimensions. Kaplan-Meier survival plots and
log-rank tests performed in SPSS Version 11.5 (SPSS Inc.) were used to assess the
differences in survival of N0, N1d, and N1/2m tumors.
4
References
1.
Mountain CF. Revisions in the International System for Staging Lung Cancer.
Chest 1997: 111: 1710-1717.
2.
Larsen JE, Pavey SJ, Passmore LH, Bowman RV, Clarke BE, Hayward NH,
Fong KM. Expression profiling defines a recurrence signature in lung squamous cell
carcinoma. Carcinogenesis 2006: Nov 1; [Epub ahead of print].
3.
Ben-Dor A, Bruhn L, Friedman N, Nachman I, Schummer M, Yakhini Z.
Tissue classification with gene expression profiles. J Comput Biol 2000: 7: 559-583.
4.
Dudoit S, Fridlyand J, Speed TP. Comparison of discrimination methods for
the classification of tumors using gene expression data. J Am Stat Assoc 2002: 97: 7787.
5.
Simon R. Diagnostic and prognostic prediction using gene expression profiles
in high-dimensional microarray data. Br J Cancer 2003: 89: 1599-1604.
6.
Lee JW, Lee JB, Park M, Song SH. An extensive comparison of recent
classification tools applied to microarray data. Comput Stat Data Anal 2005: 48: 869885.
7.
Radmacher MD, McShane LM, Simon R. A paradigm for class prediction
using gene expression profiles. J Comput Biol 2002: 9: 505-511.
8.
Tibshirani R, Hastie T, Narasimhan B, Chu G. Diagnosis of multiple cancer
types by shrunken centroids of gene expression. Proc Natl Acad Sci U S A 2002: 99:
6567-6572.
9.
Ramaswamy S, Tamayo P, Rifkin R, Mukherjee S, Yeang CH, Angelo M,
Ladd C, Reich M, Latulippe E, Mesirov JP, Poggio T, Gerald W, Loda M, Lander ES,
Golub TR. Multiclass cancer diagnosis using tumor gene expression signatures. Proc
Natl Acad Sci U S A 2001: 98: 15149-15154.
10.
Wright GW, Simon RM. A random variance model for detection of
differential gene expression in small microarray experiments. Bioinformatics 2003:
19: 2448-2455.
5
Supplementary Table 1: Supervised analysis (Mann-Whitney U) of N0 tumors (n=35) vs. N1/2 tumor by lymphatic invasion (n=16) identified 125
transcripts significantly differentially expressed. FC, fold change in expression in N1/2m / N0 tumors.
Unique id
P-value
FC
Gene symbol
Description
Genbank
Unigene
Map
DNAJB12
DnaJ (Hsp40) homolog, subfamily B, member 12
Human BAC clone CTB-7J15 from 7q31
Hypothetical protein MGC20255
Homo sapiens cDNA: FLJ23252 fis, clone
COL04668
Hypothetical protein MGC4767
Purinergic receptor (family A group 5)
Amiloride-sensitive cation channel 3, testis
Filamin C, gamma (actin binding protein 280)
NAG-7 protein
Sarcoglycan, delta (35kD dystrophin-associated
glycoprotein)
Kinesin-like 4
Homo sapiens, clone MGC:16152
IMAGE:3632546, mRNA
Protein disulfide isomerase related protein
(calcium-binding protein, intestinal-related)
Phosphodiesterase 3A, cGMP-inhibited
Zinc finger protein 106
Homo sapiens isolate sy-4M/12-H1
immunoglobulin heavy chain variable region
mRNA, partial cds
Homo sapiens EST from clone 491476, full insert
Hypothetical protein FLJ13373
Hypothetical protein FLJ10430
Poly(A) binding protein, cytoplasmic 5
CD83 antigen (activated B lymphocytes,
immunoglobulin superfamily)
Low density lipoprotein-related protein 2
Valyl-tRNA synthetase 2
Zinc finger protein
Caudal type homeo box transcription factor 4
GDNF family receptor alpha 2
Homo sapiens PRO2272 mRNA
Human DNA sequence from BAC 15E1 on
chromosome 12. Contains Cytochrome C
Oxidase Polypeptide VIa-liv
Opioid receptor, mu 1
Protein tyrosine phosphatase, receptor type, E
NM_017626
AC003989
NM_052848
Hs.7960
Hs.248069
Hs.334775
10q22.1
7q31.1
19q13.2
AK026905
NM_032314
NM_005767
NM_020321
NM_001458
NM_013343
Hs.306895
Hs.17250
Hs.189999
Hs.98547
Hs.58414
Hs.278951
11p15.3
12q24.31
13q14.2
7q36.1
7q32.1
3p25.3
NM_000337
NM_007317
Hs.151899
Hs.119324
5q33.3
16p11.2
BC013276
Hs.6872
1p36.33
NM_004911
NM_000921
NM_022473
Hs.93659
Hs.777
Hs.15220
10q26.3
12p12.2
15q15.1
AY003854
AL355685
NM_025006
NM_018092
AL122118
Hs.348649
Hs.9042
Hs.287567
Hs.6823
Hs.190614
14q32.33
21q22.11
11p15.4
16q12.1
23q21.31
NM_004233
NM_004525
NM_006295
NM_015871
NM_005193
U97145
AF119872
Hs.79197
Hs.153595
Hs.159637
Hs.102419
Hs.248098
Hs.19317
Hs.283036
6p23
2q31.1
6p21.33
1p36.11
23q13.2
8p21.3
7q33
BE961032
NM_000914
NM_006504
Hs.200400
Hs.2353
Hs.31137
12q24.31
6q25.2
10q26.2
Homo sapiens clone 24670 mRNA sequence
Homo sapiens mRNA; cDNA DKFZp564I083
Nuclear receptor subfamily 1, group D, member
2
Zinc finger protein 197
Hypothetical protein FLJ11011
Carbohydrate (N-acetylglucosamine-6-O)
sulfotransferase 2
Cytochrome P450, subfamily IIIA, polypeptide 7
C-reactive protein, pentraxin-related
Dual specificity phosphatase 9
RD RNA-binding protein
Homo sapiens cDNA FLJ30346 fis, clone
BRACE2007527
Keratin, hair, basic, 1
Hypothetical protein FLJ12606
Homo sapiens cDNA FLJ11443 fis, clone
HEMBA1001330
Solute carrier family 2 (facilitated glucose
transporter), member 10
Poliovirus receptor-related 2 (herpesvirus entry
mediator B)
Sprouty homolog 4 (Drosophila)
Ectonucleoside triphosphate
diphosphohydrolase 6 (putative function)
AD036 protein
Hypothetical protein FLJ11068
Glutathione peroxidase 5 (epididymal androgenrelated protein)
AF055019
AL049279
Hs.21906
Hs.350521
7q34
18q21.2
BC015929
NM_006991
NM_018299
Hs.37288
Hs.170341
Hs.21275
3p24.2
3p21.32
8q21.11
AB021124
NM_000765
NM_000567
NM_001395
NM_002904
Hs.8786
Hs.172323
Hs.76452
Hs.144879
Hs.106061
3q23
7q22.1
1q23.2
23q28
6p21.32
AK054908
NM_002281
NM_024804
Hs.349624
Hs.32952
Hs.163754
9q34.3
12q13.13
1q44
AK021505
Hs.297945
13q31.3
NM_030777
Hs.305971
20q13.12
NM_002856
NM_030964
Hs.183986
Hs.285814
19q13.31
5q31.3
NM_001247
BC017701
NM_018314
Hs.12330
Hs.21941
Hs.337778
20p11.21
4q32.1
11p15.1
NM_001509
Hs.248129
6p22.1
H200001540
H200016041
H200019678
0.0002
0.0003
0.0003
1.20
0.77
0.82
H200018661
H200002637
H200014446
H200010894
H200005321
H200017395
0.0004
0.0005
0.0005
0.0007
0.0007
0.0009
0.82
0.71
0.78
1.29
1.20
1.22
MGC4767
P2Y5
ACCN3
FLNC
NAG-7
H200013795
H200012204
0.0013
0.0014
1.17
1.31
SGCD
KNSL4
H200001320
0.0015
1.18
H200010594
H200000198
H200002460
0.0015
0.0016
0.0016
1.19
1.17
0.80
H200020340
H200001728
H200009045
H200001310
H200014461
0.0018
0.0022
0.0022
0.0023
0.0024
2.33
0.69
1.32
0.71
0.74
H200006646
H200013886
H200007484
H200011173
H200016053
H200002813
H200017653
0.0025
0.0026
0.0026
0.0028
0.0029
0.0029
0.0030
1.13
1.42
1.35
1.22
0.87
1.18
1.63
H200014883
H200000598
H200004069
H200001617
H200003086
H200020537
0.0030
0.0030
0.0030
0.0031
0.0031
0.0031
0.90
0.78
1.11
1.28
0.73
1.16
H200004393
H200007937
H200002997
0.0032
0.0032
0.0035
1.33
1.14
0.75
NR1D2
ZNF197
FLJ11011
H200001687
H200008040
H200006291
H200013482
H200011420
0.0036
0.0037
0.0039
0.0039
0.0043
1.22
1.19
1.18
0.81
0.77
CHST2
CYP3A7
CRP
DUSP9
RDBP
H200020457
H200004194
H200007571
0.0044
0.0044
0.0045
1.38
1.12
1.12
KRTHB1
FLJ12606
H200009846
0.0046
1.11
H200018198
0.0046
0.65
SLC2A10
H200014240
H200008849
0.0048
0.0048
1.19
1.12
PVRL2
SPRY4
H200002111
H200003092
H200019853
0.0049
0.0050
0.0050
0.74
0.80
1.22
ENTPD6
AD036
FLJ11068
H200016075
0.0050
1.11
GPX5
MGC20255
ERP70
PDE3A
ZFP106
FLJ13373
FLJ10430
PABPC5
CD83
LRP2
VARS2
LOC51042
CDX4
GFRA2
OPRM1
PTPRE
6
H200009359
0.0050
1.13
MKLN1
H200000061
H200004887
H200007318
H200009720
H200013504
0.0050
0.0051
0.0052
0.0053
0.0053
1.17
1.13
0.72
1.33
1.15
TGFBR1
ZNF228
HAP1
PLAB
PODLX2
H200007176
0.0054
1.19
H200004794
H200015840
H200019009
H200006876
H200009158
H200012712
H200017427
0.0054
0.0054
0.0056
0.0056
0.0056
0.0057
0.0057
0.88
1.12
0.72
1.17
1.26
1.25
1.18
DELGEF
FLJ23529
H200002416
0.0058
0.78
FLJ10773
H200018953
H200018942
0.0059
0.0060
0.82
0.87
C20orf72
H200019168
0.0061
0.87
H200006704
0.0061
0.71
BASP1
H200002016
H200014107
H200002552
0.0062
0.0062
0.0063
0.89
1.12
0.74
GRHPR
KIAA1373
H200006259
H200006066
H200012188
0.0064
0.0065
0.0065
0.82
1.12
0.87
ATP6L
AIP
ARF3
H200014906
H200003651
H200017068
0.0066
0.0066
0.0066
1.12
1.20
0.85
FLJ21657
LOC51114
H200020810
H200006641
0.0067
0.0068
0.82
1.09
SSRP1
H200004841
H200010802
H200004452
H200013544
H200019805
H200015525
H200002486
H200012483
0.0069
0.0069
0.0069
0.0070
0.0070
0.0070
0.0070
0.0070
0.79
1.18
0.79
1.17
1.61
1.20
1.16
1.10
FLJ11045
MGC15737
FLJ12610
IGLJ3
KIAA0707
MGC3260
SPS
H200018865
H200010466
0.0071
0.0072
0.89
1.14
H200002701
H200020311
H200011726
H200019236
H200009205
H200017487
0.0073
0.0074
0.0075
0.0076
0.0077
0.0077
0.89
0.90
0.64
1.23
1.21
0.89
H200008649
H200003699
H200017986
0.0079
0.0079
0.0080
0.83
1.17
0.70
CLUL1
ZNF-U69274
H200015965
H200004891
H200017308
H200015113
H200011184
H200010534
0.0081
0.0081
0.0083
0.0084
0.0084
0.0084
0.82
1.25
0.77
0.85
0.65
1.21
COX6BP-3
FLJ23093
SNX17
IFNA2
NTN4
SYTL2
H200005027
0.0087
0.92
ITGAX
C5orf5
FLJ23185
BIRC3
DKFZP434I2117
SEPT6
ETS1
GTR2
FLJ13381
PRO1693
Muskelin 1, intracellular mediator containing
kelch motifs
Transforming growth factor, beta receptor I
(activin A receptor type II-like kinase, 53kD)
Zinc finger protein 228
Huntingtin-associated protein 1 (neuroan 1)
Prostate differentiation factor
Endoglycan
Homo sapiens cDNA FLJ13558 fis, clone
PLACE1007743
Deafness locus associated putative guanine
nucleotide exchange factor
Hypothetical protein FLJ23529
Homo sapiens mRNA; cDNA DKFZp586J101
Chromosome 5 open reading frame 5
Hypothetical protein FLJ23185
Baculoviral IAP repeat-containing 3
Hypothetical protein DKFZp434I2117
Likely ortholog of mouse NPC derived proline
rich protein 1
Homo sapiens cDNA: FLJ21586 fis, clone
COL06920
Chromosome 20 open reading frame 72
Homo sapiens clone P1 NTera2D1
teratocarcinoma mRNA
Brain abundant, membrane attached signal
protein 1
Homo sapiens, clone MGC:16395
IMAGE:3939387, mRNA
Glyoxylate reductase/hydroxypyruvate reductase
KIAA1373 protein
ATPase, H+ transporting, lysosomal (vacuolar
proton pump) 16kD
Aryl hydrocarbon receptor interacting protein
ADP-ribosylation factor 3
Homo sapiens cDNA FLJ11079 fis, clone
PLACE1005111
Hypothetical protein FLJ21657
CGI-89 protein
Homo sapiens cDNA FLJ31505 fis, clone
NT2NE2005821
Structure specific recognition protein 1
Homo sapiens cDNA FLJ13561 fis, clone
PLACE1008045
Hypothetical protein
Hypothetical protein MGC15737
Hypothetical protein FLJ12610
Immunoglobulin lambda joining 3
KIAA0707 protein
Hypothetical protein MGC3260
Selenium donor protein
Homo sapiens cDNA FLJ30075 fis, clone
BGGI11000285
Septin 6
V-ets erythroblastosis virus E26 oncogene
homolog 1 (avian)
Homo sapiens, clone IMAGE:4846433, mRNA
Rag C protein
Homo sapiens P231 mRNA
Hypothetical protein FLJ13381
PRO1693 protein
Homo sapiens cDNA FLJ32238 fis, clone
PLACE6004993
Clusterin-like 1 (retinal)
Zinc finger protein
Cytochrome c oxidase subunit VIb pseudogene3
Hypothetical protein FLJ23093
Sorting nexin 17
Interferon, alpha 2
Netrin 4
Synaptotagmin-like 2
Integrin, alpha X (antigen CD11C (p150), alpha
polypeptide)
NM_013255
Hs.288791
7q32.3
NM_004612
NM_013380
NM_003949
NM_004864
NM_015720
Hs.220
Hs.48589
Hs.158300
Hs.296638
Hs.145416
9q22.33
19q13.31
17q21.2
19p13.11
3q21.3
AK023620
Hs.86043
9p13.2
NM_012139
AB051484
AL050376
NM_016603
NM_025056
AF070674
NM_031478
Hs.46735
Hs.246306
Hs.322645
Hs.82035
Hs.287732
Hs.127799
Hs.279307
11p15.1
2p11.2
2p15
5q31.2
22q11.22
11q22.2
16p11.2
NM_018212
Hs.14838
1q42.12
AK025239
AK027503
Hs.321142
Hs.320831
17q21.32
20p11.23
AF279782
Hs.326284
3p26.1
NM_006317
Hs.79516
5p15.1
AK056767
AK024386
AB037794
Hs.11607
Hs.155742
Hs.16229
7p22.3
9p13.2
10q23.31
NM_001694
NM_003977
NM_001659
Hs.76159
Hs.75305
Hs.119177
16p13.3
11q13.2
12q13.12
AK001938
BC013351
AL161962
Hs.201441
Hs.26498
Hs.274351
13q14.11
5p12
23q25
AK056067
NM_003146
Hs.350805
Hs.79162
1q24.2
11q12.1
AK023623
NM_019038
NM_032926
NM_024782
X57812
AB014607
BC000073
NM_012247
Hs.47374
Hs.97464
Hs.39122
Hs.146139
Hs.336946
Hs.234786
Hs.15514
Hs.124027
15q21.2
13q12.12
23q22.2
2q35
22q11.22
1p32.3
20p13
10p13
AK054637
D50918
Hs.314986
Hs.90998
20q13.32
23q24
AK001630
BC015696
NM_022157
AF334590
NM_025068
NM_014097
Hs.18063
Hs.348617
Hs.110950
Hs.326746
Hs.287988
Hs.279778
11q24.3
10p13
1p34.3
AK056800
NM_014410
NM_014415
Hs.183161
Hs.26886
Hs.301956
12q24.31
18p11.32
3q12.3
AL031594
NM_024643
NM_014748
NM_000605
AF278532
NM_032943
Hs.247884
Hs.48642
Hs.278569
Hs.211575
Hs.102541
Hs.92254
22q13.1
14q24.3
2p23.3
9p21.3
12q22
11q14.1
NM_000887
Hs.51077
16p11.2
1q42.2
7
H200003460
H200008021
H200016016
H200007428
H200006621
H200015246
H200020202
H200012519
H200004635
H200014336
H200005726
0.0087
0.0088
0.0091
0.0091
0.0092
0.0093
0.0093
0.0094
0.0096
0.0096
0.0096
0.84
1.27
1.19
1.15
0.67
0.77
0.83
1.14
0.86
1.15
1.24
RGS5
STK18
IGLV9-49
KIAA0138
MRPL3
KIAA1718
KIAA1920
MGC2601
ATP8B2
GBX2
H3FK
H200001623
0.0097
1.16
ADAMTS1
H200006727
H200004980
0.0097
0.0098
1.29
0.80
MMD
TBX19
H200018074
H200000587
H200003041
0.0099
0.0099
0.0099
1.12
0.88
0.85
EPO
KIAA0644
Regulator of G-protein signalling 5
Serine/threonine kinase 18
Immunoglobulin lambda variable 9-49
KIAA0138 gene product
Mitochondrial ribosomal protein L3
KIAA1718 protein
KIAA1920 protein
Hypothetical protein MGC2601
ATPase, Class I, type 8B, member 2
Gastrulation brain homeo box 2
H3 histone family, member K
A disintegrin-like and metalloprotease (reprolysin
type) with thrombospondin type 1 motif, 1
Monocyte to macrophage differentiationassociated
T-box 19
Human DNA sequence from clone RP5-1182A14
on chromosome 1 Contains part of a gene
similar to rat Esp
Erythropoietin
KIAA0644 gene product
NM_003617
NM_014264
BG340948
NM_014649
NM_007208
AB051505
AB067507
NM_024042
AB032963
NM_001485
NM_003536
Hs.24950
Hs.172052
Hs.248011
Hs.159384
Hs.79086
Hs.222707
Hs.348315
Hs.124915
Hs.43577
Hs.184945
Hs.70937
1q23.3
4q28.1
NM_006988
Hs.8230
21q21.3
NM_012329
NM_005149
Hs.79889
Hs.50403
17q22
1q24.2
AL137798
NM_000799
NM_014817
Hs.302115
Hs.2303
Hs.21572
1p36.13
7q22.1
7p15.1
19p13.3
3q22.1
7q34
15q25.2
16p13.3
1q21.3
2q37.2
6p22.1
8
Related documents