Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Identification of Important Signaling Proteins and Stimulants for the Production of Cytokines in RAW 264.7 Macrophages Sylvain Pradervand1, Mano Ram Maurya2, Shankar Subramaniam1,2 1San Diego Supercomputer Center 2Department of Bioengineering University of California, San Diego AIChE Annual Meeting, Wednesday, November 02, 2005 Outline Production and release of cytokines in macrophage Identification of significant correlations between signaling proteins and cytokines Quantitative input/output modeling using Principal Component Regression (PCR) Results Summary and conclusions AIChE Annual Meeting 2005 Cytokine Production and Release in Macrophages Cytokines Proteins for communication between immune cells Important players of the immune system Apoptosis of infected cells Initiation of inflammation Control of inflammation Secreted by immune cells Complex signaling activities followed by gene-expression AIChE Annual Meeting 2005 Cytokine Production and Release in Macrophages (ligands) Multiple Stimuli Complex signaling network of signaling proteins (phosphoproteins) Macrophage Cytokines Paracrine cytokines Endocrine cytokines AIChE Annual Meeting 2005 Each of the signaling proteins or 2nd messengers are a marker of a pathway: overall very complex Acknowledgement: http://www.biocarta.com/pathfiles/h_gpcrPathway.asp Clustering and Correlation Analysis Hierarchical clustering (using R) Reveals which inputs (ligands) have similar effect on the signaling proteins and on cytokine release Correlation analysis (using R) Neuman-Pearson correlation method AIChE Annual Meeting 2005 Signaling Pathway Activations and Cytokine Release Singaling proteins/2nd messengers cytokines Increase Ligands Decrease Toll like receptor (TLR) ligands Toll-like receptor: pattern recognition receptors (PRRs), binding to pathogen-associated molecular patterns, for immediate action without antibody AIChE Annual Meeting 2005 Correlation between Signaling Activity and Cytokines Significant correlations are displayed With Toll-like receptor (TLR) ligands Without TLR ligands Cytokine Signaling proteins Positive correlation Negative correlation With TLRL, similar pattern of correlations for the many cytokines TLRL are dominant, effect of others less visible Without TLRL data, only few positives correlations, most involved TNF Without TLRL data, STATs show stronger correlations AIChE Annual Meeting 2005 Further (Quantitative) Analysis Hierarchical clustering and correlation analysis are only qualitative Detailed signaling map is not available Develop simplified linear input/output models Elucidate common and different signaling modules Predict cytokine release AIChE Annual Meeting 2005 Further (Quantitative) Analysis Two-part model L Part-I: Capture most of the output as: Y1 = X*B1 (PP-model) Part-II: Residual, Y – Y1, as Y2 = L*B2 X’s highly correlated X B1 Y1 Y2 B2 Y = Y1 + Y2 User principal component regression (PCR) PLS not used since the number of data points > > the number of outputs AIChE Annual Meeting 2005 PCR-Based Approach Estimate B s/t Y = X*B, using known X and Y X mxn1 X: input or predictor V n1xk V = matrix of eigen vectors of cov(X) T = matrix of latent variables T mxk k latent variables Y=X*B T=X*V Y=T*Q Q kxn2 Y=X*V*Q Y mxn2 1. 2. 3. 4. B=V*Q Calculate T = X*V Calculate Q with least-square Predicte Y: Yp = T*Q Repeat the procedure for the residual, Y – Yp, with L as input AIChE Annual Meeting 2005 Statistical Significance of the Coefficients Most coefficients non-zero Identify the significant coefficients Estimate the coefficients for many random models Randomly shuffle Y (the data points), Ys, calculate coefficients Calculate standard deviation of the random coefficients (j) Calculate the ratio: rj = bj/ j Significance test 95% confidence level: rth = 1.96 Null hypothesis true if rj < rth Use higher threshold for the residuals: rth = 2*1.96 = 2.77 standard deviation of the difference of two samples from N(0,1) AIChE Annual Meeting 2005 Cytokines Regulatory Signals JNK, p38, NF-kB strongest coefficients ERK1/2 and RSK similar profile cAMP the only significant negative AIChE Annual Meeting 2005 Cytokines Regulatory Signals Without TLRL Data cAMP kept its negative strength STATs became more significant Remaining positive coefficients: p38 (G-CSF and TNF), RSK (TNF) AIChE Annual Meeting 2005 Cytokines Regulatory Signals in Residuals Only few ligands statistically significant AIChE Annual Meeting 2005 Cytokines Regulatory Signals in Residuals Without TLRL Data IL-4 is strong for IL-1a, IL-6 and IL-10 2MA is strong for G-CSF and TNFa G-CSF and TNFa have a similar pattern of coefficients AIChE Annual Meeting 2005 Minimal PCR Model Many predictors flagged as significant due to correlation with other important predictors Identifies most known pathways but high false positive rate Identify necessary and sufficient set of signaling pathways that would predict cytokine release Generate minimal models Find the least number of predictors with statistically same fit as the full model Must be better than a zero predictor (average) model Use F-test for each of these AIChE Annual Meeting 2005 Procedure for PCR Minimal Models F-test: Full (detailed) model with all significant predictors (ed) R1 er2 / ed2 finv( p, d r , d d ) p 1 , 0.05, p 0.95 Better than the trivial model: R2 e02 / er2 finv( p, d 0 , d r ) p 0.68 for the residuals Decreasing number of predictors As good as the full-model: If more than one predictor left, use combinatorial selection (integer programming) for exhaustive testing Keep eliminating the least significant predictor: R1 increases, R2 decreases Initial minimal model Final minimal model Zero-predictor model (e0) AIChE Annual Meeting 2005 Combined Minimal Model and Validation Integrate validation with the model development Build a network combining the results from model +/TLRL data Pathways: p38, cAMP, NF-kB, JNK, STAT1 Ligands: others 10 regulatory modules JNK/NF-kB translates TLRL dependency p38/PAF post-transcriptional controls? STAT1 affects the chemokines cAMP is anti-correlated (inhibitory?) AIChE Annual Meeting 2005 Validation with the Literature Cytokine False Positive False Negative G-CSF 0 0 IL-1a 0 0 IL-6 5% (2 extra) 0 IL-10 2.5% (1 extra) 40% (missed 2) MIP-1a 0 33% (missed 2) RANTES 0 0 TNFa 0 17% (missed 1) With minimal model Overall 1.2% false positive rate (FPR) and 13% false negative rate (FNR) With full model Overall 11% FPR (10 times higher) and 3% FNR (4 times lower) Relative gain with minimization: a factor of 2.5 AIChE Annual Meeting 2005 New Hypothesis for G-CSF from Network Reconstruction All known regulatory pathways found New hypothesis: p38 involved in posttranscriptional regulation of G-CSF (stimulates production of neutrophils)? ISO Adrb2 p38 LPS, P2C P3C, R-848 TLR2/1, TLR2/6 TLR4, TLR7 NF-B 2MA P2X, P2Y JNK G-CSF AIChE Annual Meeting 2005 Summary Ligand screen data set Statistical analysis Modeling Collection of hypothesis Design in vitro assays AIChE Annual Meeting 2005 Cytokine Production and Release in Macrophages Cytokines Messengers proteins in communication between immune cells Important players of the immune system AIChE Annual Meeting 2005 Glossary of Cytokine names IL: interleukin TGF: transforming growth factor TNF: tumor necrosis factor GM-CSF: granulocyte/macrophage colony stimulating factor, also M-CSF and G-CSF MIP: macrophage inflammatory protein RANTES: Regulated on Activation, Normal T Expressed and Secreted (also known as CCL5, binds to CCR5 which is a coreceptor of HIV, thus blocks HIV from entering the cell) AIChE Annual Meeting 2005 Glossary ofcolony Ligand Names GM-CSF: granulocyte/macrophage stimulating factor, also M-CSF and G-CSF IL: interleukin IFN: interferon (induce cells to resist viral replication) C5a: cleavage product from C5 (a protein of the complement pathway/system) R-848: Resiquimod: potent antiviral regent LPS: lipopolysaccharide P2C: PAM2CSK4 (synthetic diacylated lipopeptide; AfCS) 2MA: 2-Methylthio-ATP is a synthetic analog of ATP (acts through P2X (ligand-gated) and P2Y (GPCR)) LPA: Lysophosphatidic acid (derived from phospholipid) UDP: Uridine diphosphate (a nucleotide) S1P: Sphingosine-1-phosphate PAF:Platelet activating factor (PAF) is a proinflammatory phospholipid ISO:Isoproterenol PGE:Prostaglandin E2, a lipid product of arachidonic acid metabolism, has an immunosuppressive effect AIChE Annual Meeting 2005 Glossary of Signaling Proteins cAMP Akt: protein kinase B ERK & JNK: MAPKs (from wikipedia.com; To date, four distinct groups of MAPKs have been characterized in mammals: (1) extracellular signal-regulated kinases (ERKs), (2) c-Jun N-terminal kinases (JNKs), (3) p38 isoforms, and (4) ERK5) RSK:ribosomal S6 kinase GSK: Glycogen synthase kinase-3 (overexpressed in Alzheimer’s disease) nF-KB p40Phox (Neutrophil cytosolic factor 4; an oxidoreductase) SMAD SMAD-1 is the human homologue of Drosophila Mad (Mad =Mothers against decapentaplegic) STAT: Signal Transducers and Activator of Transcription Rps6: ribosomal protein S6 AIChE Annual Meeting 2005 Measurement of Signaling Proteins and Cytokines 2nd messengers Enzyme-linked immunoassay to measure cAMP concentrations Fluorescent dye to measure intracellular free calcium Signaling proteins Immunoblots to detect signaling proteins phosphorylations Responses Agilent inkjet-deposited presynthesized oligo arrays to assess gene expression Multiplex suspension array system to measure concentrations of cytokines in the extracellular medium Data is log-transformed after subtraction of basal response observed in control data Stimulation by a single or double ligands at a fixed strength AIChE Annual Meeting 2005 Procedure for Normalization of Data Data processing Signaling proteins Log2(Fold-change (response/basal-response)) Except for cAMP for which basal was subtracted, then log2 Cytokines Log2(response – basal + 1), basal is close to 0 Signal-to-noise ratio calculated • Cytokine not analyzed if SNR < 5 AIChE Annual Meeting 2005 Main Regulation at the mRNA Level Most of the regulatory mechanisms at the genetranscription level Except for IL-1a, good overall correlation with coefficients > 0.9: 0.92 (MIP-1a) to 0.99 (IL-10) AIChE Annual Meeting 2005 Statistical Analysis of Ligands Interactions Is there more than additive effect of ligands on the cytokine release (output)? Use of linear model (a similar model with lesser terms used to identify significant ligands in single-ligand data) Yhijklt L1h L 2 i T j E k Gl ( k ) L1L 2 hi ... These terms are either 0 or non-zero but fixed Since the data corresponds to fixed strength of the stimulus Gel-effect Nonlinear-term Random-error Time-effect Constant term Effect of ligand 1 Effect of ligand 2 Null hypothesis: no synergism (more than additive effect) of the ligands on the cytokine release, i.e., L1L2hi = 0 Used ANOVA (Analysis of Variance) AIChE Annual Meeting 2005 An Example of Hypothesis from Interaction Analysis IL-4 enhances STAT1/b activation by IFNg IFNg IL-4 + Pathway A Pathway B STAT1a/b AIChE Annual Meeting 2005 Examples of Hypothesis from Interaction Analysis Gs ligands enhance G-CSF, IL-1, IL-6, IL-10 releases by TLRL TLRL ISO/PGE + Pathway A Pathway B G-CSF, IL-1a IL-6, IL-10 AIChE Annual Meeting 2005 Examples of Hypothesis from Interaction Analysis Synergism between IL-6 and TLRL on IL-10 release is mediated via a ERK1/2-dependent pathway TLRL IL-6 + ERK1/2 Pathway B IL-10 AIChE Annual Meeting 2005 PCR-Based Approach Estimate B s/t Y = X*B, using known X and Y X mxn1 X: input or predictor V n1xk V = matrix of eigen vectors of cov(X) T = matrix of latent variables T mxk k latent variables Y=X*B T=X*V Y=T*Q Q kxn2 Y=X*V*Q Y mxn2 1. 2. 3. 4. 5. B=V*Q Vk = [V1 V2…Vk], k = matrix of eigen-values =diag[1 2… k] Calculate T = X*V; T’*T = *(m-1); k-1 = diag(1/1…. 1/k) Calculate Q with least-square method: Q = k-1/(m-1) *(T’*Y) Calculate B = V*Q, predicted Y, Yp = T*Q = T* k-1/(m-1) *(T’*Y) Repeat the procedure for the residual, Y – Yp, with L as input AIChE Annual Meeting 2005 Statistical Significance of the Coefficients Most coefficients (bj; with respect to jth input) are non-zero Identify the significant coefficients Estimate the coefficients for a random model Randomly shuffle Y (the data points), Ys, calculate coefficients Repeat many times (1000 times) Calculate standard deviation of the random coefficients (j) Approximation: j diag (V * ( k * (m 1)) 1 * V T ) * std (Y j ) Calculate the ratio: rj = bj/ j Significance test at a confidence level of 95%: rth = 1.96 Null hypothesis (coefficient not significant) true if ri < rth Use higher threshold for the residuals: rth = 2*1.96 = 2.77 standard deviation of the difference of two samples from N(0,1) AIChE Annual Meeting 2005 Cytokines Regulatory Signals Average of the ratios for models with different number of predictors to capture 80% - 95% variation in input data JNK, p38, NF-kB strongest coefficients ERK1/2 and RSK similar profile cAMP the only significant negative AIChE Annual Meeting 2005 Minimal PCR Model Many predictors flagged as significant because of their correlation with other important predictors Identifies most of the known pathways but results in high number of false positives Identification of necessary and sufficient set of signaling pathways that would predict cytokine release Algorithm to generate minimal models Essential idea: Form the list of significant predictors, find the least number of predictors with fit statistically equal to the fit for the full model The minimal model should be better than a zero predictor (average of the output) model Use F-test for each of these AIChE Annual Meeting 2005 Procedure for PCR Minimal Models F-test: Full (detailed) model with all significant predictors (ed) R1 er2 / ed2 finv( p, d r , d d ) p 1 , 0.05, p 0.95 Better than the trivial model: R2 e02 / er2 finv( p, d 0 , d r ) p 0.68 If full model itself is no better than the trivial model: Accept the trivial model Decreasing number of predictors As good as the full-model: If more than one predictor left, use combinatorial selection (integer programming) for exhaustive testing Intermediate model-1 (e1) Keep eliminating the least significant predictor: R1 increases, R2 decreases Initial minimal model Final minimal model Intermediate model-2 (e2) Zero-predictor (average output) model (e0) AIChE Annual Meeting 2005 New hypothesis for IL-1 from network reconstruction All known regulatory pathways found New hypothesis: IFNg regulates IL-1a through an IRFs-dependent pathway? New hypothesis: IL-4 regulates IL-1a through a STAT6 pathway? IFNg IFNGR JNK LPS P2C P3C R-848 TLR2/1,TLR2/6, TLR4, TLR7 IL-4 IL-4R NF-B IL-1 AIChE Annual Meeting 2005 New hypothesis for TNFa from network reconstruction All known regulatory pathways except ERK1/2 found New hypothesis: M-CSFspecific pathway regulates TNFa? M-CSF CSF-1R p38 LPS P2C P3C R-848 2MA UDP TLR2/1, TLR2/6, P2X, P2Y TLR4, TLR7 JNK ISO IFNg Adrb2 IFNGR NF-B cAMP TNF AIChE Annual Meeting 2005 New hypothesis for RANTES from network reconstruction All known regulatory pathways found New hypothesis: Synergisms between LPS specific pathway (IRF-1?) and NF-kB on RANTES regulation? IFNb IFNAR STAT1 R-848 P2C P3C TLR2/1, TLR2/6, TLR7 NFB LPS TLR4 JNK RANTES Similar hypothesis for IL-6 release AIChE Annual Meeting 2005 Validation with the Literature literature True positives identified (both) (1,1) Our model False negative (0,1) False positive (1,0) True negative identified (both) (0,0) Count ER1/2 as 1, stat1a/b as 1 JNK sh/lg as 1, GSK 3a/3b as 1 18 PP and 22 ligands as total (false positive + none) = 40 Total true negative = (negative identified + not-identified-butreported-in literature) With minimal model Overall 1.2% false positive rate (FPR) and 13% false negative rate (FNR) With full model Overall 11% FPR (10 times higher) and 3% FNR (4 times lower) Relative gain with minimization: a factor of 2.5 AIChE Annual Meeting 2005 Validation with the Literature Cytokine False Positive False Negative G-CSF 0 0 IL-1a 0 0 IL-6 5% (2 extra) 0 IL-10 2.5% (1 extra) 40% (missed 2) MIP-1a 0 33% (missed 2) RANTES 0 0 TNFa 0 17% (missed 1) Full model missed only cAMP for IL-10, but it has more false positives FPR (type-I error) = FP/(FP + none (true negative)) FNR (type-II error) = FN/(true positives = positives_identified + FN) AIChE Annual Meeting 2005 Validation with the Literature Total # predictors Minimal Model 40 Both G-CSF IL-1 IL-6 IL-10 MIP-1a RANTES TNFa Our model only 3 2 4 3 4 3 5 Total # predictors FullModel 0 0 2 1 0 0 0 none 0 0 0 2 2 0 1 false positive 37 38 34 34 34 37 34 0 0 0.055555556 0.028571429 0 0 0 false negative 0 0 0 0.4 0.333333333 0 0.166666667 0.012018141 0.128571429 40 Both G-CSF IL-1 IL-6 IL-10 MIP-1a RANTES TNFa Litterature only Our model only 3 2 4 4 6 3 6 Litterature only 4 4 3 6 3 5 3 none 0 0 0 1 0 0 0 33 34 33 29 31 32 31 false positive false negative 0.108108108 0 0.105263158 0 0.083333333 0 0.171428571 0.2 0.088235294 0 0.135135135 0 0.088235294 0 0.111391271 0.028571429 In the detail model, the only false negative would have been cAMP for IL-10. Then, the FNR would be (0.2/7) = 2.9% AIChE Annual Meeting 2005