Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Postdoctoral Research Work Quantitative Models for Cellular Signaling Pathways Mano Ram Maurya Cells respond to its environment through cellular signaling pathways which process and transmit the effect of changes in the environment to the nucleus. Proper functioning of these pathways is required for various cellular- and organ-level activities such as homeostasis, metabolism, growth, learning, and celldeath (apoptosis). Malfunctions in these pathways often result in a variety of diseases including mildlyfatal diseases such as cholera and deadliest ones such as cancer. Hence, a deep understanding of these pathways is essential to design effective drugs for various diseases. This deeper understanding involves an array of experimental and analysis tools. Some of the experimental tools are those from genomics for sequence analysis, microarrays for gene-expression, X-ray crystallography for protein structure and function prediction, identification of binding sites and complex molecular interactions, mass-spectrometry and fluorescence-microscopy for accurate measurement of species concentrations, etc. Mathematical analyses include analysis of microarray data to identify differentially regulated genes and effect of geneknockouts, statistical analysis, ab-initio methods for protein structure and function prediction, qualitative and quantitative modeling, etc. My research work focuses on quantitative modeling. Signaling pathways are composed of several well defined modules with complex interactions among them. Modules themselves are immensely complex and exhibit nonlinear behavior. Although, for most of the signaling modules, detailed quantitative models are only slowly emerging, there exist some well understood modules and pathways. This rich understanding can be exploited to develop compact models about these (well studied) individual subsystems in a given context (e.g., to be able to predict certain observations) so that computational resources can be directed towards detailed understanding of those modules (or parts of the pathway) which are not yet well understood. Hence, there exists opportunity for coarse-graining the models for well understood subsystems. Every model involves parameters such as rate constants, diffusion coefficients, structural parameters, etc. In most of the models discussed in the literature, often the chosen parameter-values in such models are based upon the information about comparable systems in a general context. The parameter-values do lie in the physiological range of interest and that the models are capable of making qualitatively accurate predictions, but their ability to make quantitative prediction for a particular system may be limited. Hence, parameter estimation using system-specific experimental data becomes an essential task in modeling. Based upon above philosophy for gaining deeper understanding about signaling pathways, three interrelated areas of my research are: modeling, coarse-graining (model reduction) and parameter estimation. The target biochemical systems in this work are GTPase cycle module of m1 muscarinic acetylcholine receptor, Gq, and regulator of G-protein signaling 4 (RGS4, a GTPase activating Protein (GAP)) and Calcium signaling pathway. GTPase cycle mediates signal transduction from the primary messengers such as ligands (stimuli on the cell surface) to secondary messengers such as Calcium and other downstream signaling components such as protein-kinase cascades. Both, G-protein signaling and Calcium signaling are among the most ubiquitous signaling systems in eukaryotes. I have also developed a novel data-mining approach to reduce the number of false positives. First part of my work dealt with parameter estimation for a detailed model of the GTPase cycle module followed by the development of a reduced-order/simplified model for the same. The detailed model contained 48 reaction rate parameters and 17 distinct chemical species. The main focus was on the Mano Ram Maurya - 2 development of methodologies for model-reduction for systems with unknown parameters. In one methodology, a multiparametric variability analysis (MPVA)-based approach is used to systematically eliminate some of the reactions from the detailed model. The parameters are estimated using a hybridgenetic-algorithm (GA)-based optimizer. An implicit MPVA is performed by utilizing the results available from GA-based parameter estimation. A GA-based optimization presents the user with several competing (near-best or pseudo-global) solutions (parameter-value sets). In this approach a parameter is characterized as being important if its value across the parameter-value sets with good fit to the data of interest does not vary much. These parameters are less likely to get eliminated in the process of modelreduction. Similarly, those which vary considerably across these sets are termed less-important and are more likely to get eliminated. I also developed a mixed-integer nonlinear optimization-based approach in which both the reducednetwork topology and the unknown parameters are determined simultaneously using a GA (more generally, stochastic-search). Thus, no iterations are needed. In this approach, binary variables are used to indicate whether or not a parameter is retained in the model. Complex expressions in which some parameters should be retained or be eliminated simultaneously can be handled by introducing appropriate constraints. The key idea is to substitute each parameter, say, k, by the expression kret*k, and then to optimize with respect to both k and kret to minimize the fit error between experimental data and model predictions. The relevant constraints also are reformulated appropriately. kret = 1 or 0 mean that parameter is retained or eliminated, respectively. The computational complexity of the overall process is about only twice of the complexity of parameter estimation for the detailed model. Thus, this approach is much faster than the MPVA approach. Nevertheless, the MPVA approach has its own utility since it provides an intuitive and global characterization of the nonlinear parametric variability or sensitivity. The second project is on data-mining. High-false positive rate is very common in data mining. Based upon the concept of minimal models (in terms of the size of the model), I developed a novel approach for reducing the number of false positives. The model developed is essentially a simple input/output model. In this approach, first the significant components (inputs or predictors) are identified using a principal component regression- (PCR)-based I/O modeling and by comparing the coefficients with the standard deviation of the coefficients of a population of models with random outputs (random-models). Then, by using an exhaustive combinatorial search, the model with all the significant predictors is further simplified by excluding some of the predictors while keeping the fit-error for the minimal models statistically same as that for the detail model with all the predictors. The application of the approach is to identify the main signaling pathways active during (responsible for) the release of different cytokines in RAW 264.7 macrophages upon stimulation with different ligands. In this case, since all the signaling pathways were not measured, a two-part model was developed. The first part, in which measured signaling activity serves as the inputs, strives to capture most of the output (cytokine release). In the second part, significant variations in the residuals (from the first part) are further captured by using the ligands as an input. These significant ligands in the residuals model, which are much lesser in number as compared to the measured pathways, provide a way to estimate the contribution through the unmeasured pathways. In this study, data specific to the stimulation by Toll-like receptor (TLR) ligands and non-TLR ligands was studied separately to unmask the strong effect of the pathways specifically activated through the TLRs. Then the models were combined along with the information gleaned from ANOVA about significant ligands to prepare a global (simplified) network-map for cytokine release. The third project deals with the development of a detailed model for calcium signaling in RAW 264.7 cells. RAW 264.7 cells are macrophage-like, Abelson leukemia virus transformed cell line derived from BALB/c mice (AfCS data center, http://www.signaling-gateway.org/, protocol ID: PP00000159). Models for calcium response in other cells, e.g., cardiac muscle cells (myocyte) and neuronal cells, serve as initial models to be explored further. However, since calcium concentrations in macrophages are quite different from those in myocytes and neuronal cells, many parameters are expected to be different. One of the Mano Ram Maurya - 3 challenges is that different repeats of the experiment (activation by ligand (stimulus)) result in different quantitative responses because several factors inside the cell affecting calcium response cannot be controlled. Hence, response from several controls is used for parameter estimation. Knockdown data is also used for parameter estimation since the knockdowns, essentially, manifest local perturbations of the network. The only available measurement is the concentration of free calcium in the cytosol. Hence, the size of the model and the number of parameters should be kept small. Towards this end, an expandable simplified model-structure has been used in which some lumped reactions/steps are used. To account for unmeasurable variability inside the cell, some of the initial states are allowed to vary within the physiological range signifying that different cells can be at different states of the cell cycle. Thus, the unknown parameters include both the kinetic parameters and unknown initial states. Utilization of multiple datasets (from control and knockdown experiments) required the development of dedicated computer models. In this data-specific computer model, the unknown parameters that can vary from cellto-cell are instantiated for each dataset. This results in an increased number of unknown parameters to be estimated but it provided a logical way of utilizing the full range of experimental data. To expedite the modeling process, a prototype modeling tool has been developed in MATLAB to generate the modelspecific C++ code semi-automatically that is combined with a stochastic-search-based optimization program to estimate the model-parameters. The optimization program has been parallelized using message passing interface (MPI). The resulting quantitative model can be used to predict novel knockdown phenotypes. Summary: Developed a detailed model for the GTPase cycle module of m1 muscarinic acetylcholine receptor, Gq, and regulator of G-protein signaling 4 (RGS4, a GTPase activating Protein (GAP). Major emphasis was on parameter estimation while ensuring that all relevant thermodynamic constraints were satisfied. Developed a multiparametric variability analysis-based methodology for model reduction and applied it to develop a reduced-order model for the above system (GTPase cycle module). Developed a stochastic-search-based optimization program for parameter-estimation. Parallelized this program so that it can be run on a supercomputer or a cluster of workstations to deal with large-scale problems. Developed a mixed-integer nonlinear-optimization-based approach for model reduction and used it to develop a reduced-order model for the GTPase cycle module. Developed a data mining and analysis framework to reduce the number of false positives using Principal Component Regression and model-size minimization, used the framework to identify important signaling pathways involved in cytokine release in macrophage and to develop an input/output model. Developed a prototype modeling software in MATLAB for utilization of knockdown/knockout data with cell-to-cell variation in kinetic modeling of biochemical reaction networks. Developed an expandable simplified model for Calcium signaling in RAW 264.7 cells using the parameter estimator.