* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download α depended degree
Genetic engineering wikipedia , lookup
X-inactivation wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Gene therapy wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Gene nomenclature wikipedia , lookup
Cancer epigenetics wikipedia , lookup
Pathogenomics wikipedia , lookup
Public health genomics wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Gene desert wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Essential gene wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
History of genetic engineering wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Genomic imprinting wikipedia , lookup
Genome evolution wikipedia , lookup
Ridge (biology) wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Minimal genome wikipedia , lookup
Oncogenomics wikipedia , lookup
Microevolution wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Gene expression programming wikipedia , lookup
Designer baby wikipedia , lookup
Journal Club Meeting Sept 13, 2010 Tejaswini Narayanan Gene expression profiling provides tremendous info to help unravel complexity cancer. Selection of the most informative genes from huge noise for cancer classification has taken importance. Wang and Gotoh [WnG] -> a novel Variable Precision Rough Sets-rooted robust soft computing method [VPRS] (by introducing an α depended degree). It is a simple, efficient and straightforward method for accurate cancer classification using single genes or gene pairs and subsequently inferred the direct gene regulatory network. U -> universe of discourse. R -> equivalence relation. The degree of dependency of a set of attributes Q on another set of attributes P is denoted by γP(Q) is Where = size of the union of the lower approximation of each equivalence class in U/R(Q) on P in U, |U| = size of U (the set of samples). Q = decision attributes D, P = subset of condition attributes, γP(D) is depended degree = degree to which P can discriminate between the distinct classes of D m= classification power. Greater γP(D) => stronger classification ability of P (basis for selecting informative genes) Canonical depended degree -> excessively rigid definition => difficult to detect the discriminative features, high computational expense, uncertainty of predictive performance and non-uniqueness. Method proposed: 1. 2. 3. To filter redundant info and retain the critical information (i.e. signal). Followed by making decision rules based on core information and classifying the whole dataset. To extract hidden meaningful rules, we sometimes need to lose some rigid definitions -> flexible α depended degree under soft computing consideration. This allows some single genes [or gene pairs] to have strong class discriminatory power. Interestingly, this also enables us to infer the networks and modules! All the gene selection, classification and network construction processes in this method correlate well with biologically meaningful decision rules, such as: tumor vs. normal cells, up- vs. down-regulation, and positive vs. negative regulation. Solution: WnG introduced α depended degree, a generalization form of the depended degree sets in their VPRS model The α depended degree, given P and D is: where |*| => size of set * U/R(•) => set of equivalence classes induced by the equivalence relation R(•). For the selection of high class-discrimination genes, lower limit of α = 0.7 Decision Rule: One decision rule was: “A ⇒ B” meaning “if A, then B”, A -> condition attributes and B -> decision attributes. The confidence of a decision rule A ⇒ B is defined as follows: where support (A) -> proportion of samples satisfying A and support (A ∧ B) -> that satisfying A and B simultaneously. T • Confidence indicates reliability of the rule. • For each determined α value, only the genes with γ P(D,α) = 1 were selected • Sufficient reliability was ensured by setting a high threshold for α. to build decision rules. Dataset • • SAGE breast cancer dataset having ~2.7 million tags and 27 samples. Each described as lymph node [LN(+)] and [LN(-)] primary breast tumors. Results Results… WnG have identified 7 highly discriminative (hub) genes. All identified genes have high classification accuracy (under α = 0.8) These seven hub genes are very interesting and informative for their biological relevance. Example: It is well known that the role of the ATF2/AP1 complex and its network is at the hub of tumorigenesis. This has been reflected by a high classification accuracy of 88.89%. Inference of the Gene Regulatory Network 1. 2. It is expected that a few highly class-discriminative hub genes could greatly enhance the authenticity and confidence of computed gene interaction networks. WnG investigated the gene regulatory network by employing the following: o 1 gene [instead of a class] is used as the decision attribute. o If “GENEI” is substituted for “Class label” in a decision table, GENE-I is regarded as the decision attribute with two distinct values: up-regulation (UR) and down-regulation (DR), and a new derivative table can be obtained. o They implement the discretization of this derivative table to obtain another newly derived table. o Decision rule: if GENE-I is DR, then Gene-II is DR; if Gene-I is UR, then Gene-II is UR. o They are not necessarily true in reverse. Therefore, a directed regulatory relation of GENE-I to GENE-II, a positive one, is established. Modularity of networks: 1. They use the Cytoscape plugin MCODE19 to analyze the network constructed. 2. Detected two significant modules, one forms a feed- forward loop. 3. They conclude that the co- regulation of multiple activators could be at least partly responsible for the occurrence of tumors. Some more observations: 1. Colon cancer dataset: they identified 18 discriminative hub genes for cancer. 2. 10 of these (e.g. DES and ACTA2) belong to DR genes in a tumor, while 8 other genes (e.g. IL8, HSPD1, SRPK1) belong to UR genes in a tumor. 3. The UR genes are regulated by more genes than DR ones, while the DR genes regulate more genes than UR ones. 4. Tumor suppressors inhibit tumor activators and activate as many other tumor suppressors as possible. Whereas, tumor activators activate other tumor activators and inhibit as few tumor suppressors as possible. This method is a new option for cancer classification and direct gene regulatory network inference. User-friendly, Simple Biologically interpretable Cost-effective in a clinical setting with single genes or gene pairs. Relatively easy to understand and follow Availability of programming codes with either open access or GNU general public license (GPL).