Download α depended degree

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Genetic engineering wikipedia , lookup

X-inactivation wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Epistasis wikipedia , lookup

Gene therapy wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Long non-coding RNA wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Gene nomenclature wikipedia , lookup

Cancer epigenetics wikipedia , lookup

Pathogenomics wikipedia , lookup

Public health genomics wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Quantitative trait locus wikipedia , lookup

Gene desert wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Essential gene wikipedia , lookup

NEDD9 wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

History of genetic engineering wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Genomic imprinting wikipedia , lookup

Genome evolution wikipedia , lookup

Gene wikipedia , lookup

Ridge (biology) wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Minimal genome wikipedia , lookup

Oncogenomics wikipedia , lookup

Microevolution wikipedia , lookup

RNA-Seq wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Gene expression programming wikipedia , lookup

Designer baby wikipedia , lookup

Genome (book) wikipedia , lookup

Gene expression profiling wikipedia , lookup

Transcript
Journal Club Meeting
Sept 13, 2010
Tejaswini Narayanan



Gene expression profiling provides tremendous info to help unravel
complexity cancer.
Selection of the most informative genes from huge noise for cancer
classification has taken importance.
Wang and Gotoh [WnG] -> a novel Variable Precision Rough Sets-rooted
robust soft computing method [VPRS] (by introducing an α depended
degree).

It is a simple, efficient and straightforward method for accurate cancer
classification using single genes or gene pairs and subsequently inferred
the direct gene regulatory network.
U -> universe of discourse.
R -> equivalence relation.
The degree of dependency of a set of attributes Q on another set of attributes P is
denoted by γP(Q) is
Where
= size of the union of the lower
approximation of each equivalence class in U/R(Q) on P in U,
|U| = size of U (the set of samples).
Q = decision attributes D, P = subset of condition attributes,
 γP(D) is depended degree = degree to which P can discriminate between the
distinct classes of D m= classification power.
 Greater γP(D) => stronger classification ability of P (basis for selecting
informative genes)
Canonical depended degree -> excessively rigid definition => difficult to detect
the discriminative features, high computational expense, uncertainty of
predictive performance and non-uniqueness.
Method proposed:
1.
2.
3.



To filter redundant info and retain the critical information (i.e. signal).
Followed by making decision rules based on core information and
classifying the whole dataset.
To extract hidden meaningful rules, we sometimes need to lose some rigid
definitions -> flexible α depended degree under soft computing
consideration.
This allows some single genes [or gene pairs] to have strong class
discriminatory power.
Interestingly, this also enables us to infer the networks and modules!
All the gene selection, classification and network construction processes in
this method correlate well with biologically meaningful decision rules, such
as:
 tumor vs. normal cells,
 up- vs. down-regulation, and
 positive vs. negative regulation.
Solution:
WnG introduced α depended degree, a generalization form of the
depended degree sets in their VPRS model
The α depended degree, given P and D is:
where
|*| => size of set *
U/R(•) => set of equivalence classes induced by the equivalence relation R(•).

For the selection of high class-discrimination genes, lower limit of α = 0.7
Decision Rule:
One decision rule was: “A ⇒ B” meaning “if A, then B”,
A -> condition attributes and B -> decision attributes.
The confidence of a decision rule A ⇒ B is defined as follows:
where support (A) -> proportion of samples satisfying A and
support (A ∧ B) -> that satisfying A and B simultaneously. T
•
Confidence indicates reliability of the rule.
•
For each determined α value, only the genes with γ P(D,α) = 1 were selected
•
Sufficient reliability was ensured by setting a high threshold for α.
to build decision rules.
Dataset
•
•
SAGE breast cancer dataset having ~2.7 million tags and 27 samples.
Each described as lymph node [LN(+)] and [LN(-)] primary breast tumors.
Results
Results…

WnG have identified 7 highly discriminative (hub) genes.

All identified genes have high classification accuracy (under α = 0.8)

These seven hub genes are very interesting and informative for their
biological relevance.
 Example: It is well known that the role of the ATF2/AP1 complex and its
network is at the hub of tumorigenesis.
 This has been reflected by a high classification accuracy of 88.89%.
Inference of the Gene Regulatory Network
1.
2.
It is expected that a few highly class-discriminative hub genes could greatly
enhance the authenticity and confidence of computed gene interaction
networks.
WnG investigated the gene regulatory network by employing the following:
o
1 gene [instead of a class] is used as the decision attribute.
o
If “GENEI” is substituted for “Class label” in a decision table, GENE-I is regarded
as the decision attribute with two distinct values: up-regulation (UR) and
down-regulation (DR), and a new derivative table can be obtained.
o
They implement the discretization of this derivative table to obtain another
newly derived table.
o
Decision rule: if GENE-I is DR, then Gene-II is DR; if Gene-I is UR, then Gene-II
is UR.
o
They are not necessarily true in reverse. Therefore, a directed regulatory
relation of GENE-I to GENE-II, a positive one, is established.
Modularity of networks:
1. They use the Cytoscape plugin MCODE19 to analyze the network
constructed.
2. Detected two significant modules, one forms a feed- forward loop.
3. They conclude that the co- regulation of multiple activators could be at least
partly responsible for the occurrence of tumors.
Some more observations:
1.
Colon cancer dataset: they identified 18 discriminative hub genes for
cancer.
2.
10 of these (e.g. DES and ACTA2) belong to DR genes in a tumor, while 8
other genes (e.g. IL8, HSPD1, SRPK1) belong to UR genes in a tumor.
3.
The UR genes are regulated by more genes than DR ones, while the DR
genes regulate more genes than UR ones.
4.
Tumor suppressors inhibit tumor activators and activate as many other
tumor suppressors as possible.
Whereas, tumor activators activate other tumor activators and inhibit as few
tumor suppressors as possible.
This method is a new option for cancer classification and direct gene regulatory
network inference.






User-friendly,
Simple
Biologically interpretable
Cost-effective in a clinical setting with single genes or gene pairs.
Relatively easy to understand and follow
Availability of programming codes with either open access or GNU
general public license (GPL).