* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Clustering
Transposable element wikipedia , lookup
Neuronal ceroid lipofuscinosis wikipedia , lookup
Gene therapy wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Gene nomenclature wikipedia , lookup
RNA interference wikipedia , lookup
History of genetic engineering wikipedia , lookup
Metagenomics wikipedia , lookup
X-inactivation wikipedia , lookup
Pathogenomics wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Cancer epigenetics wikipedia , lookup
Gene desert wikipedia , lookup
Epigenetics in learning and memory wikipedia , lookup
RNA silencing wikipedia , lookup
Non-coding RNA wikipedia , lookup
Short interspersed nuclear elements (SINEs) wikipedia , lookup
Minimal genome wikipedia , lookup
Primary transcript wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Oncogenomics wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Genome evolution wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Public health genomics wikipedia , lookup
Microevolution wikipedia , lookup
Genomic imprinting wikipedia , lookup
Ridge (biology) wikipedia , lookup
Genome (book) wikipedia , lookup
Designer baby wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Gene expression programming wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Mir-92 microRNA precursor family wikipedia , lookup
Microarrays Dr Peter Smooker, [email protected] Transcription Analysis • An analysis of transcription rates can be used to inform us about the activity of a gene- it’s expression levels, the tissues it is expressed in, developmental expression etc. • Traditionally, this was done on a gene-by-gene basis, as the sequence of that particular gene was identified (used as a probe). This was done using Northern Blotting (semi-quantitative). Developments 1. As in almost every field of molecular biology, PCR revolutionised transcript analysis. However, still done on a geneby-gene basis. 2. Genome sequencing projects. These generated a large number of gene probes that can be used to analyse global transcription. Global transcript analysis • Theoretically, every gene can be arrayed and transcription levels analyses. • Often, a subset is used e.g. immune response genes. Microarrays are a discovery technique • Understanding the genes/proteins involved in disease • Bottom up approach- single genes are analysed. What does this gene encode? What does the product do? Are defects in the product involved in disease? • Top down approach. Identify all genes whose expression is altered in a particular disease state. Identify an expression profile. Microarrays- basic theory • Spot DNA sequences (genes) onto a chip • Extract RNA from samples to be analysed • Convert to cDNA using reverse transcriptase • Hybridise to chip • Quantify hybridisation Cy3 Cy5 Discovery…. • Microarrays used to detect yeast genes regulated in sporulation • More than 1000 found (many previously unknown) • Several mutated and phenotype observedall strains were defective in sporulation • Discover function by observing expression Some applications • • • • • Identify and validate drug targets Gene expression in pathogens Population genetics Disease prognosis etc. etc. Fabricating arrays • The spots on the array are generally oligonucleotides or PCR-generated cDNA. These are arrayed using a robotic arm. • For RNA expression analysis, glass slides are used. • Up to 10,000 per slide Oligonucleotide arrays • Up to 300,0000 oligonucleotides per slide Approx. 10 per gene Scanning • After hybridisation of the labelled RNA, the slide is scanned. • A laser excites each spot. The Cy3 and Cy5 dyes emit fluorescence, which is captures by a confocal microscope. The classic array picture is generated (for human perusal). Data Analysis • The fluorescence of Cy3 and Cy5 is registered for each spot, normalised and a ratio between the two calculated. • Trivially, greater than 2-fold differences are seen as significant. • Often calculate SD and use that as a measure of significance. • As the genes that are often the most interesting are expressed in low abundance, normalisation and statistics is important. Expression profile clustering Cluster genes that give the same expression pattern over several experiments/conditions. Construct a matrix. Each column is an experiment, each row a gene. Clustering • Clustering is the division of the elements of a set into subsets, by virtue of a distance metric among the elements • From a biological perspective, this might mean clustering all genes that have elevated transcription in tamoxifen-resistant breast cancer Clustering • Some clustering techniques include: • • • • Hierarchical clustering Self-organising maps K-means clustering SVM • Because the elements in a cluster are assigned a distance, phylogenetic techniques can be used to determine relationships. Traditional phylogenetic tools are used (e.g. Phylip) Cancer profiles • One area of research is the profiling of tumours. The expression pattern of each tumour is compared, and the clinical history of the patient is also known. This can lead to diagnostic predictions. An Example Breast Cancer Res. 2001; 3 (2): 77–80 Molecular profiling of breast cancer: portraits but not physiognomy James D. Brenton, 1 Samuel A. J. R. Aparicio,2 and Carlos Caldas2 • Breast cancers may have different outcomes despite similar histopathological appearance. • Want to identify key prognostic markers. • Used 84 arrays, total over 680,000 data points. Tested 65 samples. • Used hierarchical clustering to reveal groups with similar patterns of gene expression.