Download 書面報告

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cre-Lox recombination wikipedia , lookup

Gene expression profiling wikipedia , lookup

Gene expression wikipedia , lookup

Community fingerprinting wikipedia , lookup

Protein moonlighting wikipedia , lookup

Exome sequencing wikipedia , lookup

RNA-Seq wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Gene regulatory network wikipedia , lookup

Personalized medicine wikipedia , lookup

Non-coding DNA wikipedia , lookup

List of types of proteins wikipedia , lookup

Pharmacometabolomics wikipedia , lookup

Whole genome sequencing wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Molecular evolution wikipedia , lookup

Genome evolution wikipedia , lookup

Transcript
書面報告
B90901099
劉兆昕
本資料來源: http://www.zbi.uni-saarland.de/cbi/stud/wasist.shtml
Bioinformatics Overview
Researchers in Bioinformatics develop algorithms and software for simulations of biochemical
processes and the analysis of molecular biology data.
The publication of the human genome sequence in February 2001 is considered to be a milestone of
scientific research. This is reflected in the reaction of the media and the public. Bioinformatics tools
were quintessential for this achievement.
Still bigger challenges lie ahead of us: genes must be identified and their function determined. An
understanding of the interplay of the gene products will be the basis for the development of future
pharmaceutical treatments. These tasks exceed the mere assembly of the DNA-sequence by far.
Bioinformatics provides decisive contributions to tackle these challenges.
From genome to drug
The following pages highlight some of the most important topics within bioinformatics research.
Bioinformatics played a decisive role in the most prominent scientific achievements during recent years,
the sequencing of the human genome.
With the sequence being known, the annotation of the genome can begin. This means searching for
genes in DNA , identifying the corresponding gene products (proteins, RNA), as well as explaining
their structure and function.
In order to fully understand protein function one has to consider their interplay. These interactions are
represented by metabolic and regulatory networks which, among other things, can be simulated using
computational methods.
In general, drugs act by influencing the proteins that are involved in metabolism. Based on the
assembly of the human genome bioinformatics methods allow researchers to find proteins (Targets)
that are better suited for treating certain diseases.
Bioinformatics delivers important contributions to the development of new drugs. Databases enable the
search through large amounts of data in order to find new candidate drugs, that are efficient, have fewer
side-effects and are capable of reaching the right destination in the body.
Bioinformatics supports the optimization of known therapies The comparison of complete genomes of
different individuals makes it possible to trace differences (SNPs), which may play a role when
deciding on the individual therapy for a patient.
Viral infections present a great challenge for drug development and therapy. The fact that viruses like
HIV show high genomic variability, can result in the occurrence of viral mutations that confer
resistance to the prescribed drugs. Therefore a physician is faced rather frequently with the problem of
finding a new therapy for each patient infected with a particular strain. Bioinformatics methods have
been developed to understand the relationship between viral mutations and drug resistance, leading to
better therapeutic strategies.
Sequencing
The image sketches the sequencing process for a DNA molecule (chromosome). At the beginning the
sequences of the segments are unknown. Green lines represent pieces that have been read during the
process.
(1) cloning, (2) fragmenting, (3) sequencing, (4) comparison, (5) assembly
The human genome consists of 46 long DNA molecules (chromosomes) contained within the nucleus
of the cell. The chromosomes carry genetic information. Each DNA molecule consists of two strand in
the form of a double helix. Each DNA strand is a linear polymer that consists of similar subunits
(monomers) connected end to end. Within each monomer one can find a sugar, a phosphate and a base
component. The sequences of bases represents a form of linear infomation. There are four bases
denoted by the letters A, C, G and T. The bases A,T and G,C are complementary, i.e. bind to each other.
Based on this base pair complementarity a single strand contains the full genetic information.
The goal of sequencing is to obtain the ordered set of bases contained in the DNA in form of a long
string.
The sequencing machines cannot read the whole genome in one step. Therefore, the genome has to be
cut into smaller pieces. In order to be able to reassemble the pieces they have to be overlapping. This
can be achieved by generating many copies of a DNA strand and cutting it into pieces randomly (with
high pressure, ultrasound).
In the process of sequence assembley the full sequence of nucleotides is gathered from overlap
information by performing a stepwise search for pieces with overlapping ends. Then overlapping pieces
are put together.
Bioinformaics provides suitable algorithms for the assembley step. These algorithms have to be very
efficient as the number of pieces and hence the number of pairwise comparisons for overlaps is large.
In addition the algorithms have to deal with such problems as repetitive sequences or reading errors in
the genome pieces.
Annotation: 1 - From Chromosome to Gene
The chromosomes contain the genome of every organism. The genome contains sequence regions
(genes) that code for proteins and other molecular constituents. The proportion of coding sequence in
relation to the total genome is rather small. After sequencing a genome the task is to localize the genes.
This step, part of the annotation process, requires bioinformatics methods such as pattern recognition
and sequence alignment.
2 - From Sequence to Structure
After a gene is found, the interest shifts to determining the structure and function of the protein it codes
for. The gene sequence determines the spatial structure of the molecule. The three dimensional
structure influences the tasks performed by the molecule in the body. Structures are usually
experimentally determined by X-ray crystallography or NMR spectroscopy. Bioinformaticians develop
algorithms in order to predict the shape of the molecules as well as tools for the analysis of its function.
3 - Similarity Search and Determination of Function
Database sequence searches are a favourable method for transferring established knowledge of the
function of known proteins to newly sequenced genes. Comparison of structures can also lead to the
identification of molecular function.
Metabolic and Regulatory Pathways
Part of a pathway; here enzymes are described by their classification number (4.2.1.11 und 2.7.1.40).
Overview of a metabolic pathway (Source: KEGG: Kyoto Encyclopedia of Genes and Genomes)
After the function of a protein has been understood, it is of special interest to identify the metabolic
pathways of an organism. These are the sequence of reactions that taken together lead from one
substance to another. The reactions themselves are generally enhanced by a catalytic enzyme.
Important examples of metabolic pathways are the Citrate cycle Glycolysis. The sum of all paths is a
metabolic pathway.
In addition to the metabolic pathways there are the regulatory pathways, where biological processes are
controlled by different signals. The sum of all regulatory pathways determines the regulatory network
of an organism.
The networks can be modeled using computational models. The computational pathway representation
can be used in the process of target identification, drug design and in the search for causes of genetic
disease.
In basic research these networks can be used for the comparison of metabolic processes of different
organisms. E.g., information on the metabolism of one organism can be used to understand the newly
sequenced genome (and, correspondingly, the metabolic pathways) of another organism.
Target Identification
The anlysis of the metabolic network yields a target (red X) that is to be inhibited.
Scientists can unravel the metabolism of an organism by studying their metabolic and regulatory
networks. First and foremost, in the human body, deviations of normal function are interesting since
these are frequently causes of disease.
If the origin of a disease is understood, the metabolic networks can help to perform a more precise
analysis and to find targets for a possible treatment. Targets can be proteins catalysing metabolic
reactions. If a malfunctioning protein has been found, drugs can be developed to influence its activity
and to remove the source of the disease. The use of the network models also enables to analyze
possible side effects in the body that a treatment must avoid.
Drug Design
Target identification alone is not sufficient in order to achieve a successful treatment of a disease. A
real drug needs to be developed.
This drug must influence the target protein in such a way that it does not interfere with normal
metabolism. One way to achieve this is to block activity of the protein with a small molecule.
Bioinformatics methods have been developed to virtually screen the target for compounds that bind and
inhibit the protein. Another possibility is to find other proteins that regulate the activity of the target by
binding and formiong a complex.
其他參考資料:
http://www.ncbi.nlm.nih.gov/About/primer/bioinformatics.html
http://bioinformatics.weizmann.ac.il/cards/bioinfo_intro.html
http://www.ornl.gov/sci/techresources/Human_Genome/home.shtml
http://biotech.icmb.utexas.edu/pages/bioinfo.html
http://avery.rutgers.edu/WSSP/StudentScholars/Session15/Session15.html