* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download 書面報告
Cre-Lox recombination wikipedia , lookup
Gene expression profiling wikipedia , lookup
Gene expression wikipedia , lookup
Community fingerprinting wikipedia , lookup
Protein moonlighting wikipedia , lookup
Exome sequencing wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Gene regulatory network wikipedia , lookup
Personalized medicine wikipedia , lookup
Non-coding DNA wikipedia , lookup
List of types of proteins wikipedia , lookup
Pharmacometabolomics wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
書面報告 B90901099 劉兆昕 本資料來源: http://www.zbi.uni-saarland.de/cbi/stud/wasist.shtml Bioinformatics Overview Researchers in Bioinformatics develop algorithms and software for simulations of biochemical processes and the analysis of molecular biology data. The publication of the human genome sequence in February 2001 is considered to be a milestone of scientific research. This is reflected in the reaction of the media and the public. Bioinformatics tools were quintessential for this achievement. Still bigger challenges lie ahead of us: genes must be identified and their function determined. An understanding of the interplay of the gene products will be the basis for the development of future pharmaceutical treatments. These tasks exceed the mere assembly of the DNA-sequence by far. Bioinformatics provides decisive contributions to tackle these challenges. From genome to drug The following pages highlight some of the most important topics within bioinformatics research. Bioinformatics played a decisive role in the most prominent scientific achievements during recent years, the sequencing of the human genome. With the sequence being known, the annotation of the genome can begin. This means searching for genes in DNA , identifying the corresponding gene products (proteins, RNA), as well as explaining their structure and function. In order to fully understand protein function one has to consider their interplay. These interactions are represented by metabolic and regulatory networks which, among other things, can be simulated using computational methods. In general, drugs act by influencing the proteins that are involved in metabolism. Based on the assembly of the human genome bioinformatics methods allow researchers to find proteins (Targets) that are better suited for treating certain diseases. Bioinformatics delivers important contributions to the development of new drugs. Databases enable the search through large amounts of data in order to find new candidate drugs, that are efficient, have fewer side-effects and are capable of reaching the right destination in the body. Bioinformatics supports the optimization of known therapies The comparison of complete genomes of different individuals makes it possible to trace differences (SNPs), which may play a role when deciding on the individual therapy for a patient. Viral infections present a great challenge for drug development and therapy. The fact that viruses like HIV show high genomic variability, can result in the occurrence of viral mutations that confer resistance to the prescribed drugs. Therefore a physician is faced rather frequently with the problem of finding a new therapy for each patient infected with a particular strain. Bioinformatics methods have been developed to understand the relationship between viral mutations and drug resistance, leading to better therapeutic strategies. Sequencing The image sketches the sequencing process for a DNA molecule (chromosome). At the beginning the sequences of the segments are unknown. Green lines represent pieces that have been read during the process. (1) cloning, (2) fragmenting, (3) sequencing, (4) comparison, (5) assembly The human genome consists of 46 long DNA molecules (chromosomes) contained within the nucleus of the cell. The chromosomes carry genetic information. Each DNA molecule consists of two strand in the form of a double helix. Each DNA strand is a linear polymer that consists of similar subunits (monomers) connected end to end. Within each monomer one can find a sugar, a phosphate and a base component. The sequences of bases represents a form of linear infomation. There are four bases denoted by the letters A, C, G and T. The bases A,T and G,C are complementary, i.e. bind to each other. Based on this base pair complementarity a single strand contains the full genetic information. The goal of sequencing is to obtain the ordered set of bases contained in the DNA in form of a long string. The sequencing machines cannot read the whole genome in one step. Therefore, the genome has to be cut into smaller pieces. In order to be able to reassemble the pieces they have to be overlapping. This can be achieved by generating many copies of a DNA strand and cutting it into pieces randomly (with high pressure, ultrasound). In the process of sequence assembley the full sequence of nucleotides is gathered from overlap information by performing a stepwise search for pieces with overlapping ends. Then overlapping pieces are put together. Bioinformaics provides suitable algorithms for the assembley step. These algorithms have to be very efficient as the number of pieces and hence the number of pairwise comparisons for overlaps is large. In addition the algorithms have to deal with such problems as repetitive sequences or reading errors in the genome pieces. Annotation: 1 - From Chromosome to Gene The chromosomes contain the genome of every organism. The genome contains sequence regions (genes) that code for proteins and other molecular constituents. The proportion of coding sequence in relation to the total genome is rather small. After sequencing a genome the task is to localize the genes. This step, part of the annotation process, requires bioinformatics methods such as pattern recognition and sequence alignment. 2 - From Sequence to Structure After a gene is found, the interest shifts to determining the structure and function of the protein it codes for. The gene sequence determines the spatial structure of the molecule. The three dimensional structure influences the tasks performed by the molecule in the body. Structures are usually experimentally determined by X-ray crystallography or NMR spectroscopy. Bioinformaticians develop algorithms in order to predict the shape of the molecules as well as tools for the analysis of its function. 3 - Similarity Search and Determination of Function Database sequence searches are a favourable method for transferring established knowledge of the function of known proteins to newly sequenced genes. Comparison of structures can also lead to the identification of molecular function. Metabolic and Regulatory Pathways Part of a pathway; here enzymes are described by their classification number (4.2.1.11 und 2.7.1.40). Overview of a metabolic pathway (Source: KEGG: Kyoto Encyclopedia of Genes and Genomes) After the function of a protein has been understood, it is of special interest to identify the metabolic pathways of an organism. These are the sequence of reactions that taken together lead from one substance to another. The reactions themselves are generally enhanced by a catalytic enzyme. Important examples of metabolic pathways are the Citrate cycle Glycolysis. The sum of all paths is a metabolic pathway. In addition to the metabolic pathways there are the regulatory pathways, where biological processes are controlled by different signals. The sum of all regulatory pathways determines the regulatory network of an organism. The networks can be modeled using computational models. The computational pathway representation can be used in the process of target identification, drug design and in the search for causes of genetic disease. In basic research these networks can be used for the comparison of metabolic processes of different organisms. E.g., information on the metabolism of one organism can be used to understand the newly sequenced genome (and, correspondingly, the metabolic pathways) of another organism. Target Identification The anlysis of the metabolic network yields a target (red X) that is to be inhibited. Scientists can unravel the metabolism of an organism by studying their metabolic and regulatory networks. First and foremost, in the human body, deviations of normal function are interesting since these are frequently causes of disease. If the origin of a disease is understood, the metabolic networks can help to perform a more precise analysis and to find targets for a possible treatment. Targets can be proteins catalysing metabolic reactions. If a malfunctioning protein has been found, drugs can be developed to influence its activity and to remove the source of the disease. The use of the network models also enables to analyze possible side effects in the body that a treatment must avoid. Drug Design Target identification alone is not sufficient in order to achieve a successful treatment of a disease. A real drug needs to be developed. This drug must influence the target protein in such a way that it does not interfere with normal metabolism. One way to achieve this is to block activity of the protein with a small molecule. Bioinformatics methods have been developed to virtually screen the target for compounds that bind and inhibit the protein. Another possibility is to find other proteins that regulate the activity of the target by binding and formiong a complex. 其他參考資料: http://www.ncbi.nlm.nih.gov/About/primer/bioinformatics.html http://bioinformatics.weizmann.ac.il/cards/bioinfo_intro.html http://www.ornl.gov/sci/techresources/Human_Genome/home.shtml http://biotech.icmb.utexas.edu/pages/bioinfo.html http://avery.rutgers.edu/WSSP/StudentScholars/Session15/Session15.html