* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Genomics 1 The Genome
Copy-number variation wikipedia , lookup
Cell-free fetal DNA wikipedia , lookup
Transposable element wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
DNA supercoil wikipedia , lookup
Epigenomics wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
Human–animal hybrid wikipedia , lookup
Point mutation wikipedia , lookup
Genome (book) wikipedia , lookup
Pathogenomics wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
Mitochondrial DNA wikipedia , lookup
Quantitative trait locus wikipedia , lookup
No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup
Designer baby wikipedia , lookup
Nucleic acid double helix wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
Minimal genome wikipedia , lookup
Genetic engineering wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Human genetic variation wikipedia , lookup
Public health genomics wikipedia , lookup
Microsatellite wikipedia , lookup
Genealogical DNA test wikipedia , lookup
Genome-wide association study wikipedia , lookup
Microevolution wikipedia , lookup
Helitron (biology) wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Genomic library wikipedia , lookup
Molecular Inversion Probe wikipedia , lookup
Non-coding DNA wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Human Genome Project wikipedia , lookup
Human genome wikipedia , lookup
Nucleic acid analogue wikipedia , lookup
SNP genotyping wikipedia , lookup
Genome editing wikipedia , lookup
History of genetic engineering wikipedia , lookup
Genomics 1 The Genome The cattle genome consists of 30 pairs of chromosomes which are made of DNA. The are at least 3.5 billion base pairs within the DNA of those 30 chromosomes. Amino acids are coded by 3 bases, like TAA or TGC. A set of amino acids codes for a protein or enzyme which influences activities within the body of an individual. Only about 5% of the genome actually codes for proteins and enzymes, with the remaining 95% seeming to be redundant. The purpose of the extraneous DNA is unknown, but some areas are highly conserved and similar between species, and thus, could be DNA that was once important in evolution to the present day. Another hypothesis is that the extra DNA is involved in the timing of the activation and shut-off of the coding regions of DNA. The purpose of DNA of all types is being studied actively today with almost weekly articles in newspapers and magazines about new mysteries of the DNA. This area will play an important role in your future lives, and it is both very interesting and probably very critical to your futures. From one individual to the next there are variations in the sequences of base pairs. Variations can be due to 1. A change in one base pair, where A changes to G, or G changes to C, 2. A few base pairs are missing between animals, 3. A few extra base pairs are added between two animals, or 4. The order of the base pairs can be inverted or moved to a different part of the chromosome. Depending on the location of the variations in the genome, there could be different effects on the animal. Some variations (if they are in non-coding regions, for example) may not cause any change in the proteins and enzymes that are produced. Some variations may be in coding regions of the genome, but may still be harmless and result in no changes in functioning. Some variations could cause changes, such as in height of individuals or colour of the eyes or hair, which are also harmless. Finally, variations could be harmful and cause serious and even lethal changes in the individual due to an inability to produce the correct series of amino acids. 1 2 Single Nucleotide Polymorphism, SNP The most abundant type of variation in human and cattle genomes is the single nucleotide polymorphism or SNP, where a single base pair has been changed. To be called a SNP, at least 1% of the population must have the different base change. To find SNP, one must start at one end of the genome and go through it base by base comparing between two individuals (Sequence Comparisons). SNPs are discovered by comparing individuals that are greatly different in background - such as different breeds, or very high producers versus very low producers. Millions of SNPs have been found in humans, and there are over 800,000 in cattle with more being discovered every day. Some of the same SNPs appear in both humans and cattle. In 2003, a company called Affymetrix (California) produced a ’chip’ or ’panel’ or ’array’ of 10,000 SNP (from human studies). A DNA sample is put on the chip, and the genotypes of the animal for 10,000 SNP could be determined for a cost of about $350 per animal. The goal of a USDA-dairy industry project started in 2006 was to discover Quantitative Trait Loci (i.e. genes) that had large, significant effects on various traits in cattle. Researchers went through all of the available known SNPs in cattle and deliberately chose which SNPs to be on the panel. The result was the Illumina 50K chip. DNA for the study was collected from semen samples from over 5,000 dairy and beef bulls from North America, including Canada. In 2010, Illumina is developing a SNP chip with 800,000 SNPs. 3 Genome Wide Selection For each SNP locus there are just 3 possible genotypes. In 2001, Meuwissen, Hayes, and Goddard published a paper that showed if the SNPs were evenly spread through the genome, then it was possible to estimate the effects of genotypes at each SNP locus on a trait of interest. The estimates could be put into a table as follows: Genotype Locus 1 Locus 2 11 0.10 3.60 12 0.50 4.58 22 0.90 5.63 Locus 3 · · · 10.97 12.44 · · · 15.33 Locus n -1.12 -3.56 -5.87 There would be genotype estimates for every SNP locus. Thus, if a 50K chip was used, there would be 50,000 genotypes for one animal. A Genomic Estimated Breeding Value (GEBV), could be constructed from the table of genotype estimates. Suppose the genotypes of animal X were (11, 12, 22, · · · , 12), then the animal’s GEBV would be the sum of (0.10 + 4.58 + 15.33 + · · · -3.56) = 48.72, for example. Given the genotypes, sum the corresponding genotype estimates together for all SNP loci. 2 According the Meuwissen et al. (2001) the correlation between GEBV and an animal’s true breeding value (TBV) would be as high as 0.85 or better. There estimate was based on simulation work in which many assumptions were made. In practice, so far, a correlation of 0.6 to 0.7 is probably the best that can be done. This is slightly more accurate than using a Parent Average EBV. All animals with the same parents would receive the same Parent Average EBV as an estimate of their genetic merit. However, with a GEBV, each offspring would have a different GEBV because their genotypes would most likely be different. Thus, GEBV would allow the best offspring of a sire and dam to be chosen. Since the early work of Meuwissen et al. (2001) others have proposed different methods of computing GEBV for individuals. As of August 2008 the best method has not yet been found. An advantage of GEBV is that an animal can be genotyped at birth and a GEBV can be calculated with an acceptable accuracy. There is no need to wait until the animal is mature, or until the animal has some progeny, to select or cull that animal based on its genetic merit. The generation interval can be reduced. How this would work in dairy cattle was described by Schaeffer (2006), where genetic change could be doubled, and the cost of progeny testing could be reduced by two thirds or more. Also, fewer bulls would be needed. Two countries have started to make use of GEBV. They are New Zealand and the Netherlands. France and Canada have been selecting bulls for progeny testing on the basis of genotypes for 14 or so markers (not 50,000). In 2009, the USDA will publish GEBV (combined with usual EBVs). Thus, the era of Genome Wide Selection is beginning. There will be significant changes in the dairy industry in the next few years because of this technology. The effect of GEBV on the increase in inbreeding will need to be monitored and controlled. 4 Future? In humans, it is possible to have one’s entire genome sequenced, so that the order of the 3 billion base pairs is known. With this information, the sequences of known genetic disorders can be “matched” to your genome to see if they are present or not. Thus, you will know which diseases you may incur in your life, and therefore, you might be able to alter your lifestyle to prevent the disease from occurring. The SNP era will eventually be replaced by the era of complete genome sequencing. There will be billions of base pairs of information available on every human and every breeding animal, and use of this information will be made to help humans with diseases, and to choose animals for breeding. The problems will be 1) the storage of this quantity 3 of information; 2) the manipulation of this data to be useful; and 3) the discovery of genes and their functions. Bioinformatics will be a huge area needing many thousands of workers in the future. People will be needed for computing, statistics, genetics, and proteomics (study of proteins and gene functions). REFERENCES Meuwissen, T. H. E., B. J. Hayes, M. E. Goddard. 2001. Prediction of total genetic value using genome-wide dense marker maps. Genetics 157:1819-1829. Schaeffer, L. R. 2006. Strategy for applying genome-wide selection in dairy cattle. J. Anim. Breed. Genet. 123:1-6. 4