* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Miniature Liquid Fuel-Film Combustor Trinh Pham Derek Dunn
Non-coding DNA wikipedia , lookup
Ridge (biology) wikipedia , lookup
Point mutation wikipedia , lookup
Gene nomenclature wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Genetic engineering wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Gene desert wikipedia , lookup
Pathogenomics wikipedia , lookup
Genomic imprinting wikipedia , lookup
Public health genomics wikipedia , lookup
Metagenomics wikipedia , lookup
Minimal genome wikipedia , lookup
Gene expression programming wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Genome evolution wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Genome (book) wikipedia , lookup
History of genetic engineering wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Microevolution wikipedia , lookup
Designer baby wikipedia , lookup
Helitron (biology) wikipedia , lookup
ChIP-Seq data analysis tool Team Members: Natalia Shatokhina Faculty Advisor: Dr. Russell Abbott, Dr. Sandra Sharp Liaison: California Institute of Technology Department of Computer Science College of Engineering, Computer Science, and Technology California State University, Los Angeles Sponsor Logo Background Development of computational approaches to interpret genomic data is a recent research topic of many biology research groups. These methodologies allow biologists to develop large-scale models of transcriptional and genetic regulation to study certain biological processes. The myogenesis or muscle development process is the one of the interest for Dr. Barbara Wold's Lab at Caltech and Dr. Sandra Sharp at Cal State LA. The experiments using ultra-high throughput sequencing methods are performed that can identify all the RNAs that are being expressed in a muscle cell at given time (RNA-Sequencing method), and identify all the locations in the genome at which a particular regulatory protein has bound to the DNA (Chip-sequencing method). By processing all this information a model of regulatory network is developed. Objectives To help to analyze the experimental data a computational biology tool was developed that can be used in processing of result sets from experiments done with ChIP-Seq methods. The goal is to provide an interface to query for certain kind of information. The type of input data are the result sets from ChIP-Seq experiments and sets of genes. Typical result set from Chip0seq experiment contains a list of binding sites (regions) for one particular protein and some additional information relevant to experiment. Typical set of genes is a list of genes containing their id's, abbreviations and DNA information (coordinates, chromosome). One set of genes is named gene model. User tools: -to import result sets from file and store them in the database. The corresponding description of the result set is stored in the -database as experiment description. -to import gene models from file and store them in the database. The corresponding description of the gene model is stored in the database as experiment description. -to calculate overlaps between different sets of regions on a group of result sets or one result set -to calculate distances from set of regions to gene model. Set of regions or result set from experiment and gene model are specified by user (selected or uploaded from a file).