Download bioinformatics_project

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Transposable element wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Genome (book) wikipedia , lookup

Genome evolution wikipedia , lookup

Designer baby wikipedia , lookup

Gene wikipedia , lookup

Deoxyribozyme wikipedia , lookup

RNA-Seq wikipedia , lookup

History of genetic engineering wikipedia , lookup

Cell-free fetal DNA wikipedia , lookup

Genetic code wikipedia , lookup

Oncogenomics wikipedia , lookup

Metagenomics wikipedia , lookup

Genomics wikipedia , lookup

Non-coding DNA wikipedia , lookup

Population genetics wikipedia , lookup

Mutagen wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Human genome wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Zinc finger nuclease wikipedia , lookup

Multiple sequence alignment wikipedia , lookup

Sequence alignment wikipedia , lookup

Epistasis wikipedia , lookup

Helitron (biology) wikipedia , lookup

Genome editing wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Microsatellite wikipedia , lookup

Microevolution wikipedia , lookup

CRISPR wikipedia , lookup

Mutation wikipedia , lookup

Frameshift mutation wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Point mutation wikipedia , lookup

Transcript
Kylia Williams
Identifying CRISPR Targets
Clustered regularly interspersed short palindromic repeats (CRISPR) and CRISPR-associated
Cas9 (Cas9) can nick DNA with Cas9 at a target site specified by a small guide RNA(sgRNA)
and utilize homology directed repair of DNA with a single stranded donor oligonucleotide
(ssODN) as a template. sgRNA sequences typically have the form G(N19)NGG. Cas9 nicks
before NGG, which is also known as the protospacer adjacent motif, or PAM sequence. Ideally,
the mutation is as close as possible to the sgRNA site without being within it so that it does not
interfere with sgRNA binding to the target site.
Crispr/Cas9 can be used to create clinically relevant models of genetic diseases by introducing
the known human mutation into a model organism. For example, point mutations in the
tyrosinase gene, which is involved in dopamine and melanin production, cause oculocutaneous
albinism type 1B (OCA1B). The Albinism Database
(http://www.ifpcs.org/albinism/oca1mut.html) has a list of mutations that cause OCA1B.
Missense mutations in exon 2, between the copper binding regions of the tyrosinase protein, are
good candidates to create a model.
This assignment uses Biopython to identify the best sgRNA sequence to target a specific
mutation site.
70%
Fetch the zebrafish and human tyrosinase genes from the ENSEMBL database using their
RESTful web service (http://rest.ensembl.org/) and save them onto one fasta file named tyr.fa.
Use the IDs provided in the input.
80%
Align the two sequences (this may take a few minutes) and save the alignment to a file named
alignment.aln.
90%
Find the index in the zebrafish sequence that corresponds to the given mutation in the human
sequence. Use motifs to create a list of PAM sequences and sort the list by distance from the
mutation site.
100%
Print the complete sgRNA target sequence (G(N19)NGG) closest to the mutation site. Keep in
mind that the mutation site should be outside of the PAM sequence.
Inputs
Zebrafish tyr ID
ENSDARG00000039077
ENSDARG00000039077
ENSDARG00000039077
Human tyr ID
ENSG00000077498
ENSG00000077498
ENSG00000077498
Human Mutation Site
626
976
826