Download Gene Prediction Exercise Initial concepts to be known: 1)What are

yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Genome editing wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Gene Prediction Exercise
Initial concepts to be known:
1)What are ab initio and homology based methods and what are the differences between them.
2) What is the structure of a prokaryotic operon.
Possible programs to be used for prediction of genes by ab initio methods:Glimmer
Implementation of Glimmer is divided into 2 steps:
1)First a probability model called an ICM of coding sequences is built based on the known
genes/genes from similar species.
build-icm [ options ] output-file < input-file
2) The glimmer3 program itself is run to analyze the sequences and make gene predictions.
glimmer3 [ options ] sequence icm tag.
The program can be downloaded from
Prodigal uses dynamic programming and performs well with genomes having high GC content.
For command line usage in a linux system use the following:Usage: prodigal [-a trans_file] [-c] [-d nuc_file] [-f output_type] [-g tr_table] [-h] [-i input_file]
[-m] [-n] [-o output_file][-p mode] [-q] [-s start_file] [-t training_file] [-v]
The program can be downloaded from
Genemark.hmm use hidden mark model to predict genes in orfs. GeneMarks runs a self training
part and then runs Genemark.hmm for the final gene prediction.
Prokaryotic GeneMark.hmm(Found in the GeneMarkS package)
Usage: gmhmmp [parameters ...] [sequence filename]
-m parameter is mandatory
Input sequence file in FASTA format can have multi-FASTA sequence
Usage: [options] <sequence file name>
Input sequence file in FASTA format
The program can be downloaded from
Possible programs to be used for prediction of RNA genes:RNAmmer
Usage: perl rnammer -S bac -m lsu,ssu,tsu -gff - < [inputfile]
The input file should be in fasta format.
The program can be downloaded from
Identifies transfer RNA (tRNA) genes by integrating and post-processing the outputs of three
independent tRNA prediction programs
Runs tRNAscan and the Pavesi algorithm to find candidate tRNAs
Candidate tRNAs go through the covariance model search program for confirmation
Predicted tRNA bounds are trimmed and run through the covariance model global
structure alignment program to get a secondary structure prediction
Usage: tRNAscan-SE [-options] <FASTA file(s)>
The program can be downloaded from
sRNA scanner
Identifies intergenic small RNA (sRNAs) transcriptional units
Transcriptional signal data used as positive training data
Uses a position weight matrix (PWM) to predict sRNAs
Usage: ./sRNAscanner.exec