Download Review of probe designing work

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Review of probe designing work
Parent sequence databases
The version 7 of the gene and peptide sequences for Arabidopsis thaliana was
downloaded from TAIR. The gene and pep sequences were matched to obtain the CDS
regions for A.thaliana.(??? Is this correct ???) The Arabidopsis lyrata trace file available
online were assembled into contigs using XXX. These contigs were scanned by GenScan
to predict the CDS regions. We also used the predicted smallORF (sORF) sequences in
this procedure.
Designing probes
The CDS sequences were chopped into smaller sequences, and 60-mer probes were
designed for all the A.thaliana and A.lyrata chopped CDS and sORFs using the probedesigning software PICKY v2.1. Upto 5 probes were designed per chopped CDS. These
probes were designed for the forward strand using a 500mM salt concentration. They
were constrained to have a minimum temperature separation of 10C and maximum match
length of 20 bases between redundant probes. The probes were reverse-complemented to
get the reverse strand probes.
Finding matches between A.thaliana and A.lyrata CDS and sORFs
We performed a reciprocal blast between A.thaliana and A.lyrata database sequences to
find the best matching pairs between the two. We also used the orthologous group
assignments to find highly related clusters of genes between the two species. The data
from reciprocal blast and orthologous group assignment were merged together to
generate a list of single-single, single-multiple, multiple-single and multiple-multiple
parent sequence matches between the two species.
Accounting for probe effects
Each of the probes was blasted against its parent sequence to obtain the left and right
boundaries of each probe on the parent sequence. If the sequence of the probes between
every member of the orthologous group was the same, the probes were defined as Type1.
If the probe sequence differed between members of the same orthologous group, new
probes were defined such that every original PICKY probe was mapped onto a similar
region on its ortholog group member. These probes were either Type2 or Type3,
depending on whether there was a small overlap between original probes, or no overlap at
all. These probes were then reverse complemented..
Finding out probe redundancies
All the probes – original, newly designed and the reverse complemented – were blasted
with blastn with a word size of 30 against their parent sequences. Those probes that
matched perfectly (100% ID, 60-mer length) with more than one sequence were
eliminated.
Probe names
All the probes were named using the following convention
Eg: LD4356_P1F
Where L or T : A.lyrata or A.thaliana respectively
D or S: CDS or sORF sequence
XXXX: Unique number
P or M: Generated by Picky or Match
1,2,3: Probe type
F or R: Forward or Reverse strand