Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Review of probe designing work Parent sequence databases The version 7 of the gene and peptide sequences for Arabidopsis thaliana was downloaded from TAIR. The gene and pep sequences were matched to obtain the CDS regions for A.thaliana.(??? Is this correct ???) The Arabidopsis lyrata trace file available online were assembled into contigs using XXX. These contigs were scanned by GenScan to predict the CDS regions. We also used the predicted smallORF (sORF) sequences in this procedure. Designing probes The CDS sequences were chopped into smaller sequences, and 60-mer probes were designed for all the A.thaliana and A.lyrata chopped CDS and sORFs using the probedesigning software PICKY v2.1. Upto 5 probes were designed per chopped CDS. These probes were designed for the forward strand using a 500mM salt concentration. They were constrained to have a minimum temperature separation of 10C and maximum match length of 20 bases between redundant probes. The probes were reverse-complemented to get the reverse strand probes. Finding matches between A.thaliana and A.lyrata CDS and sORFs We performed a reciprocal blast between A.thaliana and A.lyrata database sequences to find the best matching pairs between the two. We also used the orthologous group assignments to find highly related clusters of genes between the two species. The data from reciprocal blast and orthologous group assignment were merged together to generate a list of single-single, single-multiple, multiple-single and multiple-multiple parent sequence matches between the two species. Accounting for probe effects Each of the probes was blasted against its parent sequence to obtain the left and right boundaries of each probe on the parent sequence. If the sequence of the probes between every member of the orthologous group was the same, the probes were defined as Type1. If the probe sequence differed between members of the same orthologous group, new probes were defined such that every original PICKY probe was mapped onto a similar region on its ortholog group member. These probes were either Type2 or Type3, depending on whether there was a small overlap between original probes, or no overlap at all. These probes were then reverse complemented.. Finding out probe redundancies All the probes – original, newly designed and the reverse complemented – were blasted with blastn with a word size of 30 against their parent sequences. Those probes that matched perfectly (100% ID, 60-mer length) with more than one sequence were eliminated. Probe names All the probes were named using the following convention Eg: LD4356_P1F Where L or T : A.lyrata or A.thaliana respectively D or S: CDS or sORF sequence XXXX: Unique number P or M: Generated by Picky or Match 1,2,3: Probe type F or R: Forward or Reverse strand