Download Generation and Analysis of Penaeus monodon Expressed

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Generation and Analysis of
Penaeus monodon Expressed Sequence Tags
Anchalee Tassanakajon
Shrimp Molecular Biology and Genomics Laboratory,
Department of Biochemistry, Faculty of Science,
Chulalongkorn University
Research Team
Chulalongkorn University
Shrimp Molecular Biology and Genomics Laboratory
Dr. Anchalee Tassanakajon
Dr. Siriporn Pongsomboon
Dr. Premruethai Supungul
Dr. Piti Amparyup
Ms. Sureerat Tang
CE for Marine Biotechnology
Dr. Sirawut Klinbunga
Dr. Narongsak Paunglarp
Advanced Virtual and Intelligent Computing Research Center
Dr. Chidchanok Lursinsap
Mr. Kasemsant Kuphanumat
Mahidol University
Dr. Apinunt Udomkit (IMBG)
Dr. Sarawut Jitrapakdee (Faculty of Sci.)
Dr. Kallaya Dangtip (Centex Shrimp)
Prince of Songkla University
Dr. Amornrat Pongdara
Shrimp ESTs from GenBank
dbEST release 012508
Summary by Organism - January 25, 2008
Number of public entries: 49,284,356
Objectives
¾ To generate a collection of ESTs from Penaeus
monodon
¾ To establish the database for mining of genes,
repetitive sequences and SNP detection
cDNA Libraries
Non-normalized cDNA libraries
¾
Hemocyte
¾
Hematopoietic tissue
¾
Lymphoid organ
¾
Intestine
¾
Gill and epipodite
¾
Hepatopancrease
¾
Antennal gland
¾
Eye stalk
¾
Brain and thoracic ganglia
¾
Heart
¾
Ovary
¾
Testis
•
Normalized cDNA libraries
‰ Hemocyte (11,008 ESTs)
‰ Hepatopancrease (4,122 ESTs)
‰ Ovary
‰ Antennal gland
‰ Gill-epipodite
Subtractive cDNA libraries
9
Heat-induced gill subtraction
9
Ovary subtraction (different stages of female
broodstock)
9
Testes subtraction (Broodstock / juvenile)
Experimental animals
™ Normal shrimp
™ Pathogen-infected shrimp
™ Heat-induced shrimp
Summary of
Penaeus monodon EST analysis
Total no. of ESTs
Total no. of Mt sequences
40,001
6858 (17%)
Total ESTs analyzed
33,143
No. of ESTs in contigs
25,834
No. of contigs
3,227
No. of singletons
7,309
No. of unique transcripts
10,536
The distribution of cluster size
The no. of EST/contig = 2 to 454
The average range of the contig = 945 bp
The longest assembled sequence = 6,309 bp
The shortest assembled sequence = 109 bp
>50-495
Cluster size
Functional Annotation
Blastx and Blastn
Matched EST (e-value < 10-4 )
Anchalee Tassanakajon 5,648 (53.6%)
Unmatched
4,888 (46.4%)
Other Arthropods
1%
Matched
Species
All Others
8%
Mammals
11%
Actinopterygii
6%
Crustacea
8%
Chordates 22%
Other Chordates
5%
Platyhelminthes
Arthropods 41%
7%
Echinoderms
Insects
32%
4%
Fenneropenaeus
1%
Marsupenaeus
2%
Litopenaeus
4%
Penaeus
6%
Homarus
2%
Bacteria
4%
Other
10%
Tribolium
23%
Pan
2%
Macaca
3%
Bos
4%
Others
6%
Danio
19%
Canis
3%
Mus
8%
Bombyx
2%
Apis
21%
Anopheles
8%
Drosophila
10%
Protists
14%
Aedes
11%
Rattus
15%
Tetraodon
8%
Gallus
9%
Xenopus
11%
Homo
12%
Gene Ontology Annotations
Total no. of GO hits
5,002 (47.5%)
GO hits within “Molecular Function”
3,859
GO hits within “Biological process”
3,427
GO hits within “Cellular Component”
3,797
Gene Ontology Annotations
Highly represented EST transcripts from P. monodon libraries
Contig
number
No. of
sequence
Putative gene [Closest species]
Accession No.
E value
XP_001054782
3.00E-17
CT95
454
hypothetical protein [Rattus norvegicus]
CT115
443
unknown
CT19
393
elongation factor 1-alpha [Pocillopora damicornis]
BAE66714
0
CT255
393
thrombospondin [Penaeus monodon]
AAN17670
0
CT148
374
hypothetical protein [Eimeria tenella str. Houghton]
XP_001238639
1.00E-09
CT263
275
conserved hypothetical protein [Aedes aegypti]
EAT47957
2.00E-27
CT111
260
penaeidin [Penaeus monodon]
AAQ05769
6.00E-39
CT151
254
beta-actin [Litopenaeus vannamei]
AAG16253
0
CT283
214
AAM44050
1.00E-169
CT242
161
ovarian peritrophin 2 precursor [Penaeus
monodon]
similar to secreted nidogen domain protein
[Strongylocentrotus purpuratus]
XP_788074
2.00E-30
CT82
159
BAD15063
7.00E-74
CT42
148
crustin-like peptide type 2 [Marsupenaeus
japonicus]
ribosomal protein S26 [Branchiostoma belcheri]
ABK32080
6.00E-44
CT48
147
putative senescence-associated protein [Pisum
sativum]
BAB33421
5.00E-47
CT170
132
hemocyte kazal-type proteinase inhibitor [Penaeus
monodon]
AAP92779
1.00E-167
CT100
131
profilin [Branchiostoma belcheri]
Q8T938
2.00E-23
CT251
129
AAM44050
1.00E-128
CT169
128
ovarian peritrophin 2 precursor [Penaeus
monodon]
mFLJ00348 protein [Mus musculus]
BAD90390
6.00E-16
CT156
125
AAM44049
1.00E-170
CT219
124
ovarian peritrophin 1 precursor [Penaeus
monodon]
hemocyanin [Litopenaeus vannamei]
CAA57880
0
Differentially expressed immune-related genes
Mining for microsatellites
Total clone searched
No. of ESTs containing microsatelites
No. of unique ESTs containing
microsatellites
Total no. of microsatellites loci
10,100
1,381 (13.7%)
997
2,165
Distribution of microsatellite repeat types
85 new polymorphic microsatellite markers
were developed.
No. of alleles per locus 3–30 alleles
(an average of 12.6. alleles/ locus)
SNP Prediction
Total clones subjected to prediction
8,091
No. of clones in contigs
3,846
No. of contigs
356
Potential SNP sites
595
Estimated SNP site
1/ 644 bp
SNP Prediction in various putative genes
Contig
name
Contig
Contig
Contig
Contig
Contig
Contig
Contig
Contig
1274
1256
1267
1270
1251
1255
1271
1272
Contig
Contig
Contig
Contig
Contig
Contig
Contig
Contig
Contig
Contig
Contig
Contig
Contig
Contig
Contig
Contig
Contig
Contig
Contig
Contig
Contig
1260
1258
1261
1223
1254
1249
1227
1207
1198
1204
1189
1193
1176
1177
1180
1124
1117
1112
1118
1099
1037
Putative genes
thrombospondin
hemocyanin
ovarian peritrophin 2 precursor
unknown
hemocyte kazal-type proteinase inhibitor
unknown
anti-lipopolysaccharide factor
Oryza sativa (japonica cultivar-group) cDNA
clone:J023007E09
ovarian peritrophin 2 precursor
penaeidin
antimicrobial peptide
elongation factor-1 alpha
thymosin isoform 1
ribosomal protein L10
40S ribosomal protein
ATP/ADP translocase
eukaryotic initiation factor 4A
Rps16 protein
oncoprotein nm23
trypsin
profilin
actin depolymerizing factor
vacuolar ATP synthase subunit E
fructose 1,6-bisphosphate aldolase
ficolin
cathepsin A
polehole
chaperonin
calcium-binding protein Calnexin
Length
No. of
sequence
in contig
No. of
SNP
3428
2243
2658
1881
1554
1690
723
2918
139
32
62
85
26
29
86
96
35
33
26
19
17
16
14
12
900
662
1795
2500
1337
696
552
1302
1548
523
721
762
1082
1353
1985
2315
1912
2166
2637
1875
2244
36
34
39
15
29
23
15
13
12
12
11
11
10
10
10
8
7
7
7
6
5
site
12
12
12
11
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Microarrays Fabrication
Duplicated spots
9,991 genes on the array
7,256 gene spots (72.6%) showed acceptable signal
intensity for data analysis.
Outcome and Future Prospects
9 40,001 high quality EST sequences representing
10,536 unique genes
9 P. monodon EST database and a user-friendly
web site (http://pmonodon.biotec.or.th)
9 A large number of potential genetic markers from
microsatellites and potential SNP sites
9 A cDNA microarray containing 9,991 unigenes
9 11 international publications
Dr. Prasit Palittapongarnpim
Prof. Boonsirm Withyachamnarnkul
This research received financial
support from BIOTEC.
Related documents