Download Please enter parameters

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Genomics: leading a transition
Crafty Genomic Shoehorn May Yield Warp-Speed
Whole-Genome Sequencing
SANTA CLARA, Calif., Feb. 27 - The technology to sequence a long strand of DNA by sending it
through a 10-9-meter hole at a speed of 1 billion bases per second may only be two years away,
according to Daniel Branton.
Branton, a professor of molecular and cellular biology at Harvard, together with colleagues in
Cambridge, Mass., and at the University of California, Santa Cruz, has already designed solidstate nanopores big enough through which a DNA molecule can be threaded but small enough to
send the bases by only one at a time.
"DNA is a long, floppy, molecule," Branton said during the 2002 Genome Tri-Conference
meeting here. "But as it goes through the nanopore it has to proceed in single file order. We
asked ourselves what kind of probe we could apply to detect the differences in bases."
The answer, it initially turned out, was to use a beam of ions sent across the pore to measure
differences in bases. While this worked to distinguish between groups of bases, the number of
ions that could be sent across the pore was limited by the size of the nanopore, said Branton.
This limited sensitivity needed to measure distinctions down to the individual base.
By operating 100 electron-tunneling nanopore sequencers in parallel, Branton estimates it
would take at most three hours to sequence an entire human genome. Though he pointed out
that more realistic applications would be to focus on sequencing specific polymorphic regions of
humans' chromosomes and sequencing smaller genomes of other organisms…
Copyright © 2002 GenomeWeb LLC. All Rights Reserved.
The HGP: milestones
Clone map
Genetic map:
Mapped transcripts:
ESTs:
Putative transcripts:
SNPs:
Draft sequence:
Finished sequence:
WICGR, 1995
Généthon, 1996
GeneMap, 1999
dbEST, 2002
UniGene, 2002
dbSNP, 2002
GenBank, 2002
GenBank, 2002
16,124 clones @ 2 Mb res.
5,264 markers @ 1.9 Mb
44,094 @ 3.1 Mb
3.97 million
33,671 clusters
2.64 million
3,127 Mb, 98% of total
2,015 Mb, 63% of total
The HGP: cloned human disease genes
A2M AAAS AASS ABAT ABCA1 ABCA4 ABCB1 ABCB11 ABCB4 ABCB7 ABCC2 ABCC6 ABCC8 ABCD1 ABCD3 ABCG5 ABL1 ACAA1 ACACA ACADM ACADS ACADVL ACAT1 ACE ACOX1
ACTA1 ACTC ACTN4 ACVR1B ACVRL1 ADA ADAMTS13 ADAMTS2 ADD1 ADH1B ADRB2 ADSL AFP AGA AGL AGT AGTR1 AGXT AICDA AIPL1 AIRE AK1 AKT2 ALAD ALAS2 ALB
ALDH2 ALDH3A2 ALDH4A1 ALDH5A1 ALDH6A1 ALDOA ALDOB ALPL ALS2 ALX4 AMACR AMELX AMH AMHR2 AMMECR1 AMPD1 AMPD3 AMT ANK1 ANKH AP3B1 APC APCS APOA1
APOA2 APOB APOC2 APOC3 APOE APOH APP APRT APTX AQP1 AQP2 AR ARG1 ARHGEF12 ARHGEF6 ARHI ARSA ARSB ASAH ASL ASPA ASS ATM ATP2A2 ATP2C1 ATP6B1
ATP6N1B ATP7B ATP8B1 ATRX AVP AVPR2 AXIN2 B2M B4GALT7 BAX BBS2 BBS4 BCHE BCKDHA BCKDHB BCL10 BCL2 BCL3 BCL6 BCR BCS1L BFSP2 BLMH BLNK BMPR2 BPGM
BRCA1 BRCA2 BSG BTD BTK BUB1 C10orf2 C1QB C1R C1S C2 C3 C4A C4B C5 C6 C7 C7orf2 C8A C8B C9 CA2 CACNA1A CACNA1F CACNA1S CACNB4 CALCA CALCR CAPN10
CAPN3 CARD15 CASR CAT CAV3 CBFB CBS CCM1 CCND1 CCR2 CCR5 CD36 CD3E CD3G CD3Z CD59 CDH1 CDH23 CDH3 CDK4 CDKN1C CDKN2A CETP CFTR CHAC CHAT CHEK2
CHM CHRNA1 CHRNA4 CHRNA7 CHRNB2 CHRNG CHS1 CHST6 CIAS1 CKN1 CLCN1 CLCN7 CLCNKB CLN2 CLN3 CLN6 CLN8 CNGA1 CNGA3 CNGB1 COCH COL10A1 COL11A1
COL11A2 COL17A1 COL1A1 COL1A2 COL2A1 COL3A1 COL4A3 COL4A4 COL4A5 COL4A6 COL5A1 COL5A2 COL6A1 COL6A2 COL6A3 COL7A1 COL9A1 COL9A2 COL9A3 COLQ
COMP COX10 CP CPO CPS1 CPT2 CR1 CRB1 CREBBP CRH CRX CRYAA CRYAB CRYBA1 CRYBB2 CRYGC CRYGD CSF1R CSF2RA CSF2RB CSF3R CSH1 CST3 CSTB CSX CTH CTNNB1
CTNND2 CTNS CTSC CTSK CX3CR1 CYB5 CYBA CYBB CYLD CYP11B1 CYP11B2 CYP17 CYP19 CYP1B1 CYP21A2 CYP27B1 CYP2A6 CYP2C9 CYP2D6 CYP7B1 DAD1 DBH DBT DCC
DCLRE1C DCX DDB1 DDB2 DDIT3 DDX26 DEK DES DFNA5 DHCR24 DHCR7 DHH DIA1 DIAPH2 DISC1 DKC1 DLD DLL3 DMBT1 DMD DMPK DMRT1 DNAH5 DNAI1 DNMT3B DPYD
DPYS DRD2 DRD3 DRD4 DRPLA DSG1 DSP DSPP DTR DYT1 EBAF EBP ECE1 ECGF1 ED1 EDAR EDN3 EDNRB EGR2 EIF2B2 ELA2 ELAC2 ELAVL4 ELN ELOVL4 EMD EMX2 ENG ENO1
ENPP1 EPB41 EPB42 EPM2A EPOR EPX ERCC2 ERCC3 ERCC4 ERCC5 ESR1 ETFA ETV6 EVC EWSR1 EXT1 EXT2 EYA1 F10 F11 F12 F13A1 F13B F2 F5 F7 F8 F9 FACL6 FAH FANCA
FANCC FANCD2 FANCF FANCG FBN1 FBP1 FCGR1A FCGR2A FCGR2B FCGR3A FCMD FECH FGA FGB FGF23 FGFR1 FGFR2 FGFR3 FGG FH FLNA FLT3 FLT4 FMO3 FMR1 FMR2
FOLR1 FOXC1 FOXE1 FOXE3 FOXO1A FOXP2 FRDA FSHR FST FTH1 FTL FUCA1 FUT6 FVT1 FXYD2 FY G6PC G6PD G6PT1 GAA GABRG2 GALC GALE GALK1 GALNS GALT GAMT
GATA1 GATA3 GATM GBA GCDH GCG GCGR GCH1 GCK GCLC GCNT2 GCSH GDF5 GDI1 GFAP GGCX GGT1 GGT2 GH1 GHR GHRHR GIF GJA3 GJB1 GJB2 GJB3 GK GLA GLB1
GLDC GLI3 GLRA1 GLUD1 GM2A GNAI2 GNAQ GNAS GNB3 GNMT GNPAT GNRHR GP1BA GP1BB GP9 GPC3 GPD2 GPHN GPI GPX1 GRHPR GSR GSS GSTZ1 GUCA1A GUCY2D
GYS1 GYS2 HADHA HAGH HAL HBA1 HBA2 HBB HBD HBG1 HBG2 HD HESX1 HEXA HEXB HF1 HFE HGD HK1 HLA-DPB1 HLA-DRB1 HLCS HLXB9 HMBS HMGCL HMGCS2 HMGIC
HMOX1 HNF4A HNMT HOX11 HOXA11 HOXD13 HP HPD HPRT1 HPS HPS3 HR HRAS HRG HSD11B2 HSD17B4 HSD3B2 HSPG2 HYAL1 ICAM1 IDS IDUA IF IFNA1 IFNG IFNGR1
IFNGR2 IGF1 IGF2R IGHM IGKC IGLL1 IKBKAP IKBKG IL12B IL13 IL1RAPL1 IL2 IL2RA IL2RG IL4R IL6 ING1 INS INSR IPF1 IRF1 IRF4 ITGA2 ITGA2B ITGA6 ITGA7 ITGB2 ITGB3
ITGB4 ITM2B ITPA IVD JAG1 JAK3 JPH3 JUP KAI1 KAL1 KCNE1 KCNH2 KCNJ1 KCNQ1 KCNQ2 KCNQ3 KCNQ4 KERA KHK KIF1B KIT KLKB1 KNG KRAS2 KRT1 KRT10 KRT12 KRT13
KRT14 KRT16 KRT17 KRT18 KRT2A KRT3 KRT4 KRT5 KRT6A KRT6B KRT8 KRT9 KRTHB1 KRTHB6 L1CAM LAMA2 LAMA3 LAMB3 LAMC2 LAMP2 LCAT LCK LDHA LDHB LDLR LEP
LEPR LHCGR LHX3 LIPA LIPC LMAN1 LMNA LMO1 LMX1B LOR LOX LPL LPP LRP5 LTC4S LYL1 LYZ LZTS1 MAD1L1 MADH4 MALT1 MAN2B1 MAOA MAPT MAT1A MATN3 MBL2
MC4R MCC MCCC1 MCCC2 MCOLN1 MCP MDS1 MECP2 MEFV MEN1 MET MGAT2 MGP MHC2TA MID1 MITF MJD MKKS MKL1 MLF1 MLH1 MLH3 MLYCD MMP2 MN1 MOCS2 MPL
MPO MPZ MRE11A MS4A2 MSF MSH2 MSH3 MSH6 MSX1 MSX2 MTHFR MTM1 MTMR2 MTP MTR MTRR MUC3A MUT MUTYH MVK MXI1 MYBPC3 MYC MYF6 MYH2 MYH7 MYH9 MYL2
MYL3 MYLK2 MYO15A MYO5A MYO6 MYOC NAGA NAGLU NBS1 NCF1 NCF2 NDN NDP NDRG1 NDUFS1 NDUFS4 NDUFS7 NDUFS8 NDUFV1 NDUFV2 NEB NEFH NEU1 NEUROD1
NF1 NF2 NME1 NOS3 NOTCH1 NOTCH3 NP NPC2 NPHP1 NPHS2 NPM1 NQO1 NR0B1 NR0B2 NR2E3 NR3C1 NR3C2 NR4A3 NR5A1 NRAS NRL NRTN NTRK1 NUMA1 NUP214 NUP98
NYX OA1 OAT OCA2 OCRL OFD1 OGDH OGG1 OPA1 OPA3 OPHN1 OPN1LW OPN1SW OPTN OTC OTOF OXCT PABPN1 PAFAH1B1 PAH PAPSS2 PAX2 PAX3 PAX6 PAX7 PAX8 PAX9
PC PCBD PCCA PCCB PCDH15 PCSK1 PDE6A PDE6B PDGFB PDGFRL PDHA1 PEPD PER2 PEX1 PEX10 PEX13 PEX3 PEX6 PEX7 PFC PFKL PFKM PGAM2 PGK1 PHB PHEX PHGDH
PHKA2 PHKB PHKG2 PHYH PIGA PITX2 PKD1 PKD2 PKLR PKP1 PLA2G2A PLA2G7 PLAT PLEC1 PLOD PLP1 PML PMM2 PMP22 PMS1 PMS2 PNLIP POLG POLH POMC PON1 PON2
POU1F1 POU3F4 POU4F3 PPARG PPGB PPOX PPP2R1B PPT1 PRCC PRF1 PRKAG2 PRKAR1A PRKWNK1 PRKWNK4 PRNP PROC PRODH PROP1 PROS1 PRPF31 PRPF8 PRPS1
PRSS1 PRX PSAP PSEN1 PSEN2 PTCH PTCH2 PTEN PTH PTHR1 PTPN11 PTPN12 PTPRC PTS PVR PVRL1 PXF PXMP3 PXR1 PYGL PYGM QDPR RAB27A RAC2 RAD51 RAG2
RAP1GDS1 RARA RASA1 RB1 RBM15 RBP4 RDH5 RDS REN RET RFX5 RFXANK RGR RHAG RHO RLBP1 ROM1 RP1 RPE65 RPGR RPGRIP1 RPS19 RPS6KA3 RUNX1 RUNX2 RYR1
SACS SAG SAH SALL1 SARDH SCA1 SCA2 SCA7 SCN1A SCN1B SCN4A SCNN1A SCNN1B SCNN1G SCO1 SCO2 SCYA5 SDF1 SDHA SDHB SDHC SDHD SEDL SELE SELP SEPN1
SERPINA1 SERPINA3 SERPINA5 SERPINA6 SERPINA7 SERPINC1 SERPIND1 SERPINE1 SERPINF2 SERPING1 SERPINI1 SFTPB SFTPC SGCA SGCB SGCD SGCE SGSH SH2D1A SH3BP2
SH3GL1 SHOX SIX3 SLC10A2 SLC11A1 SLC11A3 SLC12A3 SLC19A2 SLC22A1L SLC22A5 SLC25A13 SLC25A15 SLC25A20 SLC25A4 SLC26A4 SLC2A1 SLC2A2 SLC2A4 SLC3A1
SLC4A1 SLC4A4 SLC5A1 SLC5A5 SLC6A2 SLC6A3 SLC6A4 SLC6A8 SLC7A7 SLC7A9 SMARCAL1 SMARCB1 SMN1 SMOH SMPD1 SNCA SNRPN SOD1 SOST SOX10 SOX9 SPG3A SPG4
SPG7 SPINK1 SPR SPTA1 SPTLC1 SRC SRD5A2 SRY SSTR2 SSX1 SSX2 STAR STAT1 STK11 STS TACSTD2 TAF15 TAL1 TAL2 TALDO1 TAP2 TAT TAZ TBP TBX19 TBX22 TBX3 TBX5
TBXA2R TBXAS1 TCAP TCF1 TCF2 TCF3 TCIRG1 TCL1A TCL1B TCN2 TCOF1 TECTA TF TFE3 TFR2 TG TGFB1 TGFBI TGFBR2 TGIF TGM1 TH THBD THPO THRB TIMM8A TIMP3
TITF1 TKT TLR4 TM4SF2 TMPRSS3 TNF TNFRSF10B TNFRSF1A TNFRSF6 TNFSF5 TNFSF6 TNNI3 TNNT1 TNNT2 TNXB TP53 TPI1 TPM1 TPMT TPO TRA@ TRH TRIM33 TRIM37
TRPS1 TSC1 TSC2 TSG101 TSHB TSHR TTID TTN TTPA TTR TULP1 TWIST TYR TYROBP TYRP1 UBE3A UCHL1 UCP2 UMPS UROD UROS USH3A USP9Y VDR VHL VMD2 VWF WAS
WFS1 WHN WISP3 WNT4 WRN WT1 XDH XK XPA XPC XRCC3 ZFHX1B ZNF145 ZNF198 ZNF9 ZNFN1A1
1104 cloned human disease loci
13,382 known human disease loci
??? disease-locus associations
Disease gene hunting is multidimensional
•
•
•
•
•
•
•
•
•
Genome-wide scan for localization (genetic linkage)
Find polymorphic markers to narrow region (genetic linkage)
Determine/characterize cytogenetic position (cytogenetic)
Find/order DNA clones that map to region (physical)
Identify genes/ESTs mapping to region (transcript)
Identify/generate genomic sequence of region (sequence)
Analyze DNA sequence (functional)
Gene analysis for candidacy (functional)
Search for mutations/rearrangements/association (functional)
One disease, one gene, many perspectives
The perfect gene finder
Please enter parameters:
Cytogenetic position
1p36.3
Flanking markers
D1S468-D1S214
Affected tissue(s)
Undifferentiated neuroblast
Type of gene expected Tumor suppressor
Keywords
Search
Cell cycle, differentiation
File Not Found
The file you have requested does not exist on this server.
The perfect gene finder
Please enter parameters:
Cytogenetic position
1p36.3
Flanking markers
D1S468-D1S214
Affected tissue(s)
Undifferentiated neuroblast
Type of gene expected Tumor suppressor
Keywords
Search
Cell cycle, differentiation
Cataloguing genomic data
CHALLENGES:
• Lots of data, no integration/standardization
 Different viewpoints
 Inefficiency for end-users
SOLUTION:
 Unification and seamless delivery
Comprehensive data collection and integration
Global, simplified views
Web portaling
Cataloguing procedure
1. Data identification
• Anchoring data sets
• Content data sets
2. Localization of genomic elements
Cytogenetic
Genetic linkage
Radiation hybrid
3. Comprehensive integration relative to a single scale
• Conflict annotation and resolution
• Nomenclature management
DNA sequence
Cataloguing procedure
• 4. Data management
• Relational database
• Embedded data parsing, input, and analysis routines
• 1 GB data file
• 5. Display of information
•
•
•
•
•
•
•
Unrestricted Internet access
Simplified user interface
Text-based and graphical viewing options
Data repository
Extensive support documentation
Implementation of data analysis tools
Object-specific linking to related data
HOME INTRO SEARCH REGION VIEW DATA INFO HELP METHODS
Usage
CompView launch:
Page requests:
Database queries:
Domains:
Distribution of users:
August, 1999
138,030
28,044
9,148
Non-US
Commercial
US education
US government
29%
21%
10%
1%
• Other/unidentified
36%
86
118
•
•
•
•
Countries represented:
Linkers/indices
Queries
Text or graphical display
Specify flanking markers or
cytogenetic position
Select chromosome
Region summaries
Region definition
Element name
Transcriptional
status
Sequence position
Map position
Cytogenetic
location
Position tab
Official name/title
DNA sequence position
RH position
Genetic linkage position
Cytolocation
Other cytogenetic assignments
Primer sequences
Description tab
Expression status
EST clusters
SNPs
Alternate names/IDs
Clone/sequence tab
Sequence information
BLAST utility
Sequence viewing
DNA clones
Database searches
Totals
RH-based localizations
Genetic linkage-based localizations
Cytogenetic localizations
Genes/EST clusters represented
DNA sequence-based localizations
Large-insert DNA clones
Single nucleotide polymorphisms
Linked external databases
51,903
12,461
14,706
36,402
51,334
116,608
2.5 M
50
Applications
ORFs
1p33
Genes
1p34
Sequence
1p35
Clones
1p36.1
Markers
1p36.2
1p36.3
SRO
Summary
eGenome provides:
• Comprehensive inclusion of genomic elements
• Triangulated localization of genomic elements
• True integration
• Improved error detection
• Nomenclature management
• User-friendly Web interface
• Meta-level portal into the human genome
Future
Model organism
Research
Clinic
Genome
Genome
Genome
Transcriptome
Transcriptome
Transcriptome
Proteome
Proteome
Proteome
Cell
Cell
Cell
Tissue
Tissue
Tissue
Organism
Organism
Organism
Acknowledgments
•Children’s Hospital of
Philadelphia
• Erik Sulman
• Evan Katz
Rutgers University
Tara Matise
Shahriar Sabuktagin
Chungsheng He
• Kevin Murphy
• Scott Winters
• Yang Jin
• Gonzalo Briceno
• Towfique Raj
• Randall Rose
Information Systems
Research Administration
Genome Database
Chris Porter
Laurie Kramer
Human Genome Centers
eGenome Beta testers
Applications
Known genes
Caspase 9, apoptosis-related cysteine protease
RNA-binding protein regulatory subunit
FK506 binding protein 12-rapamycin associated protein 1
Potassium channel, shaker-related family, beta member 2
Microfibrillar-associated protein 2
Period (Drosophila) homolog 3
PER3
Ribosomal protein L22
Regulatory solute carrier protein, family 1, member 1
Solute carrier family 2 member 5
Tumor necrosis factor receptor superfamily, member 12
Vesicle-associated membrane protein 3 (cellubrevin)
ESTs with homology to known genes
AT1 receptor-associated protein-like (mouse)
Bkm sex-determining region protein CS314-like (Drosophila)
Granule cell marker protein-like (mouse)
Hypertension-related protein-like (rat)
Protocadherin-like (rat)
Synaptojanin-like (rat)
UDP-GalNAc acetylgalactosaminyltransferase-like (human)
Symbol
CASP9
DJ1
FRAP1
KCNAB2
MFAP2
603427
RPL22
RSC1A1
SLC2A5
DR3
VAMP3
OMIM
602234
602533
601231
601142
156790
Eye expression
Yes
Yes
Yes
180474
601966
138230
603366
603657
Yes
Yes
Yes
Yes
Yes
Yes
Exclusively
Yes
Yes
Yes
Yes
Yes
Related documents