Download Introduction - Pathway Tools Software

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Microsoft Jet Database Engine wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Object-relational impedance mismatch wikipedia , lookup

Transcript
Creating a …
Community Database
Organism-Specific Database
Model-Organism Database
Why Create a PGDB?
 Perform
SRI International
Bioinformatics
pathway analyses as part of a genome
project
 Analyze
omics data
 Create
a central information resource for the
organism
 Create
an FBA model
 Perform
comparative analyses
Model Organism Databases
SRI International
Bioinformatics

DBs that describe the genome and other information about
an organism

Curated by experts for that organism
 No one group can curate all the world’s genomes
 Distribute workload across a community of experts to create a community
resource

Every sequenced organism with an active experimental
community requires a MOD
 Integrate genome data with information about the biochemical and genetic
network of the organism
 Integrate literature-based information with computational predictions
Rationale for MODs
SRI International
Bioinformatics
 Each
“complete” genome is incomplete in several
respects:
 40%-60% of genes have no assigned function
 Roughly 7% of those assigned functions are incorrect
 Many assigned functions are non-specific
 MODs
are platforms for global analyses of an
organism
 Interpret omics data in a pathway context
 In silico prediction of essential genes
 Characterize systems properties of metabolic and genetic
networks
What is Curation?
 Ongoing
SRI International
Bioinformatics
updating and refinement of a PGDB
 Correct false-positive and false-negative
predictions
 Incorporate information from experimental
literature
 Update genome sequence
 Update gene functions, gene positions, gene names
 Author comments and citations
 Add new pathways, modify existing pathways
 Enter information about regulatory networks
SRI International
Bioinformatics
Issues in Creating Public MODs
 Obtaining
funding
 Scoping the project
 Identify user community
 Obtain buy-in and help from scientific community
 IT: Set up database server, Web server
 Hire and train curators
Questions
 Do
SRI International
Bioinformatics
you intend to make your PGDB public and to
update it on an ongoing basis?
 To
create a Model Organism Database?
Administering Pathway Tools
Obtaining Pathway Tools
SRI International
Bioinformatics

Free to non-commercial organizations

To obtain license agreement go to BioCyc.org and click on
Software/Database Download

Follow Installation Guide

ptools-local directory
 Locate in common directory
 PGDBs created by all users who use this ptools installation
 PGDBs downloaded via the registry
 ptools-init.dat for this ptools installation
New Pathway Tools Releases






SRI International
Bioinformatics
Major releases = External software releases
 Twice per year
 Announced on ptools-users mailing list
Minor releases twice per year affect only our BioCyc.org
Web site and flatfile distributions
We support one prior release only
Releases announced on [email protected]
Read release notes at
 http://brg.ai.sri.com/ptools/release-notes.html
Install process:
 Upgrade schema of your DB (software assisted)
PGDB Storage:
File or Relational Database

File storage:
 Advantages:


Disadvantages:




No RDBMS installation and configuration
Must be loaded and saved in its entirety
No transaction history
No concurrent access for multiple users
Oracle/MySQL storage:
 Advantages:




Faster read access, faster saves
Concurrent update access for multiple users
Stores history of all PGDB updates
Disadvantages:

RDBMS must be installed and configured
SRI International
Bioinformatics
Multiuser Access to PGDBs
 PGDB
 Each
SRI International
Bioinformatics
stored within one Oracle or MySQL server
curator installs PTools on their workstation
 Different curators can use different software
platforms
 Workstations query RDBMS server via internet
 Local disk cache speeds access
 For each frame access, PTools queries
 In-memory cache, disk cache, RDBMS server
 After curator saves changes, all changes made by
other users are loaded into curator’s session
How to Release a PGDB?



Decide on release frequency and schedule
 Don’t wait until it’s perfect to release it!
Freeze curation for 1 week
Quality assurrance
 Run consistency checker






Tools -> Consistency Checker
Also updates organism-summary statistics
Update publications, authors in organism frame
 Update via Organism editor
Create new version of PGDB
 ptools-local/pgdbs/yeastcyc/1.0/kb/yeastbase.ocelot
 Edit against the new version, release the old version
Author release notes
Register PGDB in SRI PGDB registry
 Will allow SRI to include it in BioCyc
SRI International
Bioinformatics
SRI International
Bioinformatics
Pathway Tools Data Import/Export

File->Export
File->Import

Export/import to/from tab-delimited files

Export to Genbank, SBML, BioPAX

Export to attribute-value files

Attribute-value files can be imported into BioWarehouse
 Relational database system for bioinformatics database integration

SRI International
Bioinformatics
Napster Comes to Bioinformatics
 Public

sharing of Pathway/Genome Databases
PGDB registry maintained by SRI at URL
http://biocyc.org/registry.html
 Registry
operations
 List contents of registry
 Download PGDBs listed in the registry
 Register PGDBs you have created
Registry Details
SRI International
Bioinformatics

Why register your PGDB?
 Declare existence of your PGDB in a central location
 Facilitate its download by other scientists
 Facilitate its inclusion in BioCyc.org

Why download a PGDB?
 Desktop Navigator provides more functionality than Web
 Comparative operations
 Programmatic querying and processing of PGDB

Registration process
 Registered PGDBs have open availability by default
 Authors can provide their own license agreements
 Registered PGDBs reside in authors’ FTP site or HTTP server
Desktop versus Web Mode
SRI International
Bioinformatics
 Pathway
Tools runs in two different modes:
 Desktop mode
 Web mode (e.g., BioCyc.org)
 Desktop
vs Web functionality in Pathway Tools
http://biocyc.org/desktop-vs-web-mode.shtml
 You
can run both desktop and web modes at your
site
 Your PTools web server need not be open to the
public