* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download InstantJChem: a flexible chemical database system
Microsoft SQL Server wikipedia , lookup
Oracle Database wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Ingres (database) wikipedia , lookup
Concurrency control wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Functional Database Model wikipedia , lookup
Astrophysics Data System wikipedia , lookup
Relational model wikipedia , lookup
ContactPoint wikipedia , lookup
InstantJChem: a flexible chemical database system G. Marcou, D. Horvath +Laboratoire d’infochimie, Université de Strasbourg, 1, rue Blaise Pascal, 67000 Strasbourg 1 Introduction The goal is to present InstantJChem for the storage and manipulation of chemical information 1. General presentation 2. Database search 3. Creation of a database from scratch What is a database? A database stores data in an ordered form on a precise subject. A relational database stores information into tables which possess inter-references A relational database management system (RDBMS) is a software that manages relational databases InstantJChem is not a database and is not an RDBMS. What is InstantJChem? InstantJChem is a friendly interface between a RDBMS, chemical information and the user. User RDBMS Chemical Information Key concepts of InstantJChem Databases Data Projects Schema Entities Views Trees and Tables Exercise 1 Create a new project names IJCExercises… Key concept: Project Project icon contains resources and connections to one or more databases. Exercise 1 …and import the file SC100.SDF in it…. Key concept: Schema Schema/ Database icon Contains connection to a database and special tables (JChemProperties) Key concept: Database and Tables Table icon Database and tables are managed by the RDBMS. Actually store information. What can be stored Type Description Standard table Integer Long integer: 232 = 4294967296 Text User can specify widths of text fields as large as needed. Real Real double-precision Date Allows to store dates. Boolean Value is True or False List (Standard) To store a list of database items JChem table Chemical terms A list of functions evaluated on chemical structures: logD, pKa, tautomers,... Structure Chemical structure, automatically created with a Jchem table Key concept: Entities Entity icon An entity is a representation of data. It is a unique interface to conceptually different types of tables (Standard, Chemical, SQL, Extractions, etc). Key concept: Data Trees Data Tree icon A collection of entities and views. Organize information using a hierarchy (parentchild relationship between entities). Exercise 1 ….Customize a browser for it. Key concept: Views Views icon An interface to data. For simple data, a spreadsheet view is relevant. For complex relational data, a form is mandatory. Exercise 2 In the SC100 database, search for fluorobenzene and pyridine containing molecules. Use Substructure or Similarity search. Exercise 2 In the SC100 database, search for fluorobenzene and pyridine containing molecules. Use Substructure or Similarity search. Substructure search: 20 hits Similarity search: 0 hits Substructure search: 14 hits Similarity search: 0 hits Similarity search uses Chemical Hashed Fingerprints defined at database creation. Chemical Hashed Fingerprints (CHF) • Pattern Length: number of bonds of a pattern • Fingerprint Length: total number of bits to store the fingerprint www.chemaxon.com Efficient annotation to accelerate structure search • Bits per pattern: number of bits a pattern shall set on Exercise 3 Combine molecule 25 and 89 into a pseudo-molecule to perform a superstructure query. Exercise 4 Use compound 46 as a Full and Full fragment query to search the database. Repeat after removing the bromide from the query. Structure Searches www.chemaxon.com Exercise 5 Search benzene containing compounds, which name contains “pyrimidin” and annotated as “Good” concerning their aqueous solubility. Exercise 6 Search for compounds with at least one aromatic ring containing at least on Nitrogen atom Exercise 7 Search for compounds which MolWeight > 200 and not containing a benzene ring Exercise 8 Search for compounds with MolWeigh > 200, then for compounds without a benzene ring and search for the union of the hit lists. Execrise 9 Search for compounds possessing more than 4 microspecies at pH=4.0…. Exercise 9 … Export your hit list. Exercise 10 Import in your project the file ISICCRsm.RDF… Exercise 10 … Create a Browser for this database Exercise 11 Search for reactions including an imidazole ring into their reactants then into their products. Exercise 12 Add to your Schema a new data tree and structure entity named AlkanBoilingPoint… Exercise 12 … and add a floating point value field named BoilingPoint. Exercise 13 Add to the AlkanBoilingPoint entity the following data. Exercise 14 Add to the AlkanBoilingPoint entity a new date field named Date and fill it. Exercise 15 Add to the AlkanBoilingPoint entity a calculated value of LogP using a Chemicalterm field. Summary Create a project and schema Import data Search by substructure, superstructure, similarity, and exact match Search by keyword Combining queries and result lists Export query results Create a new database Conclusion InstantJChem is a Chemoinformatics layer above a standard SGDB. Provides many more Chemoinformatics services (databases overlap, QSPR modeling, plots, enumeration, scripting) SGDB InstantJChem