Download Cheminformatics and Pharmacophore Modeling

Application Guide Cheminformatics and Pharmacophore Modeling, Together at Last SciTegic Pipeline Pilot—Bridging Accord Database Explorer and Discovery Studio Carl Colburn Shikha Varma-O’Brien Introduction The integration of Accelrys’ Accord™ Database Explorer 3.0 (ADE 3.0) and Discovery Studio® 1.7 with the SciTegic Pipeline Pilot™ data analysis and mining platform now makes it easy for you to leverage screening data within your molecular modeling analyses. Using a published HTS data set1 as an example, this document reviews how you can: • Consolidate compound data within a centralized location using SciTegic Pipeline Pilot • Create and configure a project- or enterprise-level database from that consolidated data using • Query the resulting ADE 3.0 database to identify modeling candidates based on defined • Analyze the queried data within the powerful Discovery Studio modeling environment Accord Database Explorer 3.0 criteria using Pipeline Pilot SciTegic Pipeline Pilot— Bridging Cheminformatics and Molecular Modeling The SciTegic Pipeline Pilot server platform streamlines the integration and analysis of vast quantities of data, and can retrieve or join data from independent databases, files, and client applications. It can directly read chemistry, sequence, text, and numeric data from all popular formats and analyze data from multiple sources. As shown in Figure 1, both Accord cheminformatics tools and the Discovery Studio molecular modeling environment can share a common underlying SciTegic Pipeline Pilot server. Additionally, within the graphical client interface to Pipeline Pilot, you can compose data processing networks (known as protocols) using hundreds of different configurable components for operations such as data retrieval, manipulation, computational filtering, and display. There are several Pipeline Pilot Component Collections available that include diverse components for creating such data processing protocols. A few of these components are included in the example described in this document. Figure 1. A common Pipeline Pilot server shared by Accord informatics and Discovery Studio modeling tools bridges cheminformatics with molecular modeling. Step 1: Consolidating and Organizing Data Using Pipeline Pilot Typically compound data is not readily available from a centralized location, but rather is contained in various files scattered throughout the organization, sometimes even in different formats. With Pipeline Pilot, you have an efficient way to consolidate these data, store them in a secure, searchable, indexed database, and retrieve specific subsets of compounds for further study based on the goals of the project. To illustrate these capabilities, we took both an SD file and text files from our example HTS screening data and used Pipeline Pilot to create a protocol (shown in Figure 2) that merged the datasets based on identical molecules, standardized the chemical structures, and created a new, merged dataset. Optionally, components from the Pipeline Pilot Chemistry Collection can also be included in the protocol to filter based on properties such as Lipinski’s filters, reactive substructures, descriptors, etc. Figure 2. An example Pipeline Pilot protocol to merge disparate datasets Step 2: Creating a Local Database using Accord Database Explorer 3.0 Accord Database Explorer 3.0 is a forms-based database client that provides powerful querying and browsing tools for extracting maximum value from local and server-hosted data sources. ADE 3.0 features a Database Set-up Wizard, which lets you easily create an Access format database from an SD file. This creates a stable, secure environment to store your data and allows the chemistry compounds to be indexed for faster searching. You can create various projectlevel databases in this way, and you can customize the column names and set-up of the tables during this process. Figure 3 shows one step of the ADE 3.0 Set-up Wizard that gives a view onto the table. This allows you to create local, project-level databases with common column names so that they can be more easily queried together in SQL statements within Pipeline Pilot, as described in the following section. Figure 3. Editable table structure in ADE 3.0 Set-up Wizard Step 3: Querying the Database to Identify Modeling Candidates Using Pipeline Pilot Once the database is created with ADE 3.0, it is now available as a data source from within Pipeline Pilot, where you can search the data to select qualified compounds for further modeling and analysis with Discovery Studio. As described in the following paragraphs, this requires three steps: connect to the new database; construct the SQL Select statement to perform a specific query; and create a protocol to retrieve desired compounds for modeling. Connecting to a Database Using ODBC To connect to a database using ODBC (Open DataBase Connectivity), you must create a domain system name on your machine. This is done in the Control Panel|Administrative Tools|Data Sources menu. Since we are using an Access database in this example, select the Microsoft Access Driver. You may use a user name and password if desired. Creating a Query Protocol Next, as shown in Figure 4, create a protocol in Pipeline Pilot using the ODBC Select component. Since the database in this example is in Accord format, we use the ODBC component to access the data; but, we need to specify the format by including “…as accord_mol…” in the SQL select statement. We then include “Molecule from Accord” and “Minimize Molecule” components in our protocol to convert the data format and optimize the chemistry for Discovery Studio. Enter the DSN name, user name and password in the properties area. Then, specify the columns (parameters) from the compiled ADE 3.0 database that you want included in your dataset. Figure 4. Pipeline Pilot protocol with ODBC parameters Modifying the SQL You may now modify the SQL as needed, as illustrated in Figure 5. In this example, records without an actual IC50 data point were eliminated because we later use the IC50 data to study the structure-activity relationships of the compounds in Discovery Studio. Figure 5. PilotScript SQL statement builder Step 4: Modeling the Selected Compounds in Discovery Studio: Conformational Analysis, Pharmacophore Analysis, Etc. Depending on the objective of the project, various modeling approaches can be pursued once some experimental data is available. Discovery Studio provides both structure- and ligand-based modeling approaches. As shown in Figure 6, the customized protocols that you’ve developed in the Pipeline Pilot client for querying an ADE 3.0 database can be published within a user protocols folder in the Discovery Studio GUI. This means that you can select and execute your desired Pipeline Pilot protocol from directly within the Discovery Studio interface. In this example, we’ve created a protocol in which compounds’ IC50 data points are pulled from the ADE 3.0 database and are used to understand structure-activity relationships (SAR), which can then help guide future synthesis. Figure 6. Customized workflow: Pull data from ADE 3.0 into Discovery Studio Additionally, we can generate pharmacophore models based on the common features of the compounds in the ADE 3.0 database, which can then be used in Discovery Studio for scaffold hopping, lead finding by 3-D database mining, understanding binding modes, etc. Figure 7 shows a simple common-feature pharmacophore model of four active CDK2 inhibitors. The model represents the framework of features (acceptors, donors, aromatic rings, etc.) shared among these compounds and also provides the feature- based alignments of these molecules. Such models can be used to prioritize compounds for screening. Figure 7. Pharmacophore model built in Discovery Studio based on the common features (acceptors, donors, aromatic rings, etc) of four active CDK2 inhibitors included in an ADE 3.0 database Choose (or Invent!) Your Workflow Once in the Discovery Studio environment, modeling can be performed in any desired workflow combining both structure-based and/or ligand-based approaches, as shown in Figure 8. The Pipeline Pilot server allows you to customize any protocol and add external algorithms as well. Figure 8: Examples of Discovery Studio modeling capabilities References: 1. Bradley, E.K., Miller J.L., Saiah, E. and Grootenhuis, P.D.J. J. Med. Chem., 2003, 46, 4360-4364. 2. SciTegic website: http://www.scitegic.com/community/downloads/downloads.html

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Cheminformatics and Pharmacophore Modeling