Download DWPI Chemistry Resource (DCR) in new STN

Document related concepts

Molecular dynamics wikipedia , lookup

Transcript
DWPI Chemistry Resource (DCR)
Brian Larner –24th September 2015
AGENDA
• THE PROBLEM; WHY IS IT DIFFICULT TO
SEARCH CHEMICAL STRUCTURES IN
PATENTS?
• SOLUTION; DCR INDEXING
• DCR COVERAGE
• DCR STRUCTURE CONVENTINOS
• SEARCHING IN THE DCR DATABASE
2
PROBLEMS IN SEARCHING FOR CHEMICAL
INFORMATION
• There is no consistent way of representing chemical information in
patents
• A chemical compound could be referred to using any of the following
• A systematic chemical name
• A semi-systematic or trivial chemical name
• A trade or proprietary name or for drugs an approved name or trial prep
code
•
A drawn out chemical structure
• As one possibility within a generic chemical structure
• As one possibility when only a generic class of compounds is referred
to
ALL THE NAMES DICLOFENAC IS KNOWN BY
•
ABITREN; ADEFURONIC; AFLAMIN; ALLVORAN; ALMIRAL; AM-DICLOFENAC; ANFENAX;
ANTHRAXITON; ARTHRIFEN; ARTHRODERM; ARTHROPEN; ARTHROTEC; ARTREN; ARTRILAT;
ARTRITAREN; ASPZONE; ASSAREN; ATHROFEN; BA-47210; BATAFIL; BENFOFEN; BIOFNAC; BLESIN;
BOLABOMIN; CATAFLAM; CHLORGY; CIBA-47210; COLIRI; CONTRALG; CORDRALAN; CT-DICLO;
CURINFLAM; CURINFLAM-A.P.; DAISPAS;DEFLAMAT; DELIMON; DELPHIMIX; DELPHIMIX-1; DELPHINAC;
DENBAL; DESINFLAM; DFNA; DFP; DFP-60; DICHRONIC; DICLAC; DICLO; DICLO-ATTRITIN; DICLOBASAN; DICLO-OPT; DICLO-PHLOGONT; DICLO-PUREN; DICLO-REKTAL; DICLO-SPONDRYL; DICLOSPONDYRIL; DICLO-TABLINEN; DICLOBENIN; DICLOD; DICLOFEN; DICLOFENAC; DICLOFENAC SODIUM SALT; DICLOFENAC SODIUM; DICLOFENAC-OPT; DICLOFENAC-SODIUM; DICLOFENACO;
DICLOFLEX; DICLOMAX; DICLOMELAN; DICLOPHLOGONT; DICLOREUM; DICLOSIAN; DICLOWAL;
DICSANAL; DIFENAC; DIGNOFENAC; DIRALON; DOCELL; DOGNOFENAC; DOLOBASAN; DOLOTREN;
DOLOVISANO-DICLO; DONJUST-A; DORAGON; DURAVOLTEN; ECOFENAC; EFFEKTON; EVINOPON;
FELORAN; FENACIDON; FENAMED; FENOFLAM; FENYTAREN; FLAMERIL: FLEFARMINA; FLEFARMINE;
FLOGOFENAC; FORGENAC; GAUTELIN; GP-45840; GROFENAC; GROSALGEN; HIZEMIN; IMBUN;
INFLAMAC; INFLANAC; INFLAREN; IRINATOLON; JAVIPREN; KLAST; KRIPLEX; LINOBOL; MAGLUFEN;
MAGLUPHEN; MILNAC; MIYADREN; MONOFLAM; MP-DICLOFENAC; MYOGIT; N-RHEUMAVINCIN;
NABOAL; NABOAL-SR; NACLOF; NAKLOFEN; NERIODIN; NIFLERIEL; NOVAPIRINA; OLFEN; OLFEN-GEL;
OLPHEN; ORTOPHEN; OXA; PANAMOR; PARSAL; PENNSAID; PENTIATE; POLTAJEN; PRIMOFENAC;
PROPHENATIN; REMETHAN; REOXEL; REWODINA; RHEUFENAC; RHEUMAREN; RHEUMASAN-D;
RHEUMAVEK; RHEUMAVINCIN; RHEUMAVINCIN-N; RHUMALGAN; RUVOMINOX; SAFFRAC; SANNAX;
SAVISMIN; SEECOREN; SGESTONE; SHIGNOL; SILINO; SODIUM DICLOFENAC; SODIUM-DICLOFENAC;
SOFARIN; SOLARAZE; SORELMON; SR-318; SR-318A; SR-318B; TAKS; THICATAREN; TORYXIL; TP-318
TRABONA; TRATUL; TSUDOHMIN; URIGON; VALETAN; VERAL; VERICE; VIAVOX; VILONIT; VOLDAL;
VOLFENAC; VOLMAGEN; VOLRAMAN; VOLTAREN; VOLTAREN-QS; VOLTAREN-SR; VOLTARENE;
VOLTARN-EMULGEL; VOLTAROL; VOLTAROL-EMULGEL; VOLTAROL-OPHTHA; VONAFEC; VOREN;
VOTAXIL; VURDON; XENID; YOUFENAC
AND OF COURSE
• 2-(2-((2,6-dichlorophenyl)-amino)-phenyl)-acetic
acid
• 2-(2-((2,6-dichlorophenyl)-amino)-phenyl)-ethanoic
acid
Cl
Cl
N
O
OH
AGENDA
• THE PROBLEM; WHY IS IT DIFFICULT TO
SEARCH CHEMICAL STRUCTURES IN
PATENTS?
• SOLUTION; DCR INDEXING
• DCR COVERAGE
• DCR STRUCTURE CONVENTINOS
• SEARCHING IN THE DCR DATABASE
6
SOLUTION: DWPI CHEMISTRY RESOURCE
(DCR)
• This is a database of specific chemical
substances mentioned in patents
• They are also organised into families of closely
related compounds as follows
– basic compound
– salts, isotopes, mixtures, isomers
• Substance records include structure diagrams
and substance data, e.g.
– IUPAC-name, synonyms
– molecular formula, molecular weight
7
SOLUTION: DWPI CHEMISTRY RESOURCE
(DCR)
• The DCR numbers are associated with the
relevant fragmentation codes for the
substance so they can be searched in
conjunction with non-structural
fragmentation codes if desired
• They also have roles associated with them
(e.g. produced, detected)so that you can
limit your answers by the role of the
compound
8
BENEFITS OF DWPI INDEXING - REAL
EXAMPLE
• Search on Diclofenac or its most common synonyms
(Voltarol or Voltaren) using Key words in DWPI title &
abstract - Find 2748 documents
• Search on Diclofenac via DCR record – We find 2473
records
• 414 of these were not found by the keyword search
SOME INVENTIONS FOUND ONLY BY THE
KEYWORD SEARCH ARE LESS RELEVANT
10
BUT THE ONES FOUND BY DCR ARE
HIGHLY RELEVANT
• Multicomponent crystals useful e.g. for treating and
preventing acute and chronic pain comprise (2-amino-6-(4fluoro-benzylamino)-pyridin-3-yl)-carbamic acid ethyl ester
and 2-(2-((2,6-dichlorophenyl)-amino)-phenyl)-acetic acid
NOVELTY – Multi-component crystals comprise((2-amino-6(4-fluoro-benzylamino)-pyridin-3-yl)-carbamic acid ethyl ester
and 2-(2-((2,6-dichlorophenyl)-amino)-phenyl)-
acetic acid.
ADVANTAGE - The multicomponent crystals: are stable and
easy to formulate; exhibit physicochemical properties which
influence e.g. solubility, stability, hygroscopicity, handling and
tabletting; and does not exhibit typical problems of physical
mixtures i.e. different bioavailability or decomposition during
the production.
AGENDA
• THE PROBLEM; WHY IS IT DIFFICULT TO
SEARCH CHEMICAL STRUCTURES IN
PATENTS?
• SOLUTION; DCR INDEXING
• DCR COVERAGE
• DCR STRUCTURE CONVENTINOS
• SEARCHING IN THE DCR DATABASE
12
DCR COVERAGE
• DCR records are only created for patents that are classified
in at least one of the following CPI sections
– B(Pharmaceuticals)
– C (Agrochemcals)
– E (General Chemistry)
• In addition existing DCR records are cited when the
substances they relate to are mentioned in the DWPI
abstracts for patents classified in Section D, F, G, J and K
• DCR numbers are auto-generated from the specific
compound codes in polymer indexing and added to the
indexing
13
DCR COVERAGE BY COMPOUND TYPE
• Ordinary organic compounds (eg ethanol, ibuprofen)
• Inorganic compounds (eg Sodium chloride, ammonia)
• Complexes and organometallics (eg ferrocene, Copper
phthalocyanine, diethyl magnesium bromide)
• Peptides with 10 or less repeat units
• Proteins and other natural polymers with well defined
names*
• Synthetic Polymers from a standard list of around 340
commonly occurring ones
• Plant, animal & microbial extracts*
*these records do not contain structures
14
WHAT IS NOT COVERED
• Generic classes of compounds
– These are covered by other forms of chemical structure
indexing in DWPI eg fragmentation coding
• Synthetic polymers other than the ones in the predefined list of around 340
– These are covered by polymer indexing
• Any compound of ambiguous structure
– This could be those with ill defined ratios of ions or
components
– Or ones with ambiguous names where we can not be sure
of the correct structure
15
DCR COVERAGE BY ROLE IN THE
INVENTION
• Compounds are indexed in DCR when they meet
the following criteria
• All compounds stated to be new including new
intermediates
• Compounds produced by the inventive process
• Compound purified, removed, or detected by the
claimed process
• Catalysts
• Detecting agents & purifying agents
• Starting materials and reagents
16
DCR COVERAGE RULES
•
The following are selected for DCR indexing
•
All claimed compounds up to a maximum of ~99. This number is
reduced if a Markush is also needed. (Max no. of DCR + Markush
records =99 due to antiquated "subscriber" feed to hosts, which only
allows for a 2-digit number for the record number)
•
At least 1 example, which should be the best example illustrating the
invention (usually the one in the abstract). In many cases this one is
also claimed
•
Further examples input at analysts discretion, but more should be
selected if there are examples which are structurally dissimilar from
those claimed, but still representative of the Markush and only a few
compounds are claimed
•
Selected examples should be "real" not prophetic; i.e. should have
supporting data such as preparative data, activity data etc.
•
Compounds from the disclosure can be indexed at the Analysts
discretion. Usually these would be if there are no (or few) claimed or
exemplified compounds, or if there are novel disclosed compounds that
are not claimed (must have supporting data if they are novel)
WHEN MORE THAN 99 COMPOUNDS ARE
CLAIMED
• The ones selected must include the best example
• Others are chosen so as to reflect the full structural
diversity of the complete set
• So at least one example with each different type of
ring system present
• At least one example of each type of functional
group present
• Where possible different substitution patterns on a
ring system are also covered
DCR COVERAGE
• Pharmaceutical (B), agrochemical (C) and general
chemical (E) patent records
• Comprehensive coverage from 4/1999*
• Selective coverage for approximately
• 20,000 substances from 1/1987 to date
• 2,100 substances from 7/1981 to date
* Except Japanese patents which are covered from 9/2000 onwards.
AGENDA
• THE PROBLEM; WHY IS IT DIFFICULT TO
SEARCH CHEMICAL STRUCTURES IN
PATENTS?
• SOLUTION; DCR INDEXING
• DCR COVERAGE
• DCR STRUCTURE CONVENTIONS
• SEARCHING IN THE DCR DATABASE
20
DCR STRUCTURE DRAWING
CONVENTIONS
• Simple organic structures are drawn using the CAS rules
• Structures are represented as drawn in the document
unless they contain a clear error (e.g. 5 valent carbon atom)
• When drawing from a name however the following rules are
applied
• Keto enol tautomers are drawn in the keto form
• Sugars are drawn in the ring form
• Cyclic imines are converted into the ene-amine form when ever
tautomerism
is possible
NH
NH2
NH
H3C
CH3
Leave as drawn
Preferred
SALTS
• Only one of each ion present is drawn irrespective of how
many are present
• Inorganic cations are always shown charged
• Organic cations which are produced by protonation of a base
are shown as the uncharged base
• True onium cations (e.g. tetramethylammonium) are shown
with the charge on the central atom
• Organic anions produced by deprotonation of an organic acid
are always represented as the free acid
• Simple Inorganic anions (e.g. nitrate, sulphate, chloride) are
shown charged as long as the cation is shown charged
• Tetraorganyl borane type anions are always shown with the
charge on the central atom
EXAMPLES OF SALTS
O
O
Fe
N
+
O
3+
HO
CH3
O
S
OH
Na
+
O
Ferric acetate
Sodium 3-nitrobenzenesulphonate
CH3
H3C
N
N
Cl
CH3
+
Cl
CH3
Trimethylammonium chloride
N-methylpyridinium chloride
O
O
CH3
B
+
+
Li
Lithium tetraphenyl borane
H3C N CH3
CH3
Tetramethylammonium benzoate
ORGANOMETALLIC STRUCTURES
• Metal – Carbon σ-bonds are shown as single covalent
bonds
• All other bonds between metals and organic moieties are
shown disconnected
Mn
H3C
Li
Mg
O
+
2+
O
Br
n-propyl lithium
Phenylmagnesium bromide
Manganese
acetylacetonate
24
COMPLEX INORGANIC ANIONS
• Oxoanions of metals are shown with the metal and O atoms
disconnected
• Charges are shown if this can be worked out
• If not elements are shown zero valent
• Silicates, borosilicates, heteropolyacid anions etc. are a
shown with each element listed and no bonding (all elements
zero valent)
Na
+
Al
3+
Sodium aluminate
2-
O
Mg
2+
(0)
Si
(0)
B
(0)
O
Magnesium borosilicate
25
AGENDA
• THE PROBLEM; WHY IS IT DIFFICULT TO
SEARCH CHEMICAL STRUCTURES IN
PATENTS?
• SOLUTION; DCR INDEXING
• DCR COVERAGE
• DCR STRUCTURE CONVENTIONS
• SEARCHING IN THE DCR DATABASE
26
THE DCR RECORD
27
MEANS OF SEARCHING THE DATABASE
• The DCR database can be searched using the
following options
• Chemical structure
• Chemical name
• Molecular formula
• Elements present
• Substance descriptor (only for those classes of
structures that are hard to define in terms of a
structural query such as terpenes or alkaloids)
28
CHEMICAL NAME INFORMATION
• Each DCR record has a preferred chemical name
– This would be the approved name for a drug
– For other substances we use the name it is most commonly
known by
– We do not normally use the systematic names unless it is
short (eg Phenol or Benzoic acid would be fine)
• The systematic name appears in a separate field
• Finally we include all the known synonyms we
come across in patents for this substance
– Includes brand names and trial prep codes for drugs
– This means you can search by any known name and still
find the record
29
MOLECULAR FORMULA INFORMATION
• Molecular formula
is DCR is always
presented as
follows
– Carbon atoms first,
followed by Hydrogen
atoms, followed by any
other elements listed in
alphabetical order
– In multi-component
structures the formula for
each component is listed
separately
30
SEARCHING BY MOLECULAR FORMULA
• In the molecular formula field (/MF) you can search
the exact formula
– Eg C6H6/MF
• In the element symbol field (/ELS) you can search
for the presence of particular elements in the
molecular formula
– Eg Cl/ELS AND Br/ELS AND O/ELS to find compounds
containing Cl, Br AND O
31
SUBSTANCE DESCRIPTORS
• Certain classes of substance have special keywords applied
to them called substance structures
• These are applied to every substance that is of that type and
in addition a blank record called a substance descriptor
record is indexed for any patent which refers to this class of
substance without giving a specific example
• To search for these use the /SD field
• Eg metallocenes/SD
• Will retrieve all DCR records in which the metallocenes substance
descriptor has been applied including the substance descriptor
record
COMPLETE LIST OF SUBSTANCE
DESCRIPTORS
•
ALKALOIDS
•
LIPOPROTEINS
•
ALLOYS
•
METALLOCENES
•
ANTHRACYCLINES
•
NOBLE GASES
•
ANTIBODIES
•
NUCLEOSIDES
•
BARBITURATES
•
NUCLEOTIDES
•
BENZODIAZEPINES
•
OLIGONUCLEOTIDE
•
BETA LACTAMS
•
OTHER NATURAL PRODUCTS
•
BORANES
•
PEPTIDES
•
CARBOHYDRATES
•
cyclic peptides
•
glycoproteins
•
PHOSPHOLIPIDS
•
polysaccharides
•
POLYMERS
•
cyclodextrins
•
POLYSACCHARIDES see CARBOHYDRATES
•
CARBORANES
•
PROSTAGLANDINS
•
CROWN ETHERS
•
PROTEINS
•
CYCLIC PEPTIDES see PEPTIDES
•
enzymes
•
CYCLODEXTRINS see CARBOHYDRATES
•
glycoproteins
•
DENDRIMERS
•
RETINOIDS
•
ENZYME see PROTEINS
•
SAPONINS
•
FATTY ACID see also UNSATURATED FATTY ACIDS
•
SILICONES
•
FLAVONOIDS
•
STEROIDS see SAPONINS
•
FULLERENES
•
TAXANES
•
GLYCOPROTEINS see CARBOHYDRATES and PROTEINS
•
TERPENES
•
HALOCARBONS
•
TETRACYCLINES
•
HETEROFULLERENES
•
UNSATURATED FATTY ACIDS see also FATTY ACIDS
•
HETEROPOLY ACIDS
•
ZEOLITES
TAKING THE SEARCH FURTHER
• Having searched to find all the substance descriptor
records you can refine your search in a number of
ways
• Refine by elements present eg find metallocenes
containing Zirconium as follows
– Metallocenes/sd AND zr/els
• Refine the patent results you find by technology area
– For example look up all records related to terpenes, find the
corresponding DWPI records and then search by
antiinflammatory in the activity field to find terpenes used in antiinflammatory compositions
34
THANK YOU FOR LISTENING
• Any questions please contact
• [email protected]
• Tel +44 0207 433 4656
• www.thomsonscientific.com
35
DCR on new STN
Agenda
• Search fields in DCR
– Explore database index fields
• STN Classic users – analogous to EXPAND, EXPAND, EXPAND!
• Structure search
• Find DWPI records with DCR index terms
• Rename project
3
New STN search screen
Query screen
Results screen
History screen
4
Select database of interest
Query screen
5
6
Access search indices
7
Explore Search Indices or Thesaurus for Terms
8
Chemical Name vs. Chemical Name Segment indices
• Chemical Name (CN) = Complete Name
‒ Phrase parsed
‒ Alphabetical listing in index
• Chemical Name Segment (CNS)
‒ Word parsed
• Searching by Chemical Name Segment may capture
additional relevant records
9
Explore Search Indices or Thesaurus for Terms
10
11
12
13
Note: All selected chemical names are added to search query
and are ORed together. This list is specific to this project only.
14
Submit search query; results are automatically displayed
15
Search by chemical name segment
16
Search by chemical name segment
17
Additional compounds found by using
Chemical Name Segment
18
Search by molecular formula
19
Search by molecular formula (cont.)
20
Search by molecular formula (cont.)
21
22
Naming the Q term list
23
Submit Q term list search; results automatically displayed
These first compounds
have the correct molecular
formula, but are not
pantoprazole derivatives.
Note: Unlike L1 list which is only available for this project,
Q31 is available for all projects associated with this STN ID.
Agenda
• Search fields in DCR
– Explore database index fields
• STN Classic users – analogous to EXPAND, EXPAND, EXPAND!
• Structure search
• Find DWPI records with DCR index terms
• Rename project
25
Structure search
26
Structure search
27
Structure search valency checked automatically
28
Submit sub-structure search
Notice the highlighting
illustrating the match with
the drawn structure.
29
Get complete DCR record by clicking on DCR number
30
Complete DCR record
Click on X to close detailed
DCR record and return to
complete set.
Agenda
• Search fields in DCR
– Explore database index fields
• STN Classic users – analogous to EXPAND, EXPAND, EXPAND!
• Structure search
• Find DWPI records with DCR index terms
• Rename project
32
Get DWPI records with DCR compounds indexed
33
Get DWPI records with DCR compounds indexed (cont.)
34
DWPI records that have DCR compounds indexed
Clicking on Get References button automatically
opens DWPI and executes refx search.
35
DWPI records that have DCR compounds indexed
Click on DWPI record title to
display complete DWPI record.
36
Complete DWPI record
37
Complete DWPI record (cont.)
Complete DWPI record includes DCR hit structures in separate field (DCR Hits.)
Notice highlighting in structure corresponding to original structure search. All
DCRs indexed for this record are in IT field (not shown.)
38
Search for DWPI records with selected DCRs indexed
If only certain DCRs are of interest, check boxes for
those DCR records and then click on Get References.
39
Search for DWPI records with selected DCRs indexed (cont.)
40
Search for DWPI records with selected DCRs indexed (cont.)
Complete DWPI record for selected DCRs. Click
on DWPI title to display complete record.
Agenda
• Search fields in DCR
– Explore database index fields
• STN Classic users – analogous to EXPAND, EXPAND, EXPAND!
• Structure search
• Find DWPI records with DCR index terms
• Rename project
42
Rename Project
Click on upside down
triangle to rename project.
43
Rename Project
Summary
• Search fields in DCR
– Explore database index fields
• STN Classic users – analogous to EXPAND, EXPAND, EXPAND!
– Search examples using database index fields
• Structure search
• Find DWPI records with DCR index terms
• Rename project
For more information …
CAS
[email protected]
Support and Training:
www.cas.org
FIZ Karlsruhe
[email protected]
Support and Training:
www.stn-international.de