Challenges Representing
Phenotype in Pharmacogenomics
Tina Hernandez-Boussard
 Understanding
how genetic
variation leads to variation in
responses to drugs
 A promise from the Genome
 Personalized Medicine
– Making drug use effective and safe
based on a person’s specific
Pharmacogenomics Flow
PharmGKB: Capturing knowledge to
to catalyze pharmacogenomics research
PharmGKB Core Contents
Mission: aggregate, integrate & annotate
pharmacogenomic data and knowledge
PharmGKB Knowledge
– Structured textual summaries of Very Important
Pharmacogenes and their key variants
– Graphical pathway representations built by
consensus, associated with literature evidence
and links to PharmGKB genes, drugs,
Literature Annotations
– PharmGKB curators create data entries that
associate genes with drugs and phenotypes,
based on an interpretation of the literature. They
encode with controlled vocabularies.
Genetic Variation Complexity
Genetic variation and its relation to proteins is
 “Gene” exists in the genome
 “Gene variations” specify the existence of
– E.g. “There is A/C SNP at Golden Path X.”
– Haplotype variations = collection of simple variations
“Gene alleles” are specific variation options
– E.g. “One allele of the A/C SNP is A at GP X…”
– Haplotype alleles = collection of simple alleles
Genotypes are diploid alleles = “diplotypes”
 ASSOCIATIONS can be described to all of
Genotype-Phenotype Relations
Knowledge about gene-drug-pheno
interactions comes at different levels of
1. Product of Gene X interacts with Drug Y (in pheno
Z)--in a physical sense
2. Variant of Gene X makes a difference in pheno Z for
Drug Y--in an association sense (can also be a
physical interaction, but that is with product)
3. Specific Allele of Variant of Gene X has a particular
effect on pheno Z for Drug Y--also in an association
Mosaic Challenge: Throughput &
Limited curatorial staff has many duties
 Need methods to quickly identify
important knowledge and capture it in
computable form ONCE for multiple
 With computable knowledge, can
generate displays appropriate for user
interests: pathways, VIP summaries,
literature summaries.
Goals for Representing
Knowledge in PharmGKB
Common platform for entering & curating
Pharmacogenomic knowledge = Protégé-based
– Pathways
– Very important pharmacogenes + variants
– Gene+variant-drug-phenotype associations
Structured entry for computability
– Standard vocabularies
– Automated linkages to existing data
• Genes, drugs, external resources
– Clear semantics
– Usable SOON
– Expandable ALWAYS
Vocabularies Currently Used
HGNC for genes
– Gene families?
MEDDRA for adverse events
– Medical dictionary
MESH for disease, symptoms
– Vocabulary
Gene Ontology for cellular location, molecular
function, cellular biological process
– Cell type vocabulary (MESH for now)
– chemical & drug vocabulary (MESH for now)
• Switch to chEBI for chemicals?
• Building drug dictionary @ PharmGKB
Knowledge Templates
– Controlled vocabulary of objects
– Logical representation of relationships
– Statement of key “slots” to be filled using objects,
according to logic.
EXAMPLE: Pathway Knowledge
– Pathway Overview template, points to “Steps”
– Pathway Step templates for
Metabolism step (PK)
Transport step (PK)
Inhibition step (PD!)
Downstream phenotype step (PK & PD)
Sample metabolism step
Sample Drug Interaction
Sample Phenotype Association
PharmGKB integrates, aggregates and
annotates data and knowledge to serve
the PGx research community
 Deep, high quality genotype data
 Phenotype data--mostly small studies,
some large ones in the pipeline.
 Knowledge services include literature
curations, pathways, VIP gene summaries
 Research efforts focus on creating pipeline
to improve efficiency and precision of
curated information
PharmGKB Team
Questions? Thanks.
[email protected]