Download ACO Explorer and PMPM Explorer Application

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Concept learning wikipedia , lookup

Time series wikipedia , lookup

Pattern recognition wikipedia , lookup

Machine learning wikipedia , lookup

Transcript
Machine Learning
Misconceptions
May
rd
3 ,
2017
© 2016 Health Catalyst
Proprietary and Confidential
Data Science Team
Levi Thatcher, PhD
Mike Mastanduno, PhD
Taylor Miller, PharmD
Taylor Larsen, MS
Director of Data Science
Data Scientist
Data Scientist
Data Science Engineer
2
© 2016 Health Catalyst
Proprietary and Confidential
Purpose of Today’s Chat
• Compare and contrast machine learning and artificial intelligence.
• Discuss techniques that offer feedback into the system and when it’s
necessary to retrain a model.
• Give advice on how to avoid common pitfalls in machine learning
implementation.
• Talk about potential applications of the different classes of machine
learning techniques.
• Q&A
3
© 2016 Health Catalyst
Proprietary and Confidential
Machine Learning Definition
Machine learning is the subfield of computer science that gives computers the ability to learn
without being explicitly programmed. Such algorithms overcome following strictly static program
instructions by making data-driven predictions or decisions through building a model from sample
inputs.
- Wikipedia
4
© 2016 Health Catalyst
Proprietary and Confidential
Machine Learning Typical Use
• Movie recommendations on Netflix
• People you may know on Facebook
• Advertising
• Patient likelihood of contracting sepsis, being readmitted…
• Using any tabular data source to predict a Y/N or continuous outcome
5
© 2016 Health Catalyst
Proprietary and Confidential
Artificial Intelligence Definition
Artificial intelligence (AI) is intelligence exhibited by machines. In computer science, the field of AI
research defines itself as the study of "intelligent agents": any device that perceives its environment
and takes actions that maximize its chance of success at some goal.
- Wikipedia
These models are limited in their ability to “reason”, i.e. to carry out long chains of inferences, or
optimization procedure to arrive at an answer. The number of steps in a computation is limited by
the number of layers in feed-forward nets, and by the length of time a recurrent net will remember
things.
- Yann LeCun, Director of Facebook AI Research
6
© 2016 Health Catalyst
Proprietary and Confidential
Artificial Intelligence Typical Use
• Speech translation
• Complex game playing
• Self-driving cars
• Content delivery
• Radiology?
7
© 2016 Health Catalyst
Proprietary and Confidential
Difference Between ML and AI
• It’s fuzzy
• Learning from data? No, not really.
• Continuous learning from data? No, not really.
• AI feels more complicated.
• AI should be able to learn a skill and generalize it to
another entirely different thing.
• Many AI ideas get rebranded as ML as time goes on and
we understand them.
8
© 2016 Health Catalyst
Proprietary and Confidential
Poll #1: Have you ever used machine learning or AI?
148 respondents
• Yes, in my daily work – 21%
• Yes, as a hobby – 17%
• No, but I plan to – 52%
• No, not applicable – 9%
9
© 2016 Health Catalyst
Proprietary and Confidential
How is machine learning used?
10
© 2016 Health Catalyst
Proprietary and Confidential
Poll #2: Where is your organization in terms of using
machine learning in regular operations?
138 respondents
• Using machine learning tools daily across many departments and use
cases – 13%
• Daily across a couple of use case – 17%
• Confined to a research study or two – 49%
• What is machine learning? – 21%
11
© 2016 Health Catalyst
Proprietary and Confidential
When does a model learn?
• Different algorithms learn at different times
• Only during training
•
•
•
Logistic regression
Random forest
Clustering
• Periodically after new data comes in
•
•
•
•
Any of the above (but more complex implementation)
Naïve Bayes
Neural networks
Deep learning
• Continuously as new data comes in
•
Any of the above (but still more complex implementation)
12
© 2016 Health Catalyst
Proprietary and Confidential
When should a model be retrained?
• After significant data turnover
• If performance in production drops over time
•
Seasonality
•
Changing treatment methods
• If new features or techniques are identified
• If the use case changes
13
© 2016 Health Catalyst
Proprietary and Confidential
Pitfall 1: Poorly Defined Use Case
• Leads to:
• Use case is always the
first priority
•
Incorrect usage of data fields
•
Unavailable data
•
What is the question?
•
No adoption
•
Who are the users?
•
When are they using it?
•
How are they using?
14
© 2016 Health Catalyst
Proprietary and Confidential
Pitfall 2: Production Environment is Different
• Data might not be
available
• Learn how your data is
populated over time
• Timing of data might lead • Only train with what’s
to target leakage
available at the time of
prediction
• Predictions are made
multiple times per patient • Know your use case!
15
© 2016 Health Catalyst
Proprietary and Confidential
Pitfall 3: Bad Performance Metrics
• 99% accurate, but didn’t
find any sick people
• AUC or Precision-Recall
• Sampling methods
during model training
• Imbalanced classes
• Performance changing
over time
• Monitor correct
performance metric over
time
• Know your use case!
16
© 2016 Health Catalyst
Proprietary and Confidential
Pitfall 4: Poor Adoption
• Do people know about
it?
• Tell people about it
• Know the use case
• Is it answering a relevant
• Simple is better,
question?
shouldn’t affect workflow
• Is visualization done
• Improve trust with
well?
prediction explanations
• Do people trust the
or transparent models
model?
17
© 2016 Health Catalyst
Proprietary and Confidential
Poll #3: What’s impeding you from moving forward
with machine learning in your organization?
116 respondents
• Available tools are overwhelming OR don’t know what exists – 16%
• Use cases are overwhelming OR don’t know what’s possible – 28%
• Don’t have or can’t afford the technical staff to implement – 23%
• Adoption—clinical team isn’t interested – 9%
• Other – 25%
18
© 2016 Health Catalyst
Proprietary and Confidential
Potential Applications: ML and EMR
• Clinical
•
•
•
•
Risk scores – readmissions, mortality
Risk adjusted comparisons
Replacing clinical rulesets
Correct coding
• Operational
•
•
Staff need forecasting
Length of stay prediction
• Financial
•
•
Propensity to pay
Predicted procedure cost
19
© 2016 Health Catalyst
Proprietary and Confidential
Potential Applications: NLP or Smarter Analytics
• Parsing clinical notes
• Fill in discrete text fields automatically
• Find new features that only come up in
conversation
• Smart retrospective analysis
• Trend analysis
• Exploration across the whole EMR
• Serve up insights automatically
20
© 2016 Health Catalyst
Proprietary and Confidential
Potential Applications: Image Processing
• Diagnostics of pre-segmented suspicious
regions
• Automatic segmentation of tissue types
• Diagnosis of or staging of screening images
• Diagnosis or staging of pathology slides
21
© 2016 Health Catalyst
Proprietary and Confidential
Poll #4: What’s the most valuable use for ML/AI/Big
Data to your organization?
95 respondents
• Parsing free-form clinical notes – 14%
• Image interpretation – 5%
• Clinical risk scores – 47%
• Operational efficiency – 29%
• These are buzz words and not worth the time. – 4%
22
© 2016 Health Catalyst
Proprietary and Confidential
Poll #5: If there was an algorithm that was FDA
approved and read mammographic images on par
with a radiologist, would you use it?
90 respondents
• Yes, I’d trust it completely – 16%
• Yes, but only as an aide to the radiologist – 81%
• No, I wouldn’t trust it – 3%
23
© 2016 Health Catalyst
Proprietary and Confidential
Before we end…
24
© 2016 Health Catalyst
Proprietary and Confidential
Questions?
25
© 2016 Health Catalyst
Proprietary and Confidential