Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Phuse 2015 – PD05 Working as a data scien8st with real world clinical data 13 October 2015 Oct-‐2015 Berber PhUSE 2015 -‐ Data SSnoeijer cien3st RW 1 Filmpje h<ps://www.youtube.com/watch?
v=UoYl7eCesqw&feature=youtu.be&a Oct-‐2015 PhUSE 2015 -‐ Data Scien3st RW 2 Clinical data scien8st •
•
•
•
•
•
•
•
Data prepara3on Collabora3on Advanced Programming Sta3s3cs Scien3fic rigour Visualisa3ons Hacker mindset Understanding clinical data Oct-‐2015 PhUSE 2015 -‐ Data Scien3st RW 3 Data prepara8on • Data from different sources – GP database – Pharmacy claims database – Hospital admissions – Hospital pharmacies – Laboratories – Addi3onal data sources Oct-‐2015 PhUSE 2015 -‐ Data Scien3st RW 4 Data prepara8on •
•
•
•
•
•
Extracted Standardized Linked Coded Explored Crea3on of analysis datasets Oct-‐2015 PhUSE 2015 -‐ Data Scien3st RW 5 Collabora8on • Different departments – Informa3on Management – Research – Repor3ng • Engage with senior management • Explore customer needs • Influence without authority Oct-‐2015 PhUSE 2015 -‐ Data Scien3st RW 6 Advanced programming •
•
•
•
SAS Base SQL queries SAS Macro SAS Graph – GTL • SAS VA • Other programming languages Oct-‐2015 PhUSE 2015 -‐ Data Scien3st RW 7 Good Programming Prac8ces • Efficient – What is the most straighborward method (less code) – What is the fastest method (less 3me) • Repeatable – Macro language – No re-‐programming of same code • Clear and transferable – Debugging must easy – Be<er to understand by colleagues Oct-‐2015 PhUSE 2015 -‐ Data Scien3st RW 8 Example: SAS SQL combined with MACRO PROC SQL;
CREATE TABLE __allPat AS
SELECT &repmonc COUNT(Distinct &pat) AS TotPat
FROM &DsIn
WHERE &geslacht IN ('M','V') AND age ne . AND age<98
%IF &bymon=Y %THEN %STR(GROUP BY repmon);
;
QUIT;
Oct-‐2015 PhUSE 2015 -‐ Data Scien3st RW 9 Sta8s8cs • Descrip3ve – Mean, median, percen3les etc • Modelling – Influence of surrounding factors • Forecas3ng • Explora3ve • Pa<ern seeking Oct-‐2015 PhUSE 2015 -‐ Data Scien3st RW 10 Diabetes adherence – descrip8ve Oct-‐2015 PhUSE 2015 -‐ Data Scien3st RW 11 COPD drug predictors of indica8on Dura3on of use Age gender R03BA ipratropium LABA (not combined) 3otropium R01 Oct-‐2015 montelukast PhUSE 2015 -‐ Data Scien3st RW 12 Scien8fic rigour • Understand data • Dis3nguish sens from nonsense • Understand need of customers (caretakers and pharma companies) • Be able to fund and discuss the obtained results Oct-‐2015 PhUSE 2015 -‐ Data Scien3st RW 13 Visualisa8ons • Two types – Explora3ve – Explanatory • Give insight in a glance • Insight in data – Understandable / Intui3ve – Not too much informa3on Oct-‐2015 PhUSE 2015 -‐ Data Scien3st RW 14 Oct-‐2015 PhUSE 2015 -‐ Data Scien3st RW 15 Oct-‐2015 PhUSE 2015 -‐ Data Scien3st RW 16 Hacker mindset •
•
•
•
Curious about the data Problem solver Crea3ve Out of the box – Try to find what is not obvious… – Find abberant pa<erns … Oct-‐2015 PhUSE 2015 -‐ Data Scien3st RW 17 Using real world clinical data • Privacy – Permission of pa3ent and the owner of the data (caretaker) – Data security – Anonymizing – Documenta3on of process • Big data – Miljons of records – At least 7 years history – Lot of possible links to other databases Oct-‐2015 PhUSE 2015 -‐ Data Scien3st RW 18 Oct-‐2015 PhUSE 2015 -‐ Data Scien3st RW 19 WWW.PHARMO.COM
Oct-‐2015 PhUSE 2015 -‐ Data Scien3st RW 20