Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
HOW TO BUILD A CLINICAL DATABASE Alberto Briganti Urological Research Institute Vita Salute San Raffaele University Dept. of Urology, Milan, Italy 1 DATABASE DEFINITION “A systematized collection of data that can be accessed immediately and manipulated by a data-processing system for aspecific purpose” 2 DATABASE DESIGN CONSIDERATIONS The main design goal for all databases is to store the data accurately 3 DATABASE DESIGN CONSIDERATIONS A good database design balances various needs and limitations: • Clarity, ease, and speed of data entry • Efficient creation of analysis data sets • Formats of data transfer files • Database application software requirements 4 ENTERING DATA • Selecting a method to transcribe the data • Determining how closely the data must match with the case report forms • Making edits and “jeopardizing” quality changing data without • Quality control of the entire process 5 DATABASE STRUCTURE FIELD RECORD VALUE 6 CLINICAL DATABASE Included data: • Numeric values (i.e.: age, PSA, time of surgery, …) • Text (i.e.: drugs, co-morbidities, …) • Date (i.e.: date of birth, date of surgery, …) • Complex objects (i.e.: images, …) 7 INCLUDED DATA NUMBER 8 INCLUDED DATA TEXT 9 INCLUDED DATA DATE 10 TYPES OF VARIABLES • Continuous (quantitative) variables: measured as a number for which arithmetic operations make sense (height, age, PSA, …) • Categorical (qualitative) variables: has two or more categories. Categorical values have not numerical meaning. 11 CATEGORICAL VARIABLES • Nominal variables: have two or more categories but do not have an intrinsic order • 1: Adenocarcinoma • 2: BPH • Dichotumous variables: have only two categories or levels • 1: Male • 2: female • Ordinal variables: have two or more categories just like nominal variables and the categories can also be ordered or ranked • 1: PSA<4 ng/ml • 2: PSA between 4 and 10 ng/ml • 3: PSA >10 10 ng/ml 12 TYPES OF VARIABLES CONTINUOUS VARIABLE 13 TYPES OF VARIABLES CATEGORICAL VARIABLES 14 QUERY • To search, to question, to find • Question to the database department(s) is Joe in?) (i.e.: “Which • Usually constructed using SQL (structured query language) • The answer to any query will be a relation 15 QUERY TYPES • Select query: simple data retrival • Action query: additional operations on the data (insertion, updating or deletion) 16 CLINICAL DATA ACQUISITION • Prospective data acquisition: data of a cohort of subjects are prospectively collected from a certain moment to test a defined hypothesis • Retrospective data acquisition: data are retrospectively collected to investigate the association between risk/protection factors and clinical outcome 20 CLINICAL DATA ACQUISITION • Pre-operatory questionnaires • Clinical records • Follow-up visits • Phone interviews • Post-operatory questionnaires • Register office anagraphic data 21 WRITTEN CONSENT • Is necessary for any kind of clinical data management • For acquisition and analysis of their data, patients are asked to fill a specific written consent • One day before surgery, medical providers explain the importance of data acquisition 22 QUESTIONNAIRES • List of questions for the purpose of gathering informations from the respondents • Two main goals: • Obtain any information purposes of the survey relevant to the • Collect the information with maximal realibility and validity 23 VALIDATED QUESTIONNAIRES A validated questionnaire means that it has undergone a validation procedure to show that it accurately measures what it aims to do, regardless of who responds, when they respond, and to whom they respond 24 VALIDATED QUESTIONNAIRES Advantages of validated questionnaires: • Reduce bias by detecting ambiguities and misinterpretations • Aim at high degree of “specific objectivity” • Collect better quality data • Offer the opportunity to compare results between different studies 25 VALIDATED QUESTIONNAIRES INTERNATIONAL PROSTATIC SYMPTOM SCORE 26 VALIDATED QUESTIONNAIRES INTERNATIONAL INDEX OF ERECTILE FUNCTION 27 CLINICAL RECORD 28 CLINICAL RECORD • Patient hystory • Type of surgery • Surgical complications • Hystological findings • Duration of hospital stay 29 FOLLOW-UP VISITS • Every 3 months during the first year after surgery and every 6 months thereafter • Questionnaire administration (i.e.: IIEF, ICIQ, IPSS) • Imaging and laboratory data acquisition 30 REGISTER OFFICE ANAGRAPHIC DATA 31 hSR PROSTATE CANCER DATABASE • November 2001: hSR created using excel files Prostate Cancer database • Clinical and pathological data prospectively collected from 2002 • Data of 1200 patients treated with radical prostatectomy before 2002 retrospectively collected • 2004: due to the large amount of data included in PCa database, a FileMaker Pro database was created 32 hSR PROSTATE CANCER DATABASE 33 hSR PROSTATE CANCER DATABASE • 6000 patients treated with open RRP • 850 patients prostatectomy treated with robotic-assisted laparoscopic • Clinical and pathological data prospectively collected from 2001 • The number of RP has increased over time and today we perform 600 RP every year • In 2011 50% of RP were performed with a robotic-assisted laparoscopic approach • Complete preoperative data from 2050 patients HOW TO BUILD A CLINICAL DATA-BASE PERSONAL SUGGESTIONS… 1. As beginners, please makes it as easy a possible 2. Try to be prospective 3. If retrospective careful should be taken to avoid selection biases 4. Excel file is easy, but be careful… 5. Avoid text as much as possible (numeric conversion) 6. However, try to limit the number of categories within each variable 7. Try to sub-divide the data-base into different , connected sub-structured 8. Do not include too many variables… 9. Use standardized assessments (i.e. TNM) and questionnaires 35 HOW TO BUILD A CLINICAL DATA-BASE PERSONAL SUGGESTIONS… 1. Do not use “subjective variables” (unless to be tested in trials) 2. Every time dependent variables should have the time to event available 3. Restrict your queries 4. Copy patient charts 36