Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Modeling fluctuations in advanced Parkinson's disease using statistical and soft computing methods – data mining for contributing factors Shahina Begum 2005 Master Thesis Computer Engineering Nr: E3166D DEGREE PROJECT Computer Engineering Programme Reg. number International Master of Science in Computer Engineering (Specialization Intelligent Systems) Name of student E 3166 D Extent 30 ECTS Year-Month-Day Shahina Begum Supervisor 1977-10-01 Examiner Mr. Jerker Westin Company/Department Professor Mark Dougherty Supervisor at the Company/Department NeoPharma AB Title Modeling fluctuations in advanced Parkinson's disease using statistical and soft computing methods – data mining for contributing factors Keywords Parkinson’s disease; levodopa; fluctuations; data mining; statistical model; fuzzy model; neurofuzzy model Abstract The main purpose of this thesis work was to investigate factors that influence fluctuations in a group of patients with advanced Parkinson’s disease. Data were taken from two different crossover studies comparing duodenal infusion of a levodopa gel (Duodopa) with oral treatments. One study was ‘DireQt’ (Duodopa Infusion - Randomized Efficacy and Quality of life Trial) in which 18 patients were involved and the other was ‘Pharmacokinetic study’, which involved 12 patients. Based on these studies, data mining using statistical and fuzzy and neuro-fuzzy modeling techniques were used to compare performances of different models. From the different models it was found that fluctuation was strongly related to treatment (Duodopa or oral) and also related with disease severity. These models showed that patients who were taking oral levodopa had more fluctuations than the patients who were treated with duodopa and patients who had larger disease severity had more fluctuations compared with those who had smaller disease severity. In addition, fluctuation was more affected by treatment than by severity. Data from the ‘Pharmacokinetic study’ also showed that standard deviation of plasma concentrations of levodopa, disease duration and disease severity was related to fluctuations in advanced Parkinson’s disease. Contents 1 Introduction ___________________________________________________________1 1.1 Project Background .................................................................................................... 1 1.2 Study Description....................................................................................................... 1 1.3 Modeling Techniques................................................................................................. 2 1.4 Aim and Objective ..................................................................................................... 3 2 Methodology __________________________________________________________4 2.1 Environment ............................................................................................................... 4 2.2 Data Mining ............................................................................................................... 4 2.2.1 Data Description ................................................................................................ 5 2.2.2 Getting the Data Ready ...................................................................................... 7 2.2.3 Mining the Data ................................................................................................. 7 2.3 Statistical Techniques ................................................................................................ 7 2.3.1 Model Selection ................................................................................................. 7 2.4 Fuzzy Logic Techniques .......................................................................................... 12 2.4.1 Fuzzy Inference System/Fuzzy Rule-based System/ Fuzzy Model ................. 12 2.4.2 Neuro-fuzzy Model .......................................................................................... 12 3 Statistical Techniques __________________________________________________14 3.1 Statistical Models ..................................................................................................... 14 3.1.1 Model 1: data from DireQt study ..................................................................... 14 3.1.2 Model 2: data from DireQt study ..................................................................... 16 3.1.3 Model 3: data from Pharmacokinetic study ..................................................... 17 3.1.4 Model 4: data from Pharmacokinetic study ..................................................... 18 3.2 Evaluating Statistical Models .................................................................................. 19 3.3 Correlation among Ratings, Diary and UPDRS ...................................................... 21 3.4 Result Analysis ........................................................................................................ 22 4 Fuzzy Techniques _____________________________________________________24 4.1 Fuzzy Models ........................................................................................................... 24 4.1.1 Mamdani Fuzzy Model 1: data from DireQt study .......................................... 24 4.1.2 Mamdani Fuzzy Model 2: data from Pharmacokinetic study .......................... 25 4.1.3 Anfis Model 3: data from DireQt and Pharmacokinetic study ........................ 28 4.1.4 Anfis Model 4: data from Pharmacokinetic study ........................................... 30 4.2 Evaluating Fuzzy Models ........................................................................................ 31 4.3 Result Analysis ........................................................................................................ 32 5 Performance: Statistical and Fuzzy Models ________________________________33 6 Conclusion ___________________________________________________________35 Appendix A ______________________________________________________________36 Appendix B ______________________________________________________________41 References _______________________________________________________________43 Acknowledgements I want to acknowledge my supervisor Mr. Jerker Westin for his valuable guidance and helping me with his intuitive ideas. I am also grateful to Professor Mark Dougherty for his support and encouragement and for providing me with an opportunity to work on this project. I am thankful to Mr. Hasan Fleyeh for constantly encouraging me from the beginning of my Master’s studies. Also I want to thank Mr. Pascal Rebreyend and Mr. Kalid Askar for their cooperation during my thesis period. I wish to thank to Mr. Moudud Alam, student in Statistics department, for his valuable suggestions for solving the problems related to Statistics. Finally, I want to thank and acknowledge my family: my parents, my husband who supported me and helped me always in many ways. List of Figures Figure 1: Data mining steps ...................................................................................................... 5 Figure 2 : Block diagram of computations in ANFIS .............................................................. 12 Figure 3: Adaptive Neuro-fuzzy inference system ([33]) ........................................................ 13 Figure 4: Antecedent and consequent MFs ............................................................................. 25 Figure 5: Antecedent and consequent MFs ............................................................................. 27 Figure 6: Error after 40 epochs............................................................................................... 28 Figure7: Membership function of severity before and after training .................................... 29 Figure 8: Surface view ............................................................................................................. 29 Figure9: Error after 40 epochs................................................................................................ 30 Figure 10: Surface view ........................................................................................................... 31 Figure 11: Analysis of result from different fuzzy models ....................................................... 32 List of Tables Table 1 : Assessing Individual Independent Variables ............................................................ 14 Table 2: Assessing Independent Variables: Effect Size ........................................................... 15 Table 3: Assessing Individual Independent Variables: Statistical Significance ...................... 16 Table 4: Assessing Independent Variables: Effect Size ........................................................... 16 Table 5: Assessing Individual Independent Variables: Statistical Significance ...................... 17 Table 6: Assessing Independent Variables: Effect Size ........................................................... 17 Table 7: Assessing Individual Independent Variables: Statistical Significance ...................... 18 Table 8: Assessing Independent Variables: Effect Size ........................................................... 18 Table 9: Summary result of the r – square value of the model ................................................ 19 Table 10: Summary result of the r – square value of the model .............................................. 20 Table 11: Summary result of the r – square value of the model .............................................. 20 Table 12: Pearson Correlation matrix .................................................................................... 21 Table 13: Analysis of result for models data from DireQt and Pharmacokinetic study......... 22 Table 14: Rules of the mamdani fuzzy model........................................................................... 24 Table 15: Rules of the mamdani fuzzy model........................................................................... 26 Table 16: The rule-base of the ANFIS model .......................................................................... 28 Table 17: The rule-base of the ANFIS model .......................................................................... 30 Table 18: Comparing R2 for different models.......................................................................... 33 Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 1 1 Introduction Parkinson is a neurological disease. Dopamine is a chemical in the brain that helps to control movement and activities such as walking and talking. In Parkinson’s disease brain cells that produce dopamine are not working properly and many are dying, causing a shortage of dopamine and this shortage of Dopamine causes the key symptoms of Parkinson’s disease. Parkinson’s disease affects about 1% of all persons over the age of 60 and 15% of the patients are diagnosed before age 50. Initial stage of the disease patients are treated with ‘artificial dopamine’ (levodopa) in tablet form. Treatment of PD with levodopa was begun in 1960 [34] Levodopa is the most effective treatment in PD. More than 30 years after its initial success, levodopa is still routinely used and remains the most effective treatment for the spectrum of PD signs and symptoms. [35] But long-term use of levodopa associated with the motor fluctuations (wearing-off, start hesitation, unpredictable off, on-off) and dyskinetia (involuntary movements), which is a major problem in PD. When treated orally with levodopa (L-DOPA), after 3-5 years of treatment one third, after 5 years about half and 10-12 years nearly all patients suffer from this problem. [1][2] Long term studies using slow release formulation of levodopa show a high prevalence of motor fluctuations after 5 years of therapy. [3][4] It has been demonstrated that the motor fluctuations and dyskinesias in advanced Parkinson’s disease are at least partially related to variations in blood levodopa concentrations and that such fluctuations can be markedly reduced by keeping levodopa plasma concentration constant. [5] [6] Motor fluctuation shown to increase with longer disease duration and greater disease severity [24][25], also motor complications shown to occur more frequently and earlier in patients with younger on-set [20]. Initiation of treatment with other antiparkinsonian drugs has resulted in lower rates or delayed appearance of motor fluctuations and dyskinesias. [21][22][23] Higher doses of oral levodopa are associated with the higher rates of motor fluctuations and dyskinesia. [21][24][26] 1.1 Project Background This thesis work was a part of a project IDOL (Intelligent Dudopa On-Line) in collaboration between Högskolan Dalarna, Uppsala Universty, NeoPharma Production AB and Clinitrac AB co-funded by KK-Stiftelsen. The background of the project was the need to individually tune dosage of medication for patients with advanced Parkinson’s disease. These patients will require fine adjustments of their dopamine levels to function in daily life. If the levels were low they will be stiff, shaking and in pain and if the levels are high they have problems controlling body movements. Since Parkinson’s disease (PD) is a progressive disease, best medication dosage will be different over time. 1.2 Study Description Sample of these clinical studies described previously may not be the representative of the overall population of patients with advanced Parkinson’s disease. For this project data were taken from two different studies from NeoPharma AB, Uppsala, Sweden. Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 2 DireQt (Duodopa® infusion - Randomised Efficacy and Quality of life Trial) study; patients with advanced idiopathic PD, suffering from severe fluctuating response despite frequent oral levodopa treatment, were included in the study. This study was open, three + three week crossover study of Duodopa vs. Conventional anti-Parkinson medications with blinded assessment of Parkinsonism and dyskinesias from video recordings of patient and using rating scales. Main objective of the study was to compare continuous intraduodenal infusion of Duodopa as monotherapy to treatment with any antiparkinsonian combination therapy in patients with advanced idiopathic levodopa-responsive PD, suffering from motor fluctuation in spite of individually optimized treatment. Pharmacokinetic study performed in 1999 – 2000; patients with idiopathic PD and diurnal motor fluctuations in spite of optimized oral treatment were enrolled. Patients were randomized to continue either Sinemet CR tablets (group 1) or to start nasoduodenal infusion of levodopa (group 2) during weeks 1-3. After week 3 patients were crossed over to infusion (group 1) and Sinemet CR tablets (group 2), for the next 3 weeks. This study mainly focused on levodopa pharmacokinetics. 1.3 Modeling Techniques Extracting knowledge from data is a very interesting and important task in information science and technology. The application field here was medical system. Statistical methods play an important role in different medical research. In statistical inference, models play an important role. A model is a mathematical way of describing the relationships between a response variable and a set of independent variables. [9] A good model should be simple and at the same time it should describe most of the information of the data and should make sense from a subject-matter point of view. Statistical modeling technique have been applied in many medical researches such as a statistical model is used to model of smallpox vaccine dilution [12], a new statistical models shown to estimates of flu-related deaths rise[13]. Statistical models for the detection of abnormalities in digital mammography [14]. Statistical model are used for predicting the outcome in breast cancer [15]. Use of fuzzy logic in medical informatics has begun in the early 1970s. Fuzzy set theory, which was developed by Zadeh (1965), makes it possible to define inexact medical entities as fuzzy sets. It provides an excellent approach for approximating medical text. Furthermore, fuzzy logic provides reasoning methods for approximate inference. [10] Fuzzy models have some properties that make them particularly interesting, namely, possibility of linguistic interpretation [11]. Fuzzy models have some transparency; their information is interpretable, so as to permit a deeper understanding of the system under study. Fuzzy logic models can be developed from expert knowledge or from patient input-output data. In the first case, expert knowledge is expressed in terms of linguistics, which is sometimes faulty and requires the model to be tuned. Therefore, identifying the process is a more attractive way using the help of expert knowledge. This process requires defining the model input variables and the determination of the fuzzy model type. So there are two ways to develop a fuzzy model, the first one based on defining the initial parameters of the model (membership functions) and selecting the rules construction method (if then). Limitation of fuzzy model is that difficulty to quantify the fuzzy linguistic terms. Therefore, neuro-fuzzy appear as an attempt to combine the advantages of fuzzy systems in terms of transparency with the advantages of neural networks regarding learning capabilities. The second method is used if there is no Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 3 knowledge about the process, when the rules and membership functions can be extracted directly from the data by clustering the input / output space. Fuzzy modeling technique have been applied in many medical researches such as fuzzy model is used for determining the severity of respiratory failure [16], Use fuzzy modeling in symptomatic HIV virus infected population [17]. Investigate neuro-fuzzy systems in psychosomatic disorders [18] and fuzzy set theoretic model performance evaluation is also done for medical diagnosis. [19] 1.4 Aim and Objective The aim of this work was to model factors influencing fluctuation in advanced Parkinson’s disease based on data from the clinical trials with patients on infusion and oral medication. Investigate the factors which influence patient’s motor fluctuations based on variables like years with disease duration, age at disease onset, severity, treatment, plasma concentration level of levodopa, antiparkinson medication, age etc; find out which factors were most important for influencing the fluctuation. Investigating factors which influence patient’s motor fluctuations will be helpful to find out the way to reduce these so called on-off fluctuations, which is a major problem for patients with Parkinson in their late stage. This knowledge will be helpful for developing a decision support system for patients with advanced PD. Data mining had been used to analyze existing data to deduce the patterns deciding the factors. Statistical modelling techniques (General Linear Model) were applied for mining the data. Performances of different statistical models were evaluated in order to get the model that was best describing the dependent variable, fluctuation. A linguistically interpretable rule-base fuzzy model from data was also presented. Parameters of this model were tuned via the training of a neural network through back propagation i.e. using neuro-fuzzy model. In both cases; statistical and fuzzy model, models were evaluated to see how well these models predict in another system rather than the data that was collected. Modeling is an art; one cannot say which one is best suited for the data sets at hand, so it is good to test different models. Therefore, different modeling techniques were applied and performances of statistical and fuzzy models were compared. Following sections will describe the methodologies; statistical techniques, analysis of the result from different statistical models, fuzzy and neuro-fuzzy models, analysis of results from different fuzzy models, compare different results from statistical and fuzzy logic techniques, conclusion and suggestions for future work. Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 4 2 Methodology Data were taken from two different studies, one was DireQt and another was Pharmacokinetic study. Data mining using statistical and fuzzy modeling techniques was used within these two datasets. Top–down approach was used for mining the data; start with some idea or hypothesis. Statistical modeling techniques (General Linear Model) were applied for mining the data. Fuzzy rule-based model from data was also presented through the mamdani fuzzy model and than training a neural net architecture capable of representing a fuzzy system i.e. neuro-fuzzy model was introduced. Statistical and fuzzy models were evaluated to see how well these models predicted for sample set from other study rather than the data that was collected, simply generated one sample set from DireQt study, fit the model with that data, generated a second sample set from Pharmacokinetic study and tested the model to predict the values with the second sample set and compared performances. For evaluating the models within the same study total available data had been divided into two sets, a calibration set and a validation set. Parameters of the model were identified using the calibration data set, and the model was tested for its performance on the validation data set. Finally, performances of statistical and fuzzy models were evaluated. Also correlation among the Rating, Diary and UPDRS had been done to see if similar result could be possible to get using Diary or UPDRS instead of Rating. 2.1 Environment For statistical modeling SAS (Statistical Analysis System) system for windows V8.0, a statistical package, had been used. For statistical model evaluation R, a free statistical software had been used. For Fuzzy Inference System NRC FuzzyJ Toolkit [28], a Java(tm) API for representing and manipulating fuzzy information created at the National Research Council of Canada (NRC), was used here. The toolkit consists of a set of classes (nrc.fuzzy.*) that allow a user to build fuzzy systems in Java. The IDE (Integrated Development Environment) used here was JBuilder 2005 Foundation; a software from Borland Software Corporation. For developing an ANFIS (Adaptive Neuro-fuzzy) model MatLab 7 Fuzzy Logic Toolbox of was used. 2.2 Data Mining Data mining is a process of posing queries and extracting useful information, patterns and trends previously unknown from large quantities of data stored possibly in databases. [28] Finding patterns in a dataset has become increasingly important. But data mining is not the answer to all problems; it is only a small step toward the entire of knowledge discovery. Data mining for determining the factors that affect the fluctuations in advanced Parkinson’s disease will provide the path for model creation. Steps of data mining followed here: Data description Getting the data ready Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 5 Mining the data Getting useful results Analyzing patient history and current medical conditions determined the factors that affect fluctuations in advanced Parkinson’s disease. Data mining steps are shown in the following figure 1. Figure 1: Data mining steps 2.2.1 Data Description Data sets were taken from two different studies from NeoPharma AB, Uppsala, Sweden; as part of some collaborative research. DireQt (Duodopa® infusion - Randomised Efficacy and Quality of life Trial) study Patients with advanced idiopathic PD, suffering from severe fluctuating response despite frequent oral levodopa treatment, were planned to be included in the study. Patients fulfilling the criteria for inclusion were randomized into two groups. Group1 received conventional medication for 3 weeks followed by 3 weeks of levodopa / carbidopa (Duodopa®) by intraduodenal infusion via a nasoduodenal catheter. Group2 was treated in the same way, but first with Duodopa® infusion and then with conventional PD medication. Study was open for the patients and the investigators, but two independent observers who were unaware of each patient’s therapy evaluated the video recordings. Determination of the treatment response was done by blinded assessments of video recordings of the patients. Each video recording was assessed with regard to symptoms of PD, dyskinesias and treatment response. Description of datasets: UPDRS: Unified Parkinson Disease Rating Scale [30] Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 6 Accepted and validated rating instrument in Parkinson disease. It has four parts 1. Mentation, behavior and mood 2. Activities in daily living 3. Motor and 4. Complications of therapy sections. Here all patients were questioned about the presence of oscillations in motor response. However it was not suitable for assessment of fluctuations in the treatment response. RATING_CLIP: Patient was video recorded every 30 minutes from 9:00 to 17:00. It record standardized sequence of motor tasks: finger tips, altering hand movements, rising from a chair and walking. Each recording was assessed for symptoms of PD, dyskinesias and treatment response. The global treatment response scale, TRS was graded from -3 (marked bradykinesia) to 0 (normal) and the dyskinesias scale was graded from 0 (normal) to 3 (severe choreic dyskinesia) [31]. DIARY_ALLQ: Designed to fill out by the patients. There were 10 questions to be responded in the morning and include 8 questions during daytime. An increased in the response score (scaled from 1-5) of these questions indicated the improvement. It could give patients own assessments of quality of life. CONMED: This data set contained concomitant medication (SSRI) per each patient, which was not Parkinson’s medicine. PRESMED: Contained the present Parkinson’s disease oral medication for each patient. PUMP_LOG: For patient on treatment with Duodopa® used a portable pump for intraduodenal delivery of Duodopa. Infusion was not to be used at night. Dosage of Duodpa was individualized for each patient’s need. Extra doses (0.1 -2 ml) could be delivered via the CADD-Legacy Duodopa pump. Starting and stopping of the pump, bolus doses (1 to 10 ml), extra doses, and infusion rates (1.3 to 9.8 ml/hr) were recorded. During test days (video recording days) all data were recorded in CRF (Case Report Form). BASELINE: Performed at enrollment i.e. two weeks before treatment start. Conventional PD medications at baseline were recorded at enrollment for all patients. Pharmacokinetic study performed in 1999 – 2000 This study mainly focused on levodopa pharmacokinetics. Patients were recruited via an information letter sent to the Swedish PD Association and in the neurology clinic at Uppsala University Hospital. Patients with idiopathic PD and diurnal motor fluctuations in spite of optimized oral treatment were enrolled. There was a “wash-in” week which allowed for the elimination of prior long-acting antiparkinsonian medication. Patients were randomized to continue either Sinemet CR tablets (group 1) or to start nasoduodenal infusion of levodopa (group 2) during weeks 1-3. After week 3 patients were crossed over to infusion (group 1) and Sinemet CR tablets (group 2), for the next 3 weeks. On the last day of the baseline week, plasma levodopa concentrations were determined every 30 minutes from 8 a.m. to 5 p.m. Two test days involved collections of blood samples (approximately 3 ml) every 30 minutes from 8 a.m. to 5 p.m., standardized video recordings hourly from 8 a.m. to 8:30 a.m. Video recordings consisted of three different tasks: piano playing, alternating hand movements, rising from a chair, and walking (20). Motor performance was scored by one investigator from -3 (severe Parkinsonism) to +3 (severe dyskinesia). Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 7 Data sets: Data files were BASELINE, UPDRS, and RATING. 2.2.2 Getting the Data Ready Data were in original data collection format. Hence, cleaning was necessary to reduce noise and error within the dataset. Preparation of cleaning needed transformation of raw datasets to datasets that was suitable for cleaning. For data cleaning and examining simple descriptive features of the attributes MS ACCESS was used. New attributes were added which contained necessary information from the original datasets in order to avoid biased results. Thus data cleaning was necessary to reduce noise and error within the datasets. Data that contained constant information was cleaned, because it had no changes for majority of records in the dataset, so it did not brought any new knowledge. Redundant attributes that contained the same information were also removed. For each of the study keep only the patients those were in Per Protocol Set and the others were removed from the datasets. Now DireQt and Pharmacokinetic study contained 18 patients and 12 patients respectively. Also carefully removed some unnecessary information from these data files based on the knowledge from literatures and experts. For some data files useful information were extracted using SQL quires. Finally a data set in longitudinal format that kept as many data records and attributes as possible was built for supplying knowledge to the model for each study. 2.2.3 Mining the Data Now the data were ready for analysis. Top–down approach was used for mining the data; start with some idea or pattern or hypothesis. There are different kinds of techniques in data mining. Statistical modeling techniques were used here. 2.3 Statistical Techniques Statistical techniques are playing a major role in data mining. SAS (Statistical Analysis System) version 8.0, a statistical package, had been used for the statistical techniques. The file was imported by using the import procedure to the SAS database as SAS7BDAT format. 2.3.1 Model Selection Response variable for investigating the contributing factor of fluctuation in advanced Parkinson’s disease was fluctuation which was a continuous variable. Fluctuation was independently and normally distributed. Quantile-quantile (Q-Q plot) plot has been given in figure A.1 (Appendix A). For such data, modeling using linear models such as analysis of variance and regression analysis could be used, since the normality assumption as well as the assumption of equal variance was not violated. So data could be modeled as General Linear Models. The General Linear Model (GLM) is basically an extension of linear multiple regression. In a linear multiple regression, quantify the relationship between a single Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 8 dependent variable and multiple independent variables. GLM, however, can also test for interactions between independent variables. [9] In general linear model a response variable Y is linearly associated with values on the X variables by Y X Where, Y = dependent variable i.e. fluctuations. X = design matrix. = vector of model parameters 1, 2, = containing the error terms 3 ….. i ~ N (0, 2 ) Where, i follows identically and independently distributed as normal with mean zero and variance 2 . The most frequent criterion used to estimate the GLM parameters is called the least squares criterion. General linear models can be fitted using e.g. the GLM procedure of the SAS package (SAS Institute Inc., 1999). DireQt (Duodopa® infusion - Randomised Efficacy and Quality of life Trial) study Variable fluctuation was the response variable for modelling factors. Response/ dependent variable was taken from video scoring that means, from RATING_CLIP file. Standard deviation of ratings on treatment response scale represented the variable fluctuation. In order to get idea about the relationship of response variable with all the other variables before selecting the explanatory variables Fit (Y X) analysis had been done. Some of them are given below. Chosen explanatory variables were, baseline severity which was taken from the UPDRS for each patient, treatments (two types: oral and infusion); taken as a classification variable, age, sex, age _on _set, disease duration, mean of the baseline severity from diary for each patient, present oral medicine (N04BA), in infusion number of extra doses and the rate of change of dudopa on test days. Assumed these variables might have some effect on dependent variable in any way. Consider for further investigation, only those variables that appeared to be related to the dependent variable and then used backward deletion procedure. Normalized format of the variables were taken by subtracting their mean from each of the variables and then divided it by its standard deviation. Fit analysis of variables Fluctuation Vs. Treatment 1. 5 F L 1 U Fluctuation Vs. Sum2 (daily activities) 1. 5 F L 1 U 0. 5 0. 5 1 1. 5 10 2 15 Source Model Error C Tot al DF 1 34 35 Anal ysi s of Vari ance Sumof Squares Mean Square 1. 7472 3. 4264 5. 1737 Högskolan Dalarna Röd vägen 3, 781 88 Borlänge 20 25 SUM 2 TREAT 1. 7472 0. 1008 F St at 17. 3376 Prob > F Source 0. 0002 Model Error C Tot al DF 1 34 35 Anal ysi s of Vari ance Sumof Squares Mean Square 0. 5865 4. 5872 5. 1737 0. 5865 0. 1349 F St at 4. 3469 Prob > F 0. 0447 Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 1. 5 1. 5 Fluctuation Vs. Sum3 ( motor) F L 2017-05-13 Page 9 1 U Fluctuation Vs. Sum2 (in oral) F L 1 U 0. 5 0. 5 20 30 40 10 50 15 Source Anal ysi s of Vari ance Sumof Squares Mean Square DF Model Error C Tot al 1 34 35 0. 0051 5. 1686 5. 1737 F St at 0. 0051 0. 1520 1. 5 Prob > F Source 0. 8554 Model Error C Tot al 0. 0337 1 U Anal ysi s of Vari ance Sumof Squares Mean Square DF 1 16 17 0. 6855 2. 2722 2. 9577 L 30 DF Model Error C Tot al 40 0. 6 U 10 50 15 Anal ysi s of Vari ance Sumof Squares Mean Square 1 16 17 0. 0284 2. 9293 2. 9577 0. 0284 0. 1831 F St at 0. 1551 Prob > F Source 0. 6989 Model Error C Tot al 25 0. 6 U DF Anal ysi s of Vari ance Sumof Squares Mean Square 1 16 17 0. 0651 0. 4037 0. 4687 1 U 0. 5 0. 4 20 30 40 50 4 6 SUM 3 DF 1 16 17 8 10 12 SUM 4 Anal ysi s of Vari ance Sumof Squares Mean Square 0. 0728 0. 3959 0. 4687 0. 0728 0. 0247 F St at 2. 9414 Prob > F Source 0. 1056 Model Error C Tot al Anal ysi s of Vari ance Sumof Squares Mean Square 1 16 17 0. 0079 2. 9498 2. 9577 L 0. 6 1 U 40 60 80 2. 5 TO TSU M Model Error C Tot al Prob > F 0. 0430 0. 8384 0. 5 0. 4 Source 0. 1279 Fluctuation Vs. BaseDQ1 F U F St at 0. 0079 0. 1844 1. 5 Fluctuation Vs. Totsum 0. 8 F DF Prob > F 2. 5784 Fluctuation Vs. Sum4 (Complicati ons of therapy sections) in oral F L F St at 0. 0651 0. 0252 1. 5 Fluctuation Vs. Sum3 (in infusion) F L 20 SUM 2 0. 8 Model Error C Tot al 0. 0431 Fluctuation Vs. Sum2 (in infusion) SUM3 Source Prob > F 4. 8274 0. 4 20 L F St at 0. 6855 0. 1420 F 0. 5 Source 25 0. 8 Fluctuation Vs. Sum3 (in oral) F L 20 SUM 2 SUM 3 DF 1 16 17 Högskolan Dalarna Röd vägen 3, 781 88 Borlänge 3. 5 4 BASEDQ 1 Anal ysi s of Vari ance Sumof Squares Mean Square 0. 0515 2. 9062 2. 9577 3 0. 0515 0. 1816 F St at 0. 2836 Prob > F Source 0. 6017 Model Error C Tot al DF 1 16 17 Anal ysi s of Vari ance Sumof Squares Mean Square 0. 0834 0. 3854 0. 4687 0. 0834 0. 0241 F St at 3. 4610 Prob > F 0. 0813 Phar Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 10 macokinetic study performed in 1999 – 2000 Response variable was taken from the video scoring that means, used RATING file; standard deviation of ratings on treatment response scale. Fit (Y X) analysis to get idea about the relationship of response variable with all the other variables had been done. Some of them are given below. Chosen explanatory variables were, baseline severity which was taken from the UPDRS for each patient, medication (two types: oral and infusion); taken as a classification variable, sex, age _on _set, disease duration and the standard deviation of ‘concentration’ for each patient in each medicine, oral and infusion. Consider for further investigation, only those variables that appeared to be related to the dependent variable and then used backward stepwise deletion procedure. Normalized format of the variables were taken. Fit analysis of variables 2 1. 5 F L U 2 Fluctuation Vs. Treatment F Fluctuation Vs. Severity (Sum2 activites of daily living) 1. 5 L U 1 0. 5 1 0. 5 1 1. 5 2 15 20 25 M ED Source Anal ysi s of Vari ance Sumof Squares Mean Square DF Model Error C Tot al 1 22 23 0. 7796 4. 7475 5. 5271 Prob > F Source 0. 0705 Model Error C Tot al 3. 6126 L 1 Anal ysi s of Vari ance Sumof Squares Mean Square DF 1 22 23 1. 5941 3. 9329 5. 5271 F L 0. 5 1 0. 5 20 30 40 50 4 6 8 SU M 3 Source Model Error C Tot al 1 22 23 2. 5036 3. 0234 5. 5271 2. 5036 0. 1374 2 F F St at 18. 2176 Prob > F Source 0. 0003 Model Error C Tot al Fluctuation Vs. Severity (totsumsum1+sum2+ sum3+sum4) 1. 5 L U 0. 0068 10 12 SU M 4 Anal ysi s of Vari ance Sumof Squares Mean Square DF Prob > F 8. 9173 Fluctuation Vs. Severity (Sum4complications of therapy sections) 1. 5 U F St at 1. 5941 0. 1788 2 Fluctuation Vs. Severity (Sum3motor) 1. 5 U F St at 0. 7796 0. 2158 2 F 30 SU M 2 1 DF Anal ysi s of Vari ance Sumof Squares Mean Square 1 22 23 0. 3340 5. 1931 5. 5271 0. 3340 0. 2361 2 F F St at 1. 4148 Prob > F 0. 2469 Fluctuation Vs. Sum2 (in oral ) 1. 5 L U 1 0. 5 40 60 80 15 TO TSU M Source Model Error C Tot al DF 1 22 23 Högskolan Dalarna Röd vägen 3, 781 88 Borlänge 25 30 SUM 2 Anal ysi s of Vari ance Sumof Squares Mean Square 2. 6771 2. 8500 5. 5271 20 2. 6771 0. 1295 F St at 20. 6651 Prob > F Source 0. 0002 Model Error C Tot al DF 1 10 11 Anal ysi s of Vari ance Sumof Squares Mean Square 0. 6755 2. 1871 2. 8626 0. 6755 0. 2187 F St at 3. 0885 Prob > F 0. 1094 Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 11 1. 5 F L 2 Fluctuation Vs. Sum2 (in infusion) 1 U F Fluctuatio n Vs. Sum3 (in oral) 1. 5 L U 1 0. 5 15 20 25 30 20 30 SUM 2 Source Anal ysi s of Vari ance Sumof Squares Mean Square DF Model Error C Tot al 1 10 11 0. 9287 0. 9562 1. 8849 0. 9287 0. 0956 1. 5 F St at Prob > F Source 0. 0109 Model Error C Tot al 9. 7124 Fluctuation Vs. Totsum (in infusion) F L 1 0. 5 1 10 11 1. 4116 1. 4510 2. 8626 2 L 1 60 0. 4 80 0. 6 0. 0109 Anal ysi s of Vari ance Sumof Squares Mean Square DF 1 10 11 0. 8 1 1. 2 1. 4 CO NCEN TO TSU M 1. 2044 0. 6805 1. 8849 1. 2044 0. 0681 F St at 17. 6974 Prob > F Source 0. 0018 Model Error C Tot al Anal ysi s of Vari ance Sumof Squares Mean Square DF 1 22 23 0. 6408 4. 8863 5. 5271 0. 6408 0. 2221 2 1. 5 Fluctaution Vs. Sum3 (in infusion) F L Prob > F 9. 7284 Fluctuation Vs. Concentratio n 1. 5 F F St at 1. 4116 0. 1451 0. 5 40 Model Error C Tot al 50 Anal ysi s of Vari ance Sumof Squares Mean Square DF U U Source 40 SUM 3 1 U F F St at 2. 8852 Prob > F 0. 1035 Fluctautio n Vs. totsum (in oral) 1. 5 L U 1 0. 5 20 30 40 40 50 SUM 3 Source Model Error C Tot al DF 1 10 11 Anal ysi s of Vari ance Sumof Squares Mean Square 1. 1016 0. 7833 1. 8849 Högskolan Dalarna Röd vägen 3, 781 88 Borlänge 1. 1016 0. 0783 60 80 TO TSU M F St at 14. 0646 Prob > F Source 0. 0038 Model Error C Tot al DF 1 10 11 Anal ysi s of Vari ance Sumof Squares Mean Square 1. 4798 1. 3828 2. 8626 1. 4798 0. 1383 F St at 10. 7013 Prob > F 0. 0084 Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2.4 2017-05-13 Page 12 Fuzzy Logic Techniques Above statistical solutions to the problems of finding factors for fluctuation in advanced Parkinson’s disease was investigated by computational intelligence technique. Soft computing is an integrated approach that can usually utilize specific techniques within subtasks to construct generally satisfactory solutions to real-world problems. [27] Fuzzy inference system (FIS) which has a structural knowledge representation with the form of its if-then rules, can effectively model human expertise was used here to model fluctuation in advanced PD. Also neural network learning concepts i.e. neuro-fuzzy modeling techniques was incorporated to get a better solution. 2.4.1 Fuzzy Inference System/Fuzzy Rule-based System/ Fuzzy Model The basic structure of fuzzy modeling, commonly known as fuzzy inference system (FIS), is a rule-based or knowledge-based system consisting of three conceptual components: a rule base that consists of a collection of fuzzy IF–THEN rules; a database that defines the membership function (MF) used in fuzzy rules; and a reasoning mechanism that combines these rules into a mapping routine from the inputs to the outputs of the system, to derive a reasonable output conclusion. Fuzzy system has mainly two approaches; the first one is the Mamdani approach and the other the Takagi–Sugeno approach. 2.4.2 Neuro-fuzzy Model Merge a neural network with a fuzzy system into one integrated system i.e. neuro-fuzzy offer a promising approach to build an intelligent system. It is functionally equivalent to a fuzzy inference model. Neural Network that is functionally equal to a Sugeno fuzzy inference model, called an ANFIS (Adaptive Neuro-Fuzzy Inference System). Anfis can be trained to develop the IF-Then rules and this learning technique provide a method for the fuzzy modelling procedure to learn information about a data set, in order to compute the membership function parameters that best allow the associated fuzzy inference system to track the given input-output data. Basic block diagram of computations in ANFIS is given below Initialize the fuzzy system Give parameters for learning (like epoch, tolerance error etc.) Start learning process (stop when tolerance is achieved) Validate with independent data Figure 2 : Block diagram of computations in ANFIS Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum input Fuzzification layer 2017-05-13 Page 13 Rule layer Normalization layer Defuzzification layer output summation Figure 3: Adaptive Neuro-fuzzy inference system ([33]) A neural network, maps inputs through input membership functions and associated parameters, and then through output membership functions associated parameters to outputs. In the learning process parameters associated with the membership functions will change. It uses either back propagation or a combination of least squares estimation and back propagation (Hybrid) for membership function parameter estimation. Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 14 3 Statistical Techniques Statistical techniques were used to mine the data in order to find the influencing factors for fluctuations. 3.1 Statistical Models Different statistical models using data from DireQt and Pharamacokinetic study and their results are shown below 3.1.1 Model 1: data from DireQt study fluctuation 1 * treat 2 * severity --------------------------------- (1) Where, α = intersection 1 = estimate of treament1( oral) or treatment2 (infusion). 2 = estimate of severity (sum2-daily activities) ε = Random error. Assessing Overall Fit: Statistical Significance Examine the ANOVA (Analysis of Variance table) table from the GLM. (Appendix table A.1.1) ANOVA table contain two rows, one for the Model and the second for Error. ANOVA table applies to the whole model and not to specific parts of the model. This table and its test statistic (i.e., the F statistic) assesses whether the model as a whole predicts better than chance. In this case P-value (.0001) shows that the model was highly significant. Assessing Overall Fit: Effect Size The customary measure of effect size in a GLM is the squared multiple correlations denote as R2. Results showed (Appendix table A.1.2) the model could explain 45.10% of the variation in fluctuations. Assessing Individual Independent Variables: Statistical Significance Statistical procedures for GLMs give a table as an output consisting of a row for each independent variable in the model along with a statistical test of significance for the independent variable. Table 1 : Assessing Individual Independent Variables Source Högskolan Dalarna Röd vägen 3, 781 88 Borlänge F Value Pr > F TREAT 20.30 0.0001 SEVERITY 6.81 0.0135 Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 15 Table 1 illustrates the output from an ANOVA procedure for type III test. The ANOVA procedure uses an F test to assess significance. The resulting F value, 20.30 for TREAT in this case, uses the mean square for Treat as the numerator and the mean square for error as the denominator. The p (.0001) value suggested that treatment significantly predict the dependent variable because its two-tailed p value was lower than the customary cutoff of .05. Variable SEVERITY (severity-daily activities) had F value 6.81 with an associated probability of (p-value) 0.0135, could reject the null hypothesis that the means were equal in this case. Assessing Independent Variables: Effect Size For assessing change in the dependent variable was interpreted using the standardized regression coefficient. The standardized regression coefficient was the coefficient from a regression in which all variables were standardized (i.e., have a mean of 0 and a standard deviation of 1.0). Hence, all units were expressed as “standard deviation units.” Table 2: Assessing Independent Variables: Effect Size Parameter Estimate Pr > |T| INTERCEPT ( ) TREAT 1 2 SEVERITY -0.573 1.146 0.000 0.337 0.0031 0.0001 . 0.0135 The meaning of the INTERCEPT (α) was simply the predict value of the dependent variable when all the independent variables are 0. Note that the intercept was not required to take on a meaningful real- world value. The estimate of TREAT (Treatment) tells us that if all other conditions remain same, on an average in TREAT1 (Oral) fluctuation become 1.146 standard deviations higher than that of TREAT2 (infusion). If fix the values of all the independent variables (except SEVERITY, of course) at set of any numbers, then an increase of one standard deviation in the independent variable SEVERITY (standardized value of severity-daily activities) predicts an increase of 0.336 standard deviations in the dependent variable, fluctuation. The low value for R2 for model (1) tells that there might be some other explanatory variables that could explain the variation in fluctuation more precisely. Considering other variables that might have effects: Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 16 3.1.2 Model 2: data from DireQt study fluctuation 1 * treat 2 * severity 3 * treat * severity --------------- (2) Where, α = intersection 1 = estimate of treatment1( Oral) or treatment2 (Infusion) depends on value of i. 2 = estimate of severity (sum2-daily activities) 3 = estimation of interaction term for treatment and severity ε = random Error This model adds one interaction term between treatment and severity. Corresponding table is given in Appendix table A.2 for type III test. Table 3: Assessing Individual Independent Variables: Statistical Significance Source F Value Pr > F TREAT SEVERITY SEVERITY*TREAT 20.89 7.01 1.96 0.0001 0.0125 0.1708 Table 3 shows that variable TREAT and SEVERITY was significant. For the interaction term SEVERITY*TREAT, F value 1.96 associated with the p (0.1708) value suggested that interaction between treatment and severity could predicted the dependent variable but its two-tailed p value was higher than the customary cutoff of .05. Table 4: Assessing Independent Variables: Effect Size Parameter Estimate Pr > |T| INTERCEPT ( ) TREAT 1 2 SEVERITY SEVERITY*TREAT 1 2 -0.573 1.146 0.000 0.159 0.356 0.000 0.0028 0.0001 . 0.3843 0.1708 . Estimate of TREAT (Treatment) tells us that if all other conditions remain same, on an average in TREAT1 (Oral) fluctuation become 1.146 standard deviations higher than that of TREAT2 (infusion). If fix the values of all the independent variables (except SEVERITY, of course) at set of any numbers, then an increase of one standard deviation in the independent variable SEVERITY (standardized value of severity-daily activities) predicted an increase of 0.1585 standard deviations in the dependent variable, fluctuation. Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 17 The interaction term between TREAT and SEVERITY tells that severity effects differently depending on which treatment was given. From the estimates found that when TREAT1 (Oral) was given than one standard deviation severity increase will increase 0.3562 (on an average) standard deviation of fluctuations other than that of the TREAT2 (infusion). 3.1.3 Model 3: data from Pharmacokinetic study fluctuation 1 * treat 2 * severity --------------------------- (3) Where, α = intersection 1 = estimate of treament1 (oral) or treatment2 (infusion). 2 = estimate of severity (sum2-daily activities) ε = random error. In this case P-value (.0028) showed (Appendix A table A.3.1) that the overall model is significant. R2 value showed (R-Square is 0.429470) the model could explain 43% of the variation in fluctuations. Assessing Individual Independent Variables: Statistical Significance Table 5: Assessing Individual Independent Variables: Statistical Significance Source TREAT SEVERITY F Value 5.19 10.62 Pr > F 0.0333 0.0038 F value, 5.19 for TREAT associated with the p (.0333) value suggested that treatment significantly predicted the dependent variable. Variable SEVERITY (severity-daily activities) had F value 10.62 with an associated probability of (p-value) 0.0038 also showed that it was statistically significant. Assessing Independent Variables: Effect Size Table 6: Assessing Independent Variables: Effect Size Parameter Estimate Pr > |T| INTERCEPT ( ) TREAT 1 2 SEVERITY -0.368 0.735 0.000 0.537 0.1221 0.0333 . 0.0038 Estimate of TREAT (treatment) tells us that if all other conditions remain same, on an average in TREAT1 (Oral) fluctuation became 0.735 standard deviations higher than that of in TREAT2 (infusion). Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 18 If fix the values of all the independent variables (except SEVERITY) at set of any numbers, then an increase of one standard deviation in SEVERITY (standardized value of severitydaily activities) predicted an increase of 0.537 standard deviations in fluctuation. Low R2 value for model (3) tells that there might be some other model or variable that could explain the variation in fluctuation more precisely. 3.1.4 Model 4: data from Pharmacokinetic study fluctuation 1 * concentration 2 * severity 3 * diseaseDur ation -------- (4) Where, α = intersection 1 = estimate of standard deviation of concentration 2 = estimate of severity (sum3-motor) 3 = estimation of disease duration, ε = random Error Corresponding table for this model is given in Appendix A.3. R2 value of the model (Appendix A table A.3.2) showed that model could explain 60% of the variation in fluctuations. Assessing Individual Independent Variables: Statistical Significance Table 7: Assessing Individual Independent Variables: Statistical Significance Source F Value Pr > F CONCENTRATION SEVERITY DISEASEDURATION 5.58 5.45 5.53 0.0284 0.0302 0.0290 For this model F value, 5.58 for CONCENTRATION associated with the p (0.0284) value suggested that standard deviation of concentration of levodopa significantly predicted the fluctuation. Variable SEVERITY (severity-motor) had F value 5.45 with an associated probability of (p-value) 0.0302, could reject the null hypothesis that the means were equal in this case. Also for DISEASEDURATION F value 5.53 and the p –value of 0.0290 suggested that disease duration was statistically significant to predict the fluctuation. Assessing Independent Variables: Effect Size Table 8: Assessing Independent Variables: Effect Size Parameter Estimate Pr > |T| INTERCEPT CONCENTRATION SEVERITY DISEASEDURATION -.000 0.410 0.405 0.429 1.0000 0.0284 0.0302 0.0290 Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 19 Estimate of CONCENTRATION from table 8 tells that if fix values of all the independent variables (except CONCENTRATION) at set of any numbers, then an increase of one standard deviation in Concentration (standard deviation of concentration of plasma levodopa) predicted an increase of 0.410 standard deviations in fluctuation. Except SEVERITY fix the values of all the independent variables at set of any numbers, then an increase of one standard deviation in SEVERITY (standardized value of severity-motor) predicted an increase of 0.405 standard deviations in the dependent variable, fluctuation. And if fix the values of all the independent variables (except DISEASEDURATION) at set of any numbers, then an increase of one standard deviation in the independent variable DISEASEDURATION (standardized value of disease duration) predicted an increase of 0.429 standard deviations in fluctuation. 3.2 Evaluating Statistical Models Model: Data from DireQt and Pharmacokinetic Study First sample generated from DireQt study, fit the model with that data, generated second sample from Pharmacokinetic study and use same model to predict the values of the second sample. Table 9: Summary result of the r – square value of the model Model fluctuation 1 * treat 2 * severity Median 1st sample set nd 2 sample set Mean 0.451 0.455 0.789 0.094 0.152 0.919 Max Table 9 shows that mean of the R2 value of the model was much lower for the second sample set comparatively with the first sample set. Corresponding histogram was given in Appendix A (Figure A.3.1 and Figure A.3.2). For the first data set (DireQt study) mean of R2 tells that it could explain 45% of variation in fluctuation where as using same model with the data from other study (Pharmacokinetic), it could only explain 15%. The value of the R2 is much less in case of Pharmacokinetic study the reason might be the measure of severity (sum2) taken in the two studies in two different ways and it affects this result. Model: Data from DireQt Study Another process that was done for evaluation was, within the same study randomly chosen 2/3 from the whole data set that was the calibration data. For the other 1/3, validation data set Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 20 used the same model calculated the R2 to check the goodness of fit. And this whole process was done 1000 times. The summary results for different models were given below. Table 10: Summary result of the r – square value of the model Model fluct 1 * treat 2 * severity Median Mean Model 2 fluct 1 * treat 2 * severity 3 * treat * severity Max Median Mean Max 2/3 of the whole set 0.471 0.469 0.712 0.498 0.507 0.766 1/3 of the whole set 0.432 0.441 0.879 0.462 0.450 0.922 Corresponding histogram is shown in Appendix A (Figure A.3.3 & Figure A.3.4). Model 1 predicted on average 47% of the variations [median value of table 12] with a maximum extremes of 71%. The percentage of prediction was not much greater than 43% for 6 patients. Model 2 predicted 50% of the variations (median value), with maximum extremes 77%. The percentage of prediction was not much greater than 47% for the same size of patients as model 1. Model: Data from Pharmacokinetic study Table 11: Summary result of the r – square value of the model Model 3 Model 4 fluct 1 * treat 2 * severity Median 2/3 of the whole set 0.402 1/3 of the whole set 0.397 Mean 0.416 0.391 Max fluct 1 * concen 2 * severity 3 * diseaseDur Median Mean Max 0.756 0.677 0.657 0.888 0.878 0.4.54 0.467 0.940 Corresponding histogram was shown in Appendix A (Figure 3.5 & Figure 3.6). Model 3 predicted on average 40% of the variations [median value] with a maximum extremes of 76%. The percentage of prediction was similar to 40% for 6 patients. Model 4 predicted 68% of the variations (median value), with maximum extremes 89%. The percentage of prediction was greater than 40% for the same size of patients as model 3. Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 21 In both cases (Table 10 and Table 11) the value of the R2 is greater than the model with the whole data sets (45%). This might happened for the outliers. The outliers in the data set were now split up and it improves the result. 3.3 Correlation among Ratings, Diary and UPDRS To see whether could use Diary or UPDRS instead of Ratings to build these models for explaining the fluctuation check the correlation among them. And the table for the correlation matrix among them is given below. Table 12: Pearson Correlation matrix Fluctuation (Rating) DQ1(Diary) Totsum (UPDRS) 1.000 0.104 0.072 0.679 0.774 1.000 -0.304 Fluctuation DQ1(Diary) 0.104 0.679 Totsum (UPDRS) 0.219 0.072 -0.304 0.774 0.219 1.000 In Table 12 matrix for the Correlation Co-efficient is given where found the Correlation Coefficient (top number) and the p-value (second number). Variables totsum, DQ1 and flu were the representative variables for UPDRS, Diary and Rating respectively. Correlation between flu and DQ1 Table 12 shows that the correlation co-efficient of 0.104 (at the level .67) was not significant. Figure A.2.1 (Appendix A) also supports this. Correlation between flu and totsum Correlation co-efficient was 0.072 (p=0.774) which was less than the correlation between flu and dq1 and not significant. Correlation between DQ1 and totsum Correlation co-efficient was -0.304 (p=0.219) which indicate negative correlation though larger than others two but not significant. Figure A.2.2 (Appendix A) also supports the result. Comments: these data did not show any significant relationship (either positive or negative) among the concerned variables. Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 3.4 2017-05-13 Page 22 Result Analysis Table 13: Analysis of result for models using data from DireQt and Pharmacokinetic study DireQt study Pharmacokinetic study Model: fluctuation 1 * treat 2 * severity If all other conditions remained same, on an average in Oral fluctuation become 1.146 standard deviations higher than that of infusion. If all other conditions remained same, on an average in Oral fluctuation become 0.735 standard deviations higher than that of infusion. Comments: this influential explanatory variable for fluctuation suggested that continuous duodenal infusion of levodopa offers an improvement in fluctuation of advanced PD comparatively with Oral tablets. One standard deviation increase in Severity One standard deviation increase in (SUM2-daily activities) on an average the Severity (SUM2-daily activities) on an fluctuation increase by 0.336 if all other average the fluctuation increase by 0.537 conditions remained fixed. if all other conditions remained fixed. Comments: The more severe the patient was the higher the fluctuation was. o Model (1) and model (3) for these two studies did not show any significant difference in R2 value. Model that was for DireQT study can explain 45% of the variation in fluctuations. On the other hand Pharmacokinetic study can explain 42% of the variation in fluctuation. o Overall results from these two studies concluded that for PD patient’s taken from the same population; these two (treatment and severity) explanatory variables had some significant effect on fluctuation. o Comparatively, treatment effects (Oral or infusion) was higher (If all other conditions remained same, on an average in Oral, fluctuation become 1.146 standard deviation higher than that of infusion in DireQt study and 0.735 standard deviations in Pharmacokinetic study) than severity effects. (One standard deviation increase in Severity on an average fluctuation increased by 0.336 in DireQt study and 0.537 in Pharmacokinetic study if all other conditions remain fixed). Similar results can be expected for the same population using this same model. o For DireQt study, now in model (2) fluctuation 1 * treat 2 * severity 3 * treat * severity Added one more explanatory variable, which is an interaction term between treatment and severity. The model can explain 48% of the variation in fluctuations. Though not significant at 5% level, this term was not removed from the model because it showed effect of severity on different treatment in fluctuation. That makes the model more sensible. o For Pharmacokinetic study tried to build model (4) Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 23 fluctuation 1 * concentration 2 * severity 3 * diseaseDur ation With other explanatory variables (In DireQt study there was no direct measurement of plasma concentration level of levodopa) and got that this could now explain 60% of variation in fluctuations, which was much better than the previous one. Instead of using treatment; standard deviation of concentration of levodopa was taken. “This study shows that significantly lower variability in plasma levodopa levels can be achieved with infusion of the stabilized carbidopa/levodopa suspension as compared to oral sustained-release tablets. [32] So lower concentration in plasma level was for infusion and higher level was for oral treatment. Standardized value of severity (motor) and duration of disease were other two explanatory variables. Model (4) using measurement of plasma concentration level of levodopa for Pharmacokinetic increased R2 of the model. Consider model (4) as the best within our search for this study. Though the R2 of model (2) was low, could conclude that there might exist some other variable which in collaboration with the variables in model (2) could explain the variation of the fluctuation more precisely. However, as we have no data on those variables (like for concentration) could consider model (2) as the best within our reach for DireQt study. Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 24 4 Fuzzy Techniques Based on the knowledge from the Statistical models Fuzzy techniques were applied to see whether can get better performance. 4.1 Fuzzy Models Different fuzzy models using data from DireQt and Pharamacokinetic study and their results are shown below 4.1.1 Mamdani Fuzzy Model 1: data from DireQt study Mamdani FIS has four steps: fuzzification of the input variables, rule evaluation, aggregation of the rule outputs and defuzzification. A two-input single-output Mamdani fuzzy model was extracted from the expert knowledge. Where treatment and severity was taken as input variables and fluctuation was the output. Variable treatment was crisp and severity and fluctuation were fuzzy variables. Fuzzy Rules The parameters of the IF–THEN rules (known as antecedents or premise in fuzzy modeling) define a fuzzy region of the input space, and the output parameters (known as consequent in fuzzy modeling) specify the corresponding output. Hence, the efficiency of the FIS depends on the number of fuzzy IF–THEN rules used for computation. Implemented rules can be written as: Table 14: Rules of the mamdani fuzzy model Rule no. Antecedent 1 2 3 4 5 6 Consequent (fluctuation) Treatment Severity oral oral oral infusion infusion infusion low medium high low medium high low or medium medium or high high low low or medium medium Fuzzy Variables and MFs Treatment, severity and fluctuation were linguistic variables. Oral and infusion were linguistic values determined by the fuzzy sets on universe of discourse treatment. low, medium and high were linguistic values determined by the fuzzy sets on universe of discourse severity. low, medium and high were linguistic values Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 25 determined by the fuzzy sets on universe of discourse fluctuation. Plots of the membership functions of input treatment, severity and output fluctuation with the universe of discourse [0, 2], [0, 50] and [0, 5] respectively is shown in figure. Figure 4: Antecedent and consequent MFs Defuzzification extracts a crisp value from the fuzzy set. Defuzzified value was the observed value of the model. So Calculated R2 value was calculated from the observed and the predicted value of the model which was 0.482, showed that model could describe 48% variation of fluctuation. 4.1.2 Mamdani Fuzzy Model 2: data from Pharmacokinetic study A three-input single-output Mamdani fuzzy model was extracted from the expert knowledge. Where concentration, severity and disease_duration were taken as input variables and fluctuation was the output. Input and output were fuzzy variables. Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum ● 2017-05-13 Page 26 Fuzzy rules Table 15: Rules of the mamdani fuzzy model Rule no. Antecedent Concentration 1 2 3 4 5 6 high low none none none none Consequent (fluctuation) Severity none none high low none none Disease_duration none none none none high low high low high low high low Fuzzy Variables and MFs Concentration, severity, disease_duration and fluctuation were linguistic variables.low and high were linguistic values determined by the fuzzy sets on universe of discourse for all fuzzy variables. Plots of the membership functions of input variables concentration, severity, disease_duration and output fluctuation with the universe of discourse [0, 2], [0, 60], [0, 30] and [0, 3] respectively is shown in figure. Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 27 Figure 5: Antecedent and consequent MFs Output Calculated R2 value of the model was 0.495. So the model could explain 50% variation of dependent variable, fluctuation. Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 28 4.1.3 Anfis Model 3: data from DireQt and Pharmacokinetic study Initialize the FIS : data from DireQt study A fuzzy inference model structure was defined with a set of rules. Membership function (Gaussian MF) parameters were chosen looking at the characteristics of the data. Rules for the FIS model: Table 16: The rule-base of the ANFIS model Rule no. Antecedent 1 2 3 4 Consequent (fluctuation) Treatment Severity oral oral infusion infusion high low high low very high high medium low Severity and treatment were two linguistic input variables and fluctuation was the output linguistic variable. Low, high was the linguistic values for the fuzzy variables severity. Oral and infusion were the linguistic values for the linguistic variable treatment. For fluctuation linguistic values were low, medium, high and veryhigh. All variables were standardized (i.e., have a mean of 0 and a standard deviation of 1.0). Before training universe of discourse for severity was [-1.61 1.8] associate with fuzzy values low [0.95 -1.53] and high [0.87 1.56] Treatment had a universe of discourse [1 2] where fuzzy value oral was [0.2 1] and infusion was [0.2 2] Universe of discourse for the variable Fluctuation was [-1 2.4]. Training the FIS: Parameters associated with the membership functions will change through this training process. Used back propagation for membership function parameter estimation. Validation of the model was done by the data from Pharmacokinetic study. The error tolerance is used to create a training stopping criterion; training will stop after the training data error remains within this tolerance. Error tolerance left to 0. Set the number of training epochs to 40. Figure shows that checking error decreases up to a certain point in the training and then it increases. This increase represents the point of model over fitting. After training the membership function were tuned in order to get minimum error. Figure 6: Error after 40 epochs Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 29 Output Linguistic variables Universe of discourse Severity [-1.604 1.791] Linguistic values low high [1.111 -1.433] [0.5122 1.73] Treatment [1 2] Linguistic values oral infusion [0.2266 0.9997] [0.272 2] Fluctuation [-1.427 2.404] Figure7: Membership function of severity before and after training R2 value for the untrained fuzzy inference model using data from DireQt study was 0.491. After 40 epochs training using the same data this value increase to 0.524.This shows that the model can now explain 52% of variation of fluctuation. Figure 8: Surface view Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 30 4.1.4 Anfis Model 4: data from Pharmacokinetic study Initialize the FIS : data from Pharmacokinetic study A 3 input single output Sugeno fuzzy model was defined based on the data from Pharmacokinetic study. The parameter of the membership function was determined from looking at the normalized data sets of the Pharmacokinetic study. The implemented rules were as follows: Table 17: The rule-base of the ANFIS model Rule no. Antecedent Consequent (fluctuation) Concentration Severity 1 2 3 4 5 6 7 8 low high low high low high low high low low high high low low high high Disease_duration low low low low high high high high low low low medium medium high high high The linguistic fuzzy variables were disease_duration, Severity and concentration for the input and fluctuation for the output. Training the FIS For tuning the parameter of the membership function that predetermined FIS was trained with the set of rules. Figure shows that the error was decrease with the training. After 40 epochs of training the parameter of the MFs were: Figure9: Error after 40 epochs Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 31 Output: Values of the universe of discourse of parameters of the MFs (Guassian) before and after training were: Linguistic variables Universe of discourse After training Disease_duration [-0.3861 0.5475] Linguistic values low high [0.3009 -0.3999] [0.2994 0.6] Severity Linguistic values low high [0.6067 1.136] Concentration Linguistic values low high [-1.154 -0.7494] Fluctuation [-1.427 1.988] [0.096 0.6022] [0.1874 1.205] [0.09286 -1.035] [0.03209 -0.7285] Before training FIS before R2 value was only 0.25 and after training R2 value increase to 0.71. Figure 10: Surface view 4.2 Evaluating Fuzzy Models Mamdani model 1 For evaluating mamdani model (1) generated a model from DireQt study; input membership functions and rules were defined based on this study. With this predetermined model structure generated output for Pharmacokinetic study. To see how well these models predicted at another study data. From the calculated R2 (0.311) value shows that 31% percent of the variability of fluctuation that could be explained from the fuzzy model which was less than the predicted value of the mamdani model (1 ) (48%) for DireQt study. Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 32 Mamdani model 2 There was no such data for calculating the plasma levodopa concentration in DireQt study so this model could not be evaluated to check how it behaves for data from different study. Anfis Model 3 To see how well the models behaved with data from other study with this predefined fuzzy model where the variables were characterized by the DireQt study, data from Pharmacokinetic study was used. The calculated R2 in this case tells that the model now can only explain 24% percent of the variability of fluctuation before training. Trained FIS can predict 30% of the variability of fluctuation. This tells that training data presented to Anfis for training (estimating) membership function parameters was not fully representative of the features of the data that the trained FIS was intended to model. 4.3 Result Analysis Mamdani models R DireQt study ANFIS models R2 2 0.482 Before train: 0.491 After train: 0.524 Comments: Mamdani FIS and Sugeno FIS almost have the same value for the R2. But after training it shows improvement but not much. Pharmacokinetic study 0.495 Before train: 0.253 After train: 0.708 Comments: Mamdani FIS and Sugeno FIS shows much difference in the result but best result obtained after training the ANFIS. Figure 11: Analysis of result from different fuzzy models Mamdani fuzzy inference model is the most commonly used fuzzy inference technique. Mamdani fuzzy model is intuitive. It has widespread acceptance and it’s well-suited to human input. Mamdani fuzzy models were implemented using the data from DireQt and Phramacokinetic study. For ANFIS, a FIS model was implemented based on looking at the characteristics of variables from DireQt study. From the above figure it shows that better result is always expect to get after training the FIS that means after tuning the MFs. Expert knowledge which was expressed in terms of linguistics may sometimes faulty and requires the model to be tuned. In this case learning capabilities of neural networks tuned the parameters in order to help the expert knowledge and change the parameters of membership function and obtained a better result. Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 33 5 Performance: Statistical and Fuzzy Models In order to investigate the factors that affects the fluctuation in advanced PD several models have been implemented using statistical and fuzzy model building techniques. In both cases calculated the coefficient of determination to measure of the fit of the model to data. Table shows the R2 value for different models. Table 18: Comparing R2 for different models R2 Study Statistical models Fuzzy models Mamdani ANFIS Before train After train Model 1: Dependent variable, (fluctuation) = treatment + severity (daily activities) DireQt 0.451 Pharmacokinetic 0.430 Model 2: Dependent variable, (fluctuation) = treatment + severity (daily activities) + interaction of treatment and severity DireQt 0.483 Pharmacokinetic 0.312 0.483 0.491 0.524 0.237 0.304 Model 3: Dependent variable, (fluctuation) = concentration + severity (motor) + disease_duration Pharmacokinetic 0.598 0.495 0.253 0.708 Table shows that to measure of the fit of the model to data it always shows that better performance was obtained using ANFIS model that means when the parameters of the model were tuned via the training of a neural network through back propagation. Evaluating model (2) shows that when the validation data were taken from another study then model predicts less of the variations of fluctuation. In most traditional statistical models, the data have to be normally distributed before the model coefficients can be estimated efficiently. If the data are not normally distributed, suitable transformations to normality have to be applied. Statistical model generated the sampling error that was the differences, attributed to taking only a sample of values, between what was observed in the sample and what was present in the population. Calculate the estimate, a quantity obtained from a sample that was used to represent a population parameter. Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 34 Advantage of the fuzzy model was that described by a set of linguistic rules, makes it easily interpretable. The most unique characteristic of fuzzy theory, in contrast to classic mathematics, is its operation on various memberships functions (MF) instead of the crisp real values of the variables, extremely effective at handling noisy data, especially when the underlying physical relationships are not fully understood. Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 35 6 Conclusion Result obtained from different models suggested that motor fluctuation for patients with advanced Parkinson’s disease related with treatment and severity (daily activities). From the estimates of the statistical model it was found that treatment effect was higher than the severity. Motor fluctuation increased more when patients were treated with oral levodopa comparatively when patients were taking infusion of duodopa. The more severe (daily activities) the patient, the higher the fluctuation was. Similar results could be expected for the same population using this same model. These models were evaluated to see how well it predicted when data were taken from same study or from different study. With the same study it was found that validation data produced similar results but data from different study showed that it always predicted less variation in fluctuation than the data from which the model was obtained. Model using data from Pharmacokinetic study showed that fluctuation increased with plasma levodopa concentration and also with disease duration and severity (motor). Increasing plasma levodopa concentrations increased the fluctuation and fluctuation also increased with longer disease duration and greater disease severity (motor). Fluctuation did not show any significant relationship with the other explanatory variables like sex, age_on_set, anti Parkinson medication using these data sets. As given in the introduction some researchers showed that these variables influenced the fluctuation. For these factors cannot conclude anything because it might happen that due to the small sample size these variables did not show any significant relationship. Statistical models generate the sampling error and estimate of the parameter as the representative of the population. In fuzzy models result can be easily interpretable. Also one can use this when the underlying physical relationships are not fully understood. In particular, the possibility given by the fuzzy formalism (linguistic expression of the rules) seems to be interesting in order to integrate new features easily in the model. Different statistical and fuzzy models were used to get a simple model with better performance. As discussed in the earlier sections, best measure of fit of the model to data obtained after tuning the parameters of fuzzy membership functions using neural network through back propagation i.e. using ANFIS model. Further evaluation of the models with the data from other studies of advanced Parkinson’s disease can be possible, to check the model performance. A simple model with better performance was tried to build. But no model is true in the real sense. However, some models are certainly better than others. The result that obtained was better result in this search space; these results may not be the best. So, search process can continue to see if there are any other models or different set of variables that can produce better or the best result. Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 36 Appendix A Table A.1: Model 1 Data from DireQt Study Table A.1.1: Anova table Dependent Variable: FLUSTD Source DF Model 2 Error Corrected Total Sum of Squares Mean Square F Value 15.78761127 7.89380563 13.56 33 19.21239389 0.58219375 35 35.00000516 Pr > F 0.0001 Table A.1.2: Assessing Overall Fit: Effect Size R-Square C.V. 0.451075 Root MSE -9999.99 0.76301622 FLUSTD Mean -0.00000007 Table A.1.3: Assessing Individual Independent Variables Source DF TREAT SUM2STD 1 1 Type III SS 11.82009087 3.96752040 Mean Square F Value Pr > F 11.82009087 3.96752040 20.30 6.81 0.0001 0.0135 Table A.1.4: Assessing Independent Variables: Effect Size Parameter INTERCEPT TREAT T for H0: Parameter=0 Estimate Pr > |T| Std Error of Estimate -0.573006057 B -3.19 0.0031 0.17984465 1 1.146011967 B 4.51 0.0001 0.25433874 2 0.000000000 B . SUM2STD 0.336686392 2.61 . 0.0135 . 0.12897328 Table A.2: Model 2 Data from DireQt study Table A.2.1: R-square Value of the Model from Analysis of Variance Table R Square 0.482797 Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 37 Table A.2.2: Assessing Individual Independent Variables: Statistical Significance Source DF Type III SS Mean Square F Value Pr > F TREAT 1 11.82009090 11.82009090 20.89 0.0001 SUM2STD 1 3.96752040 3.96752040 7.01 0.0125 SUM2STD*TREAT1 1.11028696 1.11028696 1.96 0.1708 Table A.2.3: Assessing Independent Variables: Effect Size Parameter T for H0: Parameter=0 Estimate INTERCEPT -0.573006058 B TREAT 1 2 SUM2STD SUM2STD*TREAT 1 2 1.146011969 0.000000000 0.158578321 0.356216142 0.000000000 B B B B B Pr > |T| Std Error of Estimate -3.23 0.0028 .17727738 4.57 . 0.88 1.40 . 0.0001 . 0.3843 0.1708 . 0.25070807 . 0.17979208 0.25426439 . Table A.3: Model 3: Data from Pharmacokinetic Study Table A.3.1: Anova table Dependent Variable: FLUSTD Source DF Sum of Squares Mean Square F Value 2 9.87780306 4.93890153 7.90 21 23 13.12220134 23.00000440 0.62486673 Model Error Corrected Total Pr > F 0.0028 TableA.3.2: Assessing Individual Independent Variables Source DF Type III SS Mean Square F Value MED SUM2STD 1 1 3.2440633 6.63373968 3.24406339 6.63373968 5.19 10.62 Pr > F 0.0333 0.0038 Table A.3.3: Assessing Independent Variables: Effect Size Parameter Estimate T for H0: Parameter=0 INTERCEPT MED 1 2 SUM2STD -.3676537953 B 0.7353075759 B 0.0000000000 B 0.5370507083 -1.61 2.28 . 3.26 Pr > |T| Std Error of Estimate 0.1221 0.0333 . 0.0038 0.22819340 0.32271420 . 0.16482754 Table A.4: Model 4: Data from Pharmacokinetic Study Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 38 Table A.4.1: R-square Value of the Model from Analysis of Variance Table R Square 0.598246 TableA.4.2: Assessing Individual Independent Variables: Statistical Significance Source DF Type III SS Mean Square F Value SCONCEND SUM3STD PD_STD 1 1 1 2.57852362 2.51631734 2.55588497 2.57852362 2.51631734 2.55588497 5.58 5.45 5.53 Pr > F 0.0284 0.0302 0.0290 Table A.4.3: Assessing Independent Variables: Effect Size T for H0: Parameter=0 Pr > |T| Std Error of Estimate Parameter Estimate INTERCEPT -.0000000526 -0.00 1.0000 0.13874688 SCONCEND SUM3STD PD_STD 0.4095854961 0.4054231289 0.4286012772 2.36 2.33 2.35 0.0284 0.0302 0.0290 0.17337556 0.17372194 0.18222656 Figure A.2: Correlation among Ratings, Diary and UPDRS Figure A.2.1: DQ1 against flu Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Figure A.2.2: Plot totsum against flu Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 39 2 F L U 1 S T 0 D -1 -2 0 2 N_FLUS_5 Figure A.2.3: Plot totsum against DQ1 Figure A.1: QQ plot for Fluctuation Figure A.3: Evaluation of the models (Histogram of R2) Figure A.3.1: DireQt & Pharmacokinetic study (model 1) 2/3 of the whole data set 1/3 of whole set Figure A.3.3: DireQt study (model2) Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Figure A.3.2: DireQt study (model1) 2/3 of the whole data set 1/3 of whole set Figure A.3.4: Pharmacokinetic study (model3) Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 40 Figure A.3.5: Pharmacokinetic study (model4) Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 41 Appendix B Abbreviation and definition of terms PD Parkinson’s disease Dyskinesias involuntary movements caused by too much levodopa Off symptom of untreated PD Motor fluctuations rapid changes in motor function between off and dyskinetic states. Per Protocol Set: All patients correctly included, fulfilling at least 82% of each video recording day (time 9:00 to 17:00) per 3-week-period, received study treatment as planned, with no prohibited therapies were included in the Per Protocol Set. Fit Analysis: These provide methods for examining the relationship between a response (dependent) variable and a set of explanatory (independent) variables. Can use least squares methods for simple and multiple linear regression with various diagnostic capabilities when the response is normally distributed. F value (also called an F ratio): is a test statistic for the ratio of two estimates of the same variance. In ANOVA terms, F equals the mean square for the model divided by the mean square for error or F = MSmodel /MSerror MSmodel deals with the predicted value for the model, MSerror deals with the error or the residuals from the model. The p level gives the significance level for the F statistic. R – Square: square of the correlation between the predicted values and the observed values of the dependent variable. Hence, it is an estimate of the proportion of variance in the dependent variable explained by the model. Mathematically, R2 has a lower bound of 0 (although in practice, an R2 exactly equal to 0 is implausible) and an upper bound of 1.0. The larger the value of R2, the better the model predicts the data Quantile-Quantile Plots: Visually check for the fit of a theoretical distribution to the observed data by examining the quantile-quantile (or Q-Q) plot (also called Quantile Plot). In this plot, the observed values of a variable are plotted against the theoretical quantiles. A good fit of the theoretical distribution to the observed values would be indicated by this plot if the plotted values fall onto a straight line. Backward stepwise deletion procedure: Backward deletion procedure is an iterative process; one independent variable is deleted based on the F statistics. Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 42 Correlation Co-efficient: Indicate the strength of a linear relationship existing between two continuous variables. Positive correlation co-efficient means that as values of one variable increase, values of the other variable also tend to increase. Negative correlation –as one grows up , the other goes down. Correlation Co-efficient is a number ranging from -1 to +1. Small or zero Correlation Co-efficient indicate that variables are unrelated. Soft Computing is an approach to computing which parallels the remarkable ability of the human mind to reason and learn in an environment of uncertainty and imprecision. (Lotfi A. Zadeh, 1992 [1]) Fuzzy Variable A fuzzy variable defines the language that will be used to discuss a fuzzy concept such as temperature, pressure or height. Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 43 References [1] Rinne UK. Problems associated with long/term levodopa treatment of Parkinson’s disease. Acta Neurol Scand 1983;95 [2] Markham CH, Diamond SG. Long –term follow-up of early dopa treatment in Parkinson’s disease. Ann Neurol 1986; [3] Dupont E, Andersen A, Boas J st al. Sustained –release Madopar HBS® compared with standard Madopar in the longtreatment of de novo Parkinsonian patients. Acta Neurol Scand 1996;93 [4] Block G, Liss C, Reines S, Irr J, Nibelink D and the CR First study Group. Comparison of immediate release and controlled release carbidopa/levodopa in Parkinson’s disease. Eur Neurol 1997 [5] Bredberg E, Tedroff J, Aquilonius S-M, Paalzow L: Pharmacokinetics and effects of levodopa in advanced Parkinson’s disease. Eur J Clin Pharmacol 1995 [6] Harder S, Baas H, Rietbrock S. Concentration – effect relationship of levodopa in patients with Clin Pharmacokinet 1995 Parkinson’s disease. [7] Kurlan R, Rubin AJ, Miller C, et al. Duodenal delivery of levodopa for on-off fluctuations in parkinsonism: preliminary observations. Ann Neurol 1986;20:262-265. [8] Syed N, Murphy J, Zimmerman Jr T, et al. Ten years' experience with enteral levodopa infusions for motor fluctuations in Parkinson's disease. Mov Disord 1998;13:336-338. [9] Olsson U. (2002) Generalized Linear Models. An applied Approach. Printed in Sweden. Studentlitteratur, Lund. ISBN 91-44-04155-1 [10] Linkens DA, Abbod M.F. Mahfouf M, 1988. “Department of Automatic Control and Systems Engineering”. University of Sheffield, Sheffield S1 3JD, United Kingdom. [11] L.A. Zadeh, Outline of a new approach to the analysis of complex systems and decision processes, IEEE Trans. Systems, Man Cybernet. 3 (1973) 28–44. [12] McCulloch J. H. and Meginniss J. R.(2002) A Statistical Model of Smallpox Vaccine Dilution [13] Sibbald, Barbara (2003) Estimates of flu-related deaths rise with new statistical Association Journal Volym: 168 Nummer: 6 Sida: 761-761 Leverantör: Ebsco models. CMAJ: Canadian Medical [14] Calder B.; Clarke S.; Linnett L.; Carmichael D. (1996) Statistical models for the detection of abnormalities in digital mammography, Digital Mammography, IEE Colloquium Sida: 6/1-6/6 Leverantör: IEEE [15] Zylstra S, Bors-Koefoed R, Mondor M, Anti D, Giordano K and Resseguie LJ (1994) A statistical model for predicting the outcome in breast cancer malpractice lawsuits Obstetrics & Gynecology 1994; 84:392-398 © 1994 by the American College of Obstetricians and Gynecologists [16] Saletic D.Z. and Velasevic D.M. Improvements in the Fuzzy Model of Determining the Severity of Respiratory Failure, 15th IEEE Symposium on Computer-Based Medical Systems (CBMS'02) p. 353 [17] Jafelice, R.M.; de Barros, L.C.; Bassanezi, R.C.; Gomide, F. (2004) Fuzzy modeling in symptomatic HIV virus infected population Bulletin of Mathematical Biology Volym: 66 Nummer: 6 Sida: 1597-1620 Leverantör: Elsevier [18] Aruna, P.; Puviarasan, N.; Palaniappan, B.( 2005) An investigation of neuro-fuzzy systems in psychosomatic disorders. Expert Systems with Applications Volym: 28 Nummer: 4 Sida: 673-679 Leverantör: Elsevier [19] Esogbue A.O. (1983) Performance of a fuzzy set theoretic model for medical diagnosis. Computer Applications in Medical Care, 1983. Proceedings. The Seventh Annual Symposium Sida: 856-858 Leverantör: IEEE Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se Registration number: E 3166 D Name: Shahina Begum 2017-05-13 Page 44 [20a] Barbeau A. High-level levodopa therapy in severly akinetic Parkinsonian patients: twelve years later. In: Rinne UK, Klinger M, Stammer G. Parkinson Disease: current progress, problems and management. Amsterdam: Elsevier; 1980. p.22939 [20b] Kostic V, Przedborski S, Flaster E, Sternic N, Early development of levodopa-induced dyskinesias and response fluctuations in young-onset Parkinson’s disease. Neurology 1991; 41:202-5 [21] Lees AE, Stern GM. Sustained bromocriptine therapy in previously untreated Neurol Neurosurg Psychiatry 1981; 44: 1020-3 patients with Parkinson’s disease. J [22] Rinne UK. Early combination of bromocriptine and levodopa in the treatment of Parkinson’s disease: a 5 -year followup. Neurology 1987; 37: 826-8 [23] Brannan T, Yahr MD. Comparative study of selegiline plus L-dopa-carbidopa versus L-dopa- carbidopa alone in the treatment of Parkinson’s disease. Ann Neurol 1995; 37:95-8 [24] Tanner CM, Kinori I, Goetz CG, Carvey PM, Klawans HL. Age at onset and clinical outcome in idiopathic Parkinson’s disease [abstract]. Neurology 1985; 35 [Suppl 1]: 276 [25] Lesser RP, Fahn S, Snider SR, Cote LJ, Tsgreen WP, Barrett RE. Analysis of the clinical problems in Parkinsonism and the complications of long-term levodopa therapy. Neurology 1979; 29:1253-60 [26] Barbeau A. High-level levodopa therapy in severly akinetic Parkinsonian patients: twelve years later. In: Rinne UK, Klinger M, Stammer G. Parkinson’s disease: current progress, problems and management. Amsterdam: Elsevier; 1980. p. 229-39 [27] Jang J.S.R, Sun C.T. and Mizutani E., Neuro-fuzzy and Soft Computing. A computional approach to learning and machine intelligence. Prentice Hall, NJ. 1997 [28] The toolkit can be obtained by accessing the web http://ai.iit.nrc.ca/IR_public/fuzzy/ [29] Thuraisingham B. Data Mining Techonologies, Techniques, Tools and Trends. CRC press New York, USA. 1999 ISBN 0-8493 – 1815-7 [30] Fahn S, Elton R, Members of the UPDRS Development Committee. In: Fahn S, Marsden CD, Calne DB, Goldstein M, eds. Recent Developments in Parkinson’s disease, Vol 2. Florham Park, NJ. Macmillan Health Care Information 1987, pp 15 3-163, 293-304 [31] Pharmacotherapy for Parkinson’s disease- Observationa and Innovations by Dag Nyholm http://www.divaportal.org/diva/getDocument?urn_se_uu_diva-3354-1_fulltext.pdf Last referred on December 2004 [32] Nyholm D., Optimizing levodopa pharmacokinetics -- intestinal infusion versus oral sustained-release tablets. Department of Neuroscience, Neurology. Uppsala University, SE-751 85, Uppsala, Sweden. [33] Jang J.S.R, Sun C.T. and Mizutani E., Neuro-fuzzy and Soft Computing. A computional approach to learning and machine intelligence. Prentice Hall, NJ. 1997 [34] Shefrin SL, (1999), Therapeutic advances in idiopathic Parkinsonism Expert Opin Investig Drugs. 1999 Oct; 8(10):1565-88. [35] Rascol O, Geotz C, Koller W, Poewe W, Sampaio C. Treatment interventions for Parkinson’s disease: an evidence based assessment. Lancet 2002; 359: 1589- 1598. Högskolan Dalarna Röd vägen 3, 781 88 Borlänge Tel: 023 7780 00 Fax: 023 7780 50 URL: http://www2.du.se