Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Generalized Estimating Equations for Depression Dose Regimes Karen Walker, Walker Consulting LLC, Menifee CA Generalized Estimating Equations on the average produce consistent estimates of the regression coefficients and variances under weak assumptions about the actual correlation as the number of treatments becomes large. Use the GEE to: ο· Relate the marginal response with a link function, for example the log of odds. ο· Specify the variance function. ο· Test the data to choose a working correlation matrix. Compute an initial estimate of π½, for example with an ordinary generalized linear model assuming independence. Compute the working correlation matrix π π . Compute an estimate of the covariance matrix Update π½ Compute residuals and update ππ Iterate until convergence ABSTRACT While working on a study for a depression drug, I came across drug administration data by where subjects had no real treatment definition. Subject dose regimes consisted of scenarios like: 10 mg per day, 10mg + 20mg +40mg per week, or 20mg + 10mg + 10mg... And so on. At first glance, I thought "there has to be missing data and those regimes will have a definition later.β However, what if there's a need to process this data just as it is? How can this be done? Will it make sense to define a response level for each cumulative logits of the dose values over the course of a study, then fit it to a Proportional Odds model? In this paper I will demonstrate how cumulative logits are affect in the same ways using a parallel slopes test. We will use this information to see if the log cumulative odds are proportional, and discover the influence of explanatory variable, and find the points where regression lines "connect the dots" for a single continuous explanatory variable. INTRODUCTION So what is depression? According to WEBMD Clinical Depression is diagnosed when a change in one or more chemicals in the brain cause abnormal brain function. Since the cause of depression can be a mix of chemicals, there is no single disease to treat like say an infection, chronic pain, or even cancer. To treat depression we must examine all the risk factors that will change the brainβs chemical balance, for example GENDER because of chemicals introduced in the brain for women during pregnancy and menopause, AGE because as people grow older chemicals are elevated in the brain because of the grief and trauma that are experienced with age. Health conditions like cancer, heart disease, being overweight or chronic pain are the biggest complaints in person that are diagnosed as clinically depressed. Anyone can suffer clinical depression due to physical emotional abuse or violence. And other stressful events that cause clinical depression are moving, marriage, divorce, new baby, or a new job can cause clinical depression symptoms that are more than just sadness. What makes treating clinical depression tricky, and what makes this paper so interesting is that where millions of are people suffer from clinical depression thereβs no definite way to treat it. How can we know when clinical depression is cured when some subjects have a clear sense of why they became depressed, and other subjects donβt know when and where it happened? For the next 20 minutes or so weβll explore the many ways to treat it to see if we can uncover the best way. DOSE REGIMES FOR DEPRESSION First we will recognize that depression has to be managed on a daily basis at the very least to cover those subjects that didnβt know when or where it happened. Weβll build a Trial Arm dataset (TA) that contains the data points mentioned above and a few others so we can measure the effects for SEX, AGE, OBESITY, SUBSTANCE USE, DRUG, DOSE, and RESPONSE. Where response is βBetter=4β, βSlightly Better=3β, βNo Change=2β, βSlightly Worst=1β, βWorst=0β. This depression dataset will contain all the subject records to be analyzed, and assign a sequential number each time that subject takes a dose, so that sequential number will be both the visit and the number of times the subject had treatment. Note: we also want to deal with over eating, too much sugar, and substance use of alcohol, tobacco, or caffeine, so weβll have obesity and substance uses as covariates. 1 <Generalized Estimating Equations for Depression Dose Regimes>, continued There are over 30 known drugs available for the treatment of clinical depression, and many subjects; at least 200, surveyed admit to taking more than one kind of drug when dealing with the sadness they experienced. Weβll consider 12 depression dose regimes for discussion here. This kind of complex data distribution can be fitted to a generalized linear model because it allows for response variables that have arbitrary distributions and for an arbitrary link function as well. The Generalized Linear Model or (GLM) relates a mean response to a vector of explanatory variables through a link function. However where a regular Linear Model works best for a simple normal distribution, the Generalized Model allows for an arbitrary distribution on the response variable. So the link function can have assorted shapes. The GLM consists of three elements: 1. A probability distribution from the exponential family. 2. A linear predictor Ξ· = XΞ² . 3. A link function g such that E(Y) = ΞΌ = gβ1(Ξ·). Generalized Estimating Equations were introduced by Liang and Zeger in 1986 as a method of handling correlated data that can be modeled as a Generalized Linear Model for health outcomes (longitudinal studies) or litters (clustered data). Generalized estimating equations are an extension of GLMs to accommodate correlated data; they are an extension of quazi-score equations. The GEE approach models a known function for the marginal expectation of the dependent variable as a linear function of one or more variables. With quasi-likelihood, you can pursue statistical models by making assumptions about the link function and the relationship between the first two moments, but without specifying the complete distribution of the response. The GEE describes the random component with a common link and variance function. The GEE accounts for the covariance structure of the correlated measures So let πππ (j = 1 β¦.ππ , i= 1 β¦. K) Represent the jth measurement on the ith subject. For our purpose, j is the Dose Regimes and i is one dose for a Clinical Depression Subject. There are ππ Dose Regimens for one subject i and β¦. βπ² π=π ππ Total measurements 2 <Generalized Estimating Equations for Depression Dose Regimes>, continued Hereβs how to make it work. Step 1 The generalized estimating equation for Ξ² is an extension of the GLM estimating equation: π β π’=π ππβ² βπ π (ππ’ β ππ’ (π)) = π ππ π’ Where π is the corresponding vector of means π Y. = [ππ’π ,β¦,ππ’π§π’ ]β and ππ’ is an estimate of the covariance matrix Step 2 The working correlation matrix π π (πΌ) is estimated as πππ = πππ β πππ βπ(πππ ) For using the current value of the parameter vector π½ to compute the appropriate function of the Pearson residual. Step 3 Specify the variance of Y by a covariance matrix modeled as π/π π/π π½π = β π¨π πΉπ (πΆ)π¨π Where π¨π is an ππ X ππ diagonal matrix with V(πππ ) as the jth diagonal element. Step 4 Test the π Μ )β[Cπ½π· Cβ]βπ (Cπ· Μ) πΈπ =(Cπ· Update π ππβ² π βπ π·π+π = π·π β [βπ² π=π ππ· π½π Step 5 Compute residuals and update ππ . Step 6 Iterate until convergence. 3 πππ βπ ] ππ· ππβ² π βπ [βπ² π=π ππ· π½π (Y - ππ ) ] <Generalized Estimating Equations for Depression Dose Regimes>, continued DEPRESSION DATA ID SEX AGE OBESITY Su VISIT Treatment Regime RESPONSE 101 Female 39 Yes Alcohol 1 30mg standard 2 101 Female 39 Yes Alcohol 2 30mg Second time standard 1 101 Female 39 Yes Tobacco 3 30mg new 1 101 Female 39 Yes Sugar 4 30mg standard 2 102 Male 27 No Tobacco 1 10mg standard 2 Table 1. The Depression data dataset looks something like this except with at least 12 dose regimesβ¦ ID Baseline Visit_1 Visit_2 Visit_3 Visit_4 Visit_5 Visit_6 Visit_7 Visit_8β¦up to 12 101 2 2 1 1 2 2 1 1 2 102 2 2 4 4 4 2 4 4 4 103 4 4 4 4 4 4 4 4 4 104 4 4 2 4 4 4 2 4 4 105 2 2 2 3 4 2 2 3 4 Table 2.Visit 1 is set to Baseline and after PROC transpose depression data (depr_t.sas7bdat) is read in for analysis to temporary SAS dataset depres. The depression data can be analyzed with a logistic regression using GEE. Create 12 observations per subject, one for each visit. Data depres(keep=(regime id treatment sex age obesity su visit: outcome)); Set depr_t; visit=1; outcome=visit_1; output; visit=2; outcome=visit_2; output; visit=3; outcome=visit_3; output; visit=4; outcome=visit_4; output; visit=5; outcome=visit_5; output; visit=6; outcome=visit_6; output; visit=7; outcome=visit_7; output; visit=8; outcome=visit_8; output; visit=9; outcome=visit_9; output; visit=10; outcome=visit_10; output; visit=11; outcome=visit_11; output; visit=12; outcome=visit_12; output; run; 4 <Generalized Estimating Equations for Depression Dose Regimes>, continued data depression; set depres; if outcome>=3 then dichot=1; else dichot=0; if baseline>=3 then di_base=1; else di_base=0; run; GEE ANALYSIS Using an exchangeable working correlation Matrix patients on either standard or new regimes are assigned to treatment doses, with a response measured as worst, slightly worst, no change, slightly better, and better ( 0, 1, 2, 3, 4) . Subjects are measured at baseline and 12 visits. Response is slightly better, or better versus not. Proc genmod data=depression descending; Class id regime sex age obesity substance treatment visit; Model dichot = treatment sex age regime di_base visit visit*treatment Treatment*regime / Link=logit dist=bin type3; Repeated subject=id*regime / type=exch; Run; Model Information Correlation Structure Exchangeable Subject Effect Id*regimes (levels) Number of Clusters Equal to number of Subjects Correlation Matric Dimension 12 Maxim Cluster Size 12 Minimum Cluster Size 12 If you need to include a numbered or an ordered list: 1. The Type 3 analysis shows nonsignificant interaction terms.. 2. When interactions are removed visit remains nonsignificant. 3. Patients on standard treatment have, on the average greater odds of better or slightly better response . The SAS PROC GEE procedure is now available in SAS / STAT, version 9.4. It supports generalized logits as well as the ESTIMATE, LSMEANS, and OUTPUT statements. It also provides the LOGOR=option in the βRepeatedβ statement for alternating logistic regression with an extension for ordinal data. . 5 <Generalized Estimating Equations for Depression Dose Regimes>, continued THE RESULTS Patients on standard regime of 30mg have, on the average π1.2654 times greater odds of a slightly better of better response that those patients on new regime of 10mg adjusted for the other effect in the model. Output 1. Analysis of GEE Parameter Estimates Empirical Parameter intercept Standard Error Estimates Estimate Standard Error 95% Confidence Limits Z Pr > |z| -0.2066 0.5776 -1.3388 0.9255 -0.36 0.7206 -0.6495 0.3532 -1.3418 0.0428 -1.84 0.0660 0.31 0.7560 3.65 0.0003 Regime Standard Regime New 0.0000 0.0000 0.0000 0.0000 sex F 0.1368 0.4402 -0.7261 0.9996 sex M 0.0000 0.0000 0.0000 0.0000 Treatment 30mg 1.2654 0.3467 0.5859 1.9448 Treatment 10mg 0.0000 0.0000 0.0000 0.0000 -0.0188 0.0130 -0.0442 0.0067 -1.45 0.1480 9 1.857 0.3460 1.1676 2.5238 5.33 <.0001 53 0.000 0.0000 0.0000 0.0000 age di_base obesity Source: Fictitious data, for illustration purposes only 6 <Generalized Estimating Equations for Depression Dose Regimes>, continued LETβS DO THAT AGAIN Using an unstructured working correlation Matrix patients on either standard or new regimes are assigned to treatment doses, with a response measured as worst, slightly worst, no change, slightly better, and better ( 0, 1, 2, 3, 4) . Subjects are measured at baseline and 12 visits. Response is slightly better, or better versus not. Proc genmod data=depression descending; Class id regime sex age obesity substance treatment visit; Model dichot = treatment sex age regime di_base visit visit*treatment Treatment*regime / Link=logit dist=bin type3; Repeated subject=id*regime / type=unstr; Run; Patients on standard regime of 30mg have, on the average π1.2442 times greater odds of a slightly better of better response that those patients on new regime of 10mg adjusted for the other effect in the model. Output 1. Analysis of GEE Parameter Estimates Empirical Parameter intercept Standard Error Estimates Estimate Standard Error 95% Confidence Limits Z Pr > |z| -0.2324 0.5763 -1.3620 0.8972 -0.40 0.6868 -0.6558 0.3512 -1.3442 0.0326 -1.87 0.0619 0.26 0.7981 3.60 0.0003 Regime Standard Regime New 0.0000 0.0000 0.0000 0.0000 sex F 0.1128 0.4408 -0.7512 0.9768 sex M 0.0000 0.0000 0.0000 0.0000 Treatment 30mg 1.2442 0.3455 0.5669 1.9214 Treatment 10mg 0.0000 0.0000 0.0000 0.0000 -0.0175 0.0129 -0.0427 0.0077 -1.36 0.1728 9 1.8981 0.3441 1.2237 2.5725 5.52 <.0001 53 0.000 0.0000 0.0000 0.0000 age di_base obesity Source: Fictitious data, for illustration purposes only 7 <Generalized Estimating Equations for Depression Dose Regimes>, continued CONCLUSION With GEE both βExchangeableβ and βUnstructuredβ working correlation matrix yield results that are very close. Many statisticians routinely use the independent structure because the parameter estimates and standard errors are consistent even if the correlation structure isnβt correctly specified. Here the working correlation matrix are consistent as well. With smaller number of treatments it is often better to use a simpler structure because that means fewer parameters to estimate. With GEE even the more complex structures are simplified. REFERENCES Modeling Longitudinal Categorical Response Data: Stokes, Maura: (April 6, 2015) SAS Global Forum, Dallas , Texas( 2015). Analysis of Longitudinal Data Diggle P.J., and Zeger, S.L. (1994) Oxford: Oxford Science <Copyright date>. Methods for Massive, Missing or Multifaceted Data<Stokes Maura>. 2015. β Proceedings of the SAS Global Forum 2015 Conference>. <Dallas, Texas>:Available at http://support.sas.com/resources/papers/proceedings09/TOC.html. ACKNOWLEDGMENTS Thank you to all my friends working with SAS year after year you are most kind. Bless you, and in God I trust. RECOMMENDED READING ο· Base SAS® Procedures Guide ο· SAS® For Dummies® CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: Name: Karen Walker Enterprise: Walker Consulting LLC Address: 26175 Sunnywood City, State ZIP: Menifee, California 92586 Work Phone: (480)206-7196 Fax: E-mail: [email protected] Web: SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. 8