Download click here and type title

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Polymorphism (biology) wikipedia , lookup

Tag SNP wikipedia , lookup

Heritability of IQ wikipedia , lookup

History of genetic engineering wikipedia , lookup

Genetic engineering wikipedia , lookup

Pharmacogenomics wikipedia , lookup

Public health genomics wikipedia , lookup

Human genetic variation wikipedia , lookup

Genome (book) wikipedia , lookup

Genetic testing wikipedia , lookup

Genetic drift wikipedia , lookup

Group selection wikipedia , lookup

Gene expression programming wikipedia , lookup

Smith–Waterman algorithm wikipedia , lookup

Microevolution wikipedia , lookup

Population genetics wikipedia , lookup

Transcript
International Biometric Society
COVARIATE SELECTION IN A MULTIVARIATE COX REGRESSION USING A GENETIC ALGORITHM
Dr. Katrin Kupas1, Simon Fink2, Dr. Hendrik Schmidt3
1. Statistical Consultant, Frankfurt, Germany
2. Ludwig-Maximilians-Universität, München, Germany
3. Boehringer Ingelheim Pharma GmbH & Co KG, Biberach, Germany
In clinical trials, time to event is often used as a robust and patient-relevant endpoint. The
identification of meaningful prognostic factors for the time to event is important for the
prediction of the progress of the disease and the outcome of the therapy and can contribute
to the recognition of patient subgroups. Hence, this technique will also play a more
prominent role in health economic evaluations and personalized medicine.
Usually, a number of patient characteristics are collected at baseline, which can contribute
to the patient’s risk of experiencing the event. Therefore these patient characteristics are
used to identify potential prognostic factors and predictors by adding them as covariates into
a multivariate Cox regression model according to certain algorithms.
The genetic algorithm gives the possibility to search a very large space of possible models
to find the best one. In contrast to the built-in algorithms with SAS® PROC PHREG like
backward and stepwise selection this algorithm does not use the chi-square statistic with a
threshold for the p-value for in- and exclusion of variables, but optimizes Akaike’s
Information Criterion (AIC) of the Cox model.
In a first step a number of randomly combined subsets of covariates are analysed. The
subsets of covariates of the best Cox models with respect to AIC are the parents for the next
generation of covariate subsets. By application of mutation and crossover, random
variations of the covariate subsets are included for the next generation, enlarging the search
space of possible models and avoiding local minima of the AIC. A predefined stopping
criterion is used to stop the genetic algorithm when the parsimonious model has been found
with regard to the AIC.
Due to its flexibility the genetic selection algorithm has a great power to search the space of
possible Cox models. But convergence of the genetic algorithm is not always guaranteed in
an acceptable timeframe, if the random selection rate is too high. Different blinded data sets
from clinical trials with a different numbers of possible covariates have been modelled using
the multivariate Cox regression with the genetic selection algorithm. In most cases the
genetic algorithm converged to a parsimonious model with respect to the AIC value. The
optimal mutation and crossover rate was found to be between 15% and 30%. All results
have been compared to classical covariate selection methods for validation. For each data
set, several runs of the algorithm have been performed to check the robustness of the
algorithm in finding the parsimonious model.
Especially in studies with many potential covariates and borderline significances of chisquare statistics the genetic algorithm leads to a robust selection of covariates for the
optimal model in the sense of AIC.
References:
1. Tsai JS. Optimal Model Selection by a Genetic Algorithm Using SAS®. Proceedings of
the Western Users of SAS® Software Conference, 2009, Cary, NC
2. Wiegand RE. Performance of using multiple stepwise algorithms for variable selection.
Statistics in Medicine 2010; 29:1647-1659
International Biometric Conference, Florence, ITALY, 6 – 11 July 2014