Download CS_WhitePaper_Outlie..

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Forensic epidemiology wikipedia , lookup

Transcript
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts
for Analysis and Programming
PhUSE
PhUSE Computational Science Development of
Standard Scripts for Analysis and Programming
Working Group
Analysis and Display White Papers Project Team
Analyses and Displays Associated with Outliers or
Shifts from Normal to Abnormal: Focus on Vital
Signs, Electrocardiogram, and Laboratory Analyte
Measurements in Phase 2-4 Clinical Trials and
Integrated Summary Documents
[Version 1.0] – [2015-09-10]
1
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts
for Analysis and Programming
Table of Contents
1. Disclaimer .................................................................................................................................................... 3
2. Notice of Current Edition .............................................................................................................................. 3
3. Additions and/or Revisions........................................................................................................................... 4
4. Overview: Purpose ....................................................................................................................................... 5
5. Scope ........................................................................................................................................................... 5
6. Definitions .................................................................................................................................................... 6
7. Problem Statement ...................................................................................................................................... 6
8. Background .................................................................................................................................................. 6
9. Considerations ............................................................................................................................................. 7
10. Recommendations ..................................................................................................................................... 7
10.1. General Recommendation ................................................................................................................. 7
10.2. All Measurement Types ..................................................................................................................... 7
10.3. Laboratory Analyte Measurements .................................................................................................. 11
10.4. ECG Quantitative Measurements .................................................................................................... 13
10.5. Vital Sign Measurements ................................................................................................................. 14
11. Tables and Figures for Individual Studies ................................................................................................ 14
11.1. Recommended Displays .................................................................................................................. 14
11.2. Discussion ........................................................................................................................................ 24
12. Tables and Figures for Integrated Summaries ........................................................................................ 25
12.1. Recommended Displays .................................................................................................................. 25
12.2. Discussion ........................................................................................................................................ 31
13. Example SAP Language .......................................................................................................................... 32
13.1. Individual Study ................................................................................................................................ 32
13.2. Integrated Summary......................................................................................................................... 34
14. Acknowledgements .................................................................................................................................. 37
15. Project Leader Contact Information ......................................................................................................... 37
16. References ............................................................................................................................................... 38
17. Appendix: Figures and Tables ................................................................................................................. 39
List of Tables and Figures
Figure 11.1. Scatterplot and Shift Summary for Quantitative Safety Measures Assessing
Low Value: Individual Study .............................................................................................................. 16
Figure 11.2. Scatterplot and Shift Summary for Quantitative Safety Measures Assessing
High Value: Individual Study ............................................................................................................. 18
Figure 11.3. Scatterplot and Shift Summary for Quantitative Safety Measures Assessing
Low Value with Change Criteria: Individual Study ............................................................................ 20
Figure 11.4. Scatterplot and Shift Summary for Quantitative Safety Measures Assessing
High Value with Change Criteria: Individual Study ........................................................................... 22
Table 11.1. Treatment-Emergent Abnormal Summary for Qualitative Safety Measures:
Individual Study ................................................................................................................................. 24
Figure 12.1. Scatterplot and Shift Summary for Quantitative Safety Measures for Low Value:
Integrated Database .......................................................................................................................... 27
[Version 1.0] – [2015-09-10]
2
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts
for Analysis and Programming
Figure 12.2. Scatterplot and Shift Summary for Quantitative Safety Measures for High Value:
Integrated Database .......................................................................................................................... 29
Table 12.1. Treatment-Emergent Abnormal Summary for Qualitative Safety Measures:
Integrated Database .......................................................................................................................... 31
Table 13.1. Selected Categorical Limits for ECG Data .................................................................................. 33
Table 13.2. Categorical Criteria for Abnormal Treatment-Emergent Blood Pressure and Pulse
Measurement and Categorical Criteria for Weight and Temperature Changes for Adults ............... 34
Figure 17.1. Summary for Quantitative Safety Measures: Individual Study .................................................. 39
Figure 17.2. Scatterplot and Shift Summary for Quantitative Safety Measures Assessing
Low Value: Individual Study .............................................................................................................. 40
Figure 17.3. Scatterplot and Shift Summary for Quantitative Safety Measures Assessing
High Value: Individual Study ............................................................................................................. 41
Figure 17.4. Scatterplot and Shift Summary for Quantitative Safety Measures Assessing
Low Value with Change Criteria: Individual Study ............................................................................ 42
Figure 17.5. Scatterplot and Shift Summary for Quantitative Safety Measures Assessing
High Value with Change Criteria: Individual Study ........................................................................... 43
Figure 17.6. Summary of Common Treatment-Emergent Abnormal for Quantitative Safety
Measures: Individual Study ............................................................................................................... 44
Figure 17.7. Scatterplot and Shift Summary for Quantitative Safety Measures: Integrated Database ......... 45
Figure 17.8. Scatterplot and Shift Summary for Quantitative Safety Measures for Low Value:
Integrated Database .......................................................................................................................... 46
Figure 17.9. Scatterplot and Shift Summary for Quantitative Safety Measures for High Value:
Integrated Database .......................................................................................................................... 47
Table 17.1. Shift Table Analyses ................................................................................................................... 48
Table 17.2. Shift from Normal/High to Low and from Normal/Low to High for Laboratory Measures ........... 49
Table 17.3. Shift from Normal/High to Low and from Normal/Low to High: Integrated Database ................. 50
1. Disclaimer
The opinions expressed in this document are those of the authors and do not necessarily
represent the opinions of PhUSE, the members’ respective companies or organizations, or
regulatory authorities. The content in this document should not be interpreted as a data standard
and/or information required by regulatory authorities.
2. Notice of Current Edition
This edition of the “Analyses and Displays Associated with Outliers or Shifts from Normal to
Abnormal: Focus on Vital Signs, Electrocardiogram, and Laboratory Analyte Measurements in
Phase 2-4 Clinical Trials and Integrated Summary Documents” is the 1st edition.
[Version 1.0] – [2015-09-10]
3
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts
for Analysis and Programming
3. Additions and/or Revisions
Date
2015-09-10
[Version 1.0] – [2015-09-10]
Author
See Section 14
Version
v1.0
Changes
First edition
4
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts
for Analysis and Programming
4. Overview: Purpose
The purpose of this white paper is to provide advice for displaying, summarizing, and/or analyzing
measures of outliers or shifts in tables, figures, and listings (TFLs), with a focus on vital signs,
electrocardiogram (ECG) quantitative measurements, and laboratory analyte measurements in
Phase 2-4 clinical trials and integrated submission documents. This white paper also provides
advice on data collection if a particular recommended display requires data to be collected in a
certain manner that may differ from current practice. The intent is to begin the process of
developing industry standards with respect to analysis and reporting for measurements that are
common across clinical trials, and even therapeutic areas. In particular, this white paper provides
recommendations for key TFLs for measures of outliers or shifts for a common set of safety
measurements. Separate white papers address other types of data or analytical approaches (e.g.,
central tendency).
The development of standard TFLs and associated analyses will lead to improved standardization
from collection through data storage. (You need to know how you want to analyze and report
results before finalizing how to collect and store data.) The development of standard TFLs will
also lead to improved product lifecycle management by ensuring that reviewers receive the
desired analyses for the consistent and efficient evaluation of patient safety and drug
effectiveness. Although having standard TFLs is an ultimate goal, this white paper reflects
recommendations only and should not be interpreted as “required” by any regulatory agency.
5. Scope
The scope of this white paper is to provide advice when developing the analysis plan for Phase 24 clinical trials and integrated summary documents (or other documents in which measures of
outliers or shifts are of interest).
Although the focus of this white paper pertains to specific safety measurements (vital signs, ECG
quantitative measurements, and laboratory analyte measurements), some content may apply to
other measurements (e.g., different safety measurements and efficacy assessments). Similarly,
although the focus of this white paper pertains to Phase 2-4 clinical trials, some of the content
may apply to Phase 1 clinical trials or other types of clinical research (e.g., observational studies).
Detailed specifications for TFLs or dataset development are considered out of scope for this
version of this white paper. However, the hope is that specifications and code (utilizing Study Data
Tabulation Model [SDTM] and Analysis Data Model [ADaM] data structures) will be developed
that are consistent with the concepts outlined in this white paper and placed in the publicly
available Pharmaceuticals Users Software Exchange (PhUSE) Standard Scripts Repository.
[Version 1.0] – [2015-09-10]
5
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts
for Analysis and Programming
6. Definitions
ADaM = Analysis Data Model; ALT = alanine aminotransferase; AST = aspartate
aminotransferase; BMI = body mass index; CBER = Center for Biologics Evaluation and
Research; CDASH = Clinical Data Acquisition Standards Harmonization; CDER = Center for Drug
Evaluation and Research; CS = Computational Science; ECG = electrocardiogram; LLN = lower
limit of normal; PhUSE = Pharmaceuticals Users Software Exchange; SDTM = Study Data
Tabulation Model; TFLs = tables, figures, and listings; ULN = upper limit of normal
7. Problem Statement
Industry standards have evolved over time for data collection (Clinical Data Acquisition Standards
Harmonization [CDASH]), observed data (SDTM), and analysis datasets (ADaM). However,
standards have not been developed for analyses and reports. Lack of standardization leads to
inefficiency in operation (time, cost), unreliable quality, and the creation of displays that may not
be of optimal use to the reviewers.
8. Background
Industry standards have evolved over time for data collection (CDASH), observed data (SDTM),
and analysis datasets (ADaM). There is now recognition that the next step would be to develop
standard TFLs for common measurements across clinical trials and across therapeutic areas.
Some could argue that the industry should have started with creating standard TFLs prior to
creating standards for collection and data storage (consistent with end-in-mind philosophy);
however, having industry standards for data collection and the analysis of datasets provides a
good basis for creating standard TFLs.
The beginning of the effort leading to this white paper came from the initiation of the FDA/PhUSE
Computational Science Collaboration, a yearly conference and ongoing working groups to support
addressing computational needs of the industry. The FDA identified key priorities and teamed up
with the PhUSE to tackle various challenges using collaboration, crowd sourcing, and innovation
(Rosario LA, 2012). The FDA and PhUSE created several Computational Science (CS) working
groups to address several of these challenges. The working group, titled “Development of
Standard Scripts for Analysis and Programming,” has led the development of this white paper,
along with the development of a platform for storing shared code.
There are several existing guidance documents (see bulleted list below) that contain suggested
TFLs for common measurements, such as vital signs, ECG quantitative measurements, and
laboratory analyte measurements. However, many of these documents are now relatively
outdated and generally lack sufficient detail to be used as support for the entire standardization
effort. Nevertheless, these documents were used as a starting point in the development of this
white paper. The documents include the following:



ICH E3: Structure and Content of Clinical Study Reports
Guideline for Industry: Structure and Content of Clinical Study Reports
Guidance for Industry: Premarketing Risk Assessment
[Version 1.0] – [2015-09-10]
6
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts




Working Group: Standard Scripts
for Analysis and Programming
Reviewer Guidance: Conducting a Clinical Safety Review of a New Product Application
and Preparing a Report on the Review
ICH M4E: The Common Technical Document for the Registration of Pharmaceuticals for
Human Use: Efficacy
ICH E14: The Clinical Evaluation of QT/QTc Interval Prolongation and Proarrhythmic
Potential For Non-Antiarrhythmic Drugs
Guidance for Industry: ICH E14 Clinical Evaluation of QT/QTc. Interval Prolongation and
Proarrhythmic Potential for Non-Antiarrhythmic Drugs
The Reviewer Guidance is considered a key document. As discussed in the guidance, there is
generally an expectation that analyses of outliers or shifts are conducted for vital signs, ECG
quantitative measurements, and laboratory analyte measurements. The guidance recognizes
value to both analyses of central tendency and analyses of outliers or shifts from within reference
limits to outside reference limits (below lower reference limit or above upper reference limit). We
assume both will be conducted for safety signal detection. This white paper covers the outliers or
shifts portion with the expectation that an additional TFL or TFLs will also be created with a focus
on central tendency (see the CS white paper pertaining to central tendency).
9. Considerations
Members of the Analysis and Display White Papers Project Team reviewed regulatory guidance
and shared ideas and lessons learned from their experience. Draft white papers were developed
and posted in the PhUSE wiki environment for public comment.
Most contributors and reviewers of this white paper are industry statisticians, with input from nonindustry statisticians (e.g., FDA and academia) and industry and non-industry clinicians.
Additional input (e.g., from other regulatory agencies) for future versions of this white paper would
be beneficial.
10. Recommendations
10.1. General Recommendation
This section contains some general considerations for the plans of analyses and displays
associated with outliers or shifts from normal to abnormal for laboratory analyte measurements,
vital signs, and ECG quantitative measurements. Section 10.2 discusses general considerations
for all three safety domains. Section 10.3 discusses considerations specific to laboratory analyte
measurements. Section 10.4 discusses considerations specific to ECGs quantitative
measurements. Section 10.5 discusses considerations specific to the vital signs.
10.2. All Measurement Types
P-values and Confidence Intervals
There has been an ongoing debate about the value for (or lack of value for) the inclusion of pvalues and/or confidence intervals in safety assessments (Crowe BJ, 2009). This white paper
[Version 1.0] – [2015-09-10]
7
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts
for Analysis and Programming
does not attempt to resolve this debate. As noted in the Reviewer Guidance, p-values or
confidence intervals can provide some evidence of the strength of the findings, but unless the
trials are designed for the hypothesis testing of safety endpoints, these should be thought of as
descriptive. Throughout this white paper, p-values and measures of spread are included in
several places. Where these are included, they should not be considered as hypothesis testing. If
a company or compound team decides that these are not helpful as a tool for reviewing the data,
they can be excluded from the display. Although certain statistical methods are recommended in
this white paper for p-values and confidence intervals (for teams that choose to include them),
alternative methods can be considered.
Some teams may find p-values and/or confidence intervals useful to facilitate focus but have
concerns that a lack of statistical significance provides unwarranted dismissal of a potential signal.
Conversely, there are concerns that there could be over-interpretation of p-values due to
multiplicity issues, adding potential concern for too many outcomes. Similarly, there are concerns
that the lower- or upper-bound of confidence intervals will be over-interpreted. (A percentage can
be as high as x causing undue alarm.) It is important for the users of these TFLs to be educated
on these issues.
Importance of Visual Displays
Communicating information effectively and efficiently is crucial in detecting safety signals and
enabling decision making. Current practice, which focuses on tables and listings, has not always
enabled us to communicate information effectively because tables and listings may be long and
repetitive, making it difficult to see trends. Graphics, on the other hand, can provide a more
effective presentation of complex data, increasing the likelihood of detecting key safety signals
and improving the ability to make clinical decisions. They can also facilitate the identification of
unexpected values.
Standardized presentation of visual information is encouraged. The FDA/Industry/Academia
Safety Graphics Working Group was initiated in 2008 and was formed to develop a wiki and to
improve best practice for safety graphics. It has recommendations for the effective use of graphics
for three key safety areas: adverse events, ECGs, and laboratory analytes. The working group
focused on static graphs, and their recommendations were considered while developing this white
paper. In addition, there has also been advancement in interactive visual capabilities. The
interactive capabilities are beneficial but are considered out of scope for this version of the white
paper.
Conservativeness
The focus of this white paper pertains to clinical trials in which there are comparator data. As
such, the concept of “being conservative” is different than when assessing a safety signal within
an individual subject or a single arm. A seemingly conservative approach may end up not being
conservative in the end. For example, for studies that collect safety data during an off-drug followup period, one might consider it conservative to include the adverse events reported in the followup period. However, this approach may result in smaller odds ratios than including only the
exposed period in the analysis. Another example occurs when choosing cutoffs for shift/outlier
analyses. A conservative approach for defining outcomes, from a single-arm perspective, is one
[Version 1.0] – [2015-09-10]
8
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts
for Analysis and Programming
that would lead to a higher number of patients reaching a threshold. However, a conservative
approach for defining outcomes may actually make it more difficult to identify safety signals with
respect to comparing treatment with a comparator (see Section 7.1.7.3.2 in the Reviewer
Guidance; U.S. Department of Health and Human Services, 2005b). Thus, some of the
approaches recommended in this white paper may appear less conservative than alternatives, but
the intent is to propose methodology that can identify meaningful safety signals for a treatment
relative to a comparator group.
Measurements After Stopping Study Medication
Measurements collected after stopping medications under study (e.g., treatment under study and
comparators) are common for various reasons. In some cases, follow-up phases are included to
monitor patients for a period of time after study medication is stopped. In addition, study designs
where keeping patients in a study (for the entire planned length of time) after deciding to stop
medication early are becoming more popular as newer methods are developed for handling
missing data. In these cases, patients can be off study medication for an extended period of time.
Measurements post study medication can also arise not by design. For example, a subject can
decide to stop study medication at any time, and then later attend the planned visit where the
planned measurements are obtained. There is currently no standard approach on how to handle
safety assessments post study medication. Some guidance contains advice on how long to collect
safety measurements post study medication (e.g., 30 days post or x half-lives). Any advice or
decisions related to the collection of safety measurements post study medication should not be
confused with how to include such data in displays and/or analyses. It is extremely important to
document within the database for analysis the best estimate of the last date study treatment was
taken, as well as the dates on which all numerical safety data were collected, so that an accurate
determination can be made of time of data collection relative to the last dose of medication.
We recommend that the TFLs in this white paper generally exclude measurements taken during a
follow-up phase. Separate TFLs can be created for the follow-up phase and/or the treatment and
follow-up phases combined. We also recommend that the TFLs in this white paper exclude
measurements taken after the visit, which is considered the “study medication discontinuation”
visit. In the study designs that keep patients in a study for the entire planned length of time even
after stopping medication, separate TFLs can be created for the “off-medication” time and/or the
treatment and off-medication times combined. This enables the researcher to distinguish between
drug-related safety signals versus safety signals that could be more related to discontinuing a
drug (e.g., return of disease symptoms, introduction of a concomitant medication, and/or
discontinuation or withdrawal effects of the drug) or due to subsequent therapy.
We assume it is important to distinguish among these. Generally, at least some TFLs that include
data from follow-up phases and/or off-medication time will be required, but not usually as many as
are done during treatment and not necessarily in the same format as provided in this white paper.
For some compounds (e.g., compounds with a long half-life compared to the duration of the study,
compounds used for a short time, such as antibiotics), a more complete set of TFLs including
such data may be required. The ease of interpretation from such TFLs will vary depending on the
compound, disease, and design aspects, such as the half-life of the compound, the likelihood of
[Version 1.0] – [2015-09-10]
9
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts
for Analysis and Programming
taking alternative therapy, or the allowed concomitant medications during the observation period.
For the case where a subject decides to stop study medication at any time and then later attends
the planned visit to obtain the planned measurements, we recommend that the measures taken at
the study medication discontinuation visit be included. Although some patients may be off
medication, the time is generally short in these situations. For this example, the inclusion of such
measurements may more accurately reflect the safety profile of a compound versus their
exclusion. In study designs with a long period of time between visits, an alternative approach may
be warranted.
Measurements at a Discontinuation Visit
When creating displays or conducting analyses over time, how to handle data collected at
discontinuation visits should be specified. Because a subject’s discontinuation visit is not always
aligned with planned timing, it is not obvious whether to include these measurements in displays
or analyses over time. Such measurements are “planned” per protocol but not consistent with the
planned timing. We generally recommend including measures taken at the discontinuation visit
toward the next timepoint. For example, if a patient discontinues medication and the study
between Visits 6 and 7 and then goes to the office for their discontinuation visit, we recommend
that the measurements taken at the discontinuation visit are grouped with Visit 7. The inclusion of
such measurements may more accurately reflect trends over time for the compound than their
exclusion. In study designs with a long period of time between visits, an alternative approach may
be warranted.
Measurements Collected in Reflex Manner
In study designs, it is possible to have some measurements collected only when another
measurement meets a certain criteria (i.e., collected in a reflex manner). For example, sometimes
a peripheral smear is only performed when certain complete blood count analytes meet a
specified threshold. How to handle such measurements should be specified in analysis planning,
which requires an understanding of collection practices. Generally, measurements collected in a
reflex manner would be used for individual patient management and possibly for individual patient
listings or individual case descriptions (e.g., as included in patient narratives). Summaries of such
measurements within or between treatment groups tend to be uninterpretable because you
cannot generally assume normality among those who did not have the measurement, and a
summary among those meeting the criteria for receiving the measurement (sometimes a small
denominator) tends to not be helpful for signal detection purposes.
Screening Measurements versus Special Topics
The focus of this white paper pertains to measurements as part of normal safety screening. For
many compounds, some measurements are relevant to addressing a priori special topics of
interest. In these cases, it is possible that additional TFLs and/or different TFLs are warranted.
TFLs designed for special topics are out of scope for this white paper. In addition, it is possible
that additional TFLs are warranted when a safety signal is identified using the TFLs
recommended in this white paper and/or the TFLs that focus on central tendency (separate white
paper). Additional TFLs that would be considered post-hoc for further investigation are considered
out of scope.
[Version 1.0] – [2015-09-10]
10
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts
for Analysis and Programming
Number of Therapy Groups
The example TFLs show one treatment arm versus a comparator arm in this version of the white
paper. Most TFLs can be easily adapted to include multiple treatment arms or a single arm.
Multi-phase Clinical Trials
The example TFLs for individual studies show two treatment arms and a comparator arm within a
controlled phase of a study. The example TFLs for integrated summaries show one treatment arm
(assumes all of the treated arms are pooled) and a comparator arm within the controlled phase of
the studies. Discussion around additional phases (e.g., open-label extensions) is considered out
of scope in this version of the white paper. Many of the TFLs recommended in this white paper
can be adapted to display data from additional phases and/or additional treatment arms.
Integrated Analyses
For submission documents, TFLs are generally created from using data from multiple clinical
trials. Determining which clinical trials to combine for a particular set of TFLs can be complex.
Section 7.4.1 of the Reviewer Guidance (U.S. Department of Health and Human Services, 2005b)
contains a discussion of points to consider. Generally, when p-values are computed, adjusting for
study is important. Creating visual displays or tables in which timepoints or treatment comparisons
are confounded with study is discouraged. Understanding whether the overall representation
accurately reflects the review across individual clinical trial results is important.
10.3. Laboratory Analyte Measurements
The following topics generally pertain to laboratory analyte measurements, although they may
apply to other measurement types as well. In these cases, the discussion below may or may not
apply.
Planned versus Unplanned Measurements
One topic that tends to be unique to safety (the laboratory analyte measurements in particular) is
the collection of unplanned measurements. Unplanned safety measurements can arise for various
reasons. During a study, the clinical investigator sometimes orders a repeat test, or retest, of a
laboratory test, especially if he/she has received an unexpected value. The investigator may also
request the patient return for a follow-up visit due to clinical concerns. In general, retests are
repeat tests performed because an initial test result had an unexpected value. The repeat result
may either confirm the initial test results or (less commonly) suggest that a laboratory error
occurred in the case of the initial result. Retests are often performed to verify that the action taken
by the investigator (e.g., changing the dose of study drug as allowed by the protocol) has the
desired effect (e.g., test results have returned to within reference limits). If such retests are
conducted until desired measurement results have been reached, analyses from baseline to last
observation would be biased toward normality. Thus, we recommend including only planned
measurements when creating displays or conducting analyses over time and when assessing
change from baseline to endpoint. However, we recommend including planned and unplanned
measurements for analyses that focus on outliers or shifts across an entire period because these
are intended to focus on the most extreme changes.
[Version 1.0] – [2015-09-10]
11
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts
for Analysis and Programming
Analytes Collected Qualitatively
Some laboratory analyte measurements are collected in a qualitative manner that is usually binary
(e.g., Elliptcytes: normal/abnormal) or ordinal (e.g., Spherocytes: 0 [imply by lack of reporting], +,
++, +++, ++++). Some analytes have a numeric value when present but are better treated as
qualitative data (e.g., atypical lymphocytes, a type of abnormal white blood cell seen with some
viral infections, should be treated as present/not present). How to handle such analytes should be
included in analysis planning. In general, a listing of abnormal findings is sufficient.
A summary of those shifting from normal during the pre-treatment period to abnormal during the
treatment period can also be considered. Converting qualitative measurements to abnormal
versus normal categories when they are not collected as abnormal versus normal categories is
usually defined by laboratories and included in routine data transfers but should be confirmed and
well understood by study teams.
Central Versus Local Laboratories
In recent years, most large studies have utilized a central laboratory to ensure consistency in
laboratory assessments across institutions. However, there are times when this is not feasible.
For example, some studies may need to utilize local laboratories due to the nature of the study.
There are also cases where the scheduled laboratory tests are done using a central laboratory,
but ad hoc local laboratory results are done as needed for patient care. Generally, results from
different laboratories should not be combined unless a careful review of laboratory assay methods
and laboratory limit determination methods has deemed them consistent. When feasible, samples
can be split such that the local laboratory results can be provided for urgent patient care, but
results from the central laboratory would also be available. If you adopt such a practice, including
data from the central laboratory only is sufficient.
Reference Limits
Laboratories generally maintain reference limits that can be used to screen for potential
pathology. Methods to develop such limits vary, but many are developed with individual subject
safety monitoring in mind. Thus, the limits from many laboratories tend to be sensitive (reduced
false negatives). Several statistical authors (Copeland KT, 1977) (RT, 1988) (Quade d, 1980)
have presented arguments suggesting that conventional reference limits with limits set at the 2.5 th
and 97.5th percentiles (95th percentile reference interval; commonly used method for reference
limit determination) after removal of outliers (Horowitz G, 2008) might not be optimal for an
outlier/shift categorical analysis of laboratory analytes aimed at detecting differences between
groups. In particular, the impact of misclassification on the estimation of incidence and on the
power of an inferential test to detect a difference between groups when it exists has been
addressed (Quade d, 1980). Translating this into a problem of choosing reference limits
determined by the reference interval, it will be more important to choose a limit that is extreme
enough that specificity remains high, but one that is not so high as to decrease sensitivity to a
very low value. The choice of optimal reference limit will be data dependent and is likely to be
variable across analytes; however, using the principles that specificity has a greater effect than
sensitivity, we can make reasonable choices that could be superior to reference limits provided by
the laboratory. Currently, such alternatives are not widely available. When such alternatives are
available, their use is generally recommended.
[Version 1.0] – [2015-09-10]
12
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts
for Analysis and Programming
Another aspect of choosing an optimal reference limit pertains to the population in which
reference limits are developed. Reference individuals (patients) are the individuals from whom
biological samples are collected for measuring an analyte to establish the reference limits for the
analyte. For clinical use, the reference sample group is generally determined to be healthy by
some means. This is appropriate for screening individual patients for presence or lack of health.
Authors have suggested that it might be important to tailor the reference population based on the
purpose for which the derived reference limits will be used (Solberg HE, 1989). This could include
using a reference sample of clinical trial patients or of clinical trial patients with the disease under
study. As with limits developed using higher percentile reference intervals, limits developed using
alternative populations are not widely available. When such alternatives are available, their use is
generally recommended.
For some laboratory analytes, clinical thresholds (e.g., fasting glucose ≥126 mg/dL) have been
published and can be considered for use in outlier/shift summaries and analyses. The use of
clinically derived limits is recommended (likely in addition to use of statistically derived limits)
when the analyte is of special interest.
For purposes of this white paper, it is assumed that a reference limit is chosen that would identify
values as low, normal, or high for quantitative measurements. For qualitative measurements, it is
assumed observations would be identified as normal or abnormal. Providing a specific
recommendation for the reference limits is out of scope for this version of the white paper. The
specific choice of limit should be documented (e.g., protocol, statistical analysis plan, study report
methods section). Reference limits for a laboratory analyte may vary across demographics. For
example, reference limits for a laboratory analyte may be different for patients aged <45 years
and those aged ≥45 years. We recommend using the reference limit according to the patient’s
real age at the time the laboratory measurement was taken instead of using the patient’s age
when entering the study.
Above and Below Quantifiable Limits
Values above or below quantitative range (eg, <0.0001) include critical information and should not
be discarded. Such values can generally be categorized as low or high, and their inclusion in
outlier/shift summaries and analyses is recommended.
10.4. ECG Quantitative Measurements
Special considerations for “thorough QT/QTc studies” are considered out of scope for this white
paper.
QT Correction Factors
As noted in the ICH QT/QTc guidance (Section I.A.; Background; U.S. Department of Health and
Human Services, 2005a), because of its inverse relationship to heart rate, the measured QT
interval is routinely corrected by means of various formulae to a less heart rate–dependent value
known as the QTc interval. Section IIIA of the same guidance provides a discussion of some of
the various correction formulas and notes the controversy around appropriate corrections.
Generally, we recommend that the TFLs include the corrected QT interval using Fridericia’s
method (QTcF = QT/RR0.333). We believe the regulatory and medical environments are ready to
[Version 1.0] – [2015-09-10]
13
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts
for Analysis and Programming
accept the exclusion of Bazett’s method from standard TFLs. We believe a second method would
likely be warranted for a more complete evaluation. The second method could be one that is
derived from a linear regression technique (Dmitrienko AA, 2005).
Reference Limits
As with laboratory reference limits, the choice for ECG reference limits can be controversial.
Unlike laboratory analytes, clinically derived limits are commonly used for ECG outlier/shift
summaries and analyses. In addition, it is common to include clinically derived limits for both raw
measures and change values for identifying patients of potential concern. Unfortunately, the
specific clinically derived thresholds that are used vary widely, hampering efforts to standardize
analysis data across the industry. For purposes of this white paper, it is assumed a reference limit
is chosen that would identify raw values as low, normal, or high. Providing a specific
recommendation for the reference limits for either raw measures or changes is out of scope for
this version of the white paper. The specific choice of limits should be documented (e.g., protocol,
statistical analysis plan, study report methods section).
JT Interval
QTc is a biomarker with a long-established history of being used to assess the duration of
ventricular repolarization. However, QTc encompasses both ventricular depolarization and
ventricular repolarization. The length of the QRS complex represents ventricular depolarization
and the length of the JT interval, measured from the end of the QRS complex to the end of the Twave, specifically represents ventricular repolarization. JT can be corrected for heart rate, as with
QT. Thus, when the QRS is prolonged (e.g., a complete bundle branch block), QTc should not be
used to assess ventricular repolarization. The decision as to which basis for assessing potential
changes in ventricular repolarization will be used should be based on the expected proportion of
patients with widened QRS complexes for any reason in that study. It is worth noting that this
proportion increases with the age of the patient population and the extent to which the population
is expected to experience cardiac disease.
10.5. Vital Sign Measurements
Reference Limits
As with laboratory and ECG reference limits, the choice for vital sign reference limits can be
controversial. Similar to ECG limits, common use of clinically derived limits is for vital sign
outlier/shift summaries and analyses but varies widely. For the purposes of this white paper, it is
assumed that a reference limit is chosen that would identify raw values as low, normal, or high.
Providing a specific recommendation for the reference limits for either raw measures or changes
is out of scope for this version of the white paper. The specific choice of limits should be
documented (e.g., protocol, statistical analysis plan, study report methods section).
11. Tables and Figures for Individual Studies
11.1 Recommended Displays
For quantitative laboratory analyte measurements, quantitative ECG measurements, and vital
[Version 1.0] – [2015-09-10]
14
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts
for Analysis and Programming
signs in which low and high limits are based on raw values without a change or percentage
change criterion, a 3-panel display that includes a scatterplot, shift table, and a shift to low/high
table is recommended (Figures 11.1 and 11.2). In the scatterplot portion, lines indicating the
reference limits are included to ease the review of the plots. In cases where limits vary across
demographic characteristics and/or laboratories, lines indicating the most common limit can be
displayed, which is an especially a good option if the population under study contains a relatively
large percentage of a particular demographic. Alternatively, lines for the lowest of the high limits
and the highest of the low limits can be displayed. Displaying lines for all limits can be considered
but will likely be too confusing to the users of the display.
Figure 11.1 is an example for assessing low values, and Figure 11.2 is an example for assessing
high values. Two sets of visuals, one for low and one for high values for each laboratory analyte,
vital sign, and ECG are generally desired. The summary of shifts from normal/high to low includes
patients whose minimum baseline value is normal or high. The summary of shifts from normal/low
to high includes patients whose maximum baseline value is low or normal.
For quantitative laboratory analyte measurements, quantitative ECG measurements, and vital
signs in which low and high limits are based on a specified change or percentage change value or
a combination of a specified value and a change or percentage change, a 3-panel display that
includes a scatterplot and a shift to low/high is recommended (Figures 11.3 and 11.4).
For laboratory analyte measurements collected qualitatively, a listing of abnormal findings is
recommended (Table 11.1).
For the shift from normal/high to low and the shift from normal/low to high summaries, a test to
compare treatments (e.g., Fisher’s exact test) can be included, as reflected in Figures 11.1
through 11.4.
[Version 1.0] – [2015-09-10]
15
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Figure 11.1
Working Group: Standard Scripts
for Analysis and Programming
Scatterplot and Shift Summary for Quantitative Safety Measures Assessing Low Value: Individual Study
[Version 1.0] – [2015-09-10]
16
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
[Version 1.0] – [2015-09-10]
17
Working Group: Standard Scripts
for Analysis and Programming
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Figure 11.2
Working Group: Standard Scripts
for Analysis and Programming
Scatterplot and Shift Summary for Quantitative Safety Measures Assessing High Value: Individual Study
[Version 1.0] – [2015-09-10]
18
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
[Version 1.0] – [2015-09-10]
19
Working Group: Standard Scripts
for Analysis and Programming
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Figure 11.3
Working Group: Standard Scripts
for Analysis and Programming
Scatterplot and Shift Summary for Quantitative Safety Measures Assessing Low Value with Change
Criteria: Individual Study
[Version 1.0] – [2015-09-10]
20
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
[Version 1.0] – [2015-09-10]
21
Working Group: Standard Scripts
for Analysis and Programming
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Figure 11.4
Working Group: Standard Scripts
for Analysis and Programming
Scatterplot and Shift Summary for Quantitative Safety Measures Assessing High Value with Change
Criteria: Individual Study
[Version 1.0] – [2015-09-10]
22
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
[Version 1.0] – [2015-09-10]
23
Working Group: Standard Scripts
for Analysis and Programming
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Table 11.1
Working Group: Standard Scripts
for Analysis and Programming
Treatment-Emergent Abnormal Summary for Qualitative
Safety Measures: Individual Study
Laboratory Test (unit)
Lab Test 1
Lab Test 2
…
Lab Test i
Treatment
T1
N
Xxx
n (%)
xx(xx.x)
P value*
.xxx
T2
Xxx
xx(xx.x)
.xxx
PL
Xxx
xx(xx.x)
T1
Xxx
xx(xx.x)
.xxx
T2
Xxx
xx(xx.x)
.xxx
PL
Xxx
xx(xx.x)
…
…
T1
Xxx
xx(xx.x)
.xxx
T2
Xxx
xx(xx.x)
.xxx
PL
Xxx
xx(xx.x)
…
Abbreviations: N = number of patients with a normal baseline and at least one post-baseline
measure, n = number of patients with abnormal post-baseline result.
*P values are from Fisher’s exact test compare with PL.
11.2. Discussion
For the scatterplot in the recommended displays (Figures 11.1 to 11.4), we considered a
transformed scale by using the raw measure divided by the lower limit of normal (LLN) for
assessing low or the raw measure divided by the upper limit of normal (ULN) for assessing high.
This kind of display certainly would be useful for some special laboratory analyte
measurements, such as alanine aminotransferase (ALT) or aspartate aminotransferase (AST).
However, for routine laboratory analyte measurements, quantitative ECG measurements, and
vital signs, we recommend using the raw measurement in the scatterplot for most of the
analytes; it is of clinical interest to observe data in the original scale.
There are certainly multiple ways to display outlier/shift summaries. For quantitative laboratory
analyte measurements, quantitative ECG measurements, and vital signs, we considered only
displaying either scatterplots, shift tables (Table 17.1), or a shift to a low/high table (Table 17.2).
We also considered a display that combined the boxplot (from the central tendency white paper)
with a treatment-emergent table (Figure 17.1).
We quickly discarded only displaying the scatterplots. With just a scatterplot, users of the plots
will likely attempt to manually count and create percentages. We also quickly discarded only
displaying the shift table (Table 17.2) for similar reasons. Users of shift tables tend to manually
count and create grouped percentages for those shifting to high from low/normal (or low from
normal/high). Of note, shift tables become complex to create and are difficult to interpret if the
definition of an outlier/shift includes a specified change or percent change value. Thus, creating
the shift table is not recommended in these cases. We strongly considered only displaying shifts
to low/high with all analytes on the table (Table 17.3). This table has the advantage of being
[Version 1.0] – [2015-09-10]
24
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts
for Analysis and Programming
succinct and still reflecting the data that tends to be the most useful for signal detection.
However, feedback from the working group experience with medical colleagues has indicated a
desire to sort information by analyte as opposed to by analytical method. Thus, there is a
preference to have the ability to see the central tendency summary followed by the outlier/shift
summary by analyte. Table 17.3 is not suited for this type of presentation. Thus, we strongly
considered the display that has the boxplot and shift to low/high summary on the same page
(Figure 17.1). This has the advantage of being succinct (one page for each analyte) and is
ordered by analyte, consistent with medical preferences. However, the displays shown in
Figures 11.1 to 11.4 are recommended because we believe the additional information is
generally worth the extra pages per analyte. Researchers can visually see the extent of the shift
(how high or how low relative to baseline measurements) and can visually see whether there is
a clustering by treatment. The shift table portion is perhaps of less value, but it is popular across
current practices. Thus, it likely will help researchers who are used to shift tables by having what
they are used to seeing in addition to seeing other useful displays (when limits are based on
specified values without change criterion). The information can still be sorted by analyte in the
clinical study report (likely in the appendix): the boxplot followed by the 3-panel outlier/shift
diplay, sorted by analyte. This also makes it easy to bring the analytes that end up being
interesting into the body of the clinical study report. If a table such as Table 17.3 is created, a
manually created summary or a new table would be required to discuss the analyte of interest in
the body of the clinical study report.
We also considered a display that includes the scatterplot and the shift tables on one page
(Figures 17.2-17.5). Although this display is convenient to see data on one page, we
recommend separating the data into two pages so the text font size can be larger and more
likely to fulfill regulatory guidance on font size and margins.
For laboratory analyte measurements collected qualitatively, a shift from normal to abnormal
table was considered (Table 17.1). In most cases, we believe that the listing of abnormal
findings is sufficient. For any analyte part of a topic of special interest, a shift from normal to
abnormal table will likely be of interest. As noted in Section 10.3 it would be important to
understand data collection to properly create the table.
We also considered another display for summarizing information in a succinct manner within the
body of a clinical study report (Figure 17.6). This display has the advantage of quickly browsing
through all of the analytes that were analyzed, sorting by decreasing odds ratios. However,
given the medical feedback to present information sorted by analyte as opposed to analytical
method, we believe the preferred approach is to bring forward the boxplot and the 3-panel
outlier/shift figure into the body of the clinical study report for those analytes of interest (with all
of the displays in the appendix), discussed by analyte.
12. Tables and Figures for Integrated Summaries
12.1. Recommended Displays
For quantitative laboratory analyte measurements, quantitative ECG measurements, and vital
[Version 1.0] – [2015-09-10]
25
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts
for Analysis and Programming
signs, a display that includes scatterplots by study and a shift to low/high table is recommended
for summaries across studies. See Figure 12.1 (for low values) and Figure 12.2 (for high
values). For cutoff criteria, including change value, an additional reference line will be added to
the scatterplots in Figures 12.1 and 12.2 (see Figures 11.3 and 11.4 for examples of the
additional reference line).
Utilizing a method suggested by Chuang-Stein (2011) to provide adjusted cumulative
proportions is recommended in the display in addition to the unadjusted proportion.
For laboratory analyte measurements collected qualitatively, the same listing of abnormal
findings (Table 11.1) for individual studies is recommended for integrated summaries, or a shift
from normal to abnormal table can be considered (Table 12.1).
[Version 1.0] – [2015-09-10]
26
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts
for Analysis and Programming
Figure 12.1 Scatterplot and Shift Summary for Quantitative Safety Measures for Low Value: Integrated Database
[Version 1.0] – [2015-09-10]
27
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
[Version 1.0] – [2015-09-10]
28
Working Group: Standard Scripts
for Analysis and Programming
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Figure 12.2
[Version 1.0] – [2015-09-10]
Working Group: Standard Scripts
for Analysis and Programming
Scatterplot and Shift Summary for Quantitative Safety Measures for High Value: Integrated Database
29
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
[Version 1.0] – [2015-09-10]
30
Working Group: Standard Scripts
for Analysis and Programming
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Table 12.1
Treatment
N
n
Lab Test 1
A
Xxx
xx
xx.x
xx.x
B
Xxx
xx
xx.x
xx.x
A
Xxx
xx
xx.x
xx.x
B
Xxx
xx
xx.x
xx.x
…
…
A
Xxx
xx
xx.x
xx.x
B
Xxx
xx
xx.x
xx.x
…
Lab Test i
Analysis and Programming
Treatment-Emergent Abnormal Summary for Qualitative
Safety Measures: Integrated Database
Laboratory
Test (unit)
Lab Test 2
Working Group: Standard Scripts for
…
%
Study Size
Adjusted %
Heterogeneity
P valueb
P value*
xx.xx
.xxx
.xxx
xx.xx
.xxx
.xxx
xx.xx
.xxx
.xxx
ORa
Abbreviations: N = number of patients with a normal baseline and at least one post-baseline
measure, n = number of patients with abnormal post-baseline result. OR = Mantel-Haenszel odds
ratio; add more as needed (alphabetically).
aMantel-Haenszel Odds Ratio stratified by study. Treatment B is numerator, treatment A is
denominator.
bHeterogeneity of odds ratios across studies was assessed using the Breslow Day test.
*P values are from Cochran-Mantel-Haenszel (CMH) test of general association stratified by study.
12.2. Discussion
For quantitative laboratory analyte measurements, quantitative ECG measurements, and vital
signs, utilizing the same display recommended for individual studies was considered (3-panel
display with a single scatterplot, shift table, and shift to low/high table (Figure 11.1) with metaanalytical methods added). However, due to concerns with potential paradoxes (Crowe B, 2014)
when combining data from multiple studies, use of a single scatterplot (with studies combined)
and a single shift table (with studies combined) was discarded. Instead, a scatterplot by study is
recommended (unless the number of studies prohibits the use of such a display). A shift table
by study is not recommended due to space limitations but would be available in the study
reports of the individual studies. The unadjusted percentages provided in the shift to low/high
table are subject to similar potential paradoxes. However, inclusion of the adjusted cumulative
percentage and the Mantels-Haenszel odds ratio (or alternative method) stratified by study
provides information to assess whether the unadjusted percentages are impacted by such
paradoxes. When the unadjusted percentages are impacted by such paradoxes, the adjusted
cumulative percentages will likely be more appropriate for summaries such as labeling.
Another display that was considered was a shift to low/high table with a corresponding forest
plot that shows incidence differences by study (Figure 17.7). This display has the advantage of
being practical even when many studies are included in a summary. However, the scatterplot is
recommended when the number of studies is small enough (e.g., ≤6) because it provides insight
to patient level information by individual study that is often valuable for users of the figure. When
the number of studies is large (e.g., >6), Figure 17.7 can be considered.
[Version 1.0] – [2015-09-10]
31
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts for
Analysis and Programming
A simple shift to low/high table was also considered (Table 17.3). Again, we would strongly
recommend the display shown in Figure 12.1 over this display for integrated analyses when the
number of studies is small enough (e.g., ≤6) for the same reason stated above. When the
number of studies is large (e.g., >6), Table 17.3 can be considered.
Figures 17.8 and 17.9 were discarded for the same reason as Figures 17.2-17.5 (see section
11.2).
13. Example SAP Language
13.1. Individual Study
For quantitative laboratory analyte measurements, 3-panel displays that include a scatterplot, a
shift table, and a shift to high/low table will be created. Specifically, for each measurement, both
a 3-panel display assessing low values and a 3-panel display assessing high values will be
created.
In the 3-panel display to assess low values, the scatterplot will plot the minimum value during
the baseline period versus the minimum value during the treatment period. Lines indicating the
reference limits are included. In cases where limits vary across demographic characteristics,
lines indicating the most common limit will be displayed. The shift table will include the number
and percentage of patients within each baseline category (minimum value is low, normal, high,
or missing) versus each treatment category (minimum value is low, normal, or high) by
treatment. Patients with at least one result in the treatment period will be included in the shift
table. The shift from normal/high to low table will include the number and percentage of patients
by treatment whose minimum baseline result is normal or high and whose minimum treatment
result is low. Patients whose minimum baseline result is normal or high and have at least one
result during the treatment period are included. The Fisher’s exact test will be used to compare
percentages of patients who shift from normal/high to low between treatments.
The 3-panel display to assess high values will be created similarly. The scatterplot will plot the
maximum value during the baseline period versus the maximum value during the treatment
period. The shift table will include the number and percentage of patients within each baseline
category (maximum value is low, normal, high, or missing) versus each treatment category
(maximum value is low, normal, or high) by treatment. The shift from normal/high to low table
will include the number and percentage of patients by treatment whose maximum baseline
result is normal or low and whose maximum treatment result is high. Patients whose maximum
baseline result is normal or low and have at least one result during the treatment period are
included.
For laboratory analyte measurements collected qualitatively, a listing of abnormal findings will
be created. The listing will include patient ID, treatment group, laboratory collection date,
analyte name, and analyte finding.
For quantitative ECG measurements and vital signs with limits defined using a specified value
without a change criterion, 3-panel displays will be created as described above. For quantitative
[Version 1.0] – [2015-09-10]
32
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts for
Analysis and Programming
ECG measurements and vital signs with limits defined using a specified value and a change
criterion, 2-panel displays will be created. The 2-panel display will include the scatterplot and
the shift to low/high table. To assess increases, change from the maximum value during the
baseline period to the maximum value during the treatment period will be used. To assess
decreases, change from the minimum value during the baseline period to the minimum value
during the treatment period will be used.
Laboratory tests include all planned analytes as defined in the protocol, excluding those
collected in a reflex manner (only collected under certain circumstances). ALT, AST, and total
bilirubin will not be included in this analysis because they will be analyzed as described in the
hepatotoxicity section. Vital signs include systolic blood pressure, diastolic blood pressure,
pulse, and temperature. Physical characteristics include weight and body mass index (BMI).
ECG parameters include heart rate, PR, QRS, QT, corrected QT using Fredericia’s correction
factor (QTcF=QT/RR0.333), and corrected QT using a large clinical trial population based
correction factor (QTcLCTPB=QT/RR0.413; (Dmitrienko AA, 2005). When the QRS is prolonged
(e.g., a complete bundle branch block), QT and QTc should not be used to assess ventricular
repolarization. Thus, for a particular ECG, the following will be set to missing (for analysis
purposes) when QRS is ≥120: QT, QTcF, and QTcLCTPB.
Large clinical trial population-based reference limits will be used to define the low and high limits
for laboratory analyte measurements (Reference x or Attachment x – not shown in this
example). Reference limits for ECGs and vital signs are defined in Tables 13.1 and 13.2,
respectively.
Table 13.1
Selected Categorical Limits for ECG Data
Parameter
Heart rate (bpm)
PR interval (msec)
QRS interval (msec)
QTcF (msec)
QTcLCTPB (msec)1
Males
Age (yrs): Limit
≥18: <50 and
decrease ≥15
All ages: <120
All ages: <60
All ages: <330
All ages: <330
Low
Females
Age (yrs): Limit
≥18: <50 and
decrease ≥15
All ages: <120
All ages: <60
All ages: <340
All ages: <340
Males
Age (yrs): Limit
≥18: >100 and
increase ≥15
All ages: ≥220
All ages: ≥120
≥16: >450
<18: >444
18-25: >449
26-35: >438
36-45: >446
46-55: >452
56-65: >448
>65: >460
High
Females
Age (yrs): Limit
≥18: >100 and
increase ≥15
All ages: ≥220
All ages: ≥120
≥16: >470
<18: >445
18-25: >455
26-35: >455
36-45: >459
46-55: >464
56-65: >469
>65: >465
NA=Not applicable
1. Dmitrienko AA, Sides GD, Winters KJ, Kovacs RJ, Rebhun DM, Bloom JC, Groh W, Eisenberg PR.
Electrocardiogram reference ranges derived from a standardized clinical trial population. DRUG INF J 39:395-405;
2005
[Version 1.0] – [2015-09-10]
33
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Table 13.2
Working Group: Standard Scripts for
Analysis and Programming
Categorical Criteria for Abnormal Treatment-Emergent Blood
Pressure and Pulse Measurement and Categorical Criteria for
Weight and Temperature Changes for Adults
Parameter
Systolic BP (mm Hg)
(Supine or sitting –
forearm at heart level)
Diastolic BP (mm Hg)
(Supine or sitting –
forearm at heart level)
Pulse (bpm) (Supine
or sitting)
Temperature
Low
mm Hg
High
mm Hg
≤90 and decrease from baseline ≥20
≥140 and increase from baseline ≥20
≤50 and decrease from baseline ≥10
≥90 and increase from baseline ≥10
<50 and decrease from baseline ≥15
>100 and increase from baseline ≥15
<96 degrees F and
decrease ≥2 degrees F
≥101 degrees F and
increase ≥2 degrees F
Abbreviation: bpm = beats per minutes.
13.2. Integrated Summary
For quantitative laboratory analyte measurements, quantitative ECG measurements, and vital
signs, a display that includes scatterplots by study and a shift to low/high table will be created.
Specifically, for each measurement, both a 2-panel display assessing low values and a 2-panel
display assessing high values will be created.
To assess low values in the 2-panel display, the scatterplots will plot the minimum value during
the baseline period versus the minimum value during the treatment period for each study. Lines
indicating the reference limits are included. For cutoff criteria including a change value, an
additional reference line will be added to the scatterplots. In cases where limits vary across
demographic characteristics, lines indicating the most common limit will be displayed. The shift
from normal/high to low table will include the number and percentage of patients by treatment
whose minimum baseline result is normal or high and whose minimum treatment result is low.
Patients whose minimum baseline result is normal or high and have at least one result during
the treatment period are included. The Cochran-Mantel-Haenszel test stratified by study will be
used to compare the percentages of patients who shift from normal or high to low between
treatments. The Mantel-Haenszel odds ratio and Breslow-Day test for heterogeneity will also be
provided. In addition, a study size–adjusted cumulative percentages suggested by ChuangStein et al (2011) will be provided.
The 2-panel display to assess high values will be created similarly. The scatterplots will plot the
maximum value during the baseline period versus the maximum value during the treatment
period for each study. The shift from normal/low to high table will include the number and
percentage of patients by treatment whose maximum baseline result is normal or low and
whose maximum treatment result is high. Patients whose maximum baseline result is normal or
low and have at least one result during the treatment period are included.
For quantitative ECG measurements and vital signs with limits defined using a specified value
[Version 1.0] – [2015-09-10]
34
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts for
Analysis and Programming
and a change criterion, change from the maximum value during the baseline period to the
maximum value during the treatment period will be used to assess increases. Change from the
minimum value during the baseline period to the minimum value during the treatment period will
be used to assess decreases.
For laboratory analyte measurements collected qualitatively, a listing of abnormal findings will
be created. The listing will include patient ID, treatment group, laboratory collection date,
analyte name, analyte finding.
Laboratory tests include all planned analytes as defined in the protocol, excluding those
collected in a reflex manner (only collected under certain circumstances). ALT, AST, and total
bilirubin will not be included in this analysis because they will be analyzed as described in the
hepatotoxicity section. Vital signs include systolic blood pressure, diastolic blood pressure,
pulse, and temperature. Physical characteristics include weight and BMI. ECG parameters
include heart rate, PR, QRS, QT, corrected QT using Fredericia’s correction factor
(QTcF=QT/RR0.333), and corrected QT using a large clinical trial population based correction
factor (QTcLCTPB=QT/RR0.413; (Dmitrienko AA, 2005). When the QRS is prolonged (e.g., a
complete bundle branch block), QT and QTc should not be used to assess ventricular
repolarization. Thus, for a particular ECG, the following will be set to missing (for analysis
purposes) when QRS is ≥120: QT, QTcF, and QTcLCTPB.
Large clinical trial population based reference limits will be used to define the low and high limits
for laboratory analyte measurements (Reference x or Attachment x – not shown in this
example). Reference limits for ECGs and vital signs are defined in Tables 13.1 and 13.2,
respectively.
Table 13.1
Selected Categorical Limits for ECG Data
Parameter
Heart rate (bpm)
PR interval (msec)
QRS interval (msec)
QTcF (msec)
QTcLCTPB (msec)1
Males
Age (yrs): Limit
≥18: <50 and
decrease ≥15
All ages: <120
All ages: <60
All ages: <330
All ages: <330
Low
Females
Age (yrs): Limit
≥18: <50 and
decrease ≥15
All ages: <120
All ages: <60
All ages: <340
All ages: <340
Males
Age (yrs): Limit
≥18: >100 and
increase ≥15
All ages: ≥220
All ages: ≥120
≥16: >450
<18: >444
18-25: >449
26-35: >438
36-45: >446
46-55: >452
56-65: >448
>65: >460
High
Females
Age (yrs): Limit
≥18: >100 and
increase ≥15
All ages: ≥220
All ages: ≥120
≥16: >470
<18: >445
18-25: >455
26-35: >455
36-45: >459
46-55: >464
56-65: >469
>65: >465
NA=Not applicable
1. Dmitrienko AA, Sides GD, Winters KJ, Kovacs RJ, Rebhun DM, Bloom JC, Groh W, Eisenberg PR.
Electrocardiogram reference ranges derived from a standardized clinical trial population. DRUG INF J 39:395-405;
2005
[Version 1.0] – [2015-09-10]
35
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Table 13.2
Parameter
Systolic BP (mm Hg)
(Supine or sitting –
forearm at heart level)
Diastolic BP (mm Hg)
(Supine or sitting –
forearm at heart level)
Pulse (bpm)
(Supine or sitting)
Temperature
Working Group: Standard Scripts for
Analysis and Programming
Categorical Criteria for Abnormal Treatment-Emergent Blood
Pressure and Pulse Measurement and Categorical Criteria for
Weight and Temperature Changes for Adults
Low
mm Hg
High
mm Hg
≤90 and decrease from baseline ≥20
≥140 and increase from baseline ≥20
≤50 and decrease from baseline ≥10
≥90 and increase from baseline ≥10
<50 and decrease from baseline ≥15
>100 and increase from baseline ≥15
<96 degrees F and decrease ≥ 2
degrees F
≥101 degrees F and increase ≥2
degrees F
[Version 1.0] – [2015-09-10]
36
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts for
Analysis and Programming
14. Acknowledgements
The key contributors include: Wei Wang, Mary E. Nilsson, and Charles M. Beasley.
Additional contributors and members of the Analysis and Display White Papers Project Team
within the PhUSE Computational Science Development of Standard Scripts for Analysis and
Programming Working Group include: David Henry Adams, Sascha Ahrweiler, Michelle A
Barrick, Kirk Bateman, Simin K. Baygani, Nhi Beasley, Gustav Bernard, Karen L Bonifacius,
Adrienne Bonwick, Nancy Brucken, Asa Carlsheimer, Lai Shan Chan, Yi-Lin Chiu, Brenda
Crowe, Jane Diefenbach, Damon P. Disch, Mary Doi, Harprit Dosanjh, Mohammad Fayaz,
Jean-Marc Ferran, Jim Gaiser, Steven P. Gingras, Dany Guerendo, Kristen Harrington, Ray
Harris, Qi Jiang, David Jordan, Kenneth Koury, Karolyn Kracht, Karin LaPann, Fabien Linay,
Rich Manski, Kim Musgrave, Mercidita Navarro, Pierre Nicolas, Raphael Noirfalise, Michele
Norton, Musa Nsereko, Walt Offen, Mithun Ranga, Peter Schaefer, John Schoenfelder, Frank
Senk, Julie Shah, Jack Shostak, Christos Stylianou, Rebeka Tabbey, Sheryl Treichel, Francois
Vandenhende, Terry Walsh, Sharon M. Weller, John Smith, Steve Wilson
15. Project Leader Contact Information
Name: Mary Nilsson
Enterprise: Eli Lilly & Company
City, State ZIP: Indianapolis, IN 46285
Work Phone: 317-651-8041
E-mail: [email protected]
[Version 1.0] – [2015-09-10]
37
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts for
Analysis and Programming
16. References
Chuang-Stein C, Beltangady M. Reporting cumulative proportion of subjects with an adverse
event based on data from multiple studies. Pharm Stat. 2011;10:3-7.
Copeland KT, Checkoway H, McMichael AJ, Holbrook RH. Bias due to misclassification in the
estimation of relative risk. Am J Epidemiol. 1977;105:488-495.
Crowe B, Wang W, Nilsson ME. Advances in collating and using trial data. In Chevret S,
Resche-Rigon M (eds): Advances in Collating and Using Trial Data. London: Future
Science; 2014:20-36.
Crowe BJ, Xia HA, Berlin JA, et al. Recommendations for safety planning, data collection,
evaluation and reporting during drug, biologic and vaccine development: a report of the
safety planning, evaluation, and reporting team. Clin Trials. 2009;6:430-440.
Dmitrienko AA, Sides GD, Winters KJ, et al. Electrocardiogram reference ranges derived from a
standardized clinical trial population. Drug Information Journal. 2005;39:395-405.
Horowitz G, Altaie S, Boyd J, Ceriotti F, Garg U, Horn P, et al. Defining, establishing, and
verifying reference intervals in the clinical laboratory: Approved guideline (3rd ed.).
Wayne, PA: Clinical and Laboratory Standards Institute; 2008.
Quade D, Lachenbruch PA, Whaley FS, McClish DK, Haley RW. Effects of misclassifications on
statistical inferences in epidemiology. Am J Epidemiol. 1980;111:503-513.
Rosario LA, Kropp TJ, Wilson SE, Cooper CK. Joint FDA/PhUSE Working Groups to help
harness the power of computational science. Drug Information Journal. 2012;46:523524.
O'Neil R. Assessment of safety. In Peace KE (ed.), Biopharmaceutical Statistics for Drug
Development. New York: Marcel Dekker; 1988.
Solberg HE, Grasbeck R. Reference values. Adv Clin Chem. 1989; 27:1-79.
U.S. Department of Health and Human Services (2005a). Guidance for Industry: E14 Clinical
Evaluation of QT/QTc Interval Prolongation and Proarrhythmic Potential for NonAntiarrhythmic Drugs. Retrieved from
http://www.fda.gov/downloads/drugs/guidancecomplianceregulatoryinformation/guidance
s/ucm073153.pdf
U.S. Department of Health and Human Services. (2005b). Reviewer guidance: Conducting a
clinical safety review of a new product application and preparing a report on the review.
Retrieved from
http://www.fda.gov/downloads/drugs/guidancecomplianceregulatoryinformation/guidance
s/ucm072974.pdf
[Version 1.0] – [2015-09-10]
38
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts for
Analysis and Programming
17. Appendix: Figures and Tables
Figure 17.1
Summary for Quantitative Safety Measures: Individual Study
[Version 1.0] – [2015-09-10]
39
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts for
Analysis and Programming
Figure 17.2 Scatterplot and Shift Summary for Quantitative Safety Measures
Assessing Low Value: Individual Study
[Version 1.0] – [2015-09-10]
40
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts for
Analysis and Programming
Figure 17.3 Scatterplot and Shift Summary for Quantitative Safety Measures
Assessing High Value: Individual Study
[Version 1.0] – [2015-09-10]
41
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts for
Analysis and Programming
Figure 17.4 Scatterplot and Shift Summary for Quantitative Safety Measures
Assessing Low Value with Change Criteria: Individual Study
[Version 1.0] – [2015-09-10]
42
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts for
Analysis and Programming
Figure 17.5 Scatterplot and Shift Summary for Quantitative Safety Measures
Assessing High Value with Change Criteria: Individual Study
[Version 1.0] – [2015-09-10]
43
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Figure 17.6
Working Group: Standard Scripts for
Analysis and Programming
Summary of Common Treatment-Emergent Abnormal for
Quantitative Safety Measures: Individual Study
[Version 1.0] – [2015-09-10]
44
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts for
Analysis and Programming
Figure 17.7 Scatterplot and Shift Summary for Quantitative Safety Measures:
Integrated Database
[Version 1.0] – [2015-09-10]
45
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts for
Analysis and Programming
Figure 17.8 Scatterplot and Shift Summary for Quantitative Safety Measures for
Low Value: Integrated Database
[Version 1.0] – [2015-09-10]
46
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Working Group: Standard Scripts for
Analysis and Programming
Figure 17.9 Scatterplot and Shift Summary for Quantitative Safety Measures for
High Value: Integrated Database
[Version 1.0] – [2015-09-10]
47
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Table 17.1
Treatment
T1
(N = xxx)
T2
(N = xxx)
PL
(N = xxx)
Working Group: Standard Scripts for
Analysis and Programming
Shift Table Analyses
Baseline
Result
Low
Normal
High
Missing
Total
Low
Normal
High
Missing
Total
Low
Normal
High
Missing
Total
Decreased
n (%)
xx (xx.x)
xx (xx.x)
xx (xx.x)
Low
n (%)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
Same
n (%)
xx (xx.x)
xx (xx.x)
xx (xx.x)
Post-Baseline Result
Normal
High
n (%)
n (%)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
Increased
n (%)
P value*
xx (xx.x)
.xxx
xx (xx.x)
.xxx
xx (xx.x)
Total
n (%)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx (xx.x)
xx
xx
xx
xx
(xx.x)
(xx.x)
(xx.x)
(xx.x)
xx
xx
xx
xx
(xx.x)
(xx.x)
(xx.x)
(xx.x)
Treatment
T1
T2
PL
Lab Test Name
Shifts from Last Baseline to Last Post-Baseline Result
Abbreviations: N = number of patients with a baseline and post-baseline result; n = number of
patients.
in category; add more as needed (alphabetically).
*P values are from likelihood-ratio chi-square test, compared with PL.
[Version 1.0] – [2015-09-10]
48
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Table 17.2
Working Group: Standard Scripts for
Analysis and Programming
Shift from Normal/High to Low and from Normal/Low to High for
Laboratory Measures
Laboratory Tests
Shift from Normal/High to Low and from Normal/Low to High
Laboratory Test
Lab Test 1
Lab Test 2
…
Lab Test i
Abnormality
Direction
Low
Treatment
T1
T2
PL
N
xxx
xxx
xxx
n (%)
xx (xx.x)
xx (xx.x)
xx (xx.x)
P value*
.xxx
.xxx
High
T1
T2
PL
xxx
xxx
xxx
xx (xx.x)
xx (xx.x)
xx (xx.x)
.xxx
.xxx
Low
T1
T2
PL
xxx
xxx
xxx
xx (xx.x)
xx (xx.x)
xx (xx.x)
.xxx
.xxx
High
T1
T2
PL
xxx
xxx
xxx
xx (xx.x)
xx (xx.x)
xx (xx.x)
.xxx
.xxx
…
Low
…
T1
T2
PL
…
xxx
xxx
xxx
…
xx (xx.x)
xx (xx.x)
xx (xx.x)
.xxx
.xxx
High
T1
xxx
xx (xx.x)
.xxx
T2
xxx
xx (xx.x)
.xxx
PL
xxx
xx (xx.x)
Abbreviations: N = number of patients with a normal (i.e., not low if calculating low and not
high if calculating high) baseline and at least one post-baseline measure; n = number of patients
with an abnormal post-baseline result in the specified category.
*P values are from Fisher’s exact test, compared with PL.
[Version 1.0] – [2015-09-10]
49
Project: Analysis and Display White Papers Project Team
Title: Analyses and Displays Associated with Outliers or Shifts
Table 17.3
Working Group: Standard Scripts for
Analysis and Programming
Shift from Normal/High to Low and from Normal/Low to High:
Integrated Database
Laboratory
Test (unit)
Direction
Treatment
N
n (%)
ORa
Heterogeneity
P valueb
P
value*
Lab Test 1
High
A
xxx
xx (xx.x)
xx.xx
.xxx
.xxx
B
xxx
xx (xx.x)
A
xxx
xx (xx.x)
xx.xx
.xxx
.xxx
B
xxx
xx (xx.x)
…
…
A
xxx
xx (xx.x)
xx.xx
.xxx
.xxx
B
xxx
xx (xx.x)
Low
…
Lab Test i
…
High
Abbreviations: N = number of patients with a normal baseline and at least one post-baseline
measure; n = number of patients with abnormal post-baseline result; OR = Mantel-Haenszel
odds ratio; add more as needed (alphabetically).
aMantel-Haenszel Odds Ratio stratified by study. Treatment B is numerator, treatment A is
denominator.
bHeterogeneity of odds ratios across studies was assessed using the Breslow Day test.
*P values are from Cochran-Mantel-Haenszel (CMH) test of general association stratified by
study.
[Version 1.0] – [2015-09-10]
50