Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Entity–attribute–value model wikipedia , lookup
Operational transformation wikipedia , lookup
Data center wikipedia , lookup
Data analysis wikipedia , lookup
Relational model wikipedia , lookup
Clusterpoint wikipedia , lookup
3D optical data storage wikipedia , lookup
Information privacy law wikipedia , lookup
Data vault modeling wikipedia , lookup
Open data in the United Kingdom wikipedia , lookup
FUTURE DIRECTION OF BIOMEDICAL INFORMATION SYSTEMS IN THE PHARMACEUTICAL INDUSTRY BASED ON THE SAS® SYSTEM James F. Sattler, Syntex Research Type II CANDA The title of this paper promises a subject than the paper actually 15 1ntended to present. Therefore, at the outset it is important to state ~ro~der The second type of CANDA deserves to be called a II system " because a clinical data review system accompanies the data. A clinical data review system is a set of automated procedures that enables a medical reviewer in a regulatory agency to interact with the data. The sponsor thereby turns over to the regulatory agency a powerful mechanism that the following will deal with only one biomedical application computer-assisted NDA review systems. ' COMPUTER-ASSISTED NDA SYSTEMS In the second half of the 19805, U. S. pharmaceutical companies and the Food and Drug Administration have begun to apply computer technology to New Drug Applications (NOAB), i.e., to the formal submission of data to a regulatory agency required for approval of a new drug. Medical reviewers at the FDA now are starting to have direct access to clinical trials data which the sponsoring companies present as evidence of a drug's safety and efficacy. These computer-assisted NDA (CANDA) review systems are on the way to becoming a standard feature of the drug development process. that Type I facilitate As might be imagined, Type I and II CANDA systems raise the prospect of still another, more highly evolved type of CANDA. It is possible to conceive of a Type III system as a kind of super Type I I • In addition to providing a data review mechanism, such a CANDA would encompass the entire NDA package.. Cl inical results, statistical analysis, toxicology, manufacturing, chemistry, and all other information comprising the total NDA package would be available to the regulatory agency in a single integrated system. Such a system would be able to move smoothly among data, text, tables, and graphs. It would also incorporate the powerful querying and reporting tools of neWly-emerging relational database software .. This is the simplest CANDA system .. It involves the transfer of data only, from the sponsor to the regulatory agency. The transfer itself can take various forms. First t the data can be sent by means of floppy diskettes or tapes which can then be mounted directly on the computer equipment belonging to the agency. The agency then analyzes the data using its own analytical procedures and software. Another example of data transfer is the use of a direct line which connects the agency to the sponsor's own computer system. In eiof a will Type III CANDA CANDA the goal hopes It is this type of CANDA that is under There are at least three conceivable types of CANDA systems. ther case, agency the most active development today. TYPES OF CANDA SYSTEMS Type I the or improve the quality of the review. currently, the Type III CANDA re- mains a goal for the future. However, there is some discussion in the pharmaceutical industry about moving in this direction. As that happens, software requirements will become much more complex than they are in the Type II CANDA systems. The Syntex CANDA system resembles the clinical data review systems which were described at SUGI meetings in 1986 CANDA remains a modest one - to provide data only, in electronic form. In a waYt it is a misnomer to refer to this type of CANDA as a II system" , since only the administrative and technical arrangements for data transfer could be considered to be systematic. 443 and 1987. It uses SAS/A~~menu and program screens to bridge the gap between reviewer and the complex software needed for review of clinical trials data. SASjAF has made it possible also to modify prototype functions easily and to tailor the application to the specific needs of the users at FDA. SOLE RELIANCE ON SAS? Let us now explore some answers to this question in the real world of clinical data review.. This will touch on possible future changes to SAS like those compiled annually in the Software Ballot. But the paper will go beyond a laundry list of suggestions. In particular, it will try to open some discussion on the implications of new 4GL DBMS software for SAS. The Syntex CANDA system provides the following generic functions. 1. Viewing the clinical data in tables or in special screens resembling Case Report Forms. The special screens can be displayed automatically at the end of a complex search for patients with certain characteristics. 2. Manipulating the data - selecting subsets, computing new variables, and cross-referencing data sets (e.g., correlating side-effect and concomitant medication datasets). 3. Electronic Mail for communication between sponsor and regulatory agency_ 4. Miscellaneous utilities - e.g., ~aving subsets of data, sorting, print~ng, etc. First, let us try to understand the application a little better. A medical reviewer uses a CANDA system to see if a drug works and if it is safe. That sounds straightforward enough, but this simple requirement translates into sophisticated systems. For example, in order to see if a drug is safe, a medical reviewer will look at laboratory data. In a SAS-based system, a medical reviewer can use different simple or complex search techniques to isolate the laboratory data of interest. • Boolean logic can locate female patients older than 65. • Actual data values can be displayed and selected automatically .. • SAS software is capable of even more sophisticated searches. Recently, it has been possible to generate frequency tables of normalized laboratory data with the cross-tabulation results computed in pre-defined ranges. The physician reviewer can then select a segment of the frequency table representing the patients of interest and thus isolate a data subset. The subset can then be used to select individual patients and then generate custom Case Report Forms for them. In a word: The SAS system has many powerful tools available for solving the complex requirements of this type of biomedical information system. Nevertheless, it is the simple requirements that pose a challenge to exclusive reliance on SAS software solutions to biomedical requirements. In particular, querying and cross-referencing data are functions which stretch SAS to its limits. These features are similar to those found in other CANDA systems in use or under development. As the pharmaceutical industry and medical reviewers at the FDA work more closely together, the resulting systems are coming to resemble each other, at least in terms of their functionality. However, while such systems solve many problems, they also create others. Typically, they generate more demanding user requirements.. Once accustamed. to a cc:'mputerized review system, a med~cal rev~ewer may begin to imagine more powerful procedures and enhancements. The revolution of rising expectations in computer-assisted reviews of drugs places a heavy burden on the developers of the systems.. SAS, because of its initial success, has been unquestionably a leading contender for the software of choice.. The question is how long such systems can continue to be based on SAS softWare exclusively. 444 This is only one example of the kind of Software Ballot suggestion which needs to be incorporated into contemporary biomedical information systems of the CANDA type. As it turns out, this too, like the querying and cross-referencing functions mentioned above, are quite feasible in some database environments. This brings us to the next topic. QUERY FUNCTIONS IN SAS The query function needs to provide the user with an intuitive random search capability. The result of the query then should be cross-referenced easily with other data. Find patients with headaches. What concomitant medi- cations did -these patients take? What were their laboratory values? Then, if a patient's liver enzymes show alarming values, the medical reviewer will want to have all available information about the patient in order to draw informed conclusions. What about other studies? Can data be pooled across clinical trials? Assume that SAS data sets exist for laboratory, demographic, side-effects and other data collected on Case Report Forms during clinical SAS - DBMS SYSTEMS What about combining SAS with a relational database package? This sounds simple. But what kind of combinations are possible and effective? When should the data reside in tables or in SAS datasets? In short: what is the optimal mix of softwares? Let us consider these questions. The notion that the data should reside in database tables means extraction and conversion. This is the main topic in what usually is said or written about SAS interfaces to databases. When all is said and done, the interface problem is considered solved if SAS PROCs can be written which extract data from a database or load it back into a database. Implicit in this notion of a SAS DBMS combination is the belief that the data must be put into SAS first before the data can be analyzed. A "PROC DBMS" which extracts data treats the database like a passive resource, and communicates only by passing a few query commands over to the database. The database in such a system is considered to have limited intelligence and is relegated to a completely subordinate position. Not surprisingly, it is this concept of a SAS - DBMS interface which has tended to be discussed at SUGI conferences, where the idea that SAS should be the primary tool for data analysis is not a radical one. trials. Cross-referencing requires the ability to merge datasets quickly and easily. SAS merges datasets but not with the ease of a relational database. To be sure, with skillful programming techniques, it is possible to simulate many fea tures of relational databases in SAS.. However, programming techniques which can simulate relational table lookups and joins are known to relatively few SAS users. REPLAYING A SESSION One requirement for a CANDA system is to be able to save the terminal session and to reproduce the steps taken to achieve analytical results. Typically, users of biomedical information systems are expert users, but not expert computer people. With a menu-driven system, a user may arrive at desired results haphazardly and then not remember the steps taken to get them. In the SASjAF documentation, reference is made to the concept of a IIproject1t, defined as a set of one or more tasks involving data. The ability to store such a "projectn is not currently implemented in the SASjAF SAS - DBMS SEAMLESS INTEGRATION sys- tem. While i t is true that the SAS sys- tem has undergone a remarkable growth in capabilities and sophistication, so 445 too have some DBMS systems. One example of this is the appearance of complex application facilities and other 4 GL components of contemporary DBMS systems. Indeed, it begins to appear that a database must have an application facility just as a car must have disk brakes. It is this concurrent evolution of other - software products which must now be considered when discussing SAS - DBMS interfaces. CONCLUSION In the specific CANDA application under discussion, there are several roles for SAS. First, the SAS system is required for special statistical tables, because PROC FREQ, PROC MEANS, PROC UNIVARIATE, and PROC TABULATE are able to present results in highly desirable formats. Second, the use of the SAS system in the pharmaceutical industry is a standard and the algorithms used for statistical computations are considered val id. This is an important consideration for a regulated industry. The future direction of biomedical information systems in the pharmaceutical industry, based on the SAS system, is the continued use of SAS but integrated with a database. This will present a considerable challenge to the programming staffs charged with the task of developing such hybrid systerns. As the SAS system becomes more powerful and undergoes future version changes, the requirement that SAS mastery proceed congruently with DBMS mastery is a staggering one. The idea of bilingualism will not be easy to sell to already burdened data processing departments. Nevertheless, the potential payoffs for such hybrid systems should be an incentive to proceed with their development. The notion of SAS-DBMS interface which now becomes of interest is the interface between two applications facilities. In other words, the resulting system would appear as a hybrid menu and program screen package, in which some menu choices would invoke the database's programmed functions and some menu choices would invoke SAS functions. IMAGINING A HYBRID SYSTEM Such a hybrid system is relatively easy to imagine in the VMS™environment. The broad contours of the system would be set up as follows: First, the database would be used to select the analysis data. The·selection could proceed in the most powerful manner, allowing ad-hoc, random selection, metadata merges, etc. Second, the selected data could be dumped to a flat file and its descriptive characteristics (numeric or character, column position, etc.) similarly written out to a file. Next, a command file could be executed which would invoke the SAS menu system. The first program in the menu system would be a DATA step whose function would be to read in the flat file containing the dumped data. The author can be contacted at: Syntex Research Mail stop A4-100 3401 Hillview Ave. Palo Alto, CA 94303 This type of SAS-DBMS interface is fundamentally unlike the 'PROC DBMS' type interface which was discussed earl ier. It is based on a role-sharing concept, rather than the one-sided, SAS-dominant .one. The notion here is that is is worthwhile to combine the strongest features of the DBMS with the strongest features of SAS. SAS, and SAS/AF are registered trademarks of SAS Institute, cary, NC, USA. VMS is a trademark of Digital Equipment Corporation. 446