* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Computed Cell Image Information
Epigenomics wikipedia , lookup
Primary transcript wikipedia , lookup
DNA vaccination wikipedia , lookup
Cre-Lox recombination wikipedia , lookup
History of genetic engineering wikipedia , lookup
Extrachromosomal DNA wikipedia , lookup
Epigenetics in stem-cell differentiation wikipedia , lookup
Oncogenomics wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Mir-92 microRNA precursor family wikipedia , lookup
Monogr. clin. Cytol., vol. 9, pp. 62-100 (Karger, Basel 1984) Computed Cell Image Information Marluce Bibbo, Peter H. Bartels, Hamey E. Dytch, George L. Wied Section of Cytology, Department of Obstetrics and Gynecology and Pathology, University of Chicago, Chicago, Ill., and Optical Sciences Center and Department of Pathology, University ofArizona, Tucson, Ariz., USA The quantitative analytical assessment of clinical cytopatholdgy materials had its beginnings in the pioneering work of Caspersson[21]and his co-workers. Almost 50 years ago, Caspersson laid the foundations for quantitative microphotometric methodology and defined the conditions under which exact determinations of cellular constituents, such as DNA and RNA, could be obtained. One of the most consequential results from Caspersson's research arose from the nagging problem of photometric measurement errors due to the inhomogeneity, or granulaity, of the absorbing material in the cell. Known as 'distributional errors'[56,57],their effect led to erroneous estimates of the total amount of DNA in cell nuclei. Efforts to control these photometric errors led to the development of scanning microscopy[22]; by recording the optical density for a measuring spot dimensioned close to the diffraction limit of the microscope, object inhomogeneity is greatly reduced. Since optical density directly relates to the amount of absorbing material, summing up all of the raster scan measurements provided much more reliable estimates of cellular DNA content. Cellular DNA content had, by the 1950s, become of vital diagnostic interest with the discovery that tumor cells exhibit increased amounts of DNA [3, 48, 6l].The somewhat cumbersome determination of DNA contents based on UV light absorbance was replaced by visible light microphotometry employing stochiometric cytochemical staining, the Feulgen reaction[29]. Scanning light microscopes,though, were simply not available commercially until the late 1960s,and even then they were prohibitively expensive. Clinical diagnostic use of the information offered by the DNA distribution in samples of cells collected from a lesion therefore remained restricted to studies where cell samples of very modest size were measured manually by efforts that often required many hours and even days of work per sample. In the meantime, though, advances in image sensing and recording techniques had made it practical to abandon the 'analog' sensing and summing of the sequential spot measurements in favor of digital recording. For each spot measurement, the amount of light absorbed is converted to a number. The image is represented as an array of numbers, as a digital image.The recording of scanned cell images in digital form did, perse, not substantially improve accuracy or precision of the photometric DNA determinations, nor did it add much to the information recorded for a given cell. After all, at the end of the scan, all of the values were simply added up to render the 'total optical density' as a measure of DNA content.. It was at this juncture, though, that a fundamentally new approach to microphotometric assessment of cell images was perceived[ 71]. The digitized cell images accurately reflected not only the total amount of absorbing substance but also its topographic distribution, its granularity, its central or peripheral tendency, and its distribution among larger or smaller chromatin granules and among condensed and non-condensed chromatin. Summing all the optical density spot measurements appeared like taking a printed page and trying to extract the information from its text by dissolving the printers ink and measuring its total amount. The digitized cell images offered an abundance of potentially highly specific and sensitive diagnostic information, information that processing by computer could reveal. For the past 15 years, computer assessment of digitized cell imagery has developed into a methodology of great refinement and substantial potential as discussed elsewhere in this volume. It is the purpose of this article to survey its present and future impact in cytopathology and to illustrate this by means of selected examples. From the point of view of a practicing clinical cytopathologist,one would have to say that so far only one application of all of this powerful methodology has reached the stage of practical clinical use: the recording of DNA ploidy patterns. All of the other applications, as promising and even exciting as they are, must still be considered to be in the research and development phase. We begin with a discussion of computer-aided DNA cytophotometry. DNA Cytophotometry The diagnostic and prognostic evaluation of a number of lesions is greatly assisted by an assessment of the DNA ploidy pattern of their cell nuclei. The presence of a strictly euploid or polyploid DNA distribution generally presents a more favorable clinical outlook than the observation of an aneuploid distribution, which points to the presence of premalignancy or malignancy and, therefore, serious implications. DNA microphotometry has been used for this differentiation for many years [4,5,48, 6I,62,67] and has been found particularly helpful in situations where visual/morphologic assessment is inconclusive .There has recently been a greatly renewed interest in this diagnostic method, especially in the assessment of the aneuploidy status of cervical lesions, with efforts to identify precursor lesions for carcinoma in situ and invasive uterine cancer[ 19, 31-33, 35,37,58-60]. Related areas of interest are the differentiation between condylomata and aneuploid lesions of the cervix [47,63];the examination of the ploidy status of cervical cells exhibiting post-radiation dysplasia, to differentiäte these changes from recurrent carcinoma of the uterine cervix [49, 531; and the differentiation of immature squamous metaplasia from more significant lesions in diethylstilbestrol (DES)-exposed patients [33]. It is generally agreed that ploidy assessments provide diagnostic and prognostic information which has immediate impact on the clinical management of these patients, especially in young patients considering pregnancy desirable. However, while the clinical value of DNA ploidy assessments undisputed, the practical recording of DNA histograms in clinical laboratories is not a routine procedure, and where it is feasible it still is not an inexpensive procedure. Existing instrumentation requires the recording of the DNA content nucleus by nucleus, mostly by manual procedures, and the use of either somewhat inaccurate microphotometric methods - such as the 'plug method or the 'two wavelength technique' [70] to control distributional photometric effor - or the use of optical scanning microscopes, operated on line to a laboratory computer. The manual techniques require several hours per sample, are labor intensive, and are therefore often restricted in the number of nuclei included in the assessment[3 2]. This frequently means basing a diagnostic assessment on a statistically marginally valid sample size. It leads to low sensitivity for the detection of aneuploidy and implies the possibility of error due to inadequate sampling. On the other hand, scanning microscope/computer systems tend to be quite costly to install and to maintain and are really much more powerful than is required for the practical goals at hand here. Technical advances have made it possible to record the DNA contents of up to 300 nuclei within a very reasonable time period automatically [73] and to construct a DNA histogram. Ploidy assessment thus has become practical for clinical samples where only a limited number of cells are available as well as for tissue sections. This rapid DNA ploidy assessment is based on instrumentation installed for research on the TICAS project; its components are a threecolor video camera attached to a Zeiss research microscope, a computer graphics display, and a PDP 11/45 computer with an FPS AP-120B array processor and large disk storage. The operator selects a microscopic field, and the data are digitized at video rates. The computer-graphic display shows which cell nuclei are accepted for DNA photometry. The operator repeats this procedure-with the next field, until a sufficient sampling of nuclei has been recorded. Figure I shows a typical field of Feulgen stained nuclei in normal tissue. Figure 2 shows the same field after a nuclear boundary tracking algorithm has delineated the boundaries of acceptable nuclei and their integrated optical densities have been measured. Touching nuclei are automatically broken apart and only those nuclei that are not cut by the scan field border, overlapping, or otherwise unacceptable by reasons of shape, etc., are numbered. Fig. 1. Typical field of Feulgen-stained nuclei in normal tissue. Fig. 2. Same field as figure I after a nuclear boundary tracking algorithm has delineated the boundaries of acceptable nuclei and their integrated optical densities measured. Our research has shown that it is not necessary to restrict the recording of ploidy patterns to Feulgen stained material. Papanicolaou-stained clinical samples, processed under carefully controlled conditions, allow a reliable recording of such histograms, with nuclei from normal intermediate cells on the same slide providing a built-in standard for the diploid histogram peak. A sample field for a Papanicolaou smear is shown in figure 3. The digitized image is stored on the disk of the computer. The operator may interactively examine various thresholds for the tracing of the nuclear boundary and choose an appropriate threshold for each sample field. The DNA histogram may be dispalyed, together with fiduciary marks for the diploid DNA levels. This is shown in figures 4-7. Figure 4 shows the DNA ploidy histogram for a benign tissue. Figure 5 shows the DNA ploidy histogram for a case of carcinoma in situ. Figure 6 shows a histogram of total Papanicolaou nuclear extinction for normal intermediate cells. Figure 7 shows a histogram of total Papanicolaou nuclear extinction for invasive carcinoma cells. The vertical scale for all four figures is the percentage of total nuclei while the horizontal scale is the logarithm of total nuclear extinction (arbitrary units). Fig.4. DNA ploidy histogram for benign tissue Fig. 5. DNA ploidy histogram for carcinoma in situ. Fig. 6. Histogram of total Papanicolaou extinction for normal intermediate cells Fig.7. Histogram of toial Papanicolaou extinction for invasive carcinoma cells The difficulty with mere visual inspection is that differences in ploidy patterns can be appreciated only when they are profound; in fact, in their retrospective study of cervical dysplasia relating DNA analysis to neoplastic progression or regression, Nasiell et al.51 were unable to demonstrate significant differences between dysplastic cases regressing and progressing towards neoplasia. The number of cells measured per case in this study varied from 14 to 56 cells. The conclusions drawn from this study only confirm the problems mentioned above: existing instrumental limitations often lead to the examination of samples of inadequate size. While these authors, to their credit, do not conclude that valid prognostic information is offered by their data, one cannot agree that their data support the conclusion that a prognostic potential of DNA ploidy patterns can be ruled out. Calculation for the power of the test of a custom-made test statistic for the detection of aneuploidy, as done by weber et al. [6g], clearly demonstrates that sample sizes in excess of 100 cells are mandatory to make statements concerning diagnosis or prognosis unless of course one is faced with a massively aneuploid profile. Heuristic procedures for the numerical assessment of ploidy patterns have been proposed by Boecking et al. [17 ] to arrive at a consistent method for the classification of a ploidy pattern as aneuploid. In these procedures two indices are computed for each ploidy pattern: the '5N exceeding rate (5N-ER)' and the '2N-deviation index (2N-DI)'. '5N exceeding rate' is defined as the proportion of aneuploid nuclei with DNA contents exceeding 5N; any nucleus which is more than + 25% removed from 2N, 4N, 8N, ... etc.,is considered to be aneuploid. The 2N-DI statistic is defined as the sum of all squares of the deviation between the ploidy of the measured nuclei and the average ploidy value of a standard population of normal cells divided by the number of measured nuclei. Boecking determined empirical thresholds for a less favorable prognosis from a study of 258 cases of histologically confirmed malignant tumors and 74 benign lesions, with correct assessment resulting in 263 cases, 58 cases being called suspicious, and 0 cases falsely called negative. There were no false positives. An example of the type of data on which their decision is based is shown in figure 8. Empirical calibration like this is an acceptable procedure, but it has shortcomings. The procedure is sensitive to 'outlying' observations and one should also be able to detect the very early onset of aneuploidy. Fig. 8. a, b. Examples of data which determine empirical limits for the assessment of ploidy patterns (see text). Reproduced with permission from Boecking et al.ll7l For a formal hypothesis test for the presence of aneuploidy, at known chances for an error of the first kind i.e., known sensitivity to detect aneuploidy and known chances for an error of the second kind - i.e., specificity or power of the test a custom-made test statistic is being developed in cooperation with the statisticians D r. J. Weber and, B.B aldessari [68]. This effort is in progress. The diagnostic differentiation which one is trying to establish actually is not merely between euploid and aneuploid. Rather, there are 5 ploidy patterns of interest. These are 1、 Euploid with diploid peak, and a very small proportion of cells in S and 4N mode, as in normal tissue. 2、 Polyploid, with modes at 2N, 4N and 8N in varying proportions and with different coeflicients of variation. 3、Aneuploid - anything not within the tolerance limits of conditions I and 2. 4、Combinations of polyploid and aneuploid. 5、Diploid, but with stemline increased in total OD value. We have found the ploidy pattern very useful in other applications, for instance in the matching of the DNA ploidy pattern between the cytologic sample and the histologic biopsy sample [4]. This is used to establish whether the samples are representative and collected from the appropriate site and whether there is diagnostic agreement. An example in which the measurements on cytologic and histologic samples show a similar ploidy pattern is shown in figures 9-11. Fig.9-11. This is a case of a 23-year-old gravida 3, para 3 with a cytologic report of evidence of carcinoma in situ (fig. 9). The histologic sections showed carcinoma in situ (fig. l0) and the ploidy (fig. 11) indicated that the histologic material is representative of the previously detected cytologic atypia. Figures 12-14 show a case where the biopsy material was not representative of the lesion and figures 15-17 show a case of overcall. Fig. 12-14. This is a case of a 2l-year-old gravida 1, para i whose sample (fig. l2) showed evidence of mild dysplasia. The resulting histologic pattern (fig. l3) was interpreted as squamous metaplasia. The ploidy pattern (frg. 14) on the histologic sample indicated that the tissue material is not representative of the cytologic findings, and a subsequently performed second biopsy revealed the presence of mild dysplasia (not shown in this absorption pattern). Fig. 15-17. The absorption measurements on the cytology (fig. l5) and histologic material (fig. 16) of this 31-year-old gravida 4. para 3 showed predominantly polyploid patterns (fig. 17), although the histologic diagnosis was apparently overcalled as 'moderate dysplasia'. The presence of koilocytotic atypia is consistent with condylomatous changes. It is possible that a new classification of cervical intraepithelial neoplasia may emerge from ploidy patterns. There is a substantial body of literature pointing to the clinical value of the information offered by ploidy patterns. For the correspondence between DNA ploidy patterns and precursor lesions for cervical carcinoma, there are the fundamental studies by Fu et al. [32]. In a retrospective study of 100 cases of cervical intraepithelial abnormalities, the nuclear DNA content was correlated with the histologic findings and follow-up data. All cases had initial biopsies and were followed for more than a year with cytologic and/or histologic examinations. of the 34 cases having a subsequent normal follow-up, 29 (85%)had a euploid or polyploid pattern and 5 (15%) had an aneuploid distribution. of the 58 cases persisting as cervical intraepithelial neoplasia (cin), 3 (5%) had a polyploid pattern and 55 (95%) had an aneuploid distribution. Of the 8 cases which progressed to invasive carcinoma, all had an aneuploid pattern. These findings suggest that euploid or polyploid lesions are more likely to have normal follow-up studies (91%) and rarely persist (9%).Of the aneuploid lesions, 81% persisted as CIN, 12% progressed to invasive carcinoma, and 7% had a normal follow-up. For the value of DNA ploidy measurements in cases of dysplasia observed in DES-exposed females, there again are the important studies by Fu et al. [33, 34]. The diagnostic criteria of dysplasia of the cervix and vagina in DES-exposed females is controversial. Studies using Feulgen microspectrophotometry have suggested that measurement of nuclear DNA content can help distinguish between those lesions that are potentially preneoplastic and those that may appear by light microscopy to be dysplastic but are probably benign. By means of DNA studies, it has been demonstrated that many lesions classified as intraepithelial neoplasia are examples of immature squamous neoplasia. Mature and immature squamous metaplasias are euploid, having a normal diploid DNA value. Mild dysplasias are usually polyploid, having multiples of normal diploid DNA content. Moderate and severe dysplasias are usually aneuploid. Follow-up studies by cell samples and biopsies showed that dysplasias having a polyploid or euploid DNA content may rarely persist as such or progress to a more severe form, in contrast to aneuploid dysplasias in which persistence or recurrence following biopsy or treatment appears to be more common. Zetterberg and Esposti [76,77] evaluated the prognostic significance of ploidy patterns in prostatic carcinoma. Material obtained by transrectal fine needle aspiration from prostatic lesions is suitable for cytophotometric DNA analysis. studies conducted by Zetterberg and Esposti l7 6l have shown that nuclei from benign lesions( prostatic hyperplasia) exhibit a normal diploid amount of DNA; cell populations from prostatic malignancies are characterized by various degrees of heteroploidy. An overall correlation between cytological degree of differentiation and nuclear DNA characteristics was found. The majority of the tumor cell nuclei in most cases of well differentiated prostatic carcinoma had a diploid DNA content whereas the majority of the tumor cell nuclei in most cases of poorly differentiated prostatic carcinoma had considerably increased hyperploid DNA quantities. This suggested that the malignant properties of the tumor cells could be reflected by the nuclear DNA characteristics. Of particular interest was the finding that the group of moderately differentiated prostatic carcinomas, heterogeneous with respect to clinical malignancy, had DNA characteristics either similar to those of well differentiated prostatic carcinomas, or to those of poorly differentiated ones. In another study by zetterberg and Esposti[7 7] May-Grünwald-Giem-sa-stained smears( up to 15 years old) from 43 patients diagnosed as having prostatic carcinomas were processed and submitted to cytophotometric analysis. The 43 patients were selected on the basis of the clinical response to hormone therapy. Patient group I (21 patients)showed a good response, with survival without clinical evidence of cancer for at least 5 years. Patient group II (22 patients) showed a poor response, with death from cancer within 3 years. Most of the carcinomas belonging to patient group l were characterized by a diploid or a combined diploid-tetraploid DNA distribution pattern, while in carcinomas from patient group II, most of the cancer cells generally contained abnormally increased DNA amounts. In this second group, Some of the tumors exhibited clearly aneuploid modal distribution patterns, with DNA values ranging from the hypotriploid level to the hypertetraploid level, while in other tumors a large intercellular variability in DNA content was the characteristic feature. Blondat and Bengtsson [16] examined the ploidy patterns in squamous cell carcinoma of the lung. Cytophotometric analysis of nuclear DNA in 30 squamous cell carcinomas showed in most tumors a peak value neat 2c (diptoid) or 3c (aneuploid) and a wide distribution of other nuclear DNA values. A follow- up study showed that the prognosis was best for patients with diploid tumor cells and worst for patients with aneuploid tumor cells. The DNA histogram patterns give information about the dominating DNA ploidy of the tumor cells; in some tumors, information about the tumor proliferation activity is also provided. These data are useful in prognostic evaluation and may prove useful as adjuncts in the diagnosis and in selection of treatment. DNA ploidy assessment provides significant information for the management of patients with lung and prostate lesions. There are applications involving tumors from almost any organ site, e.g, mucinous tumors of the ovary [69], malignant teratoma [a3] of the testicle, and fibrocystic disease ,and cancer of the breast[18, 30, 38,45,66]. There are clinical tests now for pre-operative differentiation of thyroid aspirates to ascertain whether one is dealing with a follicular adenoma or a carcinoma [39, 40, 65]. There are numerous applications in the assessment of lymphoid cell populations [20], of astroglial tumors [25] and of bronchial cell populations [52]. Laboratory tests have been developed for a prognostic differentiation between chondromas, which are diploid, and chondrosarcomas, which may be diploid to hyperploid, with the diploid instances having a more favorable outcome [24]. Such ploidy determinations in chondrosarcomas are considered to provide a better prognostic information than even conventional histopathologic grading. Other areas in which DNA ploidy patterns are valuable are the differentiation of bladder papillomas versus carcinoma in situ of the bladder, for follow-up examinations of conservatively treated low stage bladder tumors, and detection of carcinoma in situ of the bladder [23 , 26 , 27 , 41, 42]. Computer Cytometry and Computer Graphics Even though DNA cytometry requires very careful attention to sample preparation and instrument calibration, as an information extraction process it is a gross data reduction of the detailed information present in the digitized image into a single global value, the 'total optical density'. Computer assessment of digitized cell images is capable of providing detailed morphometric charactenzation of cells. Features relating to size, shape, N/C ratio, distribution pattern of the nuclear chromatin, staining properties and spectral contrasts can be extracted[1 3]. These features may then be used to perform computer classification and diagnostic assessment for each cell, for an entire cell sample from a given patient of a given disease category, and for the patient profiles[ 8] from different diagnostic categories. The whole spectrum of analytic procedures, such as multivariate statistical analysis of variance, discriminant analysis, and time series analysis, can be brought to bear on the quantification of clinical cytodiagnostic decision making. Much of this has been described in detail before [ 6,9,10, I 3, 64, 75]. when asked, though, which methods now under development will most likely have the greatest impact in the clinical laboratory, the answer would unequivocally be 'computer-aided cytometry and computer graphics'. This will be demonstrated with some practical examples. Computer-Aided Cytometry Cytologists use a number of broad cytodiagnostic criteria to assess malignancy, such as the N/C ratio, nuclear granularity, and nuclear hyperchromasia. Human visual assessment of cell images involves a learned instantaneous response; it is not based on a laborious estimate of a value. As is all human visual assessment, it is limited in its ability to recognize and distinguish between small differences. Computer graphics can provide valuable help here by displaying next to the cell image the numerical value of selected cytodiagnostic features. This is shown by the way of example in figure 18, where the N/C ratio is displayed for normal and abnormal cells. Fig. 18. Display of Papanicolaou smear with computer-generated values of N/C ratio superimposed on selected cells. Computer graphics allow one to 'position' cells based on the values of their cytodiagnostic features, relative to other cells. For example, the cell shown in figure 20 is identified in the scatter plot of figure 19 by the large square. The scatterplot shows this to be a highly hyperchromatic cell of high nuclear total density, and with a large N/C ratio, a cell well removed from the normal cells displayed in the lower left of figure 19. Fig. 19. Scatterplot of nuclear extinction versus N/C ratio for normal and abnormal cells. Fig.20. Abnormal cell with high total nuclear density and large N/C ratio indicated by large square in figure 19. Computer graphics allows an ordering of cell images according to the values of selected cytodiagnostic fetures: figure 21 shows a sequence of cell images ordered according to N/C ratio. Since computers allow the storage of large files of cell images in full color, a cytologist could interactively compare diagnostic material to reference images retrieved from file and thus have a valuable training aid. Fig. 21. Sequence of cell images ordered accordingt o N/C ratio' Subvisual, Computed Diagnostic Information Among the most intriguing aspects of computer evaluation of cell images is the extraction of visually not perceived diagnostic information l11, 12l. Such information falls into three categories. First, there exist differences in diagnostic features which usually would clearly be noticed by a human observer, for example, differences in nuclear size. However, even though such diflerences may be consistent, they may be so small that only precise measurement and statistical evaluation can substantiate them. Second, there are differences which are small and gradual but are expressed in every cell of an entire sample. Since a cytologist would not usually look at and compare two samples simultaneously, they may remain unnoticed for lack of a suitable reference within the sample. However, when an ordered sequence of cell images is formed by the computer, a trend becomes immediately apparent to the trained observer. One of the most captivating examples for the use of such subvisual diagnostic clues are the marker features for the presence of dysplastic and malignant disease as they are expressed in normal-appearing intermediate cells from the ectocervix [ 15,72,73]. Computer assessment has discriminated remarkably well between normal intermediate cells from the ectocervix of patients with normal cytology and normal-appearing intermediate cells from patients with malignant disease. Figure 22 indicates the chromatin changes in intermediate cells from patients with normal cytology, carcinoma in situ, and invasive cancer, respectively. The extent of change is about the same for both categories of patients with abnormal cytology; however, the direction of change is different and specific. The figure shows a plot of two discriminant functions for cell data from patients with normal cytology (code l), patients with severe dysplasia/carcinoma in situ (code 2), and cell data from patients with invasive cancer (code 3). Figure 23 shows the corresponding confidence regions, tolerance ellipses, and Bayesian boundaries for the discrimination. Fig.22. Plot of two discriminant functions for cell data from patients with normal cytology (code l), patients with severe dysplasia/carcinoma in situ (code 2), and cell data from patients with invasive carcinoma (code 3). Fig. 23. Confidence regions, tolerance ellipses, and Bayesian boundaries corresponding to frgure 22. The expression of marker features in intermediate cells from the ectocervix presents an excellent example of image information that had escaped the attention of cytologists, but that can be measured with consistency to provide clues as to the diagnostic condition of the patient. However, when cells are arranged in the order of a discriminant function score, experienced observers can see the trend in staining properties. This is demonstrated in figure 24. A discriminant function was computed that separated intermediate cells from normal patients and intermediate cells from patients with malignant disease. Several cell images were selected and arranged in the order of their scores. In this instance, positive values for the score indicate cells with strong clues to the presence of malignancy in the patient; negative values indicate patients with normal cytology. This is also an excellent example for the creative use of computer graphics in quantitative and diagnostic cytology. Fig. 24.Intermediate cells arranged in the order of a discriminant function score.The first three are from patients with carcinoma in situ, the second three are from normal patients. The discovery of the marker features has two important implications. First, there must be a 'field effect' causing the subtle changes in the normal intermediate cells in patients who have malignant cervical disease. Second, if this finding is consistent among a large sample of patients, it will have major impact on the strategy for cervical cancer prescreening. No longer would there be a need to search for possibly rare tumor cells among large samples of 50,000 to 200,000 cells in a clinical preparation, nor would there be a need for substantial computer power to do the image processing. Instead, a modest sample of readily detected intermediate cells could be examined, and a powerful statistical test could be applied to see whether this patient is likely to require further attention. It is essential in studies of this kind to establish the statistical significance of differences between categories over differences between individual patients. For this, nested designs in analysis of variance are suitable [7]. For the blue intermediate cells from patients with normal cytology (INMT NRM), with moderate dysplasia (INMT MoD), and with severe dysplasia/carcinoma in situ (INMT csD), a 3 X 10 x 30, two-level nested design was used with three diagnostic categories, l0 patients each, and 30 cells measured on each slide. This provides 2 degrees of freedom for the mean square of the diagnostic categories, 3 X 9 = 27 degrees of freedom for the patient to patient mean square, and 3 X l0 x 29 = 870 degrees of freedom for the cell-to-cell variabilitvy. Table I gives the results for the marker feature 'red/green' contrast of nuclear chromatin (feature 106). The mean values for the three categories were NRM 0.0822, MOD 0.125, and CSD 0.112. Table 1. Statistical significance for the marker feature red/green contrast of nuclear chromatin Of the total mean square,9 0.5% is attributable to the differences between categories, 9 .25% to differences between patients, and only 0.25% is due to differences between cells. If one relates the cell-to-cell mean square to the measured value for the feature 106 over all observations, a coefficient of variation of 15.9% is obtained. The patient-to-patient mean square reflects changes that have a coefficient of variation of 95%, or roughly a factor of two. The diagnostic category variation by comparison changes by a factor of three. The numerical representation of the cell imagery allows us to find effective ways of conveying the meaning of nonvisual, computer diagnostic clues to a clinician by the use of computer graphics. Histologists and cytologists are suberbly capable of evaluating visual clues. The computer diagnostic system may present its assessment of a cell in two forms. It may either provide a collection of numbers, such as discriminant function scores, probabilities of being this or that, or atypicality indices, or it may offer a visual clue. For example, the computer graphic system may draw a frame around the cell image in a color hue anywhere from green to red, depending on how 'normal the cell is. The diagnostician thus sees the cell as always, as a high-resolution full-color image. In addition, he also sees the cell ast he computer does, assessed through a statistical/analytical filter, and color-coded accordingly [28]. Examples are shown in figures 25 and 26. In figure 25 all intermediate cells are surrounded by frames of a dark saturated green color; there is no indication of a dysplastic process or malignant disease. In figure 26 some cells are surrounded by frames encoded in yellow or even red color (shown by arrows in this black and white illustration). Clearly these intermediate cells express the marker features for the presence for ectocervical atypia. Fig.25. Intermediate cells from patient with normal cytology, surrounded by dark green frames which indicate absence of malignant disease. Fig.26. Intermediate cells from patient with abnormal cytology surrounded by yellow/red frames (indicated by arrows), which indicates the presence of atypia. There are several examples for the possibility of an exhaustive utilization of information from a clinical sample. A case in point are the marker features for the presence of malignant disease in dysplastic cells [74]. A diagnosis of 'malignancy present' in a cervical sample would usually be made only if tumor cells had been found in the sample and unequivocally identified. Yet it is well established that variability in the clinical sample taking may fail to provide tumor cells in a small proportion of patients with carcinoma in situ or even with invasive cancer. It has recently been established that dysplastic cells from any of the three cell types- nonkeratinizing dysplastic cells, keratinizing dysplastic cells, and severely dysplastic cells from metaplasia fall into statistically clearly separated subgroups: those that originated from patients with mere dysplasia, and those that came from patients with malignant disease. If the values of the marker features in an adequate number of dysplastic cells in a patient's sample clearly point to the presence of malignant disease, then such a diagnosis could be made with high reliability, even in the absence of tumor cells in the clinical sample. Figure 27 shows the confidence and the tolerance regions for nonkeratinizing dysplastic cells collected from patients with moderate dysplasia, from patients with severe dysplasia/carcinoma in situ, and from patients with invasive cancer. The axes of the display are formed by two effective marker features, the green/red contrast of the Papanicolaou-stained nuclear chromatin, and the average staining density of the nucleus. Fig. 27. Confidence and tolerance regions for nonkeratinizing dysplastic cells collected from patients with moderate dysplasia, from patients with severe dysplasia"/carcinoma in situ, and from patients with invasive carcinoma. A typical example based on the discriminant function scores for nonkeratinizing dysplastic cells is shown in figure 28a for a cell from a patient with moderate dysplasia. Figure 28b shows a cell from a patient with carcinoma in situ. Fig. 28. A typical nonkeratinizing dysplastic cell (a) from a patient with moderate dysplasia and (b) from a patient with carcinoma in situ. One may set up a classifier and assign dysplastic cells into three categories: patients with moderate dysplasia, patients with severe dysplasia/carcinoma in situ, and patients with invasive cancer. The majority of cells from a given patient should be classified into the disease category that gave rise to them. Figure 29 shows such patient profiles. Fig. 29. Examples of patient profiles for various patient diagnoses. A third category of subvisual information is the nuclear chromatin distribution, which has always been known to offer a wealth of diagnostic clues [36]. what cell image analysis has revealed, though, is a surprising consistency for chromatin texture on the one hand, and exquisite sensitivity of response of the texture to changes in the physiologic state of the cell. Recent research has shown not only measurable changes in the chromatin texture to occur following, for instance, exposure of cells to subtoxic doses of a chemical, but also a marked specificity. This is best demonstrated by some examples. In a study by Lockart et al. [44], Daudi cells were incubated with interferon. Such incubation may, in its initial step, activate a very small number of four to six additional genes. Even such a gentle stimulus leds to statistically significant changes in the chromatin condensation, as can be seen in figure 30. Two image properties are plotted: one a discriminant function, that is, a composite feature for the discrimination between controls and incubated Daudi cells, the other a measure of the amount of noncondensed chromatin. Shown are the confidence ellipses for the estimates of the bivariate means, control cells, and interferon-treated cells; the 50% tolerance ellipses; and the Bayesian decision boundary for the classification of a cell as a control cell or as an affected cell. Fig. 30. Changes in the chromatin condensation in Daudi cells incubated with interferon. A second example is taken from an application in environmental pathology: the detection of exposure to subtoxic doses of a toxicant. The tremendous potential for computer analysis of cells and tissues in environmental pathology is demonstrated in a study by Nair et al. [50]. Experimental rats were exposed to doses of chlordane by orders of magnitude below the toxic level. Cytophotometry revealed, as expected, a statistically significant increase in the number of liver cells with doubled DNA contents, indicating the occurrence of repair processes (fig. 31,32). More interesting, "though was the finding that in the liver cells of treated animals, even the cells with regular (8N) DNA contents showed consistent, measurable changes in their chromatin pattern. There can be no doubt that the ability to measure even more subtle changes will require a new determination of what constitutes toxic exposure. It may be that observed changes of this nature are not lasting, or that they will remain asymptomatic. It may also be that mere visual examination of stained tissue sections may no longer be adequate in the determination of thresholds of toxicity. Fig. 31. Distribution of hepatocytes as a function of total optical density from control rats (a) and rats administered feedings of ccl4 and chlordane ( b). Fig.32. Distribution of linear discriminant function values for heptocytes from control rats and animals fed chlordane Relatively few studies have been directed at probing the specificity of changes in the chromatin patterns. The few studies that have been completed indicate that there appears to be a surprising specificity to changes in the chromatin distribution and staining. Lymphocytes exposed to ionizing radiation [1,2], to chemotherapeutics [46], and to a virus infection [55], all exhibit measurable changes in their chromatin pattern, even though these changes generally remain below the threshold for visual detection [54]. This is shown in figure 33. Lymphocytes were exposed in vivo to X-ray irradiation, to a dose of cyclophosphamide, or to Friend virus infection. The nuclear chromatin patterns in Feulgen-stained preparations were then evaluated, and two discriminant functions computed (SDMFl) and (SDMF2). What figure 33 shows is not only that measurable and statistically significant changes occur for each treatment, but that the treated cells deviate in their chromatin pattern in a specific way. This is indicated by the direction, in the discriminant space,in which the mean values of the treated cell populations were displaced. Fig. 33.These cell population show distinct change in different directions. Discussion In surveying the analytical capabilities both in instrumentation, computer hardware and software and the applications in clinical cytodiagnosis, two major impressions stand out. First, computer processing of microscopic imagery, numerical assessment and multivariate analysis can indeed provide not only quantitative, but qualitatively novel diagnostic information. The potential impact of our expanded ability to differentiate and to extract diagnostic and prognostic information on the practice of clinical cytology could be substantial. There is no doubt that, in the near future, technologic options will be offered far in excess of what the clinical pathologist is at this point prepared to utilize. The second impression is that the gap between the available analytic/mathematical power and intuitive comprehension in human diagnostic ability can most effectively be bridged by computer graphic methods. It is true that technology has only now made the DNA ploidy assessment a clinically and economically feasible procedure -30 years after its clinical potential was established. Even today, hardly any analytic procedure is in place for the consistent evaluation of the prognostic clues offered by the ploidy patterns, much less is there a data base correlated with long term clinical outcome data. The vastly accelerated pace of technologic development makes it unlikely that lead times of such magnitude will ever occur again in the future. Novel diagnostic procedures are accepted in the laboratory in a conservative manner, and good medical practice has benefited greatly from such a cautious approach. Yet the generation and collection of information may, within the decade, accelerate at unprecedented pace. Large data bases may allow us to assess the efficacy and value of collected diagnostic and prognostic information within a shorter time span than has been possible in the past, while maintaining conservative standards of judgement.