Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Automatic Scoring of KI-67 Proliferation Index using Fuzzy C-Means Clustering Saranya P M, 4th Semester Mtech Student, Caarmel Engineering College, Perunad Jemily Elsa Rajan, Assistant Professor, Caarmel Engineering College, Perunad Abstract Neuroendocrine tumor is one of the most common types of cancers leading to death worldwide. They are classified according to the grade of biological aggressiveness and the extent of differentiation. Recently, KI-67 proliferation index, which is represented as the ratio between the number of immunopositive tumor cells and all tumor cells, is increasingly considered as a valid biomarker to evaluate tumor cell progression and predicting therapy responses. So here propose a automatic algorithm for scoring of KI-67 proliferation index of NET. In order to accurately and simultaneously localize a large number of cells , fuzzy local c-means clustering algorithm is used. An efficient sparse dictionary learning algorithm is applied to select a set of representative training samples. Finally, tumor and nontumor cells are separated by an SVM classifier with both the cellular features and regional structure information. Finally the KI-67 proliferation index is calculated based on the classification results of immunopositive and immunonegative tumor cells. The automatic KI-67 counting is quite accurate compared with pathologist’s manual annotations. The proposed system is much more accurate than existing methods. Index terms– Neuroendocrine tumor (NET), KI67, clustering, histopathology, classification I.INTRODUCTION Image processing can be defined as the field of signal processing where both the input and output signals are images. An Image can be thought of as two-dimensional signal via a matrix representation, and image processing can be thought of as applying standard one dimensional signal processing techniques to two-dimensional signals. Digital image processing is the processing and display of images and involves many different types of techniques, but the goal of most applications is to extract quantitative information from images. Examples of quantitative information relevant to cancer diagnosis can be the size and irregularity distribution of cells, or the ratio of cells that are positive for a certain diagnostic biomarker to all cells. This work presents several methods aimed at different types of analyses of histological samples, with a focus on neuroendocrine tumor diagnosis, and with the purpose of complementing the role of the clinical and research pathologists and biomarker-researcher. Neuroendocrine tumor (NET) is one of the most common cancers leading to death worldwide. Recently, KI-67 proliferation index, which is represented as the ratio between the numbers of immunopositive tumor cells and all tumor cells, is increasingly considered as a valid biomarker to evaluate tumor cell progression and predicting therapy responses [2][3]. Here propose an integrated learning based framework for accurate automatic KI-67 counting for NET. Computer-assisted diagnosis (CAD) refers to the procedures in medicine where computer algorithms and programs assist physicians in the interpretation of medical images. Within histopathology, CAD systems are getting more frequent and increasingly crucial in cancer identification and analysis. Working with histological images is considerably different from, for example, radiology images. Hence, the manual interpretation of histological images is time-consuming and requires a lot of skill and experience. Studies show that the interpretation and scoring of stained specimens using the microscope is not only labour intense but also a highly visual and subjective process. The use of computer-assisted analysis of histological images has been suggested as a promising way to reduce these problems, as employing computers allows both automation and consistent interpretation. As described in the previous sections, identifying certain histological structures, such as tumors, cell membranes or nuclei, is one of the prerequisites to cancer grading in histological images. Most methods focus on the segmentation of biomarker expression and quantitative features that can be translated to relevant grading systems. One major challenge lies in making the digital image analysis techniques available and incorporated into the daily workflow of the pathologist or cancer researcher. Another challenge lies in the enormous size of histological images. The available data keeps increasing with the rapid growth of the digital microscopy field, driven by the need for highthroughput systems to accompany recent developments in microscopy imaging, such as the automated whole slide scanning systems and tissue micro-arrays. This puts new demands on both data handling and the digital image analysis methods for analyzing the images within a reasonable time. The methods presented in this thesis are highly relevant for the analysis of histological images, with a focus on the quantification of biomarker known as KI67 proliferation index, for diagnosing neuroendocrine cancers. Image segmentation is an important image processing technique, and it seems everywhere if we want to analyze what inside the image. Image segmentation is the process of dividing an image into parts that have a strong correlation with the equivalent real world objects. It is one of the most important topics in image analysis since all further analyses, such as feature extraction and object classification, will depend on the result of the segmentation. Classification is not possible without knowledge of the items to classify. In classification, knowledge of an item is represented by descriptors, often referred to as features. The contextual information of a sample object is often represented by a feature vector that ideally contains discriminative information of the object. Classification methods are often divided into supervised and unsupervised methods. Supervised methods [6] typically construct a classifier by dividing a set of samples, each represented by a feature vector, into a training set and a test set. The purpose of the training set is to construct decision lines in the feature space that separates the classes and that are used to classify new objects. The test set is used to evaluate the decision lines and the extracted features. Unsupervised methods assume that objects, whose feature vectors are close to each other in feature space, belong to the same class. The classifier can be constructed by ordering the feature vectors into clusters representing the individual classes. Manual KI-67 assessment is subject to a low throughput processing rate and pathologistdependent bias. Computer-aided pathological image analysis is a promising approach to improve the objectivity and reproducibility. However, it is difficult to access automatic and accurate KI-67 counting in digitized NET images, since the complex nature of histopathological images, such as variations of image texture, color, size, and shape, presents significant challenges for accurate automatic KI-67 counting. In addition, tumor and nontumor cells are usually clustered such that the nontumor cells are also counted using many traditional methods, which lead to large counting errors. The rest of the paper is organized as follows: Section II presents motivation and overview. Section III discusses related works and IV illustrates system design. Section V the experimental results and discussion, and Section VI concludes the paper. II. MOTIVATION AND OVERVIEW The neuroendocrine system is made up of nerve and gland cells. Neuro means nerve and endocrine means the cells of the endocrine system. Digital pathology has taken the conventional standard of optical microscopy into the digital era along with the development of computers and the exponential increase in computer power. The digitization of biological data has enabled the use of computers assisting in the diagnosis. Interpreting tissue slides manually is labour intensive, costly and involves the risk for human errors and inconsistency, while using automated image analysis can provide additional automatic, fast and reproducible analyses, assisting pathologist making an accurate diagnosis. the The methods presented in this work are highly relevant for the analysis of histological images, with a focus on the quantification of biomarker known as KI-67 proliferation index, for diagnosing neuroendocrine cancers. Neuroendocrine cancer [9] is a malignant tumor that it can spread, or metastasize, to other parts of the body. The World Health Organization (WHO) classification scheme places neuroendocrine tumors into three main categories, which emphasize the tumor grade rather than the anatomical origin. Well-differentiated neuroendocrine tumors, which are subdivided into tumors with benign and those with uncertain behavior. Well-differentiated (low grade) neuroendocrine carcinomas with low-grade malignant behavior. Poorly differentiated (high grade) neuroendocrine carcinomas, which are the large cell neuroendocrine and small cell carcinomas. Placing a given tumor into one of categories depends on well-defined histological features: size, shape, KI-67 labeling index etc. Here implemented a three-stage learning-based approach to differentiate tumor cells from nontumor cells and immunopositive and immunonegative tumor cells for an automatic, accurate, and robust quantification of KI-67 proliferation index. III. RELATED WORKS This section deals with description of previous papers that are related to this work. Computer aided pathological image analysis is a promising approach to improve the objectivity and reproducibility. KI-67 proliferation index is increasingly considered as a valid biomarker to evaluate tumor cell progression and predicting therapy responses [9]. An image analysis system is utilized in [11] to quantify tumor cells, where color intensity thresholds need to be properly selected. In [13], established image analysis software [14] is applied to quantification of the KI-67 proliferation index, and multiple staining methods are used to discriminate tumor from nontumor cells. The Aperio image analysis software is utilized in [15] and [10] for the assessment of KI-67 proliferation index, but the nontumor cells such as lymphocytes and stromal cells need to be excluded manually, and therefore it is not completely automatic. A computationally efficient single-pass voting (SPV) for cell detection is reported in [7], which applies mean shift clustering instead of iterative voting to final seed localization. Another class of methods uses spatial filters to detect cells/nuclei. A LoG filter-based algorithm to automatically detect cell nuclei is presented in [5]. The nontumor cells such as lymphocytes, stromal, and/or epithelial cells thus often need to be excluded manually. Meanwhile, additional steps need to be designed to separate immunonegative and immuopositive tumor cells. These methods cannot precisely differentiate tumor from nontumor cells and separate touching cells simultaneously and lead to large counting errors. The aforementioned general cell detection and segmentation algorithms are not specifically designed to calculate KI-67 proliferation index. IV. SYSTEM DESIGN An integrated learning-based algorithm is proposed for automatic scoring of KI-67 proliferation index of NET, with addressing the problems earlier simultaneously. In order to accurately and simultaneously localize a large number of cells, fuzzy local c-means clustering technique is used. These clusters will be used to initialize a repulsive deformable model [12] to extract touching cell boundaries with known object topology constraints for segmentation. Next, an efficient sparse dictionary learning algorithm is applied to select a set of representative training samples. Finally, tumor and nontumor cells are separated by a trained SVM classifier with both the cellular features and regional structure information. The KI-67 proliferation index is calculated based on the classification results of immunopositive and immunonegative tumor cells. Fig 1: System Architecture The Fig 1 depicts the system architecture of the proposed work for calculating the KI-67 index for the diagnosis of NETs based on the classification results of immunopositive and immunonegative tumor cells. The methods range from segmentation and classification of sub-cellular features to compare them with a set of representative training samples using an efficient sparse dictionary learning algorithm. Here proposed an adaptive SVM classifier for accurate and automatic KI- 67 counting. The adaptive sparse strategy not only restricts pixels from different scales to be represented by training cells from a particular class but also allows the selected cells for these pixels to be varied, thus providing an improved representation. Finally, tumor and non tumor cells are separated with both the cellular features and regional structure information. A region is constructed by iteratively incorporating pixels on the region boundary. In addition, active-contour based methods have also been proposed to perform image segmentation. Changes in contrast can be detected by operators that calculate the gradient of an image. Fuzzy local c-means clustering technique is applied for boundary delineation and grouping of homogeneous cells. It is followed by segmentation and feature extraction. Finally we can calculate KI-67 proliferation index. A. Classification Based on Feature Extraction After an accurate segmentation and dictionary learning of all the cells in NET, a three-stage learning-based scheme combining cellular features and regional structure information is designed to differentiate tumor from nontumor cells, and immunopositive from immunonegative tumor cells for accurate KI-67 counting. Differentiation between tumor and nontumor cells is the critical step for accurate automatic KI-67 scoring. In order to extract distinctive morphological features to separate tumor from nontumor cells, accurate cellular boundary delineation is a required. This is a challenging problem due to the complex color and intensity variations inside the cells, especially in the case of touching cell clumps. For boundary delineation fuzzy local c-means clustering technique is used. In Stage I, only cellular features are considered, and some nontumor cells (like lymphocytes) can be classified as tumor cells by mistake. In Stage II, the statistical features are concatenated with cellular probabilities to train a SVM classifier. The output will produce the labels to differentiate tumor from nontumor cells. Based on the classification in Stage II, the classifier separate immunopositive and immunonegative tumor cells. This is achieved by training a final classifier for all the KI-67 positive staining cells using the cell features and cellular intensity histogram to differentiate the immuopositive and immunonegative tumor cells. The KI-67 Read Input Image Immunopositive/ Immunonegative Tumor Cell Classification KI-67 Count proliferation index is finally calculated using a classifier with a SVM classifier for the differentiation of immunopositive and immunonegative tumor cells. Fuzzy C-means Clustering Gradient Calculation Tumor / Non-tumor Cell Classification Masking SVM Classifier Training Sample Active Contour Segmentation Feature Extraction Geometric, Color, Cell Shape, Probability, Intensity etc Feature Extraction Fig 2: System Workflow of the proposed automatic KI-67 proliferation index counting system The above figure 2 shows the block diagram for calculating the KI-67 index for the diagnosis of NETs based on the classification results of immunopositive and immunonegative tumor cells. With fuzzy local c-means clustering technique [8] homogeneous cells can be grouped and boundary of each cluster can be separated. Cell boundaries are segmented with an improved deformable model. Then features are extracted to differentiate immunopositive and immunonegative tumor cells. Fuzzy local cmeans clustering algorithm used here has low time consumption for active contour segmentation. So segmentation efficiency will expected to be higher than existing methods. Fuzzy c-means (FCM) is a method of clustering which allows one piece of data to belong to two or more clusters. It is based on minimization of an objective function. Fuzzy local c-means clustering technique gives best result for overlapped data set and comparatively better than mean shift clustering algorithm described in [7]. Here data point is assigned membership to each cluster center as a result of which data point may belong to more than one cluster center. V. EXPERIMENTAL RESULTS AND DISCUSSION The ROC curve for analysing the classifier performance of proposed automatic KI-67 counting system is shown in fig 3. Fuzzy local c-means clustering algorithm proposed here is an efficient algorithm for grouping homogeneous cells. We use fuzzy local cmeans clustering technique to improve the efficiency of cell segmentation. As shown in the figure, fuzzy local c-means clustering technique is much more efficient compared to other clustering techniques like mean shift clustering that has high time consumption due to active contour segmentation. Fuzzy local c-means clustering technique gives the best result for overlapped data set and comparatively better than mean shift clustering algorithm. VI. CONCLUSION AND SCOPE OF THE WORK This paper presents several methods and applications aimed at different types of analyses of histological samples, with a focus on neuroendocrine tumor diagnosis, and with the purpose of complementing the role of the clinical and research pathologists and biomarker researchers. The significance of the accurate estimation of KI-67 proliferation index is obvious. NETs were separated into three prognostically significant grades: low, intermediate, and high. Low grade (grade 1) NETs have a mitotic count of less than two mitoses/ten high-powered fields (HPF) and/or a KI-67 index of less than 3%. Intermediate (grade 2) NETs have a mitotic rate of 2–20 mitoses/ten HPF and/or a KI-67 index of 3– 20%. High-grade (grade 3) NETs are morphologically anaplastic, and often show mitotic rates greater than 20 mitoses/ten HPF and KI-67 index well above 20%. These specific cut offs (3% and 20%) need to be precisely determined in order to differentiate low, intermediate, and high grades of NETs. In this work, an efficient fuzzy c-means clustering technique for KI-67 scoring of digitized NET images is proposed. The novel cell detection algorithm can efficiently and accurately detect thousands of cells on a digitized NET image with KI-67 staining. Furthermore, a three-stage learning-based approach is designed to differentiate tumor cells from nontumor cells and immunopositive and immunonegative tumor cells for an automatic, accurate, and robust quantification of KI-67 proliferation index. Differentiation between tumor and nontumor cells is the critical step for accurate automatic KI-67 scoring. The proposed cell detection algorithm indiscriminately detects both tumor and nontumor cells. This scheme is much more accurate than existing methods and exhibits high performance. The future work is to improve the efficiency of the cell segmentation algorithm. It can be contributed to research in the field of computer assisted diagnosis in digital histopathology. REFERENCES [1] Fuyong Xing, Hai Su, Janna Neltner, and Lin Yang, “Automatic KI-67 counting using robust cell detection and online dictionary learning,” IEEE Tran. Biomed. Imag. vol. 61, no. 3, march 2014 [2] Uzma nabi, Nagi A H, Waqas Sami “KI67 proliferating index and histological grade, type and stage of colorectal carcinoma” automated evaluation and the influence of different KI-67 equivalent antibodies,” Diagn. Pathol., vol. 6, no. S1, 2011. [11] Z. M. Mohammed, D. C. Mcmillan, B. [3] W. Jonat and N. Arnold, “Is the KI-67 labelling index ready for clinical use?” Ann. Oncol., vol. 22, no. 3, pp. 500–502, 2011. [4] C. G. Loukas, G. D. Wilson, B. Vojnovic, and A. Linney, “An image analysis-based approach for automated counting of cancer cell nuclei in tissue sections,” Cytometry a, vol. 55, no. 1, pp. 30–42, 2003. [5] J. Byun, M. R. Verardo, B. Sumengen, G. Lewis, B. S. Manjunath, and S. K. Fisher, “Automated tool for the detection of cell nuclei in digital microscopic images: application to retinal images,” Mol. Vis., vol. 12, pp. 949–960, 2006. [6] Potocnik B., Solina F., and Zazula D., “Classical image processing vs. Computer vision techniques in computer assisted and automated detection of follicles in ultrasound images of ovary”, Image Processing and Its Applications, 6th international conference, volume: 2, pages: 551-555, 1997. [7] X. Qi, F. Xing, D. J. Foran, and L. Yang, “Robust segmentation of overlapping cells in histopathology specimens using parallel seed detection and repulsive level set,” IEEE Trans. Biomed. Eng. (TBME), vol. 59, no. 3, pp. 754–765, 2012. [8] A. Hafiane, f. Bunyak, and k. Palaniappan, “fuzzy clustering and active contours for histopathology image segmentation and nuclei detection,” in adv. Concepts intell. Vis. Syst. (acivs), 2008, vol. 5259, pp. 903–914. [9] Adam m. Mougey, md douglas g. Adler, md “neuroendocrine tumors: review and clinical update” november 2007 [10] S. Fasanella, E. Leonardi, C. Cantaloni, C. Eccher, I. Bazzanella, D. Aldovini, E. Bragantini, L. Morelli, L. V. Cuorvo, A. Ferro, F. Gasperetti, G. Berlanda, P. D. Palma, and M. Barbareschi, “Proliferative activity in human breast cancer: KI-67 Elsberger, J. J. Going, C. Orange, E. Mallon, J. C. Doughty, and J. Edwards, “Comparison of visual and automated assessment of KI-67 proliferative activity and their impact on outcome in primary operable invasive ductal breast cancer,” Br. J. Cancer, vol. 106, no. 2, pp. 383–388, 2012. [12] L. D. Cohen, “On Active Contour Models and Balloons,” CVGIP: Image Understand., vol. 53, no. 2, pp. 211–218, 1991. [13] C. M. Van Der Loos, O. J. De Boer, C. Mackaaij, L. T. Hoekstra, T. M. Van Gulik, and J.Verheij, “Accurate quantitation of ki67-positive proliferating hepatocytes in rabbit liver by a multicolor immunohistochemical (ihc) approach analyzed with automated tissue and cell segmentation software,” J. Histochem. Cytochem., vol. 61, no. 1, pp. 11–18, 2013. [14] J. R. Mansfield, “Cellular context in epigenetics: quantitative multicolour imaging and automated per-cell analysis of mirnas and their putative targets,” Methods, vol. 52, pp. 271–280, 2010. [15] L. H. Tang, M. Gonen, C. Hedvat, I. M. Modlin, and D. S. Klimstra, “Objective quantification of the ki67 proliferative index in neuroendocrine tumors of the gastroenteropancreatic system: a comparison of digital image analysis with manual methods,” Am. J. Surg. Pathol., vol. 36, no. 12, pp. 1761–1770, 2012.