Download Automatic Scoring of KI-67 Proliferation Index using Fuzzy C

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Automatic Scoring of KI-67 Proliferation Index using
Fuzzy C-Means Clustering
Saranya P M, 4th Semester Mtech Student, Caarmel Engineering College, Perunad
Jemily Elsa Rajan, Assistant Professor, Caarmel Engineering College, Perunad
Abstract
Neuroendocrine tumor is one of the most
common types of cancers leading to death
worldwide. They are classified according to the
grade of biological aggressiveness and the extent
of differentiation. Recently, KI-67 proliferation
index, which is represented as the ratio between
the number of immunopositive tumor cells and all
tumor cells, is increasingly considered as a valid
biomarker to evaluate tumor cell progression and
predicting therapy responses. So here propose a
automatic algorithm for scoring of KI-67
proliferation index of NET. In order to accurately
and simultaneously localize a large number of
cells , fuzzy local c-means clustering algorithm is
used. An efficient sparse dictionary learning
algorithm is applied to select a set of
representative training samples. Finally, tumor
and nontumor cells are separated by an SVM
classifier with both the cellular features and
regional structure information. Finally the KI-67
proliferation index is calculated based on the
classification results of immunopositive and
immunonegative tumor cells. The automatic KI-67
counting is quite accurate compared with
pathologist’s manual annotations. The proposed
system is much more accurate than existing
methods.
Index terms– Neuroendocrine tumor (NET), KI67, clustering, histopathology, classification
I.INTRODUCTION
Image processing can be defined as the field of
signal processing where both the input and
output signals are images. An Image can be
thought of as two-dimensional signal via a
matrix representation, and image processing can
be thought of as applying standard one
dimensional signal processing techniques to
two-dimensional
signals.
Digital
image
processing is the processing and display of
images and involves many different types of
techniques, but the goal of most applications is
to extract quantitative information from images.
Examples of quantitative information relevant to
cancer diagnosis can be the size and irregularity
distribution of cells, or the ratio of cells that are
positive for a certain diagnostic biomarker to all
cells. This work presents several methods aimed
at different types of analyses of histological
samples, with a focus on neuroendocrine tumor
diagnosis, and with the purpose of
complementing the role of the clinical and
research pathologists and biomarker-researcher.
Neuroendocrine tumor (NET) is one of the most
common cancers leading to death worldwide.
Recently, KI-67 proliferation index, which is
represented as the ratio between the numbers of
immunopositive tumor cells and all tumor cells,
is increasingly considered as a valid biomarker
to evaluate tumor cell progression and
predicting therapy responses [2][3]. Here
propose an integrated learning based framework
for accurate automatic KI-67 counting for NET.
Computer-assisted diagnosis (CAD) refers to
the procedures in medicine where computer
algorithms and programs assist physicians in the
interpretation of medical images. Within
histopathology, CAD systems are getting more
frequent and increasingly crucial in cancer
identification and analysis. Working with
histological images is considerably different
from, for example, radiology images. Hence, the
manual interpretation of histological images is
time-consuming and requires a lot of skill and
experience. Studies show that the interpretation
and scoring of stained specimens using the
microscope is not only labour intense but also a
highly visual and subjective process. The use of
computer-assisted analysis of histological
images has been suggested as a promising way
to reduce these problems, as employing
computers allows both automation and
consistent interpretation.
As described in the previous sections,
identifying certain histological structures, such
as tumors, cell membranes or nuclei, is one of
the prerequisites to cancer grading in
histological images. Most methods focus on the
segmentation of biomarker expression and
quantitative features that can be translated to
relevant grading systems.
One major challenge lies in making the digital
image analysis techniques available and
incorporated into the daily workflow of the
pathologist or cancer researcher. Another
challenge lies in the enormous size of
histological images. The available data keeps
increasing with the rapid growth of the digital
microscopy field, driven by the need for highthroughput systems to accompany recent
developments in microscopy imaging, such as
the automated whole slide scanning systems and
tissue micro-arrays. This puts new demands on
both data handling and the digital image
analysis methods for analyzing the images
within a reasonable time. The methods
presented in this thesis are highly relevant for
the analysis of histological images, with a focus
on the quantification of biomarker known as KI67 proliferation index, for diagnosing
neuroendocrine cancers.
Image segmentation is an important image
processing technique, and it seems everywhere
if we want to analyze what inside the image.
Image segmentation is the process of dividing
an image into parts that have a strong
correlation with the equivalent real world
objects. It is one of the most important topics in
image analysis since all further analyses, such
as feature extraction and object classification,
will depend on the result of the segmentation.
Classification is not possible without knowledge
of the items to classify. In classification,
knowledge of an item is represented by
descriptors, often referred to as features. The
contextual information of a sample object is
often represented by a feature vector that ideally
contains discriminative information of the
object.
Classification methods are often divided into
supervised
and
unsupervised
methods.
Supervised methods [6] typically construct a
classifier by dividing a set of samples, each
represented by a feature vector, into a training
set and a test set. The purpose of the training set
is to construct decision lines in the feature space
that separates the classes and that are used to
classify new objects. The test set is used to
evaluate the decision lines and the extracted
features. Unsupervised methods assume that
objects, whose feature vectors are close to each
other in feature space, belong to the same class.
The classifier can be constructed by ordering the
feature vectors into clusters representing the
individual classes.
Manual KI-67 assessment is subject to a low
throughput processing rate and pathologistdependent bias. Computer-aided pathological
image analysis is a promising approach to
improve the objectivity and reproducibility.
However, it is difficult to access automatic and
accurate KI-67 counting in digitized NET
images, since the complex nature of
histopathological images, such as variations of
image texture, color, size, and shape, presents
significant challenges for accurate automatic
KI-67 counting. In addition, tumor and
nontumor cells are usually clustered such that
the nontumor cells are also counted using many
traditional methods, which lead to large
counting errors.
The rest of the paper is organized as follows:
Section II presents motivation and overview.
Section III discusses related works and IV
illustrates system design. Section V the
experimental results and discussion, and Section
VI concludes the paper.
II. MOTIVATION AND OVERVIEW
The neuroendocrine system is made up of nerve
and gland cells. Neuro means nerve and
endocrine means the cells of the endocrine
system. Digital pathology has taken the
conventional standard of optical microscopy
into the digital era along with the development
of computers and the exponential increase in
computer power. The digitization of biological
data has enabled the use of computers assisting
in the diagnosis. Interpreting tissue slides
manually is labour intensive, costly and
involves the risk for human errors and
inconsistency, while using automated image
analysis can provide additional automatic, fast
and reproducible analyses, assisting
pathologist making an accurate diagnosis.
the
The methods presented in this work are highly
relevant for the analysis of histological images,
with a focus on the quantification of biomarker
known as KI-67 proliferation index, for
diagnosing
neuroendocrine
cancers.
Neuroendocrine cancer [9] is a malignant tumor
that it can spread, or metastasize, to other parts
of the body. The World Health Organization
(WHO)
classification
scheme
places
neuroendocrine tumors into three main
categories,
which
emphasize
the tumor
grade rather than the anatomical origin.
 Well-differentiated neuroendocrine tumors,
which are subdivided into tumors with
benign and those with uncertain behavior.
 Well-differentiated
(low
grade)
neuroendocrine carcinomas with low-grade
malignant behavior.
 Poorly
differentiated
(high
grade)
neuroendocrine carcinomas, which are the
large cell neuroendocrine and small cell
carcinomas.
Placing a given tumor into one of categories
depends on well-defined histological features:
size, shape, KI-67 labeling index etc. Here
implemented a three-stage learning-based
approach to differentiate tumor cells from
nontumor cells and immunopositive and
immunonegative tumor cells for an automatic,
accurate, and robust quantification of KI-67
proliferation index.
III. RELATED WORKS
This section deals with description of previous
papers that are related to this work. Computer
aided pathological image analysis is a promising
approach to improve the objectivity and
reproducibility. KI-67 proliferation index is
increasingly considered as a valid biomarker to
evaluate tumor cell progression and predicting
therapy responses [9]. An image analysis system
is utilized in [11] to quantify tumor cells, where
color intensity thresholds need to be properly
selected. In [13], established image analysis
software [14] is applied to quantification of the
KI-67 proliferation index, and multiple staining
methods are used to discriminate tumor from
nontumor cells. The Aperio image analysis
software is utilized in [15] and [10] for the
assessment of KI-67 proliferation index, but the
nontumor cells such as lymphocytes and stromal
cells need to be excluded manually, and
therefore it is not completely automatic. A
computationally efficient single-pass voting
(SPV) for cell detection is reported in [7], which
applies mean shift clustering instead of iterative
voting to final seed localization. Another class
of methods uses spatial filters to detect
cells/nuclei. A LoG filter-based algorithm to
automatically detect cell nuclei is presented in
[5]. The nontumor cells such as lymphocytes,
stromal, and/or epithelial cells thus often need
to be excluded manually. Meanwhile, additional
steps need to be designed to separate
immunonegative and immuopositive tumor
cells. These methods cannot precisely
differentiate tumor from nontumor cells and
separate touching cells simultaneously and lead
to large counting errors. The aforementioned
general cell detection and segmentation
algorithms are not specifically designed to
calculate KI-67 proliferation index.
IV. SYSTEM DESIGN
An integrated learning-based algorithm is
proposed for automatic scoring of KI-67
proliferation index of NET, with addressing the
problems earlier simultaneously. In order to
accurately and simultaneously localize a large
number of cells, fuzzy local c-means clustering
technique is used. These clusters will be used to
initialize a repulsive deformable model [12] to
extract touching cell boundaries with known
object topology constraints for segmentation.
Next, an efficient sparse dictionary learning
algorithm is applied to select a set of
representative training samples. Finally, tumor
and nontumor cells are separated by a trained
SVM classifier with both the cellular features
and regional structure information. The KI-67
proliferation index is calculated based on the
classification results of immunopositive and
immunonegative tumor cells.
Fig 1: System Architecture
The Fig 1 depicts the system architecture of the
proposed work for calculating the KI-67 index
for the diagnosis of NETs based on the
classification results of immunopositive and
immunonegative tumor cells. The methods
range from segmentation and classification of
sub-cellular features to compare them with a set
of representative training samples using an
efficient sparse dictionary learning algorithm.
Here proposed an adaptive SVM classifier for
accurate and automatic KI- 67 counting. The
adaptive sparse strategy not only restricts pixels
from different scales to be represented by
training cells from a particular class but also
allows the selected cells for these pixels to be
varied,
thus
providing
an
improved
representation. Finally, tumor and non tumor
cells are separated with both the cellular
features and regional structure information. A
region
is
constructed
by
iteratively
incorporating pixels on the region boundary. In
addition, active-contour based methods have
also been proposed to perform image
segmentation. Changes in contrast can be
detected by operators that calculate the gradient
of an image. Fuzzy local c-means clustering
technique is applied for boundary delineation
and grouping of homogeneous cells. It is
followed by segmentation and feature
extraction. Finally we can calculate KI-67
proliferation index.
A. Classification Based on Feature Extraction
After an accurate segmentation and dictionary
learning of all the cells in NET, a three-stage
learning-based scheme combining cellular
features and regional structure information is
designed to differentiate tumor from nontumor
cells,
and
immunopositive
from
immunonegative tumor cells for accurate KI-67
counting.
Differentiation between tumor and nontumor
cells is the critical step for accurate automatic
KI-67 scoring. In order to extract distinctive
morphological features to separate tumor from
nontumor cells, accurate cellular boundary
delineation is a required. This is a challenging
problem due to the complex color and intensity
variations inside the cells, especially in the case
of touching cell clumps. For boundary
delineation fuzzy local c-means clustering
technique is used.
In Stage I, only cellular features are considered,
and some nontumor cells (like lymphocytes) can
be classified as tumor cells by mistake. In Stage
II, the statistical features are concatenated with
cellular probabilities to train a SVM classifier.
The output will produce the labels to
differentiate tumor from nontumor cells. Based
on the classification in Stage II, the classifier
separate immunopositive and immunonegative
tumor cells. This is achieved by training a final
classifier for all the KI-67 positive staining cells
using the cell features and cellular intensity
histogram to differentiate the immuopositive
and immunonegative tumor cells. The KI-67
Read Input
Image
Immunopositive/
Immunonegative
Tumor Cell
Classification
KI-67
Count
proliferation index is finally calculated using a
classifier with a SVM classifier for the
differentiation
of
immunopositive
and
immunonegative tumor cells.
Fuzzy
C-means
Clustering
Gradient
Calculation
Tumor /
Non-tumor
Cell
Classification
Masking
SVM
Classifier
Training
Sample
Active
Contour
Segmentation
Feature Extraction
Geometric, Color,
Cell Shape,
Probability, Intensity
etc
Feature
Extraction
Fig 2: System Workflow of the proposed automatic KI-67 proliferation index counting system
The above figure 2 shows the block diagram for
calculating the KI-67 index for the diagnosis of
NETs based on the classification results of
immunopositive and immunonegative tumor
cells. With fuzzy local c-means clustering
technique [8] homogeneous cells can be
grouped and boundary of each cluster can be
separated. Cell boundaries are segmented with
an improved deformable model. Then features
are extracted to differentiate immunopositive
and immunonegative tumor cells. Fuzzy local cmeans clustering algorithm used here has low
time consumption for active contour
segmentation. So segmentation efficiency will
expected to be higher than existing methods.
Fuzzy c-means (FCM) is a method of clustering
which allows one piece of data to belong to two
or more clusters. It is based on minimization of
an objective function. Fuzzy local c-means
clustering technique gives best result for
overlapped data set and comparatively better
than mean shift clustering algorithm described
in [7]. Here data point is assigned membership
to each cluster center as a result of which data
point may belong to more than one cluster
center.
V. EXPERIMENTAL RESULTS AND
DISCUSSION
The ROC curve for analysing the classifier
performance of proposed automatic KI-67
counting system is shown in fig 3. Fuzzy local
c-means clustering algorithm proposed here is
an
efficient
algorithm
for
grouping
homogeneous cells. We use fuzzy local cmeans clustering technique to improve the
efficiency of cell segmentation. As shown in the
figure, fuzzy local c-means clustering technique
is much more efficient compared to other
clustering techniques like mean shift clustering
that has high time consumption due to active
contour segmentation. Fuzzy local c-means
clustering technique gives the best result for
overlapped data set and comparatively better
than mean shift clustering algorithm.
VI. CONCLUSION AND SCOPE OF THE
WORK
This paper presents several methods and
applications aimed at different types of analyses
of histological samples, with a focus on
neuroendocrine tumor diagnosis, and with the
purpose of complementing the role of the
clinical and research pathologists and biomarker
researchers. The significance of the accurate
estimation of
KI-67 proliferation index is
obvious. NETs were separated into three
prognostically
significant
grades:
low,
intermediate, and high. Low grade (grade 1)
NETs have a mitotic count of less than two
mitoses/ten high-powered fields (HPF) and/or a
KI-67 index of less than 3%. Intermediate
(grade 2) NETs have a mitotic rate of 2–20
mitoses/ten HPF and/or a KI-67 index of 3–
20%. High-grade (grade 3) NETs are
morphologically anaplastic, and often show
mitotic rates greater than 20 mitoses/ten HPF
and KI-67 index well above 20%. These
specific cut offs (3% and 20%) need to be
precisely determined in order to differentiate
low, intermediate, and high grades of NETs.
In this work, an efficient fuzzy c-means
clustering technique for KI-67 scoring of
digitized NET images is proposed. The novel
cell detection algorithm can efficiently and
accurately detect thousands of cells on a
digitized NET image with KI-67 staining.
Furthermore, a three-stage learning-based
approach is designed to differentiate tumor cells
from nontumor cells and immunopositive and
immunonegative tumor cells for an automatic,
accurate, and robust quantification of KI-67
proliferation index. Differentiation between
tumor and nontumor cells is the critical step for
accurate automatic KI-67 scoring. The proposed
cell detection algorithm indiscriminately detects
both tumor and nontumor cells. This scheme is
much more accurate than existing methods and
exhibits high performance. The future work is to
improve the efficiency of the cell segmentation
algorithm. It can be contributed to research in
the field of computer assisted diagnosis in
digital histopathology.
REFERENCES
[1] Fuyong Xing, Hai Su, Janna Neltner, and
Lin Yang, “Automatic KI-67 counting
using robust cell detection and online
dictionary learning,” IEEE Tran. Biomed.
Imag. vol. 61, no. 3, march 2014
[2] Uzma nabi, Nagi A H, Waqas Sami “KI67 proliferating index and histological
grade, type and stage of colorectal
carcinoma”
automated evaluation and the influence of
different KI-67 equivalent antibodies,”
Diagn. Pathol., vol. 6, no. S1, 2011.
[11] Z. M. Mohammed, D. C. Mcmillan, B.
[3] W. Jonat and N. Arnold, “Is the KI-67
labelling index ready for clinical use?”
Ann. Oncol., vol. 22, no. 3, pp. 500–502,
2011.
[4] C. G. Loukas, G. D. Wilson, B. Vojnovic,
and A. Linney, “An image analysis-based
approach for automated counting of cancer
cell nuclei in tissue sections,” Cytometry a,
vol. 55, no. 1, pp. 30–42, 2003.
[5] J. Byun, M. R. Verardo, B. Sumengen, G.
Lewis, B. S. Manjunath, and S. K. Fisher,
“Automated tool for the detection of cell
nuclei in digital microscopic images:
application to retinal images,” Mol. Vis.,
vol. 12, pp. 949–960, 2006.
[6] Potocnik B., Solina F., and Zazula D.,
“Classical image processing vs. Computer
vision techniques in computer assisted and
automated detection of follicles in
ultrasound images of ovary”, Image
Processing and Its Applications, 6th
international conference, volume: 2, pages:
551-555, 1997.
[7] X. Qi, F. Xing, D. J. Foran, and L. Yang,
“Robust segmentation of overlapping cells
in histopathology specimens using parallel
seed detection and repulsive level set,”
IEEE Trans. Biomed. Eng. (TBME), vol.
59, no. 3, pp. 754–765, 2012.
[8] A. Hafiane, f. Bunyak, and k. Palaniappan,
“fuzzy clustering and active contours for
histopathology image segmentation and
nuclei detection,” in adv. Concepts intell.
Vis. Syst. (acivs), 2008, vol. 5259, pp.
903–914.
[9] Adam m. Mougey, md douglas g. Adler,
md “neuroendocrine tumors: review and
clinical update” november 2007
[10] S. Fasanella, E. Leonardi, C. Cantaloni, C.
Eccher, I. Bazzanella, D. Aldovini, E.
Bragantini, L. Morelli, L. V. Cuorvo, A.
Ferro, F. Gasperetti, G. Berlanda, P. D.
Palma, and M. Barbareschi, “Proliferative
activity in human breast cancer: KI-67
Elsberger, J. J. Going, C. Orange, E.
Mallon, J. C. Doughty, and J. Edwards,
“Comparison of visual and automated
assessment of KI-67 proliferative activity
and their impact on outcome in primary
operable invasive ductal breast cancer,”
Br. J. Cancer, vol. 106, no. 2, pp. 383–388,
2012.
[12] L. D. Cohen, “On Active Contour Models
and
Balloons,”
CVGIP:
Image
Understand., vol. 53, no. 2, pp. 211–218,
1991.
[13] C. M. Van Der Loos, O. J. De Boer, C.
Mackaaij, L. T. Hoekstra, T. M. Van
Gulik,
and
J.Verheij,
“Accurate
quantitation of ki67-positive proliferating
hepatocytes in rabbit liver by a multicolor
immunohistochemical (ihc) approach
analyzed with automated tissue and cell
segmentation software,” J. Histochem.
Cytochem., vol. 61, no. 1, pp. 11–18,
2013.
[14] J. R. Mansfield, “Cellular context in
epigenetics:
quantitative
multicolour
imaging and automated per-cell analysis of
mirnas and their putative targets,”
Methods, vol. 52, pp. 271–280, 2010.
[15] L. H. Tang, M. Gonen, C. Hedvat, I. M.
Modlin, and D. S. Klimstra, “Objective
quantification of the ki67 proliferative
index in neuroendocrine tumors of the
gastroenteropancreatic
system:
a
comparison of digital image analysis with
manual methods,” Am. J. Surg. Pathol.,
vol. 36, no. 12, pp. 1761–1770, 2012.