Download A Framework for Automatic Detection of Relevant Prior Imaging

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
SIIM 2016 Scientific Session
The Reading Room: Workflow and Efficiency
Wednesday, June 291:15 pm – 2:45 pm
A Framework for Automatic Detection of Relevant Prior Imaging Studies
Dehgan M. Ehsan, PhD, Philips Healthcare; Yuechen Qian, PhD; Gabe Mankovich; Vadiraj Hombal; Thusitha
Mabotuwana; Paul J. Chang, MD, FSIIM
Hypothesis
The semantic information included in DICOM attributes of an imaging study can be used to estimate the
relevance of a prior imaging study to the current one.
Information contained in the DICOM tags may be used to better infer both the modality and the body-part
corresponding to a study.
Introduction
For a radiologist reading an imaging study, the information contained in the relevant prior imaging studies is
of critical importance. To find the most recent imaging study as baseline, the radiologist must search through
several prior studies, which is a tedious and time-consuming task, especially for oncology patients with
numerous imaging studies. Automatic detection of the most relevant prior imaging study can save significant
time for the radiologist.
What is relevant, and informative depends on the current study that the radiologist is interested in:
• In some cases the match is exact; the most relevant document is the most recent study that is precisely
similar to the current study in that, it is concerned with exactly the same specific anatomy, affliction and
modality. This is a strict direct lexical match of modalities and anatomies.
• In other cases, the radiologist may be interested in any prior studies that are related to (but not
necessarily identical) the current body-part. Or, the radiologist might be interested in studies with
different modalities focused on related anatomies and regions of interest.
Current solutions detect relevant priors based on the naïve matching of the lexical content in DICOM tags
such as MODALITY (0008, 0060) and BODYPARTEXAMINED (0018, 0015). However, in cases where there is
no explicit lexical match between a prior study and the current study, such methods fail to establish
relevancy even if the prior study contains a related body-part or modality.
We propose a solution that reliably establishes relevance between both identical and related studies. In
contrast to lexical matching, the comparison of studies is based on body-part trees that are inferred from
information in DICOM tags. The body-part tree provides a broader context for comparison between studies.
We propose a body-part similarity score to quantify the body-part match between two given imaging
studies.
In addition to the body-part tree bases similarity score, we propose a statistical lexical matcher as a
surrogate for non-narrative DICOM tags.
We also present a relevance score for prior studies that combines the inferred modality match with the
body-part similarity score.
The relevance scores are then used to rank all prior imaging studies. This ranking system is utilized to identify
the most relevant prior imaging study with respect to the current study.
We compare our algorithm with the baseline method that is based on naïve lexical match of the MODALITY
(0008, 0060) and BODY PARTEXAMINED (0018, 0015) tags.
Methods
In our approach, study relevance is established based on the inferred modality of an imaging study and its
inferred body-part. First, a Body-part similarity score between two studies is computed. Then, the
computed body-part score is combined with modality match to obtain a relevance score between studies.
Two inferences are critical to our approach: modality and body-part.
In most cases, the modality of a study can be unambiguously identified using the modality DICOM tag.
However, identifying the corresponding body-part is non-trivial due to inconsistent and varied information
recording practices across practitioners and institutions. We use the BODYPARTEXAMINED (0018, 0015) ,
and STUDY DESCRIPTION (0008,1030) DICOM tags to illustrate the inference of body-part in the proposed
approach.
STUDYDESCRIPTION Tag: For the STUDYDESCRIPTION tag, the contained narrative unstructured text is
automatically tokenized and each token is mapped against an ontology that accounts for synonyms and
abbreviations, to generate a tree of the body part along with the corresponding SNOMED CT (CT, 2015) and
RadLex (Radlex, 2015) concept IDs if available. For example, Figure 1 depicts the branches of body part tree
for a case in which the text in the study description is “US Kidney and Liver”. As can be seen in the figure, the
body part tree consists of two main branches of differing lengths and each branch is ordered in such a way
that highest node of the branch corresponds to the most generic description of the body part.
Two studies are compared by matching the branches of their respective body-part trees. The branch
comparisons are based on similarity scores that are sensitive to the depth of match. For example, the score
for comparison between two branches, both containing Kidney à Urinary system à Abdomen branches, is
higher than a comparison in which one branch contains Renal Artery à Urinary system à Abdomen, while the
other contains Renal Artery à Urinary system à Abdomen or Liver à Abdomen.
The body part similarity scores between studies is then computed by aggregating the branch scores.
BODYPARTEXAMINED Tag: In studies for which the STUDYDESCRIPTION tag (0008,1030) does not contain
information about the scanned body part, we tokenize the text in the BODYPARTEXAMINED tag(0018,0015)
and use the Cosine Distance between the Term Frequency–Inverse Document Frequency (TF-IDF)
(Rajaraman & Ullman., 2011) representations of the tokens to calculate the body-part similarity score
between studies.
Study Relevance Score: Finally, given the body part similarity score between two given studies A and B, their
relevance score is computed using:
S= {■(αs&if MODALITY(A)=MODALITY(B)@s&otherwise)┤ (1)
Where α>1 is a modality matching parameter which is determined experimentally.
The studies are sorted based on their relevancy scores and dates. To minimize false negatives, the top three
studies in the list are chosen as the best prior matches to the current study.
Results
De-identified exam data and meta-data was obtained from University of Chicago Medical Center that
included radiology reports under IRB (13-0379). The reports contain a “Comparison” section that refers to
the order date of a prior study that was used to compare the findings of the current study. The prior study to
which the comparison date in the current study refers to were marked as the most relevant study. Instances
in which more than one prior study has the same order date were ignored. The resulting 8789 pairs of
current study and its corresponding prior study were used as the ground truth.
The performance of the methodology is evaluated based on two metrics. We define “Soft Accuracy” as the
percentage where the ground truth relevant prior is one of the selected top 3 using our algorithm. We
define “Hard Accuracy” as the percentage where the ground truth relevant prior is the top study in our
selected top 3. The algorithm is tested on all the studies with available ground truth. The algorithm
performance is summarized in Table 1. The benchmark presented in that table refers to matching based on
lexical matching of content in the Modality and Body Part tags. As can be seen in the table, matching of
studies using the proposed method provides significant important over the lexical matching approach.
Table 1: Performance of our matching algorithm
Accuracy (%) Benchmark Our Method
Hard 68.4 79.9
Soft 76.4 91.6
Discussion
In this study, we used the STUDYDESCRIPTION tag to obtain information about the scanned body parts in the
studies. Although this tag was always available and reliable in our data set, it may be absent or contain
irrelevant information in datasets acquired at other institutes. In such cases, a combination of other DICOM
tags such as SERIES DESCRIPTION (0008, 103E), PROTOCOL NAME (0018,1030), REQUESTED PROCEDURE
DESCRIPTION (0032,1060), etc. can be used to obtain body part information. In our body part detection
algorithm, the input is given as a string. Therefore, the contents of all or some of these tags can be
concatenated and given as input to our algorithm.
Conclusion
We presented a framework to detect the most relevant prior imaging study using body part information
extracted from DICOM tags. The proposed framework demonstrated hard and soft accuracy measures of
79.9% and 91.6%, respectively, compared to hard and soft accuracy measures of 68.4% and 76.4% achieved
by matching the body part and modality DICOM tags.
References
1. CT, S. (2015, 09 03). SNOMED. Retrieved from
http://www.nlm.nih.gov/research/umls/Snomed/snomed_main.html
2. Radlex. (2015, 09 03). Radlex.org. Retrieved from http://www.radlex.org/
3. Rajaraman, A., & Ullman., J. D. (2011). "Data Mining", Mining of Massive Datasets. Cambridge:
Cambridge Books Online.
Keywords
Radiology Workflow, DICOM, Relevant Study, Imaging Studies