Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Annotating and Integrating Pathology Images: Why Bother?! Jules J. Berman Ph.D., M.D. and Bruce A. Friedman, M.D. December 7, 2004 The Association for Pathology Informatics (API) has recently embarked on a project to develop a data exchange specification for pathology images. The effort currently has 30 participants working in12 task groups and is expected to take 3-5 years for completion [1]. The sole purpose of the project is to provide the pathology community with a uniform way of annotating and exchanging pathology images. When undertaking a new technical project, it's worthwhile taking stock of reality and asking if it's really worth the bother. After all, there are plenty of standard image formats (maybe, too many), we all know how to send images around by email, and many of us have already figured out how to include images in surgical pathology reports. Basically, the impetus for the project relates to our expectations of the growing importance of data annotation and date integration. In particular, it relates to a critical role for pathologists as the only professionals who can unite research datasets with clinical datasets [related to pathology specimens]. Despite the long-heralded arrival of a revolution in biomedical science, the pace of medical progress has slowed in the past decade. This is the opinion of the FDA and of others, based on counting new tests, therapies, and diagnostic devices [2-5]. There doesn't seem to be any slowdown in fundamental scientific discoveries, but there is a real disconnect between discovery and clinical implementation. What seems to be lacking is a way of quickly and efficiently validating candidate markers and tests and their therapeutic effects using datasets of pathology specimens. When you start talking about connecting research data with clinical data, you're opening the big issue of data integration. Data integration occurs when you can sensibly relate [e.g. retrieve, compare, analyze, marge] data of one type type with data of another type. In order to do these things, the data needs to be annotated. Data annotation involves making sure that every piece of data in a record is provided with another set of information that describes the data (so-called metadata or data about the data). Once data has been annotated, it can be associated with other, related data [data integration], even when the other data is found in a seemingly unrelated database. In the past few years, biomedical informatics has transformed into the science that derives biomedical value from computations performed on annotated databases [6]. This brings us back to pathology image annotation. The data exchange specification will conform to the same kind of data annotation/integration that biologists, rocket scientists and even businessmen have come to trust - eXtensible Markup Language (XML). The Specification will include descriptors for the specimen and the manner in which the specimen was prepared, the image aquisition devices, the binary representations of the image, clinical/pathologic information, and information related to confidentiality, intellectual property and authenticity of the image [1]. When an image file describes itself completely [and that's really our goal], the image becomes a database that is keyed to a particular specimen. Miraculously, this image-data-object has properties that are of immense value to image vendors, pathologists, students, and researchers. Image vendors can keep their proprietary formats and still ensure their customers that the captured images can be exported to collaborators who use different systems. All they need to do is write a simple program that ports their image into the image exchange specification. Since the image exchange specification will have well-defined descriptors for logical parts of any image, including the binary, this should be easy. If the image exchange specification is widely adopted, pathologists won't need to worry anymore about vendor "lock-in." Pathologists can use the data exchange specification to strip relevant parts [i.e. the viewable binary] from the file and insert it into a pathology report. The pathologist can also send the fully annotated image file, or any part of the file, to other pathologists for the purpose of consultation or collaboration. Real-time messaging of the image file can be used in telepathology. Collections of image files can be merged into a teaching database. Clinical and pathologic annotations will greatly enhance the didactic value of the images. Databases using the specification can easily be merged into mega-image databases. Perhaps most importantly, the image file can be integrated with biologic datasets, making it possible to discover or validate relationships between genomic, proteomic or metabolomic expression patterns against morphologic/pathologic/clinical features. The availability of large numbers of annotated images keyed to specimens provides new opportunities for medical advancement. So, is it worth the bother? It depends on your view of the future. References: 1. Laboratory Digital Imaging Project. [www.pathologyinformatics.org/ldip.html]. 2. Innovation or Stagnation: Challenge and Opportunity on the Critical Path to New Medical Products. U.S. Department of Health and Human Services, Food and Drug Administration (2004). 3. Anderson NL, Anderson NG. The human plasma proteome: history, character and diagnostic prospects. Mol. Cell. Proteomics 1, 845-867 (2002). 4. Benowitz S. Biomarker boom slowed by validation concerns. J. Natl. Cancer Inst. 96(18), 1356-1357 (2004). 5. Evans WE, Relling MV. Pharmacogenomics: translating functional genomics into rational therapeutics. Science. 286, 487-491 (1999). 6. Berman JJ. Pathology data integration with XML. In press, Human Pathology file: c:\ftp\jb_blog.rtf