Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Indiana 21st Century Research and Development Fund Quarterly Report Indiana Telemedicine Incubator (ITI) Purdue University Department of Computer Science June 1, 2001 1) Estimate of jobs created since your project started to July 1, 2001 (i.e., include planned hiring for the first half of next year): In total there are 35 people working on the Telemedicine project. The majority of the researchers are funded through other grants. Specifically funded through the 21st century funds are: Three FTE programmers, two staff, a consultant, two researchers, six graduate research assistants, one of which is a doctor working on his Masters in Computer Science. 2) Changes in partnerships--new partners, changes in focus, major new approaches, surprising new results: 1. Partnership with Professor Mathur in a project for Assisted Living. The partnership includes Purdue University and initially one Lafayette nursing home. 2. Partnership with young Purdue entrepreneur, Sid Rao, in development of PhsyioChart. PhysioChart is a hand held device that not only will store patient notes and prescriptions, but also retrieve updates from patient monitors in real time. 3. In total there have been over 30 companies contacts in the last quarter. The companies range from hospitals, to doctors, to small software companies, to video imaging companies, to entrepreneurs seeking help in advancing their ideas. 3) Submission of Federal or other proposals. Other leveraging activities: The ITI researchers have been very active in submitting proposals to industrial and governmental agencies. These grants are directly related to our 21st Century funded project. The group also has pending proposals and grant opportunities with the National Science Foundation and other corporations and research activities totaling more than $10,000,000. 4) New Patent activity. Papers, publications, presentations: A provisional patient is pending for the concept of the PhsyioChart, Rao and Elmagarmid . Following is a list of the publications and the seminars prepared by members of the ITI research group: Publications: 1. A. Bougettaya, B. Benatallah, and A. Elmagarmid, A Database Centric Infrastructure for Modeling and Querying Application Web Services. S submitted for Publication. 2. J. Fan, D. Yau, W. Aref, A. Elmagarmid, “Accessing Video Contents through Key Objects over IP,” IEEE Transactions on Multimedia, 2000. 3. Ahmed K. Elmagarmid, Jianping Fan, Mohand-Said Hacid and Farouk Toumani. Discovering Structural Associations in Video Databases. Submitted to ACM Multimedia Journal. 4. Elisa Bertino, Ahmed K. Elmagarmid and Mohand-Said Hacid. A Logical Approach to Quality of Service Specification in Video Databases. Submitted to VLDB Journal. 5. Elisa Bertino, Ahmed K. Elmagarmid and Mohand-Said Hacid. A Knowledge-Based Approach to Visual Information. Submitted to Journal of Intelligent Information Systems. 6. Verykios, V.S., Elmagarmid, A.K., Bertino, E., Dasseni, E., and Saygin, Y., Association Rule Hiding, Submitted to IEEE Transactions on Knowledge and Data Engineering. 7. W.G. Aref, M.G. Elfeky, and A.K. Elmagarmid. “Incremental, Online, Merge Mining of partial Periodic Patterns in Time-Series Databases”, Submitted to Data Mining and Knowledge Discovery Journal. 8. Elisa Bertino, Tiziana Catarci, Ahmed K. Elmagarmid and Mohand-Said Hacid. A Database Approach to QoS Management in Video Databases. Submitted to VLDB ’01. 9. Ahmed K. Elmagarmid, Jianping Fan, Mohand-Said Hacid and Farouk Toumani. Discovering Structural Associations in Video Databases. Submitted to ACM MM’01. 10. Ahmed K. Elmagarmid and Mohand-Said Hacid. A Constraint Systems for Ontologies. Submitted to International Conference on Conceptual Structures (ICCS’01). 11. Elisa Bertino, Ahmed K. Elmagarmid and Mohand-Said Hacid. Ordering and Path Constraints over Semistructured Data. Submitted to VLDB’01. 12. Ahmed K. Elmagarmid, Mohand-Said Hacid and Farouk Toumani. An Access and Specification Language for Ontologies. Submitted to VLDB’01. 13. Edoardo Ardizzone, Ahmed K. Elmagarmid, Jianping Fan and Mohand-Said Hacid. Semantic Modeling for Video Browsing Systems. Submitted IEEE TKDE. 14. Ahmed K. Elmagarmid, Mohand-Said Hacid and Evimaria Terzi. A Framework for Appropriate Query Answers in XML. Submitted to WebDB’01. 15. V. Verykios, A. Elmagarmid and G. Moustakides, Cost Optimal Record/Entity Matching. Submitted to KDD-2001. 16. A. Vakali, E. Terzi, and A. Elmagarmid, “Representation and Storage Modeling in Multimedia Systems”, Journal of Applied Systems Studies, Special Issue on Distributed Multimedia Systems with Applications, Volume 2, Number 3, to appear in fall 2001. 17. A. Vakali and E. Terzi : "Video Data Storage Policies : An Access Frequency Based Approach”, Computers & Electrical Engineering Journal, Elsevier, accepted 2001, to appear. 18. A. Vakali and E. Terzi : "A Java-based model for I/O scheduling in Tertiary Storage Subsystems", International Journal of Computers and Applications, ACTA Press, accepted 2001, to appear. 19. Mohand-Said Hacid, Evimaria Terzi and Athena Vakali : Querying XML with Constraints, accepted for presentation and publication at the Special Session on XML Data Management and Applications, Proceedings 2nd International Conference on Internet Computing , June 2001. 20. A.Vakali and E. Terzi: “A Two-Level Representation Model for Effective Video Data Storage”, MIS'2000, Proceedings of the Sixth International Workshop on Multimedia Information Systems, Chicago, USA, Oct. 2000. 21. A. Vakali and C. Stupa, “A QoS based Disk Subsystem”, Proceedings of the 6th International Conference on Computers and Their Applications (CATA2001), March 2001. 22. R. Chari, and S. Prabhakar. Prefix Caching and Replication: Techniques for Large Scale Multimedia Document Storage. Submitted. 23. R. Sion, A. Elmagarmid, S. Prabhakar, and A. Rezgui. A Database-Centric Approach to Enabling End-to-End QoS for Multimedia Repositories. Submitted. Seminars and presentations: ITI sponsors a weekly seminar. This seminar meets on Monday’s at 3pm and is attended by all the members of the Indiana Telemedicine Incubator. The following is a list of some of the speakers in this seminar series: 1. Professor Ahmed Elmagarmid Computer Science Department – Purdue University 2. Evimaria Terzi Computer Science Department – Purdue University 3. Xingquan Zhu Computer Science Department - Purdue University (post doc on loan from Microsoft China) 4. David Whittinghill Computer Science Department - Purdue University 5. Junghoo Cho Stanford University 6. Wu-chi Feng Department of Computer and Information Science, Ohio State University 7. William Winkler U.S. Census Bureau 5) Financial breakdown for quarter: Equipment Personnel Travel Other Sub-contracts (partners) Total 6) $14,954.58 $100,779.98 $4,023.58 $18,886.70 $74,768.77 $213,413.61 New Science/Technological developments, major steps toward Commercializing something, new insights: Micro Data Base Systems, Inc. (mdbs) of West Lafayette, IN has benefitted in the following ways from participation in the Indiana Telemedicine Initiative (ITI): 1. Through ITI mdbs personnel have learned a great deal about video and multimedia, and have applied this knowledge to mdbs's flagship product, TITANIUM, increasing its market appeal. 2. Specifically through the ITI project mdbs has developed a video query add-in ("Play" function)to TITANIUM to select clips from within a video stored inside TITANIUM; this is a unique capability not shared by other products in TITANIUM's existing market space. 3. The ITI project has enabled mdbs to add the full-time equivalent of 1 1/3 software engineers to its staff in West Lafayette, IN over the duration of the project. 4. Via the ITI project mdbs was able to develop a Sun Solaris UNIX version of its current TITANIUM database product, better enabling competition with UNIX database players such as Oracle and IBM. 5. By providing training to users of TITANIUM at Purdue and other participating organizations, ITI has helped make a larger base of software developers familiar with the TITANIUM product, which helps mdbs market presence. 6. By providing demonstrations of TITANIUM video technology the ITI project helps to publicize TITANIUM's general capabilities also, increasing awareness of mdbs products. 7. Knowledge and capabilities gained by mdbs via ITI will be leveraged for other Indiana 21st-Century funded projects, such as ICER - The Indiana Consortium for E-commerce Research. 7) Indications of the importance of the Fund's emphasis on partnerships: In addition to our current partners, an environment of collaboration has been increasingly evident in the Telemedicine/Medical Informatics field and particularly among 21st Century Fund awardees. Examples of this have manifested in meetings between ourselves and other awardees. Furthermore, the intellectual stimulation produced by the fund, coupled with the opportunity for future partnership for future 21st Century awards has created an environment ripe for growth. ITI is excited about the opportunity for future collaborations, continually networking and broadening our understanding of the Medical Informatics community in Indiana. As this synergy grows the Indiana Telemedicine Incubator is poised to take a lead role, especially in the education community, in the high growth arena of the Medical Informatics. 8) Overview of Projects Below are descriptions of the three areas of the project as outlined in the original proposal to the Indiana 21st Century Research and Development Fund. The three areas of research applications are: 1) Medical Education – partner(s), Indiana University Medical Education Resource Program, Purdue University School of Veterinary Medicine and mdbs 2) Clinical Trials – partner, Med Institute and 3) Teleconsultation – partner(s), Clarian Health Systems/Methodist Hospital and Greene County Hospital. Following the research applications is a portion of the research required to building and develop the aforementioned projects is outlined at the end of this section. Medical Education EduMed Significant progress has been made in the design and implementation of a demonstration prototype system called EduMed. The goal of the EduMed project is to create a trial environment for the ITI Intermed system that targets distance learning. Development of the EduMed framework, system infrastructure, and application interface as a web-based video retrieval system applied to medical education is currently underway. The capabilities of the EduMed system include all features specified in the ITI proposal, including (1) annotation, audio and content-based video analysis for the indexing of medical video according to semantic content, (2) video and indexing data storage for content-based video management, (3) web-based query, browsing, retrieval and presentation, and (4) user authorization for secure, customized application access. The accomplishments of the project are as follows: The design of the EduMed system architecture is complete. The functionality of the system is organized into three major components, (1) the end-user, (2) the front-end servers and associated local video warehouses, and (3) the back-end server with remote video archive. A medical faculty or student using a web browser on a PC with installed RealPlayer video player represents the end-user. Our development focuses on the providing the services which support components (2) and (3). These components have been designed in a modular fashion. The operation of each module and the interfaces between the modules are fully specified, and the issues involving their implementation have been fully investigated. The front-end server component is comprised of a security and authorization module, an application and user-specific forms module, and the database interface modules. The database interface modules include the query processing module, the “key frames” presentation module, the video stream interface module and the remote query module. Application forms and query processing will interface with medical terminology software for user-support and “key word” resolution related to compliance with internal representations of video annotations and indexing. The test bed for the prototype front-end server is a Sun Solaris machine in Purdue University’s Computer Science Department. A dedicated, customized, ITI – specific Apache web server has been installed, along with the mdbs Titianium database engine and a RealServer streaming video server with bitcasting.com MPEG-1 plug-in for MPEG streaming. The local warehouse video and indexing data is stored the TITANIUM. The design and implementation of the database schema to support the front-end component is complete, and test data in the form of medical video segments, key frames, annotations and key words have been created to populate the database. The modules that support the functionality of the front-end server are coded as a C application interface (API) to the TITANIUM database. The API handles the navigational calls and results handling of the interaction with TITANIUM. Dynamically generated web pages support the query submission and results presentation. Secure and user-dependent access modules, based on password and ownership protection, serve as the gateway to the system. User identification determines ‘user profiles’ which are defined as collections of video segments associated with the user; these are presented to the user upon authorized entry to EduMed. Figure1: High-level diagram of the front-end modules. The back-end server component is comprised of a security module and the database interface modules. The database interface modules include the query processing module, the “key frames” presentation module, and the video segment transfer module. The security module for the back-end does not manage individual users but rather is configured to accept connection requests only from certain IP addresses that originate from one of the front-end servers. Query processing for the extraction of video segments from stored video data according to user-specified medical key words is handled entirely by the back-end. This process incorporates the PlayVideo() function developed by mdbs for selection and extraction of video clips from within a video stored inside TITANIUM. Figure 2: High-level diagram of the back-end modules. A prototype version of EduMed will be ready by the end of July. The system will (1) provide secure access via user-dependent operation, application and profile management, (2) enable medical faculty to query the remote archive for video segments associated with medical keywords and store the results in the local warehouse for use in multi-media presentations and lectures, (3) allow student querying and student access to facultydesigned profile collections to support student research on various medical topics. Clinical Trails Med Institute Introduction: In the area of health care, a large number of images are produced on a daily basis and need to be archived for future reference. Unlike textual data, images are multifaceted and comprise a lot of information. In particular, the contents of a medical image store a myriad of information related to different parts or organs of human body. The extraction of this information requires robust image processing techniques so that the extracted features are loss less and describe the corresponding object or region in its totality. The objective is to extract the appropriate representations of the contents from a collection of images and to classify the images based on their features and contents. The entire problem of image classification and querying can be divided into two sub problems: (1) Image segmentation and labeling (2) feature extraction. Image segmentation and labeling: Image segmentation involves identifying connected regions that are homogenous in terms of some features such as of gray level, color or texture. Prior to segmentation an image needs to be pre-processed to remove any noise caused by the image capturing system. Various segmentation algorithms such as histogram thresholding, SCT/Center segmentation algorithm and PCT/median segmentation algorithm presently exist. These segmentation algorithms are generally application dependent and enhance different regions of the underlying image based on the color, texture and gray level. The segmented image may contain many false objects. To facilitate the search for the objects of interest, morphological filtering is applied to the segmented image. Morphological filtering smoothes out object outline, fills small holes and eliminates small projections. Depending on the type of image, the parameters for morphological filter are selected accordingly. For selecting the filter parameters it is assured that in the resulting image the geometry of the objects of interest is completely preserved and is not distorted at all. The last phase of image segmentation is labeling different objects in the image. Labeling corresponds to assigning same gray scale values to all the pixels within the same object, and different gray scale values to pixels across different objects. Figure 3 shows the images produced in the process of segmentation and labeling. Figure 3a. Original Image Figure 3c. Image after applying morphological filtering to segmented image Figure 3b. Segmented Image Figure 3d. Labeled Image Feature Extraction: Feature extraction is the most important step towards image classification. An ndimension feature vector represents each object in the image. The number and type of features to be extracted are application dependent. In our telemedicine application, we have focused on the following features for a salient object. 1. 2. 3. 4. 5. 6. 7. Maximum and minimum diameter. Centroid. Area. Orientation (axis of least second moment). Perimeter. Thinness. Rotation scale translation (RST) invariant features. (For rotation, spatial and translation invariant searches). Proposed System Architecture: A multi-layered architecture will be developed for this project. Figure 4 illustrates different layers of the overall system and their inter dependencies. Image segmentation is a low-level operation and involves the application of various segmentation techniques to the raw image as described in the previous section. Feature extraction layer creates a representation of different objects in the underlying segmented image. The storage plane corresponds to the DBMS that stores the actual images and the corresponding meta-data. The classification and querying layer enables to perform various features related searches in the underlying image database. It also facilitates classification of new images based on their feature vectors. The graphical user interface will provide a flexible and user-friendly environment for image classification and querying. Graphical User Interface Querying/Classification Storage Feature Extraction Figure 4. Multi-layered system architecture Teleconsultation Dr. James Trippi , Clarian Health Systems and Methodist Hospital, is the principal investigator for the teleconsulation portion of the ITI project. Currently the teleconsultation service not yet started. The purchase order for the teleconsultation equipment has been sent with an expected delivery time of 6-8 weeks. After equipment arrival, Clarian IT personnel will make the installations at Methodist Hospital. Debra Pehler, Director of Information Technology for Clarian will oversee the process. “Patient visits” will begin as soon as the system is operational, anticipating a mid to late summer start date. The preliminary literature search and research for the project "Teleconsultation for Management of Congestive Heart Failure" is complete. The protocol has been reviewed by 3 other physicians who all made minor revisions. The bio-statistician has checked the protocol and determined the number of enrolled patients needed to determine a statistically meaningful study (given previously published results of similar studies). The voluminous information needed to submit an application to the Institutional Review Board (required for any human experimentation) was submitted on Wednesday, May 16th, for the project. It is scheduled to be on the July IRB meeting for discussion. The "Minnesota Living with Heart Failure Survey" will be used in the research project. The survey is licensed to Methodist Hospital. The company agrees that their survey can be applied to our study. The hospital administrator of Greene County Hospital, Jonas Uland, has examined the research project and the concept of teleconsultation and has written a letter in support. A nurse for patient care will be hired for Greene County Hospital. The nurse will be trained by a Methodist’ heart failure nurse practitioner and a Methodist Research Institute research nurse. Dr. Kirlin, Nurse-Practitioner Kari Barron and Dr. Trippi will be "seeing" patients via teleconsultation in the research protocol. Other doctors in the group will be seeing their previously established patients for revisits. So far the technology has not been a problem. The human elements continue to be the greatest challenge. Research Large Scale Multimedia Storage: For physical storage management of multimedia documents, we have designed several novel data placement and scheduling schemes. These schemes are currently being implemented on a Sun E450 server and a Sun A1000 Raid array. Managing large volumes of data necessitates the user of cheap tertiary storage. Due to the very high random access cost of tertiary storage, efficient management of data is critical for performance. We are developing data placement, migration, pre-fetching, caching, and scheduling schemes for the effective retrieval of video from secondary and tertiary storage. Two automated DVD carousels have been acquired to serve as the tertiary storage layer. Each jukebox can hold as many as 200 CD or DVD disks. Integrating these into the storage hierarchy of the prototype is currently underway. A novel hot prefix-caching scheme has been developed for continuous media placement across the secondary-tertiary boundary. The key idea is to reserve a portion of secondary storage for storing the initial segments of continuous media objects in lieu of the traditional use as a cache for tertiary storage. These segments serve the purpose of masking the extremely high latency of random access to tertiary storage. In order to reduce jitter during playback of documents that are stored on tertiary storage, full replication will be utilized. The proposed schemes are tested using a simulation of the system under conditions of concurrent access. The results show that these two techniques result in significant reductions in the startup latency as well as jitter during playback. Also being investigated are placement schemes for tertiary storage based upon access patterns that show relationships between documents or objects. Popularity-based models have been proposed where multimedia (video) data representation guides data placement on a tertiary storage subsystem. A two–level representation model is considered to capture the frequencies of accesses at external (video objects) and internal (video clips) levels. The video data placement strategies are evaluated and the impact of video data representation model on the overall storage process is investigated and commented. Video data placement is employed on a tertiary storage topology under three well known placement policies governed by the Organ-pipe, the Camel and the Simulated Annealing algorithms. The latter approach proves to be the most beneficial for the overall multimedia system’s performance. End-to-End Quality of Service (QoS) Currently, different approaches that will allow mapping of the user-specified Quality of Presentation (QoP) parameters to Quality of Service (QoS) requirements for different system components of the overall VDBMS architecture, including storage, servers, networking and security subsystems are being tested. The implementation of the translation mechanisms will be an integral part of the QoS-based resource scheduling modules that will be implemented using several dynamic and static approaches. A system architecture (Quality-of-Service Aware Repository (QuaSAR)) that supports user quality-sensitive queries within a database framework has been designed. The proposed architecture relies upon the notion of QoS aware interfaces to the various components of the system such as the network layer and the operating system (encompassing CPU, main memory, and disk storage). These interfaces enable real-time determination of the status of the components with respect to the satisfaction of QoS constraints. In addition, these interfaces will support reservation of resources to guarantee the ability to satisfy the user’s requested level of quality. A key component of QuaSAR is the enhanced query processing capabilities in contrast to traditional databases. Based upon the content component of the query and the content metadata, alternative plans are generated for the retrieval of the relevant objects. Each plan is annotated with QoS parameters relevant to each component based upon translation of the user’s quality parameters for the given plan. Each of the constraints represented by the annotations are tested through the interfaces, and if necessary reserve resources. If no feasible plan is found a negotiation step is invoked to adjust the constraints and re-evaluate the feasibility of the plan. QoS has been proposed in storage subsystem management towards effective disk space utilization and request servicing. We present a QoS based storage model for effective user negotiation in terms of scheduling, redundancy and number of storage devices. Users can create their own profile with respect to certain QoS attributes in order to specify their requirements. A simulation model is developed based on an available disk simulator, which is experimented under artificial request workload towards better system's responsiveness, performance and functionality. A hierarchical storage model has also been simulated and data elevation among various levels of the storage hierarchy has been simulated. Algorithms of placement among different levels of storage hierarchy and elevation issues have been investigated. Content Base Video Retrieval One important way of accessing video data by contents is through the extracted visual features. Visual features include color (histograms, color moments, etc.), texture, edge orientation and motion vectors. In this video-processing task we did the following: Developed algorithms for scene-cut detection to produce meaningful video shots. A shot is the basic unit for accessing video and feature extraction. Key frames are also extracted from these video shots for querying and fast browsing. Developed the necessary algorithms for feature extraction from uncompressed and compressed video media. Frame features are also aggregated to represent per-shot features. Examples of these algorithms are: (scalable color, dominant color, color layout, texture tamura, edge orientation, motion vectors, camera motion , etc.) Most of the MPEG-7 standard visual features are included to represent the video. Also, the standard in the format and representation of these features is followed. Moreover, other semantic information about the video data has been integrated with the extracted features. The semantic information includes text annotations and keywords extracted by a domain expert. Audio to text transformation used and processed to extract more semantic information. A hybrid scheme of visual features and semantic features to access video contents will be utilized. The large number of extracted visual features needs to be indexed for efficient access and querying. Visual features can be viewed as vectors in a high dimensional space and hence an efficient multidimensional index structures are needed. In our research we did the following tasks. Investigating and comparing the different multidimensional index structures performance. Most of these index structures have a poor performance when dimensionality increases, a typical case in video features. In our prototype we implemented the SR-tree index structure, using GiST (a generalized search tree framework). An index is implemented for similarity search queries on features with dimensions up to 64. For higher dimensions, sequential nearest neighbor search still the only way for indexing. An investigation in the use of other indexing techniques with less dependence on the space dimensionality is underway. For efficient indexing, multiple features should be used in the same query. Combining the similarity search on more than one feature is tricky and need careful assignments of weights. Also being investigating is the use of the latest algorithms for multiple features indexing. Multimedia Presentation (Streaming) Current database management systems can efficiently store media data types (e.g. audio, and video) that require continuous flow of their contents. However, maintaining the rate of media presentation (media streaming) in DBMS is challenging. Database buffer is generally not optimized for continuous provision of data. For example, failure to prefetch a data-item will result in a delay that is generally acceptable in traditional database systems, but will violate continuity in media streaming. Current research on buffer management is addressing the problem of media streaming in a non-database context. This has the effect of limiting the data functionalities provided by these systems. An aggressive pre-fetching technique for database buffer is proposed with a target to support media streaming as well as traditional DBMS requests. Also being investigated is the effect of including streaming operation on query manager functionalities. The target is to support media streaming into the query execution pipeline. This approach provides an efficient utilization of system resources and bridges the gap between query processing and media streaming. Furthermore an investigation of approaches using experimental database system and extending its capabilities to support video requirement. Two that are utilized are PREDATOR and Shore. PREDATOR (the open source object relational database management system) is used for introducing new video type, its methods and meta-data. Predator uses Shore (the storage manager from University of Wisconsin) as the underlying storage manager. Modification to the system components such as storage, buffer management and query management is necessary to handle the large volume and time–sensitivity of video data. Currently, the buffer management has been extended to support streaming and experimenting these changes with concurrent media as well as traditional database requests