* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download - Lotus Live Projects
Survey
Document related concepts
Transcript
Facilitating Document Annotation Using Content And Querying Value ABSTRACT A large number of organizations today generate and share textual descriptions of their products, services, and actions .Such collections of textual data contain significant amount of structured information, which remains buried in the unstructured text. While information extraction algorithms facilitate the extraction of structured relations, they are often expensive and inaccurate, especially when operating on top of text that does not contain any instances of the targeted structured information. We present a novel alternative approach that facilitates the generation of the structured metadata by identifying documents that are likely to contain information of interest and this information is going to be subsequently useful for querying the database. Our approach relies on the idea that humans are more likely to add the necessary metadata during creation time, if prompted by the interface; or that it is much easier for humans (and/or algorithms) to identify the metadata when such information actually exists in the document, instead of naively prompting users to fill in forms with information that is not available in the document. As a major contribution of this paper, we present algorithms that identify structured attributes that are likely to appear within the document ,by jointly utilizing the content of the text and the query workload. Our experimental evaluation shows that our approach generates superior results compared to approaches that rely only on the textual content or only on the query workload, to identify attributes of interest. Architecture: 1ST FLOOR, ABOVE ANDHRABANK ATM, OPOSITE TO CHANDANA BROTHERS DILSUKHNAGAR, HYDERABAD-500060 WWW.LOTUSLIVEPROJECTS.COM , EMAIL: [email protected] CONTACT: 040- 40023300, 9700345745 Facilitating Document Annotation Using Content And Querying Value EXISTING SYSTEM: Many systems, though, do not even have the basic “attribute-value” annotation that would make a “pay-as-you-go” querying feasible. Existing work on query forms can beleveraged in creating the CADS adaptive query forms. They propose an algorithm to extract a query form that represents most of the queries in the database using the ”querability” of the columns, while they extend their work discussing forms customization. Some people use the schema information to auto-complete attribute or value names in query forms. In keyword queries are used to select the most appropriate query forms. PROPOSED SYSTEM: In this paper, we propose CADS (Collaborative Adaptive Data Sharing platform), which is an “annotate-as-you-create” infrastructure that facilitates fielded data annotation .A key contribution of our system is the direct use of the query workload to direct the annotation process, in addition to examining the content of the document. In other words, we are trying to prioritize the annotation of documents towards generating attribute values for attributes that are often used by querying users. Modules: 1. Registration 2. Login 3. Document Upload 4. Search Techniques 5. Download Document 1ST FLOOR, ABOVE ANDHRABANK ATM, OPOSITE TO CHANDANA BROTHERS DILSUKHNAGAR, HYDERABAD-500060 WWW.LOTUSLIVEPROJECTS.COM , EMAIL: [email protected] CONTACT: 040- 40023300, 9700345745 Facilitating Document Annotation Using Content And Querying Value Modules Description Registration: In this module an Author(Creater) or User have to register first,then only he/she has to access the data base. Login: In this module, any of the above mentioned person have to login,they should login by giving their emailed and password . Document Upload: In this module Owner uploads an unstructured document as file(along with meta data) into database, with the help of this metadata and its contents,the end user has to download the file. He/She has to enter content/query for download the file. Search Techniques: Here we are using two techniques for searching the document 1)Content Search,2)Query Search. Content Search: It means that the document will be downloaded by giving the content which is present in the corresponding document. If its present the corresponding document will be downloaded, Otherwise it won’t. Query Search: It means that the document will be downloaded by using query which has given in the base paper. If its input matches the document will get download otherwise it won’t. Download Document: The User has to download the document using query/content values which have given in the base paper. He/She enters the correct data in the text boxes, if its correct it will download the file. Otherwise it won’t. 1ST FLOOR, ABOVE ANDHRABANK ATM, OPOSITE TO CHANDANA BROTHERS DILSUKHNAGAR, HYDERABAD-500060 WWW.LOTUSLIVEPROJECTS.COM , EMAIL: [email protected] CONTACT: 040- 40023300, 9700345745 Facilitating Document Annotation Using Content And Querying Value System Configuration:H/W System Configuration:- - Pentium –III Processor Speed - 1.1 GHz RAM - 256 MB (min) Hard Disk - 20 GB Floppy Drive - 1.44 MB Key Board - Standard Windows Keyboard Mouse - Two or Three Button Mouse Monitor - SVGA S/W System Configuration:- Operating System :Windows95/98/2000/XP Application Server : Tomcat5.0/6.X Front End : HTML, Java, Jsp Scripts Server side Script Database : My sql Database Connectivity : JDBC. : JavaScript. : Java Server Pages. 1ST FLOOR, ABOVE ANDHRABANK ATM, OPOSITE TO CHANDANA BROTHERS DILSUKHNAGAR, HYDERABAD-500060 WWW.LOTUSLIVEPROJECTS.COM , EMAIL: [email protected] CONTACT: 040- 40023300, 9700345745 Facilitating Document Annotation Using Content And Querying Value Conclusion: We proposed adaptive techniques to suggest relevant at-tributes to annotate a document, while trying to satisfy the user querying needs. Our solution is based on a probabilistic framework that considers the evidence in the document content and the query workload. We present two ways to combine these two pieces of evidence, content value and Querying value: a model that considers both components conditionally independent and a linear weighted model. Experiments shows that using our techniques, we can suggest attributes that improve the visibility of the documents with respect to the query workload by up to 50%. That is, we show that using the query workload can greatly improve the annotation process and increase the utility of shared data. 1ST FLOOR, ABOVE ANDHRABANK ATM, OPOSITE TO CHANDANA BROTHERS DILSUKHNAGAR, HYDERABAD-500060 WWW.LOTUSLIVEPROJECTS.COM , EMAIL: [email protected] CONTACT: 040- 40023300, 9700345745