Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Mining Structured vs. Unstructured Data Where is the structure and where did the semantics go? Rahim Yaseen SAP Labs LLC. Why Mining works for structured data.. Reports Rich semantics are usually expressed in queries and reports which have apriori knowledge of the data models Queries For relational databases, the data model represents a combination of the data representation specification and its storage as relational data. Sometimes, views can express alternate representational models that differ from the underlying tables structures. Relational Data Model Data For relational data There is no separation of the semantic data model and the logical storage model Both are co-incident in a single data model and the data definition has limited semantics The semantics are captured in the richness of the queries which form well known associations based on expert knowledge of relationships in the data models SAP AG 2006, xuPA Mid-Term Strategy/ Speaker Name / 2 internal/confidential What will it take to mine unstructured data? Why free (text) search is not the answer.. The data has no structural model for which meaningful semantics can be applied As a result, queries have limited semantics and are not rich enough to get the desired outcomes The limiting nature of ad hoc search (vs. the richness of pre-defined queries based on known structure/semantics) limits the relevance of the output Converting unstructured data to structured data is also not the answer.. Applying an ETL like technique to convert data to a structured form is limiting This does not guarantee that all the data of interest can be captured It provides for only a single (fixed) interpretation of such unstructured data Can overlaying a semantic model onto the data be the answer? Extract a semantic (meta) model of interest from the unstructured data Use the structure/semantics of this model to formulate rich search/query E.g., techniques used when searching and comparing products – – Relevant attributes from product descriptions are extracted to form a model These attributes are used to formulate rich searches/queries and comparisons SAP AG 2006, xuPA Mid-Term Strategy/ Speaker Name / 3 internal/confidential Can Mining work for both structured/unstructured data? Reports Queries Simple Semantic (Meta) Data Model Multiple Storage Model Multiple Storage Model Data Storage Model (s) Queries and Search that can leverage the structure of the data model to specify queries and search that are rich in semantics A simple semantic data representation model for modeling data (structured and unstructured). Meta-data based on ontologies is extracted from the underlying data. Multiple storage models including; relational, XML, text, etc. Data A separate logical data (meta) model distinct from the underlying storage model Extracted from the data in a non-intrusive fashion and captured as meta-data Single data representation model can map to multiple storage models Structure and semantics of meta-data help structure queries, search, reports Are embedded tags in the data a possible approach to define ontology structures? Is it feasible to extract such semantic models and can mining based on this perform? SAP AG 2006, xuPA Mid-Term Strategy/ Speaker Name / 4 internal/confidential