Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CS 257 Database Systems Dr. T Y Lin Ultimate Goal Data Science (Big Data) CS 257- OverView CS257 and Big Data +: VLDB (Very Large Database ) +: Unstructured Data, i.e. Text/Web Image, Multimedia, Video, Vision Bio, Scientific Data Processing Light: Cloud Computing Light: Data Science /Knowledge Engineering etc CS 257- OverView Major Applications in Big Data Medical Informatic VLDB + Image +Cloud + Security (CS286) Financial Informatic VLDB + BI + Cloud + Security (CS286) Web Engineering Business Intelligence(BI) Data Science (Knowledge Engineering in Web/Image/Bio/etc Data) CS 257- OverView Instructor: IEEE Best Contribution Award in Data Mining (ICDM 2001) ACM/IEEE Best Service Award Web Intelligent (WI-2007) Best Contribution Award Rough Set (2005) Pioneer Award in Granular Computing (2008) CS 257- OverView http://dl.acm.org/inst_page.cfm?id=60015609 Project Overview Verification and Validation of the Core Engine of a Concept Based Semantic Search Engine 6 Main Idea A set of documents is associated with a Matrix, called 1) Latent Semantic Index(LSI) , by treating the row vectors as points in Euclidean space (point=TFIDF), - Google’s approach 7 Main Idea 2) Topological approach : A polyhedron (combinatorially, = a Simplicial Complex) is built to capture and structure the concepts 8 An open segment is a 1-simplex, an open triangle (faces) is a 2-simplex and an open tetrahedron is a 3-simplex, and . . . n-simplex. A collection of simlexes (satisfies closed condition) is called simplicial complex that is a combinatorial representation of a polyhedron that led to a “new” subject called algebraic topology. The project is algebraic topology based search engine. 9