Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Text Mining “Text mining seeks to apply some of the same types of analysis, such as knowledge discovery, or trend analysis, to unstructured textual data, that data mining applies to structured data. Text mining combines the disciplines of data mining, information extraction, information retrieval, text categorization, probabilistic modeling, linear algebra, machine learning, and computational linguistics to discover structure, patterns, and knowledge in large textual corpora.”2 The linear algebra aspect of text mining really takes advantage of vector spaces. The way it works is, data is represented in vector space models as numeric vectors and matrices, then using matrix analysis results are discovered that are relative to the search item. This is highly used by most search engines, especially google. On reason being, this allows for a ranking system. When your list of websites gets displayed you will see a relevance percentage. “The vector space model allows documents to partially match a query by assigning each document a number between 0 and 1, which can be interpreted as the likelihood of relevance to the query.” That number is turned into a percentage and that is the percentage that will display next to each site in rank. Here is a pictorial representation of the vector spaces returned when applied to the word chair. And I’ll leave you with a little quote I found about google during my research that I know you will appreciate and I hope to find room for on my poster. “It’s not my homepage, but it might as well be. I use it to ego-surf. I use it to read the news. Anytime I want to find out anything, I use it.”—Matt Groening, creator and executive producer, The Simpsons1 1. Google's PageRank and Beyond: The Science of Search Engine Rankings, Amy N. Langville and Carl D. Meyer 2.Visualize Word Meanings, from the infoMap project done by Stanford Uninversity. http://infomap.stanford.edu./