Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Invisible Web Invisible Web AKA Hidden Web Deep Web (First labeled as the Invisible Web in 1994 by Jill H. Ellsworth) Internet vs. World Wide Web Internet A network created so that computers could talk to each other 1969 Funded by U. S. Defense Advanced Research Agency WWW Software that runs on the Internet that allows users to access files 1990 Created by programmer Tim Berners-Lee European Organization for Nuclear Research What is the Invisible Web? Areas of the web that search engine crawlers cannot access. Areas of the web that are not directly searchable by the basic search function of general search engines such as Google. Search Engine Crawler The tool used by Search Engines to index the Web. A webpage that is invisible to one search engine crawler may not be to another. What is the Surface Web? Areas of the web that search engine crawlers can access. Areas of the web that are directly searchable by the basic search function of general search engines such as Google. Web Evolution Size of the Web Surface Web vs. Invisible Web “White Paper” put out by BrightPlanet states: that the Invisible Web is about 500 times larger than the Surface Web. Visible vs. Invisible A Visible Website? Has static HTML WebPages (Depends on the search engine being used) Google Scholar (PDF’s) Google Video A Visible Website? Can be a Database or Directory Do you have to enter a query? Are there links to pages? Is a subscription needed? A Visible Website? Needs to be linked to a site that is currently being crawled. If not, then it is called a “disconnected page” There is no way to discover the page Even if there are no technical issues A Visible Website? May depend on the number of WebPages on a Website Mark Ludwig (Univ. at Buffalo, State Univ. of New York) 2.2 million WebPages created, Google crawled only 20,000 A Visible Website? Depends on if WebPages are Static or Dynamic Weather information Stock reports Opaque Web Sites that are physically able to be indexed, but are not chosen to be indexed for whatever reason. Size of a website Dynamic WebPages Essentially Invisible Why is this important Locating information Disseminating information If you want a website or digital project to be visible Locating Information When Searching Use the word database, directory, search engine, or a similar synonym Use a term for your topic Disseminating Information A Visible Website? Difficult to say if a whole website is visible Need to make the distinction one Webpage at a time Disseminating Information Do these Websites have invisible or surface WebPages? JSTOR Earth Trends Environmental Information Chicago Tribune Directory of Open Access Journals Western Waters Digital Library Invisible Web Directories CompletePlanet http://aip.completeplanet.com INFOMINE http://infomine.ucr.edu/ Librarians’ Internet Index http://www.lii.org/ Disseminating Information Google For Example) Google search engine crawler will index WebPages by: Linking to them from other WebPages Submitting “add URL form” www.google.com/addurl.html Pay for indexing Google only indexes first 101 KB of a website Future of the Invisible Web The size of the Invisible Web will continue to grow. Most likely be an increase in the amount of Websites that can be crawled. (Increase in speed) More indexing of non-HTML formats Ex) Google Scholar or Google Video Conclusion Use more than one search tool Use invisible web directories Remember how web search engine crawlers function for the purpose of effectively locating and disseminating information Questions?