Download Metasearch engines perform keyword searches on multiple search

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

URL redirection wikipedia , lookup

Transcript
CMP 101: Searching the Internet Learning Assignment
Searching the Internet
Learning Assignment
How do you find information on the Web?
Before you start
This tutorial assumes you have completed the Introduction to the
Internet assignment and that you know how to navigate the web
and use browser commands.
Objectives
Students will be able to:





Identify current web search tools and describe the benefits and
drawbacks of each.
Use a directory to narrow a topic; navigate within a directory
using breadcrumbs and cross references.
Identify key words and create a search query using basic and
advanced keyword search techniques.
Analyze search results and refine search queries.
Locate and use specialized sites designed for searching deep web
resources.
Introduction FAQ
What is the World Wide Web?
The World Wide Web refers to the billions of documents stored
and accessed via the Internet and viewed by a special program
called a web browser. Web documents contain hyperlinks that
allow readers to jump from one document to another with a click
of the mouse. Hyperlinks can also be gateways to audio and
video broadcasts, animations, and other types of media.
This depends a lot on the type of information you are looking for.
For example, if you need reliable or primary sources, your search
will be very different than if you are looking for stats on your
favorite football team. Today’s common surface web search tools
include search sites, metasearch sites, and directories.
What is the difference between Surface Web and the
Deep Web?
The surface web (or visible web) refers to material that you can
find using most search sites currently available on the web. The
deep web refers to information sources that cannot be found
using a typical search engine. They may be subscription-based
resources or pages that are created dynamically based on certain
criteria specified by the user (username and password, table of
data, etc…). Special sites and directory listings are used to access
deep web content.
I already know how to use a search engine, why do I need
to know about anything else?
Good research involves the use of multiple resources in
multiple formats. In this exercise you will learn more about
performing multiple searches for electronic resources on the
web. Some of what you learn here you will be able to apply to
other formats (other electronic databases, print, audio, etc…).
NOTE: The instructions in this packet are based on the Internet
Explorer installation on campus computers. If you are using a
different computer, your screens may be different.
Revised: 7/29/2017
Page 1 of 19
CMP 101: Searching the Internet Learning Assignment
Directories
Figure 1: Directory Example – Yahoo!
A directory is a categorical list of resources. In the case of web
directories, the categories and the resources presented are the result
of human reviewers and editors rather than computer programs.
When to use:
 Narrow down a topic
 Collect synonyms or keywords for unfamiliar/broad topics
 Seek resources for specialized information
Benefits:
 Sites included in a directory are generally more popular and
more reliable.
 Categorized listing makes it easy to narrow down unfamiliar
topics.
 Very good for generalized questions.
Drawbacks:
 Compared to most search engines, include a very small
number of web sites.
 Not good for finding specific topics or information.
 Usually less searching capability.
Figure 2: Directory Example – Open Directory
Examples: Yahoo! (dir.yahoo.com), Open Directory (dmoz.org), Yahoo!
Kids (kids.yahoo.com).
When you see this icon, write or type your answers on the
attached answer sheet. If you see questions without this
icon, your instructor may ask you to discuss your findings
in class or online.
Revised: 7/29/2017
Page 2 of 19
CMP 101: Searching the Internet Learning Assignment
Using Directories
Figure 3: Yahoo! Directory Results List
1. Logon to a computer, open a web browser, and go to the
Yahoo! directory at dir.yahoo.com.
2. Click on the Science category link. The results list consists
of a list of related sub-categories and a list of related sites.
3. Click on the Agriculture link.
Each click gives a narrower topic. This is referred to drilling
down, focusing in on, or narrowing a topic.
4. Continue to narrow the topic by clicking on Biotechnology
and then Genetic Engineering (see Figure 3).
How
many sites are listed in this category?
Notice the breadcrumbs above the results list. You can use these
links to jump back to a previous category.
5. In the breadcrumbs, click Agriculture to return to the
Agriculture category.
6. Click Food Safety@.
The “@” symbol indicates a cross-reference which means the subcategory is listed in multiple places within the directory. Note
the change to the breadcrumbs.
Figure 4: Browser History
7. In the breadcrumbs, click Directory to return to the main
list of categories. Click and hold the Back button until the
history list appears (see Figure 4). Click Biotechnology <
Agriculture… to return to the Biotechnology category.
8. Open a new tab. Go to dmoz.org and use the same category
links from steps 2, 3, and 4 to locate the Biotechnology subcategory.
What are the differences in the results of the
two searches. Close the dmoz.org tab when finished.
Revised: 7/29/2017
Page 3 of 19
CMP 101: Searching the Internet Learning Assignment
Search Engines FAQ
What is a search engine?
A search engine is collection of software designed to collect words
from web pages, rank and index the words, and create a database
that can be searched. This means that when using a search engine,
you are actually searching the database created from the web.
Clarification: Many people use the term “search engine” when
referring to a search site and/or the software associated with
creating a search database. In this handout the term search site
will refer to the web site you visit to perform a search and search
engine will refer to the software used to create and query the
database.
How does a search engine work?
There are four main parts to a search engine:




Spiders or crawlers: Small programs that find new and
modified web pages and collect information to send back to
the search engine’s indexing programs.
Indexing programs: Store and index pages in the search
engine’s database based on words on the page as well as
other meta-information which might include URLs on the
page, metatags or special terms embedded in the page’s
programming code, and other much more sophisticated data.
The search engine itself: Identifies and retrieves the
information queried by the user. Also ranks pages based on
many factors (popularity, advertisements, location of
searched words, etc…), not on reliability.
The interface: How the user interacts with the search engine.
Revised: 7/29/2017
How do you access a search engines?
There are currently four main large-scale search engines: Yahoo!
(yahoo.com), Google (google.com), Bing (bing.com), and Ask
(ask.com). Other search engines typically use one of these to
perform specialized searches.
How do you use a search engine?
Begin by defining what you are looking for. Searches typically fall
into one of 3 categories:
 Answer a specific question. Example: In what city will the
2014 Olympics be held?
 Personal research. Example: Research products before
making a large purchase.
 Academic research: Example: Write a paper for science class.
Then determine keywords to use. Keywords are nouns, verbs, and
sometimes adjectives that describe the information you are
searching for.
 Choose words that most clearly define your question or topic.
 Leave out words like “a”, “the” and “for” unless they are part
of a phrase.
 Consider synonyms or related terms.
Finally, enter your search query. The search query is the actual
keywords and search commands submitted to the search site.
 Type words in the order you would expect them to appear.
 Use advanced search techniques like phrase searching,
wildcard searching, and Boolean searching.
 Use focus options such as images, news, or videos.
 Try multiple searches, intermixing synonyms and modifying
word order.
Page 4 of 19
CMP 101: Searching the Internet Learning Assignment
Using Search Engines
When to use



Figure 5: Bing search results
Looking for specific information
Quick overview of a topic
Need advanced searching capabilities
Benefits



Fast answers (but are they complete/reliable?)
Advanced / intuitive search capabilities
Automatic synonym and Boolean searches
Drawbacks



May not provide access to complete information on a topic
May be difficult to distinguish sponsored links and
advertisements from search results
Ranking can be misleading
Let’s first take a look at the behavior of search engines based
on a simple query. We want to know in what city the 2014
Winter Olympics will be held.
1. Decide on keywords by choosing the most important words
that identify the question. They include, in no specific order:
city, Olympics, Winter, and 2014
2. Use your web browser to go to bing.com.
Enter key words in the order you would expect them to appear in
the results list.
3. Type 2014 Olympics in the search box and press enter.
Review the descriptions for first two pages of the results
list, and
comment on what you find (see Figure 6).
Figure 6: Commenting on search results
When asked to comment on search results, consider the following:




The number of sites found.
Do all sites in the results list answer your question?
Are answers contradictory or are they all the same?
Do you recognize what might be reliable sources?
4. Change your query to 2014 Olympics city. Does that
change the results? Try 2014 Winter Olympics?
Revised: 7/29/2017
Page 5 of 19
CMP 101: Searching the Internet Learning Assignment
Using Search Engines
1. This time we want to discover Shakespeare’s birthday.
First,
decide on the key words (include synonyms or
related terms).
Figure 7: Yahoo! search
2. Go to Yahoo! and perform the search using the keywords
you selected.
Take a moment to review the results list.
3. Try the search using a different search site and compare the
results. When is Shakespeare’s birthday?
4. Practice:
on the answer sheet, underline the
keywords that could be used to search for the answer to
each of the following. Then perform the search to locate and
write down the answer.
a. In which stock exchange is the FTSE 100 index used
to measure stock market performance?
b. According to legend, an acorn kept on your window
sill will supposedly keep your house safe from what?
c. In 1856-1857 Mark Twain wrote several letters to the
Keokuk Post in Iowa. What name did he sign on those
letters?
d. Patent #2,026,082 is based on what board game
patented in 1904?
TIPS: leave out symbols and punctuation. For this
question, you may have to perform more than one search.
Write down any extra keywords used.
Revised: 7/29/2017
Page 6 of 19
CMP 101: Searching the Internet Learning Assignment
Advanced search
Advanced searching involves making use of special commands or
forms to force the search engine to provide more specific or more
general results. Many advanced search techniques supported by
most search sites include:

Wildcard searching: Use an asterisk to replace unknown
words or spelling.

Phrase searching: Type the search query in quotes to have
the search engine find the exact phrase.

Boolean searching: Use “and”, “or” , “and not”, “+”, and “-“
Figure 8: Wildcard search
1. Go to google.com, type a penny * earned for the search
query and press enter. How many of the results on the first
two pages reference the saying: “A penny saved is a penny
earned”.
2. Try again with the query a penny * * * * earned with a
space between each asterisk. Are the results different?
Figure 9: Phrase search
Type 1 asterisk for each missing word for better results.
3. Try the same searches on at least one other search engine.
Do you get the same results?
4. Phrase searching is used to locate an exact phrase, like the
title of an article. It’s also good for forcing the search engine
to include key words it might otherwise ignore (like “Star
Wars I”). Return to Google and type “Organic foods: Are
they safer? More nutritious?” including the quotes.
What is the original source of the article?
The results list many sites with this phrase. You should always
try to find the original source of the information rather than
relying on an author’s “claim” that an article came from a
particular source.
Revised: 7/29/2017
Page 7 of 19
CMP 101: Searching the Internet Learning Assignment
Boolean search
Some search engines support the use of Boolean operators: AND, OR,
NOT or AND NOT, or symbols: + or – to broaden or narrow a search.
Many search engines perform Boolean searches automatically or
provide a form for users to customize Boolean searches. Many
research database search tools are not yet that sophisticated.



Use AND or + to narrow a search and force the search engine
to include all keywords in the search results. This is the
default for many search engines.
Use OR to broaden a search and allow synonyms to be
included in the results.
Use NOT, AND NOT, or – to narrow a search and force the
search engine to exclude keywords from the search results.
What’s the point?
Knowing how to identify key words and synonyms and
modify searches using Boolean operators can save you a
lot of time when researching a topic. While Internet
search engines have become very adept at interpreting
your search query, online libraries and databases are not
as sophisticated.
As search technologies progress, you will always benefit
from being able to identify key words and modify your
search to make more efficient use of whatever technology
is available.
Figure 10: Boolean search
1. Say, for example, you are writing a paper examining the
relationship between poverty and crime. Return to Google,
search poverty AND crime and review the results. Try
removing the “AND” and just search for poverty crime
Tip: Use the Back and Forward buttons to compare the results of
both searches. You should see that all of the results are very
similar. Many search engines use the AND operator by default.
2. Now go to Ask.com and search for poverty AND crime
AND kids. Review the results that appear under “More
Answers”
Notice that the search engine automatically searches for
synonyms for “kids” (example: child and juvenile)
3. Add –“single parent” to the search query (see Figure 10).
Use the minus sign (-) to exclude keywords from the search
results. In this case you are not interested in cases involving
single-parent households so you remove results that include the
phrase “single parent”.
Revised: 7/29/2017
Page 8 of 19
CMP 101: Searching the Internet Learning Assignment
Metasearch Engines
Metasearch engines perform keyword searches on multiple search
engines and provide the results in one list.
When to use:
 Determining the scope of a topic (how much/what kind of
information is available)
 Perform a quick search across several search engines
Benefits:
 Fast method for conducting a broad search
 Some have special features such as clustering
Drawbacks:
 Usually only return the first 10 or 20 results from each
source.
 Some rely heavily on paid listings
 Advanced search operators may not be supported.
Figure 11: Learn more about a search engine
Examples: Dogpile (www.dogpile.com), Metacrawler
(www.metacrawler.com), Ixquick (ixquick.com), Yippy (yippy.com).
There are many, many more.
1. Go to dogpile.com. Click About Dogpile (page bottom).
With any search tool, it’s a good idea to learn something about
how the tool works. You can avoid duplication and save time.
2. In the About page (see Figure 11) you learn (at the time of
this writing) that Dogpile uses Google, Bing, and Yahoo!.
Now you know that if you do a search on Dogpile, you may not
need to repeat the search at these sites separately. However,
remember that Dogpile will limit the results list so if you need to
dig deeper another search may be required.
3. Click the links for Metasearch 101 and FAQs and any other
links you might find useful.
Write down something you
learned about Dogpile or searching in general.
Revised: 7/29/2017
Page 9 of 19
CMP 101: Searching the Internet Learning Assignment
Metasearch Engines
1.
Figure 12: Dogpile search results
Click in the search box and type the key words video game
addiction.
In reviewing the results list, sponsored (paid) sites are listed first,
followed by the non-sponsored sites. Each item also displays the
source search engine.
2. Click the News tab above the search box (see Figure 12).
Many search sites and metasearch sites allow you to filter search
results to a specific type.
3. Click the Web tab to return to web results.
In the Are you looking for? section you may find related terms to
help you in your search.
4. Click Dependence on Technology link.
If you were writing a paper on video game addiction, the information
from this search may help relate video game addiction to other social
behaviors.
5. Now go to ixquick.com. Click About at the top of the page.
Figure 13: Ixquick search results
On the About page, in the More accurate search results section, you
learn, among other things, that stars are used to rank the search
results based on the number of search engines.
6. Use the Back button to return to the Ixquick home page and
search for video game addiction.
7. Point to the stars next to the first non-sponsored site (see
Figure 13).
A screen tip appears showing the search engines used. Note that some
of the search engines are metasearch engines too.
Revised: 7/29/2017
Page 10 of 19
CMP 101: Searching the Internet Learning Assignment
Clustered Search
Figure 14: Yippy search results
When doing research, some metasearch sites can help by displaying
and grouping related terms.
1. Go to Yippy.com. Search for video game addiction.
Yippy not only searches the web, it presents logical “groups” or
“categories” called “clouds” to organize the information. This is
sometimes referred to clustering.
2. Click the details link above the results list (see Figure 14).
Information about the source of the results is listed.
3. Click the details link again to collapse the list. In the results
list point to each icon next to a result site to learn their
function. Try using the Preview icon to view the result site
directly in the results list.
4. In the clusters panel on the left, review the “clouds”. Click
the plus sign (+) next to any one of the categories listed.
The search results now focus on the category you selected.
5. Click the Sources tab in the clusters panel
Now you can see which sites come from which search engines
and can filter your search results to a specific search engine.
6. Explore the other two tabs, sites and time to view changes
to the results.
Revised: 7/29/2017
Page 11 of 19
CMP 101: Searching the Internet Learning Assignment
Specialized Search Tools
Many government and private sector sites provide access to
documents and data that is not available from the most
common search engines. For example, if you wanted to gather
statistics about a particular school, you might search the web
for Education Statistics. You would quickly find the National
Center for Education Statistics where you can build queries on
the data available there. One such query tool is the
Elementary/ Secondary Information System (ElSi)
Example: you want to compare pupil/teacher ratios for your
elementary (or one in your area).
Figure 15: NCES Search
1. Go to nces.ed.gov (National Center for Education Statistics).
Click in the search box in the upper right corner of the
page, type ELSI, and press Enter.
Many web sites allow you to search within the site.
2. Click ELSI - Elementary and Secondary Information
System. Review the page for any notices you should be
aware of (sometimes data is not up-to-date or contains
errors).
3. Click the Begin button next to quickFacts. Use the options
presented to learn more about the pupil/teacher ratio for a
school you attended or one in your area for the most recent
years available. Compare that to the oldest dates available.
Since the page is generated dynamically (when you ask for it),
surface web search tools cannot index it.
Revised: 7/29/2017
Page 12 of 19
CMP 101: Searching the Internet Learning Assignment
Find Specialized Search Tools
Some deep web resources are indexed in directories maintained by
specialists in the field. To find some of these deep web search tools
use your favorite search engine and add the following keywords (or
synonyms) to the topic you are interested in. Remember, anybody
can post anything on the Internet, so always be skeptical of your
search results.



Figure 16: Many Search Tools are available on the Web
“web directory”, “resources” or “internet resources” or “web
resources”
“library”, or “portal”
“pathfinder” for lists of printed resources
1. Use bing.com to perform a search for each of the following.
Use different tabs so you can compare results.
agriculture
agriculture portal
agriculture “web resources”
agriculture pathfinder
Are all of the search results the same or different? How
would this help if you were writing a paper about
agriculture?
2. Find your own search tools: Use a search engine of your
choice to search for the following types of search tools.
Write the URL of a site for each and a brief description how
the search site works or how search results are arranged.
a. Good search engines for
students
b. Visual search engines
c. Deep web search engine
d. Reference tools (dictionary,
thesaurus, encyclopedia,
almanac, etc…)
Revised: 7/29/2017
(Search results or related
topics are presented visually)
Page 13 of 19
CMP 101: Searching the Internet Learning Assignment
Use Research Databases
Most colleges now offer electronic databases that allow you to
search periodicals. You can use some of the same techniques to
search these databases.
Figure 17: Academic One File
1. Go to the Wor-Wic web site (www.worwic.edu), point to
Quick Links, click Library Services, and then under
Research Databases, click By Subject.
2. We’ll be looking for information about Video Game Addiction
so let’s start with Social Sciences and choose the first
database on the list, Academic OneFile.
3. Perform a keyword search on Video Game Addiction.
4. When the results are displayed, change the search options
to limit the search results to peer-reviewed publications
published in the last year.
5. Open another tab, repeat step 1 to go to the Library Services
page, select the Health, Medicine & Nursing category and
choose the Health and Wellness… database. Use the
Advanced search to find full text articles on Video Game
Addiction published in the last year.
Figure 18: Health & Wellness Resource Center
6. How do the two searches compare? Practice modifying your
search by removing the date filter (look for “revise search”
or “remove limit”).
7. Click one of the article titles to open the full document.
Look for a “print” link on the page and use it to print page 1
only.
Use the print link to open the article in a format more suitable for printing.
NOTE that some articles may open in Adobe reader which is also suitable
for printing.
Revised: 7/29/2017
Page 14 of 19
CMP 101: Searching the Internet Learning Assignment
More search examples and suggestions
Historical search (chronicled, retrospective, varied opinions)
Identify keywords and use a metasearch site such as DogPile or
IxQuick. Consider synonyms, plural vs. singular spellings,
different spellings, etc…
Examples: Emergence of smartphones
Consider synonyms for “emergence”: “rise”, “development”, and
“evolution”.
Consider different spellings: “smartphone” (singular), “smart
phones” (two words), “smart phone” (two words, singular)
information. Keywords like “trends” or “discoveries” may
provide more current data.
Personal Research (large purchases, medical information,
restoration project)
Identify keywords and use either a metasearch site or a search
site. With personal research you may rely on “comparisons”
(keyword “compare” or wildcard “compar*”), “consumer
feedback”, and professional “reviews” so include those keywords
in your search. If you know of a reliable source of information on
a product, you can include that too.
Examples: smartphone review, smartphone consumer feedback,
consumer reports smartphone
Perform multiple searches: Make note of other related terms
and keywords you find. Make use of clustering search engines to
help you find more related terms.
Identify keywords: Write your topic in a sentence to identify the
key words. Make a list of synonyms or related topics. You might
add the keywords “history” or “historical” to obtain background
Revised: 7/29/2017
Search for search tools: Remember that experts in the field
sometimes create directories of materials related to your topic.
Use some the suggestions on the previous page(s) to locate those
resources. You can also add keywords like “tutorial” or
“beginner” or “guide” to lead you towards introductory
information on some topics.
Need reliable sources? Make use of specialized search tools like
sweetsearch.com, ipl2.org, or your school’s research databases.
Feeling overwhelmed? Research is gathering information – all
kinds of information. It can be overwhelming at times. It may
help to organize and categorize the information you have found
(just like a directory) and decide whether or not it fits into the
scope of your project.
More practice. Complete the More Practice questions at the end
of the answer sheet as directed by your instructor.
Page 15 of 19
CMP 101: Searching the Internet Learning Assignment
Resources:
Reference Materials
Columbia Encyclopedia:
Wikipedia:
Dictionary:
Search Engines
Google:
google.com
Yahoo!:
yahoo.com
Bing:
bing.com
Ask:
ask.com
Directories
Internet Public Library:
Open Directory:
Yahoo!:
Yahoo! Kids:
Best of the Web:
Spanish Dictionary:
Reference Material:
ipl2.org
dmoz.org
dir.yahoo.com
kids.yahoo.com
botw.org
A few specialized directories
there are so many out there on just about any topic
Virtual Library:
vlib.org
Open Access Books:
www.doabooks.org
Queen Victoria’s Journals:
www.queenvictoriasjournals.org
Business.com:
business.com
Online books:
onlinebooks.library.upenn.edu
Revised: 7/29/2017
encyclopedia.com
wikipedia.org
yourdictionary.com
merriam-webster.com
diccionarios.com
refdesk.com
infoplease.com
Research tools & search engines
Sweet Search:
sweetsearch.com
Finding Dulcinea:
findingdulcinea.com
Jstor:
jstor.org
Noodle:
noodletools.com
Several tools and helpful information on
conducting research
Page 16 of 19
CMP 101: Searching the Internet Learning Assignment
Searching the Internet – Answer Sheet
Pg,
Step
Description
3, 4
Number of sites in Genetic Engineering category
3, 8
Compare category listings (Yahoo! and
dmoz.org)
5, 3
Comment on results for 2014 Olympics keyword
search
6, 1
Keywords for Shakespeare’s birthday
6, 4a
In which stock exchange is the FTSE 100 index
used to measure stock market performance?
6, 4b
6, 4c
6, 4d
7, 4
Type or write answers below
According to legend, an acorn kept on your
window sill will supposedly keep your house
safe from what?
In 1856-1857 Mark Twain wrote several letters
to the Keokuk Post in Iowa. What name did he
sign on those letters?
Patent #2,026,082 is based on what board game
patented in 1904 (remember to write down any
extra keywords used)?
Source of the article Organic foods: Are they
safer? More nutritious?
Revised: 7/29/2017
Page 17 of 19
CMP 101: Searching the Internet Learning Assignment
9, 3
Write down something you learned about
Dogpile or searching
13, 1
Are all of the search results the same or
different? How would this help if you were
writing a paper about agriculture?
13,
2a
Search engine for students
13,
2b
Visual search engine
13, 2c Deep web search engine
13,
2d
Reference site
Revised: 7/29/2017
Page 18 of 19
CMP 101: Searching the Internet Learning Assignment
More Practice
1. For each of the following, indicate what kind of search (keyword, wildcard, phrase, directory, deep web) would be most appropriate
and why. Try your search strategy to see how well it works.
Searching for
Asnwer
Reasoning
Ideas for fun things to do on the
weekend.
Directory search
You can focus in on activities of interest. With the other searches, many
different activities would be included in the search results. It would be
harder to weed out the ones you aren’t interested in.
Title of a poem when you only know a
few words that aren’t necessarily next
to one another in the poem.
Keyword or
wildcard
You are looking for something specific and have some of the words in
the poem. If you know the order of the words, you can use wildcards,
otherwise, just use a keyword search.
Your turn…
1. How to treat a bee sting
2. Write a paper for astronomy class
3. Obtain education statistics for US
schools by state or region.
4. Information on Richard I
2. Go to www.agoogleaday.com and see how many questions you can answer. Be sure to use the Google a day search site so as not to
give away the answer.
Revised: 7/29/2017
Page 19 of 19