Download Search Engines - Cal Poly Pomona

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

URL redirection wikipedia , lookup

Transcript
Search Engines & Marketing
Your Web Site
Dr. Soe, Dr. Westfall & CIS Dept.
California Polytechnic University,
Pomona, Updated August 2010
http://www.csupomona.edu/~rdwestfall/
451common/searchplace2.ppt
Agenda





Introductory exercise: Googlewhacks
How do Search Engines work?
How do you get your web site listed?
Search Engine exercise
Internet information issues
Googlewhacks



identify two words, NOT in quotation
marks, that get only one result in
Google
examples (used to work)
Exercise: find another Googlewhack
Search Engine Placement



Search engines return lists of links
based on search words entered by user
Most users only look at 3 - 10 items in
search output before changing words
Placement--how high a web page is in
the listings--is critically important in
generating traffic from search engines
High Search Engine Placement

"We can guarantee you a top 10
ranking"



what's it worth?
how can they do it?
Ixquick search on telecommuting
productivity

does not include Google, bur see next page
High Search Engine Placement

Google searches on specified words




telecommuting productivity
Westfall research
Mrs. Westfall (also try Norwalk Brethren
Elementary)
textbook ripoff
notes
Types of Search Engines


Spiders, webcrawlers, robots

automatic indexing of key- and other words

Google, list of others
Web subject directories



built by humans who review web pages
Yahoo!, Open Directory (Yahoo used to look
like this)
Hybrids = spiders + humans
How Web Crawlers Work

Automated index building


crawlers or spiders (indexing robot programs)
go to web sites
examine pages & extract indexing information



may simply locate words
may identify key words, phrases, links
store data in search engine’s database with
URL for page
Search Engines Deliver Indexes



User requests information via search page
Query engine searches database
Delivers list of web resources


creates results web page based on search
Listed in order of a calculated index value


index values based on search words, and also
on "popularity" of site
but usually preceded by "paid placements"
Automated Index Building Problems

No standards

HTML Documents are not structured so that
robots can extract routine information:



Except for <meta> tags: keywords,
description, publication date, author, etc.
Robots index text, not graphics, movies,
etc.
Search turns up inappropriate documents
Some Results Not "Genuine"



Sponsored links at top or side of page
Overture pay for placement is now
owned by Yahoo!
Google AdWords

only pay for click-throughs
Web Directories
Built by Human Indexing


Analyze site’s purpose
Classify sites by broad subject area


hierarchical classification schemes
Yahoo! - has many people reviewing web
site submissions


doesn't have to accept submissions
6 week delay unless pay for priority service?
Meta Search Engines

Don't have their own databases or
indexing


instead, combine results from other search
engines
Examples


Dogpile, Clusty
Ixquick (top 10 listings from other engines)
Specialized Search Engines






Search Engines & Specialized
Directories
List of search engines (Wikipedia)
Search Engines Directory
Specialized Search Engines and
Directories (many for educators)
Buzz Monitoring: 26 Free Social Media
Tracking Tools
Google: specialized "search engines"
Search Engines Ranked by %
of 16.7 Billion US Searches
Google
65.8% 10.3 billion
 Yahoo
17.1
3.4
 Microsoft
11.0
2.1
 Ask Network
3.8
0.6
 AOL LLC
2.3
0.4
Source: comScore Releases July 2010
U.S. Search Engine Rankings

Global Search Rankings
Google
 Yahoo
 Bing
 Baidu (China)
 Yahoo (Japan)
Source: Microsoft and
on Google. Strategy

69.7%
5.4
4.8
4.6
4.4
Baidu Gain Share
Analytics, 2010, Q2
Search Engines Ranked by
Pages Indexed (billions)
Yahoo!
19.2* (Aug. 2005)
 Google**
11.3 (Aug. 2005)
 MSN
5.0 (Nov. 2004)
 Ask Jeeves
2.5 "
"
* web pages (+1.6 B images, etc.)
** Google Now Knows About 1 Trillion
Pages (July 2008)

Get Site Into Directories



Directories (e.g., Yahoo!) require careful
selection of search categories & keywords
Search for your keywords on Yahoo! to find
appropriate categories
Yahoo! asks for a 25-word description of
content

make it really good to impress human indexers
Targeting Spiders




pick "keywords" that people would use to find
a page like yours
make these keywords prominent in your web
pages, especially in the entry page
Top Search Engine Ranking Factors for Google
Search Engine Ranking Factors | SEOmoz
Meta Tags

keywords meta tag used to be important


<meta name="keywords" content=
"telecommuting, research, telecommuting
research, telecommute, telecommutes,
telecommuter, telecommuters">
many search engines ignore them now because
of widespread attempts to use them to
manipulate rankings
Meta Tags - Description

even though not used much in rankings
anymore, contents of following tag are
shown in Google page listing

<meta name="description"
content="Westfall research and papers on
telecommuting, telecommuting
productivity, telecommuting economic
analyses, telecommuting strategies">
Keywords for Spiders

All keywords are not created equal;
spiders give heavier weights to:






keywords in the <title> (more than once?)
keywords in <h1> and other headers
keywords in other text near top of page
keywords in links (seen by user or in URLs)
bold faced keywords? italics?
How Search Engines Rank Web Pages
More Keywords for Spiders

Use keywords frequently, but don't repeat
same word more than once in a row




Use variations of keywords (plurals)
Use keywords in alternate text for images


OK: pizza pizza
not good: pizza pizza pizza pizza pizza pizza
<img src "file.jpg" alt="[keywords]">
SEO quizzes
Links for Spiders

Number of pages linking to a site has
become extremely important



Google pioneered this
if high ranking pages link to a site on the
same topic, it must be good
Quality of links is also important

need to be relevant both to page they are
on and to linked page
Trying to Fool Spiders

Search Engine “Spamming” (16 flavors)


spiders are being programmed to detect it
Examples:

repeat hidden keywords (bottom of page)



like background color, or <font size=1>
keywords not related to site content
irrelevant links: "link farms" or "link
stuffing" (ethical issues)
"Black Hat" SEO Tactics



Black Hat SEO (web page)
Bad SEO example page
Anti-link management


forged emails asking for removal of links to
competitors
"Black Hat" SEO (Google search)
Fake Sites on Search Engines

Security researcher Jim Stickley created
a phony site for a real credit union


redirected visitors to real site
phony site got #2 ranking on Yahoo
and #1 on Bing

ahead of even the credit union's real site
Google Blacklist

It's not nice to try to fool Google!


GoCompare.com had an 87% decrease in
web traffic after being blacklisted by
Google
search on "Google blacklist"
Register your Web Site with
Search Engines

Register individually with top sites


Yahoo!, MSN, Open Directory Project (goes
into Google, etc.)
Try site submission web sites?


Submit Express 75,000 search engines,
$29.95
“Change content, resubmit every so often?
"I can guarantee a top 10 …"


Junk mail and web sites
True, but…


not for your 1st choices of key words
Use relatively unique combination of
several words, and then them load into
key parts of page (<title>, <H1>, etc.)

probably not many people will search for
this combination of words

e.g., telecommuting productivity
Guaranteed Top 10 Listing

Use misspelled words


Use unique combination of words


including 2 words ran together (no space
between)
keep adding unrelated words to a search
until you find combination not found on
any other page
Manually submit page and put links to it
on another page(s) that's in Google
"Google bombing"

drives traffic to other pages by links and
keywords



Google search on failure
Google's AdWords ("Why these…")
pages linking to new biography URL
Scam Website Clusters

Scam promoters set up hundreds of
search-optimized sites about the scam


when you look for more information, most
search results say good things about it
example: try searching for keywords from
Magic Words that Bring You Riches page
and see how hard it is to find criticisms
Search Engine Exercise


Search for your keywords on any
automated search engine
For top 2-4 sites, look for keywords in:




<meta...>, <title>, <h1>, <a href="…>,
<img… alt="…>, etc. (use View, Source)
words in page, esp. near top
also use Google advanced search (click
Date, etc. then put URL in Find pages that
link to the page:)
Report any patterns you see
Site Submit Exercise

Identify a site to submit



find sites in Google related to Cal Poly
Go through the process of submitting to
a search engine or other submittal site
Take notes, report back on experiences:


how long it took
information required, etc.
Locate Information on Internet





Search Engines, Directories, Meta Search
Engines, On-Line Indexes
Pages with information on specific topics
White & Yellow pages
Usenet News
On-line newspapers, magazines, radio and
TV channels
Evaluating Information Quality

Source of site:





Educational institution (e.g., MIT)
Professional organization (e.g., IEEE)
Government agency (e.g., NASA)
Ratings by independent evaluators
Corroborating evidence: multiple, reliable
sources
Quality of “Did You Know?”


YouTube video that “went viral”
exercise: identify statements that
probably are not true


count passive references e.g., "predictions
are"
2010: Data doubling every 11 hours (do
the math!)
Citing Web Information


Whenever you use someone else’s
ideas, you have to cite them
Format for a research paper

American Psychological Association (APA)
Beckleheimer, J. (1994) How do you cite URL's in a
bibliography? Retrieved [month day, year], from [URL]

Graphics: if owner gives permission,
follow their directions for giving credit