Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
ECT 250: Survey of e-commerce technology Searching, tables, and markup languages Topics The topics for today’s lecture: 1. Searching the web 2. FrontPage tables 3. A survey of markup languages 2 Searching the WWW • Exploring the Web can be very time-consuming. • Search engines and directories enable you to locate relevant web pages more quickly and efficiently. • A search engine is software that allows you to type in keywords. The engine scans a database of Web pages and displays a list of pages that meet your criteria. • A directory organizes Web pages into categories. You can click on appropriate categories until you find a Web page that matches your chosen topic. 3 Search engines/directories • Altavista (http://www.altavista.com) • Excite (http://www.excite.com) • DirectHit (http://www.directhit.com/) • Fast Search (http://www.ussc.alltheweb.com/) • Go (http://www.go.com) • Google (http://www.google.com) • HotBot (http://www.hotbot.com) • Northern Light (http://www.northernlight.com) • Yahoo (http://www.yahoo.com) • Web Crawler (http://www.webcrawler.com) 4 Naïve searches • A single keyword search can yield thousands of sites, many of which are irrelevant. Example: a search on www.northernlight.com for climbing yields 900,000+ hits • Multiple keywords can help. Example: Illinois, Wisconsin, climbing on www.northernlight.com yields 11,000 hits • To save time and effort it pays to construct a more sophisticated search that will yield fewer hits with a higher percentage of relevant pages. 5 Searching tips • Use a directory to find information on a general topic. Use keywords in a search engine for specific information or narrow topics. • Read the searching tips to help you construct a more precise query. • Use multiple, specific keywords. • Try using different words and synonyms. • Use advanced search features to make your query more focused. • Try multiple search engines/directories. 6 Advanced search options • Special operators (and, or, not, near) • Search for phrases, not just keywords • Domain specific searches: include or exclude pages based on their domain • Specify the language of the search • Page specific searches: pages that link to or are similar to a given page • Give a bound on the most recent update • Specify whether the site contains images, audio, or visual information Example: www.google.com 7 Limitations Search engines examine only a fraction of the web pages available on the World Wide Web. A study released in 1998 estimated that the best engines indexed only 33% of the publicly indexable Web. The 1999 follow-up study found the coverage had decreased to only 16%. More important may be the techniques used by the search engine in ranking and updating pages. 8 Topics The topics for today’s lecture: 1. Searching the web 2. FrontPage tables 3. A survey of markup languages 9 Tables Tables are used on Web pages to: • Place a table of contents or other important information in a specific location • Keep images and text aligned properly • Divide the page into columns Although it is not always apparent, many pages use tables within other tables for layout. Tables consist of rows, columns, and cells. 10 Creating a table 1. Click on the Insert Table toolbar button. 2. Use the mouse to highlight the desired number of rows and columns on the Insert Table grid. 3. Release the mouse button and the table will appear. 4. The table will have a default size and style that you can modify. Once the table has been created, text can be typed and images inserted into cells of the table. 11 Table and cell properties • The properties of a table determine its alignment, border, width, colors, etc. • To change table properties, right click on the table and select Table Properties from the menu. • The properties of cells include layout, colors, width, height, etc. • Individual cell properties can be changed by rightclicking on the inside a cell to reach the Cell Properties dialog box. 12 Table properties • Alignment: The alignment of the entire table on the Web page. One of Default, Left, Right, Center, or Justify. • Cell padding: The number of pixels between the cell contents and border. Default is 1 pixel. • Cell spacing: The number of pixels between the two cell borders. Default is 2 pixels. • Width and height: Dimensions in either pixels or percentage of the browser window. Default is to fit the contents of the cell. 13 Table properties • Border size: Width of border in pixels. Default is 0. • Border colors: One or two colors used for the border of all cells. Only visible if the border size is non-zero. • Background color or picture: Used in the background of all cells unless otherwise specified. Default is to use the page background. 14 Formatting with tables • Tables with border width zero can be used to arrange information on a Web page. • To use a table to lay out an entire Web page, select a table of the appropriate size and then create additional tables inside the main table. • Example: Format the header of my sample home page using a table. See also my home page on http://facweb.cs.depaul.edu/asettle/ for more formatting with tables. 15 Cell properties • Horizontal alignment: One of Default, Left, Right, Center, or Justify. Default is usually Left. • Vertical alignment: One of Default, Top, Middle, Baseline, or Bottom. Default is usually Middle. • Rows and columns spanned: Changes the size of the cell. Default is one row and one column. • Header cell: Makes the contents of the cell bold and centered. Default is to not be a header cell. • No wrap: Prevents cell contents from wrapping if if it exceeds the width of the cell. 16 Cell properties • Width and height: Measured in either pixels or a percentage of the browser window. The default is an even division of the table size between the cells. Width changes affect the entire column, and height changes affect the entire row. • Border colors: Specify one or two different colors for the cell border. The automatic setting uses the same colors as the rest of the table. • Background color or picture: Specifies a different color or image from the rest of the table. 17 Modifying table structure It is often necessary to change the structure of a table after it has been created. Modifications include: • Inserting rows and columns • Deleting rows and columns • Merging cells: combine a rectangular group of cells into a single, larger cell • Splitting cells 18 Inserting/deleting rows/columns Inserting rows or columns: • Move the insertion point inside the row or column closest to where the new row or column should be placed. • Select Insert Rows or Columns from the Table menu. • Specify how many rows or columns to insert and where they should be located. Click OK. Deleting rows or columns: • Select the appropriate row or column. • Select Delete Cells from the Table menu. 19 Merging/splitting cells • Merging cells combines a rectangular group of cells into one cell. It is used when the contents of a table are not a uniform size. • To merge cells, select all the appropriate cells, then choose Merge Cells from the Table menu. • To split a cell into multiple ones, move into the cell, then select Split cells from the Table menu. At the Split Cells dialog box, indicate the desired number of rows and columns. 20 Topics The topics for today’s lecture: 1. Searching the web 2. FrontPage tables 3. A survey of markup languages 21 Markup languages • FrontPage is an HTML editor. • HTML stands for hypertext markup language. • It is an example of a markup language. • Historically markup has described annotations and handwritten notes found on manuscript pages that tell a typist how a particular page should be laid out or typeset. • Electronic markup languages are marked with tags to govern the display, formatting, and organization of text elements. 22 Three markup languages Three markup languages are of particular interest: 1. SGML (Standard Generalized Markup Language) is the parent language from which the other two are derived. It is a meta language used to define other markup languages. 2. HTML (Hypertext Markup Language) 3. XML (Extensible Markup Language) is another descendent of SGML. It defines data structures important for a wide range of data exchange activities. 23 HTML An HTML document contains both document content and tags. • The content consists of all the information that appears in the browser window, including text, graphics, and video. • Tags are the HTML codes that specify how a the document should be formatted. Example: http://condor.depaul.edu/~tsettle/ect250/main.html 24 HTML tags • Each HTML tag is enclosed in angle brackets. • Two-sided HTML tags come in pairs. The general form of a two-sided tag is: <tagname properties>Content</tagname> The opening tag is <tagname properties>. The closing tag is </tagname>. • Some HTML tags are one-sided, requiring only the opening tag. • Tags are not case-sensitive. 25 Types of tags There are a large number of tags. Some examples: • Document tags: specify the parts of the document such as the heading, title, body. <title></title>, <html></html> • Text structure tags: determine the layout of the text found in the body of the document. <h1></h1>, <p></p>, <br> • Style tags: specify how text will be shown by the browser. <center></center>, <em></em> • Image tag: <img src=“name” other-properties> • Anchor tag: <a href = “URL”></a> 26 The meta tag Search engines catalog sites by following links from page to page and saving identification information for each page visited. The main HTML element that interacts with search engines is the Meta tag. Using the Meta tag you can list information about your page that allows a search engine to better classify the contents of your page. 27 Attributes of the meta tag The Meta tag has two attributes that should always be used: 1. The Name attribute identifies the type of Meta tag you are including. 2. The Content attribute provides information the search engine will be cataloging about your site. Example: <Meta Name = “keywords” Content = “algorithms, complexity, quantum, information, retrieval, kolmogorov, security, arrays, cryptography, faculty, combinatorics”> 28 History of HTML • HTML 1.0: Introduced in 1991 by Berners-Lee. At that time there was no standard for HTML. • HTML 2.0: Released in 1995. Began to move to a standard. Released at the same time were MS IE 2.0 and Netscape’s Navigator 2.0. Recall that the World Wide Web Consortium (W3C) serves as a leader in maintaining Web standards and common protocols. It was founded in 1994. 29 History of HTML • HTML 3.2: Introduced in 1997 by the W3C. Supported tables, complex numbers, and text flow around images. • HTML 4.0: Released by W3C in 1997. Included support for cascading style sheets, and added international features such as the ability to render text right to left. • HTML 4.01: Released by W3C in 1999. Supported more multimedia options, scripting languages, and documents more accessible to users with disabilities 30 History of HTML • XHTML Basic: Released in December 2000 by W3C, incorporating elements of XML into HTML to allow development on a wider set of devices such as TVs, PDAs, pagers, and cellular phones. • Coming soon from W3C: XHTML 1.0, which is a reformulation of HTML 4.0 in XML. 31 SGML • Work on the definition of a Generalized Markup Language for describing electronic documents and their format was begun in the 1960s. • In 1986, the International Standards Organization (ISO) adopted a version of the standard called Standard Generalized Markup Language. • SGML includes a standard that defines deviceindependent and machine-independent methods for representing electronic documents. 32 Advantages of SGML • SGML is good for organizations with special or complex requirements for the management of documents. Examples: U.S. DOD, HP • It is stable since it was standardized in 1986. • It is platform independent and will outlive most current applications. • It supports user-defined tags and architecture. Why is SGML not used by everyone? 33 Disadvantages of SGML • SGML’s tools are relatively expensive when compared to HTML. • SGML has a steep learning curve. • It is costly to set up and maintain, requiring extensive training and expertise. • Creating document type definitions with SGML can be expensive in terms of human labor. 34 XML • Extensible Markup Language is also derived from SGML, although it is newer than HTML. • It represents an effort to define what information is on a Web page. This contrasts with HTML where the emphasis is on the format of the data. • XML allows designers to easily describe and deliver structured data from any application in a standard, consistent way. 35 Idea behind XML • XML is both a markup language and meta markup language. • XML allows you to create new tags for each type of document you are storing. • In this way, XML stores information in a structured manner. • It is also interoperable with both HTML and SGML. This allows data stored in XML to be displayed (using HTML) and integrated with SGML documents. 36 XML example I <article> <title>Some XML</title> <date>January 29, 2001</date> <author> <mname>Amber</mname> <lname>Settle</lname> </author> <summary>Sample XML</summary> <content>XML is not for displaying information but for managing information. </content> </article> 37 XML example II <list> <employee><fname>Simone</fname> <lname>Settle</lname> <ssn>123-00-5454</ssn> <salary>70000</salary> <position>network administrator</position> <hire-year>1999</hire-year> </employee> <employee><fname>Joon</fname> <lname>Elam</lname> <ssn>456-88-7654</ssn> <salary>62000</salary> <position>web designer</position> <hire-year>2000</hire-year> </employee> </list> 38 References Unix: • Just Enough Unix, Andersen, McGraw Hill, 2000, ISBN 0-07-230297-6. HTML: • HTML: The Definitive Guide, Musciano & Kennedy, O’Reilly, 1998, ISBN 1-56592-492-4. •Internet and World Wide Web How to Program, Dietel & Nieto, Prentice Hall, 2000, ISBN 0-12-016143-8. FrontPage: • Getting Started: Web page design with Microsoft FrontPage 2000, Morley, Dryden, 2000, ISBN 0-03-026123-6. • Running Microsoft FrontPage 2000, Buyens, Microsoft, 1999, ISBN 1-57231-947-X. 39