* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
LIS618 lecture 4 Thomas Krichel 2003-02-19 Structure of talk • Before online searching • Introduction to online searching • Introduction to DIALOG – Overview – bluesheets before a search • what is purpose – brief overview – comprehensive search • What perspective on the topic – scholarly – technical – business – popular I before search • What type of information – – – – Fulltext Bibliographic Directory Numeric • Are there any known sources? – – – – Authors Journals Papers Conferences II before search • • • • III What are the language restrictions? What, if any, are the cost restrictions? How current need the data to be? How much of each record is required? DIALOG Literature http://training.dialog.com/sem_info/courses/ pdf_sem/dlg1.pdf http://training.dialog.com/sem_info/courses/ pdf_sem/dlg2.pdf http://training.dialog.com/sem_info/courses/ pdf_sem/dlg3.pdf http://training.dialog.com/sem_info/courses/ pdf_sem/dlg4.pdf Dialog is a databank • over 500 databases • these are also known as files and cover – references and abstracts for published literature, – business information and financial data; – complete text of articles and news stories; – statistical tables – Directories • DIALOG uses the Boolean model DIALOG interface • is still rooted in "traditional" database systems • dismissed as "dial-a-dog" • is uses a command-driven interface • it is very complicated to learn fully • it is not suitable for the end-user • it therefore offers a valuable skill to the information professional • it is a challenge for a professor to teach Accessing DIALOG • On the web, go to • http://www.dialogweb.com/ • Enter username and password, then click on logon • When it is all done, click logoff in the top menu. two steps in DIALOG • step one: select databases (aka files) to look at • step two: perform searches on the selected databases • You may wonder why one does not have one single step like in a search engine. Discuss. • today we concentrate on the second step working on selected files • We assume that we have selected database that we know and we look at the search interface on the selected database. • The database selection process is a bit more complicated, covered next week. • First, let us login and look at the command prompt. • Then we select the first database (file) with the begin command The begin command • As its name suggests, usually the first command. • begin number, number,… • selects files with numbers number • Once they are selected they can be searched. • Now select the ERIC "begin 1" • "Begin 1" can be abbreviated as "b 1" Substeps in the second step • Identify search terms • Use Dialog basic commands to conduct a search • View records online or print the results the 's' (select) command • Once issued the "begin" command to select a database, we issue the "s" command on the database. • "s query_terms" where query_terms are the query terms • This will search the index of selected database in full-text view for the query issued • It will not find any of the following: "an and by for from of the to with". They are stop words. connectors • If you want to use several keywords there are three ways – you can truncate search terms – you can build an expression by putting several keywords together. This is achieved by DIALOG's connectors. – you can combine several expressions with the use of Boolean operators • we will cover this is in turn now truncation of terms • Open Truncation – "select path?" retrieves all words that begin with path: paths, pathos, pathway, pathology • Controlled-Length Truncation – "select path? ?" retrieves the root and up to one additional character: paths – "select path??" retrieves the root and up to two additional characters: paths, pathos truncation of terms II • Embedded Character truncation can be used for variant spellings: – "select organi?ation" -> organization organisation – "select fib??board" -> fiberboard fibreboard • This truncation feature is also useful for searching for unusual plural forms: – "select wom?n" -> woman women • You can also do prefixes by putting the ? in the beginning. – "?mobile" -> automobile metamobile Use of connectors • Connectors are used to put several words together. • One instance where this is useful is when you have words that on their own mean different things. • For example "mate" is a herbal beverage consumed in South America. Looking for mate on the Internet retrieves a lot of singles' pages. terms connected to mate • What other terms to be used? – matear – matero – cebar – cebador – yerba – bombilla (suck mate) (mate sucker) (prepare mate) (mate preparer) (mate herb) (mate straw) connectors I • '(W)' requires terms to appear one after the other next to each other e.g. 'yerba(W)mate?' matches "yerba mate". • '(i W)' where i is an integer, means followed by at most i words, e.g. 'ceba?(3W)mate?' matches "cebar un maravilloso mate" but not "cebador guapo mirando un buen mate" connectors II • '(N)' requires terms to be next to each other e.g. 'yerba(N)mate?' matches "yerba mate" or "mate yerba". • '(i N)' where i is an integer, means proximity by at most i words, e.g. 'ceba?(3N)mate?' matches "cebar mate" or "matear con la cebadora". • '(S)' searches for the occurrence of connected terms in the same paragraph. using Boolean operators • In your query, you can combine several expressions with Boolean operators • Example: "?SELECT LIBRARY(W)SCHOOL? AND DISTANCE(W)EDUCATION" • But I usually do not issue such fancy queries. executing several searches • there can be several searches done sequentially, and the results sets are saved by the system. • Each time the system assigns a set number. • These can be combined in Boolean expressions, e.g. 's S1 or S2 and S3' • Remember that Boolean operations are set-theoretic! Boolean operators • when using Booleans, be aware that "and" has higher precedence than "or". • Thus: a or b and c is not the same as (a or b) and c but it is a or (b and c) type command type set/format/range • set is a result set • format is a format • range can be – start – end • start is a record number to start • end is a record number to end – all formats are defined • • • • • 2 -- full record except abstract 3 or medium – citation 5 or long – full except full text 6 or free – title and dialog number 8 or short – title plus indexing terms – useful to find other indexing terms • 9 or full – everything • KWIC or K – keywords in context http://openlib.org/home/krichel Thank you for your attention!