* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Dialog
Survey
Document related concepts
Transcript
LIS618 lecture 3 Thomas Krichel 2002-09-23 Structure of talk • The blue sheet • Working with Dialog • Nexis.com using dialog • • • • go to command search pass warning screen you get to "dialog web command search" http://www.dialogweb.com/cgi/logoff?mode =guided&url=/cgi/dwframe?href=search.ht ml • searches there do not work well at this level blue sheet • each database name is linked to a blueish pop-up window called the blue sheet for the database • Contents of bluesheet is covered later • at this stage we choose a database and hit "begin". We see that there is a command selected: "be numbers" where numbers are the ones for the databases selected, separated by comma. database types • • • • full-text database bibliographic databases directory databases numeric databases – but they are not classified as such finding a database • file 411 contains the database of databases • 'sf category' selects files belonging to a category category • categories are listed at http://library.dialog.com/bluesheets • 'b ref,ref' will select databases closer look at the bluesheet • file description • subject coverage (free vocabulary) • format options, lists all formats – by number (internal) – by dialog web format (external, i.e. crossdatabase) • search options – basic index, i.e. subject contents – additional index, i.e. non-subject search options: basic index • select without qualifiers searches in all fields in the basic index • bluesheet lists field indicators available for a database • also note if field is indexed by word or phrase. proximity searching only works with word indices. when phrases are indexed you don't need proximity indicators search in basic index • basic index is queried through /IN, where IN is a field indicator • Thomas calls this a appending indicator • several field indicators can be ORed by giving a comma separated list, example • mate/ti,de additional features • Some databases allow to restrict the search with unary expressions – /ABS – /ENG require abstract present English language publication • Some fields are sortable with the sort command, i.e. records can be sorted by the values is the fields Such d are database specific. additional indices • additional indices lists those terms that can lead a query. Often, these are phrase indexed. • Such fields a queried by prefix IN=term where IN is the field abbreviator and term is the search term • Thomas calls this a pre-pending indicator the 's' (select) command • Once issued the "be" command to select a database, we issue the "s" command: • "s keywords" where keywords is a Boolean expression. • This will search the selected database in full-text view for the Boolean query issued • probably just searches the main index • keywords can be added display • you are allowed to select a format and a number of items to be displayed. • formats vary from database to database, some databases can not display certain formats Setting additional terms • It appear that "drinking and mate" seems a better search term… • What other terms to be used? – matear – matero – cebar – cebador (suck mate) (mate sucker) (prepare mate) (mate preparer) • prefix queries can be formed by appending a '?' to the query term. connectors I • '(W)' requires terms to appear one after the other next to each other e.g. 'yerba(W)mate?' matches "yerba mate". • '(i W)' where i is an integer, means followed by at most i words, e.g. 'ceba?(3W)mate?' matches "cebar un maravilloso mate" but not "cebador guapo mirando un mate" connectors II • '(N)' requires terms to be next to each other e.g. 'yerba(N)mate?' matches "yerba mate" or "mate yerba". • '(i N)' where i is an integer, means proximity by at most i words, e.g. 'ceba?(3N)mate?' matches "cebar mate" or "matear con la cebadora". • '(S)' searches for the occurrence of connected terms in the same paragraph. connectors III • (F) words in the same field, no order • (L) words in the same descriptor field, used to link headings and sub-headings. This is a hierarchical connector. • Note: connectors are processed left-toright. Use parenthesis whenever in doubt. Boolean operators • when using Booleans, be aware that "and" has higher precedence than "or". • Thus: a or b and c is not the same as (a or b) and c but it is a or (b and c) executing several searches • there can be several searches done sequentially, and the results sets are saved by the system. • Each time the system assigns a set number. • These can be combined in Boolean expressions, e.g. 's S1 or S2 and S3' • Remember that Boolean operations are set-theoretic! Reminder: fielded searches • search terms can be limited to fields by appending '/field_identifier' to the query term, where field_identifier is the identifier of a field. • identifiers of fields are also important in the "expand" command common field identifiers • • • • • • • • • 'co' 'de' 'au' 'df' 'ti' 'cc' 'pn' 'pc' 'px' company name descriptor author name one-word descriptor title classification value product name product code company type narrowing by date • 'PY=yyyy', where 'yyyy' is the four digit identifier for a year, limits the publication • 'PD=yyyymmdd' where 'yyyy' is the four digit identifier for a year, when 'mm' is a two-digit identifier expanding queries • names have to be entered as they appear in the database. • The "expand" command can be used to see varieties of spelling of a number. • It has to be used in conjunction with a field identifier, example expand au=cruz, b? to search for misspellings of José Manuel Barrueco Cruz expanding queries • search produces results of the form Ref Items Index-term – Ref is a reference number – Items is the number of items where the index term appears – Index-term is the index term • "s Ref" searches for the reference term. DS (display sets) • This command can be executed any time to review the sets that have been formed since the last B (begin) command. the stop words • an and by for from of the to with add/repeat • add number, number adds databases by files to the last query example add 297 to see what the bible says about it • repeat repeats previous query with database added the target command • "target set" where set is a search result fixes a subset of the "statistically most relevant results" • new result set is being formed.