Download index - Holoweb

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
@@@COMP: Please set alphabetical dividers. Also use
en dashes for numerical sequences.@@@
INDEX
accented characters, and text retrieval, 260
access control, 168
accessor methods, 219
Adept, 396, 397
Advogato, 421
aircraft company, shared authoring scenario in, 171
Ælfred parser, 142, 393
algebraic queries, 256-257
AltaVista, 251
anchor, regular expressions, 24
AnyDBM-File module, 241
Apache cache, 295
Apache mod-perl module, 410
search module, 317
Apache parser, 392
Apache Web Server, 392
online documentation, 421
application programmer's interface (API), 37-38
callbacks, 46
example, 38-44
apt-get program, 384
arbitrary field nesting, XML documents, 164-165
architectural forms, 213
archiving, 168
Ardent, 404
arithmetic encoding, 262, 263
arrays, 208
ASCII, 7
Astoria, 405-406
asynchronous networking, 45-47
example, 47-53
ATA 2100, 147, 151
atomic information, in tables, 60
attribute-list declaration, 26-28
attributes, 61, 72
and entities, 16
markup characters not allowed, 123
XML documents, 12-13, 26-28
authors.tbl, 79
AutoLinked glossary, 327-328
implementation, 332-336
use cases, 328-332
AutoLinker, 327, 337
for document conversion, 336
in memory, 275-284
with text retrieval, 284
for textual analysis, 336
automatic part-of-speech detection, 258
automatic text classification, 253
automatic thesaurus generation, 258
backup, 149-150
Balise programming environment, 116, 395
Barefoot License, 348, 374-375
binary mode, FTP, 382-383
binary packages, 377
installation, 383
binary snapshots, 377
binding, objects in object-oriented databases, 200
BitchX, 424
Bladerunner, 406
BLOB data type, 66
paragraphs as, 183-185
XML documents as, 180-183
BODY element (HTML), 302
book catalogue example, 117-120
CGI script for, 101-104, 105
ESIS output, 127-128
reading in Perl, 120-124
books, about XML and related topics, 413-420
book.tbl, 80-81
BookWeb example, 337-338
MySQL implementation, 76-84
representing tables, 87-91
SQL example, 73-75
Boolean ranking, presenting text retrieval results by, 265
/BriefCase, 408
browsers, see Web browsers
brute force and ignorance, classification method, 252
BSD (Berkeley Software Distribution) License, 347, 365-366
byte order mark, Unicode, 7
bzip2, 150
C
entities, 14
example API, 38-44
functions, 38, 43
parsers, 390
size overhead, 209
using expat, 133-137
C++
Is-A relationship, 211
parsers, 390
Standard Template Library, 208
cache manager, ndbm as, 294-297
callback function, 46-47
installing, 135-136
cardinality, 61, 62
Cascading Style Sheets, 33, 154
categorization, of documents in text retrieval, 252-253
The Cathedral and the Bazaar (Eric Raymond), 342
cdb, 247
central access scenario, 169-170
CGI scripts, 56, 96-98
book catalogue example, 101-104, 105
debugging, 98-101
and Perl DBI, 101-104
testing, 98-100
character references, XML documents, 13-14
character search, 254-255
character set, XML documents, 7
CHAR data type, SQL, 66
Citech browsers, 394
class designs, 215-219
classes, 212
classification, of documents in text retrieval, 252-253
clients, 35-36
second-class citizens, 38
client/server systems, 35-36
APIs, 37-44
asynchronous networking, 45-53
cache manager, 296
networks, 36-37
clustering, 321
comments
DTD, 21-22
XML declaration, 8-9
Common Gateway Interface, see CGI scripts
communication, 168
compound document configuration, 172
Comprehensive Perl Archive Network (CPAN), 388
Concurrent Versions System (CVS), 288, 408
configuration management, 168
configure script, 386
console-apt program, 380, 381
Containment relationship, 206-208
and references, 211-212
contamination, of software distributed with licensed, 345
content models, 22-23, 26
overriding using parameter entities, 30-31
conversion, 175
encyclopedia to XML, 274
existing object-oriented databases to XML, 211-212
cookies, 57
cooperation, 168
Copyleft, 346
CPAN (Comprehensive Perl Archive Network), 388
cross-references, 148
implicit, 157
in ndbm database, 248-249
CVS (Concurrent Versions System), 288, 408
Cyberbolic Map, 273
database applications
requirements, 145-149
XML architectures, 149-158
database implementation
documents as BLOBs, 180-183
elements as fields, 185-187
elements as objects, 189-190
general issues, 178-180
hybrid approaches, 191-192
metadata only, 187-189
paragraphs as BLOBs, 183-185
text retrieval, 191-192
database integrity, 67
DataBase Interface module (Perl), see Perl DBI
databases. See also object-oriented databases; relational databases; SQL; XML
books about, 416
creation with ndbm, 225-229
external file support, 288
files with, 287-291
normal forms, 72-73
session tracking, 56
and text retrieval, 291-294
database scenarios, 167-169
central access, 169-170
distributed access and technology reuse, 174
information reuse, 172-174
revision control, 171-172
shared authoring, 170-171
database-to-database translations, 85, 86
Data Mirror, 149
data mobility, 85, 86
data transfer, 149-150
db, 221, 245-248
features, 248
hashing with, 409
hybrid applications, 296-297
performance, 236
dbm, 221, 243. See also ndbm
hashing with, 409
dbm_close(), 240, 242
dbm_delete(), 231-233
dbm_fetch(), 229-231
dbm_insert(), 226
dbm_open(), 225, 240, 242
dbm_store(), 226-229
dbz, 247
deadlock, 202
Debian Free Software Guidelines, 342, 345
Debian Linux, 377, 380-382
binary packages on, 383
source packages on, 384
debugging
CGI, 98-101
object-oriented databases, 195
DELETE statement, SQL, 71
delta coding, 263
derived works, 343
design agency, central access scenario in, 169-170
Desperate Perl Hacker, 123
dev-xml (mailing list), 423
Dewey decimal classification system, 252, 253
diff program (Unix), 180
discrimination, in licenses, 344
distributed access and technology reuse scenario, 174
distribution
of licenses, 344
objects in object-oriented databases, 203
DocBook DTD, 151
DOCTYPE declaration, 9
document management, 297-298, 324
document management systems, 324, 405-406
Document Object Model (DOM), 217
documents, see XML documents
document similarity, 259
presenting text retrieval results by, 265-266
document type declaration, 9-10
document type definition (DTD), 19-21
comments and spaces, 21-22
conditional sections, 31-32
element declaration, 22-28
parameter entities, 28-31
per-document schemata, 163
document type element, 10
Docuverse DOM SDK, 394
DOS environment, document type declaration, 9
dpkg program (Debian), 380
dselect program (Debian), 380
DSSSL, 395
DTD, see document type definition
dtddoc, 393
DTD subset, 9-10
Dublin Core, 304-306
elements summary, 305-306
dynamic hashing, 224
dynamic hashing libraries, 221-222. See also ndbm
dynamic linking, 37-38
dynamic storage, 208
editors, for XML, 396-400
effectivity, 175
elements. See also content models
declaring, 22-28
as fields, 185-187
as objects, 189-190
representing sequences of, 209-210
XML documents, 10-12
Element Structure Information Set (ESIS), 125-129
embedded markup, 4-5
Empress, 403
EMPTY elements, 123
Encoded Archival Description (EAD), 312
encoding parameters, 134
encyclopedia
conversion to XML, 274
publishing, information reuse scenario in, 173-174
end command (C), 209
end tags, 10
Enhydra, 406
Enlightenment window manager, 154, 155
entities, 60-61
external, 7, 15, 16, 217-218
general, 4-5, 14-17
internal, 218
parameter, 28-31
entity reference, 15
entity relationship diagrams, 60-61, 147
BookWeb example, 88
Entity system, 214
ENUM data type, SQL, 66
error handling, in hybrid systems, 298
escape sequences, 125, 126
ESIS, 125-128
escape sequences, 125, 126
format summary, 125
reading, 129
etext command (C), 209
event models
asynchronous networking, 45-46
parsers, 46, 389
events, 45
eXcelon/Object Design, Inc., 404
Excite search engine, 412
expat, 390
building on Unix, 390-392
sample program, 137-142
using in C, 133-137
using in Java, 133
using in Perl, 130-133
export restrictions, on software, 344
Extended Backas-Naur Form, 6
extended links, 271
eXtensible Markup Language, see XML
external asset management systems, 317
external DTD, 20
external entities, 7, 15, 16, 217-218
external parsing, 124-129
extranet, 146
fields
elements as, 185-187
length, 163-164
nesting, 164-165
sequencing, 165
fields of endeavor, discrimination against prohibited in licenses, 344
find command (Unix), 410, 411
Finding Aids, 312
first normal form, 72
flat files, 287
flexible storage design, 208
FLOAT data type, SQL, 66
FOP, 396
formatting, 396
401 Authentication, 56-57
404 Forbidden error, 100-101
fragment identifier
and CGI, 97
XPath, 269
FrameMaker, 398
FreeBSD, 377, 382
binary packages on, 383
source packages on, 383, 384
FreeCode, 421
free software, see open source projects; software
Free Software Foundation, 346
Free XML Tools Page, 423
Freshmeat, 378, 379, 421, 422
FTP, downloading software with, 382-383
garbage collection (Perl), 94
gdbm, 221, 244-245
features, 248
hashing with, 409
GeekBoys, 421
Gemstone, 404
general entities, 4-5, 14-17
general text entities, 15-16
getParent(), 217
getReadyForPatterns(), 280
glossary, AutoLinker based, see AutoLinked glossary
Gnome, 378-379, 392
gnome-apt program, 380
gnorpm program, 378-379, 380, 383
GNU General Public License (GPL), 345, 346, 348-354
GNU Lesser General Public (Library) License (LGPL), 346, 354-363
GNU Public Virus (GPV), 346
Goxml.com, 423
GPL (GNU General Public License), 345, 346, 348-354
GPV (GNU Public Virus), 346
grep command (Unix), 25
for information retrieval database searching, 410-411
for text retrieval, 254, 291
groups
discrimination against prohibited in licenses, 344
of related documents, 321
gzip, 150
Harvest, 412
hashing
with dbm, 409
dynamic, 221-222, 224
Has relationship, 206, 210
HEAD element (HTML), 302
heterogeneous clients, object-oriented database servers, 196
heterogeneous collections, text retrieval, 259-260
historical archiving, 168
hooks, 46
HTML. See also links
Dublin Core elements with, 304
META and LINK tags, 302-303
page caching, 295
XML contrasted, 3-4, 5, 19
HTML-style links, 270-271
HTML Tidy, 396
HTTP, 53-54
HTTP-based cache, 295
httpd log file, 322
HTTP error logs, 324
hub format, 147, 150-151
hybrid approaches, 287-298
database implementation, 191-192
hypercase searching, 254
hypertext. See also links
books about, 416-417
SQL-like languages for reasoning about, 321
HyperText Markup Language, see HTML
HyperText Transfer Protocol (HTTP), see HTTP
HyTime, 273
Iaijutsu, 406, 407
IBM DB2 Universal Database, 403
IBM XML tools for Java, 394
implementation strategies, see database implementation
implicit cross-references, 157
InDelv browser, 394
indexes
AutoLinked glossary, 328-329, 333-334, 336
creation for text retrieval, 260-264
inverted, 256, 261-264
object-oriented databases, 201-202
information retrieval, 251-252. See also text retrieval
books about, 416-417
information retrieval databases
searching with index, 411-412
searching without index, 410-411
information reuse scenario, 172-174
Informix, 403
Inktomi search engine, 412
inline links, 270-271
inner join, 70
Insight Foundation (Yuri Rubinsky), 146
installing
binary packages, 383
Perl modules, 388
source packages, 383-384
source tarball, 385-387
InstallShield, 377
INT data type, SQL, 66
Integrated Development Environments (IDEs), with object-oriented databases, 195
Interactive PostgreSQL for Windows, 402
Interbase, 402
interchange
between applications, 153
with other organizations, 151-152
interfaces, 215
Interleaf, 406
Interleaf Panorama, 394
internal cache, 295
internal document type definition subset, 9-10, 20
internal entities, 218
internal parsing, 129-142
Internet Explorer 5, 87
XML support, 394-395
Internet Relay Chat (IRC), 424
Internet Relay Chat (IRC) glossary, 274, 322, 333
visualizing relationships, 285
intranet, 146
cookies in, 57
XML-aware Web browsers, 154
inverted index, 256, 261-264
Is-A relationship, 206, 210-211
ISO 8859-1, 7
ISO 8879:1988 SGML, 4
ISO Topic maps, 273-274, 312-313
Jade, 395, 396
Japanese language, text retrieval issues, 260
Java
dynamic hashing libraries with, 221
entities, 14
exporting data with, 112-115
multithreaded connections, 55
object-oriented databases, 199
parsers, 393-394
using expat, 133
with XML-aware browsers, 147
Java API for XML (Sun), 393
Java Component Library, 400
Java DataBase Connection (JDBC), 112
Java servlets, 113-115
for link visualization, 320
JOIN statement, SQL, 70
minimizing number for improved speed, 294
journals, about XML and related topics, 420-421
JUMBO browser, 142, 394
keepalive mechanism, HTTP 1.1, 54
kpackage program, 380, 381
languages, text retrieval in multiple, 259-260
Lark, 394
Latin 1, 7
LGPL (GNU Lesser Public (Library) License), 346, 354-363
licenses, see open source licenses
ligatures, and text retrieval, 260
line breaks, 179-180
LINK element (HTML), 302, 303
links, 315-316. See also XML links
in AutoLinked glossary, 328, 334-335
automatic, 274-284
checking, 324
extended, 271
between files, 249
incorrect, 274
management and analysis, 324
multiway, 317, 322
simple, 270-271
link visualization, 318-321
link visualization tools, 321
Linux. See also Debian Linux; Mandrake Linux; Red Hat Linux
books about, 417-419
Linux Chix, 423
literal, 309
little language, 256
loading behavior, 208, 209
local area network (LAN), 146
locales, text retrieval from multiple, 260
locate command (Unix), for information retrieval database searching, 410
location, of users, 146
locked programs, in synchronous networks, 45
LONGBLOB data type, SQL, 66
LONGTEXT data type, SQL, 66
LotusXSL, 396
lqaddfile, 333-334
lqkwic, 293, 335
lqsed, 335
lq-text, 347, 378, 379
hypercase searching, 254
for link insertion in AutoLinker glossary, 333-335
query result, 291-294
text retrieval using, 411-412
lqunindex, 293, 334
Lynx, 146
Macintosh environment
Internet Explorer 5 with, 395
MIME types, 55
source repositories, 406
magazines, about XML and related topics, 420-421
mailing lists, 423-424
malloc(), 129
Mandrake Linux, 378-380
source packages on, 383-384
manifests, 307
man-k command, 421
man perl command, 421
many-to-many relationships, 62
MARC version 21, 312
markup, 4
Markup Technologies, 413
metadata, 287-288
database implementation using only, 187-189
defined, 301-302
Dublin Core, 304-306
links as, 315-316
and Resource Description Framework (RDF), 307-312
storing in databases, 316-324
META element (HTML), 302-303
Microsoft, see DOS environment; Internet Explorer; Windows environment
Microsoft XML Notepad, 400
migration, of database objects, 203
MIME types, 317
described, 55
as preferable to namespaces for Web uses, 33
and XML declaration, 8
mIRC, 424
MIT License, 347, 366
mixed content, 148, 161-162, 209-210, 216
mkheader() (HTML), 281
Moby Project thesaurus, 258
modifiers, SQL, 64, 66-67
Mozilla, 85, 394
Enlightenment window manager, 154, 155
Public License, 347-348, 366-374
RDF application, 312
multilingual text retrieval, 259-260
Multipurpose Internet Mail Extension (MIME) types, see MIME types
multithreaded connections, 55
MySQL, 63, 183, 401, 402
documentation, 421
field length, 164
mysqladmin command
for database creation, 64
options, 65
mysql interpreter, 63
namespaces, 33, 214
RDF, 309
XLink, 270
name.tbl, 82
navigation, object-oriented databases, 196-200
navigational aids, 322, 323
ndbm, 221, 243-244
as cache manager, 294-297
database creation and saving a value, 225-229
deleting a value, 231-233
described, 222-225
features, 248
hashing with, 409
iterate over all keys, 223
performance, 236-240
reading a value, 229-231
reading all values, 233-236
retrieve any value by key, 223
store unordered set of key-value pairs, 222-223
and text retrieval, 262
using, 225-236
using in Perl, 240-242
versions, 242-248
when to avoid, 224
and XML, 248-249
nested objects, 5
nested transactions, 202
NetBSD, 382
source packages on, 383, 384
Netscape Mozilla, see Mozilla
Network File System (NFS), 188, 288
Network Information System (NIS), 222
networks, 36-37
asynchronous networking, 45-53
normalization, BookWeb example, 74-75
notations, 33
Not Found error, 100-101
NOT NULL database reference, 324
nsgmls, 129, 390
NSGMLS.pm, 129
object-oriented databases, 195-196, 404-405
converting existing to XML, 211-212
data access, 196-200
indexing, 201-202
object binding, 200
object distribution, 203
object saving, 200
persistence, 195, 196
queries, 200-201
text retrieval with, 291, 297
transactions, 202
and XML relationships, 206-211
object-oriented programming, 205
Object Query Language (OQL), 190, 200
objects
binding in object-oriented databases, 200
distribution in object-oriented databases, 203
elements as, 189-190
properties in relational databases, 60-61
relationships in relational databases, 61-62
saving in object-oriented databases, 200
XML documents as, 297
odbm, 243
features, 248
OLAP, 192
Omnimark, 116, 336, 396
one-to-many relationships, 62
one-to-one relationships, 62
online documentation, 412, 421-424
online newsfeed, information reuse scenario in, 173
OpenJade, 395, 396
open source browsers, 394-395
Open Source Definition (v 1.7), 342-344
open source licenses, 341-342
Barefoot, 348, 374-375
BSD, 347, 365-366
GPL, 346, 348-354
LGPL, 346, 354-363
MIT, 347, 366
Mozilla, 347-348, 366-374
Perl Artistic, 347, 363-365
open source projects
defined, 342-345
learning about, 413
relational databases, 401-402
Web sites, 421, 423
OpenSP, 390
Open Text, 291, 412
Oracle, 403
arbitrary field nesting, 164-165
OSI Certified software, 342, 345
Ovrimos SQL Server, 403
A Package Tool, 380-381
paper-based publishing, 152, 153
paragraphs, as BLOBs, 183-185
parameter entities, 28-31
overriding content models, 30-31
Parlance, 406
parsers, for XML, see XML parsers
PartNo element, 5
patch files, 343
patterns, see regular expressions
#PCDATA keyword, 26
per-document schemata, 163
Perl
AutoLinker example, 275-284, 328
book catalogue example, 120-124
books about, 419-420
Desperate Perl Hacker, 123
documentation, 421
dynamic hashing libraries with, 221
entities, 14
garbage collection, 94
for link visualization, 318-321
module installation, 388
textual analysis application, 336
using expat, 130-133
using ndbm in, 240-242
Web server script, 48-53
Perl Artistic License, 347, 363-365
Perl AutoLinker, 275-284, 328
perl command (Unix), 25
Perl DataBase Interface module, see Perl DBI
Perl DBI, 409-410
and CGI, 101-104
described, 92
generating XML with, 91-95
perldoc command, 421
pernicious mixed-content model, 162
per-paragraph similarity, in text retrieval, 259
perpetual intermediates, 264
persistence, in object-oriented databases, 195, 196
persons, discrimination against prohibited in licenses, 344
PHP
architecture, 105
described, 104-106, 409
documentation, 421
example code, 106-112
phrase-aware systems, 254, 255-256
phrase sequences, in text retrieval, 259
Pinnacles DTD, 151
Platform for Internet Content Selection (PICS), 306-307
Poet Software, 404, 405
poll, 47
PostgreSQL, 183, 401, 402
printing, 396
processing instructions, in XML documents, 18-19, 179
programming conferences, 413
Project Gutenberg thesaurus, 258
Prolog, XML documents, 7-10
properties, in RDF, 308, 309
protocol layers, 37-38
protocols
and APIs, 37-44
networks, 36-37
public domain software, licenses, 345-346
publishers.tbl, 79-80
push technology, 174
PyPointers, 393
Python
dynamic hashing libraries with, 221
parsers, 393, 396
queries
metadata, 316-318
object-oriented databases, 200-201
text retrieval, 254-259
Raima, 403
RCS, see Revision Control System
rcsdiff command, 290
RDF, see Resource Description Framework
RDF Schema Specification, 308, 310, 311
RDF Specification, 307, 310
RDF Visualization tool, 310, 311
RDM, 406
recall, in text retrieval, 255
Red Hat Linux, 377, 378-380
binary packages on, 383
MySQL, 402
RDF application, 311
source packages on, 383-384
redistribution, of open source code, 343
references.tbl, 82-84
referential integrity, 288
checking, 71
Refers To relationship, 206, 211
regular expressions, 23, 24-25
text retrieval, 255
relational databases, 59. See also SQL; tables
commercial, 402-404
creation in SQL, 63-64
deletion in SQL, 64
generating XML from, 85-116
minimizing JOINs for improved speed, 294
and nested objects, 5
object properties, 60-61
object relationships, 61-62
open source and free, 401-402
performance and number of tables, 148
representing tables, 87-91
Revision Control System with, 289-291
string manipulation weakness, 24
text retrieval with, 291
XML documents stored in, 166
relational tables, see tables
relationship types, see roles
repositories, 287-288, 405-406. See also databases
application-specific with wired-in behavior, 214-215
generic XML with external application, 214
source, 406, 408
requirements, 145-149
research project, central access scenario in, 169
Resource Description Framework (RDF), 147, 273
described, 307-312
visualizing relationships, 284-285
resource discovery, 168
resources, in RDF, 308, 309
revision control scenario, 171-172
Revision Control System (RCS), 408
and relational databases, 289-291
revision control tools, 172
RFC 2731, 306
roles, 61
BookWeb example, 75
root, avoiding installing source distributions as, 385
root element, 10
rpm program, 378, 380, 384
rusage command (Unix), 209
RXP, 392
saving, objects in object-oriented databases, 200
SAX, 393
described, 142
saxlib, 393
SAXON XSL processor, 395-396
SCCS, 406, 408
sdbm, 221, 244
features, 248
hashing with, 409
performance, 236, 240
search and replace, 24-25
searches, metadata, 316-318
second-class citizen, 38
second normal form, 73
sed command (Unix), 25
select() function, BSD Networking API, 47-48
select() statement, SQL, 68-69
ORDER By clause, 70
semantic nets, 258
Sequence relationship, 206, 208-210
serialization, 212
Server51, 423
servers, 35-36
SGML, and XML, 4, 5
SGML Open Catalog, 21
SGMLS.pl, 129
sgrep command (Unix)
for information retrieval database searching, 411
for text retrieval, 255
shared authoring scenario, 170-171
signatures, 261
significant comments, 179
similarity algorithms, in text retrieval, 259
Simple API for XML (SAX), see SAX
simple links, 270-271
size overhead, 208-209
slang dictionary, 156-158
SlashDot, 423
slocate command (Unix), for information retrieval database searching, 410
slurp mode (Perl), 122-123
SMART system, 266
software, 377-378
downloading with FTP, 382-383
finding packages, 378-383
installing binary packages, 383
installing Perl modules, 388
installing source packages, 383-384
installing source tarball, 385-387
software production, shared authoring scenario in, 170-171
Solaris environment, 377, 382
binary packages on, 383
Some2XML, 395
SORTBY, 208
source code, 343
choice of programmers for writing, 148-149
integrity of author's, 343-344
Source Code Control System (SCCS), 406, 408
SourceForge, 423
source packages, 377
installation, 383-384
source repositories, 406, 408
source tarballs, 378
installation, 385-387
SP, 390
building on Unix, 390-392
spell-check, context-sensitive, 259
spreadsheets, XML-aware, 87
SQL, 63. See also relational databases
changing data with UPDATE and DELETE, 71
database creation, 63-64
database deletion, 64
inserting data into tables, 68
JOIN statement, 70
limiting queries with WHERE, 69
printing tables with SELECT, 68-69
returning multiple columns, 70-71
for returning text retrieval results to a program, 266
SELECT statement, 68-69
sorting, 70
table creation, 67
text retrieval, 337
WHERE statement, 69
SQL/92, 63
SQL commands, 63
SQL data types, 64-67
SQL expressions, 70
Squid, 295
STAIRS, 262
standard error, 101
Standard Generalized Markup Language, see SGML
Standard Template Library, 208
start tags, 10
statements, RDF, 308-309
statistical ranking, presenting text retrieval results by, 265
stemming, 257-258, 259
streams, 4
strings
manipulation weakness of relational databases, 24
parameter entities for reuse, 29
Structured Query Language, see SQL
stub objects, 199
style sheets, 33
subset, 9-10
substitution, in search and replace, 25
surrogate fields, 210
sxml for emacs, 400
Sybase, 404
synchronous networking, 45
synchronous protocols, 40-41
tables, 60-61, 166
changing data with UPDATE and DELETE, 71
creation, 67
inserting data into, 68
number of, and performance of relational databases, 148
printing with SELECT, 68-69
representing, 87-91
returning multiple columns, 70-71
tags
can't be omitted, 124
META and LINK, 302-303
XML documents, 10-12
tar archive, 383, 385-386
tcl, dynamic hashing libraries with, 221
telephone repair manuals, 174
term replacement, in text retrieval, 258-259
TeX macros
for paper-based publishing, 152
XML parsers in, 394
text, 218-219. See also XML documents
declaring content, 26
representing sequences of, 209-210
transforming non-XML into XML, 395-396
XML as text-based format, 4
TEXT data type, SQL, 66
Text Encoding Initiative, 147, 151
text retrieval, 336, 337. See also information retrieval
AutoLinked glossary, 333-334
and database implementation, 191-192
and databases, 291-294
defined, 251-252
document categorization, 252-253
heterogeneous collections, 259-260
implementation issues, 260-266
index creation, 260-264
multiple languages, 259-260
multiple locales, 260
queries, 254-259
results presentation, 264-266
results returned to a program, 266
uncategorized information, 253-254
textual analysis, 336
thesaurus, in text retrieval, 254, 258-259
third normal form, 73
tie(), 241, 242
Topic maps, 273-274, 312-313
trade shows, 413
transactions, object-oriented databases, 202
tree model, parsers, 389
two-ended inline links, 270-271
types, 212
Ultraseek Server, 412
Unicode, 7, 260
Unified Modelling Language (UML), 205
Uniform Resource Identifiers (URIs), 307
Uniform Resource Locators (URLs), 55-56, 267, 307. See also CGI scripts
Uniform Resource Names (URNs), 267
Unix environment
books about, 417-419
building expat and SP on, 390-392
cache manager, 296
\(... \) idiom, 25
MIME types, 55
ndbm with, 221
nsgmls with, 390
public domain software, 345-346
size overhead, 209
software installation, 377
unrestricted field length, 163-164
untie(), 241
UPDATE statement, SQL, 71
updating, 168
Usenet News, 174
users, 146-147
values, in RDF, 309-310
VARBINARY data type, SQL, 66
VARCHAR data type, SQL, 66
vectors, 208
Verity search engine, 291, 412
Versant, 404
versioning, 168
vi command (Unix), 25
virtual folders, 321
Visual Markup, 400
visual schema designers, with object-oriented databases, 195
Visual XML, 116
Web-based validator, 393
Web browsers, 394-395
XML-aware, 154, 155
and XML generation, 86-87
Web clients, 53
Web Reports tool, 322, 323
Web servers, 53-57
IP port, 54
Perl script for, 48-53
WHERE statement, SQL, 69
white space, 13, 179
wildcards, 255
Windows environment
cache manager, 296
document type declaration, 9
Internet Explorer 5 with, 395
local edit of downloaded files, 286
MIME types, 55
nsgmls with, 390
software installation, 377
source repositories, 406
WordNet package, 258
word processors, XML-aware, 87
word sequences, in text retrieval, 255-256
workflow, 175, 297-298
World Wide Web
architecture, 53-57
401 Authentication, 56-57
automatic site mirroring, 174
cookies, 57
sites about XML and related topics, 421-423
World Wide Web Conference, 304
World Wide Web Consortium, 4, 423
Metadata Activity group, 307
Woven Goods for Linux, 423
writer's revisions, 172
wrote.tbl, 81-82
xargs command (Unix), 411
x-chat program, 424
XED, 396, 398, 399
Xerces-J parser, 393
xhost program, 378
XHTML, 4
XLink, 267, 315-316
overview, 270-272
XMetaL, 396, 397-398
XML. See also databases; SQL
behavior, 212-215
book catalogue example, 117-124
books about, 414-416
class designs, 215-219
defined, 3-5
features reference, 6-32
generating with CGI, 96-104
generating with Java, 112-116
generating with Perl DBI, 91-95
generating with PHP, 104-112
HTML contrasted, 3-4, 5, 19
hybrid approaches, 287-298
namespaces, 33
and ndbm, 248-249
notations, 33
reading into a program, 117-142
reading specification, 6-7
reasons for generating, 85-87
and SGML, 4, 5
style sheets, 33
as text-based format, 4
tutorials, 396
Web sites about, 423
XML-APP (mailing list), 423-424
XML architectures
for backup and data transfer, 149-150
as hub format, 147, 150-151
for interchange between applications, 153
for interchange with other organizations, 151-152
as intermediate format for advanced Web pages, 154, 156-158
for paper-based publishing, 152, 153
XML-aware Web browsers, 154, 155
XML Authority, 400
XML-aware tools, 87
XML-aware Web browsers, 154, 155, 394-395
XML declaration, 8
xml-dev (mailing list), 424
XML documents, 4. See also attributes; document type definition; elements; entities;
tags
arbitrary field nesting, 164-165
AutoLinker conversion, 336, 337
as BLOBs, 180-183
categorization, 252-253
character references, 13-14
character set, 7
comments, 8-9
document type declaration, 9-10
management, 297-298, 324
as objects, 297
per-document schemata, 163
processing instructions, 18-19
Prolog, 7-10
stored in relational database, 166
text content, 13-14
as trees, 166
XML declaration, 8
XML dot COM, 423
XML editors, 396-400
XML Extender, for IBM DB2 Universal Database, 403
XML files, 4
considered authoritative in databases, 317
including with parameter entities, 29-30
XML4C++, 393
XMLIO, 393
XML library for Gnome, 392
XML links
and databases, 274-285
functionality, 321-324
XML-L (mailing list), 424
XML Parser for Java (Oracle), 393
XML parsers, 46, 389-394
creating in C, 133
external parsing, 124-129
history, 142
internal parsing, 129-142
with metadata, 317
XML Path Language, see XPath
xmlproc, 393
XML Query Language, see XQuery
XML relationships
Containment, 206-208
Has, 206, 210
Is-A, 206, 210-211
Refers To, 206, 211
Sequence, 206, 208-210
XML Retrieval System (XRS), 412
XML Schema, 27, 32, 205-206
XML specification, 6-7
XML Spy, 399
XML standards, 267-274
XML Style Language (XSL), 33
XML Style Transformation Language (XSLT), see XSLT
XMLWriter, 400
XP, 394
XPath, 267, 315-316
overview, 268-269
XPointer, 267, 268, 271
overview, 269-270
XPublish, 400
XQL, see XQuery
XQL (mailing list), 424
XQuery, 267
overview, 272-273
XRS (XML Retrieval System), 412
XSL, 267
tutorials, 396
XSL-List (mailing list), 423
XSLT, 116, 151, 268, 395
and XML generation, 87
XT, 395
XTech, 413
X Window System License (MIT), 347, 366
Xyvision Parlance, 406
Yahoo!, 252
YEAR data type, SQL, 66
Yuri Rubinsky Insight Foundation, 146
Related documents