![]()
Almost as soon as computers began to be used for solving problems in engineering
and the physical sciences, scholars in the humanities were figuring out
ways to use them in their own disciplines, particularly to deal with the
massive amounts of data in text corpora and reference works that are critical
to scholarship and teaching in the humanities.
Roberto Busa, for instance, began work on the computer-based Index Thomisticus
in 1949. Later important projects included Professor Kucera's "Brown
University Corpus of American English," an analyzed selection of American
English prose which set the standard for natural language corpora, and the
Thesaurus Linguae Graecae, which contains nearly all ancient Greek
texts from Homer to 600 A.D. Soon scholars in every discipline which had
an identifiable body of important textual material were wondering how that
material could be managed, accessed, and analyzed with the computer. For
literature scholars and linguists, as well as dictionary enthusiasts, it
is hard to imagine more tempting material for digital treatment than the
Oxford English Dictionary, (OED), the massive source of detailed
information about the English Language.
Conceived in 1858, the twelve volume Oxford English Dictionary began
publishing in 1884 and was finished in 1928. A supplement was added in 1933,
which was incorporated into the four supplementary volumes released in the
'70's and 80's. It totaled over 21,000 pages with 600,000 headwords, with
extensive information about derivation, history, usage, pronunciation, and
meanings, and more than 2,400,000 quotations that demonstrate actual word
usage over time.
Maintaining a dictionary of this size by means of traditional scholarly
and publishing methods is nearly impossible. So it was natural that in planning
for the second edition, Oxford University Press turned to the promising
resources of the new information technology and formed a partnership with
computer scientists and linguists at the University of Waterloo, establishing
the "Centre for the New Oxford English Dictionary". However important
the practical usefulness of computers is in the production of the second
print version of the dictionary, it is the computer's potential for supporting
new functionality, such as rapid or complex searches, analysis, and presentation,
that is particularly exciting to scholars. The Centre at Waterloo has been
the focus of an enormous amount of research into text structuring, text
retrieval systems, and indexing, as well as lexicography. Advanced work
in this area continues at Brown also, where Jacque Russom, Consulting Linguist
for the Women Writers Project (a textbase of writing by women between 1330
and 1830), is leading several WWP projects involving the electronic OED.
Responding to faculty interest in bringing the resources of the electronic
OED to the broader University community, the University Library and Computing
and Information Services carried out a joint project to acquire the data
and configure a delivery system that would provide the widest and most functional
access feasible. CIS purchased "PAT," a text retrieval engine
developed at Waterloo, and the Library purchased the OED data from Oxford
University Press. The OED data is structured in a format similar to SGML
(the Standard General Markup Language), the standard for coding textual
information (such as books, articles, corpora) for use on the computer.
This sort of encoding explicitly marks the components of each entry (such
as headword, derivation, pronunciation, quotation, quotation author) so
that they can be used as retrieval fields. The search engine, PAT, particularly
suited for SGML retrieval, processes queries against these fields.
Last summer the CIS Scholarly Technology Group (STG), installed the data,
installed and configured PAT, developed the World Wide Web search form interface,
and adapted the programs necessary to coordinate the various parts of the
system: managing queries, retrieval results, and presentation of information.
The programs that coordinate the various parts of the system were adapted
by STG programmer Geoffrey Bilder and are based on those originally developed
by John Price-Wilkin at the University of Michigan. The Library Reference
and Systems staff, particularly Raynna Bowlby and Helen Schmierer, provided
design consultation on how the system as a whole should function to ensure
that the needs of faculty and students were met.
Here's an example of how the system works. A user accesses the OED World
Wide Web page on a Library workstation and, using a simple, intuitive form
with fields, pull-down menus, and clickable buttons, issues a query, e.g.,
a request to return all entries containing a quotation by Brown Professor
Emeritus Roderick Chisholm. The query is then translated into a language
that the PAT retrieval engine can understand and PAT interrogates the data
files to identify all entries that have Chisholm as the author, as indicated
by the presence of "R. M. Chisholm" within the quotation author
field. The content of the selected entries is then converted from the specialized
OED markup language to HTML (Hypertext Markup Language), which is the text
encoding language used on the World Wide Web. This conversion takes place
quickly, behind the scenes, dynamically constructing the WWW result page
that will be seen by the user.
This process illustrates an approach to delivering information over the
network that is likely to become a common strategy at Brown. Complex information
is stored in an encoding system, like SGML, that, because it adequately
reflects the sophistication of the information it is encoding, can support
advanced queries and analysis. However, when selections from the database
are retrieved, they are translated to a simpler, more general, and more
widely available encoding format, like HTML, in order to be easily formatted
and viewed on commonly available workstation viewing software, like World
Wide Web "browsers". But we may someday look forward to workstation
browsers that are themselves capable of managing sophisticated and specialized
encoding systems. This will give users even more capabilities, such as advanced
discipline-specific views and manipulation.
The Brown electronic version of the OED remains very much an evolving project,
as the Library and STG are still in the process of refining the interface
and increasing its functionality. However, basic retrieval by headword,
etymology, author, earliest quotation date, entire entry, and other fields
is implemented and retrieved selections are displayed conveniently formatted
and browsable. It is already proving to be a very popular electronic resource.
The OED can be easily accessed on the Library's Reference and Information
Center's workstations, as well as from the main Library Home Page. -A.R.
chowder tSau.d<E> r, sb. Also 8 chouder. App. of French origin, from chaudière pot. In the fishing villages of Brittany (according to a writer in N. & Q. 4 Ser. VII. 85) faire la chaudière means to supply a cauldron in which is cooked a mess of fish and biscuit with some savoury condiments, a hodge-podge contributed by the fishermen themselves, each of whom in return receives his share of the prepared dish. The Breton fishermen probably carried the custom to Newfoundland, long famous for its chowder, whence it has spread to Nova Scotia, New Brunswick, and New England. Another writer in N. & Q. (1870) 4 Ser. V. 261, says `I have frequently heard some of the old inhabitants [of Newfoundland] speak of Commodore John Elliot's chowder pic-nic in 1786, which was given in honour of H.R.H. Prince William Henry [William IV] in command of H.M.S. Pegasus upon the Newfoundland station'.
|