The Electronic Oxford English Dictionary



Almost as soon as computers began to be used for solving problems in engineering and the physical sciences, scholars in the humanities were figuring out ways to use them in their own disciplines, particularly to deal with the massive amounts of data in text corpora and reference works that are critical to scholarship and teaching in the humanities.

Roberto Busa, for instance, began work on the computer-based Index Thomisticus in 1949. Later important projects included Professor Kucera's "Brown University Corpus of American English," an analyzed selection of American English prose which set the standard for natural language corpora, and the Thesaurus Linguae Graecae, which contains nearly all ancient Greek texts from Homer to 600 A.D. Soon scholars in every discipline which had an identifiable body of important textual material were wondering how that material could be managed, accessed, and analyzed with the computer. For literature scholars and linguists, as well as dictionary enthusiasts, it is hard to imagine more tempting material for digital treatment than the Oxford English Dictionary, (OED), the massive source of detailed information about the English Language.

Conceived in 1858, the twelve volume Oxford English Dictionary began publishing in 1884 and was finished in 1928. A supplement was added in 1933, which was incorporated into the four supplementary volumes released in the '70's and 80's. It totaled over 21,000 pages with 600,000 headwords, with extensive information about derivation, history, usage, pronunciation, and meanings, and more than 2,400,000 quotations that demonstrate actual word usage over time.

Maintaining a dictionary of this size by means of traditional scholarly and publishing methods is nearly impossible. So it was natural that in planning for the second edition, Oxford University Press turned to the promising resources of the new information technology and formed a partnership with computer scientists and linguists at the University of Waterloo, establishing the "Centre for the New Oxford English Dictionary". However important the practical usefulness of computers is in the production of the second print version of the dictionary, it is the computer's potential for supporting new functionality, such as rapid or complex searches, analysis, and presentation, that is particularly exciting to scholars. The Centre at Waterloo has been the focus of an enormous amount of research into text structuring, text retrieval systems, and indexing, as well as lexicography. Advanced work in this area continues at Brown also, where Jacque Russom, Consulting Linguist for the Women Writers Project (a textbase of writing by women between 1330 and 1830), is leading several WWP projects involving the electronic OED.

Responding to faculty interest in bringing the resources of the electronic OED to the broader University community, the University Library and Computing and Information Services carried out a joint project to acquire the data and configure a delivery system that would provide the widest and most functional access feasible. CIS purchased "PAT," a text retrieval engine developed at Waterloo, and the Library purchased the OED data from Oxford University Press. The OED data is structured in a format similar to SGML (the Standard General Markup Language), the standard for coding textual information (such as books, articles, corpora) for use on the computer. This sort of encoding explicitly marks the components of each entry (such as headword, derivation, pronunciation, quotation, quotation author) so that they can be used as retrieval fields. The search engine, PAT, particularly suited for SGML retrieval, processes queries against these fields.

Last summer the CIS Scholarly Technology Group (STG), installed the data, installed and configured PAT, developed the World Wide Web search form interface, and adapted the programs necessary to coordinate the various parts of the system: managing queries, retrieval results, and presentation of information. The programs that coordinate the various parts of the system were adapted by STG programmer Geoffrey Bilder and are based on those originally developed by John Price-Wilkin at the University of Michigan. The Library Reference and Systems staff, particularly Raynna Bowlby and Helen Schmierer, provided design consultation on how the system as a whole should function to ensure that the needs of faculty and students were met.

Here's an example of how the system works. A user accesses the OED World Wide Web page on a Library workstation and, using a simple, intuitive form with fields, pull-down menus, and clickable buttons, issues a query, e.g., a request to return all entries containing a quotation by Brown Professor Emeritus Roderick Chisholm. The query is then translated into a language that the PAT retrieval engine can understand and PAT interrogates the data files to identify all entries that have Chisholm as the author, as indicated by the presence of "R. M. Chisholm" within the quotation author field. The content of the selected entries is then converted from the specialized OED markup language to HTML (Hypertext Markup Language), which is the text encoding language used on the World Wide Web. This conversion takes place quickly, behind the scenes, dynamically constructing the WWW result page that will be seen by the user.

This process illustrates an approach to delivering information over the network that is likely to become a common strategy at Brown. Complex information is stored in an encoding system, like SGML, that, because it adequately reflects the sophistication of the information it is encoding, can support advanced queries and analysis. However, when selections from the database are retrieved, they are translated to a simpler, more general, and more widely available encoding format, like HTML, in order to be easily formatted and viewed on commonly available workstation viewing software, like World Wide Web "browsers". But we may someday look forward to workstation browsers that are themselves capable of managing sophisticated and specialized encoding systems. This will give users even more capabilities, such as advanced discipline-specific views and manipulation.

The Brown electronic version of the OED remains very much an evolving project, as the Library and STG are still in the process of refining the interface and increasing its functionality. However, basic retrieval by headword, etymology, author, earliest quotation date, entire entry, and other fields is implemented and retrieved selections are displayed conveniently formatted and browsable. It is already proving to be a very popular electronic resource.

The OED can be easily accessed on the Library's Reference and Information Center's workstations, as well as from the main Library Home Page. -A.R.


chowder tSau.d<E> r, sb. Also 8 chouder. App. of French origin, from chaudière pot. In the fishing villages of Brittany (according to a writer in N. & Q. 4 Ser. VII. 85) faire la chaudière means to supply a cauldron in which is cooked a mess of fish and biscuit with some savoury condiments, a hodge-podge contributed by the fishermen themselves, each of whom in return receives his share of the prepared dish. The Breton fishermen probably carried the custom to Newfoundland, long famous for its chowder, whence it has spread to Nova Scotia, New Brunswick, and New England. Another writer in N. & Q. (1870) 4 Ser. V. 261, says `I have frequently heard some of the old inhabitants [of Newfoundland] speak of Commodore John Elliot's chowder pic-nic in 1786, which was given in honour of H.R.H. Prince William Henry [William IV] in command of H.M.S. Pegasus upon the Newfoundland station'.

A Definition of "Chowder" from the Oxford English Dictionary



[ Library Home Page | Library Publications | Back to Bibliofile 26]