§17 Corpora of English A corpus is a body of textual material (sometimes of spoken data) which has been collected on the basis of some pre-defined criteria and which is available in electronic form (typically on CD-ROM or via the internet). Corpora from academic circles are usually available from the university whose staff compiled them and are intended to assist scholars in backing up their statements about language with statistics drawn from real data.
The criteria used when assembling a corpus can vary. The main division is between synchronic and diachronic corpora. The next factor to consider is the text type: spoken or written medium. If the latter, what kind or kinds of literary genre are to be included? What chronological range is the corpus to cover (in the case of a diachronic corpus)? The final issue is the size of the corpus. With recent developments in computing it is possible to process large quantities of data with ease even on personal computers. There is an adage in electronic data processing: the more data one has, the more reliable one's statistics turn out to be, all other factors being equal. For this reason corpora are getting larger and larger and the results achieved are becoming increasingly accurate.
Synchronic corpora The first electronic corpora were synchronic, i.e. they covered some range of contemporary English. The original corpus in this field is the BrownCorpus, compiled by W. Nelson Francis and Henry Kucera at Brown University, Providence, Rhode Island in 1961 and which included about 1,000,000 words from various sources including newspaper texts. Two other majors corpora are the London-LundCorpus of spoken English, connected with the much larger SurveyofEnglishUsage project, started in 1959, at the University of London under the directorship of Randolph Quirk, and the Lancaster-Oslo-BergenCorpus (connected with the University of Lancaster and Geoffrey Leech). The University of Birmingham also compiled a major corpus (under the directorship of John Sinclair) which was used for the COBUILD dictionary and grammar published by Collins in London. Corpora on a larger scale have recently been started. Two major examples of these are the InternationalCorpusofEnglish (until recently under the directorship of the late Sidney Greenbaum) and the BritishNationalCorpus, started in 1991 with various publishers and universities and targeting 100,000,000 words with some of this stemming from transcriptions of spoken speech. These corpora are different in scope, the former aiming at covering all the national varieties of English. Both are considerably larger than their predecessors and are projected to contain scores of millions of words.
Diachronic corpora In the sense of collected texts of diachronic English in printed form, diachronic corpora have existed for at least a century, cf. the editions of key historical works published by the EarlyEnglishTextSociety. In the last two decades plans have been made for electronic corpora covering historical English. The most well-known of these is the HelsinkiCorpus of English Texts. The Penn-HelsinkiParsedCorpusofMiddleEnglish is based on a part of the Helsinki Corpus and has grammatical information included in the form of the corpus distributed to researchers. There are also various other, more specialised corpora available from the University of Helsinki, such as a corpus of medical texts and one of early English correspondence.
The information contained in a diachronic corpus might be lexicographical as with the DictionaryofOldEnglishCorpus, the EarlyModernEnglishDictionariesCorpus (both Toronto) and the HistoricalThesaurusofEnglish (London). It might be specific to a certain genre as with the CorpusofEarlyEnglishCorrespondence (Helsinki) and the ZurichEnglishNewspaperCorpus or indeed be confined to a single author as with the OxfordShakespeareCorpus. In yet other cases the limits may be defined by a certain period as with the ICAMET (International Computer Archive of Middle English Texts, Innsbruck) and the LampeterCorpusofEarlyModernEnglishTexts (Chemnitz) or by a given region as with the HelsinkiCorpusofOlderScots. Many of these corpora are in the process of completion at the universities of the cities indicated in their titles or in parentheses after the name.
§18 Journals of linguistics AmericanSpeech A Quarterly of Linguistic Usage, Tuscaloosa, Alabama, 1925/26-
Anglia Journal for English philology, Tübingen, 1878-
§19 Series, collections and proceedings Many publishing houses accommodate their books within series which they publish. The idea of a series is that it unites books which are thematically related and so facilitates the identification by potential readers and, given the fact that a series has an independent editor or editors, the responsibility for soliciting and reviewing manuscripts does not rest primarily with the publishers. If anything the number of linguistic series has been increasing in recent years, particularly with English publishing houses. The titles presented below are intended to give a representative selection rather than to be an exhaustive list; note also that some of the series are now defunct, but known for many seminal publications, e.g. Janua Linguarum (Mouton).
Articles or papers from conferences are frequently published as collections. This can be a regular feature or a single instance. In many cases the proceedings of a recurring conference are brought out by a single publisher. For example, the proceedings from the InternationalConferenceonHistoricalLinguistics (abbreviated as ICHL) and the equivalent conference for English historical studies, the InternationalConferenceonEnglishHistoricalLinguistics (abbreviated as ICEHL) are usually published by Benjamins (Amsterdam). Another case in point is the historical series published by Mouton de Gruyter (Berlin), the titles include HistoricalPhonology, HistoricalMorphology, HistoricalSyntax, HistoricalSemanticsandWordFormation, HistoricalDialectology, HistoricalPhilology. In general one can say that the organizations which concern themselves with levels of linguistics hold conferences and publish their proceedings regularly as conference volumes (for instance on phonetics, phonology, generative syntax in Europe [GLOW, Generative Linguists of the Old World], etc.).
A further form in which groups of articles are published is as a festschrift (from the German term ‘celebratory publication’) which is usually produced on the occasion of a particular birthday, often the 60th or 65th. Such a collection consists of contributions which are written by scholars associated with the individual who is to be honoured.
Mention should also be made of so-called working papers or occasional papers. These are as a rule issued by the department of a university and are intended to represent a pre-publication form of work by colleagues which is near completion. The material is usually photocopied rather than being typeset although recent improvements in technology have meant that there is little difference between the two procedures. The standard of the contents is frequently quite high and for some working papers peer review is demanded which is intended to guarantee consistent quality.
Anglistische Arbeitshefte ‘Anglistic notebooks’ (Tübingen: Niemeyer; Editors: Herbert Brekle and Wolfgang Kühlwein)
Applied Linguistics and Language Study Series (London: Longman; Editor: Christopher N.Candlin)
Blackwell Reference Library (Oxford: Blackwell)
Blackwell Textbooks in Linguistics (Oxford: Blackwell)
Cambridge Language Surveys (Cambridge: University Press; Editors: W.Sidney Allen et al.)
Cambridge Studies in Linguistics (Cambridge: University Press; Editors: W.Sidney Allen et al.)
Cambridge Textbooks in Linguistics (Cambridge: University Press; Editors: Bernard Comrie et al.)
Contributions to the Sociology of Language (Berlin: Mouton de Gruyter; Editor: Joshua Fishman)
Current Issues in Linguistic Theory (Amsterdam: Benjamins; Editor: Konrad Koerner)
Current Studies in Linguistics Series (Cambridge, Mass.: MIT Press; Editor: Samuel Jay Keyser)
English Language Series (London: Longman; Editor: Randolph Quirk)
Fontana Modern Masters (London: Fontana; Editor: Frank Kermode)
Foundations of Language, Supplementary Series (Dordrecht: Reidel; Editors: Morris Halle et al.)
Geschichte der Sprachtheorie ‘History of linguistic theory’ (Tübingen: Narr; Editor: Peter Schmitter)