Corpus Linguistics Bibliography


See Corpus Bibliograhy below -- pulled from an old cache.

Corpus Bibliography Sites

Corpus Bibliography -- alphabetical

Aarts, J. 1991. Intuition-Based and Observation-Based Grammars. In Aijmer, K and B. Altenberg (eds), English Corpus Linguistics. London: Longman.

Andersen, G. & A-B. Stenstrom. Forthcoming. A corpus-based investigation of the discourse items cos and innit. Synchronic Corpus Linguistics. Toronto.

Barlow, M. 1992. Using Concordance Software in Language Teaching and Research. In Shinjo, W. et al. Proceedings of the Second International Conference on Foreign Language Education and Technology. Kasugai, Japan: LLAJ & IALL

Barlow, M. 1995. A Guide to ParaConc. Houston: Athelstan

Barlow, M. 1995. ParaConc: A Concordancer for Parallel Texts. Computers and Texts, 10. (CTI Textual Studies)

Barlow, M. Forthcoming. Parallel texts in linguistic analysis. In In M. Barlow and S. Kemmer (eds.) Usage-based models of language.

Barlow, M. 1996. Corpora for Theory and Practice. International Journal of Corpus Linguistics, 1, 1.

Biber, D. 1988. Variation across speech and writing. Cambridge: Cambridge University Press.

Biber, D and E. Finegan. 1991. On the Exploitation of Computerized Corpora in Variation Studies. In K. Aijmer and B. Altenberg (eds.), English Corpus Linguistics. London: Longman.

Brill, Eric and Philip Resnik. 1994. A Rule-Based Approach To Prepositional Phrase Attachment Disambiguation. COLING-94.

Chafe, W., J. DuBois, and S. Thompson. 1991. Towards a New Corpus of Spoken American English. In K. Aijmer and B. Altenberg (eds), English Corpus Linguistics. London: Longman.

Church, K., W. Gale, P. Hanks and D. Hindle. 1991. Using statistics in lexical analysis. In U. Zernik (ed.) Lexical Acquisition. Englewood Cliff, NJ: Erlbaum. 115--64.

Clear, J. H. New Directions in English Language Corpora: Methodology, Results, Software Developments 1992. `Corpus Sampling' in G. Leitner (ed.) , . Mouton de Gruyter, Berlin. 21-31.

Clear, J. H. The Digital Word: Text-Based Computing in the Humanities 1993. `The British National Corpus' in G. P. Landow and P. Delaney (eds.), . MIT Press, Cambridge Mass. 163-87.

Clear, J. 1993. From Firth principles: Computational tools for the study of collocation. In M. Baker, G. Francis, and E. Tognini-Bonelli (eds.) text and technology: In honour of John Sinclair. Amsterdam: John Benjamins.

Clear, J.H. Papers in Computational Lexicography: COMPLEX '94 1994. `I Can't See the Sense in a Large Corpus' in F. Kiefer, G. Kiss, J. Pajzs (eds.) . Research Institute for Linguistics, Hungarian Academy of Sciences, Budapest, 33-48.

Davis, Mark and Ted Dunning. 1995. Query Translation Using Evolutionary Programming for Multi-lingual Information Retrieval. Proceedings of Evolutionary Programming 95.

Davis, Mark, Ted Dunning and Bill Ogden. 1995. Text Alignment in the Real World: Improving Alignments of Noisy Translations Using Common Lexical Features, String Matching Strategies and N-Gram Comparisons. European Association for Computation Linguistics.

Dunning, Ted. 1993. Accurate Methods for the Statistics of Surprise and Coincidence. Computational Linguistics 19, 1

Dunning, Ted. 1992. A Single Language Evaluation of a Multi-lingual Text Retrieval System. TREC-1 Proceedings, In NIST Special Publication 500-207.

Dunning, Ted. 1994. Statistical Identification of Language. CLR Tech Report. (MCCS-94-273)

Dunning, Ted, Jim Cowie and Takahiro Waka. 1991. Analysis of Parallel Japanese and English Corpora. CLR Tech Report. (MCCS-91-233)

Dunning, Ted and Mark Davis. 1993. Multi-lingual Information Retrieval. CLR Tech Report. (MCCS-93-252)

Edwards, J.A. & M. D. Lampert (eds.) (1993) _Talking Data: Transcription and Coding in Discourse Research._ Hillsdale,NJ: Erlbaum, Associates.

Eeg-Olofsson, Mats. 1991. Word-class tagging. Some computational tools. Department of computational linguistics, University of Gvteborg (Gothenburg). ISBN 91-628-0297-6. (available through University Microfilms International)

Grefenstette, G. 1994. Explorations in automatic thesaurus discovery Boston : Kluwer Academic Publishers.

Haslerud, V. & A-B Stenstrom. COLT: 1994. Mark-up and Trends. In Hermes, Journal of Linguistics 13, 55-70.

Haslerud, V. & A-B. Stenstrom. 1995. The Bergen Corpus of London Teenage Corpus (COLT). In G. Leech, G. Meyres & J. Thomas (eds) Spokan English on computer. London: Longman, 235-242.

Johns, T. 1988. Whence and Whither Classroom Concordancing? In T. Bongaerts et al (eds.) Computer Applications in Language Learning. Dordrecht: Foris.

Johns, T. 1991a. Should you be persuaded---Two Examples of Data-Driven Learning Materials. English Language Research Journal (4) 1--16. University of Birmingham

Johns, T. 1991b. From printout to handout: Grammar and vocabulary learning in the context of data-driven learning. English Language Research Journal (4) 27--45. University of Birmingham

Jordan, G. 1992. Concordances: Research Findings and Learner Processes. Unpub M.A. Dissertation.

Klavans, Judith L. and Evelyne Tzoukermann. 1989. Movement Verbs in English-French Translation: A Corpus-based Approach. Proceedings of the Sixth Israeli Conference of Artificial Intelligence and Computer Vision. Tel Aviv, Israel.

Klavans, Judith L. and Evelyne Tzoukermann. 1990. Linking Bilingual Corpora and Machine Readable Dictionaries with the BICORD System. Proceedings of the Sixth Conference of the University of Waterloo Centre for the New Oxford English Dictionary and Text Research: Electronic Text Research, University of Waterloo, Canada.

Klavans, Judith L. and Evelyne Tzoukermann. 1990. Combining Lexical Information from Bilingual Corpora and Machine-Readable Dictionaries. Proceedings of the 13th International Conference on Computational Linguistics: COLING. Helsinki, Finland.

Klavans, Judith and Evelyne Tzoukermann. 1996. Dictionaries and Corpora: Combining Corpus and Machine-Readable Dictionary Data for Building Bilingual Lexicons. The Machine Translation Journal. Kluwer.

Nattinger, J. R. and J. S. Decarrico. 1992. Lexical phrases and language teaching. Oxford: Oxford University Press.

Quirk, R. 1992. On Corpus Principles and Design. In Svartik, J. (ed) Directions in Corpus Linguistics. Berlin: Mouton de Gruyter.

Resnik, Philip. WordNet and Distributional Analysis: A Class-based Approach to Lexical Discovery. AAAI Workshop on Statistically-based NLP Techniques.

Resnik, Philip. 1992. Probabilistic Tree-Adjoining Grammar as a Framework for Statistical Natural Language Processing. Proceedings of the Fourteenth International Conference on Computational Linguistics (COLING '92).

Resnik, Philip. 1993. Semantic Classes and Syntactic Ambiguity. Proceedings of the 1993 ARPA Human Language Technology Workshop. Morgan Kaufmann.

Resnik, Philip and Marti Hearst. 1993. Syntactic Ambiguity and Conceptual Relations. Proceedings of the ACL Workshop on Very Large Corpora. 58--64.

Resnik, Philip. 1993. Selection and Information: A Class-Based Approach to Lexical Relationships. Unpublished Dissertation. University of Pennsylvania.

Resnik, Philip. 1995. Using Information Content to Evaluate Semantic Similarity in a Taxonomy. IJCAI-95.

Resnik,Philip. 1995. Disambiguating Noun Groupings with Respect to WordNet Senses. Third Workshop on Very Large Corpora. Association for Computational Linguistics.

Sampson, G. 1987. Evidence Against the ``Grammatical''/``Ungrammatical'' Distinction. In W. Meijs (ed) Corpus Linguistics and Beyond. Amsterdam: Rodopi.

Sinclair, J. 1991. Corpus, concordance, collocation. Oxford: Oxford University Press.

Stenstrom, A-B. & J. Svartvik. 1994. Imparsable speech: Repeats and nonfluencies in spoken English. In N. Oostdijk & P. de Haan (eds) Corpus-based research into language. Amsterdam: Rodopi, 241-254.

Stenstrom, A-B. 1995a. Taboos in Teenage Talk, In G. Melchers & B. Warren (eds) Studies in Anglistics. Stockholm: Almqvist & Wiksell International, 71-80.

Stenstrom, A-B. 1995b. Some remarks on comment clauses. In B. Aarts & Ch. Meyer (eds) The verb in contemporary English. Cambridge: Cambridge University Press, 290-302.

Stevens, V. 1991. Concordance-based Vocabulary Exercises: A Viable Alternative to Gap-Filling. English Language Research Journal (4) 47--61. University of Birmingham

Stevens, Vance 1995. Concordancing with Language Learners: Why? When? What? CAELL Journal, 6,2 2-10.

Stubbs, M. 1995. Collocations and semantic profiles: On the cause of the trouble with quantitative studies. Functions of Language 2, 1: 23--55.

Tribble, C. and G. Jones. 1990.Concordances in the Classroom. London: Longman.

White, Owen and Ted Dunning. 1993. Computational tools for DNA sequence analysis. In Mark Adams, Chris Fields and J. Craig Venter (eds) Automated DNA Sequencing and Analysis. Academic Press, London.

White, Owen, Ted Dunning, Sutton Granger, Mark Adams, J. Craig Venter and Chris Fields. 1993. A quality control algorithm for DNA sequencing projects. Nucleic Acids Research, 21, 16.

Willis, D. 1990. The Lexical Syllabus. Collins.

Corpus Bibliography -- topic

  • Corpora: Construction and Parsing
  • Corpus Linguistics and Text Analysis
  • Corpora and Language Teaching
  • Parallel Corpora
  • Corpora: Construction and Parsing

    Atkins, S., Clear J. H and Ostler N. Literary and Linguistics Computing 1992. `Corpus Design Criteria' in , Vol. 7, No. 1, pp. 1-16.

    Brill, Eric and Philip Resnik. 1994. A Rule-Based Approach To Prepositional Phrase Attachment Disambiguation. COLING-94.

    Chafe, W., J. DuBois, and S. Thompson. 1991. Towards a New Corpus of Spoken American English. In K. Aijmer and B. Altenberg (eds), English Corpus Linguistics. London: Longman.

    Clear, J. H. New Directions in English Language Corpora: Methodology, Results, Software Developments 1992. `Corpus Sampling' in G. Leitner (ed.) , . Mouton de Gruyter, Berlin. 21-31.

    Clear, J. H. The Digital Word: Text-Based Computing in the Humanities 1993. `The British National Corpus' in G. P. Landow and P. Delaney (eds.), . MIT Press, Cambridge Mass

    Davis, Mark and Ted Dunning. 1995. Query Translation Using Evolutionary Programming for Multi-lingual Information Retrieval. Proceedings of Evolutionary Programming 95.

    Dunning, Ted. 1992. A Single Language Evaluation of a Multi-lingual Text Retrieval System. TREC-1 Proceedings, In NIST Special Publication 500-207.

    Dunning, Ted. 1994. Statistical Identification of Language. CLR Tech Report. (MCCS-94-273)

    Dunning, Ted and Mark Davis. 1993. Multi-lingual Information Retrieval. CLR Tech Report. (MCCS-93-252)

    Edwards, J.A. & M. D. Lampert (eds.) (1993) _Talking Data: Transcription and Coding in Discourse Research._ Hillsdale,NJ: Erlbaum, Associates.

    Eeg-Olofsson, Mats 1991. Word-class tagging. Some computational tools. Department of computational linguistics, University of Gvteborg (Gothenburg). ISBN 91-628-0297-6. (available through University Microfilms International)

    Haslerud, V. & A-B Stenstrom. COLT: 1994. Mark-up and Trends. In Hermes, Journal of Linguistics 13, 55-70.

    Haslerud, V. & A-B. Stenstrom. 1995. The Bergen Corpus of London Teenage Corpus (COLT). In G. Leech, G. Meyres & J. Thomas (eds) Spokan English on computer. London: Longman, 235-242.

    Quirk, R. 1992. On Corpus Principles and Design. In Svartik, J. (ed) Directions in Corpus Linguistics. Berlin: Mouton de Gruyter.

    Resnik, Philip. 1992. Probabilistic Tree-Adjoining Grammar as a Framework for Statistical Natural Language Processing. Proceedings of the Fourteenth International Conference on Computational Linguistics (COLING '92).

    Corpus Linguistics and Text Analysis

    Aarts, J. 1991. Intuition-Based and Observation-Based Grammars. In Aijmer, K and B. Altenberg (eds), English Corpus Linguistics. London: Longman.

    Andersen, G. & A-B. Stenstrom. Forthcoming. A corpus-based investigation of the discourse items cos and innit. Synchronic Corpus Linguistics. Toronto.

    Barlow, M. Forthcoming. Parallel texts in linguistic analysis. In In M. Barlow and S. Kemmer (eds.) Usage-based models of language.

    Barlow, M. 1996. Corpora for Theory and Practice. International Journal of Corpus Linguistics, 1, 1.

    Biber, D. 1988. Variation across speech and writing. Cambridge: Cambridge University Press.

    Biber, D and E. Finegan. 1991. On the Exploitation of Computerized Corpora in Variation Studies. In K. Aijmer and B. Altenberg (eds.), English Corpus Linguistics. London: Longman.

    Church, K., W. Gale, P. Hanks and D. Hindle. 1991. Using statistics in lexical analysis. In U. Zernik (ed.) Lexical Acquisition. Englewood Cliff, NJ: Erlbaum. 115--64.

    Clear, J. 1993. From Firth principles: Computational tools for the study of collocation. In M. Baker, G. Francis, and E. Tognini-Bonelli (eds.) text and technology: In honour of John Sinclair. Amsterdam: John Benjamins.

    Clear, J.H. Papers in Computational Lexicography: COMPLEX '94 1994. `I Can't See the Sense in a Large Corpus' in F. Kiefer, G. Kiss, J. Pajzs (eds.) . Research Institute for Linguistics, Hungarian Academy of Sciences, Budapest, 33-48.

    Dunning, Ted. 1993. Accurate Methods for the Statistics of Surprise and Coincidence. Computational Linguistics 19, 1.

    Grefenstette, G. 1994. Explorations in automatic thesaurus discovery Boston : Kluwer Academic Publishers.

    Klavans, Judith L. and Evelyne Tzoukermann. 1989. Movement Verbs in English-French Translation: A Corpus-based Approach. Proceedings of the Sixth Israeli Conference of Artificial Intelligence and Computer Vision. Tel Aviv, Israel.

    Resnik, Philip. WordNet and Distributional Analysis: A Class-based Approach to Lexical Discovery. AAAI Workshop on Statistically-based NLP Techniques.

    Resnik, Philip. 1993. Semantic Classes and Syntactic Ambiguity. Proceedings of the 1993 ARPA Human Language Technology Workshop. Morgan Kaufmann.

    Resnik, Philip and Marti Hearst. 1993. Syntactic Ambiguity and Conceptual Relations. Proceedings of the ACL Workshop on Very Large Corpora. 58--64.

    Resnik, Philip. 1993. Selection and Information: A Class-Based Approach to Lexical Relationships. Unpublished Dissertation. University of Pennsylvania.

    Resnik, Philip. 1995. Using Information Content to Evaluate Semantic Similarity in a Taxonomy. IJCAI-95.

    Resnik,Philip. 1995. Disambiguating Noun Groupings with Respect to WordNet Senses. Third Workshop on Very Large Corpora. Association for Computational Linguistics.

    Sampson, G. 1987. Evidence Against the ``Grammatical''/``Ungrammatical'' Distinction. In W. Meijs (ed) Corpus Linguistics and Beyond. Amsterdam: Rodopi.

    Sinclair, J. 1991. Corpus, concordance, collocation. Oxford: Oxford University Press.

    Stenstrom, A-B. & J. Svartvik. 1994. Imparsable speech: Repeats and nonfluencies in spoken English. In N. Oostdijk & P. de Haan (eds) Corpus-based research into language. Amsterdam: Rodopi, 241-254.

    Stenstrom, A-B. 1995a. Taboos in Teenage Talk, In G. Melchers & B. Warren (eds) Studies in Anglistics. Stockholm: Almqvist & Wiksell International, 71-80.

    Stenstrom, A-B. 1995b. Some remarks on comment clauses. In B. Aarts & Ch. Meyer (eds) The verb in contemporary English. Cambridge: Cambridge University Press, 290-302.

    Stubbs, M. 1995. Collocations and semantic profiles: On the cause of the trouble with quantitative studies. Functions of Language 2, 1: 23--55.

    White, Owen and Ted Dunning. 1993. Computational tools for DNA sequence analysis. In Mark Adams, Chris Fields and J. Craig Venter (eds) Automated DNA Sequencing and Analysis. Academic Press, London.

    White, Owen, Ted Dunning, Sutton Granger, Mark Adams, J. Craig Venter and Chris Fields. 1993. A quality control algorithm for DNA sequencing projects. Nucleic Acids Research, 21, 16.

    Corpora and Language Teaching

    Barlow, M. 1992. Using Concordance Software in Language Teaching and Research. In Shinjo, W. et al. Proceedings of the Second International Conference on Foreign Language Education and Technology. Kasugai, Japan: LLAJ & IALL

    Barlow, M. 1995. A Guide to ParaConc. Houston: Athelstan

    Barlow, M. 1995. ParaConc: A Concordancer for Parallel Texts. Computers and Texts, 10. (CTI Textual Studies).

    Johns, T. 1988. Whence and Whither Classroom Concordancing? In T. Bongaerts et al (eds.) Computer Applications in Language Learning. Dordrecht: Foris.

    Johns, T. 1991a. Should you be persuaded---Two Examples of Data-Driven Learning Materials. English Language Research Journal (4) 1--16. University of Birmingham

    Johns, T. 1991b. From printout to handout: Grammar and vocabulary learning in the context of data-driven learning. English Language Research Journal (4) 27--45. University of Birmingham

    Jordan, G. 1992. Concordances: Research Findings and Learner Processes. Unpub M.A. Dissertation.

    Nattinger, J. R. and J. S. Decarrico. 1992. Lexical phrases and language teaching. Oxford: Oxford University Press.

    Stevens, V. 1991. Concordance-based Vocabulary Exercises: A Viable Alternative to Gap-Filling. English Language Research Journal (4) 47--61. University of Birmingham

    Stevens, Vance 1995. Concordancing with Language Learners: Why? When? What? CAELL Journal, 6,2 2-10.

    Tribble, C. and G. Jones. 1990.Concordances in the Classroom. London: Longman.

    Willis, D. 1990. The Lexical Syllabus. Collins.

    Parallel Corpora

    Barlow, M. 1995. A Guide to ParaConc. Houston: Athelstan

    Barlow, M. 1996. ParaConc. Computers and Texts. (CTI Textual Studies).

    Davis, Mark, Ted Dunning and Bill Ogden. 1995. Text Alignment in the Real World: Improving Alignments of Noisy Translations Using Common Lexical Features, String Matching Strategies and N-Gram Comparisons. European Association for Computation Linguistics.

    Dunning, Ted, Jim Cowie and Takahiro Waka. 1991. Analysis of Parallel Japanese and English Corpora. CLR Tech Report. (MCCS-91-233)

    Klavans, Judith L. and Evelyne Tzoukermann. 1990. Linking Bilingual Corpora and Machine Readable Dictionaries with the BICORD System. Proceedings of the Sixth Conference of the University of Waterloo Centre for the New Oxford English Dictionary and Text Research: Electronic Text Research, University of Waterloo, Canada.

    Klavans, Judith L. and Evelyne Tzoukermann. 1990. Combining Lexical Information from Bilingual Corpora and Machine-Readable Dictionaries. Proceedings of the 13th International Conference on Computational Linguistics: COLING. Helsinki, Finland.

    Klavans, Judith and Evelyne Tzoukermann. 1996. Dictionaries and Corpora: Combining Corpus and Machine-Readable Dictionary Data for Building Bilingual Lexicons. The Machine Translation Journal. Kluwer.