Book description
The first textbook of its kind, Quantitative Corpus Linguistics with R demonstrates how to use the open source programming language R for corpus linguistic analyses. Computational and corpus linguists doing corpus work will find that R provides an enormous range of functions that currently require several programs to achieve – searching and processing corpora, arranging and outputting the results of corpus searches, statistical evaluation, and graphing.
Table of contents
 Cover Page
 Title Page
 Copyright Page
 Acknowledgments
 1. Introduction
 2. The Three Central Corpuslinguistic Methods

3. An Introduction to R
 3.1 A Few Central Notions: Data Structures, Functions, and Arguments
 3.2 Vectors
 3.3 Factors
 3.4 Data Frames
 3.5 Lists
 3.6 Elementary Programming Functions

3.7 Character/String Processing
 3.7.1 Getting Information from and Accessing (Vectors of) Character Strings
 3.7.2 Elementary Ways to Change (Vectors of) Character Strings
 3.7.3 Merging and Splitting (Vectors of) Character Strings without Regular Expressions
 3.7.4 Searching and Replacing without Regular Expressions
 3.7.5 Searching and Replacing with Regular Expressions
 3.7.6 Merging and Splitting (Vectors of) Character Strings with Regular Expressions
 3.8 File and Directory Operations

4. Using R in Corpus Linguistics

4.1 Frequency Lists
 4.1.1 A Frequency List of an Unannotated Corpus
 4.1.2 A Reverse Frequency List of an Unannotated Corpus
 4.1.3 A Frequency List of an Annotated Corpus
 4.1.4 A Frequency List of Tagword Sequences from an Annotated Corpus
 4.1.5 A Frequency List of Word Pairs from an Annotated Corpus
 4.1.6 A Frequency List of an Annotated Corpus (with One Word Per Line)
 4.1.7 A Frequency List of Word Pairs of an Annotated Corpus (with One Word Per Line)
 4.2 Concordances
 4.3 Collocations
 4.4 Excursus 1: Processing Multitiered Corpora
 4.5 Excursus 2: Unicode
 4.5.1 Frequency Lists
 4.5.2 Concordancing

4.1 Frequency Lists

5. Some Statistics for Corpus Linguistics
 5.1 Introduction to Statistical Thinking
 5.2 Categorical Dependent Variables

5.3 Interval/Ratioscaled Dependent Variables
 5.3.1 Descriptive Statistics for Interval/Ratioscaled Dependent Variables
 5.3.2 One Interval/Ratioscaled Dependent Variable, One Categorical Independent Variable
 5.3.3 One Interval/Ratioscaled Dependent Variable, One Interval/Ratioscaled Independent Variable
 5.3.4 One Interval/Ratioscaled Dependent Variable, 2+ Independent Variables
 5.4 Customizing Statistical Plots
 5.5 Reporting Results
 6. Case Studies and Pointers to Other Applications
 Appendix
 References
 Endnotes
Product information
 Title: Quantitative Corpus Linguistics with R
 Author(s):
 Release date: March 2009
 Publisher(s): Routledge
 ISBN: 9781135895594
You might also like
book
Machine Learning
This tutorial text gives a unifying perspective on machine learning by covering both probabilistic and deterministic …
book
Practical Text Mining and Statistical Analysis for Nonstructured Text Data Applications
Practical Text Mining and Statistical Analysis for Nonstructured Text Data Applications brings together all the information, …
book
R Cookbook, 2nd Edition
Perform data analysis with R quickly and efficiently with more than 275 practical recipes in this …
book
Practical Statistics for Data Scientists, 2nd Edition
Statistical methods are a key part of data science, yet few data scientists have formal statistical …