Web As Corpus: Theory and Practice

Front Cover
A&C Black, Feb 13, 2014 - Language Arts & Disciplines - 224 pages
Is the internet a suitable linguistic corpus? How can we use it in corpus techniques? What are the special properties that we need to be aware of? This book answers those questions.

The Web is an exponentially increasing source of language and corpus linguistics data. From gigantic static information resources to user-generated Web 2.0 content, the breadth and depth of information available is breathtaking – and bewildering. This book explores the theory and practice of the “web as corpus”. It looks at the most common tools and methods used and features a plethora of examples based on the author's own teaching experience. This book also bridges the gap between studies in computational linguistics, which emphasize technical aspects, and studies in corpus linguistics, which focus on the implications for language theory and use.
 

Contents

Introduction
1
Basic Principles
5
An Introduction to the Web as Corpus
35
Web Search from a Corpus Perspective
73
Concordancing the Web
105
Tools and Methods
137
6 Sketches of Language and Culture from Large Web Corpora
163
The Web as Corpus in the Web 20 Era
205
Conclusion
211
References
215
Index
229
Copyright

Other editions - View all

Common terms and phrases

About the author (2014)

Maristella Gatto is a Researcher and Lecturer in English Language and Translation at the Faculty of Modern Languages, University of Bari, Italy.

Bibliographic information