Background

A corpus (plural corpora) is simply a collection of texts’. The word itself is the Latin for ‘body’, so a corpus is simply a body of texts. Before the development of computer technology, corpora were made up of handwritten or typed texts. This limited the ease with which they could be manipulated. Nowadays, people building corpora use electronic texts. Instead of looking through documents manually, you are able to use dedicated software programs to analyze their data.

However, you don’t need to be a computer genius to benefit from the insights that a corpus gives. You don’t need expensive equipment or specialist knowledge either in order to put together a group of texts to make a corpus. In fact, you may even have a corpus without realizing it.

In this unit you will get the chance to consider whether you already have texts that could be converted into a corpus. In addition, you will be introduced to the key principles of corpus research, and will explore a range of general and more specialist corpora.

Print Friendly, PDF & Email