I am going to write this article about the British National Corpus, but as I’m sure many people won’t know what a corpus is, I think it is important that I give an explanation. That is why I am going to start by writing a few lines on corpora in general, and then I will focus my article on the British National Corpus, trying to explain how it works.
What is a corpus?
According to the Oxford Dictionary, a corpus is “a collection of written or spoken material in machine-readable form, assembled for the purpose of linguistic research”.
The plural word to corpus is usually “corpora”.
What are they used for?
They are used to store words, whose features can be analyzed by means of tagging and use of concordancing programs, and they help studying linguistic competence. They are used to do statistical analysis and hypothesis testing, checking occurrences or validating linguistic rules on a specific universe.