Corpus-based analysis of the partial synonyms "oppress, repress, suppress" with regard to their contextual usage

Table of contents

1. Introduction
1.1. Corpus semantics and its possibilities
1.2. My basis of research - the BNC
1.3. My object of research - “oppress - repress - suppress
1.4. Means and methods used
1.5. Considerations in advance

2. Word meaning - different layers of contents

3. Synonymy - different shades of parity

4. Oppress, repress, suppress - Numerical Analysis

5. Oppress, repress, suppress - Semantic Analysis
5.1. Possible meanings according to the OED online
5.2. Subdivision with respect to the word meanings

6. Conclusions
6.1. Results of the numerical analysis
6.2. Results of the semantic analysis
6.3. Ideas for further study

7. References

1. Introduction

1.1. Corpus semantics and its possibilities

The term “corpus“ is used to refer to a collection of written texts or transcribed speech which may be used as a basis for linguistic analysis and description. (cf. Kennedy 1998; p. 1) Its features are finite size, machine-readable form, representativeness and a standard reference. (cf. McEnery & Wilson 1996; pp. 21-24)

As a general-purpose corpus is more or less representative of the language variety it is based on – especially large, so-called mega-corpora – one may generalize from the results a corpus-based investigation provides and presume that they show tendencies also present in the whole of the actual variety, thereby discovering not only the feasibility, but also the probability and appropriateness of certain language phenomena. “[...] Evidence for evaluative meanings can be collected by quantitative analysis of large corpora.” (Stubbs 2002, 197).

Compiled corpora offer the advantage of an approach which is time-saving and potentially free of oversights or slips, as made decisions “can be checked by independent observers. Date and methods therefore make possible the replicable and empirical analysis of meaning.“ (ibid., p. 50) Finally, corpus linguistics, besides being a means of scientific discovery, finds various fields of practical application, among which are lexicology and language teaching. (cf. McEnery & Wilson 1996; pp. 86-115).

1.2. My basis of research - the BNC

Combining several previous corpora, the B ritish N ational C orpus was completed in 1994, constituting one of the largest corpora of British English. It contains 100 million words, of which 95 per cent are written registers. The texts are annotated with SGML (Standard Generalized Markup Language) and tagged for word class by means of CLWAS, an automatic tagging system developed at Lancaster University. (cf. Kennedy 1998; pp. 1-85, McEnery & Wilson 1996; pp. 22-60/87-115).

1.3. My object of research - “oppress - repress - suppress “

As an object of analysis I chose the triplet of (ostensible) synonyms oppress, repress, and suppress, which all express an action of subjection and therefore share at least a part of their meaning. Also, all three of them are (in most contexts) commonly translated into German as ‘unterdrücken’. The question is to what extent they may be called synonymous.

This choice of words was influenced by my experiences with the French equivalents opprimer, réprimer, and (seemingly!) supprimer. Trickily, while the first two match their English counterparts, the seemingly equivalent supprimer proves to be sort of a false friend, as it usually conveys the meaning of ‘to abolish’ or ‘to remove’, and therefore only carries an extremely restricted part of the meaning expressed by the English suppress.

1.4. Means and methods used

There are several programs allowing the production of a corpus-based concordance, for example “Word Smith“, “Sara 32“, “Ice Cup“, and others. I chose “Word Smith“ as a concordancer, which offers the advantage of allowing the arrangement of tags in order to mark the different semantic fields the words appear in, which makes it extremely useful for my investigation of the contextual usage of the three partial synonyms.

I checked the appearances of the lemmas oppress, repress, and suppress in all of their morphological forms (infinitive, 3rd person singular, present and past participle), and analysed the numerical distribution first. Then, after making a random selection of about a quarter of the results for reasons of lucidity, I laid down the different sets representing semantic fields, roughly following the possible meanings indicated by the OED Online (see below), and sorted the results firstly by the set they now belonged to and secondly by the alphabetical order (of the central word and the next word to the right), to group the words into word families.

1.5. Considerations in advance

Basing on average non-native experience in the usage of the trio, one might presume that regarding numerical distribution suppress appears more often than repress (which I, for example, had never used myself before).

As for the semantic contexts the words may appear in, relatively little is really known to foreign learners. (And probably equally little is fully aware to native speakers.) As they are all based on the Latin PREMERE, ‘to press’, preceded by the prefixes for ‘against’ (OB-), ‘back’ (RE-) and ‘down’ (SUB-), and as they all express some sort of subjection, one might tend to call the three words synonyms, though it is in question in which aspects they will really prove to be synonymous. (Collins, pp. 1092, 1307, 1539).


Corpus-based analysis of the partial synonyms "oppress, repress, suppress" with regard to their contextual usage
Title: Corpus-based analysis of the partial synonyms  "oppress, repress, suppress" with regard to their contextual usage

