Word-formation at the time of COVID-19

Word-formation patterns and studying of COVID-19 neologisms

Academic Paper, 2021

99 Pages, Grade: 2


Table of contents

1 Introduction

2 What are ‘new words’?

3 Word-formation patterns
3.1 Affixation
3.2 Prefixation
3.3 Suffixation
3.4 Compounding
3.5 Conversion
3.6 Back-formation
3.7 Clipping
3.8 Acronyms and abbreviations
3.9 Blending
3.10 Loan words
3.11 Onomatopoeia
3.12 Multi-word expressions

4 Studying of COVID-19 neologisms

5 Methodology
5.1 Collecting and analyzing data
5.2 Limitations

6 Results

7 Conclusion


Appendix A. Table 1

Appendix B. Table 2

List of figures

Figure 1. The classification of word-formation processes (Quirk et al. 1985: 1520; cited in Schmid, 2016)

Figure 2. Parts of Speech

Figure 3. When were the words coined?

Figure 4. Word-formation types

Figure 5. Mixed types

List of tables

Table 1. Parts of speech

Table 2. When were the words coined?

Table 3. Word-formation types

Table 4. Mixed types

1 Introduction

The ongoing COVID-19 pandemic has caused a lot of changes in our everyday life which have also been reflected in the way we speak. New concepts and ideas needed to be named to find their place in the lexicon. It led to the emergence of many new words and expressions in the English language. Some of them aim at naming specific things like the new virus itself (i.e. COVID-19) while the others are examples of language creativity and wordplay, for example, the word cornteen imitating the American way of pronouncing the widely used word quarantine.

Currently, it is almost impossible to say which new words will get a permanent place in the vocabulary and which of them are just occasional coinages that will disappear once the pandemic is over. In order to answer this question, more time is needed but what is possible now is to trace the development of the English lexicon. For this purpose, the new COVID-19 words will have to be documented and analyzed. As the pandemic has not finished yet and other words can theoretically still be coined, multi-step research is required. This topic has already gained some attention from the scientific community, but there are only a few studies that analyze the new COVID-19 words. Having said that, the present study is aimed at contributing to the documentation and analysis of the new coinages.

The research presented in this paper has been conducted in several steps. The words were collected with the use of various sources and then sorted out so that only coinages for which example sentences were found would be analyzed. The analysis helped to reveal if the words were coined before or during the pandemic, if they were reintroduced into the vocabulary (in the case that they were coined before the pandemic), what parts of speech they belong to and what word­formation patterns were used to create them. The results of the analysis are presented in several tables and pie charts and can help with future research on the topic.

This paper consists of several parts, namely an introduction, theoretical background, methodology, results, conclusion, references, and an appendix.

2 What are ‘new words’?

The development of the language vocabulary always remains active. New entities enter language with the emergence of novel ideas, concepts, social practices, and things. Some of the entities can be replaced or may stop being used due to the extinction of the objects they named. The others have to adapt to new realities and develop new meanings.

Lexis constitutes the most flexible linguistic domain (Robinson 2019) which is open to almost all kinds of manifestations of language creativity. However, not every newly coined word becomes a part of vocabulary as there are certain factors influencing the establishment of new words and the stages they have to go through. This section is aimed at addressing this issue and providing an overview of the development of new lexemes in a language.

According to Schmid (2016: 69-70), there are several options to add a word to the lexicon. First of all, it is word-formation that enables creating a new word with the use of morphological material existing in the language. An example can be the word preoccupation originating from the verb occupy and having additional morphemes such as the prefix -pre and the suffix -ion. Another possible procedure is word creation whereby new lexemes are invented totally from scratch, for example, Kodak, frisbee, or yo-yo. However, the chances of the survival of absolutely novel words are low as lexemes coined in this way do not resemble any other words in a language. If they were to do so, that would usually have significantly simplified the process of memorizing new entities. Words can be also borrowed from other languages, e.g. Laptop, Notebook (from German), skepticism, biology (from Old Greek), bonus, allusion (from Latin), etc. These words do not have to be identical to the words of the language of origin and sometimes only certain parts of words are borrowed (e.g. prefixes of Latin origin such as inter-, ambi-, ultra-, etc.). Borrowed words are also, as a rule, accommodated and integrated into a new language on different levels. This can result in multiple different pronunciations of words (e.g. kindergarten in English and German), changes of affixes (e.g. the Latin word externus and the English variant external), amongst others (Platzer 2021). New objects and ideas can also be named with the help of a semantic shift which sometimes occurs in the form of a metaphor (e.g. the word mouse meaning a computer accessory).

The establishment of lexemes in language can be observed from structural, sociopragmatic, and cognitive perspectives (Schmid 2016: 71). The structural perspective focuses on the internal structure of the word, its form, meaning, and dependence on linguistic context. The sociopragmatic perspective views words in the speech community considering their spread and familiarity for the speakers of this community. The cognitive perspective deals with words entrenched in the individual mental lexicons of the speakers and their conceptual status (Schmid 2016: 71).

When the word tends to become a member of the general word stock, it has to go through three stages, namely, creation, consolidation, and establishment (Schmid 2016: 71). These stages can be described in terms of the three perspectives mentioned above.

The stage of creation marks the appearance of a new word. The initially used word is called ad-hoc formation or nonce formation. From the sociopragmatic perspective, speakers coin absolutely new words when the existing word stock cannot help them to describe a novel concept, idea, or thing. New coinages can also be used for brevity and economy, e.g. multiple abbreviations which help to encompass large concepts with the use of just one word (Schmid 2016: 73). The need to mark a certain style can be a reason for the creation of new words too (Fischer 1998: 8). That is why media and newspaper language for which expressivity plays a very important role often becomes a source for novel formations. From the structural perspective, nonce words “constitute a previously non-existent combination of existing morphemes” (Schmid 2016: 73) and they are type­familiar as “we can identify them as a new instance of a known word-formation type” (Schmid 2016: 74). Ad-hoc formations are often vague and ambiguous, so the context has to be provided in order to understand their meaning (Schmid 2016: 74). From the cognitive perspective, it might be argued that novel formations have no entry in the mental lexicon yet, while morphemes they consist of are already stored and accessible there (Schmid 2016: 74).

Before turning to the next stage, the difference between the terms nonce-formation and neologism should be discussed as they are often mixed up and provoke a lot of discussions. According to Fischer (1998: 7-8), they can be seen as two stages of the development of a new word. The nonce formation is an intentional linguistic action, then “when the formation is used by a number of speakers with similar intentions, the process of institutionalization is started “(Fischer 1998: 7). Following that, the nonce formation can develop into a neologism that, according to Schmid (2016: 75), happens at the level of consolidation discussed further. Fischer (1998: 4-5) also draws a connection with the ideas of Saussure and assigns nonce formations to the level of parole, linguistic performance, neologisms to the level of norm, which is the level of language in use, and finally, lexical element of the common vocabulary is found on the level of langue, i.e. language system.

The stage of consolidation, from the sociopragmatic perspective, is characterized by the spread and diffusion of the new word in the speech community. At this stage, as it was already mentioned, the word usually becomes a neologism, i.e. it is still perceived as new, but the number of speakers is growing (Schmid 2016: 75). From the structural perspective, at the stage of consolidation, the form and meaning of the word are stabilized. The original ambiguity is reduced as well as its context-dependence. From the cognitive perspective, “the word is tentatively assigned an entry in the mental lexicon of an individual speaker and begins to be linked with other entries in a multitude of associative connections” (Schmid 2016: 76). Besides that, “syntagmatic relations such as collocations start to develop when the possible combinations of, e.g., a new noun with verbs or adjectives become fixed” (Schmid 2016: 76).

Finally, the stage of establishment, from the sociopragmatic perspective, marks an absolute institutionalization of the lexeme that means it is known and used by the majority of speakers. The level of institutionalization can vary. There are both more and less institutionalized lexemes as the latter are used in specialized areas, limiting the number of speakers familiar with while the former are known by almost every adult speaker of the speech community (Schmid 2016: 77). From the structural perspective, the lexeme at this stage is also fully lexicalized which means it exhibits formal properties that “are not explicable using the rules of word-formation and/or has semantic features that cannot be deduced from the meaning of its components” (Schmid 2016: 77). From the cognitive perspective, the lexeme is entrenched in the mental lexicon and constitutes a holistic concept and an autonomous conceptual gestalt (Schmid 2016: 80-81).

However, “the three stages do not always run in parallel with respect to the three perspectives” (Schmid 2016: 81), for instance, the lexeme apple juice seems to be fully institutionalized and entrenched but it is not fully lexicalized. Complex lexemes from the field of psychiatry, such as mechanoreceptor, denervation, axon death, etc. are only partially institutionalized for the majority of speakers but they are fully lexicalized and entrenched for people working in the field of psychiatry and, therefore, are familiar with them (Schmid 2016: 81). Thus, as far as the analysis of new coinages is concerned, various aspects of their development should be taken into account. The structure, connections with other words and concepts, number of speakers and area of usage are all important for the identification of the current state of a new lexeme.

In the current paper, the focus is put on the stage of creation as it deals with very new lexis entities, many of which have just recently entered the English language. Besides that, at this stage of the research, only the structural perspective is employed to analyze the word-formation processes behind the new words chosen for this analysis.

3 Word-formation patterns

There are different approaches to describe word-formation patterns in English. The most current research is based on a traditional approach whereby the word-formation patterns, first of all, can be divided into two groups, morphemic and non-morphemic patterns (Schmid 2016: 86).

Morphemic word-formation patterns can be described “by specifying (a) the types of morphemes involved and (b) the way they are ordered” (Schmid 2016: 86). Patterns included in this group are prefixation, suffixation, compounding, conversion, and back-formation. The latter one is considered to be a borderline between morphemic and non-morphemic word-formation patterns as not only morphemes but also word components can be affected by this process (Schmid 2016: 86-87).

The second group of non-morphemic patterns includes clipping, blending, acronomy, abbreviation, and reduplication, i.e. those processes which affect only word components (Schmid 2016: 88).

Another traditional approach to classify the patterns was presented by Quirk et al. in Comprehensive Grammar (1985) who suggested distinguishing the main types of word-formation processes (Schmid 2016: 88). The approach can be summarized in the following way:

Abbildung in dieser Leseprobe nicht enthalten

Figure 1. The classification of word-formation processes (Quirk et al. 1985: 1520; cited in Schmid, 2016).

According to Quirk et al., both prefixation and suffixation are types of affixation as they consist of a base and an affix (Schmid 2016: 88), a short morpheme with an abstract meaning (Haspelmath & Sims 2010: 18). Other major word-formation patterns are conversion and compounding. The word-formation patterns such as clipping, blending, acronomy, abbreviation, and reduplication are not considered to be major ones so they can be called minor processes.

In the following sections, every pattern is discussed individually starting from the major types and ending with minor ones.

3.1 Affixation

“Affixation involves adding bound morphemes to existing roots, which result in newly

created derivatives” (Al-Salman & Haider 2021: 31). For example, the word slacker which means "a person who shirks work, or avoids exertion, exercise, etc." (Oxford English Dictionary 2021) consists of the base slack- and the affix, namely, suffix -er.

According to Quirk et al., affixation encompasses two subtypes, prefixation and suffixation (Quirk et al. 1985: 1520; cited in Schmid, 2016).

3.2 Prefixation

Prefixation is “the combination of free lexical morphemes (bases) with preceding bound lexical morphemes (prefixes)” (Schmid 2016: 147). While the head of the word typically determines its word class, prefixes can modify the bases maintaining this word class (Schmid 2016: 147).

There are different types of prefixes. In general, prefixes can express certain attitudes and carry different meanings, e.g. prefixes with negative meaning (dis-, de-, un-, etc.), with locative or spatial meaning (co-, extra-, inter-, etc.) and many more (Schmid 2016: 152-158).

An example of prefixation is the word antibody which means "any of the proteins, naturally present in the body or produced in response to the introduction of an antigen, which reacts with specific antigens" (Oxford English Dictionary 2021). The word is a noun and consists of the head/base body which also determines the word class and the prefix anti-. The prefix maintains the word class and it carries the meaning “opposite”.

3.3 Suffixation

Suffixation constitutes an attachment of a bound lexical morpheme to the end of a base (Schmid 2016: 163). The suffix, in contract to prefixes, “determines the word class of the whole derivation and, thus, functions as head in spite of the fact that it is a bound morpheme” (Schmid 2016: 163).

Thus, suffixes can be divided into groups according to the word class they form: adjective­forming suffixes (-ive, -able, -ary, etc.), verb-forming suffixes (-ate, -ify, -ize/ise), and adverb-forming suffixes (-ly, -wards, -wise) (Schmid 2016: 168-179).

The example of a new word coined with the use of suffixation is masklessness which means “a lack of mask” and consists of the base ‘ maskeless’ and the deadjectival noun-forming suffix -ness.

3.4 Compounding

Compounds usually consist of two elements, at least one of which is a free lexical morpheme. Compounds consist of a head and modifier. The former defines the word class and the latter specifies the head. The meaning of compounds cannot simply be derived from the meaning of the constituents. Typical compounds are nouns and adjectives so usually two groups are distinguished: nominal and adjectival compounds (Schmid 2016: 121-122).

Compounding, in general, is considered to be the most productive word-formation pattern in English (Plag 2018: 131) and, from the sociopragmatic perspective, the more informal registers such as conversations, personal letters, fictional texts are characterized by containing a higher number of compounds than in other types of registers (Schmid 2016: 145).

An example of compounding is the new word zoombombing which means “the unwanted, disruptive intrusion, generally by Internet trolls, into a video-conference call” (Wikipedia 2021). The compound is a noun and consists of two free lexical morphemes. One of them can be defined as the head, which is bombing, and zoom is the modifier. The head determines the word class of the whole word and zoom can be seen as the specification of the location and despite the fact it is a proper noun, in this particular case, it stands for all kinds of video-conference programs and applications.

3.5 Conversion

Conversion is a word-formation process whereby “a lexeme changes from one word class to another without formal marking while at the same time remaining in the original word class” (Schmid 2016: 187). Sweet (1900: 38 ff.; cited in Schmid 2016: 187) also points out that a lexeme is considered to be fully converted when “it accepts all the formal characteristics of the new word class”.

An example of conversion is the word to zoom which originated from the proper noun (Zoom) and has become a verb meaning "to video-chat via a web application" (The Philadelphia Inquirer 2021).

The concept of conversion is not supported by all linguists and some of them tend to think that the word-formation process that conversion stands for has to be called zero-derivation. However, this study employs the ideas of Plag who applies the overt analogue criterion whereby a zero affix can be justified only if an overt form with the same meaning is found. Having said that, conversion can be called zero-affixation only when affixes with the same range of meanings are found. Plag argues based on his findings that overt-suffixes express much more restricted meanings than conversion so the overt analogue criterion cannot be met, and conversion is a separate word­formation pattern (Plag 2018: 110-111).

3.6 Back-formation

“Back-formation typically deletes morphemes or morpheme-like units at the ends of base-lexemes” (Schmid 2016: 212). In comparison to other word-formation patterns, back-formation is quite difficult to identify. In order to do so, a lexeme has to be analyzed from the diachronic perspective and, therefore, it is possible “to trace” when back-formation happens (Schmid 2016: 212).

For example, the verb to isolate originates from the adjective isolated which, in turn, comes from Italian with its usage as an adjective dating back to 1890 (Online Etymology Dictionary 2021).

3.7 Clipping

Clipping constitutes the word-formation process whereby a part of the word is clipped. The word can be reduced at the beginning, at the end, or even at both ends. In contrast to back-formation, clipped words preserve their meanings and the word class (Schmid 2016: 213).

An example of this word-formation process is the word corona which is clipped from the more official term coronavirus.

It is interesting to mention that clippings can signal familiarity and closeness of the concept to both a speaker and listener. Clipped words are usually used when longer forms seem to be too complex or formal, so they are often found in colloquial settings (Schmid 2016: 215). They can also identify “social closeness and mutual membership of a group” (Schmid 2016: 215).

3.8 Acronyms and abbreviations

Acronymy can be seen as an extreme form of clipping as only initial letters or very short parts of words are saved. Acronyms can be written in three different ways: unseparated capital letters (BBC, UK), capital letters with full stops in between (M.O.T.), one capital letter at the beginning followed by small letters (Nato).

At the level of phonetics, acronyms are pronounced like words, e.g. NATO, COVID, etc. Words pronounced as a series of individual letters (e.g. BBC, UK, TV, etc.) are called abbreviations (Schmid 2016: 217).

Both word-formation types are, most of all, found in public communications in the fields of politics, public institutions, the media, and the sciences. The Internet and social media are also well- known sources for newly coined acronyms and abbreviations which is explained by the extreme tendency to brevity and language economy of electronic communication (Schmid 2016: 218).

3.9 Blending

Blending is similar to compounding as in both cases, words are “mixed”. However, in blends, lexemes are not incorporated in unaltered form, every lexeme is pushed into another and, as a rule, shortened or transformed in a certain way (Schmid 2016: 219).

An example of blending is the word safecation which refers to vacation during the pandemic provoked by some tension in personal relationships because of a lockdown and subsequent frustration (Urban Dictionary 2021). The word is the mix of safe and vacation. The first two letters of the second lexeme are just omitted so that both words can be blended forming a new word.

This word-formation pattern is extremely popular as a source for language creativity in computer-mediated communication and in the press. Words coined in this way usually become easily recognized by many speakers of the language community due to their similarity to the original words and common characteristics shared with them. (Schmid 2016: 221).

3.10 Loan words

Another way to enrich vocabulary is to borrow words from other languages. These new coinages are subsequently called loan words. They usually have a complex morphological structure and can stand out in the vocabulary, especially when they are borrowed from languages that do not belong to the same group as a target language (Haspelmath & Sims 2010: 107). In order to stay in a new language, loan words get accommodated to a certain extent. The accommodation can occur at different levels including morphology, phonology, syntax, semantics (Platzer 2021) so that new coinages enter the system of another language and sound more or less natural to the speakers of this language.

An example of a loan word is coronaspeck borrowed from German and it means "weight gained during lockdown as a result of eating more than usual because of working from home" (Collins Online Dictionary 2021).

3.11 Onomatopoeia

The process whereby words are produced in the way that they imitate actual sounds is called onomatopoeia (Kestler 2017).

According to Kestler (2017), four types of onomatopoeia can be distinguished:

- “Real words that sound like real things”, e.g. the word meow imitating the sound produced by a cat;
- “Real words made to evoke the sound of real things”, e.g. Edgar Allen Poe repeats the word bell 62 times in his poem “The Bells” (Poe 1850; cited in Kestler 2017) to recall the sound of a bell ringing despite the fact the word itself does not imitate this sound;
- “Made-up words that sound like real things”, e.g. the invented word tattarrattet from the novel “Ulysses” by Joyce (Joyce 1922; cited in Kestler 2017) for conveying the sound of someone knocking on a door;
- “A series of letters that mimic a raw sound”, e.g. the word hachoo imitating the sound of sneezing.

Thus, onomatopoeia can be defined as a word-formation process as well as a figure of speech depending on a perspective and what is seen as its main goal. However, in the present study which lies in the field of morphology, onomatopoeia is seen as a word-formation process that does not deny the variety of its other meanings.

3.12 Multi-word expressions

Multi-word expressions are not a type of word formation. However, this topic plays an important role in morphology as it demonstrates how words can interact, organizing themselves in separate units, expressions encoding human experiences, and naming new concepts (Masini 2019).

Multi-word expressions are “linguistic objects formed by two or more words that behave like a ‘unit’ by displaying formal and/or functional idiosyncratic properties with respect to free word combinations” (Masini 2019). There are a lot of subtypes of multi-word expressions such as proverbs, metaphorical expressions, idioms, phrasal and particle verbs, collocations, “frozen” forms, etc. (Hüning & Schlücker 2015: 450-467). In the framework of the present study, only three most relevant subtypes such as idioms, collocations, and fixed expressions are discussed.

Idioms are word expressions, the meaning of which cannot be derived from the meaning of their components. They tend to be culturally specific and cannot be easily translated from one language to another (Swan 2005: 231). The example of an idiom is the elephant in the zoom which means "the thing no one is talking about, but everyone is thinking about, during an online meeting" (IUPUI 2021) and paraphrases the more well-known idiom the elephant in the room which has the same meaning but does not imply special online settings.

Collocations are conventional combinations of words. In comparison to idioms, their meanings are more transparent as they do not convey any culturally specific knowledge but at the same time, they are still idiomatic in a sense that words have to appear in a certain way that cannot be changed without a loss in their meaning (Swan 2015: 231). The example is the word combination to flatten the curve which means "to take measures designed to reduce the rate at which infection spreads during an epidemic, with the aim of lowering the peak daily number of new cases and extending the period over which new cases occur" (Oxford English Dictionary 2021).

Another phenomenon ‘fixed expressions’ is quite difficult to distinguish from the other two types. Fixed expressions are also idiomatic but the way they express ideas is less strict and conventional than the other two types discussed above (Swan 2015: 232). The example of a fixed expression is to quarantine and chill which is a paraphrase of another popular expression Netflix and chill. Both expressions are easier to understand than idioms or collocations and more open to interpretation, so their meanings are not completely restricted.

Thus, types of multi-word expressions can be placed on the scale showing the degree of idiomaticity and fixation. Among three types discussed in the sections, idioms would have the highest degree of mentioned parameters followed by collocations and then fixed expressions which would demonstrate the lowest level of fixation and idiomaticity.

4 Studying of COVID-19 neologisms

COVID-19 neologisms are quite a new topic. Society is still going through the pandemic and despite the vaccination campaign effectively rolled out in many countries no one can say that humanity has finally won the virus and we can return to our normal lives. Unfortunately, new variants of the virus keep appearing, the vaccination campaign is far from its ending, and the travel and many other restrictions have not been removed yet. For linguistics, it means that the word manufacturing is still in process and there are a lot of things to observe and study.

There are already few studies on the problem of COVID-19 neologisms. Some of them turned out to be the starting point for this paper and are discussed in this section.

The first two papers that were chosen for the research are Linguistic Analysis of Neologisms related to Coronavirus (COVID-19) by Asif et al. (2020) and Corona Virus Disease (COVID-19) Effects on Language Use: An Analysis of Neologisms by Mweri (2021). Both studies aim at describing and analyzing COVID-19 neologisms.

Asif et al. identified some of the word-formation patterns behind newly coined words such as abbreviations, acronyms, and blendings. They created lists of the most popular neologisms and provided charts demonstrating the frequency of words relating to the pandemic based on data from the Oxford Corpus (Asif et al. 2020).

Mweri focused on the problem of semantic shift and semantic extension. The researcher discussed some words which were coined long before the pandemic but were reintroduced again and explained how the meaning of these words has been changed. The word-formation processes also gained some attention in the study. Mweri provided some examples of acronyms and blends but this topic was not the focus of the research (Mweri 2021:36-47).

In general, both studies are mostly descriptive and do not provide any statistical data. In both cases, researchers mentioned that they collected samples using the corpora, social media, newspapers and other sources but did not specify how many words they analyzed and what tools they used for it. Apart from that, contexts that could demonstrate how words functioned were not provided in both papers.

Another study that served as a basis for the present paper is COVID-19 trending neologisms and word formation processes in English by Al-Salman and Haider (2021). The aim of the study was to analyze word formation patterns behind new words relating to the pandemic. The researchers compiled a corpus of 218 new words inspired by COVID-19. For this purpose, they used various sources such as TV, newspapers, social media, and the results of the project by Thorne (2020) who was collecting COVID-19 coinages, and then published his findings on his website. Following that, the samples were analyzed and classified in accordance with their socio-pragmatic functions (nicknames, homeworking & teleconferencing, demographics & safety/security measures, describing new realities, and others). As a result, they found out that many words had been recently coined but there were also a number of reintroduced words that had experienced semantic shift or semantic extension during the pandemic. The most popular word-formation pattern turned out to be compounding. However, it does not seem that the scholars paid much attention to the comparison of different patterns and were mostly interested in the problem of language creativity as the contributing factor of large social changes as it was with the global pandemic. The only problem about this study is that the researchers do not provide the contexts in which analyzed words appear and do not explain their meanings which is an important part of working with neologisms and, especially, with ad-hoc formations as their meanings are often not transparent and contexts as well as definitions can considerably help with their understanding.

Thus, COVID-19 neologisms have already gained some attention from linguists whose works became an inspiration for this study and demonstrated what else could be done in this area. The current research is aimed at contributing to the analysis of trending COVID-19 neologisms and addressing the gaps such as the absence of statistical data, the definitions of new words, and contexts documenting their existence.

5 Methodology

The present research aims at analyzing new COVID-19 words and identifying word-formation patterns that back them up. It is important to mention that the term ‘new words’ in the framework of the study refers to ad-hoc formations, neologisms, and words that were reintroduced in the COVID-19 discourse while experiencing semantic shift or extension.

The research questions are the following:

1. What parts of speech do most COVID-19 words belong to?
2. How many words were coined during the pandemic and how many words were coined before the pandemic?
3. What word-formation patterns stand behind words relating to the pandemic and which of them are the most popular?

5.1 Collecting and analyzing data

The prime source of new COVID-19 words is certainly the Internet. Many words were coined by Internet users and then shared in social networks such as, for example, Facebook and Instagram where they could reach other members of the speaking community very quickly. Some linguists and language enthusiasts found this process particularly interesting and decided to document as many new words as possible. It resulted in the lists of trendy COVID-19 words which are now available for any user. These lists became the first source of data for the present study.

The first data source used in the present study was the project #CORONASPEAK - the language of Covid-19 goes viral by Tony Thorne (Thorne 2021), a British linguist and lexicographer, who has been collecting new coinages since the pandemic started. The lists of his findings consist of 259 words and are now published on his website. At the moment, the researcher continues working on the topic so new entries can be expected soon.

Another list consisting of 28 new COVID-19 words was published on the website Dictionary.com which is a digital dictionary providing definitions of words as well as their spelling, pronunciation, synonyms, example sentences, and word origins (Dictionary.com 2021).

The last source used for collecting data was the list of updates to the Oxford English Dictionary (Oxford English Dictionary 2021). Every month they publish new entries which are added to the online dictionary. All updates from the beginning of last year until now were analyzed and words and expressions related to the pandemic were extracted. In total, there were 21 COVID-19 words but all of them coincided with the words from the other two lists mentioned above. It is interesting to mention that most of the words were added to the Oxford English Dictionary in April 2020, i.e. just after the start of the global pandemic, which demonstrates how fast the language community can react and enrich the vocabulary with new words.

During the collection of data, some words which were randomly found on Facebook and Instagram were also added to the list.

In the end, the final list consisted of around 300 new words. The next step was to sort them out and decide which of them were to analyze. The quotations by different politicians from the list by Thorne (e.g. “...not take the foot off the neck of the beast (Boris Johnson)” (Thorne 2021)) were excluded because they did not relate to the topic of this research. Other questionable entries were words and expressions with a very broad or with an unrelated semantic meaning (e.g. Big Bang, Tsunami, etc.). Unfortunately, the contexts in which words were used are not provided in the lists chosen for this study so it was impossible to understand if the words related to the topic of the pandemic, how they functioned, and what they meant. The example sentences are extremely important for the analysis of neologisms and, especially, ad-hoc formations as it helps to extract the meaning of a word (Schmid 2016: 74), to understand its role in a sentence and to define its characteristics. Thus, the following step was to collect example sentences, for which the Coronavirus Corpus was used.

The Coronavirus Corpus is a part of the English Corpora (English Corpora 2021) which “is designed to be the definitive record of the social, cultural, and economic impact of the coronavirus (COVID-19) in 2020 and beyond” (The Coronavirus Corpus 2021). It contains around 1136 million words and the corpus is still growing by 3-4 words every day. The corpus provides information about the frequency of words, their collocations, and the patterns in which they occur. Besides that, it is also possible to compare different periods and trace the usage of a word over time (The Coronavirus Corpus 2021).

The corpus enabled the collection of the contexts in which the words from the list were used. Not all of them were found in the corpus, which can be explained by its novelty and still relatively small size. In this case, Google was used, and samples were collected from different news websites and social networks such as, for example, Twitter. It is important to mention that only the samples related to the COVID-19 pandemic were included so neologisms that were not found in such contexts were left out. There were also words which did not appear anywhere other than in the list used so they were removed from the final list for the present study as the analysis cannot be done properly without example sentences which not only provide information needed for the analysis but also document the word and prove its existence.

After sorting out the data, there were 179 words left in the final list. In addition to example sentences, the frequency of the usage of the words was checked with the use of the corpus. As some of the words were not found in there, their frequency was zero too. The meanings were explained based on the example sentences or with the use of other sources where new words were documented. After the meaning of the word was known, its structure had to be studied. Firstly, the part of speech the word belonged to was identified. Following that, the word structure was studied. In total, 12 groups of different word-formation processes were determined. Among them are compounding, blending, abbreviation, acronyms, clipping, conversion, back-formation, prefixation, suffixation, loan words, multiple-word expressions, and mixed types. The latter means the combination of two or more other types. In this study, only combinations of two types were found, and among them are compounding and blending, compounding and clipping, compounding and suffixation, compounding and conversion, conversion and prefixation, prefixation and suffixation, clipping and suffixation, and blending and clipping.

The next step was to find out which words were novel and which ones were coined before the pandemic. For this purpose, the Oxford English Dictionary (Oxford English Dictionary 2021) was used which enables the study of the words from the diachronic perspective. Neologisms cannot be found there but it is an extremely helpful tool for analyzing the samples reintroduced during the pandemic, e.g. self-isolate, flatten the curve, etc. The dictionary provides not only the definition but also the contexts in which words or phrases were used starting from the earliest one which was documented so that every user can trace the history of the word.

Other online dictionaries used for the analysis are the Merriam-Webster’s Dictionary (Merriam- Webster 2021), Collins Dictionary (Collins Online Dictionary 2021), and Macmillan Dictionary (Macmillan Dictionary 2021).

Merriam Webster is an American company that focuses mostly on publishing language dictionaries. The online dictionary provides the definitions of words, their pronunciation, and etymology (Merriam-Webster 2021). However, the latter is not so detailed as in the Oxford English Dictionary. Another difference is that some words get into the Merriam Webster’s Dictionary earlier than in the Oxford English Dictionary, which also helped with the analysis.

The Collin’s English Dictionary initially existed only in a printed version. The online version today is not just a dictionary. It also provides translational service and works as a media platform in which various materials on linguistics are posted. One of the interesting options of the website is an opportunity to suggest a new word that any user is allowed to do. The suggestion is accompanied by a definition, the date of submission, and, sometimes, an example sentence (Collins Online Dictionary 2021).

Macmillan Dictionary is slightly different from the previous two as it is aimed at English learners rather than at English speakers. Due to that fact, the website provides information not only about word definitions but also about their collocations, synonyms, expressions they are used, i.e. everything that English learners can have most difficulties with. As in Collins Dictionary Online, any user can suggest a new word which is subsequently stored in the section called Open Dictionary. Every suggestion consists of a definition, example sentence, and, sometimes, information about a register (Macmillan Dictionary 2021).

The other two sources used for the present study are Wikipedia (Wikipedia 2021) and the Urban Dictionary (Urban Dictionary 2021). One can argue that these sources are not academic enough for conducting scientific research which is right. However, studying neologisms and ad-hoc formation is often accompanied by using non-academic sources, in which the new entries appear faster than in academic ones and they can sometimes be the sources of new words themselves. Dictionaries like the Oxford English Dictionary or Merriam Webster’s document the words which are at least at the stage of establishment, i.e. they are lexicalized and institutionalized (Schmid 2016: 77). The words which were just recently coined are not widely spread and not accepted by the majority of the speech community members yet so they cannot be included in the above­mentioned dictionaries. In this case, the use of non-academic sources is warranted as they provide information about the use of new coinages which is often not found anywhere else.

Wikipedia is a free online encyclopedia available in a great number of different languages. The main idea of the website is to make it possible for any user to add and edit pages of the encyclopedia that is, on the one hand, a great opportunity to share knowledge fast and for free but, on the other hand, makes the source quite unreliable as not every entry to Wikipedia is checked and proven for credibility (Wikipedia 2021).

The Urban Dictionary is an online dictionary, the main idea of which is to collect slang words and expressions used in the English language. New suggestions can be also added by any user. The suggestion consists of a word, expression or phrase, explanation, and an example sentence. After the entry is published, other users can vote for or against it. Based on the results of the voting, the word of the day is identified and in case there are many suggestions of the same word from different users, they are all ranked in the form of a list, in which the word which got the most votes is placed on the top (Urban Dictionary 2021).

All the above-mentioned sources provided much information about new COVID-19 words. Users have been actively suggesting new coinages in the online dictionaries explaining how they can be used and providing example sentences. Besides that, during the analysis, an interesting phenomenon was faced. Some words were not included in the Oxford English Dictionary yet as they were still neologisms, but they were also too dated for novel COVID-19 words. In this case, other dictionaries also considerably helped to identify ‘old-neologisms’ due to the fact that the dates of when the words were suggested are available there. If a certain word could not be found in the Oxford English Dictionary but in other dictionaries or Wikipedia and the date of its suggestion was before December 2019, it was assigned to the group of ‘old neologisms’ and, therefore, to the words coined before the pandemic.

All data was sorted out and presented in an Excel table. Initially, it was one big table but for convenience, it was divided and presented as two tables in this paper (see Appendix A, Appendix B). In the first table, definitions and example sentences can be found. In the second row of the table, the meaning of a word is explained. Besides that, the information about whether a word was found in the Oxford Language Dictionary is given. For the words coined before the pandemic, the data of the earliest example sentence is mentioned that allows counting these words. The second table provides information about frequency in the corpus, part of speech, and word-formation processes. The words are presented in descending order so those which are more frequent in the corpus are placed on the top and the coinages with zero frequency are found at the bottom of the table.

At the final stage, the data was measured, and results are provided in the tables and pie charts which enable the comparison of the data. The results are discussed in greater detail in the next chapter.

5.2 Limitations

At the current stage of the research, certain limitations were found which could, to some extent, affect the final results. In order to avoid any misinterpretations the limitations and the way they influenced the research are discussed in this section.

According to Biermeier, “the bigger a corpus, the more valid the results” (Biermeier 2008: 16). The Coronavirus Corpus is a part of the English Corpora and it contains fewer entries (1146 million words) than some other parts such as, for instance, the Intelligent Web-based Corpus (14 billion words) or News on the Web (13,2 billion words) which is explained by the fact that the corpus was created in January 2020 while other parts had appeared quite earlier. Besides that, only 20 countries are included in the corpus which are the USA, Canada, Great Britain, Ireland, Australia, New Zealand, India, Sri Lanka, Pakistan, Bangladesh, Malaysia, Singapore, the Philippines, Hong Kong, South Africa, Nigeria, Ghana, Kenya, Tanzania, Jamaica (Coronavirus Corpus 2021). All of that imposed certain limitations on the research.

First of all, not every word from the list could be found in the corpus. Secondly, the restricted number of countries also makes the results quite approximate as they were likely to be different if more countries were included.

Another important point is that the only genre represented in the Coronavirus Corpus is web news which is characterized by media and newspaper language (Coronavirus Corpus 2021). It means that the use of the words in other genres such as, for example, scientific papers cannot be studied but it could theoretically provide very interesting results. Apart from that, due to the choice of genre, social media networks also remained out of sight. As it is known, social media networks became the most popular source of the new coinages, so the fact they are not included in the corpus also explains why some of the words were not found there during the analysis.


Excerpt out of 99 pages


Word-formation at the time of COVID-19
Word-formation patterns and studying of COVID-19 neologisms
University of Regensburg
Catalog Number
ISBN (Book)
COVID-19, morphology, wordformation, linguistics, language, English, languagechange
Quote paper
Aleksandra Martin (Author), 2021, Word-formation at the time of COVID-19, Munich, GRIN Verlag, https://www.grin.com/document/1165467


  • No comments yet.
Read the ebook
Title: Word-formation at the time of COVID-19

Upload papers

Your term paper / thesis:

- Publication as eBook and book
- High royalties for the sales
- Completely free - with ISBN
- It only takes five minutes
- Every paper finds readers

Publish now - it's free