Table of contents

1 Introduction

2 Literature Overview

3 Code-Switching (CS)
3.1 Terminology and definitions
3.1.1 Definition of CS
3.1.2 CS vs. borrowing
3.1.3 Types of code-switching
3.1.4 Insertion vs. alternation
3.2 Theoretical Models of CS
3.2.1 The Matrix Language-Frame Model (MLF)
3.2.2 The Markedness Model (MM)
3.2.3 Intersection of the MLF and the MM
3.3 Summary

4 Discussion
4.1 Supporting evidence for the MLF and the MM
4.1.1 Bosnian-Turkish CS
4.1.2 CS in Creole languages
4.1.3 Chinese-English CS
4.2 Problematic issues in the MLF and the MM
4.2.1 Matrix Language definition
4.2.2 System vs. content morpheme distinction
4.2.3 Points of critique in the MM
4.3 Summary

5 Conclusion

1 Introduction

Bilingual people have more than one language at their disposal to express themselves. When bilingual speakers communicate with other bilinguals they often alternate between two languages, as in the example below. This phenomenon is called code-switching (CS) (Boztepe 2003: 4).

(1) Sometimes I’ll start a sentence in English y termino en español

(Sometimes I’ll start a sentence in English and finish it in Spanish.)

(Poplack, 1980)

While monolinguals can vary their utterances by means of changing styles within a language or a dialect, through speech rate or intonation, bilinguals can do the same in both languages, plus they can switch between the languages. Thus, the speech of bilingual speakers is interesting for research. Many researchers have examined this language contact phenomenon and have tried to explain how and why people code-switch (Boztepe 2003: 4).

The study of CS has developed in two main directions: structural, that is grammatical and sociolinguistic, also called pragmatic aspect. Originally, CS was considered to be the result of poor language competence in both languages. Later, in the 1970s, linguists, for example, Gumperz (1972) and Pfaff (1979), suggested that mixing between languages does not occur randomly but rather follows certain grammatical rules. Many grammatical models have been proposed to account for the grammatical constraints in CS. One of the most influential models, the Matrix Language-Frame Model (MLF), was introduced by Myers-Scotton (1993b). The model is based on two asymmetries: matrix language vs. embedded language, and system vs. content morphemes. However, not only structural factors but also social and psychological factors influence the speakers’ motivation to engage in CS. It is one of the biggest challenges in the research on CS to link all these factors to provide a better understanding of the phenomenon of CS. Therefore, Myers-Scotton (1993a) went further and developed the Markedness Model (MM) in an attempt to explain why bilingual speakers code-switch and how the social environment influences the type of CS present in the community.

The purpose of this paper is to look at the relation between structural and social factors in the formation of CS patterns in bi-/multilingual communities. Furthermore, the general applicability of the MLF and the MM to the CS data from different bilingual communities will be explored.

At the beginning, the concept of code-switching and its typology will be introduced. In section 3.2, the MLF will be presented in order to define morphological and syntactical constraints for CS. Then, in the frame of the MM, possible motivations for CS will be examined. These are followed by the perspective on the relation of both models to each other. In Chapter 4, using the MLF and the MM, CS data from several bilingual communities will be analysed and finally, problematic issues in both models will be discussed.

2 Literature Overview

At the beginning of CS research, some linguists such as Weinreich (1953: 73) believed that CS is a result of the lack of linguistic competence and that the ideal bilingual switches from one language to the other according to appropriate changes in the speech situation . . . , but not in an unchanged speech situation, and certainly not within a single sentence (Weinreich 1953: 73).

It was not until the 1970s that CS was recognized to be skilled and meaningful linguistic behaviour (Woolard 2006: 74). Blom and Gumperz (1972) claimed that it is communicative option of bilinguals which has social and pragmatic meaning, which then made this phenomenon interesting for the research.

CS is an interdisciplinary phenomenon, which is why it is studied from different perspectives. The grammatical or structural approach deals with the analysis of grammatical constraints in mixed sentences. Pfaff (1979) and Poplack (1980) looked at the syntactic constraints in CS. Constraints that are based on the government-binding theory were proposed by Di Sciullo et al. (1986). Myers-Scotton (1993b) used insertional approach to CS which assumes that there is a matrix language into which the elements from embedded language are inserted. Myers-Scotton’s MLF will be explained in detail in section 3.2.1. Muysken (2000) offers the integration of existing approaches under one grammatical perspective.

The sociolinguistic approach to CS seeks to explain why bilingual speakers switch between languages and what functions the language choice may serve in the conversation. Various linguists tried to answer these questions. Blom and Gumperz (1972) proposed the situational (or conversational) and metaphorical functions of CS, Auer (1995) developed a sequential analyses approach to CS, Giles (1991) presented a listener oriented model. The most influential model, the Markedness Model (MM), was proposed by Meyrs-Scotton (1993a) and is a speaker oriented model which will be presented in section 3.2.2.

The psycholinguistic analysis of CS focuses on such questions as the organisation of mental lexicon of bilingual speakers and the influence of bilingual proficiency of CS pattern, Clyne (1991), Grosjean (1995).

3 Code-Switching (CS)

Since there is no agreement among scholars as to how to term CS, first, it is necessary to look at different definitions of CS and to distinguish it from other language contact phenomena. Then, the MLF will be introduced. The MM will illustrate the social motivations of speakers to engage in CS which play a role in grammatical structure of mixed utterances. Finally, the influence of social and psychological factors on the CS pattern will be discussed.

3.1 Terminology and definitions

3.1.1 Definition of CS

Since CS can be studied from different approaches, there is no standard definition of this phenomenon in research. This development in CS research has provoked many various definitions of CS which vary according to what aspect of CS researchers want to highlight. Generally, the term code or language code refers not only to different languages but also to different varieties, as well as to different styles of one language (Romaine 2006: 121).

One of the most often cited definitions of CS was proposed by Gumperz: "Conversational code-switching can be defined as the juxtaposition within the same speech exchange of passages of speech belonging to two different grammatical systems or subsystems" (Gumperz 1982: 59). In his definition, Gumperz highlights grammatical aspect of CS which prevailed in the research at that time and stresses that a switch can happen not only between two different languages – grammatical systems – but also between dialects – grammatical subsystems – of one language.

Another definition of CS was offered by Poplack, a pioneer in the grammatical approach of CS. She defines CS as the "alternation of two languages within a single discourse, sentence or constituent" (Poplack 1980: 583). With such a definition, she names different grammatical levels at which CS can occur and stresses the alternational and not insertional character of language mixing (Poplack 1980: 594).

Generally, Myers-Scotton (2006: 239) defines CS as “the use of two language varieties in the same conversation”. In terms of her MLF and matrix vs. embedded language dichotomy, CS is understood as "the selection by bilinguals or multilinguals of forms from an embedded variety (or varieties) in utterances of a matrix variety during the same conversation" (Myers-Scotton 1993b: 3). Both these CS definitions are the basis for this paper.

3.1.2 CS vs. borrowing

In order to give a complete definition of CS, this phenomenon should be differentiated from other similar language contact phenomena. Lexical borrowing will be considered. There is no agreement in the research as to what items from L2 can be considered as CS, especially in the case of singly occurring items. While some researchers treat singly occurring lexemes as borrowings and only a phrase or a larger constituent represent a case of CS for them, other consider that single lexemes also belong to the CS instances. Traditionally, borrowings are words from one language which are well established and recurrent in another language. They are morphologically, syntactically and phonologically integrated into the host language (Poplack et al. 1988: 220). In (2), the word magazine preserves English phonological pattern and is, therefore, an example of English CS in the Spanish sentence, while the magazine in (3) is adapted to Puerto Rican Spanish phonology and is an instance of borrowing occurring in the Spanish monolingual discourse.

(2) Leo un magazine. [mӕɡə'ziyn]

(I read a magazine).

(3) Leo un magazine. [maya'siŋ]

(I read a magazine). (Polack 1980: 583)

Another difference between borrowings and CS is that borrowings are mainly used by monolinguals or by non-fluent bilinguals, whereas CS can be practiced by bilinguals with a good command of both languages (Holmes 2002: 42). Moreover, there are instances of borrowings from L2 (language 2) into L1 (language1) which have characteristics of both borrowing and CS. They are grammatically integrated into the recipient language but not well spread and well established in the community’s linguistic repertoire and presuppose a certain level of bilingualism. Poplack (Poplack et al. 1988: 50) names such forms of borrowings as nonce borrowings. Other researchers propose other criteria to distinguish between these language contact phenomena. According to Myers-Scotton (1993b: 192), the main difference lies in the relationship of both concepts to mental lexicon of the speaker. In her MLF, she distinguishes between the base or the matrix language and the embedded language which elements are inserted into the matrix language. Borrowings are the elements of the mental lexicon of the matrix language or L1 because monolinguals also use them. Whereas CS elements are saved in the lexicon of the embedded language or L2 which is available only for bilinguals. What Poplack et al. name nonce borrowing, terms Myers-Scotton as single-lexeme CS.

3.1.3 Types of code-switching

Shana Poplack (1980) who is a pioneer in the grammatical approach to the study of CS differentiates between three types of CS according to its occurrence in the discourse: tag-switching, intersentential and intrasentential code-switching.

Tag-switching occurs when a tag (I mean, you know, no way) or a set phrase from one language is inserted into the utterance of another language. For example, the following Finish sentence contains an English tag no way: Mutta en mäviittinyt, no way ! (But I’m not bothered, no way!) (Romaine 2006: 122). Tag-switching can occur at any place in the sentence without destroying its syntactic structure, therefore, it does not demand a high level of proficiency in both languages.

Intersentential CS includes a language switch at clause or sentence boundary (Romaine 2006: 123). Consider the following dialogue in Swahili and English. English translation from Swahili is given paranthesis.

(4) Stallholder: Habari, mheshimwa. (Hello respected, sir.) Have some vegetables.

Customer: Mboga gani? Nipe kabeji hizi. (Which vegetables? Give me these cabbages.) How much is that? (Myers-Scotton 1993a: 40)

Here, the switch happens at the sentence boundary. Both speakers switch between full sentences in English and Swahili. This type of CS demands higher bilingual competence, for there must be a good command of both grammatical systems to construct long utterances.

Intrasentential CS includes switching within the clause or sentence boundary as in the following example from Tok Pisin-English: Otherwise, yu bai go long kot. (Otherwise you’ll go to court) (Romaine 2006: 123). Under this type of CS falls also the word-internal CS, a switch within a single word, for example: shoppã (shops) - the English word with Panjabi inflection morpheme. Intrasentential CS demands the highest level of bilingual competence because two grammatical systems must be combined with each other within one sentence. All three types of CS can occur within a single utterance (Romaine 2006: 23).

In line with Poplack (1980), Myers-Scotton differentiates between intersentential and intrasentential CS. Then she specifies three types of constituents which may occur within the intrasentential CS. The following examples are English-Swahili constituents from her Swahili CS corpus. (a) Constituents which contain elements from both languages niko sure (I am sure) or ni-me-decide (I have decided), (b) those entirely in the ML kwa wingi (in abundance) and (c) those entirely in the EL white clothes. (Myers-Scotton 1993b: 4).

In this work, the term code-switching (CS) will be used to refer to all types of CS.

3.1.4 Insertion vs. alternation

Muysken (2000) uses the term code-mixing to refer to intrasentential CS and code-switching to refer to intersentential switches. In the code-mixing, he specifies then three types of processes which are at work in language mixing: Insertion, Alternation and Congruent Lexicalisation.

Insertion. Single items or entire constituents can be inserted into the frame provided by the matrix language. In (5), English PP in a state of shock refers to the Spanish verb anduve and is inserted into the overall Spanish structure.

(5) Yo anduve in a state of shock por dos dias.

(I walked in a state of shock for two days) (Pfaff quoted in Muysken 2000: 5)

Alternation takes place between utterances in a turn or between turns and implies the change of both the ML and lexicon.

(6) Andale pues and do come again.

(That’s all right then, and do come again.)

(Gumperz quoted in Muysken 2000: 5)

Congruent Laxicalisation. This process refers to a situation where there is a shift between two typologically similar languages; they share a grammatical structure which can be filled with lexical elements from either language (Muysken 2000: 1ff.).

Different approaches or models to the structural analysis of CS depart from these processes. Myers-Scotton (1993b) in her MLF assumes that code-switched elements are inserted from an embedded into a matrix language, in the Equivalence Constraint model, Poplack (1980) sees CS as alternation of two language systems and studies their compatibility at the switch point and Labov (1972) explains the shift from one ML to a shared grammatical structure (Muysken 2000: 4).

3.2 Theoretical Models of CS

3.2.1 The Matrix Language-Frame Model (MLF)

Many researchers, for example, Poplack (1980), Pfaff (1979) and Myusken (1995) tried to explain the grammatical structure of utterances where two or more languages occur together. While intersentential CS is the main focus of the sociolinguistic approaches to CS, in order to describe the grammatical structure of CS, intrasentential CS must be examined.

Myers-Scotton (1993b) proposed her own model which is called Matrix Language-Frame Model [1] (MLF) to explain grammatical constraints in the intra-clausal or intrasentential CS. This model is based on Swahili-English corpus of recorded conversations which she collected in Nairobi, Kenia. The MLF is not intended to apply to all types of CS. The intra-clausal CS, where two grammars are in force, is a good backdrop to discuss the grammar of CS (Myers-Scotton 2006: 241). The model predicts which utterances containing CS can be grammatically well-formed and therefore may occur in the speech. The ungrammatical utterances are not supposed to occur, unless they are stylistically marked and have some socio-pragmatic function, for example, emphasis (Myers-Scotton 1997: 75).

The main assumption of MLF is that during the switch participating languages stay in an asymmetrical relationship to each other. One of the languages, the matrix language (ML), is dominant and supplies the morphosyntactic frame of the bilingual clause or sentence. The other language has an auxiliary function and supplies content morphemes which are embedded into the ML. This language is called embedded language (EM) (Myers-Scotton 1993b: 35). Another asymmetrical relationship is present in the different morpheme types: system vs. content morphemes. This morpheme distinction will be discussed later in this section.

As to the ML, either L1 or L2 can be the ML. There are several criteria which may help to determine the ML. From the psycholinguistic point of view, the ML is the dominant language of the speaker. However it may change in different situations. Therefore, this criterion is not reliable and can only be used in combination with sociolinguistic data. From the sociolinguistic point of view, the ML is the language which is used more often in interactions. Referring to her Markedness Model[2], Myers-Scotton argues that the unmarked language which is the expected language in a community is used more often than the marked, the unexpected one. Consequently, ML is the unmarked language in a bilingual community. However, this criterion to determine the ML is also problematic, since in one interaction, both languages, L1 and L2, can be the unmarked code (Myers-Scotton 1993b: 67ff.). To solve this problem, Myers-Scotton suggests using a linguistic criterion. Linguistically, ML is the language which supplies more morphemes in a discourse of minimum of two sentences (Myers-Scotton 1993c: 486).

The MLF differentiates between three types of constituents which may occur in the intrasentential CS: ML+EL constituents, ML Islands and EL Islands (Myers-Scotton 1993b: 77ff.). Examples which illustrate the model are taken mainly from the Myers-Scotton’s Swahili-English CS corpus, where Swahili is the ML and English is the EL.

1. ML+EL constituents. These usually consist of morphemes from both the ML and the EL, therefore they are also called mixed constituents. The typical constituent of this type contains a singly occurring EL lexeme embedded in any number of ML morphemes.

(7) Leo si-ku-come na books z-angu.

today 1SG/NEG-PAST/NEG-come with CL10-my

(Today I didn’t come with my books.) (Myers-Scotton 1993b: 80)

Example (7) is a sentence in which not only intrasentential CS but also intraword CS, si-ku-come (I didn’t come) is present. This VP is a ML+EL constituent.

2. ML Islands. These constituents consist entirely of ML morphemes and must be well formed according to the grammar of ML. In the example (8) ni-me-maliz-a ku-tengenez-a vi-tanda (I have finished making the beds) is an ML Island.

(8) Ni-me-maliz-a ku-tengenez-a vi-tanda

1SG-PERF-finish-INDIC INFIN-fix-INDIC CL 8-beds

ni-ka- wash all the closing na wewe bado maliza na kitchen.

1SG-CONSEC-wash all the closing and you still [to] finish with

(I have finished making the beds and I washed all the closing and you haven’t yet finished with the kitchen.) (Myers-Scotton 1993b: 80)

3. EL Islands. These islands are made up of EL morphemes and must be well formed according to the EL grammar. In (8) all the closing and in (9) throughout the day are EL Islands.

(9) Mimi ni-ta-try kuwa nyumba-ni throughout the day. . .


(As for me, I will try to be at home throughout the day . . .)

(Myers-Scotton 1993b: 146)

Additionally, The EL Island Trigger Hypothesis predicts when EL Islands must appear. When in a constituent an EL system morpheme occurs, then also other components of this constituent must be derived of EL and build an EL Island (Myers-Scotton 1993b: 139). For example, the English-Swahili sentence *if hainyeshi mvua (if it does not rain) is ungrammatical because it contains a system morpheme if from EL which is not allowed according to the System Morpheme Principle[3] constraint. The only possibility for the EL system morpheme to appear in the ML is to build an EL Island such as if it does not rain. In this sentence, if triggers the occurrence of EL Island (Myers-Scotton 1993b: 142).

Further, Myers-Scotton (1993b: 82) proposed the ML Hypothesis to account for the asymmetric relationships of two languages in the production of mixed constituents. It contains two principles: Morpheme-Order Principle and System Morpheme Principle. They explain the role of the ML in the morphosyntactic frame of mixed constituents.

1. Morpheme-Order Principle: In ML+EL constituents consisting of singly occurring EL lexemes and any number of ML morphemes, the surface word order will be that of the ML.


[1] Since its first publication in 1993, the MLF was amended (Myers-Scotton (1997), see Afterword). Another Myers-Scotton’s (2002) model, the 4-M model, is the morpheme classification model which elaborates on the distinction between content and system morphemes and thus, helps to explain the MLF.

[2] The Markedness Model will be explained in chapter 3.2.2.

[3] The System Morpheme Principle will be explained later in this section.

