How can corpora be used to improve vocabulary learning in language acquisition? This thesis focuses on the use of word-frequencies by teachers of English.
Teaching vocabulary to young learners is one of the most challenging responsibilities that teachers face. The methodology chosen for the presentation of vocabulary is crucial for the learning success of the students. There is a great amount of studies on how computers can facilitate the learning of English as a foreign language (EFL) and with the development of immense corpora both teachers and students now have access to hundreds of millions of words and the possibility to explore their occurrence patterns. This advantage is, however, rarely used in practice, partly due to the relatively short existence of this discipline but most importantly due to the lack of information about corpora in English language teaching (ELT).
This paper will present the concept of course book vocabulary and present word frequencies in learner's dictionaries. The research part of this thesis deals with a linguistic analysis of data extracted from course books and their comparison with the Oxford list of 3000 essential words. The aim of the thesis is to investigate the linguistic attributes of texts forming course books and to examine their relation.
Table of Contents
1 Introduction
2 Coursebook vocabulary
3 Project coursebooks in primary education
3.1 Project home study program
4 Word frequency in current learner’s dictionaries
4.1 Oxford defining vocabulary
4.2 Longman defining vocabulary
4.3 Corpus linguistics in language learning
5 Research
5.1 Goal of research
5.2 Development of methods and tasks
5.2.1 Coursebook series selection
5.2.2 Acquisition of data
5.2.3 Questionnaires
5.2.4 Mistakes and errors
5.3 Compliance of the corpus
5.3.1 Corpus compliance/builder programs
5.3.2 Corpus preparation and importation
5.3.3 Part of speech tagging and lemmatization
5.3.4 Co-occurrence analysis
5.3.5 Thematic analysis
5.3.6 Comparative analysis
5.4 Oxford 3000 corpus
6 Output interpretation
6.1 Questionnaires
6.2 Project 1
6.2.1 Word list derived analysis
6.2.2 Co-occurrence analysis
6.2.3 Thematic analysis and data mining
6.2.4 Comparative analysis
6.2.5 Mistakes and errors
6.3 Project 2
6.3.1 Word list derived analysis
6.3.2 Co-occurrence analysis
6.3.3 Thematic analysis and data mining
6.3.4 Comparative analysis
6.3.5 Mistakes
6.4 Project 3
6.4.1 Word list derived analysis
6.4.2 Co-occurrence analysis
6.4.3 Thematic analysis and data mining
6.4.4 Comparative analysis
6.4.5 Mistakes
6.5 Project 4
6.5.1 Word list derived analysis
6.5.2 Co-occurrence analysis
6.5.3 Thematic analysis and data mining
6.5.4 Comparative analysis
6.5.5 Mistakes
6.6 Project 5
6.6.1 Word list derived analysis
6.6.2 Co-occurrence analysis
6.6.3 Thematic analysis and data mining
6.6.4 Comparative analysis
6.6.5 Mistakes
6.7 Project 4th edition series evaluation
6.8 Analysis and results of the comparison with the Oxford 3000
6.8.1 Project 1
6.8.2 Project 2
6.8.3 Project 3
6.8.4 Project 4
6.8.5 Project 5
6.9 Conclusion of research
Research Objectives and Themes
The primary objective of this thesis is to create specific linguistic corpora representing the fourth edition of the Project coursebook series, analyze them through word frequency, co-occurrence, thematic, and comparative lenses, and evaluate the findings against the Oxford 3000 essential vocabulary list to understand how vocabulary is presented to young learners.
- Corpus-based analysis of the Project coursebook series.
- Evaluation of pedagogical approaches to vocabulary presentation.
- Comparison of coursebook vocabulary with the Oxford 3000 lexicon.
- Linguistic analysis of word frequency and co-occurrence patterns.
- Teacher insights regarding coursebook usage and challenges.
Excerpt from the Book
6.2.2 Co-occurrence analysis
Elementary context analysis with the threshold of occurring items set on value 1 produced list of 101 words. This list was checked and personal names were omitted, producing a final list of 94 words with available co-occurrences. The word with most occurrences was Listen with 174 occurrences in the corpus. This is due to its frequent use in instructions to various exercises. Play followed on the second place with 74 occurrences and above the number of 60 scored the following words: town, look, complete, picture, read, question, name and school. The most frequently occurring multiword items were contracted forms it’s with 104 representations and I’m with 60 representations in the corpus.
As an example of associated lemmas, a representative sample of items with frequency coefficient of 0,20 and above was chosen an is presented in a form of a table. The reading keys to the table are as follows:
LEMMA_B = lemma associated with listen
COEFF = value of the selected index
CE_B = total amount of elementary contexts that contain selected lemma
CE_AB = total amount of elementary contexts where lemmas A and B are associated
CHI2 = chi square value concerning the co-occurrence significance
(p) = probability associated with the chi square value
Summary of Chapters
Introduction: Provides a background on the challenges of teaching English vocabulary to young learners and introduces the thesis's focus on utilizing corpus linguistics to evaluate the Project coursebook series.
Coursebook vocabulary: Discusses the significance of vocabulary as a component of language learning and addresses current inefficiencies in how vocabulary is presented in school settings.
Project coursebooks in primary education: Examines the role of coursebooks in primary education and the institutional context of the Project series in the Czech Republic.
Word frequency in current learner’s dictionaries: Reviews the importance of frequency information in vocabulary acquisition, focusing on the Oxford 3000 and Longman defining vocabularies.
Research: Outlines the research methodology, including data collection from the Project series, questionnaire administration, and the use of T-LAB software for corpus analysis.
Output interpretation: Presents the detailed corpus-based findings for each of the five Project coursebooks, including word list analysis, thematic clusters, and comparative results.
Keywords
Corpus Linguistics, Project coursebooks, vocabulary, English language teaching, word frequency, co-occurrence analysis, Thematic analysis, Oxford 3000, primary education, data mining, learner dictionaries, comparative analysis, pedagogical approaches, lemmatization, T-LAB.
Frequently Asked Questions
What is the core focus of this research?
The research focuses on a corpus-based linguistic analysis of the fourth edition of the Project coursebook series to investigate how vocabulary is presented to young learners.
What are the primary thematic fields covered in the work?
The key themes include corpus linguistics, the evaluation of vocabulary presentation methods in primary education, word frequency analysis, and the comparison of pedagogical materials with standardized lists like the Oxford 3000.
What is the primary research goal?
The primary goal is to determine if current coursebooks effectively prioritize essential vocabulary and to identify patterns in how this vocabulary is presented to students.
What methodology is employed to conduct the analysis?
The study uses corpus linguistic methods, employing T-LAB software to analyze word frequency, co-occurrence, thematic clusters, and comparative data extracted from the digitalized texts of the coursebooks.
What content is addressed in the main body of the text?
The main body details the theoretical background of corpus-based vocabulary learning, explains the specific data collection and analytical processes, and provides a book-by-book interpretation of the results for the Project series.
Which keywords best characterize this work?
The study is best characterized by keywords such as Corpus Linguistics, Project coursebooks, vocabulary, English language teaching, word frequency, and comparative analysis.
How were the teachers involved in this study?
English teachers from ZŠ náměstí Míru in Nový Bor participated in a questionnaire survey to express their opinions, experiences, and encountered difficulties regarding the use of Project coursebooks in teaching vocabulary.
Why did the author analyze the Project series from editions 1 through 5?
The analysis covers the entire fourth edition series to provide a comprehensive evaluation of how vocabulary coverage and presentation evolve across the different proficiency levels represented by these coursebooks.
- Quote paper
- Karin Dietiová (Author), 2016, How can the use of frequency information from corpora be used in foreign language teaching? A corpus-based study on vocabulary in course books, Munich, GRIN Verlag, https://www.grin.com/document/454836