This book is an attempt to explore the lexical richness of certain well-known literary texts using a statistical gauge called lexical richness curve.
The analysis conducted throughout this scientific study is corpus-based and a recent version of WordSmith Tools (0.7) is used to process the basic statistical frequencies of types and tokens. The study depends basically on a wordlist tool used to analyze digital samples of six novels written by three grand novelists: Virginia Woolf's The Waves and To the Lighthouse , James Joyce's Ulysses and A Portrait of the Artist as a Young Man, and William Faulkner's Light in August and The Sound and Fury .
Fifteen samples are taken randomly from each novel with ( 1000 ) tokens intervals, so the overall samples used in the study are 90 samples. Then each sample is statistically analyzed to find about its lexical richness .The number of the types ( distinct vocabulary words ) and the number of the tokens ( words ) are counted for each sample. The ratio of types and tokens are presented visually by using Microsoft Office Excel diagrams. This will facilitate a rigorous process of figuring out the lexical richness of each novel.
It is quite evident that Joyce's Ulysses holds the highest rate of lexical richness while Faulkner's Light in August reserves the lowest lexical richness curve. As for Woolf, her novels are located somewhere in the middle with an exceptional approaching observed in The Waves to Joyce's Ulysses in some textual samples. Moreover, it is an evident feature that the type – token curves for Joyce's A Portrait of the Artist as a Young Man and Woolf's To the Lighthouse are virtually reciprocal indicating an exceptional similarity in their lexical repertoires.
Table of Contents
Chapter One Introduction
Chapter Two Stylistics Vs. Corpus Linguistics
2.1 What is Stylistics all about ?
2.1.1 The Need for Stylistics
2.1.2 Major Areas of Stylistics
2.2 Corpus Linguistics : What is it all about ?
2.2.1 Features of Corpus Linguistics :
2.2.2 The Goals of Corpus Linguistics
2.2.3 What is Corpus ?
2.2.4 Types of Corpora
2.2.5 The Use of Computer to Study Language
2.2.6 The Corpus-Based Approach VS. The Intuition- Based Approach
2.2.7 Corpus Linguistics : A Methodology or an Independent Discipline
2.2.8 Corpus - Based and Corpus-Driven Approaches
2.3 Corpus Stylistics
2.3.1 Goals of Corpus Stylistics
2.3.2 Features of Corpus Stylistics
2.3.3 Limitations of Corpus Stylistics
2.3.4 The Circle of Corpus Linguistics Description and Literary Appreciation
Chapter Three Lexical Richness as a Stylistic Feature
3.1 Lexical Richness
3.2 Measuring Style
3.3 The Problem of Measuring Style
3.4 Style as Recurrence
3.5 The Relationship Between Frequency and Significance in a Corpus
3.6 The Quantitative Features of Style
3.6.1 Lexical Features
3.6.2 Character Specific Features ( N-gram feature)
3.6.3 Syntactic Features
3.6.4 Semantic Features
3.7 Lexical Richness and Type-Token Curve
3.8 Creativity and Literary Vocabulary
Chapter Four Text Corpora and Methodology
4.1 Design Consideration
4.2 Corpus Design
4.3 Technical Preparation
4.3.1 Planning a Storage System
4.3.2 Copyright
4.3.3 Electronic Version
4.4 Text Corpora
4.5 Corpus Features
4.6 Methods Used in Analyzing the Corpus
4.6.1 WordSmith Tools Version ( 7.0 )
4.6.2 Microsoft Office Excel
4.7 The Length of Individual Text Samples
4.8 Representativeness and Balance
4.9 Sampling Methodology
4.10 Summary of the Analysis Procedures
Chapter Five Analysis and Results
5.1 Type – Token Curve Analysis
5.1.1 Type – Token Curve Analysis of Virginia Woolf's Novels
5.1.2 Type – Token Curve Analysis of William Faulkner's Novels.
5.1.3 Type – Token Curve Analysis of James Joyce's Ulysses and A Portrait of the Artist as a Young Man
5.1.4 Type – Token Curves of the Three Authors
5.2 The Results
Chapter Six Conclusions and Suggestions for Further Studies
6.1 Conclusions
Research Objective & Key Topics
The primary objective of this book is to demonstrate how corpus-driven methods, specifically type-token curves, can be utilized to perform rigorous, objective stylistic analysis of lexical richness in literary works, thereby bridging the gap between qualitative literary appreciation and quantitative linguistic description.
- Application of corpus stylistics to analyze literary texts.
- Methodological use of type-token curves to measure lexical diversity.
- Comparative statistical analysis of six novels by Woolf, Faulkner, and Joyce.
- Evaluation of lexical richness, recurrence, and stylistic development.
Excerpt from the Book
3.3 The Problem of Measuring Style
According to the approach of " style as recurrence " style is concerned with frequencies of linguistic features in a given context, and thus with textual probabilities. To measure the style of a passage or corpus the frequencies of its linguistic items of different levels must be compared with corresponding features in another text or corpus. So the aim of stylistic analysis is creating an inventory of style markers ( those linguistic items that only appear in one group of contexts ) (Corbett, 1987 : 99). So, the first difficulty associated with measuring style is to identify which technique the researchers is going to use in measuring style.
The second problem is " the impossibility to have all the linguistic features that may found in a text " (Leech, 2007 : 35 ) . Even if English is a well-studied language, still no complete description is revealed about it because language is a complex and open - ended system. This problem can be overcome by determining the linguistic features that the research will tackle in particular ( ibid ) .
In order to discover what is special about the style of a text or corpus, the researchers has to find the frequencies of the relevant features the text or the corpus includes. Studies demonstrate that some writers use certain features which discriminate them from other writers. For example writer A is "fond of" or "tend to use" feature B. Such conclusion should be based on an empirical evidence in order to be reliable and objective ( ibid : 36 ).
Another problem of measuring style is associated with the studies that focus on the words which appear only once in the entire text and called " hapax legomena " (Xiao, 2008 : 383 ). The problem of these words as discriminators is that their individual low rate of occurrence make them difficult to handle statistically as well as they tend to be obscure ( ibid : 385 ). In order to overcome this problem, the quantitative studies of measuring style which use lexical discriminators must make use of the words that appear frequently in the texts ( ibid).
Summary of Chapters
Chapter One Introduction: Outlines the scope of corpus stylistics and establishes the research goal of using type-token curves to analyze lexical richness in six specific stream-of-consciousness novels.
Chapter Two Stylistics Vs. Corpus Linguistics: Explores the theoretical relationship between stylistics and corpus linguistics, detailing how these fields intersect to enable empirical text analysis.
Chapter Three Lexical Richness as a Stylistic Feature: Discusses the quantitative components of style, specifically defining lexical richness and the role of type-token ratios in measuring lexical diversity.
Chapter Four Text Corpora and Methodology: Details the design, technical preparation, and sampling procedures used to compile the digital corpus for the study.
Chapter Five Analysis and Results: Presents the statistical findings and visual type-token curve analyses for the novels by Virginia Woolf, William Faulkner, and James Joyce.
Chapter Six Conclusions and Suggestions for Further Studies: Summarizes the findings regarding the effectiveness of type-token curves and offers perspectives on future research in corpus stylistics.
Keywords
Corpus stylistics, Lexical richness, Type-token curve, Literary analysis, Corpus linguistics, Style as recurrence, Quantitative features of style, Textual probability, Stream of consciousness, Statistical linguistics, Vocabulary diversity, Linguistic markers.
Frequently Asked Questions
What is the core focus of this publication?
The work focuses on applying corpus-driven methods to perform stylistic analysis on literary texts, particularly stream-of-consciousness novels.
What are the primary thematic fields covered?
It covers corpus linguistics, stylistic theory, lexical richness measurement, and computational methodology for textual analysis.
What is the primary research goal?
The goal is to determine the statistical reliability of type-token curves in measuring lexical diversity and to describe the lexical profiles of six major literary works.
Which scientific methods are utilized?
The study employs corpus-based quantitative analysis, utilizing WordSmith Tools (7.0) for word frequency and type calculations, and Microsoft Excel for graphical representation.
What does the main body discuss?
The main body discusses the theoretical background of style, the technicalities of building a representative corpus, and the empirical analysis of vocabulary usage in specific novels.
How can this book be characterized by its keywords?
The book is characterized by keywords like "Corpus Stylistics," "Lexical Richness," "Type-Token Curve," and "Quantitative Stylistics," which reflect its focus on statistical linguistic approaches to literature.
Why are James Joyce's novels analyzed separately in Chapter 5?
The analysis of Joyce's novels differs from the others because they exhibit unique statistical behaviors in their type-token curves, showing extreme divergence compared to the more consistent paths found in other authors' works.
What conclusion is drawn regarding Faulkner's writing?
The study suggests a potential decline or deterioration in Faulkner's lexical resources between his early works and his later novels, based on the type-token curve analysis.
- Citar trabajo
- Khalid Shakir Hussein (Autor), Ali Hussein Abdul-Ameer (Coautor), 2017, A Corpus-Driven Approach to Stylistic Analysis of a Lexical Richness Curve, Múnich, GRIN Verlag, https://www.grin.com/document/353181