Natural language is a complicated thing. When processing a sentence, the human parser has to keep track of the structure of the sentence; this requires remembering the input string, integrating new words into already built structures, and many other things, – and everything has to be done on-line. If the sentence becomes too difficult, the parser will lose control, and processing becomes slow, or may eventually break down.

There have been a number of complexity measures for natural language; the most influential one at the moment is Gibson’s (2000) Dependency Locality Theory (DLT). However, in a recent experiment, Konieczny and Döring (2003) found that reading times on clause-final verbs were faster, not slower, when the number of verb arguments was increased. This was taken as evidence against DLT’s integration cost hypothesis and for the anticipation hypothesis originally developed by Konieczny (1996): During language processing, a listener / reader anticipates what is about to come – he is “ahead of time“.
This paper presents a series of simulations modeling anticipation. Due to the fact that Simple Recurrent Networks (SRNs; Elman 1990) seem to be the most adequate device for modeling verbal working memory (MacDonald & Christiansen 2002), neural networks were used for the simulations.

In seven series of simulations, I managed to model the anticipation effect. Next to a deeper understanding of anticipation, insights into the way SRNs function could be gained.

The paper is organized as follows. First I will give an overview of different complexity measures; then the experiment mentioned above will be described. Third, I will briefly discuss existing models for verbal working memory. After a short introduction into neural network modeling, the core part of the paper, the seven simulation series, will be presented in detail.
Finally, in the Discussion I will argue that SRNs represent a good model for anticipation; implications for the anticipation hypothesis as well as implications for SRNs in general will be considered. Finally, predictions for further experiments will be discussed.

Excerpt

1. Introduction

2. Complexity Measures

2.1. Why complexity measures?

2.2. Yngve

2.3. Bottom-up parsing

2.4. Left-corner parsing

2.5. Chomsky and Miller

2.6. Fodor and Garret

2.7. Other complexity measures

2.8. Gibson

3. The idea of anticipation

3.1. Shannon and Weaver

3.2. Entropy and anticipation

3.3. SOUL

3.4. New developments

3.5. My concept of anticipation

4. Empirical evidence for anticipation

5. Language processing and Working Memory

5.1. Baddeley’s model of Working Memory

5.2. Individual differences: reading span

5.3. The one-resource model

5.3.1. King and Just (1991)

5.3.2. Just and Carpenter (1992): CCREADER

5.4. The two-resource model

5.4.1. Waters and Caplan (1996)

5.4.2. Caplan and Waters (1999)

5.5. The alternative: Neural Networks

5.5.1. The model by MacDonald and Christiansen (2002)

5.5.2. Postscript to MacDonald and Christiansen’s model

6. A short break: Integration – anticipation – neural networks

7. The simulations

7.1. Introduction to Neural Network modeling

7.1.1. Simple Recurrent Networks

7.1.2. Training and testing the network

7.1.3. Evaluating performance

7.1.4. The grammars

7.1.5. Details of the simulations

7.1.6. Overview of the seven series

7.2. First series

7.3. Second series: Anticipation is modeled

7.4. Third series: Even distribution of verbs

7.5. Fourth series: Easy versus difficult context

7.5.1. Subject and object relative clauses

7.5.2. Simply- and doubly-embedded sentences

7.5.2.1. With determiners

7.5.2.2. Without determiners

7.6. Fifth series: Positive and negative evidence

7.6.1. The Restricted GPE

7.6.2. A simple grammar

7.6.3. A complicated grammar

7.7. Sixth series: Early and late evidence

7.7.1. A simple grammar

7.7.2. A complicated grammar

7.7.2.1. Was the grammar too easy?

7.7.2.2. Two effects?

7.8. Seventh series: “pure“ anticipation

8. General Discussion

8.1. Series 1 to 3

8.2. Series 4

8.3. Series 5 and 6

8.4. Series 7

8.5. Implications for the anticipation hypothesis

8.6. Connection to other theories

9. Conclusion

10. References

Objectives & Themes

The primary goal of this research is to investigate the anticipation hypothesis in human language processing, which suggests that readers and listeners anticipate upcoming syntactic structures, thereby facilitating processing. This work models this phenomenon using neural network simulations, specifically Simple Recurrent Networks (SRNs), to evaluate if connectionist architectures can successfully capture anticipation effects that are not explained by traditional integration cost theories.

Investigation of syntactic anticipation of clause-final heads.
Evaluation of neural networks (Simple Recurrent Networks) as models for verbal working memory.
Comparison of anticipation versus traditional integration cost hypotheses.
Analysis of the impact of positive vs. negative evidence and early vs. late evidence on processing performance.

Auszug aus dem Buch

1. Introduction

There have been a number of complexity measures for natural language; the most influential one at the moment is Gibson’s (2000) Dependency Locality Theory (DLT). However, in a recent experiment, Konieczny and Döring (2003) found that reading times on clause-final verbs were faster, not slower, when the number of verb arguments was increased. This was taken as evidence against DLT’s integration cost hypothesis and for the anticipation hypothesis originally developed by Konieczny (1996): During language processing, a listener / reader anticipates what is about to come – he is “ahead of time“.

This paper presents a series of simulations modeling anticipation. Due to the fact that Simple Recurrent Networks (SRNs; Elman 1990) seem to be the most adequate device for modeling verbal working memory (MacDonald & Christiansen 2002), neural networks were used for the simulations.

In seven series of simulations, I managed to model the anticipation effect. Next to a deeper understanding of anticipation, insights into the way SRNs function could be gained.

Summary of Chapters

1. Introduction: Presents the motivation for the study, focusing on the anticipation hypothesis as an alternative to traditional complexity metrics like Gibson’s Dependency Locality Theory.

2. Complexity Measures: Reviews traditional parsing models and complexity metrics, discussing their strengths and limitations in predicting human processing difficulties.

3. The idea of anticipation: Defines the anticipation hypothesis, traces its origins to communication theory (Shannon and Weaver), and explores the concept of incremental prediction refinement.

4. Empirical evidence for anticipation: Describes the eye-tracking study by Konieczny and Döring (2003) which provides the empirical basis for the subsequent modeling.

5. Language processing and Working Memory: Discusses various symbolic and connectionist models of working memory, highlighting the shift toward neural network approaches like those of MacDonald and Christiansen.

6. A short break: integration – anticipation – neural networks: Acts as an intermediate summary before the simulation series, re-summarizing the research gap and the methodological approach.

7. The simulations: This central chapter covers the technical implementation of SRNs, the methodology of the seven simulation series, and the detailed results of each, from simple modeling of anticipation to complex experiments with positive/negative and early/late evidence.

8. General Discussion: Synthesizes the results of all simulation series, assesses the implications for the anticipation hypothesis, and compares the findings with other theoretical frameworks.

9. Conclusion: Summarizes the findings and proposes future research directions, advocating for more comprehensive, global models of language processing.

Keywords

Anticipation hypothesis, Language processing, Simple Recurrent Networks, SRN, Working memory, Dependency Locality Theory, Neural networks, Parsing, Grammaticality Prediction Error, GPE, Connectionism, Syntactic complexity, Clause-final verbs, Verb arguments, Empirical linguistics.

Frequently Asked Questions

What is this research primarily about?

This research investigates the cognitive mechanisms behind human language processing, specifically focusing on whether and how humans "anticipate" upcoming syntactic heads in a sentence, and how this relates to processing difficulty.

What are the central thematic fields?

The study centers on the intersection of syntactic complexity theories, verbal working memory, and connectionist modeling, particularly testing the validity of the anticipation hypothesis against traditional integration cost models.

What is the primary objective of this work?

The main objective is to determine if neural network simulations, specifically Simple Recurrent Networks (SRNs), can effectively model the anticipation effect observed in human eye-tracking experiments.

Which scientific methodology is utilized?

The work employs a connectionist methodology, specifically using Simple Recurrent Networks (SRNs) trained on stochastic, context-free grammars, evaluated using the Grammaticality Prediction Error (GPE) and a restricted version (RGPE).

What is covered in the main section of the paper?

The core section details seven distinct series of neural network simulations that test various aspects of anticipation, including the influence of context complexity, evidence timing, and positive versus negative grammatical evidence.

Which keywords characterize this work?

Key terms include the anticipation hypothesis, Simple Recurrent Networks (SRNs), working memory, connectionism, syntactic complexity, and language processing models.

How do SRNs model human working memory differently from symbolic models?

Unlike symbolic models that often posit a separate, capacity-limited working memory, SRNs demonstrate that processing constraints emerge directly from the network architecture and its accumulated experience with language patterns.

What conclusion does the author draw regarding the anticipation effect in neural networks?

The author concludes that SRNs can successfully model the anticipation effect, particularly by leveraging positive evidence to refine predictions of upcoming verbs, although the specific influence of "early" evidence remains complex due to the networks' inherent local bias.

Excerpt out of 133 pages - scroll top

Details

Title: Being ahead of time. A number of neural network simulations exploring the anticipation of clause-final heads
College: University of Freiburg (Germanistik)
Grade: 1,0
Author: Philipp Döring (Author)
Publication Year: 2004
Pages: 133
Catalog Number: V303808
ISBN (eBook): 9783668022997
ISBN (Book): 9783668023000
Language: English
Tags: Linguistics Linguistik Sprachwissenschaft Neural networks neuronale Netzwerke computer model relativsatz language processing sprachverabeitung anticipation grammar grammatik modellierung
Product Safety: GRIN Publishing GmbH

Quote paper: Philipp Döring (Author), 2004, Being ahead of time. A number of neural network simulations exploring the anticipation of clause-final heads, Munich, GRIN Verlag, https://www.grin.com/document/303808

Being ahead of time. A number of neural network simulations exploring the anticipation of clause-final heads