From practice perspective, given the abundance of digital content nowadays, coming up with a technological solution that summarizes written text without losing its message, coherence and cohesion of ideas is highly essential. The technology saves time for readers as well as gives them a chance to focus on the contents that matter most.

This is one of the research areas in natural language processing/ information retrieval, which the dissertation tries to contribute to. It tries to contextualize tools and technologies that are developed for other languages to automatically summarize textual Xhosa news articles. Specifically, the dissertation aims at developing a text summarizer for textual Xhosa news articles based on the extraction methods.

In doing so, it examines the literature and understands the techniques and technologies used to analyze contents of a written text, transform and synthesize it, the phonology and morphology of the Xhosa language, and finally, designs, implements and test an extraction-based automatic news article for the Xhosa language. Given comprehension and relevance of the literature review, the research design, the methods and tools and technologies used to design, implement and test the pilot system.

Two approaches were used to extract relevant sentences, which are, term frequency and sentence position. The Xhosa summarizer is evaluated using a test set. This study has employed both subjective and objective evaluation methods. The results of both methods are satisfactory. Keywords: Xhosa, Automatic Text Summarization, Term Frequency and Sentence Position.

Extrait

Chapter 1: Introduction

1.1: Background
1.2: Problem Statement
1.3: Research Objectives
1.4: Research Questions
1.5: Significance of the Study
1.6: Scope of the Study
1.7: Limitations of the Study
1.8: Structure of the Thesis

Chapter 2: Literature Review

2.1: Introduction
2.2: Text Summarization

2.2.1: Types of Text Summarization
2.2.2: Techniques for Extractive Summarization
2.2.3: Evaluation Metrics for Text Summarization

2.3: Natural Language Processing

2.3.1: Morphological Analysis
2.3.2: Lexical Analysis
2.3.3: Syntactic Analysis
2.3.4: Semantic Analysis

2.4: isiXhosa Language

2.4.1: Characteristics of the isiXhosa Language
2.4.2: Resources Available for isiXhosa

2.5: Related Work
2.6: Conclusion

Chapter 3: Research Methodology

3.1: Introduction
3.2: Research Design
3.3: Data Collection and Preparation

3.3.1: Data Source
3.3.2: Data Pre-Processing

3.4: Development of the Automatic News Summarizer

3.4.1: System Architecture
3.4.2: Summarization Algorithm

3.5: Evaluation Metrics
3.6: Ethical Considerations
3.7: Conclusion

Chapter 4: Implementation and Evaluation

4.1: Introduction
4.2: Implementation of the Summarizer
4.3: Evaluation of the Summarizer

4.3.1: Experimental Setup
4.3.2: Evaluation Results

4.4: Discussion of Results
4.5: Conclusion

Chapter 5: Conclusion and Future Work

Objectives and Key Themes

This thesis focuses on developing an automatic news summarizer for the isiXhosa language. The main goal is to address the lack of such systems for this language, enabling efficient information extraction and dissemination. The study aims to achieve this by implementing a system that leverages text summarization techniques and natural language processing approaches. It will be evaluated against standard metrics to assess its effectiveness.

Automatic Text Summarization for isiXhosa
Natural Language Processing Techniques
Development and Evaluation of a Summarizer System
isiXhosa Language Resources and Challenges
Contributions to Information Access and Dissemination in isiXhosa

Chapter Summaries

The thesis is structured into five chapters, each exploring different aspects of the research.

Chapter 1 provides a comprehensive introduction to the topic, outlining the background, problem statement, research objectives, and significance of the study.
Chapter 2 delves into a thorough literature review, discussing existing research on text summarization, natural language processing, and resources available for isiXhosa.
Chapter 3 focuses on the research methodology, outlining the design, data collection and preparation, the development of the summarizer system, and evaluation metrics.
Chapter 4 details the implementation and evaluation of the summarizer, including experimental setup, results, and a discussion of the findings.

Keywords

The key terms that characterize the research are automatic text summarization, isiXhosa language, natural language processing, extractive summarization, evaluation metrics, and information access. The research explores the development and evaluation of a system for extracting concise summaries from isiXhosa news articles, utilizing natural language processing techniques. The focus is on contributing to the field of information retrieval and dissemination for the isiXhosa language.

Fin de l'extrait de 115 pages - haut de page

Résumé des informations

Titre: Development of an automatic news summarizer for isiXhosa language
Cours: Computer Science
Note: 75
Auteur: Zukile Ndyalivana (Auteur)
Année de publication: 2017
Pages: 115
N° de catalogue: V442361
ISBN (ebook): 9783668861718
ISBN (Livre): 9783668861725
Langue: anglais
mots-clé: IsiXhosa Python NLTK
Sécurité des produits: GRIN Publishing GmbH

Citation du texte: Zukile Ndyalivana (Auteur), 2017, Development of an automatic news summarizer for isiXhosa language, Munich, GRIN Verlag, https://www.grin.com/document/442361

Development of an automatic news summarizer for isiXhosa language

Extrait

Table of Contents

Objectives and Key Themes

Chapter Summaries

Keywords

Résumé des informations