Grin logo
de en es fr
Shop
GRIN Website
Publicación mundial de textos académicos
Go to shop › Informática - Lingüística computacional

Machine Translation. Analyzing and Classifying Errors and Comparing Performances

Título: Machine Translation. Analyzing and Classifying Errors and Comparing Performances

Proyecto/Trabajo fin de carrera , 2018 , 78 Páginas , Calificación: 9 out of 10

Autor:in: Carmen Moreno (Autor)

Informática - Lingüística computacional
Extracto de texto & Detalles   Leer eBook
Resumen Extracto de texto Detalles

This undergraduate dissertation is about machine translation tools from English into Spanish and about computer-assisted translation tools. The main goal is to identify the importance of these tools within the working environment of translators nowadays and to learn about their potential.

The first section of this dissertation consists of an introduction in which I justify the chosen subject. Next, the different types of MT are analysed and some aspects on the main online MT systems such as Google, Systran or DeepL are explained. Four documents of different nature have been selected in order to be translated with these same MT engines. The goal is to identify, analyse and classify the errors made by each MT system and to compare their performance. The processes of pre- and post-editing are explained through practical examples. Finally, the advantages and disadvantages of MT are presented, as well as an explanation on computer-assisted translation tools which already allow translators to use MT in their work environment.

Extracto


Table of Contents

1. Types of MT

1.1. Rule-based MT

1.1.1. Direct systems

1.1.2. Indirect systems

1.1.2.1. Transfer systems

1.1.2.2. Interlingua

1.2. MT based on the analysis of linguistic corpora

1.2.1. Example-based MT

1.2.2. Statistical MT

1.2.3. Neural MT

1.3. Hybrid translation

2. MT tools

2.1. Systran

2.2. Google

2.3. DeepL

3. Text excerpts translated with MT

3.1. The legal text

3.2. The scientific text

3.3. The technical text

3.4. The press article

4. Analysing and classifying the errors

5. Results

6. Pre-editing and post-editing

7. Advantages and disadvantages of MT

8. Conclusions

Research Objectives and Key Topics

This undergraduate dissertation examines the utility and accuracy of machine translation (MT) and computer-assisted translation (CAT) tools in the contemporary translation industry. The study focuses on translating diverse English texts into Spanish using three major MT engines to assess performance, classify common error patterns, and explore the necessity of human intervention through pre- and post-editing.

  • Comparative performance analysis of Systran, Google, and DeepL.
  • Categorization and assessment of translation errors using an adapted DQF-MQM model.
  • Evaluation of pre-editing and post-editing processes to enhance output quality.
  • Investigation into the integration of MT within professional CAT software environments.
  • Discussion on the evolution of MT architectures, including rule-based, statistical, and neural models.

Excerpt from the Book

1.2.1. Example-based MT

Example-based MT was introduced during the 1980s by Makoto Nagao (Poibeau, 2017, p. 109), a computer scientist who has contributed to various fields, including MT and natural language processing (Wikipedia no date d).

The translation process using example-based MT consists of three phases:

• Corpus query. First, the system tries to find fragments of the sentence to be translated in the corpora for the source language. All relevant fragments are identified and stored.

• Search for equivalents. Second, the system searches for translation equivalents in the target language using aligned bilingual texts.

• Fragment combination. Finally, the system tries to combine the translation fragments to obtain a correct sentence in the target language (Poibeau, 2017, p. 110).

This process is well illustrated in the following practical example. Let us assume that we want to translate into Spanish: Today is going to be a good day and that there is a bilingual English-Spanish corpus available with the following pairs of sentences:

Summary of Chapters

1. Types of MT: This chapter categorizes different machine translation architectures, detailing the operational differences between rule-based, corpus-based (example-based, statistical, and neural), and hybrid approaches.

2. MT tools: This section provides an overview of three prominent translation technologies—Systran, Google, and DeepL—classifying their specific services and market history relevant to professional translators.

3. Text excerpts translated with MT: The author introduces a methodology for testing MT performance by translating four distinct text types (legal, scientific, technical, and press) and presents a framework for classifying translation errors.

4. Analysing and classifying the errors: This chapter defines the criteria and specific approach used for identifying and highlighting the errors produced by the MT engines during the testing phase.

5. Results: This chapter visualizes the findings through comparative graphs, demonstrating the frequency and type of errors across the tested documents for each MT engine.

6. Pre-editing and post-editing: This section explores how source text modification (pre-editing) and human correction (post-editing) are essential processes to bridge the gap between machine-generated output and professional quality standards.

7. Advantages and disadvantages of MT: The author discusses the practical benefits of MT for communication and productivity, while addressing significant challenges such as confidentiality, low-resource language limitations, and polysemy.

8. Conclusions: The final chapter summarizes the dissertation's findings, highlighting the superiority of recent neural developments and reaffirming the indispensable role of human post-editors in the translation workflow.

Keywords

Machine Translation, computer-assisted translation, Google, Systran, DeepL, translation industry, pre-editing, post-editing, neural MT, linguistic corpora, translation quality, error classification, natural language processing, multilingualism, terminology management.

Frequently Asked Questions

What is the primary focus of this dissertation?

The work focuses on the role of machine translation (MT) and computer-assisted translation (CAT) tools within the modern professional working environment for translators.

What are the central themes discussed in this research?

Key themes include the technical evolution of MT systems (from rule-based to neural), the critical assessment of translation quality across different text genres, and the vital importance of human post-editing.

What is the primary research goal or question?

The goal is to identify the importance of current MT tools, learn about their potential for translators, and analyze their performance through error classification and comparison.

Which scientific methodology is applied in this study?

The author performs an empirical analysis by translating four distinct documents—legal, scientific, technical, and journalistic—using three different MT engines, then identifies and classifies errors based on a simplified DQF-MQM (Dynamic Quality Framework - Multidimensional Quality Metrics) model.

What topics are covered in the main body of the text?

The main body covers the classification of MT systems, an analysis of market-leading MT tools, a practical evaluation of translations, and a discussion on the necessary processes of pre- and post-editing.

Which keywords best characterize this research?

This work is characterized by terms like Machine Translation, computer-assisted translation, neural MT, error classification, post-editing, and linguistic corpora.

How does the performance of the three MT engines compare in this study?

The results show that DeepL generally produced the fewest errors and highest fluency, while Google Translate and Systran showed varying levels of performance, particularly regarding word disambiguation and technical terminology.

Why does the author argue that pre-editing is necessary?

The author argues that pre-editing, such as simplifying complex syntactic structures or avoiding ambiguous terminology, significantly improves the quality of the final output provided by MT engines.

What is the conclusion regarding the future of the human translator?

The dissertation concludes that while MT technology is advancing rapidly, human proofreaders and post-editors remain essential for ensuring high-quality, culturally appropriate translations that meet specific professional requirements.

Final del extracto de 78 páginas  - subir

Detalles

Título
Machine Translation. Analyzing and Classifying Errors and Comparing Performances
Universidad
University of Granada  (Facultad de Traducción e Interpretación)
Curso
2017 - 2018
Calificación
9 out of 10
Autor
Carmen Moreno (Autor)
Año de publicación
2018
Páginas
78
No. de catálogo
V1035989
ISBN (Ebook)
9783346536419
ISBN (Libro)
9783346536426
Idioma
Inglés
Etiqueta
machine translation analyzing classifying errors comparing performances
Seguridad del producto
GRIN Publishing Ltd.
Citar trabajo
Carmen Moreno (Autor), 2018, Machine Translation. Analyzing and Classifying Errors and Comparing Performances, Múnich, GRIN Verlag, https://www.grin.com/document/1035989
Leer eBook
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
  • Si ve este mensaje, la imagen no pudo ser cargada y visualizada.
Extracto de  78  Páginas
Grin logo
  • Grin.com
  • Envío
  • Contacto
  • Privacidad
  • Aviso legal
  • Imprint