Voice recognition is a computer software program or hardware device with the ability to decode the human voice. Voice recognition is a system that allows for a secure method of authenticating speakers, the system work in such a way that it general speaker model during the enrollment phase which based on the speaker characteristics. The system testing phase typically involves making a claim on the identity of an unknown speaker using the given speech characteristics and the trained models.

However, speaker identification is known to be one among the two categories of speaker recognition system because speaker recognition can be categorized also as speaker verification whereas, the main difference between both speaker identification and speaker verification ensure to known if the person speaking and claim to be is fully verified while speaker identification make multiple decision by comparing of the person speaking with the one trained or store in database as an attempt to identify the speaker. The interest of the assignment is speaker identification; therefore, speaker identification is the main focus for this study.

Leseprobe

Inhaltsverzeichnis (Table of Contents)

Introduction
Theoretical Concepts
- Speaker Recognition
- Classification of Automatic Speaker Recognition
- Speech Feature Extraction
Objectives
Design implementation
- Vocal Activity Detection (VAD)
- Speaker Identification
  - Frame Blocking
  - Widowing
  - Mel-frequency Wrapping
  - Cepstrum and Feature Extraction
  - Distance Calculation
  - GUI
Design innovativeness
Simulation results
- Train/Enrollment Result
- Recognition
- GUI Result
- Euclidean distance between voices
Discussion

Zielsetzung und Themenschwerpunkte (Objectives and Key Themes)

This assignment aims to develop a speaker identification system that can accurately identify speakers from a group of people in a recorded audio track. The system utilizes voice activity detection (VAD) to improve speech intelligibility and recognition. Both speaker identification and VAD employ the Mel Frequency Cepstrum Coefficient (MFCC) for voice feature extraction. The main objective is to create a reliable system that allows for speaker identification based on their voice.

Speaker identification using voice analysis
Voice activity detection for speech enhancement
Mel Frequency Cepstrum Coefficient (MFCC) for feature extraction
Text-independent speaker recognition
Comparison of speaker characteristics for identification

Zusammenfassung der Kapitel (Chapter Summaries)

The introduction establishes the significance of voice recognition as a social behavior and a key element in speaker identification systems. It outlines the challenges of identifying individual voices within a group recording and introduces the concept of VAD as a solution for improving speech intelligibility. The chapter further explains how MFCCs are used to extract speech features and quantize them for speaker recognition.

The "Theoretical Concepts" chapter delves into the fundamentals of voice recognition, categorizing it into speaker verification and speaker identification. It also introduces the distinction between text-dependent and text-independent voice recognition, explaining the rationale for focusing on text-independent recognition in this assignment.

The "Design Implementation" chapter describes the VAD algorithm and its role in enhancing speech recognition. It then explores the process of speaker identification, including stages like frame blocking, widowing, Mel-frequency wrapping, Cepstrum and feature extraction, distance calculation, and GUI design.

Schlüsselwörter (Keywords)

The key terms and concepts central to this work include speaker identification, voice activity detection (VAD), Mel Frequency Cepstrum Coefficient (MFCC), speech feature extraction, text-independent speaker recognition, and GUI design. These concepts represent the primary focus areas and research themes explored within the assignment.

Ende der Leseprobe aus 26 Seiten - nach oben

Details

Titel: Speaker Recognition
Hochschule: National University of Malaysia (Apu)
Veranstaltung: Mechatronics
Note: A
Autor: Bandar Hezam (Autor:in)
Erscheinungsjahr: 2019
Seiten: 26
Katalognummer: V1420967
ISBN (PDF): 9783346980229
ISBN (Buch): 9783346980236
Sprache: Englisch
Schlagworte: Designing a speaker,
Produktsicherheit: GRIN Publishing GmbH

Arbeit zitieren: Bandar Hezam (Autor:in), 2019, Speaker Recognition, München, GRIN Verlag, https://www.grin.com/document/1420967

Speaker Recognition