The aim of this project was to create a gender identification system that can be
used to identify the gender of the speaker. In this dissertation I have explained the
signal processing background such as Fourier transforms and DCT etc. that was
needed to understand the underlying signal processing happening in digital devices.
Apart from that I also investigated the different classification techniques such as
Adaboost and Gaussian Mixture Models and different types of methods such as
Fusion method, acoustic methods and pitch methods used in gender identification.
From this perspective I have implemented 3 types of models (4 Models) that
are explained in the literature and introducing a new method for gender recognition
that uses SDC feature with pitch to identify the gender. All models were tested
and trained on the same amount of speech. The SDC and SDC fused model gave
satisfactory results on Voxforge dataset. Finally I tested the acoustic and fused
models on YouTube video which gave almost 90% accuracy. The results of my
implementations are shown in chapter 6.
Inhaltsverzeichnis (Table of Contents)
- Introduction
- Background
- Speech
- Speech Signal
- Speech Signal Processing
- Fourier Transform
- Discrete Cosine Transform
- Digital Filters
- Nyquist Shannon Sampling Theorem
- Window Functions
- Speech
- Speech Enhancement
- Signal to Noise Ratio
- Spectral Subtraction
- Cepstral Mean Normalization
- RASTA Filtering
- Voice Activity Detector
- The Empirical Mode Decomposition Method
- The Hilbert Spectrum Analysis
- Voice Activity Detection
- Gender Identification Systems
- Acoustic Features
- Mel Frequency Cepstral Coefficients (MFCC)
- Shifted Delta Cepstral (SDC)
- Pitch Extraction Method
- Pitch Based Models
- Models based on Acoustic Features
- Fused Models
- Acoustic Features
- Learning Techniques for Gender Identification
- Overview
- Adaboost
- Gaussian Mixture Model (GMM)
- GMM Training
- GMM Testing
- Decision Making
- Likelihood Ratio
- Universal Background Model
- UBM Training
- System Design and Implementation
- Toolboxes
- Signal Processing Toolbox
- Machine Learning Toolbox
- System Design
- Requirement
- Initial Approach
- Algorithm
- Feature Selection
- Experiments and Results
- Pitch Based Models
- Models Based on Acoustic Features
- Fused Model
- YouTube Videos
- Toolboxes
- Conclusion
- Summary
- Future Recommendation
Zielsetzung und Themenschwerpunkte (Objectives and Key Themes)
This dissertation aims to investigate and develop a robust gender identification system using acoustic features extracted from speech signals. The system utilizes various machine learning techniques, including Gaussian Mixture Models (GMMs) and Adaboost, to achieve high accuracy in identifying the gender of a speaker. The dissertation explores the effectiveness of different acoustic features and model combinations in enhancing gender identification performance.
- Speech signal processing and analysis for gender identification
- Exploration of different acoustic features and their impact on gender classification
- Comparison and evaluation of various machine learning techniques for gender identification
- System design and implementation of a robust and accurate gender identification system
- Analysis of experimental results and evaluation of system performance
Zusammenfassung der Kapitel (Chapter Summaries)
The dissertation begins with an introduction that provides background information on gender identification and its applications, outlining the motivation and objectives of the research. Chapter 2 delves into the fundamental concepts of speech signal processing, discussing the key techniques used in the research. Chapter 3 covers speech enhancement techniques, explaining their importance in improving the quality of speech signals for gender identification. Chapter 4 focuses on different gender identification systems based on acoustic features, pitch extraction, and model fusion. Chapter 5 explores various learning techniques for gender identification, including Gaussian Mixture Models (GMMs) and Adaboost, along with the implementation of a Universal Background Model (UBM). Chapter 6 presents the system design and implementation, outlining the tools, algorithms, and feature selection employed in the development of the gender identification system. Finally, Chapter 7 summarizes the findings of the research, discusses the system performance, and provides recommendations for future work.
Schlüsselwörter (Keywords)
This dissertation focuses on gender identification, speech signal processing, acoustic features, machine learning, Gaussian Mixture Models (GMMs), Adaboost, Universal Background Model (UBM), and system design and implementation. The research analyzes the effectiveness of various techniques for gender identification and aims to develop a robust and accurate system for practical applications.
- Quote paper
- Hassam Sheikh (Author), 2013, Who is Speaking? Male or Female, Munich, GRIN Verlag, https://www.grin.com/document/265700