Name: Arabic Handwritten Text Recognition and Writer Identification
Price: 52.95 EUR
Availability: InStock
Author: PhD Mustafa S. Kadhm
ISBN: 978-3-668-55888-5

Most of the governments and organizations have a huge number of handwritten documents generated by their daily processes. It is imperative to use computers to read the generated handwritten texts, and make them editable and searchable. Therefore, handwritten recognition lately became a very popular research topic and the number of its possible applications is very large. It's capable in resolving complex problems and simplify human activities by converting the handwritten documents into digital form. However, the Arabic handwritten text recognition is a complex process compared with other handwritten languages because Arabic handwritten text is cursive of nature.
Therefore, this thesis proposed an Arabic handwritten text recognition and writer identification system based on segmenting the input handwritten text into handwritten sub-words. The system has two main modules that are used, for the recognition of the handwritten text and identifying the text’s writer. The first module1 has six stages that work together to recognize the Arabic handwritten text and convert it into editable text. These stages are: image acquisition, segmentation, preprocessing, features base construction, classification and post-processing. The second module2 is identified the desired text’s writers through several stages that similar to module1. The system proposes an efficient and accurate segmentation algorithm that segments the input handwritten text into a number of handwritten sub-images and each of these segmented sub-images has an Arabic handwritten sub-word. Besides that, an image thresholding algorithm is proposed to convert the sub-images into binary based on using fuzzy c-mean clustering method. Furthermore, the binary sub-images went through proposed noise removal algorithm in order to remove undesired pixels. After that, two groups of features are extracted from the handwritten sub-images. The first features group that is used for models1 includes structural, statistical, Discrete Cosine Transform (DCT) and proposed Modified Histogram of Oriented Gradient (MHOG1) features. However, the second features group which is used for module2 includes proposed MHOG2 and shape features. In addition, best classification results are obtained by using Support Vector Machine (SVM) classifier.

Extracto

List of Contents

Abstract

List of Contents

List of Tables

List of Figures

List of Abbreviations

List of Algorithms

Chapter One: General Introduction
1.1 Introduction
1.2 Handwritten Recognition
1.2.1 Security Applications of Handwritten Recognition
1.2.2 Handwritten Text Recognition and Writer Identification
1.2.3 Handwritten Text Dependent and Text Independent
1.2.4 Arabic Language and Handwritten Recognition System
1.2.5 Handwritten Recognition Approaches
1.3 Literature Survey
1.4 Aim of Thesis
1.5 Thesis Contributions
1.6 Organization of Thesis

Chapter Two: Theoretical Background
2.1 Introduction
2.2 Handwritten Recognition
2.3 Classification of Text Recognition
2.3.1 Online Text Recognition
2.3.2 Offline Text Recognition
2.4 General Structure of Handwritten Text Recognition System
2.4.1 Image Acquisition
2.4.2 Preprocessing
2.4.3 Segmentation
2.4.4 Features Extraction
2.4.5 Classification
2.5 Arabic Handwritten Text Recognition
2.5.1 Features of Arabic Language
2.5.2 Arabic Handwritten Text Recognition Databases
2.6 Handwritten Recognition Applications
2.6.1 Offline Handwritten Recognition
2.6.2 Online Handwritten Recognition
2.7 Evaluation Measures of Handwritten Recognition System

Chapter Three: Proposed Arabic Handwritten Text Recognition System
3.1 Introduction
3.2 Architecture of the Proposed System
3.3 Arabic Handwritten Text Recognition (Module1 )
3.3.1 Image Acquisition Stage
3.3.2 Segmentation Stage
3.3.3 Preprocessing Stage
3.3.4 Features Base Construction Stage
3.3.5 Classification Stage
3.3.6 Post-processing Stage
3.4 Arabic Handwritten Text Writer Identification (Module2)
3.4.1 Features Base Construction (module2)
3.4.2 Classification Stage (module2)
3.4.3 Post-processing Stage (module2)
3.5 Proposed Handwritten Database

Chapter Four: Experiments and Results Discussion
4.1 Introduction
4.2 Evaluation of the HTRSA System (module1)
4.2.1 Arabic Handwritten Database
4.2.2 Handwritten Text Segmentation
4.2.3 Handwritten Text Image Preprocessing
4.2.4 Features Extraction
4.2.5 Features Normalization (FN)
4.2.6 Classification
4.2.7 Number of Images Set
4.3 Evaluation of the HTRSA System (module2)
4.3.1 Features Extraction
4.3.2 Classification
4.4 Discussion

Chapter Five: Conclusions and Suggestions
5.1 Conclusions
5.2 Suggestions for Future Work

References

List of Tables

2.1 SVM kernels

2.2 Arabic characters and their forms

2.3 Arabic letters with diacritical points

2.4 AHTR databases

3.1 The energy contained in different DCT coefficients number

3.2 Sample of the proposed Arabic lexicon

3.3 Sample of proposed writers’ lexicon

4.1 Arabic handwritten images from different databases

4.2 Segmentation Results

4.3 Evaluation of proposed thresholding algorithm on IESK-arDB database

4.4 Recognition accuracy after adding noise

4.5 Experimental results of applying the proposed noise removal algorithm

4.6 Experimental results of applying BSE algorithm

4.7 Experimental results with various image sizes

4.8 Comparison of results for different edge detection filters

4.9 Comparison of results for different MHOG1 values

4.10 Experiment results for overlapping approach

4.11 Experiment results for different dividing approaches

4.12 Ordering techniques of selecting coefficients

4.13 Features extraction times of DCT and FCT methods

4.14 Comparison of results for different features extraction methods

4.15 Comparison results of applying FN algorithm

4.16 The recognition accuracy of different Arabic Databases and SVM kernels

4.17 Experiment results of different training and testing images numbers

4.18 Experiment results for different MHOG2 values

4.19 The identification accuracy of the extracted features

4.20 The identification accuracy of different SVM kernels

List of Figures

2.1 Handwritten recognition system

2.2 Classifications of Text Recognition

2.3 Acquire of the offline and online handwritten text

2.4 General process flow of HTR system

2.5 Sobel convolution kernels

2.6 Image thinning

2.7 The four templates of Stentiford thinning algorithm

2.8 Image gradient

2.9 Cell division

2.10 Histogram of oriented gradient for all image cells

2.11 Image transformation using DCT

2.12 The hyperplane H that separates the two sets of points

2.13 Support vectors

2.14 Example maximum margin (optimal hyperplane)

2.15 linear SVM with soft margin

2.16 Changing the data space

2.17 Cursiveness of Arabic language

2.18 Example semi-word constituting Arabic sub- word

2.19 Example of AHDB database

2.20 Example of IESK-arDB database

3.1 Proposed HTRSA system architecture

3.2 Architecture of module1

3.3 Arabic handwritten text image

3.4 Distances feature of the Arabic handwritten text

3.5 Applying the proposed text segmentation Algorithm

3.6 The proposed preprocessing stage of module1

3.7 Image thresholding

3.8 Noise Removal

3.9 Black space elimination

3.10 Image thinning

3.11 Edge image using Sobel detector

3.12 Image scaling

3.13 Proposed features base construction of module1

3.14 Different Arabic descriptors

3.15 Image blocking

3.16 Edge detection

3.17 Image gradient

3.18 Cells division

3.19 Histogram of oriented gradients

3.20 Interpolation votes of gradient orientation

3.21 Histograms concatenation

3.22 Feature vector

3.23 Architecture of proposed SVM one against all approach

3.24 Classification process of module1

3.25 Architecture of writer identification (module2)

3.26 Preprocessing stage of writer identification (module2)

3.27 Proposed features base construction of module2

3.28 Blocks dividing

3.29 Histogram of gradient orientation of MHOG2

3.30 width, height and centroid of the handwritten sub-image

3.31 Classification approach of module2

3.32 Handwritten character example of the proposed database

3.33 Handwritten sub-word example of the proposed database

3.34 Sample of Arabic text written by same writer

3.35 Handwritten text image example of the proposed database

4.1 Arabic handwritten text documents

4.2 Arabic handwritten images of the text (ﻡﺎﻌﻟﺍ)

4.3 Image segmentation

4.4 Error segmentation

4.5 Mean and Standard deviation of handwritten image for the same Arabic text

4.6 Image thresholding by the proposed thresholding method

4.7 The recognition accuracy of different SVM kernels

List of Abbreviations

illustration not visible in this excerpt

List of Algorithms

2.1 Stentiford

2.2 Fast Cosine Transform (FCT)

3.1 Text Segmentation

3.2 Arabic Handwritten Image Thresholding

3.3 Noise Removal

3.4 Black Space Elimination

3.5 Statistical Features

3.6 DCT- Features Extraction

3.7 MHOG1 Features

3.8 Features Normalization

3.9 SVM Training

3.10 SVM Testing

3.11 MHOG2 Features

3.12 Shape Features

Dedication

This thesis is lovingly dedicated to my aunt “Montaha Jasim” (may Allah rest her soul in peace). Her support, encouragement and constant love have sustained me throughout my life.

Mustafa

Supervisor Certification

I certify that this thesis entitled “Arabic Handwritten Text Recognition and Writer Identification" by “Mustafa Salam Kadhm”, was prepared under my supervision at the Department of Computer Science of the University of Technology, in a partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science.

illustration not visible in this excerpt

Linguistic Certification

This is to certify that this thesis entitled “Hand Written Text Recognition for Security Application", prepared by “Mustafa Salam Kadhm” at the University of Technology/ Department of Computer Science, is reviewed linguistically. Its language was amended to meet the style of the English language.

illustration not visible in this excerpt

Examination Committee Certification

We certify that we have read this thesis entitled “Arabic Handwritten Text Recognition and Writer Identification", and as examining committee, examined the student "Mustafa Salam Kadhm", in its contents and in what is related with it and that in our opinion, it meets the standards of a thesis for the degree of Doctor of Philosophy in Computer Science.

Acknowledgements

All my thanks first of all are addressed to Almighty Allah, who has guided my steps towards the path of knowledge and without his help and blessing, this Thesis would not have progressed or have seen the light.

My sincere appreciation is expressed to my supervisor Dr. Alia K. Abdul Hassan for providing me with support, ideas and inspiration.

I am extremely grateful to all members of Computer Science Department of University of Technology for their general support.

Finally, I would never have been able to finish my Thesis without the help from friends, and support from my family and wife.

illustration not visible in this excerpt

Abstract

Therefore, this thesis proposed an Arabic handwritten text recognition and writer identification system based on segmenting the input handwritten text into handwritten sub-words. The system has two main modules that are used, for the recognition of the handwritten text and identifying the text’s writer. The first module1 has six stages that work together to recognize the Arabic handwritten text and convert it into editable text. These stages are: image acquisition, segmentation, preprocessing, features base construction, classification and post-processing. The second module2 is identified the desired text’s writers through several stages that similar to module1. The system proposes an efficient and accurate segmentation algorithm that segments the input handwritten text into a number of handwritten sub-images and each of these segmented sub-images has an Arabic handwritten sub-word. Besides that, an image thresholding algorithm is proposed to convert the sub-images into binary based on using fuzzy c- mean clustering method. Furthermore, the binary sub-images went through proposed noise removal algorithm in order to remove undesired pixels. After that, two groups of features are extracted from the handwritten sub-images. The first features group that is used for models1 includes structural, statistical, Discrete Cosine Transform (DCT) and proposed Modified Histogram of Oriented Gradient (MHOG1) features. However, the second features group which is used for module2 includes proposed MHOG2 and shape features. In addition, best classification results are obtained by using Support Vector Machine (SVM) classifier. An Arabic lexicon is proposed for the first module to convert the classified classes into the Arabic editable text, and a writers’ lexicon is proposed too to assign the classified label into the desired writer. In order to test the system performance, three Arabic handwritten databases are used, which are AHDB database, IESK-arDB database and a proposed Arabic handwritten database. The results obtained from the first module were 96.317% for AHDB, 82% for IESK-arDB and 98% for the proposed database using SVM polynomial kernel. On the other hand, the results of the second module using the proposed handwritten database was 85% for handwritten sub-word level and 100% for handwritten text level approaches. Finally, the processing time of the proposed system is 6.2 (seconds).

Chapter One General Introduction

1.1 Introduction

Pattern recognition currently has a very wide field of methods supporting the development of numerous applications in many different activities areas. The methods and techniques of pattern recognition generally are in the middle of the simulation "intelligent" tasks, which certainly infiltrated our daily lives. The technical information related to processing is currently experiencing a very active development in conjunction with the information and have the potential for more important in the field of human-machine interaction. Humans want to communicate with the computer easily to facilitate and accelerate the interaction and information exchange. They seek to make these machines accessible by voice, ability to read, see, manipulate and quickly analyze the received information ^[1].

However, writing to communicate was all the time a primary concern of human. The writing was and will remain one of the great foundations of civilization and mode of excellence in conservation and transmission of knowledge. Indeed, many objects that around us have a paper: the signs, notices job products, newspapers, books, forms ... etc. Enabling the machine to read will capture more information easily and process the documents faster. With the advent of new information technologies, electronics computers and further increase in the power of machines, the automated processing (edit, search and archiving) therein attached appears is unavoidable. Therefore, a system that makes the machine to understand the human writing is needed ^[2].

1.2 Handwritten Recognition

Handwritten recognition is the most crucial part of converting the handwritten document or characters into computer editable text. The handwritten recognition also Therefore, it focuses on the large repetitive applications with rather large databases namely: automatic processing of administrative files, automatic sorting of mails, reading the amounts and bank checks, processing of addresses, the forms processing, interfacing without keyboard, the analysis of the written gesture, reading heritage documents, indexing and search library archives of information in a database. The automation of any of these examples, is an extremely difficult problem to in view of the large variability associated with habits of writers and styles and forms of writing (handwritten and cursive). Indeed, reading activity that is simple for a human is not an easy task to do in a computer. Thus, accomplishing this task requires that machine gets a prior knowledge base domain and use a powerful mathematical formalism ^[2].

1.2.1 Security Applications of Handwritten Recognition

Most of the security systems and applications use various techniques to achieve a higher security against any type of threats. In pattern recognition field, there are several common methods that have been considered such as: fingerprint, iris and face ^[3] to identify the required user. However, the handwritten signature identification is recently used to identify the user based on his/her signature. Each person has distinct signature and every signature has its own physiology or behavioral characteristics. The commonly used of the handwritten signature identification is in bank checks ^[4]. On the other hand, the handwritten text is considered a good characteristic that identifies the text writer.

Like signature, each person has distinct handwritten text and every handwritten text has its own physiology or behavioral characteristics. Since each person can have only one or two signature, the handwritten text gives more features about the writer than the handwritten signature ^[4].

Therefore, the writer identification of handwritten text can be considered a very satisfying method for the security systems and applications. One of the main applications of handwritten text is its use in forensic sciences of e-government systems. Identification of a person based on an arbitrary handwritten text sample is a useful application. Handwritten text writer identification allows for determining the suspects in conjunction with the inherent characteristic of a crime (the case of threat letters). This is different from other biometric techniques, where the relation between the evidence material and the details of an offense can be quite remote ^[5]. In addition to forensic applications of handwritten text, other applications exist, including: forgery detection ^[6] and identification on handwritten musical scores ^[7].

1.2.2 Handwritten Text Recognition and Writer Identification

Although both handwritten text recognition and handwritten text writer identification are considered to be parts of the pattern recognition field, handwritten text recognition differs from handwritten text writer identification in that they seek to maximize opposite characteristics. The objective of handwritten text writer identification is to find the variations of the handwritten text and recognize the uniqueness of each writer with little regard to the text content. However, the objective of handwritten text recognition is to seek the similar features for the same text and identify the text content ^[8]. Nevertheless, the two fields have used quite similar techniques in features extraction and classification stages as will be described in chapter three.

1.2.3 Handwritten Text Dependent and Text Independent

Handwriting text recognition and identification can be divided into two categories which are, text dependent and text independent. Text dependent recognition and identification systems require certain known handwritten text to be written, while the text- independent recognition and identification systems can work on any given handwritten text ^[8]. In this thesis, text dependent for recognition and identification of handwritten text is involved.

1.2.4 Arabic Language and Handwritten Recognition System

Arabic is written and spoken by approximately more than 250 million people. Arabic text is cursive by nature which leads to a lower recognition accuracy than other languages. Because it is the language of the Muslims Book (Al-Quran), all Muslims can read the Arabic language. Besides that, the Arabic language is also important to other languages in the Middle East. It acts as the main text for languages such as Persian, Urdu, and Kurdish. Thus, the ability to automate the interpretation of written Arabic would have widespread benefits. Moreover, Arabic is the official language in all the institutes of Arab countries, which make it very important for different applications. Besides, most of the historical books and documents are written in Arabic language. Consequently, the Arabic handwritten recognition is the most needed system in many countries, especially the countries that want to convert their handwritten works into digital as apart of applying the e-government system ^[9].

1.2.5 Handwritten Recognition Approaches

Handwritten recognition has two main approaches for recognizing the handwritten text. These approaches depend on the way of dealing with handwritten text and how to recognize it. The two handwritten approaches are:

- The holistic approach
- The analysis approach

The holistic approach generally utilizes shape features extracted from the handwritten text image in an attempt to recognize the entire handwritten text. On the other hand, the analytic approach segments the handwritten text image into primitive components (typically characters) ^[9].

This thesis deals with handwritten text, not only handwritten words or characters. Besides, the holistic approach is considered by segmenting the handwritten text document into subwords then recognizes the entire handwritten sub-words without segmenting it into characters then build the output editable text from these recognized sub-words.

1.3 Literature Survey

In the following review the various methods and approaches that used for developing the Arabic handwritten recognition systems are presented in this section:

The first Arabic handwritten recognition system started with recognizing the Arabic ), the researchers٩ - ٠digits. Since there are only 10 classes for the Arabic digits ( achieved a high accuracy in recent years. Besides, handwritten digits recognition system is commonly used for recognizing the bank checks numbers ^[10]:

In 2012 Mohd A., developed an Arabic handwritten digits recognition system based on zoning features and Majority Voting (MV) as a classifier. The system achieved 82.7% recognition accuracy ^[11]. However, in 2013 Gita S., et al. proposed digits recognition system using SVM classifier. The authors used Image Centroid Zone (ICZ) and Zone Centroid Zone (ZCZ) features and obtained 97.7% recognition accuracy ^[12]. In 2014 Mohamed H. Ghaleb et al. proposed recognition system using horizontal and vertical moment features and minimum distance classifier to recognize the printed and handwritten Arabic digits using 4500 image samples. The accuracy of the system was 74.9% ^[13].

In 2014 Mohsen B., et al. proposed handwritten recognition system for Arabic digits using Local Binary Pattern (LBP) as the base feature extraction method and Multi-Layer Perceptron (MLP) for classification. The obtained accuracy was 99.59% ^[14]. In addition, Pawan K., et al. in 2016 proposed an accurate digits recognition system based on moment features and deep learning of Multi-Layer Perceptron (MLP) classifier. The recognition accuracy of the proposed system was 99.3% ^[15].

After the digits recognition system, Arabic handwritten character recognition system has taken place. The Arabic handwritten character recognition system is considered in many researches. Since Arabic has 28 characters, the recognition system used a 28 classes for isolated characters and more for different character positions. The handwritten recognition system depends on the accuracy of segmentation. Like handwritten digits recognition, the researchers achieved a high recognition accuracy in isolated Arabic handwritten character recognition system ^[16]:

In 2011 Lawgali A., et al. proposed Arabic handwritten character recognition system using Discrete Cosine Transform (DCT), Discrete Wavelet Transform (DWT) for features extraction and Artificial Neural Network (ANN) for classification. The system used 5600 Arabic handwritten character images and achieved 79.8%, 40.7% recognition accuracy for DCT and DWT respectively ^[17]. On the other hand, in 2012 Manal A., et al. developed a recognition system to recognize the Arabic characters by segment a set of handwritten word into characters then recognize the segmented characters. The authors recognize the character by matching the segmented characters with the characters in the used database and obtained 81% recognition rate ^[18]. In 2014 Lawgali A., et al. proposed a framework for Arabic handwritten recognition based on characters segmentation. DCT was used for features extraction and ANN for classification. The proposed framework achieved 90.7% recognition accuracy ^[19].

In 2015 Farah M., used Haar wavelet transform (HWT), zoning features and Mahalanobis Distance (MD) classifier to develop handwritten recognition system of the Arabic character. 73% is the recognition rate that was achieved by the proposed system ^[20]. Mohamed E., et al. in 2016 achieved 98.3% recognition accuracy using Convolutional Neural Network (CNN) based Support Vector Machine (SVM) model of developing an offline Arabic character recognition system ^[21].

In recent years the researchers started working with holistic approach, by recognizing the whole Arabic handwritten word directly without segmenting it into characters. This type of system allowed the researchers to avoid the character segmentation problems which is still a big challenge and open research for the researchers. Besides, the Arabic handwritten words give more features than the characters which leads to better features extraction results. However, other researchers segment the handwritten word into characters then recognize the characters and recover the original words ^[22]:

In 2014 Moftah E., et al. presented offline handwritten recognition based on Gabor Wavelet (GW) and explicit segmentation. About 600 handwritten word images were taken from IESK-arDB database and about 200 handwritten word images from FIN/ENIT (Institute of communication technologies (IFN) / National School of Engineers of Tunis (ENIT)) databases be used to evaluate the system. In preprocessing, Hough Transform (HT) and Local Minima Regression (LMR) are used to correct the skew of the handwritten word images. Besides, each handwritten word image was segmented into number of segments. GW is used to extract the features of the segmented images. Support Vector Machine (SVM) is applied to classify each handwritten word segment into its desired class. The recognition rates that achieved by the proposed system for both FIN/ENIT and IESK-arDB databases were 71% and 55% respectively ^[23].

In 2015 Moftah E., et al. presented offline handwritten recognition system based on three classifiers. Their system investigates the application of the probabilistic discriminative based Conditional Random Fields (CRFs) and its extension the hidden-states CRFs (HCRFs) to the problem of off-line Arabic handwritten recognition. First, the word images are segmented into characters and the shape descriptor is used to extract the features of each handwritten character image. After that, HMMs, CRF and HCRF were used for classification. The system used 800 handwritten word images of IESK-arDB database and, the overall recognition rates obtained were 72%, 73%, and 75%, when testing HMMs, CRFs, and HCRFs on the samples of testing set respectively ^[24].

In 2016 Khaoula J., et al. proposed Arabic handwritten recognition system based on Dynamic Bayesian Network (DBN). The authors selected several handwritten images from IFN/ENIT database. The system preprocessed then normalized handwritten word images and segmented it into characters; besides, the moment invariant of Zernike and HU moment features were extracted for each character. The used features descriptors generate continuous features. Therefore, k-means method was used to quantize these features vector. Finally, the DBN used for classification and the recognition rate that achieved was 63-78.5% ^[25].

In addition, Arabic handwritten text recognition is the most recent issue in the handwritten recognition field. The Arabic handwritten text recognition is the difficult recognition system, since it deals with not only handwritten words and handwritten characters but a handwritten text that contains characters, words, and sub-words. The newest research that recognize the Arabic handwritten text is illustrated in following:

In 2016 Hicham E., et al. proposed an Arabic handwritten recognition system based on Hidden Markov Model Toolkit (HTK). The proposed system is applied to an “Arabic- Numbers” data corpus which contains 47 handwritten words from AHDB database and 1905 sentences. These sentences are written by five different peoples. The input to the system is a handwritten text images that have three lines and each line has three or four handwritten words. The proposed system segments the input handwritten text into separate line images. The technique used is based on the horizontal projection profile of the input image then extracting the features from each line image using sliding window technique. The system achieved a rate of 80.33 %. The authors mentioned that, their system can be used in text recognition of bank checks or any other domains ^[26].

The results published in the literature showed that the rate of obtained recognition is restricted to areas of limited accuracy or writing classes, and constraints representative a particular aspect of using handwritten databases. Besides, the works focused on handwritten word or handwritten character recognition through recognizing the handwritten word directly or segment the handwritten word into characters, or segment the handwritten text into lines. In addition, all the used handwritten databases have only gray characters or words handwritten images. Moreover, the existing works are done for only one or maximum two handwritten databases with limited number of classes. On the other hand, there is not such a work done for handwritten text recognition by segmenting the text into sub-words then recognize the segmented sub-words. Also, there is no such a system that recognize the Arabic handwritten text and identify the text writer. Therefore, the Arabic text handwritten recognition and writer identification is still a subjects of active research in various areas.

1.4 Aim of Thesis

The aim of thesis is to develop an accurate handwritten text recognition system based on multi-scale features extraction methods and identifying the writer of the input handwritten text, such that the allover system may be considered as e-services unit being a step in developing the e-government. Moreover, is to develop an Arabic handwritten database with colored and gray handwritten images that works for character, word, text recognition system and can be used for the security applications.

1.5 Thesis Contributions

The main contributions of this thesis can be summarized as follows:

1. An Arabic handwritten database has been proposed. The proposed database has characters, words and texts for several writers from different ages and education backgrounds.

2. A simple and practical segmentation algorithm for segmenting the Arabic handwritten text into set of sub-words is proposed.

3. An efficient preprocessing stage has been proposed for the Arabic handwritten text and successfully applied to the proposed database. The proposed phases of the preprocessing include:

- Thresholding algorithm is proposed for converting the gray image of the handwritten image into binary based on using proposed binarization algorithm that combine the intensity of image pixels and Fuzzy C-mean Clustering (FCM) method.
- Noise removal algorithm has proposed for removing the unwanted pixels from the binary handwritten image based on two threshold obtained from the characteristic of Arabic language and without losing any desired information of the handwritten text shape.
- A black space elimination algorithm is proposed for removing the undesired pixels in the handwritten image background which do not give any feature of the handwritten text.

4. Using several methods for extracting the most appropriate features to recognize the Arabic handwritten text and identify its writer. However, several features extraction algorithms are proposed which are:

- To extract the suitable features from the handwritten sub-images, a statistical and structural features extraction algorithms are proposed. The proposed algorithms extract the features based on the structure of the Arabic text and the intensity distribution of the pixels in the images based on mathematics computations.
- Features extraction algorithms based on modifying the Histogram of Oriented Gradient (HOG) method are proposed for text recognition and writer identification. Besides, an edge detection filters for diagonal and anti- diagonal directions are proposed to detect the handwritten image edges.

5. A features scaling algorithm has been proposed for reducing the system processing time by making all the extracting features in the range of [-1,1] which make the computation process simple.

6. Employing the Support Vector Machine (SVM) for multi classification process to classify the Arabic handwritten texts and writers using several kernels.

7. An Arabic and writers lexicons are proposed. The Arabic lexicon is used for assigning the classified handwritten texts into their corresponding editable Arabic texts. However, writers’ lexicon is used for assigning the classified labels into the desired writers of the handwritten text.

1.6 Organization of Thesis

The thesis is structured in five chapters, here a brief description of their contents is given:

Chapter two describes the handwritten recognition system, its types, and the concept of the handwritten recognition with its applications and overview of the Arabic handwritten recognition and the characteristics of the Arabic language.

Chapter three gives a presentation of the proposed recognition and identification algorithms that are used to design the proposed system and the implementation of each one.

Chapter four discusses the experimental results obtained from implementation of the proposed system.

Chapter five highlights the conclusions and lists a number of suggestions for future work.

Chapter Two Theoretical Background

2.1 Introduction

Handwritten Recognition (HR) is an active research area in artificial intelligence, pattern recognition and computer vision. The field of text recognition achieved great success in the real world target applications especially in the e- government system, security application and other fields ^[27]. In this chapter the handwritten recognition classifications and their general process flow has been explained. For each step a brief overview is given for the handwritten techniques and methods used.

2.2 Handwritten Recognition

The Handwritten recognition is the process of converting the handwritten text image into text file that are understandable by the computer and used for many purposes. Advances in handwritten recognition, have aided the automation of many demanding tasks in our daily life. There are a lot of applications that depends on handwritten which are postal address reading for mail sorting purposes, cheque recognition and word spotting on a handwritten text page, and etc. Naturally, Arabic handwritten text is cursive and more difficult than printed recognition due to several factors which are the writer’s style, quality of paper and geometric factors controlled by the writing condition such as being very unsteady in shape and quality of tracing. Moreover, there are several types of recognition which are ^[27]:

Numeral (digits) Recognition.

Character Recognition. Word Recognition. Text Recognition.

In this thesis the handwritten text recognition system is concerned. Handwritten text recognition is the most difficult type of recognition systems. It deals with text documents or pages that contains several handwritten words and characters which make the recognition process more difficult and challengeable. An example of handwritten recognition system is shown in Figure 2.1.

illustration not visible in this excerpt

Figure 2.1: Handwritten recognition system.

2.3 Classification of Text Recognition

Text recognition systems are mainly classified into offline text and online text. Subsequently, offline text is classified into two subcategories of handwritten and typed text recognition. Figure 2.2 shows general text recognition classification ^[28].

illustration not visible in this excerpt

Figure 2.2: Classifications of text recognition.

2.3.1 Online Text Recognition

In online text recognition, the handwritten text is collected and recognized in real time. A special digitizer tablet and pen are used to generate this type. A digitizer is an electromagnetic tablet which transfers the coordinates of the pen position to the computer at a constant rate. Personal Digital Assistant (PDA) and tablet is a clear example of generating the text for online text recognition. However, online text recognition is less difficult than offline text recognition, since the dynamic information which is usually available for online text recognitions like: number of strokes, order of strokes, direction for each stroke and speed of writing within each stroke. This valuable information assists in recognition of documents and frequently leads to better performing systems compared with offline text recognition ^[29].

2.3.2 Offline Text Recognition

In offline text recognition, the text is produced using an ordinary pen and paper. Thus, the offline recognition methods use scanned images of the handwriting. The images are normally first enhanced and then features are extracted from images by means of digital image processing techniques. However, the offline text recognition is more difficult than the online text recognition due to the noise that created by the processing of the input devices; also due to the great variability found in human writing. Personal writing characteristics have an important influence leading to very different visual appearances of the same handwritten character. On the other hand, the handwritten text is more difficult than the printed text. The printed text has a stable writing style while the handwritten text is cursive which makes the process of recognition tougher ^[29]. Figure

2.3 illustrates the common ways to acquire offline and online handwritten text.

illustration not visible in this excerpt

Figure 2.3: Acquire of the offline and online handwritten text.

2.4 General Structure of Handwritten Text Recognition System

In this part, the general structure of Handwritten Text Recognition (HTR) system is described. The input to the system is a handwritten text image and the output will be class labels that present the desired handwritten text. The typical text recognition system has several stages which are image acquisition, preprocessing, segmentation, features extraction and classification. However, some researchers are omit or merge these stages ^[30]. Figure 2.4 shows the general process flow of the HTR system.

illustration not visible in this excerpt

Figure 2.4: General process flow of HTR system.

2.4.1 Image Acquisition

The first step in a HTR system is to convert the handwritten text document in quantities adapted to the digital processing system with a minimum of possible impairments. In offline mode according to the procurement tool used (scanner or camera), a color or gray level image is obtained.

2.4.2 Preprocessing

Preprocessing is an essential stage of any recognition system which performed after the acquisition process. Generally, the preprocessing is not specific to the recognition of the handwritten text, but is conventional preprocessing in image processing field. The preprocessing is designed to prepare the image of the route to the next stage of analysis. It is essentially to reduce the noise superimposed data and keep as much as possible significant information as presented. The noise may be due to the device acquisition, the acquisition conditions (lighting, incorrect document formatting) or yet the quality of the original document. Among the preprocessing operations generally used include: thresholding, noise removal, edge detection, image thinning, and image normalization ^[31].

A. Binarization

Binarization is the process of converting the gray image into a binary that is composed of two values 0 and 1 which make the image easiest to process. In general, using a binarization threshold appropriate reflecting the limits of high and low contrast in the image. For the low contrast images or variable contrast, it is difficult to set the threshold to a specific value. A classical method determines a binarization threshold by calculate the histogram of grayscale of the image. The threshold value will be equal to the gray level value lying in the valley between the two peaks of the histogram. The pixels h- aving a gray level above this threshold belong to the background and those with a lower value belong to the object(foreground) ^[32].

- Fuzzy C-Means Clustering (FCM)

Fuzzy c-means (FCM) is a method of clustering which allows one piece of data to belong to two or more clusters. This method (developed by Dunn in 1973 and improved by Bezdek in 1981) is frequently used in pattern recognition. There are two main processes for Fuzzy c-means clustering which are: the calculation of cluster centers and the mentioning of points to these centers using Euclidian distance. In order to make the cluster centers are stable, the process is repeated. For each item of the data for the clusters FCM assigns a membership value within a range of 0 to 1. So it combines the fuzzy set’s concepts of partial membership and forms overlapping clusters to support it. A fuzzification parameter m is needed in range [1, n] that indicate the degree of fuzziness in the clusters. FCM is depends on minimization objective function that is described in Equation 2.1 ^[33].

illustration not visible in this excerpt

Where m is any real number greater than 1, C is the number of clusters, N is the number of data, x i is the i th of d-dimensional measured data, c j is the d-dimension center of the cluster,[Abbildung in dieser Leseprobe nicht enthalten] is the degree of membership of x i in the cluster j, and ||*|| is any norm expressing the similarity between any measured data and the center. Through an iterative optimization of the Equation 2.1 function, fuzzy partitioning is carried out with the update of membership u ij and the cluster centers c j by Equation 2.2.

illustration not visible in this excerpt

Where [Abbildung in dieser Leseprobe nicht enthalten]is the distance between point i and current cluster center [Abbildung in dieser Leseprobe nicht enthalten] is the distance between point i and other cluster centers k, Equation 2.3 is used to find the d- dimension center of the cluster Cj for membership u ij:

illustration not visible in this excerpt

The fuzzy partitioning process is stop when,[Abbildung in dieser Leseprobe nicht enthalten] where ε is between (0) and (1) to stop iterations and k is iteration stages ^[34].

B. Noise Removal

The image of the handwritten text may be subject to noise introduced during the acquisition and transmission. This noise is undesired pixels that may appears in the binary image after the process of thresholding method. Noise removal is to examine the neighborhood of a pixel and eliminate isolated pixels of the one part (cleaning) ^[35]. A frequently used for noise removal is median filter that is performed by browsing image pixel using a window of size 3 × 3 and changing the value of the pixel based on the values of its 8 neighbors. Median filter sort the neighborhood pixels in ascending order then take the middle pixel as a median value ^[36].

C. Edge Detection

Edge detection is a research field that belongs to image processing and computer vision. It is identify areas of a digital image corresponding to rapid changes in light intensity. These changes include discontinuities in depth, the orientation of a surface, the properties of a material and in scene illumination. Detecting edges of an image significantly reduces the amount of data and eliminates the less relevant information, while preserving the important structural properties. Several edge detection operators are used to detect the image edges. The most common operators are: Canny, Sobel, Roberts and Prewitt edge detectors ^[37]. Sobel is used in the proposed work of this thesis.

Sobel Edge Detector

The Sobel is used in image processing for edge detection. Basically, the operator calculates the gradient of the intensity of each pixel. This indicates the direction of the largest change from light to dark and the rate of change in that direction. Then the sudden change points of brightness, possibly corresponding to the edges and the orientation of these edges will be known ^[38]. The operator uses the convolution of matrices. The matrix undergoes a convolution with the image to calculate the horizontal and vertical derivative as shown in Figure 2.5.

illustration not visible in this excerpt

Figure 2.5: Sobel convolution kernels. (a) Vertical direction, (b) Horizontal direction.

At each point, the approximations of the horizontal and vertical gradients and directions can be combined as in Equations 2.4 and 2.5 to get an approximation of the gradient norm ^[38]:

illustration not visible in this excerpt

Where g is gradient magnitude

illustration not visible in this excerpt

Where θ is gradient direction

D. Image Thinning

Thinning algorithm is a morphological operation that is used to remove selected foreground pixels from binary image. It preserves the topology (extent and connectivity) of the original region while throwing away most of the original foreground pixels ^[39]. Figure 2.6 shows the result of a thinning operation on a simple binary image. In image thinning, the template based mark-and-delete thinning algorithms are very popular because of their reliability and effectiveness. This type of thinning processes uses templates where a match of the template in the image deletes the centre pixel. They are iterative algorithms which erodes the outer layers of pixel until no more layers can be removed. Almost all iterative thinning algorithms use mark-and-delete templates including Stentiford, Zhang-Suen and Guo-Hall algorithms ^[40].

illustration not visible in this excerpt

Figure 2.6: Image thinning. (a) Original image, (b) Thinned image.

In this thesis, Stentiford thinning algorithm which uses connectivity numbers to mark and delete pixels is used. The connectivity number is a measure of how many objects are connected with a particular pixel. The Equation 2.6 is used to calculate connectivity number.

illustration not visible in this excerpt

Where Nk is the color of the eight neighbours of the pixel analysed, N0 is the centre pixel, N1 is the color value of the pixel to the right of the central pixel and the rest are numbered in counter clockwise order around the centre. Stentiford algorithm uses a set of four 3 x 3 templates to scan the image. Figure 2.7 shows the templets of Stentiford thinning algorithm.

illustration not visible in this excerpt

Figure 2.7: The four templates of Stentiford thinning algorithm.

The white circle represents a white pixel with a value of 255 and the black circle represents a black pixel with a value of zero. These templates cross the image in the following order:

- T1 - from left to right and top to bottom.
- T2 - from bottom to top and from left to right.
- T3 - from right to left and from bottom to top.
- T4 - from top to bottom and from right to left.

The endpoint pixels are the pixels that are connected to only one other pixel. That is, if a black pixel has only one black neighbour out of the eight possible neighbours. Algorithm 2.1 shows the main process steps of Stentiford thinning algorithm ^[40].

illustration not visible in this excerpt

E. Image Scaling

The handwritten images in the databases have various sizes and resolutions. The recognition systems are sensitive to small variations in the size and position as is the case in matching templates and correlation methods. The scaling of the size of images seeks to reduce variations between images due to the size of handwritten text to improve the performance of the recognizer. Therefore, the adjusting of handwritten text sizes into standard size like 128*128, 64*64, 32*32 and etc., is needed. The common image scaling method is the image scaling based on preserved aspect ratio ^[41].

2.4.3 Segmentation

Segmentation is a critical and decisive stage in several recognition systems. It is defined as the operation that seeks to break down the handwritten text image. The result of this operation is a form isolated from an image and that could be a handwritten words, sub-words, characters or sub-characters. However, this separation is not always possible. Generally, the performance of the segmentation affects directly the reliability of the overall handwritten text recognition system. There are two techniques were used in handwritten text recognition system ^[42]:

A. Implicit segmentation: is to segment the handwritten text in the lower parts called grapheme (character and word components) and find characters and words by combining those graphemes.

B. Explicit segmentation: is to segment the handwritten text exactly into sub-words or characters using general writing properties of the sub-word and characters. Consequently, explicit segmentation is used in this thesis.

2.4.4 Features Extraction

For decision making, the handwritten recognition system only needs information relevant for differentiating one object from another. For this purpose, an extraction step of features is realized. This is a critical step during the construction of the handwritten text recognition system. However, the features extraction in the handwritten text recognition system is facing the largest problem of intra-class variability. Indeed, a character can take different forms depending on its position in the text ^[43].

On the other hand, the largest changes are introduced by the writer. The writing is personal to each individual and the route resulting from the writing of a text by two persons can be well different. Moreover, for the same writer, there are a number of constraints affecting the realization of the layout of his/her writing. The best recognition depends on a successful features extraction methods. A lot of features extraction methods have been proposed for handwritten recognition purpose ^[44]. The features are generally classified into three main categories which are: the structural features, statistical features and global transformations ^[45].

A. Structural Features

Structural features describe the geometrical and topological features of a pattern by describing its global and local properties. The structural features depend on the kind of pattern to be classified. For Arabic handwritten text, the features includes ^[46]:

- The number of strokes, their sizes, directions and slopes.
- The extreme points (end points).
- The number of loops.
- The number of dots.
- The number of intersection.
- The number of connected components.

In general, structural features are challenging to extract from the Arabic handwritten text image and many errors occur because of the small difference between the Arabic characters ^[47].

B. Statistical Features

The statistical features are extracted from the statistical distribution of pixels which describe the characteristic measurements of the input image pattern. The statistical features provide low complexity and high speed. The major statistical features can be summarized as: histograms of chain code directions, pixel densities, moments, Fourier descriptor and histogram of oriented gradient ^[48].

1. Histogram of Oriented Gradient ^[49]

Histogram of Oriented Gradient (HOG) was first introduced by Dalal and Triggs for human body detection but it is now one of the successful and popular used descriptors in computer vision and pattern recognition. It is a characteristic describing the overall texture of the image by dividing the image into a grid of regions and then concatenate the histogram of oriented gradients of these regions into a single vector. Given an image intensity I, computing the HOG involves five main steps:

1) Calculating the image gradient.
2) Dividing the image into a cells.
3) Calculating a HOG for the all cells.
4) Normalizing HOG of each cell.
5) Concatenating the HOGs of all cells into one vector.

Gradient of an image is a vector representing the variation in intensity by relative to the movement in the horizontal and vertical direction. Two filters are used to calculate the gradient of the image in horizontal and vertical directions which are:

illustration not visible in this excerpt

By applying these two filters, GH (x, y) and GV (x, y) of the image I are given by Equations 2.9 and 2.10 respectively.

illustration not visible in this excerpt

Figure 2.8: Image gradient. (a) Original image, (b) Horizontal gradients, (c) Vertical gradients, (d) Magnitude of gradient.

As any vector, the gradient NG (x, y) and its orientation θG (x, y) were found by Equations 2.4 and 2.5 according to GH (x, y) and GV (x, y). Figure 2.8 (d) shows the calculation of image gradient. After calculating the image gradient, the latter divides the image into cells that cover the entire image. All the cells have equivalent dimensions, such as 4X4 or 8X8 set of pixels. An example of a cell division process is illustrated in Figure 2.9.

illustration not visible in this excerpt

Figure 2.9: Cell division.

For each cell, a HOG is calculated using the gradients of all pixels. Each pixel position (x, y) is involved in computing the HOG of n components by Equation 2.11.

illustration not visible in this excerpt

The number of HOG components is configurable. It sets the precision orientation gradients. Figure 2.10 shows the configuration used in HOG. The gradient image is divided into a 3x3 grid of cells and each cell HOG 9 bins are extracted ^[49].

illustration not visible in this excerpt

Figure 2.10: Histogram of oriented gradient for all image cells.

The next step of HOG descriptor is the normalization. This step is to normalize the HOG for each region independently from other. Dalal and Triggs mentions two standardizations based on L1 and L2 norms:

illustration not visible in this excerpt

Where HOG denotes the standard histogram of oriented gradient and is a regularization term (constant). After calculating [Abbildung in dieser Leseprobe nicht enthalten] of the normalized M cells of the image, a vector (HOGv) that concatenate all the HOGs is built as in Equation 2.15:

illustration not visible in this excerpt

2. Aspect Ratio (AR)

The AR of an image describes the proportional relationship between its width and its height. The common use of the AR is in image normalization, features extraction and evaluation. AR is defined by Equation 2.16 ^[50]:

illustration not visible in this excerpt

C. Global Transformation

The transformation schemes convert the pixels transformation of the pattern to a more compact form which reduces the dimensionality of features ^[50]. In this thesis Discrete Cosine Transform (DCT) is used for extracting the handwritten text features.

Discrete Cosine Transform Features (DCT)

The DCT converts the pixel values of an image in the spatial domain into its elementary frequency components in the frequency domain ^[51]. Given an image f (x, y), its 2D DCT transform is defined in Equation 2.17.

illustration not visible in this excerpt

Figure 2.11 illustrates the DCT transformation band, where the image is decomposed into 8x8 dimension blocks to make the computation process fast, results in the blocks of DCT of dimensions 8x8. In each DCT block, there will be Low Frequency (LF) band which represents the lowest frequency coefficients while the High Frequency (HF) band is used to represent the higher frequency coefficients of the block. Besides, the Middle Frequency (MF) band is used to represent the middle frequency coefficients ^[51].

illustration not visible in this excerpt

Figure 2.11: Image transformation using DCT. (a) Transformation of spatial domain to frequency domain (b) DCT bands.

Due to its strong capability to compress energy, the DCT is a useful tool for pattern recognition applications. The DCT can contribute to a successful recognition system with classification techniques such as Support Vector Machine (SVM) and Artificial Neural Network (ANN) ^[52]. Furthermore, the main advantage of using DCT is the removal of the redundancy among neighboring pixels that provides uncorrelated transform coefficients that can be used independently ^[53].

However, J. Makhoul ^[54] proposed a fast computation of the DCT via the Fast Fourier transform (FFT) that is called Fast Cosine Transform (FCT). The DCT of an N-point real signal is derived by taking the Discrete Fourier Transform (DFT) of a 2N-point even extension of the signal and using only an N-point DFT of a reordered version of the original signal. The implementation of the FCT is illustrated in Algorithm 2.2.

illustration not visible in this excerpt

2.4.5 Classification

In the entire process of a pattern recognition system, classification plays an important role by deciding on a form belonging to a class. The main idea of classification is to assign an example (a form) to an unknown predefined class from the description in the form of parameters. There are several classifiers that have been applied to text recognition systems such as: K-Nearest Neighbors (KNN), ANN, Hidden Markov Models (HMM) and SVM ^[55].

A. K-Nearest Neighbors ^[56]

The KNN algorithm is one of the simplest machine learning algorithms. In a classification context of a new observation X, the basic idea is simply to choose nearest neighbors of this observation. X has a class that is determined according to the class majority among the K nearest neighbors of the observation X. To find the nearest K of a given rank, one can choose the Euclidean Distance (ED). A two data represented by two vectors xi and xj, the distance between these two data is given by Equation 2.19.

illustration not visible in this excerpt

Where d is the x length, xi is the first vector, and xj is the second vector.

The main advantage of this algorithm is its simplicity and the fact that it does not require learning. It is associated with a distance function and a choice function of the class in terms of the classes of the nearest neighbors, which is the model. The KNN then fall into the category of non-parametric models. The introduction of new data can improve the performance of the algorithm without requiring the reconstruction of a model. These are a major differences with algorithm such as artificial neural network (ANN).

B. Support Vector Machine (SVM)

SVM is a binary classification method for supervised learning that was introduced by Vapnik in 1995. This method is therefore a recent alternative for classification. It is based on the existence of a linear classifier in a suitable space. Since it is a classification problem with two classes, this method uses a training set to learn the model parameters. It is based on the use of so-called kernel functions (kernel) which allows an optimal data separation. SVM is particularly effective in that it can deals with problems involving large numbers of descriptors, provides a unique solution (no local minimum problems like neural networks) and provideds good results on real problems ^[57]. The algorithm in its original form is like looking for a linear decision boundary between two classes, but this model can be greatly enriched by projecting in another space to increase the separation of the data. Therefore, the same algorithm can be applied in the new space which results in a non-linear decision boundary in the initial space. The main advantages of the SVM are producing a very accurate classifiers and less overfitting that is robust to noise. ^[58]

- General Principles ^[59]

The simplest case is where the training data only come from two different classes (+1 or -1) this is called binary classification. The idea of SVMs is to find a hyperplane which separates these two classes. If such a hyperplane exists, that is to say if the data is linearly separable it is called a SVM (Hard Margin) as in Figure 2.12.

illustration not visible in this excerpt

Figure 2.12: The hyperplane H that separates the two sets of points.

The nearest points which are alone used for the determination of the hyperplane are called support vectors as shown in Figure 2.13.

illustration not visible in this excerpt

Figure 2.13: Support vectors.

The separating hyperplane is represented by Equation 2.20:

illustration not visible in this excerpt

Where w is direction perpendicular to the hyperplane, and b is a constant. The decision function for an x example can be expressed as follows:

illustration not visible in this excerpt

Since both classes are linearly separable, there is no example which located on the hyperplane for satisfying H (x) = 0. It is then appropriate to use the following decisions:

illustration not visible in this excerpt

The values +1 and -1 to the right of the inequality may be any constants +a and -a, but by dividing both sides of the inequality has found previous inequalities are found, which are equivalent to the Equation 2.21:

illustration not visible in this excerpt

The hyperplane [Abbildung in dieser Leseprobe nicht enthalten] represents a hyperplane separating the two classes, and distance between the hyperplane and the closest example is called the margin. The region that lies between the two hyperplanes wT.xi + b = -1 and [Abbildung in dieser Leseprobe nicht enthalten]is called the region generalization of the learning machine.

Over this region, the generalization capability of the machine is greater. Maximizing this region is the objective of the training phase. The SVM method seeks the hyperplane that maximizes the generalization region of the margin. Such a hyperplane is called “optimal hyperplane” as shown in Figure 2.14. Assuming that the training data does not contain noisy data (poorly-labeled) and the test data follow the same probability as training data the maximum margin hyperplane will certainly maximize generalization ability of the learning machine.

illustration not visible in this excerpt

Figure 2.14: Example maximum margin (optimal hyperplane) ^[59].

In the case where the data are not linearly separable or contain noise (outliers: mislabeled data) constraints cannot be verified and there is the need to relax a little. This can be done by admitting a certain error classification of data which is called SVM (Soft Margin) as illustrated in Figure 2.15.

illustration not visible in this excerpt

Figure 2.15: Linear SVM with soft margin ^[59].

Then introduced the constraints of variables ℰi are introduced, called for relaxing the constraint of the Equation 2.22:

illustration not visible in this excerpt

During the testing, the input sample is associated with the class whose output is positive as Rule:

illustration not visible in this excerpt

However, it is possible that several outputs are positive for a given test sample. This is particularly true of ambiguous data located near the borders of separation of classes. In this case, a vote majority to assign the instance x to the class Ck according to the decision rule is used as in Equation 2.23:

illustration not visible in this excerpt

In another hand, when training data is not linearly separable then SVM uses a kernel function (K) as in Table 2.1 to map the data into a higher dimensional space (feature space) where data can be linearly separated as shown in Figure 2.16.

illustration not visible in this excerpt

Figure 2.16: Changing the data space. (a) Input space, (b) Feature Space.

Several SVM kernel functions that were used to map the input space to feature space and gave a good classification accuracy when classifying a new example. The most common kernel functions are shown in Table 2.1.

Table 2.1: SVM kernels.

illustration not visible in this excerpt

2.5 Arabic Handwritten Text Recognition

Arabic Handwritten Text Recognition (AHTR) has been a very challenging field in recent years. Unlike the Chinese and Latin languages, Arabic is considered very difficult language in recognition process due to its complex features ^[60]. In the next section, a brief description of the Arabic language features and handwritten databases are presented.^[61]

2.5.1 Features of Arabic Language

The main features of Arabic writing can be summarized as follows:

A. The Arabic alphabet has 28 basic letters. Unlike the Latin alphabet, Arabic letters each comes in several forms depending on its place in the text, Initial (In), middle

(M), final (F) and isolated (I) as shown in Table 2.2 which distinguish 28 characters of the alphabet with four forms of writing.

Table 2.2: Arabic characters and their forms.

illustration not visible in this excerpt

B. There is no difference between the handwritten letters and printed letters shapes. The capital letter and lowercase letter notions do not exist.

C. Most Arabic handwritten letters are linked, even the printed Arabic letters, which gives Arabic writing characteristic of cursive. Figure 2.17 illustrates cursiveness of the Arabic language.

illustration not visible in this excerpt

Figure 2.17: Cursiveness of Arabic language.

D. An Arabic character can contains a vertical line (TAA (ﻁ)), an oblique stroke (KAF )[Abbildung in dieser Leseprobe nicht enthalten] or zigzag (HAMZA [Abbildung in dieser Leseprobe nicht enthalten])

E. Arabic characters do not have a fixed size (height and width), height varies from one character to another and from one form to another within the same character.

F. In Arabic language, a single word can be interspersed one or more spaces giving many semi-words or related components or also sub-words which is the case of the word represented in Figure 2.18. In handwritten, the spacing between the different semi-words of the same sub-word is smaller than the spacing between two different sub-words.

illustration not visible in this excerpt

Figure 2.18: Example semi-word constituting Arabic sub-word. (a) 4 semi-words, (b) 1sub-word.

G. In the Arabic alphabet, 15 letters of the 28 have one or more points. These points are located either above or below of the shape to which they are associated with but never both at once. The maximum number of points a letter may have a letter is three points above the character or two points below it. Table 2.3 presents the letters with points, their numbers and positions.

Table 2.3: Arabic letters with diacritical points.

illustration not visible in this excerpt

2.5.2 Arabic Handwritten Text Recognition Databases

In order to evaluate a handwritten recognition system, the accuracy and the speed should be measured and compared with an average of human reader. Although some works have been conducted on Arabic handwritten, they generally use small databases of their own or presented results on databases which were unavailable to the public. However, there are a handwritten databases which are available to public and can be used for recognition purpose but unfortunately most of these databases are not accessible because they were developed for a well-defined research work.

The first common handwritten database is Al-Isra handwritten database which proposed by Nawwaf K., et al. ^[62]. The data from Al-Isra database was collected in the University of Amman. This database includes a gray variation image of Arabic handwritten words (37000), numbers (10000 Arabic and Indian), signatures (2500) and texts (500 paragraphs).

The second handwritten database to represent is CENPARMI. This database is published by Al-Ohali Y., et al. ^[63] in 2000. It has 7000 handwritten gray images of Saudi checks. These checks are divided into several parts. The first part has 1547 handwritten word images, while the second part has 1547 of printed word images. Besides, the third part has 23325 semi-words images and the last part has 9865 images of Indian numbers. Alma'adeed S., et al. ^[64] presented an Arabic handwritten database (AHDB). A hundred writers were invited to write words from the vocabulary of digital amounts. In addition, the AHDB contains the most popular words in Arabic handwritten and numbers which contain 4700 handwritten gray images. An example of the AHDB database is illustrated in Figure 2.19.

illustration not visible in this excerpt

Figure 2.19: Example of AHDB database ^[63].

A database for off-line Arabic handwritten IESK-ArDB was presented by Mostafa E., et al. ^[65]. The database contains 280 pages of a 14th century historical manuscripts, more than 4000 handwritten word gray images and 6000 segmented character images. The word database vocabulary covers most of Arabic part of speech; nouns, verbs, country/city names, security terms and words used for writing bank amounts. An example of the IESKArDB database is illustrated in Figure 2.20.

illustration not visible in this excerpt

Figure 2.20: Example of IESK-ArDB database ^[65].

Furthermore, the last database represented is (Institute of communication technologies (IFN) / National School of Engineers of Tunis (ENIT)) IFN/ENIT database. This database is developed by the Institute of communication technologies (IFN) in cooperation with National School of Engineers of Tunis (ENIT) in 2002. The database consists of 5 sub- assemblies, containing in whole 32492 cities / Tunisian village names images. These names are collected from more than 1,000 writers of different ages and professions ^[66]. In this thesis the AHDB and IESK-ArDB databases will be used for training and testing the system. These databases used by several researches in the literature that have accuracy results to compare with. In summary, the Arabic handwritten databases are summarized in Table 2.4.

Table 2.4: Arabic handwritten databases.

illustration not visible in this excerpt

Most of the existing databases are unavailable online and the others are not free to use. Besides, all the databases have been created only for recognition systems and cannot be employed for security applications. In addition, the handwritten databases have different text images that written by the same writer but do not have similar text images that are written by the same writer, which make the identification process is text independent only. Furthermore, the available handwritten databases have many handwritten images written by known writers which make the identification task impossible to implement. Therefore, to overcome the problems of the existing handwritten databases, a new database is needed.

In this thesis, an Arabic handwritten database is proposed to satisfy the recognition system’s requirements.^[67]

2.6 Handwritten Recognition Applications

Through particularization problems, the handwritten recognition has enabled the development of specific and effective applications for the handwritten recognition offline and online.

2.6.1 Offline Handwritten Recognition

The handwritten recognition offline experienced significant growth in the areas associated with the development of economic interests and e-government services.

A. Reading postal addresses

Reading handwritten postal codes associated with the reading of the names of cities has extended the development of automatic mail sorting machines (letters).

B. Banking

Recognition of literal amounts manuscripts, associated with the recognition of digital reading amounts to validate checks. Recognition of checks may be associated with the corresponding coupons. Machines capable of reading several thousand checks per hour are already in use currently. The recognition of digital amounts is allowed for the creation of the ABMs discount bank checks. In these controllers, the client is identified by his/her bank card that indicates the keyboard of the check amount. If the amount coincides with the digital amount acknowledged receipt of the check is immediately validated.

C. Forms and Schedules

These are mainly applications of OCR reading of survey forms, order forms and insurance observations.

D. User Authentication: Handwritten Verification and Identification

Handwritten user authentication can be achieved in two alternative classification modes which are verification and identification. In user verification, the authentication system classifies if a set of handwritten features is similar enough to a set of reference templates of a person of claimed identity to confirm the claim or not. The result of handwritten user verification is always a binary yes/no decision, confirming an identity or not. However, identification describes the process of determining the identity of a writer based on handwritten features. Here, the classification will assign the handwritten features to one out of all the classes of persons that are registered with a particular system. One possibility to implement identification is exhaustive verification, where the actual handwritten features are compared with all registered references and the identity is determined by the origin of references of greatest similarity.

In this view, identification can be accomplished by a systematic verification between an actual handwritten samples to references of all known (registered) users in the authentication system. The result of this process then yields the identity linked to those references showing greatest similarity. Consequently, the process of identification can be modeled as sequences of one-to-all verifications ^[68].

2.6.2 Online Handwritten Recognition

Recognition of the online writing is experiencing significant growth in related areas of a natural human-machine communication, friendly and ergonomic as well as portability.

Personal Digital Assistant (PDA)

Recognition of the online writing aims to replace the keyboard and mouse of computer with a pen. This change is intended to make them more user-friendly computers allowing use in very diverse situations (taking notes, writing orders, accident reports, writing, teaching and etc.). This is firstly of making mobile computing.

2.7 Evaluation Measures of Handwritten Recognition System

In order to evaluate the handwritten recognition system, several measures are used. Each of these measures are used in one of the handwritten recognition system stages. First of all, in segmentation stage pixels based Matching Score (MS) criterion ^[69] is used in order to evaluate the segmentation results. The MS criterion is defined by Equation 2.24:

illustration not visible in this excerpt

Where T is the segmented image of the segmentation algorithm and TG is the original image of the handwritten database. MS is a real number between 0 and 1 and represents the matching score between the resultant image and the original one. A higher MS difference indicates a better segmentation.

In preprocessing stage, Misclassification Error (ME) evaluates the binary images according to the similarities between the output images and the ground-truth images. The ME finds the number of background pixels wrongly assigned to foreground and on the contrary foreground pixels wrongly assigned to background ^[70]. Equation 2.25 is used to compute the ME.

illustration not visible in this excerpt

Where BO and FO denote respectively, the background and foreground of the original ground-truth image and BT and FT denote the background and foreground pixels in the test image. The ME varies from 0 for a perfectly thresholded image and to 1 for a totally wrongly thresholded image.

However, the performance of the noise removal is found by Mean Square Error (MSE) and Peak Signal to Noise Ratio (PSNR) ^[71]. Smaller value of MSE means better performance of the noise removal and higher PSNR value means the better noise removal performance. In order to compute the Mean Square Error (MSE) Equation 2.26 is used.

illustration not visible in this excerpt

Where I (i,j) is the original image, I’(i,j) is the reconstructed image after removing the noise and M,N are the dimensions of the image. However, PSNR is found by Equation

illustration not visible in this excerpt

In another hand, to evaluate the accuracy of any handwritten recognition system, a performances measures are used. The concept of True Positive (TP) is equivalent with a correct recognition of the test handwritten image as being a certain member from the testing set, while the True Negative (TN) corresponds to a correct rejection. Besides that, False Positive (FP) means incorrect recognition of the test handwritten image as being a certain member from the testing set, while it is not. Last, the False Negative (FN) corresponds to the error of failing to recognize the test handwritten image as being a certain member from the testing set while it is true ^[72].

Based on the above terminology, the True Positive Rate (TPR) is defined as the ratio between the number of TP and the total number of TP and FN as in Equation 2.28.

illustration not visible in this excerpt

While the False Positive Rate (FPR) is obtained by Equation 2.29.

illustration not visible in this excerpt

These two measures are employed for class discrimination as focused in this thesis. The recognition rate (accuracy) can be defined as in Equation 2.30.

illustration not visible in this excerpt

However, the error rate can be defined as in Equation 2.31.

illustration not visible in this excerpt

Finally, the F1score is used for computing the test accuracy based on the precision and recall in Equations 2.28 and 2.29. F1score can be found using Equation 2.32.

illustration not visible in this excerpt

Chapter Three Proposed Arabic Handwritten Text Recognition System

3.1 Introduction

This chapter presents the proposed Arabic Handwritten Text Recognition System (AHTRS). The chapter will describe the system architecture in details with the proposed algorithms in the various stages of the proposed system. Moreover, a proposed database of Arabic handwritten text is presented and discussed.

3.2 Architecture of the Proposed System

The main tasks of the proposed system are recognizing the input Arabic handwritten text by convert it into editable text and identify the writer of this document. AHTR system has two modules, one for the recognition of Arabic handwritten text and the other for identifying the desired writer of that text. The input to the system is an Arabic text handwritten text image which passes into the both modules, where the output of the first module is editable Arabic text and the second one is the writer of the input Arabic handwritten text document. Figure 3.1 shows the proposed system architecture.

Module1 works when a handwritten text image is entered into the system. The lexicon of Arabic text is used to assign the output class labels into their desired Arabic editable text. Besides, the module2 works when the pre-procced handwritten sub-images are obtained from module1. The list of known authorship is used to the output class labels into their desired writers. Furthermore, the output of the system depends on the manager query, which are editable Arabic text and the handwritten text’s writer.

illustration not visible in this excerpt

Figure 3.1: Proposed AHTR system architecture.

3.3 Arabic Handwritten Text Recognition (Module1)

The aim of module1 in the AHTR system is to recognize the Arabic handwritten text. The input to the module1 is a handwritten text image, and the output is an Arabic editable text. Module1 has several main stages and each stage has several phases that work together to achieve the system goals. The main stages for the module1 are: image acquisition, segmentation, preprocessing, features base construction, classification and post-processing which are shown in Figure 3.2.

illustration not visible in this excerpt

Figure 3.2: Architecture of module1.

3.3.1 Image Acquisition Stage

In this stage the input image of the handwritten text image is taken via camera or scanner. The image should have several formats such as JPEG, BMP, PNG, and etc. The input captured may be in gray or color form. Input images are converted into gray in order to reduce the image size then passed to the next stage of the system. An example of the Arabic handwritten text image is illustrated in Figure 3.3.

illustration not visible in this excerpt

Figure 3.3: Arabic handwritten text image. (a), (b) Color image, (c) Gray image.

3.3.2 Segmentation Stage

The input text image is segmented into several Arabic texts (sub-words) then each segment passes to the next stage of the system. In order to perform the segmentation, two features in the Arabic handwritten text are considered. The first feature is the longest Arabic word having eleven characters and the shortest Arabic word having only two characters. However, the second feature is that, the distances between the handwritten sub-words in the Arabic text are greater than the distances between the semi-words which belong to the same Arabic handwritten sub-word as shown in Figure 3.4.

illustration not visible in this excerpt

Figure 3.4: Distances feature of the Arabic handwritten text.

Therefore, according to these features a proposed segmentation algorithm is implemented, through drawing a rectangle around each Arabic handwritten segment in the input text handwritten image and crop each segment to save it as a single handwritten sub-image. Algorithm 3.1 explains the steps of the proposed segmentation stage.

illustration not visible in this excerpt

By applying the Algorithm 3.1 there will be several handwritten segments (sub-words). Each segment is given a name then passed to the preprocessing stage. The proposed segmentation algorithm is simple and does not need to deal with the histograms of the input images to segment it into sub-images. Figure 3.5 presents an example of applying the main stages of the proposed segmentation Algorithm (Algorithm 3.1).

illustration not visible in this excerpt

Figure 3.5: Applying the proposed text segmentation Algorithm. (a) Input image, (b) applying Sobel filter, (c) applying dilation and filling methods, (d) drawing rectangles around the labeled objects, (e) Drawing the obtained rectangles on original image, (f) handwritten sub-images.

3.3.3 Preprocessing Stage

The role of the preprocessing is to prepare the sub-images of the handwritten text for the next recognition stages. This is basically to reduce the noise superimposed data and keep only the desired information. In the proposed system, a preprocessing stage has been proposed. The proposed preprocessing stage has six steps which are:

A. Image thresholding,
B. Noise removal,
C. Black space elimination,
D. Image thinning,
E. Edge detection, and
F. Image scaling.

Each step has different effects on the AHTR system and by working together, the recognition accuracy of the AHTR system has increased. Figure 3.6 shows the proposed preprocessing stage steps.

illustration not visible in this excerpt

Figure 3.6: The proposed preprocessing stage of module1.

The proposed stage produce three output handwritten images. The first image is the thinned handwritten image and the second one is a binary handwritten image; while the third one is a handwritten edge image. Each of these images will be used for specific features extraction methods in order to build the features vector to be used for training and testing the proposed system.

A. Image Thresholding

The first step of the proposed preprocessing stage is image thresholding which converts the grayscale image into binary. The intensity histogram computation based approaches generally use two peaks for finding the threshold value, but many images do not have such two peaks in the histogram. Besides, the common algorithms have unsatisfied results for the handwritten text documents such as global thresholding. Global thresholding is a common technique to convert the grayscale image into binary based on choosing a global threshold value. Finding such a threshold value is computationally complex. This technique also has a side effect on the handwritten text by excluding some parts of its parts or inclusion of some noise and vice versa.

Therefore, the proposed thresholding algorithm overcomes this problem by classifying pixels into the foreground and background correctly with less numbers of misclassified pixels. The proposed thresholding algorithm starts with calculating the intensity value of pixels (Imax and Imin) in the gray handwritten sub-image, then finding the mean and difference between the max and min intensity. Furthermore, the result will be as input to the Fuzzy C-Means Culturing (FCM) that is presented in section 2.4.2(A) to attract the nearest similar pixels by clustering operation and produce best threshold level to convert the image into binary. Algorithm 3.2 describes the main steps of the proposed image thresholding.

illustration not visible in this excerpt

The intensity of pixel value determines the black and white pixels in the image then the FCM will use the black/white as input, and select several black/white pixels to use them as centroid for the clustering operation. In addition, after testing several values between (0 and 200) on 50 images to get best variation range, 20 and 110 were chosen by experiment as the best offset variation ranges of the handwritten images. In addition, if difference between foreground and background intensity, that is Idiffr>Imea (average value) larger than the mean , then the possible variation range of text content in large range and the offset (Io) is set to be 110. However, for small difference, the variation range of the text content will also be small. Therefore, the offset (Io) is set to be 20 which is a small value. Experimental results show that the offset values of 20 for lower intensity variations and 110 for large variations could cater for general pen pressure and color variations, and hence gives very satisfactory results after testing 50 Arabic handwritten images.

In another hand, the threshold value (T) is used to enhance the FCM processing through minimizing the selected clustering points and the number of iterations. The FCM gives best result for overlapped data set and comparatively better than k-means algorithm. Unlike k-means where data point must exclusively belong to one cluster center here data point is assigned membership to each cluster center as a result of which data point may belong to more than one cluster center. Figure 3.7 shows the Arabic text handwritten subimage after applying the proposed thresholding Algorithm.

illustration not visible in this excerpt

Figure 3.7: Image thresholding. (a) Original image, (b) Image after applying the proposed algorithm.

B. Noise Removal

During the acquisition process and thresholding steps, some false pixels are added to the handwritten text image and sub-image. These pixels represent noise affecting the image quality in creating irregularities on the outline of the handwritten text. In order to overcome this noise, a simple and practical noise removal algorithm is performed.

This algorithm eliminates the noise of the handwritten binary image by removing isolated pixels. By giving binary image of an Arabic handwritten text and two thresholds, the algorithm removes the pixels that are under and above the thresholds through assigning number zero (0) for these pixels. In Algorithm 3.3 the proposed noise removal is described.

illustration not visible in this excerpt

The proposed noise removal algorithm uses two thresholds which are T1=3 and T2=300 which are less and greater than the number of smallest and biggest pixels for the handwritten Arabic components. The values of the thresholds are obtained after operating many testing processes on several handwritten images of the used handwritten databases. The noise removal algorithm is labeled each components in the image then eliminates all the unwanted pixels around these components depending on the choose thresholds.

These steps make the proposed algorithm simple and efficient, through remove only the unwanted pixels without affecting the original shape of the handwritten text in the input image, unlike the existing methods which depends on a sets of filters and removing some of the handwritten text shape. Figure 3.8 shows example of applying the proposed noise removal algorithm.

illustration not visible in this excerpt

Figure 3.8: Noise Removal. (a) Input image before applying proposed Noise Removal algorithm, (b) Output image after applying proposed Noise Removal algorithm.

C. Black Space Elimination

The fourth step of the proposed preprocessing stage is removing the black space around the handwritten text images in the background. The black space represented by (0) value does not have any desired information about the handwritten text and does not help the recognition system which may affect the features extraction results and make it inefficient. The proposed approach for the removing the black space is based on counting the number of black pixels from all image directions. From each side of the binary image, closest foreground pixels of the written text will be obtained. This produced four points which formed the boundaries of the Bounding Box. The black area around this box could then be eliminated using these four values. The main steps of the black space elimination are described in Algorithm 3.4.

illustration not visible in this excerpt

By applying the Black Space Elimination (Algorithm 3.4), lots of unwanted pixels in the background are removed without any effect on the shape handwritten text as shown in Figure 3.9.

illustration not visible in this excerpt

Figure 3.9: Black space elimination. (a) Input image, (b) Output image of Black Space Elimination Algorithm.

D. Image Thinning

In order to extract the image skeleton, Stentiford thinning Algorithm 2.1 in section

2.4.2(D) is used. This algorithm is important for the structural features extraction. The skeleton image is clear and has very few pixels to represent the shape of handwritten text in the image. An example of applying the Stentiford thinning Algorithm is illustrated in Figure 3.10.

illustration not visible in this excerpt

Figure 3.10: Image Thinning. (a) Input image, (b) Thinned image.

E. Edge Detection

The fifth step of the proposed preprocessing stage is the edge detection. In order to extract the edge of the image, Sobel edge detector, described in section 2.4.2(C) is used in the proposed system which give the best results for the Arabic handwritten images. The binary image that obtained from the black space elimination algorithm is converted to gray, then the image edge is detected. First the Sobel convolution vertical and horizontal kernels are applied for the image then the magnitude is calculated using Equation 2.4. Figure 3.11 shows the edge image that is obtained by applying Sobel detector.

illustration not visible in this excerpt

Figure 3.11: Edge image using Sobel detector. (a) Input image, (b) Output image.

F. Image Scaling

Each writer, writes the text in different style and size. Therefore, it is important to make all the images in the same size, which make the process of the feature extraction faster. Scaling is often used to scale text images to a fixed size. Besides, by making the images in the same size. The extracted features will be more efficient for the classification stage. In the proposed system, all the images are normalized into same size (128x128) as shown in Figure 3.12 which is the final output of the preprocessing stage. Scaling size has been chosen after many tests on 500 images with various sizes.

illustration not visible in this excerpt

Figure 3.12: Image scaling. (a,b,c) Images with various sizes (214x376) and (117x278), (d,e,f) Images after scaling to single size (128x128).

3.3.4 Features Base Construction Stage

The objective of the features base construction stage is to capture the most relevant and discriminate characteristics of the handwritten text image and store it in a database. The selection of good features can strongly affect the classification performance and reduce the computational time. In other words, it's possible to choose a set of features that denote the significant differences from one class to another. These selected features consequently result in an easier classification task. The features used must be suitable for the application and for the applied classifier. In the proposed system two groups of features base are constructed, features base1 for module1 and features base2 for module2.

The features base1 group is used in module1 for recognition purpose through finding the most similar features for the same text handwritten, and the feature base2 used in module2 for identification through finding the most similar features in the text handwritten for the same writer. Moreover, all the extracted features in module1 are saved in a features base1 which contains a set of vectors that represent the text handwritten features, which are shown in Figure 3.13.

illustration not visible in this excerpt

Figure 3.13: Proposed features base construction of module1.

There are four main steps for features base construction of module1 which are: features extraction, feature vector, features normalization and features base1 construction.

A. Features Extraction

The first group of features that is used for recognition in module1 is extracted by several methods as will be described:

1. Structural Features

The first set of features is the structural features. Primitive visual Arabic writing is part of the structural descriptors that are connected in the form of writing. Various descriptors that are specific to the Arabic text can be extracted from writing which are number of points, number of loops, number of end points and number of junctions as shown in Figure 3.14.

illustration not visible in this excerpt

Figure 3.14: Different Arabic descriptors.

The first feature extracted in the proposed system is the number of points. They are very important features, due to the variances in the Arabic text. The points are the smallest items in the image, and it contents only a little pixels. The second extracted feature is the number of loops. Nine of the Arabic alphabets have one loop and only one character has two loops. Loops feature has a better effect on the system when it is extracted from an Arabic handwritten sub-word than extracted from a single character. The third extracted features are the number of end points. Each Arabic handwritten sub-word has a unique number of endpoints that specified from the other sub-words. The last extracted feature is the number of junctions.

2. Statistical Features

The second set of features will be extracted by a proposed statistical features extraction method in the AHTR system. The proposed method depends on dividing the edge image that is obtained for the preprocessing stage into various number of blocks then extracting the required features from each block individually. The handwritten text image is divided into four square blocks, eight horizontal and eight vertical blocks as illustrated in Figure 3.15(b, c and d).

illustration not visible in this excerpt

Figure 3.15: Image blocking. (a) Original image, (b) Four blocks divided, (c) eight blocks divided, (d) Eight vertical blocks divided, (e) Diagonal pixels of the original image, (f) Diagonal pixels of the four divided blocks.

The first features are extracted by dividing the handwritten text image into four blocks as in Figure 3.15(b) then apply a specific mathematical operation, such as the summation of the diagonal pixels only as in Figure 3.15(e and f) to get the features for each block. Besides that, the second features are obtained by dividing the images into eight equal blocks as in Figure 3.15(c). Moreover, the last features are obtained by dividing the images into eight vertical blocks and obtain the summation of the white pixels for each block as in Figure 3.13(d). All the extraction steps of the proposed statistical features are illustrated in Algorithm 3.5.

illustration not visible in this excerpt

3. The Discrete Cosine Transform (DCT) Features

In the proposed system the DCT is applied for the whole handwritten sub-image that produced from the preprocessing stage. The output of the DCT is an array of DCT coefficients. The features are extracted in a vector sequence by arranging the DCT coefficients in zigzag order. Therefore, most of the DCT coefficients away from the beginning are small or zero. In order to choose appropriate number of coefficients which represent the features, 500 handwritten images are selected for the testing. DCT is applied for all the images, and then the energy is obtained to reconstruct the original images from the selected coefficients. Examples of energy contained with different DCT coefficients numbers are illustrated in Table 3.1.

Table 3.1: The energy contained in different DCT coefficients number.

illustration not visible in this excerpt

The first 50 coefficients give the best energy to reconstruct the original images. In the proposed system, by testing the coefficients it is found that the best number of DCT coefficients to represent the handwritten sub-word as feature vector with minimum number of features is the first 10 coefficients which are given a good energy. In order to get the DCT features, fast cosine transform (FCT) in section 2.4.4(C) is used. The main steps to get the DCT features are presented in Algorithm 3.6.

IV. Modified Histogram of Oriented Gradient (MHOG1) Features

The important extracted features in the proposed system are MHOG. Several proposed steps are performed to extract the required features. First, the binary image that obtained from the previous stages is converted to gray form ( multiply by 255), and to find the gradient of the image, an edge detector filters are proposed as in Equations 3.1 and 3.2, and the resultant images of applying the proposed filters are shown in Figure 3.16.

illustration not visible in this excerpt

Figure 3.16: Edge detection. (a) Original image, (b) X-axis, (c) Y-axis.

The next step is computing the image magnitude and orientation using equations 2.4 and 2.5 respectively. The gradient magnitude and direction are shown in Figure 3.17.

illustration not visible in this excerpt

Figure 3.17: Image gradient. (a) Image magnitude, (b) image direction.

After that, the gradient image is divided into 6x6 cells then scan the image from left to right within 2x2 overlapped blocks as in Figure 3.18.

illustration not visible in this excerpt

Figure 3.18: Cells division.

For each block the histogram of the orientation gradient is obtained based on weighted vote into orientation bins over special cells. Since the output range of the gradient π] which gives a lot of directions. In the proposed system the ـorientation fall in [-140 π] and the dist- gradient orientation is quantized into 9 bins within range [-140 +2π /bin ance between the directions is (2π/bin) as shown in Figure 3.19.

illustration not visible in this excerpt

Figure 3.19: Histogram of oriented gradients. (a) Gradient orientations, (b) Histogram of oriented gradients for one cell.

If the gradient orientation of the image dose not match any of the quantized orientations, an interpolate votes linearly between neighboring bin centers is considered. An example of such a situation, if the theta= 75 which falls between 60 and 100, the distance to the bin center bin 60 and bin 100 are 15 and 25 degrees respectively and the different between 60 and 100 is 40. Hence, ratios are obtained by 15/40=0.3, 25/40=0.6 as illustrated in Figure 3.20.

illustration not visible in this excerpt

Figure 3.20: Interpolation votes of gradient orientation.

After that, all the output histograms are concatenated to make it 1D MHOG1 features vector that represent the image features which are shown in Figure 3.21. Moreover, all the extraction steps are illustrated in Algorithm 3.7.

illustration not visible in this excerpt

Figure 3.21: Histograms concatenation.

illustration not visible in this excerpt

B. Feature Vector

Each features extraction method extracts several number of features based on its entire process. Therefore, there will be four types of features extracted for each input handwritten image. In the feature vector step, all the extracted features which are extracted by Algorithms 3.5, 3.6 and 3.7 are combined together and saved in one dimension array (1*120) of all the extracted features called feature vector. The created feature vector represents the input Arabic handwritten sub-image by a set of features that will be used for classification. Figure 3.22 shows the building of the feature vector.

illustration not visible in this excerpt

Figure 3.22: Feature vector.

C. Features Normalization

An important step to make the mathematical computing simple and fast is by normalizing the numbers. Large numbers make complex mathematics calculation and take long times for processing. Therefore, in the proposed work features normalization (scaling) has been used to make the features in the same scale. Since there are a signed and unsigned features in the features vector that are obtained by the proposed features extraction algorithms, all the features are scaled in the range of [-1,1] by applying the Algorithm 3.8.

illustration not visible in this excerpt

Several scaling ranges were tried in order to choose the most appropriate one. The best scaling range was [-1, 1] which reduce the training and classification time of the proposed system.

D. Features Base1 Construction

The final step in feature base construction stage is to create the features base1. Each feature vector first will have a label. The label is an integer number that represent the desired class of the feature vector. Therefore, all the features vectors that belong to the same class (same sub-word) will have the same label to make the classification process more efficient. Moreover, all the features vectors are stored in two dimensions array (rows present the classes and columns present the features) then saved as a base in a file.

3.3.5 Classification Stage

In the proposed system SVM is employed as one-vs-all multi-class. Since SVM is a binary classifier, each Arabic handwritten sub-image (class) will have a SVM model which represents the handwritten text in the sub-image. The first SVM model separates the class "ﺪﺒﻋ" from the remaining classes (sub-images), the class of "ﺪﺒﻋ" handwritten sub- image is considered the positive class and all remaining handwritten sub-images form the negative class. Figure 3.23 shows an example of applying one against all approach in the proposed system.

illustration not visible in this excerpt

Figure 3.23: Architecture of proposed SVM one against all approach.

The proposed system used three different Arabic handwritten databases. In AHDB database 2730 handwritten images were used for training and 1365 handwritten images used for testing, while in IESK-ArDB database 420 handwritten images for training and 180 handwritten images used for testing. Besides, in the proposed database 780 handwritten images for training and 520 handwritten images for testing were used. During the training process the handwritten image goes through image acquisition (section 3.3.1), preprocessing stage (section 3.3.3), and features base construction (section 3.3.4) to save all the extracted feature vectors.

However, during the testing process images go through the same stages of the training and the segmentation stage (section 3.3.2). Furthermore, all the extracted features in the training and testing processes were used to train and test the classifier, in order to get the best accuracy of recognizing the desired class labels. The overall handwritten SVM training and testing processes are shown in Figure 3.24.

illustration not visible in this excerpt

Figure 3.24: Classification process of module1.

During training, all the examples in the class considered are labeled positively (+1) and all examples not belonging to this class are labeled negative (-1). In another hand, during the testing, the input Arabic handwritten sub-image was associated with the class whose output is positive. The process of classification based on the SVM classifier is illustrated in Algorithm 3.9 for training and Algorithm 3.10 for testing.

illustration not visible in this excerpt

3.3.6 Post-processing Stage

The final stage of module1 is the post-processing. The output of the classification stage is a class labels of each input Arabic handwritten sub-image. In this step these labels are used to represent the output text and display it in the system interface. Most of the computer applications do not support the Arabic language. Therefore, in the postprocessing process an Arabic lexicon is built. The lexicon has all required text and the Unicode of each text to simplify the dealing with that text. A sample of the proposed Arabic lexicon is presented in Table 3.2.

Table 3.2: Sample of the proposed Arabic lexicon.

illustration not visible in this excerpt

Arabic lexicon provides the ability to the proposed recognition system to edit the recognized Arabic text. In the literature of chapter one (section 1.6), the previous system displays the recognition results by viewing the accuracy only. However, some of the systems gave the ASCII codes as the output for the desired handwritten text. In the both cases the end users can not see any text as an output of the system. The users all see only the system accuracy, class labels or the ASCII codes of the handwritten text. In addition, ASCII codes of the Arabic language gave only a single characters without joining them as a complete sub-words or sentences. Therefore, the proposed Arabic lexicon overcomes all these problems and provides the end users with the required editable Arabic text.

3.4 Arabic Handwritten Text Writer Identification (Module2)

Module2 is used to identifying the writer of the input Arabic handwritten text images in the AHTR system. The main stages of the module2 are: image acquisition, segmentation, preprocessing, features extraction, classification and post-processing as illustrated in Figure 3.25.

illustration not visible in this excerpt

Figure 3.25: Architecture of writer identification (module2).

Most of the stages in the module2 are similar to the module1 stages with a little differences. Image acquisition in section 3.3.1 is used to prepare the handwritten text image for the following processes of the system through converting the input handwritten text image into gray. Algorithm 3.1 is used to segment the input handwritten image into sub-words. Algorithm 3.2 is used to convert the handwritten sub-image into binary in the preprocessing stage, then the noise is removed by Algorithm 3.3. Moreover, Algorithm 3.4 is used to remove the black space around the handwritten image as shown in Figure 3.26.

illustration not visible in this excerpt

Figure 3.26: Preprocessing stage of writer identification (module2).

3.4.1 Features Base Construction (module2)

The features used in module2 to identifying the identity of the writers who wrote the Arabic handwritten text image through finding the differences between all the writers’ styles and characteristics. Moreover, these features are extracted from each segment images and all the extracted features saved in a features base2 which contains a set of vectors that represent the writers’ features as shown in Figure 3.27.

illustration not visible in this excerpt

Figure 3.27: Proposed features base construction of module2.

Two types of features are extracted for module2. The first type is extracted by Modified HOG descriptor and the second one is a shape features which are extracted by several methods. In addition, the features extracted from the output text handwritten sub-images of the preprocessing stage.

A. Modified Histogram of Oriented Gradient (MHOG2) Features

In order to extract the first type of the mentioned features, the binary image is first converted back into gray then normalized into size 128x128. After that, the proposed filters in Equations 3.1 and 3.2 are applied to the image to get the edge in X and Y directions same as in Figure 3.16. The proposed MHOG for identification uses only gradient directions of image that is illustrated in Figure 3.17(b) for extracting the required features. Furthermore, the obtained image directions are divided into four blocks as shown in Figure 3.28.

illustration not visible in this excerpt

Figure 3.28: Blocks dividing.

For each block the histogram of the orientation gradient is obtained based on weighted vote into orientation bins. Besides that, the gradient orientation is quantized into 10 bins π] and the distance between the directions is (2π/bin). The weight within range [-π of each bin in the histogram depends on the number of directions appearing in each block as in Figure 3.29.

illustration not visible in this excerpt

Figure 3.29: Histogram of gradient orientation of MHOG2.

However, if the gradient orientation of the image does not match any of the quantized ones, an interpolate votes linearly between neighboring bin centers is considered. Finally, all the output histograms are concatenated to form 1D vector that represents the MHOG2 features. All the extraction steps of MHOG2 are illustrated in Algorithm 3.11.

illustration not visible in this excerpt

B. Shape Features

The second type of extracted features is the shape feature. Several features are extracted from the handwritten sub-images which depend on the shape of the handwritten text. The first extracted feature is the aspect ratio (Equation 2.16) of the handwritten text in the sub-image by finding the width and height of the text handwritten as in Figure 3.30. Width of the text in the sub-image is found by successively penetrating each column in the binary image to find the first and last pixels in the image and store their column numbers. The width of the image is calculated by subtracting the column number of last pixel to the column number of first pixel.

However, height of the text in the sub-image is found by consecutively probing each row in the binary image. The first and last pixels of the image are found and the corresponding row numbers are stored. The height of the image is computed by subtracting from the row number of last pixel to the row number of first pixel. The aspect ratio of all the sub-images are stored as a features. Besides that, the second extracted feature is the centroid. The centroid represents the center of mass of the handwritten sub-image depending on the width and height whose centers represent the centroid point as shown in Figure 3.30.

illustration not visible in this excerpt

Figure 3.30: Width, height and centroid of the handwritten sub-image.

The third feature is the area of the handwritten text in the sub-image. The Area of the text in the sub-image is calculated as the number of foreground pixels. Furthermore, the fourth extracted feature is the perimeter. In order to compute the perimeter, the number of “1” pixels that have “0” pixels as neighbors are counted, and the result is represents the perimeter. In the case of the sub-image having more than one handwritten segments, the perimeter is found by the summation of all the results from each segment. Algorithm 3.12 shows the steps of extracting the proposed shape features.

illustration not visible in this excerpt

In addition, features normalization in section 3.3.4(C) has been used to make the features ranges [-1,1] by applying the Algorithm 3.8. The final step is to create the features base. Each feature vector first will have a label. Therefore, all the feature vectors that belong to the same class will have the same label to simplify the classification process. Moreover, all the feature vectors are stored in two dimensional arrays then saved in features base2.

3.4.2 Classification Stage (module2)

SVM is used to make the decision by assigning each input handwritten text images into its desired writer. One-vs-all approach is used by separating each writer class from the other classes. Algorithms 3.9 and 3.10 are used to train and test the classifier. The SVM works by classifying the whole handwritten text image into its writer. Each handwritten sub-image is classified into its desired class, then a voting process is applied for all the classes in order to choose the most frequented class. The voting process depends on a threshold in order to get the best identification accuracy. If the appearance percentage of the frequent class is greater than the selected threshold which is (85% obtained by sub- word level identification approach), then it considers the desired class of the right writer. The classification approach of writer identification process is illustrated in Figure 3.31.

illustration not visible in this excerpt

Figure 3.31: Classification approach of module2.

3.4.3 Post-processing Stage (module2)

The output of the identification process is a class label of the desired writer. Another part of the proposed writers’ lexicon is created for saving the writers information. Each class label of the input text is assigned to its writer to make the output of the system display the writer name. A sample of the proposed writers’ lexicon is presented in Table 3.3.

Table 3.3: Sample of proposed writers’ lexicon.

illustration not visible in this excerpt

3.5 Proposed Handwritten Database

An Arabic database for character, sub-word, text recognition and writer identification has been proposed in this work. The first part of the database contains all Arabic characters which are written in different sizes and styles. Each characters has 20 JPG images that represent the character and the overall database images are 560 images. Samples of the proposed handwritten character database are illustrated in Figure 3.32. The second part of the proposed database has 1,300 handwritten sub-word images written by several writers of different ages and education background. Some writers used black pen but the others used blue pen for writing.

illustration not visible in this excerpt

Figure 3.32: Handwritten character example of the proposed database.

The database can be used for handwritten recognition systems and in the security systems and applications, such as writer identification and verification, when the other standard databases can be used only for the recognition systems only. Figure 3.33 shows an examples of the proposed database.

illustration not visible in this excerpt

Figure 3.33: Handwritten sub-word example of the proposed database.

Furthermore, the proposed database has a similar and different texts that are written by same writer which make the recognition for the writer identity process text dependent or independent as shown in Figure 3.34.

illustration not visible in this excerpt

Figure 3.34: Sample of Arabic text written by same writer.

Finally, the last part of the database is the handwritten text documents. The database has several Arabic handwritten text documents written by several writers. These documents are used to test the performance of the proposed work. An example of the Arabic handwritten text documents in the proposed system is illustrated in Figure 3.35. All these mentioned features are missed from the standard databases which were reviewed in section 2.5.2 of chapter two.

illustration not visible in this excerpt

Figure 3.35: Handwritten text image example of the proposed database.

Chapter Four Experiments and Results Discussion

4.1 Introduction

The results obtained from applying the proposed algorithms and the effect of each proposed algorithm of the system are presented in this chapter. In the following sections, the test setup and the experimental results obtained for the segmentation, preprocessing, multiscale features, and classifier are discussed. The proposed system is implemented in Matlab 2015a and Visual Studio 2013 programming languages. The experiments were performed on an Intel Core i5, 64 bit Operating System, 2.50 GHz processor and 6GB RAM.

4.2 Evaluation of The AHTRS System (module1)

In order to evaluate the proposed system, number of metrics are considered. The experimental results of the proposed system are given in details in the next sections.

4.2.1 Arabic Handwritten Database

The important martial that needed to evaluate any system is the database. Unlike the existing handwritten systems in the literature (section 1.3), the proposed system used three different handwritten databases and more Arabic handwritten images. The first database is AHDB, which was used for the recognition purpose. In the proposed system 4095 images of Arabic handwritten numbers and the most common Arabic words of the AHDB database are used. The proposed system randomly used 70% of the database which is 2730 handwritten images for training and 30% of which are 1365 handwritten images for testing. The second used database is IESK- ArDB that is also used for handwritten recognition. The training set use 420 images and 180 images for the testing set. The last used database is the proposed dataset which is used for recognition and identifying the handwritten text writers. The proposed sys- tem used 1300 handwritten word images, 60% used for training and 40% for testing the system. Examples of the used handwritten databases are shown in Table 4.1.

Table 4.1: Arabic handwritten images from different databases.

illustration not visible in this excerpt

As seen in Table 4.1, each handwritten database has different images colors and types with different shapes and sizes. These differences make the recognition a very difficult task to overcome the differences and achieve higher accuracy. However, for testing the system an Arabic handwritten text documents with different type, size and colors are used. An examples of the used Arabic handwritten document images which are used as input for the proposed system are illustrated in Figure 4.1.

illustration not visible in this excerpt

Figure 4.1: Arabic handwritten text documents.

In addition, each Arabic handwritten text has several orientations and shapes in the handwritten images of used handwritten databases. Even the single writer has some of differences in his/her writing. An example of the mentioned case the Arabic text “ﻡﺎﻌﻟﺍ” takes several orientations and shapes in the handwritten text image of the AHDB database as shown in Figure 4.2.

illustration not visible in this excerpt

Figure 4.2: Arabic handwritten images of the text (ﻡﺎﻌﻟﺍ).

4.2.2 Handwritten Text Segmentation

The first process of the proposed system is to segment the handwritten text into sub-words. These sub-words are then saved to be used for further process in the next stages. The results of applying the proposed segmentation are shown in Figure 4.3. The proposed segmentation algorithm is applied for several images from the AHDB database and the proposed database. The segmentation rates that obtained are presented in Table 4.2.

illustration not visible in this excerpt

Figure 4.3: Image segmentation. (a) AHDB database, (b) Proposed database.

Table 4.2: Segmentation Results.

illustration not visible in this excerpt

The correct segmentation for AHDB and proposed databases are 89% and 92% respectively which were obtained by using Equation 2.30. However, Equation 2.31 was used to calculate the segmentation errors which are 11% for AHDB database and 8% for the proposed database, because of the variation in handwriting, such as the unstable spaces between the sub-words and the semi-words. Different types of error occurred as in Figure 4.4 during the segmentation process as viewed in Table 4.2. These errors are:

- Over Segmentation: the number of segments is larger than the actual number.
- Under Segmentation: the number of segments is less than the actual number.
- Misplaced Segmentation when the number of segments is correct but the limits are wrong.

The output sub-images from the segmentation algorithm go through an evaluations to check if the sub-images are segmented correctly. First of all, the mean and Standard Deviation (SD) are calculated for the handwritten text images of each Arabic text in the used databases as shown in Figure 4.5.

illustration not visible in this excerpt

Figure 4.4: Error segmentation. (a) Over segmentation, (b) Under segmentation, (c) Misplaced Segmentation.

In addition, the height, width and area are calculated too. In another hand, all the calculation are done also for the output handwritten sub-image from the segmentation algorithm. After that, Equation 2.24 is used for both calculation results to indicate whether the sub-image are segmented correctly or not.

illustration not visible in this excerpt

Figure 4.5: Mean and SD of handwritten image for same Arabic text.

4.2.3 Handwritten Text Image Preprocessing

In this stage several processes are performed on the handwritten sub-images in order to make them ready for the next stages. These processes are performed using the algorithms that are explained in chapter three (section 3.3). The evaluation of these processes will be explained in the following sections:

A. Image Thresholding

For evaluating the proposed image thresholding algorithm, several handwritten text images from various text handwritten databases are used. The output of applying the thresholding algorithm is a binary image that contains two types of pixels, black color (0) for background and white color (1) for foreground. By applying the proposed image thresholding algorithm on the AHDB, IESK-ArDB and the proposed databases the output images will be clear, readable and noiseless as shown in Figure 4.6.

illustration not visible in this excerpt

Figure 4.6: Image thresholding by the proposed thresholding method. (a) Input images,(b) Output images.

Another evaluation of the image thresholding algorithm can be performed by using the ground-truth images in the handwritten database such as IESK-ArDB database. IESKArDB database provides text handwritten images which are produced using Adobe Photoshop software by painting in black those pixels that were expected to be considered as background by the thresholding algorithm, and painting in white those that were expected to be considered foreground.

More precisely, pixels in the resulting binary image can be classified as pixels correctly assigned to the foreground, pixels correctly assigned to the background, pixels incorrectly assigned to the foreground, and pixels incorrectly assigned to the background. Misclassification Error (ME) in chapter two (section 2.7) is used to calculate the missclasification pixels between the binary image and the ground-truth image as in Table 4.3. The optimal results of Misclassification Error (ME) is zero. Therefore, when the results are close to zero means best thresholding result and vise versa. Table 4.3 shows the strength of the proposed image thresholding algorithm through the number of the misclassification pixels between the binary images and the ground-truth images. The low number of misclassification pixels makes the features extraction methods work better and also increase the recognition accuracy. The handwritten text images with less or without misclassification pixels (ME = 0), make each handwritten text clear, unique and keep the orginal shape of the handwritten text without any change. This process makes the features extraction methods extract the features in an appropriate way without any conflict between the handwritten texts.

Table 4.3: Evaluation of proposed thresholding algorithm on IESK-ArDB database.

illustration not visible in this excerpt

In another hand, the more misclassification pixels are reduced the recognition accuracy and make the features extraction methods work badly. The more misclassification pixels force the recognition system to use several or many features extraction methods to extract an unique features for each handwritten text without conflicts. Using several features extraction methods lead to more processing time and very long feature vector.

B. Noise Removal

To address the robustness of the system and the proposed noise removal method, a noise models have been added to the handwritten sub-images. These noise models were used to generate a set of sub-images that considers different levels of degradations in real- life scanned handwritten text document images. These degradations allow to carefully study the performance of the proposed system. Salt and Pepper noise, Gaussian noise have been added to the handwritten image. Table 4.4 shows the added noises and the proposed system accuracy with these noise before applying the proposed noise removal algorithm. The recognition accuracy of the system is computed by using Equation 2.30 in chapter two (section 2.7).

Table 4.4: Recognition accuracy after adding noise.

illustration not visible in this excerpt

The results of PSNR and MSE in Table 4.4 showed the effects of adding the noise into the hardwearing image. Small PSNR and large MSE value means a noisy and poor quality image. Although the noise is added to the handwritten images, the proposed system still gets a better accuracy as shown in Table 4.4.

In another hand, for performance evaluation of the proposed noise removal algorithm, MSE and PSNR are used on the images before and after applying the proposed noise removal method. Furthermore, Table 4.5 shows the recognition accuracy of the proposed system after applying the noise removal algorithm on the noisy handwritten image. The recognition accuracy of the system is computed by using Equation 2.30 in chapter two (section 2.7).

Table 4.5: Experimental results of applying the proposed noise removal algorithm.

illustration not visible in this excerpt

The results in Table 4.5 showed the benefit of using the proposed noise removal algorithm. MSE is found by the differences between the image before and after applying the proposed noise removal method. The MSE results in the table are small which means better noise removal results. However, the PSNR results are high and the higher value of PSNR means the better noise removal. On the other hand, the system achieved a high recognition accuracy after applying the proposed noise removal algorithm.

C. Black Space Elimination

The proposed Black Space Elimination (BSE) algorithm has a good impact on the recognition accuracy of the proposed system. Removing the black space around the handwritten text, is avoiding the redundancy of the background pixels which are represented by number (0). These process makes the features extraction methods work well and avoid the undesired pixels to be used for extractions. Table 4.6 shows the accuracy of recognition system before and after applying the proposed BSE algorithm 3.4 in the proposed system.

Table 4.6: Experimental results of applying BSE algorithm.

illustration not visible in this excerpt

As seen in Table 4.5 the recognition accuracy is increased by 3.317% when the BSE algorithm is used. The removing of unwanted background pixels makes the features extraction methods extract the features directly from the handwritten text shape. This process makes the features extraction works fast, because it will use less image pixels for extracting the required features. In another hand, the features extraction methods will avoid all the uninformative pixels that lead to redundant features which reduce the recognition accuracy.

D. Image Scaling

The proposed work used various image sizes. The increasing of image sizes cause the increasing in recognition accuracy, but makes the recognition process performed slowly. However, reducing the image sizes cause the reduction in recognition accuracy and losing some of image information but makes the recognition process work faster. In the proposed system several sizes are tested in order to get the best image unified size. After testing these several sizes, it is found that the best accuracy is obtained by within 128*128, 128*64, and 64*128 image sizes for the recognition process in module1. The experimental results with different image sizes are illustrated in Table 4.7.

Table 4.7: Experimental results with various image sizes.

illustration not visible in this excerpt

4.2.4 Features Extraction

Various features extraction methods are used in the proposed system in order to extract the required features. Each method has its strength in different aspects. All the methods are applied on the images to get the best features that present the desired class (handwritten sub-word) and specify it from the other classes in the module1. In addition, different number of features are extracted by these methods. The large number of features make the recognition and identification process slow, but increase the accuracy. However, the small number of features make the recognition and identification process fast, but reducing the accuracy of the system.

In the MHOG1 features (3.2.1 D) more than one edge detection filters are applied in order to get the best accuracy. The HOG descriptor has its own default filter for edge detection which was reviewed in chapter two (section 2.4.4 B (I)). Furthermore, Sobel, Canny, Roberts and the proposed filters (Equations 3.1 and 3.2) are used. The best recognition accuracy results are obtained by using the proposed filters. Table 4.8 shows the comparison of using different edge detection filters and the proposed one on the AHTRS system.

Table 4.8: Comparison of results for different edge detection filters.

illustration not visible in this excerpt

In addition, several number of bins are tested to obtain the optimal once. The signed and unsigned theta is also tested for better performance. Also, the cells, block numbers and sizes are chosen after hundreds of tests on the used handwritten databases. The results of testing all the mentioned values are show in Table 4.9.

Table 4.9: Comparison of results for different MHOG1 values.

illustration not visible in this excerpt

The proposed MHOG1 features use an overlapped blocks that explained in chapter three (Figure 3.18) in order to compute the histograms. The overlapped processes increase the recognition accuracy better than using un-overlapped blocks. The process of overlapping makes each histogram depend on the previous and next once, which gives a very useful details about the text handwritten appearance. The effects of applying the overlapping process in MHOG1 are illustrated in Table 4.10.

Table 4.10: Experiment results for overlapping approach.

illustration not visible in this excerpt

However, in DCT algorithm 3.6 there are different approaches that are used to extract the features. The features can be extracted directly from the handwritten-sub image as one block or by dividing the handwritten sub-image into blocks then finding the DCT coefficients for each block. Four approaches are attempted in the proposed work in order to find the DCT coefficients. The first approach is finding the DCT coefficients from the sub-word directly and the other approaches by dividing the sub-image into 4x4, 6x6 and 8x8 blocks. The experimental results of applying the various approaches of finding the DCT coefficients are shown in Table 4.11.

Table 4.11: Experiment results for different dividing approaches.

illustration not visible in this excerpt

Appling the DCT on the image directly give the best recognition accuracy after testing 500 Arabic handwritten sub-images. The output of applying DCT is an array of coefficients which has the same size of the handwritten sub-image. The final step of extracting the DCT is choosing the appropriate coefficients as a features. There are two ordering techniques to choose these coefficients. The first technique is by taking the coefficients sequentially, and the second technique by taking the coefficients in zig-zag order. The results of applying the both ordering techniques are shown in Table 4.12.

Table 4.12: Ordering techniques of selecting coefficients.

illustration not visible in this excerpt

In the proposed system the DCT features are extracted by the common DCT algorithm and using FCT algorithm. The experiments show that, extracting the DCT features using FCT algorithm is faster than using the DCT algorithm within same accuracy results. Table 4.13 shows the features extraction times of applying the DCT and FCT algorithms.

Table 4.13: Features extraction times of DCT and FCT methods.

illustration not visible in this excerpt

The results of extracted features of the proposed system and the combining of the extracted features are displayed in Table 4.14.

Table 4.14: Comparison of results for different features extraction methods.

illustration not visible in this excerpt

Each one of these features outperforms the others on set of handwritten classes. By combining these features, the weakness of each single feature is strengthened by the other features type; thereby the whole proposed system is improved. The result verifies that it is possible to model an image by its unique features and taking advantage of corresponding sort of features to represent it; also, the recognition and identification effectiveness based on combined features give a higher results than those of individual features.

4.2.5 Features Normalization (FN)

In the proposed work, the features are normalized into the range [-1 1] using the proposed features normalization algorithm 3.8. The proposed algorithm reduces the training and testing time of the system by simplifying the computation processes. The impact of the proposed features normalization algorithm on the system is illustrated in Table 4.15.

Table 4.15: Comparison results of applying FN algorithm.

illustration not visible in this excerpt

4.2.6 Classification

In the proposed system, different kernels of SVM are used. SVM is commonly used with linear, polynomial and RBF kernels. A multiclass SVM classification has been used in the proposed system and it achieved a very high recognition accuracy using the polynomial kernel for the both modules. The recognition accuracy that achieved in module1 for linear, polynomial and RBF kernels are shown in Figure 4.7.

illustration not visible in this excerpt

Figure 4.7: The recognition accuracy of different SVM kernels.

The proposed system used three Arabic handwritten databases for evaluation. Each database has different number and type of Arabic handwritten images. In addition, due to the writing style and the number of the images in each database, it caused of various recognition accuracies. In Table 4.16 the comparisons results of using different databases within the SVM kernels are presented.

Table 4.16: The recognition accuracy of different Arabic Databases and SVM kernels.

illustration not visible in this excerpt

4.2.7 Number of Images Set

The accuracy of any pattern recognition system is directly affected by the number of images set used for training and testing. When the machine is trained on more data samples, then the machine is able to predict the result more accurately. By increasing the training images, the accuracy is increased and the training time is also increased. However, decreasing the training images causes decrease in the training time. For the experiments, the results of using AHDB database are shown in Table 4.17.

Table 4.17: Experiment results of different training and testing images numbers.

illustration not visible in this excerpt

4.3 Evaluation of the AHTRS System (module2)

In order to evaluate the module2, the focuses will be on the features extraction and the classification stages. In this section the results of applying the proposed features extraction algorithms for writer identification will be discussed; it also views the effects of applying different classification kernels and approaches.

4.3.1 Features Extraction

As mentioned in chapter three (section 3.4.1), two features are extracted from the handwritten sub-images. MHOG2 is the first features which gave a unique orientation details of the handwritten text for each writer. However, shape features are the second features that gave a special shape representation of the same handwritten text for the different writers.

Table 4.18: Experiment results for different MHOG2 values.

illustration not visible in this excerpt

In MHOG2, the handwritten sub-images are normalized into several sizes in order to choose the best size, also the handwritten sub-images are divided into number of blocks and several numbers of bins are tested to obtain the optimal once. The signed and unsigned theta is also tested for better performance. Table 4.18 shows the results of testing all the mentioned values.

The best accuracy obtained for the MHOG2 features is by normalized image with size of 128x128, 4x4 divided blocks and 10 bins which has been shown in Table 4.18. However, the results of applying the MHOG2 (Algorithm 3.11) and shape features extraction (Algorithm 3.12) are illustrated in Table 4.19.

Table 4.19: The identification accuracy of the extracted features.

illustration not visible in this excerpt

4.3.2 Classification

In the module2 two approaches are used to recognize the desired writer of the handwritten text. The first approach depends on the handwritten sub-word level and the second one depends on the handwritten text level. The obtained results of writer recognition for the both approaches based on using various SVM kernels are illustrated in Table 4.20.

Table 4.20: The identification accuracy of different SVM kernels.

illustration not visible in this excerpt

4.4 Discussion

The evaluations of the proposed system performance are accomplished in order to check the effectiveness of proposed algorithms. The proposed algorithms adapts to the model in image pre-processing, text segmentation, features extraction, classification and the post-processing. The proposed system has two modules, the first module is tested on three Arabic handwritten databases, and the second module tested on a proposed Arabic handwritten database. The system used 4095 handwritten images from the AHDB database, 600 handwritten images from the IESK-ArDB database and 1300 handwritten images from the proposed database. The proposed segmentation method segments the handwritten text images into a suitable form that made the system work probably. The efficient proposed preprocessing methods give a best results in converting the used handwritten images into binary and normalized for the most appropriate size which is 128x128. The features extraction methods extracted the best features that represent each Arabic text according to its characteristics. Moreover, the best classifier for dealing with a large data sample is used which is SVM, and gave a better accuracy results. Finally, 6.2 (seconds) was the system processing time.

The proposed Arabic handwritten database has character, word and text images. The proposed system achieved 99.8% for handwritten character recognition, 99.7% for handwritten word recognition and 98% for handwritten text recognition. Additionally, the system is evaluated within three classifiers which are SVM, KNN and ANN and the obtained recognition accuracies are 98%, 93% and 94% respectively for the handwritten text recognition. However, the accuracy of identification was 85% using SVM, 81% using KNN and 82.3% using ANN for the sub-word level approach and 100% using SVM, 95% using KNN and 98% using ANN for the text level approach.

Chapter Five Conclusions and Suggestions for Future Work

5.1 Conclusions

This chapter summarizes the evaluation of thesis results, and shows the main contributions of the proposed work. Based on the implementation of the proposed work, the following conclusions are given:

1. The proposed system depends on handwritten sub-word segmentation approach which is simple, practical, and efficient proposed segmentation algorithm which achieved a high segmentation rate as shown in Table 4.2, thus accurate recognition.

2. Several methods and algorithms in the proposed preprocessing stage have an ideal effect on the proposed system:

- The proposed thresholding algorithm which is based on calculating the intensity value of pixels in the gray scale image and Fuzzy C-Means Culturing (FCM), select an essential thresholding points that assign the background and foreground pixels correctly within less misclassification error pixels as illustrated in Table 4.3.
- Removing the unwanted pixels has been done by proposed algorithm which depends on two thresholds derivative from the characteristic of the Arabic language and without removing any important pixels from the binary image. The proposed algorithm keeps a best results of MSE and PSNR after removing the unwanted pixels of the image, by as shown in Table 4.5.
- One of the important algorithms of the preprocessing stage that increased the recognition accuracy is the black space elimination (BSE) algorithm that viewed in Table 4.6. The algorithm increased the system accuracy through removing the unwanted pixels in the image background, which makes the features extraction methods work fast and well by extracting the features from only the important part of the image.

The used image normalization method makes all the handwritten images in the same size 128*128, in order to create a similarity in shape size of the text in different images. The appropriate choice of the image size in the proposed system shown in Table 4.7.

3. The main success key of the proposed system is the features extraction stage.

Proposed features extraction methods got the best useful features which represent the text handwritten in the image that makes results of recognition and identification efficient. Table 4.14 shows the obtained results by using these methods and how accurate each one. Besides, the employment of MHOG1 in the proposed system is the main successful part of this thesis. Furthermore, the results in Table 4.8 show the strength of using the proposed edge detection filter for HMOG1 over the other filters.

4. The results in Table 4.19 show the strength of proposed MHOG2 and shape features extraction algorithms, which give unique features to each text’s writer.

5. The training and classification time are reduced by features scaling (FS) algorithm, subsequently reducing the system processing time.

6. The use of one vs all approach with polynomial kernel of Support Vector Machines (SVM) classification algorithm which is shown in Figure 4.7, yields more robust recognition results and identification performance than the use of other approaches, kernels and classifiers.

7. The proposed system has achieved better accuracy with three different handwritten databases which are presented in Table 4.16, than all the previous works that discussed in section 1.3 of chapter one.

8. The proposed text handwritten database gives a better accuracy result than the other handwritten databases in Table 4.16, and it also works in identification process and gives an ideal accuracy as shown in Table 4.20 for the both sub-word level and text level identification approaches. Besides, the database can work for character and word recognition, which are discussed in section 3.5 of chapter three.

9. The number of training set and testing set in Table 4.17 shows that, the accuracy results and the classification time of proposed system did not affected that much by increasing the number of testing set and decreasing the training set. On the other hand, increasing the training set leads to optimal results.

10. All the proposed system stages implemented in only 6.2 seconds.

5.2 Suggestions for Future Work

The proposed system can be improved in various directions which are:

1. Improving the proposed system to work in real time by apply it within online recognition and identification of the mobile and tablet devices.
2. The proposed system can be employed as a retrieval system based on user keyword inquiry and return the required handwritten documents.
3. The proposed system can be modified to work with printed Arabic document recognition.

^[1] Jorge S., Nicolás P., and Pedro P., “Pattern Recognition and Image Analysis (Part II)”, Springer Berlin Heidelberg, 2005.

^[2] Zhi-Qiang L., Jinhai C., and Richard B., “Handwriting Recognition”, Berlin, Germany: Springer-Verlag, 2003.

^[3] Rodrigo L., Carlos A., Othman A., and Juan R., “Biometric Identification Systems”, Signal Processing, Elsevier, vol. 83, no. 12, pp. 2539-2557, 2003.

^[4] Madhuri R., and Shubhangi M., “A Survey on Offline Signature Recognition and Verification Schemes”, Industrial Instrumentation and Control Conference, IEEE, pp. 165 - 169, 2015.

^[5] Niels R., Vuurpijl L., and Schomaker L., “Automatic Allograph Matching In Forensic Writer Identification”, International Journal of Pattern Recognition and Artificial Intelligence (IJPRAI), vol. 21, no. 1, pp 61-81, 2007.

^[6] Leedham G., and Chachra S., “Writer Identification Using Innovative Binarised Features of Handwritten Numerals”, Seventh International Conference on Document Analysis and Recognition, pp. 413-416, 2003.

^[7] Fornes A., Llados J., Sanchez G., and Bunke H., “Writer Identification in Old Handwritten Music Scores”, Eighth IAPR International Workshop on Document Analysis Systems, pp. 347-353, 2008.

^[8] Srihari S., Cha S., Arora H., and Lee S., “Individuality of Handwriting”, Journal of Forensic Sciences (JOFS), vol. 47, no. 4, pp. 1-17, 2002.

^[9] Khorsheed M., “Off-Line Arabic Character Recognition - A Review Pattern Analysis & Applications”, Pattern Analysis & Applications, Springer, vol. 5, no. 1, pp. 31-45, 2002.

^[10] Saeeda N., Saad B., Riaz A., and Muhammad I., “Arabic Script Based Digit Recognition Systems”, International Conference on Recent Advances in Computer Systems, pp. 67-73, 2016.

^[11] Mohd A., Mohammad N., Khairuddin O., Che A., and Khadijah G., “Exploiting Features From Triangle Geometry For Digit Recognition”, Control, Decision and Information Technologies (CoDIT) International Conference, pp. 876-880, 2012.

^[12] Gita S., and Jitendra K., “Arabic numeral Recognition Using SVM Classifier”, International Journal of Emerging Research in Management and Technology (IJERMT), vol. 9359, no. 5, pp. 62-67, 2013.

^[13] Mohamed H., Loay E., and Faisel G., “Printed and Handwritten Hindi/Arabic Numeral Recognition Using Centralized Moments”, International Journal of Scientific and Engineering Research, vol. 5, no. 3, pp. 140-144, 2014.

^[14] Mohsen B., Faezeh M., Jalil G., “Persian/Arabic Handwritten Digit Recognition Using Local Binary Pattern”, International Journal of Digital Information and Wireless Communications (IJDIWC), vol. 4, no. 4, pp. 486-492, 2014.

^[15] Pawan K., Ram S., and Mita N., “A Study of Moment Based Features on Handwritten Digit Recognition”, Applied Computational Intelligence and Soft Computing, vol. 16, pp. 1-17, 2016.

^[16] Omar B., and Adnan S., “Isolated Arabic Handwritten Character Recognition: A Survey”, International Journal of Advanced Research in Computer Science and Software Engineering, vol. 4, no. 10, pp. 175-185, 2014.

^[17] Lawgali A., Bouridane A., Angelova M., and Ghassemlooy Z., “Handwritten Arabic Character Recognition: Which Feature Extraction Method?”, International Journal of Advanced Science and Technology, vol. 34, no. 9, pp. 1-8, 2011.

^[18] Manal A., Lulwah M., and Hanadi H., “Off-Line Arabic Handwriting Character Recognition Using Word Segmentation”, Journal of Computing, vol. 4, no. 3, pp. 40- 44, 2012.

^[19] Lawgali A., Angelova M., and Bouridane A., “A Framework for Arabic Handwritten Recognition Based on Segmentation”, International Journal of Hybrid Information Technology, vol.7, no.5, pp.413-428, 2014.

^[20] Farah M., “A Haar Wavelet-Based Zoning for Offline Arabic Handwritten Character Recognition”, Journal of Babylon University/Pure and Applied Sciences, vol. 23, no. 2, pp 575-585, 2015.

^[21] Mohamed E., Rania M., and Monji K., “A New Design Based-SVM of the CNN Classifier Architecture with Dropout for Offline Arabic Handwritten Recognition”, International Conference on Computational Science, Elsevier, pp. 1712-1723, 2016.

^[22] Mohammad T., and Sabri A., “Offline Arabic handwritten text recognition: A Survey”, ACM Computing Surveys (CSUR), vol. 45, no. 23, pp 1-35, 2013.

^[23] Moftah E., Ayoub A., Zaher A., Laslo D., Sherif and Anwar S., “Gabor Wavelet Recognition Approach for Off-Line Handwritten Arabic Using Explicit Segmentation”, Image Processing and Communications Challenges 5, Springer International Publishing Switzerland, vol. 233, no. 1, pp. 245-254, 2014.

^[24] Moftah E., Ayoub A., Laslo D., and Sherif E., “CRFs and HCRFs Based Recognition for Off-Line Arabic Handwriting”, Springer International Publishing Switzerland, vol. 9475, no. 2, pp. 337-346, 2015.

^[25] Khaoula J., Mohamed M., and Najoua B., “Arabic Handwritten Word Recognition Based On Dynamic Bayesian Network”, International Arab Journal of Information Technology (IAJIT), vol. 13, no. 3, pp. 276-283, 2016.

^[26] Hicham E., Khalid S., Akram H., “Recognition of Off-line Arabic Handwriting words Using HMM Toolkit (HTK)”, 13th International Conference Computer Graphics, Imaging and Visualization, IEEE, pp. 167-171, 2016.

^[27] Gernot A., “Markov Models for Pattern Recognition (Part III)”, Springer, 2014.

^[28] Pervez A., and Yousef A., “Arabic Character Recognition: Progress and Challenges”, Journal of King Saud University, Elsevier, vol. 12, no. 4, pp. 85-116, 2000.

^[29] Plamondon R., and Srihari S., “On-Line and Offline Handwriting Recognition: A Comprehensive Survey”, Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 63-84, 2000.

^[30] Veena B., and Sinha R., "Integrating knowledge sources in Devanagari text recognition system", IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, vol. 30, no. 4, pp. 500-505, 2000.

^[31] Poovizhi P., “A Study on Preprocessing Techniques for the Character Recognition”, International Journal of Open Information Technologies, vol. 2, no. 12, pp. 21-24, 2014.

^[32] Kesidis A., Galiotou E., Gatos B., Pratikakis I., “A Word Spotting Framework for Historical Machine-Printed Documents”, International Journal On Document Analysis And Recognition (IJDAR), Springer, vol. 14, no. 2, pp. 131-144, 2011.

^[33] Suganya R., and Shanthi R., “Fuzzy C- Means Algorithm- A Review”, International Journal of Scientific and Research Publications, vol. 2, no. 11, pp. 1-3, 2012.

^[34] James C., "Pattern Recognition with Fuzzy Objective Function Algorithms", Advanced Applications in Pattern Recognition, US: Springer-Verlag, 1981.

^[35] Lorigo, M. and Venu G., “Off-line Arabic Handwriting Recognition: A Survey”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 5, pp. 712- 724, 2006.

^[36] Atena F., Abdolhossein S., and Jamshid S., “Document Image Noises and Removal Methods”, International MultiConference of Engineers and Computer Scientists, pp. 1-5, 2013.

^[37] Canny J., "A computational approach to edge detection", Pattern Analysis and Machine Intelligence, IEEE Transactions, vol. 8, no. 1, pp. 679-698, 1986.

^[38] Rafael C., Richard E., "Digital Image Processing (3rd Edition)”, Prentice Hall, 2008.

^[39] Alberto M., and Sabri T., "Image Processing Techniques for Machine Vision", Florida International University, pp. 1-9, 2000.

^[40] Davies E., “Machine Vision: Theory, Algorithms and Practicalities (4th Edition)”, Academic Press, 2012.

^[41] Liu C., Nakashima K., Sako H., and Fujisawa H., “Handwritten Digit Recognition: Investigation of Normalization and Feature Extraction Techniques”, Pattern Recognition, vol. 37, no.2, pp. 265-279, 2004.

^[42] Lorigo L., and Govindaraju V., "Segmentation and Pre-Recognition of Arabic Handwriting", Eighth International Conference on Document Analysis and Recognition, pp. 605-609, 2005.

^[43] Haikal E., and Volker M., “Arabic Text Recognition Systems - State Of the Art and Future Trends”. International Conference on Innovations in Information Technology (IIT), pp. 692-696, 2008.

^[44] Isabelle G., Masoud N., Steve G., and Lotfi A., “Feature Extraction Foundations and Applications”, Berlin Heidelberg New York: Springer-Verlag, 2006.

^[45] Faye I., Samir, B., and Eltoukhy M., “Digital Mammograms Classification Using a Wavelet Based Feature Extraction Method”, Second International Conference on Computer and Electrical Engineering, pp. 318-322, 2009.

^[46] Naviz. A., and Fatos T., “An Overview of Character Recognition Focused On OffLine Handwriting”, Systems Man and Cybernetics Part C (Applications and Reviews), IEEE Transactions, vol. 31, no. 2, pp 216-233, 2001.

^[47] Naviz. A., and Fatos T., “Optical Character Recognition for Cursive Handwriting”, Pattern Analysis and Machine Intelligence, IEEE Transactions, vol. 24, no. 6, pp. 801-813, 2002.

^[48] Saunders C., Grobelnik M., Gunn S., and Shawe J., “Subspace, Latent Structure and Feature Selection”, Springer-Verlag Berlin Heidelberg, 2006.

^[49] Navneet D., and Bill T., “Histograms of Oriented Gradients for Human Detection”, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1-8, 2005.

^[50] Mark N., and Alberto S., “Feature Extraction and Image Processing for Computer Vision (3rd Edition)”, Academic Press: Elsevier, 2012.

^[51] Nasir A., Natarajan T., and Kamisetty R., "Discrete Cosine Transform" IEEE Transactions on Computers, vol. 1, pp. 90-93, 1974.

^[52] Jiang J., Weng Y., and Li P., “Dominant Colour Extraction in DCT Domain”, Image and Vision Computing, Elsevier, vol. 24, no. 12, pp. 1269-1277, 2006.

^[53] Sarhan A., “Iris Recognition Using Discrete Cosine Transform and Artificial Neural Networks”, Journal of Computer Science, vol. 5, no. 5, pp. 369-373, 2009.

^[54] John M., "A Fast Cosine Transform in One and Two Dimensions", IEEE Transaction on ASSP, vol. 28, no.1, pp. 27 - 34, 1980.

^[55] Abdul R., Christian V., and Marzuki K., "Lexicon-Based Word Recognition Using Support Vector Machine and Hidden Markov Model", 10th International Conference on Document Analysis and Recognition (ICDAR), pp. 161-165, 2009.

^[56] Padraig C., and Sarah J., "K-Nearest Neighbor Classifiers", Multiple Classifier Systems: UCD School of Computer Science & Informatics, University College Dublin, Belfield, Ireland, pp. 1-17, 2007.

^[57] Cortes C., and Vapnik V., “Support-Vector Networks”, Machine Learning, Springer, vol. 20, no. 3, pp. 273-297, 1995.

^[58] Vapnik V., “The Nature of Statistical Learning Theory”, Springer-Verlag New York, 2000.

^[59] Sammut C., and Webb G., “Encyclopedia of Machine Learning”, US: Springer, 2010.

^[60] Faouzi B., Rachid H., and Mouldi B., "Handwritten Arabic Character Recognition Based On SVM Classifier", 3rd International Conference on Information and Communication Technologies: From Theory to Applications (ICTTA), pp. 1-4, 2008.

^[61] Amin A., “Recognition of Printed Arabic Text Based On Global Features and Decision Tree Learning Techniques”, Pattern Recognition, Elsevier, vol. 33, no. 8, pp. 1309-1323, 2000.

^[62] Kharma N., Ahmed M., and Ward R., “A New Comprehensive Database of Handwritten Arabic Words, Numbers, and Signatures Used For OCR Testing”, Canadian Conference on Electrical and Computer Engineering, IEEE, pp. 766-768, 1999.

^[63] Al-Ohali Y., and Cheriet M., “Databases for Recognition of Handwritten Arabic Cheques”, 7th International Workshop on Frontiers in Handwriting Recognition (IWFHR), pp. 601-606, 2000.

^[64] Al-Ma’adeed S., Elliman D., and Higgins C., “A Data Base for Arabic Handwritten Text Recognition Research”, 8th International Workshop on Frontiers in Handwriting Recognition (IWFHR), pp. 485-489, 2002.

^[65] Moftah E., Ayoub A., Zaher A., and Laslo D., “IESK-ArDB: A Database for Handwritten Arabic and an Optimized Topological Segmentation Approach”, International Journal on Document Analysis and Recognition (IJDAR), vol. 16, no. 3, pp. 295-308, 2013.

^[66] Pechwitz M., Maddouri S., Margner V., Ellouze N., and Amiri H., “IFN/ENIT- Database of Handwritten Arabic Words”, International Conference of Francophone Symposium on Writing and Document, pp. 127-136, 2002.

^[67] Melhi M., “Off-Line Arabic Cursive Handwriting Recognition Using Artificial Neural Networks”, PhD dissertation, Department of Cybernetics, University of Bradford, 2001.

^[68] Claus V., “Biometric User Authentication for IT Security, from Fundamentals to Handwriting”, Springer-Verlag US, 2006.

^[69] Dhaval S., Jun Z., Jarrell W., and Song W., “Handwritten Text Segmentation using Average Longest Path Algorithm”, Applications of Computer Vision (WACV), IEEE Workshop, pp. 15-17, 2013.

^[70] Jouni K., Chris S., and Juni P., “Encyclopedia of Biostatistics”, John Wiley and Sons, 2005.

^[71] Padmaja V., Giri P., and Chandrasekhar B., “Image Compression Effects on Face Recognition for Images with Reduction in Size”, International Journal of Computer Applications, vol. 61, no.22, pp. 38-42, 2013.

^[72] Sebastiano I., “Fundamentals in Handwriting Recognition”, Springer Berlin Heidelberg, 2012.

Frequently asked questions

What is the main topic of this document?

This document is a language preview detailing a thesis on Arabic Handwritten Text Recognition and Writer Identification. It covers the table of contents, objectives, key themes, chapter summaries, and keywords.

What is included in the "List of Contents"?

The "List of Contents" includes the Abstract, List of Tables, List of Figures, List of Abbreviations, List of Algorithms, and summaries of Chapters One through Five, along with References.

What topics are covered in Chapter One: General Introduction?

Chapter One covers the Introduction, Handwritten Recognition (including security applications, text recognition/writer identification, text dependence/independence, Arabic language aspects, and recognition approaches), Literature Survey, Aim of Thesis, Thesis Contributions, and Organization of Thesis.

What is discussed in Chapter Two: Theoretical Background?

Chapter Two discusses Handwritten Recognition, Classification of Text Recognition (online and offline), the General Structure of a Handwritten Text Recognition System (image acquisition, preprocessing, segmentation, features extraction, classification), Arabic Handwritten Text Recognition (features of Arabic language, databases), Handwritten Recognition Applications, and Evaluation Measures of a Handwritten Recognition System.

What are the main components of the Proposed Arabic Handwritten Text Recognition System (Chapter Three)?

Chapter Three details the Architecture of the Proposed System, Arabic Handwritten Text Recognition (Module 1: Image Acquisition, Segmentation, Preprocessing, Features Base Construction, Classification, Post-processing), Arabic Handwritten Text Writer Identification (Module 2: Features Base Construction, Classification, Post-processing), and the Proposed Handwritten Database.

What does Chapter Four: Experiments and Results Discussion cover?

Chapter Four presents the Evaluation of the HTRSA System (module1 & module 2), including Arabic Handwritten Database details, Handwritten Text Segmentation, Handwritten Text Image Preprocessing, Features Extraction, Features Normalization, Classification, Number of Images Set and the Results Discussion.

What are the key areas covered in Chapter Five: Conclusions and Suggestions?

Chapter Five contains Conclusions and Suggestions for Future Work.

What kind of tables are included in the "List of Tables"?

Tables include information on SVM kernels, Arabic characters and their forms, Arabic letters with diacritical points, AHTR databases, energy contained in different DCT coefficients, samples of proposed Arabic/writers' lexicons, Segmentation results, evaluation of thresholding/noise removal algorithms, experimental results with various image sizes/edge detection filters/MHOG1 values/overlapping approaches/dividing approaches, ordering techniques of selecting coefficients, features extraction times, comparison of features extraction methods, results of applying FN algorithm, recognition accuracy of different Arabic Databases and SVM kernels, experiment results of different training/testing images numbers, MHOG2 values, and identification accuracy of extracted features/SVM kernels.

What kind of figures are included in the "List of Figures"?

Figures include diagrams of handwritten recognition systems, classifications of text recognition, acquisition methods, general process flow of HTR systems, Sobel convolution kernels, image thinning, Stentiford thinning algorithm templates, image gradient, cell division, histogram of oriented gradients, image transformation using DCT, separating hyperplane H, support vectors, optimal hyperplane, linear SVM with soft margin, changing data space, cursiveness of Arabic language, semi-word constituting Arabic sub-word, examples of AHDB/IESK-arDB databases, the proposed HTRSA system architecture, architecture of module1, Arabic handwritten text images, distance features, applying the proposed text segmentation Algorithm, proposed preprocessing stage, image thresholding, noise removal, black space elimination, edge image using Sobel detector, image scaling, features base construction of module1, Arabic descriptors, image blocking/gradient/cells division, SVM one against all architecture, the proposed handwritten database, and Arabic handwritten text documents.

What abbreviations are in the "List of Abbreviations"?

The included text does not provide specific abbreviations but rather directs to an illustration not visible in the excerpt.

What algorithms are included in the "List of Algorithms"?

Algorithms listed are: Stentiford, Fast Cosine Transform (FCT), Text Segmentation, Arabic Handwritten Image Thresholding, Noise Removal, Black Space Elimination, Statistical Features, DCT-Features Extraction, MHOG1 Features, Features Normalization, SVM Training/Testing, MHOG2 Features, and Shape Features.

Final del extracto de 154 páginas - subir

Comprar ahora

Título: Arabic Handwritten Text Recognition and Writer Identification

Tesis Doctoral / Disertación , 2017 , 154 Páginas , Calificación: 3

Autor:in: PhD Mustafa S. Kadhm (Autor)

Ciencias de la computación - Aplicada

Leer eBook

Detalles

Título: Arabic Handwritten Text Recognition and Writer Identification
Calificación: 3
Autor: PhD Mustafa S. Kadhm (Autor)
Año de publicación: 2017
Páginas: 154
No. de catálogo: V378462
ISBN (Ebook): 9783668558878
ISBN (Libro): 9783668558885
Idioma: Inglés
Etiqueta: arabic handwritten text recognition writer identification
Seguridad del producto: GRIN Publishing Ltd.

Citar trabajo: PhD Mustafa S. Kadhm (Autor), 2017, Arabic Handwritten Text Recognition and Writer Identification, Múnich, GRIN Verlag, https://www.grin.com/document/378462