Grin logo
en de es fr
Shop
GRIN Website
Texte veröffentlichen, Rundum-Service genießen
Zur Shop-Startseite › Informatik - Angewandte Informatik

Convolutional Neural Network in classifying scanned documents

Titel: Convolutional Neural Network in classifying scanned documents

Praktikumsbericht / -arbeit , 2016 , 33 Seiten

Autor:in: Tai Doan (Autor:in)

Informatik - Angewandte Informatik
Leseprobe & Details   Blick ins Buch
Zusammenfassung Leseprobe Details

In this project, I created and augmented a dataset from a number of given images to train and test convolutional neural network which is used to classify five classes of images of scanned documents. In order to generate the dataset, some image processing techniques were applied such as sliding-window, rotating, flipping and pyramid-sizing. The result of this phase is a set of images having same size 244x224x3. These images after being labeled were divided into three dataset for training, validating and testing the network.

The network is a simple convolution neural network which is also called LeNet. It has three convolutional layers and one fully connected layer. After being trained and validated, the best state of the network was pointed out and tested on the testing dataset and some real images. The result showed that the LeNet was able to classify images of documents in a pretty high accuracy. At the end of the project, I modified the network and discussed the affect that those changes had on the network with the purpose of creating another similar network which can perform better than the original one. The result proved that it worked a little better than its original version.

Leseprobe


Inhaltsverzeichnis (Table of Contents)

  • Introduction
    • Context
      • About ICTLab
      • ARCHIVES project
      • Internship context
    • Report organization
  • State of the art
    • Artificial intelligence & machine learning
    • Artificial neural network (ANN)
      • History
      • Regular neural network
      • Convolutional neural network (LeNet)
      • Training and evaluating
  • Contribution
    • Data creation and augmentation
      • ARCHIVES dataset
      • Creating data
      • Augmenting the data
      • Preparing data
    • Constructing the convolution neural network (LeNet)
      • The model
      • Training
      • Validation and testing
    • Developing the network
  • Results
    • The basic network
      • Testing on the dataset
      • Testing on real images
    • The network modifications
      • Fully connected layer
      • Convolutional layers
    • The new network

Zielsetzung und Themenschwerpunkte (Objectives and Key Themes)

This internship report focuses on the application of convolutional neural networks (CNNs) for document classification. The primary objective is to develop and evaluate a CNN model capable of accurately classifying scanned documents into five distinct categories.

  • Image processing techniques for dataset generation
  • CNN architecture and training methodology
  • Evaluation and analysis of model performance
  • Network optimization and development
  • Application of CNNs in document classification

Zusammenfassung der Kapitel (Chapter Summaries)

The report begins with an introduction to the project's context, highlighting the ARCHIVES project and its significance in document classification. Chapter 2 provides a comprehensive overview of artificial intelligence, machine learning, and particularly convolutional neural networks. This chapter delves into the history of neural networks, the structure of regular neural networks, and the specific architecture of LeNet, the chosen CNN model for this project. Chapter 3 details the creation and augmentation of the dataset, including image processing techniques like sliding window, rotating, flipping, and pyramid-sizing. The chapter also elaborates on the construction of the LeNet network, its training process, and validation and testing methods. Finally, Chapter 4 presents the results of the network's performance, both on the generated dataset and on real images. It further explores the impact of modifications to the network, including changes to the fully connected and convolutional layers, leading to the development of a new, improved network.

Schlüsselwörter (Keywords)

This internship report focuses on the application of convolutional neural networks (CNNs), image processing techniques, document classification, dataset creation, and model optimization for achieving high accuracy in document classification tasks. The project employs a LeNet architecture for training and evaluation, utilizing techniques like sliding window, rotating, flipping, and pyramid-sizing for data augmentation. The research explores the impact of network modifications, aiming to improve the performance of the CNN model.

Ende der Leseprobe aus 33 Seiten  - nach oben

Details

Titel
Convolutional Neural Network in classifying scanned documents
Hochschule
University of Science and Technology of Hanoi (Trường Đại học Khoa học và Công nghệ Hà Nội)
Veranstaltung
Internship
Autor
Tai Doan (Autor:in)
Erscheinungsjahr
2016
Seiten
33
Katalognummer
V349852
ISBN (eBook)
9783668371675
ISBN (Buch)
9783668371682
Sprache
Englisch
Schlagworte
machine learning deep learning classification internship computer science neural network convolutional neural network leNet
Produktsicherheit
GRIN Publishing GmbH
Arbeit zitieren
Tai Doan (Autor:in), 2016, Convolutional Neural Network in classifying scanned documents, München, GRIN Verlag, https://www.grin.com/document/349852
Blick ins Buch
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
  • Wenn Sie diese Meldung sehen, konnt das Bild nicht geladen und dargestellt werden.
Leseprobe aus  33  Seiten
Grin logo
  • Grin.com
  • Zahlung & Versand
  • Impressum
  • Datenschutz
  • AGB
  • Impressum