Grin logo
de en es fr
Boutique
GRIN Website
Publier des textes, profitez du service complet
Aller à la page d’accueil de la boutique › Informatique - Internet, Nouvelles Technologies

Data Mining Unveiled. Definition, Evolution, and Key Processes

Titre: Data Mining Unveiled. Definition, Evolution, and Key Processes

Dossier / Travail , 2014 , 10 Pages , Note: 1.7

Autor:in: Anonym (Auteur)

Informatique - Internet, Nouvelles Technologies
Extrait & Résumé des informations   Lire l'ebook
Résumé Extrait Résumé des informations

The objective of this work is to provide a comprehensive understanding of data mining: defining its concept, tracing its evolution, elucidating its methods and tasks, and describing its typical process

Extrait


Table of Contents

1 Introduction

1.1 Problem definition

1.2 Aim of the work

2 Methods and tasks

2.1 Clustering

2.2 Association

2.3 Decision trees

3 Data mining process

4 Conclusion

Objectives and Themes

This work aims to define the term data mining, describe its origin and development, clarify its fundamental methods and tasks, and outline a typical data mining process within a business management context.

  • Evolution and definition of data mining
  • Core methodologies: Clustering, Association, and Decision trees
  • Technical components of decision-making trees
  • Quantitative evaluation metrics (Support and Confidence)
  • Challenges in the data mining process (data quality and protection)

Excerpt from the Book

2.2 Association

The 2nd method is association, with the help of association rules one can discover unknown dependencies in customer behaviour. It is a classic question: Which customers who have bought product A are also likely to buy product B? I explain the connections using the example of fruit and vegetable sales.

We have 6 transactions. Table 1 shows us what was bought in all 6 transactions. We now want to formulate a rule that says: if apples are bought, pears are bought. The quality of a rule is determined by its degree of uncertainty. The degree of uncertainty is characterised by 2 numbers called Support and Confidence.

Support is the number of transactions that include all products of both the condition if and the then part. It is expressed as a percentage of the total transactions. For our example, the support is equal to 4/6 or 2/3, i.e. 66%. In 2/3 transactions apples and pears are bought together.

The second value is called the confidence of the rule. It is the quotient of the number of transactions contained in the If and Then part and the number of transactions from the If part. It expresses the dependency, which lies between 0 and 1. The value 1 expresses a mandatory dependency; apples and pears are always bought together.

Summary of Chapters

1 Introduction: Provides an overview of the growing importance of data as a business raw material and introduces the development of KDD and data mining.

2 Methods and tasks: Explains key data mining techniques including customer segmentation (Clustering), association rule discovery, and predictive modeling using decision trees.

3 Data mining process: Describes the practical workflow of extracting information from stored data and identifies potential pitfalls like data protection and quality issues.

4 Conclusion: Summarizes the role of exploratory statistics as a bridge for companies to derive useful insights from large volumes of customer data.

Keywords

Data mining, KDD, Knowledge Discovery, Clustering, Association, Decision trees, Support, Confidence, Customer segmentation, Exploratory statistics, Business administration, Data quality, Data protection, Marketing, Pattern recognition

Frequently Asked Questions

What is the core focus of this publication?

The work focuses on explaining the foundations of data mining, its evolution, and its application as a tool for extracting knowledge from large business data sets.

What are the primary thematic areas covered?

The main themes include the definition of data mining, specific methods such as clustering and association analysis, the mechanics of decision trees, and the overall data mining process.

What is the primary goal of this research?

The primary goal is to define the term data mining, describe its development, clarify its methodologies, and outline the typical processes involved in data analysis.

Which scientific methods are analyzed in the work?

The work examines clustering, association rules (using metrics like support and confidence), and decision trees as predictive models.

What content is addressed in the main body?

The main body details the methodology of clustering, provides a practical calculation example for association rules, and breaks down the structure of decision trees including nodes and branches.

Which keywords characterize this work?

Key terms include data mining, KDD, exploratory statistics, clustering, association, and decision trees.

How is the quality of an association rule determined?

The quality is determined by two measures: "Support," which represents the frequency of the condition in total transactions, and "Confidence," which measures the strength of the dependency between items.

What are the components of a decision tree according to this work?

A decision tree consists of nodes (where attributes are queried), branches (representing decisions), hierarchy levels, and leaves (which embody groups of property values).

What common problems can arise during the data mining process?

Common issues include data protection concerns on the internet, the presence of underrepresented or missing information, incorrect data entry, and outliers in databases.

Fin de l'extrait de 10 pages  - haut de page

Résumé des informations

Titre
Data Mining Unveiled. Definition, Evolution, and Key Processes
Université
University of Applied Sciences Braunschweig / Wolfenbüttel; Salzgitter
Note
1.7
Auteur
Anonym (Auteur)
Année de publication
2014
Pages
10
N° de catalogue
V1361647
ISBN (PDF)
9783346886866
Langue
anglais
mots-clé
Clustering explorative statistics Explorative Statistics. Data Knowledge Discovery Association Decision trees
Sécurité des produits
GRIN Publishing GmbH
Citation du texte
Anonym (Auteur), 2014, Data Mining Unveiled. Definition, Evolution, and Key Processes, Munich, GRIN Verlag, https://www.grin.com/document/1361647
Lire l'ebook
  • Si vous voyez ce message, l'image n'a pas pu être chargée et affichée.
  • Si vous voyez ce message, l'image n'a pas pu être chargée et affichée.
  • Si vous voyez ce message, l'image n'a pas pu être chargée et affichée.
  • Si vous voyez ce message, l'image n'a pas pu être chargée et affichée.
  • Si vous voyez ce message, l'image n'a pas pu être chargée et affichée.
  • Si vous voyez ce message, l'image n'a pas pu être chargée et affichée.
Extrait de  10  pages
Grin logo
  • Grin.com
  • Expédition
  • Contact
  • Prot. des données
  • CGV
  • Imprint