The e-Home project from the Vienna University of Technology is an R&D project with goals of providing assistive technologies for private households of older people with the idea to give them possibilities for longer and independent living in their homes. The e-Home system consists of an adaptive intelligent network of wireless sensors for activity monitoring with a central context-aware embedded system.
The primary goal of this thesis is to investigate unsupervised prediction and clustering possibilities of user behaviour based on collected time-series data from infrared temperature sensors in the e-Home enviroment.
Three different prediction approaches are described. Hourly Based Event Binning approach is compared to two clustering algorithms, Hierarchical Clustering and Dirichlet Process GMM. Prediction rates are measured on data from three different test persons.
This thesis first examines two different approaches for event detection from infrared signal data. In a second stage three different methods for unsupervised prediction analytics are discussed and tested on selected data-sets. Clustering algorithms parameter settings for time-series data have also been discussed and tested in detail. Finally the prediction performance results are compared and each method's advantages and disadvantages have been discussed.
The practical part of this thesis is implemented in IPython notebook. Python version was 2.7 on 64 bit Ubuntu linux 12.04 LTS. Data analysis has been implemented with Python’s Pandas library. Visualisations are made with Matplotlib and Seaborn libraries.
The results reveal that prediction accuracy depends on data quantity and spread of data points. The simplest method in prediction comparison, the Hourly Based Binning has however given the best prediction rates overall.
The Dirichlet Process Gaussian Mixture Models clustering show best prediction performance on smaller training data sets and well spread data. By further parameter tuning on Dirichlet Process GMM clustering the prediction rates could be further improved coming very close or even over performing the Hourly Based Binning.
Due to the unknown distribution and well spread data, choosing the right threshold parameter for the Hierarchical Clustering was trickier than initially assumed. Despite the initial assumptions for Hierarchical Clustering, this method was at least applicable for unsupervised prediction analytics on used data sets.
Inhaltsverzeichnis (Table of Contents)
- Introduction
- Motivation
- Outline
- The Goal
- Methodology
- Theory Section
- Gaussian Mixture Model
- Dirichlet Process GMM
- Hierarchical Clustering
- Implementation
- Software Specification
- Data and Sensors
- Data Visualisation and Inspection
- Sensor values Discretisation and Extraction
- Unsupervised Event Extraction
- Data Structure for Event Analysis
- Data Quality and Quantity
- Predictive Analysis
- Hourly Binning Analysis
- Clustering Analysis
- Hierarchical Clustering Analysis
- Dirichlet Process Gaussian Mixture Model Clustering Analysis
- Summary
- Results
- Discussion
Zielsetzung und Themenschwerpunkte (Objectives and Key Themes)
This thesis explores the potential of unsupervised prediction and clustering techniques for analyzing user behavior patterns in the e-Home system, a project aimed at providing assistive technologies for elderly individuals. The primary objective is to investigate the effectiveness of different prediction approaches, including hourly binning and clustering algorithms, for analyzing time-series data from infrared temperature sensors.
- Unsupervised prediction and clustering of user behavior patterns in the e-Home system.
- Analysis of time-series data from infrared temperature sensors.
- Comparison of different prediction approaches, including hourly binning, hierarchical clustering, and Dirichlet Process GMM.
- Evaluation of the effectiveness of these methods for predicting user behavior.
- Assessment of the impact of data quantity and distribution on prediction accuracy.
Zusammenfassung der Kapitel (Chapter Summaries)
The thesis begins with an introduction that outlines the motivation for this research, the goals of the project, and the methodology used. Chapter 2 provides a theoretical background on Gaussian Mixture Models, Dirichlet Process GMM, and hierarchical clustering, which are the key algorithms explored in the thesis.
Chapter 3 delves into the implementation details of the project, covering the software specification, data and sensors used, data visualization and inspection, sensor value discretization and extraction, unsupervised event extraction, data structure for event analysis, data quality and quantity, and predictive analysis.
The chapter on predictive analysis explores three different methods: hourly binning analysis, hierarchical clustering analysis, and Dirichlet Process Gaussian Mixture Model clustering analysis. Each method is explained in detail and evaluated on different data sets. The chapter also discusses the advantages and disadvantages of each approach.
Schlüsselwörter (Keywords)
The main keywords and focus topics of this thesis include assistive technologies, e-Home system, infrared temperature sensors, unsupervised prediction, clustering, hourly binning analysis, hierarchical clustering, Dirichlet Process GMM, time-series data, user behavior patterns, data quantity, and prediction accuracy.
- Quote paper
- Dzenan Hamzic (Author), 2016, AAL Data Cluster Analysis. Theory and Implementation, Munich, GRIN Verlag, https://www.grin.com/document/340200