Research Challenges

The dDM project serves as a metaproject for different kind of simulation and machine learning applications. The computational power of our dDM members is used to support the research of our scientific partners. Currently, we are facing research challenges in the area of Time Series Analysis and Biological Data Analysis. A former subproject was dealing with Social Network Analysis. Below, you find an overview of our subprojects and the related scientific publications.

Our machine learning applications use the open source framework RapidMiner. This data mining suite provides various machine learning methods for data analysis purposes. The RapidMinder framework uses a comfortable plug-in mechanism to easily add newly developed algorithms. This flexibility and the processing power of BOINC is an ideal foundation for scientific distributed Data Mining.


Time Series Analysis

The research area called Time Series Analysis comprises methods for analyzing time series data in order to extract meaningful statistics, rules and patterns. These rules and patterns might be used to build forecasting models that are able to predict future developments.

Stock Price Prediction (on-going)

In this case study, we try to improve time series forecasting methods. In 2006, we started the analysis of Stock Market Data by using Artificial Neural Networks [6]. The results were published as book [7]. Later on, we could improve our results by applying Support Vector Machines [8]. In 2009, we started testing standard machine learning algorithm to build forecast models for Dow Jones, S&P500, German Stock Index and NIKKEI index. In addition, we try to extend existing approaches and develop new forecasting methods. Thereby, we will focus on aspects of temporal pattern evolution.   read more



Biological Data Analysis

Laryngeal high-speed video classification (on-going)

The automatic identification of voice disorders is one particular field of interest of Daniel Voigt's work. Audio recordings of the acoustical voice signal are analysed with specialized software quantifying the amount of perturbation (noise) in the signal. Through automated feature extraction from the recordings and subsequent machine learning analysis, laryngeal movement patterns can be quantitatively captured and automatically classified according to different diagnostic classes.[9],[10],[11],[12].   read more

Multi-Agent Simulation of Evolution (finished)

In this subproject we investigate the biological phenomenon of aposematism (also referred to as warning coloration). This term describes the evolutionary strategy of certain animal species to indicate their unpalatability/toxicity to potential predators by developing skin colors and patterns that can be easily perceived by them. Prominent examples of toxic animals with distinct warning coloration are poison dart frogs, coral snakes and fire salamanders.   read more


Social Network Analysis

In 2007, Tanja Falkowski proposed DenGraph - a density-based graph clustering algorithm. This algorithm is deployable for - among other things - Social Network Analysis. The following studies were powered by our distributedDataMining project. The results are published as a part of her PhD theses that is also available as book [1].

Temporal Dynamics of the Last.fm Music Platform (finished)

Last.fm is a social networking platform which has over 20 million visitor per month from more than 200 countries. In this case study we applied DenGraph-IO to detect and observe changes in the music listening behaviour of Last.fm users during a period of two years. The aim was to see, whether the proposed clustering technique detects meaningful communities and evolutions [2],[3],[4].   read more


Temporal Evolution of Communities in the Enron Email Data Set (finished)

The collapse of Enron, a U.S. company honored in six consecutive years by "Fortune" as "America's Most Innovative Company", caused one of the biggest bankruptcy cases in US-history. To investigate the case, a data set of approximately 1.5 million e-mails sent or received by Enron employees was published by the Federal Energy Regulatory Commission. We've used the processing power of dDM to analyze the temporal evolution of communities extracted from these email correspondences [5].   read more


References

  1. Falkowski T. Community Analysis in Dynamic Social Networks. Goettingen: Sierke Verlag; 2009.
  2. Schlitter N, Falkowski T. Mining the Dynamics of Music Preferences from a Social Networking Site. In: Proceedings of the 2009 International Conference on Advances in Social Network Analysis and Mining. Athens: IEEE Computer Society; 2009. p. 243-8.
  3. Falkowski T, Schlitter N. Analyzing the Music Listening Behavior and its Temporal Dynamics Using Data from a Social Networking Site. Zurich; 2008.
  4. Schlitter N, Falkowski T, Lässig J. DenGraph-HO: Density-based Hierarchical Community Detection for Explorative Visual Network Analysis. In: Research and Development in Intelligent Systems XXVIII Incorporating Applications and Innovations in Intelligent Systems XIX Proceedings of AI-2011, the Thirty-first SGAI International Conference on Innovative Techniques and Applications of Artificial Int. Cambridge: Springer; 2011. p. 283-96.
  5. Falkowski T. Community Analysis in Dynamic Social Networks. Goettingen: Sierke Verlag; 2009.
  6. Schlitter N. A Case Study of Time Series Forecasting with Backpropagation Networks. In: Steinmüller J, Langner H, Ritter M, Zeidler J, editors. 15 Jahre Künstliche Intelligenz an der TU Chemnitz. Chemnitz: Techn. Univ. Chemnitz, Fak. für Informatik; 2008. p. 203-17. (Chemnitzer Informatik-Berichte).
  7. Schlitter N. Analyse und Prognose ökonomischer Zeitreihen: Neuronale Netze zur Aktienkursprognose. Saarbrücken: VDM Verlag Dr. Müller; 2008.
  8. Möller M, Schlitter N. Analyse und Prognose ökonomischer Zeitreihen mit Support Vector Machines. In: Steinmüller J, Langner H, Ritter M, Zeidler J, editors. 15 Jahre Künstliche Intelligenz an der Fakultät für Informatik. Chemnitz: Techn. Univ. Chemnitz, Fak. für Informatik; 2008. p. 189-201. (Chemnitzer Informatik-Berichte).
  9. Voigt D. Objective Analysis and Classification of Vocal Fold Dynamics from Laryngeal High-Speed Recordings. Aachen: Shaker Verlag GmbH; 2010.
  10. Voigt D, Döllinger M, Braunschweig T, Yang A, Eysholdt U, Lohscheller J. Classification of functional voice disorders based on phonovibrograms. Artificial Intelligence in Medicine. 2010;49(1):51-9.
  11. Voigt D, Lohscheller J, Döllinger M, Yang A, Eysholdt U. Automatic diagnosis of vocal fold paresis by employing phonovibrogram features and machine learning methods. Comput Methods Programs Biomed. 2010;99(3):275-88.
  12. Voigt D, Eysholdt U. Identifying relevant analysis parameters for the classification of vocal fold dynamics. J Acoust Soc Am. 2011;130:2550.