7. Objectives, learning outcomes and obtained knowledge
The rapidly escalating challenges in data science with respect to data
size, dimensionality or heterogeneity highlighted the importance of the whole
process of data analysis, including study design, data collection, data
engineering, combination of a priori knowledge and data, combination of
multiple inductive modules into a complex system and deriving optimal
interventions. In parallel, the unprecedented challenges also renewed interest
in complex inductive schemes, such as in learning of overall network models, causal
systems models or in active and reinforcement learning.

The course provides a systematic overview both about intelligent methods
used throughout the data analysis process and about intelligent, complex
machine learning schemes used in modern data analysis. Unifying themes of this
dual approach, are the Bayesian decision theoretic framework, the network and
systems-based approaches, data and knowledge fusion, the use of ontologies and
semantic technologies and active, online (reinforcement) learning, which integrate
various phases and aspects of data analysis. The course also presents and
discusses real-world applications, from the field of biomedicine, pharmaceutical
research and system diagnostics.

The course is at the cross-road of statistics, big data analytics,
artificial intelligence and machine learning. It is self-contained, but ideally
complements earlier studies in these directions.

After accomplishing
this course, you will be familiar with the following:

(1) Theoretical
bases of induction. The engineering workflow of data analysis.

(2) Optimization,
Bayesian model averaging and sensitivity analysis using resampling methods in
data analysis.

(3) Semantic
data repositories, data visualization, dimensionality reduction, data
engineering/transformations using ontologies, data cleaning and imputation.

(4) Unsupervised
learning: clustering, module learning, self-organizing maps, network science,
metric learning.

(5) Supervised
learning: decision trees, regression, kernel methods, multilayer perceptron,
deep neural networks.

(6) Probabilistic
graphical models: Bayesian networks, dynamic/temporal Bayesian networks.

(7) Reinforcement,
active, budgeted and online learning.

(8) Knowledge
and data fusion: ontologies, semantic technologies, linked open data.