magyar nyelvű adatlap
angol nyelvű adatlap
Data Mining Algorithms
A tantárgy neve magyarul / Name of the subject in Hungarian: Adatbányászati algoritmusok
Last updated: 2017. január 27.
- linear algebra
- basic programming techniques in any programming language
- Introduction and important assets of data mining and data science - Practical application of data science through important tools
- Linear and polynomial, one and multidimensional regression and optimization: gradient descent and least squares
- Supervised learning (classification): nearest neighbour methods, decision trees, logistic regression, non-linear classification, neural networks, support vector networks, timeseries classification and dynamic time warping
- Advanced classification methods: semi-supervised learning, multi-class classification, multi-task learning, ensemble methods: bagging, boosting, stacking, ensemble of classifiers by Dietterich
- Evaluation of classifiers: cross-validation, bias-variance trade-off
- Clustering: k-means (k-medoid, FurthestFirst), hierarchical clustering, Kleinberg's impossibility theorem, internal and external evaluation, convergence speed
- Principal component analysis, low-rank approximation, collaborative filtering and applications (recommender systems, drug-target prediction)
- Density estimation and anomaly detection
- Frequent itemset mining
- Biomedical data processing (next-generation sequencing, gene expression, biomedical timeseries) and mining
- Additional applications and problems: preprocessing, scaling, overfitting, hyperparameter optimization, imbalanced classification
- Tools: Octave/Matlab, Python, R, Hadoop
- during the semester: 5 homeworks
- final: oral exam
Pang-Ning Tan, Michael Steinbach, Vipin Kumar:
Introduction to Data Mining
http://www-users.cs.umn.edu/~kumar/dmbook/index.php
Bodon Ferenc, Buza Krisztián: Adatbányászat, elektronikus jegyzet
http://www.cs.bme.hu/~buza/pdfs/adatbanyaszat-cover.pdf
Dr. Buza Krisztián tudományos munkatárs, MTA-TTK
Bálint Daróczy HAS Institute for Computer Science and Control