vissza a tantárgylistához   nyomtatható verzió    

    Data Mining Algorithms

    A tantárgy neve magyarul / Name of the subject in Hungarian: Adatbányászati algoritmusok

    Last updated: 2017. január 27.

    Budapest University of Technology and Economics
    Faculty of Electrical Engineering and Informatics
    Course ID Semester Assessment Credit Tantárgyfélév
    VISZD308   4/0/0/v 5  
    3. Course coordinator and department Dr. Katona Gyula,
    Web page of the course http://www.cs.bme.hu/adatalg
    4. Instructors
    Bálint Daróczy HAS Institute for Computer Science and Control
    Dr Gyula Katona Department of Computer Science and Information Theory
    5. Required knowledge

    - linear algebra

    - basic programming techniques in any programming language 

    7. Objectives, learning outcomes and obtained knowledge
    - Introduction and important assets of data mining and data science
    - Practical application of data science through important tools
    8. Synopsis

    - Linear and polynomial, one and multidimensional regression and optimization: gradient descent and least squares

    - Supervised learning (classification): nearest neighbour methods, decision trees, logistic regression, non-linear classification, neural networks, support vector networks, timeseries classification and dynamic time warping

    - Advanced classification methods: semi-supervised learning, multi-class classification, multi-task learning, ensemble methods: bagging, boosting, stacking, ensemble of classifiers by Dietterich

    - Evaluation of classifiers: cross-validation, bias-variance trade-off

    - Clustering: k-means (k-medoid, FurthestFirst), hierarchical clustering, Kleinberg's impossibility theorem, internal and external evaluation, convergence speed

    - Principal component analysis, low-rank approximation, collaborative filtering and applications (recommender systems, drug-target prediction)

    - Density estimation and anomaly detection

    - Frequent itemset mining

    - Biomedical data processing (next-generation sequencing, gene expression, biomedical timeseries) and mining

    - Additional applications and problems: preprocessing, scaling, overfitting, hyperparameter optimization, imbalanced classification

    - Tools: Octave/Matlab, Python, R, Hadoop

    9. Method of instruction 2x2 hour lectures/week
    10. Assessment

    - during the semester: 5 homeworks

    - final: oral exam 

    12. Consultations In office hours or by appointment.
    13. References, textbooks and resources

    Pang-Ning Tan, Michael Steinbach, Vipin Kumar:

    Introduction to Data Mining 

    http://www-users.cs.umn.edu/~kumar/dmbook/index.php

     

    Bodon Ferenc, Buza Krisztián: Adatbányászat, elektronikus jegyzet

    http://www.cs.bme.hu/~buza/pdfs/adatbanyaszat-cover.pdf

    14. Required learning hours and assignment
    Kontakt óra56
    Félévközi készülés órákra14
    Felkészülés zárthelyire25
    Házi feladat elkészítése25
    Kijelölt írásos tananyag elsajátítása0
    Vizsgafelkészülés30
    Összesen150
    15. Syllabus prepared by

    Dr. Buza Krisztián tudományos munkatárs, MTA-TTK

     Bálint Daróczy  HAS Institute for Computer Science and Control