Budapest University of Technology and Economics, Faculty of Electrical Engineering and Informatics

    címtáras azonosítással

    vissza a tantárgylistához   nyomtatható verzió    

    Machine Learning Use-case Laboratory

    A tantárgy neve magyarul / Name of the subject in Hungarian: Gépi tanulási esettanulmányok

    Last updated: 2023. július 4.

    Budapest University of Technology and Economics
    Faculty of Electrical Engineering and Informatics
    MSc in Computer Engineering,
    Data Science and Artificial Intelligence specialization
    Course ID Semester Assessment Credit Tantárgyfélév
    VITMMA18   0/0/3/f 5  
    3. Course coordinator and department Dr. Szűcs Gábor,
    4. Instructors
    Csaba Gáspár, assistant lecturer (TMIT)
    István Nagy-Rácz, technical assistant (TMIT)
    5. Required knowledge Theoretical foundations of machine learning, basic knowledge of programming, basic knowledge of probability calculations
    6. Pre-requisites
    (TárgyEredmény( "BMEVIMIMB02", "jegy" , _ ) >= 2
    TárgyEredmény("BMEVIMIMB02", "FELVETEL", AktualisFelev()) > 0)

    A fenti forma a Neptun sajátja, ezen technikai okokból nem változtattunk.

    A kötelező előtanulmányi rend az adott szak honlapján és képzési programjában található.

    7. Objectives, learning outcomes and obtained knowledge
    The key issue in the utilization of machine learning and data science knowledge is that we can properly map a real data set and an actual business problem to our machine learning and data analysis toolset. The aim of the subject is to give students a deeper practical experience in this data analysis process by solving several real case studies to demonstrate the order and manner in which the methods should be used.

    From a methodological point of view, during the lab, the students create their own Notebook at the same time as the lecturer. In order to be able to concentrate on advanced task-solving tasks, we start with further development of an initial Notebook during the occasions. The semester also includes a designated data mining ladder competition, where you have to solve a supervised machine learning task as efficiently as possible.
    8. Synopsis
    1. Introduction, methods, technologies, overview of used programming languages and technologies (e.g. Python) - Tabular data management, DataFrame-based problem solving, questions of code efficiency
    2. Supervised learning - Advanced regression methods over a real estate data set, advanced methods in data preparation, management of time trends, special data preparation methods and their impact on forecasting
    3. Supervised learning – Advanced classification task for solving a credit assessment task
    4. Supervised learning – Complex classification evaluation methods, unique objective functions, optimization in case of unique objective functions. The special properties of the ROC curve and AUC value, the relationship between error detection and evaluation functions, the evolution of evaluation in the light of business needs
    5. Clustering procedures - Challenges of customer segmentation based on clustering, data preparation difficulties, selection of clustering methods, explainability of clustering results, recognition of trivial clustering situations, story-telling related to clustering
    6. The relationship between story telling and the interpretability of models, algorithmic issues of explainability of models
    7. Anomaly detection – Solving a complex anomaly detection task over a time-varying data set
    8. Anomaly detection – Combining anomaly score values, incorporating a feedback process into the entire analysis sequence
    9. Advanced methods of variable generation, their relationship with variable selection methods - Presentation of the methods and challenges of variable selection, the feature engineering process that utilizes the results of the selection
    10. Description of the large homework task, preparation of its initial solution, description of the pitfalls of the data analysis task
    9. Method of instruction The students meet the lecturers 10 times, each time in a four-hour time slot, where assignments are issued every two weeks (biweekly). In addition, the students have to solve a large homework task during the semester, where they have to compete with each other (data science competition).
    10. Assessment
    - Assignments (small homeworks) to be submitted within the Moodle system (5 pieces) - assignments are issued every two weeks, two weeks are available to complete the assignment.
    - Participation in the data science competition (large homework) related to the subject, where the baseline specified during the semester must be reached.

    Small homework scores count for 40%, while the data science competition scores count for 60% of the grade. At the competition, a baseline level must be reached in order for the student to receive at least a pass (2) grade, this level is recorded when the competition is announced. During the competition, a supervised learning task must be solved, the exact metric is determined by the competition task announced in the given semester. The students start the competition independently, the available score depends on the results achieved by all the students participating in the competition in the given year.

    At least 40% of the total score (small homeworks plus competition) must be achieved to receive credit.

    11. Recaps
    Assignments (small homeworks) to be submitted can be made up 2 weeks after the submission deadline. If this deadline extends beyond the teaching period, they must be submitted in one week after the teaching period.

    The submission of the large homework (competition) is continuous from the middle of the semester, the submission cannot be replaced here, the competition closes at the end of the teaching period.
    13. References, textbooks and resources After each lesson, an example solution for the analysis tasks will be published.
    14. Required learning hours and assignment
    Kontakt óra42
    Félévközi készülés órákra28
    Felkészülés zárthelyire
    Házi feladat elkészítése80
    Kijelölt írásos tananyag elsajátítása
    15. Syllabus prepared by Csaba Gáspár, assistant lecturer (TMIT)