Budapest University of Technology and Economics, Faculty of Electrical Engineering and Informatics

    címtáras azonosítással

    vissza a tantárgylistához   nyomtatható verzió    

    Machine Learning 

    A tantárgy neve magyarul / Name of the subject in Hungarian: Gépi tanulás

    Last updated: 2024. március 1.

    Budapest University of Technology and Economics
    Faculty of Electrical Engineering and Informatics
    MSc in Computer Engineering
    Data Science and Artifical Intelligence specialization
    Course ID Semester Assessment Credit Tantárgyfélév
    VIMIMA27   2/1/0/v 5  
    3. Course coordinator and department Dr. Antal Péter,
    4. Instructors Dr. Bence Bolgár lecturer
    6. Pre-requisites
    (TárgyEredmény( "BMEVIMIMA05", "jegy" , _ ) >= 2
    TárgyEredmény("BMEVIMIMA05", "FELVETEL", AktualisFelev()) > 0)

    A fenti forma a Neptun sajátja, ezen technikai okokból nem változtattunk.

    A kötelező előtanulmányi rend az adott szak honlapján és képzési programjában található.

    Probability theory, Python programming
    7. Objectives, learning outcomes and obtained knowledge

    The course deals with the possibilities of computer implementation of one of the fundamental abilities of intelligent systems: learning. It introduces the types of machine learning, summarizes the theoretical foundations of machine learning, and analyses the most important learning architectures in detail. The subject examines machine learning within a unified probabilistic framework, touching upon mathematical, philosophical, and programming aspects. Beyond presenting theoretical foundations, the course aims to develop practical problem-solving skills. This is achieved through the use of a unified approach and the presentation of complex application examples. The methods learned in the course serve as a foundation and background for solving research and development tasks.

    8. Synopsis

    Detailed topics of the lectures:

    1. Introduction. Artificial intelligence, machine learning, and data science. Machine learning as inference. Learning from observations and interventions. Trustworthy and explainable machine learning.
    2. Basic concepts of Bayesian probability theory. Probability, prior, likelihood, posterior. Maximum likelihood (ML), maximum a posteriori (MAP), fully Bayesian inference, model averaging. The difficulties of fully Bayesian inference (examples when there is an analytical solution). Conjugated priors (examples of their use).
    3. Basic concepts of machine learning. Generative and discriminative models, discriminative functions in machine learning (examples). Bias-variance decomposition, underfitting, overfitting, regularization. Probabilistic derivation of commonly used loss functions and regularization schemes. Evaluation (CV, AUC, AUPR).
    4. Regression. The basic task, the probabilistic model of linear regression, ML and MAP estimation, derivation of analytical formulas for these estimations, the solution process, numerical aspects. Fully Bayesian inference. Non-linear extensions: application of basis functions, commonly used basis functions.
    5. Classification. The basic task, the probabilistic model of logistic regression. Derivation of the perceptron using Bayes' theorem, ML and MAP estimation, derivation of iterative formulas (sigmoid function, gradient), the solution process, numerical aspects.
    6. Neural networks. MLP architecture, ML and MAP estimation, derivation of the backpropagation algorithm. Activation functions used in neural models, methods of regularization. Convolutional and recurrent architectures, the types of layers used in them, example applications.
    7. Optimization in neural models. The difficulties of optimization, analytical and numerical aspects. Basic principles of optimization algorithms (batch, momentum, adaptive learning rate, higher-order methods). Notable algorithms.
    8. Variational methods. Approximate Bayesian inference, ELBO+KL decomposition, the basic principle of variational methods. BBVI, stochastic gradient-based optimization. Reparametrization trick, VAE. The idea of adversarial training, the basic principle of GAN architectures.
    9. MCMC. The basic principle of MCMC methods. Properties of Markov chains. Sufficient condition for the existence of the equilibrium distribution. Metropolis, Metropolis-Hastings algorithm. Gibbs sampling, conjugated priors. Example: Bayesian linear regression with Gibbs sampling.
    10. Probabilistic Graphical Models: Covers Bayesian Networks and Markov Random Fields for modelling conditional dependencies among variables. Includes inference, network structure learning, and parameter estimation, emphasizing practical applications and inference techniques.
    11. Transformers: Focuses on the transformer architecture's impact on deep learning, especially in NLP. Discusses self-attention mechanisms, positional encoding, and advancements in machine translation and text summarization, highlighting recent developments.

    Detailed topics of the exercises:

    1. Bayesian Thinking. Maximum likelihood estimation, posterior calculation in conjugated models.
    2. Linear Models. Bayesian models of regression and classification, calculating posterior and predictive distributions, numerical stability of implementation.
    3. Neural Networks. Implementation of MLP, convolutional networks using PyTorch/Tensorflow, optimization, evaluating predictive performance.
    4. Variational Inference. Inference in non-conjugated models, generative modelling with variational autoencoders.
    5. MCMC. Gibbs sampling in hierarchical models, time series analysis, changepoint models.
    6. Bayesian Networks. Constructing, querying, learning Bayesian Networks: Students will learn how to build Bayesian Networks to represent probabilistic relationships among variables. Exercises include constructing networks from real-world data, performing inference to calculate conditional probabilities, and using software tools to query the networks and perform diagnostic reasoning.
    7. Transformers. Implementing a Transformer Model: Students will gain hands-on experience with the transformer architecture by implementing a simple transformer model. The exercise will cover key components of transformers, including self-attention mechanisms, and use frameworks to build and train the model on a dataset. Further, students will evaluate the model's performance and explore the impact of different hyperparameters.
    9. Method of instruction

    2 hours of lectures per week, 1 hour of practice (computational exercise and computer laboratory exercise).

    10. Assessment

    During the semester: Completion of exercises bi-weekly (submission of reports).

    During the exam period: Oral exam.

    11. Recaps

    Two reports can be submitted until the end of the makeup week.

    12. Consultations Consultation is available by prior arrangement.
    13. References, textbooks and resources

    C. M. Bishop, H. Bishop: Deep learning, 2024.

    14. Required learning hours and assignment
    Contact hours (lectures)42
    Study during the semester28
    Preparation for midterm test 
    Preparation of homework40
    Study of written material 
    Preparation for exams40
    15. Syllabus prepared by Dr. Péter Antal, associate professor, MIT
    Dr. Bence Bolgár senior lecturer, MIT