# Budapest University of Technology and Economics, Faculty of Electrical Engineering and Informatics

Belépés
címtáras azonosítással

vissza a tantárgylistához   nyomtatható verzió

A tantárgy neve magyarul / Name of the subject in Hungarian: Sztochasztikus modellek és adaptív algoritmusok

Last updated: 2017. december 8.

 Budapest University of Technology and Economics Faculty of Electrical Engineering and Informatics PhD coursefree selective
 Course ID Semester Assessment Credit Tantárgyfélév VISZDV06 2/0/0/V 3
3. Course coordinator and department Dr. Katona Gyula,
Web page of the course http://www.cs.bme.hu/smaa
4. Instructors

Dr. Balázs Csanád Csáji, senior research fellow,

Institute for Computer Science and Control,

5. Required knowledge probability theory, linear algebra, (multivariable) calculus
7. Objectives, learning outcomes and obtained knowledge
The course aims at providing an introduction to typical stochastic models and adaptive algorithms applied in a wide range of fields including machine learning, data mining, system identification, control theory, signal processing, and financial mathematics.
The course has three main parts: (i) the first one is (parametric and nonparametric) regression, regularization, and the statistical properties of such methods (cf. supervised learning); (ii) the second one is the theory of sequential decision making, mainly focusing on controlled Markov processes (cf. reinforcement learning); (iii) finally, the course ends with the general theory of (recursive) adaptive algorithms (also known as stochastic approximation), discussing some of the arch-typical algorithms and related theoretical results of the field.
8. Synopsis
1. Regression:
1.1 Review of the fundamental results related to the least-squares (LS) method: orthogonal projection property, unbiasedness, its covariance matrix, Gauss-Markov theorem, connection to the Maximum Likelihood (ML) estimate, strong consistency, asymptotic efficiency, and its confidence ellipsoids.
1.2 Generalizations of the LS theory and related results from linear algebra: Tikhonov regularization, least-norm problem, singular value decomposition (SVD), low rank approximations, recursive LS, matrix inversion lemma, generalized LS and its special cases: weighted LS and forgetting factors.
1.3 Kernel methods: nonparametric regression, reproducing kernel Hilbert spaces (RKHS), regularization in RKHS, representer theorem, Moore-Aronszajn theorem, typical kernels and their induced learning machines, important special cases: support vector classification and regression.
1.4 Time-series analysis: strictly and weakly stationary stochastic processes, Wold representation, linear filters and their main properties, general linear systems, typical (such as autoregressive) models and their estimates: prediction error-, correlation-, ML-, and instrumental variable methods.
2. Markov decision processes (MDPs):
2.1 Review of Markov chains (for countable state spaces), including initial distributions, transition functions, stationary distributions, and ergodicity.
2.2 Central concepts of MDPs and their main types, such as stochastic shortest path, total discounted and average (ergodic) cost problems. Control policies, value functions, Bellman operators and their core properties, monotonicity, contraction, optimality equations, and approximation possibilities.
2.3 Main solution directions of MDPs: iterative approximations of the optimal value function (value iteration, Q-learning); direct search in the space of policies (policy iteration, policy gradient); linear programming methods, complexity. Generalizations: unbounded costs, partial observability.
2.4 Temporal difference (TD) learning: Monte Carlo evaluations, eligibility traces, TD(0), TD(1), TD(lambda) and their variants (online, offline, first-visit, every-visit), convergence theorems, optimistic policy improvements.
3.1 General iterative algorithms and stochastic approximation, fixed point and root finding problems, examples of reformulating known algorithms.
3.2 Convergence analysis based on Lyapunov functions. Famous examples: stochastic gradient descent and its variants, the Kiefer-Wolfowitz algorithm and the simultaneous perturbation stochastic approximation (SPSA) method.
3.3 Convergence analysis based on contraction and monotonicity properties, their illustration through the example of the Bellman operator in MDPs.
3.4 Generalizations: time-dependent updates and tracking changing parameters.
9. Method of instruction 2 hours of lectures per week.
10. Assessment
Signature: numerical experiments
Final: Oral exam
12. Consultations In office hours or by appointment.
13. References, textbooks and resources
- Jerome Friedman, Trevor Hastie, Robert Tibshirani. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd ed. Springer. 2009.
- Dimitri P. Bertsekas, John Tsitsiklis. Neuro-Dynamic Programming. Athena Sci. 1996.
- Albert Benveniste, Michel Métivier, Pierre Priouret. Adaptive Algorithms and Stochastic Approximations. Springer. 1990.
- Bernhard Schölkopf, Alexaner J. Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization and Beyond. The MIT Press. 2002.
- Harold J. Kushner, G. George Yin. Stochastic Approximation and Recursive Algorithms and Applications. 2nd Edition. Springer. 2008.
- Simon Haykin. Neural Networks and Learning Machines. 3rd ed. Prentice Hall, 2008.
14. Required learning hours and assignment
 Kontakt óra 28 Félévközi készülés órákra 14 Felkészülés zárthelyire Házi feladat elkészítése 14 Kijelölt írásos tananyag elsajátítása 10 Vizsgafelkészülés 24 Összesen 90