BME VIK - Sztochasztikus modellek és adaptív algoritmusok

magyar nyelvű adatlap

angol nyelvű adatlap

vissza a tantárgylistához nyomtatható verzió

Stochastic models and adaptive algorithms

A tantárgy neve magyarul / Name of the subject in Hungarian: Sztochasztikus modellek és adaptív algoritmusok

Last updated: 2017. december 8.

Budapest University of Technology and Economics
Faculty of Electrical Engineering and Informatics

PhD course

free selective

Course ID	Semester	Assessment	Credit	Tantárgyfélév
VISZDV06		2/0/0/V	3

3. Course coordinator and department Dr. Katona Gyula,

Web page of the course http://www.cs.bme.hu/smaa

4. Instructors

Dr. Balázs Csanád Csáji, senior research fellow,

Institute for Computer Science and Control,

Hungarian Academy of Sciences

5. Required knowledge probability theory, linear algebra, (multivariable) calculus

7. Objectives, learning outcomes and obtained knowledge

The course aims at providing an introduction to typical stochastic models and adaptive algorithms applied in a wide range of fields including machine learning, data mining, system identification, control theory, signal processing, and financial mathematics.

The course has three main parts: (i) the first one is (parametric and nonparametric) regression, regularization, and the statistical properties of such methods (cf. supervised learning); (ii) the second one is the theory of sequential decision making, mainly focusing on controlled Markov processes (cf. reinforcement learning); (iii) finally, the course ends with the general theory of (recursive) adaptive algorithms (also known as stochastic approximation), discussing some of the arch-typical algorithms and related theoretical results of the field.

8. Synopsis

1. Regression:

1.1 Review of the fundamental results related to the least-squares (LS) method: orthogonal projection property, unbiasedness, its covariance matrix, Gauss-Markov theorem, connection to the Maximum Likelihood (ML) estimate, strong consistency, asymptotic efficiency, and its confidence ellipsoids.

1.2 Generalizations of the LS theory and related results from linear algebra: Tikhonov regularization, least-norm problem, singular value decomposition (SVD), low rank approximations, recursive LS, matrix inversion lemma, generalized LS and its special cases: weighted LS and forgetting factors.

1.3 Kernel methods: nonparametric regression, reproducing kernel Hilbert spaces (RKHS), regularization in RKHS, representer theorem, Moore-Aronszajn theorem, typical kernels and their induced learning machines, important special cases: support vector classification and regression.

1.4 Time-series analysis: strictly and weakly stationary stochastic processes, Wold representation, linear filters and their main properties, general linear systems, typical (such as autoregressive) models and their estimates: prediction error-, correlation-, ML-, and instrumental variable methods.

2. Markov decision processes (MDPs):

2.1 Review of Markov chains (for countable state spaces), including initial distributions, transition functions, stationary distributions, and ergodicity.

2.2 Central concepts of MDPs and their main types, such as stochastic shortest path, total discounted and average (ergodic) cost problems. Control policies, value functions, Bellman operators and their core properties, monotonicity, contraction, optimality equations, and approximation possibilities.

2.3 Main solution directions of MDPs: iterative approximations of the optimal value function (value iteration, Q-learning); direct search in the space of policies (policy iteration, policy gradient); linear programming methods, complexity. Generalizations: unbounded costs, partial observability.

2.4 Temporal difference (TD) learning: Monte Carlo evaluations, eligibility traces, TD(0), TD(1), TD(lambda) and their variants (online, offline, first-visit, every-visit), convergence theorems, optimistic policy improvements.

3. Adaptive algorithms:

3.1 General iterative algorithms and stochastic approximation, fixed point and root finding problems, examples of reformulating known algorithms.

3.2 Convergence analysis based on Lyapunov functions. Famous examples: stochastic gradient descent and its variants, the Kiefer-Wolfowitz algorithm and the simultaneous perturbation stochastic approximation (SPSA) method.

3.3 Convergence analysis based on contraction and monotonicity properties, their illustration through the example of the Bellman operator in MDPs.

3.4 Generalizations: time-dependent updates and tracking changing parameters.

9. Method of instruction 2 hours of lectures per week.

10. Assessment

Signature: numerical experiments

Final: Oral exam

12. Consultations In office hours or by appointment.

13. References, textbooks and resources

- Jerome Friedman, Trevor Hastie, Robert Tibshirani. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd ed. Springer. 2009.

- Dimitri P. Bertsekas, John Tsitsiklis. Neuro-Dynamic Programming. Athena Sci. 1996.

- Albert Benveniste, Michel Métivier, Pierre Priouret. Adaptive Algorithms and Stochastic Approximations. Springer. 1990.

- Bernhard Schölkopf, Alexaner J. Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization and Beyond. The MIT Press. 2002.

- Harold J. Kushner, G. George Yin. Stochastic Approximation and Recursive Algorithms and Applications. 2nd Edition. Springer. 2008.

- Simon Haykin. Neural Networks and Learning Machines. 3rd ed. Prentice Hall, 2008.

14. Required learning hours and assignment

Kontakt óra	28
Félévközi készülés órákra	14
Felkészülés zárthelyire
Házi feladat elkészítése	14
Kijelölt írásos tananyag elsajátítása	10
Vizsgafelkészülés	24
Összesen	90