vissza a tantárgylistához   nyomtatható verzió    

    Media Informatics Systems

    A tantárgy neve magyarul / Name of the subject in Hungarian: Médiainformatikai rendszerek

    Last updated: 2018. augusztus 26.

    Budapest University of Technology and Economics
    Faculty of Electrical Engineering and Informatics
    Course ID Semester Assessment Credit Tantárgyfélév
    VITMMA08 2 2/1/0/v 4  
    3. Course coordinator and department Dr. Mihajlik Péter,
    4. Instructors

     Name

     Position

      University, Department

     Dr. Gábor MAGYAR PhD

    Associate Professor

     BME-TMIT

     Dr. Péter MIHAJLIK PhD

     Lecturer

     BME-TMIT

     Dr. Gábor SZŰCS PhD

     Associate Professor

     BME-TMIT

    5. Required knowledge Mediainformation technologies and tools
    7. Objectives, learning outcomes and obtained knowledge
    The aim of the course is to present the fundamentals of digital multimedia content management and to introduce the applied pattern recognition and analytic techniques. The students learn about the recognition and categorization issues and the standards of desriptive attributes of multimedia contents. At the end of the semester the student will be able to understand and accomplish engineering tasks related to media informatics systems by acquiring the required technologies and tools. 
    8. Synopsis

     

    Week 1:

     

    Basic definitions. Media content management. Main topics of multimedia (image, voice, video) processing.

     

    Week 2:

     

    Processing audio contents. Short Time Fourier Spectrum, windowing, spectrogram computation. The fundamentals of signal detection.

     

    Week 3:

     

    Music recognition. Challenges in real-time pattern matching: additive noise, linear and non-linear distortions. Audio fingerprinting. Case study: Shazam.

     

    Week 4:

     

    Recognition of variable sound and image signals. Statistics based general classification. Probability density function, likelihood, training and test. Bayes' theorem.

     

    Week 5:

     

    Multimedia classification tasks based on multi-variate Gaussian Mixture Models.

     

    Week 6:

     

    Maximum Likelihood vs. Discriminative Training – theoretical and practical issues regarding audio and image data. The application of Multi Layer Perceptrons on media informatics tasks.

     

    Week 7:

     

    Processing time-varying signals: Dynamic Time Warping, Hidden Markov-Models and their practical implementations. Fundamentals of speech recognition.

     

    Week 8:

     

    Contemporary speech recognition technologies. Acoustic, pronunciation and language models. Subtitling methods and standards. Case studies: BBC and the Hungarian National Television subtitling approaches.

     

    Week 9:

     

    Deep learning: deep feed-forward neural nets, convolutional nets and requrrent networks - and their applications on automatic annotation of media contents (text, sound, image and video).

     

    Week 10:

     

    Complex tasks based on multimedia technologies: shape detection in images, object tracking in videos. Face detection and recognition solutions. Video processing methods in practice: hand gesture recognition in videos.

     

    Week 11:

     

    Metadata: semantic and desriptive metadata. Multimedia metadata standards. EBU/SMPTE metadata, Dublin Core, Material Exchange Format (MXF).

     

    Week 12:

     

    Multimedia databases. Multimedia information retrieval. Search modes, types, algorithms. Quality measurement of an information retrieval system.

     

    Week 13:

     

    Digital Media Management Systems (DMMS) / Multimedia Asset Management. Structure of the systems: gathering, storing, displaying subsystems. Lifecycle attributes. Integration tools. Architecture of the media content management systems, and types of it: DAM, DM, KM, Web CMS, ECM. Overall model of systems.

     

    Week 14:

     

    Digital archiving: tasks, approaches, technics. Archiving strategies: On-line, near-line, off-line, off-site accessibility, levels. Record management.

     

    In the practice sessions, the following topics (not exculsively and not fully) will be discussed: audio feature extraction, audio fingerprinting, GMM, automatic speech recognition, language modeling, applied deep neural nets in Keras, image annotation tasks, multimedia retrieval, visualization tools, video processing.

    9. Method of instruction 2×45 min lecture and 1×45 min seminar per week (90 minutes biweekly).
    10. Assessment
    Requirements:
    - Mid-term (written) test
    Exam period:
    - Exam
    11. Recaps There is one possibility to repeat the test (Mid-term) in the teaching period and there is a final one in the official recap period.
    12. Consultations Personally - agreed by e-mail
    13. References, textbooks and resources
    David Austerberry: Digital Asset Management, FocalPress, 2006.
    Serkan Kiranyaz, Moncef Gabbouj: Content-Based Management of Multimedia Databases: Advanced Techniques for Multimedia Analysis and Retrieval, LAP LAMBERT Academic Publishing, 2012.
    Altrichter Márta, Horváth Gábor, Pataki Béla, Strausz György, Takács Gábor, Valyon József: Neurális hálózatok, Hungarian Edition Panem Könyvkiadó Kft., Budapest, 2006  
    Michael Nielsen: Neural Networks and Deep Learning, 2016. Online: http://neuralnetworksanddeeplearning.com/
    Rabiner, L., Juang, B-H., Fundamentals of Speech Recognition. Prentice Hall, New Jersey, 1993

    14. Required learning hours and assignment
    Lessons42
    Preparing for the lessons18
    Preparing for the test25
    Home work
    Preparing for the exam35
    Total120
    15. Syllabus prepared by

     Name

     Position

     University, department

     Dr. Magyar Gábor PhD

     Associate Professor

     BME-TMIT

     Dr. Szűcs Gábor PhD

     Associate Professor

     BME-TMIT

     Dr. Mihajlik Péter PhD

     Lecturer

     BME-TMIT

     Dr. Gyires-Tóth Bálint PhD

     Lecturer

     BME-TMIT