Budapest University of Technology and Economics, Faculty of Electrical Engineering and Informatics

    címtáras azonosítással

    vissza a tantárgylistához   nyomtatható verzió    

    GPGPU Applications

    A tantárgy neve magyarul / Name of the subject in Hungarian: GPGPU alkalmazások

    Last updated: 2019. január 28.

    Budapest University of Technology and Economics
    Faculty of Electrical Engineering and Informatics
    Computer Engineering MSc, Cloud and parallel systems specialization
    Course ID Semester Assessment Credit Tantárgyfélév
    VIIIMB01 3 2/1/0/v 4  
    3. Course coordinator and department Dr. Magdics Milán, Irányítástechnika és Informatika Tanszék
    Web page of the course
    4. Instructors

    Dr. Magdics Milán, Department of Control Engineering and Information Technology

    Tóth Márton, Department of Control Engineering and Information Technology 

    5. Required knowledge Programming, data structures, algorithms, mathematics
    6. Pre-requisites
    C++ programming knowledge is required.
    7. Objectives, learning outcomes and obtained knowledge The course demonstrates the general purpose utilization of the computing power of modern graphics cards, through their generalized model. During the course the architecture of the graphics card and the OpenCL general purpose computing environment are introduced. Various algorithms designed for massively parallel architecture are presented through practical examples.
    8. Synopsis

    1. Overview of the architecture of the GPU

    The lecture discusses the massively parallel architecture of the graphics hardware and its limitations, as well as the fundamentals of parallel programming. An overview of environments for graphics hardware programming is presented.


    2. Introduction to OpenCL

    OpenCL, a general purpose programming environment for GPU programming is presented, including an overview of its virtual machine platform, memory and program model and the OpenCL C language and the related CPU side API. 


    3. GPGPU support for computations on large data sets

    Vector processing operations for large data sets are discussed. An overview of the parallelization of scattering and gathering type algorithms, including their limitations and implementation details, is presented. 


    4. Implementation issues of basic parallel primitives in OpenCL

    Introduction to the parallel programming primitives, the most fundamental building blocks of scalable algorithms. Discussed primitives include map, reduce, amplify, scan and compact operators.


    5. Solution of linear equation systems 

    Parallel algorithms for solving linear equation systems are presented. We discuss implementation issues of matrix and vector operations, as well as storage issues and operations with sparse matrices. 


    6. Physical simulations on the GPU

    Examples for physical models that can be evaluated efficiently on GPUs. Efficiency and scalability of the presented algorithms is also discussed. 


    7. Parallel sorting methods

    Sorting algorithms designed for parallel architecture are presented: brick sort, radix sort, merge sort and the parallel variant of quick sort. The parallel algorithms are compared to traditional sorting algorithms w.r.t. complexity and time cost.


    8. Breadh-first traversal of graphs and its applications

    Discussion of graph traversal algorithms for parallel architectures and their applications. Complexity and time cost of the presented algorithms are compared to those of the traditional CPU graph traversal methods.


    9. Parallel hash-based algorithms

    Parallel hash algorithms and their possible applications are presented. Complexity and time cost of the presented algorithms are compared to those of the traditional CPU hash-based methods.


    10. Monte Carlo methods on the GPU

    Introduction to Monte Carlo methods and their implementation issues and possible applications. As the main issue of these algorithms is the generation of high quality random numbers, we discuss pseudo random and quasi random number generators that can be used on parallel architectures.

    11. Adjoint Monte Carlo methods

    Fundamental issues of adjoint Monte Carlo methods. We discuss theoretical requirements and implementation issues of a gathering type algorithm in the context of GPU based PET reconstruction.


    12. Optimization issues of GPGPU applications

    Optimization and performance measurement issues of parallel algorithms are discussed. The possibilities of algorithmic optimizations using of theoretical and practical metrics are presented.


    13. Efficient interoperability with the graphics API (OpenGL)

    The basic tools to connect the OpenCL environment with the OpenGL graphics API are discussed. These can be used to create efficient visualization for general purpose computing application, which can be extremely helpful in the evaluation of computation results.


    14. Special requirements for multi GPU and distributed systems

    We discuss the main issues of programming multi GPU systems, the limitations due to the distributed architecture and generally applicable techniques.

    9. Method of instruction Lectures and practicals.
    13. References, textbooks and resources Lecture slides and course materials are available on the course webpage.
    14. Required learning hours and assignment
    Kontakt óra42
    Félévközi készülés órákra14
    Felkészülés zárthelyire0
    Házi feladat elkészítése34
    Kijelölt írásos tananyag elsajátítása0
    15. Syllabus prepared by Dr. Tóth Balázs, Department of Control Engineering and Information Technology