GPGPU Applications

A tantárgy neve magyarul / Name of the subject in Hungarian: GPGPU alkalmazások

Last updated: 2019. január 28.

Budapest University of Technology and Economics
Faculty of Electrical Engineering and Informatics
Computer Engineering MSc, Cloud and parallel systems specialization
Course ID Semester Assessment Credit Tantárgyfélév
VIIIMB01 3 2/1/0/v 4  
3. Course coordinator and department Dr. Magdics Milán, Irányítástechnika és Informatika Tanszék
Web page of the course
4. Instructors

Dr. Magdics Milán, Department of Control Engineering and Information Technology

Tóth Márton, Department of Control Engineering and Information Technology 

5. Required knowledge Programming, data structures, algorithms, mathematics
6. Pre-requisites
C++ programming knowledge is required.
7. Objectives, learning outcomes and obtained knowledge The course demonstrates the general purpose utilization of the computing power of modern graphics cards, through their generalized model. During the course the architecture of the graphics card and the OpenCL general purpose computing environment are introduced. Various algorithms designed for massively parallel architecture are presented through practical examples.
8. Synopsis

1. Overview of the architecture of the GPU

The lecture discusses the massively parallel architecture of the graphics hardware and its limitations, as well as the fundamentals of parallel programming. An overview of environments for graphics hardware programming is presented.


2. Introduction to OpenCL

OpenCL, a general purpose programming environment for GPU programming is presented, including an overview of its virtual machine platform, memory and program model and the OpenCL C language and the related CPU side API. 


3. GPGPU support for computations on large data sets

Vector processing operations for large data sets are discussed. An overview of the parallelization of scattering and gathering type algorithms, including their limitations and implementation details, is presented. 


4. Implementation issues of basic parallel primitives in OpenCL

Introduction to the parallel programming primitives, the most fundamental building blocks of scalable algorithms. Discussed primitives include map, reduce, amplify, scan and compact operators.


5. Solution of linear equation systems 

Parallel algorithms for solving linear equation systems are presented. We discuss implementation issues of matrix and vector operations, as well as storage issues and operations with sparse matrices. 


6. Physical simulations on the GPU

Examples for physical models that can be evaluated efficiently on GPUs. Efficiency and scalability of the presented algorithms is also discussed. 


7. Parallel sorting methods

Sorting algorithms designed for parallel architecture are presented: brick sort, radix sort, merge sort and the parallel variant of quick sort. The parallel algorithms are compared to traditional sorting algorithms w.r.t. complexity and time cost.


8. Breadh-first traversal of graphs and its applications

Discussion of graph traversal algorithms for parallel architectures and their applications. Complexity and time cost of the presented algorithms are compared to those of the traditional CPU graph traversal methods.


9. Parallel hash-based algorithms

Parallel hash algorithms and their possible applications are presented. Complexity and time cost of the presented algorithms are compared to those of the traditional CPU hash-based methods.


10. Monte Carlo methods on the GPU

Introduction to Monte Carlo methods and their implementation issues and possible applications. As the main issue of these algorithms is the generation of high quality random numbers, we discuss pseudo random and quasi random number generators that can be used on parallel architectures.

11. Adjoint Monte Carlo methods

Fundamental issues of adjoint Monte Carlo methods. We discuss theoretical requirements and implementation issues of a gathering type algorithm in the context of GPU based PET reconstruction.


12. Optimization issues of GPGPU applications

Optimization and performance measurement issues of parallel algorithms are discussed. The possibilities of algorithmic optimizations using of theoretical and practical metrics are presented.


13. Efficient interoperability with the graphics API (OpenGL)

The basic tools to connect the OpenCL environment with the OpenGL graphics API are discussed. These can be used to create efficient visualization for general purpose computing application, which can be extremely helpful in the evaluation of computation results.


14. Special requirements for multi GPU and distributed systems

We discuss the main issues of programming multi GPU systems, the limitations due to the distributed architecture and generally applicable techniques.

9. Method of instruction Lectures and practicals.
13. References, textbooks and resources Lecture slides and course materials are available on the course webpage.
14. Required learning hours and assignment
Kontakt óra42
Félévközi készülés órákra14
Felkészülés zárthelyire0
Házi feladat elkészítése34
Kijelölt írásos tananyag elsajátítása0
15. Syllabus prepared by Dr. Tóth Balázs, Department of Control Engineering and Information Technology