magyar nyelvű adatlap
angol nyelvű adatlap
Computer Vision Systems
A tantárgy neve magyarul / Name of the subject in Hungarian: Számítógépes látórendszerek
Last updated: 2025. május 7.
A fenti forma a Neptun sajátja, ezen technikai okokból nem változtattunk.
A kötelező előtanulmányi rend az adott szak honlapján és képzési programjában található.
1. Introduction, basic tasks and challenges of computer vision, semantic gap. Fundamentals of image sensing, human vision, photodiode, CCD, CMOS, color vision. Sources of image noise and defects, blurriness, focus, image storage techniques. Role of color components, color spaces. Image enhancement techniques, intensity transformations, histograms, histogram transformations.
2. Filtering in the image domain, convolution, smoothing, sharpening, and edge detection filters, nonlinear filters. Edge detection, Canny algorithm. Image arithmetic, interpolation techniques, fittings.
3. Image processing in the frequency domain, 2D Fourier transform, analysis of image spectrum. Filtering in the frequency domain, properties of ideal and other filters. Classification based on spectrum, analysis of periodic noise. DCT, JPEG compression, Wiener deconvolution.
4. Types and extraction of image features. Template matching, similarity metrics. Corner detection, local structure matrix, KLT, Harris. Invariances to transformations, SIFT, ORB. Classification methods: Haar features, Viola-Jones, Bag of Visual Words, Deformable Parts. Tracking solutions: Pixel-based tracking, Optical Flow, LK and Farneback methods. Iterative and pyramid optical flow. Application of HMM and Kalman Filter, object matching based on affinity.
5. (Listed twice as 6 in original) Categorization of segmentation methods. Intensity-based segmentation, thresholding, histogram-based methods. Clustering techniques: k-Means, MoG, Mean-shift. Region growing, Split & Merge, SRM. Watershed, graph cuts, motion segmentation.
6. Processing of binary images, basic morphological operations, opening, closing, contour detection. Distance and adjacency, Jordan property. Skeletonization. Binary object descriptors: Euler number, fingerprint, position, orientation. Object counting and labeling. Hough transform.
7. Basics of machine learning, structure of learning systems, types of learning. Examples of learning systems, kNN. Neural networks, fundamental learning challenges, overfitting, data quality. Steps of supervised learning. Perceptron model, decision function. Error functions, gradient method, higher-order methods. MLP and backpropagation.
8. Structure of convolutional networks. Well-known architectures: VGG, Inception, ResNet, DenseNet, EfficientNet. Neural network visualization, adversarial attacks.
9. Deep learning in practice, ensuring convergence, avoiding overfitting. Hyperparameter search, model compression, pruning and ensembles.
10. Detection architectures: R-CNN variants, YOLO. Key metrics and databases, anchor-based and anchor-free solutions. Mask and other R-CNN extensions. Segmentation methods: U-Net, upscaling techniques. ASPP and CRF extensions.
11. Video processing, levels of fusion, 3D convolution. Recurrent architectures: RNN, BPTT, vanishing gradients. LSTM and GRU, soft attention mechanisms. Self-attention and vision transformer solutions.
12. Basics of projective geometry, types of transformations and their properties. Imaging geometry, pinhole camera model, extrinsic and intrinsic parameters. Camera calibration methods: 3D marker-based and chessboard-based solutions, self-calibration.
13. Stereo setup, epipolar geometry, essential and fundamental matrix. Stereo calibration, rectification. Concept of disparity and methods for its determination: BM, SGBM, BP. 3D reconstruction and its invariances, practical applications. SLAM and SfM, multi-view reconstruction.
During the semester: To obtain the course signature, the following requirement must be met:
Summary assessment: Completion of one midterm test with a minimum score of 40%.
During the exam period: Students earn their final grade by completing a written exam. The score from the midterm test contributes 20% to the final exam grade. The final grade is determined based on the following point scale:
0–39%: Fail
40–54%: Pass
55–69%: Satisfactory
70–84%: Good
85–100%: Excellent
1. Lecture notes and slides
2. John C. Russ, The Image Processing Handbook, CRC Press, 2017, https://doi.org/10.1201/b18983
3. Ian Goodfellow and Yoshua Bengio and Aaron Courville, Deep Learning, MIT Press, 2016, https://www.deeplearningbook.org/