Assignments
字数: 0

A1

(a) Explain the three approaches of mean, median and Gaussian based image noise suppression. Compare and contrast the three approaches.
Leture 2
Mean
  • Each pixel in the image with the average of its neighboring pixels within a defined window.
  • It is simple and fast. But it blurs edges and can distort sharp features, especially in the presence of salt-and-pepper noise.
Median
  • Each pixel is replaced by the median value of its neighboring pixels.
  • It is excellent at removing salt-and-pepper noise without blurring edges. But it is slower than mean filtering and may struggle with more complex noise patterns.
Gaussian
  • Use a Gaussian kernel to weigh neighboring pixels, with closer pixels having higher weights.
  • It is good at smoothing while preserving edges. But it might still blur edges to some extent, and it is computationally more expensive than mean and median filtering.
(b) Outline the steps involved in the Canny edge detection process and brief explain the contribution of each step in edge detection.
Leture 3
Steps
  1. Gaussian filter smoothing to reduce noise in the image
  1. Magnitude and orientation of gradient calculation to detect intensity changes, indicating potential edges.
  1. Non-maximum suppression to thin the edges to a single pixel width
  1. Linking and hysteresis thresholding to finalize the edges by connecting weak edges to strong ones.

A2

(a) Detail the steps involved in detecting linear features in an image using the Hough transform. How can the gradient orientation be utilized to improve the efficiency of the standard Hough transform?
Leture 5
  1. Edge detection using technique like Canny
  1. Represent the line in polar coordinates
  1. Initialize accumulator to all zeros.
    1. (Pseudocode is as follows)
  1. Peak detection by find the values of where is a local maximum.
      • The detected line in the image is given by

By analyzing the gradient direction at each edge point, limit the possible values to directions close to the gradient, reducing the number of angles to check and improving efficiency.
(b) The figure below shows an image of a checkboard on the left and its corresponding Hough space on the right. Demonstrate your understanding of the Hough transform by explaining how the geometric structures in the input image are manifested in the resultant Hough space.
notion image
Leture 5
The polar representation of the line is shown as below.
notion image
  • Therefore, the vertical lines in checkboard will be represented by intersections whose ,and the difference of ρbetween adjacent intersections equals to the width of the grid in the checkboard.
  • Similarly, there are horizontal lines that will be transformed to intersections whose , and the difference of ρbetween adjacent intersections equals to the height of the grid in the checkboard.
(c) Please give the differences between least squares, total least squares, and robust estimator for line fitting, from the perspectives of solutions and applications.
Leture 4
  • Least squares minimizes errors.
    • Best when errors are small and in , but sensitive to outliers.
  • Total least squares minimizes both and errors.
    • Works well when errors exist in both variables, but more computationally intensive.
  • Robust estimators minimizes the effect of outliers using techniques like M-estimators.
    • Resilient to outliers, suitable for noisy data.
(d) List the steps involved in RANSAC for line fitting. Given the outlier ratio to be 0.2, please choose the number of samples so that, with probability 99%, at least one random sample is free from outliers.
Leture 4
Steps
Repeat times:
  • Draw points uniformly at random
  • Fit line to these points
  • Find inliers to this line among the remaining points (i.e., points whose distance from the line is less than threshold )
  • lf there are or more inliers, accept the line and refit using all inliers
Calculation
(e) Briefly detail the application differences of Hough transform and RANSAC fitting.
Leture 4 & 5
  • Hough Transform: Best for detecting well-defined shapes in images, but computationally expensive and sensitive to noise.
  • RANSAC: Ideal for fitting models to noisy data with outliers, robust and flexible, but can be slow for large datasets.

A3

(a) List the steps involved in SIFT feature detection.
Lecture 7
  1. Scale-space extrema detection using DoG at multiple scales.
  1. Keypoint localization for discarding weak keypoints.
  1. Orientation assignment for ensuring rotation invariance.
  1. Keypoint descriptor creation by sampling the gradients of pixels around the keypoint in a local neighborhood.
  1. Matching keypoints from different images.
(b) Explain how SIFT achieves scale invariance.
The Difference of Gaussian is used to approximate the Laplacian. The Spatial selection is based on the truth that, the magnitude of the Laplacian response will achieve a maximum at the center of the blob, provided the scale of the Laplacian is “matched” to the scale of the blob.
(c) Based on the difference of Gaussian scale-space pyramids given below, please explain how SIFT improves the implement efficiency of blob detection. Finally, how many scales can be given in the image below?
Lecture 7
Lecture 7
  1. Approximate the Laplacian by difference of Gaussian, which achieves a large-scale Gaussian filtering by combining two small-scale Gaussian filtering one after another.
  1. Implementing a large-scale Gaussian filter on the original image is equivalent to implementing a small-scale Gaussian filter on the reduced image, which can greatly reduce the amount of computation. Therefore, the Gaussian pyramids are divided into several octaves. The first Gaussian filtered layer of each octave is acquired by down sampling the last third layer of the upper octave. Blob in four scales can be provided by the image: , where 
(d) How does SIFT features provide a greater degree of robustness to matching two images captured by cameras from different orientation and position. (i.e. as compared to matching raw pixel values).
There are shift, rotation, scale and illumination variation, among two images captured by cameras from different orientation and position. SIFT can provide covariant or even invariant features under these conditions.
(e) Outline the steps involved in the Harris detector. Explain why Harris corner detector is not scale invariant, and how this affects the applicability of the algorithm.
Leture 6
  1. Compute Gaussian derivatives at each pixel
  1. Compute second moment matrix in a Gaussian window around each pixel
  1. Compute corner response function
  1. Threshold
  1. Find local maxima of response function (nonmaximum suppression)
 
2023 - 2026