Advanced Computer Vision

Course Description

This course introduces the fundamental techniques used in computer vision, which is the analysis of patterns in visual images to understand the objects and scenes that generated them. Topics covered include image formation and representation, camera geometry, and calibration, computational imaging, multi-view geometry, stereo, 3D reconstruction from images, motion analysis, physics-based vision, image segmentation and object recognition. Homeworks involve Python programming exercises.
This course is modeled off of 16-720, but moving at a bit faster pace. We will also have a number of guest lectures on cutting-edge research in CV.

Educational Outcomes

Implement the Hough Transform to detect lines in an image
Detect Harris Corners and implement the RANSAC algorithm to find the homography between two images
Perform object recognition using a convolutional neural network
Perform 3D reconstruction and stereo rectification to implement stereo block matching using two images
Implement a gradient descent based image alignment algorithm to track objects in a video
3D segmentation
Students will learn how to use Python and PyTorch through the programming assignments

Prerequisites

Linear Algebra, Multivariate Calculus, Probability theory, Programming.
Python programming experience and previous exposure to image processing are desirable, but not required.
However, your ability to code in Python will be a crucial factor in your success.
If you are not familiar with Python, you will need to put in extra effort at the beginning to learning it quickly.

Recommended Books

Computer Vision: Algorithms and Applications, by Richard Szeliski (available online for free)
Multiple View Geometry in Computer Vision, by Richard Hartley and Andrew Zisserman
Computer Vision: A Modern Approach, by David Forsyth and Jean Ponce
Digital Image Processing, by Rafael Gonzalez and Richard Woods