16-820: Advanced Computer Vision

Fall 2024


[ Course Page | Schedule | Assignments | Piazza | Canvas | Office Hours | Lecture Slides ]

Monday and Wednesday, 12:30PM-1:50PM, DH [Doherty Hall] 1212


Course Description

    This course introduces the fundamental techniques used in computer vision, which is the analysis of patterns in visual images to understand the objects and scenes that generated them. Topics covered include image formation and representation, camera geometry, and calibration, computational imaging, multi-view geometry, stereo, 3D reconstruction from images, motion analysis, physics-based vision, image segmentation and object recognition. Homeworks involve Python programming exercises.
    This course is modeled off of 16-720, but moving at a bit faster pace. We will also have a number of guest lectures on cutting-edge research in CV.

Educational Outcomes

  • Implement the Hough Transform to detect lines in an image
  • Detect Harris Corners and implement the RANSAC algorithm to find the homography between two images
  • Perform object recognition using a convolutional neural network
  • Perform 3D reconstruction and stereo rectification to implement stereo block matching using two images
  • Implement a gradient descent based image alignment algorithm to track objects in a video
  • 3D segmentation
  • Students will learn how to use Python and PyTorch through the programming assignments

Prerequisites

  • Linear Algebra, Multivariate Calculus, Probability theory, Programming.
  • Python programming experience and previous exposure to image processing are desirable, but not required.
    However, your ability to code in Python will be a crucial factor in your success.
    If you are not familiar with Python, you will need to put in extra effort at the beginning to learning it quickly.

Recommended Books

  • Computer Vision: Algorithms and Applications, by Richard Szeliski (available online for free)
  • Multiple View Geometry in Computer Vision, by Richard Hartley and Andrew Zisserman
  • Computer Vision: A Modern Approach, by David Forsyth and Jean Ponce
  • Digital Image Processing, by Rafael Gonzalez and Richard Woods

Grading

Grade based on 6 homeworks (with considerable Python implementation)

  • HW 1-5 are worth 18% each
  • HW 6 (last homework) is worth 10% (it’s a bit smaller).
Extra credit (worth up to 3% of your final grade):
  • Class participation (Piazza / lecture)
  • Organizing study groups

Course Staff

Please use the course Piazza page for all communication with course staff

Course Instructor

Matthew O'Toole


Teaching Assistants

Nikhil Keetha
OH: (refer google sheet)
Ayush Jain
OH: (refer google sheet)
Yuyao Shi
OH: (refer google sheet)