EECS 545: Machine Learning Winter 2009. When/Where: TTh 12:00 - 1:30 pm, CSE 1690 Professor Benjamin Kuipers ([email protected]
) Office hours: TTh 2:00 - 3:00 pm, CSE 3741 GSI: Gyemin Lee ([email protected]
) Office hours: MW 1:00 - 2:30 pm, EECS 2420 Prerequisites: EECS 492: Introduction to Artificial Intelligence Mathematical maturity (see Q+A below) . . . Final exam: Thursday, April 30, 1:30 - 3:30 pm (if we have one). This page is updated frequently. Check back here and in Ctools. Most recent update: 4-6-09.
Course Description The catalog says, "Survey of recent research on learning in artificial intelligent systems. Topics include learning based on examples, instructions, analogy, discovery, experimentation, observation, problem solving and explanation. The cognitive aspects of learning will also be studied." Actually, we will focus on statistical machine learning and reinforcement learning methods that have had a major impact on the machine learning field over the past decade or so. We will also attend to the problem of connecting these representations to the symbolic knowledge representation methods that have been at the core of Artificial Intelligence. The course textbooks are: Christopher M. Bishop, Pattern Recognition and Machine Learning, Second edition, Springer, 2006. Richard Sutton and Andrew Barto, Reinforcement Learning: An Introduction, MIT Press, 1998. The content of the course will be organized in two parallel tracks, Theory and Practice, that will run throughout the semester. In theory, there's no difference between theory and practice. But in practice, there is. In the Theory track, we will work our way through much of the Bishop text, and then through parts of Sutton and Barto. There will be homework problems, and an exam or two. The majority of lecture time will be spent on the Theory track. In the Practice track, we will implement learning algorithms in MATLAB and apply them to large bodies of data. Machine Learning is applicable to a huge range of problems, but this semester we will focus on problems of learning from visual data. We will spend some lecture time on the Practice track, but much of this will be done by you outside of class. The following two excellent papers will be assigned early in the course, so reading these would be a good way to get a head start. Paul Viola and Michael Jones. Robust real-time face detection. International Journal of Computer Vision 57(2): 137-154, 2004. Chris Stauffer and Eric Grimson. Learning patterns of activity using real-time tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(8): 747-757, 2000.
Questions and Answers Q: How can I tell if I have enough mathematical background? A: In the Bishop textbook, read sections 1.1, 1.2, and 1.3 carefully. This should take a fair amount of care, time, and effort. You can't skim this kind of book. But if you can't follow it, even after spending a lot of time and effort on it, then you're probably not ready for this course. Q: Will there be programming projects? In what language(s)? A: Yes, in MATLAB.
Class Structure Each class meeting will be important. Don't miss any of them. In each 80-minute class, the plan is to spend: 30-40 minutes on my lecture on the topic of the day's reading assignment; 20-30 minutes on student presentation of the solution to a problem; 20-30 minutes on topics related to the programming assignments.
Lecture Topics and Readings (Theory) Here is the plan for the lecture topics during the semester. Lecture topic 1/8
Course introduction Curve fitting and probability theory Decision theory and information theory
Read before class
Distributions: binary, multinomial, Bishop 2.1-2.2 conjugate priors Distributions: Gaussians Bishop 2.3-2.5 Linear models for regression (1) Bishop 3.1-3.2 Linear models for regression (2) Bishop 3.3-3.5 Linear models for classification Bishop 4.1-4.2 (1) Linear models for classification Bishop 4.3-4.5 (2) Kernel methods: radial basis Bishop 6.1-6.3 functions Gaussian processes Bishop 6.4 Support vector machines Bishop 7.1
1/22 1/27 1/29 2/3 2/5 2/10 2/12
2/21 3/1 3/3 3/5 3/10 3/12 3/17 3/19 3/24 3/27 3/31 4/2 4/7
2.6,2.7* 2.36,2.39* 3.4* 3.8* 4.5,4.6* 4.12,4.13,4.14*
Winter break Bayesian networks Markov random fields Inference in graphical models Mixture models and EM (1) Mixture models and EM (2) Sampling methods Monte Carlo algorithms Principal component analysis Nonlinear PCA Hidden Markov models Linear dynamical systems
Reinforcement learning (1)
Reinforcement learning (2)
Reinforcement learning (3)
Reinforcement learning (4)
Bishop 8.1-8.2 Bishop 8.3 Bishop 8.4 Bishop 9.1-9.2 Bishop 9.3-9.4 Bishop 11.1 Bishop 11.2-11.6 Bishop 12.1-12.2 Bishop 12.3-12.4 Bishop 13.1-13.2 Bishop 13.3 Sutton-Barto 1; 2-2.2; 3-3.3, 3.6, 3.7; 10 Sutton-Barto 3.8-end; 4; 5 Sutton-Barto 6, 7, except 6.6, 6.7, 7.7, 7.8, 7.10 Sutton-Barto 8; 7.8; 9-9.2, 9.8; 10 (1:30-3:30 pm)
Homework Problems, Solutions, and Student Presentations Bishop emphasizes the importance of solving the problems at the end of the chapter, in order to understand the material in the readings. We will organize the class into groups of 2-4 students each. For each class, there will be a small number (1-3) of homework problems (to be announced) from the reading assignment. Each person should do the problems, and write up the solutions (in pencil or blue or black ink). Another member of the same group will correct the solution (in red ink). Both sign the homework paper, review it, correct it as needed, and hand the marked-up assignment in. (You get credit for handing this in. We'll do spot checks for bogosity.) On working together. I encourage students to work together, to increase each other's understanding of the material. But when you ask for help, or help someone else, remember that it is for the purpose of increasing both of your understanding, not to avoid understanding. Think like a teacher! For each class, one of the groups will be responsible for presenting the solution to a specified problem to the class. During the first class, we will select groups who are able to meet together to discuss the material between classes, to grade each other's homework. Will will also identify which groups are presenting in the next few classes.
Programming Assignments (Practice) There will be four programming assignments with a common theme: learning to identify meaningful structure in video sequences. Assignment 1: Background modeling, fixed camera Due: 1/22 When a fixed camera views a static scene, each pixel always has the same value. Typical real scenes are mostly static, with dynamic foreground objects occasionally obscuring the static background. In this assignment, you will treat the dynamic foreground as noise, and attempt to recover a useful model of the static background. In the simplest approach, you will model each pixel with its most common (modal) value. To handle slightly more complex scenes, for example with moving tree branches, you will learn a distribution in color space for each pixel. Result: A clean image of the background, with foreground "noise" eliminated. Do this both with the video provided, and with one you collect with your own camera. Assignment 2: Foreground object individuation Due: 2/12 A pixel belongs to the foreground if it is not "explained" by the background model. Once you have a good background model, you can check each pixel in the video stream, to see whether it is explained by the background, or whether it is part of the foreground. Cluster and track the unexplained pixels, to define individual foreground objects. Note that these "objects" will not persist, but will merge and split according to how well the clustering process is doing. Result: The static background model, plus dynamic foreground object descriptions, each with the history of its own time-varying position and extent in the image. Assignment 3: Background modeling, hand-held camera Due: 3/27 The pixel-based method above fails for a hand-held video camera, since the pose (position and orientation) of the camera changes continuously. To build a background model successfully, we must identify the camera pose trajectory as well as the model itself. We will simplify the problem by assuming that everything visible is distant (eliminating parallax) and that the camera operator is attempting to hold the camera fixed (small deviations from constant). Fortunately, we can exploit the small motions of the camera to create a "super-resolution" model of the background: higher resolution in the model than in any individual image. This exploits the redundancy in the multiple frames of the video, but depends on them not being aligned in every frame. Can you use this to read distant license plates (on parked cars)? Result: A clean, super-resolution image of the background, taken from a hand-held video stream. Assignment 4: Foreground object modeling Due: 4/16 Foreground objects are, by definition, moving against the static background. By individuating and tracking them (Assignment 2), they are somewhat stabilized in the tracker frame of reference. Now apply the methods you developed for Assignment 3 to further stabilize the image and build super-resolution models of the foreground objects. Note that we are still treating objects as two-dimensional regions in the image, so this method will only work when three-dimensional objects present the same two-dimensional image as they move. The general problem, including creating a suitable three-dimensional model of the object and determining the object's pose, is much harder, and is not part of this assignment. The example video we provide will include lateral traffic as moving objects. Try more difficult examples, to see what happens. The case of a vehicle moving toward or away from the camera is an interesting intermediate case, where the two-dimensional image varies by scale. Result: Clean, super-resolution images of each foreground object, extracted from a handheld video stream.
Handouts Slides from lectures For slides and other resources, log in to CTools Relevant video files Creekview: hand-held camera, mostly static with moving branches and two cars. German Traffic: static camera, many cars driving and turning through an intersection. Indian Traffic: hand-held camera, many cars driving and turning through an intersection.
Grading The grades will be determined as follows: Course element
Programming assignments (4) 40% Homework 15% Class presentation 10% Mid-term exam 15% Final exam 15% Class participation 5% BJK