CSE 690: Internet Vision

Instructor: Tamara Berg  (tlberg -at- cs.sunysb.edu)
Office: 1411 Computer Science
Lectures: Mondays 10:00-12:30pm Rm 2311 CS
Office Hours: Mondays 12:30-1:30pm, Wednesdays 10:00-11:00am

Dec 10
Reminder project presentations in class on Monday!

Here is a basic outline of approximately what you should cover in your presentations, but feel free to add more as time allows. Presentations should be approximately 20 minutes and similar to what you would present if you were giving this talk at a conference.

Topics to cover:

1.) Description of the general idea and why it is important.
2.) Background and related research
3.) Implementation details including descriptions of any machine learning or other methods used including those you did not implement yourself.
4.) Results
5.) Analysis of results and project - pros/cons etc
6.) Future Work

I am available today 10:00am-12:00pm for last minute project questions and also on Friday 1:00-3:00pm. Otherwise I might be available other times on Friday by appointment. Please come see me if you are having trouble with your project or if you want to run your presentation by me for input. I also have your grades in the class so far available for pick-up in my office.

Please email me your final presentation, code and images (if applicable) BEFORE class on Monday. If this is too large to email please transfer it to me via a thumb drive.

Oct 13
Reminder! Project status updates will be next week, Monday Oct 20. Please prepare a 5-10 minute presentation about the progress you've made on your projects. Come by office hours on Wednesday if you need help.
Sept 23
Reminder! Project proposal presentations will be Monday, Sept 29. Please prepare a 5-10 minute presentation about your topic, challenges you will work on, and a rough timeline. If you're having trouble deciding on a topic please come see me in office hours tomorrow, Sept 24.
Sept 15
Useful links have been added below as resources to get you started on your projects.
If you haven't picked a paper for your paper presentations please email me asap.
Sept 9
Lectures will remain as scheduled tomorrow (Wednesday 9/10/08, 9:30-10:30 2129 CS). Starting next week we will move to Mondays 10:00am-12:30pm in Rm 2311 CS.
Please look over the Reading List and choose one paper you would like to present. We will choose papers on Monday, Sept 15.
Sept 8
Lecture times for this course may change. Meetings will proceed as scheduled this week, but could change next week. Please check back for updates.


With the explosion of images and video on the web, dealing with the large amount of unorganized visual data available has become immensely challenging. This course will focus on exploring various types and sources of visual data and how to extract information for effective web search, browsing and other interaction. One of the guiding questions in the course will be: "What can we do with a billion images?". We will explore through reading current papers what new problems and approaches have been proposed by leading researchers in the fields of computer vision, information retrieval, and multi-media. Students will have a chance to define their own problems and work on solutions through a course project.

  • Visual and Multi-Media Data
  • Computational Photography
  • Image Retrieval
  • Photo Quality Estimation
  • Combining Words & Pictures
  • Places
  • Objects & People
  • The role of Social Networks & Human Interaction

Tentative Schedule

DateTopic & PresenterReadings
Sept 2Organizational Meeting (Slides)
Sept 3Introduction & Data (Tamara - Slides)Tiny Images
Sept 8Data & Computational Photography (Tamara - Slides) Flickr: Who is Looking?, and Scene Completion Using Millions of Photographs
Sept 10 Computational Photography (Tamara - Slides) Photo Clip Art
Sept 15 Computer Vision Review (Tamara - Slides) Choose one paper from the Reading List that you would like to present later in the course
Sept 22 Guest Lecture! Rob Fergus, Asst Professor NYU
Small codes and Large Image Databases for Recognition
Sept 29Project Proposals due (all) & Computational Photography (Tamara - Slides) Please prepare a 5-10 minute presentation on your proposed project. Interactive Digital Photomontage, and Creating and Exploring a Large Photorealistic Virtual Space
Oct 6 Guest Lecture!Alex Berg, Research Scientist Columbia
Classification using Intersection Kernel Support Vector Machines is Efficient, and Parsing Images of Architectural Scenes
Oct 13 Computational Photography (Juntian - Slides & Kshitij - Slides) Auto-Collage, and Seam Carving for Content-Aware Image Resizing
Oct 20 Project Status Updates (all) & Photo Quality (Tamara - Slides) Photo Quality Assessment, and Studying Aesthetics in Photographic Images Using a Computational Approach
Oct 27 Image Retrieval (Anupama & Shrijeet - Slides) Image Retrieval: Ideas, Influences, and Trends of the New Age, and PageRank for Product Image Search
Nov 3 Image Retrieval (Janit & Ram) Learning Object Categories from Google's Image Search, and SIMPLIcity: Semantics-sensitive Integrated Matching for Picture LIbraries
Nov 10 Useful Matlab tricks (Tamara), Image Retrieval & Human Labeling (Praveena & Reema) Towards scalable dataset construction: An active learning approach, and Peekaboom
Nov 17 Project Status Updates (all)
Nov 24 Words & Pictures, Social Networks (Rohit & Villa) Clustering Art, and Autotagging Facebook
Dec 1 Places (Tamara - Slides) Photo Tourism, Automatic Photo Pop-Up, and Im2GPS
Dec 8 Words & Pictures (Tamara) & Project QuestionsNames & Faces, and Animals on the Web
Dec 15 Project Presentations (all)

Reading List
The current reading list is available here. We will choose some subset of these to read based on students' interests.

There will be a final project due at the end of term with various checkpoints during the course of the semester. In addition students will be expected to prepare and lead a class discussion summarizing and critiquing a few recent relevant research papers. Lively participation and discussions are encouraged. Since this class focuses on a new and exciting research area there are no required prerequisites although some previous experience with Computer Vision or Machine Learning would be helpful. A brief summary of related algorithms and techniques will be presented at the beginning of the semester.

Useful links

Matlab tutorial by Hany Farid and Eero Simoncelli - Link
A more comprehensive Matlab tutorial by David Griffiths - Link

Label Me - Link
Tiny Images - Link
Code for downloading Flickr images - Link

Computing Features
SIFT features - Link
Scale Invariant Interest Points - Link
Affine Covariant Regions - Link
Shape Contexts - Link
Gist - Link

Other Useful Software
Various Code from INRIA - Link
Various Code from Oxford - Link
Various useful machine learning tools - Link

Reference Books
Forsyth, David A., and Ponce, J. Computer Vision: A Modern Approach, Prentice Hall, 2003.
Hartley, R. and Zisserman, A. Multiple View Geometry in Computer Vision, Academic Press, 2002.

Fun Examples of Internet Vision in Action
  • Photo Synth - a streaming multi-resolution Web-based service developed from the Photo Tourism research project which allows users to browse large collections of location based photographs in 3d.
  • Like.com - visual product search for shopping.
  • Tin Eye - extremely fast image duplicate detection and similarity search.
  • Retrievr - sketch based interface for browsing flickr photographs.
  • Photo Pop-Up - automatically create pop-up 3d models from a single photograph.
  • Photo Clipart - a system for inserting new objects into existing photographs in a context sensitive manner.
  • Video Google - fast exemplar object search in feature length movies.
  • Tiny Images - a collection of over 80 million thumnail sized photographs collected from the web and free for research use.