(Samples from the George W. Bush cluster)

Faces in the Wild

Our dataset, Faces in the Wild, consists of 30,281 faces collected from News Photographs. These faces have been automatically labeled using the system described in: Who's in the Picture. The labels are approximately 80% accurate. Included in the file faceData.tar.gz are a matlab file, FacesInTheWild.mat, and the face images stored by year/month/day/imgname.ppm. FacesInTheWild.mat contains two variables metaData (metaData{i} gives the file name of face i and it's label id), and lexicon (lexicon{i} gives the actual name of label i).


Faces in the Wild Dataset

Original News Photographs

Original Captions

To unpack any of these file into your current directory use the command:
tar zxvf filename.tar.gz


This dataset is for academic research purposes only. If you use our dataset please reference:

  • Who's in the Picture [pdf] [ps]
    Tamara L. Berg, Alexander C. Berg, Jaety Edwards, David A. Forsyth
    Neural Information Processing Systems (NIPS), 2004