Attention!: agree MSR-LA before downloading

Development data set can be downloaded here

Purpose: This data has 6 files. Each file contains 500 face images and labels, which can be used to develop/verify your recognition algorithms. They may also be used during the dry-run stage.

  • DevSet1: “Hard Set” can be used to test the robustness of an algorithm on illumination, pose, resolution, etc.
  • DevSet2: “Random Set” can be used to test the coverage of an algorithm among the 1M entities. Both of these two sets include cropped and aligned face images.
  • Two Forms: (Form1 is extracted from Form2)
    • File format1: two zip files for DevSet1 and DevSet2 respectively. Each zip file contains two folders for 500 cropped or aligned face jpeg files. Each .jpg file is named by the FreeBase MID.
    • File format2: Four text files, each line is an image record containing 6 columns, delimited by TAB.
      • Column1: Freebase MID
      • Column2: EntityNameString
      • Column3: ImageURL
      • Column4: FaceID
      • Column5: FaceRectangle_Base64Encoded (four floats, relative coordinates of UpperLeft and BottomRight corner)
      • Column6: FaceData_Base64Encoded
  • Some Statistics:
    • # of line: 500 lines in each file
    • # of unique MIDs: 500
    • Total file size: tsv files: 44 MB, zip files: 32MB

Disclaimers

  1. The data is released for non-commercial research purpose only. You have to read and agree the MSR Data License Agreement before you downloading the data;
  2. Please contact us If you are a celebrity but do not want to be included in this data set. We will remove related entries by request;
  3. In all the related publications, please cite the paper "MS-Celeb-1M: A Dataset and Benchmark for Large Scale Face Recognition" and provide the link to http://msceleb.org.
    @INPROCEEDINGS { guo2016msceleb,
        author = {Guo, Yandong and Zhang, Lei and Hu, Yuxiao and He, Xiaodong and Gao, Jianfeng},
        title = {M{S}-{C}eleb-1{M}: A Dataset and Benchmark for Large Scale Face Recognition},
        booktitle = {European Conference on Computer Vision},
        year = {2016},
        organization={Springer}}