Attention!: agree MSR-LA before downloading

Purpose: This data set is used to support the low shot learning challenge here. We provide aligned version (face region is cropped and aligned) here. If non-aligned version is needed, please use ImageURL to fetch from the cropped version here.

  • Download link
    • Base Set: 22GB: Download
    • Novel Set: 26MB: One image per celebrity; We remove datasets which contain two/five images per celebrity since they are no more needed.
    • Backup download link for both the above datasets: Onedrive.
  • File format: text files, each line is an image record containing 5 columns, delimited by TAB.
    • Column1: Image ID
    • Column2: FaceData_Base64Encoded
    • Column3: Freebase MID
    • Column4: ImageSearchRank
    • Column5: ImageURL


  1. The data is released for non-commercial research purpose only. You have to read and agree the MSR Data License Agreement before you downloading the data;
  2. Please contact us If you are a celebrity but do not want to be included in this data set. We will remove related entries by request;
  3. In all the related publications, please cite the paper "MS-Celeb-1M: A Dataset and Benchmark for Large Scale Face Recognition" and provide the link to
    @INPROCEEDINGS { guo2016msceleb,
        author = {Guo, Yandong and Zhang, Lei and Hu, Yuxiao and He, Xiaodong and Gao, Jianfeng},
        title = {M{S}-{C}eleb-1{M}: A Dataset and Benchmark for Large Scale Face Recognition},
        booktitle = {European Conference on Computer Vision},
        year = {2016},