Open Dataset for ML Training: Diverse Human Faces

The Diverse Human Faces Dataset showcases a diverse set of humans with the camera focused on the head and shoulders, in multiple environments and camera angles. This dataset is therefore particularly suited for machine learning tasks like facial segmentation, matting / background removal and landmark estimation – among others.

*Free, Open Non Commercial License
ABOUT THE SET
Specifications
  • 10k images
  • 100 diverse identities spanning gender, skin tone, and age. 1024 x 1024 RGB images
  • Rich set of pixel-perfect labels including segmentation maps, depth, surface normals, and 3D landmarks
  • Variation of expressions from neutral to extreme
  • Variation in eye gaze spanning 10° in all directions
  • Variation of head turn spanning 15° in all directions
  • Diverse hairstyles, colors, and facial hair
  • Accessories: hats, glasses, headphones, and face masks
  • Variation in outfits
  • Variation in background over a wide range of lighting and confounding elements
  • Many different camera angles around the person
Open Dataset
KEY MACHINE LEARNING TASKS
This data can be used to train...
Facial Segmentation
  • Variation over identities, facial expression, and confounds challenges models segmenting facial regions.
  • Granular sub segmentation maps includes 35 face regions such as “ear_left”, “eyelashes”, “eyelid”, “hair”
Matting/Background Removal
  • Variation in background, lighting, and hair styles produce mimics hard to segment real-world conditions.
  • Pixel-accurate alpha masks as ground truth.
Landmark Estimation
  • A high variation of camera perspectives and facial expressions enables different perspectives on the face in order to improve the landmark estimation.
  • 2D and 3D anatomical and contour landmarks available.
To download the dataset, fill out the form.
Synthesis AI provides programmatic generation of vast amounts of photorealistic image data with an accompanying set of never-before-available rich 3D labels. Learn more about our platform.

Learn more about Synthetic Data in our guide

X