With the surge in remote work, we are dependent on high quality video conferencing solutions. However, low-bandwidth connections, poor image quality and lighting, and lack of engagement tools significantly degrade the experience. Leading companies are leveraging our synthetic facial data to train new machine learning models to improve the video quality and teleconferencing experience.
Portrait Segmentation & Matting
Background blurring or replacement is currently the most used feature in video conferencing. As we have all experienced, the segmentation performance of today’s systems is not robust across lighting and environments. To develop better models, a broader set of scenarios is needed–and for this reason, we provide a variety of datasets to address the issues our faces’… face.
See things through their eyes by implementing gaze detection. Understanding gaze lets your application gauge attentiveness levels as well as deduce spatial relationships between important objects in meetings. Gaze correction to adjust for camera offsets also helps meetings feel more natural and connected.
Not being able to see because of atrocious lighting can kill the productivity of any meeting. Lag can stop it in its tracks. With upscaling and re-lighting techniques, attendees will never even know there was a bandwidth hiccup or challenging lighting to begin with.
The average amount spent on single image for full-segmentation is $6.40* – any additional labels cost more above and beyond that. Our synthetic data provides full-segmentation, landmarks, surface normals, and more – for as little as $0.03 per image.
Of course, that’s only the labeling cost. Procuring the images to label is incredibly time-consuming as well. It can take weeks or months to legally collect diverse images of individuals’ faces for most companies. Our datasets are available immediately, and our programmatic API returns generated images and labels in minutes to hours.
*Based on scale.ai pricing, January 2021.
Not only do our datasets provide training data affordably and nearly instantaneously, they do so much more than human-collected & labeled data. So you can build more advanced, more ethical computer vision models.
100% accurate ground truth–every time. Eliminate your QA step on every label.
Get peace of mind: with non-real humans, privacy concerns are history.
Even sampling across skin tones, ethnicities, and ages, for more ethical machine vision.
New Label Types
Use cutting edge models with depth, normals, dense 3D landmarks, & subsegmentation.
Combine identities, hair styles, facial hair, makeup, hats, glasses, face masks, lighting conditions, and camera angles for trillions of possibilities – all at the speed of writing JSON with our API.
Get Going Fast
Check out our snippets to jump-start the training process.
As with all synthetic data, there’s a shift from our domain to the one captured by real cameras. Although there’s no universal domain adaptation approach for every use-case, we stand on the shoulders of giants to get great results.
Adaptive Batch Normalization
Adaptive Batch Normalization is a simple technique, can be easily applied to any network with batch normalization layers, and combined with all other techniques for surprisingly good results.
Image-2-image translation methods coupled with self-regularization loss allows dataset-level refinement. While these methods require additional pipeline to train, it is completely independent and does not require modifications of the main training pipeline.