Applications
- Biometrics & security
  
  ID verification
  
  Facial identification and verification for consumer and security applications.
  
  Security
  
  Activity recognition and threat detection across camera views.
  
  Consumer devices & applications
  
  AR/VR/XR
  
  Spatial computing, gesture recognition, and gaze estimation for headsets.
  
  Virtual try-on
  
  Millions of identities and clothing options to train best-in-class models.
  
  Biometrics & security
  
  Driver monitoring
  
  Simulate driver and occupant behavior captured with multi-modal cameras.
  
  Pedestrian detection
  
  Simulate edge cases and rare events to ensure the robust performance of autonomous vehicles.
Resources
- AI Safety IV: Sparks of Misalignment
  
  This is the last, fourth post in our series...
  
  Read More
  
  Visit the Blog
  
  Gen AI
  
  Gen AI book
  
  HumanAPI
  
  Data visualizer
  
  API documentation
  
  Synthetic data
  
  Synthetic data book
  
  Synthetic data guide
  
  Synthetic data whitepaper
  
  Industry survey
  
  OpenSynthetics
Company
- About us
  
  Overview
  
  Ethics
  
  Press
  
  Newsroom
  
  Press Kit
  
  Join Our Team
  
  Together, we’re building the future of computer vision & machine learning
  
  Explore Careers

Applications
- Biometrics & security
  
  ID verification
  
  Facial identification and verification for consumer and security applications.
  
  Security
  
  Activity recognition and threat detection across camera views.
  
  Consumer devices & applications
  
  AR/VR/XR
  
  Spatial computing, gesture recognition, and gaze estimation for headsets.
  
  Virtual try-on
  
  Millions of identities and clothing options to train best-in-class models.
  
  Biometrics & security
  
  Driver monitoring
  
  Simulate driver and occupant behavior captured with multi-modal cameras.
  
  Pedestrian detection
  
  Simulate edge cases and rare events to ensure the robust performance of autonomous vehicles.
Resources
- AI Safety IV: Sparks of Misalignment
  
  This is the last, fourth post in our series...
  
  Read More
  
  Visit the Blog
  
  Gen AI
  
  Gen AI book
  
  HumanAPI
  
  Data visualizer
  
  API documentation
  
  Synthetic data
  
  Synthetic data book
  
  Synthetic data guide
  
  Synthetic data whitepaper
  
  Industry survey
  
  OpenSynthetics
Company
- About us
  
  Overview
  
  Ethics
  
  Press
  
  Newsroom
  
  Press Kit
  
  Join Our Team
  
  Together, we’re building the future of computer vision & machine learning
  
  Explore Careers

Applications
- Biometrics & security
  
  ID verification
  
  Facial identification and verification for consumer and security applications.
  
  Security
  
  Activity recognition and threat detection across camera views.
  
  Consumer devices & applications
  
  AR/VR/XR
  
  Spatial computing, gesture recognition, and gaze estimation for headsets.
  
  Virtual try-on
  
  Millions of identities and clothing options to train best-in-class models.
  
  Biometrics & security
  
  Driver monitoring
  
  Simulate driver and occupant behavior captured with multi-modal cameras.
  
  Pedestrian detection
  
  Simulate edge cases and rare events to ensure the robust performance of autonomous vehicles.
Resources
- AI Safety IV: Sparks of Misalignment
  
  This is the last, fourth post in our series...
  
  Read More
  
  Visit the Blog
  
  Gen AI
  
  Gen AI book
  
  HumanAPI
  
  Data visualizer
  
  API documentation
  
  Synthetic data
  
  Synthetic data book
  
  Synthetic data guide
  
  Synthetic data whitepaper
  
  Industry survey
  
  OpenSynthetics
Company
- About us
  
  Overview
  
  Ethics
  
  Press
  
  Newsroom
  
  Press Kit
  
  Join Our Team
  
  Together, we’re building the future of computer vision & machine learning
  
  Explore Careers

Applications
- Biometrics & security
  
  ID verification
  
  Facial identification and verification for consumer and security applications.
  
  Security
  
  Activity recognition and threat detection across camera views.
  
  Consumer devices & applications
  
  AR/VR/XR
  
  Spatial computing, gesture recognition, and gaze estimation for headsets.
  
  Virtual try-on
  
  Millions of identities and clothing options to train best-in-class models.
  
  Biometrics & security
  
  Driver monitoring
  
  Simulate driver and occupant behavior captured with multi-modal cameras.
  
  Pedestrian detection
  
  Simulate edge cases and rare events to ensure the robust performance of autonomous vehicles.
Resources
- AI Safety IV: Sparks of Misalignment
  
  This is the last, fourth post in our series...
  
  Read More
  
  Visit the Blog
  
  Gen AI
  
  Gen AI book
  
  HumanAPI
  
  Data visualizer
  
  API documentation
  
  Synthetic data
  
  Synthetic data book
  
  Synthetic data guide
  
  Synthetic data whitepaper
  
  Industry survey
  
  OpenSynthetics
Company
- About us
  
  Overview
  
  Ethics
  
  Press
  
  Newsroom
  
  Press Kit
  
  Join Our Team
  
  Together, we’re building the future of computer vision & machine learning
  
  Explore Careers

Category: Research Topics

June 19, 2025

AI Safety IV: Sparks of Misalignment

This is the last, fourth post in our series on AI safety (first, second, third). We have discussed the main definitions of the field, introduced goodharting as a key component of possible misalignment, and talked about interpretability as a possible constructive way forward. Today, we will consider a few examples of AI system behaviours that are, to say the least, highly disturbing. We will see how LLMs can turn into sleeper agents, fake alignment to avoid retraining, and earn millions of dollars on cryptocurrency markets. Let the sparks of misalignment fly!

June 3, 2025

AI Safety III: Interpretability

In the third post on AI safety (first, second), we turn to interpretability, which has emerged as one of the most promising directions in AI safety research, offering some real hope for understanding the “giant inscrutable matrices” of modern AI models. We will discuss the recent progress from early feature visualization to cutting-edge sparse autoencoders that can isolate individual concepts like “unsafe code”, “sycophancy”, or “the Golden Gate bridge” within frontier models. We also move from interpreting individual neurons to mapping entire computational circuits and even show how LLMs can spontaneously develop RL algorithms. In my opinion, recent breakthroughs in interpretability represent genuine advances towards the existentially important goal of building safe AI systems.

May 8, 2025

AI Safety II: Goodharting and Reward Hacking

In this post, the second in the series (after “Concepts and Definitions”), we embark on a comprehensive exploration of Goodhart’s law: how optimization processes can undermine their intended goals by optimizing proxy metrics. Goodharting lies at the heart of what is so terrifying about making AGI, so this is a key topic for AI safety. Starting with the classic taxonomy of regressional, extremal, causal, and adversarial goodharting, we then trace these patterns from simple mathematical models and toy RL environments to the behaviours of state of the art reasoning LLMs, showing how goodharting manifests in modern machine learning through shortcut learning, reward hacking, goal misgeneralization, and even reward tampering, with striking examples from current RL agents and LLMs.

April 17, 2025

AI Safety I: Concepts and Definitions

In October 2023, I wrote a long post on the dangers of AGI and why we as humanity might not be ready for the upcoming AGI revolution. A year and a half is an eternity in current AI timelines—so what is the current state of the field? Are we still worried about AGI? Instead of talking about how perception of the risks has shifted over the last year (it has not, not that much, and most recent scenarios such as AI 2027 still warn about loss of control and existential risks), today we begin to review the positive side of this question: the emerging research fields of AI safety and AI alignment. This is still a very young field, and a field much smaller than it should be. Most research questions are wide open or not even well-defined yet, so if you are an AI researcher, please take this series as an invitation to dive in!

March 21, 2025

The Creativity Scale: Can AI Do Science?

Today, I want to discuss two recently developed AI systems that can help with one of the holy grails of AI: doing research automatically. Google’s AI Co-Scientist appears to be a tireless research partner that can read thousands of papers overnight and brainstorm ideas with you… actually, it can brainstorm ideas internally and give you only the best of the best. Sakana AI’s AI Scientist-v2 doesn’t need you at all, it just writes new papers from scratch, and its papers are getting accepted to some very good venues. To contextualize these novelties, I also want to discuss where current AI models are, creatively speaking—and what this question means, exactly.

February 25, 2025

Large Reasoning Models: How o1 Replications Turned into Real Competition

Some of the most important AI advances in 2024 were definitely test-time reasoning LLMs, or large reasoning models (LRM), that is, LLMs that are trained to write down and reuse their chains of thought for future reference. Reasoning LLMs started with the o1 family of models by OpenAI (I wrote a short capabilities post in September, when it appeared). Since then, they have opened up a new scaling paradigm for test-time compute, significantly advanced areas such as mathematical reasoning and programming, and OpenAI is already boasting its new o3 family—but we still don’t have a definitive source on how OpenAI’s models work. In this post, we discuss how attempts to replicate o1 have progressed to this date, including the current state of the art open model, DeepSeek R1, which seems to be a worthy rival even for OpenAI’s offerings.

January 28, 2025

Attack of the Titans: Transformers 2.0?

We interrupt your regularly scheduled programming to discuss a paper released on New Year’s Eve: on December 31, 2024, Google researchers Ali Behrouz et al. published a paper called “Titans: Learning to Memorize at Test Time”. It is already receiving a lot of attention, with some reviewers calling it the next big thing after Transformers. Since we have already discussed many different approaches to extending the context size in LLMs, in this post we can gain a deeper understanding of Titans by putting it in a wider context. Also, there are surprisingly many neurobiological analogies here…

January 17, 2025

LLMs for Programming: Is Coding Dying?

It is time to discuss some applications. Today, I begin with using LLMs for programming. There is at least one important aspect of programming that makes it easier than writing texts: source code is formal, and you can design tests that cover at least most of the requirements in a direct and binary pass/fail way. So today, we begin with evaluation datasets and metrics and then proceed to fine-tuning approaches for programming: RL-based, instruction tuning, and others. Next, we will discuss LLM-based agents for code and a couple of practical examples—open LLMs for coding—and then I will conclude with a discussion of where we are right now and what the future may hold.

November 20, 2024

Linear Attention and Mamba: New Power to Old Ideas

We have already discussed how to extend the context size for modern Transformer architectures, but today we explore a different direction of this research. In the quest to handle longer sequences and larger datasets, Transformers are turning back to the classics: the memory mechanisms of RNNs, associative memory, and even continuous dynamical systems. From linear attention to Mamba, modern models are blending old and new ideas to bring forth a new paradigm of sequence modeling, and this paradigm is exactly what we discuss today.

November 7, 2024

Kolmogorov-Arnold Networks: KAN You Make It Work?

Although deep learning is a very new branch of computer science, foundations of neural networks have been in place since the 1950s: we have been training directed graphs composed of artificial neurons (perceptrons), and each individual neuron has always looked like a linear combination of inputs followed by a nonlinear function like ReLU. In April 2024, a new paradigm emerged: Kolmogorov-Arnold networks (KAN) work on a different theoretical basis and promise not only a better fit for the data but also much improved interpretability and an ability to cross over to symbolic discoveries. In this post, we discuss this paradigm, what the main differences are, and where KAN can get us right now.

Biometrics & security

ID verification

Security

Consumer devices & applications

AR/VR/XR

Virtual try-on

Biometrics & security

Driver monitoring

Pedestrian detection

Gen AI

HumanAPI

Synthetic data

About us

Press

Join Our Team

Biometrics & security

ID verification

Security

Consumer devices & applications

AR/VR/XR

Virtual try-on

Biometrics & security

Driver monitoring

Pedestrian detection

Gen AI

HumanAPI

Synthetic data

About us

Press

Join Our Team

Biometrics & security

ID verification

Security

Consumer devices & applications

AR/VR/XR

Virtual try-on

Biometrics & security

Driver monitoring

Pedestrian detection

Gen AI

HumanAPI

Synthetic data

About us

Press

Join Our Team

Biometrics & security

ID verification

Security

Consumer devices & applications

AR/VR/XR

Virtual try-on

Biometrics & security

Driver monitoring

Pedestrian detection

Gen AI

HumanAPI

Synthetic data

About us

Press

Join Our Team

Category: Research Topics

Embrace Synthetic Data.