Synthesis Blog

Together, we’re building the future of computer vision and machine learning.
Featured Post

AI Safety IV: Sparks of Misalignment

This is the last, fourth post in our series on AI safety (first, second, third). We have discussed the main definitions of the field, introduced goodharting as a key component of possible misalignment, and talked about interpretability as a possible constructive way forward. Today, we will consider a few examples of AI system behaviours that are, to say the least, highly disturbing. We will see how LLMs can turn into sleeper agents, fake alignment to avoid retraining, and earn millions of dollars on cryptocurrency markets. Let the sparks of misalignment fly!

Continue reading
All Posts
June 19, 2025

This is the last, fourth post in our series on AI safety (first, second, third). We have discussed the main…

June 3, 2025

In the third post on AI safety (first, second), we turn to interpretability, which has emerged as one of the…

May 8, 2025

In this post, the second in the series (after “Concepts and Definitions”), we embark on a comprehensive exploration of Goodhart's…

April 17, 2025

In October 2023, I wrote a long post on the dangers of AGI and why we as humanity might not…

March 21, 2025

Today, I want to discuss two recently developed AI systems that can help with one of the holy grails of…

February 25, 2025

Some of the most important AI advances in 2024 were definitely test-time reasoning LLMs, or large reasoning models (LRM), that…

All Series