model_detected

Every quick byte tagged with this topic, from launches and funding to rumors and company moves.

Share and react

model releaseOfficialUpdated: 18h ago

A year ago, we introduced AlphaEvolve — our Gemini-powered coding agent. Today, it's being used across fields from improving Google's AI inf

A year ago, we introduced AlphaEvolve — our Gemini-powered coding agent. Today, it's being used across fields from improving Google's AI infrastructure and enabling complex molecular simulations, to better predicting the risk of natural disasters. Here's a look at the impact so https://t.co/xrYpJy2qZE

Google5/8/2026, 4:56:56 PMGoogle
model releaseOfficialUpdated: 14h ago

This system helped us identify this happened for some of our prior Instant and mini models. It additionally affected GPT-5.4 Thinking in les

This system helped us identify this happened for some of our prior Instant and mini models. It additionally affected GPT-5.4 Thinking in less than 0.6% of samples. Out of abundance of caution, we did an in-depth analysis of these cases: they did not seem to reduce

OpenAI5/8/2026, 8:19:05 PMOpenAI
model releaseOfficialUpdated: 17h ago

High-quality documents based on Claude’s constitution, combined with fictional stories that portray an aligned AI, can reduce agentic misali

High-quality documents based on Claude’s constitution, combined with fictional stories that portray an aligned AI, can reduce agentic misalignment by more than a factor of three—despite being unrelated to the evaluation scenario. https://t.co/JORhSuY4N7

Anthropic5/8/2026, 5:52:12 PMAnthropic
model releaseOfficialUpdated: 17h ago

We experimented with training Claude on examples of safe behavior in scenarios like our evaluation. This had only a small effect, despite be

We experimented with training Claude on examples of safe behavior in scenarios like our evaluation. This had only a small effect, despite being similar to our evaluation. We got further by rewriting the responses to portray admirable reasons for acting safely.

Anthropic5/8/2026, 5:52:10 PMAnthropic
model releaseOfficialUpdated: 17h ago

New Anthropic research: Teaching Claude why. Last year we reported that, under certain experimental conditions, Claude 4 would blackmail use

New Anthropic research: Teaching Claude why. Last year we reported that, under certain experimental conditions, Claude 4 would blackmail users. Since then, we’ve completely eliminated this behavior. How?

Anthropic5/8/2026, 5:52:08 PMAnthropic
model releaseOfficialUpdated: 17h ago

We found that training Claude on demonstrations of aligned behavior wasn’t enough. Our best interventions involved teaching Claude to deeply

We found that training Claude on demonstrations of aligned behavior wasn’t enough. Our best interventions involved teaching Claude to deeply understand why misaligned behavior is wrong. Read more: https://t.co/0YaRlXhVZb

Anthropic5/8/2026, 5:52:09 PMAnthropic
model releaseOfficialUpdated: 17h ago

We started by investigating why Claude chose to blackmail. We believe the original source of the behavior was internet text that portrays AI

We started by investigating why Claude chose to blackmail. We believe the original source of the behavior was internet text that portrays AI as evil and interested in self-preservation. Our post-training at the time wasn’t making it worse—but it also wasn’t making it better.

Anthropic5/8/2026, 5:52:09 PMAnthropic

Discussion