OpenAI has launched a new AI Safety Hub to give the public clearer insight into how its models handle risks like harmful content, hallucinations, and jailbreak attempts. This move marks a major step toward transparency, especially as scrutiny around AI safety continues to grow.
The newly launched hub offers a live snapshot of safety test results. OpenAI says it will update the page regularly, particularly after big model changes. Until now, most of this data appeared only in system cards at launch. Now, users and researchers alike will have easier access to updates in one central location.
Responding to Criticism with Open Access
In its blog post, OpenAI explained that the hub will evolve alongside its safety research. The goal is to make safety metrics more scalable and easier to compare over time. The company also wants this move to push the broader AI community to improve safety transparency.
This change didn’t come out of nowhere. OpenAI has recently faced backlash for skipping safety disclosures with some models and allegedly rushing internal reviews. Reports last year suggested that Sam Altman, OpenAI’s CEO, misled executives about safety concerns ahead of his brief removal in November 2023.
More recently, OpenAI had to roll back an update to GPT-4o, the default ChatGPT model. Users complained that the chatbot had become too agreeable, even validating dangerous or problematic behavior. Social media platforms were quickly flooded with examples. In response, OpenAI promised to tighten guardrails and introduced a new “alpha phase” where early users can test models before public release.
What’s Next for the Safety Hub?
OpenAI says the AI Safety Hub will continue to grow. Over time, it plans to add more evaluation categories and improve the clarity of the shared metrics. The company hopes this tool will help users understand how safe its systems really are—and help rebuild trust after recent stumbles.
With more public eyes on AI safety than ever, this initiative could become a key benchmark for the industry.