In a newly published Frontier AI Framework, the company categorizes risky AI models into two levels: high risk and critical risk.
According to Meta, both classifications apply to AI systems capable of aiding cybersecurity breaches, as well as chemical and biological attacks.
However, critical-risk models are those that could lead to “catastrophic outcomes” with no viable safeguards, while high-risk systems might make such attacks easier but not reliably so.
If an AI system is classified as high-risk, Meta says it will limit internal access and delay release until mitigations are in place. If a system is deemed critical-risk, development will be halted entirely until it can be made safer.
Meta’s framework does not rely on strict empirical tests to classify AI risks. Instead, risk levels will be determined through input from internal and external researchers, subject to review by senior leadership.
The company states that current AI evaluation methods are not robust enough to provide definitive risk assessments.
The framework arrives as Meta faces scrutiny over its open approach to AI development. Unlike companies such as OpenAI, which restrict access to their models via APIs, Meta has released versions of its Llama models for public use.
While this has driven widespread adoption, it has also raised concerns about misuse. Reports suggest at least one US adversary has leveraged Llama to develop AI-driven defense tools.
With this policy, Meta appears to be drawing a distinction between its approach and that of DeepSeek, a Chinese AI company that also makes its models publicly available but with fewer safety measures.
Meta asserts that responsible AI development requires balancing openness with security, stating: “It is possible to deliver that technology to society while also maintaining an appropriate level of risk.”