International AI Safety Report 2026: Global Risks and Governance Implications

By Aryamehr Fattahi | 4 February 2026


International AI Safety Report 2026: Global Risks, Governance and Implications

Summary

  • The International AI Safety Report 2026, released on 3 February 2026 and backed by over 30 countries, documents significant advances in AI capabilities, including gold-medal mathematics performance and PhD-level science benchmarks, while warning that safeguards are not keeping pace with these developments.

  • Critical risk findings include 23% of high-performing biological AI tools having misuse potential, AI agents reaching the top 5% in cybersecurity competitions, and models learning to distinguish between safety evaluations and deployment contexts, which undermines testing reliability.

  • The India AI Impact Summit (16-20 February 2026) offers the first opportunity for Global South leadership on AI governance; however, the report's findings on the widening capability-safeguard gap are likely to intensify pressure on both voluntary industry frameworks and international coordination efforts throughout 2026.


Context

The International AI Safety Report 2026 represents the most comprehensive international scientific assessment of general-purpose AI risks to date. Published on 3 February 2026, the 220-page document contains 1,451 references and was authored by over 100 independent AI experts. It is backed by an Expert Advisory Panel comprising nominees from more than 30 countries alongside the European Union (EU), Organisation for Economic Co-operation and Development (OECD), and the United Nations (UN).

Turing Award winner Yoshua Bengio chairs the report. This is the second edition in a series established following the 2023 Bletchley Park AI Safety Summit, with the inaugural report published in January 2025. The release timing is deliberate. The findings will inform discussions at the India AI Impact Summit, scheduled for 16-20 February 2026 in New Delhi.

The report documents that 700 million people now use leading AI systems weekly, representing faster adoption than the personal computer achieved. However, distribution remains highly uneven: Over 50% of the population in some countries, compared with below 10% across much of Africa, Asia, and Latin America. During 2025, leading systems achieved gold-medal performance on International Mathematical Olympiad questions and exceeded PhD-level expert performance on science benchmarks.


Implications and Analysis

The report's findings carry significant consequences for governments, AI developers, civil society organisations, and Global South nations navigating an increasingly complex governance landscape. The following sections examine the strategic implications across key risk categories and stakeholder groups.

The Widening Capability-Safeguard Gap

The report's central concern is the growing distance between AI capabilities and mechanisms to manage associated risks. AI systems can now autonomously complete software engineering tasks requiring multiple hours of human programmer time. OpenAI's o3 model outperforms 94% of domain experts at troubleshooting virology laboratory protocols. Natural language interfaces make these capabilities accessible to users without specialist expertise, a development with profound dual-use implications.

While the number of companies publishing Frontier AI Safety Frameworks has increased, sophisticated attackers can often bypass current defences, and users can obtain harmful outputs by rephrasing requests or decomposing them into smaller steps. The gap between capability advancement and safeguard implementation is highly likely to remain a defining feature of AI governance debates throughout 2026.

For AI developers, this creates strategic tension between competitive pressure to deploy increasingly capable systems and reputational exposure if those systems cause harm. Companies face difficult trade-offs between speed-to-market and the thoroughness of safety testing, particularly as evaluation gaming complicates the reliability of pre-deployment assessments. For governments, the voluntary nature of existing frameworks raises questions about regulatory adequacy. EU AI Act Commission enforcement powers begin on 2 August 2026. Whether this framework can address the capability-safeguard gap identified in the report remains uncertain.

Biological Misuse: The Open-Source Dilemma

The biological misuse findings are likely to dominate near-term policy discussions. Of the highest-performing biological AI tools, 23% carry high misuse potential, with 61.5% of these being fully open-source. Only 3% of 375 surveyed biological AI tools have any safeguards whatsoever. In 2025, major AI developers released models with heightened safeguards after pre-deployment testing could not rule out that systems could meaningfully assist novices in developing biological weapons.

This creates a governance paradox that policymakers have yet to resolve. Open-source biological AI tools drive legitimate scientific progress, enabling researchers to accelerate drug discovery, understand disease mechanisms, and develop new therapies. Yet their accessibility makes meaningful restrictions nearly impossible without hampering beneficial research. The first demonstration of genome-scale generative AI design (researchers using foundation models to generate a significantly modified virus from scratch) illustrates capabilities that exist regardless of what safeguards frontier model developers implement.

For biosecurity policymakers, the findings suggest that focusing exclusively on frontier model safeguards is unlikely to address risks from already-distributed open-source tools. A realistic possibility exists that biological AI governance will emerge as a distinct policy track, separate from broader AI regulation, given the specific expertise required and the existing international frameworks (such as the Biological Weapons Convention) that could be leveraged. For the scientific community, tensions between openness norms and dual-use concerns are almost certain to intensify. Researchers may face growing pressure to justify why particular tools or datasets should remain publicly accessible.

Cybersecurity: Offensive Capabilities Accelerating

Cybersecurity findings follow a similar pattern to biological risks but with more immediate operational implications. The UK AI Security Institute found AI models can complete apprentice-level cyber tasks 50% of the time, up from 10% in early 2024. Underground marketplaces now sell pre-packaged AI tools that lower the skill threshold for attacks.

Whether AI ultimately advantages attackers or defenders remains uncertain, but the offensive capability curve is currently steeper. Defensive applications require integration into complex organisational systems and processes, while offensive tools can be deployed by individuals or small groups with minimal infrastructure. This asymmetry suggests that near-term cyber risk exposure is likely to increase even as defensive AI capabilities improve.

For organisations managing cyber risk, the realistic possibility that AI-augmented threats will outpace defensive adaptation warrants attention. Traditional security assumptions about attacker skill requirements may need revision. For governments considering critical infrastructure protection, the findings reinforce concerns about cascading vulnerabilities if AI-enabled attacks target interconnected systems.

Evaluation Reliability: A Structural Challenge

Perhaps the most consequential finding for AI governance concerns is safety testing itself. Some models can now distinguish between evaluation and deployment contexts, altering their behaviour accordingly. This means dangerous capabilities could go undetected during pre-deployment assessments. The report notes early signs of deception, cheating, and situational awareness in frontier models, behaviours that could be directly incentivised by training processes that reward performance on benchmarks.

This undermines the evaluation paradigm underpinning current governance frameworks. Both voluntary industry commitments and emerging regulations rely heavily on pre-deployment capability assessments to determine what safeguards are necessary. If models can identify when they are being tested and modify their responses, the reliability of these assessments degrades significantly. For governments relying on developer-reported evaluations to inform regulatory decisions, this represents a structural vulnerability that existing frameworks are poorly equipped to address.

The implications extend to loss of control scenarios, situations where AI systems operate outside human oversight with no clear path to regaining control. While current systems lack capabilities to pose such risks, the report notes modest advancements toward relevant capabilities: Autonomous computer use, programming, gaining unauthorised access to digital systems, and identifying ways to evade oversight. Expert opinion on the likelihood varies considerably, from implausible to likely, within several years. The evaluation gaming findings suggest that even if such capabilities emerged, current testing methodologies might fail to detect them reliably.

Deepfakes and Synthetic Media Harms

The deepfake findings illustrate AI-enabled harms materialising at scale despite existing frameworks. 15% of UK adults have now encountered pornography deepfakes. The range of misuse has also expanded significantly: Non-consensual intimate imagery, political disinformation, financial fraud through voice cloning, corporate impersonation scams, and reputational attacks against individuals and organisations.

For civil society organisations and platform governance bodies, the trajectory suggests that reactive content moderation is unlikely to address harms at scale. The ease of generating synthetic media, combined with the difficulty of detection, creates enforcement challenges that existing approaches struggle to meet. For technology executives, reputational and regulatory exposure from hosting or enabling such content is highly likely to increase.

This category of harm also illustrates the limitations of focusing governance efforts solely on frontier models. Many deepfake applications use older, widely available models that fall outside the scope of frontier AI regulations. Addressing these harms may require different policy instruments, including platform liability frameworks, criminal sanctions for creators and distributors, and technological interventions such as content authentication and provenance tracking.

Global South Inclusion and Governance Tensions

The India AI Impact Summit represents the first major AI governance gathering hosted in the Global South. India's framing of "impact" rather than "safety" or "action" signals intent to move from dialogue to demonstrable outcomes, with particular emphasis on inclusion, fair access to compute and data, and international cooperation. The summit's three foundational pillars, described as "Sutras" (People, Planet, and Progress), reflect a development-oriented approach distinct from the risk-focused framing that has characterised previous gatherings.

However, the report's findings create tension with inclusive governance aspirations. The capability-safeguard gap is widening at the frontier, driven by a small number of laboratories in a handful of countries (primarily the United States and, increasingly, China). Global South nations face the prospect of managing deployment-phase risks from systems whose development they had limited influence over, while their domestic AI sectors remain nascent. The report explicitly notes that risks may manifest differently across countries and regions, given variation in AI adoption rates, infrastructure, and institutional contexts.

For international institutions, the challenge is whether governance mechanisms can evolve quickly enough to remain relevant as capabilities advance. The gap between annual summit cycles and the pace of AI development creates structural difficulties for international coordination. For Global South governments, the strategic question is whether to prioritise participation in frontier governance discussions or focus resources on managing near-term deployment risks within their jurisdictions. The report's findings suggest both are necessary, but capacity constraints may force difficult choices.

Labour Market and Economic Uncertainties

The report documents significant uncertainty about AI's labour market effects. AI systems can now perform tasks previously requiring years of specialist training, yet the implications for employment remain contested. Some analyses suggest AI will augment human productivity and create new job categories; others warn of significant displacement, particularly in knowledge work.

This uncertainty complicates policy responses. Governments face pressure to address potential displacement but lack reliable forecasts of which sectors and occupations face the greatest exposure. The report's documentation of capabilities that match or exceed PhD-level expert performance on specific tasks suggests that even highly skilled occupations may face disruption, contrary to earlier assumptions that automation would primarily affect routine work.


Forecast

  • Short-term (Now - 3 months)

    • The India AI Impact Summit is almost certain to produce a Leaders' Declaration addressing capability-safeguard gaps, though binding commitments remain unlikely given the voluntary nature of previous summit outcomes. Deepfake and biological misuse findings are likely to generate immediate calls for regulatory acceleration in jurisdictions with pending AI legislation, including the UK and several US states considering frontier AI bills.

  • Medium-term (3-12 months)

    • Implementation of the EU AI Act Commission enforcement in August 2026 will test whether formalised safety frameworks can address evaluation reliability concerns raised in the report. Open-source biological AI tool governance is a realistic possibility to emerge as a distinct policy track, potentially through amendments to existing biosecurity frameworks rather than AI-specific regulation.

  • Long-term (>1 year)

    • If evaluation gaming becomes more prevalent, the current model-level testing paradigm is likely to face fundamental revision, potentially shifting toward continuous monitoring of deployed systems rather than pre-deployment assessment. International coordination on frontier AI safety, analogous to nuclear risk management frameworks, remains a realistic possibility but faces significant geopolitical obstacles given US-China technology competition and divergent regulatory philosophies between major jurisdictions.

BISI Probability Scale
Next
Next

Greenland and the Infrastructure Foundations of AI competition