Multi-Agent Systems for Cybersecurity Defense

Distributed AI Architectures for Autonomous Threat Detection and Response

Overview of Multi-Agent Approaches to Cybersecurity

Multi-Agent Reinforcement Learning (MARL) has emerged as a transformative approach to modern cybersecurity challenges, enabling decentralized, adaptive, and collaborative defense strategies that provide automated mechanisms to combat dynamic, coordinated, and sophisticated threats. As 2025 is being called "the year of multi-agent systems," organizations are increasingly deploying swarms of autonomous AI agents that work together to tackle complex security operations, with each agent addressing unique tasks such as phishing detection, malware analysis, or insider threat monitoring.

Multi-agent systems built on the collective interaction of intelligent agents offer a promising approach to autonomous, adaptive, and scalable protection. In the digital era where industrial environments handle massive data volumes, intrusion detection systems based on artificial intelligence algorithms are providing critical answers to cybersecurity challenges that traditional signature-based approaches cannot address. Agentic AI is transforming cybersecurity by automating critical tasks within Security Operations Centers (SOCs), including decision-making, incident response, and threat detection, with intelligent agents managing repetitive work, reasoning through complex scenarios, and coordinating across tools and systems to accelerate response.

Multi-Agent IDS Performance Metrics

Threat Detection and Response Coordination

Multi-agent intrusion detection systems (IDS) are designed as distributed attack detection platforms where agents work coordinately to provide scalable, fault-tolerant, multi-view architecture guided security systems. Each mobile agent analyzes traffic and detects threats independently, which evades the single point of failure problem inherent in centralized architectures. Multi-sensor architectures yield more accurate detections and create collective attack knowledge based on different sensor observations.

A major challenge for applicable IDS is addressing traffic concept drift, which manifests as zero-day attacks and changing behavior of benign users and applications. Recent research demonstrates that multi-agent IDS frameworks using federated distillation can manage big data issues in practical situations, where agents learn novel attacks and update models asynchronously while maintaining acceptable results on both known and new anomalies. Deep reinforcement learning-based IDS that employ Deep Q-Network logic in multiple distributed agents and use attention mechanisms have achieved F1-scores of 0.76 while efficiently detecting and classifying advanced network attacks.

Agentic AI can narrow down 10,000 alerts to 50 critical incidents, explain why they matter, and in some cases already take first response actions autonomously. Future advancements include autonomous incident handling with AI-driven decision-making in real-time, threat containment by isolating affected systems automatically, and response coordination to assist security teams. Effective use of AI agents requires balancing autonomy with human oversight, with many organizations starting with AI-driven recommendations and low-risk automation before expanding autonomy.

Alert Triage and Response Efficiency

Distributed Defense Architectures

NATO has developed the AICA (Autonomous Intelligent Cyber-defense Agent) Reference Architecture for Multiple Autonomous Intelligent Cyber defense Agents (MAICA), which presents the rationale, concept, and future research directions for autonomous intelligent cyber defense in military systems. In contested battlefield scenarios where enemy software cyber agents (malware) infiltrate friendly networks to attack command, control, communications, computers, intelligence, surveillance, and reconnaissance systems, NATO needs artificial cyber hunters—intelligent, autonomous, mobile agents specialized in active cyber defense. These agents work in cohorts or swarms capable of detecting cyber-attacks, devising appropriate countermeasures, and running and adapting tactically their execution.

Hybrid AI architectures integrating deep reinforcement learning (DRL), augmented Large Language Models (LLMs), and rule-based systems are being implemented in software-defined network (SDN) controllers, enabling automated defensive actions such as monitoring, analysis, decoy deployment, service removal, and recovery. A modular, flexible, and scalable SDN-based framework integrates a deep learning-based intrusion detection system (IDS) and a DRL-based intrusion prevention system (IPS) to address slow-rate DDoS threats, with the LSTM-based IDS detecting suspicious flows and informing the DRL-based IPS, which uses deep Q-learning agents per bidirectional connection to make mitigation decisions in real time.

Multi-Agent Coordination in Network Defense

Agent Specialization and Role Distribution

Multi-Agent Deep Reinforcement Learning (MADRL) presents a promising approach to enhancing the efficacy and resilience of autonomous cyber operations, leveraging collaborative interactions among multiple agents to detect, mitigate, and respond to cyber threats. By applying MARL to automated cyber defense (ACD), the defense of a large enterprise network can be distributed across individual subnets, with each defensive agent tasked with defending a single subnet, reducing state space while contributing to the defense of the entire network.

Cyber Gyms are dynamic, interactive environments for the training of MARL agents, addressing the limitations of static datasets that quickly become outdated or biased in rapidly evolving cybersecurity settings. CybORG (Cyber Operations Research Gym) is a cyber security research environment for training and development of security human and autonomous agents, containing a common interface for both emulated (using cloud-based virtual machines) and simulated network environments. The CAGE Challenge 4 extends the CybORG code base to include an environment that replicates a large enterprise network composed of numerous subnets, where each subnet contains a unique defender agent.

Novel hierarchical multi-agent reinforcement learning (H-MARL) strategies decompose cyber defense into multiple sub-tasks, train sub-policies for each sub-task guided by domain expertise, and train a master policy to coordinate sub-policy selection. Extensive experiments using CybORG CAGE 4, the state-of-the-art MARL environment for cyber defense, show that hierarchical learning approaches achieve top performance in terms of convergence speed, episodic return, and several interpretable metrics relevant to cybersecurity, including the fraction of clean machines on the network, precision, and false positives.

Agent Role Specialization

  • Detection Agents: Specialized in identifying anomalies and suspicious patterns in network traffic
  • Analysis Agents: Deep investigation of potential threats and correlation of events
  • Response Agents: Automated containment and mitigation actions
  • Coordination Agents: Orchestrating multi-agent activities and ensuring coherent defense strategies

Recent Deployments and Case Studies

CrowdStrike announced in October 2025 a collaboration with NVIDIA to bring always-on, continuously learning AI agents for cybersecurity to the edge through Charlotte AI AgentWorks, NVIDIA Nemotron open models, NVIDIA NeMo Data Designer synthetic data, NVIDIA NeMo Agent Toolkit, and NVIDIA NIM microservices. This architecture enables defenders to feed enriched telemetry directly into locally hosted AI models and agents built and optimized with the NVIDIA NeMo Agent Toolkit, operating at the edge and allowing systems to learn safely, reason accurately, and act within enterprise guardrails. When running NVIDIA NIM microservices internally, CrowdStrike Charlotte AI Detection Triage enabled automated detection triage at 2x the speed of its initial launch with 50% fewer compute resources, reducing alert fatigue and maximizing SOC efficiency.

Palo Alto Networks unveiled Cortex Cloud 2.0 in October 2025, which includes a workforce of autonomous AI agents, a reimagined Cloud Command Center, and a performance-optimized CDR agent. The company also launched Cortex AgentiX, built on a decade of security automation leadership and trained on 1.2 billion real-world playbook executions, offering up to 98% reduction in mean time to respond (MTTR) with 75% less manual work. The platform comes with over 1,000 prebuilt integrations and native Model Context Protocol (MCP) support.

Google launched the Agent2Agent (A2A) protocol on November 13, 2025, with support from more than 50 technology partners including Atlassian, Box, Cohere, Intuit, Langchain, MongoDB, PayPal, Salesforce, SAP, and ServiceNow. The protocol includes built-in auth with short-lived tokens scoped per task and expiring in minutes, eliminating long-lived secrets while integrating with enterprise SSO. Under the Linux Foundation's governance, A2A will remain vendor neutral and emphasize inclusive contributions while continuing the protocol's focus on extensibility, security, and real-world usability across industries.

SentinelNet is the first decentralized framework for proactively detecting and mitigating malicious behaviors in multi-agent collaboration, published in October 2025. It equips each agent with a credit-based detector trained via contrastive learning on augmented adversarial debate trajectories, enabling autonomous evaluation of message credibility and dynamic neighbor ranking via bottom-k elimination to suppress malicious communications. Experiments show SentinelNet achieves near-perfect detection of malicious agents within two debate rounds and recovers 95% of system accuracy from compromised baselines, with accuracy ranging from 85.9% to 92.1% across different benchmarks.

Industry Deployment Effectiveness

Challenges and Attack Vectors

The rise of AI agents and multi-agent systems introduces new challenges in cybersecurity, including new attack vectors and vulnerabilities requiring security teams to protect against data poisoning, prompt injection, and social engineering attacks. Prompt injection remains one of the most potent and versatile attack vectors, capable of leaking data, misusing tools, or subverting agent behavior. Attack vectors shift beyond CVEs to target agentic behavior itself through prompt injection, memory poisoning, and tool abuse.

Delegating tasks to AI agents extends a principal's attack surface to its software proxies, and by compromising an agent, attackers can extract highly sensitive data ranging from credentials to proprietary documents. An attack on one AI assistant's RAG (Retrieval-Augmented Generation) memory can compromise downstream decisions, which is particularly dangerous in a multi-agent system where agents share data, amplifying the attack. Once poisoned data enters the system, downstream agents automatically retrieve and act upon it without detecting the compromise, resulting in invisible and persistent compromise where both agents appear to function normally while providing false outputs.

Coordinated fleets of specialized agents can launch thousands of subtle, context-aware interactions that are far more likely to sway or manipulate individuals than a single adversary could. Agents might establish secret collusion channels through steganographic communication, engage in coordinated attacks that appear innocuous individually, or exploit information asymmetries, with network effects potentially amplifying vulnerabilities through cascading privacy leaks and proliferating jailbreaks.

Future Directions

In 2025, significant advancements in agentic artificial intelligence systems are driving new AI-based cyber defenses, with solutions helping organizations carry out specific goals, make decisions, and take mitigation action with minimal human intervention. Nearly 40% of companies expect agentic AI to augment or assist teams over the next 12 months, especially in cybersecurity, though security teams will be slower to adopt these systems than adversaries because of the need to put in place proper security guardrails and build trust over time.

The agentic security operations center (SOC), powered by multiple connected and use-case driven agents, can execute semi-autonomous and autonomous security operations workflows on behalf of defenders. In 2026 and beyond, these systems will evolve to understand threat patterns and attack methodologies of the adversary, sharing insights across networks and organizations to create a collective defense mechanism. Potential early use cases for multi-agent systems in cyber defense include incident response, application testing, and vulnerability discovery.

Addressing emerging vulnerabilities requires a comprehensive, multi-layered security architecture, treating trust as an inherent design principle rather than an afterthought. Critical to this defense is a zero trust approach to cybersecurity, which assumes no user or device can be inherently trusted, with continuous verification enabling constant monitoring and ensuring that attempts to exploit vulnerabilities are quickly detected and addressed in real time.

Bibliography

[1] Multi-Agent Reinforcement Learning in Cybersecurity: From Fundamentals to Applications. arXiv
[2] Experts Reveal How Agentic AI Is Shaping Cybersecurity in 2025. Security Journey
[3] Open Challenges in Multi-Agent Security: Towards Secure Systems of Interacting AI Agents. arXiv
[4] A Multi-Agent Intrusion Detection System Optimized by a Deep Reinforcement Learning Approach. MDPI
[5] The Agentic SOC: From Isolated Automation to Orchestrated Intelligence. Detection at Scale
[6] Autonomous Intelligent Cyber-defense Agent (AICA) Reference Architecture. arXiv