Autonomous Cyber Defense Using RL in Distributed Networks
DOI:
https://doi.org/10.15662/IJEETR.2025.0706034Keywords:
Reinforcement Learning, Autonomous Cyber Defense, Distributed Networks, MITRE ATT&CK, Multi-Agent RL, Intrusion Detection, Incident Response, Zero-Day, APT, DDoS Mitigation, NIST CSF, Network SecurityAbstract
The escalating sophistication of cyber threats against distributed networks - spanning cloud data centers, edge gateways, IoT endpoints, enterprise LANs, operational technology environments, and remote workforces - has rendered manual Security Operations Center (SOC) workflows and rule-based Security Information and Event Management (SIEM) systems fundamentally inadequate. This paper presents an autonomous cyber defense system comprising six specialized reinforcement learning agents - DDoS Mitigator, APT Hunter, Lateral Movement Blocker, Crypto Defender, Exfiltration Guard, and Reconnaissance Detector - each employing a purpose-selected RL algorithm (PPO, SAC, MAPPO, TD3, A3C, DQN) optimized for its threat domain’s unique characteristics. The agents operate under a Centralized Training with Decentralized Execution (CTDE) paradigm across six network segments, processing 2.4 million packets per second and making autonomous defense decisions in real-time. Through an 18-month deployment protecting a distributed network of 63,000 nodes across cloud, edge, IoT, enterprise, OT/SCADA, and remote worker segments, the system achieves an overall detection rate of 94.6% across 10 attack vectors mapped to MITRE ATT&CK (up from 62.4% with manual SOC), reduces mean time to respond from 4.2 hours to 2.4 seconds, decreases false positive rate from 18.5% to 1.9%, and reduces per-incident cost from $18,400 to $420. Four progressive red team exercises using 52 ATT&CK techniques validate the system’s ability to autonomously prevent all 18 simulated breaches in the final exercise. The system raises the organization’s NIST Cybersecurity Framework score from 2.8 to 4.7 out of 5.0.References
[1] MITRE Corporation, "ATT&CK Framework v14," 2025. Available: https://attack.mitre.org/
[2] NIST, "Cybersecurity Framework (CSF) 2.0," 2024.
[3] J. Schulman et al., "Proximal Policy Optimization Algorithms," arXiv:1707.06347, 2017.
[4] T. Haarnoja et al., "Soft Actor-Critic: Off-Policy Maximum Entropy RL," ICML, 2018.
[5] C. Yu et al., "The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games," NeurIPS, 2022.
[6] S. Fujimoto et al., "Addressing Function Approximation Error in Actor-Critic Methods," ICML, 2018.
[7] V. Mnih et al., "Asynchronous Methods for Deep RL," ICML, 2016.
[8] Z. Wang et al., "Dueling Network Architectures for Deep RL," ICML, 2016.
[9] Microsoft, "CyberBattleSim," 2022. Available: https://github.com/microsoft/CyberBattleSim
[10] E. Liang et al., "RLlib: Abstractions for Distributed RL," ICML, 2018.
[11] A. Vaswani et al., "Attention Is All You Need," NeurIPS, 2017.
[12] R. Lowe et al., "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments," NeurIPS, 2017.
[13] CISA, "Zero Trust Maturity Model v2.0," 2023.
[14] SANS Institute, "SOC Survey 2025: Metrics and Staffing," 2025.
[15] Verizon, "Data Breach Investigations Report (DBIR)," 2025.
[16] CrowdStrike, "Global Threat Report 2025," 2025.





