Headline
The Risk of Stateful Anti-Patterns in Enterprise Internet Architecture
Excessive statefulness hurts the ability to scale networks, applications, and ancillary supporting infrastructure, thus affecting an entire service delivery chain’s ability to withstand a DDoS attack.
In this era of accelerated digital transformation, organizations have come to rely on increasingly complex application and service delivery chains to seamlessly and consistently deliver goods and services across the Internet. In turn, they expect similar levels of service and consistency from business partners and suppliers — and all at Internet speed and scale.
This is why, of the three elements of information security — confidentiality, integrity, and availability — it is availability that is at the forefront of the organization’s ability to conduct business and attain its goals. The growing reliance on remote work and education has only served to increase the criticality of availability across all verticals, at all levels of contribution.
As a result of this wholesale shift in operational models, it is now possible for threat actors to disrupt not only an organization’s public-facing applications and services — which is bad enough, both in terms of revenue and of brand reputation — but to negatively impact the ability of front-line workers to execute their responsibilities. This is the goal of distributed denial-of-service (DDoS) attacks.
Scaling Defenses as DDoS Attacks Increase
Threat actors launch DDoS attacks for a variety of reasons, including extortion, contracted attacks from business competitors, ideological motivations, disputes related to online gaming, and even simple nihilism. And DDoS attacks against an organization’s supply chain partners or external services vendors can be just as disruptive as a direct attack against the organization’s organic assets. The record-breaking number of DDoS attacks observed during 2021 exhibited significant increases in preattack reconnaissance, the introduction of multiple new DDoS vectors, and unprecedented growth in multivector DDoS attacks targeted across multiple verticals.
While metrics such as attack volume (bits-per-second, or bps), throughput (packets-per-second, or pps), and application-layer load (transactions-per-second [tps] or queries-per-second [qps]) are necessary for understanding attack dynamics and scaling DDoS defenses, it is important to realize that DDoS attacks are attacks against both capacity and state.
In the networking context, maintaining state means tracking the current status or condition of a given network communication session. In terms of applications and services, it means doing so for discrete transactions or processes. While stateful operation can be desirable in some specific circumstances and for short time frames, excessive instantiations of state impose significant constraints on the ability to scale networks, applications, and ancillary supporting infrastructure, thus affecting the ability of the entire service delivery chain to withstand DDoS attacks.
How DDoS Attacks Overcome Stateful Firewalls, IPSes, and Load-Balancers
Placing a stateful firewall — a category that encompasses Web application firewalls (WAFs) — on an enterprise network enhances security by dropping all incoming network traffic not directly related to outgoing user-initiated network requests. However, it does not help secure public-facing Web servers, authoritative DNS servers, application servers, and the like because incoming packets to those servers and services are unsolicited.
Also, low-volume DDoS attacks can overwhelm even the highest-capacity stateful firewalls. This is due to the significant memory and processing overhead consumed in tracking connection state for all incoming Internet traffic; it simply isn’t possible to do so at Internet scale. When stateful firewalls — or the applications, services, and servers sited behind them — are subjected to a DDoS attack, the firewall state-tables are quickly exhausted, and either the firewalls themselves will be rendered inoperable under the increased traffic load or the programmatically generated attack traffic will crowd out legitimate incoming connections by exhausting the ability of the firewall to track state.
This allows attackers to successfully disrupt the organization’s public-facing services, including e-commerce, high-demand content, customer service and support applications, and DNS, as well as the VPN infrastructure for the remote workforce.
Stateful load-balancers, intrusion prevention systems (IPSes), and the applications and services behind them are also susceptible to state exhaustion as a result of DDoS attacks. The same is true of applications that carry excessive state at key points in the service delivery chain. Accordingly, state minimization and state distribution should be key in network and application design.
Best Current Practices for Network Infrastructure
Industry coalition Mutually Agreed Norms for Routing Security (MANRS) suggests a set of network infrastructure self-protection best current practices (BCPs) to implement to ensure that the network itself is resilient and can maintain availability even in the face of attack. Critical service delivery elements, such as authoritative and recursive DNS servers, application and content farms, etc., must also be configured and deployed in a scalable, distributed, and resilient manner. Stateless access-control lists (ACLs) should be implemented to enforce situationally appropriate network access control policies for servers, services, and applications, reducing the options available to attackers.
Out-of-band (OOB) management capabilities and edge-to-edge visibility into all network traffic are crucial to maintaining situational awareness and control when under attack.
Flow telemetry, such as NetFlow and IPFIX, should be exported from edge routers and layer-3 switches to provide visibility into all traffic ingressing, egressing, and traversing the network. All network edges should be instrumented. Flow telemetry collection and analysis allows network operators to detect, classify, and trace back DDoS attack traffic in real time.
Network infrastructure-based DDoS mitigation techniques such as flowspec and source-based remote triggered blackholing (S/RTBH) allow edge routers and layer-3 switches to be leveraged against DDoS attacks. Along with flow telemetry export, these mechanisms should be supported in all peering- and customer aggregation-edge network infrastructure elements.
Intelligent DDoS mitigation systems (IDMSes) are intended to protect against volumetric, application-layer, and state-exhaustion DDoS attacks. They incorporate DDoS-specific countermeasures that are either fully stateless or which instantiate into a minimal, ephemeral state that is quickly shed in order to differentiate between DDoS attack traffic and legitimate user/partner traffic. IDMSes can typically evaluate all contents of the packet header and payload, reassemble fragmented packets and application-layer messages, and evaluate incoming requests to ensure that they are sourced from legitimate clients, rather than DDoS-capable botnets.
By implementing these BCPs and ensuring that they have the ability to detect, classify, trace back, and mitigate DDoS attacks, organizations can ensure that their public-facing applications, services, and content remain available — even in the face of attack.