Feb 25, 2026·6 min read·5 visits
The `zae-limiter` library (< v0.10.1) creates a single DynamoDB partition key per entity. Because a single DynamoDB partition is physically limited to ~1,000 Write Capacity Units (WCU) per second, an attacker can crash the rate-limiting service for a specific user (and potentially neighbors on the same shard) simply by flooding it with high-frequency requests.
A deep dive into an architectural race condition in the `zae-limiter` library where the promise of 'infinite scale' collides with the hard reality of DynamoDB physical partition limits. By funneling all rate-limiting state for a single entity into one partition key, the library inadvertently created a 'hot partition' bottleneck. This allows attackers to trigger a denial of service (DoS) simply by exceeding 1,000 write units per second, turning the rate limiter—the very tool designed to prevent floods—into the point of failure.
We love Amazon DynamoDB. It's the serverless darling of the cloud world, promising near-infinite scalability. You just throw JSON at it, and Jeff Bezos catches it, right? Well, not exactly. Beneath the marketing fluff of 'seamless scaling' lies a physical reality: data has to live on actual hard drives. DynamoDB organizes this data into partitions, and these partitions have hard, physical limits. specifically, a single partition can only handle about 1,000 Write Capacity Units (WCU) per second.
Enter zae-limiter, a Python library designed to handle rate limiting using the Token Bucket algorithm. It stores the state of your buckets (how many tokens you have left) in DynamoDB. The developers, likely assuming DynamoDB handles all the heavy lifting, made a classic architectural blunder: they designed their schema based on logical purity rather than physical constraints.
In versions prior to 0.10.1, the library maps all rate-limiting operations for a specific entity—say, a user ID or an API key—to a single Partition Key (PK). This creates what we in the database world call a 'Hot Partition.' It’s like building a stadium with 50,000 seats but only opening one turnstile. It doesn't matter how big the stadium is; if everyone tries to enter at once, people are going to get crushed against the gate.
To understand why this breaks, you have to look at how DynamoDB hashes data. When you write an item, DynamoDB hashes the Partition Key to determine which physical server holds that data. In the vulnerable version of zae-limiter, the schema looked something like this:
PK = {namespace}/ENTITY#{entity_id}
This is clean. It's logical. It allows you to query everything about user_123 easily. But it is also a death sentence for high-throughput applications. Every time user_123 makes a request, zae-limiter performs a write operation (UpdateItem or TransactWriteItems) to decrement the token count. This consumes WCU.
If user_123—or an attacker impersonating them—pushes traffic beyond 1,000 requests per second, they saturate the physical partition dedicated to that key. DynamoDB doesn't just queue these requests; it rejects them with a ProvisionedThroughputExceededException. The application catches this exception and, usually, fails closed (denying the request) or fails open (ignoring the rate limit). In zae-limiter's case, this results in a targeted Denial of Service. The irony is palpable: the traffic meant to be rate-limited actually destroys the rate-limiter itself.
Let's look at the architectural diff. The vulnerability isn't a buffer overflow or a missing semicolon; it's a schema design flaw. The fix involved moving from a static, entity-based key to a dynamic, sharded key system.
The Vulnerable Design (Logical View):
All buckets for an entity lived under one roof. If you hit the limit, you hit the wall.
# pseudo-code representation of the PK generation
def get_pk(namespace, entity_id):
# ALL traffic for this entity hits this single hash
return f"{namespace}/ENTITY#{entity_id}"The Fix (v0.10.1 - Pre-Shard Buckets):
The developers introduced 'Adaptive Sharding'. Instead of one bucket, the state is split across multiple 'shards'. Clients pick a random shard to write to. If a shard is full, they try another. This spreads the WCU load across multiple physical partitions.
# The fixed approach distributes load
def get_pk(namespace, entity_id, resource, shard_id):
# Now we have N shards to write to
return f"{namespace}/BUCKET#{entity_id}#{resource}#{shard_id}"
# Proactive Sharding Logic
if current_wcu_usage > 0.80 * limit:
double_shard_count() # 1 -> 2 -> 4 -> 8This change is massive. It introduces an internal wcu tracker bucket that decrements 1,000 millitokens per write. If this internal tracker nears exhaustion, a background Lambda aggregator (triggered by DynamoDB Streams) proactively doubles the shard count, essentially creating new lanes on the highway before traffic jams occur.
Exploiting this is trivially easy if you have a valid API key or user session. You don't need fancy shellcode; you just need curl or a python script.
The Attack Chain:
zae-limiter. You can often tell by headers like X-RateLimit-Limit.import requests
from concurrent.futures import ThreadPoolExecutor
def hammer():
while True:
try:
# Send requests faster than 1/ms
requests.get("https://target-api.com/resource")
except:
pass
# Ignite the threads
with ThreadPoolExecutor(max_workers=50) as executor:
for _ in range(50):
executor.submit(hammer)The Result: The backend DynamoDB partition glows red hot. The application starts throwing 503 Service Unavailable or internal server errors as the ProvisionedThroughputExceededException bubbles up. Valid traffic for that user is dropped. Even worse, due to how DynamoDB colocates partitions, this creates a "Noisy Neighbor" effect. Unrelated data that happens to live on the same physical storage node might also see increased latency or throttling.
You might think, "So what? The spammer gets blocked. Working as intended." Not quite. A Denial of Service vulnerability in a security control is a critical failure. If the rate limiter is the first line of defense, knocking it over can leave the gates wide open (fail-open) or permanently shut (fail-closed).
If the system fails open (catches the DB exception and lets the request through), the attacker has successfully bypassed the rate limit entirely, opening the door for scraping, brute-forcing, or further resource exhaustion of the application servers.
If the system fails closed (the default behavior for zae-limiter), the attacker has successfully DoS'd the entity. In a multi-tenant SaaS environment, if an attacker targets a shared tenant ID (like a company-wide API key), they can take down operations for an entire organization. Furthermore, the sheer volume of UpdateItem retries burns through the victim's AWS budget, as DynamoDB charges for write units even if they are throttled in some configurations.
The fix is strictly architectural. You cannot configuration-tune your way out of a hot partition if your schema is fundamentally designed to hot-spot.
Immediate Action: Upgrade to zae-limiter v0.10.1 or higher. This version introduces the "Pre-Shard Buckets" architecture.
How it works:
ProvisionedThroughputExceededException. If you see this metric spiking even after the patch, your shard_count might not be scaling fast enough, or you are hitting global table limits.CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:L| Product | Affected Versions | Fixed Version |
|---|---|---|
zae-limiter zeroae | < 0.10.1 | 0.10.1 |
| Attribute | Detail |
|---|---|
| CWE ID | CWE-770 |
| Attack Vector | Network |
| CVSS | 4.3 (Medium) |
| Impact | Denial of Service |
| Exploit Status | PoC Available |
| Architecture | Serverless / DynamoDB |
Allocation of Resources Without Limits or Throttling