PR review — Add rate limiting to /api/messages

Bottom line: not ready to merge.

Three blockers — a race condition on the increment, no Redis-failure handling, and the limiter runs before auth (so unauthenticated traffic burns the budget for legit users). Once those are fixed, the nits are quick. The approach is right; the implementation is rushed.

src/middleware/rate-limit.ts +42 · new file

@@ -0,0 +1,42 @@

1import { redis } from '../lib/redis';

2import type { Request, Response, NextFunction } from 'express';

4const WINDOW_MS = 60_000;

Nit Hardcoded window length.

process.env.RATE_LIMIT_WINDOW_MS already exists for this. Pull from config so we can tune per-environment without a redeploy.

5const MAX_REQUESTS = 100;

Nit Magic number.

Same — pull from config. Bonus: per-route overrides.

7export async function rateLimit(

8 req: Request,

9 res: Response,

10 next: NextFunction,

11) {

12 const key = `rl:${req.ip}`;

13 const current = await redis.get(key);

14 const count = current ? parseInt(current, 10) : 0;

Blocker Race condition on increment.

The get + set below is non-atomic — two parallel requests both read count = 99 and both pass the check. Under load you'll let through 2× the limit on every boundary.

Use redis.incr(key) and redis.expire(key, ttl) in a pipeline (or a single Lua script for atomicity).

16 if (count >= MAX_REQUESTS) {

17 res.status(429).json({ error: 'Too many requests' });

18 return;

19 }

Nit Missing Retry-After header.

RFC 6585 expects Retry-After: <seconds> on a 429. Set it to the remaining window so well-behaved clients back off correctly.

21 await redis.set(key, String(count + 1), 'PX', WINDOW_MS);

22 next();

23}

Blocker No handling for Redis failure.

If Redis is unavailable, redis.get throws and the request 500s — surprising and bad. Wrap in try/catch and pick a strategy:

• Fail-open (allow on Redis error) — simpler, but lets through unbounded traffic during an outage.
• Fail-closed (deny on Redis error) — safer, but ties uptime of the API to Redis.

Pick one, document it, and emit a metric so we know when it's happening.

Nice-to-have Consider sliding window over fixed bucket.

Fixed bucket lets through 2× burst at the boundary by design. A sliding window is ~30 lines more code and far smoother. Not blocking, but worth a follow-up ticket.

src/routes/messages.ts +3 −0

@@ -1,12 +1,15 @@
1import { Router } from 'express';
2import { rateLimit } from '../middleware/rate-limit';
3import { authenticate } from '../middleware/authenticate';
4import { createMessage } from '../services/messages';
5 
6const router = Router();
7 
8router.post('/messages',
9  rateLimit,
10  authenticate,
11  async (req, res) => {
BlockerLimiter runs before auth.You're keying on req.ip and limiting before the user is identified. On any shared NAT (offices, mobile carriers, the entire VPN of a customer org) every legitimate user shares a budget with every drive-by bot scanning the endpoint.
Move rateLimit after authenticate and key by user ID. Keep an additional, looser per-IP limit upstream (e.g. at the LB) for the unauthenticated case.
12    const message = await createMessage(req.body);
13    res.json(message);
14  }
15);

src/lib/redis.ts +1 −1

@@ -7,6 +7,6 @@

7export const redis = new Redis(process.env.REDIS_URL!, {

8 lazyConnect: true,

8 lazyConnect: false,

9 maxRetriesPerRequest: 3,

Nit Why the change?

Flipping lazyConnect off means we connect on import, which changes startup behaviour for every entry point that pulls in this module (CLI scripts, tests, jobs). Either revert or note the rationale in the PR description.

src/middleware/rate-limit.test.ts +41 · new file

@@ -0,0 +1,41 @@

1import { rateLimit } from './rate-limit';

2import { mockReq, mockRes } from '../test/helpers';

4describe('rateLimit', () => {

5 it('allows requests under the limit', async () => {

6 for (let i = 0; i < 99; i++) {

7 const next = jest.fn();

8 await rateLimit(mockReq(), mockRes(), next);

9 expect(next).toHaveBeenCalled();

10 }

11 });

13 it('blocks requests over the limit', async () => {

14 // ... 100 requests, then assert the 101st is 429

15 });

16});

Nit Boundary cases missing.

You test "well under" and "well over" but not the boundary. Add: request #100 (last allowed), request #101 (first blocked), and the moment after the window expires.

Nit No state reset between tests.

Without a beforeEach that clears the Redis key, these tests are order-dependent and will be flaky in parallel runs.

Nice-to-have Test the race condition.

Fire 200 concurrent requests with Promise.all and assert exactly 100 of them get through. This will fail today (and force the fix above) — that's the point.