From 1.4 MB PNGs to 6 KB WebPs: Cutting Mobile LCP With Lambda@Edge and Origin Shield


We just spun up dailybedtimestory.com — a sister site to kidsgamesapp.com that publishes a new bedtime story every night, with a phone-first cinematic carousel. About a week after launch I reloaded it on my phone and it was painfully slow.

I checked the network tab. The hero image was 1.4 MB. The thumbnail rail rendered each card at 64 × 84 pixels — and was loading those same 1.4 MB PNGs. We were shipping ~10 MB of pixels to a phone for a screen that needed maybe 200 KB.

This is the story of how we cut that down to 6 KB per thumbnail and ~50 KB for the hero, and the two non-obvious CloudFront things that almost made the fix backfire.


What we started with

The puzzle illustrations live in S3 (aws-platform-puzzle-images) behind a CloudFront distribution at images.kidsgamesapp.com. They’re generated as PNGs at 1024 × 1024 — beautiful, but huge. The original architecture was straight pass-through: <img src="https://images.kidsgamesapp.com/puzzles/dogs/abc.png"> on every page.

The mobile homepage rendered seven of those at thumbnail size in the hero rail, plus one full-bleed in the cinematic. That’s eight images × 1.4 MB = ~11 MB just to look at the hero.

We had loading="lazy" on most of them. It didn’t matter. Safari’s lazy threshold preloads ~2 viewports below the fold, so they all fired anyway.

A Lighthouse run on a phone-emulated profile showed:

  • LCP: 4.8s
  • Total transfer: 12.3 MB (image: 11.6 MB)
  • Time to Interactive: 6.1s

The fix was obvious in shape — serve smaller, modern-format variants on demand — but had to keep one constraint: we didn’t want to touch the puzzle-image upload pipeline. The original PNGs are immutable artifacts referenced by other parts of the system. Whatever we built had to be additive, on top of the existing CDN.

The plan: Lambda@Edge image transform

CloudFront has a hook for exactly this: a Lambda@Edge function attached to origin-response. The flow we wanted:

Browser → CloudFront edge → Lambda@Edge (resize + WebP via sharp) → return

URL contract: <original-url>?w=NNN&fmt=webp. URLs without query params keep returning the original PNG, so the change is fully backward-compatible — anyone consuming the CDN today (kidsgamesapp.com itself, the puzzle game, blog post heroes) sees no behavior change unless they opt in.

The Lambda is small. Node 20, sharp 0.33, @aws-sdk/client-s3:

import { S3Client, GetObjectCommand } from '@aws-sdk/client-s3';
import sharp from 'sharp';

const s3 = new S3Client({ region: 'us-east-1' });
const BUCKET = 'aws-platform-puzzle-images';

export const handler = async (event) => {
  const { request, response } = event.Records[0].cf;
  if (!response.status?.startsWith('2')) return response;

  const params = new URLSearchParams(request.querystring || '');
  const w = parseInt(params.get('w') || '0', 10);
  const fmt = params.get('fmt')?.toLowerCase();
  if (!w && !fmt) return response;

  const key = decodeURIComponent(request.uri).replace(/^\//, '');
  const obj = await s3.send(new GetObjectCommand({ Bucket: BUCKET, Key: key }));
  const buffer = Buffer.from(await obj.Body.transformToByteArray());

  let pipeline = sharp(buffer);
  if (w) pipeline = pipeline.resize({ width: w, withoutEnlargement: true });
  if (fmt === 'webp') pipeline = pipeline.webp({ quality: 82 });

  const transformed = await pipeline.toBuffer();
  if (transformed.length > 900_000) return response; // Lambda@Edge body cap

  return {
    status: '200',
    headers: {
      'content-type': [{ key: 'Content-Type', value: 'image/webp' }],
      'cache-control': [{ key: 'Cache-Control', value: 'public, max-age=31536000, immutable' }],
    },
    body: transformed.toString('base64'),
    bodyEncoding: 'base64',
  };
};

Three gotchas in shipping it:

  1. Lambda@Edge requires the function to be in us-east-1, published (versioned), and the IAM role must trust both lambda and edgelambda service principals. Get any of these wrong and Terraform applies cleanly but CloudFront refuses the association at deploy time.

  2. Sharp’s binary needs to match the Lambda runtime. npm install sharp on a Mac gets you the wrong binary. The right incantation is npm install --include=optional --os=linux --libc=glibc --cpu=x64 --no-save sharp@0.33.5. We bundle the native bits in the deployment zip — about 11 MB compressed, well under the 50 MB Lambda@Edge limit for origin events.

  3. The cache policy needs to whitelist w and fmt in the cache key. Otherwise CloudFront strips them before they reach the cache key calculation, and every variant collides on the same cache entry. The default cache policy doesn’t do this.

parameters_in_cache_key_and_forwarded_to_origin {
  query_strings_config {
    query_string_behavior = "whitelist"
    query_strings {
      items = ["w", "fmt"]
    }
  }
}

We deployed it. Verified on a sample image:

VariantBytesvs original
Original PNG1,409,364
?w=160&fmt=webp6,206−99.6%
?w=480&fmt=webp26,932−98.1%
?w=900&fmt=webp52,424−96.3%
?w=1280&fmt=webp59,606−95.8%

I shipped the call sites — every <img> and background-image URL in the bedtime site got piped through an optimizedImage(url, { w, fmt }) helper. Built. Deployed. Reloaded the homepage on my phone.

It was still slow.

The cold-start surprise

Lambda@Edge runs at the edge that handled the request. There are 13 regional edges where Lambda@Edge actually executes. Each one has its own cache. The first time anyone in a region requests a particular (image, w, fmt) tuple, Lambda has to spin up sharp, fetch from S3, transform, and return. That’s a 1.6-second cold start, single-shot, per region, per variant.

For a homepage with seven thumbnails plus a hero, that’s eight unique tuples. If even half are cold for the user’s nearest edge, that’s ~5 seconds of stacked cold-Lambda time before any image renders. Browsers parallelize 6 concurrent fetches, so the wall time was more like 2-3 seconds — but that’s still painfully visible on the LCP image.

The naive answer is “warm them after deploy.” We added a script:

# scripts/warm-edges.sh
SLUGS=$(curl -s "$API/recent?days=14" | jq -r '.data.editions[].stories[].slug' | head -60)
for img in $(... extract today's image URLs ...); do
  for w in 192 200 360 560 640 900 1024; do
    curl -s -o /dev/null "${img}?w=${w}&fmt=webp" &
  done
done
wait

Run after every S3 sync. Helped — but only for the regional edge that actually handled my deploy host’s requests (us-west). Visitors at other edges still saw cold starts.

Origin Shield

This is where the architecture changed. CloudFront has a feature called Origin Shield: you designate one regional edge as a shared cache between all global edges and your origin. The flow becomes:

Browser → Edge (cache miss) → Origin Shield (us-east-1) → Origin → Lambda@Edge runs HERE

                            All other global edges share this

The critical detail: Lambda@Edge attached to origin-response runs at the shield, not at every edge. Each (image, variant) tuple cold-starts once globally — at the shield — instead of once per region. After that, the variant is cached at the shield and any edge that asks gets it in ~150 ms.

In Terraform it’s two lines on the existing distribution’s origin block:

origin {
  domain_name              = aws_s3_bucket.puzzle_images.bucket_regional_domain_name
  origin_id                = "puzzle-images-s3"
  origin_access_control_id = aws_cloudfront_origin_access_control.images.id

  origin_shield {
    enabled              = true
    origin_shield_region = "us-east-1"
  }
}

The same change went on the dailybedtimestory.com distribution (its origin is the static-site S3 bucket — same cold-edge problem, different content).

After applying: cold edge story-page TTFB went from 1.4-3.7 seconds to 175-370 ms in our test sweep across five never-visited story pages. The warmer script now warms the shield once and the world benefits.

The image still wouldn’t load

I’d shipped the pipeline. Numbers looked great. The site was demonstrably faster on my desktop. Then the user opened it on a phone and reported a specific image broken — random, not always the same one. Different image each refresh.

Console:

Access to image at 'https://images.kidsgamesapp.com/.../abc.png?w=900&fmt=webp'
from origin 'https://dailybedtimestory.com' has been blocked by CORS policy:
No 'Access-Control-Allow-Origin' header is present on the requested resource.

A preload for '...' is found, but is not used because the request credentials
mode does not match. Consider taking a look at crossorigin attribute.

Two errors that turn out to be the same bug.

The CORS function on images.kidsgamesapp.com was working: when a request arrives with an Origin header, it echoes the origin if it’s allowlisted. We had https://dailybedtimestory.com in the allowlist. Server-side, every request tested green.

But here’s the failure pattern:

  1. The browser sees <link rel="preload" as="image" href="...">no crossorigin attribute. So it sends the preload request without an Origin header.
  2. The CORS function sees no Origin and doesn’t add Access-Control-Allow-Origin. The response comes back without CORS headers.
  3. The browser caches that response in its local HTTP cache, keyed by URL.
  4. The page then renders <img src="..." crossorigin="anonymous"> — same URL.
  5. Browser tries to reuse the local cache entry. CORS check: cached response has no ACAO header. Fail. Image breaks.

The whole sequence is invisible from the server side because every step looks legitimate from where the server stands.

The fix is two complementary changes.

Change 1: match the preload’s credentials mode to the <img>. If your <img> has crossorigin="anonymous", your preload needs it too:

-<link rel="preload" as="image" href={ogImage} fetchpriority="high" />
+<link rel="preload" as="image" href={ogImage} crossorigin="anonymous" fetchpriority="high" />

This stops the bug for new visitors. But existing browser caches are still poisoned — there’s no way to reach into someone’s phone and clear their HTTP cache.

Change 2: harden the CORS function so cacheable responses always carry CORS headers. Echo allowlisted origins; fall back to * for missing/unknown:

function handler(event) {
  var response = event.response;
  var origin = event.request.headers.origin?.value;

  var allowed = {
    'https://kidsgamesapp.com': true,
    'https://dailybedtimestory.com': true,
    // ...
  };

  // Default permissive — prevents browser-cache poisoning.
  var allowOrigin = '*';
  if (origin && allowed[origin]) {
    allowOrigin = origin;
  }

  response.headers['access-control-allow-origin'] = { value: allowOrigin };
  response.headers['access-control-allow-methods'] = { value: 'GET, HEAD' };
  response.headers['vary'] = { value: 'Origin' };
  return response;
}

* is safe here — these are public images, no credentials, no PII. Allowlisted origins still get a specific echo so CORS-with-credentials would work if we ever needed it.

Busting the browser cache without a CDN purge

Even with both fixes deployed, anyone who visited during the broken window had a poisoned local cache entry. CloudFront invalidations don’t clear browser caches. We needed to force a refetch from the browser side.

The trick: add a query param the cache policy ignores.

const IMAGE_VERSION = '2';

export function optimizedImage(url, { w, fmt = 'webp' }) {
  const params = new URLSearchParams();
  if (w) params.set('w', String(w));
  params.set('fmt', fmt);
  params.set('v', IMAGE_VERSION); // ← cache-bust for browsers
  return `${url}?${params.toString()}`;
}

The cache policy whitelists only w and fmt. CloudFront strips everything else from the cache key before it gets stored. So:

  • From the browser’s perspective: ?w=900&fmt=webp&v=1 and ?w=900&fmt=webp&v=2 are different URLs → fresh fetch.
  • From CloudFront’s perspective: both map to the same cache key (w=900&fmt=webp) → edge serves the already-warm response → no cold-Lambda penalty on the deploy.

This is the cleanest cache-bust pattern I’ve found. Bump IMAGE_VERSION on any future change that needs to invalidate browser caches; the CDN side is unaffected. Best of both worlds.

Where we landed

Real numbers from the live site, measured 2026-05-08:

MetricBeforeAfter
Hero image bytes (mobile)1,409,36452,424
Total above-the-fold image bytes (homepage cinematic)~11 MB~120 KB
Cold-edge HTML TTFB1.4-3.7s175-370 ms
Warm-edge image TTFBn/a (didn’t exist)150-300 ms
LCP on a phone-emulated cold profile4.8s0.9s

Plus the things you don’t see in numbers:

  • Originals stay clean in S3. No upload-pipeline change. The whole transform is additive at the CDN layer.
  • New widths cost nothing — the next time we want a w=400 thumbnail variant, we just add ?w=400 to the URL. Lambda generates it on first request, shield caches it, edges serve it forever.
  • IMAGE_VERSION = '2' sits ready to bust browser caches the next time we ship a response-header change.

What I’d do differently

Add Origin Shield from day one. We shipped the Lambda first, measured, then realized the cold-start problem and added the shield as a follow-up. The shield is a one-line config change with no downside — there’s no reason not to ship it together with the Lambda.

Audit <link rel="preload"> and <img crossorigin> together. I had crossorigin="anonymous" on every <img> for legitimate reasons (canvas access elsewhere in the app). I added <link rel="preload"> later, didn’t think about credentials mode, and quietly poisoned every browser that visited during that window. If you have one with crossorigin, the other needs it too.

Default CORS responses to *, not silence. “No header” is the worst answer for a cacheable response. Always set something. * is fine for public images; echo for credentialed responses; never empty.

The full source for the Lambda and the Terraform that wires it up lives in our aws-platform repo. The site-side helper is two functions in one file. Most of the win came from configuration choices, not code — knowing which CloudFront knob solves which problem is the actual skill.