What does WCAG 1.2.5 actually require?

It requires audio description for all prerecorded video in synchronized media at Level AA. A second narration track describes important visual details — actions, scene changes, on-screen text — during pauses in the dialogue, so people who can't see the screen still get the full story. If the soundtrack already explains everything on screen, no extra description is needed.

How is 1.2.5 different from 1.2.3?

SC 1.2.3 (Level A) lets you choose either audio description or a full text alternative. SC 1.2.5 (Level AA) removes the choice — it specifically requires audio description . If you satisfied 1.2.3 with a transcript instead of description, you have new work to do for AA.

Do captions or a transcript satisfy 1.2.5?

No. Captions serve deaf and hard-of-hearing users (that's SC 1.2.2 ). A transcript can satisfy the Level A criterion but not 1.2.5 at AA. 1.2.5 is specifically about narrating visual information for blind and low-vision users, which is a different need.

What if my video has no pauses to fit description into?

When natural dialogue pauses are too short, you have options: write the script so the speaker narrates key visuals, provide an alternative described version, or use extended audio description (which briefly pauses the video). Extended description is its own Level AAA criterion (1.2.7), but it is one accepted way to meet 1.2.5.

Does 1.2.5 apply to live video or audio-only content?

No. 1.2.5 covers prerecorded video with a synchronized soundtrack. Live video is handled elsewhere, and audio-only content has no visual track to describe. Decorative or purely background video with no meaningful visual information also falls outside the requirement.

WCAG 1.2.5 Audio Description (Prerecorded) (Level AA)

WCAG 1.2.5 Audio Description (Prerecorded) requires that every prerecorded video with a soundtrack include an audio description — extra narration, slotted into the pauses in dialogue, that describes the important visual details a blind or low-vision viewer can’t see: actions, characters, scene changes, and any on-screen text. It is a Level AA requirement.

What 1.2.5 actually requires

The official wording is short: “Audio description is provided for all prerecorded video content in synchronized media.” Synchronized media just means video with a matching soundtrack — a product demo, a testimonial, a how-to clip, a homepage hero video with narration.

The key word is describe. A sighted viewer watching your explainer sees the presenter point at a chart, watches a price appear on screen, sees a hand swipe a card through a reader. A blind viewer hears only the soundtrack. If the soundtrack doesn’t mention those visuals, that information is simply lost. Audio description fills the gap with a second narration track that speaks the visuals aloud during the natural quiet moments between lines of dialogue.

There’s an important escape hatch the W3C makes explicit: if all the information in the video track is already conveyed by the existing audio, no separate description is necessary. A talking-head clip where the speaker says everything that matters may already pass.

Who it affects

This criterion exists for people who are blind or have low vision and cannot perceive the visual track. They navigate your site with screen readers like NVDA, JAWS, or VoiceOver, and when they reach a video they hear the soundtrack but nothing else. People with certain cognitive disabilities who struggle to interpret fast-moving visual scenes also benefit from a clear spoken description.

Think about what that means for a small business. A blind customer reaches your “How our service works” video. The narrator says, “It’s this easy.” On screen, three steps and a phone number appear — but they’re never spoken. To that customer, the video conveyed nothing.

1.2.5 vs. the rest of the 1.2 family

The Guideline 1.2 criteria are easy to confuse, so here’s the map:

Criterion	Level	Serves	What it requires
1.2.2 Captions	A	Deaf / hard of hearing	Synchronized captions of the audio
1.2.3 AD or Media Alternative	A	Blind / low vision	Audio description or a full text alternative (your choice)
1.2.5 Audio Description	AA	Blind / low vision	Audio description required — no text-only shortcut
1.2.7 Extended Audio Description	AAA	Blind / low vision	Description even when pauses are too short

The trap is 1.2.3. At Level A you can satisfy it cheaply with a transcript. But the moment you target WCAG 2.1 AA — the level courts and demand letters point to — 1.2.5 kicks in and the transcript no longer counts. You owe an actual described track. Per the W3C Understanding document, if you chose the description option for 1.2.3 you’re already done; if you chose text, this is new work.

Concrete failures and how to fix them

Failure 1 — On-screen text that’s never spoken. A testimonial video flashes the customer’s name, title, and a “$2,400 saved” stat on screen. None of it is in the audio. The fix is to add narration in the gap: “On screen: Maria Lopez, owner, Lopez Bakery — saved $2,400 in year one.”

Failure 2 — Silent visual action. A how-to video shows hands assembling a product with no commentary. A blind viewer hears nothing meaningful. The cleanest fix for small teams is integrated description — re-record or script the voiceover so the steps are spoken as they happen, removing the need for a separate track entirely.

Failure 3 — No pauses to fit description into. Wall-to-wall dialogue leaves no room. Use extended audio description, which briefly freezes the video while the narration plays.

Technically, you can deliver a described version a few ways. A common, low-friction approach is a WebVTT description track wired to the player:

<video controls>
  <source src="demo.mp4" type="video/mp4">
  <track kind="captions" src="demo-captions.vtt" srclang="en" label="English">
  <track kind="descriptions" src="demo-descriptions.vtt" srclang="en" label="Descriptions">
</video>

WEBVTT

00:00:04.000 --> 00:00:07.000
On screen: three steps appear — Book, Build, Done.

00:00:12.500 --> 00:00:15.000
A phone number, 555-0100, displays at the bottom.

Other accepted methods from the W3C media guidance include a second selectable audio track with descriptions mixed in, or a separate described version. Write descriptions in present tense, active voice, and describe objectively — say what’s on screen, don’t interpret it.

How to test for 1.2.5

You can’t catch this with a scanner alone — no automated tool can judge whether your video’s visuals are conveyed by its audio. It takes a person. Here’s the manual check:

Inventory every prerecorded video with sound on the site.
Close your eyes and listen to each one start to finish. Anything you miss — text, gestures, scene changes, results — is unaddressed visual information.
Check for a described track or version. Look for a kind="descriptions" track, a selectable “Described” audio option, or a clearly linked described version.
Confirm the description actually fits. If the dialogue never pauses, a standard description track can’t work; you need integrated or extended description.
Flag exemptions honestly. A talking-head clip with no meaningful on-screen visuals can pass — but be sure that’s truly the case, not wishful thinking.

This is exactly the kind of judgment-based testing a thorough accessibility audit covers and an automated overlay cannot.

Why this matters legally

Video accessibility is no longer a fringe issue in ADA web litigation. Inaccessible video is a recurring claim in demand letters, and analysts have tracked the share of cases mentioning video climbing year over year — UsableNet documented a stretch where nine of twenty-one ADA suits in one period involved inaccessible video, and a single plaintiff firm filed dozens of video-focused cases in a few months. The landmark precedents — the National Association of the Deaf cases against Harvard and MIT over online video — put video squarely inside ADA reach.

There’s also a parallel regulatory track. Under the Twenty-First Century Communications and Video Accessibility Act, the FCC requires major broadcasters and large cable systems to air 87.5 hours of audio-described programming per quarter in covered markets, with the rules phasing into more markets through 2035. That mandate covers TV, not your website — but it shows audio description is a settled, expected accommodation, not a novel ask.

Courts and the DOJ treat WCAG 2.1 AA as the practical yardstick for an accessible site, and 1.2.5 is part of that bar. This is general information, not legal advice — talk to a qualified attorney about your specific exposure.

Getting it fixed

Most small businesses have a handful of videos, not a library, which makes 1.2.5 very tractable. Often the smartest fix is the cheapest: script future videos so the narrator speaks the visuals, and you never need a separate track. For existing videos, we produce description scripts, record the narration, and wire up the track by hand.

Curbcut is deliberately anti-overlay: a widget cannot watch your video and narrate it, so overlays do nothing for 1.2.5. This is real production and code work — the kind our remediation team does directly. Want to know which of your videos fall short? Start with a free scan and we’ll map your media against WCAG 2.1 AA, then fix what’s missing.

WCAG 1.2.5: Audio Description (Prerecorded)

What 1.2.5 actually requires

Who it affects

1.2.5 vs. the rest of the 1.2 family

Concrete failures and how to fix them

How to test for 1.2.5

Why this matters legally

Getting it fixed

Frequently asked questions

Need this fixed — not just flagged?

What 1.2.5 actually requires

Who it affects

1.2.5 vs. the rest of the 1.2 family

Concrete failures and how to fix them

How to test for 1.2.5

Why this matters legally

Getting it fixed

Frequently asked questions

Keep reading

Need this fixed — not just flagged?