Introduced in the HTML5 specification (now the HTML Living Standard), the figure and figcaption elements are meant to create a meaningful markup structure that:

  • presents content that is relevant to the current document, but not vital to its understanding
  • and optionally, though often is presented with a descriptive “caption” or “legend” that provides more context to the primary content of the figure.

The figure element

Often thought of as an element to present images or diagrams with captions, the figure element can represent any sort of flow content (code snippets, quotes, tabular data, audio, video, etc.). The content the figure represents is meant to be referred to from the primary content of the document. The information it represents is not meant to be redundant. A figure should be able to be completely removed from the flow of a document (e.g., web page) without any detriment to the understanding of the primary content.

For example, a document outlining the airspeed velocity of an unladen Swallow may have a section explaining the differences between a South African Swallow vs the European Swallow. Accompanying this content could be a figure showcasing a side by side visual comparison of these birds, so as to compliment the information the document outlines. But if this figure was removed, then LOL who cares?

screen shot of an article using an image, offset from the primary content, to indicate the visual differences between a South African and European Swallow. Text in the image notes the South African Swallow is on the left, and the European Swallow is on the right.

This meta example of a figure and its caption works in reference to the content that preceded it. If I didn't include it, all the information up to this point would still have described the purpose of a figure.

Image source (2003).

A figure can be used with or without a figcaption. However, without a caption, or providing the figure element with an accessible name (e.g, with the aria-label attribute), the “figureness” of the content may not be overtly communicated by assistive technology. In some cases, AT may not convey any semantics at all if its given no accessible name.

This is not necessarily a bad thing… as there are unfortunately still many developers - and even those that should know better - that use HTML incorrectly.

For instance:

<h2>My Favorite Movies</h2>
<ul>
  <li>
    <figure>
      <picture>
        <img src=fightclub.jpg alt="">
      </picture>
      <figcaption>Fight Club</figcaption>
    </figure>
  </li>
  <!-- more wrong goes here -->
</ul>

The above is an example of how not to use the figure element. The following will hopefully enlighten you on why.

The figcaption element

A figcaption provides a caption, or legend, to the content the figure represents. The figcaption used to provide the accessible name to the figure element. But this often created overly verbose and redundant content. Figure captions can be rather detailed, in some cases they may even contain entire transcripts if audio or video serves as the primary content of the figure. Having an entire transcript as an accessible name of a figure element was an incredibly horrible experience.

So, figcaption is no longer meant to name its parent figure element… though some browsers may still use it as an accessible name source (as of June 2025). But they’re working on it…

If one needs to name a figure element (and not all need names, mind you), aria-label or aria-labelledby attributes would be a more robust and practical method to do so.

The figcaption may be placed before or after the primary contents of the figure, however it must be a direct child of the figure element.

For instance:

<figure>
  <figcaption>...</figcaption>
  <!-- contents of the figure -->
</figure>

<figure>
  <!-- contents of the figure -->
  <figcaption>...</figcaption>
</figure>

<figure>
  <figcaption>
    <div>
      ...
    </div>
  </figcaption>
  <!-- contents of the figure -->
</figure>

However, the following are invalid per the HTML specification:

<figure>
  <div> <!-- or any other element -->
    <figcaption>...</figcaption>
  </div>
  <!-- content of the figure -->
</figure>

<figure>
  <!-- content of the figure -->
  <div> <!-- or any other element -->
    <figcaption>...</figcaption>
  </div>
</figure>

Should an intervening div really be considered a breakage to the semantic relationship between a figure and its figcaption element? Meh… I’m just relaying the rules, I’m not saying I think this one makes much sense. Anyway…

A figcaption may contain flow content, which categorizes most elements allowed as children of the body element. But, as the element is meant to provide a caption to the figure’s content, brief but description text is preferable. The figcaption should not repeat the content of the figure, or other content within the primary document. And by “should not repeat the content of the figure…” that also means the figcaption is not meant to be used to re-state an image’s alternative text.

A figcaption is not a place to repeat or a substitute for image alternative text

Regarding images used within figures, one of the biggest misunderstandings of using a figcaption is that it can be used in lieu of providing an image/graphic alternative text. I’m going to repeat - if you think it’s OK to use alt="" on your image and then use the figcaption to convey the image’s meaning, you are wrong.

A figcaption is meant to provide a caption or summary to what the figure represents, often relating it back to the document the figure is contained within. It is commonly supposed to convey additional information that may not be directly apparent from just reviewing the content of the figure itself.

For instance, consider a photograph of three people standing in front of a building on a sunny day. The person in the middle is holding a giant pair of scissors.

That’d be a decent description of what people might be able to infer when looking at this hypothetical photograph.

But the caption of this photograph could read: “1996 Ribbon cutting ceremony for the opening of Super Corp USA. From left to right, Donald MacDonald, Sir Robert Cutsalot, and Melody Singer.”

This caption information goes beyond what most people would know from just looking at the photo alone. Thus it wouldn’t be appropriate as simply alternative text - becuase then only those who would read the alt text would have a full understanding of the image. And it wouldn’t serve to have this text be the alt text and the caption - beause that’s redundantly redundant.

“But why not make the image have an empty alt, and then just write the caption once?” you say, with an innocence that is both endearing, and yet tiring as we should all be aware at this point what providing an empty alt means for images…

If an image is given an empty alt, then the image is treated as decorative and hidden from the accessibility tree - and thus assistive technology. This is not news.

If one were to have a figure, containing an image marked as decorative, then any figcaption provided would effectively be describing nothing. Does that make sense?

Let’s look at a non-image example. Here’s a figure that contains a Sass code snippet:

<figure>
  <pre><code>
    $align-list: center, left, right;

    @each $align in $align-list {
      .txt-#{$align} {
        text-align: $align;
      }
    }
  </code></pre>
  <figcaption>
    This use of Sass's <code>@each</code> control directive will compile to 
    three CSS classes; .txt-center, .txt-left, and .txt-right.
  </figcaption>
</figure>

Now, imagine putting an aria-hidden="true" on the pre element that wraps the code snippet. Doing so would remove the ability for screen readers and other assistive technology to parse the content the caption refers to.

That’s what it’s like when putting an empty alt on an image in a figure.

The only time a figcaption can be used to name an image

There is a single scenario where one should be able to rely on a figcaption to name an image. That scenario is when an img serves as the only content of a figure and the image has no alt attribute at all.

The HTML Living Standard states this in the algorithm for determining alterntive text of an image:

  1. If the image is a descendant of a figure element that has a child figcaption element, and, ignoring the figcaption element and its descendants, the figure element has no flow content descendants other than inter-element whitespace and the img element, then return the contents of the first such figcaption element.

In reality, this is a portion of the spec that no browser ever actually implemented - that is until this Webkit PR that was merged in June 2025. But for other browsers, at least in this case the image would be exposed as an unnamed image, and one could then infer that the caption at least provides some level of understanding of what the image was meant to convey. I know that’ll get your automated checkers all miffed - finding an image without an alt attribute. This is not the best practice by any means, but at least one can understand the image is present and what the figure’s caption is referring to. Hell, some browsers even allow for AI descriptions of images - so long as someone knows an image is even available.

tldr; stop putting alt="" on images in your figure elements. You’re doing it wrong.

figures and screen readers

This is the part of the blog post where I used to have a bunch of testing results with browsers and screen readers. All this content was outdated. I suggest you check out Adrian Roselli’s more recent (2025) testing results.

The previous content can be found in the following disclosure widget:

Old content is old

Now that we have an idea of how figures and their captions should be used, how do these elements get exposed to screen readers?

The following content testing results were accurate at the time of writing in 2019. But that was before COVID, so does that time even exist? Fortunately, Adrian Roselli has new 2025 testing results. Go look at those if you want up to date stuff. Look at the following if you enjoy looking at snapshots in time, which might be fun to compare with, but are otherwise irrelevant if no longer accurate.

Ideally a figure should announce its role and the content of the figcaption as its accessible name. A person should then be able to navigate into the figure and interact with the contents of the figure and figcaption independently. For browsers that don’t fully support figures, like Internet Explorer 11, ARIA’s role="figure" and an aria-label can be used to help increase the likelihood the markup will be recognized by certain screen readers.

Here is a roundup of how tested screen readers, with default settings, expose this information (or don’t) in different browsers:

JAWS 18, 2018, and 2019

JAWS has the best support for announcing native figures and their captions, though support is not perfect or consistent depending on the browser and JAWS’s verbosity settings.

IE11 requires the use of role="figure" and an aria-label or aria-labelledby pointing to the figcaption to mimic native announcements. It’s not surprising that IE11 doesn’t support the native element, as HTML5 Accessibility’s IE11 browser rating will never improve. But at least ARIA can provide the semantics.

Edge won’t announce the presence of a figure role at all, regardless if ARIA is used or not. This will likely change once Edge switches over to Chromium.

Chrome and Firefox offer similar support, however JAWS (with default verbosity settings) + Chrome will completely ignore a figure (including the content of its figcaption) if an image has an empty alt or is lacking an alt attribute.

This means that those captions that accompany images in various Medium articles, those are all completely ignored by JAWS paired with Chrome. If JAWS settings are updated to announce all images (e.g. images where an alt attribute or value is not provided) then JAWS should announce these figure captions with Chrome.

Unlike with Chrome, JAWS paired with Firefox will still announce the figure and figcaption if an image has an empty or missing alt, but as the image will be completely ignored, the person using the screen reader will just have to infer that the primary content of the figure was an image.

NVDA

Testing NVDA version 2018.4.1 with IE11, Edge, Firefox 64.0.2 and Chrome 71, there was no trace of figures. The closest indication that something might be there was NVDA + IE11 announcing “edit” prior to the announcement of an image or contents of a figcaption (not that “edit” made any sense though…). Testing patterns with role="figure" did not change the lack of announcements. The contents of figures are still accessible, but no relationship of content and caption will be conveyed.

VoiceOver (macOS)

Testing was performed with Safari (12.0.2) and Chrome (71.0.3578.98) on macOS 10.14.2, with VoiceOver 9.

Safari

When testing with Safari, a figure will have it’s role announced. A figure’s role will not be announced if it has no accessible name (e.g. no figcaption, aria-label, etc.).

VoiceOver can navigate into the figure and interact with the primary content of the figure and figcaption individually.

Chrome

Though Chrome’s accessibility inspector notes that the semantics of the figure is being revealed, and that the accessible name is provided by its caption, VoiceOver does not locate or announce the presence of the figure as it does with Safari. That is unless the figure specifically has an aria-label. Using an aria-describedby or aria-labelledby on the figure to point to the figcaption does not expose the figure to VoiceOver. To properly convey figures to VoiceOver, with Chrome, the following markup would be necessary:

<!-- 
  aria-label would need to repeat the content of the figcaption 
  to announce the figure as expected. 
-->
<figure aria-label="Figcaption content here.">
  <!-- figure content -->
  <figcaption>
    Figcaption content here.
  </figcaption>
</figure>

Adding a role="figure" to the figure element, or another element in place of a <figure>, will still require an aria-label to make the role discoverable to VoiceOver with Chrome.

VoiceOver (iOS 12.1.2)

Testing both Safari and Chrome with VoiceOver, there is no announcement of figures, or a relationship of the figure’s content to its caption. Both <figure> and role="figure" patterns yielded the same results.

TalkBack (7.2 on Android 8.1)

Testing both Chrome (70) and Firefox (63.0.2), there is no announcement of figures, or a relationship of the figure’s content to its caption. Both <figure> and role="figure" patterns yielded the same results.

Narrator & Edge 42 / EdgeHTML 17

Narrator does not announce a figure role at all. However, the native element and role="figure" do have an effect on the manner in which a figure’s content is announced. When a figure has an accessible name, the contents of the figure (e.g. an image’s alt text) and the accessible name of the figure (figcaption content or aria-label) will be announced together. If an image has an empty alt, a figure and its figcaption will be completely ignored.

Wrapping up

Based on the intended use cases for figures and their captions, as well as the current screen reader support for these elements, the following markup pattern should be considered if you want to ensure the semantics are conveyed to as wide of an audience as possible:

<figure role="figure" aria-label="repeat figcaption content here">
  <!-- figure content. if an image, provide alt text -->
  <figcaption>
    Caption for the figure.
  </figcaption>
</figure>

<!--
  aria-label for macOS VoiceOver + Chrome
  role="figure" for IE11.

  IE11 needs an accessible name (provided by aria-label).
  If not for the fact VO + Chrome doesn't support an
  accessible name from aria-labelledby, that attribute
  would have been preferred / pointed to an ID on 
  the <figcaption>.
-->

This pattern will ensure that the following pairings will announce the figure role and its caption:

  • JAWS with Chrome, Firefox and IE11.
  • macOS VoiceOver with Safari and Chrome.
  • Edge & narrator will create a relationship, but won’t announce the figure role.

Presently, mobile screen readers won’t announce figures, nor Edge unless paired with Narrator (sort of), or any browser paired with NVDA. But don’t let these gaps deter you from using the elements as intended by their specifications.

With Edge changing over to Chromium, better support will likely become a reality in the near future. And while NVDA and mobile screen readers don’t announce the semantics, the content remains accessible. Filing bugs is the best we can do for now to usher change for these gaps.

Thank you to Steve Faulkner for reviewing my tests and this article.

The end(ish)

I’m sure this will need to be updated (again) in the future, but for now…

Dennis from Always Sunny In Philadelphia has some parting words.
Transcript

Dennis: You know that's life. That's just sort of how shit goes. Ha ha ha. Sometimes things just sort of... end.

Further reading

The following are additional resources to learn more about figures and figcaptions, the test page / results I used, as well as bugs that have been filed with JAWS and NVDA: