For the time being, I would still advocate people use robust custom dialogs, such as a11y-dialog, or at the very least ensure their <dialog> elements can fallback to custom dialogs in the event people are not using the most up-to-date browsers. That is until usage stats for browsers that support <dialog> outweigh those that don’t. For instance, it’s awesome that Safari now supports the <dialog> element, but since Safari releases are so tightly coupled with OS updates, not everyone is going to get this update nearly as quickly as those using Firefox, Chrome and Edge.
Concerning initial focus placement of dialogs - this topic was one of the reasons that the <dialog> element had been stuck in limbo for so long. Fortunately, per the mentioned ongoing discussions bout the <dialog> element there is a proposal for initial focus placement and suggested updates to the dialog focus algorithm which allow authors more control and reasonable defaults on where focus placement can be set. Essentially, the best initial focus placement for a dialog will depend on the purpose of that dialog, and browsers should allow for reasonable fallbacks if web developers do not explicitly indicate where focus should be placed.
For instance, a confirmation dialog may contain a short sentence or two of text followed by an “OK” button. In this instance it makes sense to focus the button by default, so long as the content prior to the button is automatically announced by a screen reader (and with the native dialog element, this is largely what occurs).
Alternatively, a content-heavy dialog may contain a number of focusable elements within it, but no focusable element that is close to the top of the dialog. In such a situation it would be best if web developers used the autofocus attribute to specify where initial focus should land whether it be on an initial heading, a paragraph, or even the dialog element itself if that would be the most logical placement.
The point being, that sometimes setting focus to the <dialog> might be exactly what you need to do, while others it would make far more sense to focus on the primary or most commonly used control within a dialog. But, as the author of that dialog, you should know best for where focus needs to be set to provide users with the best and most logical UX for that dialog. The browsers will do what they can to ensure a logical fallback is provided in the instances where explicit focus placement is not specified.
The remaining content of this blog post has had a few updates, but is largely kept for archival purposes.
Incoming <dialog> (Feb 2022)
Since March of 2019, and particularly recently, there has been a great deal of effort by Webkit and Firefox engineers to implement the <dialog> element. Things are looking promising, and most (but not all) of the previous issues I mentioned with the <dialog> behavior have even been polished in Chromium browsers – Microsoft Edge now being one such browser and thus also having native support for the <dialog> element.
Firefox still requires the <dialog> element be enabled in the browser’s configs: dom.dialog_element.enabled. If you enable it you’ll be treated to a generally well performing <dialog> element. In regards to presently expected behaviors for a modal dialog, and testing with the examples I had created when I first published this post in 2019, I have no quirks to report on.
While this won’t help the average user in experiencing the native <dialog> element in a website, things are on the right track there.
As of the October 27th Safari Tech Preview, Safari supports the <dialog> element without a flag of any kind. Giving it a test with VoiceOver, it is working really well!
Looking again at Chromium browsers, so long as there is a focusable element within the dialog to receive initial focus, the dialogs behave well. There is no more auto-scrolling of the page behind a long scrollable dialog anymore, as the default UA style for the <dialog> has changed to position: fixed. Note: a non-modal dialog has a default of position: absolute, which seems reasonable and can be overwritten by developers pretty easily, if need be.
With Firefox and Chromium browsers, in an instance where a <dialog> has not been provided a focusable element to initially send keyboard focus to, JAWS no longer announce anything when the modal dialog has been invoked. Screen reader focus (JAWS, NVDA, VoiceOver) continues to remain on the invoking element. Different screen readers may be able to actually find the dialog or not (for instance, I could not reach it with VoiceOver and Safari TP, but I could with NVDA and Firefox if I searched the page for a dialog announcement with the screen reader’s virtual cursor, and then pressed Enter to force enter it).
There are still some unresolved discussions regarding initial focus in a dialog.
I, as well as others, have advocated for focusing the dialog by default as the thing you can consistently count on with any implementation of a proper dialog, is that there will be a containing dialog element. Focusing it by default allows for consistency in focus placement, and allows for people using screen readers to start at the top of dialogs, rather than wherever focus is randomly set on a per-dialog basis.
I realize there are some people who have misgivings about this, and I understand and respect those. Focusing a modal dialog, particularly a custom made one, has required very specific markup, CSS and scripting gymnastics over the years. Additionally, there are times where it can actually be detrimental for the overall usability of a dialog to focus it by default. Particularly in instances where the dialog is frequently used and it would be far more practical to move focus to the most important focusable element (such as an OK button, or an important text field) than to have everyone always start at the top of the dialog over and over again.
But, per the topic of focusing the dialog itself, as a quick test you can try out how focusing a native dialog element works via this CodePen (or just read the results I provided in the pen):
tldr; focusing the dialog works largely well. Content is read from top to bottom, logically. There’s no skipping around from a focus point to the top of the dialog. There’s no duplicate announcements. There are instances where the first element might get passed over by the virtual cursor… but that’s solvable…
So we can start using the <dialog> element now with its polyfill, RIGHT?!
OK, hold on there. Have you tried testing what this polyfill actually does? No? Let’s continue…
Note that the ReadMe for the polyfill calls out some limitations with the script, and specifically notes that by default it does not return focus to the element which had invoked the dialog. A snippet is available to add this functionality.
There are additional issues here though, and I will describe a few of those momentarily. But I want to pause for a second and state that this is not meant to be a trashing of the polyfill. It notes its limitations, and there are extensions that can be made to this to have it work better (though still likely not perfect without also extending support for inert as well).
You could put in the effort to add in those extensions, or you could use a robust plugin like a11y-dialog and ensure that your dialogs will have a pretty consistent experience across all browsers.
Again, consistency is a pretty great thing. At the end of the day, being able to effectively use a website/product/what-have-you is what truly delights users, rather than frivolous CSS tricks or other embelishments that are added in the name of said ‘user delight’.
But, back to the polyfill – it does standard keyboard trapping for the Tab key. There is no specific function to try and contain screen reader’s virtual cursors to the modal dialog itself. You might notice, if testing the provided demo page, that in some cases a screen reader’s virtual cursor does appear to be contained to the dialog. However, again this is not specifically the polyfill script, but rather how screen readers are treating the existence of the role=dialog which is added to the <dialog> element by this script.
Animaged gif of JAWS can escape polyfilled dialog
As the animated gif demonstrates, when opening the modal dialog (‘basic modal’) with Firefox and JAWS, JAWS’s virtual cursor can escape the modal dialog (I pressed the Tab key and then the down arrow a few times when reaching the end of the dialog).
Further extending the script to add aria-modal=true to the modal dialog instances would likely help here, though support for aria-modal is still incomplete. However, in regards to polyfilling the <dialog> element, the most glaring gaps at this time (feb 2022) would be on Android with Firefox, and on iOS.
To end this
<dialog> is almost here. It’s been a long road, and some last bits still need to be worked out in the HTML spec. This is very promising… and there are a lot of people who need some big thank yous in their work to get this over the finish line.
But, until the <dialog> is actually fully delivered, I personally suggest continuing to use trusted and robust custom dialogs. Or, if you polyfill the <dialog> element itself, you absolutely need to make sure it fully performs as expected for all users.
I know we all want to use the latest and greatest thing to pop-up content on our web pages. But, we should make sure that we annoy our users equally.
Here are some additional articles on the <dialog> element.
If you would like to see the original version of this article for more context, then here you have it:
Article (with updates from Oct 7, 2021)
Oct 7 2021 update: this post was updated with some wording changes, typo fixes, and minor additions of information. Testing for macOS/iOS was updated, but further retests will be performed and reported on in another update.
I’ve written about building accessible modal dialogs a few times over the past five-ish years. Most recently I dissected the current state of modal dialog accessibility where I outlined UX expectations and accessibility gotchas when building custom modal dialogs.
Since publishing that breakdown in June of 2018, some things have changed, which is expected. Technology is constantly evolving, improving and often providing better support and features. However, each time I’ve written about modal dialogs I’ve briefly discussed the native <dialog> element. And each time I bring up that element it’s typically to mention its continued lackluster level of implementation from browsers (this is not to say that there aren’t people working on implementing this element. But time, resources, priorities, etc., progress has been unfortunately very slow).
tldr; I’m just going to say right now that the <dialog> element and its polyfill are not suitable for use in production. It’s been that way since the <dialog>’s earliest implementation in Chrome, six-ish many years ago (“many” is far more evergreen than a specific number that keeps increasing…).
For example, adding a <dialog> to a web document would then require another element (such as a button) to serve as the user interface control to invoke the <dialog> element’s .show() or .showModal() methods. Otherwise the <dialog> would remain hidden:
Have you tried
<ahref="https://duckduckgo.com"><code>Duck Duck Go</code></a>?
A <dialog> recognizes the open attribute when set to it. e.g., <dialog open>. When manually applied in the HTML, the open attribute will cause a <dialog> to be auto-revealed on document load, as a non-modal dialog. Authors would need to write a function to allow the .close() method to be invoked by the user (again likely via a button within the <dialog>). Otherwise there would be no method to dismiss the non-modal dialog. A non-modal <dialog> does not close when the Esc is pressed.
Just a quick aside, but I won’t be talking about open or the .show() method from here on out, as they do not invoke modal dialogs. Also, per this Twitter thread about open, seems I could go down a whole other rabbit hole.
Unlike a non-modal dialog, a <dialog> that is invoked with the .showModal() method has the added functionality of allowing the Esc key to close it. This doesn’t mean that authors should leave out an explicit element (button) to close the modal dialog, especially since not all devices have keyboards, nor is clicking or tapping outside of the modal dialog a native means to dismiss it. However, the Esc key serves as a fail safe for keyboard users in the event an author were to leave out a more explicit control to dismiss the modal.
As mentioned, a separate interactive control (again button) is necessary to allow users to open a modal dialog from the base document.
Open my dialog!
The following <iframe> contains the minimal outlined example. If you’re using a Chromium browser, you should be able to invoke the dialog. If using another browser, check to see if there’s a flag you can enable to view the dialog.
What may be the most visible quirk of the current <dialog>’s implementation is if a dialog contains content that is long enough which would cause it to scroll. For instance, a terms of service (TOS) within a modal dialog.
The issue being the <dialog>’s focus algorithm. It states that unless a <dialog> contains a control with the autofocus attribute, then the first focusable element within the dialog will receive keyboard focus.
In the TOS example, the first focusable element is near the end of the dialog’s content. This means the modal dialog will automatically scroll so the focused element will be in view, visually skipping over the content that came before it. Now, I’m sure there are many sighted users who might not mind this behavior, but from experience I’d wager many companies’ legal departments would be none to happy about the majority of their legal document being skipped over.
Auto-scrolled dialog gif
Furthermore, if an author does not provide a max-height and overflow to their modal dialog, then the entire document will scroll to ensure the focused element is within the viewport. Coupled with the fact that the current dialog implementation doesn’t return focus to the element that invoked it, this means that all users will have to re-scroll the document to return to their original position.
Granted, these are not just failings of the native <dialog>, but of its contents and use. Ideally long form content and complex UI would go on its own page, allowing for a direct link to it and thus bypassing the need for a modal dialog all together. But “ideally” and “reality” do not always align, and one cannot deny how often prodcuts/websites overload dialogs with content that should be part of, if not their own, web page. A TOS doc within a modal dialog is a real world use case. I’m sure many have encountered one, at least once in their web browsing. Privacy statements, long-form product comparison docs… there are tons of instances of mostly-static dialog content that behave like this TOS dialog example.
Shifting gears from that of a sighted mouse user to that of a keyboard user, let’s look at how the <dialog> presently behaves. Upon invoking a modal dialog, keyboard focus will immediately shift from the element that opens the dialog to an element with the autofocus attribute applied to it. If no such element exists, then focus will be placed on the first focusable element within the dialog.
If a modal dialog lacks a focusable element, focus visually appears to be lost (which is much more prevalent if using a screen reader, but more on that in a bit). Granted, this sort of scenario should be avoided as it creates a situation where people cannot close the modal dialog without a keyboard.
While a modal <dialog> is open, keyboard focus cannot return to the document beneath it. However, unlike many custom modal dialogs which typically create a focus trap – forcing an endless loop of the focusable elements within the dialog – a native modal <dialog> allows keyboard focus to escape the modal and return to the browser’s chrome (e.g., the address bar and other such controls). This is good behavior, not a bug. Completely trapping keyboard focus within a custom dialog, thus not allowing focus to return to the browser chrome, was always more of a “it’s either this, or focus can move back to the document”. If this behavior seems confusing or undesired to you, that’s unfortunately due to the fact that you’ve had to experience custom dialog behavior for so long…
An acknowledged gap in a modal dialog’s UX is that closing one doesn’t explicitly return keyboard focus to the element that invoked it. Instead focus will linger wherever the modal dialog appeared in the DOM order. Consider the following markup:
<button>Opens Dialog</button><dialog>Goes here</dialog><p>More content and <ahref="...">a link</a>.</p>
As the <dialog> exists between the button that invoked it and a link, keyboard users hitting Tab or Shift + Tab, after closing the modal dialog, will find themselves in an expected location. However, if dialogs are inserted towards the top or bottom of the DOM, when closing the modal dialog a user’s focus could wind up in unexpected locations. For instance, the browser’s address bar.
Without a more explicit re-focusing of the invoking element, it means that keyboard users will have a much higher likelihood of having to reorientate themselves and re-navigate portions of the document to get back to where they previously left off. We don’t ask sighted mouse users to re-read or scroll documents after they close modals, but providing such functional equality would be on authors to script.
Screen readers follow similar functionality to default keyboard functionality. Keyboard and screen reader focus are restricted to the modal dialog and cannot access the base document.
However, it was the variations in how each screen reader paired with Chrome announced the modal dialog and its contents that exposed some quirks.
Opening the minimal test dialog, “modal dialog” is announced, the text and role of the focused element is announced, and then general announcements of the number of important elements on the “page” are announced. e.g., number of headings and links.
After these initial announcements, JAWS appears to rely on some heuristics to determine what will happen next. For example, JAWS will begin announcing content starting at the focused link within the test modal dialog. The content prior to the link is ignored.
In the TOS modal dialog, JAWS will announce the link at the bottom of the dialog, and then return to the top of the dialog to begin announcing the content that was skipped over. However, JAWS starts announcing with the “Please read these…” paragraph, skipping over the heading and first paragraph (date).
If a modal dialog contains no focusable element JAWS will begin reading content at a point dependent on the length of the dialog’s contents. For instance, the smaller dialog will start announcing at the heading. The long dialog will start announcing somewhere in the middle.
In all situations, it would be on the JAWS user to recognize content may not have been announced. They’d then need to make the decision to search the dialog for any missed details.
As with standard keyboard focus, the location of the <dialog> in the DOM is going to determine where focus is placed when the modal dialog is closed. Though, testing showed that JAWS will often place the virtual cursor back to the element that invoked the modal, even if keyboard focus was returned to a different location in the DOM. This is both a benefit and a quirk, as depending on the next key press (arrow keys or Tab), a JAWS user could find themselves in very different locations in the document.
The manner in which the contents of the modal dialog is announced is determined behind the scenes by NVDA, similarly to JAWS.
Opening the minimal test dialog, the first link is auto focused and the contents of the dialog are read multiple times, except for the heading which is not announced at all.
NVDA's announcements for the minimal test dialog
Native test file document
Have you tried Duck Duck go
Duck Duck go?
Duck Duck go, link
link, Duck Duck go?
Checkbox not checked I have button Neat!
Regarding the TOS modal dialog, NVDA begins reading the content of the dialog starting with “Last updated” and announces all paragraph content, skipping over headings and other semantics. After completing, NVDA will start re-announcing the content from the initially focused element. This time NVDA will announce the headings and other semantics it left out in the initial pass.
When closing the modal dialog NVDA’s focus, as well as keyboard focus, were returned to the button that opened the modal dialog. This allowed for either the Tab key or virtual cursor to be used to start re-navigating the document.
Chrome and VoiceOver don’t skip over announcing any content of the modal dialog, and are generally succinct in those announcements. Visually focus is set to the link, but VoiceOver begins auto-reading from the top of the dialog, and will move through its entirety until completion, so long as a user doesn’t intervene. VoiceOver behavior with Chrome has improved since the 2019 test. A quirk from the 2019 test is contained within the following disclosure widget, but this behavior does not exist when testing in October 2021.
macOS 10.14.3 VoiceOver behavior
Some content did not have its role announced when VoiceOver was auto-reading all content of the modal dialog, by default (specifically the checkbox or the button in the minimal test). VoiceOver will also add an announcement of "group" after announcing the content of the modal dialog, which was unexpected.
Another longstanding quirk (but is not unique to the content of dialogs), is if reviewing “articles” in the VoiceOver rotor, you’ll be presented with the following:
Closing the modal dialog returns VoiceOver focus to the top of the page. VoiceOver will begin re-announcing the content of the page. Focus does appear to be reset to the button that invoked the dialog though, so hitting the Tab key will move to the following link, in the test page I put together. But navigating with VO focus, a user will find themselves having to re-navigate the document to return to where they had left off, before opening the dialog.
Android 8.1 with TalkBack 7.2
Opening the dialog in the minimal test, the first focusable element within the <dialog> is focused and announced. However, TalkBack doesn’t announce the dialog’s role when focus shifts, which may cause some confusion as to what’s happened for some users. Additionally, none of the content preceding the focused element is automatically announced, as other screen readers behave. A user would have to discover the content manually, and again, infer for themselves that they’ve entered a modal dialog.
When closing the modal dialog, TalkBack will send focus to the web view container (essentially the document root). The user will still need to re-navigate the document to get back to their previous position.
Does the polyfill help?
As noted, only Chromium-based browsers support the <dialog> right now. There is a <dialog> polyfill, but while it is better than nothing, it does not provide a fully robust, accessible user experience.
When the modal dialog is opened, virtual cursor is trapped within the modal, however Tab key appears to be able to escape and focus elements within the primary document. Opening JAWS’s dialog of links (Insert + F7) all of the links of the primary document are still available and accessible to JAWS (this should not be the case). Opening JAWS dialogs to quickly navigate to other types of elements also list elements that should be inaccessible, so long as a modal dialog is open.
JAWS with IE11 (2019)
The polyfill has some quirks in general, but most importantly doesn’t send JAWS into the dialog element itself. Virtual cursor remains on the button that launches the dialog, and navigating with it will walk through the document beneath the modal dialog. Hitting Tab to try and quickly navigate to the dialog consistently crashed IE11 (regardless if JAWS was on or not).
NVDA with Firefox (2019)
Using the Tab key, NVDA can escape an opened modal dialog. Nothing will be announced by NVDA, but after exiting, the virtual cursor can be used to navigate the base document, and NVDA will announce the content it is highlighting.
If within the modal dialog, NVDA’s virtual cursor cannot escape.
If opening NVDA’s dialog of elements (Insert + F7) within the modal dialog, no elements but those within the dialog will be exposed. However, if a user has tabbed outside of the modal dialog and opens the same NVDA dialog, all elements of the base document will be exposed.
macOS VoiceOver with Safari (2019 & 2021 retest)
Whether VoiceOver is enabled or not, one cannot leave the modal dialog if using the Tab. However, there have been no steps to mitigate users leaving the modal when navigating by VoiceOver’s cursor (Control + Option + left or right arrows). The rotor menu will also continue to list all elements within the primary document.
iOS 12.1.4 (2019) & 15.0.1 (2021) VoiceOver with Safari
A VoiceOver user will find it very easy to escape polyfilled modal dialogs. VoiceOver does not get trapped inside the modal dialog when opened. Swiping or navigating by rotor will allow easy access out of the modal dialog.
TalkBack and Android Firefox (2019)
TalkBack focus does not move to the modal dialog when using Firefox. Once you do get TalkBack to enter the exposed modal dialog, it works similarly to VoiceOver and Safari, in that it’s very easy to swipe out of the modal dialog.
So what to do?
Modal dialogs continue to be hard, and that was a lot of information to take in. So let’s wrap this all up at a high-level:
In my opinion, Chrome’s current implementation, and the manner in which screen readers presently interact with modal dialogs range from:
Adequate (Chrome with macOS VoiceOver).
Quirky (Chrome with JAWS and NVDA).
Kinda awful (Chrome with TalkBack).
Presently the specification seems geared towards dialogs that contain short form content, and where a focusable element will always be one of the first child elements.
For right or wrong, in the real world modal dialogs are used for a range of purposes. From marketing ads, informative messages, error alerts, forms, sub-applications (e.g., media managers in CMSes), to long form content (again TOS).
Hopefully Edge moving to Chromium can kick start work on this element. At the very least allowing for conversations to be re-opened, and more attention given to the accessibility gaps of the <dialog> element.
The <dialog> element could have the potential to replace the many inaccessible custom dialogs across the web, and mitigate future ones from being built. But with its current issues, and continued Chromium-only default support, it presently doesn’t make sense to use the <dialog> element.
I hope this changes soon. If only because I selfishly don’t want to write about the <dialog> element’s lack of implementation and quirks the next time I get an itch to write about modal dialogs.
Oct 2021 update: this post was updated with some wording changes, typo fixes, and minor additions of information. Some testing was updated, but further retests will be performed and reported on in another update.