Why color contrast is not as black and white as it seems

Roger Attrill on 2021-11-29

This article takes a look at one of the most fundamental tools in the designers toolbox — the color contrast checker and why it might be problematic. I’ll describe the problem, a possible upcoming improvement, and what we might be able to do to put things right — if anything.

We are biased by our tools

I’ve given a talk about cognitive biases at a few events over the last couple of years. In those talks I stress the importance of looking critically at the tools we use and considering whether (or rather how) they bias the work we do and what the onward impact is for the users that we’re designing for.

I have tried applying that scrutiny to one tool I have been using for years, and which I suspect just about every designer out there has been using as well. I’m talking about the humble color contrast checker of which there are any number to be found on the web. You probably have your favorite one but I typically turn to the one at snook.ca. Pop in your foreground and background color and out comes a contrast ratio which (depending on an acceptable level or not) you can then use as evidence for defining your color palette, or for when you tell your team that some part of a design needs to be adjusted to meet accessibility guidelines.

Color problems

A great many organizations adopt the Web Content Accessibility Guidelines (WCAG)— a wonderful set of guidance that helps us as designers to ensure we keep accessibility in mind. Indeed WebAIM (Web Accessibility In Mind) also adopts WCAG 2 and in relation to color contrast its WebAIM Million project reports how 86.4% of the top 1 million websites have some form of low contrast failure to meet the guidelines.

One of the guidelines — WCAG 2 1.4.3 (and 1.4.6)— is very likely the source of the calculations that are being used in the color contrast checkers you use.

Requirement 1.4.3 reads like this:

The visual presentation of text and images of text has a contrast ratio of at least 4.5:1, except for the following:

Large Text: Large-scale text and images of large-scale text have a contrast ratio of at least 3:1;

Incidental: Text or images of text that are part of an inactive user interface component, that are pure decoration, that are not visible to anyone, or that are part of a picture that contains significant other visual content, have no contrast requirement.

Logotypes: Text that is part of a logo or brand name has no contrast requirement.

What if I told you that contrast ratio algorithm was wrong, or at best not telling the whole truth? What if the logic behind meeting WCAG 2 1.4.3 with its contrast ratios based purely on the relative luminance of each color was deeply flawed? Or at least would be deeply flawed if it weren’t so shallow in the first place!

What if all those color contrast checks you’ve done over the last however many years, with countless hours spent tuning color palettes and combinations — is at risk?

The thing is — I don’t think you’d be surprised. I think you knew something was up really. But WCAG 2 is…well it’s the WCAG — it’s often regarded as the single source of truth when it comes to meeting accessibility guidelines. That might be for your in-house design principles, your shared design system, your corporate or public service design requirements. Or even for your VPAT voluntarily showing how you meet those guidelines or for meeting federal purchase order requirements (VPAT in recent years has adopted the WCAG 2 guidelines). The WCAG 2 contrast requirement may even be law in your location, and WCAG 2 1.43 comes part and parcel, like it or not.

Let’s get visual

Let’s have a look at the problem. It’s popularly illustrated by the following example of various orange and blue backgrounds. The contrast ratio is shown below the box for both white and black text. The calculation uses the WCAG 2 contrast ratio calculations mentioned and linked above.

In order for text to meet WCAG 2 1.4.3 AA requirements, the contrast ratio should be a minimum of 4.5 for text smaller than 18pt. The figures returned from many (I can’t qualify how many) contrast checkers will indicate very clearly that black text is best on these backgrounds due to the higher contrast ratio.

I asked in 2019, why does Twitter use white-on-blue rather than black-on-blue? What is the reason for this being OK?

I don’t know about you, but something doesn’t feel right. More to the point, it doesn’t look right! The white text appears to me to be more readable than the black, and yet the evidence can’t lie. Can it? Well perhaps it doesn’t lie, but it does avoid a rather inconvenient complexity — human beings.

I should note here that not everyone will find the white text easier to read than the black. People are diverse and our vision systems vary, not just with age, condition, or impairment, but we have different equipment, different environments, different corrective devices, and different impressions of normal.

How can we cater for real people?

Enter: the Advanced Perceptual Contrast Algorithm (APCA), independently developed as part of a future WCAG 3 standard. It is in beta at the time of writing and is subject to change at any time, but it does explain the anomalies I’ve been trying to understand for years. I’m not going to go into detail about APCA because the link can take you to a colorful world of vision science and demos that may be more than you could ever need.

If you want to see the algorithm, there’s a javascript sample on the github page (check the license). If you’re just after the APCA contrast checker, you can find it at https://www.myndex.com/APCA/ which currently looks like this:

Suffice it to say that APCA takes into account the way the human vision system perceives color on self-illuminated displays when calculating a contrast level between two colors.

How does APCA compare to WCAG 2?

Rather than return to the somewhat infamous issue with orange and blue backgrounds above, lets first consider the basic greyscale problem — whether to use black or white text on a grey background.

From here on I’ll use the following symbols :

Here’s what the WCAG 2 contrast calculations tell us with a white foreground and then a black foreground — contrast varies between 1 and 21.

white foreground on greyscale (WCAG 2)
black foreground on greyscale (WCAG 2)

If we want to always use a foreground color with the highest possible contrast, then the best choice of foreground color will change halfway along the scale. Black works best on white and white works best on black, and they get progressively worse on shades of grey until the the color flips midway.

the higher contrast (black or white) on greyscale (WCAG 2)

This symmetric model of color contrast is the underlying problem. The human vision system mostly does not perceive dark on light with the same intensity as light on dark. Color perception is not symmetric — it is biased towards light on dark for the same color ‘distance’.

The APCA takes human color perception on self-lit displays into account when calculating contrast between two colors. APCA also returns a different range of values:

Here’s what the APCA contrast calculations tell us for a white foreground or a black foreground on the same greyscale range. White gets more high contrast slots (thin rings) further through the scale, and black gets fewer high contrast slots:

white foreground on greyscale (APCA)
black foreground on greyscale (APCA)

As before, if we want to always use the foreground color with the highest contrast, the preferred choice of foreground color will change somewhere along the scale, but not halfway! In fact it’s about 40% of the way along from the lighter end.

the higher contrast (black or white) on greyscale (APCA)

White is generally perceived as being of greater contrast more of the time. The APCA algorithm is ‘polarity aware’.

What does this mean for our troublesome orange and blue from earlier? Unlike WCAG 2, the APCA contrast is higher for the white text on both backgrounds, and by some margin.

The bigger picture

In an earlier article I looked at visualizing how the WCAG 2 color contrast varies across a slice of the HSL color space where hue, saturation, and lightness vary. I’m going to use the same types of visualization here.

For a pure black or pure white foreground color, the following image shows the best choice of black or white for the highest contrast. The symbols (as used above) indicate the level of contrast— thin rings mean highest contrast at WCAG 2 AAA level, and thick rings means an AA level acceptance.

HSL Slice 1 — WCAG 2

Here hue varies left to right, saturation increases downwards (0–255), and lightness is a constant mid value (128). In WCAG 2 contrast calculations, black or white will give a suitable contrast of >4.5 against any of these colors (there are no solid circles). Black works well in the yellows, greens, and cyan. White only works well in the deeper blues.

WCAG 2 contrast calculations — varying hue and saturation

HSL Slice 1 — APCA

How do the APCA contrast calculations compare? You can see below that white takes over as the dominant preference. White looks to be very high contrast in that great swathe of blues, pinks and reds. Black isn’t so great for that large region of yellows and green as previously thought - a block of solid circles indicating that the minimum required contrast can’t be met by either black or white.

APCA contrast calculations — varying hue and saturation

HSL Slice 2— WCAG 2

Now what if we vary lightness rather than saturation?

In the next picture hue varies left to right as before, lightness increases downwards (0–255), and saturation is a constant high value (255).

In WCAG 2 contrast calculations it’s a similar story to before. Black works well in the yellows, greens, and cyan. White works better in the blues. Unlike the constant lightness and varying saturation in the first image, a vast majority of the ratios get above 7 (the thin circles) for these saturated backgrounds.

WCAG 2 contrast calculations — varying hue and lightness

HSL Slice 2— APCA

And the saturated slice with APCA contrast calculations? White pushes deeper as expected, but significantly deeper into the red, blue and pink range. Some colors can’t get the equivalent of AA acceptance but generally the preferred foreground is just more likely to be white than with WCAG 2.

APCA contrast calculations — varying hue and lightness

If it’s broken, can we fix it?

Where does this leave us? We appear to understand the problem. We potentially have a new tool to help us calculate contrast taking human perception into account.

The APCA explains, and it corrects, but is it the solution?

That’s where things get complicated. Right now we can’t all start using this new tool because… well WCAG 2 is still here; the laws are still here; VPATs are still here; design requirements and policies are still here. It’s still a bit too early.

It’s neither practical nor desirable to go back and fix all the work we’ve done before, but for a while many of us are still not even going to be able to improve things going forward either, and that hurts!

Those of us not bound (legally or otherwise) to follow WCAG 2 (or who choose to flout item 1.4.3) may have the flexibility to choose better. Twitter for example has been using white on blue forever — were they right? WCAG 2 said no, APCA says…yes.

Here’s the numbers for ‘Twitter blue’ using WCAG and APCA models

‘Twitter blue’

The rest of us (for now) armed with new tools can really only constrain ourselves further by ensuring that wherever possible we conform to both WCAG 2 and APCA. A kind of futureproofing exercise that can’t let go of the past.

The end result might at least end up being better for users due to fewer poor color combinations being used.

What might supporting both models look like?

In the interests of curiosity — lets take a look at our HSL color slices if we opted to ensure our color choices met the requirements of both models.

HSL Slice 1 — WCAG 2 and ACPA

Firstly with constant lightness and varying saturation. This shows large areas with solid circles — low saturation colors unable to get an AA compatible contrast from either model, as well as those awkward blues, oranges and a chunk of vibrant pinks and reds.

Meeting both WCAG 2 and APCA contrast requirements — varying hue and saturation

HSL Slice 2— WCAG 2 and ACPA

With constant saturation and varying lightness, you can see there will be a river of colors to avoid, pushing us to use combinations with a greater color distance, i.e. white on a darker background, or black on a lighter background. This is probably a good thing anyway.

Meeting both WCAG 2 and APCA contrast requirements — varying hue and lightness

Conclusion

In my talks on bias I mention one takeaway being not so much a checklist of all the biases to watch out for in our decision making (because who’s got time for that), but more of a generally heightened awareness of bias everywhere. We have to accept (for the most part) human beings being… just human! But bias is also encapsulated in our tools, our processes, and the same old reusable patterns that we fall back on every day. The more we’re aware of that potential bias the more we realize how we are nudged down certain paths.

The humble contrast checker appears to be one such tool.

In the longer term if/when APCA becomes part of WCAG 3 (and that seems to be the trajectory) then there’s a long waterfall of change required — from WCAG 3 release to early adoption, to broader awareness driving new tools and adapting existing tools, to rewriting policies through organizational design groups, to what becomes the new normal. Unfortunately many contrast tools in use today will likely not be maintained. Contrast checkers embedded into desktop software can’t be changed without updates. Modern online tools such as browser based extensions or server-side design frameworks like Figma, Sketch etc can probably drive along change pretty quickly. We’ll almost certainly have to support both contrast models for some time to come.

Meanwhile — watch this color space.