Edit (2023-03-30): Added a couple extra images to test for high-quality encoding and illustrations. Introduce SSIM and MSE measures.
Edit (2023-03-03): Fixed mistake, as AVIF does in fact NOT support progressive decoding.
This is a subjective look at JPEG XL and AVIF. All comparisons in this post (except the ones using SSIM and MSE) are perceptual and based on my own opinion only. This is not a scientific study! I just grabbed a bunch of images, encoded them and looked at the results. For a more careful and methodical comparison between these formats (and more), have a look at CID22.
JPEG XL and AVIF are arguably the two main contenders in the battle to replace JPEG as the next-generation image format. There are other formats in the race, like HEIC and WebP 2, but the former is subject to licensing patents (and possibly not royalty-free), and the second is still in development and seems that it may never see the light of day as a production-ready image format anyway. The original WebP is not even a contender as it is inferior to AVIF in all aspects,1 and you should probably never use it for photography anyway2, or at all if you are not ok with mediocre image quality.3
First, a quick browser support test:
If you are browsing this page around 2023, chances are that your browser supports AVIF but does not support JPEG XL. This is mainly due to the Chrome team dropping support for JPEGL XL against the opinion the community at large. In this post, I hope to convince you why this is a bad move. Below, I perform a quick analysis of lossless and lossy compression with JPEG XL and AVIF, and evaluate how they fare in terms of file size and visual quality.
I have selected five different test images to run through both the JPEG XL and the AVIF encoders:
Flag — a sunset with a smooth red-yellow gradient in the sky and a quiet sea with lots of detail. There is a flag in the foreground, which is mostly black due to it being lit from behind.
BW cityscape — a black-and-white cityscape at night. It has a lot of contrast.
Red panda — a photo of a red panda with some green leaves on the foreground and a green bokeh in the background.
Fireman wedding — a vector illustration of a newly wed couple.
Plot — a simple plot created with matplotlib.
The first four images (flag, cityscape, red panda and vector wedding) are in the public domain, CC0, and have been downloaded from free-images.com. For the first two, the originals are in 4K (or close) resolution and JPEG format. I loaded them in GIMP, downscaled them to full HD and saved them using lossless PNG. I kept the red panda image at the original resolution to test high quality JPEG XL and AVIF encoding. The fireman wedding image was encoded in lossless PNG from the SVG at 1500x1380. The final image, plot, was directly produced by matplotlib with a high resolution, which I kept, in lossless PNG.
All the images used in this post (source
.avif) are available in this repository for you to inspect.
In this post we use upscaled crops of the images. The percentage by which they are upscaled is mentioned when they appear. The cropped versions themselves are, unless stated otherwise, encoded in JPEG with a 100% quality setting.
In this section we analyze how well JXL and AVIF compress images, in relation to file size and visual quality compared to the PNG original inputs.
In order to encode the JXL versions, I have used the
cjxl provided with
libjxl. In general, I have not used options other than the quality. For instance, all JXL images use effort 7, which is the default. Increasing the effort setting would surely improve results at the expense of speed for the JXL encoder, but would add an extra dimensionality to the analysis, which I intend to keep as simple and default as possible. This is an unscientific first look, after all.
cjxl input.png output.jxl -q 60 --num_threads=4
Playing around with the quality setting (
-q) produces larger, higher quality images for higher quality settings, and smaller, lower quality ones for lower quality settings. The
-q goes from 0 to 100.
To encode the AVIF images I used
avifenc provided by
libavif, the reference AVIF open-source implementation. Here, I have used the following command line arguments:
avifenc -j 4 --min 0 --max 63 -a end-usage=q -a cq-level=12 input.png output.avif
--max set the minimum and maximum quantizer for color. We use the full range here. When paired with
-a end-usage=q, the encoder uses the rate control mode to constant quality, given by
cq-level. So, adjusting
cq-level, in 0-63, adjust the quality for the color channels. Playing around with this quality, we can get images which are the same size as their JXL counterparts for comparison.
Additionally, this analysis is purely subjective. I encode the images and then look at them, trying to compare them to the original PNGs. I could have used a formal comparison measure, like the structural similarity (SSMI), but my understanding is that this is purely based on the image levels and disregards perceptual factors, which are very important in compression. A good overview on image quality assessment is provided by Wang et al.4 in their 2004 IEEE Image Processing paper. A Python utility which takes in two images and outputs the SSIM and the MSE (mean squared error) is included in the repository for this post (
compare.py). Whenever I can’t visually find substantial perceptual differences in the image files, I’ll be resorting back to SSIM and MSE.
It would have been easier to use
libvips, which offers a unified interface for JXL and AVIF encoding with
vips jxlsave -Q QUALITY and
vips heifsave -Q QUALITY, but I wanted to eliminate any hiccups or additional hidden processing introduced by vips itself, and thus went to the ‘official’ encoders directly.
For the flag image I have used a crop around the small boat in the distance, to the very left of the scene. The crop is upscaled by 200%. Here are the results.
|Image (200% crop)||Description|
|Original, 6.9 Mb|
|JXL, 146 Kb|
|AVIF, 146 Kb|
I used some pretty heavy compression in this one. This means that the quality parameter was rather low. We compressed the image from the original 6.9 Mb to only 146 Kb. Right off the bat, we see that the JXL version is more true to the original, being able to retain more of the detail. For instance, the sky looks washed-out in the AVIF version when compared to the JXL, which retains some grain.
Also, I find it very funny that in the AVIF version the mast at the front end of the boat is completely gone, and a new, non-existing mast has appeared out of nowhere on top of the bridge! This is not the case in the JXL version, which is able to retain the original mast.
JXL wins this round, as it is able to provide a higher fidelity representation of the original image at the same size.
In this case I did not upscale the crops, so they are shown at the original resolution. Compare cityscape black-and-white image.
|Image (100% crop)||Description|
|Original, 7.8 Mb|
|JXL, 224 Kb|
|AVIF, 235 Kb|
In this case the AVIF version is 11 Kb larger than the JXL version. I couldn’t get it at the exact same size as the JXL version, so I erred on the side that favours AVIF. Here the results are closer, but we can see some artifacts on the sky surrounding the pointy structure in the AVIF, which are not present in the JXL. After carefully reviewing several spots I can say that this behaviour is consistent across the whole image. This, coupled with the fact that the AVIF image is larger, also gives the edge to JXL.
I used a
cjxl quality of 62, producing
redpanda.jxl at 563 Kb. The
avifenc uses a
cq-level of 34, producing a slightly smaller image at 552 Kb. Both versions produce visible artifacts, but I could not decide on a single crop-in to compare, as both did pretty well with this image. Instead, I computed the SSIM and MSE for each image.
- AVIF – MSE: 35.33327184406321, SSIM: 0.7737482075763144
- JXL – MSE: 39.86481836605985, SSIM: 0.7504872295500076
In this metric, the AVIF version is 0.02 percent closer to the original lossless image. Practically, this does not translate to a perceptible visual improvement in my opinion.
I used a quality setting of 50 for the JXL version, producing a file of 78 Kb, and a setting of 44 for the AVIF version, producing a 77 Kb file. This is an interesting one. The AVIF version presents some bad fuzziness in the transitions from the black outlines to the dark blue uniform, but the JXL version contains a weird gradient in the yellow part. The crops are at 200%.
|Image (200% crop)||Description|
|Original, 312 Kb|
|JXL, 78 Kb|
|AVIF, 77 Kb|
Curiously, the structural similarity in this case is a little bit in favour of JXL, when it is supposed to be worse than AVIF for illustrations. This only goes to show that the SSIM index is not very reliable for these kinds of comparisons.
- JXL – MSE: 18.794290338164252, SSIM: 0.9873632725975594
- AVIF – MSE: 11.44317922705314, SSIM: 0.9813691392049559
This is a very simple image. I used a 500% upscale crop around a specific site which has a little hole at the top left. This hole is retained in both versions (see white pixels to the top-left).
|Image (500% crop)||Description|
|Original, 399 Kb|
|JXL, 173 Kb|
|AVIF, 176 Kb|
Again, the AVIF version is a bit larger. Achieving a good compression ratio is much harder with this image. In general, both the AVIF and JXL present very few artifacts. I would say that the color artifacts in AVIF are a bit worse, but this is kind of subjective. I think this is a draw.
Finally, I also took a look at lossless compression to evaluate the differences in file size when encoding mathematically identical images without any information loss.
|Flag||6.9 Mb||2.0 Mb||2.7 Mb|
|BW cityscape||7.8 Mb||1.2 Mb||4.3 Mb|
|Red panda||23 Mb||15 Mb||14 Mb|
|Fireman wedding||312 Kb||145 Kb||270 Kb|
|Plot||399 Kb||113 Kb||853 Kb|
As you can see, there is no contest. JXL wins every time but one, with some abysmal results, like the BW cityscape, where the lossless JXL image is four times smaller than its AVIF counterpart. Interestingly, the lossless AVIF is twice the size of the original PNG. I’m not sure what the deal with this is, but it is for sure bad. It may be obvious, but let’s state it: If I had to store lots of lossless images I would definitely go for JXL.
What about encoding speed? Well, I did not capture that data (reasons later), but my totally subjective and unscientific impression is that, again JXL has the edge here as well. However, there are many parameters that affect encoding speed, like effort, multithreading, etc., and they are different for the different encoders. So finding a common comparison ground is not always easy. This is a multi-dimensional problem which requires a careful analysis and probably lots of time, so I’ll leave it for another post. Cloudinary and Google both run similar tests concerning encoding speed (here and here respectively), and they came up with contradicting results, so the battle’s still on.
According to the results presented above, my subjective opinion is that JXL is the superior format for both lossy and lossless operations. It is indeed very hard to visually compare image formats by just compressing a bunch of images and looking at them, so we can also compare them by the features they offer. In this aspect, JXL is the clear winner. Let’s see:
- Max image size is limited to 4K (3840x2160) in AVIF, which is a deal breaker to me. You can tile images, but seams are visible at the edges, which makes this unusable. JPEG XL supports image sizes of up to 1,073,741,823x1,073,741,824. You won’t run out of image space anytime soon.
- JXL offers lossless recompression of JPEG images. This is important for compatibility, as you can re-encode JPEG images into JXL for a 30% reduction in file size for free. AVIF has no such feature.
- JXL has a maximum of 32 bits per channel. AVIF supports up to 10.
- JXL is more resilient to generation loss.5
- JXL supports progressive decoding, which is essential in web delivery, IMO. AVIF has no such feature.
- AVIF is notoriously based on the AV1 video encoder. That makes it far superior for animated image sequences, outperforming JXL in this department by a wide margin. However, JXL also supports this feature.
- AVIF is supported in most major browsers. This includes Chrome (and derivatives) and Firefox (and forks). JXL is supported by almost nobody right now. Only Thorium, Pale Moon, LibreWolf, Waterfox, Basilisk and Firefox Nightly incorporate it. Most of these are community-maintained forks of Firefox. That is a big downside for adoption, as I already ranted about in this post.
- Both formats support transparency and wide gamut (HDR).
If I had to choose a format to re-encode all of my photos, I would for sure choose JXL. My image viewers of choice support it (Gnome image viewer, nsxiv, feh) and my image processors of choice support it (GIMP, DarkTable). As seen, it provides better image quality and better compression ratios than AVIF, more features and is generally faster. That said, I don’t think AVIF is completely useless. It is a better option to replace animated GIFs than JXL, and it is quite good at very lossy compressions targeting the smallest file sizes possible, which makes it a good contender for web delivery.