libjpeg-turbo | About / Part 2: Quality vs. Size

<< Part 1: Lossless SmartScale | Introduction | >>

The previous section examined the use of SmartScale as a means for generating mathematically lossless JPEG files. This section examines whether the format does a better job than baseline JPEG of generating perceptually lossless images. DCT scaling and SmartScale are then evaluated as potential alternatives to low-quality JPEG.

Test Images

artificial.ppm: Ray-traced scene from http://www.imagecompression.info/test_images/ (8-bit version, 3072 x 2048 pixels.)

nightshot_iso_100.ppm: Photographic content from http://www.imagecompression.info/test_images/ (8-bit version, 3136 x 2352 pixels.)

vgl_6434_0018.ppm: A frame capture from the Pro/ENGINEER Viewperf test. This frame represents an exploded rendering of a race car with smooth shading and lighting (1240 x 960 pixels.)

Test Platform

Dell Precision T3500
2.8 GHz quad-core Intel Xeon W3530
4 GB memory
CentOS Enterprise Linux 5.8

Test Methodology

For the performance metrics, the execution time of cjpeg was measured using the Un*x time command, and the result represents the average of 5 runs in which no significant variations were observed.

To obtain the most consistent performance, only 1 CPU core was used.

jpeg-8d was used to perform all tests. It was built with 64-bit code and maximum compiler optimizations (-O3.)

Perceptually Lossless JPEG

DCT scaling cannot ever by perceptually lossless, and thus it is unlikely that anyone would want to use it as a means of still image storage or as a means of image transfer for quality-critical content. However, decreasing the DCT block size does improve quality, so this section evaluates whether SmartScale can be used to provide better quality at the same compression ratios, or better compression ratios at the same quality, relative to perceptually lossless baseline JPEG.

For remote 3D display applications, quality=95 with no chroma subsampling has been experimentally determined to be perceptually lossless under most viewing conditions, and therefore it is the default setting in both VirtualGL and TurboVNC. Thus, those compression settings served as a reference in these tests. Each image was compressed using the reference settings, then the same image was compressed with DCT block sizes of 1, 2, and 4, and trial and error was used to determine the quality level at which the same compression ratio as the reference case could be achieved. In all cases, the DSSIM was measured for the output image using this code (higher DSSIM = more perceptual loss.)

Perceptually Lossless JPEG - Closest Size Equivalents
	artificial.ppm				nightshot_iso_100.ppm				vgl_6434_0018.ppm
	Qual.	Comp. ratio	DSSIM	Time (s)	Qual.	Comp. ratio	DSSIM	Time (s)	Qual.	Comp. ratio	DSSIM	Time (s)
Reference (Huffman coding)	95	14.37	0.0048	0.40	95	10.99	0.0113	0.55	95	10.47	0.0086	0.10
Reference (Arithmetic coding)	95	16.54	0.0048	0.48	95	12.44	0.0113	0.71	95	12.34	0.0086	0.13
Block size=1	51	16.83	0.0024	3.19	15	12.47	0.0780	3.84	75	12.35	0.0022	0.65
Block size=2	61	16.36	0.0028	1.11	37	12.59	0.0254	1.42	76	12.64	0.0023	0.25
Block size=4	83	16.39	0.0027	0.59	78	12.29	0.0128	0.86	86	12.51	0.0048	0.15

Click on any of the DSSIM metrics to view a cropped region of the output image.

As explained in the introduction, reducing the block size decreases or eliminates quantization artifacts, but it also decreases the compression ratio, so in order to make up for that, it was necessary to decrease the quality level. For the cases in which it was necessary to use quality levels less than 75, color banding was observed in the output image. This was true for artificial.ppm as well, despite the fact that the DSSIM metric seemed to indicate that the visual quality was better with SmartScale relative to the reference case. Color banding can clearly be observed around the sides of the mouse in the block size=1 and block size=2 output images. DSSIM was apparently not very good at detecting that kind of non-localized error.

For artificial.ppm and vgl_6434_0018.ppm, the reference compression settings produced some very minor artifacts around a few of the sharp features. These were not visible at 100% zoom, but they became visible typically at about 300%. These artifacts were either significantly reduced or eliminated by reducing the block size, but if this necessitated reducing the quality beyond a certain level to maintain an equal compression ratio, then it resulted in trading off an invisible artifact for a very visible one. The block size=4 case for artificial.ppm appeared to have equivalent quality to the reference case at 100% zoom. None of the equivalent-size cases for nightshot_iso_100.ppm achieved the same quality as the reference case. For that image, the block size=1 and block size=2 cases produced significant color banding, and the block size=4 case produced visible noise relative to the reference case. For vgl_6434_0018.ppm, all three SmartScale cases appeared to have the same quality as the reference. As with artificial.ppm, there were minor but invisible artifacts around some of the sharp features in the reference case, and the SmartScale cases significantly reduced or eliminated these, but the difference was not perceptible at 100% zoom.

Overall, none of the SmartScale modes proved to be an adequate substitute for the existing perceptually lossless JPEG mode. DCT block sizes of 1 and 2 often required using a low JPEG quality to maintain the same compression ratio, and this created visible color banding. Furthermore, the performance of those modes was unacceptable. A DCT block size of 4 achieved performance in the ballpark of the reference case, but on one of the images, it necessitated a low enough quality that visible noise was generated in the output.

This doesn't exactly tell the whole story, though. Having established that reducing the block size reduces quantization artifacts, all else being equal, then the next experiment compared the maximum quality achievable with both the SmartScale and baseline JPEG formats. For each test case, the DSSIM with quality=100 was measured, then the quality was reduced until the DSSIM changed, and the quality level immediately above this transition point was recorded.

Perceptually Lossless JPEG - Maximum Achievable Quality
	artificial.ppm			nightshot_iso_100.ppm			vgl_6434_0018.ppm
	Qual.	Comp. ratio	DSSIM	Qual.	Comp. ratio	DSSIM	Qual.	Comp. ratio	DSSIM
Reference (Huffman coding)	100	6.705	0.0010	100	3.140	0.0051	100	5.063	0.0018
Reference (Arithmetic coding)	100	7.148	0.0010	100	3.684	0.0051	100	5.902	0.0018
Block size=1	76	12.68	0.0007	76	3.621	0.0028	76	12.08	0.0016
Block size=2	93	8.327	0.0007	93	2.742	0.0028	93	8.020	0.0016
Block size=4	99	6.849	0.0008	100	2.525	0.0028	99	6.140	0.0016

In no case were any of these images observed to have any perceptible loss at any zoom level, and only for nightshot_iso_100.ppm did the DSSIM indicate that a significantly better quality was achieved using SmartScale.

For artificial.ppm and vgl_6434_0018.ppm, it was possible to achieve maximum quality at a higher compression ratio using SmartScale, but not so for nightshot_iso_100.ppm. Furthermore, to achieve a significantly higher compression ratio required sacrificing a lot of performance. As with the size-equivalent cases above, only a DCT block size of 4 produced performance in the same ballpark as the reference cases. Also, the higher compression ratios achieved using a block size of 1 were not much better than those that could be achieved using libpng, which has already been established to be much faster than encoding a block size=1 SmartScale file.

Conclusion

For high-quality JPEG, it is possible in some cases to achieve a lower DSSIM with the same compression ratio by using SmartScale rather than baseline, but this is not true for all image types. Furthermore, it is possible in some cases to achieve maximum quality at a higher compression ratio using SmartScale, but only by significantly sacrificing performance, and this is also not true for all image types. Also, in the cases for which the DSSIM indicated that the quality was better, the improvement was not obvious at 100% zoom, and in some cases, the DSSIM metric failed to pick up on the non-localized color banding introduced by decreasing the quality level.

In the case of streaming and remote display, perceptually lossless JPEG would rarely be used in a bandwidth-constrained environment, so the encoder performance tends to be a much more relevant metric than the compression ratio. Even if the same 2-4x speedup could be achieved with 1x1 or 2x2 SmartScale as was achieved with baseline in libjpeg-turbo, the performance would still be way too slow for streaming and remote display over high-speed networks.

In summary, SmartScale does not generally accomplish anything vis-a-vis encoding perceptually lossless JPEGs that can't already be accomplished using other means.

Low-Quality JPEG

VirtualGL and TurboVNC use quality=30 with 4:2:0 chroma subsampling as a "low-quality" preset, both because the compression ratio of this mode has been experimentally determined to be about 4 times that of the "perceptually lossless JPEG" preset and because, while the artifacts incurred by this mode are significant, they are generally deemed to be usable. This section evaluates whether SmartScale or DCT scaling can be used to provide better compression than "low-quality JPEG" without sacrificing quality.

For each image, the compression ratio and DSSIM were measured using the reference settings of quality=30 and 4:2:0 subsampling. Then, for DCT block sizes of 1, 2, and 4, as well as scaling factors of 1/2 and 8/11, trial and error was used to determine the quality at which the same DSSIM as the reference case could be achieved.

Low-Quality JPEG - Closest Quality Equivalents
	artificial.ppm				nightshot_iso_100.ppm				vgl_6434_0018.ppm
	Qual.	Comp. ratio	DSSIM	Time (s)	Qual.	Comp. ratio	DSSIM	Time (s)	Qual.	Comp. ratio	DSSIM	Time (s)
Reference (Huffman coding)	30	75.37	0.0635	0.26	30	89.27	0.0774	0.32	30	68.11	0.1008	0.08
Reference (Arithmetic coding)	30	105.7	0.0635	0.26	30	132.1	0.0774	0.31	30	98.60	0.1008	0.08
Block size=1	6	53.80	0.0650	1.62	16	18.76	0.0742	2.06	6	44.74	0.1021	0.34
Block size=2	7	67.30	0.0660	0.58	17	35.04	0.0758	0.81	7	54.02	0.0884	0.14
Block size=4	12	89.62	0.0639	0.33	20	73.45	0.0754	0.45	10	77.25	0.1012	0.09
Scale=1/2	90	73.32	0.0623	0.21	75	157.2	0.0775	0.21	90	79.36	0.1002	0.07
Scale=8/11	66	89.59	0.0632	0.22	44	155.1	0.0770	0.24	72	81.38	0.1018	0.07

Click on any of the DSSIM metrics to view a cropped region of the output image.

Given that the point of this exercise was to improve the compression ratio, in almost no case did SmartScale or DCT scaling succeed, so little further discussion is actually needed. However, it is useful to briefly examine how these tests highlighted the trade-offs of both DCT scaling and SmartScale.

With nightshot_iso_100.ppm, DCT scaling did produce better compression ratios and much more accurate color reproduction relative to the reference case. However, the buildings (arguably the most important part of the image) were not rendered as accurately in either of the scaling cases due to ringing and block edge artifacts. In the SmartScale cases, the color reproduction was less accurate, causing more banding artifacts around the streetlights and, in general, a less smooth appearance, but arguably, those cases did do a slightly better job of rendering the buildings.

With artificial.ppm, comparing the output images really highlights the trade-off between quantization level and block size. In the SmartScale cases, the colors were represented much less accurately because of the lower quality level, but the blocking artifacts due to quantization were greatly reduced or eliminated. The DCT scaling cases showed a very clear trade-off between blocking artifacts and ringing/block edge artifacts. The reference settings tended to do a better job of representing detail, such as the text on the 3D glasses and the area around the mouse button, but the DCT scaling settings did a better job of representing the colors and the edges of the maze pattern.

With vgl_6434_0018.ppm, the trend was similar to that of artificial.ppm, in that the SmartScale cases had fewer artifacts but worse color reproduction. As with nightshot_iso_100.ppm, using 1/2 DCT scaling reproduced the background and the colors more accurately, but it introduced many more artifacts into the detail section of the image, which in the case of a CAD drawing is definitely a bad thing. The 8/11 scaling case did a better job of balancing the different types of artifacts, and its overall visual quality was very similar to that of the reference image.

Conclusion

For all of the images, SmartScale reduced or eliminated the blocking artifacts, which was expected, but on the computer-generated images, it required such a low quality level that the color reproduction was very inaccurate, producing images that looked a lot like indexed color. On the photographic content, the trade-off between color accuracy and blocking artifacts was more of a wash.

As with the Perceptually Lossless cases, using a block size of 1 or 2 significantly decreased performance.

For the photographic image, DCT scaling was shown to produce better compression ratios at the same measured quality, relative to high quantization. However, hidden in these numbers was the dirty little secret of DCT scaling, which is that it tends to concentrate error in the high-frequency areas of the image-- precisely where that error is least desired. This was the case when using 1/2 scaling with the CAD drawing as well, although for both that image and artificial.ppm, the error introduced by 8/11 scaling was more of a wash relative to the error introduced by the reference settings.

In general, the biggest piece of low-hanging fruit for streaming/remote display applications is arithmetic coding, since that was shown to significantly increase the compression ratio relative to Huffman-coded JPEG.

In summary, SmartScale and DCT scaling do not generally accomplish anything vis-a-vis encoding low-quality JPEGs that can't already be accomplished using other means.

<< Part 1: Lossless SmartScale | Introduction | >>