What About mozjpeg?
On March 5, 2014, Mozilla announced the mozjpeg project, a JPEG encoder that is designed to provide better compression for web images, at the expense of performance. Since mozjpeg builds upon the libjpeg-turbo source code, quite a few people have asked, "why didn't Mozilla just integrate their changes into libjpeg-turbo?" The words "at the expense of performance" provide the simplest answer to that question, but I felt that a more detailed review was in order.
For starters, Mozilla did approach me regarding the possibility of integrating their code into libjpeg-turbo, but we both basically agreed that the project goals are incompatible. As the README file for mozjpeg says:
'mozjpeg' is not intended to be a general JPEG library replacement. It makes tradeoffs that are intended to benefit Web use cases and focuses solely on improving encoding. It is best used as part of a Web encoding workflow. For a general JPEG library (e.g. your system libjpeg), especially if you care about decoding, we recommend libjpeg-turbo.
mozjpeg's sole purpose is to reduce the size of JPEG files that are served up on the web. Thus, Mozilla wants their solution to compress as tightly as possible out of the box. We, on the other hand, want our solution to compress as quickly as possible out of the box. There is really no way to reconcile those two goals.
mozjpeg, which as of this writing is at version 2.1, relies on several methods to reduce the size of JPEG images:
Each of these methods provides an incremental improvement in compression ratio, but with an extreme loss of compression speed.
Probably the best way to illustrate the above statement is to put some numbers to it. The attached spreadsheet compares the performance of various compression modes in libjpeg v6b, libjpeg-turbo v1.3.x, and mozjpeg v2.1 using the same images and benchmarks that were used in the libjpeg-turbo Performance Study. I do not necessarily claim that these images are canonical or representative of the types of images that mozjpeg will be normally tasked with. However, the test images I use are chosen in part because, over the years, I have discovered that they each have unique performance characteristics that cover a fairly broad cross-section of images, some of which compress well with JPEG and some that don't. mozjpeg will probably be used mostly on photographic content, at the end of the day.
The following table breaks down the incremental performance and compression ratio gains/losses of the individual features that mozjpeg uses to reduce the size of JPEG files:
libjpeg-turbo's performance optimizations primarily benefit baseline JPEGs, so while progressive encoding is fully supported in libjpeg-turbo, the speedup it gives relative to libjpeg is only about 25-40%, not the 200-400% that can be achieved when producing baseline JPEGs. To put this another way, compressing progressive JPEGs in libjpeg-turbo is about 1/8 as fast as compressing baseline JPEGs, as the table above indicates (0.121 ~= 1/8.) libjpeg-turbo can decompress progressive JPEGs about 40-65% faster than libjpeg, but this is still in the neighborhood of 1/3 as fast as decompressing baseline JPEGs. In short, progressive encoding is not a "turbo" feature.
jpgcrush (which has been part of mozjpeg from the beginning) produces only a very small improvement in compression ratio (a few percent at most.) Meanwhile, it decreases performance by another 70%. Trellis quantization, which was introduced in mozjpeg v2.0, improves the compression ratio by as much as 15% and actually improves decompression performance significantly as well, but it decreases compression performance by another 30-50%. Thus, trellis quantization provides the best trade-off of the three technologies used, but that trade-off is still not particularly great.
It should be re-emphasized that, unlike jpgcrush and progressive encoding, trellis quantization does not produce images that are mathematically equivalent to baseline JPEG images. Thus, comparing the compression ratio of images generated with and without trellis quantization is technically comparing apples to oranges. In this case, the settings we used (no subsampling, quality=95) resulted in almost the same DSSIM with and without trellis quantization enabled, but the PSNR with trellis quantization enabled was measurably lower. I tend to trust DSSIM more than PSNR, but it is possible that, if I were to normalize for equivalent quality in the output images, the compression ratio improvements cited below would not be quite as good. For a more thorough analysis of compression ratio vs. quality trade-offs between mozjpeg and libjpeg-turbo, I refer the reader to Josh Aas's article on the topic.
The overall performance of mozjpeg v2.1 relative to the best-compressing mode in libjpeg-turbo is as follows:
Mozilla claims 5% better compression, on average, than libjpeg-turbo. The images I used actually showed an 8% average gain. This is definitely more compelling than what mozjpeg v1.0.0 was able to produce, but the 8% gain in compression ratio looks a bit less compelling when juxtaposed to the 4-6x drop in compression performance.
Compared to baseline, the overall performance of mozjpeg's default mode is as follows:
Again, 20% is a compelling improvement in compression ratio, but the 50x drop in compression performance definitely limits the usefulness of it, as does the 2x drop in decompression performance. Also, it's important to note that arithmetic entropy coding can produce about the same improvement in compression ratio while achieving 8 times the compression performance of mozjpeg:
It should also be noted that arithmetic coding produces images that are mathematically equivalent to baseline. However, the JPEG images generated with arithmetic coding decompress more slowly than those generated by mozjpeg. Further, because there were some patent concerns around arithmetic coding until the patent expired in the mid 2000's, arithmetic coding is still not widely supported by applications, despite being an official part of the JPEG spec.
One interesting aspect of trellis quantization is that, unlike jpgcrush, it does not require progressive encoding. The following table demonstrates its benefits when added to baseline encoding:
The overall compression performance achieved here is about twice that of mozjpeg's default settings (0.041 ~= 1/25 of baseline compared with 0.022 ~= 1/50 of baseline), and the compression ratio improvement is still significant (13% average vs. 20% average.) Further, this mode produces images that decompress faster than baseline, so if encoding performance is not an issue but disk space and bandwidth and decompression performance are, then using trellis quantization along with baseline JPEG is a potential solution.
NOTE: I also wanted to test trellis quantization without jpgcrush (in the progressive case) and without optimized entropy coding (in the baseline case), but neither mode apparently works. My hope was that trellis encoding alone might provide a better performance vs. compression ratio trade-off than when it is combined with the other features.
In addition to the conflicting project goals, the other show-stopping issue at the moment is that the jpgcrush and trellis quantization features break ABI compatibility. The code that implements these features adds new fields to the exposed libjpeg compress structure (jpeg_compress_struct), and thus programs that link against mozjpeg will not be able to use libjpeg-turbo at run time (and vice versa.) This is exactly the same problem that was introduced by jpeg-7 and later (jpeg-7, jpeg-8, and jpeg-9 also extended the exposed libjpeg structures in order to support the SmartScale and forward DCT scaling features, and in doing so, every one of those releases broke ABI compatibility with the previous release.) However, Mozilla has agreed to work with us in developing a framework for extending the libjpeg API without breaking backward ABI compatibility, so fortunately, this problem is only temporary. Once the ABI compatibility issues are resolved, then we could at least consider adopting some of the mozjpeg features, if there were a compelling reason to do so. In general, though, mozjpeg is in a very high-velocity phase of development right now. Even if I felt like there was a high demand for improving compression ratio at the expense of performance, I would be loathe to merge any of the mozjpeg features into libjpeg-turbo until the features have stabilized somewhat and the community has had a chance to process them.
When Mozilla first approached me about this, my first reaction was to ask why they went to the trouble of switching to libjpeg-turbo in Firefox if they were going to turn right around and encourage the use of a JPEG format that doesn't benefit from libjpeg-turbo's performance enhancements. Their answer was that CPU time is not the primary bottleneck when loading web pages. Perhaps they're right. Personally, though, I have my doubts as to whether a 20% average reduction in bandwidth relative to baseline, or an 8% average reduction in bandwidth relative to progressive, is going to matter much. According to Akamai, the global average Internet connection speed increased 10% in the third quarter of 2013 alone. The early adopters of this technology seem to be more interested in saving disk space on their servers than anything else. Facebook, for instance, already resizes and re-compresses every image you upload, to the point that the images are barely usable by anyone you share them with. Now it appears that Facebook may re-compress them all with mozjpeg in order to further reduce their disk footprint. Does this make sense? It depends on what costs more in the long term: 20% more disk space or using 50 times more CPU power to do the recompression. My other concern is that, as Internet speeds increase, the underlying assumption behind mozjpeg (that decompression speed is not the bottleneck) may no longer be valid. Per the above figures, decompressing images compressed with mozjpeg takes about twice as long as decompressing baseline JPEG images.
I can understand why some large-scale web hosts might be interested in re-encoding their images losslessly using mozjpeg, but I don't have a very good grasp of why a site like Facebook, which has already re-compressed my photos 3 or 4 times in the last 5 years, wouldn't just reduce the default quality for new images if disk space is that much of a concern. Further, it seems like it would be much more fruitful in the long term to develop an algorithm that figures out, based on perceptual metrics (DSSIM, for instance), what the "acceptable" amount of loss is for a given image (roughly similar in concept to what DCTune does.) That would have the potential to reduce the size of uploaded images by a lot more than 20%. By means of example, most digital cameras use JPEG quality 98 when storing JPEG images (this is also the equivalent of the highest-quality JPEG setting in Photoshop.) Irrespective of any resizing that may also occur, Facebook currently re-compresses your images using JPEG quality 80. That's easily a 3-4x difference in size relative to JPEG quality 98. My point is that tweaking the quality to meet a particular perceptual target has a lot more potential for size reduction than futzing around with the quantization tables.
In a broader sense, as libjpeg-turbo gains more popularity as a project, I have to field more and more requests from the community to integrate new features, many of which are not very "project-friendly" (that is, they break ABI compatibility or create some sort of maintenance burden or they don't perform well or they simply aren't written very elegantly.) I want to support the community as much as I can, but I also have to be a bit of a mercenary about this stuff. I am an independent developer, so I do not earn a salary for my work on open source projects. I do things this way because it allows me to more easily respond to the needs of a variety of different organizations rather than being confined to the narrow agenda of just one company. The open source projects I maintain are ultimately healthier because of this. However, being independent also means that, unless an organization is specifically paying for me to work on a project, any work I do on that project is pro bono. In the case of libjpeg-turbo, I only get paid when I'm writing code for a paying customer. I don't generally get paid to answer questions on the mailing lists or to integrate patches from the community. I also don't get paid when I take lunch breaks or vacations or sick days. I am also not getting paid to write this. I contribute a significant amount of pro bono labor to libjpeg-turbo already, and it has reached the point at which trying to make everyone happy is cutting into my ability to pay rent. At some point, I have to step back and acknowledge that libjpeg-turbo cannot be all things to all people. At the end of the day, 100% of the income I have made from my work on this project has come from organizations who are interested in its high performance, either directly or indirectly (via VirtualGL and TurboVNC, my other pet projects.) Thus, whereas it's pretty easy to get me excited about a feature that significantly improves the performance of libjpeg-turbo, or a feature that improves the compression ratio of JPEG images without affecting performance, it is very hard to build a business case for working on the mozjpeg features until/unless the compression performance improves dramatically.
Despite the fact that my personal interest in libjpeg-turbo is 100% confined to baseline encoding at the moment, I maintain libjpeg-turbo as a general-purpose project-- including donating a lot of free labor to it-- because I believe in what it can do for the community. I am proud of what it has done already. I admit that I take a conservative approach when it comes to maintaining this project, and I understand that that makes it more difficult for people to contribute experimental features to the project. That's kind of the point, though. I want libjpeg-turbo to always be stable and performant, not experimental. Thus, whenever I add features to the code, I try my hardest to always maintain at least a beta level of quality, even in the subversion trunk. I take special care to ensure that neither stability nor performance regress when significant new features are added. I take special care to ensure that ABI compatibility is maintained and that any code that is accepted into the project is code that will not create maintenance problems for me or others down the road. I stand by my anal retentiveness, but I totally understand why a research project such as mozjpeg would find it to be an impediment to their progress. They're trying to build an isolated, fit-for-purpose toolkit. I'm trying to avoid breaking the JPEG library in every Linux distro on the planet.
I think a lot of the confusion about mozjpeg stems from the misuse of the word "fork" by some in the community. mozjpeg is not really a fork per se, as that would imply that they are seeking to supplant libjpeg-turbo. Rather, they are a special-purpose JPEG encoder that just happens to build upon the libjpeg-turbo source code. Their goals are 180 degrees different from ours, not to mention being much narrower in scope and (at least for now) much higher in velocity. We can debate whether or not the end result of mozjpeg will induce cheering or yawning, but no one is claiming that mozjpeg is a replacement for libjpeg-turbo. In fact, Mozilla's own product (Firefox) will continue to use libjpeg-turbo.
In short, even though I have my doubts about whether mozjpeg will achieve what it is setting out to achieve, it seems "mostly harmless" from my point of view, as long as people understand what that project is and what it isn't. Ultimately, the community will decide whether it's important, not me.
|All content on this web site is licensed under the Creative Commons Attribution 2.5 License. Any works containing material derived from this web site must cite The libjpeg-turbo Project as the source of the material and list the current URL for the libjpeg-turbo web site.|