Re: [Libjpeg-turbo-devel] Optimizing JPEG compression speed
SIMD-accelerated libjpeg-compatible JPEG codec library
Brought to you by:
dcommander
|
From: Hao Hu <ih...@gm...> - 2016-12-21 05:51:02
|
It's operating single image with multi-thread. I shared my design a while ago on the mailing list here. But just to share it once more. (I guess DRC was right about using RST marker, but the details are different) I can't provide the code since it's in proprietary code base. But the entire implementation should be less than 300 lines of code. HOWEVER, it only works when you control both sides. Also, I only implemented decoding side since that's the place I need good perf. But I guess the similar theory could be applied to compression side. [image: 内嵌图片 1] 2016-12-20 8:42 GMT-08:00 Siddharth Bidasaria < sid...@ci...>: > Thank you for your responses, they are much appreciated! > > > > I meant that Libjpeg-turbo was faster than Pegasus by 40% for decoding > jpegs, but slower by 30% when used for encoding rgb images to jpegs. > > > > By multi-threading, do you mean encoding multiple images in parallel, or > splitting up 1 image into multiple parts and encoding them in parallel and > rejoining them? If it’s the second case, could you point me to a an example > implementation or resource that describes such a process. I tried looking > for it myself, but found nothing so far. > > > > Thanks, > > > > > > *From:* Hao Hu [mailto:ih...@gm...] > *Sent:* Tuesday, December 20, 2016 12:04 AM > *To:* libjpeg-turbo Developers <lib...@li...> > *Subject:* Re: [Libjpeg-turbo-devel] Optimizing JPEG compression speed > > > > If you search Pegasus on this link https://sourceforge.net/ > p/freeimage/discussion/36111/thread/fb51778d/ > > You'll see your result is aligned with others observation. > > > > If you control the compression/decompression, then there are ways to make > libjpeg-turbo many times faster by design a multi-threading based wrapper > code. > > Then if you have 4 cores, you usually get at least 3x faster. But you'll > need to come up with the design based on your specific case. > > > > 2016-12-19 15:21 GMT-08:00 DRC <dco...@us...>: > > Nothing obvious stands out in your code. I don't quite understand your > claims below that "libjpeg-turbo came out behind by 30%" but > "libjpeg-turbo seemed to be 40% faster while decoding images at the same > quality level". Those seem to be contradictory claims. Please clarify. > > You can get probably 10-15% better performance from the dev (1.6 > evolving) builds, if you're using an AVX2-equipped CPU (I expect that to > be more like 25% when libjpeg-turbo 1.6 ships, but I'm seeking funding > to finish out the AVX2 acceleration project.) > > I'm not really surprised that Pegasus is faster. They do use SIMD > instructions, and they also use multi-threading. The latter is > something we cannot do within our library because of the libjpeg > architecture, although applications are of course free to implement > multi-threaded compression or decompression on their own. Our initial > performance goal was to match the performance of the Intel Performance > Primitives, which we pretty much have done, but Pegasus has always been > a notch faster than IPP in my experience. > > Back in the day, I had a TurboJPEG API implementation for Pegasus, which > can be found here: > https://github.com/VirtualGL/virtualgl/tree/029f870488d251b48c3be0b17804e3 > 002f7602cd/jpeg > > You should be able to build turbojpegp.c and jpgtest.cpp against Pegasus > and obtain benchmark data that closely resembles that of tjbench in > libjpeg-turbo. That would be a good sanity check. > > On 12/19/16 2:56 PM, Siddharth Bidasaria wrote: > > Hi all, > > > > > > > > I am currently using a proprietary jpeg encoding library called Pegasus, > > and was looking to switch to libjpeg-turbo. > > > > > > > > I have been conducting some benchmarks between the two libraries and > > libjpeg-turbo came out behind by 30%. I was quite surprised by this > > because Pegasus was last updated 4 years ago, and I believe it doesn’t > > make use of SIMD instructions either. Moreover, libjpeg-turbo seemed to > > be 40% faster while decoding images at the same quality level. > > > > > > > > Hence, I wanted to rule out that I am incorrectly/inefficiently using > > the libjpeg-turbo api. If you could have a look at my compression code > > below, and let me know if there are any glaring things I > > should/shouldn’t be doing please let me know! > > ------------------------------------------------------------ > ------------------ > Developer Access Program for Intel Xeon Phi Processors > Access to Intel Xeon Phi processor-based developer platforms. > With one year of Intel Parallel Studio XE. > Training and support from Colfax. > Order your platform today.http://sdm.link/intel > _______________________________________________ > Libjpeg-turbo-devel mailing list > Lib...@li... > https://lists.sourceforge.net/lists/listinfo/libjpeg-turbo-devel > > > > ------------------------------------------------------------ > ------------------ > Developer Access Program for Intel Xeon Phi Processors > Access to Intel Xeon Phi processor-based developer platforms. > With one year of Intel Parallel Studio XE. > Training and support from Colfax. > Order your platform today.http://sdm.link/intel > _______________________________________________ > Libjpeg-turbo-devel mailing list > Lib...@li... > https://lists.sourceforge.net/lists/listinfo/libjpeg-turbo-devel > > |