Opened 9 years ago

Last modified 8 months ago

#4638 open enhancement

Multithreaded FLAC encoding

Reported by: xtemp09 Owned by:
Priority: wish Component: avcodec
Version: git-master Keywords: flac
Cc: dqeswn@gmail.com Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

I found that FLAC encoding utilizes only 100% of CPU, instead, e.g. 1200% (I have 12-core computer). It takes so long when encoding large files at maximum compression level.

Could you add multithreading to FLAC encoding? Maybe, openCL would be help common users too. flaCCL, FlaCuda are multithreaded and have better compression than that one of ffmpeg. Could you include them into ffmpeg?

I hope this request won't hang here 3-2 years like many others. :'(

Change History (10)

comment:1 by gjdfgh, 9 years ago

Run multiple encoders at the same time, concatenate the result.

comment:2 by gjdfgh, 9 years ago

Also, what's the use-case? Do you have to encode dozens of CDs per seconds? Won't the output of the decoder be too much for your storage speed or network bandwidth to handle?

This makes little sense.

in reply to:  description comment:3 by Carl Eugen Hoyos, 9 years ago

Component: undeterminedavcodec
Keywords: flac added
Priority: normalwish
Status: newopen
Version: unspecifiedgit-master

Replying to xtemp09:

flaCCL, FlaCuda [...] have better compression than that one of ffmpeg.

This sounds important: Could you provide samples?

I hope this request won't hang here 3-2 years like many others. :'(

I think it is safe to say that this will hang until you send a patch.

comment:4 by xtemp09, 9 years ago

I'm sorry, of course it's a mistake; I thought that ffmpeg uses libflac. I meant flaCCL have better compression in comparison with libflac. Besides better compression, it's ~7 times faster than libflac, since both openCL and CUDA increases encoding considerably.

gjdfgh, I started converting a large WAV file (of 1 hour duration) into FLAC last Saturday on 12-core Xeon cluster at maximum compression level. It's about 80% done. The thing that makes me sad is that only 100% is used and my kick-ass videocard is no use here. :(

If I could write a patch, I wouldn't make the feature request; I would have done it by myself. :)

comment:5 by gjdfgh, 9 years ago

I meant flaCCL have better compression in comparison with libflac. Besides better compression, it's ~7 times faster than libflac, since both openCL and CUDA increases encoding considerably.

This is questionable and might just be due to different settings. Also someone else said libflac got some SIMD optimizations recently.

at maximum compression level.

These things tend to get exponentially slower at higher levels for very little gain. Meaning you're trading minutes for some kilobytes in compression efficiency increase.

in reply to:  5 comment:6 by xtemp09, 9 years ago

Replying to gjdfgh:

Also someone else said libflac got some SIMD optimizations recently.

Libflac and not ffmpeg?

What if audio is a part of video file? Multithreading can utilize all the cores + it can significantly reduce the workload with GPU acceleration (without mentioning considerable increase in speed).

comment:7 by gjdfgh, 9 years ago

Libflac and not ffmpeg?

Well, the flaccl website didn't compare with ffmpeg.

comment:8 by xtemp09, 9 years ago

I just made a benchmark to prove importance of hardware acceleration and multithreading.

I used: CUETools.FLACCL.cmd.exe 0.3 on a Windows machine and ffmpeg N-47387-g178ba1f- on a Linux cluster.

The configuration of the machines:

The Windows machine: 
NVIDIA GeForce GTX 650 Ti
Phenom II X3 720 2.8 GHz
SSD OCZ

The Linux cluster:
12 cores of Intel Xeon CPU X5650 @ 2.67GHz

I executed the following commands:

.\time.exe .\CUETools.FLACCL.cmd.exe -11 --no-md5 --cpu-threads 3 --opencl-platform 'NVIDIA Cuda' source.wav -o output_flacCL.flac

time ffmpeg -i source.wav -compression_level 9 output_ffmpeg.flac

In a nutshell, I set maximum compression.

The results are:

flacCL  15.65 s
ffmpeg  62.444 s

I believe that IO takes a few seconds, but the difference is anyway huge.

The entire Intel Xeon node of a Linux cluster utilized only 100% of CPU. Meanwhile, flacCL didn't use CPU at all, only the GPU. (I'm not sure about CPU workload on that Windows machine)

Last edited 9 years ago by xtemp09 (previous) (diff)

in reply to:  2 comment:9 by mzso, 7 years ago

Cc: dqeswn@gmail.com added

Replying to gjdfgh:

Also, what's the use-case? Do you have to encode dozens of CDs per seconds? Won't the output of the decoder be too much for your storage speed or network bandwidth to handle?

This makes little sense.

Utilizing all cores, and encoding at ~4-12 times the speed doesn't make sense?
What strange world you live in, where they want to keep encoding as slow as possible?

I for one transcoded some DTS-HD MA movie streams to FLAC (level 12) roughly halving the space it took. And would have been really happy if it took a ~fourth of the time on my quad core cpu.

comment:10 by wondras, 8 months ago

Apologies for waking up such an old ticket, but I wanted to mention an obscure use-case where FLAC encoding speed becomes quite important:

The "Domesday Duplicator" project has a set of hardware/software tools for capturing and archiving of analog laserdiscs in a raw form, directly from the laser pickup. The "ld-decode" software stack is later used to extract and process the audio, video and data contained in the capture.

Standard Domesday captures use 10-bit resolution at 40 MHz. A 30-minute side of a CAV laserdisc takes up 144 GB when stored as signed 16-bit integers (two bytes per sample), though the 10-bit values can be packed down to 90 GB in what are now called ".lds" files.

By guesswork and luck, we discovered that FLAC does a surprisingly good job of compressing the 16-bit raw samples as audio. (The signal is modulated on on a 9 MHz carrier, so when slowed down by a factor of 1000, it sounds like a warbly whistle.) The final file size is usually around 50 GB -- nearly half the size of the simple packing. These are now known as ".ldf" files.

These compressed files are extremely useful for reducing storage space/cost when archiving multiple titles. Naturally, encoding of 144 GB takes a very long time. FFmpeg is the "gold standard" for this purpose, due to its reliability, and its willingness to deal with streams of this size. (The reference FLAC encoded flat-out refuses, I think due to a 32-bit stream position pointer that overflows beyond a certain length. FFmpeg can wrap it in an .oga container, which avoids this problem.)

There is a customized build of FlacCL called FlaLDF that makes it compatible as well, and it is much faster, but I find results to be inconsistent; this is especially true at the higher compression settings, which is where the speed is most needed.

This may be a niche use by a handful of crazed data hoarders, but I wanted to mention it just in case it piques someone's interest. If not, I'll just say thanks for all the hard work you've already done and continue to do. Cheers!

Note: See TracTickets for help on using tickets.