Opened 3 years ago

Last modified 17 months ago

#4638 open enhancement

Multithreaded FLAC encoding

Reported by: xtemp09 Owned by:
Priority: wish Component: avcodec
Version: git-master Keywords: flac
Cc: dqeswn@gmail.com Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

I found that FLAC encoding utilizes only 100% of CPU, instead, e.g. 1200% (I have 12-core computer). It takes so long when encoding large files at maximum compression level.

Could you add multithreading to FLAC encoding? Maybe, openCL would be help common users too. flaCCL, FlaCuda? are multithreaded and have better compression than that one of ffmpeg. Could you include them into ffmpeg?

I hope this request won't hang here 3-2 years like many others. :'(

Change History (9)

comment:1 Changed 3 years ago by gjdfgh

Run multiple encoders at the same time, concatenate the result.

comment:2 follow-up: Changed 3 years ago by gjdfgh

Also, what's the use-case? Do you have to encode dozens of CDs per seconds? Won't the output of the decoder be too much for your storage speed or network bandwidth to handle?

This makes little sense.

comment:3 in reply to: ↑ description Changed 3 years ago by cehoyos

  • Component changed from undetermined to avcodec
  • Keywords flac added
  • Priority changed from normal to wish
  • Status changed from new to open
  • Version changed from unspecified to git-master

Replying to xtemp09:

flaCCL, FlaCuda? [...] have better compression than that one of ffmpeg.

This sounds important: Could you provide samples?

I hope this request won't hang here 3-2 years like many others. :'(

I think it is safe to say that this will hang until you send a patch.

comment:4 Changed 3 years ago by xtemp09

I'm sorry, of course it's a mistake; I thought that ffmpeg uses libflac. I meant flaCCL have better compression in comparison with libflac. Besides better compression, it's ~7 times faster than libflac, since both openCL and CUDA increases encoding considerably.

gjdfgh, I started converting a large WAV file (of 1 hour duration) into FLAC last Saturday on 12-core Xeon cluster at maximum compression level. It's about 80% done. The thing that makes me sad is that only 100% is used and my kick-ass videocard is no use here. :(

If I could write a patch, I wouldn't make the feature request; I would have done it by myself. :)

comment:5 follow-up: Changed 3 years ago by gjdfgh

I meant flaCCL have better compression in comparison with libflac. Besides better compression, it's ~7 times faster than libflac, since both openCL and CUDA increases encoding considerably.

This is questionable and might just be due to different settings. Also someone else said libflac got some SIMD optimizations recently.

at maximum compression level.

These things tend to get exponentially slower at higher levels for very little gain. Meaning you're trading minutes for some kilobytes in compression efficiency increase.

comment:6 in reply to: ↑ 5 Changed 3 years ago by xtemp09

Replying to gjdfgh:

Also someone else said libflac got some SIMD optimizations recently.

Libflac and not ffmpeg?

What if audio is a part of video file? Multithreading can utilize all the cores + it can significantly reduce the workload with GPU acceleration (without mentioning considerable increase in speed).

comment:7 Changed 3 years ago by gjdfgh

Libflac and not ffmpeg?

Well, the flaccl website didn't compare with ffmpeg.

comment:8 Changed 3 years ago by xtemp09

I just made a benchmark to prove importance of hardware acceleration and multithreading.

I used: CUETools.FLACCL.cmd.exe 0.3 on a Windows machine and ffmpeg N-47387-g178ba1f- on a Linux cluster.

The configuration of the machines:

The Windows machine: 
NVIDIA GeForce GTX 650 Ti
Phenom II X3 720 2.8 GHz
SSD OCZ

The Linux cluster:
12 cores of Intel Xeon CPU X5650 @ 2.67GHz

I executed the following commands:

.\time.exe .\CUETools.FLACCL.cmd.exe -11 --no-md5 --cpu-threads 3 --opencl-platform 'NVIDIA Cuda' source.wav -o output_flacCL.flac

time ffmpeg -i source.wav -compression_level 9 output_ffmpeg.flac

In a nutshell, I set maximum compression.

The results are:

flacCL  15.65 s
ffmpeg  62.444 s

I believe that IO takes a few seconds, but the difference is anyway huge.

The entire Intel Xeon node of a Linux cluster utilized only 100% of CPU. Meanwhile, flacCL didn't use CPU at all, only the GPU. (I'm not sure about CPU workload on that Windows machine)

Last edited 3 years ago by xtemp09 (previous) (diff)

comment:9 in reply to: ↑ 2 Changed 17 months ago by mzso

  • Cc dqeswn@gmail.com added

Replying to gjdfgh:

Also, what's the use-case? Do you have to encode dozens of CDs per seconds? Won't the output of the decoder be too much for your storage speed or network bandwidth to handle?

This makes little sense.

Utilizing all cores, and encoding at ~4-12 times the speed doesn't make sense?
What strange world you live in, where they want to keep encoding as slow as possible?

I for one transcoded some DTS-HD MA movie streams to FLAC (level 12) roughly halving the space it took. And would have been really happy if it took a ~fourth of the time on my quad core cpu.

Note: See TracTickets for help on using tickets.