Opened 8 years ago
Last modified 6 years ago
#4638 open enhancement
Multithreaded FLAC encoding
Reported by: | xtemp09 | Owned by: | |
---|---|---|---|
Priority: | wish | Component: | avcodec |
Version: | git-master | Keywords: | flac |
Cc: | dqeswn@gmail.com | Blocked By: | |
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
I found that FLAC encoding utilizes only 100% of CPU, instead, e.g. 1200% (I have 12-core computer). It takes so long when encoding large files at maximum compression level.
Could you add multithreading to FLAC encoding? Maybe, openCL would be help common users too. flaCCL, FlaCuda are multithreaded and have better compression than that one of ffmpeg. Could you include them into ffmpeg?
I hope this request won't hang here 3-2 years like many others. :'(
Change History (9)
comment:1 by , 8 years ago
follow-up: 9 comment:2 by , 8 years ago
Also, what's the use-case? Do you have to encode dozens of CDs per seconds? Won't the output of the decoder be too much for your storage speed or network bandwidth to handle?
This makes little sense.
comment:3 by , 8 years ago
Component: | undetermined → avcodec |
---|---|
Keywords: | flac added |
Priority: | normal → wish |
Status: | new → open |
Version: | unspecified → git-master |
comment:4 by , 8 years ago
I'm sorry, of course it's a mistake; I thought that ffmpeg uses libflac. I meant flaCCL have better compression in comparison with libflac. Besides better compression, it's ~7 times faster than libflac, since both openCL and CUDA increases encoding considerably.
gjdfgh, I started converting a large WAV file (of 1 hour duration) into FLAC last Saturday on 12-core Xeon cluster at maximum compression level. It's about 80% done. The thing that makes me sad is that only 100% is used and my kick-ass videocard is no use here. :(
If I could write a patch, I wouldn't make the feature request; I would have done it by myself. :)
follow-up: 6 comment:5 by , 8 years ago
I meant flaCCL have better compression in comparison with libflac. Besides better compression, it's ~7 times faster than libflac, since both openCL and CUDA increases encoding considerably.
This is questionable and might just be due to different settings. Also someone else said libflac got some SIMD optimizations recently.
at maximum compression level.
These things tend to get exponentially slower at higher levels for very little gain. Meaning you're trading minutes for some kilobytes in compression efficiency increase.
comment:6 by , 8 years ago
Replying to gjdfgh:
Also someone else said libflac got some SIMD optimizations recently.
Libflac and not ffmpeg?
What if audio is a part of video file? Multithreading can utilize all the cores + it can significantly reduce the workload with GPU acceleration (without mentioning considerable increase in speed).
comment:7 by , 8 years ago
Libflac and not ffmpeg?
Well, the flaccl website didn't compare with ffmpeg.
comment:8 by , 8 years ago
I just made a benchmark to prove importance of hardware acceleration and multithreading.
I used: CUETools.FLACCL.cmd.exe 0.3 on a Windows machine and ffmpeg N-47387-g178ba1f- on a Linux cluster.
The configuration of the machines:
The Windows machine: NVIDIA GeForce GTX 650 Ti Phenom II X3 720 2.8 GHz SSD OCZ The Linux cluster: 12 cores of Intel Xeon CPU X5650 @ 2.67GHz
I executed the following commands:
.\time.exe .\CUETools.FLACCL.cmd.exe -11 --no-md5 --cpu-threads 3 --opencl-platform 'NVIDIA Cuda' source.wav -o output_flacCL.flac time ffmpeg -i source.wav -compression_level 9 output_ffmpeg.flac
In a nutshell, I set maximum compression.
The results are:
flacCL 15.65 s ffmpeg 62.444 s
I believe that IO takes a few seconds, but the difference is anyway huge.
The entire Intel Xeon node of a Linux cluster utilized only 100% of CPU. Meanwhile, flacCL didn't use CPU at all, only the GPU. (I'm not sure about CPU workload on that Windows machine)
comment:9 by , 6 years ago
Cc: | added |
---|
Replying to gjdfgh:
Also, what's the use-case? Do you have to encode dozens of CDs per seconds? Won't the output of the decoder be too much for your storage speed or network bandwidth to handle?
This makes little sense.
Utilizing all cores, and encoding at ~4-12 times the speed doesn't make sense?
What strange world you live in, where they want to keep encoding as slow as possible?
I for one transcoded some DTS-HD MA movie streams to FLAC (level 12) roughly halving the space it took. And would have been really happy if it took a ~fourth of the time on my quad core cpu.
Run multiple encoders at the same time, concatenate the result.