Opened 5 years ago

Last modified 5 years ago

#2640 new defect

When dropping most frames, conversion from yuvj420p to gbrp becomes faster if explicitly asking rgb24 intermediate

Reported by: b_jonas Owned by:
Priority: minor Component: undetermined
Version: unspecified Keywords:
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug:

If I decode a video from native yuvj420p to raw video output in gbrp pixel format (plane-major rgb with the planes in funny order), then ffmpeg gives warnings about no accelerated color space conversion found, and the decoding becomes very slow. When, however, I do the same explicitly converting through rgb24 to gbrp, then the conversion is fast.

The strange part of the bug is that there is a pronounced time difference only if I drop most of the frames (because I write in a much lower frame rate than the input), so the color space conversion isn't actually needed. If I output all frames, then the time difference becomes much less pronounced. This may be due to that writing the output is slow on windows.

I have originally found this bug when decoding a h264-compressed video, but I can reproduce it below by reading a raw video. Direct (unaccelerated) conversion to gbrp is so slow that the time for it dominates the video decompression time. The direct and indirect decompression both appear to work, they give visually similar but not bitwise equal output.

I believe that direct color space conversion to gbrp should not add this much overhead if I am dropping most of the output frames anyway, because the color space conversion should be needed only for the frames kept. Further, ffmpeg should automatically select an indirect conversion through rgb24 if that is faster, or ideally support a fast direct conversion.

For context, I am usually reading the raw video into a program through a pipe, and I would actually like an rgbp (plane-major rgb) output because that's easier to handle than rgb24 (pixel-major rgb), but ffmpeg does not currently support rgbp format at all, so I ask for gbrp and reorder the three planes.

How to reproduce:

I have ran all the following commands on windows 7 x86_64 with an intel core 2 quad CPU and 4 GB of RAM. I'm running the Zeranoe build of ffmpeg version N-53307-g5a65fea built on May 20 2013 22:46:15 with gcc 4.7.3 (GCC) (this is a newer version than in my previous report).

Create a large all zeros raw video input (1.4 gigabytes, 3000 frames) with the following command.

perl -we "binmode STDOUT or die; print pack qq(x).(640*480*1.5) for 1..3000" > zeros.dat

Now run the following two ffmpeg commands.

ffmpeg -report -v 99 -f rawvideo -pix_fmt yuvj420p -s 640x480 -r 30 -i zeros.dat -r 1 -f rawvideo -pix_fmt gbrp -y dump.dat 2>nul

ffmpeg -report -v 99 -f rawvideo -pix_fmt yuvj420p -s 640x480 -r 30 -i zeros.dat -r 1 -f rawvideo -vf format=pix_fmts=rgb24 -pix_fmt gbrp -y dump.dat 2>nul

The first command (the direct conversion) took 29.7 seconds to run and gave the warnings, the second (the indirect conversion) took 6.9 seconds. I attach the log files from both runs.

(I use the 2>nul part because the windows console is slow and writing the debug messages there would overwhelm it.)

Attachments (2)

ffmpeg-20130604-184440.log (438.7 KB) - added by b_jonas 5 years ago.
ffmpeg-20130604-184528.log (437.3 KB) - added by b_jonas 5 years ago.

Download all attachments as: .zip

Change History (7)

Changed 5 years ago by b_jonas

Changed 5 years ago by b_jonas

comment:1 follow-up: Changed 5 years ago by b_jonas

I (the original reporter) would like to note that this problem is not exclusive to Windows. I have reproduced this on Linux amd64 with vanilla ffmpeg version N-51215-g428e9da built on Mar 22 2013 23:36:22 with gcc 4.7.1 (GCC). I have an all-zero input created with the same command as in the report, and decoded with commands very similar to above, only I had to change -vf format=pix_fmts=rgb24 to -vf format=rgb24.

comment:2 in reply to: ↑ 1 ; follow-up: Changed 5 years ago by cehoyos

Replying to b_jonas:

I (the original reporter) would like to note that this problem is not exclusive to Windows.

Why should this be exclusive to Windows? (It is of course exclusive to x86 with SIMD.)

What I would like to know is why do you think there is a bug if FFmpeg writes on the console that it will run slow because no optimized conversion routines are available?

comment:3 in reply to: ↑ description Changed 5 years ago by cehoyos

Replying to b_jonas:

I believe that direct color space conversion to gbrp should not add this much overhead if I am dropping most of the output frames anyway, because the color space conversion should be needed only for the frames kept.

I strongly suspect you can achieve that higher performance if you put the fps filter in front of the scale filter in your filter chain.

comment:4 in reply to: ↑ 2 ; follow-up: Changed 5 years ago by b_jonas

Replying to cehoyos:

Why should this be exclusive to Windows? (It is of course exclusive to x86 with SIMD.)

I did not think it was exclusive, but have not tested before.

What I would like to know is why do you think there is a bug if FFmpeg writes on the console that it will run slow because no optimized conversion routines are available?

There are two reasons why I think there is a bug. The first is that there is an optimized conversion, which you can get by converting yuvj420p to rgb24 and that to gbrp, but ffmpeg does not find this automatically. The second is that the direct color space conversion slows down the pipeline even when I drop most frames, even though in that case only the video decompression should be performed on all frames and the color space conversion only on the frames output.

Replying to cehoyos:

Replying to b_jonas:
I strongly suspect you can achieve that higher performance if you put the fps filter in front of the scale filter in your filter chain.

In the example commands I gave, there is no scale filter. The -s option is used as an input option to tell the rawvideo input format the size of images to read, for the raw video input does not have that included as metadata.

comment:5 in reply to: ↑ 4 Changed 5 years ago by cehoyos

Replying to b_jonas:

In the example commands I gave, there is no scale filter.

The part of FFmpeg that does colour space conversion is called the scale filter (or in other words: Colour space conversion means using swscale)

Note: See TracTickets for help on using tickets.