Opened 11 years ago
Closed 10 years ago
#3625 closed defect (worksforme)
Unscaled conversion from yuv420p to gray is slow
Reported by: | Andreas Girgensohn | Owned by: | |
---|---|---|---|
Priority: | normal | Component: | swscale |
Version: | git-master | Keywords: | |
Cc: | Blocked By: | ||
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
Summary of the bug:
It is very slow to use sws_scale to convert from yuv420p to gray without scaling. It was still fast in FFmpeg 1.0.1 and was slow in FFmpeg 2.0 and newer, including the current git head. The fast version needs 0.93 ms per frame (1920x1088) and the slow version 6.99 ms.
How to reproduce:
ffmpeg version 2.2.git-996fffb Copyright (c) 2000-2014 the FFmpeg developers built on May 8 2014 13:51:28 with gcc 4.8.2 (GCC) 20131212 (Red Hat 4.8.2-7) gray_convert_ctx = sws_getContext (w, h, codec_ctx->pix_fmt, w, h, PIX_FMT_GRAY8, SWS_POINT, 0, 0, 0); sws_scale (gray_convert_ctx, frame->data, frame->linesize, 0, h, gray_frame->data, gray_frame->linesize);
This can be fixed by adding dstFormat==AV_PIX_FMT_GRAY8 here: libswscale/utils.c:1638:
/* unscaled special cases */ if (unscaled && !usesHFilter && !usesVFilter && (c->srcRange == c->dstRange || isAnyRGB(dstFormat) || dstFormat == AV_PIX_FMT_GRAY8)) { ff_get_unscaled_swscale(c);
The patch produces the following output:
[swscaler @ 0x25d2240] nearest neighbor / point scaler, from yuv420p to gray using MMXEXT
[swscaler @ 0x25d2240] using unscaled yuv420p -> gray special converter
That is probably not the correct way to patch it for all situations but some sort of test is needed here.
Patches should be submitted to the ffmpeg-devel mailing list and not this bug tracker.
Change History (12)
comment:1 by , 11 years ago
comment:2 by , 11 years ago
Converting from yuv420p to grayscale without scaling pretty much just requires copying the prefix bytes: http://en.wikipedia.org/wiki/YUV#Y.27UV420p_.28and_Y.27V12_or_YV12.29_to_RGB888_conversion
I haven't looked at the implementation to see if that's what it's doing. I also haven't checked the quality of the old output. I would like to point out that converting from yuv420p to rgb is almost three times faster than the current conversion to grayscale. I could write my own implementation from YUV to grayscale but I would prefer not to have to deal with all the different variants.
comment:3 by , 11 years ago
Maybe setting the swscale to output yuv-range would pick the old conversion? At least this is how it should work.
follow-up: 5 comment:4 by , 11 years ago
I verified that the values in frame->data[0] and gray_frame->data[0] are the same when considering the respective linesize[0]. Thus, the fast version produces correct output.
As ff_get_unscaled_swscale only sets swscale if it can handle the transformation, it might be possible to simplify the test above to the following:
if (unscaled && !usesHFilter && !usesVFilter) { ff_get_unscaled_swscale(c);
planarCopyWrapper in swscale_unscaled.c seems to do a fine and fast job in this situation.
comment:5 by , 11 years ago
comment:7 by , 11 years ago
Thanks for explaining this. The value range is indeed expanded when using the version without my patch. Unfortunately, it takes 7 times longer to convert the image.
comment:11 by , 11 years ago
I don't really understand the question about setting the range. I still kind of consider it to be a bug that converting YUV to gray takes that long with sws_scale. However, this conversation has been very helpful for me in understanding some FFmpeg internals so that I can work around this issue.
You can close the bug if you think that this is the intended performance.
comment:12 by , 10 years ago
Resolution: | → worksforme |
---|---|
Status: | new → closed |
The original behaviour was buggy (and was fixed), it produced gray output both for white and black input.
There are now at least four possibilities to get the "old" behaviour for a "conversion" from yuv420p to gray:
- Use frame->data[0] instead of involving libswscale at all (this is at least a nuance faster than the "old" behaviour)
- Force yuvj420p as input colour space for libswscale (this option is deprecated)
- Force the needed input color_range when using libswscale
- Use the extractplanes filter
The old (fast) version produced incorrect output which was not full-scale. Are you sure the output of the old version was useful?