#6736 closed defect (fixed)
vidstab fails with yuv444 input
Reported by: | Ochi | Owned by: | |
---|---|---|---|
Priority: | normal | Component: | avfilter |
Version: | git-master | Keywords: | libvidstab |
Cc: | Blocked By: | ||
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
Summary of the bug:
Trying to apply vidstab{detect,transform} filters on a yuv444 encoded input video fails with a pixel-format error.
This rather seems to be an incorrect assertion regarding the pixel formats used in ffmpeg and/or the vid.stab library than a technical limitation since commenting-out the check which fails seems to help and vidstab appears to run fine on yuv444 input, but that is something that needs to be validated.
How to reproduce:
# Create test video % ffmpeg -t 10 -s 1280x720 -f rawvideo -pix_fmt rgb24 -r 30 -i /dev/zero -pix_fmt yuv444p input.mkv # Try to run vidstabdetect % ffmpeg -i input.mkv -vf vidstabdetect -f null - # Result: [Parsed_vidstabdetect_0 @ 0x55a34dc71840] pixel-format error: wrong bits/per/pixel, please report a BUG # Try to run vidstabtransform ffmpeg -i input.mkv -vf vidstabtransform -f null - # Similar problem: pixel-format error: bpp 1<>3 chroma_subsampl: w: 0<>0 h: 0<>0 ffmpeg version 3.3.4, Arch Linux
Change History (5)
comment:1 by , 7 years ago
Keywords: | libvidstab added; vidstab yuv444 removed |
---|
comment:2 by , 7 years ago
comment:3 by , 7 years ago
In https://github.com/georgmartius/vid.stab/blob/master/src/frameinfo.h#L57, you can see the variable is only meant to be set for packed formats.
typedef struct vsframeinfo { ... int bytesPerPixel; // number of bytes per pixel (for packed formats) } VSFrameInfo;
So,in libavfilter/vf_vidstab*, it should only be checked for packed formats.
comment:4 by , 7 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Fixed by Gyan Doshi in e1e89c0695b430ca1f0f869ac8a2b6b46be9e2fa
comment:5 by , 7 years ago
Version: | unspecified → git-master |
---|
The bit-depth check that is being done in vf_vidstabdetect.c and vf_vidstabtransform.c is completely broken because of how FFMPEG and livvidstab report how many bits are in a pixel.
Here's the check from vf_vidstabdetect.c:
fi.bytesPerPixel is populated by libvidstab by calling vsFrameInfoInit in frameinfo.c in the external library. For this function, bytesPerPixel is PER PLANE, and is set to 1 for all YUV formats.
On the other hand, av_get_bits_per_pixel reports the average number of bits for all planes. The average part is important since many YUV formats share components across multiple pixels.
For example, for the YUV444P pixel format libvidstab reports that there is 1 byte per pixel. FFMPEG says there are 24 bits per pixel since it counts all planes. This gets divided by 8 for a final value of 3. The check sees that 1 != 3 so an error gets thrown.
Things get even crazier when you look at YUV422, since that format shares UV data between pixels. Again, libvidstab sets fi.bytesPerPixel to 1 since from libvidstab's perspective that values is how many actual bytes the data is taking up after it has been unpacked into the frame buffer. libvidstab is tracking that each YUV channel is using a single byte. It's all about memory layout. Contrast this to FFMPEG's av_get_bits_per_pixel which returns the theoretical, average pixel value of 12 bits. That has nothing to do with memory layout, it's about describing the ideal pixel format. To make things worse, that 12 gets divided by 8 to become 1 byte. This actually passes the check but is actually really broken since the value is getting truncated.
The difference in purpose for the values means there isn't an easy fix for this check.
You can't just multiply fi.bytesPerPixel by fi.planes to compare total number of bytes for two reasons. The first is that libvidstab sets fi.planes to 0 for RGB and RGBA, so the total number of bytes would be 0. It uses a planes value of 0 to mark the format as non-planer. In addition, if we look at YUV422 again we get a bytes per pixel value of 3 since there are 3 channels of 1 full byte each in memory. This still fails the check since 12/8=1 on the FFMPEG side and 1!=3. You can't remove the division on the FFMPEG side and multiple the bytes by 8 on the libvidstab side either because 3*8=24 bits instead of the correct 12 and 12!=24.
It really all comes down to the fact that the FFMPEG data and the libvidstab data are used for two completely different things. FFMPEG is reporting on the ideal, average pixel which is why you get values like 12 bpp, which isn't evenly divisible by 8. libvidstab is using the data to describe how the pixels are laid out in memory. There are no partial pixels when the data is unpacked, every YUV sample is contained in 1 byte.