Opened 2 years ago

Closed 2 years ago

#9675 closed defect (invalid)

-vf overlay is not accurate with RGB input using embedded RGBA alpha channel

Reported by: pdr0 Owned by:
Priority: normal Component: avfilter
Version: unspecified Keywords:
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug:
-vf overlay is not accurate with RGB input using embedded RGBA alpha channel

How to reproduce:

"D:\\_DOWNLOADS\\ffmpeg-master-latest-win64-gpl_20220302\\bin\\ffmpeg" -report -i black.png -i white_withalpharamp.png -filter_complex "[0:0][1:0]overlay" ffmpegoverlay_20220302.png -y
ffmpeg version N-105822-g4b72bca6ca-20220302 Copyright (c) 2000-2022 the FFmpeg developers
  built with gcc 11.2.0 (crosstool-NG 1.24.0.533_681aaef)
  configuration: --prefix=/ffbuild/prefix --pkg-config-flags=--static --pkg-config=pkg-config --cross-prefix=x86_64-w64-mingw32- --arch=x86_64 --target-os=mingw32 --enable-gpl --enable-version3 --disable-debug --disable-w32threads --enable-pthreads --enable-iconv --enable-libxml2 --enable-zlib --enable-libfreetype --enable-libfribidi --enable-gmp --enable-lzma --enable-fontconfig --enable-libvorbis --enable-opencl --disable-libpulse --enable-libvmaf --disable-libxcb --disable-xlib --enable-amf --enable-libaom --enable-avisynth --enable-libdav1d --enable-libdavs2 --disable-libfdk-aac --enable-ffnvcodec --enable-cuda-llvm --enable-frei0r --enable-libgme --enable-libass --enable-libbluray --enable-libmp3lame --enable-libopus --enable-librist --enable-libtheora --enable-libvpx --enable-libwebp --enable-lv2 --enable-libmfx --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenh264 --enable-libopenjpeg --enable-libopenmpt --enable-librav1e --enable-librubberband --enable-schannel --enable-sdl2 --en  libavutil      57. 22.100 / 57. 22.100
  libavcodec     59. 21.103 / 59. 21.103
  libavformat    59. 17.102 / 59. 17.102
  libavdevice    59.  5.100 / 59.  5.100
  libavfilter     8. 27.100 /  8. 27.100
  libswscale      6.  5.100 /  6.  5.100
  libswresample   4.  4.100 /  4.  4.100
  libpostproc    56.  4.100 / 56.  4.100
Splitting the commandline.
Reading option '-report' ... matched as option 'report' (generate a report) with argument '1'.
Reading option '-i' ... matched as input url with argument 'black.png'.
Reading option '-i' ... matched as input url with argument 'white_withalpharamp.png'.
Reading option '-filter_complex' ... matched as option 'filter_complex' (create a complex filtergraph) with argument '[0:0][1:0]overlay'.
Reading option 'ffmpegoverlay_20220302.png' ... matched as output url.
Reading option '-y' ... matched as option 'y' (overwrite output files) with argument '1'.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option report (generate a report) with argument 1.
Applying option filter_complex (create a complex filtergraph) with argument [0:0][1:0]overlay.
Applying option y (overwrite output files) with argument 1.
Successfully parsed a group of options.
Parsing a group of options: input url black.png.
Successfully parsed a group of options.
Opening an input file: black.png.
[NULL @ 000000c64e433b40] Opening 'black.png' for reading
[file @ 000000c64c90a040] Setting default whitelist 'file,crypto,data'
[png_pipe @ 000000c64e433b40] Format png_pipe probed with size=2048 and score=99
[png_pipe @ 000000c64e433b40] Before avformat_find_stream_info() pos: 0 bytes read:285 seeks:0 nb_streams:1
[png_pipe @ 000000c64e433b40] After avformat_find_stream_info() pos: 285 bytes read:285 seeks:0 frames:1
Input #0, png_pipe, from 'black.png':
  Duration: N/A, bitrate: N/A
  Stream #0:0, 1, 1/25: Video: png, rgb24(pc), 256x256, 25 fps, 25 tbr, 25 tbn
Successfully opened the file.
Parsing a group of options: input url white_withalpharamp.png.
Successfully parsed a group of options.
Opening an input file: white_withalpharamp.png.
[NULL @ 000000c64c90b0c0] Opening 'white_withalpharamp.png' for reading
[file @ 000000c64c90c740] Setting default whitelist 'file,crypto,data'
[png_pipe @ 000000c64c90b0c0] Format png_pipe probed with size=2048 and score=99
[png_pipe @ 000000c64c90b0c0] Before avformat_find_stream_info() pos: 0 bytes read:972 seeks:0 nb_streams:1
[png_pipe @ 000000c64c90b0c0] After avformat_find_stream_info() pos: 972 bytes read:972 seeks:0 frames:1
Input #1, png_pipe, from 'white_withalpharamp.png':
  Duration: N/A, bitrate: N/A
  Stream #1:0, 1, 1/25: Video: png, rgba(pc), 256x256, 25 fps, 25 tbr, 25 tbn
Successfully opened the file.
Parsing a group of options: output url ffmpegoverlay_20220302.png.
Successfully parsed a group of options.
Opening an output file: ffmpegoverlay_20220302.png.
Successfully opened the file.
detected 8 logical cores
Stream mapping:
  Stream #0:0 (png) -> overlay
  Stream #1:0 (png) -> overlay
  overlay:default -> Stream #0:0 (png)
Press [q] to stop, [?] for help
cur_dts is invalid st:0 (0) [init:0 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
cur_dts is invalid st:0 (0) [init:0 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
cur_dts is invalid st:0 (0) [init:0 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
cur_dts is invalid st:0 (0) [init:0 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
[graph 0 input from stream 0:0 @ 000000c64e46c600] Setting 'video_size' to value '256x256'
[graph 0 input from stream 0:0 @ 000000c64e46c600] Setting 'pix_fmt' to value '2'
[graph 0 input from stream 0:0 @ 000000c64e46c600] Setting 'time_base' to value '1/25'
[graph 0 input from stream 0:0 @ 000000c64e46c600] Setting 'pixel_aspect' to value '0/1'
[graph 0 input from stream 0:0 @ 000000c64e46c600] Setting 'frame_rate' to value '25/1'
[graph 0 input from stream 0:0 @ 000000c64e46c600] w:256 h:256 pixfmt:rgb24 tb:1/25 fr:25/1 sar:0/1
[graph 0 input from stream 1:0 @ 000000c64e46ca00] Setting 'video_size' to value '256x256'
[graph 0 input from stream 1:0 @ 000000c64e46ca00] Setting 'pix_fmt' to value '26'
[graph 0 input from stream 1:0 @ 000000c64e46ca00] Setting 'time_base' to value '1/25'
[graph 0 input from stream 1:0 @ 000000c64e46ca00] Setting 'pixel_aspect' to value '0/1'
[graph 0 input from stream 1:0 @ 000000c64e46ca00] Setting 'frame_rate' to value '25/1'
[graph 0 input from stream 1:0 @ 000000c64e46ca00] w:256 h:256 pixfmt:rgba tb:1/25 fr:25/1 sar:0/1
[format @ 000000c64e46c700] Setting 'pix_fmts' to value 'rgb24|rgba|rgb48be|rgba64be|pal8|gray|ya8|gray16be|ya16be|monob'
[auto_scale_0 @ 000000c64e46d700] w:iw h:ih flags:'' interl:0
[Parsed_overlay_0 @ 000000c64e46d900] auto-inserting filter 'auto_scale_0' between the filter 'graph 0 input from stream 0:0' and the filter 'Parsed_overlay_0'
[auto_scale_1 @ 000000c64e46c800] w:iw h:ih flags:'' interl:0
[Parsed_overlay_0 @ 000000c64e46d900] auto-inserting filter 'auto_scale_1' between the filter 'graph 0 input from stream 1:0' and the filter 'Parsed_overlay_0'
[auto_scale_2 @ 000000c64e46c900] w:iw h:ih flags:'' interl:0
[format @ 000000c64e46c700] auto-inserting filter 'auto_scale_2' between the filter 'Parsed_overlay_0' and the filter 'format'
[AVFilterGraph @ 000000c64e475380] query_formats: 5 queried, 4 merged, 3 already done, 0 delayed
[auto_scale_2 @ 000000c64e46c900] picking rgba out of 10 ref:yuva420p alpha:1
[auto_scale_0 @ 000000c64e46d700] w:256 h:256 fmt:rgb24 sar:0/1 -> w:256 h:256 fmt:yuva420p sar:0/1 flags:0x0
[auto_scale_1 @ 000000c64e46c800] w:256 h:256 fmt:rgba sar:0/1 -> w:256 h:256 fmt:yuva420p sar:0/1 flags:0x0
[Parsed_overlay_0 @ 000000c64e46d900] main w:256 h:256 fmt:yuva420p overlay w:256 h:256 fmt:yuva420p
[Parsed_overlay_0 @ 000000c64e46d900] [framesync @ 000000c64e4767e8] Selected 1/25 time base
[Parsed_overlay_0 @ 000000c64e46d900] [framesync @ 000000c64e4767e8] Sync level 2
[auto_scale_2 @ 000000c64e46c900] w:256 h:256 fmt:yuva420p sar:0/1 -> w:256 h:256 fmt:rgba sar:0/1 flags:0x0
[Parsed_overlay_0 @ 000000c64e46d900] n:1.000000 t:0.000000 pos:0.000000 x:0.000000 xi:0 y:0.000000 yi:0
Output #0, image2, to 'ffmpegoverlay_20220302.png':
  Metadata:
    encoder         : Lavf59.17.102
  Stream #0:0, 0, 1/25: Video: png, rgba(pc, gbr/unknown/unknown, progressive), 256x256, q=2-31, 200 kb/s, 25 fps, 25 tbn
    Metadata:
      encoder         : Lavc59.21.103 png
Clipping frame in rate conversion by 0.000008
frame=    1 fps=0.0 q=0.0 size=N/A time=00:00:00.00 bitrate=N/A speed=   0x    
cur_dts is invalid st:0 (0) [init:1 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
[Parsed_overlay_0 @ 000000c64e46d900] [framesync @ 000000c64e4767e8] Sync level 1
cur_dts is invalid st:0 (0) [init:1 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
[Parsed_overlay_0 @ 000000c64e46d900] [framesync @ 000000c64e4767e8] Sync level 0
cur_dts is invalid st:0 (0) [init:1 i_done:0 finish:0] (this is harmless if it occurs once at the start per stream)
[out_0_0 @ 000000c64e46c100] EOF on sink link out_0_0:default.
No more output streams to write to, finishing.
[image2 @ 000000c64e434740] Opening 'ffmpegoverlay_20220302.png' for writing
[file @ 000000c64e450440] Setting default whitelist 'file,crypto,data'
[AVIOContext @ 000000c652d47ac0] Statistics: 2411 bytes written, 0 seeks, 1 writeouts
frame=    1 fps=0.0 q=-0.0 Lsize=N/A time=00:00:00.04 bitrate=N/A speed=0.251x    
video:2kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
Input file #0 (black.png):
  Input stream #0:0 (video): 1 packets read (285 bytes); 1 frames decoded; 
  Total: 1 packets (285 bytes) demuxed
Input file #1 (white_withalpharamp.png):
  Input stream #1:0 (video): 1 packets read (972 bytes); 1 frames decoded; 
  Total: 1 packets (972 bytes) demuxed
Output file #0 (ffmpegoverlay_20220302.png):
  Output stream #0:0 (video): 1 frames encoded; 1 packets muxed (2411 bytes); 
  Total: 1 packets (2411 bytes) muxed
2 frames successfully decoded, 0 decoding errors
[AVIOContext @ 000000c64e43c040] Statistics: 285 bytes read, 0 seeks
[AVIOContext @ 000000c64c90c980] Statistics: 972 bytes read, 0 seeks

black.png is RGB 0,0,0

white_withalpharamp.png is RGB 255,255,255 with a perfect 0-255 ramp for the alpha channel, 256x256

The expected output is a perfect gradient 0-255, with each xpos having the same R=G=B value. "greyscaleramp.png" is the expected output, and achieved with other programs

The ffmpeg observed output has banding , some values are dropped, some are repeated (but at least 0 and 255 are preserved for a binarized mask). This has implications for compositing, greenscreen, non binarized masks

This test used RGB inputs, but the log says auto scaling to "fmt:yuva420p" . My guess is there is some swscale issue - maybe RGB is being converted to YUV and back with limited range at 8bit causing the quantization errors , instead of operating as an RGB overlay. Or perhaps the intermediate YUV operation could be performed at a higher bit depth to avoid those issues

Attachments (4)

black.png (285 bytes ) - added by pdr0 2 years ago.
white_withalpharamp.png (972 bytes ) - added by pdr0 2 years ago.
ffmpegoverlay_20220302.png (2.4 KB ) - added by pdr0 2 years ago.
greyscaleramp_expectedoutput.png (790 bytes ) - added by pdr0 2 years ago.

Download all attachments as: .zip

Change History (5)

by pdr0, 2 years ago

Attachment: black.png added

by pdr0, 2 years ago

Attachment: white_withalpharamp.png added

by pdr0, 2 years ago

Attachment: ffmpegoverlay_20220302.png added

comment:1 by Elon Musk, 2 years ago

Resolution: invalid
Status: newclosed

try using overlay=format next time

Note: See TracTickets for help on using tickets.