#7825 closed defect (fixed)
Malfunctioning `ssim` filter?..
Reported by: | gdgsdg123 | Owned by: | |
---|---|---|---|
Priority: | normal | Component: | avfilter |
Version: | git-master | Keywords: | ssim |
Cc: | Blocked By: | ||
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
Build from: https://zeranoe.com/builds/win64/static/ffmpeg-20190402-6aeaac3-win64-static.zip
C:\>ffmpeg -i "bt709.avi" -i "bt601.avi" -lavfi ssim -f null - ffmpeg version N-93515-g6aeaac3e1c Copyright (c) 2000-2019 the FFmpeg developers built with gcc 8.2.1 (GCC) 20190212 configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-amf --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt libavutil 56. 26.100 / 56. 26.100 libavcodec 58. 48.100 / 58. 48.100 libavformat 58. 26.101 / 58. 26.101 libavdevice 58. 7.100 / 58. 7.100 libavfilter 7. 48.100 / 7. 48.100 libswscale 5. 4.100 / 5. 4.100 libswresample 3. 4.100 / 3. 4.100 libpostproc 55. 4.100 / 55. 4.100 [avi @ 00000000005218c0] decoding for stream 0 failed Input #0, avi, from 'bt709.avi': Duration: 00:00:00.03, start: 0.000000, bitrate: 206981 kb/s Stream #0:0: Video: h264 (High 4:4:4 Predictive) (H264 / 0x34363248), yuv420p10le(tv, bt709, progressive), 1440x836, 30 fps, 30 tbr, 30 tbn, 60 tbc [avi @ 0000000002e965c0] decoding for stream 0 failed Input #1, avi, from 'bt601.avi': Duration: 00:00:00.03, start: 0.000000, bitrate: 206981 kb/s Stream #1:0: Video: h264 (High 4:4:4 Predictive) (H264 / 0x34363248), yuv420p10le(tv, smpte170m, progressive), 1440x836, 30 fps, 30 tbr, 30 tbn, 60 tbc Stream mapping: Stream #0:0 (h264) -> ssim:main Stream #1:0 (h264) -> ssim:reference ssim -> Stream #0:0 (wrapped_avframe) Press [q] to stop, [?] for help Output #0, null, to 'pipe:': Metadata: encoder : Lavf58.26.101 Stream #0:0: Video: wrapped_avframe, yuv420p10le, 1440x836, q=2-31, 200 kb/s, 30 fps, 30 tbn, 30 tbc (default) Metadata: encoder : Lavc58.48.100 wrapped_avframe frame= 1 fps=0.0 q=-0.0 Lsize=N/A time=00:00:00.03 bitrate=N/A speed=0.189x video:1kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown [Parsed_ssim_0 @ 0000000002df0f40] SSIM Y:1.000000 (inf) U:1.000000 (inf) V:1.000000 (inf) All:1.000000 (inf)
"bt709.avi" and "bt601.avi" are from the same source and encoded using the same parameters, but tagged differently.
And apparently they don't look the same...
Attachments (3)
Change History (17)
by , 6 years ago
follow-up: 2 comment:1 by , 6 years ago
Keywords: | colorspace removed |
---|---|
Resolution: | → invalid |
Status: | new → closed |
Type: | enhancement → defect |
comment:2 by , 6 years ago
Replying to cehoyos:
Apart from the attachment, I believe the filter acts as specified.
I don't think SSIM is supposed to be color-blind... (and I'm not talking about the `colorspace` filter)
Workaround
Due to the colorspace awareness issue of the ssim
filter, it's recommended to convert both inputs to some colorspace without all these color management hazards. (e.g. RGB)
Or make sure that both inputs use exactly the same color management schema.
by , 5 years ago
Attachment: | tainted.png added |
---|
Sheer black for all pixels except the most upper-left pixel tainted to "#010000".
comment:3 by , 5 years ago
Resolution: | invalid |
---|---|
Status: | closed → reopened |
Summary: | `-lavfi ssim` is not colorspace aware?.. → Malfunctioning `ssim` filter?.. |
I fear even the core of the `ssim` filter may not be functioning properly...
I purposely made 2 PNG files of "sheer black" content ("black.png", "tainted.png"), with the "tainted.png" purposefully had the most upper-left pixel tainted to a different color, and:
ffmpeg -i "black.png" -i "tainted.png" -lavfi "ssim;[0][1]psnr" -f null -
ffmpeg version git-2020-01-26-5e62100 Copyright (c) 2000-2020 the FFmpeg developers built with gcc 9.2.1 (GCC) 20200122 configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt --enable-amf libavutil 56. 38.100 / 56. 38.100 libavcodec 58. 67.100 / 58. 67.100 libavformat 58. 36.100 / 58. 36.100 libavdevice 58. 9.103 / 58. 9.103 libavfilter 7. 71.100 / 7. 71.100 libswscale 5. 6.100 / 5. 6.100 libswresample 3. 6.100 / 3. 6.100 libpostproc 55. 6.100 / 55. 6.100 Input #0, png_pipe, from 'black.png': Duration: N/A, bitrate: N/A Stream #0:0: Video: png, rgb24(pc), 3840x2160, 25 tbr, 25 tbn, 25 tbc Input #1, png_pipe, from 'tainted.png': Duration: N/A, bitrate: N/A Stream #1:0: Video: png, rgb24(pc), 3840x2160, 25 tbr, 25 tbn, 25 tbc Stream mapping: Stream #0:0 (png) -> ssim:main Stream #0:0 (png) -> psnr:main Stream #1:0 (png) -> ssim:reference Stream #1:0 (png) -> psnr:reference ssim -> Stream #0:0 (wrapped_avframe) psnr -> Stream #0:1 (wrapped_avframe) Press [q] to stop, [?] for help Output #0, null, to 'pipe:': Metadata: encoder : Lavf58.36.100 Stream #0:0: Video: wrapped_avframe, gbrp(progressive), 3840x2160, q=2-31, 200 kb/s, 25 fps, 25 tbn, 25 tbc Metadata: encoder : Lavc58.67.100 wrapped_avframe Stream #0:1: Video: wrapped_avframe, gbrp, 3840x2160, q=2-31, 200 kb/s, 25 fps, 25 tbn, 25 tbc Metadata: encoder : Lavc58.67.100 wrapped_avframe frame= 1 fps=0.0 q=-0.0 Lq=-0.0 size=N/A time=00:00:00.04 bitrate=N/A speed=0.122x video:1kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown [Parsed_ssim_0 @ 0000000002a16b40] SSIM R:1.000000 (inf) G:1.000000 (inf) B:1.000000 (inf) All:1.000000 (inf) [Parsed_psnr_1 @ 0000000002a1f2c0] PSNR r:117.318653 g:inf b:inf average:122.089866 min:122.089866 max:122.089866
...You sure this fits the definition?
Structural similarity - Wikipedia
The resultant SSIM index is a decimal value between -1 and 1, and value 1 is only reachable in the case of two identical sets of data and therefore indicates perfect structural similarity.
Also check:
AV1 vs VP9 vs AVC (h.264) vs HEVC (h.265) Part I - Lossless
SSIM Y:1.000000 (73.043867) U:1.000000 (70.134668) V:1.000000 (69.880162) All:1.000000 (72.541141) PSNR y:97.786557 u:99.494677 v:99.335577 average:98.264431 min:79.206295 max:inf
What does the value in parentheses for ffmpeg ssim log denote - Video Production Stack Exchange
It is dB representation of All value, calculated with following formula:
10 * log10(1 / (1 - ssim))
follow-ups: 6 14 comment:4 by , 5 years ago
Resolution: | → invalid |
---|---|
Status: | reopened → closed |
Feel free to open a new ticket instead of changing an existing one but please understand that there will always be a resolution high enough to compensate for one different pixel.
comment:6 by , 5 years ago
Replying to cehoyos:
Feel free to open a new ticket instead of changing an existing one.
The 2 topics are closely related so I believe it would be better to have them merged.
Replying to richardpl:
SSIM does not work like that, if you need one pixel difference use PSNR.
So you mean what on the Wikipedia is plainly wrong?..
If so, it would be very kind of you to edit that page and provide sufficient proof. The rest of the world would appreciate it.
follow-up: 8 comment:7 by , 5 years ago
Look, if you can prove you statement feel free to reopen bug, otherwise keep calm so to not reveal big ignorance. The page you linked nowhere mentions single pixel change by one difference gives different results.
comment:8 by , 5 years ago
Replying to richardpl:
Look, if you can prove you statement feel free to reopen bug, otherwise keep calm so to not reveal big ignorance.
Huh?.. Why would you take a polite request as the sign of big ignorance?
Replying to richardpl:
The page you linked nowhere mentions single pixel change by one difference gives different results.
Check the comment:3, I believe it's clear enough.
comment:9 by , 5 years ago
Lets try in other words, find ssim implementation that give results you expect.
comment:10 by , 5 years ago
Could it be a rounding issue? decimal places / precision ? If the value in parenthesis is the dB representation of All value, and values are rounded to 6 decimal could that be the issue ?
Repeating the tests at different resolutions eg. 1280x720, 1920x1080, etc... all with 1 pixel @0,0 as RGB [1,0,0]
At some point, between 1280x800 and 1200x1000 ffmpeg ssim does not detect the difference (inf). It calculates 1280x720 and 1280x800 versions as the same values, where PSNR determines them as different
ffmpeg -i "1280x720_1.png" -i "1280x720_0.png" -lavfi "ssim;[0][1]psnr" -f null -
[Parsed_ssim_0 @ 0000006362f0f900] SSIM R:1.000000 (72.247199) G:1.000000 (inf)
B:1.000000 (inf) All:1.000000 (inf)
[Parsed_psnr_1 @ 0000006362f16ac0] PSNR r:107.776228 g:inf b:inf average:112.547
441 min:112.547441 max:112.547441
ffmpeg -i "1280x800_1.png" -i "1280x800_0.png" -lavfi "ssim;[0][1]psnr" -f null -
[Parsed_ssim_0 @ 0000004e6a813700] SSIM R:1.000000 (72.247199) G:1.000000 (inf)
B:1.000000 (inf) All:1.000000 (inf)
[Parsed_psnr_1 @ 0000004e6a83e780] PSNR r:108.233803 g:inf b:inf average:113.005
016 min:113.005016 max:113.005016
ffmpeg -i "1200x1000_1.png" -i "1200x1000_0.png" -lavfi "ssim;[0][1]psnr" -f null -
[Parsed_ssim_0 @ 00000076083dc340] SSIM R:1.000000 (inf) G:1.000000 (inf) B:1.00
0000 (inf) All:1.000000 (inf)
[Parsed_psnr_1 @ 00000076083dcfc0] PSNR r:108.922616 g:inf b:inf average:113.693
829 min:113.693829 max:113.693829
ffmpeg -i "1440x1080_1.png" -i "1440x1080_0.png" -lavfi "ssim;[0][1]psnr" -f null -
[Parsed_ssim_0 @ 000000930607e6c0] SSIM R:1.000000 (inf) G:1.000000 (inf) B:1.00
0000 (inf) All:1.000000 (inf)
[Parsed_psnr_1 @ 0000009306071340] PSNR r:110.048666 g:inf b:inf average:114.819
879 min:114.819879 max:114.819879
ffmpeg -i "1920x1080_1.png" -i "1920x1080_0.png" -lavfi "ssim;[0][1]psnr" -f null -
[Parsed_ssim_0 @ 0000003f50d34e80] SSIM R:1.000000 (inf) G:1.000000 (inf) B:1.00
0000 (inf) All:1.000000 (inf)
[Parsed_psnr_1 @ 0000003f50d3cd80] PSNR r:111.298053 g:inf b:inf average:116.069
266 min:116.069266 max:116.069266
vapoursynth ssim can detect the differences, but it carries more decimal places
vapoursynth ssim (no downsample, but enable downsample also detects the difference)
3840x2160
0.9999999975397562135270845828927122056484222412109375
1920x1080
0.99999999015902474308603586905519478023052215576171875
1280x720
0.99999997785780581072145878351875580847263336181640625
self test @ 1920x1080 (to check validity)
1
There are different variations of SSIM calculations - some use different window sizes, some downsample (as suggested in original ssim paper), some apply gaussian filter (slower), some a box blur (faster)
https://github.com/FFmpeg/FFmpeg/blob/master/libavfilter/vf_ssim.c
* To improve speed, this implementation uses the standard approximation of * overlapped 8x8 block sums, rather than the original gaussian weights.
Could this "speed" implementation be contributing to the issue?
follow-up: 12 comment:11 by , 5 years ago
No, its float usage in places where double should be used instead.
comment:12 by , 5 years ago
Resolution: | invalid → fixed |
---|
Replying to richardpl:
No, its float usage in places where double should be used instead.
Does this fcc0424c933742c8fc852371e985d16b6eb4bfe9 fix this problem? Yeah and fixit commit 0815a22dccbb67970ea84559f22afacee4219192
comment:13 by , 5 years ago
Repeating those tests above with a binary that includes those commits - differences are detected now (no longer "(inf)") when they were not before
comment:14 by , 5 years ago
Replying to cehoyos:
...please understand that there will always be a resolution high enough to compensate for one different pixel.
I do understand that but also realize: the error is certainly avoidable.
By distinguishing the case of zero absolute difference from other cases (using a flag), conditionally adapt the original algorithm's evaluation to some slightly more inaccurate but practically still accurate enough values. (thus the distinction, and better fit of the definition)
Apart from the attachment, I believe the filter acts as specified.