Opened 7 years ago
Closed 7 years ago
#7226 closed defect (fixed)
Generic vf_blend implementation contains unaligned memory access
Reported by: | Lastique | Owned by: | |
---|---|---|---|
Priority: | minor | Component: | avfilter |
Version: | git-master | Keywords: | blend |
Cc: | Blocked By: | ||
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
Summary of the bug:
The generic C implementation of the vf_blend filter (libavfilter/vf_blend.c, see macros and functions blend_*_16bit
) for 16-bit components performs possibly unaligned loads and stores of uint16_t
components. This is done as a result of casting uint8_t*
pointers to uint16_t*
and then reading and writing data through the casted pointers.
On some architectures unaligned memory access is not permitted. Additionally, given the casts the compiler is free to falsely assume the pointers are aligned and apply optimizations that are not valid if the pointers are not aligned.
In my case, I'm building ffmpeg from sources with gcc 7.3.0 and vectorization enabled (I modified configure
script to not disable vectorization) for x86-64. The compiler vectorizes the loops in the blend_*_16bit
functions and performs aligned loads of the input data. However, the checkasm --test=vf_blend
then crashes as it passes unaligned pointers to the blend algorithm implementation. Here is the backtrace:
Program received signal SIGSEGV, Segmentation fault. 0x0000555555736130 in blend_addition_16bit (_top=<optimized out>, top_linesize=<optimized out>, _bottom=<optimized out>, bottom_linesize=<optimized out>, _dst=<optimized out>, dst_linesize=<optimized out>, width=128, height=256, param=0x7fffffffd130, values=0x0, starty=-134463424) at src/libavfilter/vf_blend.c:281 281 src/libavfilter/vf_blend.c: No such file or directory. (gdb) bt #0 0x0000555555736130 in blend_addition_16bit (_top=<optimized out>, top_linesize=<optimized out>, _bottom=<optimized out>, bottom_linesize=<optimized out>, _dst=<optimized out>, dst_linesize=<optimized out>, width=128, height=256, param=0x7fffffffd130, values=0x0, starty=-134463424) at src/libavfilter/vf_blend.c:281 #1 0x00005555557057a6 in checkasm_check_blend () at src/tests/checkasm/vf_blend.c:126 #2 0x00005555556ebc39 in check_cpu_flag (flag=<optimized out>, name=0x0) at src/tests/checkasm/checkasm.c:572 #3 0x00005555556ebc39 in main (argc=<optimized out>, argv=<optimized out>) at src/tests/checkasm/checkasm.c:680
The fault instruction is movdqa (%r12,%rax,1),%xmm0
which attempts to load aligned 16 bytes from effective address 0x7ffff7fd4041. The compiler also generated pre-alignment code, which assumes that the initial pointer is aligned to 2 bytes boundary, which is why the compiler used the aligned load.
I understand that the unmodified ffmpeg on x86 is likely not affected because x86 allows unaligned 16-bit loads and stores and vectorization is disabled by default. However, I still think the code in question is not correct and can fail on other architectures.
How to reproduce:
- Compile ffmpeg with vectorization enabled. For that comment the
check_optflags -fno-tree-vectorize
line inconfigure
(I'm attaching the actual patch I used). I don't believe the actualconfigure
flags are important but here is what I used:
./configure --prefix=/usr --extra-version="1mind1" --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libaom --enable-libwavpack --enable-libwebp --enable-libopenh264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libmysofa --enable-libdc1394 --enable-libdrm --enable-libiec61883 --disable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared --enable-gnutls
The compiler used: gcc 7.3.0 on Kubuntu 18.04. Compile for x86-64.
- Run the vf_blend test:
tests/checkasm/checkasm --test=vf_blend
. The test will crash.
I found this problem with ffmpeg 4.0, but the current git master seems to have the same code in libavfilter/vf_blend.c, so I assume it is affected as well.
Attachments (3)
Change History (15)
by , 7 years ago
Attachment: | 00_compile_flags.patch added |
---|
comment:1 by , 7 years ago
I've also attached a patch to fix the unaligned memory access. With it applied, I can no longer reproduce the crash (the compiler still vectorizes the code but uses unaligned memory loads and stores).
follow-up: 4 comment:3 by , 7 years ago
I'm not sure it will be reproducible with ffmpeg
command line tool because there are asm versions that will likely be used on modern CPUs. The problem is in the generic C version, which is explicitly used by the test. Maybe it'll reproduce if you build ffmpeg without asm optimizations.
Having the tests not fail is important for me because running tests is a step in the packaging process. I wouldn't want to disable tests because of this problem.
comment:4 by , 7 years ago
We have AV_R* and AV_W* macros in libavutil/intreadwrite.h for unaligned reads and writes. You can see examples of their usage in several other filters.
All those memcpys in your patch are pretty ugly and can be avoided with the aforementioned convenience macros.
That being said, the fact the build system forcefully disables vectorization should serve as a hint that it's done for a reason.
Replying to Lastique:
I'm not sure it will be reproducible with
ffmpeg
command line tool because there are asm versions that will likely be used on modern CPUs. The problem is in the generic C version, which is explicitly used by the test. Maybe it'll reproduce if you build ffmpeg without asm optimizations.
The -cpuflags option lets you choose which intruction sets you want to enable, so "-cpuflags none" will make sure no asm optimized function is used, outside of stuff hardcoded at compile time.
by , 7 years ago
Attachment: | 11_fix_vf_blend_unaligned_crash.patch added |
---|
The patch to fix unaligned memory access in vf_blend
comment:5 by , 7 years ago
Patch is not good, you killed macro. Besides checkasm test thing is wrong if data is not 16 byte aligned, for 16 bit input it must always be.
comment:6 by , 7 years ago
We have AV_R* and AV_W* macros in libavutil/intreadwrite.h for unaligned reads and writes.
Thanks, I didn't know about the macros. I've attached an updated patch. Unfortunately, I had to expand DEFINE_BLEND_EXPR
as I could not use the macros from it.
I also corrected a minor oversight in that A
and B
macros were still referencing the data through pointers instead of using the loaded values. I guess, the compiler optimized away these references.
That being said, the fact the build system forcefully disables vectorization should serve as a hint that it's done for a reason.
The only reason I found was this thread from back 2013. Things could have change since then.
Anyway, auto-vectorization is not the problem, it merely exposed the bug.
The -cpuflags option lets you choose which intruction sets you want to enable, so "-cpuflags none" will make sure no asm optimized function is used, outside of stuff hardcoded at compile time.
Ok, but ffmpeg will probably align the buffers. The test specifically tests with misaligned pointers.
comment:7 by , 7 years ago
Patch is not good, you killed macro.
I couldn't use the AV_*
macros otherwise.
Besides checkasm test thing is wrong if data is not 16 byte aligned, for 16 bit input it must always be.
You mean "2-byte aligned"? Maybe so. I don't know what use case it verifies. Would you like a patch that fixes the test instead?
by , 7 years ago
Attachment: | 11_fix_vf_blend_unaligned_test.patch added |
---|
The patch to fix vf_blend test
comment:8 by , 7 years ago
Here's the patch that fixes the test instead. Aside from fixing 2-byte misalignment, I also changes how the misalignment is applied to avoid source buffer overruns.
comment:9 by , 7 years ago
Could you submit this (or any) patch to the mailing list? That's the place for contributions and where it will get the most attention. See https://ffmpeg.org/contact.html#MailingLists
follow-up: 11 comment:10 by , 7 years ago
Keywords: | blend added; vf_blend removed |
---|
Do I understand correctly that you can only reproduce a crash with checkasm, not with an actual command line?
comment:11 by , 7 years ago
Replying to cehoyos:
Do I understand correctly that you can only reproduce a crash with checkasm, not with an actual command line?
Yes.
comment:12 by , 7 years ago
Priority: | normal → minor |
---|---|
Resolution: | → fixed |
Status: | new → closed |
Tests changed by you in d7eb8d84757155c1fb0fe1a8269025e22e9bf0ec, thank you for the report and the fix!
The patch used to enable auto-vectorization