Opened 6 years ago

Closed 6 years ago

#7226 closed defect (fixed)

Generic vf_blend implementation contains unaligned memory access

Reported by: Lastique Owned by:
Priority: minor Component: avfilter
Version: git-master Keywords: blend
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug:

The generic C implementation of the vf_blend filter (libavfilter/vf_blend.c, see macros and functions blend_*_16bit) for 16-bit components performs possibly unaligned loads and stores of uint16_t components. This is done as a result of casting uint8_t* pointers to uint16_t* and then reading and writing data through the casted pointers.

On some architectures unaligned memory access is not permitted. Additionally, given the casts the compiler is free to falsely assume the pointers are aligned and apply optimizations that are not valid if the pointers are not aligned.

In my case, I'm building ffmpeg from sources with gcc 7.3.0 and vectorization enabled (I modified configure script to not disable vectorization) for x86-64. The compiler vectorizes the loops in the blend_*_16bit functions and performs aligned loads of the input data. However, the checkasm --test=vf_blend then crashes as it passes unaligned pointers to the blend algorithm implementation. Here is the backtrace:

Program received signal SIGSEGV, Segmentation fault.
0x0000555555736130 in blend_addition_16bit (_top=<optimized out>, top_linesize=<optimized out>, _bottom=<optimized out>, bottom_linesize=<optimized out>, _dst=<optimized out>, dst_linesize=<optimized out>, width=128, height=256, param=0x7fffffffd130, values=0x0, starty=-134463424)
    at src/libavfilter/vf_blend.c:281
281     src/libavfilter/vf_blend.c: No such file or directory.
(gdb) bt
#0  0x0000555555736130 in blend_addition_16bit (_top=<optimized out>, top_linesize=<optimized out>, _bottom=<optimized out>, bottom_linesize=<optimized out>, _dst=<optimized out>, dst_linesize=<optimized out>, width=128, height=256, param=0x7fffffffd130, values=0x0, starty=-134463424)
    at src/libavfilter/vf_blend.c:281
#1  0x00005555557057a6 in checkasm_check_blend () at src/tests/checkasm/vf_blend.c:126
#2  0x00005555556ebc39 in check_cpu_flag (flag=<optimized out>, name=0x0) at src/tests/checkasm/checkasm.c:572
#3  0x00005555556ebc39 in main (argc=<optimized out>, argv=<optimized out>) at src/tests/checkasm/checkasm.c:680

The fault instruction is movdqa (%r12,%rax,1),%xmm0 which attempts to load aligned 16 bytes from effective address 0x7ffff7fd4041. The compiler also generated pre-alignment code, which assumes that the initial pointer is aligned to 2 bytes boundary, which is why the compiler used the aligned load.

I understand that the unmodified ffmpeg on x86 is likely not affected because x86 allows unaligned 16-bit loads and stores and vectorization is disabled by default. However, I still think the code in question is not correct and can fail on other architectures.

How to reproduce:

  1. Compile ffmpeg with vectorization enabled. For that comment the check_optflags -fno-tree-vectorize line in configure (I'm attaching the actual patch I used). I don't believe the actual configure flags are important but here is what I used:
./configure --prefix=/usr --extra-version="1mind1" --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libaom --enable-libwavpack --enable-libwebp --enable-libopenh264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libmysofa --enable-libdc1394 --enable-libdrm --enable-libiec61883 --disable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared --enable-gnutls

The compiler used: gcc 7.3.0 on Kubuntu 18.04. Compile for x86-64.

  1. Run the vf_blend test: tests/checkasm/checkasm --test=vf_blend. The test will crash.

I found this problem with ffmpeg 4.0, but the current git master seems to have the same code in libavfilter/vf_blend.c, so I assume it is affected as well.

Attachments (3)

00_compile_flags.patch (558 bytes ) - added by Lastique 6 years ago.
The patch used to enable auto-vectorization
11_fix_vf_blend_unaligned_crash.patch (7.8 KB ) - added by Lastique 6 years ago.
The patch to fix unaligned memory access in vf_blend
11_fix_vf_blend_unaligned_test.patch (1.1 KB ) - added by Lastique 6 years ago.
The patch to fix vf_blend test

Download all attachments as: .zip

Change History (15)

by Lastique, 6 years ago

Attachment: 00_compile_flags.patch added

The patch used to enable auto-vectorization

comment:1 by Lastique, 6 years ago

I've also attached a patch to fix the unaligned memory access. With it applied, I can no longer reproduce the crash (the compiler still vectorizes the code but uses unaligned memory loads and stores).

comment:2 by Elon Musk, 6 years ago

Are you sure crash can be produced with blend filter wih ffmpeg?

comment:3 by Lastique, 6 years ago

I'm not sure it will be reproducible with ffmpeg command line tool because there are asm versions that will likely be used on modern CPUs. The problem is in the generic C version, which is explicitly used by the test. Maybe it'll reproduce if you build ffmpeg without asm optimizations.

Having the tests not fail is important for me because running tests is a step in the packaging process. I wouldn't want to disable tests because of this problem.

in reply to:  3 comment:4 by James, 6 years ago

We have AV_R* and AV_W* macros in libavutil/intreadwrite.h for unaligned reads and writes. You can see examples of their usage in several other filters.
All those memcpys in your patch are pretty ugly and can be avoided with the aforementioned convenience macros.

That being said, the fact the build system forcefully disables vectorization should serve as a hint that it's done for a reason.

Replying to Lastique:

I'm not sure it will be reproducible with ffmpeg command line tool because there are asm versions that will likely be used on modern CPUs. The problem is in the generic C version, which is explicitly used by the test. Maybe it'll reproduce if you build ffmpeg without asm optimizations.

The -cpuflags option lets you choose which intruction sets you want to enable, so "-cpuflags none" will make sure no asm optimized function is used, outside of stuff hardcoded at compile time.

by Lastique, 6 years ago

The patch to fix unaligned memory access in vf_blend

comment:5 by Elon Musk, 6 years ago

Patch is not good, you killed macro. Besides checkasm test thing is wrong if data is not 16 byte aligned, for 16 bit input it must always be.

comment:6 by Lastique, 6 years ago

We have AV_R* and AV_W* macros in libavutil/intreadwrite.h for unaligned reads and writes.

Thanks, I didn't know about the macros. I've attached an updated patch. Unfortunately, I had to expand DEFINE_BLEND_EXPR as I could not use the macros from it.

I also corrected a minor oversight in that A and B macros were still referencing the data through pointers instead of using the loaded values. I guess, the compiler optimized away these references.

That being said, the fact the build system forcefully disables vectorization should serve as a hint that it's done for a reason.

The only reason I found was this thread from back 2013. Things could have change since then.

Anyway, auto-vectorization is not the problem, it merely exposed the bug.

The -cpuflags option lets you choose which intruction sets you want to enable, so "-cpuflags none" will make sure no asm optimized function is used, outside of stuff hardcoded at compile time.

Ok, but ffmpeg will probably align the buffers. The test specifically tests with misaligned pointers.

comment:7 by Lastique, 6 years ago

Patch is not good, you killed macro.

I couldn't use the AV_* macros otherwise.

Besides checkasm test thing is wrong if data is not 16 byte aligned, for 16 bit input it must always be.

You mean "2-byte aligned"? Maybe so. I don't know what use case it verifies. Would you like a patch that fixes the test instead?

by Lastique, 6 years ago

The patch to fix vf_blend test

comment:8 by Lastique, 6 years ago

Here's the patch that fixes the test instead. Aside from fixing 2-byte misalignment, I also changes how the misalignment is applied to avoid source buffer overruns.

comment:9 by James, 6 years ago

Could you submit this (or any) patch to the mailing list? That's the place for contributions and where it will get the most attention. See https://ffmpeg.org/contact.html#MailingLists

comment:10 by Carl Eugen Hoyos, 6 years ago

Keywords: blend added; vf_blend removed

Do I understand correctly that you can only reproduce a crash with checkasm, not with an actual command line?

in reply to:  10 comment:11 by Lastique, 6 years ago

Replying to cehoyos:

Do I understand correctly that you can only reproduce a crash with checkasm, not with an actual command line?

Yes.

comment:12 by Carl Eugen Hoyos, 6 years ago

Priority: normalminor
Resolution: fixed
Status: newclosed

Tests changed by you in d7eb8d84757155c1fb0fe1a8269025e22e9bf0ec, thank you for the report and the fix!

Note: See TracTickets for help on using tickets.