Opened 3 months ago

Closed 3 months ago

#7226 closed defect (fixed)

Generic vf_blend implementation contains unaligned memory access

Reported by: Lastique Owned by:
Priority: minor Component: avfilter
Version: git-master Keywords: blend
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug:

The generic C implementation of the vf_blend filter (libavfilter/vf_blend.c, see macros and functions blend_*_16bit) for 16-bit components performs possibly unaligned loads and stores of uint16_t components. This is done as a result of casting uint8_t* pointers to uint16_t* and then reading and writing data through the casted pointers.

On some architectures unaligned memory access is not permitted. Additionally, given the casts the compiler is free to falsely assume the pointers are aligned and apply optimizations that are not valid if the pointers are not aligned.

In my case, I'm building ffmpeg from sources with gcc 7.3.0 and vectorization enabled (I modified configure script to not disable vectorization) for x86-64. The compiler vectorizes the loops in the blend_*_16bit functions and performs aligned loads of the input data. However, the checkasm --test=vf_blend then crashes as it passes unaligned pointers to the blend algorithm implementation. Here is the backtrace:

Program received signal SIGSEGV, Segmentation fault.
0x0000555555736130 in blend_addition_16bit (_top=<optimized out>, top_linesize=<optimized out>, _bottom=<optimized out>, bottom_linesize=<optimized out>, _dst=<optimized out>, dst_linesize=<optimized out>, width=128, height=256, param=0x7fffffffd130, values=0x0, starty=-134463424)
    at src/libavfilter/vf_blend.c:281
281     src/libavfilter/vf_blend.c: No such file or directory.
(gdb) bt
#0  0x0000555555736130 in blend_addition_16bit (_top=<optimized out>, top_linesize=<optimized out>, _bottom=<optimized out>, bottom_linesize=<optimized out>, _dst=<optimized out>, dst_linesize=<optimized out>, width=128, height=256, param=0x7fffffffd130, values=0x0, starty=-134463424)
    at src/libavfilter/vf_blend.c:281
#1  0x00005555557057a6 in checkasm_check_blend () at src/tests/checkasm/vf_blend.c:126
#2  0x00005555556ebc39 in check_cpu_flag (flag=<optimized out>, name=0x0) at src/tests/checkasm/checkasm.c:572
#3  0x00005555556ebc39 in main (argc=<optimized out>, argv=<optimized out>) at src/tests/checkasm/checkasm.c:680

The fault instruction is movdqa (%r12,%rax,1),%xmm0 which attempts to load aligned 16 bytes from effective address 0x7ffff7fd4041. The compiler also generated pre-alignment code, which assumes that the initial pointer is aligned to 2 bytes boundary, which is why the compiler used the aligned load.

I understand that the unmodified ffmpeg on x86 is likely not affected because x86 allows unaligned 16-bit loads and stores and vectorization is disabled by default. However, I still think the code in question is not correct and can fail on other architectures.

How to reproduce:

  1. Compile ffmpeg with vectorization enabled. For that comment the check_optflags -fno-tree-vectorize line in configure (I'm attaching the actual patch I used). I don't believe the actual configure flags are important but here is what I used:
./configure --prefix=/usr --extra-version="1mind1" --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libaom --enable-libwavpack --enable-libwebp --enable-libopenh264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libmysofa --enable-libdc1394 --enable-libdrm --enable-libiec61883 --disable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared --enable-gnutls

The compiler used: gcc 7.3.0 on Kubuntu 18.04. Compile for x86-64.

  1. Run the vf_blend test: tests/checkasm/checkasm --test=vf_blend. The test will crash.

I found this problem with ffmpeg 4.0, but the current git master seems to have the same code in libavfilter/vf_blend.c, so I assume it is affected as well.

Attachments (3)

00_compile_flags.patch (558 bytes) - added by Lastique 3 months ago.
The patch used to enable auto-vectorization
11_fix_vf_blend_unaligned_crash.patch (7.8 KB) - added by Lastique 3 months ago.
The patch to fix unaligned memory access in vf_blend
11_fix_vf_blend_unaligned_test.patch (1.1 KB) - added by Lastique 3 months ago.
The patch to fix vf_blend test

Download all attachments as: .zip

Change History (15)

Changed 3 months ago by Lastique

The patch used to enable auto-vectorization

comment:1 Changed 3 months ago by Lastique

I've also attached a patch to fix the unaligned memory access. With it applied, I can no longer reproduce the crash (the compiler still vectorizes the code but uses unaligned memory loads and stores).

comment:2 Changed 3 months ago by richardpl

Are you sure crash can be produced with blend filter wih ffmpeg?

comment:3 follow-up: Changed 3 months ago by Lastique

I'm not sure it will be reproducible with ffmpeg command line tool because there are asm versions that will likely be used on modern CPUs. The problem is in the generic C version, which is explicitly used by the test. Maybe it'll reproduce if you build ffmpeg without asm optimizations.

Having the tests not fail is important for me because running tests is a step in the packaging process. I wouldn't want to disable tests because of this problem.

comment:4 in reply to: ↑ 3 Changed 3 months ago by jamrial

We have AV_R* and AV_W* macros in libavutil/intreadwrite.h for unaligned reads and writes. You can see examples of their usage in several other filters.
All those memcpys in your patch are pretty ugly and can be avoided with the aforementioned convenience macros.

That being said, the fact the build system forcefully disables vectorization should serve as a hint that it's done for a reason.

Replying to Lastique:

I'm not sure it will be reproducible with ffmpeg command line tool because there are asm versions that will likely be used on modern CPUs. The problem is in the generic C version, which is explicitly used by the test. Maybe it'll reproduce if you build ffmpeg without asm optimizations.

The -cpuflags option lets you choose which intruction sets you want to enable, so "-cpuflags none" will make sure no asm optimized function is used, outside of stuff hardcoded at compile time.

Changed 3 months ago by Lastique

The patch to fix unaligned memory access in vf_blend

comment:5 Changed 3 months ago by richardpl

Patch is not good, you killed macro. Besides checkasm test thing is wrong if data is not 16 byte aligned, for 16 bit input it must always be.

comment:6 Changed 3 months ago by Lastique

We have AV_R* and AV_W* macros in libavutil/intreadwrite.h for unaligned reads and writes.

Thanks, I didn't know about the macros. I've attached an updated patch. Unfortunately, I had to expand DEFINE_BLEND_EXPR as I could not use the macros from it.

I also corrected a minor oversight in that A and B macros were still referencing the data through pointers instead of using the loaded values. I guess, the compiler optimized away these references.

That being said, the fact the build system forcefully disables vectorization should serve as a hint that it's done for a reason.

The only reason I found was this thread from back 2013. Things could have change since then.

Anyway, auto-vectorization is not the problem, it merely exposed the bug.

The -cpuflags option lets you choose which intruction sets you want to enable, so "-cpuflags none" will make sure no asm optimized function is used, outside of stuff hardcoded at compile time.

Ok, but ffmpeg will probably align the buffers. The test specifically tests with misaligned pointers.

comment:7 Changed 3 months ago by Lastique

Patch is not good, you killed macro.

I couldn't use the AV_* macros otherwise.

Besides checkasm test thing is wrong if data is not 16 byte aligned, for 16 bit input it must always be.

You mean "2-byte aligned"? Maybe so. I don't know what use case it verifies. Would you like a patch that fixes the test instead?

Changed 3 months ago by Lastique

The patch to fix vf_blend test

comment:8 Changed 3 months ago by Lastique

Here's the patch that fixes the test instead. Aside from fixing 2-byte misalignment, I also changes how the misalignment is applied to avoid source buffer overruns.

comment:9 Changed 3 months ago by jamrial

Could you submit this (or any) patch to the mailing list? That's the place for contributions and where it will get the most attention. See https://ffmpeg.org/contact.html#MailingLists

comment:10 follow-up: Changed 3 months ago by cehoyos

  • Keywords blend added; vf_blend removed

Do I understand correctly that you can only reproduce a crash with checkasm, not with an actual command line?

comment:11 in reply to: ↑ 10 Changed 3 months ago by Lastique

Replying to cehoyos:

Do I understand correctly that you can only reproduce a crash with checkasm, not with an actual command line?

Yes.

comment:12 Changed 3 months ago by cehoyos

  • Priority changed from normal to minor
  • Resolution set to fixed
  • Status changed from new to closed

Tests changed by you in d7eb8d84757155c1fb0fe1a8269025e22e9bf0ec, thank you for the report and the fix!

Note: See TracTickets for help on using tickets.