Opened 4 years ago

Closed 4 years ago

#1844 closed defect (fixed)

Audio resampling asm crashes on windows

Reported by: thegeek Owned by:
Priority: normal Component: swresample
Version: git-master Keywords: win64 crash SIGSEGV
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug:

I get a segmentation fault when trying to join two video files.
My ffmpeg is built using my own cross-compiling toolchain. The toolchain itself was built with the mxe cross-development project.
I've updated the components in the toolchain to the latest version (e.g. gcc/mingw-w64 4.7.2).

I can reproduce the crash with git master as well as version 1.0, but only when I do the compile with my own cross-compiling toolchain, and only when I have asm enabled. If I compile with --disable-asm I can not reproduce.
I can not reproduce it with e.g. Zeranoe's win32 builds.

How to reproduce:

Unfortunately I can not give a repro case for any precompiled binaries that are publicly available, but I can probably upload the binaries I've compiled if anyone needs them.

I can reproduce the crash very reliably with my own binaries, here is the full command line I'm using:

% ffmpeg -i ..\..\Introduction_a.mkv -i ..\..\lesson.wmv -filter_complex "[1:1] setsar=1:1 [lv]; [0:0] [0:1] [lv] [1:0] concat=n=2:v=1:a=1 [a] [v]" -map "[a]" -map "[v]" -vcodec mpeg4 -v:q 7 output.mkv
ffmpeg version N-45928-g8b03cd3 Copyright (c) 2000-2012 the FFmpeg developers
  built on Oct 24 2012 12:18:03 with gcc 4.7.2 (GCC)
  configuration: --cross-prefix=/home/swingcatalyst/mxe/build-ffmpeg/../../mxe/mxe-multitarget-static/usr/bin/x86_64-static-mingw32- --enable-cross-compile --arch=x86_64 --target-os=mingw32 --prefix=/home/swingcatalyst/mxe/build-ffmpeg/install/ffmpeg-git-x86_64-shared-install --disable-static --enable-shared --disable-postproc --disable-pthreads --enable-runtime-cpudetect --enable-bzlib --enable-libfreetype --enable-libmp3lame --enable-libspeex --enable-libtheora --enable-libvorbis --enable-libvpx --enable-zlib --disable-stripping
  libavutil      52.  0.100 / 52.  0.100
  libavcodec     54. 68.100 / 54. 68.100
  libavformat    54. 34.100 / 54. 34.100
  libavdevice    54.  3.100 / 54.  3.100
  libavfilter     3. 20.105 /  3. 20.105
  libswscale      2.  1.101 /  2.  1.101
  libswresample   0. 16.100 /  0. 16.100
Input #0, matroska,webm, from '..\..\Introduction_a.mkv':
  Metadata:
    ENCODER         : Lavf54.29.104
  Duration: 00:00:06.29, start: 0.000000, bitrate: 2678 kb/s
    Stream #0:0: Video: mpeg4 (Simple Profile), yuv420p, 1920x1200 [SAR 1:1 DAR 8:5], 24 fps, 24 tbr, 1k tbn, 24 tbc (default)
    Stream #0:1: Audio: mp3, 44100 Hz, mono, s16, 64 kb/s (default)
[wmv3 @ 00000000019DDD40] Extra data: 8 bits left, value: 20
Guessed Channel Layout for  Input Stream #1.0 : stereo
Input #1, asf, from '..\..\lesson.wmv':
  Metadata:
    WMFSDKVersion   : 12.0.9200.16384
    WMFSDKNeeded    : 0.0.0.0000
    IsVBR           : 0
  Duration: 00:00:24.03, start: 0.000000, bitrate: 1713 kb/s
    Stream #1:0(nor): Audio: wmav2 (a[1][0][0] / 0x0161), 44100 Hz, stereo, fltp, 96 kb/s
    Stream #1:1(nor): Video: wmv3 (Simple) (WMV3 / 0x33564D57), yuv420p, 1920x1200, 2000 kb/s, 7 tbr, 1k tbn, 1k tbc
[Parsed_setsar_0 @ 0000000003570A00] num:den syntax is deprecated, please use num/den or named options instead
File 'output.mkv' already exists. Overwrite ? [y/N] y

Debug info

I used my cross-compiling mingw toolchain to build gdb, with it I can get the following backtrace at the time of the segmentation fault:

% C:\WORK\ffmpeg\ffmpeg-git-x86_64-shared-install\bin>C:\WORK\ffmpeg\gdb.exe --args ffmpeg -i ..\..\Introduction_a.mkv -i ..\..\lesson.wmv -filter_complex "[1:1] setsar=1:1 [lv]; [0:0] [0:1] [lv] [1:0] concat=n=2:v=1:a=1 [a] [v]" -map "[a]" -map "[v]" -vcodec mpeg4 -q:v 7 output.mkv"
GNU gdb (GDB) 7.5
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-static-mingw32".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from C:\WORK\ffmpeg\ffmpeg-git-x86_64-shared-install\bin\ffmpeg.exe...done.
(gdb) run
Starting program: C:\WORK\ffmpeg\ffmpeg-git-x86_64-shared-install\bin\ffmpeg.exe -i ..\..\Introduction_a.mkv -i ..\..\lesson.wmv -filter_complex "[1:1] setsar=1:1 [lv]; [0:0] [0:1]
 [lv] [1:0] concat=n=2:v=1:a=1 [a] [v]" -map "[a]" -map "[v]" -vcodec mpeg4 -q:v 7 output.mkv
[New Thread 10656.0x2b18]
ffmpeg version N-45928-g8b03cd3 Copyright (c) 2000-2012 the FFmpeg developers
  built on Oct 24 2012 12:18:03 with gcc 4.7.2 (GCC)
  configuration: --cross-prefix=/home/swingcatalyst/mxe/build-ffmpeg/../../mxe/mxe-multitarget-static/usr/bin/x86_64-static-mingw32- --enable-cross-compile --arch=x86_64 --target-os=mingw32 --prefix=/home/swingcatalyst/mxe/build-ffmpeg/install/ffmpeg-git-x86_64-shared-install --disable-static --enable-shared --disable-postproc --disable-pthreads --enable-runtime-cpudetect --enable-bzlib --enable-libfreetype --enable-libmp3lame --enable-libspeex --enable-libtheora --enable-libvorbis --enable-libvpx --enable-zlib --disable-stripping
  libavutil      52.  0.100 / 52.  0.100
  libavcodec     54. 68.100 / 54. 68.100
  libavformat    54. 34.100 / 54. 34.100
  libavdevice    54.  3.100 / 54.  3.100
  libavfilter     3. 20.105 /  3. 20.105
  libswscale      2.  1.101 /  2.  1.101
  libswresample   0. 16.100 /  0. 16.100
Input #0, matroska,webm, from '..\..\Introduction_a.mkv':
  Metadata:
    ENCODER         : Lavf54.29.104
  Duration: 00:00:06.29, start: 0.000000, bitrate: 2678 kb/s
    Stream #0:0: Video: mpeg4 (Simple Profile), yuv420p, 1920x1200 [SAR 1:1 DAR 8:5], 24 fps, 24 tbr, 1k tbn, 24 tbc (default)
    Stream #0:1: Audio: mp3, 44100 Hz, mono, s16, 64 kb/s (default)
[wmv3 @ 00000000019DDD40] Extra data: 8 bits left, value: 20
Guessed Channel Layout for  Input Stream #1.0 : stereo
Input #1, asf, from '..\..\lesson.wmv':
  Metadata:
    WMFSDKVersion   : 12.0.9200.16384
    WMFSDKNeeded    : 0.0.0.0000
    IsVBR           : 0
  Duration: 00:00:24.03, start: 0.000000, bitrate: 1713 kb/s
    Stream #1:0(nor): Audio: wmav2 (a[1][0][0] / 0x0161), 44100 Hz, stereo, fltp, 96 kb/s
    Stream #1:1(nor): Video: wmv3 (Simple) (WMV3 / 0x33564D57), yuv420p, 1920x1200, 2000 kb/s, 7 tbr, 1k tbn, 1k tbc
[Parsed_setsar_0 @ 0000000003570A00] num:den syntax is deprecated, please use num/den or named options instead
File 'output.mkv' already exists. Overwrite ? [y/N] y
[New Thread 10656.0x24]
[New Thread 10656.0xc98]
[New Thread 10656.0x1c84]
[New Thread 10656.0x23a8]
[New Thread 10656.0x28c0]
[New Thread 10656.0x2604]
[New Thread 10656.0x2318]
[New Thread 10656.0x1a98]
[New Thread 10656.0x144c]
[New Thread 10656.0x21e8]
Extra data: 8 bits left, value: 20
Output #0, matroska, to 'output.mkv':
  Metadata:
    encoder         : Lavf54.34.100
    Stream #0:0: Video: mpeg4, yuv420p, 1920x1200 [SAR 1:1 DAR 8:5], q=2-31, 200 kb/s, 1k tbn, 24 tbc
    Stream #0:1: Audio: vorbis, 44100 Hz, mono, fltp
Stream mapping:
  Stream #0:0 (mpeg4) -> concat:in0:v0
  Stream #0:1 (mp3) -> concat:in0:a0
  Stream #1:0 (wmav2) -> concat:in1:a0
  Stream #1:1 (wmv3) -> setsar
  concat:out:v0 -> Stream #0:0 (mpeg4)
  concat:out:a0 -> Stream #0:1 (libvorbis)
Press [q] to stop, [?] for help
100 buffers queued in output stream 0:1, something may be wrong.te= 871.8kbits/s

Program received signal SIGSEGV, Segmentation fault.
0x00000000707cd0e8 in ff_mix_2_1_a_float_avx () from C:\WORK\ffmpeg\ffmpeg-git-x86_64-shared-install\bin\swresample-0.dll
(gdb) backtrace
#0  0x00000000707cd0e8 in ff_mix_2_1_a_float_avx () from C:\WORK\ffmpeg\ffmpeg-git-x86_64-shared-install\bin\swresample-0.dll
#1  0x00000000707c4147 in swri_rematrix (s=s@entry=0x3168b80, out=out@entry=0x23de00, in=in@entry=0x3168c30, len=len@entry=8192,

Change History (10)

comment:1 in reply to: ↑ description ; follow-up: Changed 4 years ago by cehoyos

Replying to thegeek:

I can not reproduce it with e.g. Zeranoe's win32 builds.

Does it also crash with Zeranoe's 64bit builds?

Does it crash if you compile for 32bit ("--cc='gcc -m32'") with your tool-chain?

Does it also crash without --enable-shared?

comment:2 in reply to: ↑ 1 Changed 4 years ago by thegeek

Replying to cehoyos:

Replying to thegeek:

I can not reproduce it with e.g. Zeranoe's win32 builds.

Does it also crash with Zeranoe's 64bit builds?

Sorry, I meant his x64 builds; so no it does not. Here is the output I get with his binaries:

....
....
....
Stream mapping:
  Stream #0:0 (mpeg4) -> concat:in0:v0
  Stream #0:1 (mp3) -> aresample
  Stream #1:0 (wmav2) -> aresample
  Stream #1:1 (wmv3) -> setsar
  concat:out:v0 -> Stream #0:0 (mpeg4)
  concat:out:a0 -> Stream #0:1 (libvorbis)
Press [q] to stop, [?] for help
100 buffers queued in output stream 0:1, something may be wrong.te=1207.7kbits/s
[libvorbis @ 00000000003fdce0] Que input is backward in time
[matroska @ 00000000003fc5a0] st:0 PTS: 5354 DTS: 5354 < 5378 invalid, clipping
[libvorbis @ 00000000003fdce0] Que input is backward in time
    Last message repeated 5 times
Que input is backward in timee=    2135kB time=00:00:11.54 bitrate=1515.4kbits/s
[libvorbis @ 00000000003fdce0] Que input is backward in time
    Last message repeated 8 times
Que input is backward in timee=    3928kB time=00:00:18.54 bitrate=1735.4kbits/s
[libvorbis @ 00000000003fdce0] Que input is backward in time
    Last message repeated 7 times
[matroska @ 00000000003fc5a0] st:0 PTS: 23317 DTS: 23317 < 23322 invalid, clipping
[libvorbis @ 00000000003fdce0] Que input is backward in time
    Last message repeated 4 times
Que input is backward in timee=    6253kB time=00:00:25.44 bitrate=2012.9kbits/s
[libvorbis @ 00000000003fdce0] Que input is backward in time
    Last message repeated 4 times
frame=  290 fps=126 q=7.0 Lsize=    7170kB time=00:00:29.24 bitrate=2008.2kbits/s
video:6955kB audio:196kB subtitle:0 global headers:3kB muxing overhead 0.217808%

Does it crash if you compile for 32bit ("--cc='gcc -m32'") with your tool-chain?

No (I did this by changing the arch and cross-prefix).

Does it also crash without --enable-shared?

Yes (I also removed --disable-static).

Version 0, edited 4 years ago by thegeek (next)

comment:3 follow-up: Changed 4 years ago by cehoyos

Please add the missing parts of the gdb session (disassembly, register dump), see https://ffmpeg.org/bugreports.html - consider using a static build (although it does not really matter).

Do you know if the Zeranoe build actually contains the AVX-optimized function ff_mix_2_1_a_float_avx?

comment:4 in reply to: ↑ 3 ; follow-up: Changed 4 years ago by thegeek

Replying to cehoyos:

Please add the missing parts of the gdb session (disassembly, register dump), see https://ffmpeg.org/bugreports.html - consider using a static build (although it does not really matter).

Ok, this is the static build with slightly different build settings, which seems to produce the crash in the sse2 version of the function:

  configuration: --cross-prefix=/home/swingcatalyst/mxe/build-ffmpeg/../../mxe/mxe-multitarget-static/usr/bin/x86_64-static-mingw32- --enable-cross-compile --arch=x86_64 --target-os=mingw32 --prefix=/home/swingcatalyst/mxe/build-ffmpeg/install/ffmpeg-1.0-x86_64-static-install --disable-shared --disable-postproc --disable-pthreads --enable-runtime-cpudetect --enable-bzlib --enable-libfreetype --enable-libmp3lame --enable-libspeex --enable-libtheora --enable-libvorbis --enable-libvpx --enable-zlib --disable-stripping
(gdb) bt
#0  0x0000000000add1ec in ff_mix_2_1_a_int16_sse2 ()
#1  0x00000000085ef0a0 in ?? ()
#2  0x0000000000000000 in ?? ()
(gdb) disass $pc-32,$pc+32
Dump of assembler code from 0xadd1cc to 0xadd20c:
   0x0000000000add1cc <ff_mix_2_1_a_int16_sse2+44>:     test   $0xf,%r8
   0x0000000000add1d3 <ff_mix_2_1_a_int16_sse2+51>:     jne    0xadd0bf <mix_2_1_int16_u_int_sse2>
   0x0000000000add1d9 <ff_mix_2_1_a_int16_sse2+57>:     test   $0xf,%rcx
   0x0000000000add1e0 <ff_mix_2_1_a_int16_sse2+64>:     jne    0xadd0bf <mix_2_1_int16_u_int_sse2>
   0x0000000000add1e6 <ff_mix_2_1_a_int16_sse2+70>:     movd   (%r9,%r10,4),%xmm4
=> 0x0000000000add1ec <ff_mix_2_1_a_int16_sse2+76>:     movd   (%r9,%r11,4),%xmm6
   0x0000000000add1f2 <ff_mix_2_1_a_int16_sse2+82>:     pshuflw $0x0,%xmm4,%xmm5
   0x0000000000add1f7 <ff_mix_2_1_a_int16_sse2+87>:     punpcklqdq %xmm5,%xmm5
   0x0000000000add1fb <ff_mix_2_1_a_int16_sse2+91>:     pshuflw $0x0,%xmm6,%xmm6
   0x0000000000add200 <ff_mix_2_1_a_int16_sse2+96>:     punpcklqdq %xmm6,%xmm6
   0x0000000000add204 <ff_mix_2_1_a_int16_sse2+100>:    psllq  $0x20,%xmm4
   0x0000000000add209 <ff_mix_2_1_a_int16_sse2+105>:    psrlq  $0x30,%xmm4
End of assembler dump.
(gdb) info all-registers
rax            0x2000   8192
rbx            0x5b267a0        95578016
rcx            0xc2dc0a0        204325024
rdx            0xc2bbfe0        204193760
rsi            0x5b26970        95578480
rdi            0x5b267a0        95578016
rbp            0x0      0x0
rsp            0x23dc98 0x23dc98
r8             0xc2c3fe0        204226528
r9             0x7604f80        123752320
r10            0x0      0
r11            0xea6400000001   257715217629185
r12            0x23de10 2350608
r13            0x1      1
r14            0x0      0
r15            0x1      1
rip            0xadd1ec 0xadd1ec <ff_mix_2_1_a_int16_sse2+76>
eflags         0x10246  [ PF ZF IF RF ]
cs             0x33     51
ss             0x2b     43
ds             0x0      0
es             0x0      0
fs             0x0      0
gs             0x0      0
st0            -nan(0xbebebebebebebebe) (raw 0xffffbebebebebebebebe)
st1            -nan(0xbebebebebebebebe) (raw 0xffffbebebebebebebebe)
st2            -nan(0x31b7323331362f33) (raw 0xffff31b7323331362f33)
st3            -nan(0x3100b700320033)   (raw 0xffff003100b700320033)
st4            9        (raw 0x40029000000000000000)
st5            1        (raw 0x3fff8000000000000000)
st6            2818.3829312644548       (raw 0x400ab026207c88973351)
st7            3.4500000000000002       (raw 0x4000dcccccccccccd000)
fctrl          0x420037f        69206911
fstat          0x420    1056
ftag           0x0      0
fiseg          0x33     51
fioff          0xd252a0 13783712
foseg          0x2b     43
fooff          0x23e0a0 2351264
fop            0x0      0
xmm0           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x8000000000000000, 0x8000000000000000}, v16_int8 = {0xea, 0xff, 0xd5, 0xff, 0xd3, 0xff, 0xdd, 0xff, 0xd5, 0xff,
    0xce, 0xff, 0xd8, 0xff, 0xdf, 0xff}, v8_int16 = {0xffea, 0xffd5, 0xffd3, 0xffdd, 0xffd5, 0xffce, 0xffd8, 0xffdf}, v4_int32 = {0xffd5ffea, 0xffddffd3, 0xffceffd5,
    0xffdfffd8}, v2_int64 = {0xffddffd3ffd5ffea, 0xffdfffd8ffceffd5}, uint128 = 0xffdfffd8ffceffd5ffddffd3ffd5ffea}
xmm1           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x8000000000000000, 0x8000000000000000}, v16_int8 = {0xea, 0xff, 0xd5, 0xff, 0xd3, 0xff, 0xdd, 0xff, 0xd5, 0xff,
    0xce, 0xff, 0xd8, 0xff, 0xdf, 0xff}, v8_int16 = {0xffea, 0xffd5, 0xffd3, 0xffdd, 0xffd5, 0xffce, 0xffd8, 0xffdf}, v4_int32 = {0xffd5ffea, 0xffddffd3, 0xffceffd5,
    0xffdfffd8}, v2_int64 = {0xffddffd3ffd5ffea, 0xffdfffd8ffceffd5}, uint128 = 0xffdfffd8ffceffd5ffddffd3ffd5ffea}
xmm2           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x8000000000000000, 0x8000000000000000}, v16_int8 = {0xd5, 0xff, 0xce, 0xff, 0xd8, 0xff, 0xdf, 0xff, 0xd5, 0xff,
    0xce, 0xff, 0xd8, 0xff, 0xdf, 0xff}, v8_int16 = {0xffd5, 0xffce, 0xffd8, 0xffdf, 0xffd5, 0xffce, 0xffd8, 0xffdf}, v4_int32 = {0xffceffd5, 0xffdfffd8, 0xffceffd5,
    0xffdfffd8}, v2_int64 = {0xffdfffd8ffceffd5, 0xffdfffd8ffceffd5}, uint128 = 0xffdfffd8ffceffd5ffdfffd8ffceffd5}
xmm3           {v4_float = {0xffffffaf, 0xffffffcd, 0xffffffdb, 0xfffffff3}, v2_double = {0xffffffc2c6667abb, 0xffffffffff8726c5}, v16_int8 = {0xf6, 0x8d, 0xa2, 0xc2, 0xcc,
    0x9c, 0x4e, 0xc2, 0x3f, 0x3b, 0x15, 0xc2, 0x4e, 0x36, 0x5e, 0xc1}, v8_int16 = {0x8df6, 0xc2a2, 0x9ccc, 0xc24e, 0x3b3f, 0xc215, 0x364e, 0xc15e}, v4_int32 = {0xc2a28df6,
    0xc24e9ccc, 0xc2153b3f, 0xc15e364e}, v2_int64 = {0xc24e9cccc2a28df6, 0xc15e364ec2153b3f}, uint128 = 0xc15e364ec2153b3fc24e9cccc2a28df6}
xmm4           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0, 0x40, 0xf, 0x0 <repeats 13 times>}, v8_int16 = {0x4000, 0xf, 0x0, 0x0, 0x0, 0x0, 0x0,
    0x0}, v4_int32 = {0xf4000, 0x0, 0x0, 0x0}, v2_int64 = {0xf4000, 0x0}, uint128 = 0x000000000000000000000000000f4000}
xmm5           {v4_float = {0xffffffd0, 0xffffffdb, 0xffffffe9, 0xfffffff3}, v2_double = {0xfffffffab1300f70, 0xffffffffff8726c5}, v16_int8 = {0x85, 0x27, 0x41, 0xc2, 0x3f,
    0x3b, 0x15, 0xc2, 0x45, 0x95, 0xb9, 0xc1, 0x4e, 0x36, 0x5e, 0xc1}, v8_int16 = {0x2785, 0xc241, 0x3b3f, 0xc215, 0x9545, 0xc1b9, 0x364e, 0xc15e}, v4_int32 = {0xc2412785,
    0xc2153b3f, 0xc1b99545, 0xc15e364e}, v2_int64 = {0xc2153b3fc2412785, 0xc15e364ec1b99545}, uint128 = 0xc15e364ec1b99545c2153b3fc2412785}
xmm6           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {
    0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0}, uint128 = 0x00000000000000000000000000000000}
xmm7           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {
    0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0}, uint128 = 0x00000000000000000000000000000000}
xmm8           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0, 0x0, 0x0, 0x80, 0x0 <repeats 12 times>}, v8_int16 = {0x0, 0x8000, 0x0, 0x0, 0x0, 0x0,
    0x0, 0x0}, v4_int32 = {0x80000000, 0x0, 0x0, 0x0}, v2_int64 = {0x80000000, 0x0}, uint128 = 0x00000000000000000000000080000000}
xmm9           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {
    0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0}, uint128 = 0x00000000000000000000000000000000}
xmm10          {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {
    0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0}, uint128 = 0x00000000000000000000000000000000}
xmm11          {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {
    0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0}, uint128 = 0x00000000000000000000000000000000}
xmm12          {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {
    0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0}, uint128 = 0x00000000000000000000000000000000}
xmm13          {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {
    0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0}, uint128 = 0x00000000000000000000000000000000}
xmm14          {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {
    0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0}, uint128 = 0x00000000000000000000000000000000}
xmm15          {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {
    0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0}, uint128 = 0x00000000000000000000000000000000}
mxcsr          0x1fa0   [ PE IM DM ZM OM UM PM ]
(gdb)

Do you know if the Zeranoe build actually contains the AVX-optimized function ff_mix_2_1_a_float_avx?

I just checked and it does (both sse and avx versions).

I'm currently investigating something curious I've discovered. It seems my crosschain compilex libvpx with pthreads, but I use w32threads in ffmpeg. Will using both thread libraries be OK?
I also noticed that Zeranoe builds ffmpeg with pthreads, is this now a "good" solution?

Last edited 4 years ago by thegeek (previous) (diff)

comment:5 in reply to: ↑ 4 ; follow-up: Changed 4 years ago by cehoyos

Replying to thegeek:

I'm currently investigating something curious I've discovered. It seems my crosschain compilex libvpx with pthreads, but I use w32threads in ffmpeg. Will using both thread libraries be OK?

You could test compiling FFmpeg without libvpx.

I also noticed that Zeranoe builds ffmpeg with pthreads, is this now a "good" solution?

I don't know, this is ticket #1509

comment:6 in reply to: ↑ 5 Changed 4 years ago by thegeek

Replying to cehoyos:

You could test compiling FFmpeg without libvpx.

Tested, no difference.

I also noticed that Zeranoe builds ffmpeg with pthreads, is this now a "good" solution?

I don't know, this is ticket #1509

Ok.

I've compiled both ffmpeg and the entire cross-toolchain many _many_ times today, with no real progress:/
Seeing as how Zeranoe's builds seem to work it should surely be possible to get this working.

For a while I thought I had something but it turns out I was just misconfiguring by not setting arch correctly. Will this disable asm? If so it would explain why those builds work. (When arch is set correctly it still crashes.)

comment:7 Changed 4 years ago by thegeek

  • Summary changed from Crash when trying to join two video files to Audio resampling asm crashes on windows

I've renamed the ticket because I have a different and simpler repro case:
(Thanks to lu_zero on the #libav-devel channel for the suggestion)
"ffmpeg -i "stereo audio file" -ac 1 out.mp3"

Disabling asm optimizations always fixes the crash (done by recompile or -cpuflags 0)

Another command that also triggers it (from #ffmpeg-devel):
@ubitux: ffmpeg -f lavfi -i "aevalsrc=0:0,aformat=sample_fmts=s16" -t 5 -ac 1 -f null -
@ubitux: does this trigger it?
thegeek: yes

Here is another gdb report:

C:\WORK\ffmpeg\x86_64-static-mingw32\bin>..\..\gdb --args ffmpeg -i "..\..\01 Track 1.mp3" -ac 1 out.mp3
GNU gdb (GDB) 7.5
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-static-mingw32".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from C:\WORK\ffmpeg\x86_64-static-mingw32\bin\ffmpeg.exe...done.
(gdb) r
Starting program: C:\WORK\ffmpeg\x86_64-static-mingw32\bin\ffmpeg.exe -i "..\..\01 Track 1.mp3" -ac 1 out.mp3
[New Thread 10972.0x1768]
ffmpeg version 1.0 Copyright (c) 2000-2012 the FFmpeg developers
  built on Oct 26 2012 16:20:12 with gcc 4.7.2 (GCC)
  configuration: --cross-prefix=x86_64-static-mingw32- --enable-cross-compile --arch=x86_64 --target-os=mingw32 --prefix=/home/swingcatalyst/crossdev/mxe-github/usr/x86_64-static-
ingw32 --enable-static --disable-shared --enable-debug --disable-stripping --disable-doc --enable-memalign-hack --enable-gpl --enable-version3 --disable-nonfree --enable-postproc
-disable-pthreads --enable-w32threads --enable-avisynth --enable-libspeex --enable-libtheora --enable-libvorbis --enable-libmp3lame --disable-libxvid --disable-libfaac --enable-li
opencore-amrnb --enable-libopencore-amrwb --enable-libx264 --enable-libvpx
  libavutil      51. 73.101 / 51. 73.101
  libavcodec     54. 59.100 / 54. 59.100
  libavformat    54. 29.104 / 54. 29.104
  libavdevice    54.  2.101 / 54.  2.101
  libavfilter     3. 17.100 /  3. 17.100
  libswscale      2.  1.101 /  2.  1.101
  libswresample   0. 15.100 /  0. 15.100
  libpostproc    52.  0.100 / 52.  0.100
[mp3 @ 00000000078C10C0] max_analyze_duration 5000000 reached at 5015510
[mp3 @ 00000000078C10C0] Estimating duration from bitrate, this may be inaccurate
Input #0, mp3, from '..\..\01 Track 1.mp3':
  Metadata:
    track           : 1
    artist          : 1001
    PLAY_STAMP      : 2009-08-21 14:32:18
    PLAY_COUNTER    : 2
    LAST_PLAYED     : 2009-08-21 18:33:47
    FIRST_PLAYED    : 2009-08-21 14:32:18
    album           : Unknown Album (8/17/2009 7:58:59 AM)
    title           : Track 1
    genre           : Unknown Genre
    TLEN            : 81533
    replaygain_album_gain: -6.58 dB
    replaygain_album_peak: 1.021768
    replaygain_track_gain: -4.20 dB
    replaygain_track_peak: 0.931326
  Duration: 00:01:21.56, start: 0.000000, bitrate: 191 kb/s
    Stream #0:0: Audio: mp3, 44100 Hz, stereo, s16, 192 kb/s
Output #0, mp3, to 'out.mp3':
  Metadata:
    TRCK            : 1
    TPE1            : 1001
    PLAY_STAMP      : 2009-08-21 14:32:18
    PLAY_COUNTER    : 2
    LAST_PLAYED     : 2009-08-21 18:33:47
    FIRST_PLAYED    : 2009-08-21 14:32:18
    TALB            : Unknown Album (8/17/2009 7:58:59 AM)
    TIT2            : Track 1
    TCON            : Unknown Genre
    TLEN            : 81533
    replaygain_album_gain: -6.58 dB
    replaygain_album_peak: 1.021768
    replaygain_track_gain: -4.20 dB
    replaygain_track_peak: 0.931326
    TSSE            : Lavf54.29.104
    Stream #0:0: Audio: mp3, 44100 Hz, mono, s16
Stream mapping:
  Stream #0:0 -> #0:0 (mp3 -> libmp3lame)
Press [q] to stop, [?] for help

Program received signal SIGSEGV, Segmentation fault.
0x0000000000b3b33c in ff_mix_2_1_a_int16_sse2 ()
(gdb) backtrace
#0  0x0000000000b3b33c in ff_mix_2_1_a_int16_sse2 ()
#1  0x0000000000000004 in ?? ()
#2  0x0000000000000000 in ?? ()
(gdb) disass $pc-32,$pc+32
Dump of assembler code from 0xb3b31c to 0xb3b35c:
   0x0000000000b3b31c <ff_mix_2_1_a_int16_sse2+44>:     test   $0xf,%r8
   0x0000000000b3b323 <ff_mix_2_1_a_int16_sse2+51>:     jne    0xb3b20f <mix_2_1_int16_u_int_sse2>
   0x0000000000b3b329 <ff_mix_2_1_a_int16_sse2+57>:     test   $0xf,%rcx
   0x0000000000b3b330 <ff_mix_2_1_a_int16_sse2+64>:     jne    0xb3b20f <mix_2_1_int16_u_int_sse2>
   0x0000000000b3b336 <ff_mix_2_1_a_int16_sse2+70>:     movd   (%r9,%r10,4),%xmm4
=> 0x0000000000b3b33c <ff_mix_2_1_a_int16_sse2+76>:     movd   (%r9,%r11,4),%xmm6
   0x0000000000b3b342 <ff_mix_2_1_a_int16_sse2+82>:     pshuflw $0x0,%xmm4,%xmm5
   0x0000000000b3b347 <ff_mix_2_1_a_int16_sse2+87>:     punpcklqdq %xmm5,%xmm5
   0x0000000000b3b34b <ff_mix_2_1_a_int16_sse2+91>:     pshuflw $0x0,%xmm6,%xmm6
   0x0000000000b3b350 <ff_mix_2_1_a_int16_sse2+96>:     punpcklqdq %xmm6,%xmm6
   0x0000000000b3b354 <ff_mix_2_1_a_int16_sse2+100>:    psllq  $0x20,%xmm4
   0x0000000000b3b359 <ff_mix_2_1_a_int16_sse2+105>:    psrlq  $0x30,%xmm4
End of assembler dump.
(gdb) info all-registers
rax            0x480    1152
rbx            0x5eba6a0        99329696
rcx            0x79800c0        127402176
rdx            0x797b820        127383584
rsi            0x5eba870        99330160
rdi            0x5eba6a0        99329696
rbp            0x0      0x0
rsp            0x23dc28 0x23dc28
r8             0x797ca20        127388192
r9             0x78de2c0        126739136
r10            0x0      0
r11            0x7bee00000001   136262132432897
r12            0x23dda0 2350496
r13            0x1      1
r14            0x0      0
r15            0x1      1
rip            0xb3b33c 0xb3b33c <ff_mix_2_1_a_int16_sse2+76>
eflags         0x10246  [ PF ZF IF RF ]
cs             0x33     51
ss             0x2b     43
ds             0x0      0
es             0x0      0
fs             0x0      0
gs             0x0      0
st0            0        (raw 0x00000000000000000000)
st1            0        (raw 0x00000000000000000000)
st2            -0.73337958335896403     (raw 0xbffebbbec3ae14b30000)
st3            1        (raw 0x3fff8000000000000000)
st4            9        (raw 0x40029000000000000000)
st5            0        (raw 0x00000000000000000000)
st6            0        (raw 0x00000000000000000000)
st7            0.043619387365336069     (raw 0x3ffab2aa3e234a70d800)
fctrl          0x20037f 2098047
fstat          0x20     32
ftag           0x0      0
fiseg          0x33     51
fioff          0xe2166c 14816876
foseg          0x2b     43
fooff          0x23d718 2348824
fop            0x0      0
xmm0           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {
    0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0}, uint128 = 0x00000000000000000000000000000000}
xmm1           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {
    0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0}, uint128 = 0x00000000000000000000000000000000}
xmm2           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {
    0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0}, uint128 = 0x00000000000000000000000000000000}
xmm3           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {
    0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0}, uint128 = 0x00000000000000000000000000000000}
xmm4           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0, 0x40, 0xf, 0x0 <repeats 13 times>}, v8_int16 = {0x4000, 0xf, 0x0, 0x0, 0x0, 0x0, 0x0,
    0x0}, v4_int32 = {0xf4000, 0x0, 0x0, 0x0}, v2_int64 = {0xf4000, 0x0}, uint128 = 0x000000000000000000000000000f4000}
xmm5           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x33, 0x33, 0x73, 0x3f, 0x0 <repeats 12 times>}, v8_int16 = {0x3333, 0x3f73, 0x0, 0x0, 0x0,
    0x0, 0x0, 0x0}, v4_int32 = {0x3f733333, 0x0, 0x0, 0x0}, v2_int64 = {0x3f733333, 0x0}, uint128 = 0x0000000000000000000000003f733333}
xmm6           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {
    0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0}, uint128 = 0x00000000000000000000000000000000}
xmm7           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {
    0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0}, uint128 = 0x00000000000000000000000000000000}
xmm8           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0, 0x0, 0x0, 0x80, 0x0 <repeats 12 times>}, v8_int16 = {0x0, 0x8000, 0x0, 0x0, 0x0, 0x0,
    0x0, 0x0}, v4_int32 = {0x80000000, 0x0, 0x0, 0x0}, v2_int64 = {0x80000000, 0x0}, uint128 = 0x00000000000000000000000080000000}
xmm9           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {
    0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0}, uint128 = 0x00000000000000000000000000000000}
xmm10          {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {
    0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0}, uint128 = 0x00000000000000000000000000000000}
xmm11          {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {
    0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0}, uint128 = 0x00000000000000000000000000000000}
xmm12          {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {
    0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0}, uint128 = 0x00000000000000000000000000000000}
xmm13          {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {
    0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0}, uint128 = 0x00000000000000000000000000000000}
xmm14          {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {
    0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0}, uint128 = 0x00000000000000000000000000000000}
xmm15          {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {
    0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0}, uint128 = 0x00000000000000000000000000000000}
mxcsr          0x1fa0   [ PE IM DM ZM OM UM PM ]
(gdb)
Last edited 4 years ago by thegeek (previous) (diff)

comment:8 follow-up: Changed 4 years ago by michael

possibly fixed, please test

comment:9 in reply to: ↑ 8 Changed 4 years ago by thegeek

Replying to michael:

possibly fixed, please test

I can confirm it is fixed (commit d23e8f53ad01fde6d0dd96644c2a594f8dd7537e)
I tested both latest git and also by backporting to 1.0.
Thanks :)

comment:10 Changed 4 years ago by cehoyos

  • Component changed from undetermined to swresample
  • Keywords win64 crash SIGSEGV added
  • Resolution set to fixed
  • Status changed from new to closed

Thank you for the report!

Note: See TracTickets for help on using tickets.