Opened 4 years ago

Closed 4 years ago

Last modified 4 years ago

#4384 closed defect (fixed)

H.264 dxva2 hardware decoding fails as of 22 Mar 2015

Reported by: JohnWarburton Owned by:
Priority: important Component: avcodec
Version: git-master Keywords: H264 DXVA2 regression
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug:

Platform: Windows 8.1. Intel G45 graphics chip reported as "Mobile Intel(R) 4 Series Express Chipset Family v2

Until yesterday, on my nightly compiles, DXVA2 hardware-accelerated decoding of H264 worked perfectly. It now fails. I see a window full of semi-random green blocks on opengl and SDL output devices. Hardware decoding worked reliably and quickly until 22 March 2015.

The error messages repeat this:

[h264 @ 000000000516fd60] Failed to execute: 0x80070057
[h264 @ 000000000516fd60] hardware accelerator failed to decode picture

How to reproduce:

> ffmpeg -v debug -hwaccel auto -i http://wpc.c1a9.edgecastcdn.net/hls-live/20C1A9/bbc_world/ls_satlink/b_828.m3u8 -f opengl OUTPUT

ffmpeg version N-71042-g83020f8
built on 22nd March 2015 from Git source

Although this command line uses an H.264 stream (BBC World News), the effect is the same with a local H.264 file.

I get this output when using OpenGL. The SDL driver gives similar output.

ffmpeg version N-71042-g83020f8-COMPILED_BY_JohnWarburton Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 4.9.2 (GCC)
  configuration: --arch=x86_64 --target-os=mingw32 --cross-prefix=/home/John/source/ffmpegbuild/sandbox/mingw-w64-x86_64/bin/x86_64-w64-mingw32- --pkg-config=pkg-config --disable-doc --enable-gpl --enable-libx264 --enable-avisynth --enable-libxvid --enable-libmp3lame --enable-version3 --enable-zlib --enable-librtmp --enable-libvorbis --enable-libtheora --enable-libspeex --enable-libopenjpeg --enable-gnutls --enable-libgsm --enable-libfreetype --enable-libopus --disable-w32threads --enable-frei0r --enable-filter=frei0r --enable-libvo-aacenc --enable-bzlib --enable-libxavs --extra-cflags=-DPTW32_STATIC_LIB --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libvo-amrwbenc --enable-libschroedinger --enable-libvpx --enable-libilbc --enable-libbs2b --prefix=/home/John/source/ffmpegbuild/sandbox/mingw-w64-x86_64/x86_64-w64-mingw32 --enable-static --disable-shared --enable-libsoxr --enable-fontconfig --enable-libass --enable-libutvideo --enable-libbluray --enable-iconv --enable-libtwolame --extra-cflags=-DLIBTWOLAME_STATIC --enable-libzvbi --enable-libcaca --enable-libmodplug --extra-libs=-lstdc++ --enable-opengl --extra-libs=-lpng --enable-libvidstab --enable-libx265 --enable-decklink --extra-libs=-loleaut32 --enable-libcdio --enable-libbluray --extra-cflags= --extra-version=COMPILED_BY_JohnWarburton --extra-cflags= --enable-nonfree --enable-libfdk-aac --disable-libfaac --disable-decoder=aac --enable-runtime-cpudetect
  libavutil      54. 20.100 / 54. 20.100
  libavcodec     56. 29.100 / 56. 29.100
  libavformat    56. 26.101 / 56. 26.101
  libavdevice    56.  4.100 / 56.  4.100
  libavfilter     5. 13.101 /  5. 13.101
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  1.100 /  1.  1.100
  libpostproc    53.  3.100 / 53.  3.100
Splitting the commandline.
Reading option '-v' ... matched as option 'v' (set logging level) with argument 'debug'.
Reading option '-hwaccel' ... matched as option 'hwaccel' (use HW accelerated decoding) with argument 'auto'.
Reading option '-i' ... matched as input file with argument 'http://wpc.c1a9.edgecastcdn.net/hls-live/20C1A9/bbc_world/ls_satlink/b_828.m3u8'.
Reading option '-f' ... matched as option 'f' (force format) with argument 'opengl'.
Reading option 'OUTPUT' ... matched as output file.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option v (set logging level) with argument debug.
Successfully parsed a group of options.
Parsing a group of options: input file http://wpc.c1a9.edgecastcdn.net/hls-live/20C1A9/bbc_world/ls_satlink/b_828.m3u8.
Applying option hwaccel (use HW accelerated decoding) with argument auto.
Successfully parsed a group of options.
Opening an input file: http://wpc.c1a9.edgecastcdn.net/hls-live/20C1A9/bbc_world/ls_satlink/b_828.m3u8.

Then, after the stream is parsed, this happens:

Input #0, hls,applehttp, from 'http://wpc.c1a9.edgecastcdn.net/hls-live/20C1A9/bbc_world/ls_satlink/b_828.m3u8':
  Duration: N/A, start: 51932.388044, bitrate: N/A
  Program 0
    Metadata:
      variant_bitrate : 0
    Stream #0:0, 21, 1/90000: Video: h264 (Constrained Baseline), 2 reference frames ([27][0][0][0] / 0x001B), yuv420p(tv, bt470bg,
left), 720x404 (720x416) [SAR 404:405 DAR 16:9], 1/50, 25 fps, 25 tbr, 90k tbn, 50 tbc
    Stream #0:1, 39, 1/90000: Audio: aac ([15][0][0][0] / 0x000F), 44100 Hz, stereo, s16, 134 kb/s
Successfully opened the file.
Parsing a group of options: output file OUTPUT.
Applying option f (force format) with argument opengl.
Successfully parsed a group of options.
Opening an output file: OUTPUT.
Successfully opened the file.
detected 2 logical cores
[graph 0 input from stream 0:0 @ 00000000044fe620] Setting 'video_size' to value '720x404'
[graph 0 input from stream 0:0 @ 00000000044fe620] Setting 'pix_fmt' to value '0'
[graph 0 input from stream 0:0 @ 00000000044fe620] Setting 'time_base' to value '1/90000'
[graph 0 input from stream 0:0 @ 00000000044fe620] Setting 'pixel_aspect' to value '404/405'
[graph 0 input from stream 0:0 @ 00000000044fe620] Setting 'sws_param' to value 'flags=2'
[graph 0 input from stream 0:0 @ 00000000044fe620] Setting 'frame_rate' to value '25/1'
[graph 0 input from stream 0:0 @ 00000000044fe620] w:720 h:404 pixfmt:yuv420p tb:1/90000 fr:25/1 sar:404/405 sws_param:flags=2
[graph 0 input from stream 0:0 @ 00000000050a8840] Setting 'video_size' to value '720x404'
[graph 0 input from stream 0:0 @ 00000000050a8840] Setting 'pix_fmt' to value '25'
[graph 0 input from stream 0:0 @ 00000000050a8840] Setting 'time_base' to value '1/90000'
[graph 0 input from stream 0:0 @ 00000000050a8840] Setting 'pixel_aspect' to value '404/405'
[graph 0 input from stream 0:0 @ 00000000050a8840] Setting 'sws_param' to value 'flags=2'
[graph 0 input from stream 0:0 @ 00000000050a8840] Setting 'frame_rate' to value '25/1'
[graph 0 input from stream 0:0 @ 00000000050a8840] w:720 h:404 pixfmt:nv12 tb:1/90000 fr:25/1 sar:404/405 sws_param:flags=2
[scaler for output stream 0:0 @ 00000000050a86c0] Setting 'w' to value '720'
[scaler for output stream 0:0 @ 00000000050a86c0] Setting 'h' to value '404'
[scaler for output stream 0:0 @ 00000000050a86c0] Setting 'flags' to value '0x4'
[scaler for output stream 0:0 @ 00000000050a86c0] w:720 h:404 flags:'0x4' interl:0
[format @ 00000000050a8f00] compat: called with args=[yuv420p]
[format @ 00000000050a8f00] Setting 'pix_fmts' to value 'yuv420p'
[AVFilterGraph @ 00000000050ace00] query_formats: 5 queried, 4 merged, 0 already done, 0 delayed
[scaler for output stream 0:0 @ 00000000050a86c0] w:720 h:404 fmt:nv12 sar:404/405 -> w:720 h:404 fmt:yuv420p sar:404/405 flags:0x4
[AVFilterGraph @ 00000000050cf320] query_formats: 3 queried, 2 merged, 0 already done, 0 delayed
[opengl outdev @ 00000000045adea0] SDL driver: 'windib'.
[opengl outdev @ 00000000045adea0] OpenGL version: 2.1.0 - Build 8.15.10.2869
[opengl outdev @ 00000000045adea0] Non Power of 2 textures support: Yes
[opengl outdev @ 00000000045adea0] Unpack Subimage extension support: Yes
[opengl outdev @ 00000000045adea0] Max texture size: 4096x4096
[opengl outdev @ 00000000045adea0] Max viewport size: 4096x4096
Output #0, opengl, to 'OUTPUT':
  Metadata:
    encoder         : Lavf56.26.101
    Stream #0:0, 0, 1/25: Video: rawvideo, 1 reference frame (I420 / 0x30323449), yuv420p(left), 720x404 [SAR 404:405 DAR 16:9], 1/25, q=2-31, 200 kb/s, 25 fps, 25 tbn, 25 tbc
    Metadata:
      encoder         : Lavc56.29.100 rawvideo
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> rawvideo (native))


Press [q] to stop, [?] for help
[h264 @ 000000000516fd60] Frame num gap 2 255
[h264 @ 000000000516fd60] Frame num gap 2 0
[h264 @ 000000000516fd60] Failed to execute: 0x80070057
[h264 @ 000000000516fd60] hardware accelerator failed to decode picture
[h264 @ 00000000051b5be0] Failed to execute: 0x80070057
[h264 @ 00000000051b5be0] hardware accelerator failed to decode picture
[h264 @ 00000000051b5be0] Failed to execute: 0x8007000e

And those last two error messages repeat indefinitely.

I first spotted this when using my nightly mpv build to play an on-line stream; and so traced the problem to something possibly within FFmpeg.

Attachments (1)

0001-dxva2_h264-fix-slice-offset-in-long-slice-structs.patch (1016 bytes) - added by heleppkes 4 years ago.

Download all attachments as: .zip

Change History (8)

comment:1 follow-up: Changed 4 years ago by heleppkes

Can you test if the attached patch fixes the problem?

comment:2 in reply to: ↑ 1 Changed 4 years ago by JohnWarburton

Thank you for replying. I am cleaning the tree, applying your patch, and rebuilding right now.

Replying to heleppkes:

Can you test if the attached patch fixes the problem?

comment:3 Changed 4 years ago by JohnWarburton

Using mpv linked to the FFmpeg libraries built using your patch and using hardware decoding (I have checked for this in mpv's stdout report), the problem is fixed on my test system.

But there is still a major problem using FFmpeg alone with hardware decoding. Instead of a constantly green window, I am now seeing two or three flashes per second of the correct picture, but much of the decoding is still shown as a picture with randomly-changing blocks exhibiting shades of green. The error message is still "hardware accelerator failed to decode picture". So this is odd: mpv with the patched library works; FFmpeg (opengl and SDL output, I tested both) is slightly better, but still broken.

comment:4 Changed 4 years ago by heleppkes

You should disable multi-threaded decoding when using a hwaccel (ie. -threads 1), the combination is broken.

comment:5 Changed 4 years ago by JohnWarburton

Thank you. That was, indeed, the problem with using pure FFmpeg. CPU decoding multithreaded using FFmpeg is much faster than DXVA2, but for mpv I'm going to carry on using GPU acceleration for ordinary file playback synced to sound.

comment:6 Changed 4 years ago by heleppkes

  • Resolution set to fixed
  • Status changed from new to closed

comment:7 Changed 4 years ago by cehoyos

  • Component changed from undetermined to avcodec
  • Keywords regression added; hwaccel removed
  • Priority changed from normal to important
Note: See TracTickets for help on using tickets.