Opened 7 years ago

Closed 2 years ago

#6073 closed defect (fixed)

QSV (h264) encodes via CPU instead of GPU

Reported by: Milan Cizek Owned by:
Priority: important Component: avcodec
Version: git-master Keywords: qsv regression
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug:
QSV h264_qsv encodes via CPU instead of GPU (without HW accel.).

In output "[h264_qsv @ 0x352d1e0] Encoder will work with partial HW acceleration"
Test via Intel Media SDK passed OK.

Problem is on version: N-83042-g107b306 (and some older)
But in older version N-82007-g1a9513b qsv works fine (I see load in intel_gpu_top).

How to reproduce:

% ffmpeg -y -v debug -i "udp://@239.255.0.103:1234?fifo_size=1000000&overrun_nonfatal=1" -map v:0 -profile:v main -c:v h264_qsv -look_ahead 0 -an /a.mp4

Full output:

/# ffmpeg -y -v debug -i "udp://@239.255.0.103:1234?fifo_size=1000000&overrun_nonfatal=1" -map v:0 -profile:v main -c:v h264_qsv -look_ahead 0 -an /a.mp4                                       ffmpeg version N-83042-g107b306 Copyright (c) 2000-2017 the FFmpeg developers
  built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.4) 20160609
  configuration: --prefix=./ffmpeg-build --pkg-config-flags=--static --bindir=/root/bin --enable-gpl --enable-nonfree --enable-libfdk-aac --enable-libvorbis --enable-libmp3lame --enable-libx264 --enable-libx265 --enable-libvpx --enable-nvenc --enable-libmfx --enable-version3 --enable-pthreads --enable-runtime-cpudetect --disable-ffserver --enable-libfreetype --enable-filter=drawtext
  libavutil      55. 43.100 / 55. 43.100
  libavcodec     57. 71.101 / 57. 71.101
  libavformat    57. 62.100 / 57. 62.100
  libavdevice    57.  2.100 / 57.  2.100
  libavfilter     6. 68.100 /  6. 68.100
  libswscale      4.  3.101 /  4.  3.101
  libswresample   2.  4.100 /  2.  4.100
  libpostproc    54.  2.100 / 54.  2.100
Splitting the commandline.
Reading option '-y' ... matched as option 'y' (overwrite output files) with argument '1'.
Reading option '-v' ... matched as option 'v' (set logging level) with argument 'debug'.
Reading option '-i' ... matched as input url with argument 'udp://@239.255.0.103:1234?fifo_size=1000000&overrun_nonfatal=1'.
Reading option '-map' ... matched as option 'map' (set input stream mapping) with argument 'v:0'.
Reading option '-profile:v' ... matched as option 'profile' (set profile) with argument 'main'.
Reading option '-c:v' ... matched as option 'c' (codec name) with argument 'h264_qsv'.
Reading option '-look_ahead' ... matched as AVOption 'look_ahead' with argument '0'.
Reading option '-an' ... matched as option 'an' (disable audio) with argument '1'.
Reading option '/a.mp4' ... matched as output url.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option y (overwrite output files) with argument 1.
Applying option v (set logging level) with argument debug.
Successfully parsed a group of options.
Parsing a group of options: input url udp://@239.255.0.103:1234?fifo_size=1000000&overrun_nonfatal=1.
Successfully parsed a group of options.
Opening an input file: udp://@239.255.0.103:1234?fifo_size=1000000&overrun_nonfatal=1.
[udp @ 0x3338a80] No default whitelist set
[udp @ 0x3338a80] end receive buffer size reported is 131072
[mpegts @ 0x3338160] Format mpegts probed with size=2048 and score=50
[mpegts @ 0x3338160] stream=0 stream_type=2 pid=301 prog_reg_desc=
[mpegts @ 0x3338160] stream=1 stream_type=3 pid=311 prog_reg_desc=
[mpegts @ 0x3338160] stream=2 stream_type=3 pid=313 prog_reg_desc=
[mpegts @ 0x3338160] stream=3 stream_type=6 pid=321 prog_reg_desc=
[mpegts @ 0x3338160] Before avformat_find_stream_info() pos: 0 bytes read:123704 seeks:0 nb_streams:4
[mpegts @ 0x3338160] parser not found for codec dvb_teletext, packets or times may be invalid.
    Last message repeated 1 times
[mpeg2video @ 0x335d080] Invalid frame dimensions 0x0.
    Last message repeated 9 times
[mpegts @ 0x3338160] max_analyze_duration 5000000 reached at 5000000 microseconds st:3
[mpegts @ 0x3338160] After avformat_find_stream_info() pos: 1932640 bytes read:1933204 seeks:0 frames:663
Input #0, mpegts, from 'udp://@239.255.0.103:1234?fifo_size=1000000&overrun_nonfatal=1':
  Duration: N/A, start: 4999.876978, bitrate: N/A
  Program 259
    Stream #0:0[0x301], 128, 1/90000: Video: mpeg2video (Main), 1 reference frame ([2][0][0][0] / 0x0002), yuv420p(tv, top first, left), 720x576 [SAR 64:45 DAR 16:9], 0/1, 25 fps, 25 tbr, 90k tbn, 50 tbc
    Stream #0:1[0x311](cze), 208, 1/90000: Audio: mp2 ([3][0][0][0] / 0x0003), 48000 Hz, stereo, s16p, 192 kb/s
    Stream #0:2[0x313](cze), 200, 1/90000: Audio: mp2 ([3][0][0][0] / 0x0003), 48000 Hz, mono, s16p, 64 kb/s (visual impaired)
    Stream #0:3[0x321](cze), 127, 1/90000: Subtitle: dvb_teletext ([6][0][0][0] / 0x0006)
Successfully opened the file.
Parsing a group of options: output url /a.mp4.
Applying option map (set input stream mapping) with argument v:0.
Applying option profile:v (set profile) with argument main.
Applying option c:v (codec name) with argument h264_qsv.
Applying option an (disable audio) with argument 1.
Successfully parsed a group of options.
Opening an output file: /a.mp4.
[file @ 0x352dc20] Setting default whitelist 'file,crypto'
Successfully opened the file.
detected 8 logical cores
[graph 0 input from stream 0:0 @ 0x356e340] Setting 'video_size' to value '720x576'
[graph 0 input from stream 0:0 @ 0x356e340] Setting 'pix_fmt' to value '0'
[graph 0 input from stream 0:0 @ 0x356e340] Setting 'time_base' to value '1/90000'
[graph 0 input from stream 0:0 @ 0x356e340] Setting 'pixel_aspect' to value '64/45'
[graph 0 input from stream 0:0 @ 0x356e340] Setting 'sws_param' to value 'flags=2'
[graph 0 input from stream 0:0 @ 0x356e340] Setting 'frame_rate' to value '25/1'
[graph 0 input from stream 0:0 @ 0x356e340] w:720 h:576 pixfmt:yuv420p tb:1/90000 fr:25/1 sar:64/45 sws_param:flags=2
[format @ 0x3592d00] compat: called with args=[nv12|qsv]
[format @ 0x3592d00] Setting 'pix_fmts' to value 'nv12|qsv'
[auto-inserted scaler 0 @ 0x3596de0] Setting 'flags' to value 'bicubic'
[auto-inserted scaler 0 @ 0x3596de0] w:iw h:ih flags:'bicubic' interl:0
[format @ 0x3592d00] auto-inserting filter 'auto-inserted scaler 0' between the filter 'Parsed_null_0' and the filter 'format'
[AVFilterGraph @ 0x3591980] query_formats: 4 queried, 2 merged, 1 already done, 0 delayed
[auto-inserted scaler 0 @ 0x3596de0] w:720 h:576 fmt:yuv420p sar:64/45 -> w:720 h:576 fmt:nv12 sar:64/45 flags:0x4
[h264_qsv @ 0x352d1e0] Initialized an internal MFX session using hardware accelerated implementation
[h264_qsv @ 0x352d1e0] Using the average variable bitrate (AVBR) ratecontrol method
[h264_qsv @ 0x352d1e0] Encoder will work with partial HW acceleration
[h264_qsv @ 0x352d1e0] profile: main; level: 30
[h264_qsv @ 0x352d1e0] GopPicSize: 250; GopRefDist: 4; GopOptFlag: closed ; IdrInterval: 0
[h264_qsv @ 0x352d1e0] TargetUsage: 4; RateControlMethod: AVBR
[h264_qsv @ 0x352d1e0] TargetKbps: 1000; Accuracy: 0; Convergence: 0
[h264_qsv @ 0x352d1e0] NumSlice: 0; NumRefFrame: 3
[h264_qsv @ 0x352d1e0] RateDistortionOpt: OFF
[h264_qsv @ 0x352d1e0] RecoveryPointSEI: unknown IntRefType: 0; IntRefCycleSize: 0; IntRefQPDelta: 0
[h264_qsv @ 0x352d1e0] MaxFrameSize: 0; MaxSliceSize: 0;
[h264_qsv @ 0x352d1e0] BitrateLimit: unknown; MBBRC: unknown; ExtBRC: unknown
[h264_qsv @ 0x352d1e0] Trellis: auto
[h264_qsv @ 0x352d1e0] RepeatPPS: unknown; NumMbPerSlice: 0; LookAheadDS: unknown
[h264_qsv @ 0x352d1e0] AdaptiveI: unknown; AdaptiveB: unknown; BRefType: auto
[h264_qsv @ 0x352d1e0] MinQPI: 0; MaxQPI: 0; MinQPP: 0; MaxQPP: 0; MinQPB: 0; MaxQPB: 0
[h264_qsv @ 0x352d1e0] Entropy coding: CABAC; MaxDecFrameBuffering: 0
[h264_qsv @ 0x352d1e0] NalHrdConformance: unknown; SingleSeiNalUnit: ON; VuiVclHrdParameters: OFF VuiNalHrdParameters: OFF
Output #0, mp4, to '/a.mp4':
  Metadata:
    encoder         : Lavf57.62.100
    Stream #0:0, 0, 1/12800: Video: h264 (h264_qsv), 1 reference frame ([33][0][0][0] / 0x0021), nv12(left), 720x576 [SAR 64:45 DAR 16:9], 0/1, q=2-31, 1000 kb/s, 25 fps, 12800 tbn, 25 tbc
    Metadata:
      encoder         : Lavc57.71.101 h264_qsv
    Side data:
      cpb: bitrate max/min/avg: 0/0/1000000 buffer size: 0 vbv_delay: -1
Stream mapping:
  Stream #0:0 -> #0:0 (mpeg2video (native) -> h264 (h264_qsv))
Press [q] to stop, [?] for help
cur_dts is invalid (this is harmless if it occurs once at the start per stream)
    Last message repeated 3 times
[mpegts @ 0x3338160] Correcting start time by 656000
cur_dts is invalid (this is harmless if it occurs once at the start per stream)
    Last message repeated 65 times
*** 13 dup!
frame=  202 fps= 63 q=-0.0 Lsize=     931kB time=00:00:07.96 bitrate= 957.6kbits/s dup=13 drop=0 speed= 2.5x
video:927kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.387599%
Input file #0 (udp://@239.255.0.103:1234?fifo_size=1000000&overrun_nonfatal=1):
  Input stream #0:0 (video): 202 packets read (2361857 bytes); 190 frames decoded;
  Input stream #0:1 (audio): 208 packets read (120172 bytes);
  Input stream #0:2 (audio): 200 packets read (38750 bytes);
  Input stream #0:3 (subtitle): 128 packets read (113792 bytes);
  Total: 738 packets (2634571 bytes) demuxed
Output file #0 (/a.mp4):
  Output stream #0:0 (video): 202 frames encoded; 202 packets muxed (949177 bytes);
  Total: 202 packets (949177 bytes) muxed
190 frames successfully decoded, 0 decoding errors
[AVIOContext @ 0x3364800] Statistics: 34 seeks, 230 writeouts
[AVIOContext @ 0x3358fa0] Statistics: 3026800 bytes read, 0 seeks
Exiting normally, received signal 2.

Other user reported the same issue here:
http://ffmpeg.gusari.org/viewtopic.php?f=11&t=3236

Change History (4)

comment:1 by Carl Eugen Hoyos, 7 years ago

Keywords: qsv regression added
Priority: normalimportant

Please find the change introducing the regression.

comment:2 by jkqxz, 7 years ago

Can you offer more detail about the platform you are using?

The command as-is doesn't work for me with current libmfx on Skylake because AVBR without look-ahead isn't supported there. Either re-adding the look-ahead or adding a maxrate both work (with AVBR+LA and VBR respectively), and have full hardware acceleration.

comment:3 by Milan Cizek, 7 years ago

Linux stream 4.4.0 #1 SMP Fri Oct 14 22:04:02 CEST 2016 x86_64 x86_64 x86_64 GNU/Linux

Intel Media SDK 2017
nvidia-370; cuda_8.0-44; nvidia_video_sdk_7.0.1; intel-opencl-16.5

# vainfo
libva info: VA-API version 0.99.0
libva info: va_getDriverName() returns 0
libva info: User requested driver 'iHD'
libva info: Trying to open /opt/intel/mediasdk/lib64/iHD_drv_video.so
libva info: Found init function __vaDriverInit_0_32
libva info: va_openDriver() returns 0
vainfo: VA-API version: 0.99 (libva 1.67.0.pre1)
vainfo: Driver version: 16.5.55964-ubit
vainfo: Supported profile and entrypoints
      VAProfileH264ConstrainedBaseline: VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice
      VAProfileH264ConstrainedBaseline: <unknown entrypoint>
      VAProfileH264ConstrainedBaseline: <unknown entrypoint>
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointEncSlice
      VAProfileH264Main               : <unknown entrypoint>
      VAProfileH264Main               : <unknown entrypoint>
      VAProfileH264High               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointEncSlice
      VAProfileH264High               : <unknown entrypoint>
      VAProfileH264High               : <unknown entrypoint>
      VAProfileMPEG2Simple            : VAEntrypointEncSlice
      VAProfileMPEG2Simple            : VAEntrypointVLD
      VAProfileMPEG2Main              : VAEntrypointEncSlice
      VAProfileMPEG2Main              : VAEntrypointVLD
      VAProfileVC1Advanced            : VAEntrypointVLD
      VAProfileVC1Main                : VAEntrypointVLD
      VAProfileVC1Simple              : VAEntrypointVLD
      VAProfileJPEGBaseline           : VAEntrypointVLD
      VAProfileJPEGBaseline           : VAEntrypointEncPicture
      VAProfileVP8Version0_3          : VAEntrypointEncSlice
      VAProfileVP8Version0_3          : VAEntrypointVLD
      VAProfileVP8Version0_3          : <unknown entrypoint>
      VAProfileHEVCMain               : VAEntrypointVLD
      VAProfileHEVCMain               : VAEntrypointEncSlice
      VAProfileVP9Profile0            : <unknown entrypoint>
      <unknown profile>               : VAEntrypointVideoProc
      VAProfileNone                   : VAEntrypointVideoProc
      VAProfileNone                   : <unknown entrypoint>
# ./sys_analyzer_linux.py
--------------------------
Hardware readiness checks:
--------------------------
 [ OK ] Processor name: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
--------------------------
OS readiness checks:
--------------------------
 [ OK ] GPU visible to OS
--------------------------
Intel(R) Media Server Studio Install:
--------------------------
 [ OK ] user is root
 [ OK ] libva.so.1 found
 [ OK ] vainfo reports valid codec entry points
 [ OK ] /dev/dri/renderD128 connects to Intel i915
 [ ERROR ] could not open /dev/dri/renderD129

--------------------------
Media SDK Plugins available:
(for more info see /opt/intel/mediasdk/plugins/plugins.cfg)
--------------------------
    H264LA Encoder      = 588f1185d47b42968dea377bb5d0dcb4
    VP8 Decoder         = f622394d8d87452f878c51f2fc9b4131
    HEVC Decoder        = 33a61c0b4c27454ca8d85dde757c6f8e
    HEVC Encoder        = 6fadc791a0c2eb479ab6dcd5ea9da347
--------------------------
Component Smoke Tests:
--------------------------
 [ OK ] Media SDK HW API level:1.19
 [ OK ] Media SDK SW API level:1.19
 [ OK ] OpenCL check:./oclcheck: /usr/local/cuda/lib64/libOpenCL.so.1: no version information available (required by ./oclcheck)
platform:Intel(R) OpenCL GPU OK CPU OK
platform:NVIDIA CUDA GPU OK CPU FAIL

comment:4 by wenbin,chen, 2 years ago

Resolution: fixed
Status: newclosed

This bug cannot be reproduced now. Close it.

Note: See TracTickets for help on using tickets.