Opened 14 months ago

Closed 4 months ago

#8849 closed defect (fixed)

sub2video does not work with overlay_cuda

Reported by: Bogdan Ilisei Owned by:
Priority: normal Component: undetermined
Version: unspecified Keywords:
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug:

sub2video seems to fail when attempting to overlay (burn-in) a dvb subtitle to a video using overlay_cuda

This seems to work fine when using a transparent png as a second input, and a similar chain (with the same source) works fine when using the normal overlay filter, by using hwdownload/hwupload, while retaining hardware decoding/encoding.

Sample file: https://0x0.st/iYuU.ts

I based the filter logic on these examples: https://patchwork.ffmpeg.org/project/ffmpeg/patch/20200318071955.2329-1-yyyaroslav@gmail.com/

How to reproduce:

# ./ffmpeg_npp -v verbose -report -dump_filtergraph fmt=dot:filename=./graph.dot -nostats -vsync 0 -init_hw_device cuda=cuda -filter_hw_device cuda -hwaccel cuvid -c:v h264_cuvid -i in.ts -filter_complex "[0:s] format=yuva420p,hwupload [0s]; [0:v] scale_npp=format=yuv420p [0v]; [0v][0s] overlay_cuda [v]" -map "[v]" -map 0:a -c:v h264_nvenc -preset medium -b:v 5M -bufsize 10M -profile:v main -temporal-aq 1 -acodec copy -copy_unknown -f mpegts -y out.ts
ffmpeg started on 2020-08-14 at 02:34:38
Report written to "ffmpeg-20200814-023438.log"
Log level: 48
ffmpeg version N-98725-gcfc6552032 Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)
  configuration: --pkg-config=pkg-config --pkg-config-flags=--static --disable-libxcb --disable-debug --enable-cuda-llvm --enable-cuvid --enable-nvenc --enable-libnpp --extra-cflags=-I/usr/local/cuda/include --extra-ldflags=-L/usr/local/cuda/lib64 --extra-cflags='-mtune=generic' --extra-cflags=-O3 --enable-static --disable-shared --prefix=/home/ibm86/ffmpeg-windows-build-helpers/sandbox/cross_compilers/native --enable-nonfree --enable-libfdk-aac
  libavutil      56. 58.100 / 56. 58.100
  libavcodec     58.100.100 / 58.100.100
  libavformat    58. 50.100 / 58. 50.100
  libavdevice    58. 11.101 / 58. 11.101
  libavfilter     7. 87.100 /  7. 87.100
  libswscale      5.  8.100 /  5.  8.100
  libswresample   3.  8.100 /  3.  8.100
[h264 @ 0x564b2f71bb00] Reinit context to 1920x1088, pix_fmt: yuv420p
[h264 @ 0x564b2f71bb00] Increasing reorder buffer to 2
[mpegts @ 0x564b2f7156c0] max_analyze_duration 5000000 reached at 5016000 microseconds st:1
WARNING: defaulting hwaccel_output_format to cuda for compatibility with old commandlines. This behaviour is DEPRECATED and will be removed in the future. Please explicitly set "-hwaccel_output_format cuda".
Input #0, mpegts, from 'in.ts':
  Duration: 00:00:15.93, start: 1.400000, bitrate: 8561 kb/s
  Program 1
    Metadata:
      service_name    : Service01
      service_provider: FFmpeg
    Stream #0:0[0x100]: Video: h264 (High), 1 reference frame ([27][0][0][0] / 0x001B), yuv420p(tv, bt709, top first, left), 1920x1080 (1920x1088) [SAR 1:1 DAR 16:9], 25 fps, 50 tbr, 90k tbn, 50 tbc
    Stream #0:1[0x101](rum): Audio: mp2 ([3][0][0][0] / 0x0003), 48000 Hz, stereo, fltp, 256 kb/s
    Stream #0:2[0x102](qaa): Audio: ac3 ([129][0][0][0] / 0x0081), 48000 Hz, 5.1(side), fltp, 640 kb/s
    Stream #0:3[0x103](rum): Subtitle: dvb_subtitle ([6][0][0][0] / 0x0006)
[h264_mp4toannexb @ 0x564b2f7eef40] The input looks like it is Annex B already
[h264_cuvid @ 0x564b315982c0] CUVID capabilities for h264_cuvid:
[h264_cuvid @ 0x564b315982c0] 8 bit: supported: 1, min_width: 48, max_width: 4096, min_height: 16, max_height: 4096
[h264_cuvid @ 0x564b315982c0] 10 bit: supported: 0, min_width: 0, max_width: 0, min_height: 0, max_height: 0
[h264_cuvid @ 0x564b315982c0] 12 bit: supported: 0, min_width: 0, max_width: 0, min_height: 0, max_height: 0
Stream mapping:
  Stream #0:0 (h264_cuvid) -> scale_npp
  Stream #0:3 (dvbsub) -> format
  overlay_cuda -> Stream #0:0 (h264_nvenc)
  Stream #0:1 -> #0:1 (copy)
  Stream #0:2 -> #0:2 (copy)
Press [q] to stop, [?] for help
[h264_cuvid @ 0x564b315982c0] Formats: Original: cuda | HW: cuda | SW: nv12
[mpegts @ 0x564b2f7156c0] sub2video: using 1920x1080 canvas
[graph 0 input from stream 0:3 @ 0x564b2f902ac0] w:1920 h:1080 pixfmt:bgra tb:1/90000 fr:0/1 sar:0/1
[graph 0 input from stream 0:0 @ 0x564b2f903740] w:1920 h:1080 pixfmt:cuda tb:1/90000 fr:25/1 sar:1/1
[auto_scaler_0 @ 0x564b2f906a00] w:iw h:ih flags:'bilinear' interl:0
[Parsed_format_0 @ 0x564b30f98540] auto-inserting filter 'auto_scaler_0' between the filter 'graph 0 input from stream 0:3' and the filter 'Parsed_format_0'
[Parsed_scale_npp_2 @ 0x564b30f99600] w:1920 h:1080 -> w:1920 h:1080
[auto_scaler_0 @ 0x564b2f906a00] w:1920 h:1080 fmt:bgra sar:0/1 -> w:1920 h:1080 fmt:yuva420p sar:0/1 flags:0x2
[Parsed_overlay_cuda_3 @ 0x564b2f901ac0] [framesync @ 0x564b2f901bf8] Sync level 2
[h264_nvenc @ 0x564b2f8125c0] Using input frames context (format cuda) with h264_nvenc encoder.
[h264_nvenc @ 0x564b2f8125c0] Loaded Nvenc version 10.0
[h264_nvenc @ 0x564b2f8125c0] Nvenc initialized successfully
[h264_nvenc @ 0x564b2f8125c0] Temporal AQ enabled.
[mpegts @ 0x564b2f8c8d80] service 1 using PCR in pid=256, pcr_period=80ms
[mpegts @ 0x564b2f8c8d80] muxrate VBR, sdt every 500 ms, pat/pmt every 100 ms
Output #0, mpegts, to 'out.ts':
  Metadata:
    encoder         : Lavf58.50.100
    Stream #0:0: Video: h264 (h264_nvenc) (Main), 1 reference frame, cuda, 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 5000 kb/s, 25 fps, 90k tbn, 25 tbc (default)
    Metadata:
      encoder         : Lavc58.100.100 h264_nvenc
    Side data:
      cpb: bitrate max/min/avg: 0/0/5000000 buffer size: 10000000 vbv_delay: N/A
    Stream #0:1(rum): Audio: mp2 ([3][0][0][0] / 0x0003), 48000 Hz, stereo, fltp, 256 kb/s
    Stream #0:2(qaa): Audio: ac3 ([129][0][0][0] / 0x0081), 48000 Hz, 5.1(side), fltp, 640 kb/s
Error while add the frame to buffer source(Internal bug, should not have happened).
Error while filtering: Internal bug, should not have happened
Failed to inject frame into filter network: Internal bug, should not have happened
Error while processing the decoded data for stream #0:0
[AVIOContext @ 0x564b2f7fc100] Statistics: 0 seeks, 0 writeouts
[h264_nvenc @ 0x564b2f8125c0] Nvenc unloaded
[AVIOContext @ 0x564b2f71e580] Statistics: 5525648 bytes read, 2 seeks
Conversion failed!

This seems to be working fine with a transparent PNG, for example:

# ./ffmpeg_npp -v verbose -nostats -vsync 0 -init_hw_device cuda=cuda -filter_hw_device cuda -hwaccel cuvid -c:v h264_cuvid -i in.ts -i t.png -filter_complex "[1:v] format=yuva420p,hwupload [0s]; [0:v] scale_npp=format=yuv420p [0v]; [0v][0s] overlay_cuda=shortest=false [v]" -map "[v]" -map 0:a -c:v h264_nvenc -preset medium -b:v 5M -bufsize 10M -profile:v main -temporal-aq 1 -acodec copy -copy_unknown -f mpegts -y out.ts
ffmpeg version N-98725-gcfc6552032 Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)
  configuration: --pkg-config=pkg-config --pkg-config-flags=--static --disable-libxcb --disable-debug --enable-cuda-llvm --enable-cuvid --enable-nvenc --enable-libnpp --extra-cflags=-I/usr/local/cuda/include --extra-ldflags=-L/usr/local/cuda/lib64 --extra-cflags='-mtune=generic' --extra-cflags=-O3 --enable-static --disable-shared --prefix=/home/ibm86/ffmpeg-windows-build-helpers/sandbox/cross_compilers/native --enable-nonfree --enable-libfdk-aac
  libavutil      56. 58.100 / 56. 58.100
  libavcodec     58.100.100 / 58.100.100
  libavformat    58. 50.100 / 58. 50.100
  libavdevice    58. 11.101 / 58. 11.101
  libavfilter     7. 87.100 /  7. 87.100
  libswscale      5.  8.100 /  5.  8.100
  libswresample   3.  8.100 /  3.  8.100
[h264 @ 0x555fe4920b00] Reinit context to 1920x1088, pix_fmt: yuv420p
[h264 @ 0x555fe4920b00] Increasing reorder buffer to 2
[mpegts @ 0x555fe491a640] max_analyze_duration 5000000 reached at 5016000 microseconds st:1
WARNING: defaulting hwaccel_output_format to cuda for compatibility with old commandlines. This behaviour is DEPRECATED and will be removed in the future. Please explicitly set "-hwaccel_output_format cuda".
Input #0, mpegts, from 'in.ts':
  Duration: 00:00:15.93, start: 1.400000, bitrate: 8561 kb/s
  Program 1
    Metadata:
      service_name    : Service01
      service_provider: FFmpeg
    Stream #0:0[0x100]: Video: h264 (High), 1 reference frame ([27][0][0][0] / 0x001B), yuv420p(tv, bt709, top first, left), 1920x1080 (1920x1088) [SAR 1:1 DAR 16:9], 25 fps, 50 tbr, 90k tbn, 50 tbc
    Stream #0:1[0x101](rum): Audio: mp2 ([3][0][0][0] / 0x0003), 48000 Hz, stereo, fltp, 256 kb/s
    Stream #0:2[0x102](qaa): Audio: ac3 ([129][0][0][0] / 0x0081), 48000 Hz, 5.1(side), fltp, 640 kb/s
    Stream #0:3[0x103](rum): Subtitle: dvb_subtitle ([6][0][0][0] / 0x0006)
Input #1, png_pipe, from 't.png':
  Duration: N/A, bitrate: N/A
    Stream #1:0: Video: png, 1 reference frame, rgba(pc), 1024x721, 25 tbr, 25 tbn, 25 tbc
[h264_mp4toannexb @ 0x555fe4a37400] The input looks like it is Annex B already
[h264_cuvid @ 0x555fe4a3d580] CUVID capabilities for h264_cuvid:
[h264_cuvid @ 0x555fe4a3d580] 8 bit: supported: 1, min_width: 48, max_width: 4096, min_height: 16, max_height: 4096
[h264_cuvid @ 0x555fe4a3d580] 10 bit: supported: 0, min_width: 0, max_width: 0, min_height: 0, max_height: 0
[h264_cuvid @ 0x555fe4a3d580] 12 bit: supported: 0, min_width: 0, max_width: 0, min_height: 0, max_height: 0
Stream mapping:
  Stream #0:0 (h264_cuvid) -> scale_npp
  Stream #1:0 (png) -> format
  overlay_cuda -> Stream #0:0 (h264_nvenc)
  Stream #0:1 -> #0:1 (copy)
  Stream #0:2 -> #0:2 (copy)
Press [q] to stop, [?] for help
[h264_cuvid @ 0x555fe4a3d580] Formats: Original: cuda | HW: cuda | SW: nv12
[graph 0 input from stream 1:0 @ 0x555fe674b980] w:1024 h:721 pixfmt:rgba tb:1/25 fr:25/1 sar:0/1
[graph 0 input from stream 0:0 @ 0x555fe674c740] w:1920 h:1080 pixfmt:cuda tb:1/90000 fr:25/1 sar:1/1
[auto_scaler_0 @ 0x555fe4b22440] w:iw h:ih flags:'bilinear' interl:0
[Parsed_format_0 @ 0x555fe4a32540] auto-inserting filter 'auto_scaler_0' between the filter 'graph 0 input from stream 1:0' and the filter 'Parsed_format_0'
[Parsed_scale_npp_2 @ 0x555fe4a0bf40] w:1920 h:1080 -> w:1920 h:1080
[auto_scaler_0 @ 0x555fe4b22440] w:1024 h:721 fmt:rgba sar:0/1 -> w:1024 h:721 fmt:yuva420p sar:0/1 flags:0x2
[Parsed_overlay_cuda_3 @ 0x555fe674a9c0] [framesync @ 0x555fe674aaf8] Sync level 2
[h264_nvenc @ 0x555fe69e3e40] Using input frames context (format cuda) with h264_nvenc encoder.
[h264_nvenc @ 0x555fe69e3e40] Loaded Nvenc version 10.0
[h264_nvenc @ 0x555fe69e3e40] Nvenc initialized successfully
[h264_nvenc @ 0x555fe69e3e40] Temporal AQ enabled.
[mpegts @ 0x555fe4acd9c0] service 1 using PCR in pid=256, pcr_period=80ms
[mpegts @ 0x555fe4acd9c0] muxrate VBR, sdt every 500 ms, pat/pmt every 100 ms
Output #0, mpegts, to 'out.ts':
  Metadata:
    encoder         : Lavf58.50.100
    Stream #0:0: Video: h264 (h264_nvenc) (Main), 1 reference frame, cuda, 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 5000 kb/s, 25 fps, 90k tbn, 25 tbc (default)
    Metadata:
      encoder         : Lavc58.100.100 h264_nvenc
    Side data:
      cpb: bitrate max/min/avg: 0/0/5000000 buffer size: 10000000 vbv_delay: N/A
    Stream #0:1(rum): Audio: mp2 ([3][0][0][0] / 0x0003), 48000 Hz, stereo, fltp, 256 kb/s
    Stream #0:2(qaa): Audio: ac3 ([129][0][0][0] / 0x0081), 48000 Hz, 5.1(side), fltp, 640 kb/s
[Parsed_overlay_cuda_3 @ 0x555fe674a9c0] [framesync @ 0x555fe674aaf8] Sync level 0
No more output streams to write to, finishing.
frame=  354 fps=0.0 q=15.0 Lsize=   10349kB time=00:00:15.84 bitrate=5352.1kbits/s speed=16.8x
video:8383kB audio:1628kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 3.378023%
Input file #0 (in.ts):
  Input stream #0:0 (video): 714 packets read (14014289 bytes); 354 frames decoded;
  Input stream #0:1 (audio): 620 packets read (476160 bytes);
  Input stream #0:2 (audio): 465 packets read (1190400 bytes);
  Input stream #0:3 (subtitle): 0 packets read (0 bytes);
  Total: 1799 packets (15680849 bytes) demuxed
Input file #1 (t.png):
  Input stream #1:0 (video): 1 packets read (10935 bytes); 1 frames decoded;
  Total: 1 packets (10935 bytes) demuxed
Output file #0 (out.ts):
  Output stream #0:0 (video): 354 frames encoded; 354 packets muxed (8584346 bytes);
  Output stream #0:1 (audio): 620 packets muxed (476160 bytes);
  Output stream #0:2 (audio): 465 packets muxed (1190400 bytes);
  Total: 1439 packets (10250906 bytes) muxed
[AVIOContext @ 0x555fe4a011c0] Statistics: 0 seeks, 41 writeouts
[h264_nvenc @ 0x555fe69e3e40] Nvenc unloaded
[AVIOContext @ 0x555fe4923580] Statistics: 21886488 bytes read, 2 seeks
[AVIOContext @ 0x555fe49f3880] Statistics: 10935 bytes read, 0 seeks

Filter Graph - https://bit.ly/33ZjUE8

digraph G {
node [shape=box]
rankdir=LR
"Parsed_format_0\n(format)" -> "Parsed_hwupload_1\n(hwupload)" [ label= "inpad:default -> outpad:default\nfmt:yuva420p w:1920 h:1080 tb:1/90000" ];
"Parsed_hwupload_1\n(hwupload)" -> "Parsed_overlay_cuda_3\n(overlay_cuda)" [ label= "inpad:default -> outpad:overlay\nfmt:cuda w:1920 h:1080 tb:1/90000" ];
"Parsed_scale_npp_2\n(scale_npp)" -> "Parsed_overlay_cuda_3\n(overlay_cuda)" [ label= "inpad:default -> outpad:main\nfmt:cuda w:1920 h:1080 tb:1/90000" ];
"Parsed_overlay_cuda_3\n(overlay_cuda)" -> "format\n(format)" [ label= "inpad:default -> outpad:default\nfmt:cuda w:1920 h:1080 tb:1/90000" ];
"graph 0 input from stream 0:3\n(buffer)" -> "auto_scaler_0\n(scale)" [ label= "inpad:default -> outpad:default\nfmt:bgra w:1920 h:1080 tb:1/90000" ];
"graph 0 input from stream 0:0\n(buffer)" -> "Parsed_scale_npp_2\n(scale_npp)" [ label= "inpad:default -> outpad:default\nfmt:cuda w:1920 h:1080 tb:1/90000" ];
"format\n(format)" -> "out_0_0\n(buffersink)" [ label= "inpad:default -> outpad:default\nfmt:cuda w:1920 h:1080 tb:1/90000" ];
"auto_scaler_0\n(scale)" -> "Parsed_format_0\n(format)" [ label= "inpad:default -> outpad:default\nfmt:yuva420p w:1920 h:1080 tb:1/90000" ];
}

Attachments (1)

ffmpeg-20200814-023438.log (88.6 KB ) - added by Bogdan Ilisei 14 months ago.

Download all attachments as: .zip

Change History (3)

by Bogdan Ilisei, 14 months ago

Attachment: ffmpeg-20200814-023438.log added

comment:1 by Bogdan Ilisei, 14 months ago

Follow up on this:

I'm not sure that sub2video is actually the main culprit here.

I copied the subtitle stream to an external file:

ffmpeg -i input.ts -vn -an -scodec copy subs.ts

And I re-tried using cuda_overlay:

./ffmpeg -v verbose -nostats -vsync 0 -hwaccel nvdec -hwaccel_output_format cuda -fix_sub_duration -i input.ts -i subs.ts -init_hw_device cuda=cuda:0 -filter_hw_device cuda -filter_complex "[1:s] hwupload=derive_device=cuda [sub]; [0:v] scale_npp=format=yuv420p [video]; [video][sub] overlay_cuda [v]" -map "[v]" -map 0:a -c:v h264_nvenc -acodec copy -copy_unknown -f mpegts -y out.ts
ffmpeg version N-98994-g939f4b35b8 Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)
  configuration: --pkg-config-flags=--static --disable-libxcb --disable-debug --enable-cuda-llvm --enable-cuvid --enable-nvenc --enable-libnpp --extra-cflags=-I/usr/local/cuda/include --extra-ldflags=-L/usr/local/cuda/lib64 --enable-static --disable-shared --enable-nonfree --enable-libfdk-aac --enable-vulkan --enable-libglslang
  libavutil      56. 58.100 / 56. 58.100
  libavcodec     58.105.100 / 58.105.100
  libavformat    58. 53.100 / 58. 53.100
  libavdevice    58. 11.101 / 58. 11.101
  libavfilter     7. 87.100 /  7. 87.100
  libswscale      5.  8.100 /  5.  8.100
  libswresample   3.  8.100 /  3.  8.100
[h264 @ 0x559ea6e14e80] Reinit context to 1920x1088, pix_fmt: yuv420p
[h264 @ 0x559ea6e14e80] Increasing reorder buffer to 2
[mpegts @ 0x559ea6e0e380] max_analyze_duration 5000000 reached at 5016000 microseconds st:1
Input #0, mpegts, from 'input.ts':
  Duration: 00:00:15.93, start: 1.400000, bitrate: 8561 kb/s
  Program 1
    Metadata:
      service_name    : Service01
      service_provider: FFmpeg
    Stream #0:0[0x100]: Video: h264 (High), 1 reference frame ([27][0][0][0] / 0x001B), yuv420p(tv, bt709, top first, left), 1920x1080 (1920x1088) [SAR 1:1 DAR 16:9], 25 fps, 50 tbr, 90k tbn, 50 tbc
    Stream #0:1[0x101](rum): Audio: mp2 ([3][0][0][0] / 0x0003), 48000 Hz, stereo, fltp, 256 kb/s
    Stream #0:2[0x102](qaa): Audio: ac3 ([129][0][0][0] / 0x0081), 48000 Hz, 5.1(side), fltp, 640 kb/s
    Stream #0:3[0x103](rum): Subtitle: dvb_subtitle ([6][0][0][0] / 0x0006)
Input #1, mpegts, from 'subs.ts':
  Duration: 00:00:11.06, start: 1.400000, bitrate: 22 kb/s
  Program 1
    Metadata:
      service_name    : Service01
      service_provider: FFmpeg
    Stream #1:0[0x100](rum): Subtitle: dvb_subtitle ([6][0][0][0] / 0x0006)
Stream mapping:
  Stream #0:0 (h264) -> scale_npp
  Stream #1:0 (dvbsub) -> hwupload
  overlay_cuda -> Stream #0:0 (h264_nvenc)
  Stream #0:1 -> #0:1 (copy)
  Stream #0:2 -> #0:2 (copy)
Press [q] to stop, [?] for help
[h264 @ 0x559ea6eff680] NVDEC capabilities:
[h264 @ 0x559ea6eff680] format supported: yes, max_mb_count: 65536
[h264 @ 0x559ea6eff680] min_width: 48, max_width: 4096
[h264 @ 0x559ea6eff680] min_height: 16, max_height: 4096
[h264 @ 0x559ea6eff680] Reinit context to 1920x1088, pix_fmt: cuda
[mpegts @ 0x559ea6f1a340] sub2video: using 720x576 canvas
[graph 0 input from stream 1:0 @ 0x559ea8c7c600] w:720 h:576 pixfmt:bgra tb:1/90000 fr:0/1 sar:0/1
[graph 0 input from stream 0:0 @ 0x559ea6eedb40] w:1920 h:1080 pixfmt:cuda tb:1/90000 fr:25/1 sar:1/1
[auto_scaler_0 @ 0x559ea74b7280] w:iw h:ih flags:'bilinear' interl:0
[Parsed_hwupload_0 @ 0x559ea8b8fe80] auto-inserting filter 'auto_scaler_0' between the filter 'graph 0 input from stream 1:0' and the filter 'Parsed_hwupload_0'
[Parsed_scale_npp_1 @ 0x559ea6fe9a00] w:1920 h:1080 -> w:1920 h:1080
[auto_scaler_0 @ 0x559ea74b7280] w:720 h:576 fmt:bgra sar:0/1 -> w:720 h:576 fmt:yuva420p sar:0/1 flags:0x2
[Parsed_overlay_cuda_2 @ 0x559ea8b250c0] [framesync @ 0x559ea6f18fb8] Sync level 2
[h264_nvenc @ 0x559ea6f32b80] Using input frames context (format cuda) with h264_nvenc encoder.
[h264_nvenc @ 0x559ea6f32b80] Loaded Nvenc version 10.0
[h264_nvenc @ 0x559ea6f32b80] Nvenc initialized successfully
[mpegts @ 0x559ea6f12900] service 1 using PCR in pid=256, pcr_period=80ms
[mpegts @ 0x559ea6f12900] muxrate VBR, sdt every 500 ms, pat/pmt every 100 ms
Output #0, mpegts, to 'out.ts':
  Metadata:
    encoder         : Lavf58.53.100
    Stream #0:0: Video: h264 (h264_nvenc) (Main), 1 reference frame, cuda, 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 2000 kb/s, 25 fps, 90k tbn, 25 tbc (default)
    Metadata:
      encoder         : Lavc58.105.100 h264_nvenc
    Side data:
      cpb: bitrate max/min/avg: 0/0/2000000 buffer size: 4000000 vbv_delay: N/A
    Stream #0:1(rum): Audio: mp2 ([3][0][0][0] / 0x0003), 48000 Hz, stereo, fltp, 256 kb/s
    Stream #0:2(qaa): Audio: ac3 ([129][0][0][0] / 0x0081), 48000 Hz, 5.1(side), fltp, 640 kb/s
[Parsed_overlay_cuda_2 @ 0x559ea8b250c0] [framesync @ 0x559ea6f18fb8] Sync level 1
[Parsed_overlay_cuda_2 @ 0x559ea8b250c0] [framesync @ 0x559ea6f18fb8] Sync level 0
No more output streams to write to, finishing.
frame=  355 fps=244 q=21.0 Lsize=    4561kB time=00:01:11.04 bitrate= 526.0kbits/s speed=48.8x
video:2717kB audio:1628kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 4.979304%
Input file #0 (input.ts):
  Input stream #0:0 (video): 714 packets read (14014289 bytes); 354 frames decoded;
  Input stream #0:1 (audio): 620 packets read (476160 bytes);
  Input stream #0:2 (audio): 465 packets read (1190400 bytes);
  Input stream #0:3 (subtitle): 0 packets read (0 bytes);
  Total: 1799 packets (15680849 bytes) demuxed
Input file #1 (subs.ts):
  Input stream #1:0 (subtitle): 6 packets read (28670 bytes); 3 frames decoded;
  Total: 6 packets (28670 bytes) demuxed
Output file #0 (out.ts):
  Output stream #0:0 (video): 355 frames encoded; 355 packets muxed (2782576 bytes);
  Output stream #0:1 (audio): 620 packets muxed (476160 bytes);
  Output stream #0:2 (audio): 465 packets muxed (1190400 bytes);
  Total: 1440 packets (4449136 bytes) muxed
[AVIOContext @ 0x559ea6ef5100] Statistics: 0 seeks, 18 writeouts
[h264_nvenc @ 0x559ea6f32b80] Nvenc unloaded
[AVIOContext @ 0x559ea6e17280] Statistics: 21886488 bytes read, 2 seeks
[AVIOContext @ 0x559ea6ee7f40] Statistics: 31396 bytes read, 0 seeks

comment:2 by Bogdan Ilisei, 4 months ago

Resolution: fixed
Status: newclosed

Just tested now with latest builds, against the same CUDA version, and it seems to work just fine:

ffmpeg -threads 1 -v verbose -nostats -init_hw_device cuda=cuda:0 -hwaccel_device cuda -filter_hw_device cuda -extra_hw_frames 3 -hwaccel cuda -hwaccel_output_format cuda \
  -reinit_filter 1 -filter_threads 1 -filter_complex_threads 1 \
  -i "${INPUT}" \
  -filter_complex "[0:s] scale=1920:1080,format=yuva420p,hwupload_cuda [sub]; [0:v] scale_npp=format=yuv420p [main]; [main][sub] overlay_cuda [v]" \
  -c:v h264_nvenc -preset medium \
  -map "[v]" -b:v 5M -minrate 2.5M -maxrate 7.5M -bufsize 10M -profile:v main -temporal-aq 1 \
  -map 0:a -acodec libfdk_aac -vbr 5 \
  -copy_unknown \
  -f mpegts -y -

Note: See TracTickets for help on using tickets.