Opened 3 years ago

Closed 3 years ago

#9430 closed defect (invalid)

Audio out of sync when using select filter with lots of cuts

Reported by: Kuba Orlik Owned by:
Priority: normal Component: undetermined
Version: unspecified Keywords:
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug: When using select/aselect filter with a combination of a lot of between(t, start, end) selectors, the resulting video gets progressively more out of sync with time.

I ran into this issue when working on a set of tool that lets the user open the audio of a movie in Audacity, register which parts did the user delete from the audio, and then delete the same parts from video.

The problem seems to be with the audio channel, meaning that if I replace the audio that ffmpeg cut out with the one that Audacity spit out, then video is in sync with audio.

How to reproduce:

I used this exact below parameters, although I think the numbers there don't really matter, it's just the number of cuts that confuses something within the audio piping. Using less cuts (even around 20) also results in a delay, but less noticablle.

% ffmpeg -i input.mp4                -vf $'select=\'between(t,576.76000,632.80000)+between(t,667.80000,822.72000)+between(t,845.76000,897.40000)+between(t,903.48000,933.96000)+between(t,936.28000,987.76000)+between(t,991.48000,991.96000)+between(t,1004.52000,1018.84000)+between(t,1019.44000,1020.36000)+between(t,1021.08000,1039.60000)+between(t,1041.00000,1042.88000)+between(t,1043.32000,1044.76000)+between(t,1045.36000,1060.88000)+between(t,1062.60000,1075.12000)+between(t,1076.24000,1076.64000)+between(t,1078.08000,1109.64000)+between(t,1110.08000,1118.00000)+between(t,1120.08000,1127.96000)+between(t,1128.60000,1133.68000)+between(t,1135.96000,1145.00000)+between(t,1146.00000,1146.36000)+between(t,1149.04000,1162.80000)+between(t,1163.92000,1164.84000)+between(t,1167.32000,1236.04000)+between(t,1236.52000,1244.28000)+between(t,1246.40000,1360.92000)+between(t,1363.20000,1553.00000)+between(t,1553.64000,1559.00000)+between(t,1560.40000,1573.92000)+between(t,1576.00000,1699.04000)+between(t,1701.08000,1774.64000)+between(t,1776.28000,1777.92000)+between(t,1778.96000,2079.48000)+between(t,2255.08000,2327.96000)+between(t,2328.76000,2329.76000)+between(t,2330.68000,2354.48000)+between(t,2358.84000,2399.28000)+between(t,2401.76000,2465.84000)+between(t,2482.32000,2505.96000)+between(t,2506.76000,2511.88000)+between(t,2512.48000,2619.24000)+between(t,2621.00000,2638.60000)+between(t,2639.56000,2640.48000)+between(t,2640.96000,2641.32000)+between(t,2642.00000,2652.36000)+between(t,2652.92000,2654.36000)+between(t,2654.88000,2657.80000)+between(t,2658.36000,7195.16000)\',setpts=N/FRAME_RATE/TB'                -af $'aselect=\'between(t,576.76000,632.80000)+between(t,667.80000,822.72000)+between(t,845.76000,897.40000)+between(t,903.48000,933.96000)+between(t,936.28000,987.76000)+between(t,991.48000,991.96000)+between(t,1004.52000,1018.84000)+between(t,1019.44000,1020.36000)+between(t,1021.08000,1039.60000)+between(t,1041.00000,1042.88000)+between(t,1043.32000,1044.76000)+between(t,1045.36000,1060.88000)+between(t,1062.60000,1075.12000)+between(t,1076.24000,1076.64000)+between(t,1078.08000,1109.64000)+between(t,1110.08000,1118.00000)+between(t,1120.08000,1127.96000)+between(t,1128.60000,1133.68000)+between(t,1135.96000,1145.00000)+between(t,1146.00000,1146.36000)+between(t,1149.04000,1162.80000)+between(t,1163.92000,1164.84000)+between(t,1167.32000,1236.04000)+between(t,1236.52000,1244.28000)+between(t,1246.40000,1360.92000)+between(t,1363.20000,1553.00000)+between(t,1553.64000,1559.00000)+between(t,1560.40000,1573.92000)+between(t,1576.00000,1699.04000)+between(t,1701.08000,1774.64000)+between(t,1776.28000,1777.92000)+between(t,1778.96000,2079.48000)+between(t,2255.08000,2327.96000)+between(t,2328.76000,2329.76000)+between(t,2330.68000,2354.48000)+between(t,2358.84000,2399.28000)+between(t,2401.76000,2465.84000)+between(t,2482.32000,2505.96000)+between(t,2506.76000,2511.88000)+between(t,2512.48000,2619.24000)+between(t,2621.00000,2638.60000)+between(t,2639.56000,2640.48000)+between(t,2640.96000,2641.32000)+between(t,2642.00000,2652.36000)+between(t,2652.92000,2654.36000)+between(t,2654.88000,2657.80000)+between(t,2658.36000,7195.16000)\',asetpts=N/SR/TB'                output.cut.mp4
ffmpeg version n4.4 Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 11.1.0 (GCC)
  configuration: --prefix=/usr --disable-debug --disable-static --disable-stripping --enable-amf --enable-avisynth --enable-cuda-llvm --enable-lto --enable-fontconfig --enable-gmp --enable-gnutls --enable-gpl --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libdav1d --enable-libdrm --enable-libfreetype --enable-libfribidi --enable-libgsm --enable-libiec61883 --enable-libjack --enable-libmfx --enable-libmodplug --enable-libmp3lame --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-librav1e --enable-librsvg --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxcb --enable-libxml2 --enable-libxvid --enable-libzimg --enable-nvdec --enable-nvenc --enable-shared --enable-version3
  libavutil      56. 70.100 / 56. 70.100
  libavcodec     58.134.100 / 58.134.100
  libavformat    58. 76.100 / 58. 76.100
  libavdevice    58. 13.100 / 58. 13.100
  libavfilter     7.110.100 /  7.110.100
  libswscale      5.  9.100 /  5.  9.100
  libswresample   3.  9.100 /  3.  9.100
  libpostproc    55.  9.100 / 55.  9.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'input.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.76.100
  Duration: 01:59:55.20, start: 0.000000, bitrate: 706 kb/s
  Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 854x480 [SAR 1280:1281 DAR 16:9], 573 kb/s, 25 fps, 25 tbr, 12800 tbn, 50 tbc (default)
    Metadata:
      handler_name    : ?Mainconcept Video Media Handler
      vendor_id       : [0][0][0][0]
  Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 126 kb/s (default)
    Metadata:
      handler_name    : #Mainconcept MP4 Sound Media Handler
      vendor_id       : [0][0][0][0]
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> h264 (libx264))
  Stream #0:1 -> #0:1 (aac (native) -> aac (native))
Press [q] to stop, [?] for help
[libx264 @ 0x5595b7cfa7c0] using SAR=1280/1281
[libx264 @ 0x5595b7cfa7c0] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 0x5595b7cfa7c0] profile High, level 3.0, 4:2:0, 8-bit
[libx264 @ 0x5595b7cfa7c0] 264 - core 161 r3039 544c61f - H.264/MPEG-4 AVC codec - Copyleft 2003-2021 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=12 lookahead_threads=2 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'output.cut.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.76.100
  Stream #0:0(eng): Video: h264 (avc1 / 0x31637661), yuv420p(progressive), 854x480 [SAR 1280:1281 DAR 16:9], q=2-31, 25 fps, 12800 tbn (default)
    Metadata:
      handler_name    : ?Mainconcept Video Media Handler
      vendor_id       : [0][0][0][0]
      encoder         : Lavc58.134.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
  Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s (default)
    Metadata:
      handler_name    : #Mainconcept MP4 Sound Media Handler
      vendor_id       : [0][0][0][0]
      encoder         : Lavc58.134.100 aac
frame=157384 fps=229 q=-1.0 Lsize=  499971kB time=01:44:55.24 bitrate= 650.6kbits/s speed=9.18x    
video:397876kB audio:97408kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.946321%
[libx264 @ 0x5595b7cfa7c0] frame I:635   Avg QP:16.78  size: 59678
[libx264 @ 0x5595b7cfa7c0] frame P:40551 Avg QP:21.32  size:  6594
[libx264 @ 0x5595b7cfa7c0] frame B:116198 Avg QP:27.10  size:   879
[libx264 @ 0x5595b7cfa7c0] consecutive B-frames:  0.7%  2.2%  0.8% 96.3%
[libx264 @ 0x5595b7cfa7c0] mb I  I16..4: 15.5% 41.9% 42.6%
[libx264 @ 0x5595b7cfa7c0] mb P  I16..4:  0.9%  2.3%  0.4%  P16..4: 29.1% 13.1%  8.6%  0.0%  0.0%    skip:45.5%
[libx264 @ 0x5595b7cfa7c0] mb B  I16..4:  0.0%  0.1%  0.0%  B16..8: 23.4%  2.3%  0.4%  direct: 0.7%  skip:73.1%  L0:42.9% L1:51.4% BI: 5.7%
[libx264 @ 0x5595b7cfa7c0] 8x8 transform intra:57.3% inter:64.6%
[libx264 @ 0x5595b7cfa7c0] coded y,uvDC,uvAC intra: 43.6% 51.2% 25.4% inter: 5.6% 6.4% 0.5%
[libx264 @ 0x5595b7cfa7c0] i16 v,h,dc,p: 48% 16% 13% 22%
[libx264 @ 0x5595b7cfa7c0] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 24% 11% 34%  3%  6%  8%  5%  5%  4%
[libx264 @ 0x5595b7cfa7c0] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 26% 16% 12%  5% 10% 10%  9%  7%  5%
[libx264 @ 0x5595b7cfa7c0] i8c dc,h,v,p: 59% 16% 21%  5%
[libx264 @ 0x5595b7cfa7c0] Weighted P-Frames: Y:0.4% UV:0.1%
[libx264 @ 0x5595b7cfa7c0] ref P L0: 60.7% 15.3% 15.8%  8.2%  0.0%
[libx264 @ 0x5595b7cfa7c0] ref B L0: 88.8%  8.3%  2.9%
[libx264 @ 0x5595b7cfa7c0] ref B L1: 96.2%  3.8%
[libx264 @ 0x5595b7cfa7c0] kb/s:517.75
[aac @ 0x5595b7c8b380] Qavg: 5688.512

Change History (2)

comment:1 by Kuba Orlik, 3 years ago

Summary: Audio out of sync when using select filterAudio out of sync when using select filter with lots of cuts

comment:2 by Elon Musk, 3 years ago

Resolution: invalid
Status: newclosed

aselect filter timeline ability is not sample accurate (its only frame accurate, and frame can have multiple samples), you need to use atrim filters, so use atrim filter, each trim would take same input, split with asplit filter: asplit=(number of selections);atrim=(each selection parameters)....; and as last step aconcat filter to join each trim segment.

Note: See TracTickets for help on using tickets.