Opened 2 months ago

#6722 new defect

XMA/WMAPro decoder: gapless problems

Reported by: bnnm Owned by:
Priority: normal Component: avcodec
Version: git-master Keywords: xma, wmapro
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

XMA first/last output is slightly incorrect, notably breaking gapless files. Haven't tested WMAPro files as much but I believe also applies.


From tests with Microsoft's XMA encoder (xmaencode.exe):

  • FFmpeg decodes 128(?) samples late, and xmaencode adds 128 samples at output start (1 subframe of "setup_samples"?), making last subframe in a file incorrect. Basically the first and last output is off (samples in the middle look fine).
  • FFmpeg ignores "start skip" (samples to discard at the beginning) and "end skip" (same at the end), see wmaprodec.c@1443. Both used in decodes and applied after the first 128 "extra" samples. start_skip is usually set to frame_samples and end_skip to <frame_samples, but if manually changed (ex. to 100 or 0) xmaencode will honor the values. If >frame_size it clamps the value (512 in XMA), and end_skip seems to be always included even if 0.

ex. final samples output of 10 frames: FFmpeg = 10*512; xmaencode = 128 + 1*512 - start_skip + 9*512 - end_skip


Example files: https://mega.nz/#!DBAFGY4C!Jb0Y8gtDpm_V12DSqz5LP63k7xkqq_L9fMNn0Fc0Qv4

test_20.xma (20 PCM samples)

  • xmaencode encodes 1 frame (512) file
  • xmaencode decodes 128 + 512 - 512 start_skip (leaving last 128 from the frame) - 108 end_skip = 20 samples
  • FFmpeg just outputs 512, and the waveform isn't correct (last 128 are wrong)

Screenshot: original PCM vs xmaencode vs ffmpeg vs xmaencode manually removing the skips from the file.

test_322.xma: same with 2 frames, see how FFmpeg starts "late", and the second frame now decodes like the first frame should in test_20.xma:

Both files are less than one packet, so it isn't a bug in the bit reservoir.
I encoded those but the issues are present in all "real" files AFAIK.


Console output:

% 
ffmpeg -i test_20.xma test_20.wav -v debug
ffmpeg version N-87353-g183fd30 Copyright (c) 2000-2017 the FFmpeg developers
  built with gcc 7.2.0 (GCC)
  configuration: --enable-gpl --enable-version3 --enable-cuda --enable-cuvid --enable-d3d11va --enable-dxva2 --enable-libmfx --enable-nvenc --enable-avisynth --enable-bzlib --enable-fontconfig --enable-frei0r --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libfreetype --enable-libgme --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenh264 --enable-libopenjpeg --enable-libopus --enable-librtmp --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs --enable-libxvid --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib
  libavutil      55. 76.100 / 55. 76.100
  libavcodec     57.106.101 / 57.106.101
  libavformat    57. 82.101 / 57. 82.101
  libavdevice    57.  8.101 / 57.  8.101
  libavfilter     6.105.100 /  6.105.100
  libswscale      4.  7.103 /  4.  7.103
  libswresample   2.  8.100 /  2.  8.100
  libpostproc    54.  6.100 / 54.  6.100
Splitting the commandline.
Reading option '-i' ... matched as input url with argument 'test_20.xma'.
Reading option 'test_20.wav' ... matched as output url.
Reading option '-v' ... matched as option 'v' (set logging level) with argument 'debug'.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option v (set logging level) with argument debug.
Successfully parsed a group of options.
Parsing a group of options: input url test_20.xma.
Successfully parsed a group of options.
Opening an input file: test_20.xma.
[NULL @ 00367800] Opening 'test_20.xma' for reading
[file @ 00367f20] Setting default whitelist 'file,crypto'
[wav @ 00367800] Format wav probed with size=2048 and score=99
[wav @ 00367800] Before avformat_find_stream_info() pos: 60 bytes read:2128 seeks:0 nb_streams:1
[wav @ 00367800] parser not found for codec xma1, packets or times may be invalid.
[xma1 @ 03231580] extradata:
[xma1 @ 03231580] [d6] [10] [0] [0] [1] [0] [0] [2] [80] [2d] [7] [0] [44] [ac] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [1] [1] [0] 
[wav @ 00367800] parser not found for codec xma1, packets or times may be invalid.
[wav @ 00367800] After avformat_find_stream_info() pos: 2128 bytes read:2128 seeks:0 frames:1
Guessed Channel Layout for Input Stream #0.0 : mono
Input #0, wav, from 'test_20.xma':
  Duration: N/A, bitrate: N/A
    Stream #0:0, 1, 1/44100: Audio: xma1 (e[1][0][0] / 0x0165), 44100 Hz, mono, fltp
Successfully opened the file.
Parsing a group of options: output url test_20.wav.
Successfully parsed a group of options.
Opening an output file: test_20.wav.
[file @ 0036df00] Setting default whitelist 'file,crypto'
Successfully opened the file.
[xma1 @ 03231c00] extradata:
[xma1 @ 03231c00] [d6] [10] [0] [0] [1] [0] [0] [2] [80] [2d] [7] [0] [44] [ac] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [0] [1] [1] [0] 
Stream mapping:
  Stream #0:0 -> #0:0 (xma1 (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
cur_dts is invalid (this is harmless if it occurs once at the start per stream)
detected 2 logical cores
[graph_0_in_0_0 @ 032b5260] Setting 'time_base' to value '1/44100'
[graph_0_in_0_0 @ 032b5260] Setting 'sample_rate' to value '44100'
[graph_0_in_0_0 @ 032b5260] Setting 'sample_fmt' to value 'fltp'
[graph_0_in_0_0 @ 032b5260] Setting 'channels' to value '1'
[graph_0_in_0_0 @ 032b5260] tb:1/44100 samplefmt:fltp samplerate:44100 chlayout:(null)
[format_out_0_0 @ 032b56e0] Setting 'sample_fmts' to value 's16'
[format_out_0_0 @ 032b56e0] auto-inserting filter 'auto_resampler_0' between the filter 'Parsed_anull_0' and the filter 'format_out_0_0'
[AVFilterGraph @ 032b4a00] query_formats: 4 queried, 6 merged, 3 already done, 0 delayed
[auto_resampler_0 @ 032b5e20] [SWR @ 032b6ea0] Using fltp internally between filters
[auto_resampler_0 @ 032b5e20] ch:1 chl:1 channels fmt:fltp r:44100Hz -> ch:1 chl:1 channels fmt:s16 r:44100Hz
Output #0, wav, to 'test_20.wav':
  Metadata:
    ISFT            : Lavf57.82.101
    Stream #0:0, 0, 1/44100: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, 1 channels, s16, 705 kb/s
    Metadata:
      encoder         : Lavc57.106.101 pcm_s16le
[out_0_0 @ 032b55a0] EOF on sink link out_0_0:default.
No more output streams to write to, finishing.
size=       1kB time=00:00:00.01 bitrate= 759.3kbits/s speed= 5.8x    
video:0kB audio:1kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 7.617188%
Input file #0 (test_20.xma):
  Input stream #0:0 (audio): 1 packets read (2048 bytes); 1 frames decoded (512 samples); 
  Total: 1 packets (2048 bytes) demuxed
Output file #0 (test_20.wav):
  Output stream #0:0 (audio): 1 frames encoded (512 samples); 1 packets muxed (1024 bytes); 
  Total: 1 packets (1024 bytes) muxed
1 frames successfully decoded, 0 decoding errors
[AVIOContext @ 03290060] Statistics: 4 seeks, 4 writeouts
[AVIOContext @ 0036ca40] Statistics: 2128 bytes read, 0 seeks

Attachments (4)

test_20.xma (2.1 KB) - added by bnnm 2 months ago.
test_20.png (20.2 KB) - added by bnnm 2 months ago.
test_322.xma (2.1 KB) - added by bnnm 2 months ago.
test_322.png (18.4 KB) - added by bnnm 2 months ago.

Download all attachments as: .zip

Change History (4)

Changed 2 months ago by bnnm

Changed 2 months ago by bnnm

Changed 2 months ago by bnnm

Changed 2 months ago by bnnm

Note: See TracTickets for help on using tickets.