Opened 18 months ago

Last modified 4 weeks ago

#6367 open defect

ogg vorbis decode results in too many sample frames

Reported by: markbuer Owned by:
Priority: normal Component: undetermined
Version: git-master Keywords: ogg vorbis
Cc: Blocked By:
Blocking: Reproduced by developer: yes
Analyzed by developer: no

Description

Decoding ogg vorbis via ffmpeg results in too many sample frames, as compared with the original (pre encoded) raw file or the oggdec tool.

Chromium issue https://bugs.chromium.org/p/chromium/issues/detail?id=456252


Simple reproduction steps:

# Create 125 sample frames of raw (44.1kHz stereo pcm_s16le) silence
dd if=/dev/zero of=./silence.raw count=1 bs=500

# Encode raw to ogg vorbis
oggenc --raw silence.raw --output=silence.ogg

# Decode ogg vorbis to raw using oggdec
oggdec --raw --output silence.oggdec.raw silence.ogg

# Decode ogg vorbis to raw using ffmpeg libvorbis decoder
ffmpeg -codec:a libvorbis -i silence.ogg -f s16le -codec:a pcm_s16le silence.libvorbis.ffmpeg.raw

# Decode ogg vorbis to raw using ffmpeg native decoder
ffmpeg -i silence.ogg -f s16le -codec:a pcm_s16le silence.native.ffmpeg.raw

# Examine the results of decoding
# We expect all raw files to have identical size,
# but instead we see that the ffmpeg decoded files are too big
ls -l *.raw

# Repeat with different sized initial raw file to see that the
# problem occurs for any number of initial sample frames (small or large)

Output from ls *.raw:

-rw-r--r--  1 mark  staff      2304  3 May 07:15 silence.libvorbis.ffmpeg.raw
-rw-r--r--  1 mark  staff      2304  3 May 07:16 silence.native.ffmpeg.raw
-rw-r--r--  1 mark  staff       500  3 May 07:14 silence.oggdec.raw
-rw-r--r--  1 mark  staff       500  3 May 07:12 silence.raw

-report output for libvorbis decode invocation:

ffmpeg started on 2017-05-03 at 08:30:15
Report written to "ffmpeg-20170503-083015.log"
Command line:
ffmpeg -report -codec:a libvorbis -i silence.ogg -f s16le -codec:a pcm_s16le silence.libvorbis.ffmpeg.raw
ffmpeg version git-2017-05-02-20da413 Copyright (c) 2000-2017 the FFmpeg developers
  built with Apple LLVM version 8.1.0 (clang-802.0.42)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/HEAD-20da413 --enable-shared --enable-pthreads --enable-gpl --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-libass --enable-libfdk-aac --enable-libfreetype --enable-libmp3lame --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-libxvid --enable-opencl --disable-lzma --enable-nonfree --enable-vda
  libavutil      55. 61.100 / 55. 61.100
  libavcodec     57. 93.100 / 57. 93.100
  libavformat    57. 72.101 / 57. 72.101
  libavdevice    57.  7.100 / 57.  7.100
  libavfilter     6. 88.100 /  6. 88.100
  libavresample   3.  6.  0 /  3.  6.  0
  libswscale      4.  7.101 /  4.  7.101
  libswresample   2.  8.100 /  2.  8.100
  libpostproc    54.  6.100 / 54.  6.100
Splitting the commandline.
Reading option '-report' ... matched as option 'report' (generate a report) with argument '1'.
Reading option '-codec:a' ... matched as option 'codec' (codec name) with argument 'libvorbis'.
Reading option '-i' ... matched as input url with argument 'silence.ogg'.
Reading option '-f' ... matched as option 'f' (force format) with argument 's16le'.
Reading option '-codec:a' ... matched as option 'codec' (codec name) with argument 'pcm_s16le'.
Reading option 'silence.libvorbis.ffmpeg.raw' ... matched as output url.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option report (generate a report) with argument 1.
Successfully parsed a group of options.
Parsing a group of options: input url silence.ogg.
Applying option codec:a (codec name) with argument libvorbis.
Successfully parsed a group of options.
Opening an input file: silence.ogg.
[file @ 0x7fe038e00520] Setting default whitelist 'file,crypto'
[ogg @ 0x7fe039801000] Format ogg probed with size=2048 and score=100
[ogg @ 0x7fe039801000] Before avformat_find_stream_info() pos: 4025 bytes read:4025 seeks:0 nb_streams:1
[ogg @ 0x7fe039801000] All info found
[ogg @ 0x7fe039801000] After avformat_find_stream_info() pos: 4025 bytes read:4025 seeks:0 frames:1
Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, ogg, from 'silence.ogg':
  Duration: 00:00:00.00, start: 0.000000, bitrate: 11362 kb/s
    Stream #0:0, 1, 1/44100: Audio: vorbis, 44100 Hz, stereo, s16, 112 kb/s
Successfully opened the file.
Parsing a group of options: output url silence.libvorbis.ffmpeg.raw.
Applying option f (force format) with argument s16le.
Applying option codec:a (codec name) with argument pcm_s16le.
Successfully parsed a group of options.
Opening an output file: silence.libvorbis.ffmpeg.raw.
[file @ 0x7fe038e020e0] Setting default whitelist 'file,crypto'
Successfully opened the file.
Stream mapping:
  Stream #0:0 -> #0:0 (vorbis (libvorbis) -> pcm_s16le (native))
Press [q] to stop, [?] for help
cur_dts is invalid (this is harmless if it occurs once at the start per stream)
cur_dts is invalid (this is harmless if it occurs once at the start per stream)
detected 4 logical cores
[graph_0_in_0_0 @ 0x7fe038f002a0] Setting 'time_base' to value '1/44100'
[graph_0_in_0_0 @ 0x7fe038f002a0] Setting 'sample_rate' to value '44100'
[graph_0_in_0_0 @ 0x7fe038f002a0] Setting 'sample_fmt' to value 's16'
[graph_0_in_0_0 @ 0x7fe038f002a0] Setting 'channel_layout' to value '0x3'
[graph_0_in_0_0 @ 0x7fe038f002a0] tb:1/44100 samplefmt:s16 samplerate:44100 chlayout:0x3
[format_out_0_0 @ 0x7fe038f00840] Setting 'sample_fmts' to value 's16'
[AVFilterGraph @ 0x7fe038e0bf20] query_formats: 4 queried, 9 merged, 0 already done, 0 delayed
Output #0, s16le, to 'silence.libvorbis.ffmpeg.raw':
  Metadata:
    encoder         : Lavf57.72.101
    Stream #0:0, 0, 1/44100: Audio: pcm_s16le, 44100 Hz, stereo, s16, 1411 kb/s
    Metadata:
      encoder         : Lavc57.93.100 pcm_s16le
No more output streams to write to, finishing.
size=       2kB time=00:00:00.01 bitrate=1154.6kbits/s speed=21.3x    
video:0kB audio:2kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000000%
Input file #0 (silence.ogg):
  Input stream #0:0 (audio): 2 packets read (2 bytes); 1 frames decoded (576 samples); 
  Total: 2 packets (2 bytes) demuxed
Output file #0 (silence.libvorbis.ffmpeg.raw):
  Output stream #0:0 (audio): 1 frames encoded (576 samples); 1 packets muxed (2304 bytes); 
  Total: 1 packets (2304 bytes) muxed
1 frames successfully decoded, 0 decoding errors
[AVIOContext @ 0x7fe038e02160] Statistics: 0 seeks, 1 writeouts
[AVIOContext @ 0x7fe038e00660] Statistics: 4025 bytes read, 0 seeks

-report output for native decode invocation:

ffmpeg started on 2017-05-03 at 08:22:13
Report written to "ffmpeg-20170503-082213.log"
Command line:
ffmpeg -report -i silence.ogg -f s16le -codec:a pcm_s16le silence.native.ffmpeg.raw
ffmpeg version git-2017-05-02-20da413 Copyright (c) 2000-2017 the FFmpeg developers
  built with Apple LLVM version 8.1.0 (clang-802.0.42)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/HEAD-20da413 --enable-shared --enable-pthreads --enable-gpl --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-libass --enable-libfdk-aac --enable-libfreetype --enable-libmp3lame --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-libxvid --enable-opencl --disable-lzma --enable-nonfree --enable-vda
  libavutil      55. 61.100 / 55. 61.100
  libavcodec     57. 93.100 / 57. 93.100
  libavformat    57. 72.101 / 57. 72.101
  libavdevice    57.  7.100 / 57.  7.100
  libavfilter     6. 88.100 /  6. 88.100
  libavresample   3.  6.  0 /  3.  6.  0
  libswscale      4.  7.101 /  4.  7.101
  libswresample   2.  8.100 /  2.  8.100
  libpostproc    54.  6.100 / 54.  6.100
Splitting the commandline.
Reading option '-report' ... matched as option 'report' (generate a report) with argument '1'.
Reading option '-i' ... matched as input url with argument 'silence.ogg'.
Reading option '-f' ... matched as option 'f' (force format) with argument 's16le'.
Reading option '-codec:a' ... matched as option 'codec' (codec name) with argument 'pcm_s16le'.
Reading option 'silence.native.ffmpeg.raw' ... matched as output url.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option report (generate a report) with argument 1.
Successfully parsed a group of options.
Parsing a group of options: input url silence.ogg.
Successfully parsed a group of options.
Opening an input file: silence.ogg.
[file @ 0x7fadd4d0f580] Setting default whitelist 'file,crypto'
[ogg @ 0x7fadd5010200] Format ogg probed with size=2048 and score=100
[ogg @ 0x7fadd5010200] Before avformat_find_stream_info() pos: 4025 bytes read:4025 seeks:0 nb_streams:1
[ogg @ 0x7fadd5010200] All info found
[ogg @ 0x7fadd5010200] After avformat_find_stream_info() pos: 4025 bytes read:4025 seeks:0 frames:1
Input #0, ogg, from 'silence.ogg':
  Duration: 00:00:00.00, start: 0.000000, bitrate: 11362 kb/s
    Stream #0:0, 1, 1/44100: Audio: vorbis, 44100 Hz, stereo, fltp, 112 kb/s
Successfully opened the file.
Parsing a group of options: output url silence.native.ffmpeg.raw.
Applying option f (force format) with argument s16le.
Applying option codec:a (codec name) with argument pcm_s16le.
Successfully parsed a group of options.
Opening an output file: silence.native.ffmpeg.raw.
[file @ 0x7fadd4d10540] Setting default whitelist 'file,crypto'
Successfully opened the file.
Stream mapping:
  Stream #0:0 -> #0:0 (vorbis (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
cur_dts is invalid (this is harmless if it occurs once at the start per stream)
cur_dts is invalid (this is harmless if it occurs once at the start per stream)
detected 4 logical cores
[graph_0_in_0_0 @ 0x7fadd4d13ee0] Setting 'time_base' to value '1/44100'
[graph_0_in_0_0 @ 0x7fadd4d13ee0] Setting 'sample_rate' to value '44100'
[graph_0_in_0_0 @ 0x7fadd4d13ee0] Setting 'sample_fmt' to value 'fltp'
[graph_0_in_0_0 @ 0x7fadd4d13ee0] Setting 'channel_layout' to value '0x3'
[graph_0_in_0_0 @ 0x7fadd4d13ee0] tb:1/44100 samplefmt:fltp samplerate:44100 chlayout:0x3
[format_out_0_0 @ 0x7fadd4d14480] Setting 'sample_fmts' to value 's16'
[format_out_0_0 @ 0x7fadd4d14480] auto-inserting filter 'auto_resampler_0' between the filter 'Parsed_anull_0' and the filter 'format_out_0_0'
[AVFilterGraph @ 0x7fadd4d13980] query_formats: 4 queried, 6 merged, 3 already done, 0 delayed
[auto_resampler_0 @ 0x7fadd4d14c80] [SWR @ 0x7fadd506f400] Using fltp internally between filters
[auto_resampler_0 @ 0x7fadd4d14c80] ch:2 chl:stereo fmt:fltp r:44100Hz -> ch:2 chl:stereo fmt:s16 r:44100Hz
Output #0, s16le, to 'silence.native.ffmpeg.raw':
  Metadata:
    encoder         : Lavf57.72.101
    Stream #0:0, 0, 1/44100: Audio: pcm_s16le, 44100 Hz, stereo, s16, 1411 kb/s
    Metadata:
      encoder         : Lavc57.93.100 pcm_s16le
No more output streams to write to, finishing.
size=       2kB time=00:00:00.01 bitrate=1154.6kbits/s speed=19.1x    
video:0kB audio:2kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000000%
Input file #0 (silence.ogg):
  Input stream #0:0 (audio): 2 packets read (2 bytes); 1 frames decoded (576 samples); 
  Total: 2 packets (2 bytes) demuxed
Output file #0 (silence.native.ffmpeg.raw):
  Output stream #0:0 (audio): 1 frames encoded (576 samples); 1 packets muxed (2304 bytes); 
  Total: 1 packets (2304 bytes) muxed
1 frames successfully decoded, 0 decoding errors
[AVIOContext @ 0x7fadd4d10600] Statistics: 0 seeks, 1 writeouts
[AVIOContext @ 0x7fadd4d0f6c0] Statistics: 4025 bytes read, 0 seeks

Patches should be submitted to the ffmpeg-devel mailing list and not this bug tracker.

Change History (2)

comment:1 Changed 18 months ago by cehoyos

  • Keywords ogg vorbis added
  • Reproduced by developer set
  • Status changed from new to open
  • Version changed from unspecified to git-master

Not a regression afaict.

comment:2 Changed 4 weeks ago by evesira

Any update on this? I feel that this is important.

This issue causes the inability to seamlessly and correctly loop layered and synced ogg tracks of the same exact length after they have been decoded by ffmpeg. Vorbis was designed to be capable of sample-perfect looping, so being able to reconstruct the file exactly is possible and necessary.

Note: See TracTickets for help on using tickets.