Opened 2 months ago

Last modified 8 weeks ago

#8083 new defect

Matroska demuxer fails to parse big attachements

Reported by: Zenitram Owned by:
Priority: normal Component: avformat
Version: git-master Keywords: mkv
Cc: Blocked By:
Blocking: Reproduced by developer: yes
Analyzed by developer: yes

Description

Summary of the bug:

If an attachment >256 MiB is present, FFmpeg demuxer considers the attachment as a transmission error, shows an error despite the fact it is not an error (AFAIK, length > 0x10000000 for element with ID 0x465C is not invalid), and in practice tries to resync inside the attachment.
Big attachments may look crazy but Matroska spec explicitly stipulates e.g. "error recovery files" which may be big, and the spec does not limit the attachment size AFAIK so IMO it is a valid file (+ FFmpeg creates the file, FFmpeg should be able to read the files it creates).

How to reproduce:

Notes:
First FFmpeg command is just for quickly creating a file > 256 MiB, you can replace it by a 257 MiB file full of zeroes if you prefer to avoid the bad resync inside the attachment during the parsing.
Second FFmpeg command can be any command containing the -attach part.
Full dumps of first and second FFmpeg command as well of libx264 logs are not relevant so not provided for smaller report.

ffmpeg -f lavfi -i mandelbrot -t 40 -c:v ffv1 attachment.mkv
ffmpeg -f lavfi -i mandelbrot -t 0.040 -c:v ffv1 -attach attachment.mkv -metadata:s:1 mimetype=application/octet-stream output.mkv
./ffmpeg -i output.mkv output2.mkv
ffmpeg version N-94563-g3aeb681f07 Copyright (c) 2000-2019 the FFmpeg developers
  built with gcc 9.1.1 (GCC) 20190807
  configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libdav1d --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-amf --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth --enable-libopenmpt
  libavutil      56. 33.100 / 56. 33.100
  libavcodec     58. 55.100 / 58. 55.100
  libavformat    58. 30.100 / 58. 30.100
  libavdevice    58.  9.100 / 58.  9.100
  libavfilter     7. 58.100 /  7. 58.100
  libswscale      5.  6.100 /  5.  6.100
  libswresample   3.  6.100 /  3.  6.100
  libpostproc    55.  6.100 / 55.  6.100
[matroska,webm @ 00000173da0d9040] Invalid length 0x149af295 > 0x10000000 for element with ID 0x465C at 0x22b
[matroska,webm @ 00000173da0d9040] Duplicate element
[matroska,webm @ 00000173da0d9040] 0x00 at pos 178 (0xb2) invalid as first byte of an EBML number
[matroska,webm @ 00000173da0d9040] Duplicate element
[matroska,webm @ 00000173da0d9040] Element at 0x59 ending at 0x354dbb8c53abe6 exceeds containing master element ending at 0x1405
[matroska,webm @ 00000173da0d9040] Duplicate element
[matroska,webm @ 00000173da0d9040] Element at 0x68 ending at 0xb24dbb8e53abf5 exceeds containing master element ending at 0x1414
[matroska,webm @ 00000173da0d9040] Element at 0x77 ending at 0x49af516 exceeds containing master element ending at 0x1423
[matroska,webm @ 00000173da0d9040] Element at 0x88 ending at 0x49e447e exceeds containing master element ending at 0x1434
[matroska,webm @ 00000173da0d9040] Invalid length 0x149af295 > 0x10000000 for element with ID 0x465C at 0x22b
[matroska,webm @ 00000173da0d9040] Duplicate element
[matroska,webm @ 00000173da0d9040] 0x00 at pos 740 (0x2e4) invalid as first byte of an EBML number
[matroska,webm @ 00000173da0d9040] Duplicate element
[matroska,webm @ 00000173da0d9040] Element at 0x28b ending at 0x354dbb8c53ae18 exceeds containing master element ending at 0x1637
[matroska,webm @ 00000173da0d9040] Duplicate element
[matroska,webm @ 00000173da0d9040] Element at 0x29a ending at 0xb24dbb8e53ae27 exceeds containing master element ending at 0x1646
[matroska,webm @ 00000173da0d9040] Duplicate element
[matroska,webm @ 00000173da0d9040] Element at 0x2a9 ending at 0x49aee7b exceeds containing master element ending at 0x1655
[matroska,webm @ 00000173da0d9040] Duplicate element
    Last message repeated 2 times
[matroska,webm @ 00000173da0d9040] incomplete attachment
    Last message repeated 1 times
[matroska,webm @ 00000173da0d9040] Could not find codec parameters for stream 1 (Video: ffv1 (FFV1 / 0x31564646), none, 640x480): unspecified pixel format
Consider increasing the value for the 'analyzeduration' and 'probesize' options
[matroska,webm @ 00000173da0d9040] Could not find codec parameters for stream 2 (Video: ffv1 (FFV1 / 0x31564646), none, 640x480): unspecified pixel format
Consider increasing the value for the 'analyzeduration' and 'probesize' options
Input #0, matroska,webm, from 'output.mkv':
  Metadata:
    ENCODER         : Lavf58.30.100
  Duration: 00:00:40.00, start: 0.000000, bitrate: 69183 kb/s
    Stream #0:0: Video: ffv1 (FFV1 / 0x31564646), bgr0, 640x480, SAR 1:1 DAR 4:3, 25 fps, 25 tbr, 1k tbn, 1k tbc (default)
    Metadata:
      ENCODER         : Lavc58.55.100 ffv1
      DURATION        : 00:00:40.000000000
    Stream #0:1: Video: ffv1 (FFV1 / 0x31564646), none, 640x480, SAR 1:1 DAR 4:3, 25 fps, 25 tbr, 1k tbn, 1k tbc (default)
    Metadata:
      ENCODER         : Lavc58.55.100 ffv1
      DURATION        : 00:00:40.000000000
    Stream #0:2: Video: ffv1 (FFV1 / 0x31564646), none, 640x480, SAR 1:1 DAR 4:3, 25 fps, 25 tbr, 1k tbn, 1k tbc (default)
    Metadata:
      ENCODER         : Lavc58.55.100 ffv1
      DURATION        : 00:00:40.000000000
Stream mapping:
  Stream #0:0 -> #0:0 (ffv1 (native) -> h264 (libx264))
Press [q] to stop, [?] for help
[...]

Reference to initial bug report (from the tool creating the big attachment):
https://github.com/MediaArea/RAWcooked/issues/244

Change History (8)

comment:1 Changed 2 months ago by mkver

  • Analyzed by developer set
  • Reproduced by developer set

This is because of the restrictions in max_lengths (line 1174 in libavformat/matroskdec.c). These exist so that (potentially intentionally) damaged data does not cause arbitrarily big allocations.
Is there any limit on the size of the attachments that you intend to store in Matroska?

comment:2 Changed 2 months ago by Zenitram

Is there any limit on the size of the attachments that you intend to store in Matroska?

I actually did not plan to have such big attachment, such size is due to messy input (invalid? I would say yes because the content is not parsable by any DPX parser conform to spec, it is just "random" content appended to the end of the DPX files) and I need to store this content in the MKV file for the purpose of the tool.
So the limit is infinite (if I have 1 MB of DPX real content + 1 GB of garbage, I need to store this 1 GB per DPX file in the resulting MKV).

These exist so that (potentially intentionally) damaged data does not cause arbitrarily big allocations.

I see two cases to handle:

  • legitimate big element that should not make FFmpeg allocate big amount of RAM; in that case, the element should be skipped with a warning (not an error) and the stream synced to the next expected element (not trying to sync inside the "bad" element as it is currently).
  • damaged data; I totally understand the idea of trying to sync inside the "bad" element but here I don't see a good method for detecting damaged data (FileData? element should not be bigger than the nesting Attachments element, but I guess that such test is already done).

So I suggest to remove MATROSKA_ID_FILEDATA from the list of tests. This element is only at the beginning of the file, so not having this test should not hurt much here + if the MATROSKA_ID_FILEDATA is too big, we skip it without allocating RAM.

Note: MATROSKA_ID_BLOCKADDITIONAL could also be legitimate to be big (no limitation of opaque data), but IMO more problematic because it can be at each block.

comment:3 Changed 2 months ago by mkver

I'll think about how to implement this. There is one thing that unfortunately complicates the situation: If the input is not seekable, skipping gigantic elements would be permanent.

PS: That elements are not allowed to exceed their parent master element is of course already checked.

comment:4 Changed 2 months ago by cehoyos

  • Keywords mkv added

comment:5 follow-up: Changed 2 months ago by marillat

Apparently this bug happens with more small files and with the same error message

https://github.com/HandBrake/HandBrake/issues/2248

See handbrake log file in

https://github.com/HandBrake/HandBrake/files/3511850/HB-20190801141537-vs-HB-20190816180347.logs.zip

comment:6 in reply to: ↑ 5 Changed 2 months ago by Zenitram

Replying to marillat:

Apparently this bug happens with more small files and with the same error message

The "exceeds containing master element ending at" error without initial "Invalid length" error is a different issue, "exceeds containing master element ending at" error being just a result of the first error.

"exceeds containing master element ending at" error alone is fixed in latest FFmpeg build AFAIK (I tested the command line in this post that reproduces the error and it is OK with FFmpeg git-master)

comment:7 follow-up: Changed 2 months ago by marillat

The right bug is #8804 #8084 and already fixed in 4.2 branch

Last edited 8 weeks ago by cehoyos (previous) (diff)

comment:8 in reply to: ↑ 7 Changed 8 weeks ago by robUx4

Replying to marillat:

The right bug is #8804 and already fixed in 4.2 branch

There isn't such a bug yet. Did you mean another number ?

Note: See TracTickets for help on using tickets.