Opened 4 weeks ago

Last modified 31 hours ago

#6775 new defect

gst-libav can no longer decode video (3.4 regression)

Reported by: heftig Owned by:
Priority: critical Component: avcodec
Version: git-master Keywords: regression
Cc: jcowgill+ffmpeg@jcowgill.uk, dominik@greysector.net, negativo17@gmail.com Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug:

gst-libav attempts to drain the decoder (by feeding avcodec_decode_video2 empty packets) before every discontinuity in the stream. The compat implementation used since 3.4 doesn't seem to handle this case correctly. It does not consume any data from the following (real) packets and further calls echo a message: "Got unexpected packet size after a partial decode"

Since the start of the stream is also considered a discontinuity, the decode can't even start. Avoiding draining at the start of the stream doesn't solve the problem for any further discontinuity.

Downstream bug: https://bugzilla.gnome.org/show_bug.cgi?id=789193

How to reproduce:

gst-launch-1.0 videotestsrc ! x264enc ! rtph264pay ! identity drop-probability=0.1 ! rtph264depay ! h264parse ! avdec_h264 ! videoconvert ! autovideosink

This creates a h264 stream which is missing 10% of frames.

(1) With FFmpeg 3.3, it manages a best-effort (glitchy) decode.

(2) With FFmpeg 3.4, it does not start decoding at all.

(3) Reverting gst-libav commit 67e55e47 (or inserting a call to avcodec_flush_buffers after draining) allows decoding to start, but causes the decoder state to be cleared after each missing frame, creating much worse output than (1).

Change History (15)

comment:1 Changed 4 weeks ago by heftig

Seems to be related to #6446.

comment:2 Changed 4 weeks ago by jcowgill

  • Cc jcowgill+ffmpeg@jcowgill.uk added

comment:3 Changed 4 weeks ago by cehoyos

Please do a bisect to find the FFmpeg change that introduced the issue (and if possible please verify if the same change broke Kodi).

comment:4 Changed 4 weeks ago by cehoyos

And please confirm that the issue is reproducible with current FFmpeg git head.

comment:5 Changed 4 weeks ago by cehoyos

  • Priority changed from normal to critical

comment:6 Changed 4 weeks ago by heftig

The first bad commit is bddb2343b6e594e312dadb5d21b408702929ae04 (which merges 061a0c14bb "decode: restructure the core decoding code").

Couldn't test master as gst-libav does not build against it (yet? Problem seems to be renamed defines; would attempt a port if time allows).

comment:7 Changed 3 weeks ago by rathann

  • Cc dominik@greysector.net added

comment:8 Changed 3 weeks ago by heleppkes

Requiring avcodec_flush_buffers after draining is an expected requirement in the decode API. There is no way to otherwise signal that the draining is done.

comment:9 follow-ups: Changed 2 weeks ago by jcowgill

I think Kodi 17 is also affected by this (for people who don't use Kodi's bundled ffmpeg).
See: https://bugs.gentoo.org/634802

I have this hack which forces the buffers to be flushed after a drain has completed. It allows gst-libav and kodi to work with ffmpeg 3.4, but I expect it has some side effect I haven't thought of :) It should have the same effect as doing (3) above.

--- a/libavcodec/decode.c
+++ b/libavcodec/decode.c
@@ -780,6 +780,10 @@ static int compat_decode(AVCodecContext *avctx, AVFrame *frame,
     while (ret >= 0) {
         ret = avcodec_receive_frame(avctx, frame);
         if (ret < 0) {
+            if (ret == AVERROR_EOF) {
+                av_assert0(avci->draining);
+                avcodec_flush_buffers(avctx);
+            }
             if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF)
                 ret = 0;
             goto finish;

I think there is a misunderstanding here about what a NULL packet sent via avcodec_video_decode2 actually does, probably because the documentation is a bit lacking in this area. The documentation says you need a NULL packet at the end of a stream, but says nothing about NULL packets sent at any other time. There are at least a few comments I found in the ffmpeg source which imply that sending a NULL packet means EOF (as opposed to draining, and allowing extra data later). Clearly the developer who wrote the compat code in 3.4 thought this as well.

Maybe I'm missing something obvious, but why does gst-libav need to drain the codec when a packet is dropped?

comment:10 Changed 2 weeks ago by cehoyos

  • Version set to git-master

comment:11 Changed 11 days ago by cehoyos

Please send your patch to the development mailing list for discussion.

comment:12 Changed 34 hours ago by slaanesh

  • Cc negativo17@gmail.com added

comment:13 in reply to: ↑ 9 ; follow-up: Changed 33 hours ago by cehoyos

Replying to jcowgill:

I have this hack which forces the buffers to be flushed after a drain has completed. It allows gst-libav and kodi to work with ffmpeg 3.4, but I expect it has some side effect I haven't thought of :) It should have the same effect as doing (3) above.

Does the patch completely break error concealment?

comment:14 in reply to: ↑ 9 Changed 33 hours ago by heleppkes

Replying to jcowgill:

I think there is a misunderstanding here about what a NULL packet sent via avcodec_video_decode2 actually does, probably because the documentation is a bit lacking in this area. The documentation says you need a NULL packet at the end of a stream, but says nothing about NULL packets sent at any other time. There are at least a few comments I found in the ffmpeg source which imply that sending a NULL packet means EOF (as opposed to draining, and allowing extra data later). Clearly the developer who wrote the compat code in 3.4 thought this as well.

The key there is that nowhere in the source or the docs it does say you can send a NULL packet at any other time then EOF. All mentions of NULL drain packets include a wording similar to "at the end" (of the stream) etc.

So while it may have worked somewhat in the past (there are reports of this usage actually crashing with some decoders, though), the documentation never actually told you that this was a valid call, or suggested to do so in any way.

"Fixing" this by flushing after every drain will likely cause further issues, since a flush also discards any reference frames and the likes.

Did we ever get an answer *why* draining all the time is being performed? I don't see the need for that in any circumstances.

Last edited 33 hours ago by heleppkes (previous) (diff)

comment:15 in reply to: ↑ 13 Changed 31 hours ago by jcowgill

Replying to cehoyos:

Replying to jcowgill:

I have this hack which forces the buffers to be flushed after a drain has completed. It allows gst-libav and kodi to work with ffmpeg 3.4, but I expect it has some side effect I haven't thought of :) It should have the same effect as doing (3) above.

Does the patch completely break error concealment?

Yes in the case of gst-libav. It is however an improvement over the current situation where nothing works at all. I'm not intending for this to be a long term solution - I am hoping gst-libav will fix their code so workarounds don't have to be used.

Note: See TracTickets for help on using tickets.