Opened 12 years ago

Closed 12 years ago

Last modified 12 years ago

#958 closed defect (fixed)

matroska seek problem file (3 bugs)

Reported by: DonMoir Owned by:
Priority: normal Component: undetermined
Version: git-master Keywords: mkv h264 seek
Cc: Blocked By:
Blocking: Reproduced by developer: yes
Analyzed by developer: no

Description

In reference to libavformat/matroskadec.c

Mostly, when you seek on a matroska file, matroska_read_seek will create an index table that is appropriate for seeking. With this particular file it does not create the index table.

http://sms.pangolin.com/temp/bad_seek_h264_inf-rtwm_sample.zip

Handling the creation of the seek index table is done with first 3 lines of code in matroska_read_seek (Parse the CUES). This is just done once on the first seek.

static int matroska_read_seek(AVFormatContext *s, int stream_index,
                              int64_t timestamp, int flags)
{
    .....
    /* Parse the CUES now since we need the index data to seek. */
    if (matroska->cues_parsing_deferred) {
        matroska_parse_cues(matroska);
        matroska->cues_parsing_deferred = 0;
    }

    if (!st->nb_index_entries)
        return 0;
    ....
}

matroska_parse_cues calls matroska_parse_seekhead_entry which ends up calling ebml_parse. ebml_parse calls ebml_read_num which fails because avio_r8(pb) returns 0.

static int ebml_read_num(MatroskaDemuxContext *matroska, AVIOContext *pb,
                         int max_size, uint64_t *number)
{
    int read = 1, n = 1;
    uint64_t total = 0;

    ....

    /* The first byte tells us the length in bytes - avio_r8() can normally
     * return 0, but since that's not a valid first ebmlID byte, we can
     * use it safely here to catch EOS. */
    if (!(total = avio_r8(pb))) {
        /* we might encounter EOS here */
        if (!url_feof(pb)) {
            int64_t pos = avio_tell(pb);
            av_log(matroska->ctx, AV_LOG_ERROR,
                   "Read error at pos. %"PRIu64" (0x%"PRIx64")\n",
                   pos, pos);
        }
        return AVERROR(EIO); /* EOS or actual I/O error */
    }
    ....
}

(bug 1) Since it thinks if failed, no index entries will be built, but 0 appears to be a valid return value for this particular file. If I just let it proceed with total == 0, it creates the index entries normally. So I believe this is a bug in the way it's checking for 0 here and failing because of it.

Also note this code in matroska_read_seek:

    if (!st->nb_index_entries)
        return 0;

(bug 2) If that code is executed, no seeking has been done, yet it returns 0 instead of -1. A 0 return value for read_seek means success and -1 means failure. I have not run into a case where this is a problem though because all the files I have, have a least one index_entry. But this is also a bug I believe.

ffmpeg -ss gives no useful results. There are no errors reported. (It does get a read error at very end but it's not relevant to the problem)

ffplay is the best way for you to see the problem because we know it doesn't do any special handling and just calls avformat_seek_file and uses the results of that.

If you use:

ffplay -ss 24 bad_seek_h264_inf-rtwm_sample.mkv

You will notice it always starts from the beginning. This is because matroska_read_seek has failed to create the index entries for the above file.

This is kind of a tough work around to deal with. I have to be able to recognize the problem and then create the index entries using a read packet method if the problem exist. You don't want to have to do this. It's a pain to recognize but mostly the read packet method is slower. If I do recognize there is a problem, 2 seeks may be required instead of one.

Make sure you use ffplay to test it. Other programs may do some preprocessing and that will change the results.

The following doesn't show much in the way of useful results. No errors etc. (bug 3) But you can see it has 5.1 audio and as a side note ffplay/SDL fails to play back the center speech channel on my machine. That is, it does not convert the 5.1 to stereo correctly.

ffplay -ss 24 bad_seek_h264_inf-rtwm_sample.mkv

ffplay version N-37063-g14d94a1 Copyright (c) 2003-2012 the FFmpeg developers

built on Jan 23 2012 17:40:00 with gcc 4.6.2
configuration:
--disable-static --enable-shared --enable-gpl --enable-version3
--disable-w32threads --enable-runtime-cpudetect --enable-avisynth
--enable-bzlib --enable-frei0r --enable-libopencore-amrnb
--enable-libopencore-amrwb --enable-libfreetype --enable-libgsm
--enable-libmp3lame --enable-libopenjpeg --enable-librtmp
--enable-libschroedinger --enable-libspeex --enable-libtheora
--enable-libvo-aacenc --enable-libvo-amrwbenc --enable-libvorbis
--enable-libvpx --enable-libx264 --enable-libxavs --enable-libxvid
--enable-zlib
libavutil 51. 34.101 / 51. 34.101
libavcodec 53. 57.105 / 53. 57.105
libavformat 53. 30.100 / 53. 30.100
libavdevice 53. 4.100 / 53. 4.100
libavfilter 2. 59.101 / 2. 59.101
libswscale 2. 1.100 / 2. 1.100
libswresample 0. 6.100 / 0. 6.100
libpostproc 52. 0.100 / 52. 0.100

Input #0, matroska,webm, from 'c:\flashfiles\movies\bad_seek_h264_inf-rtwm_sample.mkv':

Duration: 00:01:04.77, start: 0.000000, bitrate: 6406 kb/s

Stream #0:0: Video: h264 (High), yuv420p, 1280x528, SAR 1:1 DAR 80:33, 23.98

fps, 23.98 tbr, 1k tbn, 47.95 tbc (default)

Stream #0:1(eng): Audio: dts (DTS), 48000 Hz, 5.1(side), s16, 1536 kb/s (default)
Stream #0:2(eng): Subtitle: text (default)

12.76 A-V: -0.001 fd= 44 aq= 320KB vq= 1058KB sq= 0B f=0/0 f=0/0


Change History (11)

comment:1 by Carl Eugen Hoyos, 12 years ago

Keywords: mkv h264 seek added
Reproduced by developer: set
Status: newopen
Version: unspecifiedgit-master

ffplay -ss 24 bad_seek_h264_inf-rtwm_sample.mkv

Not reproducible with ffmpeg -ss 24 -i bad_seek_h264_inf-rtwm_sample.mkv out.png, mplayer -ss 24 bad_seek_h264_inf-rtwm_sample.mkv also works fine.

comment:2 by Carl Eugen Hoyos, 12 years ago

Component: avformatundetermined

comment:3 by reimar, 12 years ago

I think your ebml_read_num analysis must be wrong.
I do not see any "Read error at pos." message in the output, so it is not the case that an actual error is read from the file.
It is possible it runs into EOF, possibly a part of the file (of the index more specifically) was cut away?

comment:4 by DonMoir, 12 years ago

Don't depend on any output messages etc. It only gets a read error at the very end of the file.

If you trace into the code on a first seek and set a break point, you should see it fail. It's failing because total = avio_r8(pb) is zero. Why this is zero for this file I do not know. It can create the index table if you let this pass but what other ramifications that might have I don't know.

static int ebml_read_num(MatroskaDemuxContext *matroska, AVIOContext *pb,
                         int max_size, uint64_t *number)
{
    int read = 1, n = 1;
    uint64_t total = 0;

    ....

    /* The first byte tells us the length in bytes - avio_r8() can normally
     * return 0, but since that's not a valid first ebmlID byte, we can
     * use it safely here to catch EOS. */
//******************************************
// SET BREAK POINT on following if statement
    if (!(total = avio_r8(pb))) {
        /* we might encounter EOS here */
        if (!url_feof(pb)) {
            int64_t pos = avio_tell(pb);
            av_log(matroska->ctx, AV_LOG_ERROR,
                   "Read error at pos. %"PRIu64" (0x%"PRIx64")\n",
                   pos, pos);
        }
        return AVERROR(EIO); /* EOS or actual I/O error */
    }
    ....
}

So trace directly into the code and don't depend on anything else.

You could also place an av_log message if total == 0 but just for testing since ebml_read_num is called in several places and mostly you won't need the log message.


Last edited 12 years ago by DonMoir (previous) (diff)

comment:5 by DonMoir, 12 years ago

One additional point that I forgot to mention is while total will be equal to zero, url_feof (pb) will succeed and the reason you won't see any "Read error at pos" message. But since total is 0, it will return AVERROR(EIO).

Last edited 12 years ago by DonMoir (previous) (diff)

comment:6 by Carl Eugen Hoyos, 12 years ago

Resolution: fixed
Status: openclosed

Fixed by Reimar.

comment:7 by reimar, 12 years ago

I am not sure what you are trying to say, but I said before that I expect it to run into EOF while parsing the index.
Which means the index is broken and it's better to just not use it.
If you look at it with mkvinfo you will see that two parts of the index are near the start of the file and two other parts are around position 53 MB. However the file is only 50 MB large, so part of the index is simply missing.
Yes, if you just remove the error check it will probably add something from the existing index parts, but since something is obviously wrong with the index it should be safer to just not use it at all.

comment:8 by DonMoir, 12 years ago

Yeah, fine if it's broken. Just bringing it to your attention since this is an odd case. I don't know how the index is read, but I was assuming it was at the beginning of the file. Maybe some start up code detects a bad EOF but don't know.

I just needed to make sure that total == 0 is completely valid. It's strange to me that the file plays at all if EOF is being detected just on a seek and not elsewhere. That is the file plays fine and seems EOF should be detected on normal playback. I suppose there is some justification for this somewhere.

comment:9 by DonMoir, 12 years ago

There have been several times when I was told either it's a bad file, or ffmpeg doesn't work that way, or just don't use that format. Most of these issues were later fixed. Bad files tend to show you problems and good files mostly take care of themselves.

When someone tells me something like "it's a bad file", it's like some support guy saying: "just reboot, take 2 aspirins, and call me in the morning".

If it is a bad file and nothing can be done about it, then thats fine, but be sure nothing can be done about it.

I tested the above file in VLC, ffplay, SMPlayer 0.7.0, WMP using ffdshow, and my own app.

VLC 1.1.11 mostly gets seeking right but not without distorted frames which do clear up.

SMplayer 0.7.0 gets it right. EDIT: It sort of gets it right but will start to fail if you keep playing with it. It also less than accurate.

WMP using ffdshow gets it right but somewhat slow.

My own app gets it right.

ffplay only calls avformat_seek_file for the most part, and is not able to seek on the file on start up. It will be able to seek as the index table is built up as packets are read in.

So the thing is to create the expected user experience as much as possible and who knows what might be wrong with their files. I assume that WMP, VLC, and SMPlayer take care of this in their own way.

For my own app, I detect the problem and proceed with a solution. I would not have to do this if it could be handled in mastroska_read_seek but sometimes a higher level app just needs to have work arounds in place.

So the level at which this is taken care of for me and others has to be moved up. I don't wish to cop out and do nothing.

I am assuming that your recent patches for matroska have not changed this as I have not had a chance to test that out but maybe they have helped in some way related to this problem.

Last edited 12 years ago by DonMoir (previous) (diff)

comment:10 by reimar, 12 years ago

I am assuming that your recent patches for matroska have not changed this

Seeking should be working perfectly fine since those (though a bit slow, since it involves scanning the whole file). The only thing I was disagreeing with is that there is a bug at the point you pointed out and I also think that the index should not be used even if that means slower seeking that what might be possible when using anything from the non-broken index parts.
I admit I am not completely sure about that last one. I seem to remember discussion about tricks where you would reserve a few kB at the start of a file for an index and then there put as many entries as fit (from all over the file) and put the rest at the end.
This would then allow rather nice seeking even over http without having to ever jump to the index at the end. If such a thing is actually done it might be worth trying to support it over the current all-or-nothing handling.
I also proposed a patch that introduces a resync function which would allow implementing a read_timestamp function which should speed up seeking without an index a lot.
So there's lots that could be done, but that should all be on the "optimizing", not "bug fixing" level.

comment:11 by DonMoir, 12 years ago

Just getting a chance to test your recent matroska patches.

Perfect for above file and thanks again.

Note: See TracTickets for help on using tickets.