Opened 11 years ago

Closed 11 years ago

Last modified 10 years ago

#1876 closed defect (invalid)

vc1dec.c multi-threading decode crash issue

Reported by: DonMoir Owned by:
Priority: important Component: avcodec
Version: unspecified Keywords: vc1 crash
Cc: onemda@gmail.com Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

In vc1_decode_frame there is:

if (ff_msmpeg4_decode_init(avctx) < 0

ff_vc1_decode_init_alloc_tables(v) < 0)

If this entire statement is not locked down in a multi-threaded multi-video environment, then crash and burn.

No way I know of to reproduce crash using ffmpeg command line tools.

The way I produce crash is to pump more than one WMV3 files at ffmpeg simultaneously. Multiple instances of a file are using for playback, thumbnails, and other things. Then when calling avcodec_decode_video2 you will get spurious crashes depending on the current state of things.

All other file formats I have tested seem to be ok. Just any format that makes use of vc1_decode_frame is suspect.

The file I used for testing is located here: (283 MB)

http://sms.pangolin.com/temp/WMV3_vc1_decode_frame_crash.wmv

Change History (40)

comment:1 by Carl Eugen Hoyos, 11 years ago

Priority: criticalimportant

Please provide a backtrace etc. as explained on http://ffmpeg.org/bugreports.html

comment:2 by Carl Eugen Hoyos, 11 years ago

Keywords: vc1 crash added

comment:3 by DonMoir, 11 years ago

There is not going to be any backtrace to look at as you can't get it to crash with ffmpeg command line tools and I am using a windows app to create the multi-threaded situation.

Beyond that, crashes are spurious and can be anywhere. This is do to memory overwrites or similar.

Mostly though, the point of crash is at avcodec_decode_video2 and for reasons I have already stated.

Last edited 11 years ago by DonMoir (previous) (diff)

comment:4 by DonMoir, 11 years ago

If you think you have of a way I can reproduce the problem with your tools then let me know.

in reply to:  3 comment:5 by Carl Eugen Hoyos, 11 years ago

Replying to DonMoir:

There is not going to be any backtrace to look at as you can't get it to crash with ffmpeg command line tools and I am using a windows app to create the multi-threaded situation.

Could you explain why "using a windows app" makes providing a backtrace not going to happen?

comment:6 by DonMoir, 11 years ago

Can you explain how I am suppose to create a backtrace with a windows app using MSVC where the MinGW debug symbols are not compatible? The only backtrace I get outside of a few app calling functions is a crash in avcodec_decode_video2 but can happen elsewhere and spurious as mentioned.

A thread problem has already been acknowledged in the mentioned code as well.

If you want an asm listing I can give that but it is probably not necessary.

Last edited 11 years ago by DonMoir (previous) (diff)

in reply to:  6 comment:7 by Carl Eugen Hoyos, 11 years ago

Replying to DonMoir:

Can you explain how I am suppose to create a backtrace with a windows app using MSVC where the MinGW debug symbols are not compatible?

Compile with MSVC to get debug symbols.

comment:8 by DonMoir, 11 years ago

ffmpeg c99 code is not compatible with MSVC c++ code. That is MSVC won't compile c99 code.

I have seen some mention of c99 to c89 but not dealing with that right now.

So I take it you have no tools for testing multi-threaded issues ?

But look at it this way. The crash can essentially happen anywhere and just all depends. So you want a worthless crash report when we already know where the problem is ?

Last edited 11 years ago by DonMoir (previous) (diff)

in reply to:  8 comment:9 by Carl Eugen Hoyos, 11 years ago

Replying to DonMoir:

ffmpeg c99 code is not compatible with MSVC c++ code.

FFmpeg is tested regularly with MSVC compiler, please see http://fate.ffmpeg.org/
(And of course http://ffmpeg.org/platform.html#Microsoft-Visual-C_002b_002b)

in reply to:  8 comment:10 by Carl Eugen Hoyos, 11 years ago

Replying to DonMoir:

So you want a worthless crash report when we already know where the problem is ?

Patches that fix bugs are even more welcome than bug reports, you can send them directly to ffmpeg-devel, no need to open a ticket, just include an explanation in your mail.

comment:11 by DonMoir, 11 years ago

Knowing where the problem is and getting in the optimal fix are a couple different things. I was hoping for more input on mailing list but only minimal response. I can kind of understand that since most do not use multi-video and multi-threaded in combination.

If there is a consensus on locking the mentioned code statement then the fix is very easy.

If there is not a consensus then we are back to square one wondering what the best fix is.

in reply to:  11 comment:12 by Carl Eugen Hoyos, 11 years ago

Replying to DonMoir:

Knowing where the problem is and getting in the optimal fix are a couple different things. I was hoping for more input on mailing list but only minimal response.

A safe way of getting more input is sending a patch that fixes your crash.

comment:13 by DonMoir, 11 years ago

What you call my crash has already been fixed locally and just trying to relate to you that crash is there waiting to happen to someone else. As you have said many times, "Crashes are important to us".

When will I have the time to read thru the patch rules, the submission rules, and the FATE rules? I just don't know.

Several days behind now tracking this issue.

comment:14 by Carl Eugen Hoyos, 11 years ago

Resolution: needs_more_info
Status: newclosed

Please reopen this ticket if you can explain how the crash you are seeing can be reproduced or if you can provide a backtrace.

comment:15 by DonMoir, 11 years ago

It can be hard to reproduce and it may take me about 10 tries. It came to me as a bug from a beta tester where it would happen on occasion. So then I wrote some code to trigger it more often. Running, opening multiple instances of the above file at the same time is the best way I can reproduce it. It does not happen anywhere else except in code that calls vc1_decode_frame. The crash can be anywhere because of memory overwrites etc.

My only current way to fix this is to have a unique build. After looking around some more and hints from Hendrik, it appears the fix is best put in vc1_decode_frame.

The problem code in vc1_decode_frame is:

    if (!s->context_initialized) {
        if (ff_msmpeg4_decode_init(avctx) < 0 || ff_vc1_decode_init_alloc_tables(v) < 0)
            goto err;
        s->low_delay = !avctx->has_b_frames || v->res_sprite;
        if (v->profile == PROFILE_ADVANCED) {
            s->h_edge_pos = avctx->coded_width;
            s->v_edge_pos = avctx->coded_height;
        }
    }

The offending statement is:

        if (ff_msmpeg4_decode_init(avctx) < 0 || ff_vc1_decode_init_alloc_tables(v) < 0)

After adding avpriv_lock_codec and avpriv_unlock_codec since they don't exist.

A couple ways to organize the fix but this way produces no additional testing or assignments:

    if (!s->context_initialized) {
        avpriv_lock_avcodec ();
        if (ff_msmpeg4_decode_init(avctx) < 0 || ff_vc1_decode_init_alloc_tables(v) < 0) {
            avpriv_unlock_avcodec ();
            goto err;
        }
        avpriv_unlock_avcodec ();
        s->low_delay = !avctx->has_b_frames || v->res_sprite;
        if (v->profile == PROFILE_ADVANCED) {
            s->h_edge_pos = avctx->coded_width;
            s->v_edge_pos = avctx->coded_height;
        }
    }

I don't like this way with the 2 unlocks and I would probably restructure to an initialize_context function but trying to keep it straight forward here.

If there is any way else to fix it with less impact, I don't know.

Last edited 11 years ago by DonMoir (previous) (diff)

comment:16 by DonMoir, 11 years ago

Resolution: needs_more_info
Status: closedreopened

comment:17 by Hendrik, 11 years ago

I would suggest to simply submit a patchset which introduces the new lock functions, and uses them here (separate patches). We can discuss alternative cleanup ideas on the mailing list better then here.

comment:18 by Carl Eugen Hoyos, 11 years ago

Resolution: needs_more_info
Status: reopenedclosed

comment:19 by DonMoir, 11 years ago

There is not much more info I can provide and it appears being arrogant is more important than a crash. For anyone who actually understands the issue, the detail provided is more than enough.

comment:20 by Elon Musk, 11 years ago

Analyzed by developer: set
Cc: onemda@gmail.com added
Reproduced by developer: set
Resolution: needs_more_info
Status: closedreopened

Please stop keep trying to close bugs which are not yet fixed.

in reply to:  19 comment:21 by Carl Eugen Hoyos, 11 years ago

Replying to DonMoir:

There is not much more info I can provide

Is there a backtrace that I missed?
Or any code a developer wanting to fix the problem could test?

To me it appears as if you did not provide any relevant information at all.

and it appears being arrogant is more important than a crash.

Yes, this is exactly the feeling I have about your reports.
You never provide any necessary information (including falsely pretending this isn't possibly, see above), instead you expect the developers to fill in the missing details. Combined with the fact that you refuse to send patches (that you say are easy to produce or you have even already tested them but time does not allow you to send them) and your constant ignoring of mailing list rules makes your attitude truly a seldom example of arrogance!

Last edited 11 years ago by Carl Eugen Hoyos (previous) (diff)

comment:22 by DonMoir, 11 years ago

Yeah Carl, to you it probably does seem like there is not enough detail. For a programmer who understands it is more than enough.

If you had sufficient tools so I could show you the problem then even you could understand it.

in reply to:  22 comment:23 by Carl Eugen Hoyos, 11 years ago

Replying to DonMoir:

Yeah Carl, to you it probably does seem like there is not enough detail. For a programmer who understands it is more than enough.

Thank you!

If you had sufficient tools so I could show you the problem then even you could understand it.

Could you tell me which tools I need?

comment:24 by DonMoir, 11 years ago

By the way, I have many more things to do then dealing with ffmpeg. Finding bugs in ffmpeg can take a lot of time. Reporting them sometimes takes even more time then it took to find them.

I could spend all the time working just with ffmpeg but I can't afford to do that.

It seems quite reasonable to me that if I point you to the actual bug which may have taken days to find, someone could spend an hour on it rather than wasting time on this ridiculous correspondence.

You would need tools that can spin off multiple instances of a file in multiple threads and play them at the same time for this particular problem.

Even then it can be hard to reproduce. Sometimes I can get it to crash right away and sometimes not. But I can always get it to crash.

Pretending my ass. The crash report is not very relevant but I don't expect you to understand that either.

Last edited 11 years ago by DonMoir (previous) (diff)

in reply to:  24 comment:25 by Carl Eugen Hoyos, 11 years ago

Replying to DonMoir:

By the way, I have many more things to do then dealing with ffmpeg.

Guess what: You are not the only one.

Finding bugs in ffmpeg can take a lot of time. Reporting them sometimes takes even more time then it took to find them.

I already told you numerous times: It is ok if you add an analysis to a report, it is not ok if the analysis replaces the actual report.

I could spend all the time working just with ffmpeg but I can't afford to do that.

It seems quite reasonable to me that if I point you to the actual bug which may have taken days to find, someone could spend an hour on it rather than wasting time on this ridiculous correspondence.

Yes, I agree: Instead of sharing insults, please provide a little more information!
And apart from that: What should the developer do this hour? Don't you think it would make much more sense to work on a ticket that contains all necessary information? Or one where the reporter makes us believe he is actually interested in helping?

You would need tools that can spin off multiple instances of a file in multiple threads and play them at the same time for this particular problem.

Names?
Links?

Even then it can be hard to reproduce. Sometimes I can get it to crash right away and sometimes not. But I can always get it to crash.

Then please provide the backtrace when it crashes, consider reading http://ffmpeg.org/bugreports.html if you believe I invented the idea of a backtrace because you provide such useful reports.

Pretending my ass. The crash report is not very relevant but I don't expect you to understand that either.

See above.

comment:26 by DonMoir, 11 years ago

I have no clue what tools you can use. Someone in the ffmpeg group should probably add it to the list of things to do.

The developer will know what to do as outlined above.

comment:27 by DonMoir, 11 years ago

Which crash report would you like ? pick a number 1 thru 10. It can crash anywhere do to memory overwrites and so not relevant.

I have also made steps in the direction of providing patches. I now have a linux machine with all necessary build tools. But still patches from me will have to wait.

comment:28 by Øystein Krog, 11 years ago

I suggest using a mingw-compiled gdb to produce backtraces, this has worked for me before.
I think you should be able to run your custom reproduction tools under gdb and then have gdb catch the exception (hopefully with backtrace) when it occurs.

comment:29 by DonMoir, 11 years ago

Can't do that geek, because the app is a windows app and complex. Again stack trace in this scenario is not useful anyway.

comment:30 by Carl Eugen Hoyos, 11 years ago

MSVC compilation is regularly tested, you can provide Windows backtraces now.

Which of course brings me to the next question: Why do you need a Linux box?

comment:31 by DonMoir, 11 years ago

Yeah I suppose if I wanted to mess around with c99-to-c89 and spend more time on dealing with that I could do that. But if c99-to-c89 is so bullet proof why don't you just convert over all of ffmpeg to c89.

Most people I know went to c++ from c89 and c99 did not even exist back then. Even if you did not want to use classes etc, c++ is a better c.

The ffmpeg group chose to use the syntax version of c that is used less often and so all hassle with that has been passed on to everyone else.

linux gives me a cleaner and more well defined environment, grabbing libraries, etc for dealing with ffmpeg rather than dealing with msys.

Last edited 11 years ago by DonMoir (previous) (diff)

comment:32 by DonMoir, 11 years ago

Appears to be fixed. I don't have a WMV3 file where the width and height changes so could not test very well in vc1_decode_frame when it has to be reinitialized but I faked it and seems ok.

comment:33 by Carl Eugen Hoyos, 11 years ago

Resolution: invalid
Status: reopenedclosed

comment:34 by Michael Niedermayer, 10 years ago

Resolution: invalidfixed

Ticket reporter confirmed that this has been fixed
The ticket may be invalid too but we should care more about what is fixed and what not.

in reply to:  34 comment:35 by Carl Eugen Hoyos, 10 years ago

Replying to michael:

Ticket reporter confirmed that this has been fixed

How can I backport the (important looking) fix to our release branches?

comment:36 by DonMoir, 10 years ago

The problem was, it was known to be a problem and god knows I spent a lot of time in mailing list and here on it and tracking it down, but was never closed properly and so difficult to tell exactly who fixed it and what release. You can probably search the mailing list to find more info if you want. It is indeed fixed and I beat the hell out of it with hundreds of files.

comment:37 by DonMoir, 10 years ago

By the way, I already mentioned this above, but in this case, neither command line applied and a back trace was useless. Command line tools were not capable of reproducing it because of their single file instance nature. Back trace was useless because it could crash anywhere in totally unrelated ffmpeg code or in my own code. It varied every time and it was nuts.

While it was hard to reproduce, it would happen every now and then with normal application usage. I ended up having to automate it so I could get it to happen at will. I wrote some code that would open up many instances of an offending file at the same time and just kept beating on it.

comment:38 by Carl Eugen Hoyos, 10 years ago

Analyzed by developer: unset
Reproduced by developer: unset
Resolution: fixedinvalid

Sorry but this ticket always costs me time when scanning through old reports.

It should be set to "fixed" if either:

  • The commit fixing the issue is known

or

  • A way to reproduce the issue with old versions of FFmpeg is explained

or

comment:39 by DonMoir, 10 years ago

The mailing list discussion starts here with comments from Michael, Hendrik, Reimar, myself, and possibly others.

http://ffmpeg.org/pipermail/ffmpeg-devel/2012-October/133043.html

Toward the end of that discussion, it points to this ticket. The problem was understood by the people involved.

Last edited 10 years ago by DonMoir (previous) (diff)

comment:40 by DonMoir, 10 years ago

o - The person who made the commit fixing the issue should have posted that here. This ticket was created so it could be made official and documented

o - A way to reproduce the issue was stated above in the initial comment, but your tools are too weak.

o - The source of the problem was pointed to in the initial comment.

o - A backtrace was useless and stated many times.

Note: See TracTickets for help on using tickets.