Opened 8 years ago

Closed 5 years ago

#6066 closed enhancement (needs_more_info)

Handling HTTP 500 errors for input files

Reported by: Derek Prestegard Owned by:
Priority: wish Component: avformat
Version: unspecified Keywords: http
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug:
Sometimes when working with cloud object storage systems like AWS S3 it's normal to experience some 500 errors as S3 only guarantees 99.9% availability. As it currently operates, ffmpeg appears to interpret a 500 error as the end of the input file. When working with large source files (100+ GB) this can lead to ffmpeg transcodes truncating.

It would be ideal if ffmpeg could handle a 500 error gracefully by retrying the request up to a certain number of tries.

I have tried setting the -reconnect 1 flag, but this had no effect.

Simply re-trying the transcode always succeeds, but this is wasteful.

When the error occurs, the following will show up in the transcode report:

[https @ 0x55e3cde7e140] request: GET /foo.mov?AWSAccessKeyId=bar&Expires=baz&Signature=qux HTTP/1.1
User-Agent: Lavf/57.56.100
Accept: */*
Range: bytes=44435210752-
Connection: close
Host: foo.s3.amazonaws.com
Icy-MetaData: 1

[https @ 0x5636385ed140] HTTP error 500 Internal Server Error
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x5636385ec7c0] stream 0, offset 0xe65761200: partial file
https://foo.s3.amazonaws.com/foo.mov?AWSAccessKeyId=bar&Expires=baz&Signature=qux: Invalid data found when processing input

I experienced this with multiple versions, including the latest master from a few weeks ago.

How to reproduce:
This is difficult to reproduce, since S3 usually works fine but only occasionally returns 500 errors for a request. In my testing encoding feature length content this occurs approximately 10-15% of the time. In other words, if I encode 10 movies in series, 1 or 2 of them will be truncated

ffmpeg -y -report -reconnect 1 -i "https://foo.s3.amazonaws.com/foo.mov?AWSAccessKeyId=bar&Expires=baz&Signature=qux" -an -pix_fmt yuv420p -c:v libx264 -b:v 8M -preset superfast -tune film -fastfirstpass 0 -pass 1 -force_key_frames "expr:gte(t,n_forced*4)" -x264opts vbv-maxrate=8800:vbv-bufsize=16000 10test.mp4

Change History (11)

comment:1 by Carl Eugen Hoyos, 8 years ago

Keywords: http added

Did you test reconnect_at_eof?

Please provide command line including complete, uncut console output (or report output) to make this a valid ticket.

comment:2 by Derek Prestegard, 8 years ago

I will try this and report back. Isn't reconnect_at_eof designed for live encoding scenarios?

Providing a full uncut console output is challenging because I'm using feature film content and can't provide it directly. I may be able to reproduce this with some open content but will have to do some prep work to concatenate it into full feature length.

comment:3 by Derek Prestegard, 8 years ago

Confirmed, using reconnect_at_eof encodes an endless output file

comment:4 by Carl Eugen Hoyos, 8 years ago

Resolution: invalid
Status: newclosed

Thank you for testing again!

comment:5 by Derek Prestegard, 8 years ago

Why was this closed? I verified that using reconnect_at_eof does not resolve the issue.

The 500 error handling could still be drastically improved with some retry logic. With storage systems increasingly moving toward object storage and more workloads moving to cloud services this type of issue will only happen more often in the future.

Is there anything else I can provide? I'll work on generating a source file and procedure for reproducing the issue.

in reply to:  5 comment:6 by Carl Eugen Hoyos, 8 years ago

Priority: normalwish
Resolution: invalid
Status: closedreopened

Replying to dprestegard:

Why was this closed? I verified that using reconnect_at_eof does not resolve the issue.

What does "encodes an endless output file" means?
(I am not a native speaker.)

Is there anything else I can provide?

Please provide the FFmpeg command line that allows to reproduce the issue together with its complete, uncut console output to make this a valid ticket.

comment:7 by Derek Prestegard, 8 years ago

Ah! Sorry about that :)

When I add reconect_at_eof ffmpeg loops input file over and over, since it disregards the end of file marker and starts again at the beginning. This is great for transcoding a live source, but for file transcoding it's not helpful.

I will work on producing a file that I can share along with commands to reproduce

in reply to:  7 comment:8 by Carl Eugen Hoyos, 8 years ago

Replying to dprestegard:

When I add reconect_at_eof ffmpeg loops input file over and over, since it disregards the end of file marker and starts again at the beginning. This is great for transcoding a live source, but for file transcoding it's not helpful.

Does reconect_at_eof fix your issue, just at the disadvantage that it never stops, making it unusable for your usecase?

I will work on producing a file that I can share along with commands to reproduce

The console output should be sufficient.

comment:9 by Derek Prestegard, 8 years ago

I did only brief testing, but yes it seems like reconnect_at_eof makes ffmpeg able to recover from an HTTP 500 error. However, as I've mentioned, it never stops running which does make it unsuitable for my use case (or any other file transcoding use case).

I will reproduce this and upload a full encoding report.

comment:10 by brewerja, 5 years ago

We too have struggled with using ffmpeg against files in S3. In our case, we are clipping and transcoding HLS streams that are fed into ffmpeg as an input.txt file listing tracks. When we do this for long clips where the input file contains >3K .ts files, we very often see the 500 Internal Server error error message from ffmpeg.

The assumption has been that we are getting 500 errors sporadically from S3, but having tested a sequential curl of the same input file and even doing that in parallel, we are unable to reproduce a 500 error from S3.

We can reproduce the error in a test if we use a mock web server and setup the server to throw a single 500 error. We've not had any luck with the reconnect flags.

I feel our options are either try to implement some sort of retry outside of ffmpeg (via proxy) or download the .ts files locally before running ffmpeg.

comment:11 by Carl Eugen Hoyos, 5 years ago

Resolution: needs_more_info
Status: reopenedclosed
Note: See TracTickets for help on using tickets.