Opened 2 years ago

Last modified 2 years ago

#6066 reopened enhancement

Handling HTTP 500 errors for input files

Reported by: dprestegard Owned by:
Priority: wish Component: avformat
Version: unspecified Keywords: http
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug:
Sometimes when working with cloud object storage systems like AWS S3 it's normal to experience some 500 errors as S3 only guarantees 99.9% availability. As it currently operates, ffmpeg appears to interpret a 500 error as the end of the input file. When working with large source files (100+ GB) this can lead to ffmpeg transcodes truncating.

It would be ideal if ffmpeg could handle a 500 error gracefully by retrying the request up to a certain number of tries.

I have tried setting the -reconnect 1 flag, but this had no effect.

Simply re-trying the transcode always succeeds, but this is wasteful.

When the error occurs, the following will show up in the transcode report:

[https @ 0x55e3cde7e140] request: GET /foo.mov?AWSAccessKeyId=bar&Expires=baz&Signature=qux HTTP/1.1
User-Agent: Lavf/57.56.100
Accept: */*
Range: bytes=44435210752-
Connection: close
Host: foo.s3.amazonaws.com
Icy-MetaData: 1

[https @ 0x5636385ed140] HTTP error 500 Internal Server Error
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x5636385ec7c0] stream 0, offset 0xe65761200: partial file
https://foo.s3.amazonaws.com/foo.mov?AWSAccessKeyId=bar&Expires=baz&Signature=qux: Invalid data found when processing input

I experienced this with multiple versions, including the latest master from a few weeks ago.

How to reproduce:
This is difficult to reproduce, since S3 usually works fine but only occasionally returns 500 errors for a request. In my testing encoding feature length content this occurs approximately 10-15% of the time. In other words, if I encode 10 movies in series, 1 or 2 of them will be truncated

ffmpeg -y -report -reconnect 1 -i "https://foo.s3.amazonaws.com/foo.mov?AWSAccessKeyId=bar&Expires=baz&Signature=qux" -an -pix_fmt yuv420p -c:v libx264 -b:v 8M -preset superfast -tune film -fastfirstpass 0 -pass 1 -force_key_frames "expr:gte(t,n_forced*4)" -x264opts vbv-maxrate=8800:vbv-bufsize=16000 10test.mp4

Change History (9)

comment:1 Changed 2 years ago by cehoyos

  • Keywords http added

Did you test reconnect_at_eof?

Please provide command line including complete, uncut console output (or report output) to make this a valid ticket.

comment:2 Changed 2 years ago by dprestegard

I will try this and report back. Isn't reconnect_at_eof designed for live encoding scenarios?

Providing a full uncut console output is challenging because I'm using feature film content and can't provide it directly. I may be able to reproduce this with some open content but will have to do some prep work to concatenate it into full feature length.

comment:3 Changed 2 years ago by dprestegard

Confirmed, using reconnect_at_eof encodes an endless output file

comment:4 Changed 2 years ago by cehoyos

  • Resolution set to invalid
  • Status changed from new to closed

Thank you for testing again!

comment:5 follow-up: Changed 2 years ago by dprestegard

Why was this closed? I verified that using reconnect_at_eof does not resolve the issue.

The 500 error handling could still be drastically improved with some retry logic. With storage systems increasingly moving toward object storage and more workloads moving to cloud services this type of issue will only happen more often in the future.

Is there anything else I can provide? I'll work on generating a source file and procedure for reproducing the issue.

comment:6 in reply to: ↑ 5 Changed 2 years ago by cehoyos

  • Priority changed from normal to wish
  • Resolution invalid deleted
  • Status changed from closed to reopened

Replying to dprestegard:

Why was this closed? I verified that using reconnect_at_eof does not resolve the issue.

What does "encodes an endless output file" means?
(I am not a native speaker.)

Is there anything else I can provide?

Please provide the FFmpeg command line that allows to reproduce the issue together with its complete, uncut console output to make this a valid ticket.

comment:7 follow-up: Changed 2 years ago by dprestegard

Ah! Sorry about that :)

When I add reconect_at_eof ffmpeg loops input file over and over, since it disregards the end of file marker and starts again at the beginning. This is great for transcoding a live source, but for file transcoding it's not helpful.

I will work on producing a file that I can share along with commands to reproduce

comment:8 in reply to: ↑ 7 Changed 2 years ago by cehoyos

Replying to dprestegard:

When I add reconect_at_eof ffmpeg loops input file over and over, since it disregards the end of file marker and starts again at the beginning. This is great for transcoding a live source, but for file transcoding it's not helpful.

Does reconect_at_eof fix your issue, just at the disadvantage that it never stops, making it unusable for your usecase?

I will work on producing a file that I can share along with commands to reproduce

The console output should be sufficient.

comment:9 Changed 2 years ago by dprestegard

I did only brief testing, but yes it seems like reconnect_at_eof makes ffmpeg able to recover from an HTTP 500 error. However, as I've mentioned, it never stops running which does make it unsuitable for my use case (or any other file transcoding use case).

I will reproduce this and upload a full encoding report.

Note: See TracTickets for help on using tickets.