Opened 6 years ago

Last modified 4 months ago

#7768 new enhancement

ffmpeg does not handle HTTP read errors (e.g. from cloud storage)

Reported by: Derek Prestegard Owned by:
Priority: normal Component: undetermined
Version: unspecified Keywords: http
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug:
When reading large source files via https (e.g. a signed URL on AWS S3 or similar object storage), ffmpeg does not handle HTTP errors gracefully. Errors are occasionally expected when using services like S3, and it's expected that the application handle them via a retry mechanism.

ffmpeg seems to interpret an error as the end of the input file. In other words, if an error is encountered it simply stops encoding and finishes writing the output. This means the output file will be truncated.

Ideally ffmpeg would be able to retry when hitting errors. This would enable reliable processing of large files in cloud storage systems without implementing a "split and stitch" or "chunked encoding" methodology. Although these approaches are feasible and widely used, they have some impact to quality and complexity.

How to reproduce:
Perform any transcode of a large (100 GB +) file via S3 signed URL. It will most likely produce a truncated output.

Patches should be submitted to the ffmpeg-devel mailing list and not this bug tracker.

Change History (5)

comment:1 by Carl Eugen Hoyos, 6 years ago

Please provide the command line you tested including the complete, uncut console output (an excerpt of the debug output may also be helpful) to make this a valid ticket, don't forget that only current FFmpeg git head is supported here.

comment:2 by Derek Prestegard, 6 years ago

Here's a sample command:

ffmpeg -report -loglevel debug -y -i "https://s3.amazonaws.com/download.opencontent.netflix.com/Meridian/MER_SHR_C_EN-XX_US-NR_51_LTRT_UHD_20160913_OV/MER_SHR_C_EN-XX_US-NR_51_LTRT_UHD_20160913_OV/MER_SHR_C_EN-XX_US-NR_51_LTRT_UHD_20160913_OV_01.mxf" -pix_fmt yuv420p10 -c:v libx265 -threads 1 -preset veryslow -crf 12 test_01.mp4

This reads a very large (715 GB) j2k mxf file from one of Netflix's public S3 buckets and encodes it into HEVC. I'm running this now and expect it to time out in the next few hours. I'll provide the full debug report from that.

Probing the source reveals the following:

Duration: 00:11:58.93, start: 0.000000, bitrate: 8548764 kb/s

Stream #0:0, 1, 1001/60000: Video: jpeg2000, 1 reference frame, rgb48le(12 bpc, progressive), 3840x2160, 0/1, lossless, SAR 1:1 DAR 16:9, 59.94 tbr, 59.94 tbn, 59.94 tbc
Metadata:

file_package_umid: 0x060A2B340101010501010F2013000000FE44076823AB4CAC8F059C5EDECED827
file_package_name: File Package: PROTOTYPE SMPTE ST 422 / ST 2067-5 frame wrapping of JPEG 2000 codestreams with HDR metadata
track_name : PHDR Image Track

So, we should get a file that's 00:11:58.93 in duration.

comment:3 by Derek Prestegard, 6 years ago

Confirmed. After quite awhile:

[tls @ 00000159a52d7740] Error in the pull function.0:24.17 bitrate=100805.6kbits/s speed=0.00143x
[tcp @ 0000015ac96e2940] Original list of addresses:
[tcp @ 0000015ac96e2940] Address 52.216.236.229 port 443
[tcp @ 0000015ac96e2940] Interleaved list of addresses:
[tcp @ 0000015ac96e2940] Address 52.216.236.229 port 443
[tcp @ 0000015ac96e2940] Starting connection attempt to 52.216.236.229 port 443
[tcp @ 0000015ac96e2940] Successfully connected to 52.216.236.229 port 443
[https @ 00000159a489dc40] request: GET /download.opencontent.netflix.com/Meridian/MER_SHR_C_EN-XX_US-NR_51_LTRT_UHD_201
60913_OV/MER_SHR_C_EN-XX_US-NR_51_LTRT_UHD_20160913_OV/MER_SHR_C_EN-XX_US-NR_51_LTRT_UHD_20160913_OV_01.mxf HTTP/1.1
User-Agent: Lavf/58.26.101
Accept: */*
Range: bytes=27217019339-
Connection: close
Host: s3.amazonaws.com
Icy-MetaData: 1

https://s3.amazonaws.com/download.opencontent.netflix.com/Meridian/MER_SHR_C_EN-XX_US-NR_51_LTRT_UHD_20160913_OV/MER_SHR
_C_EN-XX_US-NR_51_LTRT_UHD_20160913_OV/MER_SHR_C_EN-XX_US-NR_51_LTRT_UHD_20160913_OV_01.mxf: corrupt input packet in str
eam 0
[jpeg2000 @ 00000159a55272c0] Psot 1970142 too big
[jpeg2000 @ 00000159a55272c0] error during processing marker segment ff90
[tls @ 0000015ac96e1500] Error in the pull function.0:24.19 bitrate=101169.5kbits/s speed=0.0014x
[tls @ 0000015ac96e1500] The specified session has been invalidated for some reason.

Last message repeated 1 times

[tls @ 0000015ac96e1500] The specified session has been invalidated for some reason.speed=0.00136x

Last message repeated 1 times

[tls @ 0000015ac96e1500] The specified session has been invalidated for some reason.speed=0.00132x

Last message repeated 1 times

[tls @ 0000015ac96e1500] The specified session has been invalidated for some reason.speed=0.00129x

Last message repeated 1 times

[tls @ 0000015ac96e1500] The specified session has been invalidated for some reason.speed=0.00127x

Last message repeated 1 times

[tls @ 0000015ac96e1500] The specified session has been invalidated for some reason.speed=0.00124x

Last message repeated 1 times

[tls @ 0000015ac96e1500] The specified session has been invalidated for some reason.speed=0.00122x

Last message repeated 1 times

[tls @ 0000015ac96e1500] The specified session has been invalidated for some reason.speed=0.00119x

Last message repeated 1 times

Error while decoding stream #0:0: Invalid data found when processing input
[tls @ 0000015ac96e1500] The specified session has been invalidated for some reason.speed=0.00116x

Last message repeated 1 times

[out_0_0 @ 00000159bd978580] EOF on sink link out_0_0:default.
No more output streams to write to, finishing.
frame= 1511 fps=0.1 q=-0.0 Lsize= 336740kB time=00:00:25.15 bitrate=109648.0kbits/s speed=0.000963x
video:336716kB audio:0kB subtitle:0kB other streams:0kB global headers:2kB muxing overhead: 0.007196%
Input file #0 (https://s3.amazonaws.com/download.opencontent.netflix.com/Meridian/MER_SHR_C_EN-XX_US-NR_51_LTRT_UHD_2016
0913_OV/MER_SHR_C_EN-XX_US-NR_51_LTRT_UHD_20160913_OV/MER_SHR_C_EN-XX_US-NR_51_LTRT_UHD_20160913_OV_01.mxf):

Input stream #0:0 (video): 1512 packets read (27191430359 bytes); 1511 frames decoded;
Total: 1512 packets (27191430359 bytes) demuxed

Output file #0 (test_01.mp4):

Output stream #0:0 (video): 1511 frames encoded; 1511 packets muxed (344797328 bytes);
Total: 1511 packets (344797328 bytes) muxed

1511 frames successfully decoded, 1 decoding errors
[AVIOContext @ 00000159a556a880] Statistics: 2 seeks, 1319 writeouts
x265 [info]: frame I: 10, Avg QP:15.01 kb/s: 294753.33
x265 [info]: frame P: 400, Avg QP:15.28 kb/s: 210073.99
x265 [info]: frame B: 1101, Avg QP:20.19 kb/s: 71169.26
x265 [info]: Weighted P-Frames: Y:3.5% UV:3.0%
x265 [info]: Weighted B-Frames: Y:1.4% UV:1.1%
x265 [info]: consecutive B-frames: 26.6% 0.2% 1.5% 33.7% 31.0% 3.4% 2.9% 0.0% 0.7%

encoded 1511 frames in 26120.40s (0.06 fps), 109420.58 kb/s, Avg QP:18.85
[AVIOContext @ 00000159a4e60600] Statistics: 27193009947 bytes read, 31 seeks

Here's the full encoding report at loglevel debug

https://genie-public-streaming-test.s3.us-west-1.amazonaws.com/ffmpeg-20190306-173420.log

comment:4 by Derek Prestegard, 6 years ago

Oh and the resulting mp4 file is only 25 seconds long.

comment:5 by Colin Leroy-Mira, 4 months ago

Hello,
Since this ticket's creation, there are reconnect options:
either via command line,

ffmpeg -reconnect 1 -reconnect_at_eof 1 -reconnect_on_network_error 1

Hope this helps
Or via API,

    av_dict_set(&video_options, "reconnect_on_network_error", "1", 0);
    av_dict_set(&video_options, "reconnect_at_eof", "1", 0);
    av_dict_set(&video_options, "reconnect", "1", 0);

    ret = avformat_open_input(&video_fmt_ctx, filename, NULL, &video_options));
    ...
Note: See TracTickets for help on using tickets.