Opened 4 months ago

Closed 3 months ago

#6241 closed defect (needs_more_info)

hls_flags delete_segments – file desctriptors not freeing, ffmpeg segfaults when system limit is reached

Reported by: rafamiga Owned by: stevenliu
Priority: important Component: avformat
Version: git-master Keywords: hls crash regression
Cc: liuqi@gosun.com Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug:
When delete_segments HLS flag is enabled, the FFmpeg binary segfaults when max open files limit is reached.

How to reproduce:

/usr/local/bin/ffmpeg -loglevel info -thread_queue_size 1024 -f decklink -i DeckLink 4K Extreme@2 -audio_input embedded -video_input sdi -threads 0 -fflags +genpts -flags +global_header -c:a libfdk_aac 
-map 0:v -s:v 1280x720 -b:v 3600k -minrate:v 3000k -maxrate:v 3700k -bufsize:v 3600k -pix_fmt yuv420p -vf yadif
-map 0:a -af aresample=48000 -b:a 128k
-c:v libx264 -preset slow -profile:v main -x264opts keyint=100:min-keyint=100:scenecut=-1 -hls_time 4 -hls_list_size 5 -hls_start_number_source epoch -hls_flags delete_segments /var/www/html/live/test1/live3600k.m3u8
-map 0:v -s:v 1024x576 -b:v 2400k -minrate:v 2000k -maxrate:v 2500k -bufsize:v 2400k -pix_fmt yuv420p -vf yadif
-map 0:a -af aresample=44100 -b:a 128k
-c:v libx264 -preset slow -profile:v main -x264opts keyint=100:min-keyint=100:scenecut=-1 -hls_time 4 -hls_list_size 5 -hls_start_number_source epoch -hls_flags delete_segments /var/www/html/live/test1/live2400k.m3u8
-map 0:v -s:v 720x404 -b:v 1700k -minrate:v 1000k -maxrate:v 1800k -bufsize:v 1700k -pix_fmt yuv420p -vf yadif
-map 0:a -af aresample=44100 -b:a 96k -c:v libx264 -preset slow -profile:v main -x264opts keyint=100:min-keyint=100:scenecut=-1 -hls_time 4 -hls_list_size 5 -hls_start_number_source epoch -hls_flags delete_segments /var/www/html/live/test1/live1700k.m3u8
-map 0:v -s:v 512x288 -b:v 900k -minrate:v 800k -maxrate:v 1000k -bufsize:v 900k -pix_fmt yuv420p -vf yadif
-map 0:a -af aresample=44100 -b:a 64k
-c:v libx264 -preset slow -profile:v main -x264opts keyint=100:min-keyint=100:scenecut=-1 -hls_time 4 -hls_list_size 5 -hls_start_number_source epoch -hls_flags delete_segments /var/www/html/live/test1/live900k.m3u8
-map 0:v -s:v 512x288 -b:v 450k -minrate:v 400k -maxrate:v 500k -bufsize:v 450k -pix_fmt yuv420p -vf yadif
-map 0:a -af aresample=44100 -b:a 64k
-c:v libx264 -preset slow -profile:v main -x264opts keyint=100:min-keyint=100:scenecut=-1 -hls_time 4 -hls_list_size 5 -hls_start_number_source epoch -hls_flags delete_segments /var/www/html/live/test1/live450k.m3u8
-map 0:v -s:v 512x288 -b:v 120k -minrate:v 100k -maxrate:v 150k -bufsize:v 120k -pix_fmt yuv420p -vf yadif
-map 0:a -af aresample=32000 -b:a 32k
-c:v libx264 -preset slow -profile:v main -x264opts keyint=100:min-keyint=100:scenecut=-1 -hls_time 4 -hls_list_size 5 -hls_start_number_source epoch -hls_flags delete_segments /var/www/html/live/test1/live120k.m3u8

Build:
ffmpeg version N-83663-g7e9ba78 Copyright (c) 2000-2017 the FFmpeg developers
  built with gcc 4.9.2 (Debian 4.9.2-10)
  configuration: --prefix=/home/vagrant/ffmpeg-src/ffmpeg/ --bindir=/usr/local --pkg-config-flags=--static --extra-cflags='-I/home/vagrant/decklink-include -static' --extra-ldflags=-L/home/vagrant/decklink-include --enable-gpl --disable-shared --disable-doc --enable-static --enable-libfdk-aac --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libtheora --enable-libvorbis --enable-libvpx --enable-libx264 --enable-nonfree --enable-decklink
  libavutil      55. 47.100 / 55. 47.100
  libavcodec     57. 81.100 / 57. 81.100
  libavformat    57. 66.102 / 57. 66.102
  libavdevice    57.  2.100 / 57.  2.100
  libavfilter     6. 74.100 /  6. 74.100
  libswscale      4.  3.101 /  4.  3.101
  libswresample   2.  4.100 /  2.  4.100
  libpostproc    54.  2.100 / 54.  2.100
Hyper fast Audio and Video encoder

How to check:

... after a few minutes of encoding ...
lsof -p `pidof ffmpeg`|grep '(deleted)'|wc -l
1812

Temporary fix:

# grep NOFILE /etc/systemd/system/ffhls_decklink.service
LimitNOFILE=65500

Comment:
I've looked into the source and it seems that this is a problem in libavformat/hlsenc.c hls_delete_old_segments function:

proto = avio_find_protocol_name(s->filename);
        if (hls->method || (proto && !av_strcasecmp(proto, "http"))) {
            av_dict_set(&options, "method", "DELETE", 0);
            if ((ret = hls->avf->io_open(hls->avf, &out, path, AVIO_FLAG_WRITE, &options)) < 0)
                goto fail;
            ff_format_io_close(hls->avf, &out);
        } else if (unlink(path) < 0) {
            av_log(hls, AV_LOG_ERROR, "failed to delete old segment %s: %s\n",
                                     path, strerror(errno));
        }

I'm no expert, in fact I'm a rookie when it comes to debuging ffmpeg source, but unlink() without some sort of close() [is it ff_format_io_close()?] may be the cause of this problem. [And the same applies to unlink(sub_path) found later in that function.]

Change History (9)

comment:1 Changed 4 months ago by stevenliu

  • Owner set to stevenliu
  • Status changed from new to open

use git pull to update to newest commit please,
This problem has been fix several days ago.

commit id: 4507f29e4a6a4363e0179c02bdb78d55e4d9a12c
refer to : https://trac.ffmpeg.org/ticket/6204

comment:2 Changed 4 months ago by stevenliu

  • Resolution set to duplicate
  • Status changed from open to closed

duplicate: #6204

comment:3 follow-up: Changed 3 months ago by rafamiga

  • Resolution duplicate deleted
  • Status changed from closed to reopened

That's not it. I'm aware of 4507f29e4a6a4363e0179c02bdb78d55e4d9a12c patch. I've applied it to my source tree and it still segfaulting.

ff_format_io_close(s, &oc->pb) only happens under this condition:

    if (can_split && av_compare_ts(pkt->pts - hls->start_pts, st->time_base,
                                   end_pts, AV_TIME_BASE_Q) >= 0) {

My setup is rather strange, with 6 concurrent outputs. Please test my invocation, it only takes a minute or so to see if the process hugs those deleted segments.

comment:4 Changed 3 months ago by rafamiga

I've found one cause for can_split being false.

    if (hls->has_video) {
        can_split = st->codecpar->codec_type == AVMEDIA_TYPE_VIDEO &&
                    ((pkt->flags & AV_PKT_FLAG_KEY) || (hls->flags & HLS_SPLIT_BY_TIME));
        is_ref_pkt = st->codecpar->codec_type == AVMEDIA_TYPE_VIDEO;
    }

The variable is positive when a keyframe happens or split is forced and it's a video stream. But just after this this check happens:

    if (pkt->pts == AV_NOPTS_VALUE)
        is_ref_pkt = can_split = 0;

Maybe my setup is somewhat brokend and AV_NOPTS happens? As you can see, I use Decklink for input.

comment:5 in reply to: ↑ 3 Changed 3 months ago by stevenliu

Replying to rafamiga:

That's not it. I'm aware of 4507f29e4a6a4363e0179c02bdb78d55e4d9a12c patch. I've applied it to my source tree and it still segfaulting.

ff_format_io_close(s, &oc->pb) only happens under this condition:

    if (can_split && av_compare_ts(pkt->pts - hls->start_pts, st->time_base,
                                   end_pts, AV_TIME_BASE_Q) >= 0) {

My setup is rather strange, with 6 concurrent outputs. Please test my invocation, it only takes a minute or so to see if the process hugs those deleted segments.

use the newest version and check it, i saw your version is same with the ticket 6204

comment:6 Changed 3 months ago by stevenliu

  • Cc liuqi@gosun.com added

comment:7 Changed 3 months ago by rafamiga

Sure, the version string is the same but the source's patched. I may pull the newest source but I'm pretty sure the problem won't disappear. And since it's trivial to run the test, maybe it's easier to do so?

Last edited 3 months ago by rafamiga (previous) (diff)

comment:8 Changed 3 months ago by stevenliu

And dump the input stream to a file ,upload the file here, then let's reproduce it

comment:9 Changed 3 months ago by cehoyos

  • Keywords crash regression added; descriptors hls_flags delete_segments removed
  • Resolution set to needs_more_info
  • Status changed from reopened to closed

Please reopen this ticket if the issue is reproducible with current FFmpeg git head, please do not test patched FFmpeg versions.

Note: See TracTickets for help on using tickets.