Opened 7 years ago

Closed 7 years ago

#5878 closed defect (fixed)

x11grab slow with smaller -video_size

Reported by: lolilolicon Owned by:
Priority: normal Component: avdevice
Version: git-master Keywords: x11grab xcb
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug

When using x11grab to take a PNG screenshot, smaller -video_size values take
significantly longer than larger values:

size      real  user  sys
100x100   0.724 0.020 0.003
200x300   0.731 0.023 0.007
800x100   0.558 0.023 0.000
200x500   0.461 0.023 0.003
1000x100  0.459 0.020 0.003
200x700   0.327 0.020 0.007
1400x100  0.327 0.020 0.003
200x900   0.265 0.023 0.007
800x300   0.230 0.017 0.010
600x500   0.198 0.023 0.007
1000x300  0.197 0.020 0.010
1000x500  0.137 0.027 0.007
1000x700  0.110 0.037 0.007
1600x900  0.095 0.050 0.007

For perspective, the ImageMagick command import takes consistantly around
0.350 second, which is not good, but still better than FFmpeg for video_size
smaller than around 373x373.

How to reproduce

This is the command I used to get the time values above:

% time ffmpeg -nostdin -loglevel error -f x11grab -draw_mouse 0 -show_region 0 -video_size 100x100 -i :0+0\,0 -f image2 -codec:v png -frames:v 1 - >/dev/null

real  0.724
user  0.020
sys   0.003

A simple strace shows most of the time is spent by shmget.

% strace -r ffmpeg -nostdin -loglevel error -f x11grab -draw_mouse 0 -show_region 0 -video_size 400x500 -i :0+0\,0 -f image2 -codec:v png -frames:v 1 - >/dev/null |& grep shmget
     0.033343 shmget(IPC_PRIVATE, 800032, IPC_CREAT|0777) = 505544790
     0.032619 shmget(IPC_PRIVATE, 800032, IPC_CREAT|0777) = 505577559
     0.032847 shmget(IPC_PRIVATE, 800032, IPC_CREAT|0777) = 505610328
     0.032514 shmget(IPC_PRIVATE, 800032, IPC_CREAT|0777) = 505643097
     0.032756 shmget(IPC_PRIVATE, 800032, IPC_CREAT|0777) = 505675866
     0.032881 shmget(IPC_PRIVATE, 800032, IPC_CREAT|0777) = 505708635
     0.032862 shmget(IPC_PRIVATE, 800032, IPC_CREAT|0777) = 505741404

Total time spend for by shmget: 0.229822 (out of 0.262 real time)

% strace -r ffmpeg -nostdin -loglevel error -f x11grab -draw_mouse 0 -show_region 0 -video_size 100x100 -i :0+0\,0 -f image2 -codec:v png -frames:v 1 - >/dev/null |& grep shmget
     0.033303 shmget(IPC_PRIVATE, 40032, IPC_CREAT|0777) = 504758357
     0.032914 shmget(IPC_PRIVATE, 40032, IPC_CREAT|0777) = 504791127
     0.033090 shmget(IPC_PRIVATE, 40032, IPC_CREAT|0777) = 504823896
     0.032941 shmget(IPC_PRIVATE, 40032, IPC_CREAT|0777) = 504856665
     0.033167 shmget(IPC_PRIVATE, 40032, IPC_CREAT|0777) = 504889434
     0.033067 shmget(IPC_PRIVATE, 40032, IPC_CREAT|0777) = 504922203
     0.033127 shmget(IPC_PRIVATE, 40032, IPC_CREAT|0777) = 504954972
     0.033105 shmget(IPC_PRIVATE, 40032, IPC_CREAT|0777) = 504987741
     0.033197 shmget(IPC_PRIVATE, 40032, IPC_CREAT|0777) = 505020510
     0.033091 shmget(IPC_PRIVATE, 40032, IPC_CREAT|0777) = 505053279
     0.033189 shmget(IPC_PRIVATE, 40032, IPC_CREAT|0777) = 505086048
     0.032985 shmget(IPC_PRIVATE, 40032, IPC_CREAT|0777) = 505118817
     0.032978 shmget(IPC_PRIVATE, 40032, IPC_CREAT|0777) = 505151586
     0.033139 shmget(IPC_PRIVATE, 40032, IPC_CREAT|0777) = 505184355
     0.033065 shmget(IPC_PRIVATE, 40032, IPC_CREAT|0777) = 505217124
     0.033025 shmget(IPC_PRIVATE, 40032, IPC_CREAT|0777) = 505249893
     0.032777 shmget(IPC_PRIVATE, 40032, IPC_CREAT|0777) = 505282662
     0.032937 shmget(IPC_PRIVATE, 40032, IPC_CREAT|0777) = 505315431
     0.032944 shmget(IPC_PRIVATE, 40032, IPC_CREAT|0777) = 505348200
     0.033133 shmget(IPC_PRIVATE, 40032, IPC_CREAT|0777) = 505380969
     0.033121 shmget(IPC_PRIVATE, 40032, IPC_CREAT|0777) = 505413738

Total time spend for by shmget: 0.694295 (out of 0.724 real time).


ffmpeg version 3.1.3 Copyright (c) 2000-2016 the FFmpeg developers
built with gcc 6.2.1 (GCC) 20160830
configuration: --prefix=/usr --disable-debug --disable-static --disable-stripping --enable-avisynth --enable-avresample --enable-fontconfig --enable-gmp --enable-gnutls --enable-gpl --enable-ladspa --enable-libass --enable-libbluray --enable-libfreetype --enable-libfribidi --enable-libgsm --enable-libiec61883 --enable-libmodplug --enable-libmp3lame --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-libschroedinger --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-netcdf --enable-shared --enable-version3 --enable-x11grab
libavutil      55. 28.100 / 55. 28.100
libavcodec     57. 48.101 / 57. 48.101
libavformat    57. 41.100 / 57. 41.100
libavdevice    57.  0.101 / 57.  0.101
libavfilter     6. 47.100 /  6. 47.100
libavresample   3.  0.  0 /  3.  0.  0
libswscale      4.  1.100 /  4.  1.100
libswresample   2.  1.100 /  2.  1.100
libpostproc    54.  0.100 / 54.  0.100

Change History (13)

comment:1 by lolilolicon, 7 years ago

Component: avformatavdevice

comment:2 by lolilolicon, 7 years ago

Interestingly, the worst time I get with import is 0.360, while with ffmpeg it's 0.738. That is roughly double the time. Coincidence?

comment:3 by Carl Eugen Hoyos, 7 years ago

Keywords: video_size removed

Please test current FFmpeg git head and provide command line and complete, uncut console output for one small resolution to make this a valid ticket.

comment:4 by lolilolicon, 7 years ago

Done.
I should also mention the root window size is 1920x1200.

% ffmpeg -loglevel 99 -nostdin -f x11grab -draw_mouse 0 -show_region 0 -video_size 48x64 -i :0+0\,0 -f image2 -codec:v png -frames:v 1 - >/dev/null
ffmpeg version 3.0.git Copyright (c) 2000-2016 the FFmpeg developers
  built with gcc 6.2.1 (GCC) 20160830
  configuration: --prefix=/usr --enable-debug --disable-static --disable-stripping --enable-avisynth --enable-avresample --enable-fontconfig --enable-gmp --enable-gnutls --enable-gpl --enable-ladspa --enable-libass --enable-libbluray --enable-libfreetype --enable-libfribidi --enable-libgsm --enable-libiec61883 --enable-libmodplug --enable-libmp3lame --enable-libopencore_amrnb --enable-libopencore_amrwb --disable-libopenjpeg --enable-libopus --enable-libpulse --enable-libschroedinger --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-netcdf --enable-shared --enable-version3 --enable-x11grab
  libavutil      55. 32.100 / 55. 32.100
  libavcodec     57. 61.100 / 57. 61.100
  libavformat    57. 51.103 / 57. 51.103
  libavdevice    57.  0.102 / 57.  0.102
  libavfilter     6. 63.100 /  6. 63.100
  libavresample   3.  0.  0 /  3.  0.  0
  libswscale      4.  1.100 /  4.  1.100
  libswresample   2.  2.100 /  2.  2.100
  libpostproc    54.  0.100 / 54.  0.100
Splitting the commandline.
Reading option '-loglevel' ... matched as option 'loglevel' (set logging level) with argument '99'.
Reading option '-nostdin' ... matched as option 'stdin' (enable or disable interaction on standard input) with argument 0.
Reading option '-f' ... matched as option 'f' (force format) with argument 'x11grab'.
Reading option '-draw_mouse' ... matched as AVOption 'draw_mouse' with argument '0'.
Reading option '-show_region' ... matched as AVOption 'show_region' with argument '0'.
Reading option '-video_size' ... matched as AVOption 'video_size' with argument '48x64'.
Reading option '-i' ... matched as input file with argument ':0+0,0'.
Reading option '-f' ... matched as option 'f' (force format) with argument 'image2'.
Reading option '-codec:v' ... matched as option 'codec' (codec name) with argument 'png'.
Reading option '-frames:v' ... matched as option 'frames' (set the number of frames to output) with argument '1'.
Reading option '-' ... matched as output file.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option loglevel (set logging level) with argument 99.
Applying option nostdin (enable or disable interaction on standard input) with argument 0.
Successfully parsed a group of options.
Parsing a group of options: input file :0+0,0.
Applying option f (force format) with argument x11grab.
Successfully parsed a group of options.
Opening an input file: :0+0,0.
[x11grab @ 0x55e0d6e12ba0] All info found
[x11grab @ 0x55e0d6e12ba0] rfps: 29.250000 0.019178
[x11grab @ 0x55e0d6e12ba0] rfps: 29.333333 0.014994
[x11grab @ 0x55e0d6e12ba0] rfps: 29.416667 0.011324
[x11grab @ 0x55e0d6e12ba0] rfps: 29.500000 0.008169
[x11grab @ 0x55e0d6e12ba0] rfps: 29.583333 0.005528
[x11grab @ 0x55e0d6e12ba0] rfps: 29.666667 0.003401
[x11grab @ 0x55e0d6e12ba0] rfps: 29.750000 0.001788
[x11grab @ 0x55e0d6e12ba0] rfps: 29.833333 0.000689
    Last message repeated 1 times
[x11grab @ 0x55e0d6e12ba0] rfps: 29.916667 0.000104
    Last message repeated 1 times
[x11grab @ 0x55e0d6e12ba0] rfps: 30.000000 0.000034
    Last message repeated 1 times
[x11grab @ 0x55e0d6e12ba0] rfps: 60.000000 0.000136
    Last message repeated 1 times
[x11grab @ 0x55e0d6e12ba0] rfps: 120.000000 0.000544
    Last message repeated 1 times
[x11grab @ 0x55e0d6e12ba0] rfps: 240.000000 0.002174
    Last message repeated 1 times
[x11grab @ 0x55e0d6e12ba0] rfps: 29.970030 0.000000
    Last message repeated 1 times
[x11grab @ 0x55e0d6e12ba0] rfps: 59.940060 0.000000
    Last message repeated 1 times
[x11grab @ 0x55e0d6e12ba0] stream 0: start_time: 1475851055.803 duration: -9223372036854.775
[x11grab @ 0x55e0d6e12ba0] format: start_time: 1475851055.803 duration: -9223372036854.775 bitrate=0 kb/s
Input #0, x11grab, from ':0+0,0':
  Duration: N/A, start: 1475851055.802931, bitrate: N/A
    Stream #0:0, 21, 1/1000000: Video: rawvideo, 1 reference frame (BGR[0] / 0x524742), bgr0, 48x64, 0/1, 29.97 fps, 29.97 tbr, 1000k tbn, 1000k tbc
Successfully opened the file.
Parsing a group of options: output file -.
Applying option f (force format) with argument image2.
Applying option codec:v (codec name) with argument png.
Applying option frames:v (set the number of frames to output) with argument 1.
Successfully parsed a group of options.
Opening an output file: -.
Successfully opened the file.
detected 4 logical cores
[graph 0 input from stream 0:0 @ 0x55e0d6e228e0] Setting 'video_size' to value '48x64'
[graph 0 input from stream 0:0 @ 0x55e0d6e228e0] Setting 'pix_fmt' to value '298'
[graph 0 input from stream 0:0 @ 0x55e0d6e228e0] Setting 'time_base' to value '1/1000000'
[graph 0 input from stream 0:0 @ 0x55e0d6e228e0] Setting 'pixel_aspect' to value '0/1'
[graph 0 input from stream 0:0 @ 0x55e0d6e228e0] Setting 'sws_param' to value 'flags=2'
[graph 0 input from stream 0:0 @ 0x55e0d6e228e0] Setting 'frame_rate' to value '30000/1001'
[graph 0 input from stream 0:0 @ 0x55e0d6e228e0] w:48 h:64 pixfmt:bgr0 tb:1/1000000 fr:30000/1001 sar:0/1 sws_param:flags=2
[format @ 0x55e0d6e22e60] compat: called with args=[rgb24|rgba|rgb48be|rgba64be|pal8|gray|ya8|gray16be|ya16be|monob]
[format @ 0x55e0d6e22e60] Setting 'pix_fmts' to value 'rgb24|rgba|rgb48be|rgba64be|pal8|gray|ya8|gray16be|ya16be|monob'
[auto-inserted scaler 0 @ 0x55e0d6e23ec0] Setting 'flags' to value 'bicubic'
[auto-inserted scaler 0 @ 0x55e0d6e23ec0] w:iw h:ih flags:'bicubic' interl:0
[format @ 0x55e0d6e22e60] auto-inserting filter 'auto-inserted scaler 0' between the filter 'Parsed_null_0' and the filter 'format'
[AVFilterGraph @ 0x55e0d6e1f100] query_formats: 4 queried, 2 merged, 1 already done, 0 delayed
[auto-inserted scaler 0 @ 0x55e0d6e23ec0] picking rgb24 out of 9 ref:bgr0 alpha:0
[swscaler @ 0x55e0d6e24080] Forcing full internal H chroma due to input having non subsampled chroma
[auto-inserted scaler 0 @ 0x55e0d6e23ec0] w:48 h:64 fmt:bgr0 sar:0/1 -> w:48 h:64 fmt:rgb24 sar:0/1 flags:0x4
Output #0, image2, to 'pipe:':
  Metadata:
    encoder         : Lavf57.51.103
    Stream #0:0, 0, 1001/30000: Video: png, 1 reference frame, rgb24, 48x64, 0/1, q=2-31, 200 kb/s, 29.97 fps, 29.97 tbn, 29.97 tbc
    Metadata:
      encoder         : Lavc57.61.100 png
Stream mapping:
  Stream #0:0 -> #0:0 (rawvideo (native) -> png (native))
cur_dts is invalid (this is harmless if it occurs once at the start per stream)
[rawvideo @ 0x55e0d6e1a920] PACKET SIZE: 12288, STRIDE: 192
Clipping frame in rate conversion by 0.000008
No more output streams to write to, finishing.
[pipe @ 0x55e0d6f85020] Setting default whitelist 'crypto'
[AVIOContext @ 0x55e0d6f85100] Statistics: 0 seeks, 1 writeouts
frame=    1 fps=0.0 q=-0.0 Lsize=N/A time=00:00:00.03 bitrate=N/A speed=94.8x    
video:1kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
Input file #0 (:0+0,0):
  Input stream #0:0 (video): 1 packets read (12288 bytes); 1 frames decoded; 
  Total: 1 packets (12288 bytes) demuxed
Output file #0 (pipe:):
  Output stream #0:0 (video): 1 frames encoded; 1 packets muxed (1269 bytes); 
  Total: 1 packets (1269 bytes) muxed
1 frames successfully decoded, 0 decoding errors
Last edited 7 years ago by lolilolicon (previous) (diff)

in reply to:  4 ; comment:5 by Carl Eugen Hoyos, 7 years ago

Replying to lolilolicon:

ffmpeg version 3.0.git Copyright (c) 2000-2016 the FFmpeg developers

Can you explain how you downloaded the source for this build?
I thought this should never happen, there should always be a useful version information...

I believe that --disable-stripping is never a good idea, debug is the default.

Please test if performance is better if you build with --disable-libxcb --enable-x11grab --enable-gpl.

comment:6 by Cigaes, 7 years ago

If I read correctly, you are benchmarking the capture of a single image: libavdevice is not optimized for that case, and ffmpeg, the command-line tool, even less.

in reply to:  5 comment:7 by lolilolicon, 7 years ago

Replying to cehoyos:

Replying to lolilolicon:

ffmpeg version 3.0.git Copyright (c) 2000-2016 the FFmpeg developers

Can you explain how you downloaded the source for this build?
I thought this should never happen, there should always be a useful version information...

Ouch, sorry. I grabbed https://github.com/FFmpeg/FFmpeg/archive/master.zip
The comments in the archive confirms it's commit 572f16e10041c9af131709f88a53750a4b3db802

I believe that --disable-stripping is never a good idea, debug is the default.

I used the build script from my distro repo, which came with this; the build tool strips binaries later, IIRC.

Please test if performance is better if you build with --disable-libxcb --enable-x11grab --enable-gpl.

Added --disable-libxcb, performance of ffmpeg is improved somewhat, e.g. for -video_size 100x100 time is 0.690 compared to 0.724. But it's still bad at smaller sizes.

Last edited 7 years ago by lolilolicon (previous) (diff)

in reply to:  description ; comment:8 by lolilolicon, 7 years ago

Replying to lolilolicon:

For perspective, the ImageMagick command import takes consistantly around
0.350 second

This is completely wrong, PEBKAC. I used import -window root -crop WxH +repage png:-, which dumps a lot of images, not one; the correct command is import -window root -crop WxH+X+Y +repage png:- and it's much faster at smaller sizes. Compare now:

  video_size     ffmpeg import
20000   200x100   0.692 0.123
40000   400x100   0.689 0.086
60000   200x300   0.703 0.090
60000   600x100   0.691 0.088
80000   800x100   0.524 0.092
100000  1000x100  0.424 0.095
100000  200x500   0.424 0.094
120000  1200x100  0.358 0.096
120000  400x300   0.359 0.097
140000  1400x100  0.295 0.098
140000  200x700   0.296 0.100
160000  1600x100  0.259 0.101
180000  1800x100  0.227 0.104
180000  200x900   0.227 0.104
180000  600x300   0.226 0.103
200000  400x500   0.230 0.106
240000  800x300   0.199 0.113
280000  400x700   0.169 0.114
300000  1000x300  0.164 0.122
300000  600x500   0.163 0.114
360000  1200x300  0.133 0.127
360000  400x900   0.134 0.122
400000  800x500   0.135 0.132
420000  1400x300  0.107 0.134
420000  600x700   0.100 0.125
480000  1600x300  0.107 0.145
500000  1000x500  0.104 0.145
540000  1800x300  0.106 0.150
540000  600x900   0.103 0.137
560000  800x700   0.106 0.150
600000  1200x500  0.105 0.153
700000  1000x700  0.076 0.168
700000  1400x500  0.075 0.164
720000  800x900   0.076 0.166
800000  1600x500  0.079 0.180
840000  1200x700  0.080 0.180
900000  1000x900  0.081 0.187
900000  1800x500  0.081 0.191
980000  1400x700  0.084 0.197
1080000 1200x900  0.085 0.202
1120000 1600x700  0.090 0.216
1260000 1400x900  0.058 0.221
1260000 1800x700  0.062 0.228
1440000 1600x900  0.064 0.246
1620000 1800x900  0.069 0.266

in reply to:  8 ; comment:9 by Carl Eugen Hoyos, 7 years ago

Replying to lolilolicon:

the correct command is import -window root -crop WxH+X+Y +repage png:-

Isn't using the crop filter with FFmpeg also faster?

in reply to:  9 ; comment:10 by lolilolicon, 7 years ago

Replying to cehoyos:

Replying to lolilolicon:

the correct command is import -window root -crop WxH+X+Y +repage png:-

Isn't using the crop filter with FFmpeg also faster?

You're damn right, man! (Now I wonder how that works for screencasts...)
But this ticket is still valid, do you think? I mean, there may be room for improvement :)

in reply to:  10 ; comment:11 by Carl Eugen Hoyos, 7 years ago

Replying to lolilolicon:

But this ticket is still valid, do you think?

I don't think so (you are describing an issue somewhere in X iiuc) but since there is a regression over the old x11grab code, this should be evaluated.

in reply to:  11 comment:12 by lolilolicon, 7 years ago

Replying to cehoyos:

but since there is a regression over the old x11grab code, this should be evaluated.

You mean the part where libxcb performs worse? Yeah I agree.

comment:13 by Carl Eugen Hoyos, 7 years ago

Keywords: xcb added
Resolution: fixed
Status: newclosed
Version: unspecifiedgit-master

Overall speed with xcb should have increased with 0bd1be65e88d6a4f367e698d7a2b105424eb1905
I suspect that further fixes are outside the scope of FFmpeg.

Note: See TracTickets for help on using tickets.