Opened 15 months ago

Last modified 15 months ago

#10143 new defect

Very slow on Apple Silicon

Reported by: Gabriel Owned by:
Priority: normal Component: undetermined
Version: unspecified Keywords: apple silicon, slow performance
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug:
ffmpeg is much slower on M1 and M2 processors, than on Intel, same file, same ffmpeg version, all on Macbook's between 2018-2022.

How to reproduce:

/opt/local/bin/ffmpeg -loglevel info -hide_banner -nostats -nostdin -strict -2 -probesize 50M -analyzeduration 100M -progress /tmp/convert_to_h265_progress.log -stats_period 10 -i Dragon_Women_-_Topmanagerinnen_in_der_Finanzwelt.avi -map 0 -map -0:v:1 -c:s copy -map_metadata 0 -map_metadata:s:v 0:s:v -dn -map_metadata:s:a 0:s:a -c:a aac_at -b:a 128k -filter:v crop=in_w-mod(in_w\,2):in_h-mod(in_h\,2) -codec:v libx265 -tag:v hvc1 -max_muxing_queue_size 1024 -vf format=yuv420p10le -preset faster -crf 25 -x265-params me=hex:qcomp=0.5:ref=4:psy-rd=2.0:psy-rdoq=1.0:rd=6:log-level=2 -ignore_unknown -f mp4 Dragon_Women_-_Topmanagerinnen_in_der_Finanzwelt.h265.mp4_converting

ffmpeg version 4.4.2 Copyright (c) 2000-2021 the FFmpeg developers
built with Apple clang version 14.0.0 (clang-1400.0.29.202)

Same with 
ffmpeg version 5.1.2-tessus Copyright (c) 2000-2022 the FFmpeg developers

Test case: conversion from H264 to H265.

You can download the test video, and the results of my benchmark here:

https://owncloud.informatik.uni-bremen.de/index.php/s/McQB67Yynr4MCpN

Results in a nutshell:

MacbookPro 2019, 2.3 GHz 8-core Intel i9:
Runtime: 38 min
ffmpeg version:
ffmpeg version 4.4.1 Copyright (c) 2000-2021 the FFmpeg developers
built with Apple clang version 12.0.0 (clang-1200.0.32.28)

MacbookPro M1, 2021:
Runtime: 130 min
ffmpeg version 5.1.2-tessus Copyright (c) 2000-2022 the FFmpeg developers

Macbook Air 2022, Apple M2:
Runtime: 150 min
ffmpeg version:
ffmpeg version 4.4.2 Copyright (c) 2000-2021 the FFmpeg developers
built with Apple clang version 14.0.0 (clang-1400.0.29.202)

Change History (2)

comment:1 by Balling, 15 months ago

Test case: conversion from H264 to H265.

Is that a joke? Both x265 and h264 are only a little optimised for NEON as compared to avx256. And of course SVE2 of ARMv9 is not there yet.

in reply to:  1 comment:2 by Gabriel, 15 months ago

It is, of course, not a joke. I did not know that x265/H264 have not been optimized for NEON.

After all, there are comparisons between SSE and NEON, for instance in ray-tracing
(see e.g., https://blog.yiningkarlli.com/2021/09/neon-vs-sse.html )
These comparisons look like NEON is not too bad; some things are easier, some things are more difficult to do in NEON, but overall, there is no difference in principle.

Also, it looks like an automatic translation of the SSE code, using sse2neon, produces relatively good results.

Note: See TracTickets for help on using tickets.