Opened 3 months ago

Last modified 3 months ago

#11306 new defect

fate-rv40 fails on RVV

Reported by: Rémi Denis-Courmont Owned by:
Priority: normal Component: undetermined
Version: unspecified Keywords:
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description (last modified by Rémi Denis-Courmont)

Summary of the bug:
How to reproduce:

% make fate-rv40

Bissection leads to this:

95d1052fba671d6c4ab6727a6905a637d03211c7 is the first bad commit
commit 95d1052fba671d6c4ab6727a6905a637d03211c7 (HEAD)
Author: Rémi Denis-Courmont <remi@remlab.net>
Date:   Sat Nov 18 22:09:57 2023 +0200

    lavu/riscv: add hwprobe() for CPU detection
    
    This adds the Linux-specific function call to detect CPU features. Unlike
    the more portable auxillary vector, this supports extensions other than
    single lettered ones. At this point, FFmpeg already needs this to detect
    Zba and Zbb at run-time, and probably will need it for Zvbb in the near
    future.
    
    Support will be available in glibc 2.40 onward.

 configure             |  3 +++
 libavutil/riscv/cpu.c | 25 +++++++++++++++++++++++++
 2 files changed, 28 insertions(+)

This commit is ostensibly a scapegoat. The broken optimisation presumably depends on the Bitmap extension, and was disabled in previous commits.

Change History (3)

comment:1 by Rémi Denis-Courmont, 3 months ago

Description: modified (diff)

comment:2 by Rémi Denis-Courmont, 3 months ago

With B enabled in CFLAGS, bissection points the finger at:

5bc3b7f51308b8027e5468ef60d8336a960193e2 is the first bad commit
commit 5bc3b7f51308b8027e5468ef60d8336a960193e2 (HEAD)
Author: sunyuechi <sunyuechi@iscas.ac.cn>
Date:   Tue Apr 30 18:24:00 2024 +0800

    lavc/rv40dsp: R-V V chroma_mc
    
    This is similar to h264, but here we use manual_avg instead of vaaddu
    because rv40's OP differs from h264. If we use vaaddu,
    rv40 would need to repeatedly switch between vxrm=0 and vxrm=2,
    and switching vxrm is very slow.
    
    C908:
    avg_chroma_mc4_c: 2330.0
    avg_chroma_mc4_rvv_i32: 602.7
    avg_chroma_mc8_c: 1211.0
    avg_chroma_mc8_rvv_i32: 602.7
    put_chroma_mc4_c: 1825.0
    put_chroma_mc4_rvv_i32: 414.7
    put_chroma_mc8_c: 932.0
    put_chroma_mc8_rvv_i32: 414.7
    
    Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>

 libavcodec/riscv/Makefile       |   2 +
 libavcodec/riscv/rv40dsp_init.c |  51 ++++++
 libavcodec/riscv/rv40dsp_rvv.S  | 371 ++++++++++++++++++++++++++++++++++++++++
 libavcodec/rv34dsp.h            |   1 +
 libavcodec/rv40dsp.c            |   2 +
 5 files changed, 427 insertions(+)
 create mode 100644 libavcodec/riscv/rv40dsp_init.c
 create mode 100644 libavcodec/riscv/rv40dsp_rvv.S

...which seems a lot likelier culprit

comment:3 by Rémi Denis-Courmont, 3 months ago

Seems same functions responsible for fate-filter-codecview-mvs failing.

Note: See TracTickets for help on using tickets.