[x264-devel] [PATCH 1/1] arm64: optimize various intra_predict asm functions
Martin Storsjö
martin at martin.st
Mon Aug 24 19:42:56 CEST 2015
On Mon, 17 Aug 2015, Janne Grunau wrote:
> Make them at least as fast as the compiled C version (tested on
> cortex-a53 vs. gcc 4.9.2).
>
> C NEON (before) NEON (after)
> intra_predict_4x4_dc: 260 335 260
> intra_predict_4x4_dct: 210 265 200
> intra_predict_8x8c_dc: 497 548 493
> intra_predict_8x8c_v: 232 309 179 (arm64)
> intra_predict_8x16c_dc: 795 830 790
> ---
> common/aarch64/predict-a.S | 132 +++++++++++++++++++++++++--------------------
> common/aarch64/predict-c.c | 7 ++-
> common/aarch64/predict.h | 3 +-
> 3 files changed, 82 insertions(+), 60 deletions(-)
Ok
// Martin
More information about the x264-devel
mailing list