[x264-devel] [PATCH 1/1] arm64: optimize various intra_predict asm functions

Martin Storsjö martin at martin.st
Mon Aug 24 19:42:56 CEST 2015


On Mon, 17 Aug 2015, Janne Grunau wrote:

> Make them at least as fast as the compiled C version (tested on
> cortex-a53 vs. gcc 4.9.2).
>
>                        C     NEON (before)   NEON (after)
> intra_predict_4x4_dc:   260   335             260
> intra_predict_4x4_dct:  210   265             200
> intra_predict_8x8c_dc:  497   548             493
> intra_predict_8x8c_v:   232   309             179 (arm64)
> intra_predict_8x16c_dc: 795   830             790
> ---
> common/aarch64/predict-a.S | 132 +++++++++++++++++++++++++--------------------
> common/aarch64/predict-c.c |   7 ++-
> common/aarch64/predict.h   |   3 +-
> 3 files changed, 82 insertions(+), 60 deletions(-)

Ok

// Martin


More information about the x264-devel mailing list