[x264-devel] arm: Use aligned loads in x264_coeff_last15_neon

Martin Storsjö git at videolan.org
Sun Oct 11 19:01:02 CEST 2015


x264 | branch: master | Martin Storsjö <martin at martin.st> | Thu Aug 13 23:59:25 2015 +0300| [d2b04a26b26d02c41ffb05cf1a605dafe9e6fa59] | committer: Henrik Gramner

arm: Use aligned loads in x264_coeff_last15_neon

After subtracting 2, the pointer will be aligned.

checkasm timing      Cortex-A7    A8    A9
coeff_last15_c              423   375   230
coeff_last15_neon           350   420   404  (before)
coeff_last15_neon           350   400   394  (after)

> http://git.videolan.org/gitweb.cgi/x264.git/?a=commit;h=d2b04a26b26d02c41ffb05cf1a605dafe9e6fa59
---

 common/arm/quant-a.S |    4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/common/arm/quant-a.S b/common/arm/quant-a.S
index 4b2129a..ad8d8f8 100644
--- a/common/arm/quant-a.S
+++ b/common/arm/quant-a.S
@@ -337,10 +337,8 @@ endfunc
 function x264_coeff_last\size\()_neon
 .if \size == 15
     sub         r0,  r0,  #2
-    vld1.64     {d0-d3}, [r0]
-.else
-    vld1.64     {d0-d3}, [r0,:128]
 .endif
+    vld1.64     {d0-d3}, [r0,:128]
     vtst.16     q0,  q0
     vtst.16     q1,  q1
     vshrn.u16   d0,  q0,  #8



More information about the x264-devel mailing list