[x264-devel] aarch64: Fix integral_init4/8h_neon

Martin Storsjö git at videolan.org
Sun Oct 11 19:01:01 CEST 2015


x264 | branch: master | Martin Storsjö <martin at martin.st> | Thu Aug 13 23:59:22 2015 +0300| [5c4728d8dd82ba46901824470db1609ae0f2521d] | committer: Henrik Gramner

aarch64: Fix integral_init4/8h_neon

The stride is the number of uint16_t elements and thus needs
to be shifted.

This issue had slipped unnoticed since checkasm didn't actually
verify the output of these functions.

> http://git.videolan.org/gitweb.cgi/x264.git/?a=commit;h=5c4728d8dd82ba46901824470db1609ae0f2521d
---

 common/aarch64/mc-a.S |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/common/aarch64/mc-a.S b/common/aarch64/mc-a.S
index a4488a4..b6b588e 100644
--- a/common/aarch64/mc-a.S
+++ b/common/aarch64/mc-a.S
@@ -1403,7 +1403,7 @@ endfunc
 .endm
 
 function integral_init4h_neon, export=1
-    sub         x3,  x0,  x2
+    sub         x3,  x0,  x2, lsl #1
     ld1        {v6.8b,v7.8b}, [x1], #16
 1:
     subs        x2,  x2,  #16
@@ -1438,7 +1438,7 @@ endfunc
 .endm
 
 function integral_init8h_neon, export=1
-    sub         x3,  x0,  x2
+    sub         x3,  x0,  x2, lsl #1
     ld1        {v16.8b,v17.8b}, [x1], #16
 1:
     subs        x2,  x2,  #16



More information about the x264-devel mailing list