[vlc-devel] [PATCH] arm neon i420_yuyv/i420_uyvy/s32_s16 : a bit less cycles (untested)

Rémi Denis-Courmont remi at remlab.net
Fri Jun 24 08:51:37 CEST 2011

Le vendredi 24 juin 2011 04:10:34 Rafaël Carré, vous avez écrit :
> i420_*: Do not push lr on the stack and use bx lr
> should be the same number of cycles but with less memory usage
> (can't check as i can't find the pdf with the number of cycles per
> instruction)

The Cortex-A8 can load/store two registers per cycle. So your patch is 
actually slower.

Rémi Denis-Courmont

More information about the vlc-devel mailing list