[x265] [PATCH Update] asm: interp_8tap_hv_pp_8x8() for InterpolateHV_8x8

chen chenm003 at 163.com
Sat Oct 26 05:37:08 CEST 2013









在 2013-10-26 00:56:40,"Steve Borho" <steve at borho.org> 写道:






On Fri, Oct 25, 2013 at 7:25 AM, Min Chen <chenm003 at 163.com> wrote:
# HG changeset patch
# User Min Chen <chenm003 at 163.com>
# Date 1382703678 -28800
# Node ID 2221e3abb479b1e9a586d80d769373d13c7f7980
# Parent  4ca4da7bdd36fbef00b9eefe54c0a56bf11633f3
asm: interp_8tap_hv_pp_8x8() for InterpolateHV_8x8



How does this compare, performance wise, to the combined h_ps + v_sp intrinsic functions?
 
[MC] I will write some code to comare later. 
 
diff -r 4ca4da7bdd36 -r 2221e3abb479 source/common/ipfilter.cpp
--- a/source/common/ipfilter.cpp        Fri Oct 25 12:11:31 2013 +0530
+++ b/source/common/ipfilter.cpp        Fri Oct 25 20:21:18 2013 +0800
@@ -401,6 +401,17 @@
         dst += dstStride;
     }
 }
+typedef void (*ipfilter_ps_t)(pixel *src, intptr_t srcStride, short *dst, intptr_t dstStride, int width, int height, const short *coeff);
+typedef void (*ipfilter_sp_t)(short *src, intptr_t srcStride, pixel *dst, intptr_t dstStride, int width, int height, const short *coeff);
+
+template<int N, int width, int height>
+void interp_hv_pp_c(pixel *src, intptr_t srcStride, pixel *dst, intptr_t dstStride, int idxX, int idxY)
+{
+    short m_immedVals[(64 + 8) * (64 + 8)];
+    filterHorizontal_ps_c<N>(src - 3 * srcStride, srcStride, m_immedVals, width, width, height + 7, g_lumaFilter[idxX]);
+    filterVertical_sp_c<N>(m_immedVals + 3 * width, width, dst, dstStride, width, height, g_lumaFilter[idxY]);
+}



the intermediate buffer should be an argument
 
[MC] this function output final result of interpolate HV, I think intermedia buffer in stack is threading safey.
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20131026/17bef3db/attachment.html>


More information about the x265-devel mailing list