[vlc-commits] chroma: copy: favor uswc copy with SSE4.1

Thomas Guillem git at videolan.org
Mon Mar 19 12:15:19 CET 2018


vlc/vlc-3.0 | branch: master | Thomas Guillem <thomas at gllm.fr> | Thu Mar 15 08:11:33 2018 +0100| [eb97c600321e2ce92548ee0eb69b6cf30044be1f] | committer: Thomas Guillem

chroma: copy: favor uswc copy with SSE4.1

This commit improve the Y plane copy speed from GPU images.

(cherry picked from commit 3a0f600339464b609a8b82bd1ed021e28cb882bd)
Signed-off-by: Thomas Guillem <thomas at gllm.fr>

> http://git.videolan.org/gitweb.cgi/vlc/vlc-3.0.git/?a=commit;h=eb97c600321e2ce92548ee0eb69b6cf30044be1f
---

 modules/video_chroma/copy.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/modules/video_chroma/copy.c b/modules/video_chroma/copy.c
index 446ebb7eac..973df6f10e 100644
--- a/modules/video_chroma/copy.c
+++ b/modules/video_chroma/copy.c
@@ -418,7 +418,8 @@ static void SSE_CopyPlane(uint8_t *dst, size_t dst_pitch,
     const unsigned hstep = cache_size / w16;
     assert(hstep > 0);
 
-    if (src_pitch == dst_pitch)
+    /* If SSE4.1: CopyFromUswc is faster than memcpy */
+    if (!vlc_CPU_SSE4_1() && src_pitch == dst_pitch)
         memcpy(dst, src, src_pitch * height);
     else
     for (unsigned y = 0; y < height; y += hstep) {



More information about the vlc-commits mailing list