[vlc-commits] chroma: copy: favor uswc copy with SSE4.1
Thomas Guillem
git at videolan.org
Mon Mar 19 12:15:19 CET 2018
vlc/vlc-3.0 | branch: master | Thomas Guillem <thomas at gllm.fr> | Thu Mar 15 08:11:33 2018 +0100| [eb97c600321e2ce92548ee0eb69b6cf30044be1f] | committer: Thomas Guillem
chroma: copy: favor uswc copy with SSE4.1
This commit improve the Y plane copy speed from GPU images.
(cherry picked from commit 3a0f600339464b609a8b82bd1ed021e28cb882bd)
Signed-off-by: Thomas Guillem <thomas at gllm.fr>
> http://git.videolan.org/gitweb.cgi/vlc/vlc-3.0.git/?a=commit;h=eb97c600321e2ce92548ee0eb69b6cf30044be1f
---
modules/video_chroma/copy.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/modules/video_chroma/copy.c b/modules/video_chroma/copy.c
index 446ebb7eac..973df6f10e 100644
--- a/modules/video_chroma/copy.c
+++ b/modules/video_chroma/copy.c
@@ -418,7 +418,8 @@ static void SSE_CopyPlane(uint8_t *dst, size_t dst_pitch,
const unsigned hstep = cache_size / w16;
assert(hstep > 0);
- if (src_pitch == dst_pitch)
+ /* If SSE4.1: CopyFromUswc is faster than memcpy */
+ if (!vlc_CPU_SSE4_1() && src_pitch == dst_pitch)
memcpy(dst, src, src_pitch * height);
else
for (unsigned y = 0; y < height; y += hstep) {
More information about the vlc-commits
mailing list