[x264-devel] Shrink the i4x4_mode cost_table array
Henrik Gramner
git at videolan.org
Mon Dec 25 20:40:02 CET 2017
x264 | branch: master | Henrik Gramner <henrik at gramner.com> | Sat Oct 14 14:11:26 2017 +0200| [06c8f6bab0fc8fa9b2df9a1af5d10c87c515edb4] | committer: Anton Mitrofanov
Shrink the i4x4_mode cost_table array
Only 17 elements are actually used. It was originally padded to 64 bytes to
avoid cache line splits in the x86 assembly, but those haven't really been
an issue on x86 CPU:s made in the past decade or so.
Benchmarking shows no performance impact from dropping the padding, so
might as well remove it and save some cache.
> http://git.videolan.org/gitweb.cgi/x264.git/?a=commit;h=06c8f6bab0fc8fa9b2df9a1af5d10c87c515edb4
---
common/common.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/common/common.h b/common/common.h
index fe2b1c7f..27a56fbd 100644
--- a/common/common.h
+++ b/common/common.h
@@ -349,7 +349,7 @@ struct x264_t
struct
{
uint16_t ref[QP_MAX+1][3][33];
- ALIGNED_64( uint16_t i4x4_mode[QP_MAX+1][32] );
+ uint16_t i4x4_mode[QP_MAX+1][17];
} *cost_table;
const uint8_t *chroma_qp_table; /* includes both the nonlinear luma->chroma mapping and chroma_qp_offset */
More information about the x264-devel
mailing list