[x264-devel] [Git][videolan/x264][stable] 35 commits: x86: Always use PIC in x86-64 asm
Anton Mitrofanov
gitlab at videolan.org
Wed Jul 17 20:25:38 CEST 2019
Anton Mitrofanov pushed to branch stable at VideoLAN / x264
Commits:
275ef533 by Henrik Gramner at 2019-03-06T19:45:50Z
x86: Always use PIC in x86-64 asm
Most x86-64 operating systems nowadays doesn't even allow .text relocations
in object files any more, and there is no measurable overall performance
difference from using RIP-relative addressing in x264 asm.
Enforcing PIC reduces complexity and simplifies testing.
- - - - -
8f6ac77f by Luca Barbato at 2019-03-06T19:45:50Z
ppc: Cleanup quant
- - - - -
4dd83955 by Luca Barbato at 2019-03-06T19:45:50Z
ppc: Add quant_4x4x4
4x faster than C.
- - - - -
6e74eb5a by Luca Barbato at 2019-03-06T19:45:50Z
ppc: Rework the adds in satd8x8
10% faster.
- - - - -
e0d846a6 by Luca Barbato at 2019-03-06T19:45:51Z
ppc: Factor out the sum of absolute
And use it on the other satd > 8.
5-10% faster depending on the size.
- - - - -
83acefef by Luca Barbato at 2019-03-06T19:45:51Z
ppc: Rework satd_4* likewise
Now 4x4 is as slow as C and 4x8 is a 2% faster than before.
- - - - -
28fb2661 by Luca Barbato at 2019-03-06T19:45:51Z
ppc: Use xxpermdi to halve the computation in sad_x4_8x8
About 20% faster.
- - - - -
18262ee3 by Luca Barbato at 2019-03-06T19:45:51Z
ppc: Use a single store to write the scores for sad_x4_8x8
Yet another use of xxpermdi, another 10% gain.
- - - - -
0d111333 by Luca Barbato at 2019-03-06T19:45:51Z
ppc: Use xxpermdi in VEC_STORE8
Around a ~2% speedup to the overall encoding for --slow.
- - - - -
40688108 by Luca Barbato at 2019-03-06T19:45:51Z
ppc: Use the vec_xst_len for partial stores
Seems to give about a 1-2% overall speedup on --slow.
- - - - -
69dfb289 by Luca Barbato at 2019-03-06T19:45:51Z
ppc: Use vec_splats in mc
No overall speedup, just tidier code.
- - - - -
de380f4a by Luca Barbato at 2019-03-06T19:45:51Z
ppc: Use the vec_xst_len for partial stores in mc
Around a ~1% speedup to the overall encoding for --slow.
- - - - -
57baac4e by Alexandra Hájková at 2019-03-06T19:45:51Z
ppc: Use xxpermdi in sad_x3/x4 and use macros to avoid redundant code
- - - - -
92d36908 by Yusuke Nakamura at 2019-03-06T19:45:51Z
Signal Progressive and Constrained profiles
Progressive High, Constrained High, and Progressive High 10.
Even in Main profile, constraint_set4_flag is now set to 1 if progressive,
and constraint_set5_flag is set to 1 if no B-slices are present.
- - - - -
74c051f2 by Henrik Gramner at 2019-03-06T19:45:52Z
cli: Bash autocomplete support
Allows for automatic command line completion for both options and values.
Options such as --input-csp and --input-fmt will dynamically retrieve
supported values from libavformat when compiled with lavf support.
Execute 'source tools/bash-autocomplete.sh' in bash to enable.
- - - - -
ec1d3230 by Henrik Gramner at 2019-03-06T19:45:52Z
Bump dates to 2019
- - - - -
b7e9935c by Henrik Gramner at 2019-03-06T19:45:53Z
x86inc: Turn 'movsxd' into 'movifnidn' on x86-32
- - - - -
82721eae by Henrik Gramner at 2019-03-06T19:45:53Z
x86inc: Add x86-32 PIC support macros
- - - - -
6f85b3c4 by Henrik Gramner at 2019-03-06T19:45:53Z
x86inc: Make 'non-adjacent' default in the TAIL_CALL macro
- - - - -
101bd27d by Henrik Gramner at 2019-03-06T19:45:53Z
x86inc: Support N_PEXT bit on Mach-O
Allows for marking symbols as having limited global scope, similar to
using 'hidden' symbol visibility on ELF.
- - - - -
d3fa8b97 by Henrik Gramner at 2019-03-06T19:45:53Z
x86inc: Improve warnings for use of unsupported instructions
Warn when the following are used without the appropriate cpuflag:
* YMM and ZMM registers
* 'pextrw' with a memory operand
* GPR instruction set extensions
- - - - -
3e5aed95 by Henrik Gramner at 2019-03-06T19:45:53Z
x86inc: Add support for GFNI instructions
- - - - -
120ed3af by Anton Mitrofanov at 2019-03-06T19:45:53Z
Remove h->rc dereferencing where possible
- - - - -
d4099dd4 by Anton Mitrofanov at 2019-03-06T19:45:53Z
Remove compatibility workarounds
This will break decoding with older versions of FFmpeg/Libav.
- - - - -
5493be84 by Henrik Gramner at 2019-03-14T13:31:22Z
Fix warning in autocomplete.c when compiled with lavf
- - - - -
98ee9d2f by Konstantin Pavlov at 2019-07-16T11:34:18Z
Added gitlab CI
Supported targets:
- debian amd64
- debian aarch64
- windows 32 bit
- windows 64 bit
- macos 64bit
The tests are ran on all supported targets (via wine on windows).
The release jobs are only available on master/stable branches in
videolan/x264 repository, and must be ran manually when a developer
wishes to upload the artifacts.
- - - - -
352c0263 by Konstantin Pavlov at 2019-07-16T21:06:24Z
CI: Use a newer aarch64 image
It now includes pkg-config, so lavf can be detected.
- - - - -
bd8a88be by Konstantin Pavlov at 2019-07-16T21:06:53Z
CI: Bump macos target to darwin18
- - - - -
6381798d by Anton Mitrofanov at 2019-07-17T17:15:34Z
Fix heap-buffer-overflow read detected by ASan with interlaced encoding
Bug report by Hongxu Chen.
- - - - -
3147fa43 by Anton Mitrofanov at 2019-07-17T17:15:34Z
checkasm: Fix heap-buffer-overflow read detected by ASan
- - - - -
f06062f5 by Anton Mitrofanov at 2019-07-17T17:15:34Z
Fix integer overflow detected by UBSan in --weightp analysis
Bug report by Xuezhi Yan.
- - - - -
6b1170cb by Anton Mitrofanov at 2019-07-17T17:15:34Z
Shut up UBSan about uninitialized data read
Result was never used in that case.
- - - - -
6d494708 by Anton Mitrofanov at 2019-07-17T17:15:34Z
Fix x264_picture_alloc with X264_CSP_I400 colorspace
- - - - -
f9af2a0f by Anton Mitrofanov at 2019-07-17T17:15:34Z
Revert r2959: Signal Progressive and Constrained profiles
Some hardware decoders reject to decode streams with non-zero
constraint_set4_flag/constraint_set5_flag.
- - - - -
34c06d1c by Anton Mitrofanov at 2019-07-17T17:15:34Z
Strip git-hash from version in x264.pc
pkg-config doesn't like spaces in version string.
- - - - -
30 changed files:
- + .gitlab-ci.yml
- Makefile
- + autocomplete.c
- common/aarch64/asm-offsets.c
- common/aarch64/asm-offsets.h
- common/aarch64/asm.S
- common/aarch64/bitstream-a.S
- common/aarch64/bitstream.h
- common/aarch64/cabac-a.S
- common/aarch64/dct-a.S
- common/aarch64/dct.h
- common/aarch64/deblock-a.S
- common/aarch64/deblock.h
- common/aarch64/mc-a.S
- common/aarch64/mc-c.c
- common/aarch64/mc.h
- common/aarch64/pixel-a.S
- common/aarch64/pixel.h
- common/aarch64/predict-a.S
- common/aarch64/predict-c.c
- common/aarch64/predict.h
- common/aarch64/quant-a.S
- common/aarch64/quant.h
- common/arm/asm.S
- common/arm/bitstream-a.S
- common/arm/bitstream.h
- common/arm/cpu-a.S
- common/arm/dct-a.S
- common/arm/dct.h
- common/arm/deblock-a.S
The diff was not included because it is too large.
View it on GitLab: https://code.videolan.org/videolan/x264/compare/72db437770fd1ce3961f624dd57a8e75ff65ae0b...34c06d1c17ad968fbdda153cb772f77ee31b3095
--
View it on GitLab: https://code.videolan.org/videolan/x264/compare/72db437770fd1ce3961f624dd57a8e75ff65ae0b...34c06d1c17ad968fbdda153cb772f77ee31b3095
You're receiving this email because of your account on code.videolan.org.
More information about the x264-devel
mailing list