[x264-devel] [PATCH 00/32] Bitdepth selection at runtime

Vittorio Giovara vittorio.giovara at gmail.com
Fri Jan 20 15:20:25 CET 2017


Hello,

this set of patches modifies x264 codebase and build system in order to
allow bit depth selection at runtime rather than build time. The set is
coauthored with Luca Barbato, and would not have been possible without help
and support of several people from the x264 community.

The main idea is to compile libx264 multiple times and to rename the
bitdepth-specific functions with a template operation for C and all the
supported architectures. Since x264cli is very entwined with libx264
internals a couple of modules have been duplicated as well.

The set is split roughly in five parts:
- introductory work so that function signature is the same across the board
- refactoring of the common module to multiple submodules, so that
  bitdepth-dependent defines from common.h are not pulled in in specific parts
- creation fo the headers that will take care of the templating
- modifications to the the cli behavior
- final edits to configure and checkasm

There are a couple of open questions or tasks about this project
- the rename used to enforce symbol versioning in encoder_open had to be
  removed since it interfered with the template renames
- bit depth of 9 is not supported out of the box any more
- checkasm target had to be duplicated and split in two separate modules,
  for each supported bitdepth
- the way to list all functions to be templated might not be the most
  efficient or cleanest one
- OpenCL has to be always-off instead of always-on since the preferred
  behavior is to allow multiple bitdepths and OpenCL only supports 8bit
- software is tested and run on OS X and Linux with gcc and clang, for every
  supported architecture. However no tests on Windows or non-x86 archs have
  been run.

A fully ready snaptshop is available at my github repository
https://github.com/kodabb/x264.git (bitdepth branch).

I tried to describe more complex patches in the commit log, but I'm open to
requests for additional details or changes from the review.
Cheers,
Vittorio

Anton Mitrofanov (1):
  Use the correct private prefix defined in x86inc

Vittorio Giovara (31):
  Add the standard x264 prefix to integral functions in arm and aarch64
  aarch64: Move the standard function prefix to a single place
  arm: Move the standard function prefix to a single place
  ppc: Adjust altivec function suffix
  common: Move a function declaration to the appropriate header
  x264.h: Disable renaming x264_encoder_open()
  common: Move log helper functions to a separate file
  log: Add an internal log function and use it where needed
  osdep: Decouple module from common.h
  cpu: Decouple module from common.h
  common: Move memory functions to a separate file
  common: Move picture functions to a separate file
  common: Move mathematics functions to a separate file
  common: Move parameter functions to a separate file
  common: Move shared tables to a common file
  param: Modify default qp parameters to maximum allowable
  Adjust headers to make x264cli independent of common.h
  Add API to set bitdepth at runtime
  Generate a header listing every symbol that needs to be duplicated
  Include an implementation file to wrap the appropriate call
  Move global symbols to the implementation file
  Generate a header listing every assembly symbol
  Enable assembly templating for x86 architecture
  Enable assembly templating for arm/aarch64 architectures
  Enable assembly templating for ppc/mips architectures
  x264cli: Duplicate depth and cache filters
  x264cli: Duplicate threaded input module
  x264: Add --output-depth command line option
  Modify configure to enable selecting bitdepth at build time
  Duplicate checkasm targets
  Bump X264_BUILD version

 .gitignore                            |    2 +
 Makefile                              |  143 +++-
 common/aarch64/asm.S                  |   13 +-
 common/aarch64/bitstream-a.S          |    2 +-
 common/aarch64/cabac-a.S              |   14 +-
 common/aarch64/dct-a.S                |  106 +--
 common/aarch64/deblock-a.S            |   26 +-
 common/aarch64/mc-a.S                 |   88 +--
 common/aarch64/mc-c.c                 |   16 +-
 common/aarch64/pixel-a.S              |   92 +--
 common/aarch64/predict-a.S            |   66 +-
 common/aarch64/quant-a.S              |   36 +-
 common/api.c                          |  197 +++++
 common/arm/asm.S                      |   13 +-
 common/arm/bitstream-a.S              |    2 +-
 common/arm/cpu-a.S                    |   12 +-
 common/arm/dct-a.S                    |   94 +--
 common/arm/deblock-a.S                |   30 +-
 common/arm/mc-a.S                     |  118 +--
 common/arm/mc-c.c                     |   16 +-
 common/arm/pixel-a.S                  |  118 +--
 common/arm/predict-a.S                |   60 +-
 common/arm/quant-a.S                  |   32 +-
 common/common.c                       | 1400 ---------------------------------
 common/common.h                       |   95 +--
 common/cpu.c                          |    7 +-
 common/cpu.h                          |    2 +
 common/frame.c                        |   16 +-
 common/{ppc/pixel.h => log.c}         |   42 +-
 common/{ppc/pixel.h => log.h}         |   21 +-
 common/{ppc/pixel.h => mathematics.c} |   35 +-
 common/mathematics.h                  |  112 +++
 common/mc.c                           |    2 +-
 common/mem.c                          |  126 +++
 common/{ppc/mc.h => mem.h}            |   30 +-
 common/osdep.c                        |    2 +-
 common/osdep.h                        |    3 +
 common/{common.c => param.c}          |  296 +------
 common/picture.c                      |  105 +++
 common/pixel.c                        |    2 +-
 common/ppc/mc.c                       |    2 +-
 common/ppc/mc.h                       |    2 +-
 common/ppc/pixel.c                    |    2 +-
 common/ppc/pixel.h                    |    2 +-
 common/set.h                          |    1 +
 common/tables.c                       |   49 ++
 common/{ppc/pixel.h => tables.h}      |   16 +-
 common/x86/cabac-a.asm                |    8 +-
 common/x86/mc-a.asm                   |    4 +-
 common/x86/pixel-a.asm                |    2 +-
 common/x86/trellis-64.asm             |    4 +-
 configure                             |   33 +-
 encoder/encoder.c                     |    8 +-
 encoder/rdo.c                         |    2 +
 encoder/set.c                         |   29 +-
 filters/filters.c                     |    2 +
 filters/video/cache.c                 |   10 +-
 filters/video/depth.c                 |   22 +-
 filters/video/fix_vfr_pts.c           |    1 +
 filters/video/resize.c                |    2 +
 filters/video/select_every.c          |    6 +-
 filters/video/video.c                 |   10 +-
 input/input.c                         |    1 +
 input/input.h                         |    8 +-
 input/lavf.c                          |    2 +
 input/thread.c                        |    5 +
 input/timecode.c                      |    3 +
 input/y4m.c                           |    2 +
 output/flv.c                          |    3 +
 output/flv_bytestream.c               |    1 +
 output/flv_bytestream.h               |    2 +
 output/matroska_ebml.c                |    3 +
 output/mp4_lsmash.c                   |    1 +
 output/raw.c                          |    1 +
 tools/api.list                        |  449 +++++++++++
 tools/asm.list                        |  288 +++++++
 tools/duplicate-asm.sh                |   26 +
 tools/duplicate.sh                    |   15 +
 x264.c                                |   56 +-
 x264.h                                |   11 +-
 x264cli.h                             |    5 +-
 81 files changed, 2287 insertions(+), 2404 deletions(-)
 create mode 100644 common/api.c
 copy common/{ppc/pixel.h => log.c} (53%)
 copy common/{ppc/pixel.h => log.h} (71%)
 copy common/{ppc/pixel.h => mathematics.c} (55%)
 create mode 100644 common/mathematics.h
 create mode 100644 common/mem.c
 copy common/{ppc/mc.h => mem.h} (63%)
 copy common/{common.c => param.c} (81%)
 create mode 100644 common/picture.c
 create mode 100644 common/tables.c
 copy common/{ppc/pixel.h => tables.h} (79%)
 create mode 100644 tools/api.list
 create mode 100644 tools/asm.list
 create mode 100755 tools/duplicate-asm.sh
 create mode 100755 tools/duplicate.sh

-- 
2.10.0



More information about the x264-devel mailing list