[x264-devel] [OpenCL] Please whitelist mesa's OpenCL implementation

Niccolò Belli darkbasic at linuxsystems.it
Thu Feb 13 16:09:49 CET 2014


Hi,
I'm using Gentoo Linux amd64 with latest graphic stack from git (mesa 
git, llvm git, linux 3.14, etc.) and x264 git. My graphic card is an AMD 
Radeon HD 7950.

Test file (~90s/~200MB): 
https://mega.co.nz/#!eQhSjJQR!EEe8-taN5IspIu-RW0WQzmvKzc5fkCn282kS5ugZ_as

ffmpeg -i PlanetEarthBirds.mkv -c:v libx264 -preset slower -crf 18 
-x264opts opencl -c:a copy slower-cfr18-ocl.mkv

This is what I get when encoding with --opencl with Catalyst 14.1_beta:

[libx264 @ 0x257e360] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
[libx264 @ 0x257e360] OpenCL acceleration enabled with Advanced Micro 
Devices, Inc. Tahiti (SI)

Unfortunately when I use mesa's OpenCL implementation I get:

[libx264 @ 0x1409360] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
[libx264 @ 0x1409360] OpenCL: Unable to find a compatible device


This is my clinfo with radeonsi (mesa):

~ $ clinfo

clinfo: /usr/lib64/libOpenCL.so.1: no version information available 
(required by clinfo) 
 
 
 

clinfo: /usr/lib64/libOpenCL.so.1: no version information available 
(required by clinfo)

Number of platforms:				 1
   Platform Profile:				 FULL_PROFILE
   Platform Version:				 OpenCL 1.1 MESA 10.2.0-devel
   Platform Name:				 Default
   Platform Vendor:				 Mesa
   Platform Extensions:				 cl_khr_icd


   Platform Name:				 Default
Number of devices:				 1
   Device Type:					 CL_DEVICE_TYPE_GPU
   Device ID:					 4098
   Max compute units:				 1
   Max work items dimensions:			 3
     Max work items[0]:				 256
     Max work items[1]:				 256
     Max work items[2]:				 256
   Max work group size:				 256
   Preferred vector width char:			 16
   Preferred vector width short:			 8
   Preferred vector width int:			 4
   Preferred vector width long:			 2
   Preferred vector width float:			 4
   Preferred vector width double:		 2
   Native vector width char:			 16
   Native vector width short:			 8
   Native vector width int:			 4
   Native vector width long:			 2
   Native vector width float:			 4
   Native vector width double:			 2
   Max clock frequency:				 0Mhz
   Address bits:					 32
   Max memory allocation:			 500000000
   Image support:				 Yes
   Max number of images read arguments:		 32
   Max number of images write arguments:		 32
   Max image 2D width:				 32768
   Max image 2D height:				 32768
   Max image 3D width:				 32768
   Max image 3D height:				 32768
   Max image 3D depth:				 32768
   Max samplers within kernel:			 0
   Max size of kernel argument:			 1024
   Alignment (bits) of base address:		 128
   Minimum alignment (bytes) for any datatype:	 128
   Single precision floating point capability
     Denorms:					 Yes
     Quiet NaNs:					 Yes
     Round to nearest even:			 Yes
     Round to zero:				 No
     Round to +ve and infinity:			 No
     IEEE754-2008 fused multiply-add:		 No
   Cache type:					 None
   Cache line size:				 0
   Cache size:					 0
   Global memory size:				 2000000000
   Constant buffer size:				 0
   Max number of constant args:			 0
   Local memory type:				 Scratchpad
   Local memory size:				 32768
   Kernel Preferred work group size multiple:	 1
   Error correction support:			 0
   Unified memory for Host and Device:		 1
   Profiling timer resolution:			 0
   Device endianess:				 Little
   Available:					 Yes
   Compiler available:				 Yes
   Execution capabilities:				
     Execute OpenCL kernels:			 Yes
     Execute native function:			 No
   Queue properties:				
     Out-of-Order:				 No
     Profiling :					 Yes
   Platform ID:					 0x00007f04160dcfc0
   Name:						 AMD TAHITI
   Vendor:					 X.Org
   Device OpenCL C version:			 OpenCL C 1.1
   Driver version:				 10.2.0-devel
   Profile:					 FULL_PROFILE
   Version:					 OpenCL 1.1 MESA 10.2.0-devel
   Extensions:



And this is my clinfo with Catalyst 14.1_beta:

~ $ clinfo | head -83

Number of platforms:				 1
   Platform Profile:				 FULL_PROFILE
   Platform Version:				 OpenCL 1.2 AMD-APP (1411.4)
   Platform Name:				 AMD Accelerated Parallel Processing
   Platform Vendor:				 Advanced Micro Devices, Inc.
   Platform Extensions:				 cl_khr_icd cl_amd_event_callback 
cl_amd_offline_devices cl_amd_hsa


   Platform Name:				 AMD Accelerated Parallel Processing
Number of devices:				 2
   Device Type:					 CL_DEVICE_TYPE_GPU
   Device ID:					 4098
   Board name:					 AMD Radeon HD 7900 Series
   Device Topology:				 PCI[ B#3, D#0, F#0 ]
   Max compute units:				 28
   Max work items dimensions:			 3
     Max work items[0]:				 256
     Max work items[1]:				 256
     Max work items[2]:				 256
   Max work group size:				 256
   Preferred vector width char:			 4
   Preferred vector width short:			 2
   Preferred vector width int:			 1
   Preferred vector width long:			 1
   Preferred vector width float:			 1
   Preferred vector width double:		 1
   Native vector width char:			 4
   Native vector width short:			 2
   Native vector width int:			 1
   Native vector width long:			 1
   Native vector width float:			 1
   Native vector width double:			 1
   Max clock frequency:				 810Mhz
   Address bits:					 32
   Max memory allocation:			 1073741824
   Image support:				 Yes
   Max number of images read arguments:		 128
   Max number of images write arguments:		 8
   Max image 2D width:				 16384
   Max image 2D height:				 16384
   Max image 3D width:				 2048
   Max image 3D height:				 2048
   Max image 3D depth:				 2048
   Max samplers within kernel:			 16
   Max size of kernel argument:			 1024
   Alignment (bits) of base address:		 2048
   Minimum alignment (bytes) for any datatype:	 128
   Single precision floating point capability
     Denorms:					 No
     Quiet NaNs:					 Yes
     Round to nearest even:			 Yes
     Round to zero:				 Yes
     Round to +ve and infinity:			 Yes
     IEEE754-2008 fused multiply-add:		 Yes
   Cache type:					 Read/Write
   Cache line size:				 64
   Cache size:					 16384
   Global memory size:				 2923429888
   Constant buffer size:				 65536
   Max number of constant args:			 8
   Local memory type:				 Scratchpad
   Local memory size:				 32768
   Kernel Preferred work group size multiple:	 64
   Error correction support:			 0
   Unified memory for Host and Device:		 0
   Profiling timer resolution:			 1
   Device endianess:				 Little
   Available:					 Yes
   Compiler available:				 Yes
   Execution capabilities:				
     Execute OpenCL kernels:			 Yes
     Execute native function:			 No
   Queue properties:				
     Out-of-Order:				 No
     Profiling :					 Yes
   Platform ID:					 0x00007f59736ee500
   Name:						 Tahiti
   Vendor:					 Advanced Micro Devices, Inc.
   Device OpenCL C version:			 OpenCL C 1.2
   Driver version:				 1411.4 (VM)
   Profile:					 FULL_PROFILE
   Version:					 OpenCL 1.2 AMD-APP (1411.4)
   Extensions:					 cl_khr_fp64 cl_amd_fp64 
cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics 
cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics 
cl_khr_int64_base_atomics cl_khr_int64_extended_atomics 
cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing
cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 
cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt 
cl_khr_image2d_from_buffer cl_khr_spir


Is it possible to whitelist mesa's OpenCL implementation? Can someone 
please write a patch to force-enable OpenCL so that I can at least test 
if it works with mesa?

Thanks,
Niccolò
-- 
http://www.linuxsystems.it


More information about the x264-devel mailing list