[x265] why not PME slow down the x265 encoding process?

steve at borho.org steve at borho.org
Tue May 19 06:07:06 CEST 2015


On 05/19, Zhuo Li wrote:
> Hi,
> 
> I run the x265 encoder on a station machine with 120 threads (CPU4890 V2, RAM 128G, NUMA-on).
> 
> 
> 
> The commend lines are :
> 
> 1) ./x265 --input-res 1920x1080 --input source_500frame_1080p_20fps.yuv --bitrate 1200 --vbv-maxrate 1380 --vbv-bufsize 1000 --psnr --tune psnr --ctu 32 --fps 20  -ref 8 --preset slow -o out.hevc
> 2) ./x265 --input-res 1920x1080 --input source_500frame_1080p_20fps.yuv --bitrate 1200 --vbv-maxrate 1380 --vbv-bufsize 1000 --psnr --tune psnr --ctu 32 --fps 20 --pme   -ref 8 --preset slow -o out.hevc
> 3) ./x265 --input-res 1920x1080 --input source_500frame_1080p_20fps.yuv --bitrate 1200 --vbv-maxrate 1380 --vbv-bufsize 1000 --psnr --tune psnr --ctu 32 --fps 20 --pmode   -ref 8 --preset slow -o out.hevc
> 4) ./x265 --input-res 1920x1080 --input source_500frame_1080p_20fps.yuv --bitrate 1200 --vbv-maxrate 1380 --vbv-bufsize 1000 --psnr --tune psnr --ctu 32 --fps 20 --pmode --pme   -ref 8 --preset slow -o out.hevc
> 
> Results:
> 
>     pmode  pme   -ref      speed(frames/s)      PSNR     CPU Occupation
> 1)     0    0         8            7.94               40.195       2200%
> 2)     0    1         8            8.08          40.206       4800%
> 3)     1    0         8          11.70          40.265       4000%
> 4)     1    1         8          10.86          40.230       6300%?
> 
> Note: 1- on, 0-off.
> 
> the result shows that with PME ON, the encoding FPS is slowed down, but the explanation said that the encoder will distribute motion estimation across multiple worker threads when more than two references require motion searches for a given CU.
> 
> With PME on, it should accelerate the whole process, but not.
> 
> Is it a bug?

No, the main issue is that motion estimation is memory bandwidth bound,
so doing 4 searches simultaneously on 4 different cores does not make it
go any faster, the CPUs are all fighting to reach the same memory
through the same caches. This fighting over scarce resources is half the
reason it goes slower. The other half is from the overhead of using more
threads.

--pme is almost never a good idea on a normal CPU. I would rip it out
but I figure it might be helpful on some future platform.

-- 
Steve Borho


More information about the x265-devel mailing list