[x264-devel] Re: Defining video quality

Thu Nov 30 20:27:30 CET 2006

Hello Axel,

As Guilliame wrote, x264 can calculate SSIM as well as PSNR.  I
recommend using that for most work.  SSIM is a very good metric, and
probably the only one that is both fast and better than PSNR.

If you are looking for the absolute best metric as opposed to a merely
very good metric, and also for how to compare metrics, you may wish to
read the VQEG FR-TV Phase I and Phase II reports:
http://www.its.bldrdoc.gov/vqeg/projects/frtv_phaseI/COM-80E_final_report.pdf
http://www.its.bldrdoc.gov/vqeg/projects/frtv_phaseII/downloads/VQEGII_Final_Report.pdf

In those tests, the best metric is NTIA (ANSI T1.801.03-2003, ITU
BT.1683).  SSIM also does quite well.  Interestingly, many well-known
metrics such as JND (Tektronix/Sarnoff), DVQ (Watson/NASA) and PDM
(EPFL) are either about the same, or worse than, using just Y-PSNR (!).

About PSNR: if you encode the same continuous (no scenecuts) sequence
with the same algorithm with several different settings, the version
with the lower PSNR is almost always visually better.  Thus PSNR does a
reasonably good job of allowing you to optimise codec parameters that
you can vary during an encode (like RDO or rate control).

However, PSNR fails on several counts:
1) The absolute value of PSNR means nothing.  In other words, looking at
two pairs of original/encoded images both with PSNR of 40 (for example),
one pair may look identical and the other may have gross artifacts.  It
all depends on the source footage and on the type of algorithm.  This
means that PSNR is useless for comparing significantly different
algorithms.  It also means that using PSNR to allocate bits accross
scenecuts will result in bad decisions.
2) Some artifacts which are essentially invisible have a huge PSNR
impact (try scaling Y uniformly or adding a small amount of gaussian
noise to each pixel for example).  Other things which are extremely
visible have an unreasonably low PSNR impact (like blocking).

SSIM fixes these problems to a significant degree.  It is not *the best*
video/image quality metric, but it is the best that can be computed
quickly.  SSIM absolute values can be reasonably compared: as a very
rough guide, below 0.7 is barely watchable, 0.8-0.85 has some visible
distortion but ok for most people, and 0.9 and higher are
indistinguishable from the original.

Please note that SSIM in x264 is not calculated the way that the "real"
SSIM algorithm (and papers) say it should be.  There is no windowing, no
luma masking, no motion masking, and only a sample of all possible
positions are calculated.  However, it is within few percent of the
"real" SSIM in most cases, while being about a hundred times faster to
calculate.  "Most cases" here pretty much means any natural footage; it
may be possible to construct synthetic footage so as to defeat the
approximations used, but you are extremely unlikely to ever encounter
such.  If you want real SSIM, bug me to release my code for that ;)

Regards,
--Alex

On Thu, 2006-11-30 at 10:16 -0800, Axel Gunter wrote:
> While comparing x264 with JM (10.2) I had been using PSNR for comparing 
> the qualities of the two encoders. I understand JM is an encoder 
> designed to achieve better PSNR rather than track true motion, or 
> increase texture detail. I was wondering if x264 was based on a similar 
> theory?
> Also, what other metrics are usually being used in the industry to 
> define video quality, since PSNR can be misleading at times. Any 
> pointers/comments?
> 
> Thanks
> Axel
> 

-- 
This is the x264-devel mailing-list
To unsubscribe, go to: http://developers.videolan.org/lists.html