[x265] patch for faster intra

Steve Borho steve at borho.org
Mon Aug 4 04:52:52 CEST 2014


On 08/02, dave wrote:
> A few other things...
> 
> In my testing of encoding a single frame where I only examined processing of
> the first few CUs this search method always found the lowest cost but cost
> values never followed a consistent curve with a single low point.  Without
> further testing or deeper knowledge of the angle intra modes I wouldn't
> guarantee the lowest cost will always be found.

It's understood there will be a trade-off between compression and
performance; so this is ok.  It would be interesting to measure the
average "error" or difference in cost between the lowest cost found by
the fast-scan and the lowest cost found by the exhaustive scan. If
nothing else we can advertise this number in the documentation for the
option.

> Also, unfortunately my system is old and doesn't support sse4 and that is
> the only level of assembler supported for intra mode predictions so I was
> only able to develop and test for the c primitives.  I need to upgrade...

that's unfortunate; x265 really prefers SSE4.

> Yes, there was a small performance increase that seemed larger when encoding
> a single frame than a short video.  Neither produced consistent results on
> my system but the new search method was slightly faster most of the time
> when encoding a single frame and always faster when encoding a short video.

Looking at the code, I had forgotten about the details of picking
filtered or unfiltered sources based on the size and angle. This is
something else that the 'all-angs' function does for you implicitly.

I'm pretty sure the fast scan would be faster if it ran the 'all-angs'
function and then only measured satd/sa8d in the same pattern you have
now.  The only gotchya will be to make sure you compare with transposed
source pixels for those modes that require it.

> On 08/01/2014 09:50 PM, Steve Borho wrote:
> >On 08/01, dave wrote:
> >>I am submitting a patch to implement the faster intra search suggested by
> >>Steve Borho here:
> >>
> >>https://mailman.videolan.org/pipermail/x265-devel/2014-July/004873.html
> >nice!
> >
> >>The patch is implements the faster search in slicetype.cpp.  The same
> >>approach can also be used in analysis.cpp but it's a little more than a
> >>simple cut and paste though it shouldn't take long.
> >yep
> >
> >>TEncSearch.cpp also calls intra_pred_allangs for which this is the faster
> >>alternative but at first look, TEncSearch.cpp is more complex.  Depending on
> >>what is desired, either this search could greatly simplify this part of
> >>TEncSearch.cpp or it might not be applicable to what TEncSearch.cpp is
> >>doing.  I will be looking into it.
> >it should also apply here; this function is a little more complicated
> >because it keeps a "best N" list of modes, and then performs
> >rate-distortion measurements (encodes each option and records the actual
> >distortion and bit cost) to select the final intra mode.  The same 'fast
> >scan' mode could be used to build the 'best N' list. But I imagine most
> >presets that use this RDO version of intra will not want a fast scan.
> >
> 
> _______________________________________________
> x265-devel mailing list
> x265-devel at videolan.org
> https://mailman.videolan.org/listinfo/x265-devel

-- 
Steve Borho


More information about the x265-devel mailing list