[x265] Weekly summary for week ending 20/5/16

Ramya Sriraman ramya at multicorewareinc.com
Mon May 23 06:07:45 CEST 2016


*Highlights*


*Details*
I design and implement ARM NEON algorithm on DCT16x16, since ARM registers
very limited, I design algorithm to process 16x4 everytime, and loop 4
times to process all of DCT-1D rows. the DCT-2D is similar but work on
32-bits intermedia (the 32-bits multiplication is bottleneck here, as
compare to single cycle 16-bits multiplication, it is 4-cycles)

*Plans*
Write a example for psyCost_pp<2> (psyCost_pp_4x4)
I need more ~2 weeks to finish the DCT16x16, the function too large and
complex, I need more time to debug and adjust my algorithm / code, and I
need average ~20 minutes to execute debug top (modify from our Testbench)
in the simulate environment.

Thank you
Regards
Ramya
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/x265-devel/attachments/20160523/b1f1ca8e/attachment.html>


More information about the x265-devel mailing list