September 2014 Archives by thread
Starting: Mon Sep 1 10:26:25 CEST 2014
Ending: Tue Sep 30 17:59:13 CEST 2014
Messages: 391
- [x265] [PATCH] analysis: CU structure now holds CU-specific information,
ashok at multicorewareinc.com
- [x265] [PATCH] Entropy: Replaced getCtxQtCbf() with table
ashok at multicorewareinc.com
- [x265] [PATCH] asm: avx2 asm code for dct4
dnyaneshwar at multicorewareinc.com
- [x265] HEVC Daily Log - Murugan/Yuvaraj 9/2/22014
Murugan Vairavel
- [x265] [PATCH] count_nonzero primitive optimization, downscaling quantCoef from int32_t* to int16_t*
praveen at multicorewareinc.com
- [x265] [PATCH] dequant_normal optimization, downscaling quantCoef from int32_t* to int16_t*
praveen at multicorewareinc.com
- [x265] [PATCH] dequant_normal asm code optimization as per new interface
praveen at multicorewareinc.com
- [x265] [PATCH] dequant_scaling optimization, downscaling quantCoef from int32_t* to int16_t*
praveen at multicorewareinc.com
- [x265] [PATCH] quant.cpp, cleaned redundant code
praveen at multicorewareinc.com
- [x265] [PATCH] nquant optimization, downscaling qCoef from int32_t* to int16_t*
praveen at multicorewareinc.com
- [x265] [PATCH] rdoQuant optimization, downscaling dstCoeff fron int32_t* to int16_t*
praveen at multicorewareinc.com
- [x265] [PATCH] signBitHidingHDQ optimization, downscaling coeff from int32_t* to int16_t*
praveen at multicorewareinc.com
- [x265] [PATCH] cvt16to32_cnt optimization
praveen at multicorewareinc.com
- [x265] [PATCH] conv16to32_count C interface modification, downscaling coeff from int32_t* to int16_t*
praveen at multicorewareinc.com
- [x265] [PATCH] conv16to32_count renamed to copy_count as per new interface
praveen at multicorewareinc.com
- [x265] [PATCH] temporarily disable avx2 version of copy_cnt primitive, need to update as per new interface
praveen at multicorewareinc.com
- [x265] [PATCH] added copy_shr primitive
praveen at multicorewareinc.com
- [x265] [PATCH] added copy_shl primitive
praveen at multicorewareinc.com
- [x265] [PATCH] optimize cvt32to16_shl by replacing copy_shl
praveen at multicorewareinc.com
- [x265] [PATCH] quant_c optimization, downscaling qCoef from int32_t* to int16_t*
praveen at multicorewareinc.com
- [x265] [PATCH] quant path cleanup
praveen at multicorewareinc.com
- [x265] [PATCH] TComDataCU: Reduced repeated function call to calculate depth range
ashok at multicorewareinc.com
- [x265] change index of m_buOffsetY[] from raster to zscan
Satoshi Nakagawa
- [x265] [PATCH] asm: enable 16bpp primitives of cvt32to16 and cvt16to32 for all block sizes
murugan at multicorewareinc.com
- [x265] [PATCH] fix: hash/binary mismatch for new CU structure holds CU-specific info
ashok at multicorewareinc.com
- [x265] [PATCH] Resolve gcc warnings
dtyx265 at gmail.com
- [x265] [PATCH] Cleaned up TComDataCU::getQuadtreeTULog2MinSizeInCU for clarity and a bit of performance
dtyx265 at gmail.com
- [x265] [PATCH 1 of 3] testbench(nquant): the Round value must be less than (2 ^ qbits)
Min Chen
- [x265] [PATCH] x86asm: warn when inappropriate instruction used in function with specified cpuflags
murugan at multicorewareinc.com
- [x265] fix cbf context
Satoshi Nakagawa
- [x265] 1.3+94-0e0d0309e616 - GCC error at 28%
JMK
- [x265] divide, encode, combine?
Michael Nordberg
- [x265] note: history rewrite
Steve Borho
- [x265] [PATCH] asm: replace ssse3 instruction in pixel_ssd_ss_*_sse2
Min Chen
- [x265] [PATCH 1 of 3] asm: optimize nquant by PSIGND, improve 11k cycles -> 9.8k cycles
Min Chen
- [x265] [PATCH 2 of 3] asm: reenable IACA support, it remove by 'inappropriate instruction...' patch
Min Chen
- [x265] [PATCH 1 of 2] asm: fix output mistake in pixel_ssd_ss_4xN
Min Chen
- [x265] fix getQuadtreeTULog2MinSizeInCU()
Satoshi Nakagawa
- [x265] [PATCH] count_nonzero asm code, reduceded code size by combining mova and packsswb
praveen at multicorewareinc.com
- [x265] [PATCH] copy_cnt 4x4, eliminated move instructions, +1x improvement
praveen at multicorewareinc.com
- [x265] [PATCH] copy_cnt: nits
praveen at multicorewareinc.com
- [x265] [PATCH] copy_cnt 4x4 AVX2 asm code, as per new interface
praveen at multicorewareinc.com
- [x265] [PATCH] copy_cnt_4 asm code, corrected register uses
praveen at multicorewareinc.com
- [x265] [PATCH] asm: avx2 assembly code for dct16
yuvaraj at multicorewareinc.com
- [x265] [PATCH] frameencoder: remove second encodeCU() pass over CTUs when SAO is disabled
Steve Borho
- [x265] [PATCH] x86inc.asm: fix vpbroadcastd bug on Mac platform
Min Chen
- [x265] [PATCH 1 of 2] asm: reduce number of movd in dequant_normal
Min Chen
- [x265] [PATCH]Add iteration-skip to subpel refine
shevaxu
- [x265] fix CHECKED_BUILD
Satoshi Nakagawa
- [x265] fix sao
Satoshi Nakagawa
- [x265] [PATCH] Analysis: compressIntraCU clean up
ashok at multicorewareinc.com
- [x265] [PATCH] analysis: modified compressInterCU_rd5_6() with CU-specific information
ashok at multicorewareinc.com
- [x265] target processor detection message in CMakeLists.txt
djcj
- [x265] [PATCH 1 of 4] testbench(quant): the Round value must be less than (2 ^ qbits)
Min Chen
- [x265] [PATCH] dpb: select best TMVP candidate from among all of the reference frames
gopu at multicorewareinc.com
- [x265] [PATCH] copy_cnt replaced align load with unaligned load to avoid code crash, we are not sure about alignment of dst buffer
praveen at multicorewareinc.com
- [x265] [PATCH] copy_cnt_4: enable fast non zero coefficient count path
praveen at multicorewareinc.com
- [x265] [PATCH] search.cpp: fixed type conversion warning
praveen at multicorewareinc.com
- [x265] [PATCH] copy_cnt_4: combine mova and paddb to reduce code size, same speedup
praveen at multicorewareinc.com
- [x265] [PATCH] copy_cnt_4: faster AVX2 code
praveen at multicorewareinc.com
- [x265] [PATCH] vps: vps_timing_info is always present
deepthi at multicorewareinc.com
- [x265] [PATCH] vps: general_frame_only_constraint_flag is true in progressive videos
deepthi at multicorewareinc.com
- [x265] [PATCH 1 of 4] add intra-inter data structures and param options
sagar at multicorewareinc.com
- [x265] [PATCH] api: use generic names for analysis api
sagar at multicorewareinc.com
- [x265] [PATCH] copy_cnt_8 AVX2 asm code, as per new interface
praveen at multicorewareinc.com
- [x265] [PATCH] analysis: modified compressInterCU_rd0_4() with CU-specific information
ashok at multicorewareinc.com
- [x265] [PATCH] rc: use m_frameDuration instead of rce->frameDuration to derive complexity for each frame in 2nd pass
aarthi at multicorewareinc.com
- [x265] [PATCH] copy_cnt_8, AVX2 asm code as per new interface, performance improved from 5.13x to 7.59x on HASWELL-I5
praveen at multicorewareinc.com
- [x265] [PATCH] copy_cnt_16, AVX2 asm code as per new interface, performance improved from 14.22x to 23.57x on HASWELL-I5
praveen at multicorewareinc.com
- [x265] [PATCH] copy_cnt_32, AVX2 asm code as per new interface, performance improved from 16.81x to 32.16x on HASWELL-I5
praveen at multicorewareinc.com
- [x265] [PATCH] copy_cnt: enable avx2 version of asm code
praveen at multicorewareinc.com
- [x265] [PATCH 1 of 4] add analysis data structures and param options
sagar at multicorewareinc.com
- [x265] [PATCH] removed copy_cnt_4 avx2 asm code: SSE version is eualy faster
praveen at multicorewareinc.com
- [x265] [PATCH] search: measure RDO of intra modes within 25% of least cost [CHANGES OUTPUTS]
Steve Borho
- [x265] [PATCH] copy_cnt_4 avx2 asm code: nit, same speedup by sse version
praveen at multicorewareinc.com
- [x265] [PATCH] copy_cnt_16: avx2 asm code as per new interface, improved 514.32 cycles -> 313.66 cycles
praveen at multicorewareinc.com
- [x265] [PATCH] copy_cnt_32: avx2 asm code as per new interface, improved 1521.17 cycles -> 934.46 cycles
praveen at multicorewareinc.com
- [x265] [PATCH] Resolved gcc compiler error of mismatched type
dtyx265 at gmail.com
- [x265] sao: some cleanups
Satoshi Nakagawa
- [x265] [PATCH] asm: avx2 assembly code for dct32x32
murugan at multicorewareinc.com
- [x265] [PATCH RFC] analysis: add CU specific details to encodeCU()
santhoshini at multicorewareinc.com
- [x265] [PATCH] Search: remove redundant encode coefficients in intra for performance
ashok at multicorewareinc.com
- [x265] [PATCH] analysis: remove redundant arguments, clean up variable names
deepthi at multicorewareinc.com
- [x265] [PATCH] asm: fix mismatch due to dct32 avx2 assembly code
murugan at multicorewareinc.com
- [x265] [PATCH] analysis: Intra picture estimation information sharing
gopu at multicorewareinc.com
- [x265] [PATCH] denoiseDct: unit test code
praveen at multicorewareinc.com
- [x265] [PATCH] rc: bug fix for 2 pass when bframes = 0. fixes Issue #77
aarthi at multicorewareinc.com
- [x265] [PATCH] rc: check for changes in scenecut input between multiple passes
aarthi at multicorewareinc.com
- [x265] [PATCH] add fanout validation module to check param compatibility
sagar at multicorewareinc.com
- [x265] [PATCH] param: preset tuning changes
Steve Borho
- [x265] [PATCH] analysis: add CU specific details to encodeCU()
santhoshini at multicorewareinc.com
- [x265] [PATCH] denoiseDct: test bench code
praveen at multicorewareinc.com
- [x265] [PATCH] rc: fixes for 2 pass + vbv to calculate frameSizePlanned accurately
aarthi at multicorewareinc.com
- [x265] [PATCH] analysis: intra picture estimation (mode and split decision)information sharing
gopu at multicorewareinc.com
- [x265] [PATCH] denoiseDct test code: fixed typo
praveen at multicorewareinc.com
- [x265] [PATCH] api: do not reuse the analysisData buffer for more then one picture, set it NULL
gopu at multicorewareinc.com
- [x265] inline simple functions
Satoshi Nakagawa
- [x265] [PATCH] Analysis: fix for binary mismatch for pass 2 in compressIntraCU()
ashok at multicorewareinc.com
- [x265] [PATCH] Analysis: fix for binary mismatch for pass 2 in compressSharedIntraCTU()
ashok at multicorewareinc.com
- [x265] [PATCH] denoiseDct unit test code: fixed bound value problem
praveen at multicorewareinc.com
- [x265] [PATCH] denoiseDct asm code: nit faulty code, need a new SSE version
praveen at multicorewareinc.com
- [x265] [PATCH] denoiseDct: nit unused asm function declarations
praveen at multicorewareinc.com
- [x265] [PATCH] denoiseDct: SSE version of asm code
praveen at multicorewareinc.com
- [x265] [PATCH] rc: fix bugs in using boundary condition for cu while encoding each frame
aarthi at multicorewareinc.com
- [x265] [PATCH] search: cleanup and remove redundant variable in checkintra
gopu at multicorewareinc.com
- [x265] [PATCH] search: dump best motion statistics for P and B slices into analysisdata file
gopu at multicorewareinc.com
- [x265] [PATCH] asm: avx2 assembly code for idct16x16
murugan at multicorewareinc.com
- [x265] [PATCH] denoise_dct asm code: SSE version
praveen at multicorewareinc.com
- [x265] [PATCH] denoise_dct: avx2 asm code
praveen at multicorewareinc.com
- [x265] [PATCH] search: remove redundant loacal variables in encodeResAndCalcRdSkipCU
gopu at multicorewareinc.com
- [x265] [PATCH] copy_cnt_16: avx2 asm code, improved 514.32 cycles -> 313.66 cycles
praveen at multicorewareinc.com
- [x265] [PATCH] copy_cnt_32: avx2 asm code, improved 1521.17 cycles -> 934.46 cycles
praveen at multicorewareinc.com
- [x265] [PATCH] search: simplify and remove redundant variables in getBestIntraModeChroma
gopu at multicorewareinc.com
- [x265] [PATCH] denoiseDct: align performance data while reporting speedup
praveen at multicorewareinc.com
- [x265] primitives: intra_pred[4][35] => intra_pred[35][4] (avoid *35)
Satoshi Nakagawa
- [x265] [PATCH] blockcopy_pp: 32x8, 32x16, 32x24, 32x32, 32x48, 32x64 AVX version of asm code, approx double speedup comapre to SSE
praveen at multicorewareinc.com
- [x265] [PATCH] blockcopy_pp: 64x16, 64x32, 64x48, 64x64 AVX version of asm code, approx double speedup comapre to SSE
sagar at multicorewareinc.com
- [x265] [PATCH 01 of 14] motion: avoid extra iterations when no subpel motion found
Steve Borho
- [x265] [PATCH] psy-rd: fix bug in chroma psyEnergy for intra 4x4
deepthi at multicorewareinc.com
- [x265] [PATCH] TComDataCU: replace getZorderIdxInCU() with encodeIdx of CU structure
santhoshini at multicorewareinc.com
- [x265] [PATCH] remove getNumPartInCU() and replace it with constant value
santhoshini at multicorewareinc.com
- [x265] [PATCH] remove getNumPartInCU() and replace it with macro
santhoshini at multicorewareinc.com
- [x265] [PATCH] TComDataCU: replace getTotalNumPart() with CU structure details
santhoshini at multicorewareinc.com
- [x265] [PATCH] asm: avx2 code for dct8x8
yuvaraj at multicorewareinc.com
- [x265] [PATCH] testbench.cpp: temporary fix for testbench crash
praveen at multicorewareinc.com
- [x265] [PATCH] add avx version for chroma_copy_ss 16x4, 16x8, 16x12, 16x16, 16x24, 16x32, 16x64 based on csp, approx 1.5x-2x speedup over SSE
sagar at multicorewareinc.com
- [x265] [PATCH] add avx version for chroma_copy_ss 16x4, 16x8, 16x12, 16x16, 16x24, 16x32, 16x64 based on csp, approx 1.5x-2x speedup over SSE
chen
- [x265] [PATCH] add avx version for chroma_copy_ss 16x4, 16x8, 16x12, 16x16, 16x24, 16x32, 16x64 based on csp, approx 1.5x-2x speedup over SSE
sagar at multicorewareinc.com
- [x265] [PATCH] blockcopy_ss: 64x16, 64x32, 64x48, 64x64 AVX version of asm code, approx double speedup comapre to SSE
sagar at multicorewareinc.com
- [x265] x265 --> YASM 1.3.0 update
Michal Powalko
- [x265] [PATCH] asm: replace mova by movu to avoid AVX2 testbench crash in dct16, dct32, denoise_dct, its same speed on Haswell
Min Chen
- [x265] refine deblocking filter
Satoshi Nakagawa
- [x265] [PATCH] blockcopy_pp_32x8: avx asm code, improved 281.20 cycles -> 165.47
praveen at multicorewareinc.com
- [x265] [PATCH] asm-primitives.cpp: nits
praveen at multicorewareinc.com
- [x265] [PATCH] blockcopy_pp_32x16: avx asm code, improved 477.74 cycles -> 309.99
praveen at multicorewareinc.com
- [x265] [PATCH] blockcopy_pp_32x24: avx asm code, improved 621.84 cycles -> 371.94
praveen at multicorewareinc.com
- [x265] [PATCH] bloccopy_pp avx asm code: 32x32, 32x48, 32x64 improved by 803.69 -> 514.90, 1126.36 -> 655.24, 1454.09 -> 835.76 cycles
praveen at multicorewareinc.com
- [x265] [PATCH] blockcopy_pp: avx asm code indentation
praveen at multicorewareinc.com
- [x265] [PATCH] Changed FrameEncoder::m_tld to a pointer and set it to one of Encoder's ThreadLocalData instances
dtyx265 at gmail.com
- [x265] Regarding changeset 8154 "vec: remove idct8, we have SSSE3 assembly for it"
Steve Borho
- [x265] [PATCH 1 of 5] predict: inline single call of predInterBi()
Steve Borho
- [x265] [PATCH] encoder: rename cuCoder to analysis for better clarity
Steve Borho
- [x265] [PATCH] asm: avx2 assembly code for idct8x8
yuvaraj at multicorewareinc.com
- [x265] [PATCH] asm: avx2 asm code for idct32x32
murugan at multicorewareinc.com
- [x265] [PATCH] api: rename SAO options and params for clarity
Steve Borho
- [x265] [PATCH 1 of 4] analysis: hoist local function into anonymous namespace (file local)
Steve Borho
- [x265] [PATCH] search: give each Search instance an Entropy encoder (no output changes)
Steve Borho
- [x265] Recent Frames/second benchmarks per platform and per clip?
Raul Lopez
- [x265] [PATCH] Removed unnecessary call to loadCTUData
dtyx265 at gmail.com
- [x265] [PATCH] asm: avx2 assembly code for idct32x32
murugan at multicorewareinc.com
- [x265] [PATCH] Changes for loadCTUData
dtyx265 at gmail.com
- [x265] [PATCH] convert c++ reference to pointer on m_scalingList
Min Chen
- [x265] [PATCH] blockfill_s_16x16 avx2 asm code, performance improved 389.21 cycles -> 204.38 cycles
praveen at multicorewareinc.com
- [x265] [PATCH] blockfill_s_32x32 avx2 asm code, performance improved 1354.05 cycles -> 705.81 cycles
praveen at multicorewareinc.com
- [x265] [PATCH 0 of 2 ] TComDataCU: replace with more CU structure details
santhoshini at multicorewareinc.com
- [x265] [PATCH] blockfill_s_16x16 avx2 asm code: performance improved from 389.21 cycles to 204.38 cycles, over sse version of asm code
praveen at multicorewareinc.com
- [x265] [PATCH] blockfill_s_32x32 avx2 asm code: performance improved from 1354.05 cycles to 705.81 cycles, over sse version of asm code
praveen at multicorewareinc.com
- [x265] [PATCH] rd: move lambda and analysis qp init to rdcost.h
Steve Borho
- [x265] [PATCH] rc: apply maxAU size restrictions while encoding each frame
aarthi at multicorewareinc.com
- [x265] [PATCH] asm: avx2 assembly code for idct4x4
murugan at multicorewareinc.com
Last message date:
Tue Sep 30 17:59:13 CEST 2014
Archived on: Thu Dec 11 23:20:13 CET 2014
This archive was generated by
Pipermail 0.09 (Mailman edition).