[x265-commits] [x265] asm: testbench code for costCoeffRemain()
Sumalatha at videolan.org
Sumalatha at videolan.org
Mon Aug 3 21:06:31 CEST 2015
details: http://hg.videolan.org/x265/rev/adc769232ccc
branches:
changeset: 10849:adc769232ccc
user: Sumalatha Polureddy
date: Thu Jul 23 18:14:49 2015 +0530
description:
asm: testbench code for costCoeffRemain()
Subject: [x265] rc: fix rate factor calculation after updating m_avgQpRc in rateControlEnd
details: http://hg.videolan.org/x265/rev/db59e6d9b85b
branches:
changeset: 10850:db59e6d9b85b
user: Divya Manivannan <divya at multicorewareinc.com>
date: Thu Jul 23 17:45:47 2015 +0530
description:
rc: fix rate factor calculation after updating m_avgQpRc in rateControlEnd
Subject: [x265] asm: add missing prefix, remove TODO comments
details: http://hg.videolan.org/x265/rev/b015514a9386
branches:
changeset: 10851:b015514a9386
user: Deepthi Nandakumar <deepthi at multicorewareinc.com>
date: Sun Jul 26 16:12:32 2015 +0530
description:
asm: add missing prefix, remove TODO comments
Subject: [x265] doc: example is for a D65, P3 color space
details: http://hg.videolan.org/x265/rev/eb6b04fc06a3
branches:
changeset: 10852:eb6b04fc06a3
user: Deepthi Nandakumar <deepthi at multicorewareinc.com>
date: Mon Jul 27 13:00:17 2015 +0530
description:
doc: example is for a D65, P3 color space
Subject: [x265] threadpool: fix calculation of JobProviders
details: http://hg.videolan.org/x265/rev/dc446bc5df50
branches:
changeset: 10853:dc446bc5df50
user: Deepthi Nandakumar <deepthi at multicorewareinc.com>
date: Mon Jul 27 14:07:16 2015 +0530
description:
threadpool: fix calculation of JobProviders
The framethreads are assigned in a round robin fashion to all threadpools. The
lookahead is always assigned to threadpool 0. This patch fixes crashes due to
under allocation of job providers to thread pool 0.
Subject: [x265] update Main12 asm value range check on dequant_normal_c
details: http://hg.videolan.org/x265/rev/f15a1a89c434
branches:
changeset: 10854:f15a1a89c434
user: Min Chen <chenm003 at 163.com>
date: Mon Jul 27 13:03:03 2015 -0700
description:
update Main12 asm value range check on dequant_normal_c
Subject: [x265] asm: fix Main12 fault on AVX2 dequant
details: http://hg.videolan.org/x265/rev/c54c3663fe1c
branches:
changeset: 10855:c54c3663fe1c
user: Min Chen <chenm003 at 163.com>
date: Mon Jul 27 13:03:06 2015 -0700
description:
asm: fix Main12 fault on AVX2 dequant
Subject: [x265] asm: fix Main12 fault on AVX2 weight_pp
details: http://hg.videolan.org/x265/rev/04beacc5cc49
branches:
changeset: 10856:04beacc5cc49
user: Min Chen <chenm003 at 163.com>
date: Mon Jul 27 13:03:08 2015 -0700
description:
asm: fix Main12 fault on AVX2 weight_pp
Subject: [x265] stats: average and maximum luma level per frame
details: http://hg.videolan.org/x265/rev/e08a24505443
branches:
changeset: 10857:e08a24505443
user: Divya Manivannan <divya at multicorewareinc.com>
date: Mon Jul 27 15:15:35 2015 +0530
description:
stats: average and maximum luma level per frame
Subject: [x265] analysis: fix for rd-0 non-deterministic output
details: http://hg.videolan.org/x265/rev/dbe8c629ccc7
branches:
changeset: 10858:dbe8c629ccc7
user: Ashok Kumar Mishra<ashok at multicorewareinc.com>
date: Tue Jul 28 16:10:30 2015 +0530
description:
analysis: fix for rd-0 non-deterministic output
Subject: [x265] stats: fix loss of precision in average luma level per frame
details: http://hg.videolan.org/x265/rev/7c83f7755422
branches:
changeset: 10859:7c83f7755422
user: Divya Manivannan <divya at multicorewareinc.com>
date: Wed Jul 29 11:41:57 2015 +0530
description:
stats: fix loss of precision in average luma level per frame
Subject: [x265] info: add qg-size to info SEI
details: http://hg.videolan.org/x265/rev/a271502d2ba8
branches:
changeset: 10860:a271502d2ba8
user: Deepthi Nandakumar <deepthi at multicorewareinc.com>
date: Mon Aug 03 09:43:21 2015 +0530
description:
info: add qg-size to info SEI
Subject: [x265] param: set qgsize to default 32 for medium and all slower presets
details: http://hg.videolan.org/x265/rev/dc5d58411210
branches:
changeset: 10861:dc5d58411210
user: Deepthi Nandakumar <deepthi at multicorewareinc.com>
date: Mon Aug 03 09:46:43 2015 +0530
description:
param: set qgsize to default 32 for medium and all slower presets
This changes outputs for commandlines with medium and all slower presets
Subject: [x265] vui: add support for transfer characteristic std-b67
details: http://hg.videolan.org/x265/rev/a3b72e2a25a7
branches:
changeset: 10862:a3b72e2a25a7
user: Deepthi Nandakumar <deepthi at multicorewareinc.com>
date: Mon Aug 03 10:28:34 2015 +0530
description:
vui: add support for transfer characteristic std-b67
Subject: [x265] vui: update help
details: http://hg.videolan.org/x265/rev/8b1ce9d894d2
branches:
changeset: 10863:8b1ce9d894d2
user: Deepthi Nandakumar <deepthi at multicorewareinc.com>
date: Mon Aug 03 16:53:46 2015 +0530
description:
vui: update help
Subject: [x265] threadpool: nit
details: http://hg.videolan.org/x265/rev/d5278c76d341
branches:
changeset: 10864:d5278c76d341
user: Steve Borho <steve at borho.org>
date: Mon Aug 03 10:18:46 2015 -0500
description:
threadpool: nit
diffstat:
doc/reST/cli.rst | 3 +-
source/CMakeLists.txt | 2 +-
source/common/dct.cpp | 2 +-
source/common/framedata.h | 3 ++
source/common/param.cpp | 7 +++--
source/common/threadpool.cpp | 2 +-
source/common/x86/asm-primitives.cpp | 9 +-----
source/common/x86/pixel-util8.asm | 15 ++++++----
source/encoder/analysis.cpp | 3 +-
source/encoder/encoder.cpp | 2 +
source/encoder/frameencoder.cpp | 15 +++++++++++
source/encoder/ratecontrol.cpp | 39 +++++++++++++++--------------
source/test/pixelharness.cpp | 47 ++++++++++++++++++++++++++++++++++++
source/test/pixelharness.h | 2 +-
source/x265-extras.cpp | 4 +-
source/x265.h | 4 ++-
source/x265cli.h | 2 +-
17 files changed, 116 insertions(+), 45 deletions(-)
diffs (truncated from 458 to 300 lines):
diff -r 24c1ee516d13 -r d5278c76d341 doc/reST/cli.rst
--- a/doc/reST/cli.rst Wed Jul 22 15:42:15 2015 +0530
+++ b/doc/reST/cli.rst Mon Aug 03 10:18:46 2015 -0500
@@ -1583,6 +1583,7 @@ VUI fields must be manually specified.
15. bt2020-12
16. smpte-st-2084
17. smpte-st-428
+ 18. std-b67
.. option:: --colormatrix <integer|string>
@@ -1616,7 +1617,7 @@ VUI fields must be manually specified.
integers. The SEI includes X,Y display primaries for RGB channels,
white point X,Y and max,min luminance values. (HDR)
- Example for P65D3 1000-nits:
+ Example for D65P3 1000-nits:
G(13200,34500)B(7500,3000)R(34000,16000)WP(15635,16450)L(10000000,1)
diff -r 24c1ee516d13 -r d5278c76d341 source/CMakeLists.txt
--- a/source/CMakeLists.txt Wed Jul 22 15:42:15 2015 +0530
+++ b/source/CMakeLists.txt Mon Aug 03 10:18:46 2015 -0500
@@ -30,7 +30,7 @@ option(STATIC_LINK_CRT "Statically link
mark_as_advanced(FPROFILE_USE FPROFILE_GENERATE NATIVE_BUILD)
# X265_BUILD must be incremented each time the public API is changed
-set(X265_BUILD 67)
+set(X265_BUILD 68)
configure_file("${PROJECT_SOURCE_DIR}/x265.def.in"
"${PROJECT_BINARY_DIR}/x265.def")
configure_file("${PROJECT_SOURCE_DIR}/x265_config.h.in"
diff -r 24c1ee516d13 -r d5278c76d341 source/common/dct.cpp
--- a/source/common/dct.cpp Wed Jul 22 15:42:15 2015 +0530
+++ b/source/common/dct.cpp Mon Aug 03 10:18:46 2015 -0500
@@ -612,7 +612,7 @@ static void idct32_c(const int16_t* src,
static void dequant_normal_c(const int16_t* quantCoef, int16_t* coef, int num, int scale, int shift)
{
#if HIGH_BIT_DEPTH
- X265_CHECK(scale < 32768 || ((scale & 3) == 0 && shift > 2), "dequant invalid scale %d\n", scale);
+ X265_CHECK(scale < 32768 || ((scale & 3) == 0 && shift > (X265_DEPTH - 8)), "dequant invalid scale %d\n", scale);
#else
// NOTE: maximum of scale is (72 * 256)
X265_CHECK(scale < 32768, "dequant invalid scale %d\n", scale);
diff -r 24c1ee516d13 -r d5278c76d341 source/common/framedata.h
--- a/source/common/framedata.h Wed Jul 22 15:42:15 2015 +0530
+++ b/source/common/framedata.h Mon Aug 03 10:18:46 2015 -0500
@@ -55,6 +55,8 @@ struct FrameStats
double avgLumaDistortion;
double avgChromaDistortion;
double avgPsyEnergy;
+ double avgLumaLevel;
+ double lumaLevel;
double percentIntraNxN;
double percentSkipCu[NUM_CU_DEPTH];
double percentMergeCu[NUM_CU_DEPTH];
@@ -73,6 +75,7 @@ struct FrameStats
uint64_t cntIntra[NUM_CU_DEPTH];
uint64_t cuInterDistribution[NUM_CU_DEPTH][INTER_MODES];
uint64_t cuIntraDistribution[NUM_CU_DEPTH][INTRA_MODES];
+ uint16_t maxLumaLevel;
FrameStats()
{
diff -r 24c1ee516d13 -r d5278c76d341 source/common/param.cpp
--- a/source/common/param.cpp Wed Jul 22 15:42:15 2015 +0530
+++ b/source/common/param.cpp Mon Aug 03 10:18:46 2015 -0500
@@ -206,6 +206,7 @@ void x265_param_default(x265_param* para
param->rc.rateControlMode = X265_RC_CRF;
param->rc.qp = 32;
param->rc.aqMode = X265_AQ_VARIANCE;
+ param->rc.qgSize = 32;
param->rc.aqStrength = 1.0;
param->rc.cuTree = 1;
param->rc.rfConstantMax = 0;
@@ -219,7 +220,6 @@ void x265_param_default(x265_param* para
param->rc.zones = NULL;
param->rc.bEnableSlowFirstPass = 0;
param->rc.bStrictCbr = 0;
- param->rc.qgSize = 64; /* Same as maxCUSize */
/* Video Usability Information (VUI) */
param->vui.aspectRatioIdc = 0;
@@ -1114,11 +1114,11 @@ int x265_check_params(x265_param* param)
"Color Primaries must be undef, bt709, bt470m,"
" bt470bg, smpte170m, smpte240m, film or bt2020");
CHECK(param->vui.transferCharacteristics < 0
- || param->vui.transferCharacteristics > 17
+ || param->vui.transferCharacteristics > 18
|| param->vui.transferCharacteristics == 3,
"Transfer Characteristics must be undef, bt709, bt470m, bt470bg,"
" smpte170m, smpte240m, linear, log100, log316, iec61966-2-4, bt1361e,"
- " iec61966-2-1, bt2020-10, bt2020-12, smpte-st-2084 or smpte-st-428");
+ " iec61966-2-1, bt2020-10, bt2020-12, smpte-st-2084, smpte-st-428 or std-b67");
CHECK(param->vui.matrixCoeffs < 0
|| param->vui.matrixCoeffs > 10
|| param->vui.matrixCoeffs == 3,
@@ -1431,6 +1431,7 @@ char *x265_param2string(x265_param* p)
BOOL(p->bEnableWeightedPred, "weightp");
BOOL(p->bEnableWeightedBiPred, "weightb");
s += sprintf(s, " aq-mode=%d", p->rc.aqMode);
+ s += sprintf(s, " qg-size=%d", p->rc.qgSize);
s += sprintf(s, " aq-strength=%.2f", p->rc.aqStrength);
s += sprintf(s, " cbqpoffs=%d", p->cbQpOffset);
s += sprintf(s, " crqpoffs=%d", p->crQpOffset);
diff -r 24c1ee516d13 -r d5278c76d341 source/common/threadpool.cpp
--- a/source/common/threadpool.cpp Wed Jul 22 15:42:15 2015 +0530
+++ b/source/common/threadpool.cpp Mon Aug 03 10:18:46 2015 -0500
@@ -310,7 +310,7 @@ ThreadPool* ThreadPool::allocThreadPools
ThreadPool *pools = new ThreadPool[numPools];
if (pools)
{
- int maxProviders = (p->frameNumThreads + 1 + numPools - 1) / numPools; /* +1 is Lookahead */
+ int maxProviders = (p->frameNumThreads + numPools - 1) / numPools + 1; /* +1 is Lookahead, always assigned to threadpool 0 */
int node = 0;
for (int i = 0; i < numPools; i++)
{
diff -r 24c1ee516d13 -r d5278c76d341 source/common/x86/asm-primitives.cpp
--- a/source/common/x86/asm-primitives.cpp Wed Jul 22 15:42:15 2015 +0530
+++ b/source/common/x86/asm-primitives.cpp Mon Aug 03 10:18:46 2015 -0500
@@ -1541,19 +1541,16 @@ void setupAssemblyPrimitives(EncoderPrim
p.quant = PFX(quant_avx2);
p.nquant = PFX(nquant_avx2);
-#if X265_DEPTH <= 10
p.dequant_normal = PFX(dequant_normal_avx2);
p.dequant_scaling = PFX(dequant_scaling_avx2);
-#endif
p.dst4x4 = PFX(dst4_avx2);
p.idst4x4 = PFX(idst4_avx2);
p.denoiseDct = PFX(denoise_dct_avx2);
p.scale1D_128to64 = PFX(scale1D_128to64_avx2);
p.scale2D_64to32 = PFX(scale2D_64to32_avx2);
-#if X265_DEPTH <= 10
+
p.weight_pp = PFX(weight_pp_avx2);
-#endif
p.weight_sp = PFX(weight_sp_avx2);
p.sign = PFX(calSign_avx2);
p.planecopy_cp = PFX(upShift_8_avx2);
@@ -2553,11 +2550,9 @@ void setupAssemblyPrimitives(EncoderPrim
ALL_LUMA_CU(psy_cost_pp, psyCost_pp, sse4);
ALL_LUMA_CU(psy_cost_ss, psyCost_ss, sse4);
- // TODO: it is passed smoke test, but we need testbench, so temporary disable
p.costCoeffNxN = PFX(costCoeffNxN_sse4);
#endif
- // TODO: it is passed smoke test, but we need testbench to active it, so temporary disable
- //p.costCoeffRemain = x265_costCoeffRemain_sse4;
+ p.costCoeffRemain = PFX(costCoeffRemain_sse4);
}
if (cpuMask & X265_CPU_AVX)
{
diff -r 24c1ee516d13 -r d5278c76d341 source/common/x86/pixel-util8.asm
--- a/source/common/x86/pixel-util8.asm Wed Jul 22 15:42:15 2015 +0530
+++ b/source/common/x86/pixel-util8.asm Mon Aug 03 10:18:46 2015 -0500
@@ -1048,8 +1048,8 @@ cglobal dequant_normal, 5,5,7
%if HIGH_BIT_DEPTH
cmp r3d, 32767
jle .skip
- shr r3d, 2
- sub r4d, 2
+ shr r3d, (BIT_DEPTH - 8)
+ sub r4d, (BIT_DEPTH - 8)
.skip:
%endif
movd xm0, r4d ; m0 = shift
@@ -1407,14 +1407,16 @@ cglobal weight_pp, 6,7,6
%if HIGH_BIT_DEPTH
INIT_YMM avx2
cglobal weight_pp, 6, 7, 7
- shl r5d, 4 ; m0 = [w0<<4]
+%define correction (14 - BIT_DEPTH)
mov r6d, r6m
- shl r6d, 16
- or r6d, r5d ; assuming both (w0<<4) and round are using maximum of 16 bits each.
+ shl r6d, 16 - correction
+ or r6d, r5d ; assuming both w0 and round are using maximum of 16 bits each.
vpbroadcastd m0, r6d
- movd xm1, r7m
+ mov r5d, r7m
+ sub r5d, correction
+ movd xm1, r5d
vpbroadcastd m2, r8m
mova m5, [pw_1]
mova m6, [pw_pixel_max]
@@ -1453,6 +1455,7 @@ cglobal weight_pp, 6, 7, 7
dec r4d
jnz .loopH
+%undef correction
RET
%else
INIT_YMM avx2
diff -r 24c1ee516d13 -r d5278c76d341 source/encoder/analysis.cpp
--- a/source/encoder/analysis.cpp Wed Jul 22 15:42:15 2015 +0530
+++ b/source/encoder/analysis.cpp Mon Aug 03 10:18:46 2015 -0500
@@ -1106,7 +1106,8 @@ uint32_t Analysis::compressInterCU_rd0_4
/* Copy best data to encData CTU and recon */
X265_CHECK(md.bestMode->ok(), "best mode is not ok");
md.bestMode->cu.copyToPic(depth);
- md.bestMode->reconYuv.copyToPicYuv(*m_frame->m_reconPic, cuAddr, cuGeom.absPartIdx);
+ if (m_param->rdLevel)
+ md.bestMode->reconYuv.copyToPicYuv(*m_frame->m_reconPic, cuAddr, cuGeom.absPartIdx);
return refMask;
}
diff -r 24c1ee516d13 -r d5278c76d341 source/encoder/encoder.cpp
--- a/source/encoder/encoder.cpp Wed Jul 22 15:42:15 2015 +0530
+++ b/source/encoder/encoder.cpp Mon Aug 03 10:18:46 2015 -0500
@@ -1168,6 +1168,8 @@ void Encoder::finishFrameStats(Frame* cu
frameStats->avgChromaDistortion = curFrame->m_encData->m_frameStats.avgChromaDistortion;
frameStats->avgLumaDistortion = curFrame->m_encData->m_frameStats.avgLumaDistortion;
frameStats->avgPsyEnergy = curFrame->m_encData->m_frameStats.avgPsyEnergy;
+ frameStats->avgLumaLevel = curFrame->m_encData->m_frameStats.avgLumaLevel;
+ frameStats->maxLumaLevel = curFrame->m_encData->m_frameStats.maxLumaLevel;
for (uint32_t depth = 0; depth <= g_maxCUDepth; depth++)
{
frameStats->cuStats.percentSkipCu[depth] = curFrame->m_encData->m_frameStats.percentSkipCu[depth];
diff -r 24c1ee516d13 -r d5278c76d341 source/encoder/frameencoder.cpp
--- a/source/encoder/frameencoder.cpp Wed Jul 22 15:42:15 2015 +0530
+++ b/source/encoder/frameencoder.cpp Mon Aug 03 10:18:46 2015 -0500
@@ -591,6 +591,10 @@ void FrameEncoder::compressFrame()
m_frame->m_encData->m_frameStats.lumaDistortion += m_rows[i].rowStats.lumaDistortion;
m_frame->m_encData->m_frameStats.chromaDistortion += m_rows[i].rowStats.chromaDistortion;
m_frame->m_encData->m_frameStats.psyEnergy += m_rows[i].rowStats.psyEnergy;
+ m_frame->m_encData->m_frameStats.lumaLevel += m_rows[i].rowStats.lumaLevel;
+
+ if (m_rows[i].rowStats.maxLumaLevel > m_frame->m_encData->m_frameStats.maxLumaLevel)
+ m_frame->m_encData->m_frameStats.maxLumaLevel = m_rows[i].rowStats.maxLumaLevel;
for (uint32_t depth = 0; depth <= g_maxCUDepth; depth++)
{
m_frame->m_encData->m_frameStats.cntSkipCu[depth] += m_rows[i].rowStats.cntSkipCu[depth];
@@ -604,6 +608,7 @@ void FrameEncoder::compressFrame()
m_frame->m_encData->m_frameStats.avgLumaDistortion = (double)(m_frame->m_encData->m_frameStats.lumaDistortion / m_frame->m_encData->m_frameStats.totalCtu);
m_frame->m_encData->m_frameStats.avgChromaDistortion = (double)(m_frame->m_encData->m_frameStats.chromaDistortion / m_frame->m_encData->m_frameStats.totalCtu);
m_frame->m_encData->m_frameStats.avgPsyEnergy = (double)(m_frame->m_encData->m_frameStats.psyEnergy / m_frame->m_encData->m_frameStats.totalCtu);
+ m_frame->m_encData->m_frameStats.avgLumaLevel = (double)(m_frame->m_encData->m_frameStats.lumaLevel / m_frame->m_encData->m_frameStats.totalCtu);
m_frame->m_encData->m_frameStats.percentIntraNxN = (double)(m_frame->m_encData->m_frameStats.cntIntraNxN * 100) / m_frame->m_encData->m_frameStats.totalCu;
for (uint32_t depth = 0; depth <= g_maxCUDepth; depth++)
{
@@ -983,6 +988,16 @@ void FrameEncoder::processRowEncoder(int
for (int n = 0; n < INTRA_MODES; n++)
curRow.rowStats.cuIntraDistribution[depth][n] += frameLog.cuIntraDistribution[depth][n];
}
+ uint64_t ctuLumaLevel = 0;
+ uint64_t ctuNoOfPixels = 0;
+ for (uint32_t i = 0; i < (best.reconYuv.m_size * best.reconYuv.m_size); i++)
+ {
+ ctuLumaLevel += *(best.reconYuv.m_buf[0] + i);
+ ctuNoOfPixels++;
+ if ((*(best.reconYuv.m_buf[0] + i)) > curRow.rowStats.maxLumaLevel)
+ curRow.rowStats.maxLumaLevel = *(best.reconYuv.m_buf[0] + i);
+ }
+ curRow.rowStats.lumaLevel += (double)(ctuLumaLevel / ctuNoOfPixels);
curEncData.m_cuStat[cuAddr].totalBits = best.totalBits;
x265_emms();
diff -r 24c1ee516d13 -r d5278c76d341 source/encoder/ratecontrol.cpp
--- a/source/encoder/ratecontrol.cpp Wed Jul 22 15:42:15 2015 +0530
+++ b/source/encoder/ratecontrol.cpp Mon Aug 03 10:18:46 2015 -0500
@@ -2151,25 +2151,6 @@ int RateControl::rateControlEnd(Frame* c
FrameData& curEncData = *curFrame->m_encData;
int64_t actualBits = bits;
Slice *slice = curEncData.m_slice;
- if (m_isAbr)
- {
- if (m_param->rc.rateControlMode == X265_RC_ABR && !m_param->rc.bStatRead)
- checkAndResetABR(rce, true);
-
- if (m_param->rc.rateControlMode == X265_RC_CRF)
- {
- if (int(curEncData.m_avgQpRc + 0.5) == slice->m_sliceQp)
- curEncData.m_rateFactor = m_rateFactorConstant;
- else
- {
- /* If vbv changed the frame QP recalculate the rate-factor */
- double baseCplx = m_ncu * (m_param->bframes ? 120 : 80);
- double mbtree_offset = m_param->rc.cuTree ? (1.0 - m_param->rc.qCompress) * 13.5 : 0;
- curEncData.m_rateFactor = pow(baseCplx, 1 - m_qCompress) /
- x265_qp2qScale(int(curEncData.m_avgQpRc + 0.5) + mbtree_offset);
- }
- }
- }
if (m_param->rc.aqMode || m_isVbv)
{
@@ -2195,6 +2176,26 @@ int RateControl::rateControlEnd(Frame* c
curEncData.m_avgQpAq = curEncData.m_avgQpRc;
}
+ if (m_isAbr)
+ {
+ if (m_param->rc.rateControlMode == X265_RC_ABR && !m_param->rc.bStatRead)
+ checkAndResetABR(rce, true);
+
+ if (m_param->rc.rateControlMode == X265_RC_CRF)
More information about the x265-commits
mailing list