[vlc-devel] [WIP] sharpen: reimplement with SKIPSM
Tristan Matthews
le.businessman at gmail.com
Fri May 23 10:12:18 CEST 2014
On Thu, May 22, 2014 at 3:42 PM, Felix Abecassis
<felix.abecassis at gmail.com> wrote:
>
> Interesting.
>
> Did you benchmark the two implementations?
Yup, somewhat crudely thus far though (top, rdtsc, clock(),
kcachegrind). It seems that the new implementation consistently
performs faster/with fewer instructions so far. If you have any tips
for useful metrics/results, I could post them as well.
>
> Can the implementation be easily extended to a larger kernel width?
Possibly, but the paper I referenced was specifically for 3x3 kernels
and since this is the current behaviour of the sharpen filter I didn't
dig much further. In another paper, "Efficient algorithm for Gaussian
blur using finite-state machines", the same author does discuss 3x5
and 5x5 gaussian blur implementations, and compares the algorithmic
complexity of these vs. an NxN SKIPSM. He also mentions that an NxN
SKIPSM can be decomposed into several 3x3 SKIPSMs for comparable
performance.
>
>
>
> 2014-05-22 17:36 GMT+02:00 Tristan Matthews <le.businessman at gmail.com>:
>>
>> On Thu, May 22, 2014 at 11:30 AM, Tristan Matthews <le.businessman at gmail.com> wrote:
>>>
>>> SKIPSM (Separated-Kernel Image Processing using finite-State Machines) allows
>>> sharpening with fewer repeated operations. Two finite-state machines
>>> (a 2 element row FSM, and a width-element column FSM) are used to to avoid
>>> duplicate reads/arithmetic.
>>>
>>> This is a WIP. sharpen2 is meant to replace sharpen but both are included here
>>> for ease of live comparison.
>>>
>>> Reference:
>>> http://www-personal.engin.umd.umich.edu/~jwvm/ece488588/Papers/skipsm/17_Misc3x3.pdf
>>>
>>> Maybe refs #9458
>>> ---
>>> modules/MODULES_LIST | 1 +
>>> modules/gui/qt4/components/extended_panels.cpp | 3 +
>>> modules/gui/qt4/ui/video_effects.ui | 46 ++++
>>> modules/video_filter/Modules.am | 2 +
>>> modules/video_filter/sharpen2.c | 298 +++++++++++++++++++++++++
>>> 5 files changed, 350 insertions(+)
>>> create mode 100644 modules/video_filter/sharpen2.c
>>>
>>> diff --git a/modules/MODULES_LIST b/modules/MODULES_LIST
>>> index 61ad62b..bc60143 100644
>>> --- a/modules/MODULES_LIST
>>> +++ b/modules/MODULES_LIST
>>> @@ -309,6 +309,7 @@ $Id$
>>> * sepia: Sepia video filter
>>> * sftp: SFTP network access module
>>> * sharpen: Sharpen video filter
>>> + * sharpen2: Sharpen2 video filter
>>> * shine: MP3 encoder using Shine, a fixed point implementation
>>> * shm: Shared memory framebuffer access module
>>> * sid: Sidplay demuxer
>>> diff --git a/modules/gui/qt4/components/extended_panels.cpp b/modules/gui/qt4/components/extended_panels.cpp
>>> index 84d16ae..9583196 100644
>>> --- a/modules/gui/qt4/components/extended_panels.cpp
>>> +++ b/modules/gui/qt4/components/extended_panels.cpp
>>> @@ -150,6 +150,9 @@ ExtVideo::ExtVideo( intf_thread_t *_p_intf, QTabWidget *_parent ) :
>>> SETUP_VFILTER( sharpen )
>>> SETUP_VFILTER_OPTION( sharpenSigmaSlider, valueChanged( int ) )
>>>
>>> + SETUP_VFILTER( sharpen2 )
>>> + SETUP_VFILTER_OPTION( sharpen2SigmaSlider, valueChanged( int ) )
>>> +
>>> SETUP_VFILTER( ripple )
>>>
>>> SETUP_VFILTER( wave )
>>> diff --git a/modules/gui/qt4/ui/video_effects.ui b/modules/gui/qt4/ui/video_effects.ui
>>> index 6284e22..a6564d7 100644
>>> --- a/modules/gui/qt4/ui/video_effects.ui
>>> +++ b/modules/gui/qt4/ui/video_effects.ui
>>> @@ -316,6 +316,50 @@
>>> </layout>
>>> </widget>
>>> </item>
>>> + <item row="3" column="1">
>>> + <widget class="QGroupBox" name="sharpen2Enable">
>>> + <property name="title">
>>> + <string>Sharpen2</string>
>>> + </property>
>>> + <property name="checkable">
>>> + <bool>true</bool>
>>> + </property>
>>> + <property name="checked">
>>> + <bool>false</bool>
>>> + </property>
>>> + <layout class="QGridLayout">
>>> + <item row="0" column="0">
>>> + <widget class="QLabel" name="label_29">
>>> + <property name="text">
>>> + <string>Sigma</string>
>>> + </property>
>>> + <property name="buddy">
>>> + <cstring>sharpen2SigmaSlider</cstring>
>>> + </property>
>>> + </widget>
>>> + </item>
>>> + <item row="0" column="1">
>>> + <widget class="QSlider" name="sharpen2SigmaSlider">
>>> + <property name="maximum">
>>> + <number>200</number>
>>> + </property>
>>> + <property name="pageStep">
>>> + <number>10</number>
>>> + </property>
>>> + <property name="orientation">
>>> + <enum>Qt::Horizontal</enum>
>>> + </property>
>>> + <property name="tickPosition">
>>> + <enum>QSlider::TicksBelow</enum>
>>> + </property>
>>> + <property name="tickInterval">
>>> + <number>50</number>
>>> + </property>
>>> + </widget>
>>> + </item>
>>> + </layout>
>>> + </widget>
>>> + </item>
>>> </layout>
>>> </widget>
>>> <widget class="QWidget" name="tab_3">
>>> @@ -1950,6 +1994,8 @@
>>> <tabstop>gradfunRadiusSlider</tabstop>
>>> <tabstop>grainEnable</tabstop>
>>> <tabstop>grainVarianceSlider</tabstop>
>>> + <tabstop>sharpen2Enable</tabstop>
>>> + <tabstop>sharpen2SigmaSlider</tabstop>
>>> <tabstop>cropTopPx</tabstop>
>>> <tabstop>cropBotPx</tabstop>
>>> <tabstop>topBotCropSync</tabstop>
>>> diff --git a/modules/video_filter/Modules.am b/modules/video_filter/Modules.am
>>> index 3bb8cdb..ae0b63c 100644
>>> --- a/modules/video_filter/Modules.am
>>> +++ b/modules/video_filter/Modules.am
>>> @@ -78,6 +78,7 @@ video_filter_LTLIBRARIES += librotate_plugin.la
>>> SOURCES_colorthres = colorthres.c
>>> SOURCES_extract = extract.c
>>> SOURCES_sharpen = sharpen.c
>>> +SOURCES_sharpen2 = sharpen2.c
>>> SOURCES_erase = erase.c
>>> SOURCES_bluescreen = bluescreen.c
>>> SOURCES_alphamask = alphamask.c
>>> @@ -153,6 +154,7 @@ video_filter_LTLIBRARIES += \
>>> libscene_plugin.la \
>>> libsepia_plugin.la \
>>> libsharpen_plugin.la \
>>> + libsharpen2_plugin.la \
>>> libsubsdelay_plugin.la \
>>> libtransform_plugin.la \
>>> libwave_plugin.la \
>>> diff --git a/modules/video_filter/sharpen2.c b/modules/video_filter/sharpen2.c
>>> new file mode 100644
>>> index 0000000..cdabc20
>>> --- /dev/null
>>> +++ b/modules/video_filter/sharpen2.c
>>> @@ -0,0 +1,298 @@
>>> +/*****************************************************************************
>>> + * sharpen2.c: Sharpen video filter
>>> + *****************************************************************************
>>> + * Copyright (C) 2003-2007 VLC authors and VideoLAN
>>> + * $Id$
>>> + *
>>> + * Author: Jérémy DEMEULE <dj_mulder at djduron dot no-ip dot org>
>>> + * Jean-Baptiste Kempf <jb at videolan dot org>
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify it
>>> + * under the terms of the GNU Lesser General Public License as published by
>>> + * the Free Software Foundation; either version 2.1 of the License, or
>>> + * (at your option) any later version.
>>> + *
>>> + * This program is distributed in the hope that it will be useful,
>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>>> + * GNU Lesser General Public License for more details.
>>> + *
>>> + * You should have received a copy of the GNU Lesser General Public License
>>> + * along with this program; if not, write to the Free Software Foundation,
>>> + * Inc., 51 Franklin Street, Fifth Floor, Boston MA 02110-1301, USA.
>>> + *****************************************************************************/
>>> +
>>> +/* The sharpen filter. */
>>> +/*
>>> + * static int filter[] = { -1, -1, -1,
>>> + * -1, 8, -1,
>>> + * -1, -1, -1 };
>>> + */
>>> +
>>> +/*****************************************************************************
>>> + * Preamble
>>> + *****************************************************************************/
>>> +
>>> +#ifdef HAVE_CONFIG_H
>>> +# include "config.h"
>>> +#endif
>>> +
>>> +#include <vlc_common.h>
>>> +#include <vlc_plugin.h>
>>> +
>>> +#include <vlc_filter.h>
>>> +#include "filter_picture.h"
>>> +
>>> +#define SIG_TEXT N_("Sharpen strength (0-2)")
>>> +#define SIG_LONGTEXT N_("Set the Sharpen strength, between 0 and 2. Defaults to 0.05.")
>>> +
>>> +/*****************************************************************************
>>> + * Local prototypes
>>> + *****************************************************************************/
>>> +static int Create ( vlc_object_t * );
>>> +static void Destroy ( vlc_object_t * );
>>> +
>>> +static picture_t *Filter( filter_t *, picture_t * );
>>> +static int SharpenCallback( vlc_object_t *, char const *,
>>> + vlc_value_t, vlc_value_t, void * );
>>> +
>>> +#define SHARPEN2_HELP N_("Augment contrast between contours.")
>>> +#define FILTER_PREFIX "sharpen2-"
>>> +
>>> +/*****************************************************************************
>>> + * Module descriptor
>>> + *****************************************************************************/
>>> +vlc_module_begin ()
>>> + set_description( N_("Sharpen2 video filter") )
>>> + set_shortname( N_("Sharpen2") )
>>> + set_help(SHARPEN2_HELP)
>>> + set_category( CAT_VIDEO )
>>> + set_subcategory( SUBCAT_VIDEO_VFILTER )
>>> + set_capability( "video filter2", 0 )
>>> + add_float_with_range( "sharpen2-sigma", 0.05, 0.0, 2.0,
>>> + SIG_TEXT, SIG_LONGTEXT, false )
>>> + add_shortcut( "sharpen2" )
>>> + set_callbacks( Create, Destroy )
>>> +vlc_module_end ()
>>> +
>>> +static const char *const ppsz_filter_options[] = {
>>> + "sigma", NULL
>>> +};
>>> +
>>> +/*****************************************************************************
>>> + * filter_sys_t: Sharpen video filter descriptor
>>> + *****************************************************************************
>>> + * This structure is part of the video output thread descriptor.
>>> + * It describes the Sharpen specific properties of an output thread.
>>> + *****************************************************************************/
>>> +
>>> +struct filter_sys_t
>>> +{
>>> + vlc_mutex_t lock;
>>> + int tab_precalc[512];
>>> + int16_t *column_state[2];
>>> +};
>>> +
>>> +/*****************************************************************************
>>> + * clip: avoid negative value and value > 255
>>> + *****************************************************************************/
>>> +inline static uint8_t clip( int32_t a )
>>> +{
>>> + return (a > 255) ? 255 : (a < 0) ? 0 : a;
>>> +}
>>> +
>>> +static void init_precalc_table(filter_sys_t *p_filter, float sigma)
>>> +{
>>> + for(int i = 0; i < 512; ++i)
>>> + {
>>> + p_filter->tab_precalc[i] = (i - 256) * sigma;
>>> + }
>>> +}
>>> +
>>> +/*****************************************************************************
>>> + * Create: allocates Sharpen video thread output method
>>> + *****************************************************************************
>>> + * This function allocates and initializes a Sharpen vout method.
>>> + *****************************************************************************/
>>> +static int Create( vlc_object_t *p_this )
>>> +{
>>> + filter_t *p_filter = (filter_t *)p_this;
>>> +
>>> + const vlc_fourcc_t fourcc = p_filter->fmt_in.video.i_chroma;
>>> + const vlc_chroma_description_t *p_chroma = vlc_fourcc_GetChromaDescription( fourcc );
>>> + if( !p_chroma || p_chroma->plane_count != 3 || p_chroma->pixel_size != 1 ) {
>>> + msg_Err( p_filter, "Unsupported chroma (%4.4s)", (char*)&fourcc );
>>> + return VLC_EGENERIC;
>>> + }
>>> +
>>> + /* Allocate structure */
>>> + p_filter->p_sys = malloc( sizeof( filter_sys_t ) );
>>> + if( p_filter->p_sys == NULL )
>>> + return VLC_ENOMEM;
>>> +
>>> + for( int i = 0; i < 2; ++i) {
>>> + p_filter->p_sys->column_state[i] = malloc( sizeof(*p_filter->p_sys->column_state[i]) *
>>> + p_filter->fmt_in.video.i_visible_width );
>>> + if( p_filter->p_sys->column_state[i] == NULL )
>>> + return VLC_ENOMEM;
>>> + }
>>> +
>>> + p_filter->pf_video_filter = Filter;
>>> +
>>> + config_ChainParse( p_filter, FILTER_PREFIX, ppsz_filter_options,
>>> + p_filter->p_cfg );
>>> +
>>> + float sigma = var_CreateGetFloatCommand( p_filter, FILTER_PREFIX "sigma" );
>>> + init_precalc_table(p_filter->p_sys, sigma);
>>> +
>>> + vlc_mutex_init( &p_filter->p_sys->lock );
>>> + var_AddCallback( p_filter, FILTER_PREFIX "sigma",
>>> + SharpenCallback, p_filter->p_sys );
>>> +
>>> + return VLC_SUCCESS;
>>> +}
>>> +
>>> +
>>> +/*****************************************************************************
>>> + * Destroy: destroy Sharpen video thread output method
>>> + *****************************************************************************
>>> + * Terminate an output method created by SharpenCreateOutputMethod
>>> + *****************************************************************************/
>>> +static void Destroy( vlc_object_t *p_this )
>>> +{
>>> + filter_t *p_filter = (filter_t *)p_this;
>>> + filter_sys_t *p_sys = p_filter->p_sys;
>>> +
>>> + var_DelCallback( p_filter, FILTER_PREFIX "sigma", SharpenCallback, p_sys );
>>> + vlc_mutex_destroy( &p_sys->lock );
>>> + for (int i = 0; i < 2; ++i)
>>> + free( p_sys->column_state[i] );
>>> + free( p_sys );
>>> +}
>>> +
>>> +/*****************************************************************************
>>> + * Render: displays previously rendered output
>>> + *****************************************************************************
>>> + * This function send the currently rendered image to Invert image, waits
>>> + * until it is displayed and switch the two rendering buffers, preparing next
>>> + * frame.
>>> + *
>>> + * Reference:
>>> + * http://www-personal.engin.umd.umich.edu/~jwvm/ece488588/Papers/skipsm/17_Misc3x3.pdf
>>> + *
>>> + * Row Machine
>>> + * 1 Tmp1 = Input[row j][col i];
>>> + * 2 Tmp2 = Tmp1 + RS1;
>>> + * 3 Tmp3 = 9*RS0;
>>> + * 4 RS1 = RS0 + Tmp1;
>>> + * 5 RS0 = Tmp1;
>>> + * 6 Tmp1 = Tmp3 - Tmp2;
>>> +
>>> + * Column Machine
>>> + * (Division by 8 omitted to get same behaviour as current sharpen filter)
>>> + * Out[row j-1][col i-1]) = (CS1[col i] - Tmp2)/8;
>>> + * CS1[col i] = Tmp1 - CS0[col i];
>>> + * CS0[col i] = Tmp2
>>> + *
>>> + *****************************************************************************/
>>> +static picture_t *Filter( filter_t *p_filter, picture_t *p_pic )
>>> +{
>>> + picture_t *p_outpic;
>>> + unsigned i, j;
>>> + uint8_t *p_src = NULL;
>>> + uint8_t *p_out = NULL;
>>> + int i_src_pitch;
>>> + int i_out_pitch;
>>> + uint8_t pix;
>>> + int16_t row_state0, row_state1;
>>> + filter_sys_t *sys = p_filter->p_sys;
>>> + int16_t *column_state[2] = {sys->column_state[0], sys->column_state[1]};
>>> + const unsigned i_visible_lines = p_pic->p[Y_PLANE].i_visible_lines;
>>> + const unsigned i_visible_pitch = p_pic->p[Y_PLANE].i_visible_pitch;
>>> +
>>> + if( !p_pic ) return NULL;
>>> +
>>> + p_outpic = filter_NewPicture( p_filter );
>>> + if( !p_outpic )
>>> + {
>>> + picture_Release( p_pic );
>>> + return NULL;
>>> + }
>>> +
>>> + /* process the Y plane */
>>> + p_src = p_pic->p[Y_PLANE].p_pixels;
>>> + p_out = p_outpic->p[Y_PLANE].p_pixels;
>>> + i_src_pitch = p_pic->p[Y_PLANE].i_pitch;
>>> + i_out_pitch = p_outpic->p[Y_PLANE].i_pitch;
>>> +
>>> + /* reset column state at beginning of operation */
>>> + for (unsigned c = 0; c < 2; ++c)
>>> + memset(column_state[c], 0, sizeof(*column_state[c]) *
>>> + p_filter->fmt_in.video.i_visible_width);
>>> +
>>> + /* perform convolution only on Y plane. Avoid border line. */
>>> + vlc_mutex_lock( &p_filter->p_sys->lock );
>>> +
>>> + /* copy first row */
>>> + memcpy(p_out, p_src, i_visible_pitch);
>>> +
>>> + for( i = 1; i < i_visible_lines - 1; i++ )
>>> + {
>>> + /* row state must be initialized for each row */
>>> + row_state0 = row_state1 = 0;
>>> +
>>> + /* copy first pixel in row */
>>> + p_out[i * i_out_pitch] = p_src[i * i_src_pitch];
>>> +
>>> + for( j = 1; j < i_visible_pitch - 1; j++ )
>>> + {
>>> + /* row machine */
>>> + int16_t tmp1 = p_src[i * i_src_pitch + j];
>>> + const int16_t tmp2 = tmp1 + row_state1;
>>> + const int16_t tmp3 = 9 * row_state0;
>>> + row_state1 = row_state0 + tmp1;
>>> + row_state0 = tmp1;
>>> + tmp1 = tmp3 - tmp2;
>>> +
>>> + /* column machine */
>>> + pix = clip(column_state[1][j] - tmp2);
>>> +
>>> + /* mix with original signal and write to output */
>>> + p_out[(i - 1) * i_out_pitch + j - 1] =
>>> + clip( p_src[(i - 1) * i_src_pitch + j - 1] +
>>> + p_filter->p_sys->tab_precalc[pix + 256]);
>>> +
>>> + column_state[1][j] = tmp1 - column_state[0][j];
>>> + column_state[0][j] = tmp2;
>>> + }
>>> +
>>> + /* copy last pixel */
>>> + p_out[i * i_out_pitch + i_visible_pitch - 1] =
>>> + p_src[i * i_src_pitch + i_visible_pitch - 1];
>>> + }
>>> +
>>> + /* copy last row */
>>> + for( j = 0; j < i_visible_pitch; j++ )
>>> + p_out[(i_visible_lines - 1) * i_out_pitch + j] =
>>> + p_src[(i_visible_lines - 1) * i_src_pitch + j];
>>> +
>>> + vlc_mutex_unlock( &p_filter->p_sys->lock );
>>> +
>>> + plane_CopyPixels( &p_outpic->p[U_PLANE], &p_pic->p[U_PLANE] );
>>> + plane_CopyPixels( &p_outpic->p[V_PLANE], &p_pic->p[V_PLANE] );
>>> +
>>> + return CopyInfoAndRelease( p_outpic, p_pic );
>>> +}
>>> +
>>> +static int SharpenCallback( vlc_object_t *p_this, char const *psz_var,
>>> + vlc_value_t oldval, vlc_value_t newval,
>>> + void *p_data )
>>> +{
>>> + VLC_UNUSED(p_this); VLC_UNUSED(oldval); VLC_UNUSED(psz_var);
>>> + filter_sys_t *p_sys = (filter_sys_t *)p_data;
>>> +
>>> + vlc_mutex_lock( &p_sys->lock );
>>> + init_precalc_table( p_sys, VLC_CLIP( newval.f_float, 0., 2. ) );
>>> + vlc_mutex_unlock( &p_sys->lock );
>>> + return VLC_SUCCESS;
>>> +}
>>> --
>>> 1.9.0
>>>
>>
>> Just to clarify, if the new algo is ok, a proper patch will be sent that will only modify sharpen.c. The GUI won't change.
>> This was sent together just for testing/comparison purposes.
>>
>> -t
>>
>>
>> _______________________________________________
>> vlc-devel mailing list
>> To unsubscribe or modify your subscription options:
>> https://mailman.videolan.org/listinfo/vlc-devel
>>
>
>
>
> --
> Félix Abecassis
> http://felix.abecassis.me
>
> _______________________________________________
> vlc-devel mailing list
> To unsubscribe or modify your subscription options:
> https://mailman.videolan.org/listinfo/vlc-devel
>
More information about the vlc-devel
mailing list