<div dir="ltr"><div><div><div><div>I made a quick test on a Intel i7-4771 CPU with a 3840x2160 video sample, here are approximate results:<br><br>- gcc 4.8.2 sharpen (original): 55ms<br></div>- gcc 4.8.2 sharpen (modified): 20ms<br>

</div>- gcc 4.8.2 sharpen2 SKIPSM: 30ms<br>- clang 3.5 sharpen (original): 50ms<br>
</div>- clang 3.5 sharpen (modified): 50ms :(<br>- clang sharpen2 SKIPSM: 30ms <br><br></div>Apparently clang couldn't vectorize the code. But I couldn't find an equivalent of -ftree-vectorizer-verbose to be sure.<br>

The SKIPSM implementation seems interesting if the target does not support SIMD or if the compiler cannot vectorize the code. But I don't know how we can properly dispatch to the fastest implementation given a compiler/arch.<br>

</div><div class="gmail_extra"><br><br><div class="gmail_quote">2014-05-23 22:46 GMT+02:00 Felix Abecassis <span dir="ltr"><<a href="mailto:felix.abecassis@gmail.com" target="_blank">felix.abecassis@gmail.com</a>></span>:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div>If you plan to do some benchmarks, could you also test with the attached patch for the original implementation? After a few changes gcc was able to automatically vectorize the inner loop. You need to compile with -O3 or -ftree-vectorize.<br>


</div>You can check if the loop was vectorized using -ftree-vectorizer-verbose=2:<br>../../../modules/video_filter/sharpen.c:209: note: LOOP VECTORIZED.<br>../../../modules/video_filter/sharpen.c:170: note: vectorized 1 loops in function<br>


<br></div>On my home desktop this version seems sligthly faster than SKIPSM and should be even faster with wider SIMD like AVX.<br><br>Thanks!<br></div><div class="gmail_extra"><br><br><div class="gmail_quote">2014-05-23 11:25 GMT+02:00 Felix Abecassis <span dir="ltr"><<a href="mailto:felix.abecassis@gmail.com" target="_blank">felix.abecassis@gmail.com</a>></span>:<div>

<div class="h5"><br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><br><div class="gmail_quote">2014-05-23 10:12 GMT+02:00 Tristan Matthews <span dir="ltr"><<a href="mailto:le.businessman@gmail.com" target="_blank">le.businessman@gmail.com</a>></span>:<div>


<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>On Thu, May 22, 2014 at 3:42 PM, Felix Abecassis<br>
<<a href="mailto:felix.abecassis@gmail.com" target="_blank">felix.abecassis@gmail.com</a>> wrote:<br>
><br>
> Interesting.<br>
><br>
> Did you benchmark the two implementations?<br>
<br>
</div>Yup, somewhat crudely thus far though (top, rdtsc, clock(),<br>
kcachegrind). It seems that the new implementation consistently<br>
performs faster/with fewer instructions so far. If you have any tips<br>
for useful metrics/results, I could post them as well.<br>
<div><br></div></blockquote></div><div>This should be fine in order to get a first approximation of the performance. Please share your results :).<br><br></div><div>I'm wondering how this SKIPSM implementation compares to a simple but vectorized implementation. The inner loop of the original filter might be automatically vectorized if we help the compiler a bit.<br>



</div><div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>
><br>
> Can the implementation be easily extended to a larger kernel width?<br>
<br>
</div>Possibly, but the paper I referenced was specifically for 3x3 kernels<br>
and since this is the current behaviour of the sharpen filter I didn't<br>
dig much further. In another paper, "Efficient algorithm for Gaussian<br>
blur using finite-state machines", the same author does discuss 3x5<br>
and 5x5 gaussian blur implementations, and compares the algorithmic<br>
complexity of these vs. an NxN SKIPSM.</blockquote></div><div>For large kernels, it is probably better to use an horizontal pass and a vertical pass if the filter is separable. These two passes can also be vectorized. But you are right that this is out of the scope of this patch. <br>



</div><div><div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">He also mentions that an NxN<br>
SKIPSM can be decomposed into several 3x3 SKIPSMs for comparable<br>
performance.<br>
<div><div><br></div></div></blockquote><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div>
><br>
><br>
><br>
> 2014-05-22 17:36 GMT+02:00 Tristan Matthews <<a href="mailto:le.businessman@gmail.com" target="_blank">le.businessman@gmail.com</a>>:<br>
>><br>
>> On Thu, May 22, 2014 at 11:30 AM, Tristan Matthews <<a href="mailto:le.businessman@gmail.com" target="_blank">le.businessman@gmail.com</a>> wrote:<br>
>>><br>
>>> SKIPSM (Separated-Kernel Image Processing using finite-State Machines) allows<br>
>>> sharpening with fewer repeated operations. Two finite-state machines<br>
>>> (a 2 element row FSM, and a width-element column FSM) are used to to avoid<br>
>>> duplicate reads/arithmetic.<br>
>>><br>
>>> This is a WIP. sharpen2 is meant to replace sharpen but both are included here<br>
>>> for ease of live comparison.<br>
>>><br>
>>> Reference:<br>
>>> <a href="http://www-personal.engin.umd.umich.edu/~jwvm/ece488588/Papers/skipsm/17_Misc3x3.pdf" target="_blank">http://www-personal.engin.umd.umich.edu/~jwvm/ece488588/Papers/skipsm/17_Misc3x3.pdf</a><br>
>>><br>
>>> Maybe refs #9458<br>
>>> ---<br>
>>>  modules/MODULES_LIST                           |   1 +<br>
>>>  modules/gui/qt4/components/extended_panels.cpp |   3 +<br>
>>>  modules/gui/qt4/ui/video_effects.ui            |  46 ++++<br>
>>>  modules/video_filter/Modules.am                |   2 +<br>
>>>  modules/video_filter/sharpen2.c                | 298 +++++++++++++++++++++++++<br>
>>>  5 files changed, 350 insertions(+)<br>
>>>  create mode 100644 modules/video_filter/sharpen2.c<br>
>>><br>
>>> diff --git a/modules/MODULES_LIST b/modules/MODULES_LIST<br>
>>> index 61ad62b..bc60143 100644<br>
>>> --- a/modules/MODULES_LIST<br>
>>> +++ b/modules/MODULES_LIST<br>
>>> @@ -309,6 +309,7 @@ $Id$<br>
>>>   * sepia: Sepia video filter<br>
>>>   * sftp: SFTP network access module<br>
>>>   * sharpen: Sharpen video filter<br>
>>> + * sharpen2: Sharpen2 video filter<br>
>>>   * shine: MP3 encoder using Shine, a fixed point implementation<br>
>>>   * shm: Shared memory framebuffer access module<br>
>>>   * sid: Sidplay demuxer<br>
>>> diff --git a/modules/gui/qt4/components/extended_panels.cpp b/modules/gui/qt4/components/extended_panels.cpp<br>
>>> index 84d16ae..9583196 100644<br>
>>> --- a/modules/gui/qt4/components/extended_panels.cpp<br>
>>> +++ b/modules/gui/qt4/components/extended_panels.cpp<br>
>>> @@ -150,6 +150,9 @@ ExtVideo::ExtVideo( intf_thread_t *_p_intf, QTabWidget *_parent ) :<br>
>>>      SETUP_VFILTER( sharpen )<br>
>>>      SETUP_VFILTER_OPTION( sharpenSigmaSlider, valueChanged( int ) )<br>
>>><br>
>>> +    SETUP_VFILTER( sharpen2 )<br>
>>> +    SETUP_VFILTER_OPTION( sharpen2SigmaSlider, valueChanged( int ) )<br>
>>> +<br>
>>>      SETUP_VFILTER( ripple )<br>
>>><br>
>>>      SETUP_VFILTER( wave )<br>
>>> diff --git a/modules/gui/qt4/ui/video_effects.ui b/modules/gui/qt4/ui/video_effects.ui<br>
>>> index 6284e22..a6564d7 100644<br>
>>> --- a/modules/gui/qt4/ui/video_effects.ui<br>
>>> +++ b/modules/gui/qt4/ui/video_effects.ui<br>
>>> @@ -316,6 +316,50 @@<br>
>>>        </layout><br>
>>>       </widget><br>
>>>      </item><br>
>>> +    <item row="3" column="1"><br>
>>> +     <widget class="QGroupBox" name="sharpen2Enable"><br>
>>> +      <property name="title"><br>
>>> +       <string>Sharpen2</string><br>
>>> +      </property><br>
>>> +      <property name="checkable"><br>
>>> +       <bool>true</bool><br>
>>> +      </property><br>
>>> +      <property name="checked"><br>
>>> +       <bool>false</bool><br>
>>> +      </property><br>
>>> +      <layout class="QGridLayout"><br>
>>> +       <item row="0" column="0"><br>
>>> +        <widget class="QLabel" name="label_29"><br>
>>> +         <property name="text"><br>
>>> +          <string>Sigma</string><br>
>>> +         </property><br>
>>> +         <property name="buddy"><br>
>>> +          <cstring>sharpen2SigmaSlider</cstring><br>
>>> +         </property><br>
>>> +        </widget><br>
>>> +       </item><br>
>>> +       <item row="0" column="1"><br>
>>> +        <widget class="QSlider" name="sharpen2SigmaSlider"><br>
>>> +         <property name="maximum"><br>
>>> +          <number>200</number><br>
>>> +         </property><br>
>>> +         <property name="pageStep"><br>
>>> +          <number>10</number><br>
>>> +         </property><br>
>>> +         <property name="orientation"><br>
>>> +          <enum>Qt::Horizontal</enum><br>
>>> +         </property><br>
>>> +         <property name="tickPosition"><br>
>>> +          <enum>QSlider::TicksBelow</enum><br>
>>> +         </property><br>
>>> +         <property name="tickInterval"><br>
>>> +          <number>50</number><br>
>>> +         </property><br>
>>> +        </widget><br>
>>> +       </item><br>
>>> +      </layout><br>
>>> +     </widget><br>
>>> +    </item><br>
>>>     </layout><br>
>>>    </widget><br>
>>>    <widget class="QWidget" name="tab_3"><br>
>>> @@ -1950,6 +1994,8 @@<br>
>>>    <tabstop>gradfunRadiusSlider</tabstop><br>
>>>    <tabstop>grainEnable</tabstop><br>
>>>    <tabstop>grainVarianceSlider</tabstop><br>
>>> +  <tabstop>sharpen2Enable</tabstop><br>
>>> +  <tabstop>sharpen2SigmaSlider</tabstop><br>
>>>    <tabstop>cropTopPx</tabstop><br>
>>>    <tabstop>cropBotPx</tabstop><br>
>>>    <tabstop>topBotCropSync</tabstop><br>
>>> diff --git a/modules/video_filter/Modules.am b/modules/video_filter/Modules.am<br>
>>> index 3bb8cdb..ae0b63c 100644<br>
>>> --- a/modules/video_filter/Modules.am<br>
>>> +++ b/modules/video_filter/Modules.am<br>
>>> @@ -78,6 +78,7 @@ video_filter_LTLIBRARIES += <a href="http://librotate_plugin.la" target="_blank">librotate_plugin.la</a><br>
>>>  SOURCES_colorthres = colorthres.c<br>
>>>  SOURCES_extract = extract.c<br>
>>>  SOURCES_sharpen = sharpen.c<br>
>>> +SOURCES_sharpen2 = sharpen2.c<br>
>>>  SOURCES_erase = erase.c<br>
>>>  SOURCES_bluescreen = bluescreen.c<br>
>>>  SOURCES_alphamask = alphamask.c<br>
>>> @@ -153,6 +154,7 @@ video_filter_LTLIBRARIES += \<br>
>>>         <a href="http://libscene_plugin.la" target="_blank">libscene_plugin.la</a> \<br>
>>>         <a href="http://libsepia_plugin.la" target="_blank">libsepia_plugin.la</a> \<br>
>>>         <a href="http://libsharpen_plugin.la" target="_blank">libsharpen_plugin.la</a> \<br>
>>> +       <a href="http://libsharpen2_plugin.la" target="_blank">libsharpen2_plugin.la</a> \<br>
>>>         <a href="http://libsubsdelay_plugin.la" target="_blank">libsubsdelay_plugin.la</a> \<br>
>>>         <a href="http://libtransform_plugin.la" target="_blank">libtransform_plugin.la</a> \<br>
>>>         <a href="http://libwave_plugin.la" target="_blank">libwave_plugin.la</a> \<br>
>>> diff --git a/modules/video_filter/sharpen2.c b/modules/video_filter/sharpen2.c<br>
>>> new file mode 100644<br>
>>> index 0000000..cdabc20<br>
>>> --- /dev/null<br>
>>> +++ b/modules/video_filter/sharpen2.c<br>
>>> @@ -0,0 +1,298 @@<br>
>>> +/*****************************************************************************<br>
>>> + * sharpen2.c: Sharpen video filter<br>
>>> + *****************************************************************************<br>
>>> + * Copyright (C) 2003-2007 VLC authors and VideoLAN<br>
>>> + * $Id$<br>
>>> + *<br>
>>> + * Author: Jérémy DEMEULE <dj_mulder at djduron dot no-ip dot org><br>
>>> + *         Jean-Baptiste Kempf <jb at videolan dot org><br>
>>> + *<br>
>>> + * This program is free software; you can redistribute it and/or modify it<br>
>>> + * under the terms of the GNU Lesser General Public License as published by<br>
>>> + * the Free Software Foundation; either version 2.1 of the License, or<br>
>>> + * (at your option) any later version.<br>
>>> + *<br>
>>> + * This program is distributed in the hope that it will be useful,<br>
>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of<br>
>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the<br>
>>> + * GNU Lesser General Public License for more details.<br>
>>> + *<br>
>>> + * You should have received a copy of the GNU Lesser General Public License<br>
>>> + * along with this program; if not, write to the Free Software Foundation,<br>
>>> + * Inc., 51 Franklin Street, Fifth Floor, Boston MA 02110-1301, USA.<br>
>>> + *****************************************************************************/<br>
>>> +<br>
>>> +/* The sharpen filter. */<br>
>>> +/*<br>
>>> + * static int filter[] = { -1, -1, -1,<br>
>>> + *                         -1,  8, -1,<br>
>>> + *                         -1, -1, -1 };<br>
>>> + */<br>
>>> +<br>
>>> +/*****************************************************************************<br>
>>> + * Preamble<br>
>>> + *****************************************************************************/<br>
>>> +<br>
>>> +#ifdef HAVE_CONFIG_H<br>
>>> +# include "config.h"<br>
>>> +#endif<br>
>>> +<br>
>>> +#include <vlc_common.h><br>
>>> +#include <vlc_plugin.h><br>
>>> +<br>
>>> +#include <vlc_filter.h><br>
>>> +#include "filter_picture.h"<br>
>>> +<br>
>>> +#define SIG_TEXT N_("Sharpen strength (0-2)")<br>
>>> +#define SIG_LONGTEXT N_("Set the Sharpen strength, between 0 and 2. Defaults to 0.05.")<br>
>>> +<br>
>>> +/*****************************************************************************<br>
>>> + * Local prototypes<br>
>>> + *****************************************************************************/<br>
>>> +static int  Create    ( vlc_object_t * );<br>
>>> +static void Destroy   ( vlc_object_t * );<br>
>>> +<br>
>>> +static picture_t *Filter( filter_t *, picture_t * );<br>
>>> +static int SharpenCallback( vlc_object_t *, char const *,<br>
>>> +                            vlc_value_t, vlc_value_t, void * );<br>
>>> +<br>
>>> +#define SHARPEN2_HELP N_("Augment contrast between contours.")<br>
>>> +#define FILTER_PREFIX "sharpen2-"<br>
>>> +<br>
>>> +/*****************************************************************************<br>
>>> + * Module descriptor<br>
>>> + *****************************************************************************/<br>
>>> +vlc_module_begin ()<br>
>>> +    set_description( N_("Sharpen2 video filter") )<br>
>>> +    set_shortname( N_("Sharpen2") )<br>
>>> +    set_help(SHARPEN2_HELP)<br>
>>> +    set_category( CAT_VIDEO )<br>
>>> +    set_subcategory( SUBCAT_VIDEO_VFILTER )<br>
>>> +    set_capability( "video filter2", 0 )<br>
>>> +    add_float_with_range( "sharpen2-sigma", 0.05, 0.0, 2.0,<br>
>>> +        SIG_TEXT, SIG_LONGTEXT, false )<br>
>>> +    add_shortcut( "sharpen2" )<br>
>>> +    set_callbacks( Create, Destroy )<br>
>>> +vlc_module_end ()<br>
>>> +<br>
>>> +static const char *const ppsz_filter_options[] = {<br>
>>> +    "sigma", NULL<br>
>>> +};<br>
>>> +<br>
>>> +/*****************************************************************************<br>
>>> + * filter_sys_t: Sharpen video filter descriptor<br>
>>> + *****************************************************************************<br>
>>> + * This structure is part of the video output thread descriptor.<br>
>>> + * It describes the Sharpen specific properties of an output thread.<br>
>>> + *****************************************************************************/<br>
>>> +<br>
>>> +struct filter_sys_t<br>
>>> +{<br>
>>> +    vlc_mutex_t lock;<br>
>>> +    int tab_precalc[512];<br>
>>> +    int16_t *column_state[2];<br>
>>> +};<br>
>>> +<br>
>>> +/*****************************************************************************<br>
>>> + * clip: avoid negative value and value > 255<br>
>>> + *****************************************************************************/<br>
>>> +inline static uint8_t clip( int32_t a )<br>
>>> +{<br>
>>> +    return (a > 255) ? 255 : (a < 0) ? 0 : a;<br>
>>> +}<br>
>>> +<br>
>>> +static void init_precalc_table(filter_sys_t *p_filter, float sigma)<br>
>>> +{<br>
>>> +    for(int i = 0; i < 512; ++i)<br>
>>> +    {<br>
>>> +        p_filter->tab_precalc[i] = (i - 256) * sigma;<br>
>>> +    }<br>
>>> +}<br>
>>> +<br>
>>> +/*****************************************************************************<br>
>>> + * Create: allocates Sharpen video thread output method<br>
>>> + *****************************************************************************<br>
>>> + * This function allocates and initializes a Sharpen vout method.<br>
>>> + *****************************************************************************/<br>
>>> +static int Create( vlc_object_t *p_this )<br>
>>> +{<br>
>>> +    filter_t *p_filter = (filter_t *)p_this;<br>
>>> +<br>
>>> +    const vlc_fourcc_t fourcc = p_filter->fmt_in.video.i_chroma;<br>
>>> +    const vlc_chroma_description_t *p_chroma = vlc_fourcc_GetChromaDescription( fourcc );<br>
>>> +    if( !p_chroma || p_chroma->plane_count != 3 || p_chroma->pixel_size != 1 ) {<br>
>>> +        msg_Err( p_filter, "Unsupported chroma (%4.4s)", (char*)&fourcc );<br>
>>> +        return VLC_EGENERIC;<br>
>>> +    }<br>
>>> +<br>
>>> +    /* Allocate structure */<br>
>>> +    p_filter->p_sys = malloc( sizeof( filter_sys_t ) );<br>
>>> +    if( p_filter->p_sys == NULL )<br>
>>> +        return VLC_ENOMEM;<br>
>>> +<br>
>>> +    for( int i = 0; i < 2; ++i) {<br>
>>> +        p_filter->p_sys->column_state[i] = malloc( sizeof(*p_filter->p_sys->column_state[i]) *<br>
>>> +                                        p_filter->fmt_in.video.i_visible_width );<br>
>>> +        if( p_filter->p_sys->column_state[i] == NULL )<br>
>>> +            return VLC_ENOMEM;<br>
>>> +    }<br>
>>> +<br>
>>> +    p_filter->pf_video_filter = Filter;<br>
>>> +<br>
>>> +    config_ChainParse( p_filter, FILTER_PREFIX, ppsz_filter_options,<br>
>>> +                   p_filter->p_cfg );<br>
>>> +<br>
>>> +    float sigma = var_CreateGetFloatCommand( p_filter, FILTER_PREFIX "sigma" );<br>
>>> +    init_precalc_table(p_filter->p_sys, sigma);<br>
>>> +<br>
>>> +    vlc_mutex_init( &p_filter->p_sys->lock );<br>
>>> +    var_AddCallback( p_filter, FILTER_PREFIX "sigma",<br>
>>> +                     SharpenCallback, p_filter->p_sys );<br>
>>> +<br>
>>> +    return VLC_SUCCESS;<br>
>>> +}<br>
>>> +<br>
>>> +<br>
>>> +/*****************************************************************************<br>
>>> + * Destroy: destroy Sharpen video thread output method<br>
>>> + *****************************************************************************<br>
>>> + * Terminate an output method created by SharpenCreateOutputMethod<br>
>>> + *****************************************************************************/<br>
>>> +static void Destroy( vlc_object_t *p_this )<br>
>>> +{<br>
>>> +    filter_t *p_filter = (filter_t *)p_this;<br>
>>> +    filter_sys_t *p_sys = p_filter->p_sys;<br>
>>> +<br>
>>> +    var_DelCallback( p_filter, FILTER_PREFIX "sigma", SharpenCallback, p_sys );<br>
>>> +    vlc_mutex_destroy( &p_sys->lock );<br>
>>> +    for (int i = 0; i < 2; ++i)<br>
>>> +        free( p_sys->column_state[i] );<br>
>>> +    free( p_sys );<br>
>>> +}<br>
>>> +<br>
>>> +/*****************************************************************************<br>
>>> + * Render: displays previously rendered output<br>
>>> + *****************************************************************************<br>
>>> + * This function send the currently rendered image to Invert image, waits<br>
>>> + * until it is displayed and switch the two rendering buffers, preparing next<br>
>>> + * frame.<br>
>>> + *<br>
>>> + * Reference:<br>
>>> + * <a href="http://www-personal.engin.umd.umich.edu/~jwvm/ece488588/Papers/skipsm/17_Misc3x3.pdf" target="_blank">http://www-personal.engin.umd.umich.edu/~jwvm/ece488588/Papers/skipsm/17_Misc3x3.pdf</a><br>




>>> + *<br>
>>> + * Row Machine<br>
>>> + * 1 Tmp1 = Input[row j][col i];<br>
>>> + * 2 Tmp2 = Tmp1 + RS1;<br>
>>> + * 3 Tmp3 = 9*RS0;<br>
>>> + * 4 RS1 = RS0 + Tmp1;<br>
>>> + * 5 RS0 = Tmp1;<br>
>>> + * 6 Tmp1 = Tmp3 - Tmp2;<br>
>>> +<br>
>>> + * Column Machine<br>
>>> + * (Division by 8 omitted to get same behaviour as current sharpen filter)<br>
>>> + * Out[row j-1][col i-1]) = (CS1[col i] - Tmp2)/8;<br>
>>> + * CS1[col i] = Tmp1 - CS0[col i];<br>
>>> + * CS0[col i] = Tmp2<br>
>>> + *<br>
>>> + *****************************************************************************/<br>
>>> +static picture_t *Filter( filter_t *p_filter, picture_t *p_pic )<br>
>>> +{<br>
>>> +    picture_t *p_outpic;<br>
>>> +    unsigned i, j;<br>
>>> +    uint8_t *p_src = NULL;<br>
>>> +    uint8_t *p_out = NULL;<br>
>>> +    int i_src_pitch;<br>
>>> +    int i_out_pitch;<br>
>>> +    uint8_t pix;<br>
>>> +    int16_t row_state0, row_state1;<br>
>>> +    filter_sys_t *sys = p_filter->p_sys;<br>
>>> +    int16_t *column_state[2] = {sys->column_state[0], sys->column_state[1]};<br>
>>> +    const unsigned i_visible_lines = p_pic->p[Y_PLANE].i_visible_lines;<br>
>>> +    const unsigned i_visible_pitch = p_pic->p[Y_PLANE].i_visible_pitch;<br>
>>> +<br>
>>> +    if( !p_pic ) return NULL;<br>
>>> +<br>
>>> +    p_outpic = filter_NewPicture( p_filter );<br>
>>> +    if( !p_outpic )<br>
>>> +    {<br>
>>> +        picture_Release( p_pic );<br>
>>> +        return NULL;<br>
>>> +    }<br>
>>> +<br>
>>> +    /* process the Y plane */<br>
>>> +    p_src = p_pic->p[Y_PLANE].p_pixels;<br>
>>> +    p_out = p_outpic->p[Y_PLANE].p_pixels;<br>
>>> +    i_src_pitch = p_pic->p[Y_PLANE].i_pitch;<br>
>>> +    i_out_pitch = p_outpic->p[Y_PLANE].i_pitch;<br>
>>> +<br>
>>> +    /* reset column state at beginning of operation */<br>
>>> +    for (unsigned c = 0; c < 2; ++c)<br>
>>> +        memset(column_state[c], 0, sizeof(*column_state[c]) *<br>
>>> +                p_filter->fmt_in.video.i_visible_width);<br>
>>> +<br>
>>> +    /* perform convolution only on Y plane. Avoid border line. */<br>
>>> +    vlc_mutex_lock( &p_filter->p_sys->lock );<br>
>>> +<br>
>>> +    /* copy first row */<br>
>>> +    memcpy(p_out, p_src, i_visible_pitch);<br>
>>> +<br>
>>> +    for( i = 1; i < i_visible_lines - 1; i++ )<br>
>>> +    {<br>
>>> +        /* row state must be initialized for each row */<br>
>>> +        row_state0 = row_state1 = 0;<br>
>>> +<br>
>>> +        /* copy first pixel in row */<br>
>>> +        p_out[i * i_out_pitch] = p_src[i * i_src_pitch];<br>
>>> +<br>
>>> +        for( j = 1; j < i_visible_pitch - 1; j++ )<br>
>>> +        {<br>
>>> +            /* row machine */<br>
>>> +            int16_t tmp1 = p_src[i * i_src_pitch + j];<br>
>>> +            const int16_t tmp2 = tmp1 + row_state1;<br>
>>> +            const int16_t tmp3 = 9 * row_state0;<br>
>>> +            row_state1 = row_state0 + tmp1;<br>
>>> +            row_state0 = tmp1;<br>
>>> +            tmp1 = tmp3 - tmp2;<br>
>>> +<br>
>>> +            /* column machine */<br>
>>> +            pix = clip(column_state[1][j] - tmp2);<br>
>>> +<br>
>>> +            /* mix with original signal and write to output */<br>
>>> +            p_out[(i - 1) * i_out_pitch + j - 1] =<br>
>>> +            clip( p_src[(i - 1) * i_src_pitch + j - 1] +<br>
>>> +                  p_filter->p_sys->tab_precalc[pix + 256]);<br>
>>> +<br>
>>> +            column_state[1][j] = tmp1 - column_state[0][j];<br>
>>> +            column_state[0][j] = tmp2;<br>
>>> +        }<br>
>>> +<br>
>>> +        /* copy last pixel */<br>
>>> +        p_out[i * i_out_pitch + i_visible_pitch - 1] =<br>
>>> +            p_src[i * i_src_pitch + i_visible_pitch - 1];<br>
>>> +    }<br>
>>> +<br>
>>> +    /* copy last row */<br>
>>> +    for( j = 0; j < i_visible_pitch; j++ )<br>
>>> +        p_out[(i_visible_lines - 1) * i_out_pitch + j] =<br>
>>> +            p_src[(i_visible_lines - 1) * i_src_pitch + j];<br>
>>> +<br>
>>> +    vlc_mutex_unlock( &p_filter->p_sys->lock );<br>
>>> +<br>
>>> +    plane_CopyPixels( &p_outpic->p[U_PLANE], &p_pic->p[U_PLANE] );<br>
>>> +    plane_CopyPixels( &p_outpic->p[V_PLANE], &p_pic->p[V_PLANE] );<br>
>>> +<br>
>>> +    return CopyInfoAndRelease( p_outpic, p_pic );<br>
>>> +}<br>
>>> +<br>
>>> +static int SharpenCallback( vlc_object_t *p_this, char const *psz_var,<br>
>>> +                            vlc_value_t oldval, vlc_value_t newval,<br>
>>> +                            void *p_data )<br>
>>> +{<br>
>>> +    VLC_UNUSED(p_this); VLC_UNUSED(oldval); VLC_UNUSED(psz_var);<br>
>>> +    filter_sys_t *p_sys = (filter_sys_t *)p_data;<br>
>>> +<br>
>>> +    vlc_mutex_lock( &p_sys->lock );<br>
>>> +    init_precalc_table( p_sys,  VLC_CLIP( newval.f_float, 0., 2. ) );<br>
>>> +    vlc_mutex_unlock( &p_sys->lock );<br>
>>> +    return VLC_SUCCESS;<br>
>>> +}<br>
>>> --<br>
>>> 1.9.0<br>
>>><br>
>><br>
>> Just to clarify, if the new algo is ok, a proper patch will be sent that will only modify sharpen.c. The GUI won't change.<br>
>> This was sent together just for testing/comparison purposes.<br>
>><br>
>> -t<br>
>><br>
>><br>
>> _______________________________________________<br>
>> vlc-devel mailing list<br>
>> To unsubscribe or modify your subscription options:<br>
>> <a href="https://mailman.videolan.org/listinfo/vlc-devel" target="_blank">https://mailman.videolan.org/listinfo/vlc-devel</a><br>
>><br>
><br>
><br>
><br>
> --<br>
> Félix Abecassis<br>
> <a href="http://felix.abecassis.me" target="_blank">http://felix.abecassis.me</a><br>
><br>
> _______________________________________________<br>
> vlc-devel mailing list<br>
> To unsubscribe or modify your subscription options:<br>
> <a href="https://mailman.videolan.org/listinfo/vlc-devel" target="_blank">https://mailman.videolan.org/listinfo/vlc-devel</a><br>
><br>
_______________________________________________<br>
vlc-devel mailing list<br>
To unsubscribe or modify your subscription options:<br>
<a href="https://mailman.videolan.org/listinfo/vlc-devel" target="_blank">https://mailman.videolan.org/listinfo/vlc-devel</a><br>
</div></div></blockquote></div></div></div><div><div><br><br clear="all"><br>-- <br>Félix Abecassis<div><a href="http://felix.abecassis.me" target="_blank">http://felix.abecassis.me</a></div>
</div></div></div></div>
</blockquote></div></div></div><div><div class="h5"><br><br clear="all"><br>-- <br>Félix Abecassis<div><a href="http://felix.abecassis.me" target="_blank">http://felix.abecassis.me</a></div>
</div></div></div>
</blockquote></div><br><br clear="all"><br>-- <br>Félix Abecassis<div><a href="http://felix.abecassis.me" target="_blank">http://felix.abecassis.me</a></div>
</div>