[vlc-devel] GSoC 2011: Stereoscopic 3D Playback

Fri Apr 8 15:53:46 CEST 2011

I've submitted my formal proposal through Google at the
http://www.google-melange.com/gsoc/dashboard/google/gsoc2011 website, and
I'll also paste a copy here for anybody who is interested in reviewing it:
-----

*1. Abstract*

For Google Summer of Code 2011, I, Andrew Price, am proposing to add
stereoscopic playback to the VideoLAN project and specifically to the VLC
media player. Stereoscopy, informally known as 3D, refers to the process
where two separate images are displayed to the viewer simultaneously, one
image assigned to each eye, giving the illusion depth and volume. Upon
completion of my Summer of Code project a user will be able to playback,
with the correct equipment, stereoscopic video through the VLC media player.

*2. My Background*

I am a 22 year old student current undertaking my Honours Bachelor of
Computer Science at the University of Adelaide in Adelaide, South Australia
in 2011. My adventures in programming started when I was 11 on an old
Commodore 128. I shortly moved on to a modern PC and taught myself C++. My
experience ranges from low-level operating system development, to real-time
computer graphics, to networking. More recently, I have taught myself C# and
I have been exposed to Java through my university education.

My experimentation with stereoscopy rendering began when I developed a
wrapper for the game Minecraft (http://www.minecraft.net) earlier this year
which enabled a player to play the game in stereoscopic 3D. To accomplish
this I modified the native OpenGL backend of LWJGL that forwarded the
rendered image through Direct3D, which merged the images of the two separate
eyes into the output format based on the stereoscopic method being used. My
Minecraft wrapper supports the traditional anaglyph glasses and the nVidia
3D Vision shutter glasses. More information on my Minecraft wrapper is
available through this link:
http://www.minecraftforum.net/viewtopic.php?f=1022&t=201762

At present I am working on a colour calibration video filter module to
familiarise myself with the VideoLAN coding standard and build system.

*3. Reasoning*

With the recent rise of the 3D film industry, the availability of 3D
televisions, and the affordability of home 3D solutions such as the NVidia
3D Vision Kit, the amount of stereoscopic video content out there continues
to increase. There are already media playing programs available that support
the playback of stereoscopic video including 3dtv.at’s Stereoscopic Player,
Cyberlink PowerDVD, and nVidia 3D Vision Stereosopic 3D Player. My project,
which aims to add stereoscopy video playback to the VideoLAN project, will
make VLC media player a stronger competitor and the first open source media
player with this capability.

*4. Implementation*

Supporting stereoscopic playback will involve changes in the VLC rendering
pipeline. Two new stages will be added to the rendering pipeline. The first
stage occurs directly after receiving the video stream, and this involves
separating the video stream into two separate left and right images. Video
filters and text overlays such as subtitles will be applied to each image
individually. The final stage just before displaying the image will be to
combine the two images into a format suitable for display.

I recommend adding two new categories of modules. Each category will
correspond to one of the new stages introduced above; the stereoscopic input
modules, which taken a source and extract a left and right image, and the
stereoscopic output modules, which take a processed left and right image and
combine them into a displayable format. I believe implementing these as new
categories of modules is beneficial since it will keep the implementation
clean and gives developers the opportunity to support new input and display
formats as technology evolves.

*5. Stereoscopic Input Modules*

The stereoscopic input modules I am aiming to develop as part of this
project are;

# Side by side rendering - where there is a single video stream has double
the width of the image shown to each eye, and the image for each eye is
stored side by side. This is the most common format of stereoscopic video
found online.

# Duel streams in one file - where a single file contains multiple video
streams, and each stream is destined for a specific eye.

# Duel streams in multiple files - where there are multiple files with a
single video stream in each and each file is destined for a specific eye.
This is common in film production where multiple cameras have been used to
record a scene in stereoscopic 3D that has yet been converted to another
format.

# 2D plus depth - where there is a single colour video stream containing the
2D image, but a secondary monochrome video stream that contains the depth
value of each pixel. The depth and colour of each pixel is projected into a
3D space which the image for the left and right eye is extracted from. The
advantage of this method is that the viewer can adjust the intensity of the
stereoscopy. The limitation of this format lies in it being computationally
heavy at high resolutions to project and stretch each pixel, and because
there is only being a single depth value for each pixel it is impossible to
represent transparency or reflection. Because of these limitations this
format is considered deprecated and the amount of content being produced in
this format is fading.

# 2D plus delta - where a single video stream contains the complete image
for one eye and a second video stream contains the colour offset to add to
each pixel to construct the image for the other eye. This is a method of
compressing two images by relying on the fact that since the two eyes are
close together then the colour of the same pixel on each eye has a high
chance of being similar. Because of this the secondary or ‘delta’ video
stream may be stored with lower precision.

*6. Stereoscopic Output Modules*

The stereoscopic output modules I plan to develop for the different display
technologies are;

# nVidia 3D Vision - a proprietary shutter glasses solution from nVidia and
currently one of the most affordable solutions for full-colour stereoscopy.
The consumer version of the nVidia 3D Vision shutter glasses only functions
in full screen, are limited Window Vista and higher, and require a supported
nVidia graphics card and a supported monitor. However the large install base
makes it an attractive technology to support.

# Anaglyph - glasses with tinted lenses filter the colours to each eye, so
each eye receives a different colour channel. It is an affordable
alternative to other stereoscopic technologies and it works on any type of
colour display or hardware. The only disadvantage though is depending on the
colour of the glasses the viewer either loses a colour channel or only sees
a monochrome image.

# Side by side/cross eyed - the image for each eye is outputted next to the
image for the other eye. If the right image is shown on the left side, and
the left image is shown on the right side, then it is possible to achieve a
stereoscopic effect without any glasses by simply looking at the image
cross-eyed.

# Quad-buffered stereo - an OpenGL format supported by the professional
nVidia 3D Vision glasses and AMD video cards.

# Row interleaved stereo - each row contains the line of an alternative eye.
The disadvantage of this method is that the vertical resolution of the final
image is halved, however when dealing with an interleaved display it becomes
possible to store two interleaved video streams in a single progressive
video stream without any loss of detail.

*7. Deliverables*

By the end of the Google Summer of Code program I intend to deliver the
following;

# The code required to support stereoscopy modules and video playback.

# The code for the modules supporting the various input formats and
stereoscopic display technologies mentioned above.

# The developer documentation on changes that were needed to be made to
existing VideoLAN code to support stereoscopy playback.

# The developer documentation on assisting other developers in developing
modules which extend the capability to support different input formats and
stereoscopic display technologies.

# The user documentation on how to use watch stereoscopy movies through the
VLC media player.

The documentation will either be in the form of standalone documents or
(preferably) integrated into the VideoLAN wiki.

*8. Timeline*

The timeline for my project will be split up into four quarters, with
quarter two and quarter three separated by the mid-term evaluation. The
project commences on the 23 May where I will design the stereoscopic
playback architecture and experiment with adding stereoscopy to the VLC
media player for quarter one.

By the end of quarter two (the mid-term evaluation) I will have the
side-by-side stereoscopy input module and nVidia 3D Vision output module
implemented. It will be possible to view side-by-side stereoscopic video
through the nVidia 3D Vision shutter glasses with the VLC media player.
Side-by-side stereoscopic video is chosen as the first input module to
develop because it’s the most popular storage format of stereoscopic 3D
content found online. nVidia 3D Vision was chosen as the first output module
to develop because of the large install base.

By the end of quarter three I will have implemented a variety of stereoscopy
input and output modules, with focus on those listed above. This should be a
very straight forward process because most of the minor details will have
been sorted out during quarter two. Quarter four will be spent working on
documentation for both other developers and the end user, and user support.

*9. Future Work*

Having stereoscopy playback in the VLC media player opens up the possibility
of future extensions to the VideoLAN project which are beyond the scope of
my Google Summer of Code project. Some of the possible ideas beyond this
project include a stereoscopy input module that automatically extracts a
stereoscopic image from a 2D source, using stereoscopy to emulate looking
through a computer monitor to view a video on a large virtual theatre
screen, support for the playback of 3D blue-ray titles, streaming
stereoscopic video over a network, and producing modules to support
different input formats and 3D display technologies.

On Fri, Apr 8, 2011 at 3:39 PM, Jean-Baptiste Kempf <jb at videolan.org> wrote:

> ... of all hope.
>
> Nah, seriously, deadline for applications for GSoC 2011 is TODAY at
> 19:00 UTC.
>
> Please don't submit them at the last minute.
>
> Best Regards,
>
> --
> Jean-Baptiste Kempf
> http://www.jbkempf.com/ - +33 672 704 734
> Sent from my Electronic Device
> _______________________________________________
> vlc-devel mailing list
> To unsubscribe or modify your subscription options:
> http://mailman.videolan.org/listinfo/vlc-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.videolan.org/pipermail/vlc-devel/attachments/20110408/a95fa501/attachment.html>