SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Olsson Roger 1973 ) "

Sökning: WFRF:(Olsson Roger 1973 )

  • Resultat 1-46 av 46
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Boström, Lena, 1960-, et al. (författare)
  • Digital visualisering i skolan : Mittuniversitetets slutrapport från förstudien
  • 2018
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • Den här studiens syfte har varit tvåfaldigt, nämligen att testa alternativa lärmetoder via ett digitalt läromedel i matematik i en kvasiexperimentell studie samt att tillämpa metoder av användarupplevelser för interaktiva visualiseringar, och därigenom öka kunskapen kring hur upplevd kvalitet beror på använd teknik. Pilotstudien sätter också fokus på flera angelägna områden inom skolutveckling både regionalt och nationellt samt viktiga aspekter när det gäller kopplingen teknik, pedagogik och utvärderingsmetoder inom “den tekniska delen”. Det förra handlar om sjunkande matematikresultat i skolan, praktiknära skolforskning, stärkt digital kompetens, visualisering och lärande samt forskning om visualisering och utvärdering. Den senare svarar på frågor om vilka tekniska lösningar som tidigare använts och med vilket syfte har de skapats samt hur visualiseringar har utvärderats enligt läroböcker och i forskningslitteratur. När det gäller elevernas resultat, en av de stora forskningsfrågorna i studien, så fann vi inga signifikanta skillnader mellan traditionell undervisning och undervisning med visualiseringsläromedlet (3D). Beträffande elevers attityder till matematikmomentet kan konstateras att i kontrollgruppen för årskurs 6 förbättrades attityden signifikans, men inte i klass 8. Gällande flickors och pojkars resultat och attityder kan vi konstatera att flickorna i båda klasserna hade bättre förkunskaper än pojkarna samt att i årskurs 6 var flickorna mer positiva till matematikmomentet än pojkarna i kontrollgruppen. Därutöver kan vi inte skönja några signifikanta skillnader. Andra viktiga rön i studien var att provkonstruktionen inte var optimal samt att tiden för provgenomförande har stor betydelse när på dagen det genomfördes. Andra resultat resultaten i den kvalitativa analysen pekar på positiva attityder och beteenden från eleverna vid arbetet med det visuella läromedlet. Elevernas samarbete och kommunikation förbättrades under lektionerna. Vidare pekade lärarna på att med 3D-läromedlet gavs större möjligheter till att stimulera flera sinnen under lärprocessen. En tydlig slutsats är att 3D-läromedlet är ett viktigt komplement i undervisningen, men kan inte användas helt självt. Vi kan varken sälla oss till de forskare som anser att 3D-visualisering är överlägset som läromedel för elevers resultat eller till de forskare som varnar för dess effekter för elevers kognitiva överbelastning.  Våra resultat ligger mer i linje med de slutsatser Skolforskningsinstitutet (2017) drar, nämligen att undervisning med digitala läromedel i matematik kan ha positiva effekter, men en lika effektiv undervisning kan möjligen designas på andra sätt. Däremot pekar resultaten i vår studie på ett flertal störningsmoment som kan ha påverkat möjliga resultat och behovet av god teknologin och välutvecklade programvaror. I studien har vi analyserat resultaten med hjälp av två övergripande ramverk för integrering av teknikstöd i lärande, SAMR och TPACK. Det förra ramverket bidrog med en taxonomi vid diskussionen av hur väl teknikens möjligheter tagits tillvara av läromedel och i läraktiviteter, det senare för en diskussion om de didaktiska frågeställningarna med fokus på teknikens roll. Båda aspekterna är högaktuella med tanke på den ökande digitaliseringen i skolan. Utifrån tidigare forskning och denna pilotstudie förstår vi att det är viktigt att designa forskningsmetoderna noggrant. En randomisering av grupper vore önskvärt. Prestandamått kan också vara svåra att välja. Tester där personer får utvärdera användbarhet (usability) och användarupplevelse (user experience, UX) baserade på både kvalitativa och kvantitativa metoder blir viktiga för själva användandet av tekniken, men det måste till ytterligare utvärderingar för att koppla tekniken och visualiseringen till kvaliteten i lärandet och undervisningen. Flera metoder behövs således och det blir viktigt med samarbete mellan olika ämnen och discipliner.
  •  
2.
  • Ahmad, Waqas, et al. (författare)
  • Compression scheme for sparsely sampled light field data based on pseudo multi-view sequences
  • 2018
  • Ingår i: OPTICS, PHOTONICS, AND DIGITAL TECHNOLOGIES FOR IMAGING APPLICATIONS V Proceedings of SPIE - The International Society for Optical Engineering. - : SPIE - International Society for Optical Engineering.
  • Konferensbidrag (refereegranskat)abstract
    • With the advent of light field acquisition technologies, the captured information of the scene is enriched by having both angular and spatial information. The captured information provides additional capabilities in the post processing stage, e.g. refocusing, 3D scene reconstruction, synthetic aperture etc. Light field capturing devices are classified in two categories. In the first category, a single plenoptic camera is used to capture a densely sampled light field, and in second category, multiple traditional cameras are used to capture a sparsely sampled light field. In both cases, the size of captured data increases with the additional angular information. The recent call for proposal related to compression of light field data by JPEG, also called “JPEG Pleno”, reflects the need of a new and efficient light field compression solution. In this paper, we propose a compression solution for sparsely sampled light field data. In a multi-camera system, each view depicts the scene from a single perspective. We propose to interpret each single view as a frame of pseudo video sequence. In this way, complete MxN views of multi-camera system are treated as M pseudo video sequences, where each pseudo video sequence contains N frames. The central pseudo video sequence is taken as base View and first frame in all the pseudo video sequences is taken as base Picture Order Count (POC). The frame contained in base view and base POC is labeled as base frame. The remaining frames are divided into three predictor levels. Frames placed in each successive level can take prediction from previously encoded frames. However, the frames assigned with last prediction level are not used for prediction of other frames. Moreover, the rate-allocation for each frame is performed by taking into account its predictor level, its frame distance and view wise decoding distance relative to the base frame. The multi-view extension of high efficiency video coding (MV-HEVC) is used to compress the pseudo multi-view sequences. The MV-HEVC compression standard enables the frames to take prediction in both direction (horizontal and vertical d), and MV-HEVC parameters are used to implement the proposed 2D prediction and rate allocation scheme. A subset of four light field images from Stanford dataset are compressed, using the proposed compression scheme on four bitrates in order to cover the low to high bit-rates scenarios. The comparison is made with state-of-art reference encoder HEVC and its real-time implementation X265. The 17x17 grid is converted into a single pseudo sequence of 289 frames by following the order explained in JPEG Pleno call for proposal and given as input to the both reference schemes. The rate distortion analysis shows that the proposed compression scheme outperforms both reference schemes in all tested bitrate scenarios for all test images. The average BD-PSNR gain is 1.36 dB over HEVC and 2.15 dB over X265.
  •  
3.
  • Ahmad, Waqas, et al. (författare)
  • Computationally Efficient Light Field Image Compression Using a Multiview HEVC Framework
  • 2019
  • Ingår i: IEEE Access. - 2169-3536. ; 7, s. 143002-143014
  • Tidskriftsartikel (refereegranskat)abstract
    • The acquisition of the spatial and angular information of a scene using light eld (LF) technologies supplement a wide range of post-processing applications, such as scene reconstruction, refocusing, virtual view synthesis, and so forth. The additional angular information possessed by LF data increases the size of the overall data captured while offering the same spatial resolution. The main contributor to the size of captured data (i.e., angular information) contains a high correlation that is exploited by state-of-the-art video encoders by treating the LF as a pseudo video sequence (PVS). The interpretation of LF as a single PVS restricts the encoding scheme to only utilize a single-dimensional angular correlation present in the LF data. In this paper, we present an LF compression framework that efciently exploits the spatial and angular correlation using a multiview extension of high-efciency video coding (MV-HEVC). The input LF views are converted into multiple PVSs and are organized hierarchically. The rate-allocation scheme takes into account the assigned organization of frames and distributes quality/bits among them accordingly. Subsequently, the reference picture selection scheme prioritizes the reference frames based on the assigned quality. The proposed compression scheme is evaluated by following the common test conditions set by JPEG Pleno. The proposed scheme performs 0.75 dB better compared to state-of-the-art compression schemes and 2.5 dB better compared to the x265-based JPEG Pleno anchor scheme. Moreover, an optimized motionsearch scheme is proposed in the framework that reduces the computational complexity (in terms of the sum of absolute difference [SAD] computations) of motion estimation by up to 87% with a negligible loss in visual quality (approximately 0.05 dB).
  •  
4.
  • Ahmad, Waqas (författare)
  • High Efficiency Light Field Image Compression : Hierarchical Bit Allocation and Shearlet-based View Interpolation
  • 2021
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Over the years, the pursuit of capturing the precise visual information of a scenehas resulted in various enhancements in digital camera technology, such as highdynamic range, extended depth of field, and high resolution. However, traditionaldigital cameras only capture the spatial information of the scene and cannot pro-vide an immersive presentation of it. Light field (LF) capturing is a new-generationimaging technology that records the spatial and angular information of the scene. Inrecent years, LF imaging has become increasingly popular among the industry andresearch community mainly for two reasons: (1) the advancements made in optical and computational technology have facilitated the process of capturing and processing LF information and (2) LF data have the potential to offer various post-processing applications, such as refocusing at different depth planes, synthetic aperture, 3Dscene reconstruction, and novel view generation. Generally, LF-capturing devicesacquire large amounts of data, which poses a challenge for storage and transmissionresources. Off-the-shelf image and video compression schemes, built on assump-tions drawn from natural images and video, tend to exploit spatial and temporalcorrelations. However, 4D LF data inherit different properties, and hence there is aneed to advance the current compression methods to efficiently address the correla-tion present in LF data.In this thesis, compression of LF data captured using a plenoptic camera andmulti-camera system (MCS) is considered. Perspective views of a scene capturedfrom different positions are interpreted as a frame of multiple pseudo-video se-quences and given as an input to a multi-view extension of high-efficiency videocoding (MV-HEVC). A 2D prediction and hierarchical coding scheme is proposedin MV-HEVC to improve the compression efficiency of LF data. To further increasethe compression efficiency of views captured using an MCS, an LF reconstructionscheme based on shearlet transform is introduced in LF compression. A sparse set of views is coded using MV-HEVC and later used to predict the remaining views by applying shearlet transform. The prediction error is also coded to further increase the compression efficiency. Publicly available LF datasets are used to benchmark the proposed compression schemes. The anchor scheme specified in the JPEG Plenocommon test conditions is used to evaluate the performance of the proposed scheme. Objective evaluations show that the proposed scheme outperforms state-of-the-art schemes in the compression of LF data captured using a plenoptic camera and an MCS. Moreover, the introduction of shearlet transform in LF compression further improves the compression efficiency at low bitrates, at which the human vision sys-tem is sensitive to the perceived quality.The work presented in this thesis has been published in four peer-reviewed con-ference proceedings and two scientific journals. The proposed compression solu-tions outlined in this thesis significantly improve the rate-distortion efficiency forLF content, which reduces the transmission and storage resources. The MV-HEVC-based LF coding scheme is made publicly available, which can help researchers totest novel compression tools and it can serve as an anchor scheme for future researchstudies. The shearlet-transform-based LF compression scheme presents a compre-hensive framework for testing LF reconstruction methods in the context of LF com-pression.
  •  
5.
  •  
6.
  • Ahmad, Waqas, et al. (författare)
  • Interpreting Plenoptic Images as Multi-View Sequences for Improved Compression
  • 2017
  • Ingår i: ICIP 2017. - : IEEE. - 9781509021758 ; , s. 4557-4561
  • Konferensbidrag (refereegranskat)abstract
    • Over the last decade, advancements in optical devices have made it possible for new novel image acquisition technologies to appear. Angular information for each spatial point is acquired in addition to the spatial information of the scene that enables 3D scene reconstruction and various post-processing effects. Current generation of plenoptic cameras spatially multiplex the angular information, which implies an increase in image resolution to retain the level of spatial information gathered by conventional cameras. In this work, the resulting plenoptic image is interpreted as a multi-view sequence that is efficiently compressed using the multi-view extension of high efficiency video coding (MV-HEVC). A novel two dimensional weighted prediction and rate allocation scheme is proposed to adopt the HEVC compression structure to the plenoptic image properties. The proposed coding approach is a response to ICIP 2017 Grand Challenge: Light field Image Coding. The proposed scheme outperforms all ICME contestants, and improves on the JPEG-anchor of ICME with an average PSNR gain of 7.5 dB and the HEVC-anchor of ICIP 2017 Grand Challenge with an average PSNR gain of 2.4 dB.
  •  
7.
  • Ahmad, Waqas, et al. (författare)
  • Shearlet Transform-Based Light Field Compression under Low Bitrates
  • 2020
  • Ingår i: IEEE Transactions on Image Processing. - : IEEE. - 1057-7149 .- 1941-0042. ; 29, s. 4269-4280
  • Tidskriftsartikel (refereegranskat)abstract
    • Light field (LF) acquisition devices capture spatial and angular information of a scene. In contrast with traditional cameras, the additional angular information enables novel post-processing applications, such as 3D scene reconstruction, the ability to refocus at different depth planes, and synthetic aperture. In this paper, we present a novel compression scheme for LF data captured using multiple traditional cameras. The input LF views were divided into two groups: key views and decimated views. The key views were compressed using the multi-view extension of high-efficiency video coding (MV-HEVC) scheme, and decimated views were predicted using the shearlet-transform-based prediction (STBP) scheme. Additionally, the residual information of predicted views was also encoded and sent along with the coded stream of key views. The proposed scheme was evaluated over a benchmark multi-camera based LF datasets, demonstrating that incorporating the residual information into the compression scheme increased the overall peak signal to noise ratio (PSNR) by 2 dB. The proposed compression scheme performed significantly better at low bit rates compared to anchor schemes, which have a better level of compression efficiency in high bit-rate scenarios. The sensitivity of the human vision system towards compression artifacts, specifically at low bit rates, favors the proposed compression scheme over anchor schemes. The proposed compression scheme performed significantly better at low bit rates compared to anchor schemes, which have a better level of compression efficiency in high bit-rate scenarios. The sensitivity of the human vision system towards compression artifacts, specifically at low bit rates, favors the proposed compression scheme over anchor schemes. The proposed compression scheme performed significantly better at low bit rates compared to anchor schemes, which have a better level of compression efficiency in high bit-rate scenarios. The sensitivity of the human vision system towards compression artifacts, specifically at low bit rates, favors the proposed compression scheme over anchor schemes. 
  •  
8.
  • Ahmad, Waqas, et al. (författare)
  • Shearlet Transform Based Prediction Scheme for Light Field Compression
  • 2018
  • Konferensbidrag (refereegranskat)abstract
    • Light field acquisition technologies capture angular and spatial information ofthe scene. The spatial and angular information enables various post processingapplications, e.g. 3D scene reconstruction, refocusing, synthetic aperture etc at theexpense of an increased data size. In this paper, we present a novel prediction tool forcompression of light field data acquired with multiple camera system. The captured lightfield (LF) can be described using two plane parametrization as, L(u, v, s, t), where (u, v)represents each view image plane coordinates and (s, t) represents the coordinates of thecapturing plane. In the proposed scheme, the captured LF is uniformly decimated by afactor d in both directions (in s and t coordinates), resulting in a sparse set of views alsoreferred to as key views. The key views are converted into a pseudo video sequence andcompressed using high efficiency video coding (HEVC). The shearlet transform basedreconstruction approach, presented in [1], is used at the decoder side to predict thedecimated views with the help of the key views.Four LF images (Truck, Bunny from Stanford dataset, Set2 and Set9 from High DensityCamera Array dataset) are used in the experiments. Input LF views are converted into apseudo video sequence and compressed with HEVC to serve as anchor. Rate distortionanalysis shows the average PSNR gain of 0.98 dB over the anchor scheme. Moreover, inlow bit-rates, the compression efficiency of the proposed scheme is higher compared tothe anchor and on the other hand the performance of the anchor is better in high bit-rates.Different compression response of the proposed and anchor scheme is a consequence oftheir utilization of input information. In the high bit-rate scenario, high quality residualinformation enables the anchor to achieve efficient compression. On the contrary, theshearlet transform relies on key views to predict the decimated views withoutincorporating residual information. Hence, it has inherit reconstruction error. In the lowbit-rate scenario, the bit budget of the proposed compression scheme allows the encoderto achieve high quality for the key views. The HEVC anchor scheme distributes the samebit budget among all the input LF views that results in degradation of the overall visualquality. The sensitivity of human vision system toward compression artifacts in low-bitratecases favours the proposed compression scheme over the anchor scheme.
  •  
9.
  •  
10.
  • Ahmad, Waqas, et al. (författare)
  • Towards a generic compression solution for densely and sparsely sampled light field data
  • 2018
  • Ingår i: Proceedings of 25TH IEEE International Conference On Image Processing. - 9781479970612 ; , s. 654-658
  • Konferensbidrag (refereegranskat)abstract
    • Light field (LF) acquisition technologies capture the spatial and angular information present in scenes. The angular information paves the way for various post-processing applications such as scene reconstruction, refocusing, and synthetic aperture. The light field is usually captured by a single plenoptic camera or by multiple traditional cameras. The former captures a dense LF, while the latter captures a sparse LF. This paper presents a generic compression scheme that efficiently compresses both densely and sparsely sampled LFs. A plenoptic image is converted into sub-aperture images, and each sub-aperture image is interpreted as a frame of a multiview sequence. In comparison, each view of the multi-camera system is treated as a frame of a multi-view sequence. The multi-view extension of high efficiency video coding (MVHEVC) is used to encode the pseudo multi-view sequence.This paper proposes an adaptive prediction and rate allocation scheme that efficiently compresses LF data irrespective of the acquisition technology used.
  •  
11.
  • Conti, Caroline, et al. (författare)
  • Light Field Image Compression
  • 2018
  • Ingår i: 3D Visual Content Creation, Coding and Delivery. - Cham : Springer. - 9783319778426 ; , s. 143-176
  • Bokkapitel (refereegranskat)
  •  
12.
  • Damghanian, Mitra, et al. (författare)
  • Depth and Angular Resolution in Plenoptic Cameras
  • 2015
  • Ingår i: 2015 IEEE International Conference On Image Processing (ICIP), September 2015. - : IEEE. ; , s. 3044-3048
  • Konferensbidrag (refereegranskat)abstract
    • We present a model-based approach to extract the depth and angular resolution in a plenoptic camera. Obtained results for the depth and angular resolution are validated against Zemax ray tracing results. The provided model-based approach gives the location and number of the resolvable depth planes in a plenoptic camera as well as the angular resolution with regards to disparity in pixels. The provided model-based approach is straightforward compared to practical measurements and can reflect on the plenoptic camera parameters such as the microlens f-number in contrast with the principal-ray-model approach. Easy and accurate quantification of different resolution terms forms the basis for designing the capturing setup and choosing a reasonable system configuration for plenoptic cameras. Results from this work will accelerate customization of the plenoptic cameras for particular applications without the need for expensive measurements.
  •  
13.
  • Damghanian, Mitra, 1978-, et al. (författare)
  • Investigating the lateral resolution in a plenoptic capturing system using the SPC model
  • 2013
  • Ingår i: Proceedings of SPIE - The International Society for Optical Engineering. - : SPIE - International Society for Optical Engineering. - 9780819494337 ; , s. 86600T-
  • Konferensbidrag (refereegranskat)abstract
    • Complex multidimensional capturing setups such as plenoptic cameras (PC) introduce a trade-off between various system properties. Consequently, established capturing properties, like image resolution, need to be described thoroughly for these systems. Therefore models and metrics that assist exploring and formulating this trade-off are highly beneficial for studying as well as designing of complex capturing systems. This work demonstrates the capability of our previously proposed sampling pattern cube (SPC) model to extract the lateral resolution for plenoptic capturing systems. The SPC carries both ray information as well as focal properties of the capturing system it models. The proposed operator extracts the lateral resolution from the SPC model throughout an arbitrary number of depth planes giving a depth-resolution profile. This operator utilizes focal properties of the capturing system as well as the geometrical distribution of the light containers which are the elements in the SPC model. We have validated the lateral resolution operator for different capturing setups by comparing the results with those from Monte Carlo numerical simulations based on the wave optics model. The lateral resolution predicted by the SPC model agrees with the results from the more complex wave optics model better than both the ray based model and our previously proposed lateral resolution operator. This agreement strengthens the conclusion that the SPC fills the gap between ray-based models and the real system performance, by including the focal information of the system as a model parameter. The SPC is proven a simple yet efficient model for extracting the lateral resolution as a high-level property of complex plenoptic capturing systems.
  •  
14.
  • Damghanian, Mitra, 1978-, et al. (författare)
  • Performance analysis in Lytro camera: Empirical and model based approaches to assess refocusing quality
  • 2014
  • Ingår i: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. - : IEEE conference proceedings. ; , s. 559-563
  • Konferensbidrag (refereegranskat)abstract
    • In this paper we investigate the performance of Lytro camera in terms of its refocusing quality. The refocusing quality of the camera is related to the spatial resolution and the depth of field as the contributing parameters. We quantify the spatial resolution profile as a function of depth using empirical and model based approaches. The depth of field is then determined by thresholding the spatial resolution profile. In the model based approach, the previously proposed sampling pattern cube (SPC) model for representation and evaluation of the plenoptic capturing systems is utilized. For the experimental resolution measurements, camera evaluation results are extracted from images rendered by the Lytro full reconstruction rendering method. Results from both the empirical and model based approaches assess the refocusing quality of the Lytro camera consistently, highlighting the usability of the model based approaches for performance analysis of complex capturing systems.
  •  
15.
  • Dima, Elijs, 1990- (författare)
  • Augmented Telepresence based on Multi-Camera Systems : Capture, Transmission, Rendering, and User Experience
  • 2021
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    •  Observation and understanding of the world through digital sensors is an ever-increasing part of modern life. Systems of multiple sensors acting together have far-reaching applications in automation, entertainment, surveillance, remote machine control, and robotic self-navigation. Recent developments in digital camera, range sensor and immersive display technologies enable the combination of augmented reality and telepresence into Augmented Telepresence, which promises to enable more effective and immersive forms of interaction with remote environments.The purpose of this work is to gain a more comprehensive understanding of how multi-sensor systems lead to Augmented Telepresence, and how Augmented Telepresence can be utilized for industry-related applications. On the one hand, the conducted research is focused on the technological aspects of multi-camera capture, rendering, and end-to-end systems that enable Augmented Telepresence. On the other hand, the research also considers the user experience aspects of Augmented Telepresence, to obtain a more comprehensive perspective on the application and design of Augmented Telepresence solutions.This work addresses multi-sensor system design for Augmented Telepresence regarding four specific aspects ranging from sensor setup for effective capture to the rendering of outputs for Augmented Telepresence. More specifically, the following problems are investigated: 1) whether multi-camera calibration methods can reliably estimate the true camera parameters; 2) what the consequences are of synchronization errors in a multi-camera system; 3) how to design a scalable multi-camera system for low-latency, real-time applications; and 4) how to enable Augmented Telepresence from multi-sensor systems for mining, without prior data capture or conditioning. The first problem was solved by conducting a comparative assessment of widely available multi-camera calibration methods. A special dataset was recorded, enforcing known constraints on camera ground-truth parameters to use as a reference for calibration estimates. The second problem was addressed by introducing a depth uncertainty model that links the pinhole camera model and synchronization error to the geometric error in the 3D projections of recorded data. The third problem was addressed empirically - by constructing a multi-camera system based on off-the-shelf hardware and a modular software framework. The fourth problem was addressed by proposing a processing pipeline of an augmented remote operation system for augmented and novel view rendering.The calibration assessment revealed that target-based and certain target-less calibration methods are relatively similar in their estimations of the true camera parameters, with one specific exception. For high-accuracy scenarios, even commonly used target-based calibration approaches are not sufficiently accurate with respect to the ground truth. The proposed depth uncertainty model was used to show that converged multi-camera arrays are less sensitive to synchronization errors. The mean depth uncertainty of a camera system correlates to the rendered result in depth-based reprojection as long as the camera calibration matrices are accurate. The presented multi-camera system demonstrates a flexible, de-centralized framework where data processing is possible in the camera, in the cloud, and on the data consumer's side. The multi-camera system is able to act as a capture testbed and as a component in end-to-end communication systems, because of the general-purpose computing and network connectivity support coupled with a segmented software framework. This system forms the foundation for the augmented remote operation system, which demonstrates the feasibility of real-time view generation by employing on-the-fly lidar de-noising and sparse depth upscaling for novel and augmented view synthesis.In addition to the aforementioned technical investigations, this work also addresses the user experience impacts of Augmented Telepresence. The following two questions were investigated: 1) What is the impact of camera-based viewing position in Augmented Telepresence? 2) What is the impact of depth-aiding augmentations in Augmented Telepresence? Both are addressed through a quality of experience study with non-expert participants, using a custom Augmented Telepresence test system for a task-based experiment. The experiment design combines in-view augmentation, camera view selection, and stereoscopic augmented scene presentation via a head-mounted display to investigate both the independent factors and their joint interaction.The results indicate that between the two factors, view position has a stronger influence on user experience. Task performance and quality of experience were significantly decreased by viewing positions that force users to rely on stereoscopic depth perception. However, position-assisting view augmentations can mitigate the negative effect of sub-optimal viewing positions; the extent of such mitigation is subject to the augmentation design and appearance.In aggregate, the works presented in this dissertation cover a broad view of Augmented Telepresence. The individual solutions contribute general insights into Augmented Telepresence system design, complement gaps in the current discourse of specific areas, and provide tools for solving challenges found in enabling the capture, processing, and rendering in real-time-oriented end-to-end systems.
  •  
16.
  • Dima, Elijs, et al. (författare)
  • Estimation and Post-Capture Compensation of Synchronization Error in Unsynchronized Multi-Camera Systems
  • 2021
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • Multi-camera systems are used in entertainment production, computer vision, industry and surveillance. The benefit of using multi-camera systems is the ability to recover the 3D structure, or depth, of the recorded scene. However, various types of cameras, including depth cameras, can not be reliably synchronized during recording, which leads to errors in depth estimation and scene rendering. The aim of this work is to propose a method for compensating synchronization errors in already recorded sequences, without changing the format of the recorded sequences. We describe a depth uncertainty model for parametrizing the impact of synchronization errors in a multi-camera system, and propose a method for synchronization error estimation and compensation. The proposed method is based on interpolating an image at a desired timeframe based on adjacent non-synchronized images in a single camera's sequence, using an array of per-pixel distortion vectors. This array is generated by using the difference between adjacent images to locate and segment the recorded moving objects, and does not require any object texture or distinguishing features beyond the observed difference in adjacent images. The proposed compensation method is compared with optical-flow based interpolation and sparse correspondence based morphing, and the proposed synchronization error estimation is compared with a state-of-the-art video alignment method. The proposed method shows better synchronization error estimation accuracy and compensation ability, especially in cases of low-texture, low-feature images. The effect of using data with synchronization errors is also demonstrated, as is the improvement gained by using compensated data. The compensation of synchronization errors is useful in scenarios where the recorded data is expected to be used by other processes that expect a sub-frame synchronization accuracy, such as depth-image-based rendering.
  •  
17.
  • Dima, Elijs, et al. (författare)
  • LIFE: A Flexible Testbed For Light Field Evaluation
  • 2018
  • Konferensbidrag (refereegranskat)abstract
    • Recording and imaging the 3D world has led to the use of light fields. Capturing, distributing and presenting light field data is challenging, and requires an evaluation platform. We define a framework for real-time processing, and present the design and implementation of a light field evaluation system. In order to serve as a testbed, the system is designed to be flexible, scalable, and able to model various end-to-end light field systems. This flexibility is achieved by encapsulating processes and devices in discrete framework systems. The modular capture system supports multiple camera types, general-purpose data processing, and streaming to network interfaces. The cloud system allows for parallel transcoding and distribution of streams. The presentation system encapsulates rendering and display specifics. The real-time ability was tested in a latency measurement; the capture and presentation systems process and stream frames within a 40 ms limit.
  •  
18.
  • Dima, Elijs, et al. (författare)
  • Modeling Depth Uncertainty of Desynchronized Multi-Camera Systems
  • 2017
  • Ingår i: 2017 International Conference on 3D Immersion (IC3D). - : IEEE. - 9781538646557
  • Konferensbidrag (refereegranskat)abstract
    • Accurately recording motion from multiple perspectives is relevant for recording and processing immersive multi-media and virtual reality content. However, synchronization errors between multiple cameras limit the precision of scene depth reconstruction and rendering. In order to quantify this limit, a relation between camera de-synchronization, camera parameters, and scene element motion has to be identified. In this paper, a parametric ray model describing depth uncertainty is derived and adapted for the pinhole camera model. A two-camera scenario is simulated to investigate the model behavior and how camera synchronization delay, scene element speed, and camera positions affect the system's depth uncertainty. Results reveal a linear relation between synchronization error, element speed, and depth uncertainty. View convergence is shown to affect mean depth uncertainty up to a factor of 10. Results also show that depth uncertainty must be assessed on the full set of camera rays instead of a central subset.
  •  
19.
  • Dima, Elijs (författare)
  • Multi-Camera Light Field Capture : Synchronization, Calibration, Depth Uncertainty, and System Design
  • 2018
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • The digital camera is the technological counterpart to the human eye, enabling the observation and recording of events in the natural world. Since modern life increasingly depends on digital systems, cameras and especially multiple-camera systems are being widely used in applications that affect our society, ranging from multimedia production and surveillance to self-driving robot localization. The rising interest in multi-camera systems is mirrored by the rising activity in Light Field research, where multi-camera systems are used to capture Light Fields - the angular and spatial information about light rays within a 3D space. The purpose of this work is to gain a more comprehensive understanding of how cameras collaborate and produce consistent data as a multi-camera system, and to build a multi-camera Light Field evaluation system. This work addresses three problems related to the process of multi-camera capture: first, whether multi-camera calibration methods can reliably estimate the true camera parameters; second, what are the consequences of synchronization errors in a multi-camera system; and third, how to ensure data consistency in a multi-camera system that records data with synchronization errors. Furthermore, this work addresses the problem of designing a flexible multi-camera system that can serve as a Light Field capture testbed.The first problem is solved by conducting a comparative assessment of widely available multi-camera calibration methods. A special dataset is recorded, giving known constraints on camera ground-truth parameters to use as reference for calibration estimates. The second problem is addressed by introducing a depth uncertainty model that links the pinhole camera model and synchronization error to the geometric error in the 3D projections of recorded data. The third problem is solved for the color-and-depth multi-camera scenario, by using a proposed estimation of the depth camera synchronization error and correction of the recorded depth maps via tensor-based interpolation. The problem of designing a Light Field capture testbed is addressed empirically, by constructing and presenting a multi-camera system based on off-the-shelf hardware and a modular software framework.The calibration assessment reveals that target-based and certain target-less calibration methods are relatively similar at estimating the true camera parameters. The results imply that for general-purpose multi-camera systems, target-less calibration is an acceptable choice. For high-accuracy scenarios, even commonly used target-based calibration approaches are insufficiently accurate. The proposed depth uncertainty model is used to show that converged multi-camera arrays are less sensitive to synchronization errors. The mean depth uncertainty of a camera system correlates to the rendered result in depth-based reprojection, as long as the camera calibration matrices are accurate. The proposed depthmap synchronization method is used to produce a consistent, synchronized color-and-depth dataset for unsynchronized recordings without altering the depthmap properties. Therefore, the method serves as a compatibility layer between unsynchronized multi-camera systems and applications that require synchronized color-and-depth data. Finally, the presented multi-camera system demonstrates a flexible, de-centralized framework where data processing is possible in the camera, in the cloud, and on the data consumer's side. The multi-camera system is able to act as a Light Field capture testbed and as a component in Light Field communication systems, because of the general-purpose computing and network connectivity support for each sensor, small sensor size, flexible mounts, hardware and software synchronization, and a segmented software framework. 
  •  
20.
  • Li, Yongwei, et al. (författare)
  • An analysis of demosaicing for plenoptic capture based on ray optics
  • 2018
  • Ingår i: Proceedings of 3DTV Conference 2018. - 9781538661253
  • Konferensbidrag (refereegranskat)abstract
    • The plenoptic camera is gaining more and more attention as it capturesthe 4D light field of a scene with a single shot and enablesa wide range of post-processing applications. However, the preprocessing steps for captured raw data, such as demosaicing, have been overlooked. Most existing decoding pipelines for plenoptic cameras still apply demosaicing schemes which are developed for conventional cameras. In this paper, we analyze the sampling pattern of microlens-based plenoptic cameras by ray-tracing techniques and ray phase space analysis. The goal of this work is to demonstrate guidelines and principles for demosaicing the plenoptic captures by taking the unique microlens array design into account. We show that the sampling of the plenoptic camera behaves differently from that of a conventional camera and the desired demosaicing scheme is depth-dependent.
  •  
21.
  • Li, Yun, et al. (författare)
  • Coding of focused plenoptic contents by displacement intra prediction
  • 2016
  • Ingår i: IEEE transactions on circuits and systems for video technology (Print). - 1051-8215 .- 1558-2205. ; 26:7, s. 1308-1319
  • Tidskriftsartikel (refereegranskat)abstract
    • A light field is commonly described by a two-plane representation with four dimensions. Refocused three-dimensional contents can be rendered from light field images. A method for capturing these images is by using cameras with microlens arrays. A dense sampling of the light field results in large amounts of redundant data. Therefore, an efficient compression is vital for a practical use of these data. In this paper, we propose a displacement intra prediction scheme with a maximum of two hypotheses for the compression of plenoptic contents from focused plenoptic cameras. The proposed scheme is further implemented into HEVC. The work is aiming at coding plenoptic captured contents efficiently without knowing underlying camera geometries. In addition, the theoretical analysis of the displacement intra prediction for plenoptic images is explained; the relationship between the compressed captured images and their rendered quality is also analyzed. Evaluation results show that plenoptic contents can be efficiently compressed by the proposed scheme. Bit rate reduction up to 60 percent over HEVC is obtained for plenoptic images, and more than 30 percent is achieved for the tested video sequences.
  •  
22.
  • Li, Yun, et al. (författare)
  • Coding of plenoptic images by using a sparse set and disparities
  • 2015
  • Ingår i: Proceedings - IEEE International Conference on Multimedia and Expo. - : IEEE conference proceedings. - 9781479970827 ; , s. -Art. no. 7177510
  • Konferensbidrag (refereegranskat)abstract
    • A focused plenoptic camera not only captures the spatial information of a scene but also the angular information. The capturing results in a plenoptic image consisting of multiple microlens images and with a large resolution. In addition, the microlens images are similar to their neighbors. Therefore, an efficient compression method that utilizes this pattern of similarity can reduce coding bit rate and further facilitate the usage of the images. In this paper, we propose an approach for coding of focused plenoptic images by using a representation, which consists of a sparse plenoptic image set and disparities. Based on this representation, a reconstruction method by using interpolation and inpainting is devised to reconstruct the original plenoptic image. As a consequence, instead of coding the original image directly, we encode the sparse image set plus the disparity maps and use the reconstructed image as a prediction reference to encode the original image. The results show that the proposed scheme performs better than HEVC intra with more than 5 dB PSNR or over 60 percent bit rate reduction.
  •  
23.
  • Li, Yun, et al. (författare)
  • Compression of Unfocused Plenoptic Images using a Displacement Intra prediction
  • 2016
  • Ingår i: 2016 IEEE International Conference on Multimedia and Expo Workshop, ICMEW 2016. - : IEEE Signal Processing Society. - 9781509015528
  • Konferensbidrag (refereegranskat)abstract
    • Plenoptic images are one type of light field contents produced by using a combination of a conventional camera and an additional optical component in the form of microlens arrays, which are positioned in front of the image sensor surface. This camera setup can capture a sub-sampling of the light field with high spatial fidelity over a small range, and with a more coarsely sampled angle range. The earliest applications that leverage on the plenoptic image content is image refocusing, non-linear distribution of out-of-focus areas, SNR vs. resolution trade-offs, and 3D-image creation. All functionalities are provided by using post-processing methods. In this work, we evaluate a compression method that we previously proposed for a different type of plenoptic image (focused or plenoptic camera 2.0 contents) than the unfocused or plenoptic camera 1.0 that is used in this Grand Challenge. The method is an extension of the state-of-the-art video compression standard HEVC where we have brought the capability of bi-directional inter-frame prediction into the spatial prediction. The method is evaluated according to the scheme set out by the Grand Challenge, and the results show a high compression efficiency compared with JPEG, i.e., up to 6 dB improvements for the tested images.
  •  
24.
  • Li, Yongwei, 1990- (författare)
  • Computational Light Field Photography : Depth Estimation, Demosaicing, and Super-Resolution
  • 2020
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • The transition of camera technology from film-based cameras to digital cameras has been witnessed in the past twenty years, along with impressive technological advances in processing massively digitized media content. Today, a new evolution emerged -- the migration from 2D content to immersive perception. This rising trend has a profound and long-term impact to our society, fostering technologies such as teleconferencing and remote surgery. The trend is also reflected in the scientific research community, and more intention has been drawn to the light field and its applications. The purpose of this dissertation is to develop a better understanding of light field structure by analyzing its sampling behavior and to addresses three problems concerning the light field processing pipeline: 1) How to address the depth estimation problem when there is limited color and texture information. 2) How to improve the rendered image quality by using the inherent depth information. 3) How to solve the interdependence conflict of demosaicing and depth estimation. The first problem is solved by a hybrid depth estimation approach that combines advantages of correspondence matching and depth-from-focus, where occlusion is handled by involving multiple depth maps in a voting scheme. The second problem is divided into two specific tasks -- demosaicing and super-resolution, where depth-assisted light field analysis is employed to surpass the competence of traditional image processing. The third problem is tackled with an inferential graph model that encodes the connections between demosaicing and depth estimation explicitly, and jointly performs a global optimization for both tasks. The proposed depth estimation approach shows a noticeable improvement in point clouds and depth maps, compared with references methods. Furthermore, the objective metrics and visual quality are compared with classical sensor-based demosaicing and multi-image super-resolution to show the effectiveness of the proposed depth-assisted light field processing methods. Finally, a multi-task graph model is proposed to challenge the performance of the sequential light field image processing pipeline. The proposed method is validated with various kinds of light fields, and outperforms the state-of-the-art in both demosaicing and depth estimation tasks. The works presented in this dissertation raise a novel view of the light field data structure in general, and provide tools to solve image processing problems in specific. The impact of the outcome can be manifold: To support scientific research with light field microscopes, to stabilize the performance of range cameras for industrial applications, as well as to provide individuals with a high-quality immersive experience.
  •  
25.
  • Li, Yun, et al. (författare)
  • Depth Map Compression with Diffusion Modes in 3D-HEVC
  • 2013
  • Ingår i: MMEDIA 2013 - 5th International Conferences on Advances in Multimedia. - : International Academy, Research and Industry Association (IARIA). - 9781612082653 ; , s. 125-129
  • Konferensbidrag (refereegranskat)abstract
    • For three-dimensional television, multiple views can be generated by using the Multi-view Video plus Depth (MVD) format. The depth maps of this format can be compressed efficiently by the 3D extension of High Efficiency Video Coding (3D-HEVC), which has explored the correlations between its two components, texture and associated depth map. In this paper, we introduce two modes for depth map coding into HEVC, where the modes use diffusion. The framework for inter-component prediction of Depth Modeling Modes (DMM) is utilized for the proposed modes. They detect edges from textures and then diffuse an entire block from known adjacent blocks by using Laplace equation constrained by the detected edges. The experimental results show that depth maps can be compressed more efficiently with the proposed diffusion modes, where the bit rate saving can reach 1.25 percentage of the total depth bit rate with a constant quality of synthesized views.
  •  
26.
  • Li, Yun, et al. (författare)
  • Scalable coding of plenoptic images by using a sparse set and disparities
  • 2016
  • Ingår i: IEEE Transactions on Image Processing. - 1057-7149 .- 1941-0042. ; 25:1, s. 80-91
  • Tidskriftsartikel (refereegranskat)abstract
    • One of the light field capturing techniques is the focused plenoptic capturing. By placing a microlens array in front of the photosensor, the focused plenoptic cameras capture both spatial and angular information of a scene in each microlens image and across microlens images. The capturing results in significant amount of redundant information, and the captured image is usually of a large resolution. A coding scheme that removes the redundancy before coding can be of advantage for efficient compression, transmission and rendering. In this paper, we propose a lossy coding scheme to efficiently represent plenoptic images. The format contains a sparse image set and its associated disparities. The reconstruction is performed by disparity-based interpolation and inpainting, and the reconstructed image is later employed as a prediction reference for the coding of the full plenoptic image. As an outcome of the representation, the proposed scheme inherits a scalable structure with three layers.The results show that plenoptic images are compressed efficiently with over 60 percent bit rate reduction compared to HEVC intra, and with over 20 percent compared to HEVC block copying mode.
  •  
27.
  • Muddala, Suryanarayana Murthy, et al. (författare)
  • Depth-Included Curvature Inpainting for Disocclusion Filling in View Synthesis
  • 2013
  • Ingår i: International Journal on Advances in Telecommunications. - : International Academy, Research and Industry Association (IARIA). - 1942-2601. ; 6:3&4, s. 132-142
  • Tidskriftsartikel (refereegranskat)abstract
    • Depth-image-based-rendering (DIBR) is the commonly used for generating additional views for 3DTV and FTV using 3D video formats such as video plus depth (V+D) and multi view-video-plus-depth (MVD). The synthesized views suffer from artifacts mainly with disocclusions when DIBR is used. Depth-based inpainting methods can solve these problems plausibly. In this paper, we analyze the influence of the depth information at various steps of the depth-included curvature inpainting method. The depth-based inpainting method relies on the depth information at every step of the inpainting process: boundary extraction for missing areas, data term computation for structure propagation and in the patch matching to find best data. The importance of depth at each step is evaluated using objective metrics and visual comparison. Our evaluation demonstrates that depth information in each step plays a key role. Moreover, to what degree depth can be used in each step of the inpainting process depends on the depth distribution.
  •  
28.
  • Muddala, Suryanarayana M., et al. (författare)
  • Virtual View Synthesis Using Layered Depth Image Generation and Depth-Based Inpainting for Filling Disocclusions and Translucent Disocclusions
  • 2016
  • Ingår i: Journal of Visual Communication and Image Representation. - : Elsevier BV. - 1047-3203 .- 1095-9076. ; 38, s. 351-366
  • Tidskriftsartikel (refereegranskat)abstract
    • View synthesis is an efficient solution to produce content for 3DTV and FTV. However, proper handling of the disocclusions is a major challenge in the view synthesis. Inpainting methods offer solutions for handling disocclusions, though limitations in foreground-background classification causes the holes to be filled with inconsistent textures. Moreover, the state-of-the art methods fail to identify and fill disocclusions in intermediate distances between foreground and background through which background may be visible in the virtual view (translucent disocclusions). Aiming at improved rendering quality, we introduce a layered depth image (LDI) in the original camera view, in which we identify and fill occluded background so that when the LDI data is rendered to a virtual view, no disocclusions appear but views with consistent data are produced also handling translucent disocclusions. Moreover, the proposed foreground-background classification and inpainting fills the disocclusions with neighboring background texture consistently. Based on the objective and subjective evaluations, the proposed method outperforms the state-of-the art methods at the disocclusions.
  •  
29.
  • Muddala, Suryanarayana, et al. (författare)
  • Spatio-Temporal Consistent Depth-Image Based Rendering Using Layered Depth Image and Inpainting
  • 2016
  • Ingår i: EURASIP Journal on Image and Video Processing. - Springer : Springer Science and Business Media LLC. - 1687-5176 .- 1687-5281. ; 9:1, s. 1-19
  • Tidskriftsartikel (refereegranskat)abstract
    • Depth-image-based rendering (DIBR) is a commonly used method for synthesizing additional views using video-plus-depth (V+D) format. A critical issue with DIBR based view synthesis is the lack of information behind foreground objects. This lack is manifested as disocclusions, holes, next to the foreground objects in rendered virtual views as a consequence of the virtual camera “seeing” behind the foreground object. The disocclusions are larger in the extrapolation case, i.e. the single camera case. Texture synthesis methods (inpainting methods) aim to fill these disocclusions by producing plausible texture content. However, virtual views inevitably exhibit both spatial and temporal inconsistencies at the filled disocclusion areas, depending on the scene content. In this paper we propose a layered depth image (LDI) approach that improves the spatio-temporal consistency. In the process of LDI generation, depth information is used to classify the foreground and background in order to form a static scene sprite from a set of neighboring frames. Occlusions in the LDI are then identified and filled using inpainting, such that no disocclusions appear when the LDI data is rendered to a virtual view. In addition to the depth information, optical flow is computed to extract the stationary parts of the scene and to classify the occlusions in the inpainting process. Experimental results demonstrate that spatio-temporal inconsistencies are significantly reduced using the proposed method. Furthermore, subjective and objective qualities are improved compared to state-of-the-art reference methods.
  •  
30.
  • Navarro-Fructuoso, Hector, et al. (författare)
  • Extended depth-of-field in integral imaging by depth-dependent deconvolution
  • 2013
  • Konferensbidrag (refereegranskat)abstract
    • Integral Imaging is a technique to obtain true color 3D images that can provide full and continuous motion parallax for several viewers. The depth of field of these systems is mainly limited by the numerical aperture of each lenslet of the microlens array. A digital method has been developed to increase the depth of field of Integral Imaging systems in the reconstruction stage. By means of the disparity map of each elemental image, it is possible to classify the objects of the scene according to their distance from the microlenses and apply a selective deconvolution for each depth of the scene. Topographical reconstructions with enhanced depth of field of a 3D scene are presented to support our proposal.
  •  
31.
  • Navarro, Hector, et al. (författare)
  • Depth-of-field enhancement in integral imaging by selective depth-deconvolution
  • 2014
  • Ingår i: IEEE/OSA Journal of Display Technology. - : IEEE/OSA. - 1551-319X .- 1558-9323. ; 10:3, s. 182-188
  • Tidskriftsartikel (refereegranskat)abstract
    • One of the major drawbacks of integral imaging technique is its limited depth of field. Such limitation is imposed by the numerical aperture of the microlenses. In this paper we propose a method to extend the depth of field of integral imaging systems in the reconstruction stage. The method is based on the combination of deconvolution tools and depth filtering of each elemental image using disparity map information. We demonstrate our proposal presenting digital reconstructions of a 3D scene focused at different depths with extended depth of field.
  •  
32.
  • Olsson, Roger, 1973-, et al. (författare)
  • A modular cross-platform GPU-based approach for flexible 3D video playback
  • 2011
  • Ingår i: Proceedings of SPIE - The International Society for Optical Engineering. - : SPIE - International Society for Optical Engineering. - 9780819484000 ; , s. Art. no. 78631E-
  • Konferensbidrag (refereegranskat)abstract
    • Different compression formats for stereo and multiview based 3D video is being standardized and software players capable of decoding and presenting these formats onto different display types is a vital part in the commercialization and evolution of 3D video. However, the number of publicly available software video players capable of decoding and playing multiview 3D video is still quite limited. This paper describes the design and implementation of a GPU-based real-time 3D video playback solution, built on top of cross-platform, open source libraries for video decoding and hardware accelerated graphics. A software architecture is presented that efficiently process and presents high definition 3D video in real-time and in a flexible manner support both current 3D video formats and emerging standards. Moreover, a set of bottlenecks in the processing of 3D video content in a GPU-based real-time 3D video playback solution is identified and discussed.
  •  
33.
  • Olsson, Roger, 1973-, et al. (författare)
  • Converting conventional stereo pairs to multi-view sequences using morphing
  • 2012
  • Ingår i: Proceedings of SPIE - The International Society for Optical Engineering. - Burlingame, CA : SPIE - International Society for Optical Engineering. - 9780819489357 ; , s. Art. no. 828828-
  • Konferensbidrag (refereegranskat)abstract
    • Autostereoscopic multi view displays require multiple views of a scene to provide motion parallax. When an observer changes viewing angle different stereoscopic pairs are perceived. This allows new perspectives of the scene to be seen giving a more realistic 3D experience. However, capturing arbitrary number of views is at best cumbersome, and in some occasions impossible. Conventional stereo video (CSV) operates on two video signals captured using two cameras at two different perspectives. Generation and transmission of two views is more feasible than that of multiple views. It would be more efficient if multiple views required by an autostereoscopic display can be synthesized from these sparse set of views. This paper addresses the conversion of stereoscopic video to multiview video using the video effect morphing. Different morphing algorithms are implemented and evaluated. Contrary to traditional conversion methods, these algorithms disregard the physical depth explicitly and instead generate intermediate views using sparse sets of correspondence features and image morphing. A novel morphing algorithm is also presented that uses scale invariant feature transform (SIFT) and segmentation to construct robust correspondences features and qualitative intermediate views. All algorithms are evaluated on a subjective and objective basis and the comparison results are presented.
  •  
34.
  • Olsson, Roger, 1973-, et al. (författare)
  • Multiview image coding scheme transformations: artifact characteristics and effects on perceived 3D quality
  • 2010
  • Ingår i: Stereoscopic Displays and Applications XXI 2010. - : SPIE - International Society for Optical Engineering. - 9780819479174
  • Konferensbidrag (refereegranskat)abstract
    • Compression schemes for 3D images and -video gain much of their efficiency from transformations that convert the signal into forms suitable for quantization. The achieved compression efficiency is principally determined by rate-distortion analysis using objective quality evaluation metrics. In 2D, quality evaluation metrics operating in the pixel domain implicitly assumes an ideal display modelled as a unity transformation. Similar simplifications are not feasible in 3D analysis and different coding schemes introduce significantly different compression artefacts even though operating at the same rate-distortion ratio. In this paper we have performed a subjective assessment of the quality of compressed 3D images presented on an autostereoscopic display. In the qualitative part of the assessment different properties of the induced coding artefacts was identified with respect to image depth, pixelation, and zero-parallax distortion. The quantitative part was conducted using a group of non-expert observers that assessed the 3D quality.In the results we show how the compression schemes introduce specific groups of artefacts manifesting with significantly different characteristics. In addition, each characteristic is derived from the transformation domains and the relationships between coding scheme and distortion property are presented. Moreover, the characteristics are related to the image quality assessment produced by the observation group.
  •  
35.
  • Olsson, Roger, 1973- (författare)
  • Synthesis, Coding, and Evaluation of 3D Images Based on Integral Imaging
  • 2008
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • In recent years camera prototypes based on Integral Imaging (II) have emerged that are capable of capturing three-dimensional (3D) images. When being viewed on a 3D display, these II-pictures convey depth and content that realistically change perspective as the viewer changes the viewing position. The dissertation concentrates on three restraining factors concerning II-picture progress. Firstly, there is a lack of digital II-pictures available for inter alia comparative research and coding scheme development. Secondly, there is an absence of objective quality metrics that explicitly measure distortion with respect to the II-picture properties: depth and view-angle dependency. Thirdly, low coding efficiencies are achieved when present image coding standards are applied to II-pictures. A computer synthesis method has been developed, which enables the production of different II-picture types. An II-camera model forms a basis and is combined with a scene description language that allows for the describing of arbitrary complex virtual scenes. The light transport within the scene and into the II-camera is simulated using ray-tracing and geometrical optics. A number of II-camera models, scene descriptions, and II-pictures are produced using the presented method. Two quality evaluation metrics have been constructed to objectively quantify the distortion contained in an II-picture with respect to its specific properties. The first metric models how the distortion is perceived by a viewer watching an II-display from different viewing-angles. The second metric estimates the depth-distribution of the distortion. New aspects of coding-induced artifacts within the II-picture are revealed using the proposed metrics. Finally, a coding scheme for II-pictures has been developed that inter alia utilizes the video coding standard H.264/AVC by firstly transforming the II-picture into a pseudo video sequence. The properties of the coding scheme have been studied in detail and compared with other coding schemes using the proposed evaluation metrics. The proposed coding scheme achieves the same quality as JPEG2000 at approximately 1/60th of the storage- or distribution requirements.
  •  
36.
  • Paudyal, Pradip, et al. (författare)
  • SMART: a Light Field image quality dataset
  • 2016
  • Ingår i: Proceedings of the 7th International Conference on Multimedia Systems, MMSys 2016. - New York, NY, USA : Association for Computing Machinery (ACM). ; , s. 374-379
  • Konferensbidrag (refereegranskat)abstract
    • In this article, the design of a Light Field image datasetis presented. The availability of an image dataset is use-ful for design, testing, and benchmarking Light Field imageprocessing algorithms. As rst step, the image content se-lection criteria have been dened based on selected imagequality key-attributes, i.e. spatial information, colorfulness,texture key features, depth of eld, etc. Next, image sceneshave been selected and captured by using the Lytro IllumLight Field camera. Performed analysis shows that the con-sidered set of images is sucient for addressing a wide rangeof attributes relevant to assess Light Field image quality.
  •  
37.
  • Paudyal, Pradip, et al. (författare)
  • Towards the Perceptual Quality Evaluation of Compressed Light Field Images
  • 2017
  • Ingår i: IEEE transactions on broadcasting. - 0018-9316 .- 1557-9611. ; 63:3, s. 507-522
  • Tidskriftsartikel (refereegranskat)abstract
    • Evaluation of perceived quality of light field images,as well as testing new processing tools, or even assessing the effectiveness of objective quality metrics, relies on the availabilityof test dataset and corresponding quality ratings. This article presents SMART light field image quality dataset. The dataset consists of source images (raw data without optical corrections), compressed images, and annotated subjective quality scores. Furthermore, analysis of perceptual effects of compression on SMART dataset is presented. Next, the impact of image content on the perceived quality is studied with the help of image quality attributes. Finally, the performances of 2D image quality metrics when applied to light field images are analyzed.
  •  
38.
  • Schwarz, Sebastian, 1980-, et al. (författare)
  • Depth Map Upscaling Through Edge Weighted Optimization
  • 2012
  • Ingår i: Proceedings of SPIE - The International Society for Optical Engineering. - : SPIE - International Society for Optical Engineering. - 9780819489371 ; , s. Art. no. 829008-
  • Konferensbidrag (refereegranskat)abstract
    • Accurate depth maps are a pre-requisite in three-dimensional television, e.g. for high quality view synthesis, but this information is not always easily obtained. Depth information gained by correspondence matching from two or more views suffers from disocclusions and low-texturized regions, leading to erroneous depth maps. These errors can be avoided by using depth from dedicated range sensors, e.g. time-of-flight sensors. Because these sensors only have restricted resolution, the resulting depth data need to be adjusted to the resolution of the appropriate texture frame. Standard upscaling methods provide only limited quality results. This paper proposes a solution for upscaling low resolution depth data to match high resolution texture data. We introduce We introduce the Edge Weighted Optimization Concept (EWOC) for fusing low resolution depth maps with corresponding high resolution video frames by solving an overdetermined linear equation system. Similar to other approaches, we take information from the high resolution texture, but additionally validate this information with the low resolution depth to accentuate correlated data. Objective tests show an improvement in depth map quality in comparison to other upscaling approaches. This improvement is subjectively confirmed in the resulting view synthesis.
  •  
39.
  • Schwarz, Sebastian, 1980-, et al. (författare)
  • Depth or disparity map upscaling
  • 2016
  • Patent (populärvet., debatt m.m.)abstract
    • Method and arrangement for increasing the resolution of a depth or disparity map related to multi view video. The method comprises deriving a high resolution depth map based on a low resolution depth map and a masked texture image edge map. The masked texture image edge map comprises information on edges in a high resolution texture image, which edges have a correspondence in the low resolution depth map. The texture image and the depth map are associated with the same frame.
  •  
40.
  • Schwarz, Sebastian, et al. (författare)
  • Multivariate Sensitivity Analysis of Time-of-Flight Sensor Fusion
  • 2014
  • Ingår i: 3D Research. - : Springer Publishing Company. - 2092-6731. ; 5:3
  • Tidskriftsartikel (refereegranskat)abstract
    • Obtaining three-dimensional scenery data is an essential task in computer vision, with diverse applications in various areas such as manufacturing and quality control, security and surveillance, or user interaction and entertainment. Dedicated Time-of-Flight sensors can provide detailed scenery depth in real-time and overcome short-comings of traditional stereo analysis. Nonetheless, they do not provide texture information and have limited spatial resolution. Therefore such sensors are typically combined with high resolution video sensors. Time-of-Flight Sensor Fusion is a highly active field of research. Over the recent years, there have been multiple proposals addressing important topics such as texture-guided depth upsampling and depth data denoising. In this article we take a step back and look at the underlying principles of ToF sensor fusion. We derive the ToF sensor fusion error model and evaluate its sensitivity to inaccuracies in camera calibration and depth measurements. In accordance with our findings, we propose certain courses of action to ensure high quality fusion results. With this multivariate sensitivity analysis of the ToF sensor fusion model, we provide an important guideline for designing, calibrating and running a sophisticated Time-of-Flight sensor fusion capture systems.
  •  
41.
  • Schwarz, Sebastian, et al. (författare)
  • Time-of-Flight Sensor Fusion with Depth Measurement Reliability Weighting
  • 2014
  • Ingår i: 3DTV-Conference. - : IEEE Computer Society. - 9781479947584 ; , s. Art. no. 6874759-
  • Konferensbidrag (refereegranskat)abstract
    • Accurate scene depth capture is essential for the success of three-dimensional television (3DTV), e.g. for high quality view synthesis in autostereoscopic multiview displays. Unfortunately, scene depth is not easily obtained and often of limited quality. Dedicated Time-of-Flight (ToF) sensors can deliver reliable depth readings where traditional methods, such as stereovision analysis, fail. However, since ToF sensors provide only limited spatial resolution and suffer from sensor noise, sophisticated upsampling methods are sought after. A multitude of ToF solutions have been proposed over the recent years. Most of them achieve ToF super-resolution (TSR) by sensor fusion between ToF and additional sources, e.g. video. We recently proposed a weighted error energy minimization approach for ToF super-resolution, incorporating texture, sensor noise and temporal information. For this article, we take a closer look at the sensor noise weighting related to the Time-of-Flight active brightness signal. We determine a depth measurement reliability function based on optimizing free parameters to test data and verifying it with independent test cases. In the presented doubleweighted TSR proposal, depth readings are weighted into the upsampling process with regard to their reliability, removing erroneous influences in the final result. Our evaluations prove the desired effect of depth measurement reliability weighting, decreasing the depth upsampling error by almost 40% in comparison to competing proposals.
  •  
42.
  • Sjöström, Mårten, 1967-, et al. (författare)
  • A digital 3D signage system and its effect on customer behavior
  • 2011
  • Ingår i: The International Conference on 3D Imaging (IC3D). - : IEEE conference proceedings. - 9781479915774
  • Konferensbidrag (refereegranskat)abstract
    • The use of digital signs simplifies distribution. Importantly, it draws more attention than static signs. A way to increase attention is to add an experienced depth. The paper discusses possible alternatives for extending an existing digital signage system to display stereoscopic 3D contents, comparing a decentralized distribution solution and a centralized solution. A functional prototype system was implemented. A new 3D player was developed to render views from different formats. The implemented system was used to study customer behavior when exposed to digital stereoscopic 3D signage in a direct sales situation. The proportion of sales of selected products related to the total number of sold products varied approximately equally before and during tests. An interview study suggests that the sign did not interact with customer decisions: customers were lost at different stages in this series of steps, among others the sign placement.
  •  
43.
  •  
44.
  • Takhtardeshir, Soheib, 1991-, et al. (författare)
  • A Deep Learning based Light Field Image Compression as Pseudo Video Sequences with Additional in-loop Filtering
  • 2024
  • Ingår i: 3D Imaging and Applications 2024-Electronic Imaging. - San Francisco Airport in Burlingame, California : Society for Imaging Science & Technology. ; , s. 1-6
  • Konferensbidrag (refereegranskat)abstract
    • In recent years, several deep learning-based architectures have been proposed to compress Light Field (LF) images as pseudo video sequences. However, most of these techniques employ conventional compression-focused networks. In this paper, we introduce a version of a previously designed deep learning video compression network, adapted and optimized specifically for LF image compression. We enhance this network by incorporating an in-loop filtering block, along with additional adjustments and fine-tuning. By treating LF images as pseudo video sequences and deploying our adapted network, we manage to address challenges presented by the unique features of LF images, such as high resolution and large data sizes. Our method compresses these images competently, preserving their quality and unique characteristics. With the thorough fine-tuning and inclusion of the in-loop filtering network, our approach shows improved performance in terms of Peak Signal-to-Noise Ratio (PSNR) and Mean Structural Similarity Index Measure (MSSIM) when compared to other existing techniques. Our method provides a feasible path for LF image compression and may contribute to the emergence of new applications and advancements in this field.
  •  
45.
  • Tourancheau, Sylvain, 1982-, et al. (författare)
  • Evaluation of quality of experience in interactive 3D visualization: methodology and results
  • 2012
  • Ingår i: Proceedings of SPIE - The International Society for Optical Engineering. - : SPIE - International Society for Optical Engineering. - 9780819489357 ; , s. Art. no. 82880O-
  • Konferensbidrag (refereegranskat)abstract
    • Human factors are of high importance in 3D visualization, but subjective evaluation of 3D displays is not easy because of a high variability among users. This study aimed to evaluate and compare two different 3D visualization systems (a market stereoscopic display, and a state-of-the-art multi-view display) in terms of task performance and quality of experience (QoE), in the context of interactive visualization. An adapted methodology has been designed in order to focus on 3D differences and to reduce the influence of all other factors. 36 subjects took part in an experiment during which they were asked to solve different tasks in a synthetic 3D scene. After the experiment, they were asked to judge the quality of their experience, according to specific features. Results showed that scene understanding and precision was significantly better on the multi-view display. Concerning the quality of experience, visual comfort was judged significantly better on the multi-view display and visual fatigue was reported by 52% of the subjects on the stereoscopic display. This study has permitted to identify some factors influencing QoE such as prior experience and stereopsis threshold.
  •  
46.
  • Tourancheau, Sylvain, 1982-, et al. (författare)
  • Subjective evaluation of user experience in interactive 3D-visualization in a medical context
  • 2012
  • Ingår i: Proceedings of the SPIE, vol 8318: Conference on Image Perception, Observer Performance, and Technology Assessment, San Diego, CA, USA, 4 - 9 February 2012. - : SPIE - International Society for Optical Engineering. ; , s. Art. no. 831814-
  • Konferensbidrag (refereegranskat)abstract
    • New display technologies enable the usage of 3D-visualization in a medical context. Even though user performance seems to be enhanced with respect to 2D thanks to the addition of recreated depth cues, human factors, and more particularly visual comfort and visual fatigue can still be a bridle to the widespread use of these systems. This study aimed at evaluating and comparing two different 3D visualization systems (a market stereoscopic display, and a state-of-the-art multi-view display) in terms of quality of experience (QoE), in the context of interactive medical visualization. An adapted methodology was designed in order to subjectively evaluate the experience of users. 14 medical doctors and 15 medical students took part in the experiment. After solving different tasks using the 3D reconstruction of a phantom object, they were asked to judge their quality of the experience, according to specific features. They were also asked to give their opinion about the influence of 3D-systems on their work conditions. Results suggest that medical doctors are opened to 3D-visualization techniques and are confident concerning their beneficial influence on their work. However, visual comfort and visual fatigue are still an issue of 3D-displays. Results obtained with the multi-view display suggest that the use of continuous horizontal parallax might be the future response to these current limitations.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-46 av 46
Typ av publikation
konferensbidrag (25)
tidskriftsartikel (10)
doktorsavhandling (4)
rapport (2)
bokkapitel (2)
annan publikation (1)
visa fler...
licentiatavhandling (1)
patent (1)
visa färre...
Typ av innehåll
refereegranskat (36)
övrigt vetenskapligt/konstnärligt (7)
populärvet., debatt m.m. (2)
Författare/redaktör
Olsson, Roger, 1973- (45)
Sjöström, Mårten, 19 ... (44)
Ahmad, Waqas (9)
Li, Yun (6)
Dima, Elijs (4)
Jennehag, Ulf (4)
visa fler...
Vagharshakyan, Suren (3)
Gotchev, Atanas (3)
Bregovic, Robert (3)
Martinez Corral, Man ... (3)
Ericson, Thomas (2)
Persson, Anders (2)
Tourancheau, Sylvain ... (2)
Damghanian, Mitra, 1 ... (2)
Navarro Fructuoso, H ... (2)
Gao, Yuan (1)
Andersson, Håkan (1)
Ghafoor, Mubeen (1)
Tariq, Syed Ali (1)
Hassan, Ali (1)
Schelkens, Peter, Pr ... (1)
Koch, Reinhard (1)
Zerman, Emin, 1989- (1)
Andersson, Mattias, ... (1)
Norén, Bengt (1)
Zhang, TingTing (1)
Dalin, Rolf (1)
Sjöström, Mårten (1)
Boström, Lena, 1960- (1)
Kjellqvist, Martin (1)
Karlsson, Håkan, 196 ... (1)
Sundgren, Marcus, 19 ... (1)
Åhlander, Jimmy (1)
Xu, Youzhi (1)
Rasmusson, Lennart (1)
Conti, Caroline (1)
Soares, Luis Ducla (1)
Nunes, Paulo (1)
Perra, Cristian (1)
Assunção, Pedro Amad ... (1)
Damghanian, Mitra (1)
Dima, Elijs, 1990- (1)
Sjöström, Mårten, Pr ... (1)
Olsson, Roger, Dr. 1 ... (1)
Poullis, Charalambos ... (1)
Esquivel, Sandro (1)
Litwic, Lukasz (1)
Zhang, Zhi (1)
Flodén, Lars (1)
Assarsson, Ulf, Prof ... (1)
visa färre...
Lärosäte
Mittuniversitetet (46)
Blekinge Tekniska Högskola (1)
Språk
Engelska (44)
Svenska (2)
Forskningsämne (UKÄ/SCB)
Teknik (28)
Naturvetenskap (22)
Medicin och hälsovetenskap (1)
Samhällsvetenskap (1)

År

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy