SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Aksoy Eren 1982 ) srt2:(2024)"

Sökning: WFRF:(Aksoy Eren 1982 ) > (2024)

  • Resultat 1-8 av 8
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Cortinhal, Tiago, 1990- (författare)
  • Semantics-aware Multi-modal Scene Perception for Autonomous Vehicles
  • 2024
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Autonomous vehicles represent the pinnacle of modern technological innovation, navigating complex and unpredictable environments. To do so effectively, they rely on a sophisticated array of sensors. This thesis explores two of the most crucial sensors: LiDARs, known for their accuracy in generating detailed 3D maps of the environment, and RGB cameras, essential for processing visual cues critical for navigation. Together, these sensors form a comprehensive perception system that enables autonomous vehicles to operate safely and efficiently.However, the reliability of these vehicles has yet to be tested when key sensors fail. The abrupt failure of a camera, for instance, disrupts the vehicle’s perception system, creating a significant gap in sensory input. This thesis addresses this challenge by introducing a novel multi-modal domain translation framework that integrates LiDAR and RGB camera data while ensuring continuous functionality despite sensor failures. At the core of this framework is an innovative model capable of synthesizing RGB images and their corresponding segment maps from raw LiDAR data by exploiting the scene semantics. The proposed framework stands out as the first of its kind, demonstrating for the first time that the scene semantics can bridge the gap across different domains with distinct data structures, such as unorganized sparse 3D LiDAR point clouds and structured 2D camera data. Thus, this thesis represents a significant leap forward in the field, offering a robust solution to the challenge of RGB data recovery without camera sensors.The practical application of this model is thoroughly explored in the thesis. It involves testing the model’s capability to generate pseudo point clouds from RGB depth estimates, which, when combined with LiDAR data, create an enriched perception dataset. This enriched dataset is pivotal in enhancing object detection capabilities, a fundamental aspect of autonomous vehicle navigation. The quantitative and qualitative evidence reported in this thesis demonstrates that the synthetic generation of data not only compensates for the loss of sensory input but also considerably improves the performance of object detection systems compared to using raw LiDAR data only.By addressing the critical issue of sensor failure and presenting viable solutions, this thesis contributes to enhancing the safety, reliability, and efficiency of autonomous vehicles. It paves the way for further research and developiment, setting a new standard for autonomous vehicle technology in scenarios of sensor malfunctions or adverse environmental conditions.
  •  
2.
  • Cortinhal, Tiago, 1990-, et al. (författare)
  • Depth- and semantics-aware multi-modal domain translation : Generating 3D panoramic color images from LiDAR point clouds
  • 2024
  • Ingår i: Robotics and Autonomous Systems. - Amsterdam : Elsevier. - 0921-8890 .- 1872-793X. ; 171, s. 1-9
  • Tidskriftsartikel (refereegranskat)abstract
    • This work presents a new depth-and semantics-aware conditional generative model, named TITAN-Next, for cross-domain image-to-image translation in a multi-modal setup between LiDAR and camera sensors. The proposed model leverages scene semantics as a mid-level representation and is able to translate raw LiDAR point clouds to RGB-D camera images by solely relying on semantic scene segments. We claim that this is the first framework of its kind and it has practical applications in autonomous vehicles such as providing a fail-safe mechanism and augmenting available data in the target image domain. The proposed model is evaluated on the large-scale and challenging Semantic-KITTI dataset, and experimental findings show that it considerably outperforms the original TITAN-Net and other strong baselines by 23.7% margin in terms of IoU. © 2023 The Author(s). 
  •  
3.
  • Cortinhal, Tiago, 1990-, et al. (författare)
  • Semantics-aware LiDAR-Only Pseudo Point Cloud Generation for 3D Object Detection
  • 2024
  • Konferensbidrag (refereegranskat)abstract
    • Although LiDAR sensors are crucial for autonomous systems due to providing precise depth information, they struggle with capturing fine object details, especially at a distance, due to sparse and non-uniform data. Recent advances introduced pseudo-LiDAR, i.e., synthetic dense point clouds, using additional modalities such as cameras to enhance 3D object detection. We present a novel LiDAR-only framework that augments raw scans with denser pseudo point clouds by solely relying on LiDAR sensors and scene semantics, omitting the need for cameras. Our framework first utilizes a segmentation model to extract scene semantics from raw point clouds, and then employs a multi-modal domain translator to generate synthetic image segments and depth cues without real cameras. This yields a dense pseudo point cloud enriched with semantic information. We also introduce a new semantically guided projection method, which enhances detection performance by retaining only relevant pseudo points. We applied our framework to different advanced 3D object detection methods and reported up to 2.9% performance upgrade. We also obtained comparable results on the KITTI 3D object detection dataset, in contrast to other state-of-the-art LiDAR-only detectors. 
  •  
4.
  • Inceoglu, Arda, et al. (författare)
  • Multimodal Detection and Classification of Robot Manipulation Failures
  • 2024
  • Ingår i: IEEE Robotics and Automation Letters. - Piscataway, NJ : IEEE. - 2377-3766. ; 9:2, s. 1396-1403
  • Tidskriftsartikel (refereegranskat)abstract
    • An autonomous service robot should be able to interact with its environment safely and robustly without requiring human assistance. Unstructured environments are challenging for robots since the exact prediction of outcomes is not always possible. Even when the robot behaviors are well-designed, the unpredictable nature of the physical robot-object interaction may lead to failures in object manipulation. In this letter, we focus on detecting and classifying both manipulation and post-manipulation phase failures using the same exteroception setup. We cover a diverse set of failure types for primary tabletop manipulation actions. In order to detect these failures, we propose FINO-Net (Inceoglu et al., 2021), a deep multimodal sensor fusion-based classifier network architecture. FINO-Net accurately detects and classifies failures from raw sensory data without any additional information on task description and scene state. In this work, we use our extended FAILURE dataset (Inceoglu et al., 2021) with 99 new multimodal manipulation recordings and annotate them with their corresponding failure types. FINO-Net achieves 0.87 failure detection and 0.80 failure classification F1 scores. Experimental results show that FINO-Net is also appropriate for real-time use. © 2016 IEEE.
  •  
5.
  • Raisuddin, Abu Mohammed, 1989-, et al. (författare)
  • 3D-OutDet : A Fast and Memory Efficient Outlier Detector for 3D LiDAR Point Clouds in Adverse Weather
  • 2024
  • Ingår i: 2024 IEEE Intelligent Vehicles Symposium (IV). - : IEEE. - 9798350348811 - 9798350348828 ; , s. 2862-2868
  • Konferensbidrag (refereegranskat)abstract
    • Adverse weather conditions such as snow, rain, and fog are natural phenomena that can impair the performance of the perception algorithms in autonomous vehicles. Although LiDARs provide accurate and reliable scans of the surroundings, its output can be substantially degraded by precipitation (e.g., snow particles) leading to an undesired effect on the downstream perception tasks. Several studies have been performed to battle this undesired effect by filtering out precipitation outliers, however, these works have large memory consumption and long execution times which are not desired for onboard applications. To that end, we introduce a novel outlier detector for 3D LiDAR point clouds captured under adverse weather conditions. Our proposed detector 3D-OutDet is based on a novel convolution operation that processes nearest neighbors only, allowing the model to capture the most relevant points. This reduces the number of layers, resulting in a model with a low memory footprint and fast execution time, while producing a competitive performance compared to state-of-the-art models. We conduct extensive experiments on three different datasets (WADS, SnowyKITTI, and SemanticSpray) and show that with a sacrifice of 0.16% mIOU performance, our model reduces the memory consumption by 99.92%, number of operations by 96.87%, and execution time by 82.84% per point cloud on the real-scanned WADS dataset. Our experimental evaluations also showed that the mIOU performance of the downstream semantic segmentation task on WADS can be improved up to 5.08% after applying our proposed outlier detector. We release our source code, supplementary material and videos in https://sporsho.github.io/3DOutDet. Upon clicking the link you will have to option to go to source code, see supplementary information and view videos generated with our 3D-OutDet. © 2024 IEEE.
  •  
6.
  • Rosberg, Felix, 1995- (författare)
  • Anonymizing Faces without Destroying Information
  • 2024
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Anonymization is a broad term. Meaning that personal data, or rather data that identifies a person, is redacted or obscured. In the context of video and image data, the most palpable information is the face. Faces barely change compared to other aspect of a person, such as cloths, and we as people already have a strong sense of recognizing faces. Computers are also adroit at recognizing faces, with facial recognition models being exceptionally powerful at identifying and comparing faces. Therefore it is generally considered important to obscure the faces in video and image when aiming for keeping it anonymized. Traditionally this is simply done through blurring or masking. But this de- stroys useful information such as eye gaze, pose, expression and the fact that it is a face. This is an especial issue, as today our society is data-driven in many aspects. One obvious such aspect is autonomous driving and driver monitoring, where necessary algorithms such as object-detectors rely on deep learning to function. Due to the data hunger of deep learning in conjunction with society’s call for privacy and integrity through regulations such as the General Data Protection Regularization (GDPR), anonymization that preserve useful information becomes important.This Thesis investigates the potential and possible limitation of anonymizing faces without destroying the aforementioned useful information. The base approach to achieve this is through face swapping and face manipulation, where the current research focus on changing the face (or identity) while keeping the original attribute information. All while being incorporated and consistent in an image and/or video. Specifically, will this Thesis demonstrate how target-oriented and subject-agnostic face swapping methodologies can be utilized for realistic anonymization that preserves attributes. Thru this, this Thesis points out several approaches that is: 1) controllable, meaning the proposed models do not naively changes the identity. Meaning that what kind of change of identity and magnitude is adjustable, thus also tunable to guarantee anonymization. 2) subject-agnostic, meaning that the models can handle any identity. 3) fast, meaning that the models is able to run efficiently. Thus having the potential of running in real-time. The end product consist of an anonymizer that achieved state-of-the-art performance on identity transfer, pose retention and expression retention while providing a realism.Apart of identity manipulation, the Thesis demonstrate potential security issues. Specifically reconstruction attacks, where a bad-actor model learns convolutional traces/patterns in the anonymized images in such a way that it is able to completely reconstruct the original identity. The bad-actor networks is able to do this with simple black-box access of the anonymization model by constructing a pair-wise dataset of unanonymized and anonymized faces. To alleviate this issue, different defense measures that disrupts the traces in the anonymized image was investigated. The main take away from this, is that naively using what qualitatively looks convincing of hiding an identity is not necessary the case at all. Making robust quantitative evaluations important.
  •  
7.
  • Tzelepis, Georgies, et al. (författare)
  • Semantic State Estimation in Robot Cloth Manipulations Using Domain Adaptation from Human Demonstrations
  • 2024
  • Ingår i: Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4. - Setúbal : SciTePress. - 9789897586798 ; , s. 172-182
  • Konferensbidrag (refereegranskat)abstract
    • Deformable object manipulations, such as those involving textiles, present a significant challenge due to their high dimensionality and complexity. In this paper, we propose a solution for estimating semantic states in cloth manipulation tasks. To this end, we introduce a new, large-scale, fully-annotated RGB image dataset of semantic states featuring a diverse range of human demonstrations of various complex cloth manipulations. This effectively transforms the problem of action recognition into a classification task. We then evaluate the generalizability of our approach by employing domain adaptation techniques to transfer knowledge from human demonstrations to two distinct robotic platforms: Kinova and UR robots. Additionally, we further improve performance by utilizing a semantic state graph learned from human manipulation data. © 2024 by SCITEPRESS – Science and Technology Publications, Lda.
  •  
8.
  • Tzelepis, Georgies, et al. (författare)
  • Semantic State Prediction in Robotic Cloth Manipulation
  • 2024
  • Ingår i: Lecture Notes in Networks and Systems. - Cham : Springer Nature. ; , s. 205-219
  • Konferensbidrag (refereegranskat)abstract
    • State estimation of deformable objects such as textiles is notoriously difficult due to its extreme high dimensionality and complexity. Lack of data and benchmarks is another challenge impeding progress in robotic cloth manipulation. In this paper, we make a first attempt to solve the problem of semantic state estimation through RGB-D data only in an end-to-end manner with the help of deep neural networks. Since neural networks require large amounts of labeled data, we introduce a novel Mujoco simulator to generate a large-scale fully annotated robotic textile manipulation dataset including bimanual actions. Finally, we provide a set of baseline deep neural networks and benchmark them on the problem of semantic state prediction on our proposed dataset. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-8 av 8

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy