Hauptmenü
  • Herausgeber
    • Roth, Peter M.
    • Steinbauer, Gerald
    • Fraundorfer, Friedrich
    • Brandstötter, Mathias
    • Perko, Roland
  • TitelProceedings of the Joint Austrian Computer Vision and Robotics Workshop 2020
  • Datei
  • DOI10.3217/978-3-85125-752-6
  • LicenceCC BY
  • ISBN978-3-85125-752-6
  • ZugriffsrechteCC-BY

Kapitel

  • PrefaceRoth, Peter M.; Steinbauer, Gerald; Fraundorfer, Friedrich; Brandstötter, Mathias; Perko, Roland; 10.3217/978-3-85125-752-6-00pdf
  • Semi-Automatic Generation of Training Data for Neural Networks for 6D Pose Estimation and Robotic GraspingRauer, Johannes; Aburaia, Mohamed; Wöber, Wilfried; 10.3217/978-3-85125-752-6-01pdfMachine-learning-based approaches for pose estimation are trained using annotated groundtruth data – images showing the object and information of its pose. In this work an approach to semiautomatically generate 6D pose-annotated data, using a movable marker and an articulated robot, is presented. A neural network for pose estimation is trained using datasets varying in size and type. The evaluation shows that small datasets recorded in the target domain and supplemented with augmented images lead to more robust results than larger synthetic datasets. The results demonstrate that a mobile manipulator using the proposed pose-estimation system could be deployed in real-life logistics applications to increase the level of automation.
  • Feasibility study of a certifiable production environment using safe environmental sensor systemsPapa, Maximilian; Sattinger, Vinzenz; Kubinger, Wilfried; 10.3217/978-3-85125-752-6-02pdfSafe robot development is based on three factors: safety, performance and economy. Currently however, only two properties can be maximized at once, which is why an alternative for maximizing all three factors has been worked on. In particular the topic of safe environment has been discussed, where sensors from individual robots will be relocated in the environment. A sensor thus monitors more than one robot, which leads to an increase in efficiency. Due to the novelty, technical and legal requirements of such a system have first been clarified. The required components have then been determined in order to plan a possible implementation. Finally an adapted concept for the Digital Factory of the UAS Technikum Vienna showed the feasibility of a safetycertifiable environmental sensor system.
  • Vision-based Docking of a Mobile RobotKriegler, Andreas; Wöber, Wilfried; 10.3217/978-3-85125-752-6-03pdfFor mobile robots to be considered autonomous they must reach target locations in required pose, a procedure referred to as docking. Popular current solutions use LiDARs combined with sizeable docking stations but these systems struggle by incorrectly detecting dynamic obstacles. This paper instead proposes a vision-based framework for docking a mobile robot. Faster R-CNN is used for detecting arbitrary visual markers. The pose of the robot is estimated using the solvePnP algorithm relating 2D-3D point pairs. Following exhaustive experiments, it is shown that solvePnP gives systematically inaccurate pose estimates in the x-axis pointing to the side. Pose estimates are off by ten to fifty centimeters and could therefore not be used for docking the robot. Insights are provided to circumvent similar problems in future applications.
  • Autonomous Grasping of Known Objects Using Depth Data and the PCASteigl, Dominik; Wöber, Wilfried; Aburaia, Mohamed; 10.3217/978-3-85125-752-6-04pdfTwo main goals for automated object manipulation processes are cost reduction and flexibility. Time-consuming, costly object-specific fixtures can be replaced by vision systems, whereby the manipulators are extended with cameras so that multiple objects in the environment can be precisely identified. To be able to manipulate an object, it must be recognized first in the world, and then the pose must be calculated. Neural network approaches recognize and estimate the pose of an object in a single step and yield superior results, but rely on vast amounts of training data. This work describes an approach for estimating the pose of identified objects without pre-trained pose data. Template matching is used to recognize objects in depth images, and the pose is estimated through principal component analysis (PCA). The input to the algorithm is reduced to the template. Pre-existing knowledge about the object further improves accuracy. A maximum deviation of 0.2 cm from the ground truth has been achieved, which suffices for the industrial grasping task. The system was evaluated with real measurements taken with an RGB-D camera. This work resembles a first step to estimate an object’s pose with linear statistical methods.
  • EDLRIS: European Driving License for Robots and Intelligent SystemsMenzinger, Manuel; Kandlhofer, Martin; Steinbauer, Gerald; Bieber, Ronald; Baumann, Wilfried; Ehardt-Schmiederer, Margit; Winkler, Thomas; 10.3217/978-3-85125-752-6-05pdfEDLRIS is a professional and standardized system for training and certifying people in fundamental topics of Robotics and Artificial Intelligence. It was developed, implemented and evaluated within the course of an international 3-year project. This paper provides an overview of goals, methodology, training modules and preliminary results of the EDLRIS project.
  • Design and Implementation of a Mobile Search and Rescue RobotNovotny, Georg; Kubinger, Wilfried; 10.3217/978-3-85125-752-6-06pdfFor public emergencies such as nuclear accidents or natural disasters, an urgent and reliable description as well as an evaluation of the environment form the basis of all organized search and rescue (S&R) team plans and actions. If this information is not available the risks for the rescue services increases dramatically. Mobile robots help to minimize these risks by providing information about the disaster site to rescue teams. This paper discusses the needs and requirements of mobile robots in S&R application areas such as nuclear disasters and evaluates results achieved during the ENRICH 2019 trial based on the system architecture of the mobile S&R robot ”Robbie” of UAS Technikum Vienna. The successful participation of the ENRICH 2019 show that the mobile robot is capable of performing S&R actions during emergencies.
  • Automatic Ontology-based Plan Generation for an Industrial Robotics SystemHoebert, Timon; Lepuschitz, Wilfried; Merdan, Munir; 10.3217/978-3-85125-752-6-07pdfProgramming and re-configuration of robots are associated with high costs, especially for small- and medium-sized enterprises. We present an ontology-driven solution that can automate the configuration as well as the generation of process plans and schedules thereby significantly lowering the efforts in the case of changes. The presented approach is demonstrated in a laboratory environment with an industrial pilot test case.
  • How does explicit exploration influence Deep Reinforcement Learning?Hollenstein, Jakob; Renaudo, Erwan; Saveriano, Matteo; Piater, Justus; 10.3217/978-3-85125-752-6-08pdfMost Deep Reinforcement Learning (DRL) methods perform local search and therefore are prone to get stuck in non-optimal solutions. To overcome this issue, we exploit simulation models and kinodynamic planners as exploration mechanism in a model-based reinforcement learning method. We show that, even on a simple toy domain, D-RL methods are not immune to local optima and require additional exploration mechanisms. In contrast, our planning-based exploration exhibits a better state space coverage which turns into better policies than the ones learned via standard D-RL methods.
  • UGV Radiation Mapping using a Particle FilterPermann, Alexander; Hettegger, Daniel; Steinbauer, Gerald; 10.3217/978-3-85125-752-6-09pdfWe present and evaluate a particle filter based approach to predict the location and emission intensity of an arbitrary and unknown number of stationary nuclear radiation sources from measurement data taken by an autonomously navigating unmanned ground vehicle (UGV).
  • Towards ASP-based Scheduling for Industrial Transport VehiclesFabricius, Felicitas; De Bortoli, Marco; Maximilian, Selmair; Reip, Michael; Steinbauer, Gerald; Gebser, Martin; 10.3217/978-3-85125-752-6-10pdfThe increasing number of robots and autonomous vehicles involved in logistics applications leads to new challenges to face for the community of Artificial Intelligence. Web-shop giants, like Amazon or Alibaba for instance, brought this problem to a new level, with huge warehouses and a huge number of orders to deliver with strict deadlines. Coordinating and scheduling such high quantity of tasks over a fleet of autonomous robots is a really complex problem: neither simple imperative greedy algorithms, which compromises over the quality of the solution, nor precise enumeration techniques, which make compromises over the solving time, are anymore feasible to tackle such problems. In this work, we use Answer Set Programming to tackle real-world logistics problems, involving both dynamic task assignment and planning, at the BMW Group and Incubed IT. Different strategies are tried, and compared to the original imperative approach.
  • Learning Manipulation Tasks from Vision-based TeleoperationHirschmanner, Matthias; Jamadi, Ali; Neuberger, Bernhard; Patten, Timothy; Vincze, Markus; 10.3217/978-3-85125-752-6-11pdfLearning from demonstration is an approach to directly teach robots new tasks without explicit programming. Prior methods typically collect demonstration data through kinesthetic teaching or teleoperation. This is challenging because the human must physically interact with the robot or use specialized hardware. This paper presents a teleoperation system based on tracking the human hand to alleviate the requirement of specific tools for robot control. The data recorded during the demonstration is used to train a deep imitation learning model that enables the robot to imitate the task. We conduct experiments with a KUKA LWR IV+ robotic arm for the task of pushing an object from a random start location to a goal location. Results show the successful completion of the task by the robot after only 100 collected demonstrations. In comparison to the baseline model, the introduction of regularization and data augmentation leads to a higher success rate.
  • Reactive motion planning framework inspired by hybrid automataHajdu, Csaba; Ballagi, Áron; 10.3217/978-3-85125-752-6-12pdfThis paper presents a motion planning framework controlled by reactive events and producing feedback data suitable to be processed by various learning and verification methods (e.g. reinforcement learning, runtime monitoring). Our architecture decomposes subtasks of motion planning into separate perception and trajectory planner parts. In our architecture, we interact between these distributed parts through discrete-timed events controlled by timed state machines, besides classical continuous state flow. Our research primarily focuses on autonomous vehicle research, so this framework is supposed to satisfy the requirements of this field. The motion planner framework interfaces a widely-used robotic middleware.
  • Automated Log Ordering through Robotic GrasperWeiss, Stephan; Ainetter, Stefan; Arneitz, Fred; Arronde, Dailys; Dhakate, Rohit; Fraundorfer, Friedrich; Gietler, Harald; Gubensäk, Wolfgang; Medeiros, Mylena; 10.3217/978-3-85125-752-6-13pdfThis work focuses on retrofitting a crane model in the wood industry for automated log grasping. AI inspired vision based approaches are used to categorize and segment the logs and their geometry to subsequently define optimal grasping poses. Retrofittable sensors and robust control strategies for cost efficient upgrading of existing manually operated cranes towards autonomous systems are developed.
  • Introducing a Morphological Box for an Extended Risk Assessment of Human-Robot Work Systems Considering Prospective System ModificationsKomenda, Titanilla; Steiner, Martin; Rathmair, Michael; Brandstötter, Mathias; 10.3217/978-3-85125-752-6-14pdfThe concept of human-machine collaboration is regarded as key enabler for agile production systems as collaborative robots offer new forms of flexibility. Due to inherent safety functionalities, these robots can operate without physically separating safety devices and thus provide flexibility in task allocation and execution. However, changes on the work system require a new risk assessment due to the present normative regulations, which is a tedious task as feasible changes are usually not considered in the implementation phase. This paper presents the impact of modifications on collaborative robotic cells and how they influence the risk assessment. Furthermore, a method of considering work system variants based on desired future modifications is presented so that implications can be already identified in an early design phase of the system.
  • Several Approaches for the Optimization of Arm Motions of HumanoidsLichtenecker, Daniel; Krög, Gabriel; Gattringer, Hubert; Müller, Andreas; 10.3217/978-3-85125-752-6-15pdfThis paper presents several point-to-point optimization tasks of humanoid arm motions. The focus lies on optimization of elementary arm motions. Several cost functions for optimization tasks are defined. Tasks in respect of time optimal control, minimizing joint loads and maximizing the vertical torque of the torso are presented. The dynamic optimal control problem is transformed into a static parametric optimization problem by using B-spline curves. The optimization is carried out with the Sequential Quadratic Programming algorithm.
  • Presentation Attacks and Their Detection in Finger and Hand Vein RecognitionDebiasi, Luca; Kauba, Christof; Hofbauer, Heinz; Prommegger, Bernhard; Uhl, Andreas; 10.3217/978-3-85125-752-6-16pdfBiometric recognition systems, especially vascular pattern based ones, are becoming more popular. However, these systems are still susceptible to so called presentation attacks, where a forged representation of the original biometric is presented to the system trying to mimic the original biometric and fool the system. We propose a presentation attack approach for finger- and hand-vein recognition systems using paper prints as well as wax and silicone artefacts. We further develop a suitable presentation attack detection (PAD) scheme based on natural scene statistics and acquire a corresponding hand vein presentation attack dataset. Evaluating the PAD scheme on the dataset confirmed its success in the detection of the forged samples.
  • HPS: Holistic End-to-End Panoptic Segmentation Network with InterrelationsKniewasser, Günther; Grabner, Alexander; Roth, Peter M.; 10.3217/978-3-85125-752-6-17pdfTo provide a complete 2D scene segmentation, panoptic segmentation unifies the tasks of semantic and instance segmentation. For this purpose, existing approaches independently address semantic and instance segmentation and merge their outputs in a heuristic fashion. However, this simple fusion has two limitations in practice. First, the system is not optimized for the final objective in an end-to-end manner. Second, the mutual information between the semantic and instance segmentation tasks is not fully exploited. To overcome these limitations, we present a novel end-to-end trainable architecture that generates a full pixel-wise image labeling with resolved instance information. Additionally, we introduce interrelations between the two subtasks by providing instance segmentation predictions as feature input to our semantic segmentation branch. This inter-task link eases the semantic segmentation task and increases the overall panoptic performance by providing segmentation priors. We evaluate our method on the challenging Cityscapes dataset and show significant improvements compared to previous panoptic segmentation architectures.
  • Frame-To-Frame Consistent Semantic SegmentationRebol, Manuel; Knöbelreiter, Patrick; 10.3217/978-3-85125-752-6-18pdfIn this work, we aim for temporally consistent semantic segmentation throughout frames in a video. Many semantic segmentation algorithms process images individually which leads to an inconsistent scene interpretation due to illumination changes, occlusions and other variations over time. To achieve a temporally consistent prediction, we train a convolutional neural network (CNN) which propagates features through consecutive frames in a video using a convolutional long short term memory (ConvLSTM) cell. Besides the temporal feature propagation, we penalize inconsistencies in our loss function. We show in our experiments that the performance improves when utilizing video information compared to single frame prediction. The mean intersection over union (mIoU) metric on the Cityscapes validation set increases from 45.2% for the single frames to 57.9% for video data after implementing the ConvLSTM to propagate features trough time on the ESPNet. Most importantly, inconsistency decreases from 4.5% to 1.3% which is a reduction by 71.1%. Our results indicate that the added temporal information produces a frame-to-frame consistent and more accurate image understanding compared to single frame processing.
  • Ground Control Point Retrieval From SAR Satellite ImageryPerko, Roland; Raggam, Hannes; Gutjahr, Karlheinz; Koppe, Wolfgang; Janoth, Jürgen; 10.3217/978-3-85125-752-6-19pdfFor many applications, like for instance autonomous driving or geo-referencing of optical satellite data, highly accurate reference coordinates are of importance. This work demonstrates that such Ground Control Points can automatically be derived from multi-beam Synthetic Aperture Radar satellite images with high accuracy.
  • Classification and Segmentation of Scanned Library Catalogue Cards using Convolutional Neural NetworksWödlinger, Matthias; Sablatnig, Robert; 10.3217/978-3-85125-752-6-20pdfThe library of the TU Wien has been documenting changes in its inventory in the form of physical library archive cards. To make these archive cards digitally accessible, the cards and the text regions therein need to be categorized and the text must be made machine-readable. In this paper we present a pipeline consisting of classification, page segmentation and automated handwriting recognition that, given a scan of a library card, returns the category this card belongs to and an xml file containing the extracted and classified text.
  • Visual Odometry For Industrial Cable LayingGregorac, Ana; Gutjahr, Karlheinz; Ladstädter, Richard; Perko, Roland; Höppl, Wolfgang; 10.3217/978-3-85125-752-6-21pdfUIn order to support broadband network expansion in rural areas, the LAYJET Micro-Rohr Verlegegesellschaft has developed a highly automated cable laying technology based on a Fendt 936 tractor as the carrier vehicle and a milling machine with an integrated cable laying unit [3]. Operating at a speed of approximately 1kph, LAYJET is able to lay cables of several kilometres of length per day along of existing roads. The position of the cable needs to be precisely surveyed for documentation purposes, which is a time consuming and costly process. LAYJET is therefore equipped with a high-end GNSS RTK positioning system (TRIMBLE NetR9). In areas with bad GNSS signal reception or even complete GNSS outage (e.g., roads through a forest) an alternative positioning method is needed. JOANNEUM RESEARCH and the surveying office Hoppl / Graz ¨ have therefore developed a calibrated stereo camera setup triggered by an odometer which allows reconstructing the trajectory of the GNSS antenna using visual odometry (VO).
  • Few-shot Object Detection Using Online Random ForestsBailer, Werner; Fassold, Hannes; 10.3217/978-3-85125-752-6-22pdfWe propose an approach for few-shot object detection, consisting of a CNN-based generic object detector and feature extractor, and an online random forest as a classifier. This enables incremental training of the classifier, which reaches similar performance with around 20 samples as when using 50+ training samples in batch learning.
  • The Problem of Fragmented Occlusion in Object DetectionPegoraro, Julian; Pflugfelder, Roman; 10.3217/978-3-85125-752-6-23pdfObject detection in natural environments is still a very challenging task, even though deep learning has brought a tremendous improvement in performance over the last years. A fundamental problem of object detection based on deep learning is that neither the training data nor the suggested models are intended for the challenge of fragmented occlusion. Fragmented occlusion is much more challenging than ordinary partial occlusion and occurs frequently in natural environments such as forests. A motivating example of fragmented occlusion is object detection through foliage which is an essential requirement in green border surveillance. This paper presents an analysis of state-of-the-art detectors with imagery of green borders and proposes to train Mask R-CNN on new training data which captures explicitly the problem of fragmented occlusion. The results show clear improvements of Mask R-CNN with this new training strategy (also against other detectors) for data showing slight fragmented occlusion.
  • A Centerline-Guided Approach for Aorta and Stent-Graft SegmentationSabrowsky-Hirsch, Bertram; Thumfart, Stefan; Hofer, Richard; Fenz, Wolfgang; Schmit, Pierre; Fellner, Franz; 10.3217/978-3-85125-752-6-24pdfMonitoring of patients after Endovascular aortic repair (EVAR) is a clinical necessity due to the high re-intervention rate associated with the treatment. The risk assessment could be greatly enhanced by the inclusion of metrics based on the aortic blood-flow and stent-graft changes. A preliminary step to this endeavour is, however, the automatic reconstruction of the relevant structures: aortic bloodlumen and the stent-graft wire frame. In this paper we present a centerline-guided approach that leverages knowledge about the target structures through a combination of two 3D U-Nets for efficient automated segmentation of both structures. We evaluate our approach on a real-world clinical dataset yielding Dice similarity coefficients of 0.942 and 0.841 for the blood lumen and stent-graft metal wire, respectively.
  • Image Synthesis in SO(3) by Learning Equivariant Feature SpacesPeer, Marco; Thalhammer, Stefan; Vincze, Markus; 10.3217/978-3-85125-752-6-25pdfEquivariance is a desired property for feature spaces designed to make transformations between samples, such as object views, predictable. Encoding this property in two dimensional feature spaces for 3D transformations is beneficial for tasks such as image synthesis and object pose refinement. We propose the Trilinear Interpolation Layer that applies SO(3) transformations to the bottleneck feature map of an encoder-decoder network. By employing a 3D grid to trilinearly interpolate in the feature map we create models suited for view synthesis with three degrees of rotational freedom. We quantitatively and qualitatively evaluate on image synthesis in SO(3) providing evidence of the suitability of our approach.
  • Frame Border Detection for Digitized Historical FootageHelm, Daniel; Pointner, Bernhard; Kampel, Martin; 10.3217/978-3-85125-752-6-26pdfAutomatic video analysis of digitized historical analog films is influenced by video quality, composition and scan artifacts called overscanning. This paper provides a first pipeline to crop the main frame window by detecting Sprocket-Holes and interpreting the geometric hole layout to distinguish between two different film reel types (16mm and 9.5mm). Therefore, an heuristic approach based on histogram features is explored. Finally, our results demonstrate a first baseline for future research.
  • Highly Accurate Binary Image Segmentation for CarsHeitzinger, Thomas; Kampel, Martin; 10.3217/978-3-85125-752-6-27pdfWe study methods for the generation of highly accurate binary segmentation masks with application to images of cars. The goal is the automated separation of cars from their background. A fully convolutional network (FCN) based on the UNet architecture is trained on a private dataset consisting of over 7000 samples. The main contributions of the paper include a series of modification to common loss functions as well as the introduction of a novel Gradient Loss that outperforms standard approaches. In a specialized postprocessing step the generated masks are further refined to better match the inherent curvature bias typically found in the outline of cars. In direct comparison to previous implementations our method reduces the segmentation error measured by the Jaccard index by over 65%.
  • Powder Bed Analysis in Additive Manufacturing Using Image ProcessingRecla, Florian; Welk, Martin; 10.3217/978-3-85125-752-6-28pdfSystems for additive manufacturing are experiencing an enormous upswing in the industry. In this paper a method for the optical control of powder beds is presented. The system is based on a camera and directional lighting and is suitable for detecting two types of defects, including (i) areas where too little/too much powder has been applied, and (ii) areas with different porosity. The system is evaluated for both types of errors.
  • Grasping Point Prediction in Cluttered Environment using Automatically Labeled DataAinetter, Stefan; Fraundorfer, Friedrich; 10.3217/978-3-85125-752-6-29pdfWe propose a method to automatically generate high quality ground truth annotations for grasping point prediction and show the usefulness of these annotations by training a deep neural network to predict grasping candidates for objects in a cluttered environment. First, we acquire sequences of RGBD images of a real world picking scenario and leverage the sequential depth information to extract labels for grasping point prediction. Afterwards, we train a deep neural network to predict grasping points, establishing a fully automatic pipeline from acquiring data to a trained network without the need of human annotators. We show in our experiments that our network trained with automatically generated labels delivers high quality results for predicting grasping candidates, on par with a trained network which uses human annotated data. This work lowers the cost/complexity of creating specific datasets for grasping and makes it easy to expand the existing dataset without additional effort.
  • The Difficulties of Detecting Deformable Objects Using Deep Neural NetworksDjukic, Nikola; Kropatsch, Walter G.; Vincze, Markus; 10.3217/978-3-85125-752-6-30pdfObject detectors based on deep neural networks have revolutionized the way we look for objects in an image, outperforming traditional image processing techniques. These detectors are often trained on huge datasets of labelled images and are used to detect objects of different classes. We explore how they perform at detecting custom objects and show how shape and deformability of an object affect the detection performance. We propose an automated method for synthesizing the training images and target the real-time scenario using YOLOv3 as the baseline for object detection. We show that rigid objects have a high chance of being detected with an AP (average precision) of 87.38%. Slightly deformable objects like scissors and headphones show a drop in detection performance with precision averaging at 49.54%. Highly deformable objects like a chain or earphones show an even further drop in AP to 26.58%.
  • Border Propagation: A Novel Approach To Determine Slope Region DecompositionsBogner, Florian; Palmrich, Alexander; Kropatsch, Walter G.; 10.3217/978-3-85125-752-6-31pdfSlope regions are a useful tool in pattern recognition. We review theory about slope regions and prove a theorem linking monotonic paths and the connectedness of levelsets. Unexpected behavior of slope regions in higher dimensions is illustrated by two examples. We introduce the border propagation (BP) algorithm, which decomposes a d-dimensional array (d 2 N) of scalar values into slope regions. It is novel as it allows more than 2-dimensional data.
  • How High is the Tide? Estimation of Flood Level from Social MediaStrebl, Julia; Slijepcevic, Djordje; Kirchknopf, Armin; Sakeena, Muntaha; Seidl, Markus; 10.3217/978-3-85125-752-6-32pdfThe availability of social media data represents an opportunity to automatically detect and assess disasters to better guide emergency forces. We propose a method for flood level estimation from user-generated images to support assessing the severity of flooding events. Furthermore, we provide labeled data for water detection. Results on a public benchmark dataset are promising and motivate further research.
  • Real-World Video Restoration using Noise2NoiseZach, Martin; Kobler, Erich; 10.3217/978-3-85125-752-6-33pdfRestoration of real-world analog video is a challenging task due to the presence of very heterogeneous defects. These defects are hard to model, such that creating training data synthetically is infeasible and instead time-consuming manual editing is required. In this work we explore whether reasonable restoration models can be learned from data without explicitly modeling the defects or manual editing. We adopt Noise2Noise techniques, which eliminate the need for ground truth targets by replacing them with corrupted instances. To compensate for temporal mismatches between the frames and ensure meaningful training, we apply motion correction. Our experiments show that video restoration can be learned using only corrupted frames, with performance exceeding that of conventional learning.
  • Asymptotic Analysis of Bivariate Half-Space Median FilteringWelk, Martin; 10.3217/978-3-85125-752-6-34pdfMedian filtering is well established in signal and image processing as an efficient and robust denoising filter with favourable edge-preserving properties, and capable of denoising some types of heavy-tailed noise such as impulse noise. For multichannel images such as colour images, flow fields or diffusion tensor fields, multivariate median filters have been considered in the literature. Whereas the L1 median filter so far dominates in image processing applications, other multivariate concepts from statistics may be used such as the half-space median which in the focus of this work. In the understanding of discrete image filters a central question is always how these relate to the space-continuous physical reality underlying discrete images. For the univariate median filter, a milestone in answering this question is an asymptotic approximation result that links median filtering to the mean curvature motion evolution. We will present an analogous result for half-space median filtering in the bivariate (two-channel) case, which contributes to the theoretical understanding of multivariate median filtering and provides the basis for further generalisations in future work.
  • 360° monitoring for robots using Time-of-Flight sensorsMaier, Thomas; Hasenberger, Birgit; 10.3217/978-3-85125-752-6-35pdfIn this paper, we present a system based on multiple Time-of-Flight (ToF) 3D sensors paired with a central processing hub for integration into robots or mobile machines. This system can produce a 360° view from the robot’s perspective and enables tasks ranging from navigation and obstacle avoidance to human-robot collaboration.
  • Towards Identification of Incorrectly Segmented OCT ScansRenner, Verena; Hladůvka, Jiří; 10.3217/978-3-85125-752-6-36pdfPrecise thickness measurements of retinal layers are crucial to decide whether the subject requires subsequent treatment. As optical coherence tomography (OCT) is becoming a standard imaging method in hospitals, the amount of retinal scans increases rapidly, automated segmentation algorithms are getting deployed, and methods to assess their performance are in demand. In this work we propose a semi-supervised framework to detect incorrectly segmented OCT retina scans: ground-truth segmentations are (1) embedded in 2D feature space and (2) used to train an outlier scoring function and the corresponding decision boundary. We evaluate a selection of five outlier detection methods and find the results to be a promising starting point to address the given problem. While this work and results are centred around one concrete segmentation algorithm we sketch the possibilities of how the framework can be generalized for more recent or more precise segmentation methods.
  • Evaluating Counter Measures against SIFT Keypoint ForensicsSalman, Muhammad; Uhl, Andreas; 10.3217/978-3-85125-752-6-37pdfForensic analysis is used to detect image forgeries e.g. the copy move forgery and the object removal forgery. Counter forensic techniques (methods to fool the forensic analyst by concealing traces of manipulation) have become popular in the game of cat and mouse between the analyst and the attacker. Methods to counter forensic techniques based on SIFT keypoints are being analysed in this paper (aka anti-forensic techniques), with particular emphasis on keypoint removal in the context of copy move forgery detection. Local smoothing is suggested in this paper and turns out to be a highly attractive alternative to techniques investigated in literature so far.
  • Automated Generation of 3D Garments in Different Sizes from a Single ScanHauswiesner, Stefan; Grasmug, Philipp; 10.3217/978-3-85125-752-6-38pdfWe describe a method to generate additional sizes of a garment from a single scanned size and grading tables. The method helps retailers an manufacturers to efficiently capture their entire product range, which in turn enables advanced AR applications such as virtual fashion try-on.