Bachelorarbeiten
Evaluation of Inverse Rendering using Multi-View RGB-D data
Beschreibung
-
The core idea is to detect illumination in a pre-defined scene (Digital Twin) and adapt the moving objects in the simulation.
In this work, the student would have to:
- Create a sensor setup
- Inverse Rendering
- Show lighting changes in the room
- Estimate novel views
Voraussetzungen
Preferable
- Experience with Git
- Python (Pytorch)
- Nvidia Omniverse
Kontakt
driton.salihu@tum.de
Betreuer:
Multiband evaluation of passive signal for human activity recognition
CSI; HAR; AI
To obtain samples
Beschreibung
The student must use a rf system to collect samples for different activities.
Implement classification algorithms to determinine dofferent activies from CSI or using T-F transforms
Voraussetzungen
Kontakt
fabian.seguel@tum.de
of. 2940
Betreuer:
Vital Sign Monitoring Using Multi-resolution Analysis and Machine Learning
CSI; HAR; AI
To obtain samples
Beschreibung
The student must use a radar sytem to obtain vital signs of a patient.
The system must be embedded in a hospital bed.
Vital signs such as breathing rate and HR will be targeted; Others applications must be discussed.
Voraussetzungen
Kontakt
fabian.seguel@tum.de
of. 2940
Betreuer:
Learning-based human-robot shared autonomy
robot learning, shared control
Beschreibung
In shared control teleoperation, the robot intelligence and human input can be blended together to achieve improved task performance and reduce the human workload. In this topic, we would like to investigate how we can combine human input and robot intelligence effectively to achieve at the end full robot autonomy. We will employ robot learning from demonstration approaches, where we provide task demonstrations using teleoperation.
We aim to test the developed algorithms in simulation and using Franka Emika robot arm.
Requirements:
Basic experience in C/C++
ROS is a plus
High motivation to learn and conduct research
Betreuer:
Elfolgsmaximierung auf der streaming Platform YouTube
Beschreibung
...
Betreuer:
Optimization of Saliency Map Creation
Saliency maps, deep learning, computer vision
Beschreibung
Saliency maps can be interpreted as probability maps the assess the scenes attractiveness and highlight the regions that might be interesting for the user to look. The objective of this thesis is to help create a novel dataset that records the head-motions and gaze directions for participants that watch 360° videos with varying dynamics in the scene. This dataset is then to be used to improve state-of-the-art saliency map creation algorithms and make them soft realtime-capable. Deep learning proved to be a robust technique to create saliency maps. The student is supposed to either use pruning techniques to boost the performance of state-of-the-art methods or to develop an own approach that delivers a trade-off between accuracy and computational complexity.
Voraussetzungen
Computer Vision, Machine Learning, C++, Python
Betreuer:
Masterarbeiten
Video Coding of Natural Video Sequences Using Estimated Auxiliary Data
video coding, codec, JVET
Beschreibung
As part of the AhG15 (gaming content compression) activity at JVET, additional data is available for various sequences: depth maps, motion vectors, camera parameters, and trajectories. These were directly extracted from a gaming engine, ensuring their superior quality. AhG15 aims to investigate how this data can be leveraged for the compression of gaming content.
In this master’s thesis, we want to explore whether similar data for regular sequences can also be estimated and utilized for coding. Auxiliary data can be obtained by leveraging existing approaches, such as SfM and SLAM. The final goal is to achieve higher coding gain (BD-Rate) or faster coding by simplifying the motion search process.
Supervisors: Dr. Johannes Sauer (Huawei Munich), Hongjie You
Voraussetzungen
Knowledge on video coding, H.266 (VVC) and H.265 (HEVC)
Coding skills of C++, python and shell for working with JVET AhG verification software.
Independent problem solving skills.
Kontakt
Please get in touch with Hongjie You directly
Betreuer:
Open-Vocabulary Surface Material Segmentation using Object Category Prior with Vision Language Model
Beschreibung
Recognizing the surface material of objects in an environment is essential for signal propagation simulation in a 6G digital twin. Due to the lack of large surface material datasets, the material categories are often limited to some specific scenarios. Recently, in the object detection domain, open-vocabulary approaches have become popular as they can detect any object class by leveraging a vision-language model. Thus, we would like to introduce the open-vocabulary feature to surface material segmentation to make the model independent of a specific dataset.
In our first test, directly applying a CLIP-based open-vocabulary object segmentation method to material segmentation resulted in a terrible performance. In this master's thesis, we want to explore how to improve this model. First, we need to adapt the mask generation algorithm to the shapes of surface materials. In addition, we need to finetune the vision-language model to fit it to material-level information. Moreover, we could introduce the object prior knowledge to the prompt, such as "a wooden surface of a chair." The final goal is to develop a model that can segment both object and surface material at the same time.
References:
F. Liang et al., Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP, CVPR 2023, https://jeff-liangf.github.io/projects/ovseg/
P. Upchurch et al., A Dense Material Segmentation Dataset for Indoor and Outdoor Scene Parsing, ECCV 2022, https://github.com/apple/ml-dms-dataset
Voraussetzungen
Basic programming knowledge is required, preferably in Python.
Experience with PyTorch and popular object detection models (YOLOv8, Detectron2, ...) is a plus.
Interest in Computer Vision and Multimodal Models.
Kontakt
zhifan.ni@tum.de
Betreuer:
3D Person and Object Detection Using Multiview Cameras for Safety Critical Scenarios
Beschreibung
For this thesis, the student will focus on 3D localization of objects and people in safety-critical scenarios, such as when people are positioned near cageless robots. The research will leverage multiview detection, using multiple RGB cameras that capture the scene from various angles to enable precise 3D localization. The thesis will analyze multiview detection model performance in terms of factors like camera count and coverage area. It will also examine the effects of abnormalities, including pose variations and occlusions, through performance tests conducted with a in-house dataset. If significant performance gaps are identified, additional strategies for enhancing system reliability in these safety-critical environments will be investigated.
Betreuer:
Validation of Pose Estimation Algorithm with Synthetic Data Generated in Game Engines
Beschreibung
The aim of this thesis is to examine whether synthetic test data generated with game engines provide the necessary level of detail and realism to be meaningfully utilized in the context of a critical evaluation of algorithms, such as those used for body pose estimation.
Your tasks:
Research on state-of-the-art pose estimation algorithms
Creation of virtual test scenarios relevant to occupant safety in Unreal Engine 5
Generation of photorealistic synthetic test data including ground truth
Quantitative investigation/validation of an established pose estimation algorithm using the above test data
Kontakt
Zhifan Ni (zhifan.ni@tum.de)
Betreuer:
Diffusion Model-based Imitation Learning for Robot Manipulation Task
Beschreibung
Diffusion models are powerful generative models that enable many successful applications, such as image, video, and 3D generation from texts. It's inspired by non-equilibrium thermodynamics, which defines a Markov chain of diffusion steps to slowly add random noise to data and then learn to reverse the diffusion process to construct desired data samples from the noise.
In this work, we aim to explore the application of the diffusion model or its variants in imitation learning and evaluate it on the real-world Franka robot arm.
Voraussetzungen
- Good Programming Skills (Python, C++)
- Knowledge about Ubuntu/Linux/ROS
- Motivation to learn and conduct research
Kontakt
dong.yang@tum.de
(Please attach your CV and transcript)
Betreuer:
Real-time registration of noisy, incomplete and partially-occluded 3D pointclouds
Beschreibung
This topic is about the registration of 3D pointclouds belonging to certain objects in the scene, rather than about registering different pointclouds of the scene itself.
State-of-the-art (SOTA) pointcloud registration models/algorithms should be first reviewed, and promising candidates should be selected for evaluation based on the criteria listed below.
- The method must work in real-time (at least 25 frames per second) for at least 5 different objects at the same time.
- The method must be robust to noise in the pointclouds. They come from an Intel RealSense D435 RGB+Depth camera.
- The method must be able to robustly track the objects of interest even if they are occluded partially by other objects.
The best-suited method must then be extended or improved in a novel way or a completely novel method should be developed.
Both classical as well as Deep Learning based methods must be considered.
Related work:
- DeepGMR: https://github.com/wentaoyuan/deepgmr
- 3D Object Tracking with Transformer: https://github.com/3bobo/lttr
Voraussetzungen
- First experiences with 3D data processing / Computer Vision
- Python programming, ideally also familiarity with C++
- Familiarity with Linux and the command line
Betreuer:
Learning 3D skeleton animations of animals from videos
Beschreibung
Under this topic, the student should investigate how to learn 3D animations of skeletons of animals from videos. The 2D skeleton should be extracted first automatically from a video. A state-of-the-art 3D animal shape+pose (SMAL, see references below) model should then be fitted to the skeleton.
References
- https://smal.is.tue.mpg.de/index.html
- https://smalr.is.tue.mpg.de/
- https://github.com/silviazuffi/smalr_online
- https://github.com/silviazuffi/gloss_skeleton
- https://github.com/silviazuffi/smalst
- https://github.com/benjiebob/SMALify
- https://github.com/benjiebob/SMALViewer
- https://bmvc2022.mpi-inf.mpg.de/0848.pdf
Dataset
- https://research.google.com/youtube8m/explore.html
- https://youtube-vos.org/dataset/vos/
- https://data.vision.ee.ethz.ch/cvl/youtube-objects/
- https://blog.roboflow.com/youtube-video-computer-vision/
- https://github.com/gtoderici/sports-1m-dataset/ (this dataset seems to provide raw videos from YT)
- https://github.com/pandorgan/APT-36K
- https://calvin-vision.net/datasets/tigdog/: contains all the videos, the behavior labels, the landmarks, and the segmentation masks for all three object classes (dog, horse, tiger)
- https://github.com/hellock/WLD (raw videos)
- https://sutdcv.github.io/Animal-Kingdom/
- https://sites.google.com/view/animal-pose/
Voraussetzungen
- Background in Computer Vision, Optimization techniques, and Deep Learning
- Python programming
Betreuer:
Real-time Multi-View Visual SLAM
Beschreibung
How can a SLAM system utilize a multi camera rig efficiently, robust and fast.
Betreuer:
Deep Learning models for zero-shot object detection and segmentation
Beschreibung
In the world of computer vision, data labeling holds immense significance for training powerful machine learning models. Accurate annotations provide the foundation for teaching algorithms to understand visual information effectively. However, data labeling in computer vision poses unique challenges, including the complexity of visual data, the need for precise annotations, and handling large-scale datasets. Overcoming these challenges is crucial for enabling computer vision systems to extract valuable insights, identify objects, and revolutionize a wide range of industries.
Therefore, the development of automatic annotation pipelines for 2D and 3D labeling in various tasks is crucial, leveraging recent advancements in computer vision to enable automatic, efficient and accurate labeling of visual data.
This master thesis will focus on automatically labeling images and videos, and specifically generating 2D/3D labels (i.e., 2D/3D bounding boxes and segmentation masks). The automatic labeling pipeline has to generalize to any type of images and videos such as, household objects, toys, indoor/outdoor environments, etc.
The automatic labeling pipeline will be developed based on zero-shot detection and segmentation models suchGroundingDINO andsegment-anything, in addition to similar methods (seeAwesome Segment Anything). Additionally, the labeling pipeline including the used models will be implemented in theautodistill code base and the performance will be tested by training and evaluating some smaller target models for specific tasks.
Sub-tasks:
? Automatic generation of 2D labels for images and videos, such as 2D bounding boxes and segmentation masks (seeGrounded-Segment-Anything andsegment-any-moving,Segment-and-Track-Anything).
? Automatic generation of 3D labels for images and videos, such as 3D bounding boxes and segmentation masks (see3D-Box-Segment-Anything,SegmentAnything3D,segment-any-moving,Segment-and-Track-Anything).
? Implement a 2D/3D labeling tool to modify and improve the automatic 2D/3D labels (seeDLTA-AI)
? The automatic labeling pipeline in addition to the used base models and some target models have to be implemented in theautodistill code base to enable an easy end-to-end labeling, training, and deployment for various tasks such as 2D/3D object detection, segmentation.
? Comprehensive overview of the performance and limitation of the current zero-shot models for the use of automatic labeling for tasks such as 2D/3D object detection, segmentation.
? Suggestion of future works to overcome the limitation of the used methods
Bonus tasks:
? Adding image augmentation and editing methods to the labeling pipeline and tool to generate more data (seeEditAnything)
? Implement one-shot labeling methods to generate labels for unique objects (seePersonalize-SAM andMatcher)
Voraussetzungen
Interest and first experiences in Computer Vision, Deep Learning, Python programming, 3D data.
Betreuer:
iOS app for tracking objects using RGB and depth data
Beschreibung
This topic is about the development of an iPhone app for tracking objects in the environment using data from the device's RGB and depth sensors.
Voraussetzungen
- Good programming experience with C++ and Python
- Ideally, experience building iOS apps with SWIFT and/or Unity ARFoundation
- This topic is only suitable for you if you have a recent personal mac development device (ideally at least a MacBook Pro with Apple Silicon M1) and at least an iPhone 12 Pro with a LiDAR depth sensor
Betreuer:
Hand Pose Estimation Using Multi-View RGB-D Sequences
Hand Object Interaction, Pose Estimation, Deep Learning
Beschreibung
In this project the task is to fit a parametric hand mesh model and a set of rigid objects to a sequence of multi-view RGB-D cameras. Existing models for hand key-point detection and 6DoF pose estimation for rigid objects models have significantly evolved in recent years. Our goal is to utilize such models to estimate the hand and object poses.
Related Work
- https://dex-ycb.github.io/
- https://www.tugraz.at/institute/icg/research/team-lepetit/research-projects/hand-object-3d-pose-annotation/
- https://github.com/hassony2/obman
- https://github.com/ylabbe/cosypose
Voraussetzungen
- Knowledge in computer vision.
- Experience about segmentation models (i.e. Detectron2)
- Experience with deep learning frameworks PyTorch or TensorFlow(2.x).
- Experience with Pytorch3D is a plus.
Kontakt
marsil.zakour@tum.de
Betreuer:
Attentive observation using intensity-assisted segmentation for SLAM in a dynamic environment
SLAM, ROS, Deep Learning, Segmentation
Beschreibung
Attentive observation using intensity-assisted segmentation for SLAM in a dynamic environment.
Betreuer:
Illumination of Augmented Reality Content using a Digital Enviroment Twin
Beschreibung
...
Betreuer:
Klassifikation von Wafermuster mit Hilfe von Machine Learing Methoden zur Automatisierung der Mustererkennung
Beschreibung
...
Betreuer:
Solid-State LiDAR and Stereo-Camera based SLAM for unstructured planetary-like environments
Solid-State LiDAR; Stereo-Camera; SLAM
Beschreibung
New developments in solid-state LiDAR technology open the possibility of integrating range sensors in possible space-qualifiable perception setups, thanks to mechanical designs with reduced moveable parts. Thereby, the development of a hybrid stereo-camera/LiDAR sensor setup might overcome disadvantages each technology comes with, such as limit range for stereo camera setups or the minimum range Lidars need. This thesis investigates such a new solid-state Lidar's possibilities by incorporating it along with a stereo camera setup and an IMU sensor into a SLAM system. Foreseen activities might include, but are not limited to, the design and construction of a portable/handhold sensor setup for recording and testing in planetary-like environments, extrinsic calibration of the sensors, integration into a software pipeline, development of a ROS interface, and preliminary mapping tests.
Betreuer:
Deep Predictive Attention Controller for LiDAR-Inertial localization and mapping
SLAM, Sensor Fusion, Deep Learning
Beschreibung
The multidimensional sensory data is computationally expensive for localization algorithms in autonomous navigation for drones. Research shows that not all sensory data are equivalently important during the entire process of SLAM to perform a reliable output. The attention control scheme is one of the effective ways to filter out the highly valuable sensory data for such a system. The predictive attention model, for instance, can help us to improve the result of the sensor fusion algorithms by concentrating on the most valuable sensory data based on the dynamic of the vehicle motion or the semantic understanding of the environment. The aim of this work is to investigate the state-of-the-art attention control models that can be adapted for the multidimensional sensory data acquisition system and compare them from different modalities.
Voraussetzungen
- Strong background in Python and C++ programming
- Solid background in robot control theory
- Be familiar with deep learning frameworks (Tensorflow)
- Be familiar with the robot operating system (ROS)
Kontakt
leox.karimi@tum.de
Betreuer:
Model based Collision Identification for Real-Time Jaco2 Robot Manipulation
ROS, Haptics, Teleoperation, Jaco2
Beschreibung
By the advancement of robotics and communication networks such as 5G, telemedicine has become a critical application for remote diagnosis and treatment.
In this project, we want to perform robotic teleoperation using a Sigma 7 haptic master and a Jaco 2 robotic manipulator.
Tasks:
- State of the art review and mathematical modeling
- Jaco2 haptic controller implementation
- Fault-tolerant (delay, network disconnect) controller design
- System evaluation with external force-torque sensor
Voraussetzungen
- Strong background in C++ programming
- Solid background in control theory
- Be familiar with robot dynamics and kinematics
- Be familiar with the robot operating system (ROS) and ROS Control (Optional)
Kontakt
edwin.babaians@tum.de
Betreuer:
Interdisziplinäre Projekte
Extension of an Open-source Autonomous Driving Simulation for German Autobahn Scenarios
Beschreibung
This work can be done in German or English in a team of 2-4 members.
Self-driving cars need to be safe in the interaction with other road users such as motorists, cyclists, and pedestrians. But how can car manufacturers ensure that their self-driving cars are safe with us humans? The only realistic and economic way to test this is to use simulation.
cogniBIT is a Munich-based Startup founded by Alumni of TUM and LMU and provides realistic models of all kind of road users. These models are based on state-of-the art neurocognitive and sensorimotor research and reproduce human perception, cognition, and action with all its limitations.
In this project the objective is to extend the open-source simulator CARLA (www.carla.org) such that German Autobahn-like scenarios can be simulated.
Tasks:
• Design an Autobahn scenario using the road description format OpenDRIVE.
• Adapt the CARLA OpenDRIVE standalone mode (requires C++ knowledge).
• Design an environment for the scenario using the Unreal Engine 4 Editor.
• Perform a simulation-based experiment using the German Autobahn scenario and the cogniBIT driver model.
Voraussetzungen
• C++ knowledge
• experience with Python is helpful
• experience with the UE4 editor is helpful
• interest in autonomous driving and cognitive models
Betreuer:
Extension of an Open-source Autonomous Driving Simulation for German Autobahn Scenarios
Beschreibung
This work can be done in German or English in a team of 2-4 members.
Self-driving cars need to be safe in the interaction with other road users such as motorists, cyclists, and pedestrians. But how can car manufacturers ensure that their self-driving cars are safe with us humans? The only realistic and economic way to test this is to use simulation.
cogniBIT is a Munich-based Startup founded by Alumni of TUM and LMU and provides realistic models of all kind of road users. These models are based on state-of-the art neurocognitive and sensorimotor research and reproduce human perception, cognition, and action with all its limitations.
In this project the objective is to extend the open-source simulator CARLA (www.carla.org) such that German Autobahn-like scenarios can be simulated.
Tasks:
• Design an Autobahn scenario using the road description format OpenDRIVE.
• Adapt the CARLA OpenDRIVE standalone mode (requires C++ knowledge).
• Design an environment for the scenario using the Unreal Engine 4 Editor.
• Perform a simulation-based experiment using the German Autobahn scenario and the cogniBIT driver model.
Voraussetzungen
• C++ knowledge
• experience with Python is helpful
• experience with the UE4 editor is helpful
• interest in autonomous driving and cognitive models
Betreuer:
Forschungspraxis (Research Internships)
DT-based Human-robot Teleoperation with Haptic Codecs Standard
Digital Twin, Teleoperation, Haptic Codecs Standard
Our project aims to build a DT-based human-robot teleoperation with haptic codecs standard and multiple sensors under a Linux system.
Beschreibung
For the system, the main achievements should be:
1. A completed human-in-the-loop haptic teleoperation: You should port the teleoperation system, which currently interacts with Unity on Windows, to a Linux system using the Robot Operating System (ROS). You can use Gazebo to create a remote environment. It should contain a robotic arm (the follower device) and an operational platform to simulate a real remote environment. You will use a Phantom device as the leader device to manipulate the virtual robot arm to gather information during the interaction to explore the environment updates, such as adding a new object, thus building a Digital Twin (DT) in the virtual environment on the leader side.
2. Multiple sensors for data collection on the Follower side: You should use visual and haptic devices to collect environment-update data and complete the environment restoration. Visual information is usually captured using 2D and depth cameras, and haptic information is expressed by the remote position and force feedback.
3. Haptic codecs for data transmission: The transmission of velocity, position, visual, and haptic information needs to follow the Haptic Codecs Standard.
4. Optional function: Plug-and-Play: When a haptic device is temporarily disconnected and reconnected, the teleoperation system should automatically restore normal operations, resuming synchronization between both sides. For both the leader side and the follower side, the detection of disconnection and the resumption of reconnection should be designed.
Voraussetzungen
Our requirements (preferably should have):
Familiarity with teleoperation systems, Linux systems, and visual and haptic sensors.
A good understanding of ROS (Robot Operating System).
Betreuer:
Refining 3D Hand-Object Reconstruction via Elastomer Model
Beschreibung
To model the interaction of hand and object, not only is a separate estimation of hand and object required but the contact between hand and object must also be taken into account. Significant progress has been made in modeling isolated hands and objects from RGB images. However, modeling the contact between a human hand and an object within a single image needs much effort because of the existence of occlusions. In this paper, we propose a method for the reconstruction of hands and objects in 3D based on elastomer models. This method simulates the Hand-Object(HO) contact based on the elastic energy of the elastomer model. At the same time, it imitates the deformation of soft hand tissue using the concept of elastic modulus, such that a more physically plausible grasp could be formed. Aside from that, an optimizer is applied to improve the HO interaction under the supervision of ground truth. The whole framework is constructed in an end-to-end manner. Several commonly used benchmarks show that the method leads to a better reconstruction result and produces more physically plausible hand and object estimation.
Betreuer:
Surface Material Recognition using Object Category Prior with Vision Language Model
Beschreibung
In our 6G Digital Twin setup, we need to recognize the surface material of objects in the environment. Currently, we are using a YOLOv8 model to perform object segmentation. If we could extend such an object segmentation model to material segmentation tasks, we can save computation resources. Surely, we can fine-tune the classification head in those object segmentation models on a surface material dataset. However, this will ignore the object category we already detected. Another possible approach is to apply a vision language model, such as CLIP, and use prompts like "a wooden surface of a table" to leverage the prior knowledge of object categories.
In this Forschungspraxis, we will first explore the state-of-the-art works in open-vocabulary semantic segmentation and understand how they deal with segmentation masks. Then, we will adapt such models to material segmentation, which may involve mask adaptation, feature alignment, prompt engineering, etc.
Example: F. Liang et al., Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP, CVPR 2023, https://jeff-liangf.github.io/projects/ovseg/
Voraussetzungen
Basic programming knowledge is required, preferably in Python.
Experience with PyTorch and popular object detection models (YOLOv8, Detectron2, ...) is a plus.
Interest in Computer Vision and Multimodal Models.
Kontakt
zhifan.ni@tum.de
Betreuer:
Feature enhancement based human-object detection
Human-object interaction, Feature pre-processing, VAE
Beschreibung
Human-object interaction detection is currently a famous research topic. It requires us to spatially distinguish human-object interaction in images. However, the current stage of feature extraction can be further optimized. This task will explore how to improve the performance of HOI detection, starting with feature extraction and optimization.
Voraussetzungen
- Computer vision
- Human-object interaction prediction
- Deep Learning
- Transformer
Kontakt
yuankai.wu@tum.de
Betreuer:
Monocular RGB-based Digital Twin
Beschreibung
Using monocular RGB data to reconstruct a 3D interior environment with CAD-based reconstruction.
Voraussetzungen
Git, Python, PyTorch
Kontakt
driton.salihu@tum.de
Betreuer:
Human-robot interaction using vision-based human-object interaction prediction
Human-object interaction, human-robot interaction
Beschreibung
We use vision solution to locate the target object for the robot and send the desired object back to the operator to complete the whole process of human-robot interaction.
Voraussetzungen
- Panda arm
- Computer vision
- Human-object interaction prediction
-Grasping
Betreuer:
GAN-based subjective haptic signal quality assessment database augmentation and enlargement methods
Beschreibung
This project needs the student to research and implement a novel GAN-based approach for subjective haptic signal quality assessment database augmentation and enlargement. Also, subjective experiments will also be conducted to evaluate the result of the automatic data expansion.
Betreuer:
Inverse Rendering in a Digital Twin for Augmented Reality
Digital Twin, Illumination, HDR
Beschreibung
The task is to generate an End-to-End pipeline for illumination estimation inside of a digital twin.
Finally, also an AR application can be created.
Possible References
[1] https://arxiv.org/pdf/1905.02722.pdf
[2] https://arxiv.org/pdf/1906.07370.pdf
[3] https://arxiv.org/pdf/2011.10687.pdf
Voraussetzungen
- Python (Pytorch)
- Experience with Git
Kontakt
driton.salihu@tum.de
Betreuer:
Network Aware Shared Control
Teleoperation, Learning from Demonstration
Beschreibung
In this thesis, we would like the make the best out of varying quality demonstrations. We will test the developed approach with shared control.
Voraussetzungen
Requirements:
Experience in C/C++
ROS is a plus
High motivation to learn and conduct research
Betreuer:
Optimization of 3D Object Detection Procedures for Indoor Environments
3D Object Detection, 3D Point Clouds, Digital Twin, Optimization
Beschreibung
3D object detection has been a major task for point cloud-based 3D reconstruction of indoor environments. Current research has focused on having a low inference time for 3D object detection. While this is preferable, a lot of cases do not profit from this. Especially considering the use of a pre-defined static Digital Twin for AR and robotics application, thus this decreases the incentive for low inference time at the cost of accuracy.
As such this thesis will follow the approach of [1] (in this work only based on point cloud data) to generate proposals of layout and objects in a scene through for example [2]/[3] and use some form of optimization algorithm (reinforcement learning, genetic algorithm) to optimize to the correct solution.
Further, for more geometrical-reasonable results the use of a relationship graph neural network, as in [4], would be applied in the pipeline.
References
[1] Hampali, Shreyas, et al. “Monte Carlo Scene Search for 3D Scene Understanding.” 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021): 13799-13808. https://arxiv.org/abs/2103.07969#:~:text=We explore how a general, from noisy RGB-D scans.
[2] Chen, Xiaoxue, Hao Zhao, Guyue Zhou, and Ya-Qin Zhang. “PQ-Transformer: Jointly Parsing 3D Objects and Layouts From Point Clouds.” IEEE Robotics and Automation Letters 7 (2022): 2519-2526. https://arxiv.org/abs/2109.05566
[3] Qi, C., Or Litany, Kaiming He and Leonidas J. Guibas. “Deep Hough Voting for 3D Object Detection in Point Clouds.” 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2019): 9276-9285. https://arxiv.org/abs/1904.09664
[4] Avetisyan, Armen, Tatiana Khanova, Christopher Bongsoo Choy, Denver Dash, Angela Dai and Matthias Nießner. “SceneCAD: Predicting Object Alignments and Layouts in RGB-D Scans.” ArXiv abs/2003.12622 (2020): n. pag. https://arxiv.org/abs/2003.12622
Voraussetzungen
- Python (Pytorch)
- Experience with Git
- Knowledge in working with 3D Point Clouds (preferable)
- Knowledge about optimization methods (preferable)
Kontakt
driton.salihu@tum.de
Betreuer:
Learning Temporal Knowledge Graphs with Neural Ordinary Differential Equations
Beschreibung
...
Kontakt
zhen.han@campus.lmu.de
Betreuer:
Sim-to-Real Gap in Liquid Pouring
sim-to-real
Beschreibung
We want to investigate what are the simulation bottlenecks in order to learn the pouring task. How we can tackle this problem. This project is more paper reading and the field of research is skill refinement and domain adaptation. In addition, we will try to implement one of the states of the art methods of teaching by demonstration in order to adapt the simulation skill to the real-world scenario.
Voraussetzungen
Creativity
Motivation
Strong C++ Background
Strong Phyton Background
Kontakt
edwin.babaians@tum.de
Betreuer:
"Pouring Liquids" dataset development
Nvidia Flex, Unity3D, Nvidia Physics 4.0
Using Unity3D and Nvidia Flex plugin, develop a learning environment and model different fluids for teaching pouring tasks to robots.
Beschreibung
The student will develop different liquid characteristics using Nvidia Flex, will add different containers and particle collision checking system. In addition, a ground truth system to later use for robot teaching.
Reference:
https://developer.nvidia.com/flex
https://developer.nvidia.com/physx-sdk%20
Voraussetzungen
Strong Unity3D background
Familiar with Nvidia Physics and Nvidia Flex libraries.
Kontakt
edwin.babaians@tum.de
Betreuer:
Analysis and evaluation of DynaSLAM for dynamic object detection
Beschreibung
Investigation of DynaSLAM in terms of real-time capabilities and dynamic object detection.
Betreuer:
Comparison of Driver Situation Awareness with an Eye Tracking based Decision Anticipation Model
Situation Awareness, Autonomous Driving, Region of Interest Prediction, Eye Tracking
Beschreibung
This work can be done in German or English
The transmission of control to the human driver in autonomous driving requires the observation of the human driver. The vehicle has to guarantee that the human driver is aware of the current driving situation. One input source for observing the human driver is based on the driver's gaze.
The objective of this project is to compare two existing approaches for driver observation [1,2]. While [1] measures the driver situation awareness (SA), [2] anticipates the drivers decision. As part of a user study [2] published a gaze dataset. An interesting cross validation would be the comparison of the
SA score generated by [1] and the predicted decision correctness of [2].
Tasks
- Generate ROI predictions [3] from the dataset of [2]
- Estimate the driver SA with the model of [1]
- Compare [1] and [2]
- (Optional) Extend driving experiments
References
[1] Markus Hofbauer, Christopher Kuhn, Lukas Puettner, Goran Petrovic, and Eckehard Steinbach. Measuring driver situation awareness using region-of-interest prediction and eye tracking. In 22nd IEEE International Symposium on Multimedia (ISM), Naples, Italy, Dec 2020.
[2] Pierluigi Vito Amadori, Tobias Fischer, Ruohan Wang, and Yiannis Demiris. Decision Anticipation for Driving Assistance Systems. June 2020.
[3] Markus Hofbauer, Christopher Kuhn, Jiaming Meng, Goran Petrovic, and Eckehard Steinbach. Multi-view region of interest prediction for autonomous driving using semisupervised labeling. In IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), Rhodes, Greece, Sep 2020.
Voraussetzungen
- Experience with ROS and Python
- Basic knowledge of Linux
Betreuer:
3D object model reconstruction from RGB-D scenes
Beschreibung
The robots should be able to discover their environments and learn new objects in order to be a part of daily human life. There are still challenges to detect or recognize objects in unstructured environments like a household environment. For robotic grasping and manipulation, knowing 3D models of the objects are beneficial, hence the robot needs to infer the 3D shape of an object upon observation. In this project, we will investigate methods that can infer or produce 3D models of novel objects by observing RGB-D scenes. We will analyze the methods to reconstruct 3D information with different arrangements of an RGB-D camera.
Voraussetzungen
- Basic knowledge of digital signal processing / computer vision
- Experience with ROS, C++, Python.
- Experience with Artificial Neural Network libraries or motivation to learn them
- Motivation to yield a successful work
Kontakt
furkan.kaynar@tum.de
Betreuer:
Research on the implementation of an (automated) solution for the analysis of surface impurities on endoscope tubes
Beschreibung
...
Betreuer:
AI-Enhanced Tool for Desk Research – Smart Analytical Engine
Beschreibung
...
Betreuer:
Algorithm evaluation for robot grasping with compliant jaws
python, ROS, robot grasping
Apply state-of-the-art contact model for robot grasp planning with a customized physical setup including a KUKA robot arm and a parallel-jaw gripper with compliant materials.
Beschreibung
Model-based grasp planning algorithms depend on friction analysis since friction between objects and gripper-jaws highly affect the grasp robustness. A state-of-the-art friction analysis algorithm for grasp planning is evaluated with plastic robot fingers and achieved promising results, but will it work if grippers are mounted with compliant materials such as rubber and silicon, compared to more advanced contact models?
The task of this work is to create a new dataset and retrain an existing deep network by applying a state-of-the-art contact model for grasp planning.
Betreuer:
Adaptive LiDAR data update rate control based on motion estimation
SLAM, Sensor Fusion, ROS
Beschreibung
...
Betreuer:
UWB localization by Kalman filter and particle filter
Beschreibung
...
Betreuer:
Investigating the Potential of Machine Learning to Map Changes in Forest based on Earth Observation
Beschreibung
...
Betreuer:
Ingenieurpraxis
Recording of Robotic Grasping Failures
Beschreibung
The aim of this project is collecting data by robotic grasping experiments and creating a largescale labeled dataset. We will conduct experiments while attempting to grasp known or unknown objects autonomously. The complete pipeline includes:
- Estimating grasp poses via computer vision
- Robotic motion planning
- Executing the grasp physically
- Recording necessary data
- Organizing the recorded data into a well-structured dataset
Most of the data collection pipeline has been already developed, additions and modifications may be needed.
Voraussetzungen
Useful background:
- Digital signal processing
- Computer vision
- Dataset handling
Requirements:
- Experience with Python and ROS
- Motivation to yield a good outcome
Kontakt
furkan.kaynar@tum.de
(Please provide your CV and transcript in your application)
Betreuer:
Studentische Hilfskräfte
Studentische Hilfskraft Praktikum Software Engineering
Software Engineering, Unit Testing, TDD, C++
Beschreibung
We are looking for a teaching assistant student of our new Software Engineering Lab. In this course we explain basic principles of software engineering such as unit testing, test driven development and how to collaborate in teams [1].
You will act as a teaching assistant to supervise students during the lab session working on their practical homeworks. The tasks of the homeworks are generally C++ coding exercises where the students contribute to a common codebase. This means you should have a good experience in C++, unit testing, and git as this will be an essential part of the homeworks.
References
[1] Winters, Titus, Tom Manshreck, and Hyrum Wright, eds. Software Engineering at Google: Lessons Learned from Programming Over Time. O'Reilly Media, Incorporated, 2020
Voraussetzungen
- Very good knowledge in C++
- Experience with unit testing
- Good understanding of git and collaborative software development
Betreuer:
MATLAB tutor for Digital Signal Processing lecture in summer semester 2022
Beschreibung
Tasks:
- Help students with the basics of MATLAB (e.g. matrix operations, filtering, image processing, runtime errors)
- Correct some of the homework problems
- Understand the DSP coursework material
We offer:
- Payment according to the working hours and academic qualification
- The workload is approximately 6 hours per week from May 2022 to August 2022
- Technische Universität München especially welcomes applications from female applicants
Application:
- Please send your application with a CV and transcript per e-mail to basak.guelecyuez@tum.de
- Students who have taken DSP course preferred.
Betreuer:
Studentische Hilfskraft Praktikum Software Engineering
Software Engineering, Unit Testing, TDD, C++
Beschreibung
We are looking for a teaching assistant student of our new Software Engineering Lab. In this course we explain basic principles of software engineering such as unit testing, test driven development and how to collaborate in teams [1].
You will act as a teaching assistant to supervise students during the lab session working on their practical homeworks. The tasks of the homeworks are generally C++ coding exercises where the students contribute to a common codebase. This means you should have a good experience in C++, unit testing, and git as this will be an essential part of the homeworks.
References
[1] Winters, Titus, Tom Manshreck, and Hyrum Wright, eds. Software Engineering at Google: Lessons Learned from Programming Over Time. O'Reilly Media, Incorporated, 2020
Voraussetzungen
- Very good knowledge in C++
- Experience with unit testing
- Good understanding of git and collaborative software development