We have exciting new projects, not only for SA and MA students, but also other team members looking for a fun and interesting project next to ETH obligations. If you are interested in a SA or MA please contact one of our PhDs to discuss project ideas. For other projects, check out the list of current projects below and contact one of our TAs for more information.
If you are interested in previous projects, go to our Project Report site, where you can download multiple reports of previous projects.
If you are from a different team and your team works with the B-Human code framework, make sure to check our YouTube channel for some tutorials on the framework by Filippo Martinoni.
Bachelor / Semester / Master Projects
For HS22 there are no more projects available. Please get in contact with us if you are interested to work on a project during FS23. They will be published soon.
- Pose detection by Image Translation [closed]
- Objects and field detection in Robocup [closed]
- Design of a Dynamic Omni-directional Kick Engine for NAO Bipedal Robots in RoboCup [closed]
If interested in a thesis please directly reach out to the supervisors mentioned in the proposals or send an email to firstname.lastname@example.org for general questions about projects.
This is a glance of the projects we have to deal with as members of the team. Every year new challenges arise and our job is to find solution to them exploiting our capabilities while implementing the newest technological advancements to made our robots work better and better.
System Identification for Stabilization
Our walking controller is based on the inverse pendulum assumption. Given this first order model approximation, a PID controller is used to produce a joint command for the AnkleRoll and AnklePitch (check here for visual reference); however, this is purely model-free controller with heuristic tuning. The goal of this project is to improve the control effort by including model information.
Using both geometrical assumptions and system identification techniques, we plan to obtain an accurate model not only useful for control, but also to build a simulation based it. Obtaining a good mathematical model will then allow to adjust the controller from a model-free (PID) to a model-based one (Inverse-Dynamics + PID correction).
Ground-truth is obtained wither with filtered IMU data or with an external MoCap system.
The model can be dependent on additional factors such as temperature, battery level, legs extension and so on, such that the model can be adapted online to various conditions.
A dedicated kick (fastKick) executed from a stationary standing position can be more powerful and accurate, but in practice when doing 5v5 matches, there is often not enough time for the robot to stop, balance, swing back and finally to execute the kick motion.
Therefore, having a good kick during walking is crucial and can be used for short distance passing as well as replacing the dribbling which is slow and prone to errors.
Neural Network Modeling
In order to have a physically meaningful simulation, it is important to model correctly the mechatronics assemblies. In our robot, we cannot directly access the actuators therefore obtaining a good model is hard. Furthermore, a PID controller works low-level and again it’s hidden from us.
In this project we will use data-driven modeling techniques to understand how each single electric actuator works, aiming to identify non-linearities, delays, backlash and damages. We will define a Look-Up Table and train a Neural Network, following nowadays trends in research. The results will be used in simulations to cover the sim-to-real gap in future control and Reinforcement Learning projects.
The main idea will be pursued is to model both the controller and the actuator at the same time, only relying on the information we have available: commanded position and measured output position.
position → | controller + actuator | → velocity
Whole-Body Control for Kicking
We want to develop a robust algorithm to produce the kicking motion, able to deal with stability issues that could arise from any unwanted contact with the floor or other robots.
Also, this approach could lead to an improvement in control over kicking direction.
Behavior & Decision-Making
New Offensive and Defensive Behavior
The current behavior is relatively old and contains many issues that have yet to be solved, but until now there were many other more pressing issues that needed to be solved.
Now with the SelfLocator, Ball Detection, Robot Detection, and most basic motion working reasonably well, the implementation of more advanced behavior start to make sense.
The final goal of this project is to enable the robot to pass the ball to each other and incorporate this skill into offensive and defensive gameplay. Theoretically, this is fairly simple but in practice, it is very hard to execute since most kicks are inaccurate, opponents are usually all over the place when trying to execute this maneuver, and most importantly all estimated states and information you have are not 100% correct.
Improved Obstacle Avoidance
When a robot is walking to a specific destination, there might be some other robots in the way. Then it should plan and follow a path to avoid the obstacles so as to efficiently reach the destination.
Based on the current computer vision pipeline, a robot can detect other robots and estimate their locations by its own vision. Note that a robot could also get some information of other robots from the shared knowledge (teammates’ detection). Limited by the SPL Rules 2022 (Chapter 2.5.2 – Wireless Communications) , nevertheless, the shared knowledge might not be fully updated, making it not as reliable as the robots’ own vision.
The current implementation is, whenever there is a detected robot between the current position and the destination, a waypoint will be set. The robot will move to the waypoint (including the location and the supposed direction) first and then head for the destination. This yields two questions: 1. Is the waypoint optimal (we will call it as obstacle avoidence) 2. how does a robot walk to the waypoint efficiently (e.g. see the figure below, and we will call this part as path following).
This project aims to use all the available sensors (upper and lower cameras, sonars, touch sensors) to get a global and accurate description of the environment, in order to efficiently compute solutions.
A possible expansion would be to build a global map, with occupancy grid, uncertainty, safe positions, target positions, … and provide optimal behaviors based on that.
RL for optimal cooperative positioning
Our current logic is based on multiple state-machines addressing different game conditions via predefined transitions.
It is reliable, but not powerful enough when a multi-robot cooperation is needed. We want to train RL agents to enable our robots to position themselves correctly in each game situation. A main RL framework for single agents and fully observable MDP has been already developed, we need to improve it for multi-agent problems. After positioning, gameplay will be addressed.
Detecting robots is one of the main vision tasks in a robot soccer perception stack. Currently, our framework uses a single-stage object detector based on the SSD meta-architecture to detect robots but unfortunately at the moment we are not able to determine the team to which the robot belongs i.e. our own team or the opponent team. While this is already useful for obstacle avoidance, it is not good enough to implement collaborative behaviors.
In this project we are going to design and implement an approach to determine the team of a detected player based on its jersey color. We are going to start with classic computer vision techniques using color based heuristics and eventually move to a more robust learning-based approach.
Deep learning models have become the standard for perception in SPL, especially for tasks such as object detection and scene understanding. However, collecting high-quality data with the hardware of the NAO and annotating it is a tedious and time consuming task.
One possible workaround is to use synthetic data. State-of-the-art game engines such as Unreal are capable of rendering highly photorealistic scenes, which can then be used to generate synthetic dataset. However, training only on synthetic data might result in poor generalization.
In recent years, it has been shown that deep learning can be used to transfer the properties of a dataset to another one with the same labels but different domains e.g. real and synthetic robot soccer scenes. In particular, models belong to the CycleGAN family have shown very promising results.
The goal of this project is to set up a pipeline to automatically generate annotated synthetic training data for computer vision models from robot soccer scenes rendered e.g. using Unreal Engine and make it more “realistic” using a style transfer model training on publicly available datasets.
Instead of object detection, we want to try out a more holistic approach to scene understanding: semantic segmentation.
The goal of this project is to develop a deep learning model for semantic segmentation that can run in real-time on the embedded CPU of the NAO V6.
Light Agnostic Detection
Domain adaptation is an important task in computer vision.
In Robosoccer, this becomes even more important since the lighting conditions can change and a model trained on artificial lights also needs to perform well in natural lighting conditions.
Domain adaptation using Fourier transform is one of the easiest methods that can be implemented to make a model agnostic to the lighting condition.
There needs to be an image from the source dataset on which the model has been trained.
The lines and robots are fine details that are present in the high-frequency domain while lightning and other details are low-frequency signals. Imposing the lower frequency information of the source image onto the target domain should lead to a resulting image that has the lines and robots in an environment with the lightning conditions of the source dataset.
Another idea to use would be to automatically calibrate the canerasettings. There is a tradeoff between exposure and the noise present in the image. By tweaking the exposure and gain in the cameraSettings, it might be possible to have a setting which is similar to the image where the models have been trained on. A good metric to check if the images are similar would be to calculate the KL divergence between the histograms (turned to a probability distribution).
Automatic Data Labelling
The ball and line detection models are very dependent on the fields and lightning conditions. It is thus necessary to fine-tune our model on a new dataset on the new fields.
However, creating a new dataset in every field is not possible, if done manually.
In the project, we look into labelling a dataset using computer vision models having a lot of parameters. We would be looking into knowledge distillation for training a smaller model. Since this would be done offline, there is no restriction on the number of parameters, the model should have. The teacher model would focus more on accuracy and metrics rather than latency.
The output of the pre-trained model would be used as pseudo-labels for the new dataset and some (ideally none) human interjection to correct the dataset in case some error is present.
The smaller model, that needs to be deployed on the robots, would now be trained on the data generated from the pseudo-labels.
The model that we are looking into is a semantic segmentation model for pixel-wise classification of the images, which need to be obtained from the robots in the new field. The model can be either an object detection model or a semantic segmentation model and generate the object detection boundary from the semantic segmentation model.
Automatic Camera Calibration
Camera calibration in our framework refers to calibrating the joint offsets of the kinematic chain connecting the head to the torso – so e.g. the 3 revolute joints of the neck.
At the moment, the process of calibrating a robot is still very manual – so tedious, time-consuming and error prone.
The goal of this project is to automate this process by leveraging field features detection to determine the error between the perceived field position and the actual one.