Self-­Supervised Learning for Visual Obstacle Avoidance

Tom van Dijk

Self-Supervised Learning for Visual Obstacle Avoidance

Tom van Dijk

Control & Simulation

Research output: Book/Report › Report › Scientific

270 Downloads (Pure)

Abstract

With a growing number of drones, the risk of collision with other air traffic or fixed obstacles increases. New safety measures are required to keep the operation of Unmanned Aerial Vehicles (UAVs) safe. One of these measures is the use of a Collision Avoidance System (CAS), a system that helps the drone autonomously detect and avoid obstacles. The design of a Collision Avoidance System is a complex task with many smaller subproblems, as illustrated by Albaker and Rahim [1]. How should the drone sense nearby obstacles? When is there a risk of collision? What should the drone do when a conflict is detected? All of these questions need to be answered to develop a functional Collision Avoidance System. However, all of these subproblems – except the sensing of obstacles – only concern the behavior of the vehicle. They can be solved independently of the target platform as long as it can perform the required maneuvers; it does not matter whether it is a UAV or a larger vehicle. The sensing of the environment, on the other hand, is the only subproblem that places requirements on the hardware, specifically the sensors that should be carried by the UAV. It is the hardware that sets UAVs apart from other vehicles. Unlike autonomous cars, other groundbased vehicles or larger aircraft, UAVs have only a small payload capacity. It is therefore not practical to carry large or heavy sensors such as LIDAR or radar for obstacle avoidance. Instead, obstacle avoidance on UAVs requires clever use of lightweight sensors: cameras, microphones or antennae. This research will therefore focus on the sensing of the environment. Out of the sensors mentioned above – cameras, microphones and antennae – cameras are the only ones that can detect nearly all groundbased obstacles and other air traffic; microphones and antennae are limited to detection of sources of noise or radio signals1. Therefore, this research will focus on the visual detection of obstacles. The field of computer vision is welldeveloped; it may already be possible to find an adequate solution for visual obstacle detection using existing stereo vision methods like Semiglobal Matching (SGM) [23]. These methods, however, only use a fraction of the information present in the images to estimate depth – the disparity. Other cues such as the apparent size of known objects are completely ignored. The use of appearance cues for depth estimation is a relatively new development driven largely by the advent of Deep Learning, which allows these cues to be learned from large, labeled datasets. As long as the UAV’s operational environment is similar to this training dataset it should be possible to use appearance cues in a CAS. However, this is difficult to guarantee and may require a prohibitively large training set. SelfSupervised Learning may provide a solution to this problem. After training on an initial dataset, the UAV will continue to collect new training samples during operation. This allows it to ‘adapt’ to its operational environment and to learn new depth cues that are relevant in that environment. SelfSupervised Learning for depth map estimation is a young field, the first practical examples started to appear around 2016 (e.g. [17]). Most of the current literature is focused on automotive applications

Original language	English
Publisher	Micro Air Vehicle Lab (MAVLab), TU Delft
Number of pages	39
Publication status	Published - Mar 2020

Keywords

Obstacle Avoidance
Stereo Vision
Depth perception
self-supervised learning
Neural Networks
micro air vehicle
Benchmark study
Survey
Computer Vision
Unmanned Aerial Vehicle

Access to Document

TechReport_TvanDijk_March2020Final published version, 13.2 MB

Cite this

@book{bf982743f04349c1a50212f3a91b739e,

title = "Self-Supervised Learning for Visual Obstacle Avoidance",

abstract = "With a growing number of drones, the risk of collision with other air traffic or fixed obstacles increases. New safety measures are required to keep the operation of Unmanned Aerial Vehicles (UAVs) safe. One of these measures is the use of a Collision Avoidance System (CAS), a system that helps the drone autonomously detect and avoid obstacles. The design of a Collision Avoidance System is a complex task with many smaller subproblems, as illustrated by Albaker and Rahim [1]. How should the drone sense nearby obstacles? When is there a risk of collision? What should the drone do when a conflict is detected? All of these questions need to be answered to develop a functional Collision Avoidance System. However, all of these subproblems – except the sensing of obstacles – only concern the behavior of the vehicle. They can be solved independently of the target platform as long as it can perform the required maneuvers; it does not matter whether it is a UAV or a larger vehicle. The sensing of the environment, on the other hand, is the only subproblem that places requirements on the hardware, specifically the sensors that should be carried by the UAV. It is the hardware that sets UAVs apart from other vehicles. Unlike autonomous cars, other groundbased vehicles or larger aircraft, UAVs have only a small payload capacity. It is therefore not practical to carry large or heavy sensors such as LIDAR or radar for obstacle avoidance. Instead, obstacle avoidance on UAVs requires clever use of lightweight sensors: cameras, microphones or antennae. This research will therefore focus on the sensing of the environment. Out of the sensors mentioned above – cameras, microphones and antennae – cameras are the only ones that can detect nearly all groundbased obstacles and other air traffic; microphones and antennae are limited to detection of sources of noise or radio signals1. Therefore, this research will focus on the visual detection of obstacles. The field of computer vision is welldeveloped; it may already be possible to find an adequate solution for visual obstacle detection using existing stereo vision methods like Semiglobal Matching (SGM) [23]. These methods, however, only use a fraction of the information present in the images to estimate depth – the disparity. Other cues such as the apparent size of known objects are completely ignored. The use of appearance cues for depth estimation is a relatively new development driven largely by the advent of Deep Learning, which allows these cues to be learned from large, labeled datasets. As long as the UAV{\textquoteright}s operational environment is similar to this training dataset it should be possible to use appearance cues in a CAS. However, this is difficult to guarantee and may require a prohibitively large training set. SelfSupervised Learning may provide a solution to this problem. After training on an initial dataset, the UAV will continue to collect new training samples during operation. This allows it to {\textquoteleft}adapt{\textquoteright} to its operational environment and to learn new depth cues that are relevant in that environment. SelfSupervised Learning for depth map estimation is a young field, the first practical examples started to appear around 2016 (e.g. [17]). Most of the current literature is focused on automotive applications ",

keywords = "Obstacle Avoidance, Stereo Vision, Depth perception, self-supervised learning, Neural Networks, micro air vehicle, Benchmark study, Survey, Computer Vision, Unmanned Aerial Vehicle",

author = "{van Dijk}, Tom",

year = "2020",

month = mar,

language = "English",

publisher = "Micro Air Vehicle Lab (MAVLab), TU Delft",

}

TY - BOOK

T1 - Self-Supervised Learning for Visual Obstacle Avoidance

AU - van Dijk, Tom

PY - 2020/3

Y1 - 2020/3

N2 - With a growing number of drones, the risk of collision with other air traffic or fixed obstacles increases. New safety measures are required to keep the operation of Unmanned Aerial Vehicles (UAVs) safe. One of these measures is the use of a Collision Avoidance System (CAS), a system that helps the drone autonomously detect and avoid obstacles. The design of a Collision Avoidance System is a complex task with many smaller subproblems, as illustrated by Albaker and Rahim [1]. How should the drone sense nearby obstacles? When is there a risk of collision? What should the drone do when a conflict is detected? All of these questions need to be answered to develop a functional Collision Avoidance System. However, all of these subproblems – except the sensing of obstacles – only concern the behavior of the vehicle. They can be solved independently of the target platform as long as it can perform the required maneuvers; it does not matter whether it is a UAV or a larger vehicle. The sensing of the environment, on the other hand, is the only subproblem that places requirements on the hardware, specifically the sensors that should be carried by the UAV. It is the hardware that sets UAVs apart from other vehicles. Unlike autonomous cars, other groundbased vehicles or larger aircraft, UAVs have only a small payload capacity. It is therefore not practical to carry large or heavy sensors such as LIDAR or radar for obstacle avoidance. Instead, obstacle avoidance on UAVs requires clever use of lightweight sensors: cameras, microphones or antennae. This research will therefore focus on the sensing of the environment. Out of the sensors mentioned above – cameras, microphones and antennae – cameras are the only ones that can detect nearly all groundbased obstacles and other air traffic; microphones and antennae are limited to detection of sources of noise or radio signals1. Therefore, this research will focus on the visual detection of obstacles. The field of computer vision is welldeveloped; it may already be possible to find an adequate solution for visual obstacle detection using existing stereo vision methods like Semiglobal Matching (SGM) [23]. These methods, however, only use a fraction of the information present in the images to estimate depth – the disparity. Other cues such as the apparent size of known objects are completely ignored. The use of appearance cues for depth estimation is a relatively new development driven largely by the advent of Deep Learning, which allows these cues to be learned from large, labeled datasets. As long as the UAV’s operational environment is similar to this training dataset it should be possible to use appearance cues in a CAS. However, this is difficult to guarantee and may require a prohibitively large training set. SelfSupervised Learning may provide a solution to this problem. After training on an initial dataset, the UAV will continue to collect new training samples during operation. This allows it to ‘adapt’ to its operational environment and to learn new depth cues that are relevant in that environment. SelfSupervised Learning for depth map estimation is a young field, the first practical examples started to appear around 2016 (e.g. [17]). Most of the current literature is focused on automotive applications

AB - With a growing number of drones, the risk of collision with other air traffic or fixed obstacles increases. New safety measures are required to keep the operation of Unmanned Aerial Vehicles (UAVs) safe. One of these measures is the use of a Collision Avoidance System (CAS), a system that helps the drone autonomously detect and avoid obstacles. The design of a Collision Avoidance System is a complex task with many smaller subproblems, as illustrated by Albaker and Rahim [1]. How should the drone sense nearby obstacles? When is there a risk of collision? What should the drone do when a conflict is detected? All of these questions need to be answered to develop a functional Collision Avoidance System. However, all of these subproblems – except the sensing of obstacles – only concern the behavior of the vehicle. They can be solved independently of the target platform as long as it can perform the required maneuvers; it does not matter whether it is a UAV or a larger vehicle. The sensing of the environment, on the other hand, is the only subproblem that places requirements on the hardware, specifically the sensors that should be carried by the UAV. It is the hardware that sets UAVs apart from other vehicles. Unlike autonomous cars, other groundbased vehicles or larger aircraft, UAVs have only a small payload capacity. It is therefore not practical to carry large or heavy sensors such as LIDAR or radar for obstacle avoidance. Instead, obstacle avoidance on UAVs requires clever use of lightweight sensors: cameras, microphones or antennae. This research will therefore focus on the sensing of the environment. Out of the sensors mentioned above – cameras, microphones and antennae – cameras are the only ones that can detect nearly all groundbased obstacles and other air traffic; microphones and antennae are limited to detection of sources of noise or radio signals1. Therefore, this research will focus on the visual detection of obstacles. The field of computer vision is welldeveloped; it may already be possible to find an adequate solution for visual obstacle detection using existing stereo vision methods like Semiglobal Matching (SGM) [23]. These methods, however, only use a fraction of the information present in the images to estimate depth – the disparity. Other cues such as the apparent size of known objects are completely ignored. The use of appearance cues for depth estimation is a relatively new development driven largely by the advent of Deep Learning, which allows these cues to be learned from large, labeled datasets. As long as the UAV’s operational environment is similar to this training dataset it should be possible to use appearance cues in a CAS. However, this is difficult to guarantee and may require a prohibitively large training set. SelfSupervised Learning may provide a solution to this problem. After training on an initial dataset, the UAV will continue to collect new training samples during operation. This allows it to ‘adapt’ to its operational environment and to learn new depth cues that are relevant in that environment. SelfSupervised Learning for depth map estimation is a young field, the first practical examples started to appear around 2016 (e.g. [17]). Most of the current literature is focused on automotive applications

KW - Obstacle Avoidance

KW - Stereo Vision

KW - Depth perception

KW - self-supervised learning

KW - Neural Networks

KW - micro air vehicle

KW - Benchmark study

KW - Survey

KW - Computer Vision

KW - Unmanned Aerial Vehicle

UR - https://github.com/tomvand/2020-techreport-visual-obstacle-avoidance/blob/master/report.pdf

M3 - Report

BT - Self-Supervised Learning for Visual Obstacle Avoidance

PB - Micro Air Vehicle Lab (MAVLab), TU Delft

ER -

Self-Supervised Learning for Visual Obstacle Avoidance

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this

Self-­Supervised Learning for Visual Obstacle Avoidance

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this

Self-Supervised Learning for Visual Obstacle Avoidance