thesis update (cumulative)
This commit is contained in:
@@ -170,13 +170,26 @@
|
|||||||
% mainmatter (=content)
|
% mainmatter (=content)
|
||||||
|
|
||||||
\newchapter{Introduction}{chap:introduction}
|
\newchapter{Introduction}{chap:introduction}
|
||||||
\todo[inline, color=green!40]{its a master thesis where we try to know how trustworthy the sensor data for robot navigation is}
|
%\todo[inline, color=green!40]{its a master thesis where we try to know how trustworthy the sensor data for robot navigation is}
|
||||||
\newsection{Motivation and Problem Statement}{sec:motivation}
|
%\newsection{Motivation and Problem Statement}{sec:motivation}
|
||||||
\todo[inline]{lidar and its role in robot navigation. discuss sensor degradation and its effects on navigation.}
|
%\todo[inline]{lidar and its role in robot navigation. discuss sensor degradation and its effects on navigation.}
|
||||||
|
|
||||||
\todo[inline, color=green!40]{autonomous robots have many sensors for understanding the world around them, especially visual sensors (lidar, radar, ToF, ultrasound, optical cameras, infrared cameras), they use that data for navigation mapping, SLAM algorithms, and decision making. these are often deep learning algorithms, oftentimes only trained on good data}
|
Autonomous robots have gained more and more prevailance in search and rescue missions due to not endangering another human being and still being able to fulfil the difficult tasks of navigating hazardous environments like collapsed structures, identifying and locating victims and assessing the environment's safety for human rescue teams. To understand the environment, robots employ multiple sensor systems such as lidar, radar, ToF, ultrasound, optical cameras or infrared cameras of which lidar is the most prominently used due to its accuracy. The robots use the sensors' data to map their environments, navigate their surroundings and make decisions like which paths to prioritize. Many of the aforementioned algorithms are deep learning-based algorithms which are trained on large amounts of data whose characteristics are learned by the models.
|
||||||
\todo[inline, color=green!40]{difficult environments for sensors to produce good data quality (earthquakes, rescue robots), produced data may be unreliable, we don't know how trustworthy that data is (no quantification, confidence), since all navigation and decision making is based on input data, this makes the whole pipeline untrustworthy/problematic}
|
|
||||||
\todo[inline, color=green!40]{contribution/idea of this thesis is to calculate a confidence score which describes how trustworthy input data is. algorithms further down the pipeline (slam, navigation, decision) can use this to make more informed decisions - examples: collect more data by reducing speed, find alternative routes, signal for help, do not attempt navigation, more heavily weight input from other sensors}
|
Environments of search and rescue situations provide challenging conditions for the sensor systems to produce reliable data. One of the most promiment examples are aerosol particles from smoke and dust which can obstruct the view and lead sensors to produce erroneous data. If such degraded data was not present in the robots' algorithms' training data these errors may lead to unexpected outputs and potentially endanger the robot or even human rescue targets. This is especially important for autonomous robots whose decisions are entirely based on their sensor data without any human intervention. To safeguard against these problems, robots need a way to assess the trustworthiness of their sensor systems' data.
|
||||||
|
|
||||||
|
|
||||||
|
For remote controlled robots a human operator can make these decisions but many search and rescue missions do not allow remote control due to environment factors, such as radio signal attenuation or the search area's size and therefore demand autonomous robots. Therefore, during the design for such robots we arrive at the following critical question:
|
||||||
|
|
||||||
|
\begin{quote} Can autonomous robots quantify the reliability of lidar sensor data in hazardous environments to make more informed decisions? \end{quote}
|
||||||
|
|
||||||
|
In this thesis we aim to answer this question by assessing a deep learning-based anomaly detection method and its performance when quantifying the sensor data's degradation. The employed algithm is a semi-supervised anomaly detection algorithm which uses manually labeled training data to improve its performance over unsupervised methods. We show how much the introduction of these labeled samples improves the methods performance. The models output is an anomaly score which quantifies the data reliability and can be used by algorithms that rely on the sensor data. These reliant algorithms may decide to for example slow down the robot to collect more data, choose alternative routes, signal for help or rely more heavily on other sensor's input data.
|
||||||
|
|
||||||
|
\todo[inline]{discuss results (we showed X)}
|
||||||
|
|
||||||
|
%\todo[inline, color=green!40]{autonomous robots have many sensors for understanding the world around them, especially visual sensors (lidar, radar, ToF, ultrasound, optical cameras, infrared cameras), they use that data for navigation mapping, SLAM algorithms, and decision making. these are often deep learning algorithms, oftentimes only trained on good data}
|
||||||
|
%\todo[inline, color=green!40]{difficult environments for sensors to produce good data quality (earthquakes, rescue robots), produced data may be unreliable, we don't know how trustworthy that data is (no quantification, confidence), since all navigation and decision making is based on input data, this makes the whole pipeline untrustworthy/problematic}
|
||||||
|
%\todo[inline, color=green!40]{contribution/idea of this thesis is to calculate a confidence score which describes how trustworthy input data is. algorithms further down the pipeline (slam, navigation, decision) can use this to make more informed decisions - examples: collect more data by reducing speed, find alternative routes, signal for help, do not attempt navigation, more heavily weight input from other sensors}
|
||||||
|
|
||||||
\newsection{Scope of Research}{chap:scope_research}
|
\newsection{Scope of Research}{chap:scope_research}
|
||||||
\todo[inline]{output is score, thresholding (yes/no), maybe confidence in sensor/data? NOT how this score is used in navigation/other decisions further down the line}
|
\todo[inline]{output is score, thresholding (yes/no), maybe confidence in sensor/data? NOT how this score is used in navigation/other decisions further down the line}
|
||||||
@@ -237,9 +250,11 @@
|
|||||||
|
|
||||||
|
|
||||||
\newchapter{DeepSAD: Semi-Supervised Anomaly Detection}{chap:deepsad}
|
\newchapter{DeepSAD: Semi-Supervised Anomaly Detection}{chap:deepsad}
|
||||||
\todo[inline, color=green!40]{DeepSAD is a semi-supervised anomaly detection method proposed in cite, which is based on an unsupervised method (DeepSVDD) and additionally allows for providing some labeled data which is used during the training phase to improve the method's performance}
|
Deep Semi-Supervised Anomaly Detection~\cite{deepsad} is a deep-learning based anomaly detection method whose performance in regards to sensor degradation quantification we explore in this thesis. It is a semi-supervised method which allows the introduction of manually labeled samples in addition to the unlabeled training data to improve the algorithm's performance over its unsupervised predecessor Deep One-Class Classification~\cite{deepsvdd}.\todo{phrasing} The working principle of the method is to encode the input data onto a latent space and train the network to cluster normal data close together while anomalies get mapped further away in that latent space.
|
||||||
|
%\todo[inline, color=green!40]{DeepSAD is a semi-supervised anomaly detection method proposed in cite, which is based on an unsupervised method (DeepSVDD) and additionally allows for providing some labeled data which is used during the training phase to improve the method's performance}
|
||||||
\newsection{Algorithm Description}{sec:algorithm_description}
|
\newsection{Algorithm Description}{sec:algorithm_description}
|
||||||
\todo[inline]{explain deepsad in detail}
|
%\todo[inline]{explain deepsad in detail}
|
||||||
|
|
||||||
\todo[inline, color=green!40]{Core idea of the algorithm is to learn a transformation to map input data into a latent space where normal data clusters close together and anomalous data gets mapped further away. to achieve this the methods first includes a pretraining step of an auto-encoder to extract the most relevant information, second it fixes a hypersphere center in the auto-encoders latent space as a target point for normal data and third it traings the network to map normal data closer to that hypersphere center. Fourth The resulting network can map new data into this latent space and interpret its distance from the hypersphere center as an anomaly score which is larger the more anomalous the datapoint is}
|
\todo[inline, color=green!40]{Core idea of the algorithm is to learn a transformation to map input data into a latent space where normal data clusters close together and anomalous data gets mapped further away. to achieve this the methods first includes a pretraining step of an auto-encoder to extract the most relevant information, second it fixes a hypersphere center in the auto-encoders latent space as a target point for normal data and third it traings the network to map normal data closer to that hypersphere center. Fourth The resulting network can map new data into this latent space and interpret its distance from the hypersphere center as an anomaly score which is larger the more anomalous the datapoint is}
|
||||||
\todo[inline, color=green!40]{explanation pre-training step: architecture of the autoencoder is dependent on the input data shape, but any data shape is generally permissible. for the autoencoder we do not need any labels since the optimization target is always the input itself. the latent space dimensionality can be chosen based on the input datas complexity (search citations). generally a higher dimensional latent space has more learning capacity but tends to overfit more easily (find cite). the pre-training step is used to find weights for the encoder which genereally extract robust and critical data from the input because TODO read deepsad paper (cite deepsad). as training data typically all data (normal and anomalous) is used during this step.}
|
\todo[inline, color=green!40]{explanation pre-training step: architecture of the autoencoder is dependent on the input data shape, but any data shape is generally permissible. for the autoencoder we do not need any labels since the optimization target is always the input itself. the latent space dimensionality can be chosen based on the input datas complexity (search citations). generally a higher dimensional latent space has more learning capacity but tends to overfit more easily (find cite). the pre-training step is used to find weights for the encoder which genereally extract robust and critical data from the input because TODO read deepsad paper (cite deepsad). as training data typically all data (normal and anomalous) is used during this step.}
|
||||||
\todo[inline, color=green!40]{explanation hypersphere center step: an additional positive ramification of the pretraining is that the mean of all pre-training's latent spaces can be used as the hypersphere target around which normal data is supposed to cluster. this is advantageous because it allows the main training to converge faster than choosing a random point in the latent space as hypersphere center. from this point onward the center C is fixed for the main training and inference and does not change anymore.}
|
\todo[inline, color=green!40]{explanation hypersphere center step: an additional positive ramification of the pretraining is that the mean of all pre-training's latent spaces can be used as the hypersphere target around which normal data is supposed to cluster. this is advantageous because it allows the main training to converge faster than choosing a random point in the latent space as hypersphere center. from this point onward the center C is fixed for the main training and inference and does not change anymore.}
|
||||||
@@ -267,6 +282,17 @@
|
|||||||
|
|
||||||
\newsection{Data}{sec:data}
|
\newsection{Data}{sec:data}
|
||||||
|
|
||||||
|
%BEGIN missing points
|
||||||
|
As we can see in figure~\ref{fig:data_missing_points}, the artifical smoke introduced as explicit degradation during some experiments results in more missing measurements during scans, which can be explained by measurement rays hitting airborne particles but not being reflected back to the sensor in a way it can measure.
|
||||||
|
|
||||||
|
\begin{figure}
|
||||||
|
\begin{center}
|
||||||
|
\includegraphics[width=0.9\textwidth]{figures/data_missing_points.png}
|
||||||
|
\end{center}
|
||||||
|
\caption{Density histogram showing the percentage of missing measurements per scan for normal experiments without degradation and anomalous experiments with artifical smoke introduced as degradation.}\label{fig:data_missing_points}
|
||||||
|
\end{figure}
|
||||||
|
%END missing points
|
||||||
|
|
||||||
\todo[inline]{describe data sources, limitations}
|
\todo[inline]{describe data sources, limitations}
|
||||||
\todo[inline]{screenshots of camera/3d data?}
|
\todo[inline]{screenshots of camera/3d data?}
|
||||||
\todo[inline]{difficulties: no ground truth, different lidar sensors/settings, different data shapes, available metadata, ...}
|
\todo[inline]{difficulties: no ground truth, different lidar sensors/settings, different data shapes, available metadata, ...}
|
||||||
@@ -317,6 +343,7 @@
|
|||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
% end mainmatter
|
% end mainmatter
|
||||||
% **************************************************************************************************
|
% **************************************************************************************************
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user