sensor platform photo, deepsad chapter progress
This commit is contained in:
@@ -200,7 +200,7 @@
|
|||||||
|
|
||||||
\newsection{thesis_structure}{Structure of the Thesis}
|
\newsection{thesis_structure}{Structure of the Thesis}
|
||||||
\todo[inline]{brief overview of thesis structure}
|
\todo[inline]{brief overview of thesis structure}
|
||||||
\todo[inline, color=green!40]{in section x we discuss anomaly detection, semi-supervised learning since such an algorithm was used as the chosen method, we also discuss how lidar works and the data it produces. then in we discuss in detail the chosen method DeepSAD in section X, in section 4 we discuss the traing and evaluation data, in sec 5 we describe our setup for training and evaluation (whole pipeline). results are presented and discussed in section 6. section 7 contains a conclusion and discusses future work}
|
\todo[inline, color=green!40]{in section x we discuss anomaly detection, semi-supervised learning since such an algorithm was used as the chosen method, we also discuss how lidar works and the data it produces. then in we discuss in detail the chosen method Deep SAD in section X, in section 4 we discuss the traing and evaluation data, in sec 5 we describe our setup for training and evaluation (whole pipeline). results are presented and discussed in section 6. section 7 contains a conclusion and discusses future work}
|
||||||
|
|
||||||
\newchapter{background}{Background and Related Work}
|
\newchapter{background}{Background and Related Work}
|
||||||
|
|
||||||
@@ -227,19 +227,19 @@
|
|||||||
\fi
|
\fi
|
||||||
|
|
||||||
\newsection{semi_supervised}{Semi-Supervised Learning Algorithms}
|
\newsection{semi_supervised}{Semi-Supervised Learning Algorithms}
|
||||||
\todo[inline]{Quick overview of the DeepSAD metho}
|
\todo[inline]{Quick overview of the Deep SAD metho}
|
||||||
\todo[inline, color=green!40]{deep learning based (neural network with hidden layers), neural networks which get trained using backpropagation, to learn to solve a novel task by defining some target}
|
\todo[inline, color=green!40]{deep learning based (neural network with hidden layers), neural networks which get trained using backpropagation, to learn to solve a novel task by defining some target}
|
||||||
\todo[inline, color=green!40]{data labels decide training setting (supervised, non-supervised, semi-supervised incl explanation), supervised often classification based, but not possible if no labels available, un-supervised has no well-defined target, often used to fined common hidden factors in data (distribution). semi-supervised more like a sub method of unsupervised which additionally uses little (often handlabelled) data to improve method performance}
|
\todo[inline, color=green!40]{data labels decide training setting (supervised, non-supervised, semi-supervised incl explanation), supervised often classification based, but not possible if no labels available, un-supervised has no well-defined target, often used to fined common hidden factors in data (distribution). semi-supervised more like a sub method of unsupervised which additionally uses little (often handlabelled) data to improve method performance}
|
||||||
\todo[inline, color=green!40]{include figure unsupervised, semi-supervised, supervised}
|
\todo[inline, color=green!40]{include figure unsupervised, semi-supervised, supervised}
|
||||||
\todo[inline, color=green!40]{find easy illustrative example with figure of semi-supervised learning and include + explain here}
|
\todo[inline, color=green!40]{find easy illustrative example with figure of semi-supervised learning and include + explain here}
|
||||||
\todo[inline, color=green!40]{our chosen method DeepSAD is a semi-supervised deep learning method whose workings will be discussed in more detail in secion X}
|
\todo[inline, color=green!40]{our chosen method Deep SAD is a semi-supervised deep learning method whose workings will be discussed in more detail in secion X}
|
||||||
|
|
||||||
\newsection{autoencoder}{Autoencoder}
|
\newsection{autoencoder}{Autoencoder}
|
||||||
\todo[inline]{autoencoder explanation}
|
\todo[inline]{autoencoder explanation}
|
||||||
\todo[inline, color=green!40]{autoencoders are a neural network architecture archetype (words) whose training target is to reproduce the input data itself - hence the name. the architecture is most commonly a mirrored one consisting of an encoder which transforms input data into a hyperspace represantation in a latent space and a decoder which transforms the latent space into the same data format as the input data (phrasing), this method typically results in the encoder learning to extract the most robust and critical information of the data and the (todo maybe something about the decoder + citation for both). it is used in many domains translations, LLMs, something with images (search example + citations)}
|
\todo[inline, color=green!40]{autoencoders are a neural network architecture archetype (words) whose training target is to reproduce the input data itself - hence the name. the architecture is most commonly a mirrored one consisting of an encoder which transforms input data into a hyperspace represantation in a latent space and a decoder which transforms the latent space into the same data format as the input data (phrasing), this method typically results in the encoder learning to extract the most robust and critical information of the data and the (todo maybe something about the decoder + citation for both). it is used in many domains translations, LLMs, something with images (search example + citations)}
|
||||||
\todo[inline, color=green!40]{typical encoder decoder mirrored figure}
|
\todo[inline, color=green!40]{typical encoder decoder mirrored figure}
|
||||||
\todo[inline, color=green!40]{explain figure}
|
\todo[inline, color=green!40]{explain figure}
|
||||||
\todo[inline, color=green!40]{our chosen method DeepSAD uses an autoencoder to translate input data into a latent space, in which it can more easily differentiate between normal and anomalous data}
|
\todo[inline, color=green!40]{our chosen method Deep SAD uses an autoencoder to translate input data into a latent space, in which it can more easily differentiate between normal and anomalous data}
|
||||||
|
|
||||||
\newsection{lidar_related_work}{Lidar - Light Detection and Ranging}
|
\newsection{lidar_related_work}{Lidar - Light Detection and Ranging}
|
||||||
\todo[inline]{related work in lidar}
|
\todo[inline]{related work in lidar}
|
||||||
@@ -249,18 +249,29 @@
|
|||||||
\todo[inline, color=green!40]{because of the given advantages of lidar it is most commonly used nowadays on robot platforms for environment mapping and navigiation - so we chose to demonstrate our method based on degraded data collected by a lidar sensor as discussed in more dtail in section (data section)}
|
\todo[inline, color=green!40]{because of the given advantages of lidar it is most commonly used nowadays on robot platforms for environment mapping and navigiation - so we chose to demonstrate our method based on degraded data collected by a lidar sensor as discussed in more dtail in section (data section)}
|
||||||
|
|
||||||
|
|
||||||
\newchapter{deepsad}{DeepSAD: Semi-Supervised Anomaly Detection}
|
\newchapter{deepsad}{Deep SAD: Semi-Supervised Anomaly Detection}
|
||||||
|
|
||||||
%In this chapter we explore the method \emph{Deep Semi-Supervised Anomaly Detection}~\cite{deepsad} which we employed during our experiments to quanitfy the degradation of lidar scans caused by artifically introduced water vapor from a theater smoke machine. The same approach of modeling a degradation quantification problem as an anomaly detection task was succesfully used in \cite{degradation_quantification_rain} to quantify the degradation caused to lidar scans by bad weather conditions such as rain, fog and snow for autonomous driving tasks. DeepSAD is characterized by it being a deep-learning approach to anomaly detection which enables it to learn more complex anomalous data patterns than more classic statistical approaches and its capability of employing hand-labeled data samples-both normal and anomalous-during its training step to better teach the model to differentiate between know anomalies and normal data than if only an unsupervised approach was used which basically just learns the most common patterns in the implicitely more common normal data and to differentiate anything from that.
|
%In this chapter we explore the method \emph{Deep Semi-Supervised Anomaly Detection}~\cite{deepsad} which we employed during our experiments to quanitfy the degradation of lidar scans caused by artifically introduced water vapor from a theater smoke machine. The same approach of modeling a degradation quantification problem as an anomaly detection task was succesfully used in \cite{degradation_quantification_rain} to quantify the degradation caused to lidar scans by bad weather conditions such as rain, fog and snow for autonomous driving tasks. Deep SAD is characterized by it being a deep-learning approach to anomaly detection which enables it to learn more complex anomalous data patterns than more classic statistical approaches and its capability of employing hand-labeled data samples-both normal and anomalous-during its training step to better teach the model to differentiate between know anomalies and normal data than if only an unsupervised approach was used which basically just learns the most common patterns in the implicitely more common normal data and to differentiate anything from that.
|
||||||
|
|
||||||
In this chapter, we explore the method \emph{Deep Semi-Supervised Anomaly Detection} (DeepSAD)~\cite{deepsad}, which we employed to quantify the degradation of LiDAR scans caused by artificially introduced water vapor from a theater smoke machine. A similar approach—modeling degradation quantification as an anomaly detection task—was successfully applied in \cite{degradation_quantification_rain} to assess the impact of adverse weather conditions (rain, fog, and snow) on LiDAR data for autonomous driving applications. DeepSAD leverages deep learning to capture complex anomalous patterns that classical statistical methods might miss. Furthermore, by incorporating a limited amount of hand-labeled data (both normal and anomalous), it can more effectively differentiate between known anomalies and normal data compared to purely unsupervised methods, which typically learn only the most prevalent patterns in the dataset.
|
In this chapter, we explore the method \emph{Deep Semi-Supervised Anomaly Detection} (Deep SAD)~\cite{deepsad}, which we employed to quantify the degradation of LiDAR scans caused by artificially introduced water vapor from a theater smoke machine. A similar approach—modeling degradation quantification as an anomaly detection task—was successfully applied in \cite{degradation_quantification_rain} to assess the impact of adverse weather conditions on LiDAR data for autonomous driving applications. Deep SAD leverages deep learning to capture complex anomalous patterns that classical statistical methods might miss. Furthermore, by incorporating a limited amount of hand-labeled data (both normal and anomalous), it can more effectively differentiate between known anomalies and normal data compared to purely unsupervised methods, which typically learn only the most prevalent patterns in the dataset~\cite{deepsad}.
|
||||||
|
|
||||||
|
|
||||||
%Deep Semi-Supervised Anomaly Detection~\cite{deepsad} is a deep-learning based anomaly detection method whose performance in regards to sensor degradation quantification we explore in this thesis. It is a semi-supervised method which allows the introduction of manually labeled samples in addition to the unlabeled training data to improve the algorithm's performance over its unsupervised predecessor Deep One-Class Classification~\cite{deepsvdd}. The working principle of the method is to encode the input data onto a latent space and train the network to cluster normal data close together while anomalies get mapped further away in that latent space.
|
%Deep Semi-Supervised Anomaly Detection~\cite{deepsad} is a deep-learning based anomaly detection method whose performance in regards to sensor degradation quantification we explore in this thesis. It is a semi-supervised method which allows the introduction of manually labeled samples in addition to the unlabeled training data to improve the algorithm's performance over its unsupervised predecessor Deep One-Class Classification~\cite{deepsvdd}. The working principle of the method is to encode the input data onto a latent space and train the network to cluster normal data close together while anomalies get mapped further away in that latent space.
|
||||||
%\todo[inline, color=green!40]{DeepSAD is a semi-supervised anomaly detection method proposed in cite, which is based on an unsupervised method (DeepSVDD) and additionally allows for providing some labeled data which is used during the training phase to improve the method's performance}
|
%\todo[inline, color=green!40]{Deep SAD is a semi-supervised anomaly detection method proposed in cite, which is based on an unsupervised method (Deep SVDD) and additionally allows for providing some labeled data which is used during the training phase to improve the method's performance}
|
||||||
\newsection{algorithm_description}{Algorithm Description}
|
\newsection{algorithm_description}{Algorithm Description}
|
||||||
%\todo[inline]{explain deepsad in detail}
|
%\todo[inline]{explain deepsad in detail}
|
||||||
|
|
||||||
|
%Deep SAD is a typical clustering based anomaly detection technique which is described in \cite{anomaly_detection_survey} to generally have a two step approach to anomaly detection. First a clustering algorithm is used to cluster data closely together around a centroid and secondly the distances from data to that centroid is calculated and interpreted as an anomaly score. This general idea can also be found in the definition of the Deep SAD algorithm, which uses the encoder part of an autoencoder architecture which is trained to cluster data around a centroid in the latent space of its output. The datas geometric distance to that centroid in the latent space is defined as an anomaly score. Deep SAD is a semi-supervised training based method which can work completely unsupervised (no labeled data available) in which case it falls back to its predecessor method Deep SVDD but additionally allows the introduction of labeleld data samples during training to more accurately map known normal samples near the centroid and known anomalous samples further away from it.
|
||||||
|
|
||||||
|
Deep SAD is an anomaly detection method that belongs to the category of clustering-based methods, which according to~\cite{anomaly_detection_survey} typically follow a two-step approach. First, a clustering algorithm groups data points around a centroid; then, the distances of individual data points from this centroid are calculated and used as an anomaly score. In Deep SAD, this concept is implemented by employing the encoder part of an autoencoder architecture, which is jointly trained to map data into a latent space and to minimize the volume of an data-encompassing hypersphere whose center is the aforementioned centroid. The geometric distance in the latent space to the hypersphere center is used as the anomaly score, where a higher score corresponds to a higher probability of a sample being anomalous according to the method.
|
||||||
|
|
||||||
|
Deep SAD is semi-supervised, though it can operate in a fully unsupervised mode—effectively reverting to its predecessor, Deep SVDD~\cite{deepsvdd}—when no labeled data are available. However, it also allows for the incorporation of labeled samples during training. This additional supervision helps the model better position known normal samples near the centroid and push known anomalies farther away, thereby enhancing its ability to differentiate between normal and anomalous data.
|
||||||
|
|
||||||
|
%As a pre-training step an autoencoder architecture is trained and its weights are used to initialize its encoder part before training of the method itself begins. \citeauthor{deepsad} argue in~\cite{deepsad} that this pre-training step which was already present in~\cite{deepsvdd}, allows them to not only interpret the method in geometric terms as minimum volume estimation but also in probalistic terms as entropy minimization over the latent distribution, since the autoencoding objective implicitely maximizes the mutual information between the data and its latent space represenation. This insight-that the method follows the Infomax principle with the additional objective of the latent distribution having mininmal entropy-allowed \citeauthor{deepsad} to introduce an additional term in Deep SAD's - over Deep SVDD's objective, which encorporates labeled data to better model the nature of normal and anomalous data. They show that Deep SAD's objective can be interpreted as normal data's distribution in the latent space being modeled to have low entropy and anomalous data's distribution in that latent space being modeled as having high entropy, which they argue captures the nature of the difference between normal and anomalous data by interpreting anomalies ``as being generated from an infinite mixture of distributions that are different from normal data distribution''~\cite{deepsad}.
|
||||||
|
|
||||||
|
As a pre-training step, an autoencoder is trained and its encoder weights are used to initialize the model before beginning the main training phase. \citeauthor{deepsad} argue in \cite{deepsad} that this pre-training step—originally introduced in \cite{deepsvdd}—not only provides a geometric interpretation of the method as minimum volume estimation but also a probabilistic one as entropy minimization over the latent distribution. The autoencoding objective implicitly maximizes the mutual information between the data and its latent representation, aligning the approach with the Infomax principle while encouraging a latent space with minimal entropy. This insight enabled \citeauthor{deepsad} to introduce an additional term in DeepSAD’s objective, beyond that of Deep SVDD, which incorporates labeled data to better capture the characteristics of normal and anomalous data. They demonstrate that DeepSAD’s objective effectively models the latent distribution of normal data as having low entropy, while that of anomalous data is characterized by higher entropy. In this framework, anomalies are interpreted as being generated from an infinite mixture of distributions that differ from the normal data distribution.
|
||||||
|
|
||||||
|
|
||||||
\todo[inline, color=green!40]{Core idea of the algorithm is to learn a transformation to map input data into a latent space where normal data clusters close together and anomalous data gets mapped further away. to achieve this the methods first includes a pretraining step of an auto-encoder to extract the most relevant information, second it fixes a hypersphere center in the auto-encoders latent space as a target point for normal data and third it traings the network to map normal data closer to that hypersphere center. Fourth The resulting network can map new data into this latent space and interpret its distance from the hypersphere center as an anomaly score which is larger the more anomalous the datapoint is}
|
\todo[inline, color=green!40]{Core idea of the algorithm is to learn a transformation to map input data into a latent space where normal data clusters close together and anomalous data gets mapped further away. to achieve this the methods first includes a pretraining step of an auto-encoder to extract the most relevant information, second it fixes a hypersphere center in the auto-encoders latent space as a target point for normal data and third it traings the network to map normal data closer to that hypersphere center. Fourth The resulting network can map new data into this latent space and interpret its distance from the hypersphere center as an anomaly score which is larger the more anomalous the datapoint is}
|
||||||
\todo[inline, color=green!40]{explanation pre-training step: architecture of the autoencoder is dependent on the input data shape, but any data shape is generally permissible. for the autoencoder we do not need any labels since the optimization target is always the input itself. the latent space dimensionality can be chosen based on the input datas complexity (search citations). generally a higher dimensional latent space has more learning capacity but tends to overfit more easily (find cite). the pre-training step is used to find weights for the encoder which genereally extract robust and critical data from the input because TODO read deepsad paper (cite deepsad). as training data typically all data (normal and anomalous) is used during this step.}
|
\todo[inline, color=green!40]{explanation pre-training step: architecture of the autoencoder is dependent on the input data shape, but any data shape is generally permissible. for the autoencoder we do not need any labels since the optimization target is always the input itself. the latent space dimensionality can be chosen based on the input datas complexity (search citations). generally a higher dimensional latent space has more learning capacity but tends to overfit more easily (find cite). the pre-training step is used to find weights for the encoder which genereally extract robust and critical data from the input because TODO read deepsad paper (cite deepsad). as training data typically all data (normal and anomalous) is used during this step.}
|
||||||
\todo[inline, color=green!40]{explanation hypersphere center step: an additional positive ramification of the pretraining is that the mean of all pre-training's latent spaces can be used as the hypersphere target around which normal data is supposed to cluster. this is advantageous because it allows the main training to converge faster than choosing a random point in the latent space as hypersphere center. from this point onward the center C is fixed for the main training and inference and does not change anymore.}
|
\todo[inline, color=green!40]{explanation hypersphere center step: an additional positive ramification of the pretraining is that the mean of all pre-training's latent spaces can be used as the hypersphere target around which normal data is supposed to cluster. this is advantageous because it allows the main training to converge faster than choosing a random point in the latent space as hypersphere center. from this point onward the center C is fixed for the main training and inference and does not change anymore.}
|
||||||
@@ -337,7 +348,7 @@ For training, explicit labels are generally not required because the semi-superv
|
|||||||
%\todo[inline, color=green!40]{list sensors on the platform}
|
%\todo[inline, color=green!40]{list sensors on the platform}
|
||||||
%Based on the previously discussed requirements and labeling difficulties we decided to train and evaluate the methods on \emph{Multimodal Dataset from Harsh Sub-Terranean Environment with Aerosol Particles for Frontier Exploration}~\cite{subter}. The dataset is comprised of data from multiple sensors on a moving sensor platform which was driven through tunnels and rooms in a subterranean setting. What makes it especially fitting for our use case is that during some of the experiments, an artifical smoke machine was employed to simulate aerosol particles.
|
%Based on the previously discussed requirements and labeling difficulties we decided to train and evaluate the methods on \emph{Multimodal Dataset from Harsh Sub-Terranean Environment with Aerosol Particles for Frontier Exploration}~\cite{subter}. The dataset is comprised of data from multiple sensors on a moving sensor platform which was driven through tunnels and rooms in a subterranean setting. What makes it especially fitting for our use case is that during some of the experiments, an artifical smoke machine was employed to simulate aerosol particles.
|
||||||
%The sensors employed during capture of the dataset include:
|
%The sensors employed during capture of the dataset include:
|
||||||
Based on the previously discussed requirements and the challenges of obtaining reliable labels, we selected the \emph{Multimodal Dataset from Harsh Sub-Terranean Environment with Aerosol Particles for Frontier Exploration}~\cite{subter} for training and evaluation. This dataset comprises multimodal sensor data collected from a moving platform navigating tunnels and rooms in a subterranean environment. Notably, some experiments incorporated an artificial smoke machine to simulate aerosol particles, making the dataset particularly well-suited to our use case. The sensors used during data capture include:
|
Based on the previously discussed requirements and the challenges of obtaining reliable labels, we selected the \emph{Multimodal Dataset from Harsh Sub-Terranean Environment with Aerosol Particles for Frontier Exploration}~\cite{subter} for training and evaluation. This dataset comprises multimodal sensor data collected from a moving platform navigating tunnels and rooms in a subterranean environment. Notably, some experiments incorporated an artificial smoke machine to simulate aerosol particles, making the dataset particularly well-suited to our use case. The sensors used during data capture include:\todo[inline, color=green!40]{refer to sketch with numbers}
|
||||||
|
|
||||||
\begin{itemize}
|
\begin{itemize}
|
||||||
\item Lidar - Ouster OS1-32
|
\item Lidar - Ouster OS1-32
|
||||||
@@ -350,8 +361,18 @@ Based on the previously discussed requirements and the challenges of obtaining r
|
|||||||
|
|
||||||
%We mainly utilize the data from the \emph{Ouster OS1-32} lidar sensor, which produces 10 frames per second with a resolution of 32 vertical channels by 2048 measurements per channel, both equiangularly spaced over the vertical and horizontal fields of view of 42.4° and 360° respectively. Every measurement of the lidar therefore results in a point cloud with a maximum of 65536 points. Every point contains the \emph{X}, \emph{Y} and \emph{Z} coordinates in meters with the sensor location as origin, as well as values for the \emph{range}, \emph{intensity} and \emph{reflectivity} which are typical data measured by lidar sensors. The data is dense, meaning missing measurements are still present in the data of each point cloud with zero values for most fields.
|
%We mainly utilize the data from the \emph{Ouster OS1-32} lidar sensor, which produces 10 frames per second with a resolution of 32 vertical channels by 2048 measurements per channel, both equiangularly spaced over the vertical and horizontal fields of view of 42.4° and 360° respectively. Every measurement of the lidar therefore results in a point cloud with a maximum of 65536 points. Every point contains the \emph{X}, \emph{Y} and \emph{Z} coordinates in meters with the sensor location as origin, as well as values for the \emph{range}, \emph{intensity} and \emph{reflectivity} which are typical data measured by lidar sensors. The data is dense, meaning missing measurements are still present in the data of each point cloud with zero values for most fields.
|
||||||
|
|
||||||
|
\todo[inline, color=green!40]{short description of sensor platform and refer to photo}
|
||||||
|
|
||||||
We use data from the \emph{Ouster OS1-32} LiDAR sensor, which was configured to capture 10 frames per second with a resolution of 32 vertical channels and 2048 measurements per channel. These settings yield equiangular measurements across a vertical field of view of 42.4° and a complete 360° horizontal field of view. Consequently, every LiDAR scan can generate up to 65,536 points. Each point contains the \emph{X}, \emph{Y}, and \emph{Z} coordinates (in meters, with the sensor location as the origin) along with values for \emph{range}, \emph{intensity}, and \emph{reflectivity}—typical metrics measured by LiDAR sensors. Although the dataset is considered dense, each point cloud still contains missing measurements, with fields of these missing measurements registering as zero.
|
We use data from the \emph{Ouster OS1-32} LiDAR sensor, which was configured to capture 10 frames per second with a resolution of 32 vertical channels and 2048 measurements per channel. These settings yield equiangular measurements across a vertical field of view of 42.4° and a complete 360° horizontal field of view. Consequently, every LiDAR scan can generate up to 65,536 points. Each point contains the \emph{X}, \emph{Y}, and \emph{Z} coordinates (in meters, with the sensor location as the origin) along with values for \emph{range}, \emph{intensity}, and \emph{reflectivity}—typical metrics measured by LiDAR sensors. Although the dataset is considered dense, each point cloud still contains missing measurements, with fields of these missing measurements registering as zero.
|
||||||
|
|
||||||
|
\begin{figure}
|
||||||
|
\centering
|
||||||
|
\subfigure{\includegraphics[width=0.45\textwidth]{figures/data_subter_platform_photo.jpg}\label{fig:subter_platform_sketch}}%
|
||||||
|
\hfill
|
||||||
|
\subfigure{\includegraphics[width=0.45\textwidth]{figures/data_subter_platform_sketch.png}\label{fig:subter_platform_photo}}%
|
||||||
|
\caption{\todo[inline, color=green!40]{better caption} 1-OS1-32, 2-mmWave RADARs, 3-M1600, 4-OAK-D Pro. 5-LED, 6-IMU, and 7-Intel NUC. Reproduced from~\cite{subter}}\label{fig:subter_platform}
|
||||||
|
\end{figure}
|
||||||
|
|
||||||
%During the measurement campaign 14 experiments were conducted, of which 10 did not contain the utilization of the artifical smoke machine and 4 which did contain the artifical degradation, henceforth refered to as normal and anomalous experiments respectively. During 13 of the experiments the sensor platform was in near constant movement (sometimes translation - sometimes rotation) with only 1 anomalous experiment having the sensor platform stationary. This means we do not have 2 stationary experiments to directly compare the data from a normal and an anomalous experiment, where the sensor platform was not moved, nonetheless the genereal experiments are similar enough for direct comparisons. During anomalous experiments the artifical smoke machine appears to have been running for some time before data collection, since in camera images and lidar data alike, the water vapor appears to be distributed quite evenly throughout the closer perimeter of the smoke machine. The stationary experiment is also unique in that the smoke machine is quite close to the sensor platform and actively produces new smoke, which is dense enough for the lidar data to see the surface of the newly produced water vapor as a solid object.
|
%During the measurement campaign 14 experiments were conducted, of which 10 did not contain the utilization of the artifical smoke machine and 4 which did contain the artifical degradation, henceforth refered to as normal and anomalous experiments respectively. During 13 of the experiments the sensor platform was in near constant movement (sometimes translation - sometimes rotation) with only 1 anomalous experiment having the sensor platform stationary. This means we do not have 2 stationary experiments to directly compare the data from a normal and an anomalous experiment, where the sensor platform was not moved, nonetheless the genereal experiments are similar enough for direct comparisons. During anomalous experiments the artifical smoke machine appears to have been running for some time before data collection, since in camera images and lidar data alike, the water vapor appears to be distributed quite evenly throughout the closer perimeter of the smoke machine. The stationary experiment is also unique in that the smoke machine is quite close to the sensor platform and actively produces new smoke, which is dense enough for the lidar data to see the surface of the newly produced water vapor as a solid object.
|
||||||
|
|
||||||
During the measurement campaign, 14 experiments were conducted—10 without the artificial smoke machine (hereafter referred to as normal experiments) and 4 with it (anomalous experiments). In 13 of these experiments, the sensor platform was in near-constant motion (either translating or rotating), with only one anomalous experiment conducted while the platform remained stationary. Although this means we do not have two stationary experiments for a direct comparison between normal and anomalous conditions, the overall experiments are similar enough to allow for meaningful comparisons.
|
During the measurement campaign, 14 experiments were conducted—10 without the artificial smoke machine (hereafter referred to as normal experiments) and 4 with it (anomalous experiments). In 13 of these experiments, the sensor platform was in near-constant motion (either translating or rotating), with only one anomalous experiment conducted while the platform remained stationary. Although this means we do not have two stationary experiments for a direct comparison between normal and anomalous conditions, the overall experiments are similar enough to allow for meaningful comparisons.
|
||||||
@@ -401,11 +422,11 @@ While the density of these near-sensor returns might be used to estimate data qu
|
|||||||
%\todo{describe how 3d lidar data was preprocessed (2d projection), labeling}
|
%\todo{describe how 3d lidar data was preprocessed (2d projection), labeling}
|
||||||
%\todo[inline]{screenshots of 2d projections?}
|
%\todo[inline]{screenshots of 2d projections?}
|
||||||
|
|
||||||
%\todo[inline, color=green!40]{while as described in sec X the method DeepSAD is not dependend on any specific type/structure of data it requires to train an auto encoder in the pretraining step. such autoencoders are better understood in the image domain since there are many uses cases for this such as X (TODO citation needed), there are also 3d data auto encoders such as X (todo find example). same as the reference paper (rain cite) we chose to transform the 3d data to 2d by using a spherical spherical projection to map each of the 3d points onto a 2d plane where the range of each measurement can be expressed as the brightness of a single pixel. this leaves us with a 2d image of resolution 32x2048 (channels by horizontal measurements), which is helpful for visualization as well as for choosing a simpler architecture for the autoencoder of deepsad, the data in the rosbag is sparse meaning that measurements of the lidar which did not produce any value (no return ray detected before sensor specific timeout) are simply not present in the lidar scan. meaning we have at most 65xxx measurements per scan but mostly fewer than this, (maybe statistic about this? could aslo be interesting to show smoke experiment stuff)}
|
%\todo[inline, color=green!40]{while as described in sec X the method Deep SAD is not dependend on any specific type/structure of data it requires to train an auto encoder in the pretraining step. such autoencoders are better understood in the image domain since there are many uses cases for this such as X (TODO citation needed), there are also 3d data auto encoders such as X (todo find example). same as the reference paper (rain cite) we chose to transform the 3d data to 2d by using a spherical spherical projection to map each of the 3d points onto a 2d plane where the range of each measurement can be expressed as the brightness of a single pixel. this leaves us with a 2d image of resolution 32x2048 (channels by horizontal measurements), which is helpful for visualization as well as for choosing a simpler architecture for the autoencoder of deepsad, the data in the rosbag is sparse meaning that measurements of the lidar which did not produce any value (no return ray detected before sensor specific timeout) are simply not present in the lidar scan. meaning we have at most 65xxx measurements per scan but mostly fewer than this, (maybe statistic about this? could aslo be interesting to show smoke experiment stuff)}
|
||||||
|
|
||||||
%As described in section~\ref{sec:algorithm_description} the method we want to evaluate is datatype agnostic and can be adjusted to work with any kind of data. The data from~\cite{subter} that we will train on is a point cloud per scan created by the lidar sensor which contains up to 65536 points with \emph{X}, \emph{Y}, and \emph{Z} coordinates (in meters) per point. To adjust the architecture of DeepSAD to work with a specific datatype, we have to define an autoencoder architecture that works for the given datatype. While autoencoders can be created for any datatype, as~\cite{autoencoder_survey} points out over 60\% of research papers pertaining autoencoders in recent years look at image classification and reconstruction, so we have a better understanding of their architectures for two dimensional images than for three dimensional point clouds.
|
%As described in section~\ref{sec:algorithm_description} the method we want to evaluate is datatype agnostic and can be adjusted to work with any kind of data. The data from~\cite{subter} that we will train on is a point cloud per scan created by the lidar sensor which contains up to 65536 points with \emph{X}, \emph{Y}, and \emph{Z} coordinates (in meters) per point. To adjust the architecture of Deep SAD to work with a specific datatype, we have to define an autoencoder architecture that works for the given datatype. While autoencoders can be created for any datatype, as~\cite{autoencoder_survey} points out over 60\% of research papers pertaining autoencoders in recent years look at image classification and reconstruction, so we have a better understanding of their architectures for two dimensional images than for three dimensional point clouds.
|
||||||
|
|
||||||
As described in Section~\ref{sec:algorithm_description}, the method under evaluation is data type agnostic and can be adapted to work with any kind of data. In our case, we train on point clouds from~\cite{subter}, where each scan produced by the LiDAR sensor contains up to 65,536 points, with each point represented by its \emph{X}, \emph{Y}, and \emph{Z} coordinates. To tailor the DeepSAD architecture to this specific data type, we would need to design an autoencoder suitable for processing three-dimensional point clouds. Although autoencoders can be developed for various data types, as noted in~\cite{autoencoder_survey}, over 60\% of recent research on autoencoders focuses on two-dimensional image classification and reconstruction. Consequently, there is a more established understanding of architectures for images compared to those for three-dimensional point clouds.
|
As described in Section~\ref{sec:algorithm_description}, the method under evaluation is data type agnostic and can be adapted to work with any kind of data. In our case, we train on point clouds from~\cite{subter}, where each scan produced by the LiDAR sensor contains up to 65,536 points, with each point represented by its \emph{X}, \emph{Y}, and \emph{Z} coordinates. To tailor the Deep SAD architecture to this specific data type, we would need to design an autoencoder suitable for processing three-dimensional point clouds. Although autoencoders can be developed for various data types, as noted in~\cite{autoencoder_survey}, over 60\% of recent research on autoencoders focuses on two-dimensional image classification and reconstruction. Consequently, there is a more established understanding of architectures for images compared to those for three-dimensional point clouds.
|
||||||
|
|
||||||
%\todo[inline, color=green!40]{to achieve this transformation we used the helpful measurement index and channel present in each measurement point of the dataset which allowed a perfect reconstruction of the 2d projection without calculating the pixel position in the projection of each measurement via angles which in our experience typically leads to some ambiguity in the projection (multiple measurements mapping to the same pixel due to precision loss/other errors) the measurement index increases even for unavailable measurements (no ray return) so we can simply create the 2d projection by mapping the normalized range (FIXME really normalized) value to the pixel position y = channel, x = measurement index. by initalizing the array to NaN values originally we have a 2d data structure with the range values and NaN on pixel positions where originally no measurement took place (missing measurements in scans due to no ray return)}
|
%\todo[inline, color=green!40]{to achieve this transformation we used the helpful measurement index and channel present in each measurement point of the dataset which allowed a perfect reconstruction of the 2d projection without calculating the pixel position in the projection of each measurement via angles which in our experience typically leads to some ambiguity in the projection (multiple measurements mapping to the same pixel due to precision loss/other errors) the measurement index increases even for unavailable measurements (no ray return) so we can simply create the 2d projection by mapping the normalized range (FIXME really normalized) value to the pixel position y = channel, x = measurement index. by initalizing the array to NaN values originally we have a 2d data structure with the range values and NaN on pixel positions where originally no measurement took place (missing measurements in scans due to no ray return)}
|
||||||
|
|
||||||
@@ -442,7 +463,7 @@ Since an objective measure of degradation is unavailable, we explored alternativ
|
|||||||
%\todo[inline]{TODO maybe evaluate based on different thresholds? missing datapoints, number of detected outliers, number of particles in phantom circle around sensor?}
|
%\todo[inline]{TODO maybe evaluate based on different thresholds? missing datapoints, number of detected outliers, number of particles in phantom circle around sensor?}
|
||||||
|
|
||||||
\newchapter{experimental_setup}{Experimental Setup}
|
\newchapter{experimental_setup}{Experimental Setup}
|
||||||
\newsection{autoencoder_architecture}{DeepSAD Autoencoder Architecture}
|
\newsection{autoencoder_architecture}{Deep SAD Autoencoder Architecture}
|
||||||
\newsection{data_setup}{Training/Evaluation Data Distribution}
|
\newsection{data_setup}{Training/Evaluation Data Distribution}
|
||||||
\todo[inline]{which data was used how in training/evaluation}
|
\todo[inline]{which data was used how in training/evaluation}
|
||||||
\todo[inline]{explain concept of global/local application for global-/window quantifiction}
|
\todo[inline]{explain concept of global/local application for global-/window quantifiction}
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
@article{Chandola2009AnomalyDA,
|
@article{anomaly_detection_survey,
|
||||||
title = {Anomaly detection: A survey},
|
title = {Anomaly detection: A survey},
|
||||||
author = {Varun Chandola and Arindam Banerjee and Vipin Kumar},
|
author = {Varun Chandola and Arindam Banerjee and Vipin Kumar},
|
||||||
journal = {ACM Comput. Surv.},
|
journal = {ACM Comput. Surv.},
|
||||||
|
|||||||
BIN
thesis/figures/data_subter_platform_photo.jpg
Normal file
BIN
thesis/figures/data_subter_platform_photo.jpg
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 747 KiB |
BIN
thesis/figures/data_subter_platform_sketch.png
Normal file
BIN
thesis/figures/data_subter_platform_sketch.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 85 KiB |
Reference in New Issue
Block a user