reworked AD background section

This commit is contained in:
Jan Kowalczyk
2025-05-01 10:26:48 +02:00
parent 6b0aed4da2
commit 09b474fa2d

View File

@@ -273,7 +273,7 @@ This thesis tackles a broad, interdisciplinary challenge at the intersection of
Because anomalies are, by nature, unpredictable in form and structure, unsupervised learning methods are often preferred since they do not require pre-assigned labels—a significant advantage when dealing with unforeseen data patterns. However, these methods can be further refined through the integration of a small amount of labeled data, giving rise to semi-supervised approaches. The method evaluated in this thesis, DeepSAD, is a semi-supervised deep learning approach that also leverages an autoencoder architecture. Autoencoders have gained widespread adoption in deep learning for their ability to extract features from unlabeled data, which is particularly useful for handling complex data types such as LiDAR scans. Because anomalies are, by nature, unpredictable in form and structure, unsupervised learning methods are often preferred since they do not require pre-assigned labels—a significant advantage when dealing with unforeseen data patterns. However, these methods can be further refined through the integration of a small amount of labeled data, giving rise to semi-supervised approaches. The method evaluated in this thesis, DeepSAD, is a semi-supervised deep learning approach that also leverages an autoencoder architecture. Autoencoders have gained widespread adoption in deep learning for their ability to extract features from unlabeled data, which is particularly useful for handling complex data types such as LiDAR scans.
LiDAR sensors function by projecting lasers in multiple directions simultaneously, measuring the time it takes for each reflected ray to return. Using the angles and travel times, the sensor constructs a point cloud that is often dense enough to accurately map its surroundings. In the following sections, we will delve into these technologies, review their historical development and use cases, and describe how they are employed in this thesis. LiDAR sensors function by projecting lasers in multiple directions simultaneously, measuring the time it takes for each reflected ray to return. Using the angles and travel times, the sensor constructs a point cloud that is often dense enough to accurately map its surroundings. In the following sections, we will delve into these technologies, review how they work and their use cases, and describe how they are employed in this thesis.
\todo[inline, color=green!40]{mention related work + transition to anomaly detection} \todo[inline, color=green!40]{mention related work + transition to anomaly detection}
@@ -298,23 +298,21 @@ Figure~\ref{fig:anomaly_detection_overview} depicts a simple but illustrative ex
\caption{An illustrative example of anomalous and normal data containing 2-dimensional data with clusters of normal data $N_1$ and $N_2$ as well as two single anomalies $o_1$ and $o_2$ and a cluster of anomalies $O_3$. Reproduced from~\cite{anomaly_detection_survey}}\label{fig:anomaly_detection_overview} \caption{An illustrative example of anomalous and normal data containing 2-dimensional data with clusters of normal data $N_1$ and $N_2$ as well as two single anomalies $o_1$ and $o_2$ and a cluster of anomalies $O_3$. Reproduced from~\cite{anomaly_detection_survey}}\label{fig:anomaly_detection_overview}
\end{figure} \end{figure}
By their very nature anomalies are rare occurences and oftentimes unpredictable in nature, which makes it hard to define all possible anomalies in any system. It also makes it very challenging to create an algorithm which is capable of detecting anomalies which may have never occured before and may not have been known to exist during the creation of the detection algorithm. There are multiple possible approaches taken by anomaly detection algorithms to achieve this feat. By their very nature anomalies are rare occurences and oftentimes unpredictable in nature, which makes it hard to define all possible anomalies in any system. It also makes it very challenging to create an algorithm which is capable of detecting anomalies which may have never occured before and may not have been known to exist during the creation of the detection algorithm. There are many possible approaches to this problem, though they can be roughly grouped into six distinct categories based on the techniques used~\cite{anomaly_detection_survey}:
\citeauthor{anomaly_detection_survey} categorize anomaly detection algorithms in~\cite{anomaly_detection_survey} into six distinct categories based on the techniques used:
\begin{enumerate} \begin{enumerate}
\item \textbf{Classification Based} - Using classification techniques such as SVMs, neural networks to classify samples as either normal or anomalous based on labeled training data. Alternatively, if not enough labeled training data is available a one-class classification algorithm can be used which assumes all training samples to be normal and then learns a boundary around the normal samples to differentiate them from anomalous samples which lie outside the learnt boundary. \item \textbf{Classification Based} - A classification technique such as an SVM or a fitting neural network is used to classify samples as either normal or anomalous based on labeled training data. Alternatively, if not enough labeled training data is available a one-class classification algorithm can be employed. In that case, the algorithm assumes all training samples to be normal and then learns a boundary around the normal samples to differentiate them from anomalous samples which lie outside the learnt boundary.
\item \textbf{Clustering Based} - Using clustering techniques such as K-Means clustering, DBSCAN to cluster normal data together with the assumption that anomalies do not belong to the cluster, are an appreciable distance from the clusters center or belong to smaller different clusters than the normal data. \item \textbf{Clustering Based} - Clustering techniques such as K-Means clustering or DBSCAN aim to group similar data together into clusters, differentiating it from dissimilar data which may belong to another or no cluster at all. Anomaly detection methods from this category employ such a technique, with the assumption that normal data will assemble into one or more clusters due to their similar properties, while anomalies may create their own smaller clusters, not belong to any cluster at all or at least be an appreciable distance from the closest normal cluster's center.
\item \textbf{Nearest Neighbor Based} - Similar to clustering based, these techniques assume normal data is more closely clustered than anomalies and therefore judge samples based on either the distance to their $k^{th}$ nearest neighbor or on the density of their local neighborhood. \item \textbf{Nearest Neighbor Based} - Similar to the clustering based category, these techniques assume normal data is more closely clustered than anomalies and therefore utilize either a sample's distance to their $k^{th}$ nearest neighbor or the density of their local neighborhood, to judge wether a sample is anomalous.
\item \textbf{Statistical} - Using statistical techniques to fit a statistical model of the normal behaviour to the data and determining if samples are anomalous based on their likelihood of fitting into the statistical model. \item \textbf{Statistical} - These methods try to fit a statistical model of the normal behaviour to the data. After the distribution from which normal data originates is defined, samples can be found to be normal or anomalous based on their likelihood to arise from said distribution.
\item \textbf{Information Theoretic} - Using information theoretic measures to determine iregularities in the data's information content which are assumed to be caused by anomalies. \item \textbf{Information Theoretic} - The main assumption for information theoretic anomaly detection methods, is that anomalies differ somehow in their information content from anomalous data. An information theoretic measure is therefore used to determine iregularities in the data's information content, enabling the detection of anomalous samples.
\item \textbf{Spectral} - Using dimensionality reduction techniques like PCA to embed the data into a lower dimensional subspace where normal data appears significantly different from anomalous data. Spectral techniques may also be used as a pre-processing step followed by any other anomaly detection algorithm in the lower dimensional subspace. \item \textbf{Spectral} - Spectral approaches assume the possibility to map data into a lower-dimensional space, where normal data appears significantly different from anomalous data. To this end a dimensionality reduction technique such as PCA is used to embed the data into a lower dimensional subspace. Although it may be easier to differentiate normal and anomalous data in that subspace, it is still necessary to employ a technique capable of this feat, so spectral methods are oftentimes used as a pre-processing step followed by an anomaly detection method operating on the data's subspace.
\end{enumerate} \end{enumerate}
In this thesis we used an anomaly detection method, namely Deep Semi-Supervised Anomaly Detection~\cite{deepsad} to model our problem -how to quantify the degradation of lidar sensor data- as an anomaly detection problem. We do this by classifying good quality data as normal and degraded data as anomalous and rely on a method which can express each samples likelihood of being anomalous as an analog anomaly score, which enables us to interpret it as the datas degradation quantification value. In this thesis we used an anomaly detection method, namely Deep Semi-Supervised Anomaly Detection~\cite{deepsad} to model our problem -how to quantify the degradation of lidar sensor data- as an anomaly detection problem. We do this by classifying good quality data as normal and degraded data as anomalous and rely on a method which can express each samples likelihood of being anomalous as an analog anomaly score, which enables us to interpret it as the datas degradation quantification value.
Chapter~\ref{chp:deepsad} describes DeepSAD in more detail, which shows that it is a clustering based approach with a spectral pre-processing component, in that it first uses a neural network to reduce the inputs dimensionality while simultaneously clustering normal data closely around a given centroid and then gives an anomaly score by calculating the geometric distance between a single data sample and the aforementioned cluster centroid in the lower dimensional subspace. Since our data is high dimensional it makes sense to use a spectral method to reduce the datas dimensionality and an approach which results in an analog value rather than a binary classification is useful for our use-case since we want to quantify not only classify the data degradation. Chapter~\ref{chp:deepsad} describes DeepSAD in more detail, which shows that it is a clustering based approach with a spectral pre-processing component, in that it uses a neural network to reduce the inputs dimensionality while simultaneously clustering normal data closely around a given centroid. It then produces an anomaly score by calculating the geometric distance between a data sample and the aforementioned cluster centroid, assuming the distance is shorter for normal than for anomalous data. Since our data is high dimensional it makes sense to use a spectral method to reduce the datas dimensionality and an approach which results in an analog value rather than a binary classification is useful for our use-case since we want to quantify not only classify the data degradation.
%\todo[inline, color=green!40]{data availability leading into semi-supervised learning algorithms} %\todo[inline, color=green!40]{data availability leading into semi-supervised learning algorithms}