work on bg section, new referenes
This commit is contained in:
@@ -316,7 +316,7 @@ Chapter~\ref{chp:deepsad} describes DeepSAD in more detail, which shows that it
|
||||
|
||||
%\todo[inline, color=green!40]{data availability leading into semi-supervised learning algorithms}
|
||||
|
||||
There is a wide array of problems in domains similar to the one we research in this paper, for which modeling them as anomaly detection problems has been proven successful. The degradation of pointclouds, produced by an industrial 3D sensor, has been modeled as an anomaly detection task in~\cite{bg_ad_pointclouds_scans}. \citeauthor{bg_ad_pointclouds_scans} propose a student-teacher model capable of infering a pointwise anomaly score for degradation in point clouds. The teacher network is trained on an anomaly-free dataset to extract dense features of the point clouds' local geometries, after which an identical student network is trained to emulate the teacher networks' outputs. For degraded pointclouds the regression between the teacher's and student's outputs is calculated and interpreted as the anomaly score, with the rationalization that the student network has not observed features produced by anomalous geometries during training, leaving it incapable of producing a similar output as the teacher for those regions. Another example would be~\cite{bg_ad_pointclouds_poles}, which proposes a method to detect and classify pole-like objects in urban point cloud data, to differentiate between natural and man-made objects such as street signs, for autonomous driving purposes. An anomaly detection method was used to identify the vertical pole-like objects in the point clouds and then using a clustering algorithm to group similar objects and classify them as either trees or poles.
|
||||
There is a wide array of problems in domains similar to the one we research in this paper, for which modeling them as anomaly detection problems has been proven successful. The degradation of pointclouds, produced by an industrial 3D sensor, has been modeled as an anomaly detection task in~\cite{bg_ad_pointclouds_scans}. \citeauthor{bg_ad_pointclouds_scans} propose a student-teacher model capable of infering a pointwise anomaly score for degradation in point clouds. The teacher network is trained on an anomaly-free dataset to extract dense features of the point clouds' local geometries, after which an identical student network is trained to emulate the teacher networks' outputs. For degraded pointclouds the regression between the teacher's and student's outputs is calculated and interpreted as the anomaly score, with the rationalization that the student network has not observed features produced by anomalous geometries during training, leaving it incapable of producing a similar output as the teacher for those regions. Another example would be~\cite{bg_ad_pointclouds_poles}, which proposes a method to detect and classify pole-like objects in urban point cloud data, to differentiate between natural and man-made objects such as street signs, for autonomous driving purposes. An anomaly detection method was used to identify the vertical pole-like objects in the point clouds and then the preprocessed objects were grouped by similarity using a clustering algorithm to then classify them as either trees or man-made poles.
|
||||
|
||||
As already shortly mentioned at the beginning of this section, anomaly detection methods and their usage are oftentimes challenged by the limited availability of anomalous data, owing to the very nature of anomalies which are rare occurences. Oftentimes the intended use-case is to even find unknown anomalies in a given dataset which have not yet been identified. In addition, it can be challenging to classify anomalies correctly for complex data, since the very definition of an anomaly is dependent on many factors such as the type of data, the intended use-case or even how the data evolves over time. For these reasons most types of anomaly detection approaches limit their reliance on anomalous data during training and many of them do not differentiate between normal and anomalous data at all. DeepSAD is a semi-supervised method which is characterized by using a mixture of labeled and unlabeled data.
|
||||
|
||||
@@ -351,7 +351,7 @@ As already shortly mentioned at the beginning of this section, anomaly detection
|
||||
{explain what ML is, how the different approaches work, why to use semi-supervised}
|
||||
{autoencoder special case (un-/self-supervised) used in DeepSAD $\rightarrow$ explain autoencoder}
|
||||
|
||||
Machine learning defines types of algorithms capable of learning from existing data to perform tasks on previously unseen data without being explicitely programmed to do so~\cite{machine_learning_first_definition}. Many kinds of machine learning methods exist, but neural networks are one of the most commonly used and researched of them, due to their versatility and domain independent success over the last decades. They are comprised of connected artifical neurons, modeled roughly after neurons and synapses in the brain.
|
||||
Machine learning (ML) defines types of algorithms capable of learning from existing data to perform tasks on previously unseen data without being explicitely programmed to do so~\cite{machine_learning_first_definition}. Many kinds of machine learning methods exist, but neural networks are one of the most commonly used and researched of them, due to their versatility and domain independent success over the last decades. They are comprised of connected artifical neurons, modeled roughly after neurons and synapses in the brain.
|
||||
\todo[inline, color=green!40]{talk about neural networks, deep learning, backwards propagation, optimization goals, iterative process, then transition to the categories}
|
||||
One way to categorize machine learning algorithms is by the nature of the feedback provided for the algorithm to learn. The most prominent of those categories are supervised learning, unsupervised learning and reinforcement learning.
|
||||
|
||||
@@ -372,8 +372,8 @@ A more interactive approach to learning is taken by reinforcement learning, whic
|
||||
Semi-Supervised learning algorithms are an inbetween category of supervised and unsupervised algorithms, in that they use a mixture of labeled and unlabeled data. Typically vastly more unlabeled data is used during training of such algorithms than labeled data, due to the effort and expertise required to label large quantities of data correctly. Semi-supervised methods are oftentimes an effort to improve a machine learning algorithm belonging to either the supervised or unsupervised category. Supervised methods such as classification tasks are enhanced by using large amounts of unlabeled data to augment the supervised training without additional need of labeling work. Alternatively, unsupervised methods like clustering algorithms may not only use unlabeled data but improve their performance by considering some hand-labeled data during training.
|
||||
%Semi-Supervised learning algorithms are an inbetween category of supervised and unsupervised algorithms, in that they use a mixture of labeled and unlabeled data. Typically vastly more unlabeled data is used during training of such algorithms than labeled data, due to the effort and expertise required to label large quantities of data correctly. The type of task performed by semi-supervised methods can originate from either supervised learningor unsupervised learning domain. For classification tasks which are oftentimes achieved using supervised learning the additional unsupervised data is added during training with the hope to achieve a better outcome than when training only with the supervised portion of the data. In contrast for unsupervised learning use cases such as clustering algorithms, the addition of labeled samples can help guide the learning algorithm to improve performance over fully unsupervised training.
|
||||
|
||||
Machine learning based anomaly detection methods can utilize techniques from all of the aforementioned categories, although their usability varies depending on the available training data. While supervised anomaly detection methods exist, their suitability depends mostly on the availability of labeled training data and on a reasonable proportionality between normal and anomalous data. Both requirements can be challenging due to labeling often being labour intesive and the anomalies' intrinsic property to occur rarely when compared to normal data, making capture of enough anomalous behaviour a hard problme. Semi-Supervised anomaly detection methods are of special interest in that they may overcome these difficulties inherently present in many anomaly detection tasks~\cite{semi_ad_survey}. These methods typically have the same goal as unsupervised anomaly detection methods which is to model the normal class behaviour and delimitate it from anomalies, but they can incorporate some hand-labeled examples of normal and/or anomalous behaviour to improve their perfomance over fully unsupervised methods. DeepSAD is a semi-supervised method which extends its unsupervised predecessor Deep SVDD by including some labeled samples during training. Both, DeepSAD and Deep SVDD also utilize an autoencoder in a pre-training step, a machine learning architecture, frequently grouped with unsupervised algorithms, even though that definition can be contested when scrutinizing it in more detail, which we will do next.
|
||||
|
||||
For anomaly detection methods, the underlying techniques can belong to any of these or other categories of machine learning algorithms. As described in section~\ref{sec:anomaly_detection}, they may not even use any machine learning at all. While supervised anomaly detection methods exist, their suitability depends mostly on the availability of labeled training data and on a reasonable proportionality between normal and anomalous data. Both requirements can be challenging due to labeling often being labour intesive and the anomalies' intrinsic property to occur rarely when compared to normal data. DeepSAD is a semi-supervised method which extends its unsupervised predecessor Deep SVDD by including some labeled samples during training with the intention to improve the algorithm's performance. Both, DeepSAD and Deep SVDD include the training of an autoencoder as a pre-training step, a machine learning architecture, frequently grouped with unsupervised algorithms, even though that definition can be contested when scrutinizing it in more detail, which we will look at next.
|
||||
|
||||
\newsection{autoencoder}{Autoencoder}
|
||||
|
||||
@@ -390,7 +390,9 @@ Autoencoders are a type of neural network architecture, whose main goal is learn
|
||||
\todo[inline, color=green!40]{explain figure}
|
||||
\todo[inline, color=green!40]{Paragraph about Variational Autoencoders? generative models vs discriminative models, enables other common use cases such as generating new data by changing parameterized generative distribution in latent space - VAES are not really relevant, maybe leave them out and just mention them shortly, with the hint that they are important but too much to explain since they are not key knowledge for this thesis}
|
||||
|
||||
One key use case of autoencoders is to employ them as a dimensionality reduction technique. In that case, the latent space inbetween the encoder and decoder is of a lower dimensionality than the input data itself. Due to the aforementioned reconstruction goal, the shared information between the input data and its latent space representation is maximized, which is known as following the infomax principle. After training such an autoencoder, it may be used to generate lower-dimensional representations of the given datatype, enabling more performant computations which may have been infeasible to achieve on the original data. DeepSAD uses an autoencoder in a pre-training step to achieve this goal among others. This is especially useful for our usecase since point clouds produced by lidar sensors such as the one used in robotics are usually very high-dimensional, owed to the difficulty in mapping the whole scene with enough detail to navigate it.
|
||||
One key use case of autoencoders is to employ them as a dimensionality reduction technique. In that case, the latent space inbetween the encoder and decoder is of a lower dimensionality than the input data itself. Due to the aforementioned reconstruction goal, the shared information between the input data and its latent space representation is maximized, which is known as following the infomax principle. After training such an autoencoder, it may be used to generate lower-dimensional representations of the given datatype, enabling more performant computations which may have been infeasible to achieve on the original data. DeepSAD uses an autoencoder in a pre-training step to achieve this goal among others.
|
||||
|
||||
Autoencoders have been shown to be useful in the anomaly detection domain by assuming that autoencoders trained on more normal than anomalous data are better at reconstructing normal behaviour than anomalous one. This assumption allows methods to utilize the reconstruction error as an anomaly score. Examples for this are the outlier detection method in~\cite{bg_autoencoder_ad} or the anomaly detection method in~\cite{bg_autoencoder_ad_2} which both employ an autoencoder and the aforementioned assumption. Autoencoders have also been shown to be a suitable dimensionality reduction technique for lidar data, which is oftentimes high-dimensional and sparse, making feature extraction and dimensionality reduction popular preprocessing steps. As an example,~\cite{bg_autoencoder_lidar} shows the feasibility and advantages of using an autoencoder architecture to reduce lidar-orthophoto fused feature's dimensionality for their building detection method, which can recognize buildings in visual data taken from an airplane. Similarly, we can make use of the dimensionality reduction in DeepSAD's pre-training step, since our method is intended to work with high-dimensional lidar data.
|
||||
|
||||
%Another way to employ autoencoders is to use them as a generative technique. The decoder in autoencoders is trained to reproduce the input state from its encoded representation, which can also be interpreted as the decoder being able to generate data of the input type, from an encoded representation. A classic autoencoder trains the encoder to map its input to a single point in the latent space-a distriminative modeling approach, which can succesfully learn a predictor given enough data. In generative modeling on the other hand, the goal is to learn the distribution the data originates from, which is the idea behind variational autoencoders (VAE). VAEs have the encoder produce an distribution instead of a point representation, samples from which are then fed to the decoder to reconstruct the original input. The result is the encoder learning to model the generative distribution of the input data, which enables new usecases, due to the latent representation
|
||||
|
||||
|
||||
@@ -116,7 +116,7 @@
|
||||
on MNIST and CIFAR-10 image benchmark datasets as well as on the
|
||||
detection of adversarial examples of GTSRB stop signs.},
|
||||
},
|
||||
@inproceedings{pmlr-v80-ruff18a,
|
||||
@inproceedings{deep_svdd,
|
||||
title = {Deep One-Class Classification},
|
||||
author = {Ruff, Lukas and Vandermeulen, Robert and Goernitz, Nico and Deecke,
|
||||
Lucas and Siddiqui, Shoaib Ahmed and Binder, Alexander and M{\"u}ller
|
||||
@@ -368,5 +368,93 @@
|
||||
year = {2015},
|
||||
month = sep,
|
||||
pages = {12680–12703},
|
||||
},
|
||||
|
||||
@article{semi_ad_survey,
|
||||
title = {Semi-supervised anomaly detection algorithms: A comparative summary
|
||||
and future research directions},
|
||||
volume = {218},
|
||||
ISSN = {0950-7051},
|
||||
url = {http://dx.doi.org/10.1016/j.knosys.2021.106878},
|
||||
DOI = {10.1016/j.knosys.2021.106878},
|
||||
journal = {Knowledge-Based Systems},
|
||||
publisher = {Elsevier BV},
|
||||
author = {Villa-Pérez, Miryam Elizabeth and Álvarez-Carmona, Miguel Á. and
|
||||
Loyola-González, Octavio and Medina-Pérez, Miguel Angel and
|
||||
Velazco-Rossell, Juan Carlos and Choo, Kim-Kwang Raymond},
|
||||
year = {2021},
|
||||
month = apr,
|
||||
pages = {106878},
|
||||
},
|
||||
@inbook{bg_autoencoder_ad,
|
||||
title = {Outlier Detection with Autoencoder Ensembles},
|
||||
ISBN = {9781611974973},
|
||||
url = {http://dx.doi.org/10.1137/1.9781611974973.11},
|
||||
DOI = {10.1137/1.9781611974973.11},
|
||||
booktitle = {Proceedings of the 2017 SIAM International Conference on Data
|
||||
Mining},
|
||||
publisher = {Society for Industrial and Applied Mathematics},
|
||||
author = {Chen, Jinghui and Sathe, Saket and Aggarwal, Charu and Turaga,
|
||||
Deepak},
|
||||
year = {2017},
|
||||
month = jun,
|
||||
pages = {90–98},
|
||||
},
|
||||
@inproceedings{bg_autoencoder_ad_2,
|
||||
title = {Memorizing Normality to Detect Anomaly: Memory-Augmented Deep
|
||||
Autoencoder for Unsupervised Anomaly Detection},
|
||||
url = {http://dx.doi.org/10.1109/ICCV.2019.00179},
|
||||
DOI = {10.1109/iccv.2019.00179},
|
||||
booktitle = {2019 IEEE/CVF International Conference on Computer Vision (ICCV)},
|
||||
publisher = {IEEE},
|
||||
author = {Gong, Dong and Liu, Lingqiao and Le, Vuong and Saha, Budhaditya and
|
||||
Mansour, Moussa Reda and Venkatesh, Svetha and Van Den Hengel, Anton},
|
||||
year = {2019},
|
||||
month = oct,
|
||||
pages = {1705–1714},
|
||||
},
|
||||
@article{bg_autoencoder_lidar,
|
||||
title = {Deep Learning Approach for Building Detection Using LiDAR–Orthophoto
|
||||
Fusion},
|
||||
volume = {2018},
|
||||
ISSN = {1687-7268},
|
||||
url = {http://dx.doi.org/10.1155/2018/7212307},
|
||||
DOI = {10.1155/2018/7212307},
|
||||
journal = {Journal of Sensors},
|
||||
publisher = {Wiley},
|
||||
author = {Nahhas, Faten Hamed and Shafri, Helmi Z. M. and Sameen, Maher
|
||||
Ibrahim and Pradhan, Biswajeet and Mansor, Shattri},
|
||||
year = {2018},
|
||||
month = aug,
|
||||
pages = {1–12},
|
||||
},
|
||||
@article{lidar_denoising_survey,
|
||||
title = {LiDAR Denoising Methods in Adverse Environments: A Review},
|
||||
volume = {25},
|
||||
ISSN = {2379-9153},
|
||||
url = {http://dx.doi.org/10.1109/JSEN.2025.3526175},
|
||||
DOI = {10.1109/jsen.2025.3526175},
|
||||
number = {5},
|
||||
journal = {IEEE Sensors Journal},
|
||||
publisher = {Institute of Electrical and Electronics Engineers (IEEE)},
|
||||
author = {Park, Ji-Il and Jo, SeungHyeon and Seo, Hyung-Tae and Park, Jihyuk},
|
||||
year = {2025},
|
||||
month = mar,
|
||||
pages = {7916–7932},
|
||||
},
|
||||
@inproceedings{lidar_subt_dust_removal,
|
||||
title = {Efficient Real-time Smoke Filtration with 3D LiDAR for Search and
|
||||
Rescue with Autonomous Heterogeneous Robotic Systems},
|
||||
url = {http://dx.doi.org/10.1109/IECON51785.2023.10312303},
|
||||
DOI = {10.1109/iecon51785.2023.10312303},
|
||||
booktitle = {IECON 2023- 49th Annual Conference of the IEEE Industrial
|
||||
Electronics Society},
|
||||
publisher = {IEEE},
|
||||
author = {Kyuroson, Alexander and Koval, Anton and Nikolakopoulos, George},
|
||||
year = {2023},
|
||||
month = oct,
|
||||
pages = {1–7},
|
||||
}
|
||||
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user