reworked background semi-supervised section

2025-05-06 11:41:32 +02:00
parent 16d8b47fe6
commit e4a5df58b9
1 changed files with 28 additions and 11 deletions
--- a/thesis/Main.tex
+++ b/thesis/Main.tex
@@ -269,11 +269,11 @@ The method we employ produces an analog score that reflects the confidence in th
 {state which background subsections will follow + why and mention related work}
 {necessity of knowledge and order of subsections are explained $\rightarrow$ essential background}

-This thesis tackles a broad, interdisciplinary challenge at the intersection of robotics, embedded systems, and data science. In this chapter, we introduce the background of anomaly detection—the framework we use to formulate our degradation quantification problem. Anomaly detection has its roots in statistical analysis and has been successfully applied in various domains. Recently, the incorporation of learning-based techniques, particularly deep learning, has enabled more efficient and effective analysis of large datasets.
+This thesis tackles a broad, interdisciplinary challenge at the intersection of robotics, computer vision, and data science. In this chapter, we introduce the background of anomaly detection, which we formulate our degradation quantification problem as. Anomaly detection has its roots in statistical analysis and has been successfully applied in various domains. Recently, the incorporation of learning-based techniques, particularly deep learning, has enabled more efficient and effective analysis of large datasets.

-Because anomalies are, by nature, unpredictable in form and structure, unsupervised learning methods are often preferred since they do not require pre-assigned labels—a significant advantage when dealing with unforeseen data patterns. However, these methods can be further refined through the integration of a small amount of labeled data, giving rise to semi-supervised approaches. The method evaluated in this thesis, DeepSAD, is a semi-supervised deep learning approach that also leverages an autoencoder architecture. Autoencoders have gained widespread adoption in deep learning for their ability to extract features from unlabeled data, which is particularly useful for handling complex data types such as LiDAR scans.
+Because anomalies are, by nature, often unpredictable in form and structure, unsupervised learning methods are widely used since they do not require pre-assigned labels—a significant advantage when dealing with unforeseen data patterns. However, these methods can be further refined through the integration of a small amount of labeled data, giving rise to semi-supervised approaches. The method evaluated in this thesis, DeepSAD, is a semi-supervised deep learning approach that also leverages an autoencoder architecture in its design. Autoencoders have gained widespread adoption in deep learning for their ability to extract features from unlabeled data, which is particularly useful for handling complex data types such as LiDAR scans.

-LiDAR sensors function by projecting lasers in multiple directions simultaneously, measuring the time it takes for each reflected ray to return. Using the angles and travel times, the sensor constructs a point cloud that is often dense enough to accurately map its surroundings. In the following sections, we will delve into these technologies, review how they work and their use cases, and describe how they are employed in this thesis.
+LiDAR sensors function by projecting lasers in multiple directions near-simultaneously, measuring the time it takes for each reflected ray to return. Using the angles and travel times, the sensor constructs a point cloud that is often accurate enough to map the sensor's surroundings. In the following sections, we will delve into these technologies, review how they work, how they are generally used and describe how they are employed in this thesis. We will also explore research from these backgrounds related to our thesis.

 \todo[inline, color=green!40]{mention related work + transition to anomaly detection}

@@ -351,28 +351,45 @@ As already shortly mentioned at the beginning of this section, anomaly detection
 {explain what ML is, how the different approaches work, why to use semi-supervised}
 {autoencoder special case (un-/self-supervised) used in DeepSAD $\rightarrow$ explain autoencoder}

-Machine learning (ML) defines types of algorithms capable of learning from existing data to perform tasks on previously unseen data without being explicitely programmed to do so~\cite{machine_learning_first_definition}. Many kinds of machine learning methods exist, but neural networks are one of the most commonly used and researched of them, due to their versatility and domain independent success over the last decades. They are comprised of connected artifical neurons, modeled roughly after neurons and synapses in the brain.
-\todo[inline, color=green!40]{talk about neural networks, deep learning, backwards propagation, optimization goals, iterative process, then transition to the categories}
-One way to categorize machine learning algorithms is by the nature of the feedback provided for the algorithm to learn. The most prominent of those categories are supervised learning, unsupervised learning and reinforcement learning.
+%Machine learning defines types of algorithms capable of learning from existing data to perform tasks on previously unseen data without being explicitely programmed to do so~\cite{machine_learning_first_definition}. Among the techniques employed in machine learning algorithms, neural networks have become especially prominent over the past few decades due to their flexibility and ability to achieve state-of-the-art results across a wide variety of domains. They are most commonly composed of layers of interconnected artificial neurons. Each neuron computes a weighted sum of its inputs, adds a bias term, and then applies a nonlinear activation function (such as ReLU, sigmoid, or tanh) to produce its output. These layers are typically organized into three types:

-\todo[inline, color=green!40]{rewrite last paragraph to be more generally about ML first, talk about neural networks, deep learning, backwards propagation, optimization goals, iterative process, then transition to the categories}
+Machine learning refers to algorithms capable of learning patterns from existing data to perform tasks on previously unseen data, without being explicitely programmed to do so~\cite{machine_learning_first_definition}. Central to many approaches is the definition of an objective function that measures how well the model is performing. The model’s parameters are then adjusted to optimize this objective. By leveraging these data-driven methods, machine learning can handle complex tasks across a wide range of domains.

-For supervised learning each data sample is augmented by including a label depicting the ideal output the algorithm can produce for the given input. During the learning step these algorithms can compare their generated output with the one provided by an expert and calculate the error between them, minimizing the error to improve performance. Such labels are typically either a categorical or continuous target which are most commonly used for classification and regression tasks respectively.
+Among the techniques employed in machine learning algorithms, neural networks have become especially prominent over the past few decades due to their ability to achieve state-of-the-art results across a wide variety of domains. They are most commonly composed of layers of interconnected artificial neurons. Each neuron computes a weighted sum of its inputs, adds a bias term, and then applies a nonlinear activation function, enabling them to model complex non-linear relationships. These layers are typically organized into three types:
+
+\begin{itemize}
+	\item Input layer, which receives raw data.
+	\item Hidden layers, where the network transforms and extracts complex features by combining signals through successive nonlinear operations. Networks with at least two hidden layers are typically called deep learning networks.
+	\item Output layer, which produces the network’s final prediction.
+\end{itemize}
+
+As outlined above, neural network training is formulated as an optimization problem: we define an objective function that measures how well the model is achieving its task and then we adjust the network’s parameters to optimize that objective. The most common approach is stochastic gradient descent (SGD) or one of its variants (e.g., Adam). In each training iteration, the network first performs a forward pass to compute its outputs and evaluate the objective, then a backward pass—known as backpropagation—to calculate gradients of the objective with respect to every weight in the network. These gradients indicate the direction in which each weight should change to improve performance, and the weights are updated accordingly. Repeating this process over many iterations (or epochs) allows the network to progressively refine its parameters and better fulfill its task.
+
+%To train neural networks, we express the task as an optimization problem: we define a loss function that quantifies the discrepancy between the network’s predictions and the ground-truth labels (or target values), and we seek to minimize this loss across a dataset. The most common method for doing so is stochastic gradient descent (SGD) or one of its variants (e.g., Adam). During each training iteration, the network performs a forward pass to compute its predictions and the associated loss and then a backward pass—known as backpropagation—to calculate gradients of the loss with respect to each weight in the network. These gradients indicate how to adjust the weights to reduce the loss, and the weights are updated accordingly. Over many iterations also known as epochs, the network progressively refines its parameters, improving its ability to optimally fulfil the given task.
+
+Aside from the underlying technique, one can also categorize machine learning algorithms by the type of feedback provided during learning, for the network to improve. Broadly speaking, three main categories-supervised, unsupervised and reinforcement learning-exist, although many other approaches do not exactly fit any of these categories and have spawned less common categories like semi-supervised or self-supervised learning.
+
+In supervised learning, each input sample is paired with a “ground-truth” label representing the desired output. During training, the model makes a prediction and a loss function quantifies the difference between the prediction and the true label. The learning algorithm then adjusts its parameters to minimize this loss, improving its performance over time. Labels are typically categorical (used for classification tasks, such as distinguishing “cat” from “dog”) or continuous (used for regression tasks, like predicting a temperature or distance).
+%For supervised learning each data sample is augmented by including a label depicting the ideal output the algorithm can produce for the given input. During the learning step these algorithms can compare their generated output with the one provided by an expert and calculate the error between them, minimizing the error to improve performance. Such labels are typically either a categorical or continuous target which are most commonly used for classification and regression tasks respectively.

 \fig{ml_supervised_learning}{figures/ml_supervised_learning_placeholder.jpg}{PLACEHOLDER - An illustration of supervised learning-the training data is augmented to include the algorithms optimal output for the data sample, called labels.}

-Unsupervised learning algorithms use raw data without a target label that can be used during the learning process. These types of algorithms are often utilized to identify underlying patterns in data which may be hard to discover using classical data analysis due to for example large data size or high data complexity. Cluster analysis depicts one common use case, in which data is grouped into clusters such that data from one cluster resembles other data from the same cluster more closely than data from other clusters, according to some predesignated criteria. Another important use case are dimensionality reduction tasks which transform high-dimensional data into a lower-dimensional subspace while retaining meaningful information of the original data.
+In unsupervised learning, models work directly with raw data, without any ground-truth labels to guide the learning process. Instead, they optimize an objective that reflects the discovery of useful structure—whether that is grouping similar data points together or finding a compact representation of the data. For example, cluster analysis partitions the dataset into groups so that points within the same cluster are more similar to each other (according to a chosen similarity metric) than to points in other clusters. Dimensionality reduction methods, on the other hand, project high-dimensional data into a lower-dimensional space, optimizing for minimal loss of the original data’s meaningful information. By focusing purely on the data itself, unsupervised algorithms can reveal hidden patterns and relationships that might be difficult to uncover with manual analysis.
+%Unsupervised learning algorithms use raw data without a target label that can be used during the learning process. These types of algorithms are often utilized to identify underlying patterns in data which may be hard to discover using classical data analysis due to for example large data size or high data complexity. Cluster analysis depicts one common use case, in which data is grouped into clusters such that data from one cluster resembles other data from the same cluster more closely than data from other clusters, according to some predesignated criteria. Another important use case are dimensionality reduction tasks which transform high-dimensional data into a lower-dimensional subspace while retaining meaningful information of the original data.

 \fig{ml_unsupervised_learning}{figures/ml_unsupervised_learning_placeholder.png}{PLACEHOLDER - An illustration of unsupervised learning-the training data does not contain any additional information like a label. The algorithm learns to group similar input data together.}

-A more interactive approach to learning is taken by reinforcement learning, which provides the algorithm with an environment and an interpreter of the environment's state. During training the algorithm explores new possible actions and their impact on the provided environment. The interpreter can then reward or punish the algorithm based on the outcome of its actions. To improve the algorithms capability it will try to maximize the rewards received from the interpreter, retaining some randomness as to enable the exploration of different actions and their outcomes. Reinforcement learning is usually used for cases where an algorithm has to make sequences of decisions in complex environments e.g., autonomous driving tasks.
+%A more interactive approach to learning is taken by reinforcement learning, which provides the algorithm with an environment and an interpreter of the environment's state. During training the algorithm explores new possible actions and their impact on the provided environment. The interpreter can then reward or punish the algorithm based on the outcome of its actions. To improve the algorithms capability it will try to maximize the rewards received from the interpreter, retaining some randomness as to enable the exploration of different actions and their outcomes. Reinforcement learning is usually used for cases where an algorithm has to make sequences of decisions in complex environments e.g., autonomous driving tasks.
+
+In reinforcement learning, the model—often called an agent—learns by interacting with an environment, that provides feedback in the form of rewards or penalties. At each step, the agent observes the environment’s state, selects an action, and an interpreter judges the action's outcome based on how the environment changed, providing a scalar reward or penalty that reflects the desirability of that outcome. The agent’s objective is to adjust its decision-making strategy to maximize the cumulative reward over time, balancing exploration of new actions with exploitation of known high-reward behaviors. This trial-and-error approach is well suited to sequential decision problems in complex settings, such as autonomous navigation or robotic control, where each choice affects both the immediate state and future possibilities.
+

 \todo[inline, color=green!40]{illustration reinforcement learning}

 Semi-Supervised learning algorithms are an inbetween category of supervised and unsupervised algorithms, in that they use a mixture of labeled and unlabeled data. Typically vastly more unlabeled data is used during training of such algorithms than labeled data, due to the effort and expertise required to label large quantities of data correctly. Semi-supervised methods are oftentimes an effort to improve a machine learning algorithm belonging to either the supervised or unsupervised category. Supervised methods such as classification tasks are enhanced by using large amounts of unlabeled data to augment the supervised training without additional need of labeling work. Alternatively, unsupervised methods like clustering algorithms may not only use unlabeled data but improve their performance by considering some hand-labeled data during training.
 %Semi-Supervised learning algorithms are an inbetween category of supervised and unsupervised algorithms, in that they use a mixture of labeled and unlabeled data. Typically vastly more unlabeled data is used during training of such algorithms than labeled data, due to the effort and expertise required to label large quantities of data correctly. The type of task performed by semi-supervised methods can originate from either supervised learningor unsupervised learning domain. For classification tasks which are oftentimes achieved using supervised learning the additional unsupervised data is added during training with the hope to achieve a better outcome than when training only with the supervised portion of the data. In contrast for unsupervised learning use cases such as clustering algorithms, the addition of labeled samples can help guide the learning algorithm to improve performance over fully unsupervised training.

-Machine learning based anomaly detection methods can utilize techniques from all of the aforementioned categories, although their usability varies depending on the available training data. While supervised anomaly detection methods exist, their suitability depends mostly on the availability of labeled training data and on a reasonable proportionality between normal and anomalous data. Both requirements can be challenging due to labeling often being labour intesive and the anomalies' intrinsic property to occur rarely when compared to normal data, making capture of enough anomalous behaviour a hard problme. Semi-Supervised anomaly detection methods are of special interest in that they may overcome these difficulties inherently present in many anomaly detection tasks~\cite{semi_ad_survey}. These methods typically have the same goal as unsupervised anomaly detection methods which is to model the normal class behaviour and delimitate it from anomalies, but they can incorporate some hand-labeled examples of normal and/or anomalous behaviour to improve their perfomance over fully unsupervised methods. DeepSAD is a semi-supervised method which extends its unsupervised predecessor Deep SVDD by including some labeled samples during training. Both, DeepSAD and Deep SVDD also utilize an autoencoder in a pre-training step, a machine learning architecture, frequently grouped with unsupervised algorithms, even though that definition can be contested when scrutinizing it in more detail, which we will do next.
+Machine learning based anomaly detection methods can utilize techniques from all of the aforementioned categories, although their usability varies depending on the available training data. While supervised anomaly detection methods exist, their suitability depends mostly on the availability of labeled training data as well as a reasonable proportionality between normal and anomalous data. Both requirements can be challenging due to labeling often being labour intensive and anomalies' intrinsic property to occur rarely when compared to normal data, making capture of enough anomalous behaviour a hard problem. Semi-Supervised anomaly detection methods are of special interest in that they may overcome these difficulties inherently present in many anomaly detection tasks~\cite{semi_ad_survey}. These methods typically have the same goal as unsupervised anomaly detection methods which is to model the normal class behaviour and delimitate it from anomalies, but they can incorporate some hand-labeled examples of normal and/or anomalous behaviour to improve their perfomance over fully unsupervised methods. DeepSAD is a semi-supervised method which extends its unsupervised predecessor Deep SVDD by including some labeled samples during training. Both, DeepSAD and Deep SVDD also utilize an autoencoder in a pre-training step, a machine learning architecture, frequently grouped with unsupervised algorithms, even though that definition can be contested when scrutinizing it in more detail, which we will do next.


 \newsection{autoencoder}{Autoencoder}