erworked some semi-supervised section paragraphs

2025-04-30 17:09:48 +02:00
parent b2386910ec
commit 6b0aed4da2
1 changed files with 3 additions and 2 deletions
--- a/thesis/Main.tex
+++ b/thesis/Main.tex
@@ -359,9 +359,10 @@ Unsupervised learning algorithms use raw data without a target label that can be
 A more interactive approach to learning is taken by reinforcement learning, which provides the algorithm with an environment and an interpreter of the environment's state. During training the algorithm explores new possible actions and their impact on the provided environment. The interpreter can then reward or punish the algorithm based on the outcome of its actions. To improve the algorithms capability it will try to maximize the rewards received from the interpreter, retaining some randomness as to enable the exploration of different actions and their outcomes. Reinforcement learning is usually used for cases where an algorithm has to make sequences of decisions in complex environments e.g., autonomous driving tasks.
-Semi-Supervised learning algorithms are an inbetween category of supervised and unsupervised algorithms, in that they use a mixture of labeled and unlabeled data. Typically vastly more unlabeled data is used during training of such algorithms than labeled data, oftentimes due to the effort and expertise required to label large quantities of data correctly for supervised training methods. The target tasks of semi-supervised methods can come from both the domains of supervised and unsupervised algorithms. For classification tasks which are typically achieved using supervised learning the additional unsupervised data is added during training with the hope to achieve a better outcome than when training only with the supervised portion of the data. In contrast for typical unsupervised learning tasks such as clustering algorithms, the addition of labeled samples can help guide the learning algorithm to improve performance over fully unsupervised training.
+Semi-Supervised learning algorithms are an inbetween category of supervised and unsupervised algorithms, in that they use a mixture of labeled and unlabeled data. Typically vastly more unlabeled data is used during training of such algorithms than labeled data, due to the effort and expertise required to label large quantities of data correctly. Semi-supervised methods are oftentimes an effort to improve a machine learning algorithm belonging to either the supervised or unsupervised category. Supervised methods such as classification tasks are enhanced by using large amounts of unlabeled data to augment the supervised training without additional need of labeling work. Alternatively, unsupervised methods like clustering algorithms may not only use unlabeled data but improve their performance by considering some hand-labeled data during training.
 %Semi-Supervised learning algorithms are an inbetween category of supervised and unsupervised algorithms, in that they use a mixture of labeled and unlabeled data. Typically vastly more unlabeled data is used during training of such algorithms than labeled data, due to the effort and expertise required to label large quantities of data correctly. The type of task performed by semi-supervised methods can originate from either supervised learningor unsupervised learning domain. For classification tasks which are oftentimes achieved using supervised learning the additional unsupervised data is added during training with the hope to achieve a better outcome than when training only with the supervised portion of the data. In contrast for unsupervised learning use cases such as clustering algorithms, the addition of labeled samples can help guide the learning algorithm to improve performance over fully unsupervised training.
-Anomaly detection tasks can be formulated as both supervised or unsupervised problems, depending on the underlying technique utilized. While supervised anomaly detection methods exist, their suitability depends on the availability of labeled training data and on a reasonable proportionality between normal and anomalous data. Both requirements can be challenging due to the often labour intesive task of labeling data and the anomalies' intrinsic property to occur rarely when compared to normal data. DeepSAD is a semi-supervised method which extends its unsupervised predecessor Deep SVDD to include some labeled samples during training which are intended to improve the algorithm's performance. Both, DeepSAD and Deep SVDD include the training of an autoencoder as a pre-training step whose architecture is of special interest when talking about the categorization of supervised and unsupervised training, due to its unique properties.
+For anomaly detection methods, the underlying techniques can belong to any of these or other categories of machine learning algorithms. As described in section~\ref{sec:anomaly_detection}, they may not even use any machine learning at all. While supervised anomaly detection methods exist, their suitability depends mostly on the availability of labeled training data and on a reasonable proportionality between normal and anomalous data. Both requirements can be challenging due to labeling often being labour intesive and the anomalies' intrinsic property to occur rarely when compared to normal data. DeepSAD is a semi-supervised method which extends its unsupervised predecessor Deep SVDD by including some labeled samples during training with the intention to improve the algorithm's performance. Both, DeepSAD and Deep SVDD include the training of an autoencoder as a pre-training step, a machine learning architecture, frequently grouped with unsupervised algorithms, even though that definition can be contested when scrutinizing it in more detail, which we will look at next.
 \newsection{autoencoder}{Autoencoder}