setup chapter progress

2025-08-13 14:15:25 +02:00
parent 8a5adc6360
commit 37ac637c9c
1 changed files with 13 additions and 1 deletions
--- a/thesis/Main.tex
+++ b/thesis/Main.tex
@@ -525,7 +525,7 @@ The pre-training results are used in two more key ways. First, the encoder weigh
 In the main training step, DeepSAD's network is trained using SGD backpropagation. The unlabeled training data is used with the goal to minimize an data-encompassing hypersphere. Since one of the pre-conditions of training was the significant prevelance of normal data over anomalies in the training set, normal samples collectively cluster more tightly around the centroid, while the rarer anomalous samples do not contribute as significantly to the optimization, resulting in them staying further from the hypersphere center. The labeled data includes binary class labels signifying their status as either normal or anomalous samples. Labeled anomalies are pushed away from the center by defining their optimization target as maximizing the distance between them and $\mathbf{c}$. Labeled normal samples are treated similar to unlabeled samples with the difference that DeepSAD includes a hyperparameter capable of controling the proportion with which labeled and unlabeled data contribute to the overall optimization. The resulting network has learned to map normal data samples closer to $\mathbf{c}$ in the latent space and anomalies further away.
-\fig{deepsad_procedure}{diagrams/deepsad_procedure}{(WORK IN PROGRESS) Depiction of DeepSAD's training procedure, including data flows and tweakable hyperparameters.}
+\fig{deepsad_procedure}{diagrams/deepsad_procedure/deepsad_procedure}{(WORK IN PROGRESS) Depiction of DeepSAD's training procedure, including data flows and tweakable hyperparameters.}
 \threadtodo
 {how to use the trained network?}
@@ -988,8 +988,20 @@ We pass instances of this \texttt{Dataset} to PyTorch’s \texttt{DataLoader}, e
 %\todo[inline]{semi-supervised trainingn labels, k_fold, inference dataloader}
 DeepSAD supports both unsupervised and semi-supervised training by optionally incorporating a small number of labeled samples. To control this, our custom PyTorch \texttt{Dataset} accepts two integer parameters, \texttt{num\_labelled\_normal} and \texttt{num\_labelled\_anomalous}, which specify how many samples of each class should retain their labels during training. All other samples are assigned a label of 0 (``unknown'') and treated as unlabeled.
 When using semi-supervised mode, we begin with the manually-defined evaluation labels. We then randomly un-label (set to 0) enough samples of each class until exactly \texttt{num\_labelled\_normal} normals and \texttt{num\_labelled\_anomalous} anomalies remain labeled. This mechanism allows us to systematically compare unsupervised mode, where \texttt{num\_labelled\_normal} = \texttt{num\_labelled\_anomalous} = 0, and Semi-supervised modes with varying label budgets.
 To obtain robust performance estimates on our relatively small dataset, we implement $k$-fold cross-validation. A single integer parameter, \texttt{num\_folds}, controls the number of splits. We use scikit-learn’s \texttt{KFold} (from \texttt{sklearn.model\_selection}) with \texttt{shuffle=True} and a fixed random seed to partition each experiment’s frames into \texttt{num\_folds} disjoint folds. Training then proceeds across $k$ rounds, each time training on $(k-1)/k$ of the data and evaluating on the remaining $1/k$. In our experiments, we set \texttt{num\_folds=5}, yielding an 80/20 train/evaluation split per fold.
 For inference (i.e.\ model validation on held-out experiments), we provide a second \texttt{Dataset} class that loads the full NumPy file for a single experiment (no k-fold splitting), does not assign any labels to the frames nor does it shuffle frames, preserving temporal order. This setup enables seamless, frame-by-frame scoring of complete runs—crucial for analyzing degradation dynamics over an entire traversal.
 \section{Model Configuration \& Evaluation Protocol}
 Since the neural network architecture trained in the deepsad method is not fixed as described in section~\ref{sec:algorithm_details} but rather chosen based on the input data, we also had to choose an autoencoder architecture befitting our preprocessed lidar data projections. Since \citetitle{degradation_quantification_rain}~\cite{degradation_quantification_rain} reported success in training DeepSAD on similar data we firstly adapted their utilized network architecture for our usecase, which is based on the simple and well understood LeNet architecture. Additionally we were interested in evaluating the importance of a well-suited network architecture for DeepSAD's performance and therefore designed a second network architecture henceforth called "efficient architecture" to incorporate a few modern techniques, befitting our usecase.
 \newsubsubsectionNoTOC{Network architectures (LeNet variant, custom encoder) and how they suit the point‑cloud input}
 \threadtodo