exp setup section work

2025-07-31 12:27:01 +02:00
parent bbead428be
commit a24ac3af5c
1 changed files with 5 additions and 6 deletions
--- a/thesis/Main.tex
+++ b/thesis/Main.tex
@@ -923,9 +923,7 @@ In the following sections, we detail our adaptations to this framework:
 {codebase, github, dataloading, training, testing, baselines}
 {codebase understood $\rightarrow$ how was it adapted}
-The PyTorch implementation of the DeepSAD implementation and framework was originally developed for Python 3.7 but could be adapted to Python 3.12 without the need for many changes. It originally included the MNIST, Fashion-MNIST, and CIFAR-10 datasets and arrhythmia, cardio, satellite, satimage-2, shuttle, and thyroid datasets from \citetitle{odds}~\cite{odds}, as well as suitable autoencoder and DeepSAD network architectures for the corresponding datatypes.
+The PyTorch implementation of the DeepSAD framework includes the MNIST, Fashion-MNIST, and CIFAR-10 datasets and arrhythmia, cardio, satellite, satimage-2, shuttle, and thyroid datasets from \citetitle{odds}~\cite{odds}, as well as suitable autoencoder and DeepSAD network architectures for the corresponding datatypes. The framework can train and test DeepSAD as well as a number of baseline algorithms, namely SSAD, OCSVM, Isolation Forest, KDE and SemiDGM with the loaded data and evaluate their performance by calculating the ROC area under curve for all given algorithms. We adapted this implementation which was originally developed for Python 3.7 to work with Python 3.12 and changed or added functionality for dataloading our chosen dataset, added DeepSAD models that work with the lidar projections datatype, more evaluation methods and an inference module.
 %\todo[inline]{data preprocessed (2d projections, normalized range)}
 \threadtodo
@@ -934,6 +932,9 @@ The PyTorch implementation of the DeepSAD implementation and framework was origi
 {preprocessed numpy (script), load, labels/meta, split, k-fold}
 {k-fold $\rightarrow$ also adapted in training/testing}
 dataset in rosbag format (one bag file per experiment) was preprocessed as mentioned in chapter X by projecting the 3d lidar data (xzy pointcloud) using a spherical projection in a python script and saved as a npy araray of dimensions frames by height by width with value normalized distance (1 over sqrt(distance)) using numpy save method for simplicity while loading and to avoid having to do this preprocessing during each experiment. the projection was done using the meta information in the bag which includes the channel (height/row) and the index which is available since the data is non-sparse/dense, which means that for each possible measurement a data is available in the original rosbag even if the sensor did not record a return ray for this measurement, which means there is no data and it could be left out in a sparse array saving file size. this is very helpful since it allows the direct mapping of all measurements to the spherical projection using channel as the height index and measurement_idx modulo (measurements / channel) as the width index for each measurement. the reason that this is useful is that otherwise the projection would have to be calculated, meaning the angles between the origin and each point from the point cloud would have to be used to reconstruct the mapping between each measurement and a pixel in the projection. we also tried this method originally which lead to many ambiguities in the mappings were sometimes multiple measurements were erroneously mapped to the same pixel with no clear way to differentiate between which of them was mapped incorrectly. this is most likely due to quantification errors, systematic and sporadic measurement errors and other unforseen problems. for these reasons the index based mapping is a boon to us in this dataset. it should also be mentioned that lidar sensors originally calculate the distance to an object by measuring the time it takes for an emitted ray to return (bg chapter lidar ref) and the point cloud point is only calculated using this data and the known measurement angles. for this reason it is typically possible to configure lidar sensors to provide this original data which is basically the same as the 2d projection directly, without having to calculate it from the pointcloud.
 %\todo[inline]{k-fold data loading, training, testing}
 \threadtodo
 {how was training/testing adapted (networks overview), inference, ae tuning}
@@ -973,7 +974,7 @@ The PyTorch implementation of the DeepSAD implementation and framework was origi
 {LR, eta, epochs, latent space size (hyper param search), semi labels}
 {everything that goes into training known $\rightarrow$ what experiments were actually done?}
-\newsection{setup_matrix}{Experiment Matrix}
+\newsection{setup_matrix_hardware_runtime}{Experiment Matrix, Hardware and Runtimes}
 %\todo[inline]{what experiments were performed and why (table/list containing experiments)}
 \threadtodo
@@ -982,8 +983,6 @@ The PyTorch implementation of the DeepSAD implementation and framework was origi
 {explanation of what was searched for (ae latent space first), other hyperparams and why}
 {all experiments known $\rightarrow$ how long do they take to train}
 \newsection{setup_hardware}{Experiment Hardware and Runtimes}
 \threadtodo
 {give overview about hardware setup and how long things take to train}
 {we know what we trained but not how long that takes}