added common thread macro

2025-04-25 11:01:48 +02:00
parent f5b39508e4
commit 9cfbfd5c1b
1 changed files with 21 additions and 15 deletions
--- a/thesis/Main.tex
+++ b/thesis/Main.tex
@@ -64,7 +64,22 @@
 % \usepackage[first,bottom,light,draft]{draftcopy}
 % \draftcopyName{ENTWURF}{160}
 
-\usepackage{todonotes}
+\usepackage{xcolor}
+\usepackage[colorinlistoftodos]{todonotes}
+
+\DeclareRobustCommand{\threadtodo}[4]{%
+  \todo[inline,
+        backgroundcolor=red!20,
+        bordercolor=red!50,
+        textcolor=black!80,
+        size=\small,
+        caption={Common Thread Note}]{%
+        \textbf{Goal:} #1 \newline
+        \textbf{Context:} #2 \newline
+        \textbf{Evidence/Method:} #3 \newline
+        \textbf{Transition:} #4
+  }%
+}

 % correct bad hyphenation
 \hyphenation{}
@@ -175,8 +190,8 @@
 %\todo[inline, color=green!40]{its a master thesis where we try to know how trustworthy the sensor data for robot navigation is}
 %\newsection{Motivation and Problem Statement}{motivation}
 %\todo[inline]{lidar and its role in robot navigation. discuss sensor degradation and its effects on navigation.}
-
-Autonomous robots have gained more and more prevailance in search and rescue missions due to not endangering another human being and still being able to fulfil the difficult tasks of navigating hazardous environments like collapsed structures, identifying and locating victims and assessing the environment's safety for human rescue teams. To understand the environment, robots employ multiple sensor systems such as lidar, radar, ToF, ultrasound, optical cameras or infrared cameras of which lidar is the most prominently used due to its accuracy. The robots use the sensors' data to map their environments, navigate their surroundings and make decisions like which paths to prioritize. Many of the aforementioned algorithms are deep learning-based algorithms which are trained on large amounts of data whose characteristics are learned by the models.
+\threadtodo{\textit{“What’s the one key claim or insight here?”}}{\textit{“Why must it appear right now?”}}{\textit{“How am I proving or demonstrating this?"}}{\textit{“How does it naturally lead to the next question or section?”}}
+\threadtodo{Create interest in topic, introduce main goal of thesis, summarize results}{}{}{}

 Environments of search and rescue situations provide challenging conditions for the sensor systems to produce reliable data. One of the most promiment examples are aerosol particles from smoke and dust which can obstruct the view and lead sensors to produce erroneous data. If such degraded data was not present in the robots' algorithms' training data these errors may lead to unexpected outputs and potentially endanger the robot or even human rescue targets. This is especially important for autonomous robots whose decisions are entirely based on their sensor data without any human intervention. To safeguard against these problems, robots need a way to assess the trustworthiness of their sensor systems' data.

@@ -185,7 +200,7 @@ For remote controlled robots a human operator can make these decisions but many

 \begin{quote} Can autonomous robots quantify the reliability of lidar sensor data in hazardous environments to make more informed decisions? \end{quote}

-In this thesis we aim to answer this question by assessing a deep learning-based anomaly detection method and its performance when quantifying the sensor data's degradation. The employed algithm is a semi-supervised anomaly detection algorithm which uses manually labeled training data to improve its performance over unsupervised methods. We show how much the introduction of these labeled samples improves the methods performance. The models output is an anomaly score which quantifies the data reliability and can be used by algorithms that rely on the sensor data. These reliant algorithms may decide to for example slow down the robot to collect more data, choose alternative routes, signal for help or rely more heavily on other sensor's input data.
+In this thesis we aim to answer this question by assessing a deep learning-based anomaly detection method and its performance when quantifying the sensor data's degradation. The employed algorithm is a semi-supervised anomaly detection algorithm which uses manually labeled training data to improve its performance over unsupervised methods. We show how much the introduction of these labeled samples improves the methods performance. The models output is an anomaly score which quantifies the data reliability and can be used by algorithms that rely on the sensor data. These reliant algorithms may decide to for example slow down the robot to collect more data, choose alternative routes, signal for help or rely more heavily on other sensor's input data.

 \todo[inline]{discuss results (we showed X)}

@@ -303,16 +318,6 @@ The third category -reinforcement learning- takes a more interactive approach to
 Semi-Supervised learning algorithms are -as the name implies- an inbetween category of supervised and unsupervised algorithms in that they use a mixture of labeled and unlabeled data. Typically vastly more unlabeled data is used during training of such algorithms than labeled data, oftentimes due to the effort and expertise required to label large quantities of data correctly for supervised training methods. The target tasks of semi-supervised methods can come from both the domains of supervised and unsupervised algorithms. For classification tasks which are typically achieved using supervised learning the additional unsupervised data is added during training with the hope to achieve a better outcome than when training only with the supervised portion of the data. In contrast for typical unsupervised learning tasks such as clustering algorithms, the addition of labeled samples can help guide the learning algorithm to improve performance over fully unsupervised training.


-\todo[inline]{learning based methods categorized into supervised, unsupervised and semi-supervised, based on whether for all, none or some of the data labels are provided hinting at the correct output the method should generate for these samples. historically, machine learning started with unsupervised learning and then progressed to supervised learning. unsupervised learning was oftentimes used to look for emergent patterns in data but supervised deep learning was then successfull in classifciation and regression tasks and it became clear how important dataset size was, for this reason supervised methods were challenging to train since it required a lot of expert labeling. unsupervised methods were explored for these use-cases as well and oftentimes adapted to become semi-supervised by providing partially labeled datasets which were easier to produce but more performant when trained compared to unsupervised methods}
-
-%Broadly speaking all learning based methods can be categorized into three categories, based on whether or not the training data contains additional information about the expected output of the finalized algorithm for each given data sample. These additional information are typically called labels, which are assigned to each individual training sample, prior to training by an expert in the field. For example in the case of image classification each training image may be labeled by a human expert with one of the target class labels. The algorithm is then supposed to learn to differentiate images from each other during the training and classify them into one of the trained target classes. Methods that require such a target label for each training sample are called supervised methods, owing to the fact that an expert guides the algorithm or helps it learn the expected outputs by providing it with samples and their corresponding correct solutions, like teaching a student by providing the correct solution to a stated problem.
-%Alternatively, there are algorithms that learn only using the training data itself without any labels produced by experts for the data. The algorithms are therefore called unsupervised methods and are oftentimes used to find emergent patterns in the data itself without the need for prior knowledge of these patterns by an expert. Unsupervised methods 
-
-
-\todo[inline, color=green!40]{deep learning based (neural network with hidden layers), neural networks which get trained using backpropagation, to learn to solve a novel task by defining some target}
-\todo[inline, color=green!40]{data labels decide training setting (supervised, non-supervised, semi-supervised incl explanation), supervised often classification based, but not possible if no labels available, un-supervised has no well-defined target, often used to fined common hidden factors in data (distribution). semi-supervised more like a sub method of unsupervised which additionally uses little (often handlabelled) data to improve method performance}
-\todo[inline, color=green!40]{include figure unsupervised, semi-supervised, supervised}
-\todo[inline, color=green!40]{find easy illustrative example with figure of semi-supervised learning and include + explain here}
 \todo[inline, color=green!40]{our chosen method Deep SAD is a semi-supervised deep learning method whose workings will be discussed in more detail in secion X}

 \newsection{autoencoder}{Autoencoder}
@@ -579,6 +584,7 @@ This simplistic labeling approach has both advantages and disadvantages. On the
 Since an objective measure of degradation is unavailable, we explored alternative labeling methods—such as using statistical properties like the number of missing measurements per point cloud or the higher incidence of erroneous measurements near the sensor in degraded environments. Ultimately, we were concerned that these statistical approaches might lead the method to simply mimic the statistical evaluation rather than to quantify degradation in a generalized and robust manner. Notably, our labeling strategy—based on the presence or absence of smoke—is fundamentally an environmental indicator, independent of the intrinsic data properties recorded during the experiments.

 %\todo[inline]{TODO maybe evaluate based on different thresholds? missing datapoints, number of detected outliers, number of particles in phantom circle around sensor?}
+\todo[inline]{maybe also mention that we considered labeling using output of down-the-pipeline algorithm (e.g., SLAM) and how it performs/how confident it is and retrospectively label the quality of the data based on that}

 \newchapter{experimental_setup}{Experimental Setup}
 \newsection{autoencoder_architecture}{Deep SAD Autoencoder Architecture}