Contextually Situated Anomaly Detection

Towards the Future!

01 March 2022

As you can tell from our code, this is still just a proof-of-concept! There is still much to do to make the algorithm applicable in real-world situations. In this section, we describe some possible ideas and advancements to consider for future work on this project.

While the proposed methodology is able to identify and segment data in both offline and online environments, computational feasibility is still an issue, with input data recorded over the seconds scale (each 8192 data points of input corresponds to only 5 seconds of recording time), running our non-optimised proposed methodology for 2 minutes of data from 24 sensors takes more than 20 minutes. This is clearly not feasible for live environments, so proceeding efforts could focus on streamlining the algorithm towards effective, fast computations. This includes parallelising the computation for each sensor over multiple CPUs, taking advantage of just-in-time compilation (e.g. with Python's Numba) and using cloud and GPU clusters to accelerate computations. Some back-of-the-envelope calculations shows that such improvements (particularly parallelising the computations) will make the proposed methodology much more feasible in production environments.

From the results of the online algorithm, another question presents itself: visually speaking, each peak in the animations clearly corresponds to exactly one damage transition. How can we adapt our model to detect 'peaks in time' automatically in real-time? A simple solution would be to test whether the CAC window has an 'active' peak (a localised maximum above a fixed threshold). In order to account for noise, we can enforce that this peak must persist for a certain number of frames. If detected, display a visual alert for a potential anomalous shift so that shutdown, inspection and repair decisions can be made. Figure 8 has a flowchart of what such an detector would look like. The anomaly `persisting' in multiple frames was one of the key indicators separating the true anomalous transitions and noise that we observed in our experiments, the second feature being the magnitude of the peaks. Tuning would have to be performed to remove false positives from the result. Machines wrongly categorising $x$ as $y$ , how many times have we heard this before? Too many false positives would not be good for the engineer's morale...too many false negatives and the building comes down! Thankfully, our model is able to capture \textit{all} true positives - a good sign! algo extension flowchart Figure 8: A simple extension on the proposed methodology. If not much noise is expected (all sensors operational), then we can issue an alert whenever the CAC exceeds a certain threshold. Otherwise, we can increase robustness to noise by enforcing that a certain number of frames must have a peak past a certain threshold to issue an anomaly alert.

Finally, we point out the following: because we only segment the final output which is actually in the frequency domain, we are only able to make predictions accurate to one experiment's worth of data. For our analysis, the accurate time resolution is 5 seconds; for certain purposes, this is sufficient, but in high-risk situations, reducing this input-prediction lag would be paramount for better outcomes.