As networks grow in size and complexity, the amount of information they produce can become overwhelming. This is especially true in the case of network monitoring. Traditional service checks and SNMP data collection have given way to streaming telemetry data such as flows, and even on small networks this can exceed tens of thousands of data points a second.
Since there is no way for human beings to consume and parse this data in real time, the OpenNMS project turned to machine learning in order to deal with it. This presentation will discuss ALEC – an Architecture for Learning Enabled Correlation.
The ALEC system is used to correlate alarm data into “situations” – actionable tasks that a Network Operations Center can use to address issues with their network. ALEC is designed to support multiple correlation engines and currently supports two: DBSCAN for unsupervised machine learning and TensorFlow for supervised “deep learning”.
This talk should prove interesting for anyone tasked with monitoring a large network or people interested in a real world application of machine learning.