MyObservability

Designing AIOPs

Machine learning

Core feature of AIOps is Machine learning

  1. Supervised learning: Predicting a category based on past data.
  2. Unsupervised learning: Finding inherent groupings in data without known labels. like Grouping related alerts, potential correlation
    • Anomaly Detection: Identifying unusual data points that deviate from established norms.
  3. Reinforcement Learning: Involves an agent learning optimal actions through trial and error in a given environment.

Challenges for tradition root cause analysis:

How Machine Learning (ML) tackles root cause analysis (RCA) within AIOps.

Ex:

  1. A customer reported about a web application about inaccessible.
  2. AIOps platform collects the data: alerts from web server, middileware, database server, Infrastructure.
  3. Groups alerts based on timing and component overlap.
  4. Anomaly detection models simultaneously show unusual traffic patterns associated with that network device.
  5. IT teams focus on that network device to investigate, rather than troubleshooting everything downstream of the outage.

ML algorithms & Pattern recognitions

Types of Anomalies:

Consider what you want to detect:

Knowledge Graphs:

A knowledge graph is a network of interconnected data points representing IT components, dependencies, historical incidents, alerts, and other relevant operational information.

Knowledge Graph is Built using

Predictive Model

Predictive models use machine learning algorithms to analyze historical data (logs, metrics, past incidents) and identify patterns that anticipate potential issues, breakdowns, or performance degradations.


AIOps Main page My AIOPs

AIOps Overview Page: AIOPs Overview

Next Page: AIOps Design