What role does AI play in monitoring and observability practices?
Discussions around AI in IT monitoring and digital experience monitoring can quickly become difficult to follow. Terms are often used loosely, some capabilities are grouped together, and different approaches are sometimes presented as though they were equivalent.
Within observability platforms, monitoring covers a wide range of functions. While the overall goal of observability is to ensure the proper operation of critical applications, monitoring contributes by observing user experiences, detecting issues within the systems that support user journeys, and providing visibility across different technical layers.
It also helps automate certain tasks and provides the data required for analysis and investigation.
AI has now become part of this landscape, but it does not represent a single reality. Some capabilities are involved in defining and configuring what should be monitored. Others support teams in interpreting signals, managing incidents, and using the platform on a daily basis.
Treating all of these capabilities as if they served the same purpose makes it harder to understand the value they actually provide.
AI Upstream
AI can play a role when teams determine what should be monitored and how. It helps clarify requirements, structure monitoring strategies, and prepare critical environments.
AI in Operations
It also plays a role when issues need to be detected, analyzed, and resolved. It supports teams in interpreting signals, managing incidents, and using the platform effectively on a daily basis.
Define
Identify what needs to be monitored.
Configure
Set up journeys, scenarios, and key monitoring elements.
Analyze
Interpret signals, data, and anomalies.
Act
Respond more effectively to incidents and ongoing changes.
It also enables teams to become more effective in protecting services, supporting users, and managing ongoing change.
Understanding the Role of AI in Monitoring
In general, generative AI helps define what should be monitored and evolve its scope over time. Agentic AI, on the other hand, helps interpret what is happening and respond to situations as they unfold. This simple distinction reflects the way work is actually carried out in a monitoring environment, even though the underlying AI technologies are often presented together.
Define and Evolve the Monitoring Scope
Part of the work involves defining what should be monitored and how it should be measured, for example by creating synthetic user journeys or business transactions that are tracked over time.
Interpret, Understand and Respond
In contrast, interpreting signals and resolving issues that arise during monitoring represent a different type of work.
Two Complementary Dimensions of Monitoring
Define and structure something that does not yet exist or needs to evolve.
Understand what is happening within the monitored environment.
Respond to situations, incidents, and observed changes.
It does not address the same challenges when helping structure monitoring as it does when helping teams make sense of observed signals.
These are not simply different stages of the same process. Both dimensions involve distinct tasks and different requirements for both tools and teams.
Clearly highlighting this distinction helps explain why AI can appear both powerful and, at times, inconsistent across different monitoring tools within the same observability platform.
This distinction also helps clarify what organizations can realistically expect from AI, how different capabilities support different aspects of monitoring work, and when AI enables a shift from efficiency gains toward more effective monitoring that is better aligned with business needs.
How Does Generative AI Facilitate the Definition and Evolution of Monitoring?
Most users are already familiar with generative AI in their daily lives. It is the type of AI used to draft an email, summarize a document, or quickly obtain an explanation on a given topic. Users simply describe what they need, and the system produces a structured result that can then be reviewed and refined.
Describe an Intent, Receive a Structured Foundation
In other words, generative AI responds to a request by generating an output based on the information provided.
In the context of monitoring, the same principle applies. Generative AI comes into play when defining or configuring what should be monitored.
Describe
Teams describe the business transactions, application workflows, or user journeys that need to be monitored.
Structure
Based on that description, AI can generate structured elements that serve as the foundation for synthetic monitoring.
Refine
Teams review, refine, and adapt the generated output to align it with their actual requirements.
Monitoring Does Not Remain Static
Applications evolve, user behavior changes, and new workflows must be taken into account. Defining and updating monitoring manually can quickly become time-consuming, especially as environments grow in complexity.
For teams that need to keep monitoring aligned with constantly evolving applications, AI’s ability to generate and refine monitoring definitions is particularly valuable.
Generative AI reduces the effort required to create and update monitoring journeys while helping maintain consistent coverage over time. Instead of building everything from scratch, teams can rely on an initial structured version that reflects their intent.
How Does Agentic AI Facilitate Monitoring Operations?
Agentic AI is less widely known, but its role becomes clear when looking at what happens once monitoring is actively running. Rather than waiting for a request, it operates in real time, relying on signals as they emerge and evolve.
Situations Continuously Evolve
Within a monitoring environment, agentic AI operates in a context where data is continuously generated and situations can change from one moment to the next. Alerts, events, logs, and other signals provide valuable information, but they do not always offer a complete picture.
Signals Processed in Real Time
Detecting events that require attention.
Analyzing technical traces to understand what is happening.
Connecting information to build a clearer understanding of the situation.
Connect Information
Agentic AI helps teams access context and connect information without having to manually piece everything together.
Reduce Noise
In environments with large numbers of alerts, it can group related events, highlight what is most likely to matter, and reduce the time spent sorting through noise.
Clarify Incidents
During incident analysis, it can correlate information from multiple sources such as recent changes, dependencies, and user impact to provide a clearer view of the situation.
It helps clarify what is happening without replacing the underlying tools or operational processes.
Interaction with the monitoring platform itself is another key aspect. Agentic AI can not only surface information that would otherwise require multiple investigation steps, but can also help teams navigate monitoring environments more efficiently.
Through context-aware assistance and natural language interactions, the nature of monitoring operations evolves: less time spent rebuilding context, more time spent taking action.
The objective is to reduce effort and improve clarity in situations where time and context are critical.
Generative AI and Agentic AI: Towards More Effective Monitoring in Support of Observability
As we have seen from a monitoring perspective, generative AI helps define what should be monitored and evolve those definitions as systems change. Agentic AI focuses on what is actively being monitored, helping teams interpret signals, place them into context, and address situations as they unfold.
Prepare and Evolve Monitoring
It helps define what should be monitored, structure monitoring journeys, and adapt monitoring coverage as systems evolve.
Interpret and Act in Production
It helps teams interpret signals, place them into context, and respond to situations as they emerge.
The Value of AI Depends on How Well It Matches the Work to Be Done
By looking at monitoring from preparation through to production operations, it becomes easier to understand how different forms of AI contribute. More importantly, it helps clarify what can realistically be expected from each capability and in which contexts they provide the greatest value.
Less Time
Reduce the time required to move from visibility to understanding.
Less Effort
Reduce manual work involved in structuring, analyzing, and rebuilding context.
Lower Costs
Optimize monitoring operations by allowing teams to focus on high-value actions.
Greater Impact
Detect issues earlier, act with better context, and reduce the impact of incidents.
It helps teams improve operational effectiveness and better protect the services and digital experiences on which users depend.
It reduces the gap between what teams have available, dashboards, alerts, signals, and logs, and what they are ultimately seeking: faster understanding, clearer context, and the ability to act in ways that protect digital services while delivering differentiated value to users and customers.