Trying to answer the famous “Can we trust an AI agent for network operations?” question, for which I have been dedicating more time recently, let’s start with some basics, can we trust the AI agent with text queries, as opposed to structured data. This is what the LLMs are good at, right?
Actually, I encountered a situation yesterday and was so surprised that I had to share my experience.
Here is apparently a simple question:

And that answer did not feel right to me, so I double-checked.

WTF?

So a clear hallucination example, but a hallucination with protocol specifications, from rigorous IETF RFCs and drafts: I was not expecting that!
And obviously, I restarted the test multiple times, and with different models (Sonnet 4.6, Haiku 4.5, Opus 4.6): admittedly: the answer was better along the time.
Bottom line: for this particular task, AI saved me time and provided me with some pointers, which I double-checked to finally find my answer and discarded the hallucinations. Note that, in that particular case, this was a chatbot and not an AI agent.
To the question “Would AI agents help for network operations?”, we can assert: absolutely! However, let’s walk before we run. The first question to ask ourselves is: how do we go from operational tools based on deterministic workflow to some probabilistic tools, who might not and will provide the same answer based on a specific input. Will the network administrator accept this shift? I would say yes for network anomalies and incident detection, but maybe not now for about closed loop. Let’s wait first until the AI agents improve.
For sure, agenticOPS has a lot of traction these days but what’s behind the hype?