Trying to answer the famous “Can we trust an AI agent for network operations?” question, for which I have been dedicating more time recently, let’s start with some basics, can we trust the AI agent with text queries, as opposed to structured data. This is what the LLMs are good at, right? So let’s test with an IETF drafts and RFCs.
Actually, I was so surprised that I had to share my experience. Here is apparently a simple question:

And that answer did not feel right to me, so I double-checked.

WTF?

So a clear hallucination example.
And obviously, I restarted the test, multiple times: the answer was better along the time. I also tested with different models (Sonnet 4.6, Haiku 4.5, Opus 4.6).
Bottom line: for this particular task, the AI agent saved me time and provided me some pointers, which I double-checked to finally find my answer and discarded the hallucinations.
How would these AI agents help for network operations? The first question to ask is: how do we go from a deterministic workflow in operations to some probablistic tools. Is this even compatible? For sure agenticOPS has a lot of traction these days. What’s behind the hype?
Stay tuned.