Loading...

Why confident AI agents are still the most dangerous ones in enterprise operations

Many enterprises already have RPA, BPM tools, and ERP systems. How does Skan position AI agents as complementary to, rather than competing with, those existing investments?

 I should say upfront — these are my own views, shaped by patterns I see across the industry, not a position from Skan.

That said, the framing of “complement versus compete” is itself the wrong starting point. The more honest question is: what problem were those investments actually solving, and is it actually been solved?

RPA was sold as automation but delivered fragility. It works until the UI changes, the process shifts, or an exception walks in. BPM gave organizations the visibility they desperately needed but required so much upfront modeling that by the time a workflow was deployed, the business had already moved. ERPs consolidated data but created their own rigidity — customizations that cost millions and still could not accommodate how work actually gets done on the ground.

Rajesh Gupta, Head of Agentic AI, Skan

AI agents do not replace any of that. They sit on top of it and handle what those systems were never designed for — variability, judgment, and the messy middle between defined process steps. The RPA bot handles the structured transaction. The AI agent handles the exception that would have gone to a human supervisor.

The practical pitch is simple: you do not need to rip anything out. You need something that makes your existing stack more resilient by handling what falls through the gaps. The organizations that get this right treat AI agents as connective tissue between existing tools — not a new system competing for budget.

What is the current ceiling of agentic AI reliability in high-stakes enterprise operations, and what technical advances are needed to push past it?

We are somewhere between 85–92% reliable on well-scoped tasks. In high-stakes operations, that gap to 100% is everything. A 10% failure rate in customer service is annoying. In underwriting, claims adjudication, or financial operations, it is a liability.

Agents handle structured, repeatable tasks well. Reliability drops the moment things get ambiguous, long-running, or operationally messy. Most enterprises still keep humans in the loop for approvals and compliance-heavy decisions — and for good reason.

The core tension is that LLMs are probabilistic systems operating inside deterministic businesses. An agent can sound completely confident while missing context, misreading policy, or quietly compounding errors across a multi-step workflow. The system around it often cannot tell the difference between a confident wrong decision and a correct one.

Three things set the ceiling today: context drift over long task chains, poor failure transparency, and audit trails that regulators do not actually trust. Agents do not just hallucinate facts — they hallucinate actions.

Breaking through requires reliable action verification, structured behavioral logging that captures why a decision was made rather than just what was decided, and standardized action taxonomies so unexpected behavior has a shared language to describe and learn from. The technology exists — in pieces. What is missing is the infrastructure layer that ties it together and builds accountability into the architecture from the start.

Enterprise AI adoption is often described as a change management problem more than a technology problem. Based on your deployments at Fortune 500 Companies, how accurate is that, and where does technology still fall short?

Very accurate. In most Fortune 500 environments, the hardest part is rarely getting a demo to work — it is getting an organization comfortable enough to operationalize it. The friction comes from trust, ownership, and accountability. Teams worry about reliability, leaders worry about risk, and employees worry about whether the system will actually make their work easier.

Enterprises do not adopt AI because it is intelligent. They adopt it when it becomes predictable, measurable, and easy to integrate into existing operations.

That said, technology still has real gaps. Current systems struggle with long-running workflows, fragmented enterprise context, inconsistent data quality, and cross-system reasoning. A model can perform extremely well in controlled evaluations and still fail in production because enterprise environments are messy in ways benchmarks rarely capture. Observability and orchestration tooling is still catching up to the pace of model development.

Which enterprise functions, such as finance, HR, contact centre, and operations, have proven to be the most fertile ground for AI adoption, and which ones are harder than they look?

Contact centers are the clearest early win. The ROI is immediate — reduced handle time, automated summaries, multilingual support, 24/7 coverage — and the workflows are observable enough to optimize over time.

Operations and back-office functions are another strong area: claims processing, invoice reconciliation, onboarding, compliance checks, and document handling. AI is especially valuable where employees spend their time navigating multiple tools and moving information between them.

Finance has emerged as a solid domain for FP&A, procurement, audit preparation, and risk analysis — though most deployments are still assistive rather than autonomous when real financial decisions are on the line.

HR looks easier than it is. Candidate screening, onboarding support, and internal knowledge assistants work reasonably well. But once AI touches performance evaluation, hiring fairness, or sensitive employee communication, the risk profile changes fast. Human nuance matters more than people initially expect.

The hardest category is any workflow combining ambiguity, fragmented context, and real accountability — healthcare operations, underwriting, legal review, enterprise procurement. These look automatable in demos. Production environments expose edge cases very quickly.

AI pilots rarely die from bad technology but die from bad adoption. What is the adoption playbook you run with every new enterprise customer to ensure a pilot becomes a production deployment?

The biggest mistake enterprises make is chasing a transformational AI strategy before proving value in a narrow workflow. The most successful deployments start with a specific pain point where ROI is measurable in weeks, not quarters.

Start with a workflow that is painful, repetitive, and already measured. You need clear baseline metrics before introducing AI — cycle time, error rates, escalation volume. Without that, every pilot becomes subjective.

Design for the existing operational reality rather than asking teams to change behavior overnight. Adoption accelerates when AI fits into current workflows, not when it demands a new operating model from day one.

Build in trust. Early users need visibility into what the system is doing, where confidence is high or low, and when human intervention is expected. Black box behavior kills adoption even when the underlying model is strong.

Avoid over-automating too early. The best pilots start in assistive mode before moving toward autonomous execution. That builds organizational comfort while generating operational data to improve the system safely.

And do not underestimate executive sponsorship. Successful production deployments almost always have a business owner accountable for operational outcomes — not just an innovation team experimenting in isolation.

Enjoyed this interview? Now imagine yours. Write to:
editor@thefoundermedia.com

About The Author