How To Prevent AI Coding Agents From Going Rogue

As software engineers continue to deploy artificial intelligence (AI) for ever greater amounts of coding work, a new industry is emerging. Tools and platforms that give these AI tools a better chance of functioning effectively are growing in popularity, as enterprises seek to avoid the damaging and unpredictable failures caused by flawed coding implementations.

San Francisco-based start-up Causal Dynamics Labs is one competitor in this novel field. The company, which has so far raised $8 million of seed funding, says its Cielara platform can help businesses avoid falling into many of the traps that AI coding exposes them too.

“When an engineer or an AI changes a company's software, no one really knows what will break until it's already live and customers are complaining,” explains co-founder and CEO Hasibul Haque, a former head of platform engineering at Uber. “Cielara reads a company's software the way a doctor reads an MRI, builds a living picture of how everything connects, and tests every change against that picture before it goes out.”

It’s exactly the type of approach that a good software engineer would take to updating the enterprise’s software, Haque adds. And just as an engineer would keep track of any changes made – and the reason for those changes – Cielara also keeps records of all its updates, providing the traceability and transparency often missing from AI solutions.

The emergence of solutions such as Cielara is welcome. The recent DORA report from Google found that while the use of AI in coding is now almost ubiquitous, this trend has been accompanied by a 7.2% decline in deployment stability. Almost a third of software professionals are uncomfortable trusting AI to develop new code on their behalf, the research warns.

Causal Dynamics’ own research suggests the inability of AI coding solutions to understand context is at the heart of the problem. It analysed the work of different AI coding agents across thousands of sessions and concluded that more than half the actions taken by agents were searches for files within the system rather than actual coding edits; in other words, the agents were struggling to connect the instructions they were given with the existing structures of the organisation’s software.

The start-up’s solution is therefore to build a digital twin of the enterprise. “We create a world model – a picture of the organization’s current state – so that every code change can be planned and tested,” Haque adds.

The company has made good progress in commercializing its research since its launch last year, with more than 40 Fortune 500 companies already piloting the Cielara platform. Around a quarter of these companies have become paying customers already.

The direction of travel in regulation is helping the company – as well as the broader code review sector, where rivals include Codacy, SonarQube and Snyk Code. Organizations in many industries face increasing pressure to prove they’ve implemented AI coding solutions safely and securely.

“Board and auditor expectations for proactive risk management have risen sharply,” says the chief information security officer of a law firm that has implemented Cielara. “Leaders now demand evidence that security can anticipate risks from rapid AI and automation, rather than depending on post-incident response.”

“AI has already changed how people access information; the next step is changing how people make decisions,” adds Matt Fisher, the co-founder and former CTO of ecommerce search engine Daydream, who is now an adjunct professor at Brown University. “Instead of only asking what is true right now, teams should be able to explore what could happen next, compare possible paths, and understand the consequences of action before committing.”

How To Prevent AI Coding Agents From Going Rogue

Read Next