Government teams are under pressure to improve services faster without compromising trust, accessibility or quality. In our recent Experiment conference, we explored whether vibe coding could help designers and researchers move from idea to testable prototype sooner.

For public services, the real question is not whether AI can generate code. It is whether it can help teams learn faster, reduce setup effort and produce prototypes realistic enough for research, while still staying anchored in user needs.

Starting with user needs, not tools

We deliberately began with familiar User Centred Design (UCD) questions rather than asking which AI tool was best. What do designers and researchers actually need? Where does prototyping slow delivery down? Which parts of the journey are repetitive, and which parts demand human judgement?

That framing mattered because the aim was not to chase hype. It was to understand whether AI-assisted prototyping could strengthen existing practice in government, not replace it.

Why the government context matters

Delivery of Government Digital Services is a useful environment for this kind of experiment because it is already highly structured. The GOV.UK Design System is opinionated, patterns are consistent and the Prototype Kit gives teams a familiar starting point. That structure can make AI-generated output more predictable, but it does not make it trustworthy by default.

As we discussed at Experiment, vibe coding is a trendy label, not a quality mark. In public services, quality still comes from research, testing, accessibility and governance. A polished prototype is not the same as a validated service.

What we found in practice

Loose prompting quickly led to drift. Unless the constraints were clear, the prototype became less reliable over time and the intended journey started to weaken. In government, that is a real issue because repeatability and rationale matter.

A more structured approach worked better. Using GitHub Copilot on top of a GOV.UK-style foundation, with around six initial prompts and known components, produced more reliable results. With around ten prompts, we were able to get a pre-registration flow working from an initial journey and set of user actions.

We then pushed the prototype further. At first, it only stored data while the terminal was running. After prompting Copilot to create a database, the prototype kept user input after the session closed and behaved more like a realistic service.

We also asked the model to carry out an initial accessibility check. It flagged issues with sizing and colour, brought the prototype closer to GOV.UK conventions, and suggested extra code to support assistive technologies. That was helpful, but only as an early check: accessibility still needs human review and testing.

The end result was a stable working prototype in roughly four hours and around ten prompts. The main value was not speed for its own sake. It was reaching something testable much sooner than we otherwise would have.

The real opportunity

AI-assisted prototyping can help teams get to a believable prototype earlier, run usability sessions sooner and spot confusion or failure points faster. It can also reduce repetitive setup work, leaving more time for the parts of design that matter most: refining user flows, improving content, handling edge cases and strengthening accessibility.

At its best, this is not about replacing UCD. It is about accelerating the loop between ideas, prototypes, evidence and iteration.

Risks and guardrails

The biggest risk is mistaking polished output for trustworthy output. A prototype can look convincing while still missing edge cases, weakening validation, confusing users or introducing accessibility issues that are not obvious at first glance.

There is also a risk of stakeholder overconfidence, and a risk that design judgement erodes if prompting starts to replace critical thinking.

That is why the guardrails matter. Our non-negotiables are simple: user needs still drive the work, accessibility still needs proper human validation, usability testing still has to happen and humans still sign things off. Done does not mean generated. Done means validated.

In practice, that means starting from established GOV.UK patterns, using a prompt playbook, asking explicitly for validation and error states, and keeping an audit trail of prompts and decisions. AI can help surface likely issues and draft fixes, but people still need to verify them through testing and evidence.

Where we’ve landed

Looking back, we started in the right place. This work helped us explore the space properly, test tools in practice and form a clearer view of where AI-assisted prototyping can genuinely add value.

It can help us get testable prototypes faster, reduce setup effort and speed up learning. But it cannot replace user needs, discovery, usability testing or the judgement required to build trustworthy public services.

The takeaway is simple: vibe coding may be useful for prototyping in government, but only when it sits inside a strong UCD and governance framework. Used well, it can accelerate learning. Used carelessly, it can shortcut the very checks that protect users.

As this starts to move beyond prototyping, it may also have implications for software engineering, including the potential for parallel run testing. From a UCD perspective, the next step is to keep comparing approaches, testing prototypes and evaluating the evidence.

So the real question is not whether government should use AI-assisted prototyping, but how it should use it responsibly to improve services without lowering the bar.

Talk to CGI’s Government and Justice design team about how AI-assisted prototyping can be used responsibly in your service journey.

About this author

David Bolokan Headshot

David Bolokan

David Bolokan is a User-Centred Design Consultant in CGI’s Government and Justice business unit, specialising in accessible service design and AI-assisted prototyping.