Federal agencies have been using AI for some time now, but there have been rolling concerns about an inconsistent approach to adoption.

Government AI systems will be interacting with personal and private information. Agencies will have to develop informed consent models for this, as well as keep the private, private.

Harms caused by wrong answers, model collapse, data and model bias, and the unpredictability of how humans and AI interact are ever-present.

GovAI’s official launch last month, along with the new AI technical standards, is the government’s first holistic attempt to get the whole government on the same page.

Government representatives told The Mandarin they were wary of how previous tech bungles had not only set the government back practically but also affected public trust.

Essential reading for Australia’s public service.

Stay ahead of policy shifts, leadership moves, and the big ideas shaping the public service. Sign up to The Mandarin’s free newsletters.

By continuing, you agree to our Terms & Conditions and Privacy Policy.

The government’s approach reflects a desire to maintain trust in its ability to not only deliver on the productivity promises of AI, but persuade people it can be trusted to use it in the public interest. 

Agencies working with GovAI were eager to tell us how they were approaching this.

What is GovAI?

Marcel Gabriel discusses use cases for GovAI at the AI Government Showcase last monthMarcel Gabriel discussed use cases for GovAI at the AI Government Showcase last month.

GovAI is a platform for developing and testing AI for government use. It allows agencies to develop generative AI-based applications for government use.

The Department of Finance kicked off the “proof of concept” project in May last year.

Assistant secretary for ICT in government services Marcel Gabriel said his branch’s previous success with GovTEAMS, the Parliamentary Document Management System (PDMS), and encrypted communication service GovLINK made them the obvious choice for developing a whole-of-government AI platform.

He said events like robodebt had made both the public service and the public at large cautious of government AI use.

“Trust from the outside and confidence on the inside were low. That’s a people problem, and we are absolutely working through that now,” he said.

“So it started slowly and I think the call came to us to say, ‘let’s start learning by doing’, which is where we fit in really well.

“We’re very well aligned with the Digital Transformation Agency. They’re in the portfolio, and I think there’s an opportunity for the compliance frameworks to grow as well. 

“Policy and implementation need to feed off each other. So we want to create a little bit of a feedback loop — let agencies apply the frameworks and tell us how that goes.

“Using our collaboration services, I think, is a no-brainer. Bringing people together, [to] support each other, telling stories. Our use case library is essentially our stories.

“We are trying to build up to even include stories that include failures … People try to avoid talking about that. But it’s a fundamental part of learning.”

Privacy and security

Agencies are testing new systems in GovAI’s local sandbox. That means models are stored on government systems, thereby eliminating the need for information sharing with the model-makers.

Levels of security vary depending on the specific application, but data is never transmitted in plain text.

It’s a stark contrast to the approaches by the US and UK governments, who are developing information-sharing programs with model makers like OpenAI and Google.

Those who will ultimately use their systems on citizens’ data are testing first on publicly available or synthetic data until the developers are confident they know how it will respond in the real world.

Department of Home Affairs

The Department of Home Affairs presents the results of internal testing at the AI in Government Showcase last month.

Home Affairs has been using AI for service delivery, law enforcement intelligence and security, administrative tasks, and compliance and fraud protection.

The department is primarily running “off the shelf” models, focusing on developing guardrails and prompt engineering.

In practice, this means constraining existing models to ensure they provide answers in clean bureaucratese appropriate to the use case, and ensuring users know how to “prompt” the model to give them the information they are looking for.

Home Affairs uses retrieval-augmented generation (RAG) to constrain models to particular indexes and datasets for particular use cases. This ensures information retrieved by the model is from a credible internal source.

A departmental spokesperson said that while “fine-tuning” models for specific use cases will likely provide them with better results in the long run, this is still too expensive to be practical.

“It requires a lot of hardware in computational power to let us train the model … fine tuning the parameters to suit a specific need. 

“We [currently] focus on prompt engineering and RAG … but we really hope we can get a powerful computational infrastructure to do some fine tuning.”

In the interests of keeping it simple, Home Affairs tested AI on small tasks with limited to no public consequences. Operating these within the GovAI sandbox allowed testing without information being transferred offshore through the open internet.

In one case, an APS6 delivered an Ollama chatbot to update some of the Home Affairs legacy Java codebase. This was completed in two weeks with minimal oversight from an EL1.

Another helps categorise responses from APS census data into positive and negative and provide contextual information for time-poor executives. Home Affairs is using similar tools to help measure sentiment around change at the agency.

These tasks were chosen because they don’t use personal data, and can therefore be considered “low risk”.

A potentially higher-stakes use case involves a question and answer designed to provide expert information on immigration.

The RAG-enabled bot prompts users through a series of questions to determine eligibility and requirements for a given visa, or which visas someone might be eligible for. 

Another departmental spokesperson said this would potentially save the agency a lot of time.

“While answering questions, [the bot] can give you references to where in the legislation and the policy that the answers are coming from, allowing you to verify your answers.

In some ways it’s a quicker search and a more accurate search … but also giving lists like ‘here’s the five requirements someone must need to have this visa’.”

The Department of Veterans’ Affairs

Alicja Mosbauer at the AI in Government Showcase last month.Alicja Mosbauer at the AI in Government Showcase last month.

Veterans’ Affairs (DVA) has developed “Poppy” to help claims agents identify relevant regulations for reports.

Conceptually, it’s designed to do the same job as Home Affairs’ chatbot. That is, identify policy and legislation relevant to a given claim or enquiry.

Recently appointed chief data officer Alicja Mosbauer said it was showing great potential to speed up the claims process and reduce the backlog.

“In our claims environment, you get these massive documents, and it might take somebody you know days to weeks to go through that particular claim. 

“We have a massive number of people who are trying to do these claims processing pieces … [but] they’re not trained medically and they have to manually tab a series of different PDFs to try and understand if the claim is about a shoulder or a knee, or does the claim contain different particular body parts?

“There’s lots of personal information in that, so as part of the GovAI environment, we wanted to look at how we might be able to do that in a way that was really, really safe.

“We’ve created a whole lot of synthetic data, made up a whole lot of content, so we can test this … and we’ve been able to very quickly and easily prove we can do this a whole lot faster.

“This is not something that’s going into production tomorrow, but I think the GovAI environment has really demonstrated how we can go from May to now … into a really well-developed concept and have an understanding of what our road map is.”