IE 11 Not Supported

For optimal browsing, we recommend Chrome, Firefox or Safari browsers.

Proving Grounds: Governments Build Sandboxes to Test AI

Separated from live systems and sensitive public data, sandboxes let states and cities test drive artificial intelligence use cases without impacting services.

sand with circles raked on the edges and in the corners
Adobe Stock
In Massachusetts, CIO Jason Snyder has concerns about artificial intelligence. Many publicly available AI applications, for example, retain user data for training. “There’s significant risk there: That data is not ours to share,” he said. There are other concerns as well. Can the AI do what it promises? Will it deliver accurate outputs?

Snyder is looking to explore emerging AI use cases, but he wants to do it safely. With this in mind, Massachusetts has ramped up an “AI sandbox” — a cordoned-off space where it’s possible to test-drive emerging AI capabilities without impacting live systems and data.

For both state IT organizations and, increasingly, city technology departments, sandboxes offer a lower-risk way to embrace AI and accelerate innovation.

WHY A SANDBOX?


New Jersey Chief Innovation Officer Dave Cole described his state’s sandbox as a response to growing interest in AI as a support for various mission sets.

“Folks have a lot of ideas about how AI can help improve the work that we’re doing, how it can deliver better services,” he said. “In October of 2023, our governor put out an executive order tasking the state with finding responsible, effective ways to deploy this new technology.”

That led to the creation of a task force aimed in part at ensuring responsible and ethical use of AI. This in turn spurred creation of the sandbox. “We didn’t want to let folks just sort of figure out what tools to use on their own,” Cole said.

In Massachusetts, Snyder turned to a sandbox as a way to bring consistency to AI efforts. “We wanted to make sure that we set clear guardrails for everybody,” he said. “By creating a sandbox, not only did we retain ownership of the data, but we also could insert terms and conditions that everybody would agree upon.”

At the city level, too, some IT leaders are embracing sandboxes as a way to mitigate risk while still driving innovation.

Given the rapid pace of change in AI, “we realized that the standard linear approach to upgrades in technology just wasn’t going to get us where we need to go,” said Jeff Auker, director of Development Services in Hartford, Conn.

The city wants to accelerate AI adoption, but with so many AI tools still largely untested, any information on their effectiveness “is going to be anecdotal at best,” he said. Hartford’s sandbox offers a place to gather real-world insight into AI’s capabilities across a range of use cases, from planning and zoning to 311 response.

In Washington, D.C., meanwhile, Chief Technology Officer (CTO) Stephen Miller is looking to sandboxes as a way for the IT department to help mission leaders safely explore AI’s potential.

Sandboxes offer “safe environments for our teams to work with our customer agencies, so we can get in there and play with the tools, try interesting things, in a risk-free environment,” he said.

“It also gives us a better idea of how the tool’s going to perform, what it’s going to look like when it’s live,” he said.

aerial view of Washington, D.C.
Washington, D.C., CTO Stephen Miller said a city’s small scale is an asset in getting AI work done, as they can “work as a whole government” to align priorities.
Adobe Stock

BRINGING SANDBOXES TO LIFE


Across the board, these IT leaders are leveraging commercial cloud to deliver the isolated environment in which AI experimentation can take place.

Massachusetts, for instance, taps the Amazon Web Services (AWS) ecosystem. The sandbox incorporates Amazon Bedrock, a service that helps users build generative AI applications, along with Amazon Kendra, a machine learning-powered search service that helps users find information across their organization’s content.

With those tools in hand, the state worked with AWS “to wall off an area within our overall AWS account system for AI use, and AI use only,” Snyder said. The team is running its experiments in this safe area.

By putting the sandbox in a commercial cloud, the state was able to access needed supports, with AWS providing training to those who would be running experiments. That includes both state agencies and also university students: Researchers from Northeastern University, for example, are using the sandbox to look at AI use cases related to transportation, health care and grant distribution.

These are sensitive models with sensitive outcomes, and we need to make sure through these sandboxes that we are doing this in the right way, that we are increasing quality, that we are shrinking the time to deliver.
In New Jersey, the IT team put the sandbox in a secure, isolated environment within Microsoft Azure. “The application doesn’t have access to state data, systems or anything like that. It’s a standalone, isolated application,” Cole said.

Access to the sandbox is secured through the same authentication platform that employees already use to access other work systems. “That allows us to operate with a level of confidence, knowing that these are the same systems we use for document management, for email and for other cases where you might have sensitive information,” he said.

The sandbox incorporates a Microsoft open source, chat-based client that helps facilitate the user experience. “We took that client and added in things like document uploads, [enabling] people to attach files to their prompts,” Cole said. That chat client makes it easy to experiment in the sandbox, “and we’ve made that available to other states who are building out similar sandboxes.”

At the city level, Washington, D.C., likewise is using the Microsoft cloud environment to host its sandbox. This provides “a safe, isolated environment that’s apart from our actual production environment. We’re isolating the data, we’re isolating the models,” Miller said. Within the sandbox, experiments leverage no-code solutions for simplicity. “We want to make sure that we’re making things as seamless as possible.”

In terms of use cases, the district is looking especially at chatbots for customer support on city websites. “We want to make sure we’re helping these chatbots [deliver] a self-service model, by improving response times, improving customer satisfaction, improving the quality of customer support that we’re giving out,” Miller said.

Sandboxing helps to minimize the risk that arises when a municipal chatbot makes AI-generated answers publicly available. “We want to make sure that they’re doing what they’re doing in a way that we understand,” and that outputs are safe and equitable, he said.

Beyond chatbots, early sandbox efforts include a look at how AI can support more efficient procurement processes — a topic near and dear for Miller. “We’re the technology agency, so we do a lot of procurement around technology. But procurement is a very sensitive space,” he said.

“We’re utilizing a sandbox to see how AI is going to make it easier, how we’re going to get statements of work done faster,” he said. “These are sensitive models with sensitive outcomes, and we need to make sure through these sandboxes that we are doing this in the right way, that we are increasing quality, that we are shrinking the time to deliver.”

Hartford likewise is using a walled-off space within a major public cloud provider to host its sandbox. With test data segregated from production data, the city is interfacing with tools like Accela and OpenGov to explore AI-assisted capabilities.

“Within our internal IT teams and processes, we are working to define the use cases, to be very clear about who’s going to get access to the tools and to set up well-defined scripts to start testing,” Auker said. With the right protocols in place, “we can compare what we know are the right answers to what the tools are printing out. That’s going to get us comfortable.”

WHO IS INVOLVED?


Whether at the state or city level, an AI sandbox effort goes well beyond just the IT team.

In New Jersey, the Office of Innovation is situated within the governor’s cabinet and is leading the sandbox effort as part of its overall push to improve digital service delivery across the state. To launch the sandbox, “we worked very closely with the Office of Information Technology, OIT, which is a peer agency or organization. And this was a true partnership,” Cole said.

“OIT traditionally provides the platforms and the infrastructure for state technology. We focus more on the human-centered side of things: How do we use technology to deliver better results?” he said. The sandbox “required both teams coming together. We worked to build out the website, the interface design, the system, while OIT provided access to cloud-based platforms.”

In Massachusetts, CTO Bill Cole and a team of architects got the ball rolling on AI innovation, with Snyder’s team joining the effort in July 2024. “We recognized the need for what we describe as a center of excellence, to provide that overall AI governance,” Snyder said. That center of excellence now runs on the talents of the state’s CISO/chief risk officer and deputies; the chief technology officer and deputy; the chief privacy officer/general counsel and deputies; the chief IT accessibility officer; the chief of staff; director of contract management; and enterprise cloud architects. 

The center of excellence provides both access to the sandbox and guidance — for example, in putting limits around what might be tested. “We had one use case that involved essentially trying to map the state Capitol, but we don’t want to do that, for physical security reasons,” Snyder said. “We could see how there would be benefit to that, but we also recognize the risk.”

The center of excellence also collab-orates with mission leaders as AI applications emerge from the sandbox. “Operationally, who’s supporting this code that you’ve created? There’s the risk of having all of these great innovations, and no one supports it in production,” he said. Part of the sandbox effort includes not just developing the applications, but “ensuring that they have the access, the environment and the operating plans going forward to support that new code.”
Operationally, who’s supporting this code that you’ve created? There’s the risk of having all of these great innovations, and no one supports it in production.
In Connecticut, the sandbox is organized under a larger statewide effort known as the Connecticut AI Alliance, which brings together not just state IT leaders but also area colleges and corporations.


The sandbox effort itself draws support from Auker and the state’s head of IT, as well as mission leaders from areas such as licensing, inspections, and planning and zoning — the people likely to benefit from AI innovations.

“If somebody wants to go stand up 400,000 square feet of mixed apartment and retail space on a lot that the city owns, that needs some environmental abatement — that process is going to touch a lot of people,” Auker said, and with mission leaders informing the AI-related requirements around that process, the end product is more likely to meet the actual need.

In Washington, D.C., meanwhile, Miller is looking to involve not just city workers but also private-sector partners in the sandbox effort.

“The key thing to bringing those sandboxes to life is working with your partners to understand what’s going to be available” in terms of emerging AI capabilities, he said. “Our strategic partners understand their tools, and they work with us to get a level-set of what’s going to be possible.”

aerial view of Hartford, Conn.
Hartford, Conn.’s sandbox is a place to get concrete data on AI use cases when so much current information “is going to be anecdotal at best,” said Director of Development Services Jeff Auker.
Adobe Stock

CITY-SPECIFIC CONSIDERATIONS


AI sandboxing is just beginning to emerge as a consideration at the municipal level, and it’s likely that cities will need to approach this with strategies that differ from those being used at the state level.

Given limitations around staffing and funding, “I could see them having scale problems,” Snyder said. “But I think you can offset that with consultative help, or vendor-partner help.”

Auker said Hartford is able to leverage its ties to the Connecticut AI Alliance to address scalability questions. “The state has a rich set of IT resources available to them,” he said. In addition, the city’s own ecosystem is helpfully robust: School districts, for example, have been proactive in exploring AI, and they bring that talent to the table in support of innovation.

It helps, too, to have corporate partners at the table. Big companies can support AI sandboxing efforts, and probably should. “It benefits them to have a much more streamlined city to do business with,” Auker said.

As applications emerge for production, municipal IT should be well-situated to bring them to life. Because city IT teams are laser-focused on practical outcomes, “we do have some advantages: We know what it’s supposed to look like in the end,” Auker said. “In the cities, we’re really good at the nitty-gritty of implementation.”

In Washington, D.C., Miller too sees cities having some plusses in their corner as they look to ramp up AI sandboxes. For example, a smaller bureaucratic footprint can make it easier to get things done.

On a municipal scale, officials from departments such as transportation to health and human services can connect easily with the IT team to bring AI-

driven applications into the testing environment. “We can work as a whole government to overcome obstacles as we identify them, and make sure that they’re aligning with the way that we want to put AI in place,” he said.

BEST PRACTICES


For state and city IT teams looking toward a sandbox approach in support of AI innovation, a few best practices already have emerged.

Snyder describes training as key to success. “We have 150 users already in the sandbox today — it supports quite a large group of people,” he said. Even for those with some AI experience, “just becoming familiar with the environment is critical.”

Rather than set people loose in the sandbox without any formal guidance, “you really have to provide the learning to get them to be able to use it,” he said, noting that both internal teams and vendors can play a role here. AWS, for example, has delivered training in support of the Massachusetts sandbox.

Cost factors in as well. With 14,000 active users in the New Jersey sandbox, Cole is attuned to the potential for cloud expenses to spiral. With this in mind, he’s implemented a per-use cost structure in the sandbox.

With a per-use model, “we focus on building on APIs that are transactional: We are paying per batches of tokens on request,” he said. This approach “radically reduces the costs. When people aren’t using it, we’re not paying for it.”

As a safeguard, he also keeps an automated eye on the bottom line. Suppose there was suddenly a 300 percent increase in sandbox usage. “We have monitoring and alerting, a sort of trip switch that would control for those costs,” he said.

In a pinch, the IT team could dial down the usage, although right now the sandbox is performing well within its expected budget, he said.

As government IT shops advance sandbox efforts, Miller said, data should be the starting point. “You need to understand what data you plan on providing these AI tools. That helps you understand the risks that you’re taking,” he said.

“If it’s health information or student information, that’s where those sandboxes will help you to know: How is this data transiting into this tool? What is this tool doing with that data after we close that window? That’s really where those sandboxes are going to help you,” he said.

Finally, he encourages government IT leaders to learn from others in this space. “We all have to make sure that those tools are safe and equitable, that they’re not being misused,” he said.
Adam Stone is a contributing writer for Government Technology magazine.
Sign up for GovTech Today

Delivered daily to your inbox to stay on top of the latest state & local government technology trends.