The findings point to a clear tension, one that two of the authors discussed with the Center for Digital Education: AI can improve student performance while in use, but those gains appear to fade when the technology is removed.
CONTEXT AND SCALE OF RESEARCH
Despite the swift adoption of AI in schools, rigorous research on its impact on teaching and learning remain limited, the report emphasized: Most studies examine whether students using AI tools perform better, yet they do not isolate whether the tech itself caused those gains.
While the report analyzed more than 800 studies on AI in education, Stanford researchers ultimately relied upon approximately 20 high-quality causal studies, which are designed to determine whether AI directly drives changes in student outcomes.
“Causal research is the best way to tell how a tool, like an AI tool, impacts students and educators, and so that’s a different question than what you get by looking at survey data or just kind of descriptive outcomes,” Lily Fesler, senior researcher at the AI Hub for Education at Stanford University, who also co-authored the report, said. “So that’s why we focused on causal research studies.”
That distinction, she continued, matters for school leaders making decisions about adopting AI. Without causal evidence, improvements could be driven by factors like teacher experience, student motivation or classroom context, rather than the tool itself.
But even among hundreds of studies, only a small number meet that bar. The result, researchers said, is a fast-moving market with limited proof of impact.
“I think the headline is, it’s still too early to know if this is doing school faster ... or if it’s reimagining. I think research shows hints of both,” said another co-author of the report, Chris Agnew, who serves as managing director of the AI Hub for Education. “The headline is mixed, and it requires real intention and much further research.”
Agnew added that much of the existing research reflects tools that are currently available, which are largely one-to-one, chatbot-style products. Moreover, that data often does not capture how AI is actually implemented in classrooms.
EVIDENT UPSIDES
According to both Agnew and Fesler, though evidence for benefits of AI in education exists, it mainly points to specific and limited benefits, particularly when students have access to step-by-step support and immediate feedback.
Fesler said studies show AI tools can help students improve performance on structured tasks, such as improving math practice problems and providing feedback on student writing.
The report also found that AI can meaningfully shift how teachers spend their time: Tools can reduce the time spent on tasks like grading, lesson planning and feedback generations — sometimes by as much as 30 percent — without lowering lesson quality.
However, Fesler noted that time saved does not inevitably mean teachers are working less.
“We know that it does save teachers time, but that doesn’t necessarily mean that teachers are spending less time [working] at home,” Fesler said. “So that means they’re reinvesting that time and spending it on other things that they think can support student learning in their classrooms.”
Agnew pointed to broader potential in expanding access to AI and giving students more control over their learning.
“I think there’s a real sense of agency that it can build in individuals,” he said, adding that AI could help support students with diverse learning needs.
PROMINENT DOWNSIDES
The report’s most consistent concern is what happens when AI is no longer available.
Across studies, students often perform better while using AI tools. However, they struggle to replicate those results independently, and in many cases, gains appear to weaken or disappear once the tool is removed.
“There are a number of different applications where AI can improve student performance while students have access to the tools, but importantly, a lot of times that does not translate to when the tools are taken away,” Fesler said. “And evidence is more mixed on whether students continue to perform well when the tools are taken away.”
Fesler described this as cognitive offloading — when students rely on AI to do thinking for them instead of developing their own understanding.
“If students are really relying on the AI tools to just do the assignment for them, they’re thinking a little less, engaging less, during the task,” Fesler stated. “Their performance on the task can improve … but they’re not necessarily internalizing and learning from that experience.”
Agnew highlighted that those risks extend beyond the use of AI in academia.
“Areas that I’m concerned about are ... the offloading of fundamental skill development, the erosion of critical thinking and a changing sense of self because of technology in young people,” he said.
Design also plays a critical role. Tools that guide students through reasoning tend to produce better outcomes than those that generate answers outright.
“It really depends on the pedagogical design of the AI tool,” Fesler said. “There are certainly ways that students can be cognitively engaged when using an AI tool, but it’s not a given. It needs to be designed into the tool.”
POLICY RELEVANCE
For district leaders, the findings surface a key consideration when it comes to purchasing AI tools: When does the technology function as a useful tool, and when does it become a crutch in the learning process?
“I think any product that is making strong outcome claims should be heard cautiously, because, as we know, the research is very early,” Agnew said, emphasizing that leaders should press vendors on whether tools improve learning independently and not just performance while students are using them.
Researchers also cautioned against replacing core elements of teaching with technology.
“AI, being a tool that has promise, does not mean it’s better for kids to spend more time just in front of computers or their phone,” Fesler said, emphasizing that relationships and in-person learning remain central.