Deployed Before Proven
Governing AI in Schools: the Developmental Stakes
The debate about screens in schools is being conducted almost entirely from outcomes. Test scores, engagement metrics, effect sizes, procurement accountability: these matter, but they are proxies for something more fundamental that the policy conversation is not discussing.
The question is not whether EdTech improves learning outcomes. The question is what kind of brain forms in what kind of environment, and whether the answer is reversible.
It isn’t; not fully. And that changes everything about how we should be thinking about AI entering the same space.
Development Is Not a Minor Variable
Human cognitive capacity is not a fixed endowment that environmental conditions either support or interfere with at the margins. It is built through cellular environment, gene expression, and the progressive consolidation of neural architecture during windows that are open once and then close. The attentional systems that allow sustained reasoning, deep reading, and complex argument-following are not present at birth. They form at the intersection of nutrition, stimulation, and adaptation. The environment during formation is not background noise: it is a primary determinant of what capacity emerges.
This is not a metaphor; it is how neurodevelopment proceeds. The brain's anatomical differentiation depends on stimulation from the environment — what genes a cell expresses, what connections it maintains or prunes, where in the cortical architecture it ultimately fits. During sensitive and critical periods, experience instructs neural circuits to process and represent information in ways that then become foundational. During critical periods specifically, the literature is now clear that this instruction produces configurations that cannot be replaced by alternative connectivity patterns afterward: consequences that are, in the precise scientific sense, irreversible.
Attention is the capacity most directly at stake during the developmental windows when children now encounter screens. The research literature is explicit: attention is the gateway to higher-order cognition, a prerequisite for working memory, reasoning, decision-making, and inhibitory control — the ability to filter irrelevant information and sustain focus on what matters. A child whose attentional architecture forms in an environment optimized for rapid switching, novelty, and continuous engagement capture does not simply have worse attention than a child whose architecture formed under different conditions. They have reduced access to the downstream capacities that attention permits. The ability to hold an argument in mind long enough to evaluate it, to track the variables in a problem, to follow an evidence chain to its conclusion — these depend on attentional infrastructure that was either built or wasn't.
Longitudinal evidence now supports this directly. Screen time at age two predicts executive function deficits at age three, controlling for verbal ability and other covariates. A larger study tracking children from ages three to five across more than two thousand participants found that greater screen time at one age point predicted lower executive function a year later — and critically, the effect was unidirectional: executive function did not predict later screen use. The environment was shaping the developing system; the developing system was not correcting for the environment.
The population-level rise in ADHD diagnoses over the same period that screen exposure became pervasive in early childhood is worth examining carefully. ADHD diagnostic rates reflect multiple overlapping factors: broadened criteria, increased clinical awareness, reduced stigma, variation in how schools identify and refer children, and attributing the trend to any single cause overstates what the epidemiological record supports. But from a neuroscience standpoint the mechanistic question is pointed: if attentional architecture forms during specific developmental windows, and if the environment during formation determines what gets built, and if screen environments are optimized precisely for the attentional pattern ADHD describes — distractibility, novelty-seeking, inability to sustain focus on low-stimulation tasks — then the hypothesis is not a leap. It is the parsimonious explanation, and the burden of proof arguably runs in the other direction. What the evidence does support clearly is this: screen environments during development produce attentional profiles consistent with ADHD symptomatology.Whether a given child receives a diagnosis depends on clinicians, criteria, and context. What their attentional architecture actually is depends on the environment that built it. The diagnostic label is variable; the underlying structure is not.
This matters because it reveals what we are actually deciding when we put screens in front of children during developmental windows. We are not making a pedagogical choice about delivery format. We are making a decision about the environment in which human beings will form, and about what those humans will therefore be capable of for the rest of their lives.
A note on neuroplasticity. The brain's capacity for change does not dissolve after childhood; adult neuroplasticity is real and well-documented. But neuroplasticity is not a recovery guarantee, and it is not symmetric across development. During sensitive and critical periods, the brain is maximally responsive to specific inputs in ways that produce lasting structural effects. Outside those periods, the mechanisms of plasticity differ: learning is possible but does not proceed through the same experience-dependent circuit formation. A child who missed the developmental window for robust attentional infrastructure can continue to develop, and targeted intervention has value. But the brain reorganizing in response to a screen-saturated environment is neuroplasticity working exactly as described: optimizing for the inputs present, not for the inputs that were absent. Recovery requires not just different inputs later, but sustained effort against architecture that was built for something else. The literature does not support the idea that this is equivalent to formation under better conditions.
That is not a minor variable. It is, in a real sense, the whole ball game.
What the EdTech Evidence Shows
The empirical record on EdTech in schools is by now substantial and consistent. International assessments tracking millions of students across decades — PISA, TIMSS, PIRLS — show a monotonic relationship between classroom screen exposure and performance in reading, mathematics, and science. More screen time, lower performance, across income levels, grade levels, and national contexts.
Meta-analyses tell a similar story when effect sizes are benchmarked honestly. Most EdTech research compares digital interventions against zero rather than against ordinary classroom instruction: a sleight of hand that inflates apparent benefit. When re-centered against the average impact of typical teaching, one-to-one laptop programs, online instruction, and general classroom technology integration all fall below the effectiveness of standard practice. The one partial exception: narrowly constrained adaptive tools for foundational skill drilling. They work because they automate repetition in well-defined domains. They do not work because they enhance deep learning.
The structural explanation is not complicated. Human attention systems did not evolve for the behavioral patterns digital platforms train. Frequent task-switching, novelty-seeking, and engagement capture are platform design features — and they condition habits that directly conflict with the sustained focus learning requires. This is not a discipline problem. It is a conditioning problem, operating through the same mechanisms by which any repeated environmental pattern shapes neural architecture.
But here is what the test-score perspective fails to reveal: the children showing up in these assessments with lower reading comprehension and weaker mathematical reasoning are not simply students who learned less. They are people whose cognitive development proceeded in a particular environment, and who will carry the results of that environment forward. The score is a snapshot; the architecture is permanent.
Scaling Before Knowing
EdTech deployment happened without requiring independent efficacy evidence. Vendors made claims that districts lacked the capacity to evaluate. Equity rhetoric – device access as opportunity – substituted for evidence of learning. The populations most harmed, as the data now show, were the disadvantaged students the programs claimed to serve, who showed the largest negative effects in the EdTech evidence base.
AI is entering the same procurement pipelines under worse epistemic conditions. The longitudinal evidence base for AI tutoring systems at K-12 scale is sparse. The systems are more opaque than laptops; it is harder to audit what an AI tutoring system is optimizing for than what a one-to-one device program is doing. And the deployment pressure is higher, because AI carries an urgency that Chromebooks never did. The U.S. Department of Education finalized a rule in April 2026 prioritizing AI adoption in federal grant programs, accelerating procurement across thousands of districts before outcome evidence exists.
The structural failure mode is identical to the first wave of tech in classrooms. AI systems in educational contexts are being optimized for engagement metrics. Engagement and learning are not the same thing; they are often incompatible. A system that keeps a child interacting is not necessarily building durable understanding, transferable reasoning, or the attentional capacity that makes future learning possible. Given what we now know about how attentional infrastructure forms, a system optimized for engagement during developmental windows may be doing something considerably worse than failing to teach the intended content. It may be shaping, at the level of neural architecture, the kind of mind the child will have.
Where AI Belongs in Schools
None of this means AI has no role in educational institutions. The criterion that matters is whether human cognitive engagement is the point of the task, or whether the task is administrative overhead that consumes time and attention that could go elsewhere.
AI deployed behind the student-facing wall has genuine potential: reducing the paperwork burden of IEP documentation and compliance reporting; drafting and stress-testing emergency protocols; flagging scheduling conflicts or resource allocation inefficiencies; supporting curriculum design as a drafting tool that teachers then evaluate, revise, and own. In these contexts, AI is relieving cognitive load on adults whose professional judgment remains the decision point. The human is not being replaced. The human is being freed to do the work that requires a human.
Teacher-facing AI tools for lesson planning and materials development fall into a similar category, with an important caveat: the teacher must retain enough engagement with the curriculum to actually evaluate what AI produces. A teacher who outsources lesson design to AI without genuine review is not saving time; they are degrading their own professional capacity while delivering content they do not fully understand. The tool is appropriate, and the workflow determines whether it is used well.
What does not belong in front of students, particularly young students in developmental windows, is AI as a primary instructional interface. AI tutoring systems, automated essay scoring that substitutes for teacher feedback, adaptive learning platforms as a replacement for human instruction: these apply the technology precisely where the EdTech evidence base is most damning, and where the engagement-optimization problem is most dangerous. The child sitting with an AI tutor is not just getting a different pedagogical delivery. They are spending developmental time in an environment engineered for engagement capture rather than for the conditions under which human cognition actually forms. Whether AI tutoring could be designed around the actual conditions of cognitive formation rather than engagement capture is an open question, but the current incentive structure does not reward that design, and no vendor is being asked to demonstrate it. The governance precedes the design.
One question the current evidence does not resolve deserves explicit acknowledgment. Structured technical education – teaching children the logic underlying computational systems, the principles of how algorithms operate, the reasoning required to identify and correct errors in a rule-based system – makes a legitimate claim on developmental time that passive consumption or engagement-optimized platforms do not. The cognitive demands are meaningfully different: debugging requires sustained attention, logical sequencing, hypothesis testing, and tolerance for failure. These are not incidental to the attentional infrastructure we are concerned about; they may actively develop it. Whether structured technical education of this kind produces cognitive benefits that outweigh the developmental costs of the medium itself is a question the longitudinal research has not answered. The studies distinguishing task-specific screen use from general screen exposure during developmental windows have not been done at the scale or rigor the question requires. Before drawing conclusions in either direction, we need that evidence: not to delay indefinitely, but because decisions made during irreversible developmental windows require the standard of proof the stakes demand.
The Governance Question
The EdTech record is clear. The relationship between deployment scale, evidence requirements, and outcome is documented. The question for AI in education is whether policymakers will demand independent efficacy evidence, longitudinal outcome measurement, and transparency about optimization targets before scale, or whether institutional enthusiasm and vendor pressure will again move faster than the research.
What makes this moment different is what we understand about the stakes. The EdTech conversation was conducted largely in terms of test scores and learning outcomes. But the developmental lens tells us what cognitive architecture forms in what environment, and what that means for the humans who will live with the results.
We are making decisions now about the environments in which children will develop. The procurement timelines are short. The developmental windows are not.
References
EdTech outcomes and international assessments
OECD (2023). PISA 2022 Results (Volume I): The State of Learning and Equity in Education. OECD Publishing. https://doi.org/10.1787/53f23881-en
Mullis, I. V. S., von Davier, M., Foy, P., Fishbein, B., Reynolds, K. A., & Wry, E. (2023). PIRLS 2021 International Results in Reading. Boston College, TIMSS & PIRLS International Study Center. https://doi.org/10.6017/lse.tpisc.tr2103.kb5342
OECD (2016). PISA 2015 Results (Volume I): Excellence and Equity in Education. OECD Publishing. https://doi.org/10.1787/9789264266490-en
Salmerón, L., Vargas, C., Delgado, P., & Baron, N. (2023). Relation between digital tool practices in the language arts classroom and reading comprehension scores. Reading and Writing, 36(1), 175–194.
Fittes, E. K. (2022). How much time are students spending using EdTech? Education Week, March 1, 2022. https://marketbrief.edweek.org/meeting-district-needs/how-much-time-are-students-spending-using-ed-tech/2022/03
Ragan, E. D., Jennings, S. R., Massey, J. D., & Doolittle, P. E. (2014). Unregulated use of laptops over time in large lecture classes. Computers & Education, 78, 78–86.
Effect size benchmarking and meta-analysis
Hattie, J. (2023). Visible Learning: The Sequel. New York.
Horvath, J. C. (2026). The Digital Delusion. LME Global Press, Arizona.
Neurodevelopment, critical periods, and neuroplasticity
Knudsen, E. I. (2004). Sensitive periods in the development of the brain and behavior. Journal of Cognitive Neuroscience, 16(8), 1412–1425.
Black, J. E., Jones, T. A., Nelson, C. A., & Greenough, W. T. (1998). Neuronal plasticity and the developing brain. In N. E. Alessi et al. (Eds.), Handbook of Child and Adolescent Psychiatry. Wiley.
Cicchetti, D., & Tucker, D. (1994). Development and self-regulatory structures of the mind. Development and Psychopathology, 6(4), 533–549.
Reh, R. K., Bhatt, D. L., Bhatt, M., & Bhatt, R. (2020). Shifting developmental trajectories during critical periods of brain formation. Frontiers in Neuroscience, 14, 1–15. [Critical vs. sensitive period distinction; irreversibility of critical period configurations.]
Werker, J. F., & Hensch, T. K. (2015). Critical periods in speech perception: New directions. Annual Review of Psychology, 66, 173–196. [Sensitive periods can be opened and closed by experience; multiple overlapping windows throughout development.]
Bhatt, D. (2023). Adult neuroplasticity employs developmental mechanisms. PMC/Frontiers. [Adult plasticity real but mechanistically distinct from developmental plasticity.]
Screen time, executive function, and longitudinal evidence
McHarg, G., Ribner, A. D., Devine, R. T., & Hughes, C. (2020). Screen time and executive function in toddlerhood: A longitudinal study. Frontiers in Psychology, 11, 570392.
Ma, S., Wang, J., et al. (2025). A longitudinal analysis of screen time and executive function in young children. ScienceDirect/Early Childhood Research. [Screen time ages 3–5 predicts lower EF one year later; effect unidirectional.]
Attention as gateway to higher-order cognition
Cowan, N. (2009). Working Memory Capacity. Psychology Press.
Diamond, A. (2013). Executive functions. Annual Review of Psychology, 64, 135–168.
Posner, M. I., Rothbart, M. K., Sheese, B. E., & Voelker, P. (2014). Developing attention: Behavioral and brain mechanisms. Advances in Neuroscience.
Attention, task-switching, and memory
Kirschner, P. A., & De Bruyckere, P. (2017). The myths of the digital native and the multitasker. Teaching and Teacher Education, 67, 135–142.
Foerde, K., Knowlton, B. J., & Poldrack, R. A. (2006). Modulation of competing memory systems by distraction. Proceedings of the National Academy of Sciences, 103(31), 11778–11783.
Jolicoeur, P., Dell'Acqua, R., & Crebolder, J. (2000). Multitasking performance deficits: forging links between the attentional blink and the psychological refractory period. In Control of Cognitive Processes: Attention and Performance(pp. 309–330). MIT Press.
AI in education — policy and deployment
U.S. Department of Education (2026). Proposed Priorities, Requirements, Definitions, and Selection Criteria — Artificial Intelligence. Federal Register, April 2026.
Holmes, W., Porayska-Pomsta, K., Holstein, K., Luckin, R., et al. (2022). Ethics of AI in education: Towards a community-wide framework. International Journal of Artificial Intelligence in Education, 32, 504–526.
Note: Longitudinal outcome data for AI tutoring systems at K-12 scale remains sparse as of this writing. The absence of that evidence is itself part of the argument.