In 1892, a group of American educators calling themselves the Committee of Ten gathered to solve an administrative problem. Secondary schools had proliferated across the United States in the decades following the Civil War, each with its own curriculum and standards. The committee, chaired by Harvard president Charles W. Eliot, sought to impose order on this chaos.
Their 1893 report established the template that still shapes schooling today: discrete subjects taught in fixed periods, standardised assessments, and a clear division between those destined for university and those headed for industrial labour. As historian David Labaree has noted, this design emerged alongside the factory system, reflecting a society that needed workers who could follow instructions and perform specialised tasks reliably.
One hundred and thirty-three years later, that framework remains fundamentally unchanged. The bells still ring. Students still move between subjects as if knowledge came in silos. We still assess understanding through high-stakes examinations that reward memorisation and penalise collaboration.
The world these structures were built to serve no longer exists.
In January 2023, researchers at the University of Pennsylvania's Wharton School reported that GPT-3.5 had passed the final examination for the school's Master of Business Administration programme, scoring between a B and B- (Terwiesch, 2023). Around the same time, a separate study found that GPT-4 could pass the United States Medical Licensing Examination (USMLE) with a score exceeding the passing threshold by approximately 20 percentage points (Kung et al., 2023). By late 2024, subsequent research confirmed that GPT-4o could achieve passing scores on the bar examination across all US jurisdictions (Bommarito & Katz, 2024).
These are not isolated achievements. A study published in Scientific Reports found that GPT-4 received passing scores on graduate-level biomedical science examinations, with performance improving substantially as the models evolved (McLachlan et al., 2024). The pattern is clear: artificial intelligence systems can now perform at the level expected of professionals who have completed years of intensive training.
Meanwhile, the labour market outcomes for those who have undertaken that training are deteriorating. OECD data reveals significant graduate underemployment across developed economies, with rates of workers in jobs below their educational qualifications ranging from 13% in Italy to 31% in Japan (OECD, 2013). In Greece, nearly 20% of tertiary-educated workers are unemployed; in Spain, the figure approaches 15% (Statista, 2024). The promise that education guarantees employment – a foundational assumption of the past century – has grown increasingly threadbare.
Yet public discourse remains preoccupied with a narrower question: are students using AI to cheat? This framing, while understandable, mistakes the symptom for the disease. The issue is not that students might use machines to pass examinations designed for humans. It is that these examinations now better measure a skill – information recall in standardised conditions – that machines have already mastered than the capabilities that might actually matter in an age of artificial intelligence.
Some education systems have recognised that incremental reform will not suffice. In July 2021, China's Ministry of Education implemented what it termed the "Double Reduction" policy, formally known as the Opinions on Further Reducing the Homework Burden and Off-Campus Training Burden of Students in Compulsory Education (Chen & Lin, 2024). The policy represents a deliberate restructuring away from rote learning and the examination-driven culture that had come to dominate Chinese schooling. Early assessments indicate that the proportion of students completing written assignments within specified timeframes rose from 46% before implementation to over 90% by late 2021 (Ministry of Education of the PRC, 2021).
India's National Education Policy (NEP) 2020 adopts a similar trajectory. The policy explicitly aims to shift "from rote memorization to formative assessment, focusing on conceptual understanding, critical thinking, and analysis" (Government of India, 2020). It proposes a holistic, multi-disciplinary approach designed to foster "higher-order skills such as critical thinking and problem solving" alongside social and emotional capabilities including "cultural awareness and empathy, perseverance and grit, teamwork, leadership, [and] communication" (Ministry of Education, India, 2020).
Perhaps most significantly, Finland's 2016 National Core Curriculum for Basic Education mandated what it calls phenomenon-based learning (PhenoBL), requiring schools to incorporate multidisciplinary learning modules that examine real-world phenomena across subject boundaries (Symeonidis & Schwarz, 2016). Unlike reforms that merely adjust content within existing structures, phenomenon-based learning fundamentally reconfigures the relationship between student, teacher, and knowledge. The Finnish National Agency for Education describes this approach as preparation for "21st-century competencies" – a tacit acknowledgment that the discrete subjects codified by the Committee of Ten are no longer adequate to the complexities of contemporary life.
The education technology sector has largely understood AI as a tool for personalisation within existing structures – adaptive systems that adjust problem difficulty, tutoring programmes that supplement classroom instruction, or automated grading that frees teacher time. These applications, while useful, treat AI as an enhancement to the 1892 model rather than a challenge to it.
Autonomous AI represents something categorically different. Where generative AI produces content in response to prompts, autonomous AI agents can design and execute tasks independently – setting learning objectives, identifying knowledge gaps, sourcing materials, and adapting instructional approaches in real time based on learner responses. Research published in Procedia Computer Science describes frameworks where AI agents dynamically adjust "instructional difficulty" through continuous assessment of learner performance, creating personalised pathways that respond to individual needs without human intervention (Al-Ramahi et al., 2025).
The implications for educational equity are profound. A student in rural Bangladesh with a smartphone and an autonomous learning agent could theoretically access educational infrastructure comparable to that available at MIT – not in terms of physical facilities or social networks, but in the quality of instruction and the sophistication of learning design. The constraint would no longer be proximity to elite institutions but the availability of basic connectivity.
Whether this potential is realised depends on policy choices, not technology alone. But the technical capacity now exists to decouple educational quality from geographic location, institutional prestige, and the supply of trained teachers. For the majority of the world's students, who will never attend a university ranked in global league tables, this represents a potential discontinuity with historical patterns of educational stratification.
If autonomous AI systems can learn, reason, and produce work at graduate level across professional domains, what remains the purpose of education?
This is not a hypothetical philosophical exercise. It is the question that every education system will confront as AI capabilities advance. The traditional answer – that education prepares individuals for employment – assumes a labour market that rewards the skills taught in schools and universities. But if machines can perform the cognitive tasks that have constituted professional work, this rationale appears increasingly circular. We train people to do things that machines can already do, then express surprise when employers prefer the machines.
The uncomfortable possibility is that education's purpose must be relocated elsewhere: in human development as an end in itself, in the cultivation of judgment and discernment that machines do not possess, in the formation of citizens capable of navigating complex ethical landscapes. These have always been legitimate educational aims, but they have typically been secondary to economic preparation. The rise of autonomous AI may force a reordering.
Finland's phenomenon-based learning, India's emphasis on "social and emotional skills," and China's "Double Reduction" policy all gesture toward this recalibration. They recognise that the value of human learning cannot be reduced to test scores or job placement rates. What remains uncertain is whether these reforms can be implemented at scale within institutional structures designed for very different purposes.
There is a particular irony in this moment. The education system designed to produce factory workers – those who would operate the machines of industrial capitalism – may prove to be the first casualty of the technology that those workers eventually built. The digital revolution was created by people trained in the disciplines established by the Committee of Ten: mathematics, physics, engineering. Their education rewarded the precise, structured thinking that enables algorithmic development. And now that development threatens to render obsolete the very system that produced it.
This need not be a catastrophe. The Committee of Ten was responding to genuine needs: the coordination of standards, the efficient transmission of knowledge, the sorting of students into appropriate educational tracks. These problems have not disappeared, but their solutions must be reimagined for a world where information is abundant and machines can process it faster than any human.
The education systems now emerging – less centralised, more interdisciplinary, focused on understanding rather than memorisation – suggest possible futures. Whether they can be implemented widely, and whether they can do so quickly enough to outpace technological change, remains an open question. History offers little precedent for institutional transformation at this velocity.
What is clear is that the debate over AI in education has been too small. The question is not whether students will cheat on essays, or whether professors can detect machine-generated work. It is whether the architecture we inherited from 1892 can survive contact with machines that render its assumptions obsolete. The evidence suggests that for many countries, it cannot – and that reconstruction, not renovation, is the task ahead.
---
Bommarito, M. J., & Katz, D. M. (2024). GPT-4 passes the bar exam. PLoS ONE, 19(5), e0303882. https://doi.org/10.1371/journal.pone.0303882
Chen, L., & Lin, S. (2024). Examining China's "Double Reduction" Policy: Promises and challenges for balanced and quality development in compulsory education. ECNU Review of Education. https://doi.org/10.1177/20965311241265123
Government of India. (2020). National Education Policy 2020. Ministry of Education. https://www.education.gov.in/en/nep/about-nep
Kung, T. H., Cheatham, M., Medenilla, A., Sillos, C., De Leon, L., Elepaño, C., Madriaga, M., Aggabao, R., Diaz-Candido, G., Maningo, J., & Tseng, V. (2023). Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digital Health, 2(2), e0000198. https://doi.org/10.1371/journal.pdig.0000198
Labaree, D. F. (2010). Someone has to fail: The zero-sum game of public schooling. Harvard University Press.
McLachlan, J., et al. (2024). The model student: GPT-4 performance on graduate biomedical science exams. Scientific Reports, 14, 6860. https://doi.org/10.1038/s41598-024-55568-7
Ministry of Education of the People's Republic of China. (2021, December 30). 'Double reduction' policy adds strength to China's education reform. http://english.scio.gov.cn/in-depth/2021-12/30/content_77960470.htm
Ministry of Education, India. (2020). National Education Policy 2020. https://www.education.gov.in/sites/upload_files/mhrd/files/NEP_Final_English_0.pdf
OECD. (2013). OECD skills outlook 2013: First results from the survey of adult skills. OECD Publishing.
Symeonidis, S., & Schwarz, J. (2016). Phenomenon-based teaching in practice. Nordic Studies in Education, 36(4), 317–329.
Terwiesch, C. (2023). Would ChatGPT get a Wharton MBA? A prediction based on its performance in the operations management course. Wharton School Working Paper.