A recent clarification from the Caribbean Examinations Council (CXC) on the role of artificial intelligence in school-based assessments (SBAs) has brought a measure of relief to regional students, teachers, and parents who have grown increasingly anxious about the impact of AI detection tools on academic outcomes. CXC representative Dr Nicole Manning’s timely statement, which emphasized that AI-generated originality reports are not meant to serve as the sole proof of academic misconduct, addresses growing concerns over inconsistent similarity scores, unfair penalties, and overreliance on flawed detection technology.
It is a widely accepted conclusion in educational technology research that AI detection tools cannot definitively confirm authorship. These systems operate solely on the basis of probability, statistical pattern matching, and predictive language modeling, meaning even fully original student work can be incorrectly flagged as AI-generated. This fundamental limitation is why education policy experts consistently argue that human oversight must remain the core of any credible assessment framework — a principle CXC has now formally acknowledged.
Despite this clarification, critical questions remain unanswered about the exact operational role of AI originality reports within the CXC SBA structure, especially as anecdotal reports from students and educators across the region mount of unfair penalties stemming from false AI flags. Any tool that shapes assessment outcomes, even indirectly, demands a clear, consistently applied, and transparently communicated role within the system.
Long before generative AI entered mainstream education, the CXC SBA model was built on a foundation of robust human supervision. Teachers guide students through project development, monitor progress step-by-step, evaluate submissions, and participate in cross-institutional moderation processes designed to protect assessment fairness. The integration of AI detection tools has only added an extra layer of procedural responsibility to this existing framework.
If AI originality reports are intended to act primarily as a deterrent to misuse, a documentation tool, a transparency measure, or an early warning system for potential academic misconduct, their inclusion in the framework is reasonable. No regional examining body can afford to ignore the rise of generative AI or assume it will never be used improperly, as protecting academic integrity is non-negotiable for upholding the value of CXC qualifications. The challenge emerges when widely acknowledged imperfect tools are embedded into high-stakes assessment processes that shape student outcomes.
The core contradiction remains: if AI detection results are not definitive, why are numerical similarity scores still being used in high-stakes assessment contexts at all? When human interpretation becomes the final safeguard against false flags, the bulk of new responsibility shifts directly onto overstretched regional teaching workforces. Teachers are now expected to analyze AI originality reports, cross-reference submissions with students’ past work, compare drafts, evaluate contextual evidence of original work, and distinguish between statistical false flags and intentional misconduct — all on top of their already heavy existing workloads that include classroom instruction, administrative duties, and core SBA supervision.
In practice, CXC’s new AI policy has significantly expanded the interpretive labor required from teachers, fitting into a broader pattern in regional education where new procedural expectations are rolled out without corresponding increases in resourcing, adjusted workload allocations, or additional compensation. Teachers are not direct employees of CXC; they support the regional assessment system while fulfilling their core roles in individual schools. If the entire integrity of the assessment framework now depends on this extra layer of interpretive work, issues of workload sustainability and fair remuneration can no longer be treated as afterthoughts — they are core to successful implementation.
Beyond teacher workload, a pressing question remains: can consistent fairness be maintained across regional schools and territories that operate with wildly different levels of infrastructure and resourcing? Some well-resourced institutions boast strong technological infrastructure and dedicated time for teachers to conduct detailed reviews of flagged submissions, while many under-resourced schools operate under severe capacity constraints that leave little time for extra procedural work. Variations in available time and institutional support for teachers directly impact how thoroughly they can investigate AI flags, creating uneven application of the policy across the region. Fairness cannot be achieved when the rigor of review depends entirely on a school’s resource level, and any policy that relies heavily on human judgment must account for the uneven distribution of time, resources, and support across Caribbean education systems.
Another unaddressed gap is the lack of standardization for AI detection tools across the CXC system. Currently, different schools are permitted to use different AI originality checkers, and it is well-documented that these tools produce wildly different similarity scores for the exact same student submission. If one tool flags a submission with a 12% similarity score and another flags the same work at 28%, there is no clear rule for which result takes precedence. Without system-wide standardization, consistent assessment outcomes are impossible to guarantee. If AI detection is to remain part of the SBA framework, systemic coordination rather than fragmented, school-by-school tool selection is essential. Standardization would also require coordinated support from regional ministries of education and CXC to ensure access to approved tools does not depend on a school’s independent budget, preventing uneven implementation across institutions.
This lack of standardized resourcing also raises concerns that AI integration could widen existing educational inequalities across the region. Access to reliable technology, stable high-speed internet, digital literacy training, and institutional resources is far from uniform across Caribbean schools. Better-resourced institutions are naturally positioned to navigate new AI-related requirements far more easily than under-resourced schools, and technology never operates neutrally within unequal systems. Without targeted safeguards, AI integration risks reinforcing pre-existing achievement gaps between more and less advantaged institutions.
There is also the risk of unintended harm to student writing development. If students internalize the message that polished, sophisticated academic work increases the risk of being flagged as AI-generated, they may begin to alter their writing unnecessarily: simplifying their language, avoiding complex syntactical structures, and abandoning formal academic tone to avoid suspicion. This would turn a policy designed to protect academic integrity into one that pushes students to prioritize avoiding false flags over demonstrating their actual understanding of course material.
At its core, this debate over AI detection in SBAs raises a much deeper question: are regional assessment systems structured appropriately for the age of generative AI? For decades, written assignments have served as the primary evidence of independent student thinking, but generative AI has blurred the once-clear lines between individual authorship, external assistance, and collaborative work.
Educational researchers have long advocated for alternative assessment models that prioritize authentic demonstration of understanding, including oral defenses, supervised in-person drafting, practical skill demonstrations, and real-time evaluation of mastery. These approaches existed long before the rise of generative AI, but they have gained new urgency as AI complicates traditional written assessment. The open question now is whether Caribbean assessment systems can adapt quickly enough to meet this new context.
If CXC continues to center AI detection despite its well-documented limitations, the assessment system will rely on fundamentally unreliable tools. If it shifts fully to human interpretation as the primary safeguard, fairness becomes dependent on inconsistent institutional capacity and teacher workload. Neither path is simple, and balancing competing priorities remains the central challenge for the council. Academic integrity must be protected, and misuse of AI must be addressed — but honest, original student work should not be penalized by systems that policymakers themselves admit are fallible. Ultimately, the question that remains unanswered is whether Caribbean education systems are prepared to meet the new demands of authentic assessment, authentic learning, and authentic authorship at a moment when the very nature of student writing is being redefined.
This commentary is contributed by Dr Zhane Bridgeman-Maxwell, a Barbados-based science educator, researcher, and education reform advocate focused on redesigning outdated learning systems through policy change and pedagogical innovation. Her work centers amplifying the voices of students, teachers, and parents as she reimagines the purpose and structure of regional schooling.
