A laboratory professional operating an automated analyzer Tima Miroshnichenko via Pexels
Medicine

The Best AI Model on Paper Doesn't Always Make It to the Laboratory. What's Missing in Between?

A new EFLM framework identifies exactly where the gap between publication and deployment lies, and what laboratory professionals should be asking before trusting any AI diagnostic tool.

Author : Dr. Abhinaya. K
Edited by : M Subha Maheswari

Key Takeaways

  • Only about 25 percent of laboratories in Europe report an active AI project. Most published AI models for laboratory medicine remain research prototypes.

  • In January 2026, EFLM's Committee on Digitalisation and AI published a six-item checklist extension identifying why laboratory AI models fail to reach clinical deployment.

  • A model can perform well in its home laboratory and still fail on a different platform, a different population, or a different unit of measurement.

  • Indian laboratories running multiple analytical platforms under a single NABL/ISO 15189:2022 accreditation are a real-world example of exactly this challenge.

  • Across the literature, the consensus is augmentation, not replacement: AI is expected to support laboratory professionals, not substitute for their clinical judgment.

Published AI studies in laboratory medicine frequently report excellent diagnostic performance, yet only about one in four European laboratories currently reports an active AI project. Most published models never progress beyond the research stage, highlighting a persistent gap between publication and routine clinical implementation.¹

A 2026 position paper from the European Federation of Clinical Chemistry and Laboratory Medicine (EFLM) examines this gap between publication and routine clinical implementation and outlines what laboratory professionals should evaluate before trusting an AI tool.¹

Why Is AI Adoption Still Limited in Clinical Laboratories?

The reasons most commonly cited in the literature are familiar: implementation cost, unclear liability when an algorithm contributes to a wrong result, and a workforce that has not yet been trained to critically evaluate AI claims.² But laboratory medicine also has a structural problem that goes deeper than unfamiliarity.

A laboratory result is not just a random number. It is the end product of a chain that runs from sample collection through analytical calibration to clinical interpretation. A model trained without accounting for that chain tends to break when the chain gets complicated.

This is something that the general-purpose medical AI checklists may not be equipped to catch.

What Is the EFLM AI Checklist for Laboratory Medicine?

In January 2026, EFLM’s Committee on Digitalisation and Artificial Intelligence (C-AI) published a six-item extension to ChAMAI (checklist for the assessment of medical AI), the existing Checklist for the Assessment of Medical AI by Cabitza and Campagner.¹ The extension is built specifically to catch what general frameworks may miss when applied to laboratory data.

These six items form a clear diagnosis of why a model that performed well in its original paper often cannot survive the move to a second laboratory.

Why Analyte Specification Matters for AI Models

The same analyte reported on two different platforms is not necessarily the same value. For example, HbA1c reported as a percentage under NGSP (National Glycohemoglobin Standardization Program) and in mmol/mol under IFCC (International Federation of Clinical Chemistry and Laboratory Medicine) follows a non-linear relationship.3 A model trained on one platform’s output has no built-in awareness that it is being handed another platform’s numbers.

Why Clinical Context Matters in Laboratory AI

The brain-to-brain loop in laboratory medicine, showing how raw test data is shaped by both metadata and peridata

The checklist distinguishes between metadata (how the result was generated) and peridata (the clinical context required to interpret it).¹ A fasting glucose and a postprandial one carry different clinical meaning. A troponin from a point-of-care device in the emergency department is not directly comparable to a high-sensitivity troponin from the central laboratory. Models trained on bare numbers inherit none of this context.

Why Analyte Standardization Is Essential for Laboratory AI

Well-standardized analytes such as glucose, creatinine, and cholesterol are relatively safe for multi-site model training. But analytes such as Lipoprotein(a) is causally linked to cardiovascular risk but still lacks international standardization, and is reported in two non-interconvertible unit systems.4 If the underlying test does not generalize cleanly across laboratories, no model trained on it will either.

Why External Validation Matters for AI Models

External validation is frequently reported but rarely interrogated. The checklist asks whether the validation dataset is semantically similar (same analyte codes, same units), procedurally similar (same instruments, same pre-analytical handling), and comparable in size and structure.¹ A validation dataset that is too similar to the training set is not a real test. One that differs for uncontrolled reasons will unfairly penalize a sound model. Neither produces actionable evidence of generalizability.

The remaining two checklist items cover analytical and biological variability (the unavoidable noise in repeated laboratory measurements) and FAIR (Findable, Accessible, Interoperable, and Reusable) data sharing. This makes preprocessing steps and feature transformations available for audit.¹ These are what make the other four items verifiable by anyone outside the original research group.

Why AI Validation Is More Challenging in Indian Clinical Laboratories

Laboratories accredited under NABL against ISO 15189:2022 regularly run chemiluminescent immunoassay platforms alongside colorimetric methods and calculated parameters on the same patient requisition. That platform heterogeneity reflects the reality of a diverse laboratory network. The EFLM checklist was designed with this type of analytical heterogeneity in mind. In India, however, such variation often exists within individually accredited facilities, not just between institutions.¹

A 2026 expert panel from the Asia-Pacific Federation for Clinical Biochemistry and Laboratory Medicine, moderated by a NABL technical assessor for ISO 15189:2022, identified two additional layers of complexity specific to India. EHR (Electronic Health Record) adoption remains limited, which constrains the peridata that any AI model could access. And the equity stakes are higher: over-relying on automated interpretation in a system built around universal healthcare carries different risks than the same step taken in a well-resourced single-payer network.

The IFCC’s ongoing harmonization effort for TSH immunoassays is a concrete illustration of how far even a single, extremely common test still is from platform-independent comparability.3 Laboratory AI will inherit that gap unless it is explicitly accounted for.

Can AI Replace Laboratory Professionals?

Two laboratory medicine professionals reviewing AI-assisted diagnostic output.

The consistent answer across the published literature is no. The shift in personnel may happen from operating instruments to supervising algorithmic performance: validating models, monitoring for drift, and contributing clinical context that a trained model alone cannot supply.5

For laboratory professionals evaluating a vendor’s AI diagnostic tool, the EFLM checklist offers a practical filter. The right questions to ask are not limited to sensitivity and specificity. They include: what platform and reagent system was this trained on, is this analyte genuinely standardized across sites, and what did the external validation dataset actually look like compared to the training data.

What Laboratory Professionals Should Know About AI

The real question for AI in laboratory medicine is not whether a model can read a laboratory value. It is whether it can read a given laboratory’s value, on that laboratory’s platform, for that laboratory’s patient population, and have that claim independently verified. The EFLM checklist is the most rigorous attempt so far to make that question answerable, not just askable.

References

  1. Carobene A, Cadamuro J, Frans G, et al., on behalf of the EFLM Committee on Digitalisation and Artificial Intelligence. EFLM checklist for the assessment of AI/ML studies in laboratory medicine: enhancing general medical AI frameworks for laboratory-specific applications. Clin Chem Lab Med. 2026;64(1):27-40. doi:10.1515/cclm-2025-0841 . https://pubmed.ncbi.nlm.nih.gov/40966119/

  2. Dodig S, Čepelak I, Dodig M. Are we ready to integrate advanced artificial intelligence models in clinical laboratory? Biochemia Medica. 2025;35(1):010501. doi:10.11613/BM.2025.010501 https://pubmed.ncbi.nlm.nih.gov/39703759/

  3. International Federation of Clinical Chemistry and Laboratory Medicine, Committee for Standardization of Thyroid Function Tests. Harmonization of thyroid-stimulating hormone immunoassays, ongoing multicenter effort. Accessed June 2026.

  4. Spies NC, Farnsworth CW, Wheeler S, McCudden CR. Validating, implementing, and monitoring machine learning solutions in the clinical laboratory safely and effectively. Clinical Chemistry. 2024;70(11):1334-1343. doi:10.1093/clinchem/hvae126 https://academic.oup.com/clinchem/article/70/11/1334/7754656

  5. Cabitza F, Campagner A. The need to separate the wheat from the chaff in medical informatics: introducing a comprehensive checklist for the (self)-assessment of medical AI studies. Int J Med Inf. 2021;153:104510. https://pubmed.ncbi.nlm.nih.gov/34108105/

  6. NABL India. Introduction to NABL accreditation and ISO 15189:2022 medical laboratory standards. Accessed June 2026. https://nabl-india.org/accreditation/medical-testing-laboratories/

Centre Unveils Draft National Pharmacy Commission Bill 2026 to Replace PCI, Introduce National Exit Test

Kerala Nurses Leap Off Bus to Save 45-Year-Old Man After Cardiac Arrest on Busy Road

Confused Between Probiotic, Prebiotic, Postbiotics, and Synbiotic? Here’s When to Use Them

Why Scratching Bug Bites Can Make Itching Worse: Study Reveals How the Immune System Responds

Govt Expands Central Licensing Rules to Cover Gene and Stem Cell Therapies