In our previous articles, "Why Toxicology Cannot Afford AI Hype" and "AI in Safety Assessment: Technological Opportunities Paired with Scientific Responsibility," we established a clear position. Toxicology does not reward speed without certainty. Scientific conclusions in this field extend far beyond reports. They influence product development, regulatory decisions, and ultimately public health.
That position still stands. What has changed is not the need for caution, but the practical understanding of where artificial intelligence fits.
The discussion has moved forward. The real question is no longer whether AI has a role in toxicology. It is far more specific.
What does AI actually do inside toxicology workflows today, and where does it stop being reliable
AI in toxicology is not one technology.
One of the most persistent misconceptions is treating AI as a single capability. In reality, toxicology workflows interact with three distinct categories of AI.
Predictive modeling includes approaches such as QSAR, machine learning, and deep learning used to estimate toxicity or biological activity from chemical structure and experimental data. The OECD provides an overview of QSAR principles and regulatory expectations here: OECD QSAR Toolbox | OECD
Natural language processing focuses on extracting structured information from unstructured sources such as scientific literature. This includes endpoints, study design, and experimental conditions.
Generative AI and large language models introduce a different layer. They generate summaries, draft reports, and increasingly orchestrate multi-step workflows by connecting search, retrieval, and extraction systems.
Each of these technologies contributes to the workflow. Each also fails in a different way. In toxicology, failure modes matter as much as performance.
Where AI fits in toxicology workflows
The most useful way to understand AI is to map it directly onto the safety assessment process.
Finding information
Toxicology depends on comprehensive and unbiased evidence gathering. AI can support this step by expanding search queries, identifying synonyms, and improving retrieval consistency across large datasets.
The European Food Safety Authority has explored this through its AI-driven initiatives, particularly in evidence extraction and integration: AI@EFSA | EFSA
However, AI must remain grounded in retrieved documents. Systems that generate answers without traceable sources introduce unacceptable risk. Hallucinated evidence is not a minor issue in toxicology. It is a fundamental failure of the workflow.
Sorting and structuring evidence
This is where AI is already demonstrating measurable value.
Toxicology data is fragmented across formats, databases, and publications. AI can standardize this information by populating structured templates with data and removing duplication.
EFSA’s work on AI to support systematic reviews highlights this potential, particularly in automating parts of the evidence-handling process.
The key insight is that performance improves significantly when models are constrained to curated datasets rather than open-ended sources.
Analysis and relevance assessment
This step remains the boundary of reliable AI use.
Relevance in toxicology is not a single variable decision. It requires evaluating study quality, exposure conditions, biological plausibility, and applicability to real-world scenarios.
The European Medicines Agency and the US Food and Drug Administration emphasize this in their joint principles for AI use in medicines regulation, which highlight transparency, explainability, and human oversight.
AI can support pattern recognition and highlight relationships. It cannot yet replace expert judgment in determining relevance.
Summarisation of scientific evidence
Large language models perform well in summarisation tasks, but toxicology imposes stricter requirements.
A valid toxicology summary must preserve numerical values, experimental conditions, and limitations while maintaining full traceability to source material.
Research on large language models in systematic reviews shows both potential and risk, particularly around consistency and reproducibility.
This reinforces a critical principle. AI-generated summaries must be treated as drafts and verified against sources.
Endpoint selection and risk assessment
The most critical decisions in toxicology remain human-driven.
Selecting the most sensitive endpoint requires integrating multiple lines of evidence, understanding biological mechanisms, and interpreting uncertainty.
The US Environmental Protection Agency highlights the importance of expert judgment in risk assessment frameworks.
AI can assist by organizing inputs and highlighting options. It cannot assume responsibility for the final decision.
What regulators are actually doing
Regulatory bodies are not approaching AI as a replacement for toxicologists. They are integrating it as a controlled support layer.
The US Food and Drug Administration has established the AI4TOX program within its National Center for Toxicological Research, focusing on predictive modeling, document analysis, and digital pathology.
Similarly, EFSA’s AI strategy emphasizes human-centric adoption to improve evidence management rather than to automate scientific judgment.
This alignment across regulators is significant. It demonstrates that AI adoption is being shaped by governance and scientific rigor rather than technological enthusiasm.
Industry perspective is converging with regulators.
Industry discussions, particularly in the chemicals sector, reflect the same priorities.
The European Center for Ecotoxicology and Toxicology of Chemicals has emphasized data quality, governance, and trust as prerequisites for AI integration in safety assessment.
At the same time, broader industry analysis highlights that AI delivers value only when embedded into structured workflows rather than deployed as standalone tools.
A recent McKinsey analysis describes this as a shift from experimentation to integration, in which AI becomes part of core operational processes.
This mirrors what is happening in toxicology. AI is not transforming the science itself. It is transforming how the work is executed.
The real constraints that define progress
Across regulators, industry, and research, the same constraints appear consistently.
Data quality and provenance
AI systems depend on curated and well-described datasets. Without this foundation, scaling AI leads to scaling errors.
The FAIR data principles provide a widely accepted framework for improving data quality and interoperability.
Explainability and applicability domain
Regulatory acceptance depends on understanding how a model works and when it should not be trusted.
The OECD guidance on QSAR validation emphasizes applicability domain and transparency as core requirements.
Governance over automation
The most consistent message across all stakeholders is clear.
AI should support decisions, not replace accountability.
Overstating automation capabilities remains one of the fastest ways to lose regulatory and scientific trust.
What good looks like in practice
A credible AI-enabled toxicology workflow follows a consistent structure.
It begins with curated and governed datasets.
It uses AI systems trained on datasets with clear output structures.
It ensures all outputs are traceable, reproducible, and auditable.
It maintains human oversight at every decision point.
This model aligns with both regulatory expectations and emerging industry best practices.
A continuation of the same principle
This blog builds directly on the argument established in your previous pieces.
AI does not change the scientific responsibility of toxicology. It reinforces it.
When applied correctly, AI expands the evidence base, improves consistency, and reduces manual burden.
When applied incorrectly, it introduces opacity, bias, and risk.
The difference is not in the technology itself. It lies in how it is governed and integrated.
Final perspective
Toxicology has never been about speed alone. It has always been about being right under conditions of uncertainty.
AI does not remove that requirement.
It allows teams to reach well-supported conclusions more efficiently, provided transparency, traceability, and expert judgment remain intact.
That is what AI actually means in toxicology workflows today.
Talk to One of Our Experts
Get in touch today to find out about how Evalueserve can help you improve your processes, making you better, faster and more efficient.

