CONTEXT

A healthcare client needed to analyze free-text narratives from adverse event (AE) reports in a large Phase IV rare disease trial. Traditional MedDRA coding focused only on structured terminology and often overlooked subtle safety signals described in investigator comments. This limited visibility into emerging risks and delayed proactive intervention.

RESOLUTION

Our team implemented a natural language processing (NLP) pipeline tailored for pharmacovigilance. Using spaCy for text preprocessing (tokenization, lemmatization, stopword removal) and Gensim’s LDA for topic modeling, we surfaced hidden AE themes. To detect recurring rare events with inconsistent coding, we employed Sentence-BERT embeddings and HDBSCAN clustering. These insights were integrated with structured AE datasets and visualized through interactive safety signal graphs using NetworkX and Plotly.

RESULT

The AI-powered solution identified a previously undetected neurotoxic symptom cluster in a small patient subgroup. This early detection allowed for timely safety actions, streamlined pharmacovigilance operations, and demonstrated the effectiveness of combining AI and NLP for enhanced post-trial safety monitoring.