4 - My PhD
Objectives
My PhD thesis in clinical NLP focuses on the analysis of clinical texts and how state-of-the-art methods can cope with them.
I devote particular focus to:
- Synthetic clinical text generation and evaluation metrics
- Sublanguage derivation through linguistic, terminological, and stylistic features
- More suitable methods for classic tasks of biomedical NLP, like named entity recognition, entity linking, and relation extraction, with special focus on non-English texts and SNOMED CT
Relevant Publications
GerMedIQ: A Resource for Simulated and Synthesized Anamnesis Interview Responses in German (ACL 2025)
We release the first fully accessible medical anamnesis interview corpus in German and compare LLM-generated synthetic responses with human-produced simulated responses to standardized anamnesis questions. We found that Gemma 3 (4B) produced the most similar responses to humans, and Mistral (124B)'s responses were rated more acceptable than human responses by LLM judges as well as human raters.
Don't miss the following resources:
- A detailed description of the findings: The GerMedIQ Corpus
- The paper: ACL Anthology
- The poster: Zenodo
- The Corpus: Github & Zenodo