4 - My PhD
Objectives
My PhD thesis in clinical NLP focuses on the linguistic nature of clinical texts and how state-of-the-art methods can cope with them.
In particular, I'm interested in whether universal clinical sublanguages can be identified by grammatical, stylistic, terminological, or semantic features derived from clinical routine texts. Given that clinical text in German is sparse due to privacy constraints, the question arises how well Large Language Models (LLMs) can process, analyze, summarize, or generate German clinical text. I'd like to investigate whether explicit knowledge about clinical sublanguage features may increase the performance of LLMs in this regard.
Relevant Publications
GerMedIQ: A Resource for Simulated and Synthesized Anamnesis Interview Responses in German
We release the first fully accessible medical anamnesis interview corpus in German and compare LLM-generated synthetic responses with human-produced simulated responses to standardized anamnesis questions. We found that Mistral (124B)'s responses were rated more acceptable than human responses by LLM judges as well as human raters.
If you want to know more:
- Take Home Messages: The GerMedIQ Corpus
- The paper: https://aclanthology.org/2025.acl-srw.84/