The GerMedIQ Corpus

Take Home Messages

Problem: Clinical Data in non-English languages is sparse $\Rightarrow$ Solution: Data Simulation with Laypersons & Synthetic Data Generation with LLMs
The GerMedIQ Corpus:
- 39 laypersons answered 116 standardized anamnesis interview questions in German each
- 18 LLMs produced 5 responses each to the same 116 questions
Evaluation:
- Structural Evaluation: LLM responses are longer and more complex than human responses
- Semantic Evaluation: Gemma 3 produced closest responses to the human responses
- Acceptability Study: Both human raters and LLM judges rated Mistral (124B)'s responses as most appropriate, surpassing human responses

Abstract

Due to strict privacy regulations, text corpora in non-English clinical contexts are scarce. Consequently, synthetic data generation using Large Language Models (LLMs) emerges as a promising strategy to address this data gap. To evaluate the ability of LLMs in generating synthetic data, we applied them to our novel German Medical Interview Questions Corpus (GerMedIQ), which consists of 4,524 unique, simulated question-response pairs in German. We augmented our corpus by prompting 18 different LLMs to generate responses to the same questions. Structural and semantic evaluations of the generated responses revealed that large-sized language models produced responses comparable to those provided by humans. Additionally, an LLM-as-a-judge study, combined with a human baseline experiment assessing response acceptability, demonstrated that human raters preferred the responses generated by Mistral (124B) over those produced by humans. Nonetheless, our findings indicate that using LLMs for data augmentation in non-English clinical contexts requires caution.

Cite the paper

Please use the following citation to cite our paper:

@InProceedings{hofenbitzer2025germediq,
  author           = {Hofenbitzer, Justin and Sch{\"o}ning, Sebastian and Belle, Sebastian and Lammert, Jacqueline and Modersohn, Luise and Boeker, Martin and Frassinelli, Diego},
  booktitle        = {Proceedings of the 63rd {Annual} {Meeting} of the {Association} for {Computational} {Linguistics} ({Volume} 4: {Student} {Research} {Workshop})},
  title            = {{GerMedIQ}: {A} {Resource} for {Simulated} and {Synthesized} {Anamnesis} {Interview} {Responses} in {German}},
  year             = {2025},
  address          = {Vienna, Austria},
  editor           = {Zhao, Jin and Wang, Mingyang and Liu, Zhu},
  month            = jul,
  pages            = {1064--1078},
  publisher        = {Association for Computational Linguistics},
  doi              = {10.18653/v1/2025.acl-srw.84},
  isbn             = {9798891762541},
  url              = {https://aclanthology.org/2025.acl-srw.84/},
}

Please use the following citation to cite the GerMedIQ Corpus (the resource):

@Misc{hofenbitzer2025jhofenbitzer,
  author           = {Justin Hofenbitzer and Sch{\"o}ning, Sebastian and Belle, Sebastian and Lammert, Jacqueline and Modersohn, Luise and Boeker, Martin and Frassinelli, Diego},
  title            = {{Jhofenbitzer/GerMedIQ-Corpus: Official Github Repository of the GerMedIQ Corpus}},
  year             = {2025},
  copyright        = {Creative Commons Attribution 4.0 International},
  doi              = {10.5281/zenodo.16460622},
  publisher        = {Zenodo},
}

If you use the GerMedIQ Corpus, please cite both our paper and the resource itself!

Take Home Messages

Abstract

Links

Cite the paper