Follow
Craig Thomson
Title
Cited by
Cited by
Year
A gold standard methodology for evaluating accuracy in data-to-text systems
C Thomson, E Reiter
arXiv preprint arXiv:2011.03992, 2020
432020
Underreporting of errors in NLG output, and what to do about it
E Van Miltenburg, MA Clinciu, O Dušek, D Gkatzia, S Inglis, L Leppänen, ...
arXiv preprint arXiv:2108.01182, 2021
272021
Missing information, unresponsive authors, experimental flaws: The impossibility of assessing the reproducibility of previous human evaluations in NLP
A Belz, C Thomson, E Reiter, G Abercrombie, JM Alonso-Moral, M Arvan, ...
arXiv preprint arXiv:2305.01633, 2023
242023
SportSett: basketball-a robust and maintainable data-set for natural language generation
C Thomson, E Reiter, S Sripada
Proceedings of the Workshop on Intelligent Information Processing and …, 2020
212020
Generation challenges: Results of the accuracy evaluation shared task
C Thomson, E Reiter
arXiv preprint arXiv:2108.05644, 2021
142021
Shared task on evaluating accuracy
E Reiter, CA Thomson
112020
Evaluating factual accuracy in complex data-to-text
C Thomson, E Reiter, B Sundararajan
Computer Speech & Language 80, 101482, 2023
82023
Gemv2: Multilingual nlg benchmarking in a single line of code
S Gehrmann, A Bhattacharjee, A Mahendiran, A Wang, A Papangelis, ...
arXiv preprint arXiv:2206.11249, 2022
82022
Non-repeatable experiments and non-reproducible results: The reproducibility crisis in human evaluation in NLP
A Belz, C Thomson, E Reiter, S Mille
Findings of the Association for Computational Linguistics: ACL 2023, 3676-3687, 2023
62023
Barriers and enabling factors for error analysis in NLG research
E Van Miltenburg, M Clinciu, O Dušek, D Gkatzia, S Inglis, L Leppänen, ...
Northern European Journal of Language Technology, 2023
42023
Studying the impact of filling information gaps on the output quality of neural data-to-text
CA Thomson, Z Zhao, SG Sripada
42020
Comprehension driven document planning in natural language generation systems
C Thomson, E Reiter, S Sripada
Proceedings of The 11th International Natural Language Generation Conference, 2018
42018
Common Flaws in Running Human Evaluation Experiments in NLP
C Thomson, E Reiter, A Belz
Computational Linguistics, 1-10, 2024
12024
The accuracy evaluation shared task as a retrospective reproduction study
C Thomson, E Reiter
Proceedings of the 15th International Conference on Natural Language …, 2022
12022
Enhancing factualness and controllability of Data-to-Text Generation via data Views and constraints
C Thomson, C Rebuffel, E Reiter, L Soulier, S Sripada, P Gallinari
Proceedings of the 16th International Natural Language Generation Conference …, 2023
2023
The 2023 ReproNLP Shared Task on Reproducibility of Evaluations in NLP: Overview and Results
A Belz, C Thomson
Proceedings of the 3rd Workshop on Human Evaluation of NLP Systems, 35-48, 2023
2023
The system can't perform the operation now. Try again later.
Articles 1–16