arxiv preprint - Artificial Artificial Artificial Intelligence: Crowd Workers Widely Use Large Language Models for Text Production Tasks

In this episode we discuss Artificial Artificial Artificial Intelligence: Crowd Workers Widely Use Large Language Models for Text Production Tasks
by Veniamin Veselovsky, Manoel Horta Ribeiro, Robert West. The paper explores how large language models (LLMs) affect the reliability of human-generated data collected through crowdsourcing. The authors conducted a case study on Amazon Mechanical Turk to determine how often crowd workers utilized LLMs when performing an abstract summarization task. Using keystroke detection and synthetic text classification, the authors estimated that 33-46% of crowd workers employed LLMs while completing the task, indicating that human data may not always be exclusively human. As a result, the article proposes new techniques for guaranteeing that human data are truly human-generated.

arxiv preprint – Artificial Artificial Artificial Intelligence: Crowd Workers Widely Use Large Language Models for Text Production Tasks