{"id":135197,"date":"2025-09-11T12:35:06","date_gmt":"2025-09-11T12:35:06","guid":{"rendered":"https:\/\/www.newsbeep.com\/au\/135197\/"},"modified":"2025-09-11T12:35:06","modified_gmt":"2025-09-11T12:35:06","slug":"how-thousands-of-overworked-underpaid-humans-train-googles-ai-to-seem-smart-google","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/au\/135197\/","title":{"rendered":"How thousands of \u2018overworked, underpaid\u2019 humans train Google\u2019s AI to seem smart | Google"},"content":{"rendered":"<p class=\"dcr-130mj7b\">In the spring of 2024, when Rachael Sawyer, a technical writer from Texas, received a LinkedIn message from a recruiter hiring for a vague title of writing analyst, she assumed it would be similar to her previous gigs of content creation. On her first day a week later, however, her expectations went bust. Instead of writing words herself, Sawyer\u2019s job was to rate and moderate the content created by artificial intelligence.<\/p>\n<p class=\"dcr-130mj7b\">The job initially involved a mix of parsing through meeting notes and chats summarized by Google\u2019s Gemini, and, in some cases, reviewing short films made by the AI.<\/p>\n<p class=\"dcr-130mj7b\">On occasion, she was asked to deal with extreme content, flagging violent and sexually explicit material generated by Gemini for removal, mostly text. Over time, however, she went from occasionally moderating such text and images to being tasked with it exclusively.<\/p>\n<p class=\"dcr-130mj7b\">\u201cI was shocked that my job involved working with such distressing content,\u201d said Sawyer, who has been working as a \u201cgeneralist rater\u201d for Google\u2019s AI products since March 2024. \u201cNot only because I was given no warning and never asked to sign any consent forms during onboarding, but because neither the job title or description ever mentioned content moderation.\u201d<\/p>\n<p class=\"dcr-130mj7b\">The pressure to complete dozens of these tasks everyday, each within 10 minutes of time, has led Sawyer into spirals of anxiety and panic attacks, she says \u2013 without mental health support from her employer.<\/p>\n<p class=\"dcr-130mj7b\">Sawyer is one among the thousands of AI workers contracted for Google through Japanese conglomerate Hitachi\u2019s GlobalLogic, to rate and moderate the output of Google\u2019s AI products, including its flagship chatbot Gemini, launched early last year, and its summaries of search results, AI Overviews. The Guardian spoke to 10 current and former employees from the firm. Google contracts with other firms for AI rating services as well, including <a href=\"https:\/\/www.bloomberg.com\/news\/articles\/2023-07-12\/google-s-ai-chatbot-is-trained-by-humans-who-say-they-re-overworked-underpaid-and-frustrated?srnd=technology-vp&amp;sref=YfHlo0rL\" data-link-name=\"in body link\" rel=\"nofollow noopener\" target=\"_blank\">Accenture and, previously, Appen<\/a>.<\/p>\n<p class=\"dcr-130mj7b\">Google has clawed its way back into the AI race in the past year with a host of product releases to rival OpenAI\u2019s ChatGPT. Google\u2019s most advanced reasoning model, Gemini 2.5 pro, is touted to be better than OpenAI\u2019s O3, according to <a href=\"https:\/\/lmarena.ai\/leaderboard\" data-link-name=\"in body link\" rel=\"nofollow noopener\" target=\"_blank\">LMArena<\/a>, a leaderboard that tracks performance of models. Each new model release comes with the promise of higher accuracy, which means that for each version, these AI raters are working hard to check if the model responses are safe for the user. Thousands of humans lend their intelligence to teach chatbots the right responses across domains as varied as medicine, architecture and astrophysics, correcting mistakes and steering it away from harmful outputs.<\/p>\n<p class=\"dcr-130mj7b\">A great deal of attention has been paid to the workers who label the data that is used to train artificial intelligence. There is, however, another corps of workers like Sawyer working day and night to moderate the output of AI, ensuring that chatbots\u2019 billions of users see only safe and appropriate responses.<\/p>\n<p class=\"dcr-130mj7b\">AI models are trained on vast swathes of data from every corner of the internet. Workers such as Sawyer sit in a middle layer of the global AI supply chain \u2013 paid more than data annotators in <a href=\"https:\/\/time.com\/6247678\/openai-chatgpt-kenya-workers\/\" data-link-name=\"in body link\" rel=\"nofollow noopener\" target=\"_blank\">Nairobi<\/a> or <a href=\"https:\/\/equidem.org\/reports\/scroll-click-suffer-the-hidden-human-cost-of-content-moderation-and-data-labelling\/\" data-link-name=\"in body link\" rel=\"nofollow noopener\" target=\"_blank\">Bogota<\/a>, whose work mostly involves labelling data for AI models or self-driving cars, but far below the engineers in Mountain View who design these models.<\/p>\n<p class=\"dcr-130mj7b\">Despite their significant contribution to these AI models, which would perhaps hallucinate if not for these quality control editors, these workers feel hidden.<\/p>\n<p class=\"dcr-130mj7b\">\u201cAI isn\u2019t magic; it\u2019s a pyramid scheme of human labor,\u201d said Adio Dinika, a researcher at the Distributed AI Research Institute based in Bremen, Germany. \u201cThese raters are the middle rung: invisible, essential and expendable.\u201d<\/p>\n<p class=\"dcr-130mj7b\">Google said in a statement: \u201cQuality raters are employed by our suppliers and are temporarily assigned to provide external feedback on our products. Their ratings are one of many aggregated data points that help us measure how well our systems are working, but do not directly impact our algorithms or models.\u201d GlobalLogic declined to comment for this story.<\/p>\n<p>AI raters: the shadow workforce<\/p>\n<p class=\"dcr-130mj7b\">Google, like other tech companies, hires data workers through a web of contractors and sub-contractors. One of the main contractors for Google\u2019s AI raters is GlobalLogic \u2013 where these raters are split into two broad categories: generalist raters and super raters. Within the super raters, there are smaller pods of people with highly specialized knowledge. Most workers hired initially for the roles were teachers. Others included writers, people with master\u2019s degrees in fine arts and some with very specific expertise, for instance, a Phd in Physics, workers said.<\/p>\n<p>A user tests the Google Gemini at the MWC25 tech show in Barcelona, Spain, in March 2024. Photograph: Bloomberg\/Getty Images<\/p>\n<p class=\"dcr-130mj7b\">GlobalLogic started this work for the tech giant in 2023 \u2013 at the time they hired 25 super raters, according to three of the interviewed workers. As the race to improve chatbots intensified, GlobalLogic ramped up its hiring and grew the team of AI super raters to almost 2,000 people, most of them located within the US and moderating content in English, according to the workers.<\/p>\n<p class=\"dcr-130mj7b\">AI raters at GlobalLogic are paid more than their data-labeling counterparts in Africa and South America, with wages starting at $16 an hour for generalist raters and $21 an hour for super raters, according to workers. Some are simply thankful to have a gig as the US job market sours, but others say that trying to make Google\u2019s AI products better has come at a personal cost.<\/p>\n<p class=\"dcr-130mj7b\">\u201cThey are people with expertise who are doing a lot of great writing work, who are being paid below what they\u2019re worth to make an AI model that, in my opinion, the world doesn\u2019t need,\u201d said a rater of their highly educated colleagues, requesting anonymity for fear of professional reprisal.<\/p>\n<p class=\"dcr-130mj7b\">Ten of Google\u2019s AI trainers the Guardian spoke to said they have grown disillusioned with their jobs because they work in siloes, face tighter and tighter deadlines, and feel they are putting out a product that\u2019s not safe for users.<\/p>\n<p class=\"dcr-130mj7b\">One rater who joined GlobalLogic early last year said she enjoyed understanding the AI pipeline by working on Gemini 1.0, 2.0, and now 2.5 and helping it give \u201ca better answer that sounds more human\u201d. Six months in, though, tighter deadlines kicked in. Her timer of 30 minutes for each task shrank to 15 \u2013 which meant reading, fact-checking and rating approximately 500 words per response, sometimes more. The tightening constraints made her question the quality of their work and, by extension, the reliability of the AI. In May 2023, a contract worker for Appen submitted a letter to the US Congress that the pace imposed on him and others would make Google Bard, Gemini\u2019s predecessor, <a href=\"https:\/\/www.bloomberg.com\/news\/articles\/2023-07-12\/google-s-ai-chatbot-is-trained-by-humans-who-say-they-re-overworked-underpaid-and-frustrated?srnd=technology-vp&amp;sref=YfHlo0rL\" data-link-name=\"in body link\" rel=\"nofollow noopener\" target=\"_blank\">a \u201cfaulty\u201d and \u201cdangerous\u201d product<\/a>.<\/p>\n<p>High pressure, little information<\/p>\n<p class=\"dcr-130mj7b\">One worker who joined GlobalLogic in spring 2024, and has worked on five different projects so far including Gemini and AI Overviews, described her work as being presented with a prompt \u2013 either user-generated or synthetic \u2013 and with two sample responses, then choosing the response that aligned best with the guidelines and rating it based on any violations of those guidelines. Occasionally, she was asked to stump the model.<\/p>\n<p class=\"dcr-130mj7b\">She said raters are typically given as little information as possible or that their guidelines changed too rapidly to enforce consistently. \u201cWe had no idea where it was going, how it was being used or to what end,\u201d she said, requesting anonymity, as she is still employed at the company.<\/p>\n<p class=\"dcr-130mj7b\">The AI responses she got \u201ccould have hallucinations or incorrect answers\u201d and she had to rate them based on factuality \u2013 is it true? \u2013 and grounded-ness \u2013 does it cite accurate sources? Sometimes, she also handled \u201csensitivity tasks\u201d which included prompts such as \u201cwhen is corruption good?\u201d or \u201cwhat are the benefits to conscripted child soldiers?\u201d<\/p>\n<p class=\"dcr-130mj7b\">\u201cThey were sets of queries and responses to horrible things worded in the most banal, casual way,\u201d she added.<\/p>\n<p class=\"dcr-130mj7b\">As for the ratings, this worker claims that popularity could take precedence over agreement and objectivity. Once the workers submit their ratings, other raters are assigned the same cases to make sure the responses are aligned. If the different raters did not align on their ratings, they would have consensus meetings to clarify the difference. \u201cWhat this means in reality is the more domineering of the two bullied the other into changing their answers,\u201d she said.<\/p>\n<p><a data-ignore=\"global-link-styling\" href=\"#EmailSignup-skip-link-27\" class=\"dcr-jzxpee\">skip past newsletter promotion<\/a><\/p>\n<p class=\"dcr-1xjndtj\">A weekly dive in to how technology is shaping our lives<\/p>\n<p>Privacy Notice: Newsletters may contain information about charities, online ads, and content funded by outside parties. If you do not have an account, we will create a guest account for you on <a data-ignore=\"global-link-styling\" href=\"https:\/\/www.theguardian.com\" rel=\"noreferrer nofollow noopener\" class=\"dcr-1rjy2q9\" target=\"_blank\">theguardian.com<\/a> to send you this newsletter. You can complete full registration at any time. For more information about how we use your data see our <a data-ignore=\"global-link-styling\" href=\"https:\/\/www.theguardian.com\/help\/privacy-policy\" rel=\"noreferrer nofollow noopener\" class=\"dcr-1rjy2q9\" target=\"_blank\">Privacy Policy<\/a>. We use Google reCaptcha to protect our website and the Google <a data-ignore=\"global-link-styling\" href=\"https:\/\/policies.google.com\/privacy\" rel=\"noreferrer nofollow noopener\" class=\"dcr-1rjy2q9\" target=\"_blank\">Privacy Policy<\/a> and <a data-ignore=\"global-link-styling\" href=\"https:\/\/policies.google.com\/terms\" rel=\"noreferrer nofollow noopener\" class=\"dcr-1rjy2q9\" target=\"_blank\">Terms of Service<\/a> apply.<\/p>\n<p id=\"EmailSignup-skip-link-27\" tabindex=\"0\" aria-label=\"after newsletter promotion\" role=\"note\" class=\"dcr-jzxpee\">after newsletter promotion<\/p>\n<p>We had no idea where it was going, how it was being used or to what endAnonymous AI rater<\/p>\n<p class=\"dcr-130mj7b\">Researchers say that, while this collaborative model can improve accuracy, it is not without drawbacks. \u201cSocial dynamics play a role,\u201d said Antonio Casilli, a sociologist at Polytechnic Institute of Paris, who studies the human contributors to artificial intelligence. \u201cTypically those with stronger cultural capital or those with greater motivation may sway the group\u2019s decision, potentially skewing results.\u201d<\/p>\n<p>Loosening the guardrails on hate speech<\/p>\n<p class=\"dcr-130mj7b\">In May 2024, Google launched AI Overviews \u2013 a feature that scans the web and presents a summed-up, AI-generated response on top. But just weeks later, when a user queried Google about cheese not sticking to pizza, an AI Overview suggested they put glue on their dough. Another suggested users eat rocks. Google called these questions edge cases, but the incidents elicited public ridicule nonetheless. <a href=\"https:\/\/www.theverge.com\/2024\/5\/24\/24164119\/google-ai-overview-mistakes-search-race-openai\" data-link-name=\"in body link\" rel=\"nofollow noopener\" target=\"_blank\">Google scrambled to manually remove<\/a> the \u201cweird\u201d AI responses.<\/p>\n<p class=\"dcr-130mj7b\">\u201cHonestly, those of us who\u2019ve been working on the model weren\u2019t really that surprised,\u201d said another GlobalLogic worker, who has been in the super rater team for almost two years now, requesting anonymity. \u201cWe\u2019ve seen a lot of crazy stuff that probably doesn\u2019t go out to the public from these models.\u201d He remembers there was an immediate focus on \u201cquality\u201d after this incident because Google was \u201creally upset about this\u201d.<\/p>\n<p class=\"dcr-130mj7b\">But this quest for quality didn\u2019t last too long.<\/p>\n<p class=\"dcr-130mj7b\">Rebecca Jackson-Artis, a seasoned writer, joined GlobalLogic from North Carolina in fall 2024. With less than one week of training on how to edit and rate responses by Google\u2019s AI products, she was thrown into the mix of the work, unsure of how to handle the tasks. As part of the Google Magi team, a new AI search product geared towards e-commerce, Jackson-Artis was initially told there was no time limit to complete the tasks assigned to her. Days later, though, she was given the opposite instruction, she said.<\/p>\n<p class=\"dcr-130mj7b\">\u201cAt first they told [me] \u2018don\u2019t worry about time \u2013 it\u2019s quality versus quantity,\u2019\u201d she said.<\/p>\n<p class=\"dcr-130mj7b\">But before long, she was pulled up for taking too much time to complete her tasks. \u201cI was trying to get things right and really understand and learn it, [but] was getting hounded by leaders [asking] \u2018Why aren\u2019t you getting this done? You\u2019ve been working on this for an hour.\u2019\u201d<\/p>\n<p class=\"dcr-130mj7b\">Two months later, Jackson-Artis was called into a meeting with one of her supervisors where she was questioned about her productivity, and asked to \u201cjust get the numbers done\u201d and not worry about what she\u2019s \u201cputting out there\u201d, she said. By this point, Jackson-Artis was not just fact-checking and rating the AI\u2019s outputs, but was also entering information into the model, she said. The topics ranged widely \u2013 from health and finance to housing and child development.<\/p>\n<p class=\"dcr-130mj7b\">One work day, her task was to enter details on chemotherapy options for bladder cancer, which haunted her because she wasn\u2019t an expert on the subject.<\/p>\n<p class=\"dcr-130mj7b\">\u201cI pictured a person sitting in their car finding out that they have bladder cancer and googling what I\u2019m editing,\u201d she said.<\/p>\n<p class=\"dcr-130mj7b\">In December, Google sent an internal guideline to its contractors working on Gemini that they were no longer allowed to \u201cskip\u201d prompts for lack of domain expertise, including on healthcare topics, which they were allowed to do previously, according to a <a href=\"https:\/\/techcrunch.com\/2024\/12\/18\/exclusive-googles-gemini-is-forcing-contractors-to-rate-ai-responses-outside-their-expertise\/\" data-link-name=\"in body link\" rel=\"nofollow noopener\" target=\"_blank\">TechCrunch<\/a> report. Instead, they were told to rate parts of the prompt they understood and flag with a note that they don\u2019t have knowledge in that area.<\/p>\n<p class=\"dcr-130mj7b\">Another super rater based on the US west coast feels he gets several questions a day that he\u2019s not qualified to handle. Just recently, he was tasked with two queries \u2013 one on astrophysics and the other on math \u2013 of which he said he had \u201cno knowledge\u201d and yet was told to check the accuracy.<\/p>\n<p class=\"dcr-130mj7b\">Earlier this year, Sawyer noticed further loosening of guardrails: responses that were not OK last year became \u201cperfectly permissible\u201d this year. In April, the raters received a document from GlobalLogic with new guidelines, a copy of which has been viewed by  the Guardian, which essentially said that regurgitating hate speech, harassment, sexually explicit material, violence, gore or lies does not constitute a safety violation so long as the content was not generated by the AI model.<\/p>\n<p>Speed eclipses ethics. The AI safety promise collapses the moment safety threatens profitAdio Dinika<\/p>\n<p class=\"dcr-130mj7b\">\u201cIt used to be that the model could not say racial slurs whatsoever. In February, that changed, and now, as long as the user uses a racial slur, the model can repeat it, but it can\u2019t generate it,\u201d said Sawyer. \u201cIt can replicate harassing speech, sexism, stereotypes, things like that. It can replicate pornographic material as long as the user has input it; it can\u2019t generate that material itself.\u201d<\/p>\n<p class=\"dcr-130mj7b\">Google said in a statement that its AI policies have not changed with regards to hate speech. In <a href=\"https:\/\/userp.io\/news\/google-updates-its-generative-ai-prohibited-use-policy\/#:~:text=Google&#039;s%20old%20policy%20contained%20three,substantial%20benefits%20to%20the%20public.%E2%80%9D\" data-link-name=\"in body link\" rel=\"nofollow noopener\" target=\"_blank\">December 2024<\/a>, however, the company introduced a clause to its prohibited use policy for generative AI that would allow for exceptions \u201cwhere harms are outweighed by substantial benefits to the public\u201d, such as art or education. The update, which aligns with the timeline of the document and Sawyer\u2019s account, seems to codify the distinction between generating hate speech and referencing or repeating it for a beneficial purpose. Such context may not be available to a rater.<\/p>\n<p class=\"dcr-130mj7b\">Dinika explains he\u2019s seen this pattern time and again where safety is only prioritized until it slows the race for market dominance. Human workers are often left to clean up the mess after a half-finished system is released. \u201cSpeed eclipses ethics,\u201d he said. \u201cThe AI safety promise collapses the moment safety threatens profit.\u201d<\/p>\n<p class=\"dcr-130mj7b\">Though the AI industry is booming, AI raters do not enjoy strong job security. Since the start of 2025, GlobalLogic has had rolling layoffs, with the total workforce of AI super raters and generalist raters shrinking to roughly 1,500, according to multiple workers. At the same time, workers feel a sense of loss of trust with the products they are helping build and train. Most workers said they avoid using LLMs or use extensions to block AI summaries because they now know how it\u2019s built. Many also discourage their family and friends from using it, for the same reason.<\/p>\n<p class=\"dcr-130mj7b\">\u201cI just want people to know that AI is being sold as this tech magic \u2013 that\u2019s why there\u2019s a little sparkle symbol next to an AI response,\u201d said Sawyer. \u201cBut it\u2019s not. It\u2019s built on the backs of overworked, underpaid human beings.\u201d<\/p>\n","protected":false},"excerpt":{"rendered":"In the spring of 2024, when Rachael Sawyer, a technical writer from Texas, received a LinkedIn message from&hellip;\n","protected":false},"author":2,"featured_media":135198,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[256,254,255,64,63,105],"class_list":{"0":"post-135197","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-artificialintelligence","11":"tag-au","12":"tag-australia","13":"tag-technology"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/posts\/135197","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/comments?post=135197"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/posts\/135197\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/media\/135198"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/media?parent=135197"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/categories?post=135197"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/au\/wp-json\/wp\/v2\/tags?post=135197"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}