{"id":244970,"date":"2025-10-23T02:17:10","date_gmt":"2025-10-23T02:17:10","guid":{"rendered":"https:\/\/www.newsbeep.com\/us\/244970\/"},"modified":"2025-10-23T02:17:10","modified_gmt":"2025-10-23T02:17:10","slug":"reddit-sues-ai-search-engine-perplexity-for-scraping-its-data","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/us\/244970\/","title":{"rendered":"Reddit sues AI search engine Perplexity for scraping its data"},"content":{"rendered":"<p>Stay informed with free updates<\/p>\n<p class=\"article__content-sign-up-topic-description o3-type-body-base\">Simply sign up to the Artificial intelligence myFT Digest &#8212; delivered directly to your inbox.<\/p>\n<p>Social media platform Reddit has filed a copyright lawsuit against Perplexity, accusing the artificial intelligence company of illegally scraping its data in order to train the model powering its search engine.<\/p>\n<p>The complaint filed in New York federal court on Wednesday marks the latest legal tussle between <a href=\"https:\/\/www.ft.com\/artificial-intelligence\" data-trackable=\"link\" rel=\"nofollow noopener\" target=\"_blank\">AI groups<\/a> over alleged copyrighted material.<\/p>\n<p><a href=\"https:\/\/www.ft.com\/stream\/85aa227a-f4af-4fe0-ae31-f2cdac88054a\" data-trackable=\"link\" rel=\"nofollow noopener\" target=\"_blank\">Reddit<\/a> also sued three smaller groups: Lithuanian data scraper Oxylabs, \u201cformer Russian botnet\u201d AWMProxy, and Texas start-up SerpApi. <\/p>\n<p>Reddit claims the three groups provided data-scraping services for hoovering up its copyrighted content \u201cby masking their identities, hiding their locations and disguising their web scrapers as regular people\u201d.<\/p>\n<p>Ben Lee, chief legal officer at Reddit on Wednesday said: \u201cAI companies are locked in an arms race for quality human content \u2014 and that pressure has fuelled an industrial-scale \u2018data laundering\u2019 economy.\u201d<\/p>\n<p><a href=\"https:\/\/www.ft.com\/stream\/759ba02a-eb9f-42de-8879-d44d2a41bc98\" data-trackable=\"link\" rel=\"nofollow noopener\" target=\"_blank\">Perplexity<\/a> was \u201ca willing customer of at least one of its co-defendants\u201d, the social media company wrote in the filing, alleging the San Francisco-based AI group \u201cdesperately\u201d needed to fuel its \u201canswer engine\u201d by scraping data through Google search results.<\/p>\n<p>Perplexity on Wednesday said it had not received the lawsuit. <\/p>\n<p>\u201cWe will always fight vigorously for users\u2019 rights to freely and fairly access public knowledge,\u201d it added. \u201cOur approach remains principled and responsible as we provide factual answers with accurate AI, and we will not tolerate threats against openness and the public interest.\u201d<\/p>\n<p>Oxylabs and SerpApi each said they had also not been served, but plan to defend themselves in court. <\/p>\n<p>Denas Grybauskas, Oxylabs chief governance and strategy officer, added Reddit has made \u201cno attempt to speak with us directly or communicate any potential concerns\u201d.<\/p>\n<p>\u201cOxylabs has always been and will continue to be a pioneer and an industry leader in public data collection, and it will not hesitate to defend itself against these allegations,\u201d Grybauskas added. <\/p>\n<p>Two people familiar with the matter told the Financial Times that Reddit had confronted Perplexity about its alleged theft and suggested they enter discussions about a paid partnership, but that its founder Aravind Srinivas was not interested.<\/p>\n<p>Reddit had also contacted Google with its concerns, asking the tech giant to investigate if Perplexity was scraping Reddit\u2019s proprietary data through its search engine and if so, to work out how to prevent this, the people added.<\/p>\n<p>Google declined to comment. <\/p>\n<p>The suit adds to dozens of copyright lawsuits that have been filed against AI companies since the advent of generative AI systems, which are trained using vast amounts of text data, including content from the internet. Copyright holders have claimed their content has been used without consent or fair compensation.<\/p>\n<p>Reddit, which went public in March 2024 and is known for hosting devoted online communities, has struck multimillion-dollar partnerships with Google and OpenAI allowing them to train their large language models on its content.<\/p>\n<p>By contrast, Reddit alleged in the complaint that the defendants had circumvented their data protection measures to obtain its copyrighted material without permission. <\/p>\n<p>Lee said Reddit was \u201ca prime target because it\u2019s one of the largest and most dynamic collections of human conversation ever created\u201d.<\/p>\n<p>In June, Reddit sued <a href=\"https:\/\/www.ft.com\/content\/07611b74-3d69-4579-9089-f2fc2af61baa\" data-trackable=\"link\" rel=\"nofollow noopener\" target=\"_blank\">Anthropic<\/a>, alleging the AI start-up had scraped its platform more than 100,000 times since July 2024. Anthropic responded at the time that it \u201cdisagreed\u201d with Reddit\u2019s claims and would \u201cdefend ourselves vigorously\u201d.<\/p>\n<p>Perplexity and Oxylabs did not immediately respond to a request for comment. AWMProxy could not be reached for comment.<\/p>\n","protected":false},"excerpt":{"rendered":"Stay informed with free updates Simply sign up to the Artificial intelligence myFT Digest &#8212; delivered directly to&hellip;\n","protected":false},"author":2,"featured_media":244971,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[45],"tags":[182,181,507,74],"class_list":{"0":"post-244970","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-artificialintelligence","11":"tag-technology"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts\/244970","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/comments?post=244970"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/posts\/244970\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/media\/244971"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/media?parent=244970"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/categories?post=244970"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/us\/wp-json\/wp\/v2\/tags?post=244970"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}