{"id":63736,"date":"2025-08-12T10:27:09","date_gmt":"2025-08-12T10:27:09","guid":{"rendered":"https:\/\/www.newsbeep.com\/ca\/63736\/"},"modified":"2025-08-12T10:27:09","modified_gmt":"2025-08-12T10:27:09","slug":"the-internet-is-about-to-get-a-little-worse-as-reddit-moves-to-block-the-internet-archive-so-ai-companies-cant-scrape-its-content","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/ca\/63736\/","title":{"rendered":"The internet is about to get a little worse as Reddit moves to block the Internet Archive so AI companies can&#8217;t scrape its content"},"content":{"rendered":"<p class=\"mb-4 text-lg md:leading-8 break-words\">When you buy through links on our articles, Future and its syndication partners may earn a commission.<\/p>\n<p><img alt=\" Reddit logo. \" loading=\"lazy\" width=\"960\" height=\"541\" decoding=\"async\" data-nimg=\"1\" class=\"rounded-lg\" style=\"color:transparent\" src=\"https:\/\/www.newsbeep.com\/ca\/wp-content\/uploads\/2025\/08\/90f4b12c1c6443a85916df1f38ec7057.jpeg\"\/><\/p>\n<p>Credit: SOPA Images (Getty Images)<\/p>\n<p class=\"mb-4 text-lg md:leading-8 break-words\">The internet, which was once a useful thing, is about to become a little less so: A new report from <a href=\"https:\/\/www.theverge.com\/news\/757538\/reddit-internet-archive-wayback-machine-block-limit\" rel=\"nofollow noopener\" target=\"_blank\" data-ylk=\"slk:The Verge;elm:context_link;itc:0;sec:content-canvas\" class=\"link \">The Verge<\/a> says Reddit is going to start blocking the Wayback Machine from indexing most of its content.<\/p>\n<p class=\"mb-4 text-lg md:leading-8 break-words\">The Wayback Machine, part of the Internet Archive, takes &#8220;snapshots&#8221; of websites as they exist at various points through their history\u2014even if those websites don&#8217;t exist anymore. Want to know what the old <a href=\"https:\/\/web.archive.org\/web\/20150812192936\/https:\/\/forum.bioware.com\/\" rel=\"nofollow noopener\" target=\"_blank\" data-ylk=\"slk:BioWare forums;elm:context_link;itc:0;sec:content-canvas\" class=\"link \">BioWare forums<\/a> looked like before they were closed in 2016? Wayback Machine&#8217;s got you. It&#8217;s also incredibly handy for tracking things like <a href=\"https:\/\/tech.yahoo.com\/gaming\/articles\/tencent-survival-game-being-sued-221855139.html\" data-ylk=\"slk:Steam page changes;elm:context_link;itc:0;sec:content-canvas;outcm:mb_qualified_link;_E:mb_qualified_link;ct:story;\" class=\"link  yahoo-link\" rel=\"nofollow noopener\" target=\"_blank\">Steam page changes<\/a> and answering questions like, &#8220;Hey, did the CIA ever run a Star Wars fan site?&#8221; (And <a href=\"https:\/\/www.yahoo.com\/news\/cia-operated-network-gaming-sites-175736887.html\" data-ylk=\"slk:yes, it did;elm:context_link;itc:0;sec:content-canvas;outcm:mb_qualified_link;_E:mb_qualified_link;ct:story;\" class=\"link  yahoo-link\" rel=\"nofollow noopener\" target=\"_blank\">yes, it did<\/a>.)<\/p>\n<p class=\"mb-4 text-lg md:leading-8 break-words\">The Internet Archive&#8217;s ability to do this is dependent on crawling and indexing websites, and that&#8217;s what Reddit is going to block: In future, the Wayback Machine will only be able to index the <a href=\"http:\/\/reddit.com\" rel=\"nofollow noopener\" target=\"_blank\" data-ylk=\"slk:reddit.com;elm:context_link;itc:0;sec:content-canvas\" class=\"link \">reddit.com<\/a> homepage, meaning individual subreddits and posts will be out of reach\u2014effectively rendering it useless. Reddit spokesperson Tim Rathschmidt said the block is being imposed because &#8220;we\u2019ve been made aware of instances where AI companies violate platform policies, including ours, and scrape data from the Wayback Machine.&#8221;<\/p>\n<p class=\"mb-4 text-lg md:leading-8 break-words\">The report says limits on the Wayback Machine&#8217;s ability to scrape Reddit will start &#8220;ramping up&#8221; today. Rathschmidt said Reddit had been in touch with the Internet Archive in advance, to &#8220;inform them of the limits before they go into effect.&#8221;<\/p>\n<p class=\"mb-4 text-lg md:leading-8 break-words\">I&#8217;m generally all for anything that makes life more difficult for AI companies, but I can&#8217;t really hand it to Reddit in this case because the principle in question here appears to be, well, not principle, but money: Reddit <a href=\"https:\/\/redditinc.com\/blog\/reddit-and-google-expand-partnership\" rel=\"nofollow noopener\" target=\"_blank\" data-ylk=\"slk:made a deal with Google;elm:context_link;itc:0;sec:content-canvas\" class=\"link \">made a deal with Google<\/a> in 2024 to make its content available for AI training. Another <a href=\"https:\/\/redditinc.com\/blog\/reddit-and-oai-partner\" rel=\"nofollow noopener\" target=\"_blank\" data-ylk=\"slk:deal with OpenAI;elm:context_link;itc:0;sec:content-canvas\" class=\"link \">deal with OpenAI<\/a> followed a few months later.<\/p>\n<p class=\"mb-4 text-lg md:leading-8 break-words\">Reddit&#8217;s thing isn&#8217;t so much about preventing the abuses of AI training, then, as it is charging top dollar for the privilege. In that light, this really sucks: The Internet Archive is a <a href=\"https:\/\/archive.org\/donate\" rel=\"nofollow noopener\" target=\"_blank\" data-ylk=\"slk:non-profit organization;elm:context_link;itc:0;sec:content-canvas\" class=\"link \">non-profit organization<\/a>, and the Wayback Machine\u2014in sharp contrast to AI-powered chatbots\u2014is genuinely useful, even vital given how quickly <a href=\"https:\/\/www.pcgamer.com\/more-of-the-internet-could-disappear-as-load-bearing-image-host-imgur-announces-deletion-of-old-content-and-nsfw-images\/\" rel=\"nofollow noopener\" target=\"_blank\" data-ylk=\"slk:working links turn into dead ones;elm:context_link;itc:0;sec:content-canvas\" class=\"link \">working links turn into dead ones<\/a>. The Internet Archive provides a valuable service, <a href=\"https:\/\/www.pcgamer.com\/software\/ai\/those-erroneous-search-results-were-just-the-ai-doing-its-job-says-googleprior-to-these-screenshots-going-viral-practically-no-one-asked-google-that-question\/\" rel=\"nofollow noopener\" target=\"_blank\" data-ylk=\"slk:accurately;elm:context_link;itc:0;sec:content-canvas\" class=\"link \">accurately<\/a> and without unprompted <a href=\"https:\/\/www.pcgamer.com\/software\/ai\/elon-musk-claims-grok-was-manipulated-into-praising-hitler-then-makes-wild-claims-about-it-discovering-new-technologies-and-new-physics-within-the-next-year-just-let-that-sink-in\/\" rel=\"nofollow noopener\" target=\"_blank\" data-ylk=\"slk:racist slurs;elm:context_link;itc:0;sec:content-canvas\" class=\"link \">racist slurs<\/a>. Cutting the Wayback crawler off from Reddit, a massive trove of information on just about every subject imaginable, is a loss for us all.<\/p>\n<p class=\"mb-4 text-lg md:leading-8 break-words\">There does seem to be some faint hope for a better resolution than simply it doesn&#8217;t work anymore: In a statement provided to PC Gamer, Mark Graham, director of the Wayback Machine, said, &#8220;We have a longstanding relationship with Reddit and continue to have ongoing discussions about this matter.&#8221;<\/p>\n","protected":false},"excerpt":{"rendered":"When you buy through links on our articles, Future and its syndication partners may earn a commission. Credit:&hellip;\n","protected":false},"author":2,"featured_media":63737,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[18],"tags":[49,48,244,40985,2522,61,40984],"class_list":{"0":"post-63736","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-internet","8":"tag-ca","9":"tag-canada","10":"tag-internet","11":"tag-internet-archive","12":"tag-reddit","13":"tag-technology","14":"tag-wayback-machine"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/posts\/63736","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/comments?post=63736"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/posts\/63736\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/media\/63737"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/media?parent=63736"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/categories?post=63736"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/ca\/wp-json\/wp\/v2\/tags?post=63736"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}