{"id":293703,"date":"2026-02-20T16:16:08","date_gmt":"2026-02-20T16:16:08","guid":{"rendered":"https:\/\/www.newsbeep.com\/nz\/293703\/"},"modified":"2026-02-20T16:16:08","modified_gmt":"2026-02-20T16:16:08","slug":"microsoft-deletes-blog-telling-users-to-train-ai-on-pirated-harry-potter-books","status":"publish","type":"post","link":"https:\/\/www.newsbeep.com\/nz\/293703\/","title":{"rendered":"Microsoft deletes blog telling users to train AI on pirated Harry Potter books"},"content":{"rendered":"<p>\u201cI think that the regurgitation and the creation of fan fiction, they both could flag copyright issues, in that fan fiction often has to take from the expressive elements, a copyrighted character, a character that\u2019s famous enough to be protected by a copyright law or plot stories or sequences,\u201d Smith said. \u201cIf these things are copied and reproduced, then that output could be potentially infringing.\u201d<\/p>\n<p>But it\u2019s also still a gray area. Looking at the blog, Smith said, \u201cI would be concerned,\u201d but \u201cI wouldn\u2019t say it\u2019s automatically infringement.\u201d<\/p>\n<p>Smith told Ars that, in pulling the blog, Microsoft \u201cwas probably smart,\u201d since courts have only generally said that training AI on copyrighted books is fair use. But courts continue to probe questions about pirated AI training materials.<\/p>\n<p>On the deleted Kaggle dataset page, Maindola previously explained that to source the data, he \u201cdownloaded the ebooks and then converted them to txt files.\u201d<\/p>\n<p>Microsoft may have infringed copyrights<\/p>\n<p>If Microsoft ever faced questions as to whether the company knowingly used pirated books to train the example models, fair use \u201ccould be a difficult argument,\u201d Smith said.<\/p>\n<p>Hacker News commenters suggested the blog could be considered fair use, since the training guide was for \u201ceducational purposes,\u201d and Smith said that Microsoft could raise some \u201cgood arguments\u201d in its defense.<\/p>\n<p>However, she also suggested that Microsoft could be deemed liable for contributing to infringement on some level after leaving the blog up for a year. Before it was removed, the Kaggle dataset was downloaded more than 10,000 times.<\/p>\n<p>\u201cThe ultimate result is to create something infringing by saying, \u2018Hey, here you go, go grab that infringing stuff and use that in our system,\u2019\u201d Smith said. \u201cThey could potentially have some sort of secondary contributory liability for copyright infringement, downloading it, as well as then using it to encourage others to use it for training purposes.\u201d<\/p>\n","protected":false},"excerpt":{"rendered":"\u201cI think that the regurgitation and the creation of fan fiction, they both could flag copyright issues, in&hellip;\n","protected":false},"author":2,"featured_media":293704,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[365,363,364,111,139,69,145],"class_list":{"0":"post-293703","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-artificial-intelligence","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-artificialintelligence","11":"tag-new-zealand","12":"tag-newzealand","13":"tag-nz","14":"tag-technology"},"_links":{"self":[{"href":"https:\/\/www.newsbeep.com\/nz\/wp-json\/wp\/v2\/posts\/293703","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.newsbeep.com\/nz\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.newsbeep.com\/nz\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/nz\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/nz\/wp-json\/wp\/v2\/comments?post=293703"}],"version-history":[{"count":0,"href":"https:\/\/www.newsbeep.com\/nz\/wp-json\/wp\/v2\/posts\/293703\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.newsbeep.com\/nz\/wp-json\/wp\/v2\/media\/293704"}],"wp:attachment":[{"href":"https:\/\/www.newsbeep.com\/nz\/wp-json\/wp\/v2\/media?parent=293703"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.newsbeep.com\/nz\/wp-json\/wp\/v2\/categories?post=293703"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.newsbeep.com\/nz\/wp-json\/wp\/v2\/tags?post=293703"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}