|
To get all the Today in Books Content plus community features, join All Access!
|
|
|
|
|
|
Jeff O'Neal
February 7, 2025
|
|
|
This content contains affiliate links. When you buy through these links, we may earn an affiliate commission.
|
|
 |
Welcome to Today in Books, our daily round-up of literary headlines at the intersection of politics, culture, media, and more.
“Torrenting from a corporate laptop doesn’t feel right”: Meta emails unsealed
In the initial news-breaking and opinion-slinging around Meta’s LLM training data, not much consideration was given to where the book data it was using came from. Mostly because neither of the two camps’ argument (using books at all is theft v. this is how things go in technology) really care if they were using legally acquired books. But the law, it turns out, might care
. If you can prove that the books in the data set were both pirated and seeded (aka provided for other people to pirate), that might poison the well. It doesn’t answer the more philosophical questions about training LLMs on human output without consent and compensation (outside presumably at least buying like an ebook of each title in the data set), but it could prove extremely costly in this legal dispute. My own open question “is this all that different than a person reading a bunch of books and then writing an amalgam/homage/remix/?” does in fact assume that the books under consideration were obtained legally.
 | |
 |
|
To get all the Today in Books Content plus community features, join All Access!
|
|
|
|