Ycombinator has a great user discussion/debate over the takeaways of the Anthropic AI data training court decision, we highly recommend the read. One of the key takeaways, in our opinion, is the fact that Anthropic downloaded 7M books for their use, through illegal means (read torrents and the like), and thus the training set material, in this specific court case, was deemed legal but the piracy itself was not. To put this in perspective, willful piracy comes with damages of up to $150,000 per occurrence. This was willful piracy. For the non-math nerds out there, 7M x $150,000 = $1.05 trillion dollars. Yes, trillion.
And this is likely what every single large training model has done. There is evidence Meta did this and there is even a paper trail of them discussing they knew it was illegal and communications from the top of the company saying to do it anyway.
We hope hope hope there are repercussions. But we are also pragmatists and have seen the Trump Administration reduce monetary sentences by 99% for people who have donated to his campaign, bought his crypto, donated to his inauguration, etc., and every single big tech company head has done just that. Trump pardoned the person behind the Silk Road, who was convicted of running a site that facilitated murders for hire, child pornography, sex trafficking, drug trades, and more. The limit at how low this administration will go to reduce damages or commute sentences to the highest donor has no limits. But maybe there will be Justice in the end for these companies that stole from real creators. Maybe.
PS Featured Gen AI image “Create an image that represent how big $1.05 trillion USD really is” brought to you by “What are those piles of exactly?” and “Other than the text, how are we supposed to know?”