Reddit announced new API changes today that will eventually move the content pipeline away from being used to train artificial intelligence tools, including the models that power OpenAI’s ChatGPT, Google’s Bard, and Microsoft’s Bing AI. AI chatbots’ ability to provide powerful answers is thanks to data sources like Reddit — and now Reddit is planning to put that robot food behind a paywall.
Social media resources, including Reddit, are some of the resources used to train large language models (LLM) that can provide compelling responses to human cues. Some of this data can be scraped in an unstructured way, but Reddit’s API has helped companies make it easy to find and package useful data on the fly.
Reddit’s API, which has been available since 2008, was previously fairly open to developers to do almost anything. That includes building tools that help moderate subreddits, create Reddit browsing clients, and make the site easier to search. Reddit plans to keep the API free for some use cases, such as those building moderation tools or using Reddit in education and research environments.
Reddit’s new terms apply to developers who use the APIs in ways that require “broader use rights” and don’t automatically license anyone who needs to modify user content, as published in the new Data API Terms. This means that commercial use, such as training LLMs, is not allowed developer license and will instead require the parties to “enter into a separate agreement with Reddit.” Reddit has yet to specify how much it plans to charge companies that use its data commercially.
Reddit didn’t elaborate on how API changes directly affect third-party Reddit clients such as Apollo, Rif, and Relay. It mentions in the Data API terms that it can enforce limits on the number of API requests made – which can be quite high for customers as they need to use OAuth tokens for Reddit user authentication. Apollo’s sole developer, Christian Selig, asked Reddit how “speed limit enforcement” will affect apps like his. A Reddit admin replied vaguely, saying it depends on the volume of API usage and whether it’s “compliant with our terms.”
These API changes come as Reddit plans an initial public offering later this year. Much of the company’s revenue comes in the form of advertising (which has its own API) and digital goods. But as more AI platforms emerge, Reddit wants to build on the value of its user-generated content. “The Reddit corpus of data is really valuable,” said Reddit CEO Steve Huffman with an interview The New York Times. “We don’t have to give all that value to some of the biggest companies in the world for free.” The changes also follow a much wider block of Twitter’s API owned by Elon Musk — one that could affect both commercial and non-commercial users.
The new Reddit terms will take effect “after 60 days’ notice” after developers and third parties receive an official email notification. Reddit will also release new internal moderator tools that work with the official iOS and Android apps.
Janice has been with businesskinda for 5 years, writing copy for client websites, blog posts, EDMs and other mediums to engage readers and encourage action. By collaborating with clients, our SEO manager and the wider businesskinda team, Janice seeks to understand an audience before creating memorable, persuasive copy.