On October 22, 2025, Reddit, Inc. initiated a federal lawsuit in the Southern District of New York against Perplexity AI, Inc. and associated data-scraping firms. The lawsuit alleges violations of the Digital Millennium Copyright Act’s anti-circumvention provisions, along with claims for unjust enrichment and unfair competition. Reddit characterizes the dispute not as a typical copyright infringement case but as a focused attack on what it terms “industrial-scale” evasion of technical controls designed to protect its content.
The core of Reddit’s complaint revolves around the alleged scraping of its data through Google’s search engine results. According to the lawsuit, the defendants employed various deceptive tactics, including masking identities and rotating IP addresses, to extract billions of search-engine results pages that contained Reddit URLs, text, images, and videos. Reddit claims that Perplexity subsequently integrated this harvested data into its “answer engine.”
Key Allegations and Legal Framework
Two significant allegations distinguish this case from others. Reddit asserts it created a post that was indexed by Google but not directly accessible. Within hours, Perplexity’s answer engine reportedly surfaced substantial portions of that post, suggesting that Perplexity or its co-defendants scraped Google’s results and utilized the data. Additionally, after Reddit issued a cease-and-desist notice in May 2024, Perplexity’s citations to Reddit allegedly surged forty-fold, despite its public claims of respecting robots.txt directives.
Rather than disputing how Perplexity used the copyrighted materials, Reddit’s claim under §1201(a)(1) of the DMCA emphasizes the act of circumventing technological measures that control access to copyrighted works. The lawsuit also points to the unlawful actions of the data-scraping co-defendants, alleging that Perplexity collaborated with them to facilitate this large-scale circumvention of Reddit’s and Google’s access controls. Reddit seeks both injunctive relief and damages for the harm caused to its business.
Reddit positions Perplexity within a broader context of data brokers and scrapers, contrasting its practices with Reddit’s paid licensing partnerships with entities like OpenAI and Google. The lawsuit is framed as a defense of a licensing model that Reddit claims its competitors adhere to. It argues that Perplexity’s methods not only undermine the value of existing licensing agreements but also divert user engagement away from Reddit, reducing the necessity for users to access its platform directly.
User Privacy and Data Integrity Concerns
The lawsuit raises critical issues regarding user privacy. Reddit alleges that the scraping practices captured deleted, private, or restricted posts, hampering the company’s ability to honor user deletion requests and privacy preferences. This, according to Reddit, compromises users’ rights to control access to their content and threatens the platform’s integrity and user trust.
The case is expected to involve complex legal challenges, including whether Google’s and Reddit’s measures qualify as §1201 “technological measures” that effectively control access. Additionally, the court will need to consider whether scraping search engine results pages constitutes accessing the underlying copyrighted works. Perplexity may argue that the snippets available on search engine results pages are publicly accessible and that limitations should regulate automated volume rather than access to protected content.
Reddit v. Perplexity highlights a growing trend where platforms are increasingly relying on access-control and contract-based theories in the face of direct copyright claims that may be complex or questionable. For content owners, the case underscores the necessity of combining technical barriers with contractual enforcement to effectively combat circumvention. For AI developers, it serves as a reminder that publicly available content is not automatically free for training purposes if obtained through methods that evade access restrictions, a strategy that poses significant legal risks under the DMCA.
