Reddit sued Perplexity and three data-scraping companies in New York federal courtroom, alleging the businesses bypassed entry controls to acquire Reddit content material at scale, together with by scraping Google search outcomes.
Perplexity posted a public response, saying it summarizes Reddit discussions with citations and doesn’t prepare AI fashions on Reddit content material.
The place is per the corporate’s previous statements. Whether or not it addresses the particular allegations in Reddit’s submitting stays an open query.
The grievance names Oxylabs UAB, AWMProxy, and SerpApi as intermediaries. It alleges Perplexity is a SerpApi buyer and bought and/or utilized SerpApi providers to bypass controls and replica Reddit information.
Proof In The Grievance
Perplexity’s argument is constructed round a technical distinction. The corporate says it summarizes and cites discussions moderately than coaching fashions on Reddit posts.
Perplexity wrote in its Reddit response:
“We summarize Reddit discussions, and we cite Reddit threads in solutions, identical to folks share hyperlinks to posts right here on a regular basis.”
The grievance, nonetheless, presents technical claims that decision that framework into query.
In line with the submitting, Reddit created a take a look at submit that was solely crawlable by Google’s search engine and never accessible anyplace else on the web. Inside hours, that hidden content material appeared in Perplexity’s outcomes.
The submitting additionally says that after Reddit despatched a cease-and-desist letter, Perplexity’s citations to Reddit elevated roughly forty-fold.
Related Accusations From Publishers
Forbes beforehand accused Perplexity of republishing an unique and threatened authorized motion.
Wired reported that Perplexity used undisclosed IPs and spoofed user-agent strings to bypass robots.txt. Wired’s
Cloudflare later mentioned Perplexity used “stealth, undeclared crawlers” that ignored no-crawl directives, based mostly on exams it ran in August.
How Perplexity Has Responded
In earlier disputes, Perplexity mentioned points stemmed from tough edges on new merchandise and promised clearer attribution.
The corporate has additionally argued that some media organizations are attempting to manage “publicly reported information.”
On this newest response, Perplexity frames Reddit’s lawsuit as leverage in broader training-data negotiations and writes:
“We summarize Reddit discussions… We received’t be extorted, and we received’t assist Reddit extort Google.”
Why This Issues
This concern issues as a result of it issues how AI assistants use discussion board content material that your audiences learn and that publishers incessantly cite.
The authorized questions transcend simply coaching.
Courts could study if technical controls have been bypassed, whether or not summarization infringes on protected expressions, and if utilizing third-party scrapers might result in authorized legal responsibility for downstream merchandise.
If courts settle for Reddit’s anti-circumvention argument, it might result in adjustments in how assistants cite or hyperlink Reddit threads.
Alternatively, if courts agree with Perplexity’s viewpoint, assistants may begin relying extra on discussion board discussions which might be much less restricted by licensing.
What We Don’t Know But
The submitting alleges Perplexity obtained information through a minimum of one scraping agency, however the public grievance doesn’t specify which vendor provided which information or embody transaction particulars.









