Tag: Legal Action

Google Held Accountable for False AI Claims in Germany

Recently, a German court ruling caught my attention because it asserts that Google can be directly liable for false claims made in their AI Overviews. The Regional Court of Munich’s decision highlights a significant shift, considering AI-generated summaries as Google’s own content rather than just protected search results.

This ruling emerged from a case where AI Overviews mistakenly linked two Munich publishers to scams and dubious practices, despite the linked pages containing no such evidence, as reported by The Decoder.

AI Overviews are not just search tools. According to the court, these Overviews go beyond merely assisting users in finding third-party content. They actually process and present information in their own distinctive manner.

What struck me was the court’s findings that the AI Overview allegedly made standalone accusations regarding questionable business practices, which were not substantiated by the linked sources. Because Google crafts and controls these features and their algorithms, the court ruled these statements to be Google’s own content.

Traditional search protections didn’t apply here. Google argued that they should be protected by German case law, which generally shields search engines as indirect infringers. However, the court disagreed, emphasizing that AI Overviews are distinct as they generate new statements from multiple sources.

The court also dismissed Google’s argument that users could verify claims by reviewing linked content. They highlighted that AI Overviews offer claims that stand as complete answers without needing verification.

Why does this matter to me? The court’s stance implies that AI Overviews aren’t neutral links. If they issue incorrect claims about a company, Google may bear direct responsibility for these words.

Mismatched connections and misinformation. The court determined that misinformation resulted from AI conflating data about other entities with that concerning the publishers.

Given that the contested claims weren’t present on the linked sites, the publishers lacked a clear third party to target legally, should Google be considered only as an intermediary.

Interestingly, the court insisted that Google could compare AI-generated content against primary sources, at least in analogous situations.

Action required from Google. The injunction demands that Google refrains from repeating the disputed claims, which include allegations of scams and nonexistent business practices.

Furthermore, Google is instructed to bear 80% of the legal costs, while each publisher covers 10%. Despite Google’s lack of a cease-and-desist declaration with a penalty clause, the potential for repeat violations was noted, emphasizing the importance of this ruling for future similar claims.

Inspired by this post on Search Engine Land.

June 10, 2026
Publishers Demand Halt in AI Data Collection by Common Crawl
Could AI be losing a crucial source of its training data? As a major shift looms, significant publishers are urging Common Crawl to pause its collection and distribution of their content for AI training.

Digital Content Next (DCN) has sent a cease-and-desist letter to the Common Crawl Foundation, asking them to stop scraping and sharing protected publisher content.

Representing leading digital publishers like the AP, the New York Times, NBC Universal, Bloomberg, NPR, and Fox, DCN is also insisting that Common Crawl remove its members’ content, including paywalled and subscriber-only news articles, from its datasets.

Concerns Over Opt-Outs: Questions arise regarding Common Crawl’s adherence to publisher opt-out requests. Specifically, DCN’s lawyers are scrutinizing whether previous statements about compliance—often citing technical costs and delays—were perhaps misleading.
- The registry maintained by Common Crawl does list sites opting out, including several prominent news organizations.
Claims of Infringement: DCN firmly holds that copyright isn’t an opt-out system. They allege Common Crawl has been “flagrantly infringing” on publisher copyrights by distributing protected content without authorization or compensation.
- The group further critiques how Common Crawl shares this content with AI developers.
- DCN’s CEO, Jason Kint, signifies this legal action is a stance against the notion that online content is available for unrestricted collection, storage, and reuse.
Common Crawl’s Defense: Rich Skrenta, the Executive Director, denies allegations of bypassing paywalls and misleading publishers. He references a prompt and technical response to remove previously crawled content upon request.
- “Our removal process aligns with our dataset’s technical framework,” Skrenta explains.
Importance of This Battle: The outcome of this dispute could drastically influence the scope of publisher content that AI search engines use without explicit permission. Should there be heightened consent requirements, licensed sources may prevail, reducing reliance on openly available web content.

The High Stakes of AI Training: Established in 2008, Common Crawl has amassed billions of webpages to form a free public repository, a vital tool for training AI models. Notably, The New York Times’ lawsuit against OpenAI in 2023 cited that Common Crawl comprised 60% of GPT-3’s training data, as reported by Press Gazette.
- A 2024 Mozilla Foundation paper found generative AI would scarcely exist today without Common Crawl.
- Common Crawl’s ongoing efforts to create AI crawling standards indicate a willingness to adapt, yet DCN calls for decisive action—fully halting the scraping of protected content.
Inspired by this post on Search Engine Land.
June 10, 2026
Navigating the AI Data Wars: Key Developments from 2023 to 2026

As I delve into the ongoing data battles, I’m struck by how they’re reshaping the AI landscape and the answers we rely on. It’s fascinating to observe the pivotal deals, restrictions, and lawsuits that are creating a fragmented visibility landscape in AI.

This journey through 2023 to 2026 reveals how platform shifts are altering the way data access impacts AI answers. Each step is integral to understanding the changing dynamics of this tech-driven era.

Inspired by this post on HiGoodie Blog.

April 1, 2026
Court Blocks Perplexity’s AI Bot from Amazon Access
I’ve just learned that Perplexity AI’s Comet browser agent can no longer make purchases on Amazon. This decision comes after a federal judge ruled in Amazon’s favor, expressing concerns about AI shopping bots.

Why this matters to us. The ruling challenges AI’s ability to simplify tasks, such as online shopping, by acting on our behalf. If similar restrictions are enacted, AI agents might face significant hurdles when trying to access logged-in areas of popular platforms.

The situation as it unfolded. U.S. District Judge Maxine Chesney in San Francisco issued a preliminary injunction, favoring Amazon’s position.
- Perplexity is now prohibited from using Comet to enter password-protected sections of Amazon, like those reserved for Prime members.
- Judge Chesney noted Amazon’s “strong evidence” indicating Comet’s access was granted by users but not authorized by Amazon itself.
- The court order also mandates that Perplexity must eliminate all Amazon data it has gathered.
Getting up to speed. Back in November, Amazon filed a lawsuit against Perplexity, accusing it of computer fraud and unauthorized platform access. Allegedly, Comet completed purchases on user accounts without properly identifying itself as a bot.

Next steps. There’s a one-week suspension on the order, giving Perplexity the chance to appeal.

What Amazon says. According to Lara Hendrickson, an Amazon spokesperson, this injunction is crucial for stopping Perplexity’s unauthorized Amazon access and is a vital move towards maintaining trust for customers.

Inspired by this post on Search Engine Land.
March 10, 2026
SerpApi’s Legal Battle: Challenging Google’s Scraping Lawsuit

When I first learned about SerpApi’s move to dismiss Google’s lawsuit, my immediate thought was about the bold challenge SerpApi is undertaking. They’re arguing that Google is twisting copyright laws to restrict access to public search results all to protect their ad revenue, not copyrights.

The motion to dismiss was officially filed on February 20th, as mentioned in a recent blog post by SerpApi’s CEO, Julien Khaleghy. This legal battle stems from Google’s accusation in December that SerpApi bypassed security measures to scrape and resell content from Google Search.

The details: According to Khaleghy, Google is improperly applying the Digital Millennium Copyright Act (DMCA). Here’s what I found compelling:

The DMCA is meant to protect copyrighted works, not online platforms or advertising ventures. In addition, Google doesn’t actually own the content that appears in its search results, and accessing publicly available pages doesn’t qualify as “circumvention” under this law, SerpApi argues.

Google claims that SerpApi managed to evade bot-detection and crawling controls using rotating bot identities and large networks to scrape licensed content from features such as images and real-time data. However, SerpApi insists that they do not decrypt systems or breach authentication protocols, and merely gather the same data any user could see via a browser, without needing to log in.

Khaleghy also points out Google’s admission that its anti-bot systems primarily secure its advertising interests, which weakens the DMCA claim against SerpApi.

SerpApi references significant legal precedents, including the Ninth Circuit’s hiQ v. LinkedIn, which cautions against monopolizing public data, and the Sixth Circuit’s Impression Products v. Lexmark, reinforcing that public-facing content shouldn’t be blocked by merely technical measures.

Catch up quick: This lawsuit is the latest in a series of escalating legal clashes over data scraping and AI usage:

Back in October 2022, Reddit filed suits against SerpApi, among others, alleging they indirectly scraped content from Google Search. Reddit claims these companies obscured their identities and operated at an “industrial scale.” In turn, SerpApi has vowed to robustly defend itself, emphasizing that public data should remain accessible.

By December, Google further escalated the legal situation by suing SerpApi for ignoring its security measures and attempting to resell protected content. SerpApi stands firm, citing lawful operation and First Amendment rights to access public search data.

By the numbers: If Google’s interpretation of the DMCA holds, SerpApi suggests potential damages could skyrocket to $7.06 trillion — more than the entire U.S. GDP. However, this staggering figure is a theoretical estimate based on potential penalties, not an actual demand.

What’s next: It all boils down to the court’s decision on whether Google’s claims should move forward. Depending on the outcome, this case could significantly impact how SEO platforms, AI tools, and competitive intelligence software access search results data in the future.

A triumph for Google might hinder third-party access to search data, while a victory for SerpApi could reinforce that publicly accessible search outcomes are indeed fair game.

For deeper insights, I recommend reading Google v. SerpApi: We’re filing a Motion to Dismiss. Here’s why we’re in the right.

Don’t miss Inside SearchGuard: How Google detects bots and what the SerpAPI lawsuit reveals for in-depth analysis.

Inspired by this post on Search Engine Land.

February 23, 2026
Inside Google SearchGuard: Decoding Bot Detection Secrets
I recently explored Google’s SearchGuard, an advanced system that safeguards Google Search from bots. This groundbreaking technology has been thrust into the limelight due to a lawsuit against SerpAPI, revealing how Google differentiates between human users and automated scripts.

After meticulously dissecting the JavaScript code, I gained rare insights into how Google distinguishes humans from automated scrapers in real-time.

What happened: On December 19, Google filed a lawsuit against SerpAPI, accusing them of bypassing SearchGuard to extract copyrighted data from Google Search results on a colossal scale. Instead of focusing on terms-of-service breaches, Google cited DMCA Section 1201, emphasizing anti-circumvention clauses.

This case underscores what Google deems worth protecting, which is crucial for anyone in the SEO and marketing sectors who might be using tools that interact with Google Search.

Why we care: Understanding SearchGuard is vital because any large-scale automation with Google Search invokes this system. If you’re using scraping tools, this is the barrier they encounter.

Here’s where it gets interesting: SerpAPI isn’t just another scraper. OpenAI utilized Google search results, obtained through SerpAPI, to enhance ChatGPT’s capabilities. Although OpenAI’s request for direct access to Google’s index was flatly denied in 2024, they still needed real-time data.

This situation highlights a strategic move by Google, focusing on a key element in the competition’s data supply chain.

In investigating SearchGuard, I fully decrypted version 41 of the BotGuard script, which started with an unexpected greeting:
```
Anti-spam. Want to say hello? Contact botguard-contact@google.com
```
Don’t let the friendly tone fool you; behind it lies one of the most complex bot detection systems ever created.

BotGuard vs. SearchGuard: BotGuard, internally termed Web Application Attestation (WAA), shields most Google services. Google’s legal complaint disclosed that the specific system guarding Search is known as SearchGuard, which when implemented in early 2025, disrupted nearly all SERP scrapers.

Unlike traditional CAPTCHAs, BotGuard operates invisibly, seamlessly analyzing user behavior using sophisticated algorithms to separate bots from people.

It leverages a highly protected bytecode virtual machine to ensure it remains impervious to reverse engineering.

How Google knows you’re human: The system evaluates multiple behavioral metrics in real-time, including mouse movements, keyboard rhythm, scroll behavior, and timing jitter, painting a comprehensive picture of a user’s natural interactions.

Mouse movements

Google observes the fluidity of mouse motions, capturing deviations that indicate a human touch, unlike the straight paths typical of bots.
- Path shape
- Speed
- Acceleration changes
- Micro-tremors
A perfectly linear mouse action raises alarms, as it is atypical of human movement, usually characterized by imperfections.

Keyboard rhythm: Everyone types differently. Google captures inter-keystroke intervals, error patterns, and post-punctuation pauses to form a user’s unique typing ‘fingerprint.’
- Time between keys
- Keypress duration
- Error sequences
- Pauses after punctuation
The aspects of natural scrolling and timing jitter are also scrutinized, as context-specific nuances help discern human from machine.

Google’s system even enlists over 100 HTML elements for browser environment fingerprinting to further ensure authenticity.

Performance monitoring: Google captures intricate details such as navigator properties, screen metrics, and engagement with browser APIs for an exhaustive analysis.

Despite efforts to outsmart it, SearchGuard employs cryptographic measures similar to those developed by the NSA to protect its integrity, making circumvention fleeting at best.

The statistical ingenuity behind SearchGuard: Algorithms like Welford’s and reservoir sampling give SearchGuard the upper hand, continuously refreshing a composite profile of expected user behavior.

SerpAPI’s stance: Julien Khaleghy, CEO of SerpAPI, notes Google never reached out before filing the lawsuit, suggesting it’s an attempt to stifle competition from innovative services using their platform to power advanced applications.

Google’s assertiveness poses a monumental challenge to the SEO industry, redefining how anti-scraping measures might be perceived legally. Should SearchGuard be recognized as a legitimate protective measure under DMCA, it could set significant precedent.

Inspired by this post on Search Engine Land.
January 19, 2026
Google’s Legal Battle: SerpApi Accused of Unlawful Data Scraping
Today, I came across an intriguing development where Google has initiated legal proceedings against SerpApi. This lawsuit revolves around allegations that SerpApi has been bypassing Google’s security systems to scrape and resell copyrighted content from search results.

The Allegations: According to Google, SerpApi has:
- Circumvented the security measures and standard crawling controls Google has in place.
- Ignored directives from websites that specify content accessibility.
- Employed techniques such as cloaking, rotating bot identities, and large bot networks to scrape vast amounts of content.
- Appropriated licensed content from search features such as images and real-time data, subsequently selling it for profit.
Google’s Stance: Describing SerpApi’s actions as “brazen” and “unlawful,” Google expressed concerns over how stealthy scrapers like SerpApi override crawling directives, stripping sites of their choices. Alarmingly, Google noted a significant increase in SerpApi’s activities over the last year.

Quick Update: Interestingly, Google’s lawsuit mirrors similar legal action by Reddit, which also targeted SerpApi, Perplexity, Oxylabs, and AWMProxy. Reddit accused them of scraping content via Google Search results and concealing their identities to evade restrictions.
- Reddit has licensing agreements with Google and OpenAI, suspecting other entities of attempting to bypass these deals.
- They reportedly set a “trap” post, visible only to Google’s crawler, which eventually surfaced in Perplexity’s results as proof of scraping.
- SerpApi denied these allegations, claiming their operations are lawful.
SerpApi’s Previous Statements: In defense, SerpApi has maintained that “public search data should be accessible,” viewing its actions as protected by the First Amendment. They also warned that lawsuits like the one from Reddit could endanger the “free and open web.”

Why It Matters to Me: Should Google triumph in this case, acquiring reliable SERP data might become increasingly challenging and costly. This could particularly impact teams reliant on services like SerpApi, as they navigate the complexities of understanding search results, performance metrics, and achieving success in an evolving digital landscape.

Inspired by this post on Search Engine Land.
December 19, 2025