I’ve witnessed AI tools become indispensable in automating complex processes that traditionally demanded a lot of manual effort. However, I’ve also seen them used without any real benefit just because they are available.
That’s why I prefer focusing on AI applications that save time and address genuine challenges.
Recently, I was tasked with aligning the SEO architecture for over a dozen websites across three separate businesses, eight regional domains, and numerous languages, including three English dialects, Italian, Japanese, Spanish, Thai, French, and Korean.
Mapping thousands of URLs to create seamless hreflang XML sitemaps traditionally required specialized software or extensive spreadsheet work. Instead, I used Google Gemini to develop a custom Python script to handle the heavy lifting.
Here’s how an initial prompt evolved into a fully customized automation tool and what it taught me about utilizing AI for technical SEO.
Where AI Delivers the Most Value
I leverage AI primarily for practical, time-saving tasks, including:
- Generating regex patterns when I need quick solutions without researching syntax from scratch.
- Creating complex spreadsheet formulas for reporting workflows that depend on manual data exports.
- Speeding up research and planning for projects requiring competitive analysis across business lines.
- Building custom automation tools for recurring SEO and data-processing tasks.
The hreflang project I discuss here fits perfectly into the last category.
Mapping hreflang at Scale
The challenge was straightforward: accurately map thousands of URLs across multiple multilingual websites into cohesive hreflang XML sitemaps.
I chose not to tackle this manually. Instead, Google Gemini helped me build a custom Python solution.
Here’s a walkthrough of how the process unfolded.
Phase 1: Asking for an Approach, Not Just a Script
One common pitfall of using generative AI for coding is asking it to sprint before understanding the course. Typing, “Write a Python script to create an hreflang sitemap,” will yield generic code prone to breaking with real-world data.
Instead, I started by asking for an approach. I detailed the scenario: multiple regional domains, organic growth over several years leading to mismatched URL slugs, translated subfolders, and appended revision years.
Gemini suggested a multi-step, data-driven approach:
- Crawl the websites to collect live URLs and their metadata.
- Use Python in Google Colab to process the raw data.
- Run an exact match cluster to group identical slugs.
- Use an advanced semantic AI model (like SentenceTransformers) to fuzzy match translated pages based on their titles and normalized URLs.
Phase 2: Crawling and Data Collection
Following the recommended strategy, I used a crawler to spider all regional websites to generate a unified CSV file with live URLs, status codes, title tags, and H1s. Screaming Frog proved ideal for this task.
The quality of AI output relates directly to the quality of your crawl data, so make sure it’s robust.
An AI script can miss an obvious “exact match” if a target URL is a 404 or a 301 redirect. Before feeding data into the script, filter your CSV to include only indexable content.
Dig deeper: International SEO in 2026: What still works, what no longer does, and why
Phase 3: The Google Colab Sandbox
Google Colab offers a free, cloud-based Jupyter notebook environment for coding, bypassing local installations or environment variable issues. I used Google Drive to access it. The free version sufficed for this project.
After uploading the CSV to Colab, Gemini provided an initial Python script that utilized a domain-mapping routine to assign language codes, clean the URLs, and generate an XML tree. The initial results required refinement.
Phase 4: The Iteration (Where the Real Work Happens)
If you expect AI to produce a flawless script on the first try, you’ll be disappointed. Like an intern, AI requires oversight. The true value lies in iteration.
After running the initial script, several unmatched URLs left orphaned pages rather than grouping them with international counterparts. Here’s how I iteratively guided AI through the complexities of human-managed websites.
The Directory Flattening Problem
The U.S. site had recently reorganized its blog into topical folders, unlike the Mexican and Italian sites. I presented these mismatches to Gemini, leading to a script adjustment that flattened directories, allowing slugs to align.
The Aggressive Semantic Trap
Concept traps we implemented were initially strict. A UK article about manufacturing wouldn’t match its Italian counterpart due to a slightly different title. By loosening these traps for general industries and enforcing them for critical terms, the AI became adept at delivering better matches.
The Translated Slug Epiphany
The pivotal insight arrived when examining Mexican blog orphans. A Spanish URL /detras-de-escenas-historias... matched the English /behind-the-scenes-stories..., which I pointed out to Gemini. As a result, Gemini updated the script to create a “Combined Semantic Signature,” dynamically translating slugs and efficiently bridging language gaps.
Dig deeper: Cultural SEO: A practical framework for Spanish markets in AI search
Lessons from Building an AI-Assisted SEO Tool
This project reinforced a simple truth: AI excels as a collaborator rather than a shortcut.
- Be the strategist, let AI be the coder: Rather than demanding a finished product, discuss architecture and logic first, treating AI as a junior developer needing guidance.
- Provide concrete examples: Don’t simply state, “It’s broken.” Give specific failed URL examples or mismatches to help AI refine its logic.
- Embrace the iterative loop: Run the code, identify issues, and iterate. Each iteration enhances the tool’s intelligence.
- Leverage Google Colab: You don’t need to be a Python guru to apply Python in SEO. Colab bridges the gap, providing access to complex data science libraries in your browser.
In the end, I had a fully customized Python script capable of processing a massive CSV to generate a cross-referenced hreflang XML sitemap in minutes.
Though AI isn’t replacing technical SEOs, those who collaborate with AI to build scalable tools will have a significant edge.
Dig deeper: How AI search defines market relevance beyond hreflang
Inspired by this post on Search Engine Land.


Leave a Reply