Tools Sitemap URL Extractor
// tool

Sitemap URL Extractor

Paste any sitemap.xml URL. This tool fetches it, parses the XML in-browser, and lists every URL — ready for download as a text file.

02.

How It Works

01

HTTP Fetch

Uses the Fetch API with a multi-proxy cascade (direct → corsproxy.io → allorigins → corsproxy.org). Each strategy is tried in sequence until one succeeds.

02

DOM XML Parsing

The browser's built-in DOMParser converts the XML string into a traversable document tree — no external libraries required.

03

Recursive Sitemap Index

Detects <sitemapindex> root elements and recursively fetches every child sitemap to collect all URLs.

04

Client-Side File Gen

Constructs a Blob from the URL list, creates an Object URL, and triggers a download — nothing leaves the browser.

03.

Under the Hood

This tool is built entirely in vanilla JavaScript with zero external dependencies. Here's a breakdown of the CS fundamentals and web platform APIs it exercises:

Data Structures

  • DOM Tree traversal for XML node extraction
  • Array & Set for deduplication (O(1) lookup)
  • Stack-based iterative recursion for sitemap index

Networking & Async

  • Fetch API with async/await control flow
  • CORS handling with multi-proxy cascade fallback
  • AbortController for request cancellation & timeout

Robustness

  • Input validation & URL sanitisation
  • Graceful error handling with user-facing messages
  • Namespace-aware XML querying (handles xmlns)

Cross-Platform

  • Blob & Object URL API for file download
  • Clipboard API with execCommand fallback
  • Responsive layout for mobile, tablet, desktop