Sitemap URL Extractor
Paste any sitemap.xml URL. This tool fetches it, parses the XML in-browser, and lists every URL — ready for download as a text file.
How It Works
HTTP Fetch
Uses the Fetch API with a multi-proxy cascade (direct → corsproxy.io → allorigins → corsproxy.org). Each strategy is tried in sequence until one succeeds.
DOM XML Parsing
The browser's built-in DOMParser converts the XML string into a traversable document tree — no external libraries required.
Recursive Sitemap Index
Detects <sitemapindex> root elements and recursively fetches every child sitemap to collect all URLs.
Client-Side File Gen
Constructs a Blob from the URL list, creates an Object URL, and triggers a download — nothing leaves the browser.
Under the Hood
This tool is built entirely in vanilla JavaScript with zero external dependencies. Here's a breakdown of the CS fundamentals and web platform APIs it exercises:
Data Structures
- ▸ DOM Tree traversal for XML node extraction
- ▸ Array & Set for deduplication (O(1) lookup)
- ▸ Stack-based iterative recursion for sitemap index
Networking & Async
- ▸ Fetch API with async/await control flow
- ▸ CORS handling with multi-proxy cascade fallback
- ▸ AbortController for request cancellation & timeout
Robustness
- ▸ Input validation & URL sanitisation
- ▸ Graceful error handling with user-facing messages
- ▸ Namespace-aware XML querying (handles xmlns)
Cross-Platform
- ▸ Blob & Object URL API for file download
- ▸ Clipboard API with execCommand fallback
- ▸ Responsive layout for mobile, tablet, desktop