Link Inspection and Crawler Tool

Mobile-first GitHub Pages crawler with snapshots and safe source inspection.

How It Works

Enter a GitHub Pages URL and select crawl depth. The tool fetches the page, extracts links, and classifies them as internal (same GitHub Pages domain) or external. It recursively crawls internal links up to the specified depth. Results are displayed in a table with actions for each link.

Working Algorithm -

  1. Input URL: User enters a GitHub Pages URL and selects crawl depth.
  2. Fetch & Parse: The tool fetches the HTML content of the URL and parses it to extract all anchor tags.
  3. Normalize Links: Each extracted link is normalized to an absolute URL based on the base URL.
  4. Classify Links: Links are classified as internal (same GitHub Pages domain) or external (different domain).
  5. Recursive Crawling: Internal links are recursively crawled up to the specified depth, repeating steps 2-4 for each new page.
  6. Data Storage: All unique links are stored in a data structure with their URL, type, and associated repository.
  7. Render Results: The collected data is rendered in a table format with action buttons for visiting, previewing, or copying each link.

Note: Always exercise caution when visiting external links. Use the preview feature to get a snapshot of the site before visiting. If the preview fails, it may indicate that the site blocks iframes or has CORS restrictions, so it's safer to open it in a new tab for manual inspection.

Disclaimer

This tool is provided for educational purposes only. The creators are not responsible for any misuse or damage caused by using this tool. Always exercise caution when visiting external links. This tool does not guarantee the safety of any links. Use it at your own risk.

Also this project is open-source and free to use. If you find it useful, consider starring the repository on GitHub to support further development.

About the Creator -

Hey there! I'm Akshat Prasad, a passionate web developer and open-source enthusiast. I created this Link Inspection and Crawler Tool to help users quickly analyze my GitHub Pages sites and understand my projects link structure. With a focus on simplicity and efficiency, this tool is designed to provide valuable insights while being easy to use. If you have any feedback or suggestions for improvement, feel free to reach out or contribute to the project on GitHub!

Crawled Links

URL Type Repository Actions

Site Manual

Visit – Open link in new tab. Visual check for safety and content.

Preview – View snapshot preview. Click to load live site in safe container. If preview fails, site may block iframes or have CORS restrictions. Always use caution. In that case, click "Open in new tab" to inspect manually.

Copy – Copy normalized URL to clipboard for reference or manual inspection.