Scraping Crawling Tools resources

8 resources

Apify

Scraping Crawling Tools

Cloud platform for web automation, scraping, and actor-based data workflows.

ToolNeeds verificationOpen sourceSelf-hostedFreemiumGitHub linked

Crawl4AI

Scraping Crawling Tools

Open-source crawler and scraper designed for LLM-friendly web extraction.

ToolNeeds verificationOpen sourceSelf-hostedGitHub linked

Crawlee

Scraping Crawling Tools

Open-source web scraping and browser automation library from Apify.

ToolNeeds verificationOpen sourceSelf-hostedGitHub linked

Firecrawl

Scraping Crawling Tools

Tool and API for turning websites into LLM-ready markdown or structured data.

ToolNeeds verificationOpen sourceSelf-hostedFreemiumGitHub linked

Scrapling

Scraping Crawling Tools

Python scraping library focused on resilient extraction from changing pages.

ToolNeeds verificationOpen sourceSelf-hostedGitHub linked

Scrapy

Scraping Crawling Tools

Open-source Python framework for building allowed crawlers and web data extraction pipelines.

ToolNeeds verificationOpen sourceSelf-hostedGitHub linked

D4Vinci/Scrapling

Python

🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!

RepositoryAuto enrichedPythonBSD-3-ClauseHigh starsActive signal

apify/crawlee

Typescript

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

RepositoryAuto enrichedTypeScriptApache-2.0Active signal