Why this exists
Many websites have paginated data (portfolios, listings, directories) with no export button. You need a way to extract it without setting up Puppeteer, Selenium, or a full scraping framework. This skill generates a browser console script you paste on each page. It accumulates data in localStorage as you navigate, then downloads everything as clean JSON.
No dependencies. No login tokens. No headless browser. Just paste, navigate, paste, download.
How it works
- You tell Claude what to scrape Give it the URL, tell it what fields you want (title, image, price, link, etc.), and how the site paginates (numbered pages, load more button, or infinite scroll). Claude asks follow-up questions if anything is unclear.
- Claude generates a browser console script A custom JavaScript snippet tailored to that specific site. It uses CSS selectors to find the data on the page, deduplicates automatically, and stores everything in your browser's localStorage so nothing is lost between pages.
- You paste the script on each page Open DevTools, paste, press Enter. Navigate to the next page. Paste again. Repeat. Each paste adds new items to the running total. You can paste on the same page twice without creating duplicates.
-
You download the collected data
When you have scraped all the pages, type
downloadData()in the console. A JSON file downloads to your computer with every item collected across all pages. - Claude processes and cleans the output Hand the JSON file back to Claude. It generates a Node.js script to deduplicate, normalise text, filter out incomplete entries, and optionally download all images locally. You end up with clean, ready-to-use data.
Step by step (for first-time users)
Never used browser DevTools before? Follow this exactly.
-
Open Claude Code and type
/web-scrapeClaude will ask you a few questions: what website, what data you want, and how the pages work. Answer in plain English. You do not need to know CSS selectors or JavaScript. - Claude gives you a script. Copy it. Select all the code Claude outputs and copy it to your clipboard (Ctrl+C on Windows, Cmd+C on Mac).
- Open the website you want to scrape Go to the first page of whatever you want to extract. Make sure you are logged in if the content requires it.
-
Open DevTools
Press
F12on your keyboard. A panel opens on the right or bottom of your browser. Click the tab that says Console. - Paste the script and press Enter Click inside the Console area, press Ctrl+V (or Cmd+V), then press Enter. You will see a green message confirming how many items were found on this page.
- Go to the next page Click the "Next" button, or page 2, or scroll down if it is infinite scroll. The page changes. Your data is safely stored.
- Paste the script again and press Enter Same thing. Paste, Enter. The counter goes up. New items are added to your total.
- Repeat until you have covered all pages Keep going: navigate, paste, enter. The console message tells you the running total each time.
-
Download your data
When you are done, type
downloadData()in the console and press Enter. A.jsonfile downloads to your computer. - Give the file back to Claude Go back to Claude Code and tell it you have the file. Claude will process it: remove duplicates, clean up the text, and give you the final output ready to use.
Console output
Each time you paste the script on a new page, you see this in your console:
Page scraped! New items found: 48 Total collected: 192 Next: go to the next page and paste this script again.
Three utility commands are always available:
checkCount()— see how many items you have so fardownloadData()— save the JSON when you are doneclearData()— start over if something went wrong
Data structure
The downloaded JSON contains an array of objects. Fields are configured per project. A typical output looks like this:
[
{
"title": "Project Name",
"imageUrl": "https://cdn.example.com/img/project.jpg",
"category": "Design",
"date": "2025-11",
"slug": "project-name",
"link": "https://example.com/projects/project-name"
},
{
"title": "Another Project",
"imageUrl": "https://cdn.example.com/img/another.jpg",
"category": "Illustration",
"date": "2025-09",
"slug": "another-project",
"link": "https://example.com/projects/another-project"
}
]
Use cases
Honest take
What it does well: Zero setup. No npm installs, no API keys, no headless browser configuration. You paste a script, navigate pages, and download JSON. It handles deduplication automatically, so you can re-paste on the same page without creating duplicates. The localStorage pattern means your progress survives page navigation and even accidental tab closes. I built this approach for my own portfolio migration and it worked on the first try.
What it does not do: It cannot handle sites that require login (the script runs in your authenticated session, but Claude cannot log in for you). It does not bypass anti-bot protections like Cloudflare challenges. It will not work on sites that render content inside iframes or shadow DOM without adjustments. And it requires you to manually navigate each page. If you need fully automated scraping of hundreds of pages, you still want Puppeteer or Playwright.
When to use it: Any time you need data from a paginated website where the total pages are manageable (under 50 pages or so). Portfolio platforms, design contest sites, business directories, product listings, job boards. If you can see the data in your browser, this can extract it.