I Found a Loophole to (Successfully) Web Scrape Using ChatGPT. Here’s How it Works

The PyCoach on 2022-12-17

Scrape any website with ChatGPT using this approach (demo with Amazon and Twitter)

Photo by Mikhail Nilov on Pexels

In a previous article, I made a demo on how to scrape websites by writing simple prompts for ChatGPT like “scrape website X using Python.”

But that doesn’t always work.

Actually, after trying to scrape dozens of websites using ChatGPT, I came to the conclusion that plain prompts like the one before almost never work when it comes to web scraping.

But I found another approach that will help us scrape any website out there using ChatGPT and some basic HTML.

First Things First — Use The Advanced Version of ChatGPT (Playground)

To quickly scrape websites using ChatGPT, we need to use the advanced version of ChatGPT — Playground. This version has fewer restrictions and is way faster when it comes to generating code.

Here’s how it looks.

As you can see, this is different from the classic view of ChatGPT where you can only type a prompt. On Playground, you have more customization options, and, when it comes to generating code, it’s way faster than the basic version.

No more restrictions or slow responses.

For this tutorial, we’ll write our prompts in the box below the “Playground” title.

How to Scrape Any Websites with ChatGPT

To easily explain to you how we’re going to use ChatGPT to scrape any website we want, we’ll start with a simple website called subslikescript that has a list of movies listed on its site.

Later, in this guide, I’ll show you how to use the same approach to scrape sites such as Amazon and Twitter, but let’s keep it simple for now.

If we try a basic prompt like “scrape the movie titles on this website: https://subslikescript.com/movies” it won’t scrape anything. Here’s…