some scraper prevention thing I made lol
Find a file
2025-07-20 21:24:59 -07:00
templates Update templates/captcha.html 2025-07-20 20:39:58 -07:00
LICENSE Initial commit 2025-05-01 19:47:40 -07:00
loading.svg Upload files to "/" 2025-05-01 19:56:55 -07:00
README.md Update README.md 2025-05-02 13:12:34 -07:00
server.py Update server.py 2025-07-20 21:24:59 -07:00

sweetCAPTCHA

some scraper prevention thing I made lol

How it works

It uses a configuration by domain name, severity can be set as following:

  • always Always challenge a not-trusted IP
  • whitelist Always challenge a not-trusted IP unless the useragent is in the whitelist
  • blacklist Never challenge a not-trusted IP unless the useragent is in the blacklist

When the IP and User-Agent isn't trusted, it will be shown a challenge

A screenshot of the sweet CAPTCHA challenge

The {{uuid}} string will be replaced by a base64-encoded UUIDv4 string. JavaScript code will be able to decode and set the cookie automatically.

So even if the user doesn't have JavaScript enabled, they are still able to decode it themselves and set the sweetcaptcha-uuid cookie.

When the UUID cookie is correct, the IP will be permanently whitelisted.

NGINX example

server {
        listen 443 ssl http2;
        listen [::]:443 ssl http2;

        server_name some.protected.page;

        location @correct {
                # The real page.
                proxy_pass http://127.0.0.1:5378;
        }
        location / {
                # The CAPTCHA will be used
                proxy_pass http://127.0.0.1:5283;
                # X-Real-IP required for whitelisting IPs
                proxy_set_header X-Real-IP $remote_addr;
                # Host required for setting configuration
                proxy_set_header Host $host;
                proxy_intercept_errors on;
                # The CAPTCHA will use error 403 to indicate the IP has completed the challenge.
                error_page 403 = @correct;
        }
    ssl_certificate /etc/letsencrypt/live/some.protected.page/fullchain.pem; # managed by Certbot
    ssl_certificate_key /etc/letsencrypt/live/some.protected.page/privkey.pem; # managed by Certbot
    include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
    ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot

}