Running Flaresolverr on AWS Lambda: A Serverless Approach

Running Flaresolverr on AWS Lambda: A Serverless Approach

Flaresolverr is a proxy server to bypass Cloudflare protection, widely used in automation and scraping. Traditionally, it runs as a long-lived Docker container. However, running it 24/7 can be inefficient for sporadic workloads.

In this post, we explore how we re-engineered Flaresolverr to run inside AWS Lambda, leveraging the power of serverless computing. We’ll dive into the specific configurations, code changes, and the advantages this architecture brings.

The Challenge: Browsers in Lambda

Running a full web browser like Chromium inside AWS Lambda is not straightforward. You face several constraints:

  • Memory & CPU: Browsers are resource-hungry.
  • Filesystem: Lambda has a read-only filesystem (except /tmp).
  • Execution Model: Lambda functions are ephemeral; they don’t keep state between invocations in the same way a daemon does.
  • Startup Time: “Cold starts” can be slow when launching a browser.

The Solution: Configuration & Code Changes

We successfully adapted Flaresolverr for AWS Lambda. Here is a breakdown of the key changes.

1. Framework Switch: Bottle to FastAPI + Mangum

The original Flaresolverr used the Bottle framework. To make it compatible with AWS Lambda’s event-driven model, we switched to FastAPI.

  • FastAPI: Provides a modern, high-performance web framework.
  • Mangum: An adapter that allows ASGI applications (like FastAPI) to run on AWS Lambda.

This change allows the application to handle standard HTTP requests locally while seamlessly processing Lambda events in production.

2. Containerization with Docker

We moved to a container-based Lambda function. This allows us to package all dependencies, including the browser and the AWS Lambda Runtime Interface Client (RIC), into a single image.

Key Dockerfile adjustments:

  • Base image: python:3.11-slim-bookworm.
  • Installed awslambdaric to interface with the Lambda Runtime API.
  • Included necessary system dependencies for Chromium.

3. Browser Tuning for Serverless

The most critical part was configuring the underlying browser (via nodriver) to survive in the Lambda environment. We modified configuration to include specific flags:

  • --headless=new: Essential for serverless.
  • --no-sandbox: Required as Lambda runs as a non-root user with limited privileges.
  • --disable-gpu: Lambda has no GPU.
  • --disable-dev-shm-usage: Prevents usage of /dev/shm (shared memory), which is limited in Lambda.
  • --no-zygote: Disables the zygote process to save memory and startup time.
  • --disable-setuid-sandbox: Further sandbox disabling.

4. Serverless Framework Configuration

We used the Serverless Framework to orchestrate the deployment.

  • High Memory (3008 MB): In AWS Lambda, CPU power is proportional to memory. We maximized memory to ~3GB to get a full vCPU, significantly speeding up browser startup and page loads.
  • ARM64 Architecture: We switched to ARM64 (Graviton2) processors. They are cheaper and often faster than x86 for this workload.
  • Timeout: Set to 45 seconds to accommodate browser spin-up and solving time.

Advantages of the Serverless Approach

  1. Infinite Scalability: AWS Lambda automatically scales out. If you have a burst of 1000 requests, Lambda spins up 1000 concurrent execution environments. No more queueing behind a single Docker container.
  2. Cost Efficiency: You pay only for the milliseconds your code runs. For sporadic scraping tasks, this is significantly cheaper than maintaining a 24/7 VPS or EC2 instance.
  3. Zero Maintenance: No OS patches, no server management. You just deploy the container image.
  4. Abuse Prevention: We enforced mandatory proxy usage in the configuration, ensuring that the Lambda IP itself isn’t burned or abused.

Conclusion

By combining Docker, FastAPI, and careful browser tuning, we transformed Flaresolverr into a scalable, serverless solution. This setup offers a robust way to handle Cloudflare challenges on demand without the overhead of managing permanent infrastructure.

Related Posts

Never Miss a Magic Bag Again: Introducing the TGTG Automation Script

Never Miss a Magic Bag Again: Introducing the TGTG Automation Script

We’ve all been there. You open the Too Good To Go (TGTG) app, hoping to snag that amazing bakery bag or the grocery haul everyone talks about, only to see the dreaded “Sold Out” message. It’s frustrating, right? You know the food is there, but unless you’re glued to your phone 24/7, the best “Magic Bags” seem to vanish in seconds.

Read More
Building Your First Cereal Script

Building Your First Cereal Script

In this tutorial, we will guide you through the process of creating your very first script for the Cereal Automation platform. By the end of this guide, you will have a working script that you can run and test locally.

Read More
Bypassing Cloudflare: A Deep Dive into Browser Automation

Bypassing Cloudflare: A Deep Dive into Browser Automation

Note: This guide explores the technical mechanics of bypassing bot protections for educational purposes.

Read More