Files
homelab-opentofu/modules/20-services-apps/crawl4ai
Yuris Cakranegara a63f144bf1 feat: add crawl4ai
2025-06-30 22:22:08 +10:00
..
2025-06-30 22:22:08 +10:00
2025-06-30 22:22:08 +10:00
2025-06-30 22:22:08 +10:00

Crawl4AI Module

This module deploys Crawl4AI, a web crawling and AI analysis tool, as a Docker container in the homelab environment.

Overview

The Crawl4AI module:

  • Deploys the unclecode/crawl4ai Docker container
  • Configures resource limits and reservations for memory
  • Provides shared memory access for Chrome/Chromium performance
  • Supports custom configuration through volume mounting
  • Provides service definition for integration with networking modules

Usage

module "crawl4ai" {
  source      = "./modules/20-services-apps/crawl4ai"
  volume_path = "/path/to/volumes"
  networks    = ["homelab-network"]
}

Variables

Variable Description Type Default
image_tag Tag of the Crawl4AI image to use string "latest"
volume_path Host path for Crawl4AI data volumes string -
networks List of networks to attach the container to list(string) []

Outputs

Output Description
service_definition Service definition for integration with networking modules

Service Definition

This module outputs a service definition that is used by the networking modules to expose the service.

{
  name         = "crawl4ai"
  primary_port = 11235
  endpoint     = "http://crawl4ai:11235"
}

Environment Variables

Crawl4AI requires API keys for various LLM providers. These are configured through a .env file in the module directory. You should create this file based on the provided .env.example template:

  • OPENAI_API_KEY: OpenAI API key
  • DEEPSEEK_API_KEY: DeepSeek API key
  • ANTHROPIC_API_KEY: Anthropic API key
  • GROQ_API_KEY: Groq API key
  • TOGETHER_API_KEY: Together API key
  • MISTRAL_API_KEY: Mistral API key
  • GEMINI_API_TOKEN: Gemini API token

Configuration

Crawl4AI requires a custom configuration file. This is mounted from ${volume_path}/crawl4ai/config.yml to /app/config.yml in the container.

Ports

Crawl4AI exposes one port, which is mapped to host ports defined in the .env file:

  1. Frontend (port 11235) - The main web interface for accessing games

Example Integration in Main Configuration

module "crawl4ai" {
  source       = "./modules/20-services-apps/crawl4ai"
  volume_path  = module.system_globals.volume_host
  networks     = [module.services.homelab_docker_network_name]
  memory_limit = 8192  # 8GB if you need more memory
}

# The service definition is automatically included in the services output
module "services" {
  source = "./modules/services"
  # ...
  service_definitions = [
    module.crawl4ai.service_definition,
    # Other service definitions
  ]
}