Files
deer-flow/docker/k8s/README.md
JeffJiang b6da3a219e Add Kubernetes-based sandbox provider for multi-instance support (#19)
* feat: adds docker-based dev environment

* docs: updates Docker command help

* fix local dev

* feat(sandbox): add Kubernetes-based sandbox provider for multi-instance support

* fix: skills path in k8s

* feat: add example config for k8s sandbox

* fix: docker config

* fix: load skills on docker dev

* feat: support sandbox execution to Kubernetes Deployment model

* chore: rename web service name
2026-02-09 21:59:13 +08:00

11 KiB

Kubernetes Sandbox Setup

This guide explains how to deploy and configure the DeerFlow sandbox execution environment on Kubernetes.

Overview

The Kubernetes sandbox deployment allows you to run DeerFlow's code execution sandbox in a Kubernetes cluster, providing:

  • Isolated Execution: Sandbox runs in dedicated Kubernetes pods
  • Scalability: Easy horizontal scaling with replica configuration
  • Cluster Integration: Seamless integration with existing Kubernetes infrastructure
  • Persistent Skills: Skills directory mounted from host or PersistentVolume

Prerequisites

Before you begin, ensure you have:

  1. Kubernetes Cluster: One of the following:

    • Docker Desktop with Kubernetes enabled
    • OrbStack with Kubernetes enabled
    • Minikube
    • Any production Kubernetes cluster
  2. kubectl: Kubernetes command-line tool

    # macOS
    brew install kubectl
    
    # Linux
    # See: https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/
    
  3. Docker: For pulling the sandbox image (optional, but recommended)

    # Verify installation
    docker version
    

Quick Start

1. Enable Kubernetes

Docker Desktop:

Settings → Kubernetes → Enable Kubernetes → Apply & Restart

OrbStack:

Settings → Enable Kubernetes

Minikube:

minikube start

2. Run Setup Script

The easiest way to get started:

cd docker/k8s
./setup.sh

This will:

  • Check kubectl installation and cluster connectivity
  • Pull the sandbox Docker image (optional, can be skipped)
  • Create the deer-flow namespace
  • Deploy the sandbox service and deployment
  • Verify the deployment is running

3. Configure Backend

Add the following to backend/config.yaml:

sandbox:
  use: src.community.aio_sandbox:AioSandboxProvider
  base_url: http://deer-flow-sandbox.deer-flow.svc.cluster.local:8080

4. Verify Deployment

Check that the sandbox pod is running:

kubectl get pods -n deer-flow

You should see:

NAME                                 READY   STATUS    RESTARTS   AGE
deer-flow-sandbox-xxxxxxxxxx-xxxxx   1/1     Running   0          1m

Advanced Configuration

Custom Skills Path

By default, the setup script uses PROJECT_ROOT/skills. You can specify a custom path:

Using command-line argument:

./setup.sh --skills-path /custom/path/to/skills

Using environment variable:

SKILLS_PATH=/custom/path/to/skills ./setup.sh

Custom Sandbox Image

To use a different sandbox image:

Using command-line argument:

./setup.sh --image your-registry/sandbox:tag

Using environment variable:

SANDBOX_IMAGE=your-registry/sandbox:tag ./setup.sh

Skip Image Pull

If you already have the image locally or want to pull it manually later:

./setup.sh --skip-pull

Combined Options

./setup.sh --skip-pull --skills-path /custom/skills --image custom/sandbox:latest

Manual Deployment

If you prefer manual deployment or need more control:

1. Create Namespace

kubectl apply -f namespace.yaml

2. Create Service

kubectl apply -f sandbox-service.yaml

3. Deploy Sandbox

First, update the skills path in sandbox-deployment.yaml:

# Replace __SKILLS_PATH__ with your actual path
sed 's|__SKILLS_PATH__|/Users/feng/Projects/deer-flow/skills|g' \
  sandbox-deployment.yaml | kubectl apply -f -

Or manually edit sandbox-deployment.yaml and replace __SKILLS_PATH__ with your skills directory path.

4. Verify Deployment

# Check all resources
kubectl get all -n deer-flow

# Check pod status
kubectl get pods -n deer-flow

# Check pod logs
kubectl logs -n deer-flow -l app=deer-flow-sandbox

# Describe pod for detailed info
kubectl describe pod -n deer-flow -l app=deer-flow-sandbox

Configuration Options

Resource Limits

Edit sandbox-deployment.yaml to adjust resource limits:

resources:
  requests:
    cpu: 100m      # Minimum CPU
    memory: 256Mi  # Minimum memory
  limits:
    cpu: 1000m     # Maximum CPU (1 core)
    memory: 1Gi    # Maximum memory

Scaling

Adjust the number of replicas:

spec:
  replicas: 3  # Run 3 sandbox pods

Or scale dynamically:

kubectl scale deployment deer-flow-sandbox -n deer-flow --replicas=3

Health Checks

The deployment includes readiness and liveness probes:

  • Readiness Probe: Checks if the pod is ready to serve traffic
  • Liveness Probe: Restarts the pod if it becomes unhealthy

Configure in sandbox-deployment.yaml:

readinessProbe:
  httpGet:
    path: /v1/sandbox
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 5
  timeoutSeconds: 3
  failureThreshold: 3

Troubleshooting

Pod Not Starting

Check pod status and events:

kubectl describe pod -n deer-flow -l app=deer-flow-sandbox

Common issues:

  • ImagePullBackOff: Docker image cannot be pulled
    • Solution: Pre-pull image with docker pull <image>
  • Skills path not found: HostPath doesn't exist
    • Solution: Verify the skills path exists on the host
  • Resource constraints: Not enough CPU/memory
    • Solution: Adjust resource requests/limits

Service Not Accessible

Verify the service is running:

kubectl get service -n deer-flow
kubectl describe service deer-flow-sandbox -n deer-flow

Test connectivity from another pod:

kubectl run test-pod -n deer-flow --rm -it --image=curlimages/curl -- \
  curl http://deer-flow-sandbox.deer-flow.svc.cluster.local:8080/v1/sandbox

Check Logs

View sandbox logs:

# Follow logs in real-time
kubectl logs -n deer-flow -l app=deer-flow-sandbox -f

# View logs from previous container (if crashed)
kubectl logs -n deer-flow -l app=deer-flow-sandbox --previous

Health Check Failures

If pods show as not ready:

# Check readiness probe
kubectl get events -n deer-flow --sort-by='.lastTimestamp'

# Exec into pod to debug
kubectl exec -it -n deer-flow <pod-name> -- /bin/sh

Cleanup

Remove All Resources

Using the setup script:

./setup.sh --cleanup

Or manually:

kubectl delete -f sandbox-deployment.yaml
kubectl delete -f sandbox-service.yaml
kubectl delete namespace deer-flow

Remove Specific Resources

# Delete only the deployment (keeps namespace and service)
kubectl delete deployment deer-flow-sandbox -n deer-flow

# Delete pods (they will be recreated by deployment)
kubectl delete pods -n deer-flow -l app=deer-flow-sandbox

Architecture

┌─────────────────────────────────────────────┐
│         DeerFlow Backend                    │
│  (config.yaml: base_url configured)         │
└────────────────┬────────────────────────────┘
                 │ HTTP requests
                 ↓
┌─────────────────────────────────────────────┐
│    Kubernetes Service (ClusterIP)           │
│  deer-flow-sandbox.deer-flow.svc:8080       │
└────────────────┬────────────────────────────┘
                 │ Load balancing
                 ↓
┌─────────────────────────────────────────────┐
│         Sandbox Pods (replicas)             │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  │
│  │  Pod 1   │  │  Pod 2   │  │  Pod 3   │  │
│  │ Port 8080│  │ Port 8080│  │ Port 8080│  │
│  └──────────┘  └──────────┘  └──────────┘  │
└────────────────┬────────────────────────────┘
                 │ Volume mount
                 ↓
┌─────────────────────────────────────────────┐
│         Host Skills Directory               │
│    /path/to/deer-flow/skills                │
└─────────────────────────────────────────────┘

Setup Script Reference

Command-Line Options

./setup.sh [options]

Options:
  -h, --help              Show help message
  -c, --cleanup           Remove all Kubernetes resources
  -p, --skip-pull         Skip pulling sandbox image
  --image <image>         Use custom sandbox image
  --skills-path <path>    Custom skills directory path

Environment Variables:
  SANDBOX_IMAGE      Custom sandbox image
  SKILLS_PATH        Custom skills path

Examples:
  ./setup.sh                                    # Use default settings
  ./setup.sh --skills-path /custom/path         # Use custom skills path
  ./setup.sh --skip-pull --image custom:tag     # Custom image, skip pull
  SKILLS_PATH=/custom/path ./setup.sh           # Use env variable

Production Considerations

Security

  1. Network Policies: Restrict pod-to-pod communication
  2. RBAC: Configure appropriate service account permissions
  3. Pod Security: Enable pod security standards
  4. Image Security: Scan images for vulnerabilities

High Availability

  1. Multiple Replicas: Run at least 3 replicas
  2. Pod Disruption Budget: Prevent all pods from being evicted
  3. Node Affinity: Distribute pods across nodes
  4. Resource Quotas: Set namespace resource limits

Monitoring

  1. Prometheus: Scrape metrics from pods
  2. Logging: Centralized log aggregation
  3. Alerting: Set up alerts for pod failures
  4. Tracing: Distributed tracing for requests

Storage

For production, consider using PersistentVolume instead of hostPath:

  1. Create PersistentVolume: Define storage backend
  2. Create PersistentVolumeClaim: Request storage
  3. Update Deployment: Use PVC instead of hostPath

See skills-pv-pvc.yaml.bak for reference implementation.

Next Steps

After successful deployment:

  1. Start Backend: make dev or make docker-start
  2. Test Sandbox: Create a conversation and execute code
  3. Monitor: Watch pod logs and resource usage
  4. Scale: Adjust replicas based on workload

Support

For issues and questions:

  • Check troubleshooting section above
  • Review pod logs: kubectl logs -n deer-flow -l app=deer-flow-sandbox
  • See main project documentation: ../../README.md
  • Report issues on GitHub