Add Kubernetes-based sandbox provider for multi-instance support (#19)

* feat: adds docker-based dev environment

* docs: updates Docker command help

* fix local dev

* feat(sandbox): add Kubernetes-based sandbox provider for multi-instance support

* fix: skills path in k8s

* feat: add example config for k8s sandbox

* fix: docker config

* fix: load skills on docker dev

* feat: support sandbox execution to Kubernetes Deployment model

* chore: rename web service name
This commit is contained in:
JeffJiang
2026-02-09 21:59:13 +08:00
committed by GitHub
parent 554ec7a91e
commit b6da3a219e
20 changed files with 981 additions and 94 deletions

427
docker/k8s/README.md Normal file
View File

@@ -0,0 +1,427 @@
# Kubernetes Sandbox Setup
This guide explains how to deploy and configure the DeerFlow sandbox execution environment on Kubernetes.
## Overview
The Kubernetes sandbox deployment allows you to run DeerFlow's code execution sandbox in a Kubernetes cluster, providing:
- **Isolated Execution**: Sandbox runs in dedicated Kubernetes pods
- **Scalability**: Easy horizontal scaling with replica configuration
- **Cluster Integration**: Seamless integration with existing Kubernetes infrastructure
- **Persistent Skills**: Skills directory mounted from host or PersistentVolume
## Prerequisites
Before you begin, ensure you have:
1. **Kubernetes Cluster**: One of the following:
- Docker Desktop with Kubernetes enabled
- OrbStack with Kubernetes enabled
- Minikube
- Any production Kubernetes cluster
2. **kubectl**: Kubernetes command-line tool
```bash
# macOS
brew install kubectl
# Linux
# See: https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/
```
3. **Docker**: For pulling the sandbox image (optional, but recommended)
```bash
# Verify installation
docker version
```
## Quick Start
### 1. Enable Kubernetes
**Docker Desktop:**
```
Settings → Kubernetes → Enable Kubernetes → Apply & Restart
```
**OrbStack:**
```
Settings → Enable Kubernetes
```
**Minikube:**
```bash
minikube start
```
### 2. Run Setup Script
The easiest way to get started:
```bash
cd docker/k8s
./setup.sh
```
This will:
- ✅ Check kubectl installation and cluster connectivity
- ✅ Pull the sandbox Docker image (optional, can be skipped)
- ✅ Create the `deer-flow` namespace
- ✅ Deploy the sandbox service and deployment
- ✅ Verify the deployment is running
### 3. Configure Backend
Add the following to `backend/config.yaml`:
```yaml
sandbox:
use: src.community.aio_sandbox:AioSandboxProvider
base_url: http://deer-flow-sandbox.deer-flow.svc.cluster.local:8080
```
### 4. Verify Deployment
Check that the sandbox pod is running:
```bash
kubectl get pods -n deer-flow
```
You should see:
```
NAME READY STATUS RESTARTS AGE
deer-flow-sandbox-xxxxxxxxxx-xxxxx 1/1 Running 0 1m
```
## Advanced Configuration
### Custom Skills Path
By default, the setup script uses `PROJECT_ROOT/skills`. You can specify a custom path:
**Using command-line argument:**
```bash
./setup.sh --skills-path /custom/path/to/skills
```
**Using environment variable:**
```bash
SKILLS_PATH=/custom/path/to/skills ./setup.sh
```
### Custom Sandbox Image
To use a different sandbox image:
**Using command-line argument:**
```bash
./setup.sh --image your-registry/sandbox:tag
```
**Using environment variable:**
```bash
SANDBOX_IMAGE=your-registry/sandbox:tag ./setup.sh
```
### Skip Image Pull
If you already have the image locally or want to pull it manually later:
```bash
./setup.sh --skip-pull
```
### Combined Options
```bash
./setup.sh --skip-pull --skills-path /custom/skills --image custom/sandbox:latest
```
## Manual Deployment
If you prefer manual deployment or need more control:
### 1. Create Namespace
```bash
kubectl apply -f namespace.yaml
```
### 2. Create Service
```bash
kubectl apply -f sandbox-service.yaml
```
### 3. Deploy Sandbox
First, update the skills path in `sandbox-deployment.yaml`:
```bash
# Replace __SKILLS_PATH__ with your actual path
sed 's|__SKILLS_PATH__|/Users/feng/Projects/deer-flow/skills|g' \
sandbox-deployment.yaml | kubectl apply -f -
```
Or manually edit `sandbox-deployment.yaml` and replace `__SKILLS_PATH__` with your skills directory path.
### 4. Verify Deployment
```bash
# Check all resources
kubectl get all -n deer-flow
# Check pod status
kubectl get pods -n deer-flow
# Check pod logs
kubectl logs -n deer-flow -l app=deer-flow-sandbox
# Describe pod for detailed info
kubectl describe pod -n deer-flow -l app=deer-flow-sandbox
```
## Configuration Options
### Resource Limits
Edit `sandbox-deployment.yaml` to adjust resource limits:
```yaml
resources:
requests:
cpu: 100m # Minimum CPU
memory: 256Mi # Minimum memory
limits:
cpu: 1000m # Maximum CPU (1 core)
memory: 1Gi # Maximum memory
```
### Scaling
Adjust the number of replicas:
```yaml
spec:
replicas: 3 # Run 3 sandbox pods
```
Or scale dynamically:
```bash
kubectl scale deployment deer-flow-sandbox -n deer-flow --replicas=3
```
### Health Checks
The deployment includes readiness and liveness probes:
- **Readiness Probe**: Checks if the pod is ready to serve traffic
- **Liveness Probe**: Restarts the pod if it becomes unhealthy
Configure in `sandbox-deployment.yaml`:
```yaml
readinessProbe:
httpGet:
path: /v1/sandbox
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
```
## Troubleshooting
### Pod Not Starting
Check pod status and events:
```bash
kubectl describe pod -n deer-flow -l app=deer-flow-sandbox
```
Common issues:
- **ImagePullBackOff**: Docker image cannot be pulled
- Solution: Pre-pull image with `docker pull <image>`
- **Skills path not found**: HostPath doesn't exist
- Solution: Verify the skills path exists on the host
- **Resource constraints**: Not enough CPU/memory
- Solution: Adjust resource requests/limits
### Service Not Accessible
Verify the service is running:
```bash
kubectl get service -n deer-flow
kubectl describe service deer-flow-sandbox -n deer-flow
```
Test connectivity from another pod:
```bash
kubectl run test-pod -n deer-flow --rm -it --image=curlimages/curl -- \
curl http://deer-flow-sandbox.deer-flow.svc.cluster.local:8080/v1/sandbox
```
### Check Logs
View sandbox logs:
```bash
# Follow logs in real-time
kubectl logs -n deer-flow -l app=deer-flow-sandbox -f
# View logs from previous container (if crashed)
kubectl logs -n deer-flow -l app=deer-flow-sandbox --previous
```
### Health Check Failures
If pods show as not ready:
```bash
# Check readiness probe
kubectl get events -n deer-flow --sort-by='.lastTimestamp'
# Exec into pod to debug
kubectl exec -it -n deer-flow <pod-name> -- /bin/sh
```
## Cleanup
### Remove All Resources
Using the setup script:
```bash
./setup.sh --cleanup
```
Or manually:
```bash
kubectl delete -f sandbox-deployment.yaml
kubectl delete -f sandbox-service.yaml
kubectl delete namespace deer-flow
```
### Remove Specific Resources
```bash
# Delete only the deployment (keeps namespace and service)
kubectl delete deployment deer-flow-sandbox -n deer-flow
# Delete pods (they will be recreated by deployment)
kubectl delete pods -n deer-flow -l app=deer-flow-sandbox
```
## Architecture
```
┌─────────────────────────────────────────────┐
│ DeerFlow Backend │
│ (config.yaml: base_url configured) │
└────────────────┬────────────────────────────┘
│ HTTP requests
┌─────────────────────────────────────────────┐
│ Kubernetes Service (ClusterIP) │
│ deer-flow-sandbox.deer-flow.svc:8080 │
└────────────────┬────────────────────────────┘
│ Load balancing
┌─────────────────────────────────────────────┐
│ Sandbox Pods (replicas) │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Pod 1 │ │ Pod 2 │ │ Pod 3 │ │
│ │ Port 8080│ │ Port 8080│ │ Port 8080│ │
│ └──────────┘ └──────────┘ └──────────┘ │
└────────────────┬────────────────────────────┘
│ Volume mount
┌─────────────────────────────────────────────┐
│ Host Skills Directory │
│ /path/to/deer-flow/skills │
└─────────────────────────────────────────────┘
```
## Setup Script Reference
### Command-Line Options
```bash
./setup.sh [options]
Options:
-h, --help Show help message
-c, --cleanup Remove all Kubernetes resources
-p, --skip-pull Skip pulling sandbox image
--image <image> Use custom sandbox image
--skills-path <path> Custom skills directory path
Environment Variables:
SANDBOX_IMAGE Custom sandbox image
SKILLS_PATH Custom skills path
Examples:
./setup.sh # Use default settings
./setup.sh --skills-path /custom/path # Use custom skills path
./setup.sh --skip-pull --image custom:tag # Custom image, skip pull
SKILLS_PATH=/custom/path ./setup.sh # Use env variable
```
## Production Considerations
### Security
1. **Network Policies**: Restrict pod-to-pod communication
2. **RBAC**: Configure appropriate service account permissions
3. **Pod Security**: Enable pod security standards
4. **Image Security**: Scan images for vulnerabilities
### High Availability
1. **Multiple Replicas**: Run at least 3 replicas
2. **Pod Disruption Budget**: Prevent all pods from being evicted
3. **Node Affinity**: Distribute pods across nodes
4. **Resource Quotas**: Set namespace resource limits
### Monitoring
1. **Prometheus**: Scrape metrics from pods
2. **Logging**: Centralized log aggregation
3. **Alerting**: Set up alerts for pod failures
4. **Tracing**: Distributed tracing for requests
### Storage
For production, consider using PersistentVolume instead of hostPath:
1. **Create PersistentVolume**: Define storage backend
2. **Create PersistentVolumeClaim**: Request storage
3. **Update Deployment**: Use PVC instead of hostPath
See `skills-pv-pvc.yaml.bak` for reference implementation.
## Next Steps
After successful deployment:
1. **Start Backend**: `make dev` or `make docker-start`
2. **Test Sandbox**: Create a conversation and execute code
3. **Monitor**: Watch pod logs and resource usage
4. **Scale**: Adjust replicas based on workload
## Support
For issues and questions:
- Check troubleshooting section above
- Review pod logs: `kubectl logs -n deer-flow -l app=deer-flow-sandbox`
- See main project documentation: [../../README.md](../../README.md)
- Report issues on GitHub

View File

@@ -0,0 +1,7 @@
apiVersion: v1
kind: Namespace
metadata:
name: deer-flow
labels:
app.kubernetes.io/name: deer-flow
app.kubernetes.io/component: sandbox

View File

@@ -0,0 +1,65 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: deer-flow-sandbox
namespace: deer-flow
labels:
app.kubernetes.io/name: deer-flow
app.kubernetes.io/component: sandbox
spec:
replicas: 1
selector:
matchLabels:
app: deer-flow-sandbox
template:
metadata:
labels:
app: deer-flow-sandbox
app.kubernetes.io/name: deer-flow
app.kubernetes.io/component: sandbox
spec:
containers:
- name: sandbox
image: enterprise-public-cn-beijing.cr.volces.com/vefaas-public/all-in-one-sandbox:latest
ports:
- name: http
containerPort: 8080
protocol: TCP
readinessProbe:
httpGet:
path: /v1/sandbox
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
livenessProbe:
httpGet:
path: /v1/sandbox
port: 8080
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 3
failureThreshold: 3
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 1000m
memory: 1Gi
volumeMounts:
- name: skills
mountPath: /mnt/skills
readOnly: true
securityContext:
privileged: false
allowPrivilegeEscalation: true
volumes:
- name: skills
hostPath:
# Path to skills directory on the host machine
# This will be replaced by setup.sh with the actual path
path: __SKILLS_PATH__
type: Directory
restartPolicy: Always

View File

@@ -0,0 +1,21 @@
apiVersion: v1
kind: Service
metadata:
name: deer-flow-sandbox
namespace: deer-flow
labels:
app.kubernetes.io/name: deer-flow
app.kubernetes.io/component: sandbox
spec:
type: ClusterIP
clusterIP: None # Headless service for direct Pod DNS access
ports:
- name: http
port: 8080
targetPort: 8080
protocol: TCP
selector:
app: deer-flow-sandbox
# Enable DNS-based service discovery
# Pods will be accessible at: {pod-name}.deer-flow-sandbox.deer-flow.svc.cluster.local:8080
publishNotReadyAddresses: false

245
docker/k8s/setup.sh Executable file
View File

@@ -0,0 +1,245 @@
#!/bin/bash
# Kubernetes Sandbox Initialization Script for Deer-Flow
# This script sets up the Kubernetes environment for the sandbox provider
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "${SCRIPT_DIR}/../.." && pwd)"
# Default sandbox image
DEFAULT_SANDBOX_IMAGE="enterprise-public-cn-beijing.cr.volces.com/vefaas-public/all-in-one-sandbox:latest"
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
echo -e "${BLUE}╔════════════════════════════════════════════╗${NC}"
echo -e "${BLUE}║ Deer-Flow Kubernetes Sandbox Setup ║${NC}"
echo -e "${BLUE}╚════════════════════════════════════════════╝${NC}"
echo
# Function to print status messages
info() {
echo -e "${BLUE}[INFO]${NC} $1"
}
success() {
echo -e "${GREEN}[SUCCESS]${NC} $1"
}
warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
error() {
echo -e "${RED}[ERROR]${NC} $1"
}
# Check if kubectl is installed
check_kubectl() {
info "Checking kubectl installation..."
if ! command -v kubectl &> /dev/null; then
error "kubectl is not installed. Please install kubectl first."
echo " - macOS: brew install kubectl"
echo " - Linux: https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/"
exit 1
fi
success "kubectl is installed"
}
# Check if Kubernetes cluster is accessible
check_cluster() {
info "Checking Kubernetes cluster connection..."
if ! kubectl cluster-info &> /dev/null; then
error "Cannot connect to Kubernetes cluster."
echo "Please ensure:"
echo " - Docker Desktop: Settings → Kubernetes → Enable Kubernetes"
echo " - Or OrbStack: Enable Kubernetes in settings"
echo " - Or Minikube: minikube start"
exit 1
fi
success "Connected to Kubernetes cluster"
}
# Apply Kubernetes resources
apply_resources() {
info "Applying Kubernetes resources..."
# Determine skills path
SKILLS_PATH="${SKILLS_PATH:-${PROJECT_ROOT}/skills}"
info "Using skills path: ${SKILLS_PATH}"
# Validate skills path exists
if [[ ! -d "${SKILLS_PATH}" ]]; then
warn "Skills path does not exist: ${SKILLS_PATH}"
warn "Creating directory..."
mkdir -p "${SKILLS_PATH}"
fi
echo " → Creating namespace..."
kubectl apply -f "${SCRIPT_DIR}/namespace.yaml"
echo " → Creating sandbox service..."
kubectl apply -f "${SCRIPT_DIR}/sandbox-service.yaml"
echo " → Creating sandbox deployment with skills path: ${SKILLS_PATH}"
# Replace __SKILLS_PATH__ placeholder with actual path
if [[ "$OSTYPE" == "darwin"* ]]; then
# macOS
sed "s|__SKILLS_PATH__|${SKILLS_PATH}|g" "${SCRIPT_DIR}/sandbox-deployment.yaml" | kubectl apply -f -
else
# Linux
sed "s|__SKILLS_PATH__|${SKILLS_PATH}|g" "${SCRIPT_DIR}/sandbox-deployment.yaml" | kubectl apply -f -
fi
success "All Kubernetes resources applied"
}
# Verify deployment
verify_deployment() {
info "Verifying deployment..."
echo " → Checking namespace..."
kubectl get namespace deer-flow
echo " → Checking service..."
kubectl get service -n deer-flow
echo " → Checking deployment..."
kubectl get deployment -n deer-flow
echo " → Checking pods..."
kubectl get pods -n deer-flow
success "Deployment verified"
}
# Pull sandbox image
pull_image() {
info "Checking sandbox image..."
IMAGE="${SANDBOX_IMAGE:-$DEFAULT_SANDBOX_IMAGE}"
# Check if image already exists locally
if docker image inspect "$IMAGE" &> /dev/null; then
success "Image already exists locally: $IMAGE"
return 0
fi
info "Pulling sandbox image (this may take a few minutes on first run)..."
echo " → Image: $IMAGE"
echo
if docker pull "$IMAGE"; then
success "Image pulled successfully"
else
warn "Failed to pull image. Pod startup may be slow on first run."
echo " You can manually pull the image later with:"
echo " docker pull $IMAGE"
fi
}
# Print next steps
print_next_steps() {
echo
echo -e "${BLUE}╔════════════════════════════════════════════╗${NC}"
echo -e "${BLUE}║ Setup Complete! ║${NC}"
echo -e "${BLUE}╚════════════════════════════════════════════╝${NC}"
echo
echo -e "${YELLOW}To enable Kubernetes sandbox, add the following to backend/config.yaml:${NC}"
echo
echo -e "${GREEN}sandbox:${NC}"
echo -e "${GREEN} use: src.community.aio_sandbox:AioSandboxProvider${NC}"
echo -e "${GREEN} base_url: http://deer-flow-sandbox.deer-flow.svc.cluster.local:8080${NC}"
echo
echo
echo -e "${GREEN}Next steps:${NC}"
echo " make dev # Start backend and frontend in development mode"
echo " make docker-start # Start backend and frontend in Docker containers"
echo
}
# Cleanup function
cleanup() {
if [[ "$1" == "--cleanup" ]] || [[ "$1" == "-c" ]]; then
info "Cleaning up Kubernetes resources..."
kubectl delete -f "${SCRIPT_DIR}/sandbox-deployment.yaml" --ignore-not-found=true
kubectl delete -f "${SCRIPT_DIR}/sandbox-service.yaml" --ignore-not-found=true
kubectl delete -f "${SCRIPT_DIR}/namespace.yaml" --ignore-not-found=true
success "Cleanup complete"
exit 0
fi
}
# Show help
show_help() {
echo "Usage: $0 [options]"
echo
echo "Options:"
echo " -h, --help Show this help message"
echo " -c, --cleanup Remove all Kubernetes resources"
echo " -p, --skip-pull Skip pulling sandbox image"
echo " --image <image> Use custom sandbox image"
echo " --skills-path <path> Custom skills directory path"
echo
echo "Environment variables:"
echo " SANDBOX_IMAGE Custom sandbox image (default: $DEFAULT_SANDBOX_IMAGE)"
echo " SKILLS_PATH Custom skills path (default: PROJECT_ROOT/skills)"
echo
echo "Examples:"
echo " $0 # Use default settings"
echo " $0 --skills-path /custom/path # Use custom skills path"
echo " SKILLS_PATH=/custom/path $0 # Use env variable"
echo
exit 0
}
# Parse arguments
SKIP_PULL=false
while [[ $# -gt 0 ]]; do
case $1 in
-h|--help)
show_help
;;
-c|--cleanup)
cleanup "$1"
;;
-p|--skip-pull)
SKIP_PULL=true
shift
;;
--image)
SANDBOX_IMAGE="$2"
shift 2
;;
--skills-path)
SKILLS_PATH="$2"
shift 2
;;
*)
shift
;;
esac
done
# Main execution
main() {
check_kubectl
check_cluster
# Pull image first to avoid Pod startup timeout
if [[ "$SKIP_PULL" == false ]]; then
pull_image
fi
apply_resources
verify_deployment
print_next_steps
}
main