fix(sandbox):deer-flow-provisioner container fails to start in local execution mode (#889)

This commit is contained in:
Willem Jiang
2026-02-24 08:31:52 +08:00
committed by GitHub
parent b5c11baece
commit 03705acf3a
12 changed files with 452 additions and 52 deletions

View File

@@ -6,11 +6,10 @@
# - frontend: Frontend Next.js dev server (port 3000)
# - gateway: Backend Gateway API (port 8001)
# - langgraph: LangGraph server (port 2024)
# - provisioner: Sandbox provisioner (creates Pods in host Kubernetes)
# - provisioner (optional): Sandbox provisioner (creates Pods in host Kubernetes)
#
# Prerequisites:
# - Host machine must have a running Kubernetes cluster (Docker Desktop K8s,
# minikube, kind, etc.) with kubectl configured (~/.kube/config).
# - Kubernetes cluster + kubeconfig are only required when using provisioner mode.
#
# Access: http://localhost:2026
@@ -20,6 +19,8 @@ services:
# cluster via the K8s API.
# Backend accesses sandboxes directly via host.docker.internal:{NodePort}.
provisioner:
profiles:
- provisioner
build:
context: ./provisioner
dockerfile: Dockerfile
@@ -55,19 +56,21 @@ services:
start_period: 15s
# ── Reverse Proxy ──────────────────────────────────────────────────────
# Routes API traffic to gateway, langgraph, and provisioner services.
# Routes API traffic to gateway/langgraph and (optionally) provisioner.
# Select nginx config via NGINX_CONF:
# - nginx.local.conf (default): no provisioner route (local/aio modes)
# - nginx.conf: includes provisioner route (provisioner mode)
nginx:
image: nginx:alpine
container_name: deer-flow-nginx
ports:
- "2026:2026"
volumes:
- ./nginx/nginx.conf:/etc/nginx/nginx.conf:ro
- ./nginx/${NGINX_CONF:-nginx.local.conf}:/etc/nginx/nginx.conf:ro
depends_on:
- frontend
- gateway
- langgraph
- provisioner
networks:
- deer-flow-dev
restart: unless-stopped

View File

@@ -188,7 +188,7 @@ kubectl config view --minify -o jsonpath='{.clusters[0].cluster.server}'
The provisioner runs as part of the docker-compose-dev stack:
```bash
# Start all services including provisioner
# Start Docker services (provisioner starts only when config.yaml enables provisioner mode)
make docker-start
# Or start just the provisioner
@@ -249,6 +249,18 @@ docker exec deer-flow-gateway curl -s $SANDBOX_URL/v1/sandbox
- Run `kubectl config view` to verify
- Check the volume mount in docker-compose-dev.yaml
### Issue: "Kubeconfig path is a directory"
**Cause**: The mounted `KUBECONFIG_PATH` points to a directory instead of a file.
**Solution**:
- Ensure the compose mount source is a file (e.g., `~/.kube/config`) not a directory
- Verify inside container:
```bash
docker exec deer-flow-provisioner ls -ld /root/.kube/config
```
- Expected output should indicate a regular file (`-`), not a directory (`d`)
### Issue: "Connection refused" to K8s API
**Cause**: The provisioner can't reach the K8s API server.

View File

@@ -80,12 +80,29 @@ def _init_k8s_client() -> k8s_client.CoreV1Api:
Tries the mounted kubeconfig first, then falls back to in-cluster
config (useful if the provisioner itself runs inside K8s).
"""
try:
k8s_config.load_kube_config(config_file=KUBECONFIG_PATH)
logger.info(f"Loaded kubeconfig from {KUBECONFIG_PATH}")
except Exception:
logger.warning("Could not load kubeconfig from file, trying in-cluster config")
k8s_config.load_incluster_config()
if os.path.exists(KUBECONFIG_PATH):
if os.path.isdir(KUBECONFIG_PATH):
raise RuntimeError(
f"KUBECONFIG_PATH points to a directory, expected a file: {KUBECONFIG_PATH}"
)
try:
k8s_config.load_kube_config(config_file=KUBECONFIG_PATH)
logger.info(f"Loaded kubeconfig from {KUBECONFIG_PATH}")
except Exception as exc:
raise RuntimeError(
f"Failed to load kubeconfig from {KUBECONFIG_PATH}: {exc}"
) from exc
else:
logger.warning(
f"Kubeconfig not found at {KUBECONFIG_PATH}; trying in-cluster config"
)
try:
k8s_config.load_incluster_config()
except Exception as exc:
raise RuntimeError(
"Failed to initialize Kubernetes client. "
f"No kubeconfig at {KUBECONFIG_PATH}, and in-cluster config is unavailable: {exc}"
) from exc
# When connecting from inside Docker to the host's K8s API, the
# kubeconfig may reference ``localhost`` or ``127.0.0.1``. We
@@ -103,15 +120,27 @@ def _init_k8s_client() -> k8s_client.CoreV1Api:
def _wait_for_kubeconfig(timeout: int = 30) -> None:
"""Block until the kubeconfig file is available."""
"""Wait for kubeconfig file if configured, then continue with fallback support."""
deadline = time.time() + timeout
while time.time() < deadline:
if os.path.exists(KUBECONFIG_PATH):
logger.info(f"Found kubeconfig at {KUBECONFIG_PATH}")
return
if os.path.isfile(KUBECONFIG_PATH):
logger.info(f"Found kubeconfig file at {KUBECONFIG_PATH}")
return
if os.path.isdir(KUBECONFIG_PATH):
raise RuntimeError(
"Kubeconfig path is a directory. "
f"Please mount a kubeconfig file at {KUBECONFIG_PATH}."
)
raise RuntimeError(
f"Kubeconfig path exists but is not a regular file: {KUBECONFIG_PATH}"
)
logger.info(f"Waiting for kubeconfig at {KUBECONFIG_PATH}")
time.sleep(2)
raise RuntimeError(f"Kubeconfig not found at {KUBECONFIG_PATH} after {timeout}s")
logger.warning(
f"Kubeconfig not found at {KUBECONFIG_PATH} after {timeout}s; "
"will attempt in-cluster Kubernetes config"
)
def _ensure_namespace() -> None: