Files
LangBot/docker/README_K8S.md
Copilot 43553e2c7d feat: Add Kubernetes deployment configuration for cluster deployments (#1779)
* Initial plan

* feat: Add Kubernetes deployment configuration and guide

Co-authored-by: RockChinQ <45992437+RockChinQ@users.noreply.github.com>

* feat: Add test script and update docker-compose with k8s reference

Co-authored-by: RockChinQ <45992437+RockChinQ@users.noreply.github.com>

* doc: add k8s deployment doc in README

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: RockChinQ <45992437+RockChinQ@users.noreply.github.com>
Co-authored-by: Junyan Qin <rockchinq@gmail.com>
2025-11-14 11:25:11 +08:00

14 KiB
Raw Permalink Blame History

LangBot Kubernetes 部署指南 / Kubernetes Deployment Guide

简体中文 | English


简体中文

概述

本指南提供了在 Kubernetes 集群中部署 LangBot 的完整步骤。Kubernetes 部署配置基于 docker-compose.yaml,适用于生产环境的容器化部署。

前置要求

  • Kubernetes 集群(版本 1.19+
  • kubectl 命令行工具已配置并可访问集群
  • 集群中有可用的存储类StorageClass用于持久化存储可选但推荐
  • 至少 2 vCPU 和 4GB RAM 的可用资源

架构说明

Kubernetes 部署包含以下组件:

  1. langbot: 主应用服务

    • 提供 Web UI端口 5300
    • 处理平台 webhook端口 2280-2290
    • 数据持久化卷
  2. langbot-plugin-runtime: 插件运行时服务

    • WebSocket 通信(端口 5400
    • 插件数据持久化卷
  3. 持久化存储:

    • langbot-data: LangBot 主数据
    • langbot-plugins: 插件文件
    • langbot-plugin-runtime-data: 插件运行时数据

快速开始

1. 下载部署文件

# 克隆仓库
git clone https://github.com/langbot-app/LangBot
cd LangBot/docker

# 或直接下载 kubernetes.yaml
wget https://raw.githubusercontent.com/langbot-app/LangBot/main/docker/kubernetes.yaml

2. 部署到 Kubernetes

# 应用所有配置
kubectl apply -f kubernetes.yaml

# 检查部署状态
kubectl get all -n langbot

# 查看 Pod 日志
kubectl logs -n langbot -l app=langbot -f

3. 访问 LangBot

默认情况下LangBot 服务使用 ClusterIP 类型,只能在集群内部访问。您可以选择以下方式之一来访问:

选项 A: 端口转发(推荐用于测试)

kubectl port-forward -n langbot svc/langbot 5300:5300

然后访问 http://localhost:5300

选项 B: NodePort适用于开发环境

编辑 kubernetes.yaml,取消注释 NodePort Service 部分,然后:

kubectl apply -f kubernetes.yaml
# 获取节点 IP
kubectl get nodes -o wide
# 访问 http://<NODE_IP>:30300

选项 C: LoadBalancer适用于云环境

编辑 kubernetes.yaml,取消注释 LoadBalancer Service 部分,然后:

kubectl apply -f kubernetes.yaml
# 获取外部 IP
kubectl get svc -n langbot langbot-loadbalancer
# 访问 http://<EXTERNAL_IP>

选项 D: Ingress推荐用于生产环境

确保集群中已安装 Ingress Controller如 nginx-ingress然后

  1. 编辑 kubernetes.yaml 中的 Ingress 配置
  2. 修改域名为您的实际域名
  3. 应用配置:
kubectl apply -f kubernetes.yaml
# 访问 http://langbot.yourdomain.com

配置说明

环境变量

ConfigMap 中配置环境变量:

apiVersion: v1
kind: ConfigMap
metadata:
  name: langbot-config
  namespace: langbot
data:
  TZ: "Asia/Shanghai"  # 修改为您的时区

存储配置

默认使用动态存储分配。如果您有特定的 StorageClass请在 PVC 中指定:

spec:
  storageClassName: your-storage-class-name
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

资源限制

根据您的需求调整资源限制:

resources:
  requests:
    memory: "1Gi"
    cpu: "500m"
  limits:
    memory: "4Gi"
    cpu: "2000m"

常用操作

查看日志

# 查看 LangBot 主服务日志
kubectl logs -n langbot -l app=langbot -f

# 查看插件运行时日志
kubectl logs -n langbot -l app=langbot-plugin-runtime -f

重启服务

# 重启 LangBot
kubectl rollout restart deployment/langbot -n langbot

# 重启插件运行时
kubectl rollout restart deployment/langbot-plugin-runtime -n langbot

更新镜像

# 更新到最新版本
kubectl set image deployment/langbot -n langbot langbot=rockchin/langbot:latest
kubectl set image deployment/langbot-plugin-runtime -n langbot langbot-plugin-runtime=rockchin/langbot:latest

# 检查更新状态
kubectl rollout status deployment/langbot -n langbot

扩容(不推荐)

注意:由于 LangBot 使用 ReadWriteOnce 的持久化存储,不支持多副本扩容。如需高可用,请考虑使用 ReadWriteMany 存储或其他架构方案。

备份数据

# 备份 PVC 数据
kubectl exec -n langbot -it <langbot-pod-name> -- tar czf /tmp/backup.tar.gz /app/data
kubectl cp langbot/<langbot-pod-name>:/tmp/backup.tar.gz ./backup.tar.gz

卸载

# 删除所有资源(保留 PVC
kubectl delete deployment,service,configmap -n langbot --all

# 删除 PVC会删除数据
kubectl delete pvc -n langbot --all

# 删除命名空间
kubectl delete namespace langbot

故障排查

Pod 无法启动

# 查看 Pod 状态
kubectl get pods -n langbot

# 查看详细信息
kubectl describe pod -n langbot <pod-name>

# 查看事件
kubectl get events -n langbot --sort-by='.lastTimestamp'

存储问题

# 检查 PVC 状态
kubectl get pvc -n langbot

# 检查 PV
kubectl get pv

网络访问问题

# 检查 Service
kubectl get svc -n langbot

# 检查端口转发
kubectl port-forward -n langbot svc/langbot 5300:5300

生产环境建议

  1. 使用特定版本标签:避免使用 latest 标签,使用具体版本号如 rockchin/langbot:v1.0.0
  2. 配置资源限制:根据实际负载调整 CPU 和内存限制
  3. 使用 Ingress + TLS:配置 HTTPS 访问和证书管理
  4. 配置监控和告警:集成 Prometheus、Grafana 等监控工具
  5. 定期备份:配置自动备份策略保护数据
  6. 使用专用 StorageClass:为生产环境配置高性能存储
  7. 配置亲和性规则:确保 Pod 调度到合适的节点

高级配置

使用 Secrets 管理敏感信息

如果需要配置 API 密钥等敏感信息:

apiVersion: v1
kind: Secret
metadata:
  name: langbot-secrets
  namespace: langbot
type: Opaque
data:
  api_key: <base64-encoded-value>

然后在 Deployment 中引用:

env:
- name: API_KEY
  valueFrom:
    secretKeyRef:
      name: langbot-secrets
      key: api_key

配置水平自动扩缩容HPA

注意:需要确保使用 ReadWriteMany 存储类型

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: langbot-hpa
  namespace: langbot
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: langbot
  minReplicas: 1
  maxReplicas: 3
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

参考资源


English

Overview

This guide provides complete steps for deploying LangBot in a Kubernetes cluster. The Kubernetes deployment configuration is based on docker-compose.yaml and is suitable for production containerized deployments.

Prerequisites

  • Kubernetes cluster (version 1.19+)
  • kubectl command-line tool configured with cluster access
  • Available StorageClass in the cluster for persistent storage (optional but recommended)
  • At least 2 vCPU and 4GB RAM of available resources

Architecture

The Kubernetes deployment includes the following components:

  1. langbot: Main application service

    • Provides Web UI (port 5300)
    • Handles platform webhooks (ports 2280-2290)
    • Data persistence volume
  2. langbot-plugin-runtime: Plugin runtime service

    • WebSocket communication (port 5400)
    • Plugin data persistence volume
  3. Persistent Storage:

    • langbot-data: LangBot main data
    • langbot-plugins: Plugin files
    • langbot-plugin-runtime-data: Plugin runtime data

Quick Start

1. Download Deployment Files

# Clone repository
git clone https://github.com/langbot-app/LangBot
cd LangBot/docker

# Or download kubernetes.yaml directly
wget https://raw.githubusercontent.com/langbot-app/LangBot/main/docker/kubernetes.yaml

2. Deploy to Kubernetes

# Apply all configurations
kubectl apply -f kubernetes.yaml

# Check deployment status
kubectl get all -n langbot

# View Pod logs
kubectl logs -n langbot -l app=langbot -f

3. Access LangBot

By default, LangBot service uses ClusterIP type, accessible only within the cluster. Choose one of the following methods to access:

Option A: Port Forwarding (Recommended for testing)

kubectl port-forward -n langbot svc/langbot 5300:5300

Then visit http://localhost:5300

Option B: NodePort (Suitable for development)

Edit kubernetes.yaml, uncomment the NodePort Service section, then:

kubectl apply -f kubernetes.yaml
# Get node IP
kubectl get nodes -o wide
# Visit http://<NODE_IP>:30300

Option C: LoadBalancer (Suitable for cloud environments)

Edit kubernetes.yaml, uncomment the LoadBalancer Service section, then:

kubectl apply -f kubernetes.yaml
# Get external IP
kubectl get svc -n langbot langbot-loadbalancer
# Visit http://<EXTERNAL_IP>

Option D: Ingress (Recommended for production)

Ensure an Ingress Controller (e.g., nginx-ingress) is installed in the cluster, then:

  1. Edit the Ingress configuration in kubernetes.yaml
  2. Change the domain to your actual domain
  3. Apply configuration:
kubectl apply -f kubernetes.yaml
# Visit http://langbot.yourdomain.com

Configuration

Environment Variables

Configure environment variables in ConfigMap:

apiVersion: v1
kind: ConfigMap
metadata:
  name: langbot-config
  namespace: langbot
data:
  TZ: "Asia/Shanghai"  # Change to your timezone

Storage Configuration

Uses dynamic storage provisioning by default. If you have a specific StorageClass, specify it in PVC:

spec:
  storageClassName: your-storage-class-name
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

Resource Limits

Adjust resource limits based on your needs:

resources:
  requests:
    memory: "1Gi"
    cpu: "500m"
  limits:
    memory: "4Gi"
    cpu: "2000m"

Common Operations

View Logs

# View LangBot main service logs
kubectl logs -n langbot -l app=langbot -f

# View plugin runtime logs
kubectl logs -n langbot -l app=langbot-plugin-runtime -f

Restart Services

# Restart LangBot
kubectl rollout restart deployment/langbot -n langbot

# Restart plugin runtime
kubectl rollout restart deployment/langbot-plugin-runtime -n langbot

Update Images

# Update to latest version
kubectl set image deployment/langbot -n langbot langbot=rockchin/langbot:latest
kubectl set image deployment/langbot-plugin-runtime -n langbot langbot-plugin-runtime=rockchin/langbot:latest

# Check update status
kubectl rollout status deployment/langbot -n langbot

Note: Due to LangBot using ReadWriteOnce persistent storage, multi-replica scaling is not supported. For high availability, consider using ReadWriteMany storage or alternative architectures.

Backup Data

# Backup PVC data
kubectl exec -n langbot -it <langbot-pod-name> -- tar czf /tmp/backup.tar.gz /app/data
kubectl cp langbot/<langbot-pod-name>:/tmp/backup.tar.gz ./backup.tar.gz

Uninstall

# Delete all resources (keep PVCs)
kubectl delete deployment,service,configmap -n langbot --all

# Delete PVCs (will delete data)
kubectl delete pvc -n langbot --all

# Delete namespace
kubectl delete namespace langbot

Troubleshooting

Pods Not Starting

# Check Pod status
kubectl get pods -n langbot

# View detailed information
kubectl describe pod -n langbot <pod-name>

# View events
kubectl get events -n langbot --sort-by='.lastTimestamp'

Storage Issues

# Check PVC status
kubectl get pvc -n langbot

# Check PV
kubectl get pv

Network Access Issues

# Check Service
kubectl get svc -n langbot

# Test port forwarding
kubectl port-forward -n langbot svc/langbot 5300:5300

Production Recommendations

  1. Use specific version tags: Avoid using latest tag, use specific version like rockchin/langbot:v1.0.0
  2. Configure resource limits: Adjust CPU and memory limits based on actual load
  3. Use Ingress + TLS: Configure HTTPS access and certificate management
  4. Configure monitoring and alerts: Integrate monitoring tools like Prometheus, Grafana
  5. Regular backups: Configure automated backup strategy to protect data
  6. Use dedicated StorageClass: Configure high-performance storage for production
  7. Configure affinity rules: Ensure Pods are scheduled to appropriate nodes

Advanced Configuration

Using Secrets for Sensitive Information

If you need to configure sensitive information like API keys:

apiVersion: v1
kind: Secret
metadata:
  name: langbot-secrets
  namespace: langbot
type: Opaque
data:
  api_key: <base64-encoded-value>

Then reference in Deployment:

env:
- name: API_KEY
  valueFrom:
    secretKeyRef:
      name: langbot-secrets
      key: api_key

Configure Horizontal Pod Autoscaling (HPA)

Note: Requires ReadWriteMany storage type

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: langbot-hpa
  namespace: langbot
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: langbot
  minReplicas: 1
  maxReplicas: 3
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

References