作者:Tomoya Fujita
本篇博客介绍如何在 KubeEdge 中启用 Cilium 容器网络接口。
为何 KubeEdge 选择 Cilium
Cilium 是 Kubernetes 集群中最为先进和高效的容器网络接口插件之一,它为容器化应用提供网络连接与安全防护。通过利用 eBPF(扩展伯克利数据包过滤器)技术,在 Linux 内核层级实现网络与安全策略,从而确保高性能的数据平面操作及细粒度的安全控制。
而 KubeEdge 则将集群编排能力拓展至边缘环境,实现统一的集群管理和专为边缘设计的高级特性。
结合 KubeEdge 使用 Cilium,即便在边缘计算环境中,也能兼得二者优势。我们能够在运行 EdgeCore 的位置部署应用容器,并借助 Cilium 与云端基础设施的工作负载相连。这是因为 Cilium 支持通过 WireGuard VPN 实现端点间流量的透明加密。
此外,我们还能依靠 Cilium Tetragon 安全可观测性和运行时执行功能,限制边缘环境中的安全风险与漏洞。
如何在 KubeEdge 中启用 Cilium
以下步骤指导如何搭建一个集 Kubernetes、KubeEdge 及 Cilium 于一体的简易集群系统。鉴于这是一种新方法,目前尚处于测试阶段,因此需要手动执行以下操作。
所有操作完成后,即可构建出结合了 KubeEdge 与 Cilium 的集群配置。
前提条件
需使用 KubeEdge v1.16 或更高版本:在 KubeEdge 中启用 Cilium,必须采用 v1.16 或更新的 KubeEdge 版本。这是由于 cilium-agent 需要向 Kubernetes API 服务器发起 InClusterConfig API 请求以配置自身。虽然在 Kubernetes 节点上这不是问题,但在 KubeEdge 环境下,这些 API 请求与响应需经由 KubeEdge MetaManager 传递。更多详情可参考 KubeEdge EdgeCore 对 Cilium CNI 的支持[1]。 需与 KubeEdge v1.16 兼容的 Kubernetes 版本:兼容和支持的 Kubernetes 版本信息请查看此处[2]。 执行命令需具备超级用户(或根用户)权限。
Kubernetes 主节点设置
参照 KubeEdge 安装前准备[3],设置 Kubernetes API 服务器。
### Check node status
> kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
tomoyafujita Ready control-plane 25s v1.26.15 AA.BBB.CCC.DD <none> Ubuntu 20.04.6 LTS 5.15.0-102-generic containerd://1.6.32
### Taint this node so that CloudCore can be deployed on the control-plane
> kubectl taint node tomoyafujita node-role.kubernetes.io/control-plane:NoSchedule-
node/tomoyafujita untainted
> kubectl get nodes -o json | jq '.items[].spec.taints'
null
Cilium 安装与设置
按照 Cilium 快速安装指南[4],在集群中安装并配置 cilium。
> cilium version
cilium-cli: v0.16.9 compiled with go1.22.3 on linux/amd64
cilium image (default): v1.15.5
cilium image (stable): v1.15.5
cilium image (running): unknown. Unable to obtain cilium version. Reason: release: not found
同时,在集群中启用 WireGuard VPN 安装 Cilium,
> cilium install --set encryption.enabled=true --set encryption.type=wireguard --set encryption.wireguard.persistentKeepalive=60
...
> cilium status
/¯¯\
/¯¯\__/¯¯\ Cilium: OK
\__/¯¯\__/ Operator: OK
/¯¯\__/¯¯\ Envoy DaemonSet: disabled (using embedded mode)
\__/¯¯\__/ Hubble Relay: disabled
\__/ ClusterMesh: disabled
Deployment cilium-operator Desired: 1, Ready: 1/1, Available: 1/1
DaemonSet cilium Desired: 1, Ready: 1/1, Available: 1/1
Containers: cilium Running: 1
cilium-operator Running: 1
Cluster Pods: 1/2 managed by Cilium
Helm chart version:
Image versions cilium quay.io/cilium/cilium:v1.15.5@sha256:4ce1666a73815101ec9a4d360af6c5b7f1193ab00d89b7124f8505dee147ca40: 1
cilium-operator quay.io/cilium/operator-generic:v1.15.5@sha256:f5d3d19754074ca052be6aac5d1ffb1de1eb5f2d947222b5f10f6d97ad4383e8: 1
为 Cilium DaemonSet 添加 nodeAffinity,确保这些 Pod 只在云端节点创建。这些 Cilium Pod 属于通用型 DaemonSet,理应部署在云端节点而非运行 EdgeCore 的节点。
### Edit Cilium DaemonSet with the following patch
> kubectl edit ds -n kube-system cilium
diff --git a/cilium-kubelet.yaml b/cilium-kubelet.yaml
index 21881e1..9946be9 100644
--- a/cilium-kubelet.yaml
+++ b/cilium-kubelet.yaml
@@ -29,6 +29,12 @@ spec:
k8s-app: cilium
spec:
affinity:
+ nodeAffinity:
+ requiredDuringSchedulingIgnoredDuringExecution:
+ nodeSelectorTerms:
+ - matchExpressions:
+ - key: node-role.kubernetes.io/edge
+ operator: DoesNotExist
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
编辑后,Cilium Pod 将重启。
KubeEdge CloudCore 设置
首先,需按照官方流程使用 Keadm 安装 KubeEdge[5]。
本博文中,我们使用 Keadm v1.16.1 进行安装,
### Install v1.16.1 keadm command
> wget https://github.com/kubeedge/kubeedge/releases/download/v1.16.1/keadm-v1.16.1-linux-amd64.tar.gz
> tar -zxvf keadm-v1.16.1-linux-amd64.tar.gz
> cp keadm-v1.16.1-linux-amd64/keadm/keadm /usr/local/bin
> keadm version
version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.1", GitCommit:"bd7b42acbfbe3a453c7bb75a6bb8f1e8b3db7415", GitTreeState:"clean", BuildDate:"2024-03-27T02:57:08Z", GoVersion:"go1.20.10", Compiler:"gc", Platform:"linux/amd64"}
然后,启动 v1.16.1 版本的 CloudCore。
> keadm init --advertise-address="AA.BBB.CCC.DD" --profile version=v1.16.1 --kube-config=/root/.kube/config
Kubernetes version verification passed, KubeEdge installation will start...
CLOUDCORE started
=========CHART DETAILS=======
NAME: cloudcore
LAST DEPLOYED: Tue Jun 4 17:19:15 2024
NAMESPACE: kubeedge
STATUS: deployed
REVISION: 1
CloudCore 启动后,还需启用 DynamicControllers。
### edit ConfigMap of CloudCore to enable dynamicController
> kubectl edit cm -n kubeedge cloudcore
> kubectl delete pod -n kubeedge --selector=kubeedge=cloudcore
### Check ConfigMap
> kubectl get cm -n kubeedge cloudcore -o yaml | grep "dynamicController" -A 1
dynamicController:
enable: true
为处理来自 MetaManager 的原生自边缘节点 Cilium 的 API,需通过编辑 clusterRole 和 clusterRolebinding 给 CloudCore 授予权限。
clusterRole
:
### Edit and apply the following patch
> kubectl edit clusterrole cilium
diff --git a/cilium-clusterrole.yaml b/cilium-clusterrole.yaml
index 736e35c..fd5512e 100644
--- a/cilium-clusterrole.yaml
+++ b/cilium-clusterrole.yaml
@@ -66,6 +66,7 @@ rules:
verbs:
- list
- watch
+ - get
- apiGroups:
- cilium.io
resources:
clusterRolebinding
:
### Edit and apply the following patch
> kubectl edit clusterrolebinding cilium
diff --git a/cilium-clusterrolebinding.yaml b/cilium-clusterrolebinding.yaml
index 9676737..ac956de 100644
--- a/cilium-clusterrolebinding.yaml
+++ b/cilium-clusterrolebinding.yaml
@@ -12,3 +12,9 @@ subjects:
- kind: ServiceAccount
name: cilium
namespace: kube-system
+- kind: ServiceAccount
+ name: cloudcore
+ namespace: kubeedge
+- kind: ServiceAccount
+ name: cloudcore
+ namespace: default
最后,在 CloudCore 重启后获取令牌。
> keadm gettoken
<TOKEN_HASH>
KubeEdge EdgeCore 设置
利用上述提供的令牌,启动 EdgeCore 加入集群系统。
> keadm join --cloudcore-ipport=AA.BBB.CCC.DD:10000 --kubeedge-version=v1.16.1 --cgroupdriver=systemd --token <TOKEN_HASH>
...<snip>
I0604 21:36:31.040859 2118064 join_others.go:265] KubeEdge edgecore is running, For logs visit: journalctl -u edgecore.service -xe
I0604 21:36:41.050154 2118064 join.go:94] 9. Install Complete!
> systemctl status edgecore
● edgecore.service
Loaded: loaded (/etc/systemd/system/edgecore.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2024-06-04 21:36:31 PDT; 40s ago
Main PID: 2118341 (edgecore)
Tasks: 24 (limit: 18670)
Memory: 31.8M
CPU: 849ms
CGroup: /system.slice/edgecore.service
└─2118341 /usr/local/bin/edgecore
EdgeCore 启动后,需启用 ServiceBus 和 MetaServer,编辑 edgecore.yaml 文件。
### Edit and apply the following patch
> vi /etc/kubeedge/config/edgecore.yaml
### Restart edgecore systemd-service
> systemctl restart edgecore
diff --git a/edgecore.yaml b/edgecore.yaml
index 8d17418..5391776 100644
--- a/edgecore.yaml
+++ b/edgecore.yaml
@@ -62,6 +62,8 @@ modules:
cgroupDriver: systemd
cgroupsPerQOS: true
clusterDomain: cluster.local
+ clusterDNS:
+ - 10.96.0.10
configMapAndSecretChangeDetectionStrategy: Get
containerLogMaxFiles: 5
containerLogMaxSize: 10Mi
@@ -151,7 +151,7 @@ modules:
enable: true
metaServer:
apiAudiences: null
- enable: false
+ enable: true
server: 127.0.0.1:10550
serviceAccountIssuers:
- https://kubernetes.default.svc.cluster.local
@@ -161,7 +161,7 @@ modules:
tlsPrivateKeyFile: /etc/kubeedge/certs/server.key
remoteQueryTimeout: 60
serviceBus:
- enable: false
+ enable: true
port: 9060
server: 127.0.0.1
timeout: 60
接着,需为 EdgeCore 节点单独创建 cilium-agent 的 DaemonSet。需将 cilium-agent Pod 部署到运行 KubeEdge EdgeCore 并标记有 kubernetes.io/edge=
的节点上。此外,cilium-agent 需向 MetaServer 查询 API,而非直接访问 Kubernetes API 服务器,以保持 KubeEdge 提供的边缘自治性。
### Dump original Cilium DaemonSet configuration
> kubectl get ds -n kube-system cilium -o yaml > cilium-edgecore.yaml
### Edit and apply the following patch
> vi cilium-edgecore.yaml
### Deploy cilium-agent aligns with edgecore
> kubectl apply -f cilium-edgecore.yaml
diff --git a/cilium-edgecore.yaml b/cilium-edgecore.yaml
index bff0f0b..3d941d1 100644
--- a/cilium-edgecore.yaml
+++ b/cilium-edgecore.yaml
@@ -8,7 +8,7 @@ metadata:
app.kubernetes.io/name: cilium-agent
app.kubernetes.io/part-of: cilium
k8s-app: cilium
- name: cilium
+ name: cilium-kubeedge
namespace: kube-system
spec:
revisionHistoryLimit: 10
@@ -29,6 +29,12 @@ spec:
k8s-app: cilium
spec:
affinity:
+ nodeAffinity:
+ requiredDuringSchedulingIgnoredDuringExecution:
+ nodeSelectorTerms:
+ - matchExpressions:
+ - key: node-role.kubernetes.io/edge
+ operator: Exists
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
@@ -39,6 +45,8 @@ spec:
containers:
- args:
- --config-dir=/tmp/cilium/config-map
+ - --k8s-api-server=127.0.0.1:10550
+ - --auto-create-cilium-node-resource=true
- --debug
command:
- cilium-agent
@@ -178,7 +186,9 @@ spec:
dnsPolicy: ClusterFirst
hostNetwork: true
initContainers:
- - command:
+ - args:
+ - --k8s-api-server=127.0.0.1:10550
+ command:
- cilium
- build-config
env:
如下图所示,cilium-pq45v(标准 cilium-agent Pod)运行于云端节点,而 cilium-kubeedge-kkb7z(针对 EdgeCore 的特定 DaemonSet)则与 EdgeCore 一同运行。
> kubectl get pods -A -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system cilium-kubeedge-kkb7z 1/1 Running 0 32s 43.135.146.155 edgemaster <none> <none>
kube-system cilium-operator-fdf6bc9f4-445p6 1/1 Running 0 3h40m AA.BBB.CCC.DD tomoyafujita <none> <none>
kube-system cilium-pq45v 1/1 Running 0 3h32m AA.BBB.CCC.DD tomoyafujita <none> <none>
kube-system coredns-787d4945fb-2bbdf 1/1 Running 0 8h 10.0.0.104 tomoyafujita <none> <none>
kube-system coredns-787d4945fb-nmd2p 1/1 Running 0 8h 10.0.0.130 tomoyafujita <none> <none>
kube-system etcd-tomoyafujita 1/1 Running 0 8h AA.BBB.CCC.DD tomoyafujita <none> <none>
kube-system kube-apiserver-tomoyafujita 1/1 Running 1 8h AA.BBB.CCC.DD tomoyafujita <none> <none>
kube-system kube-controller-manager-tomoyafujita 1/1 Running 0 8h AA.BBB.CCC.DD tomoyafujita <none> <none>
kube-system kube-proxy-qmxqp 1/1 Running 0 19m 43.135.146.155 edgemaster <none> <none>
kube-system kube-proxy-v2ht7 1/1 Running 0 8h AA.BBB.CCC.DD tomoyafujita <none> <none>
kube-system kube-scheduler-tomoyafujita 1/1 Running 1 8h AA.BBB.CCC.DD tomoyafujita <none> <none>
kubeedge cloudcore-df8544847-6mlw2 1/1 Running 0 4h23m AA.BBB.CCC.DD tomoyafujita <none> <none>
kubeedge edge-eclipse-mosquitto-9cw6r 1/1 Running 0 19m 43.135.146.155 edgemaster <none> <none>
从 Pod 检验 Cilium 连通性
现在,Cilium 已准备好为应用 Pod 和容器提供网络连接。我们可以使用 busybox DaemonSet 测试通过 Cilium 的网络连通性。
> cat busybox.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: busybox
spec:
selector:
matchLabels:
app: busybox
template:
metadata:
labels:
app: busybox
spec:
containers:
- image: busybox
command: ["sleep", "3600"]
imagePullPolicy: IfNotPresent
name: busybox
> kubectl apply -f busybox.yaml
daemonset.apps/busybox created
> kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
busybox-mn98w 1/1 Running 0 84s 10.0.0.58 tomoyafujita <none> <none>
busybox-z2mbw 1/1 Running 0 84s 10.0.1.121 edgemaster <none> <none>
> kubectl exec --stdin --tty busybox-mn98w -- /bin/sh
/ #
/ # ping 10.0.1.121
PING 10.0.1.121 (10.0.1.121): 56 data bytes
64 bytes from 10.0.1.121: seq=0 ttl=63 time=1.326 ms
64 bytes from 10.0.1.121: seq=1 ttl=63 time=1.620 ms
64 bytes from 10.0.1.121: seq=2 ttl=63 time=1.341 ms
64 bytes from 10.0.1.121: seq=3 ttl=63 time=1.685 ms
^C
--- 10.0.1.121 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 1.326/1.493/1.685 ms
/ # exit
> kubectl exec --stdin --tty busybox-z2mbw -- /bin/sh
/ #
/ # ping 10.0.0.58
PING 10.0.0.58 (10.0.0.58): 56 data bytes
64 bytes from 10.0.0.58: seq=0 ttl=63 time=0.728 ms
64 bytes from 10.0.0.58: seq=1 ttl=63 time=1.178 ms
64 bytes from 10.0.0.58: seq=2 ttl=63 time=0.635 ms
64 bytes from 10.0.0.58: seq=3 ttl=63 time=1.152 ms
^C
--- 10.0.0.58 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 0.635/0.923/1.178 ms
通过 Cilium,busybox 容器间的跨通信功能验证一切正常!!
欲了解更多技术细节与开发进展,请关注并订阅 KubeEdge EdgeCore 支持 Cilium CNI[6] 的相关信息。
KubeEdge EdgeCore 对 Cilium CNI 的支持: https://github.com/kubeedge/kubeedge/issues/4844
[2]此处: https://github.com/kubeedge/kubeedge?tab=readme-ov-file#kubernetes-compatibility
[3]KubeEdge 安装前准备: https://kubeedge.io/docs/category/prerequisites
[4]Cilium 快速安装指南: https://docs.cilium.io/en/stable/gettingstarted/k8s-install-default/
[5]使用 Keadm 安装 KubeEdge: https://kubeedge.io/docs/setup/install-with-keadm
[6]KubeEdge EdgeCore 支持 Cilium CNI: https://github.com/kubeedge/kubeedge/issues/4844
点击【阅读原文】阅读网站原文。
CNCF概况(幻灯片)
扫描二维码联系我们!
CNCF (Cloud Native Computing Foundation)成立于2015年12月,隶属于Linux Foundation,是非营利性组织。
CNCF(云原生计算基金会)致力于培育和维护一个厂商中立的开源生态系统,来推广云原生技术。我们通过将最前沿的模式民主化,让这些创新为大众所用。请关注CNCF微信公众号。