跳到主要内容

在 GKE 安装 cilium

概述

GKE 标准集群默认网络模式并没有对 cilium 做产品化支持,不过 cilium 开源社区支持了 GKE 标准集群的容器网络,在 GKE 环境中安装 cilium 时,自动做一些配置调整来适配。

安装方法

确保 node 打上如下的污点:

  taints:
- key: "node.cilium.io/agent-not-ready"
value: "true"
effect: "NoExecute"

然后用 cli 工具安装 cilium:

cilium install --version 1.18.2

YAML 清单

cilium 安装后,相关 YAML 如下:

apiVersion: apps/v1
kind: DaemonSet
metadata:
annotations:
deprecated.daemonset.template.generation: "1"
meta.helm.sh/release-name: cilium
meta.helm.sh/release-namespace: kube-system
labels:
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: cilium-agent
app.kubernetes.io/part-of: cilium
k8s-app: cilium
name: cilium
namespace: kube-system
spec:
revisionHistoryLimit: 10
selector:
matchLabels:
k8s-app: cilium
template:
metadata:
annotations:
kubectl.kubernetes.io/default-container: cilium-agent
labels:
app.kubernetes.io/name: cilium-agent
app.kubernetes.io/part-of: cilium
k8s-app: cilium
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
k8s-app: cilium
topologyKey: kubernetes.io/hostname
automountServiceAccountToken: true
containers:
- args:
- --config-dir=/tmp/cilium/config-map
command:
- cilium-agent
env:
- name: K8S_NODE_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
- name: CILIUM_K8S_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
- name: CILIUM_CLUSTERMESH_CONFIG
value: /var/lib/cilium/clustermesh/
- name: GOMEMLIMIT
valueFrom:
resourceFieldRef:
divisor: "1"
resource: limits.memory
- name: KUBE_CLIENT_BACKOFF_BASE
value: "1"
- name: KUBE_CLIENT_BACKOFF_DURATION
value: "120"
image: quay.io/cilium/cilium:v1.18.2@sha256:858f807ea4e20e85e3ea3240a762e1f4b29f1cb5bbd0463b8aa77e7b097c0667
imagePullPolicy: IfNotPresent
lifecycle:
postStart:
exec:
command:
- bash
- -c
- |
set -o errexit
set -o pipefail
set -o nounset

# When running in AWS ENI mode, it's likely that 'aws-node' has
# had a chance to install SNAT iptables rules. These can result
# in dropped traffic, so we should attempt to remove them.
# We do it using a 'postStart' hook since this may need to run
# for nodes which might have already been init'ed but may still
# have dangling rules. This is safe because there are no
# dependencies on anything that is part of the startup script
# itself, and can be safely run multiple times per node (e.g. in
# case of a restart).
if [[ "$(iptables-save | grep -E -c 'AWS-SNAT-CHAIN|AWS-CONNMARK-CHAIN')" != "0" ]];
then
echo 'Deleting iptables rules created by the AWS CNI VPC plugin'
iptables-save | grep -E -v 'AWS-SNAT-CHAIN|AWS-CONNMARK-CHAIN' | iptables-restore
fi
echo 'Done!'
preStop:
exec:
command:
- /cni-uninstall.sh
livenessProbe:
failureThreshold: 10
httpGet:
host: 127.0.0.1
httpHeaders:
- name: brief
value: "true"
- name: require-k8s-connectivity
value: "false"
path: /healthz
port: 9879
scheme: HTTP
periodSeconds: 30
successThreshold: 1
timeoutSeconds: 5
name: cilium-agent
readinessProbe:
failureThreshold: 3
httpGet:
host: 127.0.0.1
httpHeaders:
- name: brief
value: "true"
path: /healthz
port: 9879
scheme: HTTP
periodSeconds: 30
successThreshold: 1
timeoutSeconds: 5
securityContext:
capabilities:
add:
- CHOWN
- KILL
- NET_ADMIN
- NET_RAW
- IPC_LOCK
- SYS_MODULE
- SYS_ADMIN
- SYS_RESOURCE
- DAC_OVERRIDE
- FOWNER
- SETGID
- SETUID
drop:
- ALL
seLinuxOptions:
level: s0
type: spc_t
startupProbe:
failureThreshold: 300
httpGet:
host: 127.0.0.1
httpHeaders:
- name: brief
value: "true"
path: /healthz
port: 9879
scheme: HTTP
initialDelaySeconds: 5
periodSeconds: 2
successThreshold: 1
timeoutSeconds: 1
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts:
- mountPath: /var/run/cilium/envoy/sockets
name: envoy-sockets
- mountPath: /host/proc/sys/net
name: host-proc-sys-net
- mountPath: /host/proc/sys/kernel
name: host-proc-sys-kernel
- mountPath: /sys/fs/bpf
mountPropagation: HostToContainer
name: bpf-maps
- mountPath: /var/run/cilium
name: cilium-run
- mountPath: /var/run/cilium/netns
mountPropagation: HostToContainer
name: cilium-netns
- mountPath: /host/etc/cni/net.d
name: etc-cni-netd
- mountPath: /var/lib/cilium/clustermesh
name: clustermesh-secrets
readOnly: true
- mountPath: /lib/modules
name: lib-modules
readOnly: true
- mountPath: /run/xtables.lock
name: xtables-lock
- mountPath: /var/lib/cilium/tls/hubble
name: hubble-tls
readOnly: true
- mountPath: /tmp
name: tmp
dnsPolicy: ClusterFirst
hostNetwork: true
initContainers:
- command:
- cilium-dbg
- build-config
env:
- name: K8S_NODE_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
- name: CILIUM_K8S_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
image: quay.io/cilium/cilium:v1.18.2@sha256:858f807ea4e20e85e3ea3240a762e1f4b29f1cb5bbd0463b8aa77e7b097c0667
imagePullPolicy: IfNotPresent
name: config
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts:
- mountPath: /tmp
name: tmp
- command:
- sh
- -ec
- |
cp /usr/bin/cilium-mount /hostbin/cilium-mount;
nsenter --cgroup=/hostproc/1/ns/cgroup --mount=/hostproc/1/ns/mnt "${BIN_PATH}/cilium-mount" $CGROUP_ROOT;
rm /hostbin/cilium-mount
env:
- name: CGROUP_ROOT
value: /run/cilium/cgroupv2
- name: BIN_PATH
value: /home/kubernetes/bin
image: quay.io/cilium/cilium:v1.18.2@sha256:858f807ea4e20e85e3ea3240a762e1f4b29f1cb5bbd0463b8aa77e7b097c0667
imagePullPolicy: IfNotPresent
name: mount-cgroup
securityContext:
capabilities:
add:
- SYS_ADMIN
- SYS_CHROOT
- SYS_PTRACE
drop:
- ALL
seLinuxOptions:
level: s0
type: spc_t
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts:
- mountPath: /hostproc
name: hostproc
- mountPath: /hostbin
name: cni-path
- command:
- sh
- -ec
- |
cp /usr/bin/cilium-sysctlfix /hostbin/cilium-sysctlfix;
nsenter --mount=/hostproc/1/ns/mnt "${BIN_PATH}/cilium-sysctlfix";
rm /hostbin/cilium-sysctlfix
env:
- name: BIN_PATH
value: /home/kubernetes/bin
image: quay.io/cilium/cilium:v1.18.2@sha256:858f807ea4e20e85e3ea3240a762e1f4b29f1cb5bbd0463b8aa77e7b097c0667
imagePullPolicy: IfNotPresent
name: apply-sysctl-overwrites
securityContext:
capabilities:
add:
- SYS_ADMIN
- SYS_CHROOT
- SYS_PTRACE
drop:
- ALL
seLinuxOptions:
level: s0
type: spc_t
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts:
- mountPath: /hostproc
name: hostproc
- mountPath: /hostbin
name: cni-path
- args:
- mount | grep "/sys/fs/bpf type bpf" || mount -t bpf bpf /sys/fs/bpf
command:
- /bin/bash
- -c
- --
image: quay.io/cilium/cilium:v1.18.2@sha256:858f807ea4e20e85e3ea3240a762e1f4b29f1cb5bbd0463b8aa77e7b097c0667
imagePullPolicy: IfNotPresent
name: mount-bpf-fs
securityContext:
privileged: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts:
- mountPath: /sys/fs/bpf
mountPropagation: Bidirectional
name: bpf-maps
- command:
- sh
- -c
- |
until test -s "/tmp/cilium-bootstrap.d/cilium-bootstrap-time"; do
echo "Waiting on node-init to run...";
sleep 1;
done
image: quay.io/cilium/cilium:v1.18.2@sha256:858f807ea4e20e85e3ea3240a762e1f4b29f1cb5bbd0463b8aa77e7b097c0667
imagePullPolicy: IfNotPresent
name: wait-for-node-init
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts:
- mountPath: /tmp/cilium-bootstrap.d
name: cilium-bootstrap-file-dir
- command:
- /init-container.sh
env:
- name: CILIUM_ALL_STATE
valueFrom:
configMapKeyRef:
key: clean-cilium-state
name: cilium-config
optional: true
- name: CILIUM_BPF_STATE
valueFrom:
configMapKeyRef:
key: clean-cilium-bpf-state
name: cilium-config
optional: true
- name: WRITE_CNI_CONF_WHEN_READY
valueFrom:
configMapKeyRef:
key: write-cni-conf-when-ready
name: cilium-config
optional: true
image: quay.io/cilium/cilium:v1.18.2@sha256:858f807ea4e20e85e3ea3240a762e1f4b29f1cb5bbd0463b8aa77e7b097c0667
imagePullPolicy: IfNotPresent
name: clean-cilium-state
securityContext:
capabilities:
add:
- NET_ADMIN
- SYS_MODULE
- SYS_ADMIN
- SYS_RESOURCE
drop:
- ALL
seLinuxOptions:
level: s0
type: spc_t
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts:
- mountPath: /sys/fs/bpf
name: bpf-maps
- mountPath: /run/cilium/cgroupv2
mountPropagation: HostToContainer
name: cilium-cgroup
- mountPath: /var/run/cilium
name: cilium-run
- command:
- /install-plugin.sh
image: quay.io/cilium/cilium:v1.18.2@sha256:858f807ea4e20e85e3ea3240a762e1f4b29f1cb5bbd0463b8aa77e7b097c0667
imagePullPolicy: IfNotPresent
name: install-cni-binaries
resources:
requests:
cpu: 100m
memory: 10Mi
securityContext:
capabilities:
drop:
- ALL
seLinuxOptions:
level: s0
type: spc_t
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts:
- mountPath: /host/opt/cni/bin
name: cni-path
nodeSelector:
kubernetes.io/os: linux
priorityClassName: system-node-critical
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
appArmorProfile:
type: Unconfined
seccompProfile:
type: Unconfined
serviceAccount: cilium
serviceAccountName: cilium
terminationGracePeriodSeconds: 1
tolerations:
- operator: Exists
volumes:
- name: tmp
- hostPath:
path: /var/run/cilium
type: DirectoryOrCreate
name: cilium-run
- hostPath:
path: /var/run/netns
type: DirectoryOrCreate
name: cilium-netns
- hostPath:
path: /sys/fs/bpf
type: DirectoryOrCreate
name: bpf-maps
- hostPath:
path: /proc
type: Directory
name: hostproc
- hostPath:
path: /run/cilium/cgroupv2
type: DirectoryOrCreate
name: cilium-cgroup
- hostPath:
path: /home/kubernetes/bin
type: DirectoryOrCreate
name: cni-path
- hostPath:
path: /etc/cni/net.d
type: DirectoryOrCreate
name: etc-cni-netd
- hostPath:
path: /lib/modules
type: ""
name: lib-modules
- hostPath:
path: /run/xtables.lock
type: FileOrCreate
name: xtables-lock
- hostPath:
path: /var/run/cilium/envoy/sockets
type: DirectoryOrCreate
name: envoy-sockets
- hostPath:
path: /tmp/cilium-bootstrap.d
type: DirectoryOrCreate
name: cilium-bootstrap-file-dir
- name: clustermesh-secrets
projected:
defaultMode: 256
sources:
- secret:
name: cilium-clustermesh
optional: true
- secret:
items:
- key: tls.key
path: common-etcd-client.key
- key: tls.crt
path: common-etcd-client.crt
- key: ca.crt
path: common-etcd-client-ca.crt
name: clustermesh-apiserver-remote-cert
optional: true
- secret:
items:
- key: tls.key
path: local-etcd-client.key
- key: tls.crt
path: local-etcd-client.crt
- key: ca.crt
path: local-etcd-client-ca.crt
name: clustermesh-apiserver-local-cert
optional: true
- hostPath:
path: /proc/sys/net
type: Directory
name: host-proc-sys-net
- hostPath:
path: /proc/sys/kernel
type: Directory
name: host-proc-sys-kernel
- name: hubble-tls
projected:
defaultMode: 256
sources:
- secret:
items:
- key: tls.crt
path: server.crt
- key: tls.key
path: server.key
- key: ca.crt
path: client-ca.crt
name: hubble-server-certs
optional: true
updateStrategy:
rollingUpdate:
maxSurge: 0
maxUnavailable: 2
type: RollingUpdate

CNI 配置

cilium 安装后,CNI 配置变成:

/etc/cni/net.d/05-cilium.conflist
{
"cniVersion": "0.3.1",
"name": "cilium",
"plugins": [
{
"type": "cilium-cni",
"enable-debug": false,
"log-file": "/var/run/cilium/cilium-cni.log"
}
]
}

cilium-dbg

$ kubectl exec -i -t cilium-69x5v -- bash
root@gke-gke-test-default-pool-e5de334e-9jms:/home/cilium# cilium-dbg status
KVStore: Disabled
Kubernetes: Ok 1.34 (v1.34.0-gke.1662000) [linux/amd64]
Kubernetes APIs: ["EndpointSliceOrEndpoint", "cilium/v2::CiliumCIDRGroup", "cilium/v2::CiliumClusterwideNetworkPolicy", "cilium/v2::CiliumEndpoint", "cilium/v2::CiliumNetworkPolicy", "cilium/v2::CiliumNode", "core/v1::Pods", "networking.k8s.io/v1::NetworkPolicy"]
KubeProxyReplacement: False
Host firewall: Disabled
SRv6: Disabled
CNI Chaining: none
CNI Config file: successfully wrote CNI configuration file to /host/etc/cni/net.d/05-cilium.conflist
Cilium: Ok 1.18.2 (v1.18.2-5bd307a8)
NodeMonitor: Listening for events on 2 CPUs with 64x4096 of shared memory
Cilium health daemon: Ok
IPAM: IPv4: 11/254 allocated from 10.44.0.0/24,
IPv4 BIG TCP: Disabled
IPv6 BIG TCP: Disabled
BandwidthManager: Disabled
Routing: Network: Native Host: Legacy
Attach Mode: TCX
Device Mode: veth
Masquerading: IPTables [IPv4: Enabled, IPv6: Disabled]
Controller Status: 62/62 healthy
Proxy Status: OK, ip 10.44.0.205, 0 redirects active on ports 10000-20000, Envoy: external
Global Identity Range: min 256, max 65535
Hubble: Ok Current/Max Flows: 4095/4095 (100.00%), Flows/s: 14.39 Metrics: Disabled
Encryption: Disabled
Cluster health: 3/3 reachable (2025-09-23T07:47:31Z) (Probe interval: 1m56.754608943s)
Name IP Node Endpoints
Modules Health: Stopped(13) Degraded(0) OK(88)