跳到主要内容

Cilium 与 Nodelocal DNSCache 共存

概述

Nodelocal DNS Cache 用于 DNS 缓存和加速,可减轻 coredns 压力并提高 DNS 查询性能。

本文介绍安装 Cilium 的 TKE 集群如何实现与 Nodelocal DNSCache 共存。

与 TKE 的 NodeLocalDNSCache 插件不兼容

安装 Cilium 并替代了 kube-proxy,访问 coredns 的请求会被 cilium 的 ebpf 程序拦截并转发,无法被节点上的 node-local-dns Pod 拦截,也就无法直接实现 DNS 缓存的能力,该插件的能力将会失效。

Cilium 官方给出了通过配置 CiliumLocalRedirectPolicy 来实现与 Nodelocal DNSCache 共存的方法,但如果使用的是 TKE 的 NodeLocalDNSCache 插件, 即使通过配置 CiliumLocalRedirectPolicy 也无法实现与 NodeLocalDNSCache 共存,因为该插件使用了 HostNetwork 网络且不监听 节点/Pod IP (监听的是 169.254.20.10kube-dns 的 Cluster IP),导致 DNS 流量无法被 CiliumLocalRedirectPolicy 重定向到本机的 node-local-dns Pod。

所以,若想在安装 Cilium 的集群使用 Nodelocal DNSCache,建议自建 Nodelocal DNSCache,具体方法参考下文。

自建 Nodelocal DNSCache

  1. 保存以下内容到文件 node-local-dns.yaml:
说明

以下内容是根据 Ciium 官方文档 Node-local DNS cache 中的 Manual Configuration 方式,将 node-local-dns 官方的部署 YAML 文件 nodelocaldns.yaml 修改而来,另外替换镜像地址成 dockerhub 上的 mirror 镜像,方便在 TKE 环境中直接内网拉取到,并且禁用了 HINFO 请求避免日志一直报错(VPC 的 DNS 服务不支持 HINFO 请求)。

node-local-dns.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: node-local-dns
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
---
apiVersion: v1
kind: Service
metadata:
name: kube-dns-upstream
namespace: kube-system
labels:
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/name: "KubeDNSUpstream"
spec:
ports:
- name: dns
port: 53
protocol: UDP
targetPort: 53
- name: dns-tcp
port: 53
protocol: TCP
targetPort: 53
selector:
k8s-app: kube-dns
---
apiVersion: v1
kind: ConfigMap
metadata:
name: node-local-dns
namespace: kube-system
labels:
addonmanager.kubernetes.io/mode: Reconcile
data:
Corefile: |
__PILLAR__DNS__DOMAIN__:53 {
errors
cache {
success 9984 30
denial 9984 5
}
reload
loop
bind 0.0.0.0
forward . __PILLAR__CLUSTER__DNS__ {
force_tcp
}
prometheus :9253
health
}
in-addr.arpa:53 {
errors
cache 30
reload
loop
bind 0.0.0.0
forward . __PILLAR__CLUSTER__DNS__ {
force_tcp
}
prometheus :9253
}
ip6.arpa:53 {
errors
cache 30
reload
loop
bind 0.0.0.0
forward . __PILLAR__CLUSTER__DNS__ {
force_tcp
}
prometheus :9253
}
.:53 {
template ANY HINFO . {
rcode NXDOMAIN
}
errors
cache 30
reload
loop
bind 0.0.0.0
forward . __PILLAR__UPSTREAM__SERVERS__
prometheus :9253
}
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: node-local-dns
namespace: kube-system
labels:
k8s-app: node-local-dns
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
spec:
updateStrategy:
rollingUpdate:
maxUnavailable: 10%
selector:
matchLabels:
k8s-app: node-local-dns
template:
metadata:
labels:
k8s-app: node-local-dns
annotations:
prometheus.io/port: "9253"
prometheus.io/scrape: "true"
spec:
priorityClassName: system-node-critical
serviceAccountName: node-local-dns
hostNetwork: false
dnsPolicy: Default # Don't use cluster DNS.
tolerations:
- key: "CriticalAddonsOnly"
operator: "Exists"
- effect: "NoExecute"
operator: "Exists"
- effect: "NoSchedule"
operator: "Exists"
containers:
- name: node-cache
image: docker.io/k8smirror/k8s-dns-node-cache:1.26.4
resources:
requests:
cpu: 25m
memory: 5Mi
args: ["-localip", "__PILLAR__DNS__SERVER__", "-conf", "/etc/Corefile", "-upstreamsvc", "kube-dns-upstream", "-skipteardown=true", "-setupinterface=false", "-setupiptables=false"]
securityContext:
capabilities:
add:
- NET_ADMIN
ports:
- containerPort: 53
name: dns
protocol: UDP
- containerPort: 53
name: dns-tcp
protocol: TCP
- containerPort: 9253
name: metrics
protocol: TCP
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 60
timeoutSeconds: 5
volumeMounts:
- mountPath: /run/xtables.lock
name: xtables-lock
readOnly: false
- name: config-volume
mountPath: /etc/coredns
- name: kube-dns-config
mountPath: /etc/kube-dns
volumes:
- name: xtables-lock
hostPath:
path: /run/xtables.lock
type: FileOrCreate
- name: kube-dns-config
configMap:
name: kube-dns
optional: true
- name: config-volume
configMap:
name: node-local-dns
items:
- key: Corefile
path: Corefile.base
---
# A headless service is a service with a service IP but instead of load-balancing it will return the IPs of our associated Pods.
# We use this to expose metrics to Prometheus.
apiVersion: v1
kind: Service
metadata:
annotations:
prometheus.io/port: "9253"
prometheus.io/scrape: "true"
labels:
k8s-app: node-local-dns
name: node-local-dns
namespace: kube-system
spec:
clusterIP: None
ports:
- name: metrics
port: 9253
targetPort: 9253
selector:
k8s-app: node-local-dns
  1. 安装:

    kubedns=$(kubectl get svc kube-dns -n kube-system -o jsonpath={.spec.clusterIP}) && sed -i "s/__PILLAR__DNS__SERVER__/$kubedns/g;" node-local-dns.yaml
    kubectl apply -f node-local-dns.yaml
  2. 保持以下内容到文件 localdns-redirect-policy.yaml:

    localdns-redirect-policy.yaml
    apiVersion: cilium.io/v2
    kind: CiliumLocalRedirectPolicy
    metadata:
    name: nodelocaldns
    namespace: kube-system
    spec:
    redirectFrontend:
    serviceMatcher:
    serviceName: kube-dns
    namespace: kube-system
    redirectBackend:
    localEndpointSelector:
    matchLabels:
    k8s-app: node-local-dns
    toPorts:
    - port: "53"
    name: dns
    protocol: UDP
    - port: "53"
    name: dns-tcp
    protocol: TCP
  3. 创建 CiliumLocalRedirectPolicy (将 dns 的请求重定向到本机的 node-local-dns pod):

    kubectl apply -f localdns-redirect-policy.yaml

常见问题

sed 报错: extra characters at the end of n command

macOS 下执行安装 Nodelocal DNSCache 时,sed 报错:

$ kubedns=$(kubectl get svc kube-dns -n kube-system -o jsonpath={.spec.clusterIP}) && sed -i "s/__PILLAR__DNS__SERVER__/$kubedns/g;" node-local-dns.yaml

sed: 1: "node-local-dns.yaml
": extra characters at the end of n command

是因为 macOS 自带的 sed 命令不是标准的(GNU),语法有些不一样,可安装 GNU 版的 sed:

brew install gnu-sed

并设置下 PATH:

PATH="/usr/local/opt/gnu-sed/libexec/gnubin:$PATH"

最后新开终端重新执行安装命令即可。

无法创建 CiliumLocalRedirectPolicy

CiliumLocalRedirectPolicy 的能力没有默认开启,需在安装时加参数 --set localRedirectPolicies.enabled=true 来开启。

若 Cilium 已安装,通过以下方式更新 Cilium 配置来开启:


helm upgrade cilium cilium/cilium --version 1.18.3 \
--namespace kube-system \
--reuse-values \
--set localRedirectPolicies.enabled=true

再需重启下 operator 和 agent 生效:

kubectl rollout restart deploy cilium-operator -n kube-system
kubectl rollout restart ds cilium -n kube-system

参考资料