Cilium 与 Nodelocal DNSCache 共存

概述

Nodelocal DNS Cache 用于 DNS 缓存和加速，可减轻 coredns 压力并提高 DNS 查询性能。

本文介绍安装 Cilium 的 TKE 集群如何实现与 Nodelocal DNSCache 共存。

与 TKE 的 NodeLocalDNSCache 插件不兼容

安装 Cilium 并替代了 kube-proxy，访问 coredns 的请求会被 cilium 的 ebpf 程序拦截并转发，无法被节点上的 node-local-dns Pod 拦截，也就无法直接实现 DNS 缓存的能力，该插件的能力将会失效。

Cilium 官方给出了通过配置 CiliumLocalRedirectPolicy 来实现与 Nodelocal DNSCache 共存的方法，但如果使用的是 TKE 的 NodeLocalDNSCache 插件，即使通过配置 CiliumLocalRedirectPolicy 也无法实现与 NodeLocalDNSCache 共存，因为该插件使用了 HostNetwork 网络且不监听节点/Pod IP （监听的是 169.254.20.10 和 kube-dns 的 Cluster IP），导致 DNS 流量无法被 CiliumLocalRedirectPolicy 重定向到本机的 node-local-dns Pod。

所以，若想在安装 Cilium 的集群使用 Nodelocal DNSCache，建议自建 Nodelocal DNSCache，具体方法参考下文。

自建 Nodelocal DNSCache

保存以下内容到文件 node-local-dns.yaml:

说明

以下内容是根据 Ciium 官方文档 Node-local DNS cache 中的 Manual Configuration 方式，将 node-local-dns 官方的部署 YAML 文件 nodelocaldns.yaml 修改而来，另外替换镜像地址成 dockerhub 上的 mirror 镜像，方便在 TKE 环境中直接内网拉取到，并且禁用了 HINFO 请求避免日志一直报错（VPC 的 DNS 服务不支持 HINFO 请求）。

node-local-dns.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: node-local-dns
  namespace: kube-system
  labels:
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
---
apiVersion: v1
kind: Service
metadata:
  name: kube-dns-upstream
  namespace: kube-system
  labels:
    k8s-app: kube-dns
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
    kubernetes.io/name: "KubeDNSUpstream"
spec:
  ports:
  - name: dns
    port: 53
    protocol: UDP
    targetPort: 53
  - name: dns-tcp
    port: 53
    protocol: TCP
    targetPort: 53
  selector:
    k8s-app: kube-dns
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: node-local-dns
  namespace: kube-system
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
data:
  Corefile: |
    __PILLAR__DNS__DOMAIN__:53 {
        errors
        cache {
                success 9984 30
                denial 9984 5
        }
        reload
        loop
        bind 0.0.0.0
        forward . __PILLAR__CLUSTER__DNS__ {
                force_tcp
        }
        prometheus :9253
        health
        }
    in-addr.arpa:53 {
        errors
        cache 30
        reload
        loop
        bind 0.0.0.0
        forward . __PILLAR__CLUSTER__DNS__ {
                force_tcp
        }
        prometheus :9253
        }
    ip6.arpa:53 {
        errors
        cache 30
        reload
        loop
        bind 0.0.0.0
        forward . __PILLAR__CLUSTER__DNS__ {
                force_tcp
        }
        prometheus :9253
        }
    .:53 {
        template ANY HINFO . {
            rcode NXDOMAIN
        }
        errors
        cache 30
        reload
        loop
        bind 0.0.0.0
        forward . __PILLAR__UPSTREAM__SERVERS__
        prometheus :9253
        }
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-local-dns
  namespace: kube-system
  labels:
    k8s-app: node-local-dns
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
spec:
  updateStrategy:
    rollingUpdate:
      maxUnavailable: 10%
  selector:
    matchLabels:
      k8s-app: node-local-dns
  template:
    metadata:
      labels:
        k8s-app: node-local-dns
      annotations:
        prometheus.io/port: "9253"
        prometheus.io/scrape: "true"
    spec:
      priorityClassName: system-node-critical
      serviceAccountName: node-local-dns
      hostNetwork: false
      dnsPolicy: Default # Don't use cluster DNS.
      tolerations:
      - key: "CriticalAddonsOnly"
        operator: "Exists"
      - effect: "NoExecute"
        operator: "Exists"
      - effect: "NoSchedule"
        operator: "Exists"
      containers:
      - name: node-cache
        image: docker.io/k8smirror/k8s-dns-node-cache:1.26.4
        resources:
          requests:
            cpu: 25m
            memory: 5Mi
        args: ["-localip", "__PILLAR__DNS__SERVER__", "-conf", "/etc/Corefile", "-upstreamsvc", "kube-dns-upstream", "-skipteardown=true", "-setupinterface=false", "-setupiptables=false"]
        securityContext:
          capabilities:
            add:
            - NET_ADMIN
        ports:
        - containerPort: 53
          name: dns
          protocol: UDP
        - containerPort: 53
          name: dns-tcp
          protocol: TCP
        - containerPort: 9253
          name: metrics
          protocol: TCP
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 60
          timeoutSeconds: 5
        volumeMounts:
        - mountPath: /run/xtables.lock
          name: xtables-lock
          readOnly: false
        - name: config-volume
          mountPath: /etc/coredns
        - name: kube-dns-config
          mountPath: /etc/kube-dns
      volumes:
      - name: xtables-lock
        hostPath:
          path: /run/xtables.lock
          type: FileOrCreate
      - name: kube-dns-config
        configMap:
          name: kube-dns
          optional: true
      - name: config-volume
        configMap:
          name: node-local-dns
          items:
          - key: Corefile
            path: Corefile.base
---
# A headless service is a service with a service IP but instead of load-balancing it will return the IPs of our associated Pods.
# We use this to expose metrics to Prometheus.
apiVersion: v1
kind: Service
metadata:
  annotations:
    prometheus.io/port: "9253"
    prometheus.io/scrape: "true"
  labels:
    k8s-app: node-local-dns
  name: node-local-dns
  namespace: kube-system
spec:
  clusterIP: None
  ports:
  - name: metrics
    port: 9253
    targetPort: 9253
  selector:
    k8s-app: node-local-dns

安装：

kubedns=$(kubectl get svc kube-dns -n kube-system -o jsonpath={.spec.clusterIP}) && sed -i "s/__PILLAR__DNS__SERVER__/$kubedns/g;" node-local-dns.yaml
kubectl apply -f node-local-dns.yaml

保持以下内容到文件 localdns-redirect-policy.yaml:

localdns-redirect-policy.yaml
apiVersion: cilium.io/v2
kind: CiliumLocalRedirectPolicy
metadata:
  name: nodelocaldns
  namespace: kube-system
spec:
  redirectFrontend:
    serviceMatcher:
      serviceName: kube-dns
      namespace: kube-system
  redirectBackend:
    localEndpointSelector:
      matchLabels:
        k8s-app: node-local-dns
    toPorts:
    - port: "53"
      name: dns
      protocol: UDP
    - port: "53"
      name: dns-tcp
      protocol: TCP

创建 CiliumLocalRedirectPolicy (将 dns 的请求重定向到本机的 node-local-dns pod)：
```
kubectl apply -f localdns-redirect-policy.yaml
```

常见问题

sed 报错: extra characters at the end of n command

macOS 下执行安装 Nodelocal DNSCache 时，sed 报错：

$ kubedns=$(kubectl get svc kube-dns -n kube-system -o jsonpath={.spec.clusterIP}) && sed -i "s/__PILLAR__DNS__SERVER__/$kubedns/g;" node-local-dns.yaml

sed: 1: "node-local-dns.yaml
": extra characters at the end of n command

是因为 macOS 自带的 sed 命令不是标准的（GNU），语法有些不一样，可安装 GNU 版的 sed：

brew install gnu-sed

并设置下 PATH：

PATH="/usr/local/opt/gnu-sed/libexec/gnubin:$PATH"

最后新开终端重新执行安装命令即可。

无法创建 CiliumLocalRedirectPolicy

CiliumLocalRedirectPolicy 的能力没有默认开启，需在安装时加参数 --set localRedirectPolicies.enabled=true 来开启。

若 Cilium 已安装，通过以下方式更新 Cilium 配置来开启：

helm upgrade cilium cilium/cilium --version 1.18.3 \
  --namespace kube-system \
  --reuse-values \
  --set localRedirectPolicies.enabled=true

再需重启下 operator 和 agent 生效:

kubectl rollout restart deploy cilium-operator -n kube-system
kubectl rollout restart ds cilium -n kube-system

概述​

与 TKE 的 NodeLocalDNSCache 插件不兼容​

自建 Nodelocal DNSCache​

常见问题​

sed 报错: extra characters at the end of n command​

无法创建 CiliumLocalRedirectPolicy​

参考资料​

概述

与 TKE 的 NodeLocalDNSCache 插件不兼容

自建 Nodelocal DNSCache

常见问题

sed 报错: extra characters at the end of n command

无法创建 CiliumLocalRedirectPolicy

参考资料