PI-Hole high availability with Kubernetes

Here is what I just achieved in my setup...

Need :
--Highest availability level possible for DNS
--Highest security level possible for DNS
--Solution requiring as little manual intervention as possible
--Solution ensuring data integrity

While trying to deploy an HA instance of PI-Hole in my Kubernetes cluster, I ended up with problems many others faced before me. After searching all over the place, I did not find an easy solution and accepted the challenge of doing it myself :slight_smile:

Problem No 1-Avoiding the GUI (not possible as of now)

From PI-Hole v5, it is not possible to add your blocklist without manually clicking in the GUI anymore. That pure Microsoft / Windows philosophy is not good at all for automation like Kubernetes, so I had to work around that.

Problem No 2-SQLite (unavoidable as of now)

That's another one that is problematic when you need multiple access to the same resource. Because PI-Hole can not be redirected to my actual HA cluster based on MariaDB, again I had to deal with that.

3-Loosing client's source IP (loose some availability / flexibility or loose the source IP. Pick your poison).

When routing requests through Kubernetes infrastructure of load balancers, ingress and services, you often loose the client's source IP. The option to avoid that forces the use of local network stack which itself is not good for HA because you end up linked to a specific node / IP associated with a single container.

So how did I solved all of that ?

Environment is :
--A cluster of 2 pfSense firewalls around which the entire network is built
--A Kubernetes cluster of 9 nodes (3 control plane and 6 workers)
--That K8S cluster is using Longhorn for storage, MetalLB for load balancing and both Nginx and HAProxy for ingress.

About point No 1:

I created two deployments in Kubernetes. Both are using two ReadWriteMany volumes from Longhorn. One is for /etc/pihole and the other is for /etc/dnsmasq.d.

First deployment is limited to a single replica and mounts these volumes as RW. That deployment also exposes ports 53 (TCP and UDP), as well as port 80. I need to start that deployment first for it to create the files needed by the others.

The second deployment can have many replicas (running with 3 here) and forces the volumeMounts as readonly for both volumes. Also, that deployment exposes only port 53 and not port 80.

A maximum of options are configured using environment variables (DNSSec ; upstream DNS ; conditional forwarding ; etc).

The Kubernetes service points to both pods (RW and RO). Because only the RW one has its port 80 open, only that one will present the GUI. That way, when connecting the GUI, I am sure that it is served from the right container. Should that single pod goes down for any reason, DNS service survives and management will come back within a minute.

About point No 2:

Despite the volumes are ReadWriteMany, only a single container actually mounts is as RW. The others mount it as RO. Thanks to that, data integrity is ensured and SQLite is protected.

About point No 3:
--Network devices points to the pfSense cluster to get their DNS service
--pfSense will resolve local names itself and will query Pi-Hole for the rest
--Should Pi-Hole not be available despite the HA (ex: ESXi itself is rebooted, taking down everything), pfSense will fallback to a public DNS as a last resort.

Thanks to that:
--pfSense can log everything and provide me with the client's source IP and its associated queries.
--DNS will always be available (if pfSense is down, the entire network is down...)
--PI-Hole will be enforced as long as it is running (which is very high HA now)
--Management required from the GUI can be done in a safe way
--Changes are propagated to every PI-Hole instance thanks to the shared volume
--As many PI-Holes as desired can run together and share the load / ensure the service

Still, it would be good for PI-Hole to consider realities associated with Kubernetes but in the mean time, I managed to work my way around every problem.

Some extra details...

To get a reliable access to the management, I had to fix the IP address for this pod in my CNI (calico). I also have BGP routing between the kubernetes cluster and the LAN. Thanks to that, when needed, I can open the required access in the firewall and reach a pod directly. I had to do this for my mail server (Poste.IO) as well for it to see the direct client's IPs.

The second point is that ReadOnly pods need to be re-started whenever I do an update on the RW pod. For that, I just need to delete the RO pods. I can delete all of them at once and DNS will be provided by the RW pod, or I can delete them progressively so some remain online to answer queries, despite with the previous config.

Hi,

i'm interested by your setup. Can you share it?

It's a helm or simple manifest ?

Thanks
Cedric

Hi,

using manifests here. I avoid Helm templates as much as possible. I do not feel in control of my setup when using helm charts.

Most of the solution is outside of PI-Hole itself because natively, PI-Hole is unable to do what I need. For that reason, it is difficult to share the entire solution...

Before using the manifests, you need to already have :
Step 1
--An HA solution for your gateway that provides DNS to everyone
--Do your logging from that solution
--Configuring it to forward non-local queries to the VIP used for PI-Hole

I use pfSense and its built-in HA (CARP Failover) and DNS Forwarder. I use the Forwarder in sequential mode, starting with PI-Hole. That way, nothing will bypass PI-Hole unless it is down itself. In that case, pfSense will forward to public DNS outide, bypassing the security but leaving DNS functional and logging everything.

Step 2
--A functional Kubernetes cluster
--A CNI that can fix IP addresses
--Routing between the cluster and the outside world
--Using MetalLB as a loadbalancer
--I use IPv6 / IPv4 DualStacking ; remove IPv6 if you do not use it...
--HA storage with ReadWriteMany

I built mine with Kubeadm under Ubuntu 22.04. Running the latest version (upgraded to 1.29.1 yesterday). I use Calico as a CNI. BGP is configured between Calico and pfSense. Longhorn is my storage engine. I gave a 100G of space to longhorn on each of my 6 workers and everything exist in 3 replicas. That gives me 200G of HA local storage space.

So again, this goes way beyond PI-Hole itself and there is a ton of work, debug, trial and error and more... Most of that is built with manifests and is required for the solution to work but again, I can not provide you with my details because for sure your situation will not match mine to the point of being able to re-use mine.

Now the fun stufff :slight_smile:

Create the volumes for PI-Hole :
+-+-+-+-+-+-+
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: lh-pvc-pihole-etc
namespace: prod-services
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
storageClassName: longhorn

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: lh-pvc-pihole-dnsm
namespace: prod-services
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
storageClassName: longhorn
+-+-+-+-+-+-+

First deployment is for PI-Hole RW
+-+-+-+-+-+-+
kind: Deployment
metadata:
labels:
app: pihole
name: pihole-rw
namespace: prod-services
spec:
replicas: 1
selector:
matchLabels:
app: pihole
strategy:
type: Recreate
template:
metadata:
labels:
app: pihole
annotations:
cni.projectcalico.org/ipAddrs: "["2001:db8:1234:100::1", "10.244.0.11"]"
spec:
containers:
- name: pihole-rw
image: pihole/pihole:2024.01.0
imagePullPolicy: IfNotPresent
env:
- name: TZ
value: America/Toronto
- name: VIRTUAL_HOST
value: 'pihole.example.org'
- name: DNSSEC
value: 'true'
- name: DNSMASQ_LISTENING
value: all
- name: PIHOLE_DNS_
value: 2606:4700:4700::1111;2620:119:53::53
- name: FTLCONF_RATE_LIMIT
value: 0/0
- name: FTLCONF_LOCAL_IPV4
value: 172.16.0.11
- name: FTLCONF_MAXDBDAYS
value: '90'
- name: REV_SERVER
value: 'true'
- name: REV_SERVER_DOMAIN
value: local.lan
- name: REV_SERVER_TARGET
value: 172.16.0.1
- name: REV_SERVER_CIDR
value: 172.16.0.0/12
- name: WEBPASSWORD
valueFrom:
secretKeyRef:
name: pihole-web-password
key: password
volumeMounts:
- name: pihole-etc
mountPath: /etc/pihole
- name: pihole-dnsm
mountPath: /etc/dnsmasq.d
ports:
- name: dns-tcp
containerPort: 53
protocol: TCP
- name: dns-udp
containerPort: 53
protocol: UDP
- name: web
containerPort: 80
protocol: TCP
resources:
requests:
cpu: "20m"
memory: "512Mi"
limits:
cpu: "250m"
memory: "896Mi"
readinessProbe:
exec:
command: ['dig', '@127.0.0.1', 'cloudflare.com']
timeoutSeconds: 20
initialDelaySeconds: 5
periodSeconds: 60
livenessProbe:
tcpSocket:
port: dns-tcp
initialDelaySeconds: 15
periodSeconds: 30
volumes:
- name: pihole-etc
persistentVolumeClaim:
claimName: lh-pvc-pihole-etc
- name: pihole-dnsm
persistentVolumeClaim:
claimName: lh-pvc-pihole-dnsm
+-+-+-+-+-+-+

Must create 2 services because you can not put TCP and UDP in a single one.
+-+-+-+-+-+-+
kind: Service
apiVersion: v1
metadata:
name: pihole-udp
namespace: prod-services
annotations:
metallb.universe.tf/allow-shared-ip: dns
metallb.universe.tf/address-pool: pihole-pool
spec:
selector:
app: pihole
ipFamilyPolicy: PreferDualStack
ipFamilies:
- IPv6
- IPv4
ports:

  • protocol: UDP
    port: 53
    name: dnsudp
    targetPort: 53
    type: LoadBalancer

kind: Service
apiVersion: v1
metadata:
name: pihole-tcp
namespace: prod-services
annotations:
metallb.universe.tf/allow-shared-ip: dns
metallb.universe.tf/address-pool: pihole-pool
spec:
selector:
app: pihole
ipFamilyPolicy: PreferDualStack
ipFamilies:
- IPv6
- IPv4
ports:

  • protocol: TCP
    port: 53
    name: dnstcp
    targetPort: 53
  • protocol: TCP
    port: 80
    name: web
    targetPort: 80
    type: LoadBalancer
    +-+-+-+-+-+-+

Now I add the ReadOnly pods
+-+-+-+-+-+-+
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: pihole
name: pihole-ro
namespace: prod-services
spec:
replicas: 3
selector:
matchLabels:
app: pihole
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
type: RollingUpdate
template:
metadata:
labels:
app: pihole
annotations:
cni.projectcalico.org/ipv6pools: "["pihole-pool6"]"
spec:
containers:
- name: pihole-ro
image: pihole/pihole:2024.01.0
imagePullPolicy: IfNotPresent
env:
- name: TZ
value: America/Toronto
- name: DNSSEC
value: 'true'
- name: DNSMASQ_LISTENING
value: all
- name: PIHOLE_DNS_
value: 2606:4700:4700::1111;2620:119:53::53
- name: FTLCONF_RATE_LIMIT
value: 0/0
- name: FTLCONF_MAXDBDAYS
value: '90'
- name: REV_SERVER
value: 'true'
- name: REV_SERVER_DOMAIN
value: local.lan
- name: REV_SERVER_TARGET
value: 172.16.0.1
- name: REV_SERVER_CIDR
value: 172.16.0.0/12
- name: WEBPASSWORD
valueFrom:
secretKeyRef:
name: pihole-web-password
key: password
volumeMounts:
- name: pihole-etc
mountPath: /etc/pihole
readOnly: true
- name: pihole-dnsm
mountPath: /etc/dnsmasq.d
readOnly: true
ports:
- name: dns-tcp
containerPort: 53
protocol: TCP
- name: dns-udp
containerPort: 53
protocol: UDP
resources:
requests:
cpu: "20m"
memory: "512Mi"
limits:
cpu: "250m"
memory: "896Mi"
readinessProbe:
exec:
command: ['dig', '@127.0.0.1', 'cloudflare.com']
timeoutSeconds: 20
initialDelaySeconds: 5
periodSeconds: 60
livenessProbe:
tcpSocket:
port: dns-tcp
initialDelaySeconds: 15
periodSeconds: 30
volumes:
- name: pihole-etc
persistentVolumeClaim:
claimName: lh-pvc-pihole-etc
- name: pihole-dnsm
persistentVolumeClaim:
claimName: lh-pvc-pihole-dnsm
+-+-+-+-+-+-+-+

Last step is to add an ingress (and secure it) to reach the management
+-+-+-+-+-+-+-+
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: pihole-ingress
namespace: prod-services
labels:
auth1: mtls
auth2: app-password
auth3: none
security: restricted
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/auth-tls-secret: ingress-nginx/ca-secret
nginx.ingress.kubernetes.io/auth-tls-verify-client: "on"
nginx.ingress.kubernetes.io/auth-tls-verify-depth: "1"
nginx.ingress.kubernetes.io/auth-tls-error-page: "https://static.example.org/mtls_error.html"
nginx.ingress.kubernetes.io/backend-protocol: "HTTP"
spec:
ingressClassName: nginx
tls:

  • hosts:
    • pihole-dc.example.com
      secretName: pihole-tls-secret
      rules:
  • host: pihole-dc.example.com
    http:
    paths:
    • path: /
      pathType: Prefix
      backend:
      service:
      name: pihole-tcp
      port:
      number: 80
      +-+-+-+-+-+-+

But again, for these manifest to work, there is ton of requirements : Cert-Manager, ingress-nginx, MetalLB and more.

Bonne chance avec ton propre déploiement,

Oups ; indent was removed when I posted the reply... Be sure to fix it when you try it.

I also noticed that I copied - pasted from a previous version of my files. You need to distinguish the two deployements (app = pihole-w and app = pihole-r) to avoid confusion between the services. pihole-w is the Web deployment (RW) and pihole-r is the resolver deployment (RO)

Thanks you very much for informations :smiley:

Merci de toutes ces informations, ca mériterais presque un push sur un repo git

If you want to keep indentation, you need to use "Preformatted text".

You have 3 options:

  • type your text or paste file contents, select the text and click on this button image in the edit window;

  • type your text or paste file contents, select the text and press CTRL + E;

  • use "fences" (3 backticks ```) before and after your text, like this:

```
Text
```

The result will look like this:

Text