DNS responses blocked to other containers on pi-hole host

eightiesCalling · June 28, 2023, 1:30pm

Hello!

I'm going to post this in a docker specific forum as I believe it's more a docker issue/config than anything else but thought it might be a more common question here (but can't see any matching answers).

Setup

Separate container instances of pi-hole running on 2 different hosts.
keepalived has been configured as a floating IP across them (issue is repeatable excluding this but included for completeness).
The hosts running pi-hole also run a number of other container based services.
No common "pi-hole network" has been setup or associated with the other containers on the host. While this is a common google answer and would fix the problem, putting them all in the same network group starts to defeat the point of the containers.

Scenario

Containers running on the same host as the primary pi-hole instance fail resolution requests. Looking at the pihole log I can see the requests and responses (including retries) but the responses don't get back to the originating container.
Pointing the request at a container on an alternative host works.

When the environment is run with DHCP handing out multiple DNS IPs (and not using keepalived), things "appear" to work but that's down to clients retrying on the alternative host after failures. Not great from a resilience perspective as they only have access to a 50% of the DNS server pool. (In fact I hadn't noticed this behaviour until putting keepalived in the loop giving the clients the impression of a single DNS server).

What I think is happening is that DNS requests are being routed out of docker to the virtual IP and then forwarded back in through Docker to pi-hole. The problem coming when pi-hole tries to send the response as Docker realises it's between the 2 networks and blocks it rather than letting the response loop back at host level.

So, is there a simple way to deal with this that doesn't involve either joining all the containers on the same network or bumping Pi-Hole up to host networking?

Bucking_Horn · June 29, 2023, 8:08am

There's not enough information to wager a guess of an answer, though on first glance, it indeed seems like a Docker specific issue rather than a Pi-hole one.

To invite answers, you should share your container configuration details (e.g. your docker-compose or docker run files).

eightiesCalling · June 30, 2023, 9:21am

Hi! Thanks for taking the time to read through.

The docker compose is below. The volume structure is because (initially at least) the multiple instances, along with other containers, were running on different Pi4s that ran against shared storage. Basically 4 Pis setup - 1 providing all the storage and the other 3 net-booting from and running against storage from the first.

version: "3"

services:
  pihole:
    container_name: pihole
    image: pihole:local
    build: .
    hostname: "pihole_${RUNNING_ON:?err}"
    ports:
      - "53:53/tcp"
      - "53:53/udp"
      - "67:67/udp"
      - "8081:80/tcp"
    environment:
      TZ: 'Europe/London'
      WEBPASSWORD: 'pihole'
      PIHOLE_UID: 801
      PIHOLE_GID: 801
      - '/containers/volumes/pihole/${RUNNING_ON:?err}/etc-pihole/:/etc/pihole/'
      - '/containers/volumes/pihole/${RUNNING_ON:?err}/etc-dnsmasq.d/:/etc/dnsmasq.d/'
      - '/containers/volumes/pihole/fixPermissions.sh:/etc/cont-init.d/30-permissions.sh:ro'
      - '/containers/volumes/pihole/fixes/:/fixes/'
    cap_add:
      - NET_ADMIN
    restart: unless-stopped

  keepalived:
    container_name: pihole-vip
    image: pihole-vip:local
    build: ./vip
    environment:
      # KEEPALIVED_DEBUG: 'true'
      DNS_ALL_SERVERS: '10.0.0.210 10.0.0.18'
      DNS_MASTER: '10.0.0.210'
      DNS_VIP: '10.0.0.224'
      ADMIN_PORT: '8081'
      VIP_TESTS: 'dns http'
    restart: unless-stopped
    privileged: true
    network_mode: host
    cap_add:
      - NET_ADMIN

DanSchaper · July 1, 2023, 6:30pm

Have you tried using either host networking or macvlan networking?

The default bridge network will isolate the containers.

https://docs.docker.com/network/drivers/

Bucking_Horn · July 1, 2023, 11:08pm

On the contrary - I'd perceive that as a demonstration of the resilience you get by using two distinct DNS IP addresses.
(On a side note, I'd probably prefer having multiple DNS IPs personally, as using just one floating IP may result in interesting issues when parts of your network links are isolated from another (e.g. when using VLANs or L3 switches). In your specific case, running keepalived in a container, it would also mean that your failover would be affected if the Docker process or the conatiner fails while the machine itself would be still available).

I'm not sure I understand your issue well enough yet:

What DNS server are those containers using?
How do those requests register in Pi-hole?
Can you attribute DNS requests to individual containers?
Also, how do you point a container to send requests via an alternate DNS server, particularly while a single floating IP is active?

I'm not sure if running keepalived in a container would be my first choice.
In particular, I wonder if and how a container would be configured to allow binding to non-local IPs, which would be required by backup clients, and also if that would still be required on the Docker host itself nevertheless.

Although that would seem unlikely to affect the master system (primary?), investigating non-local binding still seems worthwile to me.

Some details of your configuration spark my interest, though they may not be related to your issue:

Those are not part of Pi-hole - together with pihole:local, would that imply that you are using some kind of modified Pi-hole image?

I also note the absence of FTLCONF_LOCAL_IPV4.
This should impact Pi-hole's web server operation only, but you still should consider adding it.

You should be aware that Pi-hole's SQLite3 database backend relies on filesystem locks to ensure database locks. Network filesystems (and NFS in particular) may contain bugs that could make database files stored using those filesystem more prone to database corruption. Specifically, the probabilty for corruption would increase if more than one process would access the same database file.
It's just as well that you've separated your databases for each of your four Pi-hole's into their own, host-specific folder (assuming that RUNNING_ON would return a unique identifier for each Pi-hole host).

eightiesCalling · July 2, 2023, 9:05am

Host level works albeit lose some flexibility on port mapping. Doesn't this lose network isolation entirely though? (The more I read, the more I think that is about the only option for something like DNS in a container environment either way.)

Wasn't aware of macvlan networking - interesting, thank you, but not sure it helps here. May try it just to better understand the behaviour though.

eightiesCalling · July 2, 2023, 9:10am

The Pi-hole containers aren't configured any differently for DNS so that will end up as loopback then upstream.

Requests are attributed based on the name that was resolved - for example, I know that only container A would ask to get to abc.xyz.

During trouble shooting, a container had DNS forced to bypass the floating IP using the dns entry in docker-compose. That allowed me to prove the theory that with 2 separate pi-hole instances it was always the local one that failed.

The config shown seems to be working as expected so far - in this case host and NET_ADMIN give the expected permissions. It's a work in progress though when it comes to availability checks and failure scenarios.

I agree that it may be more suitable at host level but the thing that keeps me in containers is the ability to try things, experiment in consistent repeatable manner without worrying about a bunch of digital detritus scattered across the host OS. The ability to just destroy a low overhead container (vs an entire VM) and start from scratch can't be beaten.

Correct - though not by much. The Dockerfile build pulls the official image and then adds a set of scripts that manage the block lists using the Firebog lists.

I use the local tag on images that are built locally rather than pulled more as a quick reference for myself than anything else.

Agreed and understood on that point. Pi-Hole was never designed to run multiple instances on 1 host.

The environment was originally 4 Pi4s, a mixture of a docker learning experiment and just low power capacity to run stuff. Think of it as 1 storage and 3 compute nodes. This setup meant that any persistant volumes ran on the storage node and anything else could move to any compute node without issue - handy for low/no impact OS upgrades where with a bit of automation you could have a fully configured new OS for a compute node in a few minutes. (Yes, that storage node was the exactly opposite as a SPoF!)

Arguably more complex than it needs to be but was interesting/fun to learn and in general, the setup was more performant running the 3 compute nodes against the storage node with a single SSD than running each with their own SD card.

Bucking_Horn · July 2, 2023, 10:31am

Ah, that's not what I was after.
You haven't disclosed how same-host containers join your network (e.g. via own macvlan IPs or via a Docker gateway from a Docker bridge network).
I'd be curious which client IPs you would see for containers in Pi-hole's Query Log:
Some private range IPs from your normal network, some Docker internal IPs for individual containers, or a single IP of Docker's internal gateway?

I don't think that's the case.
Port 53 has to be bound by Pi-hole anyway to be used with regular network clients, and the web UI's port can be switched via WEB_PORT.

If using host mode fixes the issue for you, you should probably stick with it.

I tend to disagree here, as you report that DNS queries do not travel back to their requester.
(EDIT: It would be interesting to see some log excerpts here, specifically to see which IP requested DNS resolution and thus should have received Pi-hole's reply.)

Pi-hole's embedded pihole-FTL/dnsmasq is capable of dealing with IP addresses on network interfaces as they come and go.

I am unaware if the same would be true for Docker.

Also, as the floating IP would not exist on backup systems, how would Docker treat requests to bind that IP anyway? Would it perhaps be required to configure the host OS to explicitly allow binding of non-local IPs (which is disallowed by default, I think)?