Enabling DHCP causes API connection loss

Pi2MGc2 · November 30, 2020, 11:19am

I'm using multiple Pi-holes for each VLAN each in its own docker container using macvlans. Without DHCP the DNS service is active and can resolve domains as intended. However when I activate the DHCP service things start to go wrong.

Expected Behaviour:

The DHCP service is enabled and provides the VLAN with IP addresses.

Actual Behaviour:

The homepage reports an API connection loss and states that FTL is offline.

Not sure what's causing this.

Thanks.

DL6ER · November 30, 2020, 1:19pm

Probably some config errors (I'm just guessing right now).

Check /var/log/pihole-FTL.log and /var/log/pihole.log for possible complaints about config errors.

Pi2MGc2 · November 30, 2020, 3:52pm

When DHCP is disabled, part of the FTL log is thus:

 Successfully accessed setupVarsconf
 *********************************************************
 * WARNING: Required Linux capability CAP_NET_ADMIN not available        *
 *************************************************************************
 *************************************************************************
 * WARNING: Required Linux capability CAP_SYS_NICE not available         *
 *************************************************************************
 PID of FTL process: 30048
 Changing /FTL-lock (11) to 0:0
 Changing /FTL-strings (11) to 0:0
 Changing /FTL-counters (11) to 0:0
 Changing /FTL-domains (11) to 0:0
 Listening on Unix socket
 Changing /FTL-clients (11) to 0:0
 Listening on port 4711 for incoming IPv4 telnet connections
 Changing /FTL-queries (11) to 0:0
 Changing /FTL-upstreams (11) to 0:0
 Changing /FTL-overTime (11) to 0:0
 Changing /FTL-settings (11) to 0:0
 Changing /FTL-dns-cache (11) to 0:0
 Changing /FTL-per-client-regex (11) to 0:0
 Reloading DNS cache
 Blocking status is enabled
 INFO: No regex whitelist entries found
 Compiled 0 whitelist and 8 blacklist regex filters for 1 clients in 0.4 msec

When DHCP is enabled it is:

 Successfully accessed setupVars.conf
 *************************************************************************
 * WARNING: Required Linux capability CAP_NET_ADMIN not available        *
 *************************************************************************
 *************************************************************************
 * WARNING: Required Linux capability CAP_SYS_NICE not available         *
 *************************************************************************

The lines prior to Successfully accessed setupVars.conf do not differ. In the second log it seems that the FTL process isn't run?

Bucking_Horn · November 30, 2020, 4:48pm

Your pihole-FTL.log looks inconspicious.

What about /var/log/pihole.log?

DL6ER · November 30, 2020, 4:53pm

So PID of FTL process: 30048 doesn't appear with DHCP, right? Then I have the same question as the last speaker

Pi2MGc2 · November 30, 2020, 5:31pm

PID of FTL process: 30048 doesn't appear no.

/var/log/pihole.log is a massive file, but the last few entries after activating the DHCP server are:

query[A] pi.hole from 127.0.0.1
/etc/pihole/local.list pi.hole is 0.0.0.0
query[A] pi.hole from 127.0.0.1
/etc/pihole/local.list pi.hole is 0.0.0.0

After deactivating DHCP, additional lines are present after the former:

started, version pi-hole-2.81 cachesize 10000
DNS service limited to local subnets
compile time options: IPv6 GNU-getopt no-DBus no-UBus no-i18n IDN DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth DNSSEC loop-detect inotify dumpfile
using only locally-known addresses for domain use-application-dns.net
using nameserver 1.0.0.1#53
using nameserver 1.1.1.1#53
read /etc/hosts - 7 addresses
failed to load names from /etc/pihole/custom.list: No such file or directory
read /etc/pihole/local.list - 2 addresses

DL6ER · November 30, 2020, 6:14pm

Okay, so you're never reaching the point of

when enabling DHCP. This is very strange. I'm not familiar with macvlan but my best assumption right now is that something is going crazy here. As there are neither crashes not config (or other) errors are in any of the two logs, it's maybe the socket binding of the DHCP socket that somehow ... well ... freezes without doing anything. This is something way outside of what I've been trying with docker myself, so I'm afraid I won't be of much help here.

However, we have some real docker gurus around which should be able to help you. If anyone has a similar enough configuration to compared to yours ... only time will tell.

Pi2MGc2 · November 30, 2020, 6:51pm

Sure, thanks anyway. There's some other strange behaviour I'm getting, such as lighttpd not running. I'll work on trying to sort that in the meantime.

jfb · November 30, 2020, 9:25pm

This seems odd. The query should return the IP of the Pi-hole, not 0.0.0.0.

DL6ER · December 1, 2020, 9:04am

Yeah, overall it seems your system is in some strange meta-state. Could you generate and upload a debug token using (pihole -d) for us to take a look at your config files?

Pi2MGc2 · December 1, 2020, 10:37am

I've done a fresh install in a new container but now lighttpd isn't running so I can't get to the interface to enable DHCP (actually the container keeps restarting too).

I created the new container with:
sudo docker run -h PH-VL30-C -d --name=PH_VL30 --net=macvlan30 --ip=192.168.30.9 --restart=always -v PH_VL30_CFG:/etc/pihole -v PH_VL30_DNS:/etc/dnsmasq.d -v BLOCKLISTS:/home/blocklists:ro --dns=127.0.0.1 --cap-add=NET_ADMIN -e SERVERIP=192.168.30.9 pihole/pihole:latest

I can upload a debug output of this new one?

PromoFaux · December 1, 2020, 1:43pm

If it helps any, here is a snippet my docker-compose file (note, I'm probably not following any best practices - and I'm sure someone somewhere will have comments to pass - but it works)

  pihole:
    container_name: pihole
    hostname: pihole-docker   
    image: pihole/pihole:v5.2
    environment:
      - TZ=Europe/London
      - ServerIP=192.168.1.253
      - DNS1=192.168.1.254#53
      - DNS2=no
      - HOST_IP=192.168.1.253
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - ./pihole/resolv.conf:/etc/resolv.conf
      - ./pihole/configs/pihole/:/etc/pihole/
      - ./pihole/configs/dnsmasq.d:/etc/dnsmasq.d/
      # - ./pihole/01-conf-dnsmasq.sh:/etc/cont-init.d/01-conf-dnsmasq.sh
    ports:
      - "53:53/tcp"
      - "67:67/udp" # Uncomment if you want to use Pi-Hole for DHCP
      - "53:53/udp"
    expose:
       - "80"
    mac_address: d0:ca:ab:cd:ef:fe
    networks:
      home:
        ipv4_address: 192.168.1.253
    dns:
      - 127.0.0.1
    cap_add:
      - NET_ADMIN
    restart: always

And my network is defined elsewhere as:

networks:
  home:
    name: home
    driver: macvlan
    driver_opts:
      parent: eth0
    ipam:
      config:
        - subnet: 192.168.0.0/23
    attachable: false

Pi2MGc2 · December 1, 2020, 2:08pm

That looks like a good setup (but I'm not certain on best practices either lol).

I've just done a complete reinstall of Docker (got rid of the snap package and installed the proper one), reconfigured and recreated my Pi-hole containers.

I think it was the NET_ADMIN that did it, though I can't be sure because I changed other things aswell.

I also noticed that lighttpd not starting coincided with the container not being able to resolve hostnames. I changed the DNS such that it could and lighttpd now seems to start up ok.

Restarted the container and still seems to be ok.

The new command I'm using:
sudo docker run -h PH-VL30-C -d --name=PH_VL30 --net=macvlan30 --ip=192.168.30.9 --restart=always -v PH_VL30_CFG:/etc/pihole -v PH_VL30_DNS:/etc/dnsmasq.d -v BLOCKLISTS:/home/blocklists:ro --dns=127.0.0.1 --cap-add=NET_ADMIN -e SERVERIP=192.168.30.9 -e DNS1=1.1.1.1 -e DNS2=1.0.0.1 pihole/pihole:latest

Thanks for everyone's help in this. Not 100% an indication of the cause, but a high probably I'd say that the NET_ADMIN was behind the DHCP and the DNS was behind the lighttpd. I suspect my issues can probably be recreated by blocking the default DNS and leaving NET_ADMIN out.

PromoFaux · December 1, 2020, 2:10pm

Ah, yes. I was going to point this out in my post, but I noticed from the post I was replying to that you included NET_ADMIN, if that was a new addition, then that is definitely the cause

https://github.com/pi-hole/docker-pi-hole#note-on-capabilities

By default, docker does not include the NET_ADMIN capability for non-privileged containers, and it is recommended to explicitly add it to the container using --cap-add=NET_ADMIN .