Conditional Forwarding stops working when I reboot host or restart container

interconnect · February 20, 2021, 3:00am

Hi, so I'm having an issue where if I restart my Pi-hole container, or reboot the host machine, Conditional Forwarding stops working. To fix it, I have to log in using the IP, disable it, then re-enable it and then it starts working again. Anyone know what may be causing this?

Edit: I found that even if I destroy the container and associated volumes and reload from the docker compose below, Conditional Forwarding does not work. The REV_SERVER environment variables that I've supplied are shown in the WebUI and Conditional Forwarding is ticked/enabled, however it does not work. I have to disable, save, re-enable, re-enter the same exact info and then hit save, and then it works. It's almost as if Pi-hole is not starting Conditional Forwarding at container startup. The REV_SERVER environment variables are being picked up, because they’re shown in the WebUI, so I’m not sure why it’s not working.

Here is my docker-compose file:

version: "3.5"

services:
  pihole:
    image: pihole/pihole:latest
    container_name: pihole
    hostname: ******
    networks:
      pihole:
        ipv4_address: 172.20.0.2
    dns:
      - 127.0.0.1
      - 1.1.1.1
    ports:
      - target: 53
        published: 53
        protocol: tcp
      - target: 53
        published: 53
        protocol: udp
      - target: 67
        published: 67
        protocol: udp
      - target: 80
        published: 80
        protocol: tcp
      - target: 443
        published: 443
        protocol: tcp
    environment:
      - "TZ=America/New_York"
      - "PROXY_LOCATION=pihole"
      - "VIRTUAL_PORT=80"
      - "PIHOLE_DNS_=172.20.0.3#5053;172.20.0.3#5053"
      - "WEBPASSWORD=***************"
      - "ServerIP=192.168.1.30"
      - "DNS_BOGUS_PRIV=TRUE"
      - "DNS_FQDN_REQUIRED=TRUE"
      - "DNSSEC=TRUE"
      - "REV_SERVER=TRUE"
      - "REV_SERVER_TARGET=192.168.1.1"
      - "REV_SERVER_DOMAIN=*********.lan"
      - "REV_SERVER_CIDR=192.168.1.0/24"
      - "TEMPERATUREUNIT=f"
      - "WEBUIBOXEDLAYOUT=boxed"
    volumes:
      - "pihole:/etc/pihole/"
      - "dnsmasq:/etc/dnsmasq.d/"
    restart: always

  cloudflared:
    image: crazymax/cloudflared:latest
    container_name: cloudflared
    hostname: cloudflared
    networks:
      pihole:
        ipv4_address: 172.20.0.3
    environment:
      - "TZ=America/New_York"
      - "TUNNEL_DNS_UPSTREAM=https://1.1.1.1/dns-query,https://1.0.0.1/dns-query"
    restart: always

networks:
  pihole:
    name: pihole
    ipam:
      config:
        - subnet: 172.20.0.0/24
        
volumes:
  pihole:
    name: pihole
  dnsmasq:
    name: dnsmasq

interconnect · February 22, 2021, 11:33pm

Is there a dev or someone available that might be able to take a look at this?

DanSchaper · February 23, 2021, 12:04am

Tail the logs and check to see what IP address is being used as the forward/upstream DNS for PTR queries and A queries for local domain names when you first start up and then again when you change the settings.

And clarify what you mean when you use the acronym CF, you have conditional forwarding and cloudflare, both of which are CF in shorthand.

interconnect · February 23, 2021, 1:22am

And clarify what you mean when you use the acronym CF, you have conditional forwarding and cloudflare, both of which are CF in shorthand.

Sorry about that, I edited the original post (I meant CF = Conditional Forwarding).

Tail the logs and check to see what IP address is being used as the forward/upstream DNS for PTR queries and A queries for local domain names when you first start up and then again when you change the settings.

So I restarted the container, tailed pihole.log and tried to access a couple local domains and this is what I got. I did not see any PTR queries that happened at that time. I did save what I captured if you want to see the whole thing (there was a whole slew of other queries going on).

BEFORE:

Feb 22 20:12:17 dnsmasq[505]: query[A] plex.*********.lan from 192.168.1.10
Feb 22 20:12:17 dnsmasq[505]: cached plex.*********.lan is NXDOMAIN

Feb 22 20:12:20 dnsmasq[505]: query[A] pulsar.*********.lan from 192.168.1.10
Feb 22 20:12:20 dnsmasq[505]: cached pulsar.*********.lan is NXDOMAIN

AFTER:

Feb 22 20:26:03 dnsmasq[1518]: query[A] plex.*********.lan from 192.168.1.10
Feb 22 20:26:03 dnsmasq[1518]: forwarded plex.*********.lan to 192.168.1.1
Feb 22 20:26:03 dnsmasq[1518]: reply plex.*********.lan is 192.168.1.20

Feb 22 20:26:00 dnsmasq[1518]: query[A] pulsar.*********.lan from 192.168.1.10
Feb 22 20:26:00 dnsmasq[1518]: forwarded pulsar.*********.lan to 192.168.1.1
Feb 22 20:26:00 dnsmasq[1518]: reply pulsar.*********.lan is 192.168.1.30

DanSchaper · February 23, 2021, 1:38am

Try greping out plex.*********.lan from that time period to see what the initial query was. There shouldn't be any cache to read from when you start the container fresh.

I see "PIHOLE_DNS_=172.20.0.3#5053;172.20.0.3#5053" in the env vars, that's duplicated, if you intended to have some kind of dual upstream configuration.

And check the docker logs for the pihole container to see if there are any messages or warnings at startup. You may have something misconfigured that is resolved when you make changes. That causes the configuration file to be rewritten inside the container and might be what's fixing things.

interconnect · February 23, 2021, 2:03am

Try grep ing out plex.*********.lan from that time period to see what the initial query was. There shouldn't be any cache to read from when you start the container fresh.

I just restarted the container vs. reloading form scratch, since it happens in either scenario. I can reload it from the docker-compose file though and see what happens....

I see "PIHOLE_DNS_=172.20.0.3#5053;172.20.0.3#5053" in the env vars, that's duplicated, if you intended to have some kind of dual upstream configuration.

I did that on purpose actually, because on initial container start up 8.8.8.8 would be checked as an additional upstream dns provider. I obviously don't want that since I'm using cloudflared w/ DoH. I was able to prevent that by just duplicating 172.20.0.3 in both DNS1 and DNS2 fields. I just made that change recently and it fixed that particular issue, but the Conditional Forwarding thing has been an issue long before I did that. If you think that will cause a problem, I can remove the second entry.

Just restarted the container and here is the docker log:
https://pastebin.com/9c8s0thU

DL6ER · February 23, 2021, 7:30am

Yeah, please go ahead and attach the first, say, 100 lines from /var/log/pihole.log and /var/log/pihole-FTL.log so we can check what is happening. Once in the BEFORE (broken) and once in the AFTER (working) case.

Is there any notable difference between these two? I'm just asking because a host machine restart natually triggers a container restart. Do you set the container to auto-start on booting? I'm just wondering if there can be a situaton where the Pi-hole container is already up, but your conditional forwarding target cannot be reached (and, hence, is rejected) for some reason.

interconnect · February 24, 2021, 12:38am

Ok, here are the logs:

pi-hole.log
pihole-FTL.log

I destroyed and recreated Pi-hole and cloudflared containers from scratch with my docker-compose, then tried to access some local domains (see DM), which failed, then I went into the Pi-hole WebUI and un-ticked Conditional Forwarding (which had all the info in there already), saved, then re-ticked Conditional Forwarding and entered the correct information (which is the same as what's in my docker-compose), saved. Conditional Forwarding begins working.

I captured before/broken and after/working all in one log. So the first part is when it was broken and at the bottom is when it is working.

I will DM you the passwords for the pastes, since these have some sensitive domains in them.

Is there any notable difference between these two? I'm just asking because a host machine restart natually triggers a container restart. Do you set the container to auto-start on booting? I'm just wondering if there can be a situaton where the Pi-hole container is already up, but your conditional forwarding target cannot be reached (and, hence, is rejected) for some reason.

Not that I can tell. I just noticed that with both scenarios Conditional Forwarding stops working. Yes I do set the container to auto-start. The target (192.168.1.1) is my router, which is always on.

Bucking_Horn · February 24, 2021, 8:52am

Just a note on providing the log files:

You could execute e.g.

tail -n 100 /var/log/pihole-FTL.log | pihole tricorder

or substitute with any relevant other log, and then post the token(s) here.
(See also How do I debug my Pi-hole installation?).

That way, only trusted Pi-hole team members would be able to access them, and they'd be deleted automatically after 48 hours.

interconnect · February 24, 2021, 12:26pm

Thank you. I did it that way as well. It gave me URLs as output, which are below. Also, instead of using my browser to access some local domains to do the queries, I used "dig @192.168.1.30 somelocal.domain.lan", where the IP is the Pi-hole in question, because I think my browser was pre-querying tons of domains and flooding the logs.

BEFORE/BROKEN:
pihole.log: https://tricorder.pi-hole.net/xv1ufkgtni
pihole-FTL.log: https://tricorder.pi-hole.net/1oygsywv2l

AFTER/WORKING:
pihole.log: https://tricorder.pi-hole.net/rgd9qc1s3r
pihole-FTL.log: https://tricorder.pi-hole.net/qa47sthtwp

DanSchaper · February 24, 2021, 5:11pm

The Before/Broken doesn't show anything NXDOMAIN other than two lines that are bad domains and should always return NXDOMAIN.

Run that tail -n 100 /var/log/pihole-FTL.log command without piping it to pihole and see what you are sending us so that you can tell us what it is we are looking for.

Edit: These are the only two NXDOMAIN queries in the BEFORE log:

Feb 24 07:17:19 dnsmasq[521]: query[AAAA] connectivity-check.ubuntu.com from 192.168.1.31
Feb 24 07:17:19 dnsmasq[521]: cached connectivity-check.ubuntu.com is NODATA-IPv6
Feb 24 07:17:19 dnsmasq[521]: query[AAAA] connectivity-check.ubuntu.com.REDACTED.lan from 192.168.1.31
Feb 24 07:17:19 dnsmasq[521]: cached connectivity-check.ubuntu.com.REDACTED.lan is NXDOMAIN
Feb 24 07:17:19 dnsmasq[521]: query[PTR] 31.1.168.192.in-addr.arpa from 127.0.0.1
Feb 24 07:17:19 dnsmasq[521]: forwarded 31.1.168.192.in-addr.arpa to 172.20.0.3
Feb 24 07:17:19 dnsmasq[521]: query[AAAA] connectivity-check.ubuntu.com from 192.168.1.30
Feb 24 07:17:19 dnsmasq[521]: cached connectivity-check.ubuntu.com is NODATA-IPv6
Feb 24 07:17:19 dnsmasq[521]: query[AAAA] connectivity-check.ubuntu.com.REDACTED.lan from 192.168.1.30
Feb 24 07:17:19 dnsmasq[521]: cached connectivity-check.ubuntu.com.REDACTED.lan is NXDOMAIN

I've redacted the b********a.lan domain name.

But note the PTR went to 127.20.0.3:

Feb 24 07:17:19 dnsmasq[521]: forwarded 31.1.168.192.in-addr.arpa to 172.20.0.3

EDIT2: I see the .lan are still cached, so we haven't yet found the original query for those lookups.

cached connectivity-check.ubuntu.com.REDACTED.lan is NXDOMAIN

DanSchaper · February 24, 2021, 5:25pm

Sorry, should have caught this at the get go.

I'm pretty sure your environments are wrong.

    environment:
      ServerIP: x.x.x.x
      DNS1: 1.1.1.1
      DNS2: 1.0.0.1
      VIRTUAL_HOST: pi.hole
      DNSMASQ_LISTENING: all

You're dropping in the literal strings as the environment variables, and not variables with values.

    environment:
      - "TZ=America/New_York"
      - "PROXY_LOCATION=pihole"
      - "VIRTUAL_PORT=80"
      - "PIHOLE_DNS_=172.20.0.3#5053;172.20.0.3#5053"
      - "WEBPASSWORD=***************"
      - "ServerIP=192.168.1.30"

Try entering the running Pi-hole container and doing something like echo $DNSSEC or even export to see if the variables are really being populated correctly. Do this in a fresh container that is just spun up.

interconnect · February 25, 2021, 12:39am

Ok, thanks, I fixed them. They look like this now:

    environment:
      TZ: America/New_York
      PROXY_LOCATION: pihole
      VIRTUAL_PORT: 80
      PIHOLE_DNS_: 172.20.0.3#5053;172.20.0.3#5053
      WEBPASSWORD: **************
      ServerIP: 192.168.1.30
      DNS_BOGUS_PRIV: 'TRUE'
      DNS_FQDN_REQUIRED: 'TRUE'
      DNSSEC: 'TRUE'
      REV_SERVER: 'TRUE'
      REV_SERVER_TARGET: 192.168.1.1
      REV_SERVER_DOMAIN: b*******a.lan
      REV_SERVER_CIDR: 192.168.1.0/24
      TEMPERATUREUNIT: f
      WEBUIBOXEDLAYOUT: boxed

Although that did not fix the issue. I did go into a newly spun up container and did check those variables and they were all good. I got the correct result; I checked all of the environmental variables shown above.

On another note, I forgot to include the actual results I got in terminal for the dig's I did after I spun up a new container with broken Conditional Forwarding. I did three local domains and all had results like this:

; <<>> DiG 9.10.6 <<>> @192.168.1.30 plex.b******a.lan
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 29631
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;plex.b*******a.lan. IN A

;; AUTHORITY SECTION:
. 1800 IN SOA a.root-servers.net. nstld.verisign-grs.com. 2021022400 1800 900 604800 86400

;; Query time: 97 msec
;; SERVER: 192.168.1.30#53(192.168.1.30)
;; WHEN: Wed Feb 24 07:16:38 EST 2021
;; MSG SIZE rcvd: 122

I will try again to capture the logs... I'm not sure what the best way is to do it. Can I just send the whole damn log? On a new container it shouldn't be that big.

Edit: I DM'd you the logs since I had to look at them first to find the queries.

Edit 2: So I found that in the BEFORE/BROKEN the queries are going to 172.20.0.3 (cloudflared container) which is not correct, and in the AFTER/WORKING they are going to 192.168.1.1 (my router) which is correct. So I guess that's a clue, question is why. The env variables are supplied in the docker-compose and they are shown in the UI and the variables themselves are populated with the correct info right after a fresh container spin up.

DanSchaper · February 25, 2021, 2:48am

Post the full output of a fresh docker container start up. Don't daemonize the container process, so no docker -d or docker-compose up -d.

interconnect · February 25, 2021, 3:26am

Ok, here ya go. Docker output of a new container via docker-compose:

https://pastebin.com/3jr8x2jP

DanSchaper · February 25, 2021, 4:27am

Okay, and then run
docker-compose exec <Pi-hole Container Name> export

and

docker-compose exec <Pi-hole Container Name> cat /etc/pihole/setupVars.conf

docker-compose exec <Pi-hole Container Name> cat /etc/dnsmasq.d/01-pihole.conf

DanSchaper · February 25, 2021, 4:32am

Why exactly are you doing that?

DanSchaper · February 25, 2021, 4:41am

And I don't know what those are for?

interconnect · February 25, 2021, 12:22pm

I can't get this to work? I get:

Can't find a suitable configuration file in this directory or anyparent. Are you in the right directory? Supported filenames: docker-compose.yml, docker-compose.yaml

Can you explain how to do this?

I did that because on container spin up one google upstream dns server would be checked in the UI in addition to the Custom 1 DNS (for cloudflared). I added it in a second time, which prevents the Google one from getting checked. I can undo it if necessary.

Yeah I don't know what those are. I adapted my docker-compose from here:

github.com/crazy-max/docker-cloudflared

examples/pihole/docker-compose.yml

882b48715

version: "3.5"

services:
  pihole:
    image: pihole/pihole:latest
    container_name: pihole
    hostname: pihole
    networks:
      pihole:
        ipv4_address: 172.20.0.2
    dns:
      - 127.0.0.1
      - 1.1.1.1
    ports:
      - target: 53
        published: 53
        protocol: tcp
      - target: 53
        published: 53
        protocol: udp

This file has been truncated. show original

Should I remove? Would they even be used with Pi-hole?

DanSchaper · February 25, 2021, 3:56pm

How are you starting the docker composure?

That's from 2019.

Two options: Use our official template or ask the creator of that template for help.