Pihole crashes daily

I'm using the latest Ver 5 on two RPis (3b and 4) that have been running version 4.3 successfully at least for 12 months. The installations were clean on a clean Buster OS. In both cases, DNS resolver is crashing daily. I can restart it without problems but a day or so later it has crashed. Can anyone point me in the right direction to find the cause and solution.

Can you please post the content of /var/log/pihole-FTL.log

Are you using DNSSEC with Cloudflare as upstream DNS provider?

Please generate a debug token with pihole -d

[2020-06-05 00:11:39.855 21532] Note: FTL forked to handle TCP requests

[2020-06-05 00:21:37.355 5332] !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

[2020-06-05 00:21:37.355 5332] ----------------------------> FTL crashed! <----------------------------

[2020-06-05 00:21:37.355 5332] !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

[2020-06-05 00:21:37.355 5332] Please report a bug at Issues · pi-hole/FTL · GitHub

[2020-06-05 00:21:37.356 5332] and include in your report already the following details:

[2020-06-05 00:21:37.356 5332] FTL has been running for 41619 seconds

[2020-06-05 00:21:37.356 5332] FTL branch: master

[2020-06-05 00:21:37.356 5332] FTL version: v5.0

[2020-06-05 00:21:37.356 5332] FTL commit: 3d7c095

[2020-06-05 00:21:37.356 5332] FTL date: 2020-05-10 18:58:38 +0100

[2020-06-05 00:21:37.356 5332] FTL user: started as pihole, ended as pihole

[2020-06-05 00:21:37.356 5332] Compiled for armhf (compiled on CI) using arm-linux-gnueabihf-gcc (Debian 6.3.0-18) 6.3.$

[2020-06-05 00:21:37.356 5332] Received signal: Segmentation fault

[2020-06-05 00:21:37.357 5332] at address: 0x636a7300

[2020-06-05 00:21:37.357 5332] with code: SEGV_MAPERR (Address not mapped to object)

[2020-06-05 00:21:37.357 5332] Backtrace:

[2020-06-05 00:21:37.358 5332] B[0000]: 0x4488e4, /usr/bin/pihole-FTL(+0x238e4) [0x4488e4]

[2020-06-05 00:21:37.358 5332] B[0001]: 0x76dce130, /lib/arm-linux-gnueabihf/libc.so.6(__default_rt_sa_restorer+0) [0x7$

[2020-06-05 00:21:37.358 5332] B[0002]: 0x46fa22, /usr/bin/pihole-FTL(iface_check+0x35) [0x46fa22]

[2020-06-05 00:21:37.358 5332] B[0003]: 0x45d986, /usr/bin/pihole-FTL(receive_query+0x1f1) [0x45d986]

[2020-06-05 00:21:37.358 5332] B[0004]: 0x46cb52, /usr/bin/pihole-FTL(+0x47b52) [0x46cb52]

[2020-06-05 00:21:37.358 5332] B[0005]: 0x46dfcc, /usr/bin/pihole-FTL(main_dnsmasq+0xc07) [0x46dfcc]

[2020-06-05 00:21:37.358 5332] B[0006]: 0x43eda8, /usr/bin/pihole-FTL(main+0xa7) [0x43eda8]

[2020-06-05 00:21:37.358 5332] B[0007]: 0x76db8718, /lib/arm-linux-gnueabihf/libc.so.6(__libc_start_main+0x10c) [0x76db$

[2020-06-05 00:21:37.358 5332] ------ Listing content of directory /dev/shm ------

[2020-06-05 00:21:37.359 5332] File Mode User:Group Filesize Filename

[2020-06-05 00:21:37.359 5332] rwxrwxrwx root:root 260 .

[2020-06-05 00:21:37.359 5332] rwxr-xr-x root:root 4K ..

[2020-06-05 00:21:37.359 5332] rw------- pihole:pihole 4K FTL-per-client-regex

[2020-06-05 00:21:37.360 5332] rw------- pihole:pihole 20K FTL-dns-cache

[2020-06-05 00:21:37.360 5332] rw------- pihole:pihole 29K FTL-overTime

[2020-06-05 00:21:37.360 5332] rw------- pihole:pihole 918K FTL-queries

[2020-06-05 00:21:37.361 5332] rw------- pihole:pihole 20K FTL-upstreams

[2020-06-05 00:21:37.361 5332] rw------- pihole:pihole 643K FTL-clients

[2020-06-05 00:21:37.361 5332] rw------- pihole:pihole 66K FTL-domains

[2020-06-05 00:21:37.362 5332] rw------- pihole:pihole 33K FTL-strings

[2020-06-05 00:21:37.362 5332] rw------- pihole:pihole 12 FTL-settings

[2020-06-05 00:21:37.362 5332] rw------- pihole:pihole 124 FTL-counters

[2020-06-05 00:21:37.363 5332] rw------- pihole:pihole 28 FTL-lock

[2020-06-05 00:21:37.363 5332] ---------------------------------------------------

debug token is https://tricorder.pi-hole.net/9kasd6z51e

Yes, I am using Cloudflare but DNSSEC is not checked.

There are some bug reports on github with v5.0 in conjunction with Cloudflare as upstream DNS server. The root cause has not been found yet.

It would be great if you could file an issue too and attach the debugger to FTL as described here
https://docs.pi-hole.net/ftldns/debugging/

Pinging @DL6ER as he is investigating the crashes.

ADD

Could you also please provide */var/log/pihole.log from around the time the crash happend. This will help to understand if the crash is related to a specific DNS request.

This is the related crash:

The address differs, however, this is due to you running Pi-hole on a Raspberry Pi device (armhf processor architecture) and the others running it on a "real" server (x86_64 architecture).

Please educate me: How do you know, if the address is different?

Sure thing. See the comment at the very end why all this will become obsolete with v5.1as I implemented a mechanism that automates all the manual steps below.


We need the binaries for linx-x86_64 and armhf from Download the necessary binaries from Release Pi-hole FTL v5.0 · pi-hole/FTL · GitHub

First, we do the calculations for x86_64 (see the linked Github ticket). I copied everything what we need here:

B[0002]: 0x55b28752b018, /usr/bin/pihole-FTL(iface_check+0x48) [0x55b28752b018]
B[0003]: 0x55b28750fa7d, /usr/bin/pihole-FTL(receive_query+0x33d) [0x55b28750fa7d]
B[0004]: 0x55b28752694b, /usr/bin/pihole-FTL(+0x6694b) [0x55b28752694b]

From line B[0004] (you need to find a line where the code offset is directly shown) we can compute:
offset = 0x55b28752694b - 0x6694b = 0x55b2874c0000 (64 bit pointer)

Using this offset, we can compute the addresses in the binary:

B[0002]: 0x55b28752b018 - offset = 0x6b018
B[0003]: 0x55b28750fa7d - offset = 0x4fa7d

Using these addresses, we can get where in the binary the crash happened:

# addr2line 0x4fa7d 0x6694b 0x6b018 -e./pihole-FTL-linux-x86_64
/root/project/src/dnsmasq/forward.c:1486
/root/project/src/dnsmasq/dnsmasq.c:1786
/root/project/src/dnsmasq/network.c:131  <---------

Now, repeat the same with the data from above:

B[0000]: 0x4488e4, /usr/bin/pihole-FTL(+0x238e4) [0x4488e4]
B[0002]: 0x46fa22, /usr/bin/pihole-FTL(iface_check+0x35) [0x46fa22]

offset = 0x4488e4 - 0x238e4 = 0x425000 (32 bit pointer)

and with this:

B[0002]: 0x46fa22 - offset = 0x4aa22
# addr2line 0x4aa22 -e ./pihole-FTL-arm-linux-gnueabihf 
/root/project/src/dnsmasq/network.c:131  <---------

You can see that both crashed at dnsmasq/network.c:131


In the future, all these steps will be done automatically for us, directly adding the code files and lines into the crash report:

2 Likes

Thanks for the explanation, really appreciated!

Any idea when 5.1 might be released either as beta for testing or final?

We plan to have to ready within a few weeks. However, note that this bug is not fixed for v5.1 as this issue is non-trivial (so I cannot solve it by looking at the code alone), I'm still unable to reproduce this myself and I'm yet waiting for crash reports with debugger output so I can try fixing this even when I cannot get the crash to happen locally.

Switching the settings away from Cloudflare upstream DNS fixed the DNS resolver/FTL crash. I could go back to Cloudflare and add a debugger but these two RPis are in a production environment and it causes chaos if debugging disrupts access.

We still don't know what is going on but we're trying hard to fix it. If you cannot afford breaking your system, than that's absolutely fine! We will surely find a solution in the end.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.