Pihole randomly stopped passing traffic

My primary Pi-hole randomly stopped passing traffic in the middle of the night for no apparent reason (although, after reviewing my logs further, appears to have been triggered by Daily apt download activities @03:27:05
Pihole2 syslog

Expected Behaviour:

This little RPi3B+ has been absolutely incredible. I'm honestly surprised my micro SD card hasn't failed in the past ~4 years, or so that we've been running it. Thankfully the Pihole4 VM picked right up so we didn't lose connectivity, but i'd be a lot more comfortable if I could have some assistance trying to big brain the solution here -- i'm obviously floundering.

OS: Raspbian GNU/Linux 10 (buster) armv7l
Host: Raspberry Pi 3 Model B Plus Rev 1.3
Kernel: 5.10.17-v7+
Uptime: 3 hours, 58 mins
CPU: BCM2835 (4) @ 1.400GHz
Memory: 120MiB / 973MiB

Actual Behaviour:

Pihole stopped passing traffic this was the very last message from my pihole.log:

Jun 12 03:27:35 dnsmasq[1822]: query[AAAA] www.google.com from 192.168.107.216
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^$

Debug Token:

https://tricorder.pi-hole.net/bcaa7r7inz

Thanks for taking the time to glance at my little issue(s).

I suspect this is the case. Your debug log doesn't show anything abnormal.

Why do you run this cron script?

1 Like

I believe it was triggered automatically by my UnattendedUpgrades -- what's best practices state for Pi-hole, as to proper method for keeping my various machines up to date? Thanks for your input, sir.

I have the following entry in my /etc/apt/apt.conf.d/20auto-upgrades:

APT::Periodic::Update-Package-Lists "1";
APT::Periodic::Unattended-Upgrade "1";
APT::Periodic::Download-Upgradeable-Packages "1";
APT::Periodic::AutocleanInterval "7";

sudo systemctl status apt-daily.timer

● apt-daily.timer - Daily apt download activities
   Loaded: loaded (/lib/systemd/system/apt-daily.timer; enabled; vendor preset: enabled)
   Active: active (waiting) since Sat 2021-06-12 03:08:29 EDT; 16h ago
  Trigger: Sun 2021-06-13 01:01:58 EDT; 5h 0min left

Jun 12 03:08:29 pihole2 systemd[1]: Started Daily apt download activities.

Sounds about right, I believe this confirms our original suspicions, my bad -- it was triggered through the UnattendedUpgrades package, which i discovered has a sixty minute RandomizedDelaySec variance between trigger times, apparently it's ONLY the reboot time which I have manually selected (typically I get an email via bsd-mailx service i think if anything goes awry).

Sorry, I'm bad. Lesson learned, RTFM before posting in a half panicked state. I should have been able to work through this one on my own. Appreciate your diligence, sir. Have a wonderful rest of your weekend. Stay safe.

Okay, so the Daily apt download activities timer was triggered at 03:08:29 EDT and eventually mucked something up, but what? I can't really make sense of the syslog, but apparently something was corrupt that it couldn't recover from:

Jun 12 03:08:29 pihole2 systemd-fsck[247]: 0x41: Dirty bit is set. Fs was not properly unmounted and some data may be corrupt.

Jun 12 03:09:01 pihole2 CRON[955]: (root) CMD (  [ -x /usr/lib/php/sessionclean ] && if [ ! -d /run/systemd/system ]; then /usr/lib/php/sessionclean; fi)
Jun 12 03:09:04 pihole2 systemd[1]: Starting Clean php session files...
Jun 12 03:09:04 pihole2 systemd[1]: phpsessionclean.service: Succeeded.
Jun 12 03:09:04 pihole2 systemd[1]: Started Clean php session files.
Jun 12 03:10:01 pihole2 CRON[1008]: (root) CMD (   PATH="$PATH:/usr/sbin:/usr/local/bin/" pihole updatechecker local)
Jun 12 03:17:01 pihole2 CRON[1054]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
Jun 12 03:19:57 pihole2 rngd[323]: stats: bits received from HRNG source: 3380064
Jun 12 03:19:57 pihole2 rngd[323]: stats: bits sent to kernel pool: 3300512
Jun 12 03:19:57 pihole2 rngd[323]: stats: entropy added to kernel pool: 3300512
Jun 12 03:19:57 pihole2 rngd[323]: stats: FIPS 140-2 successes: 168
Jun 12 03:19:57 pihole2 rngd[323]: stats: FIPS 140-2 failures: 1
Jun 12 03:19:57 pihole2 rngd[323]: stats: FIPS 140-2(2001-10-10) Monobit: 0
Jun 12 03:19:57 pihole2 rngd[323]: stats: FIPS 140-2(2001-10-10) Poker: 1
Jun 12 03:19:57 pihole2 rngd[323]: stats: FIPS 140-2(2001-10-10) Runs: 0
Jun 12 03:19:57 pihole2 rngd[323]: stats: FIPS 140-2(2001-10-10) Long run: 0
Jun 12 03:19:57 pihole2 rngd[323]: stats: FIPS 140-2(2001-10-10) Continuous run: 0
Jun 12 03:19:57 pihole2 rngd[323]: stats: HRNG source speed: (min=424.805; avg=831.668; max=1065.128)Kibits/s
Jun 12 03:19:57 pihole2 rngd[323]: stats: FIPS tests speed: (min=4.255; avg=7.429; max=16.629)Mibits/s
Jun 12 03:19:57 pihole2 rngd[323]: stats: Lowest ready-buffers level: 2
Jun 12 03:19:57 pihole2 rngd[323]: stats: Entropy starvations: 0
Jun 12 03:19:57 pihole2 rngd[323]: stats: Time spent starving for entropy: (min=0; avg=0.000; max=0)us
Jun 12 03:20:01 pihole2 CRON[1079]: (root) CMD (   PATH="$PATH:/usr/sbin:/usr/local/bin/" pihole updatechecker local)
Jun 12 03:27:05 pihole2 systemd[1]: Starting Daily apt download activities...

If your filesystem isn't properly unmounted, writes to open files may have failed, which will likely and randomly corrupt any program that has to rely on some kind of persistent state sooner or later.

You should try to find out which of your unattended upgrades would require or enforce unmounting filesystems, and probably shouldn't perform those upgrades unattended.

If that's not possible, it may be advisable -as a precautionary measure- that you stop certain processes explicitly before you initiate your unattended upgrades, and restart them afterwards.

1 Like

Thank you kindly for the detailed explanation, sir. At this point I'm shifting gears and going to assume my micro SD card in the RPi3B+ is finally going bad. Things just started acting weird, my device no longer passes traffic through the wireguard tunnel (i see tx: traffic, but rx: 0 B is perpetually stuck) and the only thing that makes sense at this point is pending drive failure.

That's my best uneducated guess right now. Probably should be performing backup/recovery before I lose complete access to the device and data. Already used Teleporter and backed up my wireguard configs. Thanks again @Bucking_Horn !

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.