Error in /var/log/pihole_updateGravity.log

jpgpi250 · July 20, 2020, 1:39pm

Now you have a confirmed second case, thanks for that @beerns
A user on another dutch forum makes the following remark, translation:

For some reason my pihole triggers the service pihole-FTL restart section of the script, however, there are 3 choices there:

svc="kill -SIGRTMIN $(pidof ${resolver})"
svc="killall -s SIGHUP ${resolver}"
svc="service ${resolver} restart"

Unfortunately, I haven't been able to figure out why my pihole makes the service choice, while your pihole obviously makes another choice.

Given the fact that cron executes the script with the path /usr/bin:/bin:/usr/local/bin/ (see earlier), the script will only fail in one of three cases:

kill -> which kill -> /bin/kill -> success (on the path used by cron)
killall -> which killall -> /usr/bin/killall -> success (on the path used by cron)
service -> which service -> /usr/sbin/service -> fail (NOT on the path used by cron)

All I do, before cron executes the weekly gravity update (I do this on Saturday 23h00, success or failure is registered) is download some lists locally and process them, to make them pihole compatible, examples shallalist, using md5 checksum to ensure integrity and duckduckgo trackers.

Looking into this for more than a day now, I'm convinced this is a real problem, that some users may or may not notice.

For now, I solved the problem, by creating a symbolic link for service:

sudo ln -s /usr/sbin/service /usr/local/bin/service

Another possible (better?) solution would be to change the code:

…
else
  # A full restart has been requested
  serviceCommand=$(which service)
  svc="${serviceCommand} ${resolver} restart"
  str="Restarting DNS server"
fi
…

It would probably be wise to do the same for the kill and killall binaries, this to avoid future problems.

Coro · July 20, 2020, 1:49pm

@DL6ER for the crash. They have a few bug fixes and are about to release v5.1.1 at any point. Maybe your bug is already covered

yubiuser · July 20, 2020, 2:04pm

Not the final answer but a step higher: Gravity does a full restart if it doesn't find a pidof pihole.

github.com

pi-hole/pi-hole/blob/56cd7c4d59ceb629a2170cd1c8ab78ef26366019/gravity.sh#L749


      
          
              if [[ "${status}" -ne 0 ]]; then
                echo -e "\\n  ${CROSS} Unable to optimize gravity database ${gravityDBfile}\\n  ${output}"
                error="error"
              else
                echo -e "${OVER}  ${TICK} ${str}"
              fi
            fi
          
            # Only restart DNS service if offline
            if ! pidof ${resolver} &> /dev/null; then
              "${PIHOLE_COMMAND}" restartdns
              dnsWasOffline=true
            fi
          
            # Print Pi-hole status if an error occurred
            if [[ -n "${error}" ]]; then
              "${PIHOLE_COMMAND}" status
              exit 1
            fi
          }

If the process is running, it just reload

github.com

pi-hole/pi-hole/blob/56cd7c4d59ceb629a2170cd1c8ab78ef26366019/gravity.sh#L825


      
          update_gravity_timestamp
          
          # Ensure proper permissions are set for the database
          chown pihole:pihole "${gravityDBfile}"
          chmod g+w "${piholeDir}" "${gravityDBfile}"
          
          # Compute numbers to be displayed
          gravity_ShowCount
          
          # Determine if DNS has been restarted by this instance of gravity
          if [[ -z "${dnsWasOffline:-}" ]]; then
            "${PIHOLE_COMMAND}" restartdns reload
          fi
          
          gravity_Cleanup
          echo ""
          
          "${PIHOLE_COMMAND}" status

jpgpi250 · July 20, 2020, 2:11pm

if I just run pidof pihole-FTL, I get a valid process identifier, I have no idea why it would not find this when cron runs the script (which pidof -> /bin/pidof -> on the path used by cron)

Do you at least agree there might be a problem on some systems, reason currently unknown?

yubiuser · July 20, 2020, 2:17pm

Of course - otherwise I wouldn't have invested time to track this down

I'm just not the one to decide if this is a user-generated issue or a bug. But you provided already a good hint/workaround/solution the devs can look into.

DanSchaper · July 20, 2020, 4:58pm

33 3 * * 7 root PATH="$PATH:/usr/local/bin/:/usr/sbin/" pihole updateGravity >/var/log/pihole_updateGravity.log || cat /var/log/pihole_updateGravity.log

jpgpi250 · July 20, 2020, 6:33pm

Thank you for acknowledging the problem.

I have removed all my changes and added your modification to the pihole cron job, and a temporary echo command to the script, to ensure / verify the script (run with cron) is actually executing the service section again.

Test successful, will this be in pihole 5.1.2?

Any idea why the script chooses the service option? yubiuser has examined the code, apparently pidof pihole-FTL fails, what I do not understand, since it was in use and responsive during the updateGravity process.

Thank you for your time and effort.

DanSchaper · July 20, 2020, 7:38pm

Really, I'm not ack'ing or nack'ing anything. I was merely providing an option that would not carry the dangers of symlinking the init system controler from a protected /sbin to the world's /usr/local/bin.

I don't know, we'd need to know if this affects more than this presented case. And we'd need to know if this is a fix or a bandaid covering something deeper that needs to be resolved.

Not offhand, I don't have any environment to try to duplicate the issue.

I'm not clear on what the web interface screenshot is in reference to, can you explain what that was meant to show?

jpgpi250 · July 20, 2020, 7:56pm

I hope my comments and tests, as well as the second user, who has noticed the problem, are sufficient indications, something is wrong.
A simple test, a bash script with echo "${PATH}" > /home/pi/result.txt, executed by a cron job, will clearly show /usr/sbin is not in the PATH, at least, not on the latest Raspberry Pi OS (32-bit) Lite ( Release date: 2020-05-27), which makes the service command fail. In my opinion, a normal user will probably not notice the new entries in gravity aren't in use.

The screenshot is just there to show the pi, and thus the script, executed by cron, is using pihole-FTL as a resolver.

edit
why did I finally go for the symbolic link solution? To avoid the change being undone by pihole -up OR pihole -r
/edit

DL6ER · July 20, 2020, 9:10pm

I looked through this discussion, but I wonder have you ever tried calling the cron command directly ?

Like

PATH="/bin:/usr/bin:/usr/local/bin/" pihole updateGravity

without any service symlink or something else.

Does this work for you?
I just tested this on a Ubuntu 20.04 VM (nothing else within reach right now) and there it certainly works.

jpgpi250 · July 20, 2020, 9:36pm

As I already indicated, I've removed all changes from my system and use the solution, proposed by Dan Schaper, that works.

Your suggestion also works, pihole-FTL restarts, sudo service pihole-FTL status indicates an active status of 20 sec (command executed immediately after updategravity has completed.

I don't see the point of this test. My assumption, confirmed by some tests, is that cron has a hardcoded path, /usr/sbin is NOT included, which makes the service command fail.
There are several methods, suggested in this topic, to overcome that, I used the symbolic link to avoid conflicts with pihole -up and pihole -r, but what is really needed is a code change, either the cron job, suggested by Dan Schaper, OR using the $(which service) solution in the script.

I really don't care what solution is finally chosen, I am however convinced users aren't aware they are affected by this problem, thus never applying their weekly gravity update (pihole-FTL doesn't restart, reload, or whatever).

My problem is (temporary) solved, my pihole customization scripts are handling the problem, the only question I have is why my system is always restarting, using service (as explained above, pidof pihole-FTL doesn't produce a proper result)

DL6ER · July 20, 2020, 9:44pm

I'm just trying to identify what is different on systems that don't work. The more help I can get with this, the better. If someone can tell me how to break my system accordingly, I will do this as well

The point is that my command overwrites PATH. As you have seen, there is no /usr/sbin included anymore. Yet it succeds. I'm just looking for the reason why I (and apparently many else) are not seeing an issue here and others are.

I definitely know that FTL is restarted once a week for me because I'm monitoring my development system very well to catch potential issues.

What I'm concerned about is that the description of this seems to be fairly new (it is not that it has never worked) even though this file has not changed in 2+ years.

Hmm, are you getting exactly one PID? Or maybe multiple? There is an issue with killall as well as kill + pidof in case pidof returns multiple PIDs. It is possible for FTL to have multiple PIDs at the same time (TCP workers, script helpers, ...) so this is something we should look into.

jpgpi250 · July 20, 2020, 10:02pm

I tested this, this afternoon, and just now again, I get a single pid as reply.

Something I've asked before, but was never answered: sudo service -pihole-FTL status says:

 Active: active (exited) since Mon 2020-07-20 23:19:36 CEST; 20s ago

exited, could this be an indicator of the problem?

Anyway, by simply adding the PATH to service OR using the $(which service) solution, OR adding the PATH in the cron job, the error disappears.

I noticed you omitted /usr/sbin from your command, so I ran the test again, this time with a modified script, adding an echo to the three possible choices there are (kill, killall, service).
when executing your (ubuntu) command, something like:

	echo "path: ${PATH}"
	echo "choice: service"

repeated for every option, the script outputs:

path: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
choice: killall

no problem here, first of all, killall is used, secondly the path is NOT wat is specified on the commandline.

DL6ER · July 21, 2020, 3:53am

No. this is a long standing init.d issue. FTL forks during startup to become independent of the starting terminal. init.d interprets this as the process exited. systemd solves this more elegantly by actually looking for the process.

jpgpi250:

the script outputs:
path: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
choice: killall
no problem here, first of all, killall is used, secondly the path is NOT wat is specified on the commandline.

I was almost afraid something like this will happen. Will check more when back from work today.

ChurchOfNoise · July 21, 2020, 6:06am

Came here after reading @jpgpi250 's post on another forum and have the same issue.
Installation: have been upgrading pi-hole from 4.x.x to 5.0 beta to 5.0 to 5.0 dev to 5.1 to current dev on Raspbian Lite on a Pi Zero.

I saw that @jpgpi250 mentioned a 'fix' by @DanSchaper, would that be this:
33 3 * * 7 root PATH="$PATH:/usr/local/bin/:/usr/sbin/" pihole updateGravity >/var/log/pihole_updateGravity.log || cat /var/log/pihole_updateGravity.log
(so should this be added to the cron jobs?)

Currently, I've run

sudo ln -s /usr/sbin/service /usr/local/bin/service

and did pihole -up, the same error popped up in the log file.

jpgpi250 · July 21, 2020, 6:22am

Thank you for reporting the problem. Did you notice the problem yourself (daily use), or only after checking /var/log/pihole_updateGravity.log , as I indicated?

The file /etc/cron.d/pihole contains a line (job) with the following content (time may be different):

59 1    * * 7   root    PATH="$PATH:/usr/local/bin/" pihole updateGravity >/var/log/pihole_updateGravity.log || cat /var/log/pihole_updateGravity.log

you should edit this line to include the required path e.g.:

59 1    * * 7   root    PATH="$PATH:/usr/local/bin/:/usr/sbin/" pihole updateGravity >/var/log/pihole_updateGravity.log || cat /var/log/pihole_updateGravity.log

WARNING. pihole -up and pihole -r and pihole checkout (if core is included) will undo this change, so you need to reapply the fix after executing one of these commands.

ChurchOfNoise · July 21, 2020, 6:30am

Didn't notice the problem, just saw the error after checking the log as you indicated.

Thanks for the clarification!

jpgpi250 · July 21, 2020, 6:36am

Don't forget to remove the symbolic link you created in /usr/local/bin (if it still exists).

Strange that this solution doesn't work, it worked for me. I chose that solution, because I assumed pihole -up wouldn't affect this (e.g. permanent), but apparently that is also a false assumption.

beerns · July 21, 2020, 6:50am

I just applied the fix. Rebooted my Raspberry pi, applied a pihole -g and the same error is in my logfile:

[i] Cleaning up stray matter...^M^[[K [✓] Cleaning up stray matter
[✗] /usr/local/bin/pihole: line 129: service: command not found

[✓] DNS service is running
[✓] Pi-hole blocking is Enabled

(edit) removed the symlink before i applied the suggested fiks

jpgpi250 · July 21, 2020, 7:03am

the log file is only created, when the job is run with cron, it isn't created / updated when running pihole -g. Look at the file date, it will not match the time you ran pihole -g

ls -l /var/log/pihole_updateGravity.log