Further to PR#6465 (#6465) and the dropping of PermissionsStartOnly, I have a few extra suggestions to the unit file. The ReadWritePaths directive can be changed to
ConfigurationDirectory=pihole
which grants read/write access to /etc/pihole. If you add
LogsDirectory=pihole
which grants read/write access to /var/log/pihole, then you can change ProtectSystem:
ProtectSystem=strict
which mounts the entire filesystem as read-only.
Additionally, I have also set
RuntimeDirectory=pihole
which will place the PID file under /run/pihole (requires additional configuration in the .toml file.
I have applied other settings as well which I am happy to share in case this is an ongoing item where further work is planned.
Agreed, though it is not that simple, but requires more changes:
ConfigurationDirectory=pihole alone does not guarantee write access if ProtectSystem=full is used. The prior only guarantees the dir exists with correct ownership. The latter can only be overridden by ReadWritePaths. I see ReadWriteDirectories is used, which seems to still do the same looking at the code, but it is not documented anymore, hence seems to be deprecated. So ConfigurationDirectory=pihole does make sense to potentially replace some steps done in the prestart script, and ReadWriteDirectories should be changed to ReadWritePaths, to prevent some difficult to debug surprises once systemd removes the prior entirely.
LogsDirectory=pihole makes sense as well, since all logs have been moved to that subdir, and makes related prestart steps obsolete.
I also agree to ProtectSystem=strict. Just note that some more dirs might need to be added to ReadWritePaths, like /tmp, and then PrivateTmp=1 would be a good addition.
RuntimeDirectory=pihole at the moment would be unused, since /run/pihole-FTL.pid is the PID file. But that should be changed, moving it to /run/pihole/FTL.pid or so, or removing it entirely. Actually with systemd, no PID file is needed (unless used for something important internally). Same as with /etc/pihole, /run/pihole then would need to be added to ReadWritePaths, if FTL itself shall be able to create the PID file.
With these changes, prestart and poststop would not be needed anymore. Overwriting all permissions recursively on every service start should not be needed, is highly uncommon, might break some individual admin decisions etc. If there is some concern that admins break their setup by applying false permissions, the installer could be checked to assure a repair applies correct permissions or so.
However, there is still the SysV/init.d service installed if systemd is not the init system. For this, prestart and poststop steps, including the PID file, are still needed. So keeping these stings instead of adding them to the systemd unit could be seen as reduced redundancy/duplication. However, I personally would still prefer to remove them from the systemd unit with above changes. This would harden like 99% of bare-matel cases, I guess ... excluding the Docker container which has its own start.sh which makes use of prestart and poststop. Though that one actually does not require most of it, since it is a lot harder to mess with permissions inside a container.
Okay this cannot be reasonably ported to native systemd directives. This also makes ProtectSystem=strict difficult, as one would additionally need to add all custom paths to ReadWritePaths. And I know how annoying it is for users to run into permission issues, while UNIX owner/modes are all fine, driving people crazy by times who do not know this additional systemd sandboxing .
Some really good points made here, thanks! My comments:
You (@MichaIng) are correct about write access not guaranteed. This can be addressed by setting ConfigurationDirectoryMode= to a desired value, e.g. 775. Same applies for LogsDirectoryMode= and RuntimeDirectoryMode=.
ReadWritePaths= was added in systemd 231, so I agree this would be the correct replacement.
LogsDirectory=piholemakes life easier as well and neatens up the file a bit, but @rdwebdesign is absolutely correct that it may create problems if the user changes the logging directory.
I would argue that setting PrivateTmp= is a good idea regardless of anything else, but agreed that it will need to be done if ProtectSystem=strict is used, as this would case /tmp to be read-only as well.
RuntimeDirectory= would also be a good idea (required unless making /run writeable) along with the associated PID location change. Same caveat as the logging though, although I think it is less likely that users will change the PID file location.
For ReadWritePaths=, the variables ConfigurationDirectory=, RuntimeDirectory= and LogsDirectoryMode= will take care of this (assuming the permissions are correct which can be handled with the associated mode flags). This negates the need to list these paths explicitly. I went as far as using AppArmor to try figuring out where FTL needs write permissions and I have not come up with anything (after running for about 6-8 weeks) which falls outside of those directories.
I am happy to share snippets from my service file if you would like; while the full file is too strict to be used as a baseline for all users, perhaps it will help as it has been running flawlessly for me.
P.S. @MichaIng totally agree on the troubleshooting of sandbox configuration, unless you know what you are looking for, it can be really difficult.
Nope, as said, the directory mode and owner, i.e. UNIX permissions, have no impact on mount namespacing/sandboxing. It is as if you attach a R/O drive. You cannot undo ProtectSystem=full/strict (or similar systemd sandboxing directives) without ReadWritePaths.
The immediate problem from the perspective of the static service file are the configurability of the log file and pid file locations as described by @rd webdesign above.
There is the possibility of having different prestart/poststop files for different init systems. But that duplication, and then having those diverge over time increases the maintenance burden for very little gain.
Forgive me if I am misunderstanding you, it has been a long day. Assuming I am interpreting your statement correctly, I respectfully disagree:
It is recommended to enable this setting for all long-running services, unless they are involved with system updates or need to modify the operating system in other ways. If this option is used, ReadWritePaths= may be used to exclude specific directories from being made read-only. Similar, StateDirectory=, LogsDirectory=, โฆ and related directory settings (see below) also exclude the specific directories from the effect of ProtectSystem=.
This tallies with my own configuration: I have no ReadWritePaths directive in my file, but logs and modifications to the configuration directory work as expected, as well as the PID file.
Please correct me if I am missing the point here - not trying to argue, merely questioning.
Nah the whole point of this request/suggestion is to move things to native systemd sandboxing directives instead of running external pre/post scripts as root user. One could argue that, if someone really changes those paths, one can assume that the admin knows how to prepare the new location with correct permissions, and that no prestart script or even systemd unit directive is needed to assure those are correct on every start.
But again, this this works sufficiently well with ProtectSystem=full, it does not with ProtectSystem=strict. So I think we can cross out the latter at this point in any case.
Looks like this has changed somewhat recently. This sentence appeared in the docs with v252, was not yet inside with v251: systemd.exec
This e.g. means it works with Debian Bookworm, but not with Bullseye yet. So I guess it is too early to rely on this for Pi-hole. I remember I tested it with PrivateTmp recently, and that one at least does not make /tmp writable. But while one might assume it similarly there, this is actually an additional sandboxing feature, so not comparable, and also not mentioned in the docs in this regards.
Yes that is where we were missing each other, thanks for tracking that down. My system uses v252 which explains why I do not need to manually set excluded paths other than using those directives.
Would further sandboxing directives, e.g.:
ProtectKernelModules=
ProtectKernelLogss=
ProtectKernelTunables=
ProtectControlGroups=
LockPersonality=
be worthy of inclusion?
I agree that it makes sense to rule out using ProtectSystem=strict at this point, given (at a minimum) the systemd dependency.
Just as overview for what I started to use where applicable:
ProtectHome=1 # no read access to /home and /root
PrivateUsers=1 # no access to user list
PrivateTmp=1 # own /tmp
PrivateDevices=1 # empty /dev aside of /dev/null /dev/zero /dev/random etc
ProtectKernelTunables=1 # most things in /proc R/O
ProtectControlGroups=1 # /sys/fs/cgroup R/O
ProtectKernelModules=1 # prevents modprobe and /lib/module access
ProtectKernelLogs=1 # no access to /dev/kmsg and /proc/kmsg
ProtectClock=1 # /dev/rtc0 and /dev/rtc1 R/O
Some syscall filters are also included with some of above directives, which could otherwise be used to bypass the restriction.
I guess PrivateDevices could be problematic for some networking stuff.
I didn't know about LockPersonality. Might this interfere with emulation, e.g. if CI tests foreign arch binaries or containers via QEMU userland emulation/binfmt? Of course this is rarely done via systemd unit, just thinking.
Thanks for the ping. Given the discussion so far, I can understand the reasons and could put my script somewhere else. The question, though, is: where to put auxiliary scripts and under which user should they run? (In my case DHCP updates trigger an MQTT message.)
ProtectHome=true is something I run as well. No issues observed (although I do not use the DHCP functionality in Pi-hole).
PrivateUsers= is something I actually tried earlier today for the first time - I saw warnings in the logs so undid it immediately, but perhaps further testing is required on my part.
Agree on PrivateTmp=
PrivateDevices=trueis also something I run. No issues observed, but wonder if this would interfere with RTC clock for example if you had one attached to a Pi? Guess it would be a niche use case though.
ProtectKernelxxxxis also something I have been using without problems.
Same for ProtectControlGroups=
How would ProtectClock work here? I am not using the NTP server functionality of Pi-hole either, but if you were using it, would Pi-hole not manage the host clock as well? In which case you would need to allow access to the clock in order to modify the time etc. Is my understanding wrong?
Not certain about LockPersonality but I have also combined this with SystemCallArchitecturesand not seen anything break.
Yes, this would definitely cause problems with the RTC related NTP functionality (Which access the RTC via /dev/efirtc, /dev/misc/efirtc, /dev/rtc0, /dev/rtc, /dev/misc/rtc), or other user specified device)., not just on a Rasberry Pi, but on any other system.
Edited to add, I believe that these are the only hardware devices present in the FTL source, other than pseudodevices like /dev/null, /devurandom etc. So this could be dealt with by the following:
Thanks, this would also preclude ProtectClockfrom being used if the option to update the clock is ticked. Devices can be whitelisted with DeviceAllowbut that may not be workable in such a case.
I think there is still a bit more on the table in terms of hardening.
At this point I've only tested on systems running Trixie with systemd257, so some of this may need to be wound back for older versions, especially where the controls were less granular.
Output of sudo systemd-analyze security pihole-FTL.service
Before (current development branch): โ Overall exposure level for pihole-FTL.service: 9.0 UNSAFE ๐จ
After (additions listed below): โ Overall exposure level for pihole-FTL.service: 3.8 OK ๐
These are the current changes I have been testing. I believe everything is working correctly at these settings but will continue to monitor the situation.
###### Hardening discussion
# https://discourse.pi-hole.net/t/suggestions-for-systemd-service-file
# Limit allowable system calls
# Begin by blocking everything
SystemCallFilter=~@known
# Then add allowable calls
SystemCallFilter=@system-service
# @clock is not needed on systems not using Pi-hole's NTP time sync features
SystemCallFilter=@clock
SystemCallFilter=@debug
# Limit Capabilities
# Similarly CAP_SYS_TIME isn't needed unless using NTP (Similarly for Ambient Capabilities above)
CapabilityBoundingSet=CAP_NET_BIND_SERVICE CAP_NET_RAW CAP_NET_ADMIN CAP_SYS_NICE CAP_IPC_LOCK CAP_CHOWN CAP_SYS_TIME
# Restrict name spaces
RestrictNamespaces=yes
# Prevent switching personality
LockPersonality=yes
# Clear IPC at unload
RemoveIPC=yes
# Isolate Pi-hole's temporary files
PrivateTmp=yes
# Prevent writing to /home hierarchy
# Denying this entirely prevents users from making use of locally stored lists
ProtectHome=read-only
# Prevent modification to CGROUP hierarchies
ProtectControlGroups=yes
# Prevent access to keyring
KeyringMode=private
# Prevent altering the system's hostname or NIS domain name
ProtectHostname=yes
# Prevent loading any additional kernel modules
ProtectKernelModules=yes
# Restrict address families
RestrictAddressFamilies=AF_INET AF_INET6 AF_LOCAL AF_NETLINK
# Ignore setuid and setgid bits, and prevent them from being set
NoNewPrivileges=Yes
RestrictSUIDSGID=yes
# Deny access to kernel logs
ProtectKernelLogs=yes
# Don't create globally readable files by default
# (Can still create files and set permissions, but if permissions are not specified, files are readable neither by group nor world)
UMask=0077
# Limit which files the service may execute
# Begin by blocking everything
NoExecPaths=/
# Allow the FTL daemon and gravity update to execute
# (Allows the "Update Gravity" feature of the WebUI to work)
ExecPaths=/opt/pihole/gravity.sh
# Device Policies - restrict Pi-Hole to pseudo devices (eg /dev/null)
# and those devices specifically permitted below
# Note 3 - There is maybe the risk of this blocking some oddball RTCs that don't use any of the documented /dev locations ? Do these even exist?
# Note 2 - The DevicePolicy can be changed to "strict" and the DeviceAllow directives removed from systems not making use of Pi-Hole's NTP time synchronization functionatlity
# Note 1 - DevicePolicy supercedes the older PrivateDevices directive, with more granularity
DevicePolicy=closed
# The following RTC devices are required for Pi-Hole to be able to
# adjust the system's real time clock
DeviceAllow=/dev/efirtc
DeviceAllow=/dev/misc/efirtc
DeviceAllow=/dev/rtc*
DeviceAllow=/dev/rtc
DeviceAllow=/dev/misc/rtc
# Allow Pi-Hole to adjust system time
# This may be set to "yes" on systems not making use of Pi-holes NTP time synchronisation features
ProtectClock=no
I do hope you are planning a pull request at some point @Distort7841