MS Teams gets no presence status for contacts

Expected Behaviour:

MS Teams and Outlook should show presence status of contacts. presence.teams.microsoft.com should get an IP from Pihole

Actual Behaviour:

presence.teams.microsoft gets a SERVFAIL

Debug Token:

https://tricorder.pi-hole.net/c8k3vnf3ld

Often used phrase also applies here "Yesterday everything was working fine and I did not change anything"

Backround: Pihole is broadcasted as DNS through my fritzbox 7590. fritzbox is my upstream dns in Pihole. Cloudflare is upstream dns in fritzbox. I don't actually know what SERVFAIL means or if this is actually causing the missing presence status. But if I change my DNS on the Laptop manually to my fritzbox the status works fine. I thought DoT in my Fritzbox could be an issue so i deactivated it. But that did not help eather. Also Teams presence status works just fine on my phone that also goes through pihole.

I can't provide a solution to your problem, but I've noticed a few issues in your debug log

  • your not running the latest version of Pi-hole
  • Your Pi seems to have network connectivity issues
[i] Default IPv4 gateway: 192.168.178.1
192.168.178.1
   * Pinging 192.168.178.1
192.168.178.1...
[✗] Gateway did not respond.


 Jan 29 02:59:02 dnsmasq[29690]: failed to send packet: Network is unreachable
   Jan 29 02:59:40 dnsmasq[29690]: failed to send packet: Network is unreachable
   Jan 29 02:59:40 dnsmasq[29690]: failed to send packet: Network is unreachable
   Jan 29 03:01:37 dnsmasq[29690]: failed to send packet: Network is unreachable
   Jan 29 03:01:37 dnsmasq[29690]: failed to send packet: Network is unreachable
   Jan 29 03:01:51 dnsmasq[29690]: failed to send packet: Network is unreachable
   Jan 29 03:01:51 dnsmasq[29690]: failed to send packet: Network is unreachable
   Jan 29 03:02:12 dnsmasq[29690]: failed to send packet: Network is unreachable
   Jan 29 03:03:22 dnsmasq[29690]: failed to send packet: Network is unreachable
   Jan 29 03:03:22 dnsmasq[29690]: failed to send packet: Network is unreachable
   Jan 29 03:03:34 dnsmasq[29690]: failed to send packet: Network is unreachable
   Jan 29 03:04:40 dnsmasq[29690]: failed to send packet: Network is unreachable
  • the Fritz!Box is your only upstream DNS server, there is no need to configure Conditional Forwarding
    PIHOLE_DNS_1=192.168.178.1
    PIHOLE_DNS_2=fd00::23
    REV_SERVER=true
    REV_SERVER_CIDR=192.168.178.0/24
    REV_SERVER_TARGET=192.168.178.1
    REV_SERVER_DOMAIN=fritz.box
  • did you manually trigger this shutdown? SQL got confused a bit.
   [2021-01-29 08:37:19.046 29690M] Resizing "FTL-dns-cache" from 253952 to (16128 * 16) == 258048 (/dev/shm: 6.9MB used, 484.9MB total)
   [2021-01-29 08:45:00.479 29690M] Reloading DNS cache
   [2021-01-29 08:45:00.485 29690M] Blocking status is disabled
   [2021-01-29 08:45:01.354 29690/T29694] SQLite3 message: statement aborts at 1: [END TRANSACTION] cannot commit - no transaction is active (1)
   [2021-01-29 08:45:01.354 29690/T29694] ERROR: SQL query "END TRANSACTION" failed: SQL logic error
   [2021-01-29 08:45:01.354 29690/T29694] ERROR: Storing devices in network table failed: SQL logic error
   [2021-01-29 08:45:01.419 29690/T29694] Compiled 0 whitelist and 0 blacklist regex filters for 70 clients in 1.0 msec
   [2021-01-29 08:46:41.509 29690M] Shutting down...
   [2021-01-29 08:46:41.822 29690M] Finished final database update

Thank you so much yubiuser for looking into my issue. For no appearent reason (except me switching settings back and forth and updating ... :nerd_face:) it started working again after my lunchbreak. Presence.teams.microsoft.com gets served an IP and I can see presence status again. I would still be curious though what might have caused it but I do understand that it might not be possible. My only hint to pihole was the fact, that it worked once I took pihole out of the equation by seting dns manually to the router.

I Updated Pi-hole

I saw that and first thought my router/gateway would not be responding to pings in general. Turns out it actually does. I tested it from my Windows Laptop and the very same Raspberry that runs pihole. I got responded on ipv4 and ipv6 on both machines. But a second debug log shows the very same connectivity issue. I havbe no explanation for that. But since everything is working ill just ignore that?

That is very helpful. I just set up my Pihole recently and I was going back and forth on this one since i was not sure if it makes any difference. The logs sometimes don't show the hostname or have some double hostname entries and I was hoping this could fix something. (I know the usual recomendation is to use Pihole as DHCP and I will try that when the time is right :wink:

I just used the buttons in the webinterface. Not sure if this might have caused it:

Thanks again so much for your help!!!

This may be due to an old FTL version. This very much looks like the bug in dnsmasq v2.83 which was fixed in v2.84 (and FTL v5.6).

It means that one server on the path to the final answer (Pi-hole -> router -> Cloudflare -> possibly more) refused to reply. Does it happen more often or did it only happen once?

If it happened only once, I would assume a missconfiguration on Microsoft's or Cloudflare's end of things. Which they fixed meanwhile.

If it will happen again, I suggest skipping the Pi-hole and ask the upstreams directly to see who is causing the SERVFAIL. Do something like

dig presense.teams.microsoft.com @fritz.box

to check the fritzbox and similarly to check other upstream servers.

1 Like

SERVFAIL is somewhat hard to troubleshoot.
(I've recently explained that in another post in German).

Luckily, more often then not, these failures are only temporarily, some of them disappearing as early as for the very next repeated DNS request. It is not uncommon for Pi-hole to encounter a SERVFAIL every once in a while without you even noticing it.

SERVFAIL is just a generic error message indicating a server side resolution failure. Until recently, there was no indication as to the reason for failure or what server in the chain of DNS servers caused.

RFC 8914 is the most recent attempt at changing that by supplying Extended DNS Error codes.
As that standard is comparably new (dating Oct 2020), implementation support is still lacking. Currently, I am only aware of Cloudflare DNS returning those codes.

In case you encounter another persistent SERVFAIL, you could try a dig for the failed domain against Cloudflare's DNS, e.g.

dig @1.1.1.1 presence.teams.microsoft.com

Look for the OPT return codes similiar like this (taken from a sample query, not from the above dig)

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; OPT=15: 00 06 ("..")

OPT=15 indicates an EDE (Extended DNS Error), followed by the error code, 06 for DNSSEC Bogus in this case.
Newer versions of dig may supply textual outpout straight away.

Note that Cloudflare still may show no errors while your actual DNS lookup fails, or vice versa.
So the diagnostic value of such a query is not very high.
Still, it may reveal systematic failures of the same DNS server present in both resolution chains.

Though that would potentially give you a reason for the failure, it would be up to the DNS server maintainers to address that.
There is nothing you can do about the root cause, save from changing your upstream, maybe.

1 Like

Thanks Coro and Bucking_Horn for the explanation. I just think as long as the problems are fixed as soon as I take Pi-Hole out of the equation im pretty sure its my setup with pi-hole and not the Upstream DNS. I had to take Pi-Hole out as the broadcasted DNS-Server on my LAN because my Fam started complaining about the outages. Im back at testing with my devices only and as manually set DNS-Server. So far without any problems. But the fact, that the problems stopped imediately after these change I am wondering if something is wrong with the broadcasting of Pi-Hole as DNS-Server. I have some variing theories about that:

  1. I am using a second fritz.box router as a W-Lan extension in what the manufacturer calls "Wlan-Mesh". I realised that at least yesterday I had some issues (SERVFAIL for DuckDuckgo.com and others) as soon as I was connected to that second Router. Now this router did not have IPv6 enabled which I did now. But I am not sure this got anything to do with it at all. Maybe it simply does not broadcast the DNS-Server properly? On the ohter hand this could really not have been the case, when I had the MS Teams issues, because I was in the basement and the second router is all the way on the second floor. So I was defenitly connected to my primary router.

  2. DNSSEC. I keep going back and forth with this setting in the Pi-Hole mainly because I sometimes have connection issues with a Dynamic-DNS address that I use to reach my selfhosted Nextcloud. Mostly when these issues emerge and I switch DNSSEC On I can reach Nextcloud again. I then keep switching it of, when I have other issues. In general I am not sure if Pi-Hole should be handling DNS-SEC or if my Clients do that anyways. (I also wrote about it here in german)

In general right now there is just to many moving parts and its hard for me to figure it out because some issues only emerge after a few hours of changing a setting. I cant be restarting all devices every time I change something arround (I do some dnsflushing on my devices though). Any further help is much appreceated since I really want to get this to work. Me, I am also enjoing this since I consider it a hobby and I learn new stuff. My Family though would rather accept the adds then the internet outages. So I can only swith back to Pi-Hole beeing the main DNS Server on the LAN when I have figured it out.

Best to get a second Pi and put Pi-hole on that and assign only your devices to this one. Then do your experiments with this one and leave the family Pi-hole running and stable.

Hey jfb, that would be a great idea, if I had a stable pi-hole instance. Which I don't otherwise I would not be tinkering arround with this one.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.