Ok I think I struck gold finally. Posting this up for any who happen to land on this thread.
TL/DR:
Add this to /etc/dnsmasq.d/03-custom.conf
# Fix for clients that misbehave if no WPAD option specified
dhcp-option=252,"\n"
Detailed findings for those interested below...
I inspected the shared memory and locking code and nothing jumped out at me. That many locks for the same mutex for so many functions is scary but it's shared memory and I understand it's needed.
I got both the PiZ and Pi3B into the hurt setup and watched the debug timings for all the locks and confirmed there were no deadlocks or excessive wait times waiting for the lock in any function.
So here it is... It's a DHCP client issue and pihole is not to blame, and by extension dnsmasq isn't technically to blame either because it is just doing what is requested by the clients.
Some DHCP clients, most notably android os (read: cellphones, amazon fire os, anything based on android is suspect) are immediately sending another DHCPINFORM/DHCPREQUEST if there is no Web Proxy Auto-Discovery (WPAD) option specified in the DHCPACK response.
This can be observed in the pihole.log as the client constantly sending DHCPREQUEST over and over again until either a) the client finally accepts reality and stops the loop, or b) the client side timeout expires and it gives up.
This has the domino effect of making the dnsmasq code spin hard as it constantly is trying to execute a code path for each client over and over again.
The fix above adds the WPAD option, 252, to the DHCPACK response with basically a no-op. The misbehaving client is then satisfied the option is present and no longer repeats the process over and over.
I was able to observe clients obtaining IPs in <3s from both the PiZ and Pi3B with that config change with the full ~820k domains on the blocklist.