Request: Ability to add clients to known client list from Network page

Looks great! Thanks.

EDIT: Is there a mechanism to flush the stats such that old devices would be removed? Or is the source of this from the long term reporting data? If so, is there a way to reset/flush long term data? Thanks again!

There is the ability to flush the entire notwork table (Settings-> Danger Zone -> Flush network table)

I start to see duplicates again

Looking in pihole-FTL.log I see that there have been two lookups (since restarting FTL at 20:50) where the device was not in ARP (21:00 and 22:00) before it was found in ARP at 23:00.
The log indicates that the database queries happen at the full hour - but wasn't the network table supposed to be updated every minute? It seems like the device made a few queries between the full hours and got removed from ARP's cache soon so it wasn't present when the database queries were performed.

2020-03-15 21:00:00.302 941] Network table: 10.0.1.64 NOT known through ARP/neigh cache
[2020-03-15 21:00:00.302 941] dbquery: "SELECT network_id FROM network_addresses WHERE ip = '10.0.1.64' AND lastSeen > (cast(strftime('%s', 'now') as int)-86400) ORDER BY lastSeen DESC;"
[2020-03-15 21:00:00.302 941] dbquery: "SELECT id FROM network WHERE hwaddr = 'ip-10.0.1.64';"
[2020-03-15 21:00:00.302 941] dbquery: "INSERT INTO network (hwaddr,interface,firstSeen,lastQuery,numQueries,name,macVendor) VALUES ('ip-10.0.1.64','N/A',1584302400, 1584302333, 8, 'ipad', '');"
[2020-03-15 21:00:00.302 941] dbquery: "INSERT OR REPLACE INTO network_addresses (network_id,ip,lastSeen) VALUES(9,'10.0.1.64',(cast(strftime('%s', 'now') as int)));"

[2020-03-15 22:00:00.156 941] Network table: 10.0.1.64 NOT known through ARP/neigh cache
[2020-03-15 22:00:00.156 941] dbquery: "SELECT network_id FROM network_addresses WHERE ip = '10.0.1.64' AND lastSeen > (cast(strftime('%s', 'now') as int)-86400) ORDER BY lastSeen DESC;"
[2020-03-15 22:00:00.157 941] APR: Identified device 10.0.1.64 using most recently used IP address
[2020-03-15 22:00:00.157 941] dbquery: "UPDATE network SET lastQuery = MAX(lastQuery, 1584305601) WHERE id = 9;"
[2020-03-15 22:00:00.157 941] dbquery: "UPDATE network SET numQueries = numQueries + 72 WHERE id = 9;"
[2020-03-15 22:00:00.157 941] dbquery: "UPDATE network SET name = 'ipad' WHERE id = 9;"

2020-03-15 23:00:00.135 941] dbquery: "SELECT id FROM network WHERE hwaddr = '14:20:5e:db:00:16';"
[2020-03-15 23:00:00.153 941] dbquery: "INSERT INTO network (hwaddr,interface,firstSeen,lastQuery,numQueries,name,macVendor) VALUES ('14:20:5e:db:00:16','eth0',1584309600, 1584308757, 163, 'ipad', 'Apple, Inc.');"
[2020-03-15 23:00:00.153 941] dbquery: "INSERT OR REPLACE INTO network_addresses (network_id,ip,lastSeen) VALUES(12,'10.0.1.64',(cast(strftime('%s', 'now') as int)));"

[2020-03-15 23:00:00.157 941] Network table: Client 10.0.1.64 known through ARP/neigh cache

Side not: during FTL's restart I see double queries for each device/IP. Is this supposed to be like this?

2020-03-15 20:50:49.304 941] Querying count of distinct domains in gravity database table vw_regex_whitelist
[2020-03-15 20:50:49.305 941] Querying gravity database for client 127.0.0.1
[2020-03-15 20:50:49.306 941] Querying gravity database for client 127.0.0.1
[2020-03-15 20:50:49.307 941] Querying gravity database for client 10.0.1.190
[2020-03-15 20:50:49.307 941] Querying gravity database for client 10.0.1.190
[2020-03-15 20:50:49.308 941] Querying gravity database for client 10.0.30.254
[2020-03-15 20:50:49.308 941] Querying gravity database for client 10.0.30.254
[2020-03-15 20:50:49.308 941] Querying gravity database for client 10.0.1.1
[2020-03-15 20:50:49.309 941] Querying gravity database for client 10.0.1.1
[2020-03-15 20:50:49.309 941] Querying gravity database for client 10.0.1.2
[2020-03-15 20:50:49.310 941] Querying gravity database for client 10.0.1.2
[2020-03-15 20:50:49.310 941] Querying gravity database for client 10.0.1.136
[2020-03-15 20:50:49.311 941] Querying gravity database for client 10.0.1.136
[2020-03-15 20:50:49.311 941] Querying gravity database for client 10.0.20.107
[2020-03-15 20:50:49.312 941] Querying gravity database for client 10.0.20.107
[2020-03-15 20:50:49.312 941] Querying gravity database for client 10.0.1.64
[2020-03-15 20:50:49.313 941] Querying gravity database for client 10.0.1.64
[2020-03-15 20:50:49.313 941] Querying gravity database for client 10.0.1.5
[2020-03-15 20:50:49.314 941] Querying gravity database for client 10.0.1.5
[2020-03-15 20:50:49.314 941] Querying gravity database for client 10.0.40.3
[2020-03-15 20:50:49.315 941] Querying gravity database for client 10.0.40.3
[2020-03-15 20:50:49.315 941] Querying gravity database for client 10.0.1.183
[2020-03-15 20:50:49.316 941] Querying gravity database for client 10.0.1.183
[2020-03-15 20:50:49.316 941] Compiled 1 whitelist and 23 blacklist regex filters in 16.2 msec

edit

Wasn't there a check that only devices with queries are added?

Oh geez, I was looking at that page and didn't even see it. Thanks! :slight_smile:

After thinking about what I wrote I came up with the conclusion. I've set DBINTERVAL=60.0 that's why the ARP query is also just executed every hour! Would it make sense to separate FTL's query storing in the database independent from the ARP queries?

edit
I was to fast, seeing double entries again even after removing DBINTERVAL=60.0. Always with zero queries...

I don't think so. If the user wants the database to only be updated once an hour, we should respect this.

Another small change should catch those as well now, please update/re-checkout.

Yes, it is once loading white and once black regex filters for said client. As the functions are intentionally generic, the code where this message is logged is not aware of the reason for being called so cannot log it accordingly. If you add more debug flags, you see more detail, like:

[2020-03-16 19:58:34.974 4257] Querying gravity database for client 127.0.0.1                                                                                 
[2020-03-16 19:58:34.974 4257] Regex blacklist: Querying groups for client 127.0.0.1: "SELECT id from vw_regex_blacklist WHERE group_id IN (0);"             
[2020-03-16 19:58:34.974 4257] Regex blacklist: Enabling regex with DB ID 83 for client 127.0.0.1                                                             
[2020-03-16 19:58:34.974 4257] Querying gravity database for client 127.0.0.1                                                                                 
[2020-03-16 19:58:34.974 4257] Regex whitelist: Querying groups for client 127.0.0.1: "SELECT id from vw_regex_whitelist WHERE group_id IN (0);"

Mock-devices with zero queries are gone now, but after restarting & flushing network table I see a real device (Gateway) the zero queries. Maybe add the filter for zero queries to devices from ARP's cache as well?

I'll try to put it another way what I really meant: At the moment the update of the network table (and connected to that the ARP lookup) is tied to the DBINTERVAL. I set it to 60 to extend the life of my SD-Card. But if a device is just present in between those 60 minutes in the ARP cache and doing some queries, there is no way for the code at the moment to assign the queries to that device as the ARP cache has no entry for it when the lookup is done. What I would expect is the following: As FTL's queries are stored to memory and at every DBINTERVAL saved in the database, I would expect the same behavior for the ARP lookup/network table. Do a ARP lookup every minute (to assign the queries to a specific device) store it in memory but save it to the database just at DBINTERVAL.

To simulate the behavior: set DBINTERVAL to 5 minutes, restart FTL, boot up a device, generate some queries, check ARP for presence of the device's MAC, (wait 1-2 minutes), manually delete the arp entry, see a mock-MAC for that device appear in network table at the next refresh.
The problem is: information that has been there at the time queries have been made (+simulated ARP timeout) may not present anymore when you check the ARP table with long DBINTERVAL settings.

No, they are meaningful. Devices that DO NOT use the Pi-hole would show 0 queries as well. We should not try to hide those devices for which this table was implemented in the first place! :slight_smile:

This is the problem. This is an incorrect assumption. The database is a static file, it will only ever grow. The SD card life will not be affected by the number of writes onto it. And, even if, reducing the number of I/O operations from 60 to one per hour for Pi-hole FTL will have no impact at all given the billion I/O operations the operating system itself does in said hour.

I got what you meant already before, however, I'm not ready to do

as it would be a lot of work and a lot that needs to get tested with only very very few users expected to use this at all. And even these users are likely doing something they shouldn't because they try to solve an issue they have (close to) no influence at all on.

:see_no_evil:

Thanks for the explanaition.

Got that :slight_smile:
I'm fine with setting DBINTERVAL to default and keep testing for network table issues. Just wanted to make clear for myself (and maybe others reading this thread one day) why mock-devices for real MACs might appear and that the culprit is DBINTERVAL and not anything else. But your right - very very few users will every see this happen.

Thank you for implementing this feature at all!

One last bug that might affect a lot of normal users as well:

My iPad (in standby) seems to send requests and immediately goes to sleep afterwards. It's so fast that even one minute is to slow to populate ARP's cache. I can see requests in the query log and a mock device in the network table. When I really use the iPad this always creates a second entry.

During standby:

nanopi@nanopi:~$ sudo arp
Address                  HWtype  HWaddress           Flags Mask            Iface
IPad                             (incomplete)                              eth0

After real usage

nanopi@nanopi:~$ sudo arp
Address                  HWtype  HWaddress           Flags Mask            Iface
IPad                     ether   14:20:5e:db:00:16   C                     eth0

Can you suppress the creation of mock-devices when ARP's HWaddress is (incomplete) ?

@yubiuser I was seeing this too but after the last update and flushing the network table, it hasn't returned in the past day or two.

On a side note, I'm wondering which script file I can tweak to change the default number of lines displayed from 10 to 100 on the Custom DNS page. I've found all the other pages with drop down "10, 25, ..." code and created a "sed" replacement line to batch change those after each update. But cannot find where the Custom DNS page specifies the contents of the drop down box. It's not in the scripts folder like the others. Thanks!

Yeah, it must be somewhere else, because "All" is missing too.

I'm on the latest version and it happend still today after flushing the network table. Maybe my network is strange...

Hmm, I'll report if I see it again. Have similar hosts (dhcp assigned ipads, etc) and router is OPNsense if it matters. I do not include these devices in pihole's Custom DNS.

I do but it shouldn't matter

No, not really. When we don't add a client during the ARP cache run (because we don't have a MAC address), we do it in the second run where mock hardware addresses are added. Mind, if we decide to skip known clients in both rounds, there is a certain chance that they will never get added. Don't misunderstand me, I'm not wanting to bring this discussion to an end, we just have to be careful in the design of what we do. I don't think excluding some special clients will be a good thing and may trigger more support requests in the future (which will be difficult to assess).

I see it the other way: if you leave it as it is there will be mock-devices for real devices and people will send support requests why they have duplicates.

I share your general concern that

but I think the likelihood is very small compared to creating false positive mock-devices (as it would mean a device has always an incomplete ARP entry). In my opinion it's wrong to add device as mock-devices (=different subnet device) if an ARP entry (although incomplete) exists at all (=same subnet device). This means that you put a device from the local subnet that has for some reason an incomplete ARP (Wifi broke down the second the DNS request was made, device was turned off/went to sleep, faulty ARP) in a wrong "category" by design. I would prefer to ignore them.

How is the situation with pihole v4.0? I guess devices with incomplete ARP entries get ignored as well - so there would be no additional harm.

Why not? I haven't looked through the code (on my to do list) but my naiive idea is something like: take all IPs for queries since last network table update from FTL, look in ARP's cache if IP and HWaddress exists (attribute data to corresponding MAC), if just IP but no HWaddress exists (incomplete) do nothing, for all IP without ARP entry create/update mock-devices.

This convinced me :wink:

I'm currently in a hotel with (apparently) AP-side client isolation so it is difficult for me to obtain an incomplete ARP record in my table. What would be even more interesting than the output of arp would be the output of ip neigh (getting the data directly from the kernel memory). What does it show for an incomplete entry?

Well, the procedure is slightly different. What you suggest would only add devices known to FTL. However, the initial idea of the network table was to use the ARP cache knowledge to discover also those devices what do NOT use FTL. As FTL would have no clue about them, they would never get added to the table in your proposed strategy. The implementation in FTL works differently:

  1. Parse neigh cache, add all devices from here
  2. Iterate through all clients known to FTL and add mock-devices if not already handled in 1

I will add a temporary buffer holding information why a client should not get handled by no. 2 (either already added by ARP or skipped intentionally due to incomplete status). Even more extra code but it is worth it.

1 Like
Mo 23. Mär 10:09:33 CET 2020
10.0.1.190 dev eth0 lladdr d4:38:9c:01:ac:6c STALE
10.0.1.1 dev eth0 lladdr f0:9f:c2:1e:8f:e9 REACHABLE
0.0.0.0 dev lo lladdr 00:00:00:00:00:00 NOARP
10.0.1.64 dev eth0  FAILED
10.0.1.136 dev eth0 lladdr 3c:97:0e:13:36:c0 REACHABLE
ff02::1:ff42:dae4 dev eth0 lladdr 33:33:ff:42:da:e4 NOARP
ff02::2 dev eth0 lladdr 33:33:00:00:00:02 NOARP
ff02::16 dev eth0 lladdr 33:33:00:00:00:16 NOARP
::1 dev lo lladdr 00:00:00:00:00:00 NOARP

^C
nanopi@nanopi:~$ sudo arp
Address                  HWtype  HWaddress           Flags Mask            Iface
Sony-XZ1-Compact         ether   d4:38:9c:01:ac:6c   C                     eth0
usg                      ether   f0:9f:c2:1e:8f:e9   C                     eth0
IPad                             (incomplete)                              eth0
Thinkpad-LAN             ether   3c:97:0e:13:36:c0   C                     eth0

Thank you for your effort.

Okay, so this will be an issue. FTL actually only sees what you are seeing in ip neigh, not arp.

BTW: You should not have had to hit Ctrl+C, was ip neigh not reacting? If so, please try again and see if it will just go into some timeout at some point. Maybe the ipad record will only show up at the end?

It was a while sleep loop that requested ip neigh every minute - I terminated it because we found one arp incomplete.

1 Like