March 6th, 2015

Position privacy protection

Mozilla maintains access points (AP) database at location.services.mozilla.com. Location using WIFI is cool: you don't need GPS hardware, and you can fix quicker/for less battery power in cities, and you can get a fix indoors.
Mozilla will return your position if you know SSIDs of two nearby access points, using web service. That has disadvantages: you need working internet connection, connection costs you time, power and money, and Mozilla now knows where you are.
Obvious solution would be to publish AP database, but that has downside: if you visit Anicka once and learn SSID of her favourite access point, you could locate Anicka with simple database query once she moves.

Solution:
first = select N numerically lower (or most commonly seen) access points in the area
second = all access points in the area
for i in first:
      for j in second:
              at position sha1(i, j, salt?) in the database, store GPS coordinates.
If probability of missing access point when you are in the right area is P, probability of not being able to tell your location is P^N. Database will grow approximately N times.
Storing salt: it will make it harder to see differences between different version (good). But if we play some tricks with hash-size to artificaly introduce collisions, this may make them ineffective.
Problem: There is only 2^48 access points. Someone could brute force hash. Solution: store fewer bits of hash to create collisions?
Problem: If you can guess Anicka anicka likes South Pole, and suddenly new access point appears in the area, you can guess her address. Comment: not a problem since Anicka would have to take two access points to the South Pole? Or still a problem since you don't need to know address of the second AP to locate her?
Problem: If you know Anicka likes Mesto u Chciplyho psa, where noone ever moves and noone ever activates/deactivats APs, you can still locate her. Comment: is it a problem? Are there such places?

Any ideas? Does it work, or did I make a mistake somewhere? Is there solution with lower overhead?