Unique ID auto-generated across Unconnected Servers

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

Post Reply
User avatar
Skara
Forum Regular
Posts: 703
Joined: Sat Mar 12, 2005 7:13 pm
Location: US

Unique ID auto-generated across Unconnected Servers

Post by Skara »

Information I have to work with:
- Location (must be included)
- first name
- last name
- phone
- address
- etc...

I need a way to generate a unique and short id out of the data above. What I'm doing now (as a stopgap) is the following.
Green Acres, John Doe, 555-1234
becomes
doe1234f42h890a
where f42h890a is the first 8 chars of base64_encode(location).

Details:

Problems, of course, include the fact that substr()ing the base64 location takes out data. The point is to make this short. People (generic people, not employees) will have to enter this as a passcode. The length it is now is about as long as I want it to get.
Also, there's a chance there will be two words that are similar enough that the first 8 digits will be the same (yes?). In addition, there's always a chance that the last name and last 4 digits of phone number could be the same. Example:
John Ward, 555-1234
Frank Ward, 123-234-1234

I'm running multiple servers completely off of any network that have to come up with this code. Don't ask. There are currently 2 such systems, but there will be 6 before next year and (with luck) a few dozen the next. There are currently three locations, but (with luck) there will end up being dozens to hundreds.

I also have to include the location field in the code somehow because when they enter the code, the app there needs to know the location. There's no way I can keep up a database of which 3 digit code means which location. The application has to figure out the location from the passcode only by using an algorithm.

In brief:

Using the above info, generate a completely unique id containing the location data. This code has to be as short as possible.

Does ANYONE have any ideas...?

Thanks.
User avatar
s.dot
Tranquility In Moderation
Posts: 5001
Joined: Sun Feb 06, 2005 7:18 pm
Location: Indiana

Re: Unique ID auto-generated across Unconnected Servers

Post by s.dot »

Well if you can figure that out, data compression specialists will pay you millions for your algorhthym. :))

The whole setup smells fishy. :P You CAN use a code to contain information, (such as base64, like you mentioned) but to contain all of the data it's going to be longer than that. Way longer.

The best way is to yes, use the database. Set up each user record to have an auto-incremented primary key. When this key is entered, have the program query the database for the info relevant to that key.

If you can't get to the data from other servers (no network or server access) then the only way would be to sync files or databases manually regularly between the servers.
Set Search Time - A google chrome extension. When you search only results from the past year (or set time period) are displayed. Helps tremendously when using new technologies to avoid outdated results.
User avatar
Mordred
DevNet Resident
Posts: 1579
Joined: Sun Sep 03, 2006 5:19 am
Location: Sofia, Bulgaria

Re: Unique ID auto-generated across Unconnected Servers

Post by Mordred »

1. base64 makes the data longer, not shorter (4 base64-ed characters encode 3 data characters)
2. Do you need to
Skara wrote:generate a unique and short id out of the data above
or
Skara wrote:generate a completely unique id containing the location data
... because these are two quite different things. As scottayy already pointed out, the second one means compression to levels not easily achieved without profound knowledge of the data.

The first one is trivial:

Code: Select all

$code = substr(md5($data), 0, 8);
User avatar
Skara
Forum Regular
Posts: 703
Joined: Sat Mar 12, 2005 7:13 pm
Location: US

Re: Unique ID auto-generated across Unconnected Servers

Post by Skara »

Yes, the setup is fishy, because we're kind of inventing things as we go along here. And yeah, I know encoding tons of data short is impossible, but I can compare the code against existing entries to see if it matches.

In other words..
I encode "Green Acres" into 4a5b6c on server 1 but I only store 4a5b.
On home server, I encode each of the entries:

Code: Select all

Red Acres   => 3a5b6c => 3a5b
Green Acres => 4a5b6c => 4a5b
Blue Acres  => 5a5b6c => 5a5b
and I see that 4a5b matches, so that's the entry I go with.
Problem is, what if "Green Fields" encodes into 4a5b7c? Then 4a5b will match and it gets set up with the wrong location.
...then the only way would be to sync files or databases manually regularly between the servers.
Well, I can uniquely identify each of the small servers and use that as the base for the uniqueness. For that matter, if I simply needed something unique I could take the server id (2 digits would be plenty, 3 would be overkill) and the time() (then replace each letter randomly for security). It's just matching an id to a location.
2. Do you need to
Skara wrote:generate a unique and short id out of the data above
or
Skara wrote:generate a completely unique id containing the location data
... because these are two quite different things.
Erhm... Not sure. I'll take either I can make work.

I may yet come up with another off-the-wall solution, but if anyone else has one, let me know.
Post Reply