Page 1 of 1

Real time web analytics and reporting

Posted: Fri Apr 16, 2010 12:21 pm
by Denoxis
Hello everybody,

This is my first post, so first I would like to greet everyone, and thank you everybody who are reading this post. :)
I am new to PHP so I am learning its OOP aspects using my past experience from C#. When I am stuck it's usually a design issue, since the syntax errors can be resolved with few googling.

I am working on a project which is somewhat similar to an affiliate network program where visitors go through a redirector script that I host. What I want to do is to provide a real-time reporting to the vendors/admins who wants to see what their visitors are doing.

There are couple of ways coming to my mind, and I am kind of thinking loud here hoping to see your professional opinions on this:

1) Let the redirector script save basic visitor information such as IP, user agent, http referrer to a database. Then use a reporting tool that can parse this data from the database in real-time. Parsing would involve Geo-locating from IP, browser and OS identification from user agent, referrer page and parameter extraction from http referrer.
1.b.) If this real-time parsing would take too much resources, should it be done while saving the data (i.e. PHP script parses location, browser info, and referrer info and saves in 5-6 different fields instead of 3 fields). Now the reporting tool doesn't have to do the parsing.

2) Try to integrate an analytics solution like Piwik. A javascript code would be embedded in to the redirector script which will capture IP, browser info, referrer info and such, and let the Analytics server do its job.

For the sake of simplicity and moving forward, I already implemented #1 thinking that once the project is done I would find a commercially available class (or write one) that can parse this raw data. Is there such a readily-available solution which can generate reports out of specified database tables?

My senses tell me that I should at least implement #1b - that way I can parse the data only once. I would still need to find/write a class that would display shiny charts. The only downside I can think of with this approach is, if the parser class is updated in the future the parsed data might have inconsistencies (e.g. parser was wrongfully parsing "A" as "X" and later it's correctly parsing "A" as "Z". Database now has both "X" and "Z")

As for the #2, it seems like the easiest, but not flexible as I want. It's a whole new package that needs to be integrated. For example, my redirector script may want to save some other data that Piwik's javascript code is not aware of. I may have to get my hands really dirty if I want to customize an open-source solution like Piwik.

As for the sample data, let's say 10 admin/vendors with 1000 visitors each per day.

Any thoughts/experiences on this?

Thank you.

Deniz