Getting RAW Log files and deciphering

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
myitanalyst
Forum Newbie
Posts: 24
Joined: Wed Dec 14, 2005 4:31 pm

Getting RAW Log files and deciphering

Post by myitanalyst »

I will be pulling log files from a remote server and I need some help in deciphering them. Basically I will use PHP to parse the file and extract bandwidth used by a client.

The log files come off of Limelight and I can tell who my clients are based on the folder structure.

So let's say I want to determine the bandwidth used for a particular file for a particular client. I know I could isolate the client and the file based on the GET portion of the log, but how do I know how much bandwidth has been used? Can I get this information from the log files?

Here is a small sample of a couple of items.

Code: Select all

202.108.250.253 - - [12/Apr/2006:04:00:11 -0700] "GET http://podcastpub.dl.llnw.net/10/blog_kits_1.mp3 HTTP/1.0" 206 413 "http://202.108.23.172/index.html" "Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt)"
202.108.250.253 - - [12/Apr/2006:04:14:42 -0700] "GET http://podcastpub.dl.llnw.net/10/blog_kits_1.mp3 HTTP/1.0" 206 284 "http://202.108.23.172/index.html" "Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt)"
202.108.250.253 - - [12/Apr/2006:04:28:20 -0700] "GET http://podcastpub.dl.llnw.net/10/blog_kits_1.mp3 HTTP/1.0" 206 8469 "http://202.108.23.172/index.html" "Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt)"
220.181.18.7 - - [12/Apr/2006:12:32:48 -0700] "GET http://podcastpub.dl.llnw.net/10/blog_kits_1.mp3 HTTP/1.0" 206 8469 "http://220.181.27.54/index.html" "Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt)"
201.129.85.211 - - [12/Apr/2006:07:18:44 -0700] "GET http://podcastpub.dl.llnw.net/8/teen_options_1.mp3 HTTP/1.0" 200 565904 "-" "iTunes/6.0.2 (Macintosh; N; PPC)"
I think I would find the answer here: "GET http://podcastpub.dl.llnw.net/10/blog_kits_1.mp3 HTTP/1.0" 206 413 but I am not sure.

I will continue doing research, but any pointers would be gratefull.

Since I have to pull them from their FTP server will a CRON job be the best use of this? Can a CRON job do this and can it call web pages. NEVER used CRON jobs as you can tell.

Also the log files are in compressed .gz format so I will have to get them uncompressed to even read them.

Thanks again!
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

The second number following the GET request header denotes the byte length sent in the request. The first number following the GET request header is the HTTP response code. In this case 565,904 bytes and 200 (OK) respectively.
myitanalyst
Forum Newbie
Posts: 24
Joined: Wed Dec 14, 2005 4:31 pm

Post by myitanalyst »

feyd wrote:The second number following the GET request header denotes the byte length sent in the request. The first number following the GET request header is the HTTP response code. In this case 565,904 bytes and 200 (OK) respectively.
Ok, so I need to learn about the response codes and what they mean.

Secondly if the mp3 file is a 15MB file would I see one large GET or would it be broken down into a bunch of small ones.

I asked because the logs I have thus far do NOT show any GET values that are very large at all. Even the 565,904 bytes above is only half a MB if I read it correctly.

Thanks.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

They have always been one single entry for my logs.
myitanalyst
Forum Newbie
Posts: 24
Joined: Wed Dec 14, 2005 4:31 pm

Post by myitanalyst »

feyd wrote:They have always been one single entry for my logs.
So I wonder in the above example where it shows 200 (OK), but the bytes are only a tiny portion of the entire file... did they get the entire file or did they cancel the download. Would I see a 200 if the user stops the transfer?

Thanks
myitanalyst
Forum Newbie
Posts: 24
Joined: Wed Dec 14, 2005 4:31 pm

Post by myitanalyst »

I also wonder if someone already has a parsing script that can read log files that I could alter to extract what I need.

thanks
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

The response code is sent in the return headers. I do not believe there is a response code recorded if the download is aborted or an error, however it could be a dash that may get recorded.
Post Reply