Page 1 of 1

Useragent Database

Posted: Fri Feb 10, 2006 1:28 am
by Benjamin
Does anyone have a useragent database which includes the operating system, search engine and browser? I need this information for a web site traffic stats application. If you know where I could find this information it would be great. I tried search engines but didn't find anything useful.

Actually if you have a high traffic web site I would appreciate a list of the most common (say 200) user agents so that I can input them manually if I can't find a database of them.

Posted: Fri Feb 10, 2006 1:31 am
by feyd

Posted: Fri Feb 10, 2006 1:51 am
by Benjamin
I can work with that. Thanks!!

Posted: Fri Feb 10, 2006 4:21 am
by Benjamin
I take that back. get_browser() won't work on all servers and the databases that I found (from the php.net manual) in the csv format have the versions parsed out of them so if I insert them into a database they won't match. I need data like this...

"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)","Internet Explorer 6.0","Windows XP"
"Googlebot/2.1 (+http://www.google.com/bot.html)","Google Search Engine",""

Or something similar.

Posted: Fri Feb 10, 2006 8:07 am
by feyd
I've built my own versions of get_browser() before. Using a parse_ini_file() and regex against the ini get_browser() uses.

Posted: Fri Feb 10, 2006 9:16 am
by Benjamin
feyd wrote:I've built my own versions of get_browser() before. Using a parse_ini_file() and regex against the ini get_browser() uses.
You're like a php god though feyd, All I need is a list of common user agents with the os and browser.

Posted: Fri Feb 10, 2006 10:23 am
by feyd

Code: Select all

<?php

function feyd_get_browser($user_agent = null, $return_array = false) {

	$ini = 'php_browscap.ini';
	$default = 'Unknown';
	
	if ($user_agent === null)
	{
		if(isset($_SERVER['HTTP_USER_AGENT']))
		{
			$user_agent = $_SERVER['HTTP_USER_AGENT'];
		}
		else
		{
			$user_agent = $default;
		}
	}

	$data = parse_ini_file($ini,true);

	$final = array();
	$found = false;

	foreach($data as $pattern => $info)
	{
		$name_regex     = '^' . str_replace(array('?','*'),array('.','.*'),strtolower($pattern)) . '$';
		$regex 	        = '#^' . str_replace(array('\\?','\\*','#'),array('.','.*?','\\#'),preg_quote($pattern)) . '$#i';
		if (preg_match($regex,$user_agent))
		{
			$found = true;
			$final['browser_name_pattern'] = $pattern;
			$final['browser_name_regex']   = $name_regex;
			if (isset($info['parent']) and isset($data[$info['parent']]))
			{
				$final = array_merge($final,$data[$info['parent']],$info);
			}
			else
			{
				$final = array_merge($final,$info);
			}
			break;
		}
	}
	
	if(!$found) {
		$final = $data[$default];
	}
	
	if ($return_array)
	{
		return $final;
	}
	else
	{
		return (object)$final;
	}
}

echo '<pre>';
var_export(feyd_get_browser());
echo "\n";
var_export(feyd_get_browser(''));
echo "\n";
var_export(feyd_get_browser(null,true));
echo "\n";
var_export(feyd_get_browser('',true));
echo '</pre>';

?>
outputs

Code: Select all

stdClass::__set_state(array(
   'browser_name_pattern' => 'Mozilla/5.0 (Windows; *; Windows NT 5.1; *rv:*) Gecko/* Firefox/1.5*',
   'browser_name_regex' => '^mozilla/5.0 (windows; .*; windows nt 5.1; .*rv:.*) gecko/.* firefox/1.5.*$',
   'browser' => 'Firefox',
   'version' => '1.5',
   'majorver' => '1',
   'minorver' => '5',
   'css' => '2',
   'frames' => '1',
   'iframes' => '1',
   'tables' => '1',
   'cookies' => '1',
   'backgroundsounds' => '',
   'vbscript' => '',
   'javascript' => '1',
   'javaapplets' => '1',
   'activexcontrols' => '',
   'cdf' => '',
   'aol' => '',
   'beta' => '',
   'win16' => '',
   'crawler' => '',
   'stripper' => '',
   'wap' => '',
   'netclr' => '',
   'parent' => 'Firefox 1.5',
   'platform' => 'WinXP',
))
stdClass::__set_state(array(
   'browser_name_pattern' => '*',
   'browser_name_regex' => '^.*$',
   'browser' => 'Default Browser',
   'css' => '0',
   'frames' => '',
   'iframes' => '',
   'tables' => '1',
   'cookies' => '',
   'backgroundsounds' => '',
   'vbscript' => '',
   'javascript' => '',
   'javaapplets' => '',
   'activexcontrols' => '',
   'cdf' => '',
   'aol' => '',
   'beta' => '',
   'Win16' => '',
   'crawler' => '',
   'stripper' => '',
   'wap' => '',
   'netclr' => '',
))
array (
  'browser_name_pattern' => 'Mozilla/5.0 (Windows; *; Windows NT 5.1; *rv:*) Gecko/* Firefox/1.5*',
  'browser_name_regex' => '^mozilla/5.0 (windows; .*; windows nt 5.1; .*rv:.*) gecko/.* firefox/1.5.*$',
  'browser' => 'Firefox',
  'version' => '1.5',
  'majorver' => '1',
  'minorver' => '5',
  'css' => '2',
  'frames' => '1',
  'iframes' => '1',
  'tables' => '1',
  'cookies' => '1',
  'backgroundsounds' => '',
  'vbscript' => '',
  'javascript' => '1',
  'javaapplets' => '1',
  'activexcontrols' => '',
  'cdf' => '',
  'aol' => '',
  'beta' => '',
  'win16' => '',
  'crawler' => '',
  'stripper' => '',
  'wap' => '',
  'netclr' => '',
  'parent' => 'Firefox 1.5',
  'platform' => 'WinXP',
)
array (
  'browser_name_pattern' => '*',
  'browser_name_regex' => '^.*$',
  'browser' => 'Default Browser',
  'css' => '0',
  'frames' => '',
  'iframes' => '',
  'tables' => '1',
  'cookies' => '',
  'backgroundsounds' => '',
  'vbscript' => '',
  'javascript' => '',
  'javaapplets' => '',
  'activexcontrols' => '',
  'cdf' => '',
  'aol' => '',
  'beta' => '',
  'Win16' => '',
  'crawler' => '',
  'stripper' => '',
  'wap' => '',
  'netclr' => '',
)
when used against php_browscap.ini


I just wrote this from scratch. The tested output is from PHP 5.1.2

Posted: Fri Feb 10, 2006 1:52 pm
by Roja
Several different solutions.

I use phpsniff quite often, as its a fairly nice browser sniffer.

You can also rip into the code used in bbclone, which is regularly updated with new browser identifications including all of the major (and many of the minor) crawler bots. (Demo available).