Page 1 of 1

list of all the colleges in the US?

Posted: Fri Aug 11, 2006 1:21 am
by matt1019
Hmm,

Hi guys! Looking at one of the posts (list of cities,state,zip) in this section got me thinking.... is there a similar compilation of colleges/universities in the US?

thanks much! ;)

-Matt

Posted: Fri Aug 11, 2006 4:07 am
by Weirdan

Posted: Fri Aug 11, 2006 11:31 am
by matt1019
Holy Cow!!

Just what I needed, except they do not provide in database format.... its a huge huge list to manually cut and paste too...

I can think of an idea: a source code parser.!!!

I just learned in c++ how to take a .html page, and strip everything between specified tags.

Once I finish, I will post the results here in zip format ;)

in the mean time, if someone has a "php" based solution, please share ;)

Basically, the source parser will take the page in .html form, and "take" everything between TARGET="_blank"> and </A></LI> since the name of the colleges are in between those tags.

here's an example:
<LI><A HREF="http://www.au.af.mil/" TARGET="_blank">Air University</A></LI>
<LI><A HREF="http://www.aamu.edu/" TARGET="_blank">Alabama A&M University</A></LI>
<LI><A HREF="http://www.alasu.edu/" TARGET="_blank">Alabama State University</A></LI>
<LI><A HREF="http://www.athens.edu/" TARGET="_blank">Athens State University</A></LI>
<LI><A HREF="http://www.auburn.edu/" TARGET="_blank">Auburn University</A></LI>
<LI><A HREF="http://www.aum.edu/" TARGET="_blank">Auburn University at Montgomery</A></LI>
<LI><A HREF="http://www.bsc.edu/" TARGET="_blank">Birmingham-Southern College</A></LI>
interesting little project.... worth it though ;)

-Matt

Posted: Fri Aug 11, 2006 11:55 am
by s.dot
Or how about this :P

Code: Select all

$file = file_get_contents('sourcefile.txt');

preg_match_all("#<LI>*+? TARGET=\"_blank\">(*+?)</A>#ism",$file,$matches);

foreach($matches[0] AS $match)
{
   echo $match.'<br />';
}
Not tested, of course.

Posted: Fri Aug 11, 2006 12:19 pm
by matt1019
Hi Scotayy,

nope, does not work... gives me the following errors:
Warning: preg_match_all() [function.preg-match-all]: Compilation failed: nothing to repeat at offset 6 in getcolleges.php on line 7

Warning: Invalid argument supplied for foreach() in getcolleges.php on line 9
i tried the following also.... still no luck.

Code: Select all

<?php

error_reporting(E_ALL);

$file = file_get_contents('sourcefile.html'); 

preg_match_all("#<LI>*+? TARGET=\"_blank\">,(*+?)</A>#ism",$file,$matches,PREG_PATTERN_ORDER); 

foreach($matches[0] AS $match) 
{ 
echo $match.'<br />'; 
} 
?>
The file "sourcefile.html" does exist in the same dir as this script.

-Matt

Posted: Fri Aug 11, 2006 12:25 pm
by s.dot
Ah. My bad.

Code: Select all

$file = file_get_contents('sourcefile.txt');

preg_match_all("#<LI>.+? TARGET=\"_blank\">(.+?)</A>#ism",$file,$matches); 

foreach($matches[1] AS $match) 
{ 
   echo $match.'<br />'; 
}

Posted: Fri Aug 11, 2006 12:30 pm
by matt1019
Hahahahahahha!! Yaarrrrrr! Genius!

I will now tweak this script so that it produces "sql" type output....so then I can just copy and paste.... and get database table filled with these values.

As again, I will post the completed sql file ;)

thanks scottayy!!


-Matt

Posted: Fri Aug 11, 2006 1:39 pm
by matt1019
No Attachments allowed?

any suggestions as to where I can upload my rar file?

(has the complete sql file ;) )

-Matt

Posted: Fri Aug 11, 2006 2:12 pm
by matt1019
Ahhh, yes.

Found a good site to host it on ;)

here's the link to the download:

http://www.upload2.net/page/download/hA ... 0.rar.html

please read the comments inside the sql file. It explains what you need to know.

-Matt