Page 1 of 1

Need help

Posted: Sat Apr 26, 2008 6:48 pm
by absolut9
Hello,

I am building a weather site and need to gather information from the National Weather Service. Unfortunately, a lot of their information isn't in some of the best formats to pull from. I need to parse information from the text file located at http://www.spc.noaa.gov/products/outloo ... 262000.txt

I need help parsing a file... Specifically, I need the 8 digit numbers, which correspond into lat/long pairs so that I can draw a line across a map. Is there an easy way to parse this text file, gather the 8 digit codes as well as the heading before it?

Realistically, I would like to parse the text file (perhaps into an array), gather the information I need and then store it into a database. My thinking is if I can write a search for each specific section in the text file, I could kick off several instances of the search and have them combined into the database.

I am using PHP and hoping that regex may offer a better solution to searching and storing the information I need from the file. I am more interested getting the categorical numbers for the SLGT and TSTM areas than the top areas... Getting something in the following format would be great...

array = [SLGT, 29680224, 30890281, 32990280, 34040203, 34470120, ...,. ...]
array = [TSTM, 30918077, 30308242, 30248345, 29798491, 29218483, ..., ...]

So.. pretty complicated and way above my skill level.. Can anyone help? Thanks!

Re: Need help

Posted: Sun Apr 27, 2008 2:25 am
by prometheuzz
I assume that by "pretty complicated and way above my skill level" you meant the regex part. Because if you don't know any PHP as well, then I think you should do some basic tutorials first before creating an application you described.
That said, here's how you could extract the data using PHP's regex functions:

Code: Select all

#!/usr/bin/php
 
<?php
 
$uri = "http://www.spc.noaa.gov/products/outlook/archive/2008/KWNSPTSDY1_200804262000.txt";
$text = getTextFrom($uri);
$slgt_values = getValuesAfterLabel("SLGT", $text);
display($slgt_values);
$tstm_values = getValuesAfterLabel("TSTM", $text);
display($tstm_values);
 
// Prints an array nicely formatted.
function display($array) {
  echo "array=[";
  for ($i = 0; $i < count($array); $i++) {
    if($i > 0 && $i%6 == 0) echo "\n       ";
    echo "$array[$i]".($i < count($array)-1 ? ", " : "");
  }
  echo "]\n";
}
 
// Read a file (locally, or on a remote location) and return the 
// string contents of it.
function getTextFrom($uri) {
  $output = "";
  $file = fopen($uri, "r");
  while(!feof($file)) {
    $output = $output . fgets($file, 1024); 
  }
  fclose($file);
  return $output;
}
 
// Return all the 8-digit groups from '$text' that occur after the 
// variable '$label' as an array.
function getValuesAfterLabel($label, $text) {
  if(preg_match('/.*?'.$label.'((\s*\d{8})+).*/', $text, $match)) {
    $trimmed = trim($match[1]);
    return preg_split('/\s+/', $trimmed);
  }
}
 
?>  

Which outputs the following:

Code: Select all

$ ./readfile.php 
 
array=[29680224, 30890281, 32990280, 34040203, 34470120, 34380036, 
       33899929, 32839713, 30899635, 29349702, 28469886, 28480045]
array=[30918077, 30308242, 30248345, 29798491, 29218483, 99999999, 
       35887503, 35397618, 34857699, 34387750, 33367816, 99999999, 
       45097312, 42817372, 41657414, 41027415, 40387396, 40017352, 
       99999999, 30150486, 31020486, 31810480, 32850436, 33750355, 
       34610292, 35950141, 36860069, 37960030, 38890119, 39530257, 
       40880377, 41660472, 42650463, 43390427, 43510337, 43340247, 
       43060125, 42609989, 41739869, 41239738, 40079561, 39429407, 
       38509382, 37759505, 36879651, 36319740, 35909807, 35399828, 
       35089782, 34989696, 35049592, 35059483, 35059399, 35179332, 
       35209200, 35848839, 35588648, 36128428, 36638291, 37328189, 
       38358172, 39048176, 40258139, 42608062]
Study the regex carefully to try and find out how it works, if there's anything unclear, feel free to post back.
Any questions about the PHP code itself is better asked in the general PHP section of this forum: I am not too familiar with the language: I only know a little regex.

Good luck.