Alright, here's the breakdown:
The regex needs to match the following format
Two spaces at the beginning of the line. Capture everything until at least 3 spaces are encountered. Skip everything until new line.
Ignore white spaces at the beginning of the line. Then capture a street address if it is present. Street address ends with a forward slash ( / ), but there may be forward slashes inside the street address. It is, however, the last forward slash on that line.
Capture the city, which ends with a comma ( , )
Capture the state (there may be a variable number of spaces before and after the state)
Capture the zip code, which may have the -XXXX extension
Anchor to the end of line. Could have spaces after zip.
You are right, my example text did not include all possible cases. Here's one that shows something different:
ALLIANCE FLOORING, INC 36646125D 01/23/08
4711 WEST METAIRIE AVENUE / METAIRIE, LA 70001-0000
Regardless of all possible cases, however, the regex captures all the expected cases, BUT something makes the preg_match_all stop matching when it encounters that chunk of text that I mentioned before. Here is my understanding of how preg_match_all works... it's probably more optimized than this though.
Go through text one character at a time. Try to match it from there trying various lengths of matches for + and * operators. Store all matches and captures in an array. Then step to the next character and repeat.
If this is in fact roughly how preg_match_all works, then it should match the lines at the end of file, no matter what I put before them. This obviously breaks.
What's wrong with my thinking here?