Oh Glorious Programming Stories

Ye' old general discussion board. Basically, for everything that isn't covered elsewhere. Come here to shoot the breeze, shoot your mouth off, or whatever suits your fancy.
This forum is not for asking programming related questions.

Moderator: General Moderators

Post Reply
jason
Site Admin
Posts: 1767
Joined: Thu Apr 18, 2002 3:14 pm
Location: Montreal, CA
Contact:

Oh Glorious Programming Stories

Post by jason »

I love programming stories. I especially love reading about other people's quirky mistakes that happen, and how they contribute to embarrassing situations. I love reading them because it helps when I make my own mistakes, I can recall that I'm not the only one.

So, it's with this mindset I present you with a story that isn't that terrible, but maybe my telling of it will be.

When outputting date's, it's easy to get into the habit of just quickly using something that works. After all, when you output a date, and the format is correct and the date is what you expect, all is good. More important, when using 'YYYY' net's you 2009 (last year), and 2010 (this year), it's good.

Of course, a good programmer doesn't simply check his current date. Rather, he makes sure that dates in the future work as well. So he mocks up some data, and looks to check 2010 (since this program was written in the middle of December of 2009). So now he checks January 10th through the 19th. The 10th chosen because in testing, it was just the first number he thought of. The 19th because it's his birthday, so of course he'll choose that.

Everything works as expected. So then he starts putting the program through even more tests. Imports over a week. Imports over the course of the month.

Even imports of the new year. In every case, the import script works wonderfully well. Now, of course, he doesn't sit there and watch the import script. He has a test running that makes sure the import script goes out and grabs data, and gets data back. It makes sure that the date he is searching for is the date the remote server is using to search for, and he makes sure that the dates match up. He also makes sure that the dates in the data being returned matches the date search for.

He finds a few issues here and there, and makes corrections. Everything works well. He even adds a few extra features, because he has the time, and the script works well. It's efficient, it can work over large date ranges, and it works fairly quickly with the only bottleneck being the network connection and the remote servers ability to hand back the data. But that's not his issue.

So, finally, after everything is approved, the program is put live. Everything works well. A few minor issues are found on the live server, but those are problems with the live version of the remote server, and not with the script. In fact, the script handles this in the way that it should, but still, it's a small issue that will have to be fixed. But still, the program runs fine.

But there is a problem. An interesting problem. A problem that some might already have guessed at.

YYYY doesn't return the current year. No, instead it returns the current year as per ISO 8601. What does this mean? Well, for the vast majority of the time, it works fine. In fact, with all my tests, everything passed. However, I was making an assumption. I was assuming the date that Zend_Date was returning was in fact the date I was expecting. Now, anyone with any programming experience knows that expecting something is bad. Zend_Date was of course doing exactly what it should have done. So, on Jan. 1, 2, and 3, I was requesting data for Jan 1, 2, and 3 of 2009. The remove server was of course fine with this, and the data it returned was proper. The dates matched, and all was good.

Now, it was my fault. I did copy the YYYY portion of the code from some place else in the code (someone else had used it first), and since it had worked previously, I was off the mind not to change it. Luckily, the fix was easy, and the error didn't cause any real problems. We simply had to run the script manually to import the missing data. But still, it was embarrassing. Even my supervisor here at my new job hadn't realized the problem, and was surprised as well.

So anyways, that's my little story. And what have I learned?

Several things:
  • Don't hardcode anything! Even dates for Zend_Date and the like! Take those formats and define() them!
  • Don't assume that just because it's worked until now, it will continue to work in the future.
  • Do check options, and find out what they actually do. Just because you get the expected output, find out why. If you don't understand what everything in the option does, take some time to learn.
Happy coding!
User avatar
Bill H
DevNet Resident
Posts: 1136
Joined: Sat Jun 01, 2002 10:16 am
Location: San Diego CA
Contact:

Re: Oh Glorious Programming Stories

Post by Bill H »

This goes back to programming in MS-DOS for the IBM PC on a text-based screen. I had a C language program that every once in a while made the screen go blank for about two seconds. Many hours of debugging failed to find the problem, which was limited to the two seconds of blank screen. The screen would refresh and everything would be fine, but that two seconds was troubling. Even on the primitive machine of the day, that was a long, long time.

What was going on in that two seconds?

I gave the program and code to several other programmers. No joy; they could not find the problem either. Months went by and I kept coming back to debugging that silly code. It was really driving me nuts.

I boiled it down to a couple of lines; a string input that was underlined, accepting the user input and then erasing the remaining underlines of the input spaces. Okay, length of input allowed minus actul input equals number of spaces to print in order to erase the underlines. Oh, wait. The remainder is being put into an unsigned integer, and it is not being checked to be sure it has not gone negative. You do know how an unsigned integer interpretets -2, right?

It was taking two seconds to print 32,764 spaces to the monitor.
jason
Site Admin
Posts: 1767
Joined: Thu Apr 18, 2002 3:14 pm
Location: Montreal, CA
Contact:

Re: Oh Glorious Programming Stories

Post by jason »

Bill H wrote:It was taking two seconds to print 32,764 spaces to the monitor.
"You could say it... spaced out." - That guy from CSI
User avatar
Darhazer
DevNet Resident
Posts: 1011
Joined: Thu May 14, 2009 3:00 pm
Location: HellCity, Bulgaria

Re: Oh Glorious Programming Stories

Post by Darhazer »

User avatar
onion2k
Jedi Mod
Posts: 5263
Joined: Tue Dec 21, 2004 5:03 pm
Location: usrlab.com

Re: Oh Glorious Programming Stories

Post by onion2k »

We once had a problem with a Perl script that was taking far too long to run. It was for an email marketing app, and the purpose of the script was to create a mailing list based on an existing list minus all the people who have unsubscribed. Fairly simple stuff but it was taking over 5 minutes to run on quite small lists (100,000 contacts), which was a problem.

Having spent about 10 minutes reading through the code we discovered the problem. The code was taking each contact in the mailing list and querying the database to see if it was in the `unsubscribed` table. That's 100,000 queries. No wonder it was taking a while. I asked one of my developers to come up with a fix.

After an afternoon of coding he said he'd done it, and it had gone from over 5 minutes to about 30s to run. That still seemed like a long time to me so I reviewed his code. What he'd done was do a single query to pull out all the unsubscribed contacts and put them into an array (very sensible). What his code then did was to loop through all the contacts in the mailing list, and for each one loop through all unsubscribed contacts and check if it was in the array. For a contact list of 100,000 people, and 10,000 unsubscribed, that was 1,000,000,000 string comparisons.

Half an hour later I'd rewritten the code to use the unsubscribe contact as the key in the hash, and then use exists() (Perl's equivolent to array_key_exists()) to check if the mailing address was unsubscribed, taking advantage of the fact Perl builds an index of hash keys automatically. It ran against the 100,000 name mailing list in about 0.1s.
Post Reply