Designing a "price comparison" script - Pitfalls?

Not for 'how-to' coding questions but PHP theory instead, this forum is here for those of us who wish to learn about design aspects of programming with PHP.

Moderator: General Moderators

Post Reply
User avatar
onion2k
Jedi Mod
Posts: 5263
Joined: Tue Dec 21, 2004 5:03 pm
Location: usrlab.com

Designing a "price comparison" script - Pitfalls?

Post by onion2k »

I want to write a script that does something similar to a price comparison script. I'm not actually going to be comparing prices, but my project idea does need to build up a set of price data from different sites. Some have feeds that I can utilise (Amazon for example), but most don't.

As it stands my method is to crawl the 'new releases' pages using a regexp to get ids, then crawl the individual product pages every few days using another regexp to get the price. Is there a better method than that? I imagine it'll be fragile as hell.

Has anyone here written some sort of price comparator? What did you find most tricky? What should I watch out for? How did you link similar products (I'm thinking of using the name with Levenshtein distances).
User avatar
Benjamin
Site Administrator
Posts: 6935
Joined: Sun May 19, 2002 10:24 pm

Post by Benjamin »

You know, one thing that people often forget in the business world is that it's not all about prices. Communicating value is a key to success.

How can you communicate value...

1. When I order a product, and you say it will be here on the xth day of then month, will it be?
2. Will the product be as described?
3. Can I count on good customer service?
4. Is this product worth the price?
5. etc. etc. etc..

I have seen so many businesses fail because they think that they can grab all the customers by undercutting the competition, but they don't consider loyality and trust.

Just MHO
User avatar
onion2k
Jedi Mod
Posts: 5263
Joined: Tue Dec 21, 2004 5:03 pm
Location: usrlab.com

Post by onion2k »

My project is not to do with prices directly. It's to do with trends. Prices are merely the root of the data.
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Post by Christopher »

Rather than parsing the HTML you could just use the webservices feeds from sites that have it. You can get pricing back in delimited or XML that will be much easier to deal with.
(#10850)
User avatar
onion2k
Jedi Mod
Posts: 5263
Joined: Tue Dec 21, 2004 5:03 pm
Location: usrlab.com

Post by onion2k »

arborint wrote:Rather than parsing the HTML you could just use the webservices feeds from sites that have it. You can get pricing back in delimited or XML that will be much easier to deal with.
Whereever a site has a feed I make full use of it. I'd be daft not to really. Where a site doesn't I crawl the site in full accordance with the site's terms, privacy policy, and robots.txt file. My current script doesn't request any content more than once per 1.75s on average (it uses a random period between 0.5s and 3s). I'm not doing anything that Google doesn't do already. It's all completely above board. This is getting off topic now.
Post Reply