Developing a Search engine like application in PHP

Ye' old general discussion board. Basically, for everything that isn't covered elsewhere. Come here to shoot the breeze, shoot your mouth off, or whatever suits your fancy.
This forum is not for asking programming related questions.

Moderator: General Moderators

Post Reply
tiwari.ravish07
Forum Newbie
Posts: 2
Joined: Tue Jul 22, 2008 7:55 am

Developing a Search engine like application in PHP

Post by tiwari.ravish07 »

Hello Everybody,

This is Ravish tiwari, a developer in the field of PHP from India.
I’ve just got my first major project in PHP, and the project is to develop a Knowledge Base for my company.

The project is a search based repository like MSDN, IBM Red Book or Google. But unlike google this application will provide information access to end user from it repository based on user queries.
User will enter search term and app query the database and display the records, more or less like to Google. You can also say that I’ve to develop a search engine for our own uses, where data, its search and its presentation will our.

I am having problem in defining the architecture of the project, means should I develop this as pure File based project?, or as Pure Database project where all data reside in database and all searches are performed on the database? or should I use a combination of both File based and Database approach? Should I use a combination of pure file and XML based database approach? I am confused caz this project is going to be huge and I am having problem with architecture of the Project.

Some suggestions that have come to my way are:
1) File Based Approach:
a. Store all the data in files and perform search by using file handling. [this seems to be time consuming]
b. Store file name, title and related keywords in database and perform search on that


2) Database based Approach
a. Create a multi-level database model store all data in tables depending on there relevance.
b. Start searching for particular keyword right from the first level table to last level table for particular record.
c. Whenever a match is found redirect the to particular file.


3) XML approach
a. Here all data resides in XML and search is done by parsing the file
b. Or all data resides in XML and search is done on its reference that is found in the MySQL database when a match is found XML is sent to Client browser which in tern format it with XSL and display it

All these suggestion have increased my confusion, please tell me which one should I use. Please share your experiences with me, caz it can help me a lot.
User avatar
Eran
DevNet Master
Posts: 3549
Joined: Fri Jan 18, 2008 12:36 am
Location: Israel, ME

Re: Developing a Search engine like application in PHP

Post by Eran »

There is a fourth approach:
4) Use a search engine approach
There are several dedicated solutions for indexing and searching text. Many of those have been ported to PHP (such as Lucene and Sphinx). See wikipedia for a list of those - http://en.wikipedia.org/wiki/Category:F ... e_software
mattinahat
Forum Newbie
Posts: 17
Joined: Tue Jul 22, 2008 12:35 pm

Re: Developing a Search engine like application in PHP

Post by mattinahat »

Well, it is almost impossible to recommend a particular architecture, but what I would suggest is that you don't mix and match your architecture. Choose a method and stick to it.

If there is a lot of information to store (and it sounds like there is) then I would probably avoid the use of files, simply because it could be a lot more complex to build and maintain.

The MySQL approach would provide one of the easiest means of implementing the system as SQL contains some impressive query functions that are simple to implement and quite easy to use.

However, XML may well be the better approach as XML is incredibly easy to incorporate into non-related applications (if done properly) which then means any future developments or modifications you do would have easy access to the XML data. As well as this, XML data can be parsed in a desktop environment as well as in a web environment which would make it suitable for desktop search software as well as a web intranet approach.

As I said, hard to recommend a particular technology, but you really need to assess your requirements and come up with your top 3 priorities for the application. (efficiency, reliability etc). Each technology has its own merits. But XML is becoming increasingly popular as it is so versatile and quite efficient.
User avatar
volomike
Forum Regular
Posts: 633
Joined: Wed Jan 16, 2008 9:04 am
Location: Myrtle Beach, South Carolina, USA

Re: Developing a Search engine like application in PHP

Post by volomike »

If you go with the file-based approach, consider getting a mini Google box (a piece of hardware you can purchase from Google). Then learn how to program against it. Haven't done this yet -- but did cross my mind.
Post Reply