PHP File Server idea - feedback & a few technical proble

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

Post Reply
Sparky
Forum Newbie
Posts: 11
Joined: Sat May 12, 2007 10:54 am

PHP File Server idea - feedback & a few technical proble

Post by Sparky »

Hey,

First post here - thought I'd try and get some feedback for something I'm working on that has quite a few technical limitations - and also to see if people think it's a good idea or not!

I'm designing a PHP file server. The idea is that I'll run it on my PC, and my file server PC (that has 8 or 9 hard drives in) and it'll index the entire drives, whilst keeping all PC's up to date with a current list. It'll maintain hashes (MD5 and SHA1) of every file, plus lots of keywords generated from filename, extra data (like ID3 tags) etc. And user defined comments, of course.

After searching for a file it'll find the closest version of it (that is - I'm looking at this for my brother and family to also run a copy, who are over ADSL internet lines) - so if you have a local copy on the LAN or a copy over the Internet, of course it'll chose the LAN version to send to you.

Limitations:
- Firstly, indexing thousands of files will take a long time, but perhaps this isn't that bad.
- MySQL database will grow very quickly - and will have to be shared. Perhaps too much data?
- Indexing new files - as it's PHP, how is the easiest way (other than rescanning the whole drive! No way!) to find out what new files have been created? If it were C++ or similar it could be hooked into the OS, but this is cross-platform and using PHP ...
- How will searches be made fast enough if the database grows to be significant in size? Considering the problems that Torrent sites have, for instance, (thinking about 'small' sets of data that need to be searched, rather than search engines themselves) I think there's something to be worried about here...

Lastly - do people NEED this program and want it, or is there other better ways? I've considered things like DC++ but have only briefly used it years ago - would this do the same but easier? Is there something I'm missing?! Plus, I want hot-swap capabilities, and browsing the entire archives as if they were one big file system...

Thoughts appreciated!
Cheers!
Sparky
Forum Newbie
Posts: 11
Joined: Sat May 12, 2007 10:54 am

Post by Sparky »

Sorry for double posting; but I'd really appreciate people's opinions on this...

If it's in the wrong forum please say! It's quite an undertaking for a single coder (although it'll be released as open-source almost certainly; I'd rather have it working before I release anything at all) so I need to know if there's a real need for this - I made a Windows-based file server yesterday with seven hard drives - first it blew the entire house electrics (lol!) and second (With 6 drives) it worked, but was sluggish with a 700 Mhz P3 processor/256 Ram/WinXP. Running a SLIM version of Linux & a LAMP setup that shouldn't be such a PITA ...

Thanks :)
User avatar
CoderGoblin
DevNet Resident
Posts: 1425
Joined: Tue Mar 16, 2004 10:03 am
Location: Aachen, Germany

Post by CoderGoblin »

You may want to look at something to index your files differently (Lucene springs to mind) rather than keep indexes in MySQL.

Personally I would not want an application like this. I can understand it for a photo/video storage system or something similar but never a complete system. The first thing which springs to mind is the security aspect.... If you get hacked you provide a complete copy of any personal information stored on your system.
Sparky
Forum Newbie
Posts: 11
Joined: Sat May 12, 2007 10:54 am

Post by Sparky »

CoderGoblin, thanks for your reply.

There definitely will be speed issues if MySQL is used entirely for it's search database, and this is something I'm still experimenting with. OTOH, it does make it easy to port and pass the data around between systems.

As for security - firstly, this is designed as a closed system - to be used between a FEW people at once. For me, it'll be myself, my brother and my parents - I don't trust anyone else beyond that. But, it'll have a few security extras - directories that'll be excluded (eg, My Documents, Windows, etc) - directories & files set as hidden & protected except to yourself, and even a level of encryption for small document files. Granted, if the system were hacked you'd be opened up to compromise - but that happens for any system / server that runs on your PC (esp. Windows) as it can do anything it likes - so I'd be confident the security mechanisms I made would be sufficient.

The primary need for this is, in fact, for me to quickly locate videos, music and programs that are scattered across a couple of dozen hard drives. The fact that I want it to be extended to work with normal documents, program files, DLL's, etc, is just an addition because of the number of times I've searched for precisely that over said dozens of hard drives!

Interested in more response - thanks! :)
User avatar
CoderGoblin
DevNet Resident
Posts: 1425
Joined: Tue Mar 16, 2004 10:03 am
Location: Aachen, Germany

Post by CoderGoblin »

Sparky wrote: The primary need for this is, in fact, for me to quickly locate videos, music and programs that are scattered across a couple of dozen hard drives. The fact that I want it to be extended to work with normal documents, program files, DLL's, etc, is just an addition because of the number of times I've searched for precisely that over said dozens of hard drives!
The question here is why not use the default windows search ? What does your "application" do which windows search doesn't. As stated before I can imagine uses for videos, music etc where the user can "tag" things. Everything else (dll's .ini files etc) I would simply use the windows search. If I wanted to connect from outside to find these I would use something like VNC or a remote login. To get information FTP etc. The tools are out there. Why give yourself the potential security headache when the existing tools have probably solved issues you probably haven't even considered ?
Sparky
Forum Newbie
Posts: 11
Joined: Sat May 12, 2007 10:54 am

Post by Sparky »

I know there's plenty of ways to achieve roughly the same thing, but not the same thing ...

Here's a few reasons for me wanting to create it:
- Windows search is SLOW. Even Indexed search is slow, and the indexer is a waste of resources.
- Using 'windows' tools is not cross-platform, and the idea is to have a common way to achieve this, not "This way for Windows, this way for Linux". I'm mainly considering my lower-spec'ed Linux boxes here ...
- I don't want multiple ways to achieve one task: Finding files, checking hashes, getting previews of files and ultimately downloading them to my PC for use
- I don't want to log in using remote connections / SSH / VNC / whatever. If there's 10 or more PC's in the network, THIS is the headache I'm trying to avoid ...
- I want to be able to consider the reliability of hard drives. Some hard drives are somewhat unstable - this will track the reliability of hard drives and, using a priority system, reorganise files on the same system to locations more suited. For example - my documents need a high-reliability drive; videos will probably suffice on a low-reliability drive.

There's plenty of tools to do part the job but I'm struggling to find one that even does half of what this would accomplish. Having used literally hundreds of applications to do similar jobs and not once having found something remotely close to this has inspired me to write one - otherwise I absolutely do not want to waste time if it's not fixing a real problem. Plus, this program would sit over the files and thus tasks like duplicate file searching would be done instantaneously, instead of as a long winded and processor intensive task.

Cheers for your thoughts, though, I do like to hear what the opinion is of this and if it's viable!
Begby
Forum Regular
Posts: 575
Joined: Wed Dec 13, 2006 10:28 am

Post by Begby »

This isn't too bad of an idea, but you are going to have some indexing problems. You are going to probably want to use something other than PHP to do the searches, something that has already been written and has a lot of support. PHP would then work with that search API to get and then show the results. There is no need to reinvent the wheel. However cross platform support might be an issue and I agree that windows search is something to stay away from.

Have you looked into the google desktop search app to see if that has an open API that you could use?

Also, there is iFolder and also WebDav that exist to allow for serving files. You could index and then search webdav shares on the local PC and set up SSL security. iFolder is slick too from what I have seen (dunno how good the open source version is though)
Sparky
Forum Newbie
Posts: 11
Joined: Sat May 12, 2007 10:54 am

Post by Sparky »

Begby - thanks for the suggestions. I will look a bit more into iFolder, looks interesting... I don't want to reinvent the wheel, in fact I don't want to make this at all if there's something capable of doing even half of what I need! I guess the core of it is:

- Share files across at LEAST a LAN, but over the Internet would be superb
- Cross platform would be superb - allow me to run some older boxes with Linux nicely
- Handle large (dozens, not hundreds!) of hard drives on multiple systems easily, allowing me to send one search to scan the whole lot
- Handle removal and addition of drives relatively easily

If I can access all my files from anywhere at least within my LAN (not necessarily 'edit' facilities - just run / download / play facilities and upload facilities, ie, not random access) and search the whole lan very quickly etc etc then that's the important bits. The rest is all extras on top!

Cheers!
Post Reply