Browsing remote folders

PHP programming forum. Ask questions or help people concerning PHP code. Don't understand a function? Need help implementing a class? Don't understand a class? Here is where to ask. Remember to do your homework!

Moderator: General Moderators

User avatar
evilmonkey
Forum Regular
Posts: 823
Joined: Sun Oct 06, 2002 1:24 pm
Location: Toronto, Canada

Browsing remote folders

Post by evilmonkey »

Hello. I would like to make a bot that would go into a website, go through all the directories under that path, and display a list of all the files that are there. Can PHP do that? I know it can connect to remote hosts and do file_get_contents(), but can it get the list of files?

Thanks.

(P.S. Before I get accused of bieng a hacker, or something to the sort, this is in purely good intentions.)
User avatar
m3mn0n
PHP Evangelist
Posts: 3548
Joined: Tue Aug 13, 2002 3:35 pm
Location: Calgary, Canada

Post by m3mn0n »

Yep. :)
User avatar
evilmonkey
Forum Regular
Posts: 823
Joined: Sun Oct 06, 2002 1:24 pm
Location: Toronto, Canada

Post by evilmonkey »

How? :P
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

you have to look at all the file references the pages and other returned data's give.. it's potentially a very time consuming endevour.

If you look through my "recent" posts, you'll find something about a roll-your-own proxy, where I roughly detailed out what's involved with creating a firewall quasi by-passing script.
Getran
Forum Commoner
Posts: 59
Joined: Wed Aug 11, 2004 7:58 am
Location: UK
Contact:

Post by Getran »

hmm, i'm not sure, but is this kinda like what you're looking for ?

http://www.spoono.com/php/tutorials/tutorial.php?id=10
User avatar
evilmonkey
Forum Regular
Posts: 823
Joined: Sun Oct 06, 2002 1:24 pm
Location: Toronto, Canada

Post by evilmonkey »

Hello Gertran,

That doesn't seem to work if I set $path to an http://.../somefolder., it only seems to work for local directories...

Feyd, I'll take a look at your posts (assuming I find them ;))
User avatar
evilmonkey
Forum Regular
Posts: 823
Joined: Sun Oct 06, 2002 1:24 pm
Location: Toronto, Canada

Post by evilmonkey »

Hello fyed,

I found your thread on bypassing a firewall, but I don't really understand how it applies to me, nor do I understand what you mean by "use regex to to switch all the external file references in the page to usable ones". What are "usable ones"? And once again, I don't really get how it applies to my situation.

Thanks for your help. :)
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

the usable ones was a referring to creating the proxy. But that part doesn't apply to you. The parts I was talking about were the regular expression matching bits, mostly. Basically, you create a search engine of sorts, that reads and finds all links in a given page, stores them and continues going through it's list until it hits a dead end (all links on the page have already been spidered). When you find a set of links that apply to the folder you want, you store those off in a seperate "results" list.
User avatar
evilmonkey
Forum Regular
Posts: 823
Joined: Sun Oct 06, 2002 1:24 pm
Location: Toronto, Canada

Post by evilmonkey »

Ooooh, no, that's unfortunaly not what I need. The content I want to find is not linked anywhere. It's a lone file sitting somewhere within a directory, not linked to anything, not linked by anything, and I want to find it. For that, I need to open the directory that It's in, get it's listing, then if there are directories in there, get thir listing, and so fourth, until I find that file.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

if the server doesn't provide a directory listing, then you are probably out of luck.
User avatar
m3mn0n
PHP Evangelist
Posts: 3548
Joined: Tue Aug 13, 2002 3:35 pm
Location: Calgary, Canada

Post by m3mn0n »

How about FTP?
Breckenridge
Forum Commoner
Posts: 62
Joined: Thu Sep 09, 2004 11:10 pm
Location: Breckenridge, Colorado

Post by Breckenridge »

I think evilmonkey is talking about http w/o any access to the server file system through a user account. If this is the case I don't think this can be done.
User avatar
evilmonkey
Forum Regular
Posts: 823
Joined: Sun Oct 06, 2002 1:24 pm
Location: Toronto, Canada

Post by evilmonkey »

Yeah, through a script runing on one server, get the directory listing of a folder on a totally different computer. Are you sure this can't be done?

Sami, FTP is out of the qeustion. I want the script to know nothing about the server that it is accessing. Just read the names of the files and folders, nothing else. Modify nothing, delete nothing, open nothing. Just the filenames.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

it's not possible unless the server allows a directory listing of the files in it.
User avatar
evilmonkey
Forum Regular
Posts: 823
Joined: Sun Oct 06, 2002 1:24 pm
Location: Toronto, Canada

Post by evilmonkey »

Okay, if there's an index.htm/l/.php/.asp/whatever file, does mean dirlist is dissallowed?
Post Reply