Page 1 of 1

writting own registration text validator

Posted: Wed Jul 18, 2007 12:20 pm
by divx
Hi,

Im thinking of writting my own Text(picture) verification

I know there are methods out there, but I dont want to use standards (since i know the common ones always get hacked eventualy).

Im thinking of writting this in php, where the php gets a image number (say from the milliesecond that the user has gone to the registration of this page) eg, 01 millseconds = pic 1 (an algorithum will modify this, but you get the idea)

But I have then run into security problems, with various approaches:

1. onSend the javaScript can verify that the image the same as the input eg 01img = Myinput
- security problem, javascript can be searched

2. imageNumber can be varified by php that imageNumber corresponds to Myinput
- cant see anyproblems with this so far

Anyone done this before, any infomation on this sort of thing would be appreciated, thx

Posted: Wed Jul 18, 2007 1:03 pm
by Zoxive
They are called captcha's, and most php based ones I know of use session variables.

Posted: Wed Jul 18, 2007 1:13 pm
by divx
cheers, I think nmost captchas use the php GD2 library (im not hosting this one myself), so i dont think I can really go down there?

Posted: Wed Jul 18, 2007 1:14 pm
by superdezign
Most captchas actually generate an image, not retrieve a pre-generated image. Your method is much easier to hack.

Posted: Wed Jul 18, 2007 1:41 pm
by divx
I dont know how, but they have been hacked. (just look at most of the discussion about this in vbulletin)

Pregenerated may not be as bad as u might 1st think, since once someone designs a hack, just upload a new batch of images with different values.

Posted: Wed Jul 18, 2007 1:43 pm
by John Cartwright
divx wrote:I dont know how, but they have been hacked. (just look at most of the discussion about this in vbulletin)

Pregenerated may not be as bad as u might 1st think, since once someone designs a hack, just upload a new batch of images with different values.
That seems awfully painful to have to design the new images everytime. Why not generate the images on the fly randomly changing things such as background, distortion, letter size, pitch, color, etc.

Posted: Wed Jul 18, 2007 1:53 pm
by superdezign
You may even want to go the Digg route where it's hard for humans to read what the captcha says, let alone a bot.

Posted: Wed Jul 18, 2007 5:11 pm
by divx
superdezign wrote:You may even want to go the Digg route where it's hard for humans to read what the captcha says, let alone a bot.
hmm, no i reallly wouldnt want to do that, lets face it its not neural networks reading the images, most of these generators have simply just been hacked (via code exploits).

Posted: Wed Jul 18, 2007 5:13 pm
by divx
Jcart wrote:
divx wrote:I dont know how, but they have been hacked. (just look at most of the discussion about this in vbulletin)

Pregenerated may not be as bad as u might 1st think, since once someone designs a hack, just upload a new batch of images with different values.
That seems awfully painful to have to design the new images everytime. Why not generate the images on the fly randomly changing things such as background, distortion, letter size, pitch, color, etc.
Maybe not as painfull as you might think, predefined sets for each week, 52 sets, all mutually exlusive. Nobody will want to redisign a macro to resubmit for 1 week, and then have to design another one.

I think generating on the fly makes me prone, since once its hacked once, it stays hacked until the way they hacked it is figured out, and a patch can be applied.

maybe a combination of these ideas will seal the deal

Posted: Fri Jul 20, 2007 6:25 am
by Mordred
divx wrote:lets face it its not neural networks reading the images, most of these generators have simply just been hacked (via code exploits).
You seem so certain in what you say, so why coming here looking for advice and then discarding it?
You are also wrong, most attempts at defeating CAPTCHA-s are indeed algorythmic, not exploit-based.

Posted: Fri Jul 20, 2007 5:20 pm
by feyd
Mordred wrote:most attempts at defeating CAPTCHA-s are indeed algorythmic, not exploit-based.
Indeed.

Posted: Fri Jul 20, 2007 9:56 pm
by John Cartwright
divx wrote:
Jcart wrote:
divx wrote:I dont know how, but they have been hacked. (just look at most of the discussion about this in vbulletin)

Pregenerated may not be as bad as u might 1st think, since once someone designs a hack, just upload a new batch of images with different values.
That seems awfully painful to have to design the new images everytime. Why not generate the images on the fly randomly changing things such as background, distortion, letter size, pitch, color, etc.
Maybe not as painfull as you might think, predefined sets for each week, 52 sets, all mutually exlusive. Nobody will want to redisign a macro to resubmit for 1 week, and then have to design another one.

I think generating on the fly makes me prone, since once its hacked once, it stays hacked until the way they hacked it is figured out, and a patch can be applied.

maybe a combination of these ideas will seal the deal
I'm not sure you quite understood my last post. This generally only applies if you have fixed attributes on your captcha, such as a static background image. Essentially, if all your attributes are dynamic and highly random I don't see how it's easily possible to break the captcha without the use of a very clever OCR program. So please tell me, if someone creates a program that can break highly dynamic captchas whats to stop them from breaking yours as well?

Posted: Sat Jul 21, 2007 4:44 am
by The Phoenix
Most captchas are broken by optical character recognition (OCR). Here's a good example breaking down some of the "how": http://sam.zoy.org/pwntcha/

The summary version is that it can 'guess' where letters are based on common fonts, positions, rendering, and so forth.

There is a considerable amount of work in the security field on breaking them precisely because they are being used to secure resources.

Lets simplify the examples to clarify things. Lets say you pre-generate an image of a number 1-9. On the other hand, Bob generates (real-time) images of a number 1-9. An attacker finds a way to identify those numbers reliably.

Bob has to change his algorithm.

You have to change your algorithm also. But you also need to generate the images, and then upload them. Then you need to use different names for your images so that the cached versions people already received from your site won't linger.

For 9 images, thats not too unreasonable. For 36 images, it should be an obvious disadvantage.

No matter how you slice it, dynamically generated images will be easier to maintain in an attack/response environment.

Worse, by having pre-generated images, the attacker can download the very small set (9? 36?) of images, and be sure he has 100% of the images he needs to be able to guess. With dynamic image generation, you can combine multiple items into one image, forcing the number of potential solutions much higher (thousands or more).

Posted: Sat Jul 21, 2007 9:50 am
by superdezign
The Phoenix wrote:Worse, by having pre-generated images, the attacker can download the very small set (9? 36?) of images, and be sure he has 100% of the images he needs to be able to guess. With dynamic image generation, you can combine multiple items into one image, forcing the number of potential solutions much higher (thousands or more).
And, you can save some server space.