Page 1 of 1

Captcha

Posted: Mon Feb 23, 2009 12:51 am
by m!nus
so those who have solved it, how did you do that?
someone i know has managed to crack captchas via an ANN, so i'm curios how you did it.

Posted: Wed Feb 25, 2009 1:03 pm
by papa
I didn't bother reading the words but only compared the pictures to each other.

Posted: Wed Feb 25, 2009 1:29 pm
by Tenebrar
So, the repeated word is actually twice the exact same picture? I had assumed at least the angle of the word would be different.

Posted: Wed Feb 25, 2009 1:56 pm
by gfoot
I thought the standard way to crack arbitrary captchas on a large scale was to proxy them to porn site surfers, and get the guinea-pigs to do the work without realising it.

Of course, to do that you need (a) lots of bandwidth, (b) lots of visitors, and (c) lots of porn. Hmm.

Posted: Wed Feb 25, 2009 1:59 pm
by papa
Of course the angle is different. Only comparing the pixels won't do it.

Posted: Wed Feb 25, 2009 9:40 pm
by m!nus
papa wrote:I didn't bother reading the words but only compared the pictures to each other.
how can you not read the words and still compare them...

Posted: Thu Feb 26, 2009 6:23 pm
by MerickOWA
You could come up with some equation as to how different each picture is from another and find the two pictures which have the lowest difference value. But thats assuming that the two pictures "look" very much the same.

I'm going to bet that they don't, but I haven't tested this to see ;)

Posted: Mon Aug 16, 2010 3:43 pm
by arthur
gfoot wrote:I thought the standard way to crack arbitrary captchas on a large scale was to proxy them to porn site surfers, and get the guinea-pigs to do the work without realising it.

Of course, to do that you need (a) lots of bandwidth, (b) lots of visitors, and (c) lots of porn. Hmm.
My teacher once talked about that...
Anyway porn site is illegal in my country. :(

Posted: Wed Jul 13, 2011 11:03 am
by moose
10.001 images? Are you serious?

How long did it take for the 32 who solved it?

If I would use my standard OCR-approach, I guess it would take about 0.5-1.0 seconds per picture:
10001/(60*60) = 2 hours 46 minutes. As I would have to train it a bit, lets say 3 hours, but I have to interact only about 15 minutes, then it'll know the font .... well ... acceptable at the weekend, if I have nothing to do and a good book to read :D

Posted: Fri Jul 15, 2011 3:24 pm
by megabreit
I did not use an OCR approach but "simply" reduced the number of possible candidates.
But, reading the solved forum, there were other approaches working far better... none of them used OCR.

Posted: Sat Jul 16, 2011 1:55 pm
by moose
megabreit wrote:I did not use an OCR approach but "simply" reduced the number of possible candidates.
But, reading the solved forum, there were other approaches working far better... none of them used OCR.
Ok, I've tried something much simpler as my algorithm for rotating the images makes some strange things (Has anyone experience with imagemagick? Why is the font getting smaller each time I rotate the picture?)

As 10,001 files are given, you can make 10,001² = 100,020,001 comparisons. I have reduced them with a very simple method to 245,536. But it might be possible that my method has errors. Anyway, it's too much.

Posted: Sat Jul 16, 2011 2:16 pm
by AMindForeverVoyaging
moose wrote: As 10,001 files are given, you can make 10,001² = 100,020,001 comparisons.
I don't think so, since if you have compared file M with file N already, you don't need to compare file N with file M anymore. Also you don't compare a file to itself. Example:

Compare 4 things with each other:
#1 with #2
#1 with #3
#1 with #4
#2 with #3
#2 with #4
#3 with #4
= 6 comparisons total
= sum of (1 to n) where n is 3

Thus, comparing 10,001 files with each other should need sum of (1 to n) where n is 10,000. With the fomula n*(n+1)/2, that is 50,005,000 comparisons. If I am not mistaken.

Posted: Tue Aug 07, 2012 2:39 pm
by contagious
moose wrote:
megabreit wrote:I did not use an OCR approach but "simply" reduced the number of possible candidates.
But, reading the solved forum, there were other approaches working far better... none of them used OCR.
Ok, I've tried something much simpler as my algorithm for rotating the images makes some strange things (Has anyone experience with imagemagick? Why is the font getting smaller each time I rotate the picture?)

As 10,001 files are given, you can make 10,001² = 100,020,001 comparisons. I have reduced them with a very simple method to 245,536. But it might be possible that my method has errors. Anyway, it's too much.
Its because rotation != n*90 degrees is lossy.

Posted: Sat Apr 18, 2015 8:45 pm
by argyblarg
If anyone is still interested in this one...I suppose OCR has come a long way in the last few years, and it is viable as (part of) the way to the solution.

If you do go this route, though, you'll need a top-tier OCR and plan on some serious training time. That font is *ugh*.