Captcha

m!nus · Post by **m!nus** » Mon Feb 23, 2009 12:51 am

so those who have solved it, how did you do that?
someone i know has managed to crack captchas via an ANN, so i'm curios how you did it.

papa · Post by **papa** » Wed Feb 25, 2009 1:03 pm

I didn't bother reading the words but only compared the pictures to each other.

Tenebrar · Post by **Tenebrar** » Wed Feb 25, 2009 1:29 pm

So, the repeated word is actually twice the exact same picture? I had assumed at least the angle of the word would be different.

gfoot · Post by **gfoot** » Wed Feb 25, 2009 1:56 pm

I thought the standard way to crack arbitrary captchas on a large scale was to proxy them to porn site surfers, and get the guinea-pigs to do the work without realising it.

Of course, to do that you need (a) lots of bandwidth, (b) lots of visitors, and (c) lots of porn. Hmm.

papa · Post by **papa** » Wed Feb 25, 2009 1:59 pm

Of course the angle is different. Only comparing the pixels won't do it.

m!nus · Post by **m!nus** » Wed Feb 25, 2009 9:40 pm

papa wrote:I didn't bother reading the words but only compared the pictures to each other.

how can you not read the words and still compare them...

MerickOWA · Post by **MerickOWA** » Thu Feb 26, 2009 6:23 pm

You could come up with some equation as to how different each picture is from another and find the two pictures which have the lowest difference value. But thats assuming that the two pictures "look" very much the same.

I'm going to bet that they don't, but I haven't tested this to see

arthur · Post by **arthur** » Mon Aug 16, 2010 3:43 pm

gfoot wrote:I thought the standard way to crack arbitrary captchas on a large scale was to proxy them to porn site surfers, and get the guinea-pigs to do the work without realising it.

Of course, to do that you need (a) lots of bandwidth, (b) lots of visitors, and (c) lots of porn. Hmm.

My teacher once talked about that...
Anyway porn site is illegal in my country.

moose · Post by **moose** » Wed Jul 13, 2011 11:03 am

10.001 images? Are you serious?

How long did it take for the 32 who solved it?

If I would use my standard OCR-approach, I guess it would take about 0.5-1.0 seconds per picture:
10001/(60*60) = 2 hours 46 minutes. As I would have to train it a bit, lets say 3 hours, but I have to interact only about 15 minutes, then it'll know the font .... well ... acceptable at the weekend, if I have nothing to do and a good book to read

megabreit · Post by **megabreit** » Fri Jul 15, 2011 3:24 pm

I did not use an OCR approach but "simply" reduced the number of possible candidates.
But, reading the solved forum, there were other approaches working far better... none of them used OCR.

moose · Post by **moose** » Sat Jul 16, 2011 1:55 pm

megabreit wrote:I did not use an OCR approach but "simply" reduced the number of possible candidates.
But, reading the solved forum, there were other approaches working far better... none of them used OCR.

Ok, I've tried something much simpler as my algorithm for rotating the images makes some strange things (Has anyone experience with imagemagick? Why is the font getting smaller each time I rotate the picture?)

As 10,001 files are given, you can make 10,001² = 100,020,001 comparisons. I have reduced them with a very simple method to 245,536. But it might be possible that my method has errors. Anyway, it's too much.

Sat Jul 16, 2011 2:16 pm

moose wrote: As 10,001 files are given, you can make 10,001² = 100,020,001 comparisons.

I don't think so, since if you have compared file M with file N already, you don't need to compare file N with file M anymore. Also you don't compare a file to itself. Example:

Compare 4 things with each other:
#1 with #2
#1 with #3
#1 with #4
#2 with #3
#2 with #4
#3 with #4
= 6 comparisons total
= sum of (1 to n) where n is 3

Thus, comparing 10,001 files with each other should need sum of (1 to n) where n is 10,000. With the fomula n*(n+1)/2, that is 50,005,000 comparisons. If I am not mistaken.

contagious · Post by **contagious** » Tue Aug 07, 2012 2:39 pm

moose wrote:
megabreit wrote:I did not use an OCR approach but "simply" reduced the number of possible candidates.
But, reading the solved forum, there were other approaches working far better... none of them used OCR.
Ok, I've tried something much simpler as my algorithm for rotating the images makes some strange things (Has anyone experience with imagemagick? Why is the font getting smaller each time I rotate the picture?)

As 10,001 files are given, you can make 10,001² = 100,020,001 comparisons. I have reduced them with a very simple method to 245,536. But it might be possible that my method has errors. Anyway, it's too much.

Its because rotation != n*90 degrees is lossy.

argyblarg · Post by **argyblarg** » Sat Apr 18, 2015 8:45 pm

If anyone is still interested in this one...I suppose OCR has come a long way in the last few years, and it is viable as (part of) the way to the solution.

If you do go this route, though, you'll need a top-tier OCR and plan on some serious training time. That font is *ugh*.