Captcha
Captcha
How did you solve it? First, I tried employing some OCR software. I had to roughly correct the orientation of the images, because neither Finereader nor Omnipage seem to support big rotations. After all, they both didn't perform well enough to give me the answer.
So I further straightened the images by minimizing the height of the text bounding-boxes. Finally I computed the difference between the images (squared distance of pixel colors) to find the best matching pair.
This is probably a bit complicated. Do you guys have a simpler solution?
So I further straightened the images by minimizing the height of the text bounding-boxes. Finally I computed the difference between the images (squared distance of pixel colors) to find the best matching pair.
This is probably a bit complicated. Do you guys have a simpler solution?
I didn't try to recognize the characters because I thought it needless. I did almost the same way as you finally did. I think it's simple enough.
First I calculated the center and the rotation (by linear regression analysis) for each image, and straightened it into 128x16 rectangle. Then I compared each pair of the straightened images, by the sum of squared distance of pixel values.
It was funny that the best score was made by "docketing" and "pocketing" pair, because the difference was out of the 128x16 area. The answer was found as the second best pair.
First I calculated the center and the rotation (by linear regression analysis) for each image, and straightened it into 128x16 rectangle. Then I compared each pair of the straightened images, by the sum of squared distance of pixel values.
It was funny that the best score was made by "docketing" and "pocketing" pair, because the difference was out of the 128x16 area. The answer was found as the second best pair.
Oh, this is of course the better, faster and more robust approach! I did it pure ad-hoc: Rotate and trim the images with ImageMagick until they got minimal height.tails wrote:First I calculated the center and the rotation (by linear regression analysis)
I didn't fit the images into fixed boxes. Instead, I only compared images with similar size. This way, the answer really got the best score.tails wrote:It was funny that the best score was made by "docketing" and "pocketing" pair, because the difference was out of the 128x16 area. The answer was found as the second best pair.
- Yharaskrik
- Posts: 31
- Joined: Wed Nov 05, 2008 11:44 am
- Location: Germany
I used principal component analysis to calculate the rotation and didn't reduce the size of the images but sorted them according to the sum of all pixel's color values and only correlated each picture with 100 neighbors.
I expected that too many similar images correlate well and already planned what to do next. But in fact the two desired images correlated much better than all the other pairs.
I expected that too many similar images correlate well and already planned what to do next. But in fact the two desired images correlated much better than all the other pairs.
Interesting. My first estimation of the rotation was also based on the bbox and arctan. But I din't bother training my OCR-tools (maybe this was the reason the failed ). After maybe 50 false positives I gave up.Yharaskrik wrote:Hi!
I calculated the rotation by finding the highest/lowest x/y pixels and using arctan.
Then I used "gocr", trained it with 200 captures, and used a hash table to
find duplicates. I had to look over 100 false positives and finally found the
correct pair.
I used principal component analysis (http://en.wikipedia.org/wiki/Principal_ ... s_analysis) to calculate the rotation, ImageMagick (http://www.imagemagick.org/script/index.php) to undo the rotation and DoublePics (http://www.doublepics.net/index.cms?FQT ... ics.Whatis) to find the duplicates.
as usual I tried not to use any special software, only Delphi ...
Rotating with a best fractional fit i.e. the value of each pixel & each color in target (integer) is the weighted average of the four pixels (integer) around the calculated source point (double).
Looking for the best linear alignment (minimal height & best fit)
Get the pattern of the row in the middle (only black & white), sorting the pattern,
comparing with the nearest ten neighbors (with Hamming distance <=2). Got about 30 hits, the result was the third hit after immobilize/immobilise & attenders/attendees.
nice challenge, but it took a while ...
Rotating with a best fractional fit i.e. the value of each pixel & each color in target (integer) is the weighted average of the four pixels (integer) around the calculated source point (double).
Looking for the best linear alignment (minimal height & best fit)
Get the pattern of the row in the middle (only black & white), sorting the pattern,
comparing with the nearest ten neighbors (with Hamming distance <=2). Got about 30 hits, the result was the third hit after immobilize/immobilise & attenders/attendees.
nice challenge, but it took a while ...
I converted every picture into black&white and counted the number of white pixels.
Then I (visually) compared all pictures with the same pixel count. Unfortunately there were far more pictures with the same pixel count then I expected. Luckily the twins had the same pixel count... so I had not to think about how to rotate and compare things
Then I (visually) compared all pictures with the same pixel count. Unfortunately there were far more pictures with the same pixel count then I expected. Luckily the twins had the same pixel count... so I had not to think about how to rotate and compare things
nice challenge
to solve this one i decided to use c#
1) first i calculated the rotation with linear regression and rotated every image into horizontal position.
That was the easy part.
No I had to figure out some method to compare all those images...
2) I only used the R(ed) value of the images.
Here the important part of the comparision routine.
The image with the lowest score is the winner
One run - aproximatly 3 hours Best score was 0 (zero) !!! What a surprise
... and it was the correct solution.
1) first i calculated the rotation with linear regression and rotated every image into horizontal position.
That was the easy part.
No I had to figure out some method to compare all those images...
2) I only used the R(ed) value of the images.
Here the important part of the comparision routine.
The image with the lowest score is the winner
Code: Select all
for (x = 10; x < 120; x++)
{
for (y = 50; y < 80; y++)
{
c1 = bmp1.GetPixel(x, y);
c2 = bmp2.GetPixel(x, y);
value = Math.Abs(c1.R - c2.R);
if (value > 200) score += value;
if (value > 240) score += value; //double value cause this one really doesn't fit
if (score > 9000) return int.MaxValue;
}
}
return score;
... and it was the correct solution.
- dangermouse
- Posts: 89
- Joined: Sun Jun 05, 2011 8:14 pm
- Location: deep space computing AG
- Contact:
in Freepascal/Lazarus:
i tried first to compute for each white pixel of each image Sum (r_i^2) with r_i set to the distance of the pixel i from the center. This quantity, like number of white pixels or Sum(r_i) is rotational invariant.
Then i wrote at tool to select the best 10000 matches among pairs and sorted in order of fitness.
Still, could not find the solution. i did not use PCA to compute the center of the rotation, i was assuming that it was exaclty lying in the middle of the picture... i might have failed for this reason?
i then did it the hard way. i wrote a tool which randomly showed me pictures, i read and typed them in by hand... the tool checked if i already had the word. my speed was about 1000 words per hour. after 5507 tries i finally found the word
all in all, this was one of my most time consuming challenges! all challenges i am facing now look really hard. i would need more in my head
i tried first to compute for each white pixel of each image Sum (r_i^2) with r_i set to the distance of the pixel i from the center. This quantity, like number of white pixels or Sum(r_i) is rotational invariant.
Then i wrote at tool to select the best 10000 matches among pairs and sorted in order of fitness.
Still, could not find the solution. i did not use PCA to compute the center of the rotation, i was assuming that it was exaclty lying in the middle of the picture... i might have failed for this reason?
i then did it the hard way. i wrote a tool which randomly showed me pictures, i read and typed them in by hand... the tool checked if i already had the word. my speed was about 1000 words per hour. after 5507 tries i finally found the word
all in all, this was one of my most time consuming challenges! all challenges i am facing now look really hard. i would need more in my head
Learn the rules if you want to break them effectively. Dalai Lama XV
No, my assumption was identical (but it wasn't hard to find center)dangermouse wrote:i did not use PCA to compute the center of the rotation, i was assuming that it was exaclty lying in the middle of the picture... i might have failed for this reason?
I did it using raw C++ implementing everything from scratch
I don't know PCA, so I figured another way to find rotation values. I calculated convex hull from non-black pixels and then separated vectors (from each side of hull) into two types, by its direction (going (left-up & right-down) and (left-down & right-up)). Then I calculated which type has greater length of its sum and used only them (this vectors are from longer sides of containing rect of non-rotated text). Rotation is the angle of sum of these vectors (each vector was additionally multiplicated by its length) No, it wasn't hard to implement, but PCA is probably easier.
After rotation I just compared each pair by calculating sum of pixels differences and dividing it by word length (in pixels, calc. from bounding rect) for not favoring short words.
The first hit was the password
Total time of calculations: about 20 min
I don't think it's clear explanation because of my poor english, but maybe somebody will understand it
Thanks for nice task!
Uff
I had a feeling I should stop searching when I found the match ... uff lucky me.
OK may be not as lucky as I have reduced the candidates to about 60% ...
[edit]
I have to give better explanation what I did:
First I started by renaming images according to number of white regions in the image ... unfortunately there was a lot of "ligatures" and therefore that didn't look promissing.
Than I codded myself good enough rotation of the image by given angle.
I have detected the angle by computing all pairs of highest and lowest points on ligatures
and making list of lower and upper bounds of corresponding angles. The angle in most intersections was the chosen one (sort helped). Of course it was failing on pictures with only one ligature...
Than I have decided to find interesting properties ... first I have put images to different directories according to white width after roattion, but I was not sure with rounding so I put image to directories X and X+1 according to width/2. I did simillar for height so I got 40000 images .
... that was not good way to go.
Than I looked to another properties ... the number of dots above letters. (I put all images back to the root directory ... process so far eliminated may be 1% outlayers).
The dots seemed to be clasified with 100% at least what I have optically checked.
Using rotated white width for name start to have images sorted simplified MANUAL/VISUAL check the images in small categories are unique. I am not sure if I eliminated words with at least 3 or at least 4 dots this way.
Next bucket criterion was number of big holes (odbp letters) ... and there I have found the solution.
I didn't want to compare all pairs of images and I didn't expect there would be only small difference after derotation.
[/edit]
OK may be not as lucky as I have reduced the candidates to about 60% ...
[edit]
I have to give better explanation what I did:
First I started by renaming images according to number of white regions in the image ... unfortunately there was a lot of "ligatures" and therefore that didn't look promissing.
Than I codded myself good enough rotation of the image by given angle.
I have detected the angle by computing all pairs of highest and lowest points on ligatures
and making list of lower and upper bounds of corresponding angles. The angle in most intersections was the chosen one (sort helped). Of course it was failing on pictures with only one ligature...
Than I have decided to find interesting properties ... first I have put images to different directories according to white width after roattion, but I was not sure with rounding so I put image to directories X and X+1 according to width/2. I did simillar for height so I got 40000 images .
... that was not good way to go.
Than I looked to another properties ... the number of dots above letters. (I put all images back to the root directory ... process so far eliminated may be 1% outlayers).
The dots seemed to be clasified with 100% at least what I have optically checked.
Using rotated white width for name start to have images sorted simplified MANUAL/VISUAL check the images in small categories are unique. I am not sure if I eliminated words with at least 3 or at least 4 dots this way.
Next bucket criterion was number of big holes (odbp letters) ... and there I have found the solution.
I didn't want to compare all pairs of images and I didn't expect there would be only small difference after derotation.
[/edit]
I used OpenCV to 4x scale and un-rotate text to make it manageable for OCR tools to dig into words.
Functions used: cv::imload, cv::resize, cv:minAreaRect, cv::warpAffine.
It was a nice experience to play with modern OCR library.
Then used tesseract and qt-box-editor in manual training mode for the first 101 scaled images. Then ran 'sort | uniq -d' on text output.
The answer was one of 10 words.
Tesseract makes almost perfect guesses even for such a crudely rotated text ('j' amd 'q' letters are missing in first 100 pictures).
Reading now about linear regression and principal component analysis mentioned here.
Nice challenge!
Functions used: cv::imload, cv::resize, cv:minAreaRect, cv::warpAffine.
It was a nice experience to play with modern OCR library.
Then used tesseract and qt-box-editor in manual training mode for the first 101 scaled images. Then ran 'sort | uniq -d' on text output.
The answer was one of 10 words.
Tesseract makes almost perfect guesses even for such a crudely rotated text ('j' amd 'q' letters are missing in first 100 pictures).
Reading now about linear regression and principal component analysis mentioned here.
Nice challenge!