Page 1 of 2

CipherQuest D

Posted: Sun Jun 21, 2009 7:24 am
by tails
Hi,

I happened to guess the first 10 characters in very early stage of deciphering, and after that it took a while to get the full plaintext. I didn't realize the new rule introduced in this challenge when I submitted the answer.

I don't think of a very good way to let solvers prove their success, but isn't it better to ask first 10 words instead of first 10 charaters?

Posted: Sun Jun 21, 2009 9:15 am
by gfoot
Yes, 10 characters isn't much. Also, apart from on CipherQuest A I haven't bothered converting anything into punctuation - I must have just deleted the punctuation along with the true null characters.

That's why I made the solution be the entire alphabet in the homophonic ciphers. You could do the same here - the deletes would mean you still don't need to decode some characters, but you would need most of them.

I must say I felt lucky when solving CipherQuest D - my approach wasn't systematic, and there were a bunch of things I did which could easily fail if the cipher was designed to foil them. It'll be interesting to see any further challenges.

Posted: Sun Jun 21, 2009 10:31 pm
by adum
very interesting -- i didn't realize that just asking for the first 10 could be an easy shortcut. of course, this is very typical -- cipher creator not understanding weaknesses =)

at first i was going to require the solution to be something like the alphabet, but that gets complicated when it's deletes and nulls, etc. so i went for first ten chars.

i'll pick something better for CipherQuest E =)

Posted: Sun Jun 21, 2009 10:33 pm
by adum
let's keep in mind too that these aren't exactly proving to be a walk in the park for people. the fact that only you two got here means they're extremely challenging for most. yay substitution ciphers!

Posted: Mon Jun 22, 2009 12:49 am
by tails
Ah, yes, of course you are right :) This cipher is very challenging.

How I solved

Posted: Mon Jun 22, 2009 1:54 am
by tails
As a feedback, this is how I solved this cipher:

1. Guessing that "k" must be the word separator.
2. Guessing some nulls and deletes.
3. There are some "Cdg" and "CdOC" found as whole words in the cipher, so they may be "the" and "that" respectively.
4. There are many words which end with "sQP", so they may be "ing".
5. There are some "WgsQP" and "WggQ". They are very suitable for "being" and "been" because only "W" is unknown among those letters.
6. "WggQ" ("been") are often preceded by "dOt", so I thought it must be "has" (in fact it is "had").

At this point, the first 10 characters are "*a*ing he*" where * are unknown letters.
It didn't took long until I found they are "m", "k" and "r".

And after that, I noticed some letters each of which was not suitable for one character.

Posted: Mon Jun 22, 2009 9:53 am
by gfoot
My steps:

1. Guess word separator based on minimising the longest word length
2. Analyse digraph, trigraph, ... up to exagraph(?) frequencies, compare with an online reference - most exagraphs are four-letter words with spaces either side, most quadragraphs are two letter words - pretty useful
3. Analyse characters that tend to start or end words. 50% of words start with one of "TAOSW", 50% end with one of "ESOT".
4. Lots of cross-referencing between common sequences and word start/end letters, and trying out various letters as nulls and deletes to try to improve the profile of the frequency results.

So nowhere near as direct - it took quite a while, with a lot of failed guesses as to which letter was which. The double letters messed up a lot of the frequencies - "THE" was unusually hard to find, for me.

Things that would make it particularly hard for me would be:
* making the word separator harder to find (have more than one, add a fake one using nulls, or use deletes to hide the real one)
* non-random null/delete placement (maybe you do this already) - otherwise they can stand out as being unusual due to being too promiscuous
* proactively disguise common words - deliberately insert nulls and deletes into them
* add fake common words - use specific sets of nulls to give false hits for words like "the" and "and"

On top of that, I'm sure you'll add more metacharacters - there are lots of interesting things you could do with them. :)

Posted: Mon Jun 22, 2009 10:17 am
by tails
Wow, gfoot, your method is very systematic, while mine very ad-hoc :)
gfoot wrote:50% of words start with one of "TAOSW", 50% end with one of "ESOT".
Didn't know that. Good information :)
gfoot wrote:Things that would make it particularly hard for me would be:
* Using homophony.

Posted: Mon Jun 22, 2009 3:32 pm
by Yharaskrik
Hi,

besides the NULLs I didn't find other rules. Well, I didn't look after them because I found the solution by just guessing character after character.
...researching...
So, I just tried to fully get the plain text and found these rules:
- delete this character
- delete this and the following character
- replace this character by another character
- replace this character by two other characters
Now it's almost completly readable. Do you know what the missing rules are?

Posted: Mon Jun 22, 2009 6:26 pm
by gfoot
That is all the rules. The only new rule since CipherQuest C is that some ciphertext letters represent pairs of plaintext letters (e.g. "er", "th"...)

Posted: Tue Jun 23, 2009 12:43 am
by tails
When enciphering, if the two-character rule were fully applied to each occurence of such pair, I think the cipher would be much harder to solve. Maybe it would not even look like English. This time some "th" are expressed in two one-character letters, and that is a big clue for guessing and analyzing.

By the way, I found only two-character letters are homophonic; e.g. "a", "r" and "z" all represent "ti" ("z" appears only once so it has some other possibilities though).

And, hi Yharaskrik, I'm glad we went on very similar ways :)

Posted: Sun Jul 12, 2009 12:59 am
by rmplpmpl
Ha, done it... I'll post tomorrow what I did, because it seems to me I didn't took the way you all did.

Posted: Sun Jul 12, 2009 12:33 pm
by rmplpmpl
OK, after further analysing the text and my solution, the differences are not to big to your way of cracking that nut.

1) Guessing Space, I was unsure at first. All other possibilites did not work with the estimated amount of one, two and three-letter words.

2) Analysed 3-, 4-, 5- and 6-grams where the first and last letter was a space, this led me to the conclusion that my approach to guess the most common 3-gram as THE must be wrong. So I opted for another subsitution for THE (to find out later that it was in fact SHE).

3) From there I got R, and short time later A, N and D

4) I had guessing some NULL und double-NULL values but waited with the deletion, nevertheless notepad++ helped me a lot with marking them out

From there I made my way through all the other letters and digrams.

There is one glitch little in the text (three or four words where I found a W-substitution are somewhat corrupted) in my solution, maybe I have to look in it once more...

Thanks for this great cipher!

Posted: Tue Jun 18, 2013 5:30 pm
by godefv
Hi,

it has been a long time since last post but maybe some people are still around...

This one was definitely harder than the previous ones because all the frequency analysis was fucked up by missing common digrams.
The sQ and sQP were incredibly high and I have waited a long time before assigning them to "ing" because I thought it may be a trick to add these sQP at the end of words. Also I was confused by the missing "the" and "and". How could it be real that "ing" was so much higher ?
Also, I really thought that several different letters would decipher to the same, and this would explain having high frequency letters splitted to few low frequency ones.

So, nice one. However, as before, I still have some little errors in the deciphered text at the end like a few missing and extra letters. Especially, I am not sure what to do if I have fKJO, should I delete fK, KJ, or J first ? Bad order of operations can lead to those few errors. Besides, for me, z was "li" ...

And... this one is a dead end :'(

Posted: Mon Jun 16, 2014 1:10 am
by Cheesy
Phew. I got really confused by the frequency analysis for a while there.

I've gotten every word to make sense, except "wzit" whatever that is.

"wzit was common in the dungeons with them being full of cold drafts and all."