Not Smaller Yet
Not Smaller Yet
I have a question. How should we use the value 34 on decoding?
I suppose, 34 defines the beginning of the original text. Since the BWT ist rotation invariant (abcde has the same bwt-result as e.g. bcdea or deabc), you cannot otherwise know where the text starts in the backtransformed sequence.
But I thought that this should be the position of the last (or the first) character of the original text in the BWT-Sequence, which is 17 (resp. 20)?
But I thought that this should be the position of the last (or the first) character of the original text in the BWT-Sequence, which is 17 (resp. 20)?
BWT is the first step in the compression algorithm of bzip2. It sorts the letters such that those appearing in a similar surrounding come together. E.g. a 'q' is always followed by a 'u', so the q's will appear only within a certain region of the BWT-sequence. Therefore this sequence is easier to compress with Move-To-Front- and Huffman-Coding, which are the next steps in bzip2 and make the sequence shorter actually.
By the way, I did pattern matching on BWT-sequences in my diploma thesis, that's why I found the solution to this challenge at first glance (in contrast to some other crypto challenges).
By the way, I did pattern matching on BWT-sequences in my diploma thesis, that's why I found the solution to this challenge at first glance (in contrast to some other crypto challenges).
About the meaning of the number: in this case it denotes which character of the original text appears first in the encoded text. In the given text, the rotation starting with " Burr..." is lexically smallest, so the 'e' of "the" in front of it appears first in the transformed output, and this is character #34 of the original text. To the decoder, this means that after the text has been reconstructed by iterating the "sorting permutation" starting with the first character, the correct rotation will be obtained after moving the last 34 characters to the front of the string.
Note that this is quite similar in spirit, yet opposite in effect to the number expected by tails and papa. In that case the decoder would obtain the correct rotation by applying the "sorting permutation" repeatedly, starting with the specified character of the transformed string.
This is the first challenge I have created and I'm sorry if the number caused much confusion... I wasn't sure whether to include it in the puzzle because there is more than one equally useful way to convey the missing information to the decoder and this puzzle is most likely just as solvable without it (modulo rotation).
Cheers and happy puzzling,
Joachim
Note that this is quite similar in spirit, yet opposite in effect to the number expected by tails and papa. In that case the decoder would obtain the correct rotation by applying the "sorting permutation" repeatedly, starting with the specified character of the transformed string.
This is the first challenge I have created and I'm sorry if the number caused much confusion... I wasn't sure whether to include it in the puzzle because there is more than one equally useful way to convey the missing information to the decoder and this puzzle is most likely just as solvable without it (modulo rotation).
Cheers and happy puzzling,
Joachim
Yes, it was a big hint for me. I was like this:adum wrote:was the title of the challenge a hint for you guys?
(1) frequency analysis -> hmm, it seems a transposition cipher.
(2) "Not Smaller Yet" -> going to be smaller?
(1) + (2) -> Aha!
No, I didn't even know the word BWT. But I know the very basic things about how bzip2 works. It's called as "block sort" in Japan, so this time I googled it (actually in Japanese) and found the Wikipedia article.adum wrote:were you already familiar with BWT?
I recognized it even without the challenge title, but the title made it even more clear.
I know BWT because I once tried to implement DEFLATE in PHP (didn't make it but that's why i know about it). For the challenge I did BWT in PHP, not really hard, the number was very confusing though.
Quite nice challenge, maybe I should take a (closer) look at MTF und Huffman again aswell.
I know BWT because I once tried to implement DEFLATE in PHP (didn't make it but that's why i know about it). For the challenge I did BWT in PHP, not really hard, the number was very confusing though.
Quite nice challenge, maybe I should take a (closer) look at MTF und Huffman again aswell.
-
- Forum Admin
- Posts: 496
- Joined: Sat May 28, 2011 9:14 am
- Location: Germany