'Say It' [Misc]

User avatar
Drifter
Posts: 2
Joined: Tue Dec 09, 2008 11:36 am

'Say It' [Misc]

Post by Drifter »

  • Aggravation: *Curses the creator of this challenge*
  • Pity: *I hope your friend has a good calling plan*
  • Denial*Your friend must be a computer, b/c 'he read' (for continuously)*
  • Self-Pity *Do we have to correct mistakes made while reading*
  • Fear *Hopes 'your friend isn't dyslexic*
The original file is 4 hours, 2 minutes, 7 seconds... (14527 seconds.) Or at 70% play speed for transcription and safety 5 hours, 45 minutes, 53 seconds... (20753seconds.) Add two child labors again for transcription safety and integrity; and the resultant

While the premise is relatively simple ... compression from speech to hex looks too painful. 40 bytes in the first 25 seconds expands to an ending file size to be 23kb.

Anyone care to recommend a Speech Recognition Engines (preferably open source)?

Before I realize that it was not 4:02 minutes but hours The first 240 bytes are:

Code: Select all

89504E470D0A1A0A0000000D4948445200000078
0000005A0802000000FC6205B800000001735247
4200AECE1CE9000000097048597300000B130000
0B1301009A9C180000000774494D4507D8071403
171AA545EB8F0000001974455874436F6D6D656E
74004372656174656420776974682047494D5057
810E17000020004944415478DA7CBC69AC6DD971
1E56C35A6BEF7D867BEE7CEF9BA7EED7CD26BB39
8B4D89438B8B461AB06311A000590A2843919040
91292B13E0244E2028B163433F12C1B013409194
C0F290C888048B1425311145A4A2D8249B3DF075
F77BDD6F7E773CF78C7B586B55557EDC1F010992
This block is about 1% of the unchecked raw file.
Feel free to edit if you feel this reveals to much...

They say a picture is worth a thousand words; this one is worth ~12 thousand...
! and I only need one.
- Another atomatron trudges senselessly away.
MagneticMonopole
Posts: 26
Joined: Fri Nov 07, 2008 3:19 pm

Post by MagneticMonopole »

Hi Drifter,

and, as you will without doubt have noticed, the PNG format is exceptionally unforgiving. A single wrong byte will not result in an ill-shaded pixel, but an almost completely corrupted file - thanks to the mandatory use of compression and checksums in PNG. :D
User avatar
efe
Posts: 45
Joined: Sun Oct 26, 2008 10:28 am
Location: germany

Post by efe »

@Drifter: There are at least 5 errors in your 240 bytes.
I compared it to my result and listened again to the differing bytes.
I also found an error in my byte sequence. :)

Sometimes its hard to distinguish between 5 and 6. (ex. 62 or 52).
gfoot
Posts: 269
Joined: Wed Sep 05, 2007 11:34 pm
Location: Brighton, UK

Post by gfoot »

Checksums work in your favour if the input data could be corrupted.
MichaBln
Posts: 18
Joined: Tue Nov 11, 2008 1:55 pm
Location: Berlin, GER

Post by MichaBln »

Hi,

i'm getting approximatly 95% right ... still my PNG is corrupt.
My way of approaching the whole thing works pretty good ... I made kindof my own learning speach-recognition software. Unfortunatly I'm really not good at audio-fingerprinting and such, I tried using my own algotithms for the recoginition by analysing characteristics of each number.
Still that 5 / 6 - Problem as mentioned is really sick ... I don't know if its a good idea but I'm going to try using FFT and analyzing the frequencies itself.

I got the feeling I close (and so seem to be others ...) but as mentiond 99,9 % won't do the job.

Anyways ... nice challenge.

Michael
helly0d
Posts: 29
Joined: Fri Feb 13, 2009 2:10 am
Location: Iasi Romania

Post by helly0d »

Can i ask if it is really true that from the 18 hours of the track just 6 of them are the picture because the other 12 hours are just 2 repetitions of the picture?
And am i close if i get a 4 Kb PNG filled with black?
Arkondi
Posts: 3
Joined: Wed Jan 27, 2010 7:54 am
Location: Germany

Post by Arkondi »

[quote="helly0d"]Can i ask if it is really true that from the 18 hours of the track just 6 of them are the picture because the other 12 hours are just 2 repetitions of the picture?
And am i close if i get a 4 Kb PNG filled with black?[/quote]

There are no repetitions, the file you will get has 22536 bytes and it is not filled with black.
Masti6
Posts: 55
Joined: Sat May 15, 2010 12:04 pm
Location: Finland, Nurmes

Post by Masti6 »

This is one of the most frustrating challenges yet.
Any speech-to-text(other than youtube?) out there that you could share?
wolf may cry
Posts: 6
Joined: Fri Nov 05, 2010 9:20 am

Post by wolf may cry »

Anyone still working on this challenge?
I've tried some speech-to-text software, but it doesn't work at all
Do I have to write a special speech-to-text software for this challenge? Because I have no idea how it works... :cry:

Thanks
AMindForeverVoyaging
Forum Admin
Posts: 496
Joined: Sat May 28, 2011 9:14 am
Location: Germany

Post by AMindForeverVoyaging »

There really should be a warm-up challenge for this one, introducing you to speech processing/voice recognition.

The "solving rate" for this challenge is less than one person per year. If that is not a clear indicator that the difficulty level is way overdone (for a challenge that stands on its own and has no warm-up), then I don't know what is.
aurora
Posts: 54
Joined: Thu Feb 05, 2009 12:31 pm
Location: Bavaria, Germany

Post by aurora »

i am trying speech to text now, but as of bad quality, my hopes are not very high, that this will work. i have still another idea to solve it, but that sure will mean "some" work in coding ...
aurora
Posts: 54
Joined: Thu Feb 05, 2009 12:31 pm
Location: Bavaria, Germany

Post by aurora »

could someone who solved this problem verify, that it's indeed 22536 bytes? i have currently 25442 to work with. (i gave up with speech-to-text btw. and am trying a different approach).
AMindForeverVoyaging
Forum Admin
Posts: 496
Joined: Sat May 28, 2011 9:14 am
Location: Germany

Post by AMindForeverVoyaging »

aurora wrote:could someone who solved this problem verify, that it's indeed 22536 bytes?
Well that figure is from Arkondi, who *is* one of the (very few) people who solved it.
aurora
Posts: 54
Joined: Thu Feb 05, 2009 12:31 pm
Location: Bavaria, Germany

Post by aurora »

AMindForeverVoyaging wrote:
aurora wrote:could someone who solved this problem verify, that it's indeed 22536 bytes?
Well that figure is from Arkondi, who *is* one of the (very few) people who solved it.
errr ... right. i think i could have figured this out myself, sorry. i had probably the wrong tool anyway, and i can confirm now, that with the right tool you can indeed get 22536 bytes out of this input. toughest part is still ahead, though :( ...
Konk
Posts: 2
Joined: Sun Mar 08, 2009 10:30 am

Post by Konk »

Like many other I get about 2% errors in my speech-detected files and so I corrected some hours manually...
Since my png looks very corupt and 2/3 black, I would like to ask, if anybody with the solution would habe a look at my numbers. If anybody has a text file it's just a simple file compare. Currently I am pretty sure, that the first 8336 numbers are correct. In the png only the first 5 lines are nice, rest ist corrupted pixels and black.
So if anybody would be willing to have a quick look at my numbers - maybe you can drop me a message. I would appreaciate this very much.
Post Reply