Transcription HTML Handy Tool at

Go Back   DV Info Net > Windows / PC Post Production Solutions > Non-Linear Editing on the PC

Non-Linear Editing on the PC
Discussing the editing of all formats with Matrox, Pinnacle and more.

Thread Tools Search this Thread
Old August 2nd, 2009, 01:08 AM   #1
Regular Crew
Join Date: Jun 2009
Location: Winnipeg , Manitoba, Canada
Posts: 43
Transcription HTML Handy Tool

Problem: How do you give a client the audio for an interview they need to listen to, as well as

the transcription metadata XML file so they can study roughly what was said?

Supposedly this metadata can be encoded with Flash, but I've got no idea how it displays it, what

you can do other than just look at it? I dunno.

Ok, I wrote this tool today, it's very, very basic, but if you know the code, there are hints

about how great this could evolve to become.

I can't expect my client to have Soundbooth kicking around, though it has features that are handy,

like renaming words, deleting words, so on.. things that someone can change to improve the

quality of the transcription and also get a better handle on picking out talking points. Those features are beyond what I have skills to do. But I can do this:

To use this code, you'll need to do a few VERY simple things.
#1 Grab this file. It's got two HTML files in there. Unzip the files into a folder somewhere.
Free File Hosting Made Simple - MediaFire

#2 Take the audio file (.wav, .mp3, whatever) and the resulting transcription file from Soundbooth , copy them to that above folder and rename the

.xml file to 'voices.xml'. The file name is fixed in the code. Easy to change.

#3 You need Media Player Classic. In the options tab, select Web Interface.
- Check Listen on port ##### (the default is 13579, check your firewall settings, though
shouldn't matter, everything serves from your localhost).
- Check Serve pages from: and select the folder where you put the files.
- Default Page should be the file named 'MPC.html'

#3 Open the audio file via Media Player. Pause it, play it, whatever.

#4 Open the following link in Firefox


Two things are happening here. The first is a smaller window is being opened up (make sure you

allow popups from localhost). This window is just a simple workaround to a problem I have

submitting the data to the web server).
The other thing that is happening is the transcription file (voices.xml) is being parsed and made

ready to be displayed. I've set the max words shown to 1500. Increase that number if you choose.

You'll see displayed on the left a timecode starting at 00:00:00. To the right are 20 words.

Next line is another timecode, and another 20 words, and so it goes.

The timecode, as well as the words are clickable and will move the playhead of Media Player

Classic to the corresponding timecode.

So it's really easy to click around a document and then immediately hear the audio from that spot

in the transcription.

The down side is the transcription is SO off the mark, it's useless. I couldn't make heads or

tails of what I was seeing.

About the code:
I've got a few thing in there that would change the size or colour of the font according to the

confidence level of the transcription. Trouble words would be reder or taller than the more

accurate greener and smaller words. I ran into a few odd tag structures in one of my larger .xml

files and was getting errors parsing the confidence data, so I disabled this feature.
Confidence is always at at 40.

There is no feedback from Media Player Classic as to where it is in a document. I've not figured

out all the communication back & forths on this yet. What should happen is each word is

highlighted as it gets played out.

Time is only accurate to the second, since I'm not sure of any precise way of setting the

playhead. The metadata gives you millisecond accuracy, but that gets reduced by a factor of 1000

so Media Player Classic can be told where to skip to (expects times as '00:00:00').

Would be sweet to edit the words, delete them, do those things that can be done in Soundbooth.

As I said, the XML parsing is rough and prone to errors. And if you changed any of the words, not

sure how to turn all that data back into an .XML file.
Greg Paulson is offline   Reply

DV Info Net refers all where-to-buy and where-to-rent questions exclusively to these trusted full line dealers and rental houses...

Professional Video
(800) 833-4801
Portland, OR

Omega Broadcast
(512) 251-7778
Austin, TX

(973) 335-4460
Mountain Lakes, NJ

Abel Cine Tech
(888) 700-4416
N.Y. NY & L.A. CA

(800) 238-8480
Glendale, CA

Precision Camera
(800) 677-1023
Austin, TX

DV Info Net also encourages you to support local businesses and buy from an authorized dealer in your neighborhood.
  You are here: DV Info Net > Windows / PC Post Production Solutions > Non-Linear Editing on the PC

Thread Tools Search this Thread
Search this Thread:

Advanced Search



All times are GMT -6. The time now is 03:18 PM.

DV Info Net -- Real Names, Real People, Real Info!
1998-2015 The Digital Video Information Network