voice recognition systems - Help!! at DVinfo.net

Go Back   DV Info Net > The Tools of DV and HD Production > All Things Audio

All Things Audio
Everything Audio, from acquisition to postproduction.


Reply
 
Thread Tools Search this Thread
Old November 22nd, 2009, 05:07 AM   #1
Tourist
 
Join Date: Nov 2009
Location: Cardiff, Wales
Posts: 2
voice recognition systems - Help!!

i am considering buying a voice recognition programme and am looking for any and all the advice i can get. i would need to use it NOT to recognise my voice but to recognise someone elses - here's the deal. i recently shot some interviews with my father & rather than transcribe it all by hand i need it to recognise his voice from playback. I am currently using Edius Neo for editing purposes but am considering upgrading my system (possibly CS4). Any advice would be much appreciated. thanks.
Rhys Carter is offline   Reply With Quote
Old November 22nd, 2009, 07:12 AM   #2
Trustee
 
Join Date: Sep 2004
Location: Bristol, CT (Home of EPSN)
Posts: 1,182
Won't work, but here's an alternative.

Dragon Naturally Speaking is the standard amongst voice recognition, it's very good but it's not perfect. You need to speak a bit differently than you normally would to get maximum accuracy. You'll annunciate more and your voice needs to train the system.

My suggestion is that you listen to the interviews through an earphone and repeat them (audio prompting) as you are listening, using DNS. Your results will be much better than using the interview audio, although you may want to try it.

I use DNS for creative and instructional writing. It's great for that, once you learn its idiosyncrocies and get used to composing verbally.
__________________
Paul Cascio
www.pictureframingschool.com
Paul Cascio is offline   Reply With Quote
Old November 22nd, 2009, 10:42 AM   #3
Trustee
 
Join Date: May 2004
Location: USA
Posts: 1,550
I've used Dragon Naturally Speaking and I don't like it. The problem is it takes just as long to go through and fix all the mistakes as it does to simply type it yourself. If you can't type quickly or don't want to do it then pay someone. Go cheap and find high school/college student to do it.
Pete Cofrancesco is offline   Reply With Quote
Old November 23rd, 2009, 10:25 PM   #4
Major Player
 
Join Date: May 2004
Location: Louisville, KY
Posts: 378
I'm also thinking about getting dragon. The basic version is only $45 on their site right now. I just wish I could use a trial first.
Eric Stemen is offline   Reply With Quote
Old November 26th, 2009, 12:54 PM   #5
Tourist
 
Join Date: Nov 2009
Location: Cardiff, Wales
Posts: 2
thanks for your replies

Thanks for your replies guys - indeed i must agree a trial would be ideal.
Having done a little more research into DNS, i have read some positive reviews but have also seen some not so positive.

The main complaint seems to be lack of technical support from the manufacturers.
Also, understandably i suppose is the mistakes it makes - this is inevitable if you ask me as no voice rec system can be 100% correct 100% of the time.

Buyers of version 10 seem to feel left down & they suggest hanging on to version 9 or 9.5 if you already have it.
oh what to do???!
Rhys Carter is offline   Reply With Quote
Old November 27th, 2009, 09:59 PM   #6
Trustee
 
Join Date: Sep 2004
Location: Bristol, CT (Home of EPSN)
Posts: 1,182
Quote:
Originally Posted by Pete Cofrancesco View Post
I've used Dragon Naturally Speaking and I don't like it. The problem is it takes just as long to go through and fix all the mistakes as it does to simply type it yourself. If you can't type quickly or don't want to do it then pay someone. Go cheap and find high school/college student to do it.
Pete, I disagree. DNS is not good for everything, but it's great for getting your ideas into written form. I find that it's even good for later revisions. Consider giving it another chance. If you're patient, learn it idiosyncrocies, and accept the limitations of voice recognition, it's a great creative tool.
__________________
Paul Cascio
www.pictureframingschool.com
Paul Cascio is offline   Reply With Quote
Old November 27th, 2009, 10:28 PM   #7
Inner Circle
 
Join Date: Nov 2006
Location: Tallahassee, FL
Posts: 4,100
We've used DNS since it's inception. In fact, the creator of the system did a demo for my company when the product was still in beta. Last year, we bought the latest version of it to try and use it as a way to transcribe video for us. That failed miserably. Then we used the audio-prompting and that was better, but only after spending considerable time training it.

It's just not the correct solution. If you REALLY want to get this done properly, with good accuracy, and you don't mind paying a bit of money, PM me, and I'll get you the info you really need to make this work.

In case you are wondering, my company has to be compliant with Federal Rule 508 which dictates that we provide captioning for ANY video we create for public consumption or for use within the company. Essentially, everything I shoot at the office has be be transcribed and captioned. We do this a lot.
__________________
DVX100, PMW-EX1, Canon 550D, FigRig, Dell Octocore, Avid MC4/5, MB Looks, RedCineX, Matrox MX02 mini, GTech RAID, Edirol R-4, Senn. G2 Evo, Countryman, Moles and Lowels.
Perrone Ford is offline   Reply With Quote
Old November 28th, 2009, 05:35 PM   #8
Trustee
 
Join Date: May 2004
Location: USA
Posts: 1,550
Quote:
Originally Posted by Paul Cascio View Post
Pete, I disagree. DNS is not good for everything, but it's great for getting your ideas into written form. I find that it's even good for later revisions. Consider giving it another chance. If you're patient, learn it idiosyncrocies, and accept the limitations of voice recognition, it's a great creative tool.
I stand by what I said. For his purposes it even makes less sense because he wants to transcribe someone else's voice. I've taken the time in the past to train it to my voice and the worst thing is it commonly misinterprets an entire phrase. So you might say "I took my dog for a walk" and it will transcribe that as "I draw with chalk". In addition say any non-dictionary word like someone's name, technical term, street, town, abbreviation, short phrase, etc., it spits out the wrong words. When you go back and proof read later its easy to miss because both the grammar and spelling are correct but it has translated something complete different than what was said. If DNS was that accurate and saved time, court reporters would use it, and they don't because its simply better to spend the time to type it correct the first time.
Pete Cofrancesco is offline   Reply With Quote
Old November 28th, 2009, 05:48 PM   #9
Inner Circle
 
Join Date: Nov 2006
Location: Tallahassee, FL
Posts: 4,100
A properly trained DNS system can be faster than a court reporter on a longhand keyboard, but not even close for a certified reporter on a shorthand machine.

In any event DNS is not the correct solution to this problem.
__________________
DVX100, PMW-EX1, Canon 550D, FigRig, Dell Octocore, Avid MC4/5, MB Looks, RedCineX, Matrox MX02 mini, GTech RAID, Edirol R-4, Senn. G2 Evo, Countryman, Moles and Lowels.
Perrone Ford is offline   Reply With Quote
Old November 28th, 2009, 05:57 PM   #10
Trustee
 
Join Date: May 2004
Location: USA
Posts: 1,550
Quote:
Originally Posted by Perrone Ford View Post
A properly trained DNS system can be faster than a court reporter on a longhand keyboard, but not even close for a certified reporter on a shorthand machine.

In any event DNS is not the correct solution to this problem.
That's an odd caveat to add if a certified reporter didn't use shorthand. That's like saying I could out play Tiger Woods at golf if he used a stick instead of a golf club. Besides, even if a court reported used a keyboard they're going to 98+% accurate. Can you make that claim with DNS? Because who cares how fast DNS can translate if its not accurate. That's my point.
Pete Cofrancesco is offline   Reply With Quote
Old November 28th, 2009, 06:15 PM   #11
Inner Circle
 
Join Date: Nov 2006
Location: Tallahassee, FL
Posts: 4,100
Pete,

I understand your point. However, I also understand some of the limitations of having court reporters do certain work. And in some instances they cannot use their steno machines. Like when they have to feed computer systems that are captioning on the fly. If we can tie them to dedicated CC hardware it's no issue, but that isn't always the case.
__________________
DVX100, PMW-EX1, Canon 550D, FigRig, Dell Octocore, Avid MC4/5, MB Looks, RedCineX, Matrox MX02 mini, GTech RAID, Edirol R-4, Senn. G2 Evo, Countryman, Moles and Lowels.
Perrone Ford is offline   Reply With Quote
Old November 28th, 2009, 07:04 PM   #12
Trustee
 
Join Date: May 2004
Location: USA
Posts: 1,550
I'm not familiar with CC but I think technology has progressed to allow steno machines to digitally interface. But often ppl come here looking for an impossible cheap solution like I need a good shotgun mic for $50...

I work with court reporters and they often have their steno machine plugged into a laptop which converts the shorthand to ordinary text which then can be sent real time to netbooks for clients to view.
Pete Cofrancesco is offline   Reply With Quote
Old November 28th, 2009, 08:50 PM   #13
Inner Circle
 
Join Date: Jan 2004
Location: Boca Raton, FL
Posts: 2,979
Rhys,
I worked in this field for many years with some of the best technology around. Google me.

Fundamentally, systems that can recognize large vocabularies with accuracy in the 95% and up range are systems that are trained to a person's voice, vocabulary and speaking habits. Even still, the high success rates are achieved with skilled speakers using systems trained to their voice. Yes, inevitably, getting a dictation system to have low errors comes with help from the speaker adapting to the system.

The best systems will only get about 70% accuracy out of the box. Over time, training of the system by reading to it, dictating to it, and telling it the errors it made will raise the accuracy.

If you are willing and your father is able, you can have him go through the training procedure. If your system allows you to run your recorded audio through it, you can then start the process of correcting and feeding back corrections.

Like others have said here, by the time you are done, even with a system running at 90% accuracy, you are fixing one out of every 10 words and you may have been better off doing it yourself. YMMV
Les Wilson is offline   Reply With Quote
Old November 28th, 2009, 09:34 PM   #14
Major Player
 
Join Date: Aug 2006
Location: Petaluma, CA
Posts: 456
Hmm, I have DNS 10 on my laptop and version 7 on this old PC. Still, I've had good results not knowing much about how to correctly use it. Below is an example reading the text of a prior post. For me, it's an easy read through to make a few minor corrections (I see I don't know the command to capitalize a word - it's not "cap that" but I'm sure I could easily look it up if I had the need). Also, I noticed that when I captured the original text in MS-Word before copying it over here, it caught the typo by the original poster (double words "be be" at the end of the post), but I left it as-is.

So when I've got hours of text to transcribe, I'll continue to use DNS to help. One thing I've learned is not to worry about how DNS is transcribing as you read - just let it do its thing and make the corrections later (much faster workflow).

However, if we're doing a project requiring closed caption for a customer and had the original text/script, I'd be the first to think about using a transciption service.

Regards, Michael

We've used Dragon NaturallySpeaking since its inception. In fact, the creator of the system did a demo for my company when the product was still in beta. Last year, we bought the latest version of it to try and use it as a way to transcribe video for us. That failed miserably. Then we used the audio-prompting and that was better, but only after spending considerable time training at.

It's just not the correct solution. But If You Really want to get this done properly, with good accuracy, and you don't mind paying a bit of money, P. M. may, and all get two the info you really need to make this work.

In case you're wondering, my company has To Be Compliant with Federal Role 508 which dictates that we provide captioning for Any video we create for public consumption or four years within the company. Essentially, everything I shoot at the office has be be transcribed and captioned. We do this a lot.
Michael Nistler is offline   Reply With Quote
Old November 29th, 2009, 05:57 PM   #15
Major Player
 
Join Date: May 2004
Location: Louisville, KY
Posts: 378
Thanks for the example Michael. It looks like that would be pretty quick to go back through and fix the mistakes.
Eric Stemen is offline   Reply
Reply

DV Info Net refers all where-to-buy and where-to-rent questions exclusively to these trusted full line dealers and rental houses...

Professional Video
(800) 833-4801
Portland, OR

B&H Photo Video
(866) 521-7381
New York, NY

Z.G.C.
(973) 335-4460
Mountain Lakes, NJ

Abel Cine Tech
(888) 700-4416
N.Y. NY & L.A. CA

Precision Camera
(800) 677-1023
Austin, TX

DV Info Net also encourages you to support local businesses and buy from an authorized dealer in your neighborhood.
  You are here: DV Info Net > The Tools of DV and HD Production > All Things Audio

Thread Tools Search this Thread
Search this Thread:

Advanced Search

 



Google
 

All times are GMT -6. The time now is 12:20 AM.


DV Info Net -- Real Names, Real People, Real Info!
1998-2017 The Digital Video Information Network