DV Info Net

DV Info Net (https://www.dvinfo.net/forum/)
-   What Happens in Vegas... (https://www.dvinfo.net/forum/what-happens-vegas/)
-   -   Synchronization of Audio and Video from Different Sources (https://www.dvinfo.net/forum/what-happens-vegas/64559-synchronization-audio-video-different-sources.html)

Dale Paterson April 7th, 2006 06:45 AM

Synchronization of Audio and Video from Different Sources
Hello again,

Let me start by describing my setup:

6 Radio Mics (Sony UWP-C3 Transmitters)
Each Mic assigned an input channel on an Alesis FireWire Mixer
Alesis FireWire Mixer attached to Fujistsu Siemes Notebook
6 Audio Channels created in Vegas to capture sound from each Mic/Channel

Recording went perfectly and all tracks identical in length and in perfect synch with each other.


Using the Sony FX1 for video (DV) and the cameras audio track as a reference (to the above recorded audio tracks) the video starts out being perfectly in synch with the tracks recorded from the mixer (when placed correctly in the Vegas timeline) but gets progressively out of synch by the end of the tape.

In other words - I have 6 sound channels recorded using Vegas from a FireWire Mixer and I also have the audio track from the camera that is used as a reference to align or synch the 6 recorded sound channels to the video. Once correctly (manually) aligned on the Vegas timeline the video the video are initially in synch but the video and audio seem to go out of synch by the end of the tape / video footage i.e. after about an hour.

In order to correct this I have to stretch the video track by a fraction to get the whole video event synchronized with the audio recorded from the mixer.

I am confused about this.

I would have thought that because everything is being recorded digitally that this would not happen.

The project created to record the 6 audio tracks from the mixer had the identical settings (48,000 kHz, 16 Bit audio) to the final project where everything i.e. the 6 audio tracks and the video footage are being brought together again.

Why would the audio recorded from the mixer using Vegas be any different to the audio track of the camera?

Is this possible?

Any ideas or input?



Mike Kujbida April 7th, 2006 09:57 AM

Dale, the only thing that comes to mind is that the camera probably runs in drop-frame mode (i.e. 29.97 frames/sec.).
If the audio stream wasn't clocked to this (i.e. it was running at non-drop or 30 fps), the drift after one hour would be 3.6 sec. (108 frames).
Am I on the right track?


Seth Bloombaum April 7th, 2006 11:58 AM


Originally Posted by Mike Kujbida
...the camera probably runs in drop-frame mode (i.e. 29.97 frames/sec.)...

I don't think this could be the issue. The camera is at 29.97 fps regardless of how the time code generator is counting the frames. The audio record chain has no frames, it just runs at its sample rate.

I'd be looking deeper into setups in the Alesis Firewire mixer. This is where you want the clock locked to 16/48, this is the clock that determines your audio sampling and sync. Is it possible that the Alesis is set up for 16/44, then you're recording in Vegas at 16/48 while resampling on the fly?

In dealing with your existing recordings, I think you'll have better results shrinking the duration of the audio than expanding the duration of the video. Definately shorter rendering times, and perhaps better picture quality.

BTW, multitrack recording and posting in Vegas has been very very good to me! Some with Sound Devices 744T, some with Alesis HD24. I've been recording at 24/48.

Dale Paterson April 7th, 2006 12:22 PM

Hi, and thanks for the replies.

The Alesis was set to 16/48 but I don't ever remember clicking on the button that says 'set clock master' but I am not sure if this was ever necessary as 16/48 is the default for the installation.

BTW I am shooting in PAL so I can only assume that I am getting a straight and true 25fps :)

It is the strangest thing though and I cannot figure it out. And it is not out by that much either i.e. not even a whole second at the end of the video footage more like milliseconds.

Maybe I am supposed to click on the 'set clock master' in the Alesis software prior to recording in Vegas BUT if the mixer was actually recording at 16/44 then I am sure that the audio track would differ greatly from the cameras audio track but like I say it is only fractionally out.

Just had a thought though - if the audio tracks were recorded at 16/44 then Vegas would identify them as such would it not? Vegas does recognize the individual audio tracks as 16/48.


Seth Bloombaum April 7th, 2006 01:51 PM

Regarding "set clock master", I'm unfamiliar with Alesis' terminology here. If there is anything that looks like "lock sample rate" you want it.

Milliseconds out over an hour may be the best you can do. I'm not enough of a digital circuits person to say whether you "should" be able to do better. We're talking extremely small fractions of a percentage differences if, say, you're 200 milliseconds out after an hour... that's about 0.0055 percent error.

If I've got my math right, that would be about 5 frames at 25fps.

Not bad for clocks that aren't locked to each other?

Dale Paterson April 7th, 2006 02:36 PM


I suppose you are right - it is a small margin - but I was always under the impression that because all of this is digital none of this would matter.

I tried your suggestion about stretching the audio track but it gets far more complicated because although initially your video is in synch with the audio tracks in the beginning - by adjusting the audio tracks to synch with the end of the video you are changing the properties of the audio tracks in their entirety and the beginning then goes out again so you would have to repeat this process over and over until you got it just right.

The strange thing is that normally when you stretch or compress a video track the frames are recompressed when previewing on an external monitor via FireWire but this is not happening when stretching or compressing the video by such a small amount.

I have another thought though - do you think it makes any difference that I captured the video on another workstation (my desktop editing workstation) i.e. the audio was captured in the field on a notebook but then copied to my desktop and the video was then captured from the FX1 straight onto the desktop for synch and editing. I have not tried capturing the video to the notebook and then trying to align then stuff there (another three hours just to test)?

Still don't understand this clock thing!

By the way - is there any better way or technique of checking the synchronization i.e. I am previewing all of the audio tracks (including the video's original audio track) and then moving the video / video audio track back and forth on the timeline until there is no echo or delay and then I am assuming that at this point the audio of all tracks is in synch i.e. when the audio tracks are out of synch an echo or delay can be heard but when they are spot on with each other there is no echo or delay. Any other techniques?


Seth Bloombaum April 8th, 2006 12:29 PM


Originally Posted by Dale Paterson
...do you think it makes any difference that I captured the video on another workstation...

No, don't think so. A capture in DV/Firewire is really more of a file transfer, no bits get changed from what's on tape.


...By the way - is there any better way or technique of checking the synchronization i.e. I am previewing all of the audio tracks (including the video's original audio track) and then moving the video / video audio track back and forth on the timeline until there is no echo or delay...
With a good ear and some devoted listening this is the best! Due to the way we perceive sound, our hearing is really very accurate for small timing differences (echo). Short of field equipment upgrades to cameras and recorders capable of jamming timecode, or better yet running synced, you can't do better, nor do you need to.

If you've reduced echo to nil, and lip-sync seems right, you're done.

If you're slipping the audio it can be handy to turn off "quantize to frames" so you can get sub-frame sync (better than 1/25th of a second for PAL). Don't forget to turn quantize to frames back on before you touch anything with video.

Barry Oppenheim April 8th, 2006 02:53 PM

I'm not a Vegas user, but I had the same problem once before. Audio started out fine but later on was out of sync. Turned out that during the capture there was a frame drop in a couple of spots. Once I found out where the frame drops were (a tedious procedure) I realigned the audio clips at those points and things were fine.

In Premiere I've never had problems with drift importing 16/44k audio into 16/48k projects.


Dale Paterson April 8th, 2006 11:18 PM

Thanks eveyone for the replies and info.

Seth - I followed your advice and stretched the mixers audio tracks and not the video and after only a slight adjustment the audio the individual video clips just fell into place which tells me that it is not the video capture but something to do with the audio capture.

By the way the difference per clip was 0.120 seconds - amazing how we (our ears) are able to perceive such small differences. I'm just glad I was able to align it and was just surprised that this kind of thing could happen in our digital world. Like you said - small differences like this (the audio tracks from the mixer were actually about 3 hours long and I only had to stretch them by a fraction) are actually not bad for unsnyched clocks (although I did not know this was a factor).

Funny enough - only yesterday - after doing the shoot - did it occur to me to turn off things like quantize to frames, snapping, etc. etc. in Vegas when capturing the audio from the mixer on the notebook. Maybe all of these settings played a part but as you say - I did get it right and there you go.



Dale Paterson April 13th, 2006 01:28 PM

Just to update this thread with some further thoughts on the subject:

It has dawned on me that using the audio captured by the camera as a reference to synch the video to the audio captured via the mixer is not actually an ideal or useable method. The reason I say this is as follows:

The camera was picking up the sound from the public address system (which was connected to the mixers main out). The notebook was recording the sound directly from the wireless mics. In other words - the camera was picking up the audio after it had come from the mics - then to the mixer - then to the public address system - and only then to the cameras mic. Also the camera was always a fair way away from the public addresss system. Is it not possible that this sort of 'round trip' compounded by the physical distance between the camera and the public address system could have caused my video/audio synch problems (although it still would not explain why either the video or audio track would have to be stretched or compressed given the same sampling rate/bit depth for a given period)?



Mike Rehmus April 13th, 2006 03:31 PM

It is easy to get a one frame difference between the optical record of the event and the audio. This can happen over a shorter distance than you might think. Just for grins, think 600 mph is the speed of sound for this calculation.

We take 30*60*60 frames per hour, that's 108,000 (NTSC)
Let's figure nautical miles cause that's 6000 feet.
600 x 6000 = 3,600,000 feet per hour. But we get to divide it by 3600 to get down to feet per second = 1,000 feet per second.
30 frames per second = 33.3 feet per frame

So you can see that you don't have to be too far away before the difference between a camera-local microphone and a radio microphone or a action-local recording system to have a significant difference in sound recording timing.

Even a small wedding can cause problems.

Steve House April 13th, 2006 08:39 PM

Some digital A/D converters run at 48.048kHz instead of exactly 48kHz but produce files that are timestamped as 48kHz. The purpose is to produce a 0.1% slowdown to match the speed of 30FPS viewed at 29.97FPS for editing film on video in Avid etc. I wonder if the Alesis Firewire mixer could be one of those critters? The slowdown would mean the audio would run just a hair longer than the video. Just guessing here, you'd need to find out the exact sample rates in the Sony camera and the Alesis mixer to know for sure.

Dale Paterson April 14th, 2006 01:18 AM

Steve and Mike - thank you both for those replies.


I'm not sure that I understand your calculation proper - particularly the 'nautical miles part'. Also I shoot in PAL and use the metric system so if you could explain a little more in detail (or adapt it for my specifics) I would appreciate it. Does the speed of sound not come into this equation? (I can't believe how ridiculous that question sounds)!

On the other hand - accepting you figures for an NTSC environment - it does make a lot of sense. At my last event the main camera was at least twenty metres away from the subject at any given point (and twenty five metres away from a subject at the far end of the venue) and that would equate (if I'm not mistaken) to about a two or three frame difference or delay (still not quite sure what we're talking about here) which seems to be exactly the difference between the audio and video after rendering (using the camera's original sound track to synch to the mixers sound track). The lip synch is definitely out by a fraction on the final product and I think that this is why.

I am going to use another method that I hope will work - I'll assign an extra channel on the mixer to a spare mic and at the beginning of each tape I will somehow generate a series of 'beeps' or 'test tones' using a small dictaphone or message pager or something like that (in close proximity to the camera and the spare mic). The idea being that I would then be able to 'visually align' the two audio tracks (at least at the beginning of each tape). Any thoughts on this idea?

None of the above of course explains away the difference between the length of the camera's audio track and the mixer's recorded audio track. Steve, however, may be on to something here.


I did not know about the possibility of having a 0.1% slowdown for matching 30fps to 29.97fps and I am shooting in PAL so I would imagine that this could cause an even bigger difference if this is indeed the case. Am I right? I will investigate this further as I can just see this causing many problems for me in the future. It took me a whole day just to synch two hours of video to the mixer's audio tracks to the point where I was happy and even then, after rendering, the lip synch is slightly out (because of Mike's theory above I think).

How did I arrive at this point? What happened to the good old days of shooting and editing home video of the kids? What happened ...? Actually - not true! I think my fascination with video and film actually lies, not in what I already know but, in what I have still yet to learn! (How's that for a quotable quote)!

Thanks again.



Jeff Mack April 14th, 2006 08:52 AM


Here's a couple of thoughts as well - for what they are worth. Try using a line out of the mixer into your camera. Bypass that on camera mic altogether. Since everything is coming off the board, use the line in as your reference channel and sync up your 6 discrete channels to it. What I do is expand the height of the audio tracks and minimize the video track. I group the 6 other tracks so they move in unison. I then match the waveforms by expanding out the timeline with my wheel mouse. I'll drop my curser somewhere on a peak on the reference track and then slide the group so the same peak is lined up. It really helps to have a click or other loud, short duration sound to use for a mark. I do this at the front of each song and once it appears as close as I can get, I go to maybe halfway through and find another reference peak and confirm sync. If you are planning to edit in individual songs, you can cut your tracks (split) and resync all over again as you go down the timeline. This way you minimize any sync issues over short segments.

Hope this helps.


Dale Paterson April 15th, 2006 01:44 AM

Thanks Jeff for that.

Funny - I have been thinking about doing something similar (well actually the same - just a different way).

The Sony UWP-C3 Kit is made up of a plug-on transmitter and receiver. The nice thing about the transmitter is that you can switch the input between mic and line level so it would not be a big deal to take a line out from the mixer to the transmitter and then straight to the camera - mabe this is the BEST way of doing things. It would also most certainly identify which of the clocks is causing the synch problems (see details in the thread above) or at least which of the clocks is 'out'.

I really think that I need to give this a try.



All times are GMT -6. The time now is 02:56 PM.

DV Info Net -- Real Names, Real People, Real Info!
1998-2021 The Digital Video Information Network