A Beginner’s Guide to Shooting Stereoscopic 3D
This article was Originally published in FCPUG Supermag #4 (April 2010) and on Dashwood3D.com
Updated and Revised September 2011. All images ©Tim Dashwood unless otherwise specified.
3D is back in style again and it seems like everyone, from Hollywood producers to wedding videographers, is interested in producting stereoscopic 3D content.
So how can you get involved by shooting your own 3D content? It’s actually quite easy to get started and learn the basics of stereoscopic 3D photography. You won’t be able to sell yourself as a stereographer after reading this beginner’s guide (it literally takes years to learn all the aspects of shooting and build the necessary experience to shoot good stereoscopic 3D) but I guarantee you will have some fun and impress your friends.
The basic principle behind shooting stereoscopic 3D is to capture and then present two slightly different points of view and let the viewer’s own visual system determine stereoscopic depth. It sounds simple enough but the first thing any budding stereographer should learn is some basic stereoscopic terminology. These few terms may seem daunting at first but they will form the basis of your stereoscopic knowledge.
Stereoscopic 3D a.k.a. “Stereo3D,” “S-3D,” or “S3D”
“3D” means different things to different people. In the world of visual effects it primarily refers to CGI modeling. This is why stereographers refer to the craft specifically as “stereoscopic 3D” or simply “S3D” to differentiate it from 3D CGI.
Interaxial (a.k.a. “Stereo Base”) & Interocular (a.k.a. “i.o.”) separation
The interocular separation (or interpupulary distance) technically refers to the distance between the centers of the human eyes. This distance is typically accepted to be an average of 65mm (roughly 2.5 inches) for a male adult.
Interaxial separation is the distance between the centers of two camera lenses (specifically the entrance pupils.) The human interocular separation is an important constant stereographers use to make calculations for interaxial separation. Beware that Interaxial separation is often incorrectly referred to as “Interocular” and vise-versa. In the professional world of stereoscopic cinema it has become the norm to refer to interaxial separation as “i.o.” even though it is the incorrect term.
Binocular Vision, Retinal Disparity and Parallax
Binocular Vision simply means that two eyes are used in the vision system. Binocular Vision is very important to most mammals (including humans) because it allows us to perceive depth at close range.
Try this: Hold one finger next to your ear. Now stretch your other arm out straight and hold up another finger. Now bring your two fingers together and touch the tips together. Is was easy right? Now repeat the same procedure but close one eye. Were you able to touch your fingers together on the first try? Now you know how important binocular vision is at close range.
When we look around at objects at different distances from us the images of those objects will be projected on our retinas in slightly different locations for each eye. Our brain can interpret this “Retinal Disparity” and help us determine depth.
When we shoot 3D with two cameras from slightly different positions the same thing happens; each camera’s sensor registers the objects in the scene in slightly different horizontal positions. We call this difference “parallax.”
Convergence & Divergence
Binocular Vision and Parallax are the primary visual tools animals use to perceive depth at close range. The wider an animal’s eyes are apart (its interocular distance) the deeper its binocular depth perception or “depth range.”
At greater distances we start to use monocular depth cues like perspective, relative size, occlusion, shadows and relation to horizon to perceive how far away objects are from us.
Of course it would be difficult to look at double images all day so instead our eyes naturally angle in towards the object of interest to make it a single image. This is called convergence.
Here’s an example of how your eyes use convergence in the real world. Hold a pen about one foot in front of your face and look directly at it. You will feel your eyes both angle towards the pen in order to converge on it, creating a single image of the pen. What you may not immediately perceive is that everything behind the pen appears as a double image (diverged.) Now look at the background behind the pen and your pen will suddenly appear as two pens because your eyes are no longer converged on it. This “double-image” is retinal disparity at work and it is helping your brain determine which object is in front of the other.
What never happens to your eyes in the natural world is divergence, which would mean that your eyes would angle outward. This is because the furthest point you could possible attempt to look at is at infinity and even infinity would only require that your eyes be angled perfectly parallel to each other. This is why stereographers should avoid background parallax values in their scene that may require the eyes to diverge when viewed. This is easy to keep in check through some simple math but we will cover that a little later.
Stereo Window, the Screen Plane and Negative, Zero or Positive Parallax
Simply put, the “Stereo Window” refers to the physical display surface. You will be able to visualize the concept if you think of your TV screen as a real window that allows you to view the outside world. Objects in your stereoscopic scene can be behind or outside the window (positive parallax,) on the window (the Screen Plane or zero parallax,) or inside, between you and the window (negative parallax.) In the same way objects appear in different horizontally offset locations on our retina to create parallax separation, stereoscopically recorded and displayed objects will appear to have different horizontal offsets (parallax) depending on their depth in the scene. If an object has no perceivable amount of parallax then we consider it to appear on the screen surface just as the star in the illustration. This is why converging on an object will make it appear to be at the screen. This can be done by converging the cameras on the objects while shooting, or by sliding the images horizontally in opposite directions during post production.
If an object’s left image is to the left of the corresponding right image then that object has positive parallax and will appear to be behind the screen.
If an objects left image is to the right of the right image then it has negative parallax and will cause your eyes to cross, which will suggest to your brain that the object is in front of the screen.
This is the basic principle behind stereoscopic shooting and emulating human binocular vision with two cameras.
Respecting the Stereo Window
We discussed briefly how the display screen represents a window and objects can be behind, at or in front of the window. If you want an object to appear in front of the window it cannot touch the left or right edge of the frame. If it does the viewer’s brain won’t understand how the parallax is suggesting the object is in front of the screen, but at the same time it is being occluded by the edge of the screen. When this contradiction happens it is referred to as a window violation and it should be avoided. Professional stereographers have a few tricks for fixing window violations with lighting or soft masks but it is best for beginners to simply obey this rule.
Unwelcome Disparities (Geometric, lens and temporal)
Disparity is a “dirty word” for stereographers. In fact the only “good” type of disparity in S3D is horizontal disparity between the left and right eye images. As mentioned before, this is known as parallax.
Any other type of disparity in your image (vertical, rotational, zoom, keystone or temporal) will cause the viewers eyes to strain to accommodate. This can break the 3D effect and cause muscular pain in the viewer’s eyes or even nausea. Every stereographer will strive to avoid these disparities on set by carefully calibrating the steroescopic rig and it will be tweaked ever further in post production through the use of 3D mastering software.
Ortho-stereo, Hyper-stereo & Hypo-stereo
I already mentioned that the average interocular of humans is considered to be about 65mm (2.5 inches.) When this same distance is used as the interaxial distance between two shooting cameras then the resulting stereoscopic effect is typically known as “Ortho-stereo.” Many stereographers choose 2.5” as a stereo-base for this reason. If the interaxial distance used to shoot is smaller than 2.5 inches then you are shooting “Hypo-stereo.” This technique is common for theatrically released films to accommodate the effects of the big screen. It is also used for macro stereoscopic photography.
Lastly, Hyper-stereo refers to interaxial distances greater than 2.5 inches. As I mentioned earlier the greater the interaxial separation, the greater the depth effect. An elephant can perceive much more depth than a human, and a human can perceive more depth than a mouse. However, using this same analogy, the mouse can get close and peer inside the petals of a flower with very good depth perception, and the human will just go “cross-eyed.” Therefore decreasing the interaxial separation between two cameras to 1” or less will allow you to shoot amazing macro stereo-photos and separating the cameras to several feet apart will allow great depth on mountain ranges, city skylines and other vistas.
The trouble with using hyper-stereo is that scenes with gigantic objects in real-life may appear as small models. This phenomenon is known as dwarfism and we perceive it this way because the exaggerated separation between the taking lenses allows us to see around big objects much more that we do in the real world. Our brain interprets this as meaning the object must be small. The opposite happens with hypo-stereo, where normal sized objects appear gigantic. (gigantism.)
If one attempts to shoot with two cameras configured in a side-by-side stereoscopic mount the smallest interaxial distance available will be the width of the camera. In most cases the width of the camera will be around 6 inches. This might seem like a big limiting factor, but other specialized equipment is available to achieve small interaxial distances with almost any sized camera. (More on that a in the “Selecting your Gear” segment.)
Viewing 3D: Passive Polarization, Active Shutter Glasses, Anaglyph & Autostereo
There are three basic types of glasses used for presenting stereoscopic 3D material. In most of the theatres in North America the common method is passive polarized glasses with either circular or linear polarizers. There are a few consumer and professional HD 3D monitors that use the same passive method. However, most of the consumer 3DTVs on the market use some form of active shutter glasses to flicker the left and right images on and off at 120Hz. Autostereoscopic displays use lenticular lenses or parallel barrier technologies to present stereoscopic material without the use of glasses.
Anaglyph glasses will work with almost any display but use color filters to separate the left and right images. The most common configurations are red/cyan, blue/amber, and green/magenta.
The Quick Math & Some Rules to Remember
Stereoscopic Parallax Budget (sometimes called Depth Budget) vs Depth Bracket
The Depth Bracket of your scene refers to the actual distance between your closest object in the frame and the furthest object. The Parallax Budget refers to your calculated maximum positive parallax and desired maximum negative parallax represented in percentage of screen width. For example if I determine through a simple calculation that my positive parallax should never exceed 0.7% of screen width and I have determined that my negative parallax should not exceed 2% of screen width, then my total Parallax Budget is 2.7%. The Depth Bracket must be able to be squeezed into the Parallax Budget. There are many algebraic formulas to determine the proper interaxial distance to achieve this.
Native Parallax for final display size
The native parallax for a given screen size simply refers to what percentage of screen width will equal the human interocular. If you are using 2.5 inches as the baseline interocular and you know your presentation screen will be 30 feet wide (360 inches) then just divide 2.5 by 360. 2.5 ÷ 360 = 0.007 or 0.7% Therefore the Native Parallax of a 30 foot screen is 0.7%, so we should make sure to keep our maximum positive parallax under 0.7% of screen width if we plan to show our footage on a 30 foot wide screen. If we shoot for a 65” 3DTV, then we can get away with over 3% positive parallax.
The 1/30th Rule
The 1/30 rule refers to a commonly accepted rule that has been used for decades by hobbyist stereographers around the world. It basically states that the interaxial separation should only be 1/30th of the distance from your camera to the closest subject. In the case of ortho-stereoscopic shooting that would mean your cameras should only be 2.5” apart and your closest subject should never be any closer than 75 inches (about 6 feet) away.
Interaxial x 30 = minimum object distance
Minimum object distance ÷ 30 = Interaxial
If you are using a couple standard 6″ wide camcorders in a side by side rig as close as they will fit together then the calculation would look like: 6” x 30 = 180 inches or 15 feet. That’s right… 15 feet!
But does the 1/30 rule apply to all scenarios? No, the 1/30 rule certainly does not apply to all scenarios. In fact, in feature film production destined for the big screen we will typically use a ratio of 1/60, 1/100 or higher. The 1/30 rule works well if your final display screen size is less than 65 inches wide, your cameras were parallel to each other, and your shots were all taken outside with the background at infinity. When you are ready to take the next step to becoming a stereographer you will need to learn about parallax range and the various equations available to calculate maximum positive parallax (the parallax of the furthest object,) which will translate into a real-world distance when you eventually display your footage.
Remember that illustration on page 3 of the eyes pointing outward (diverging)? Well it isn’t natural for humans to diverge and therefore the maximum positive parallax when displayed should not exceed the human interocular of 2.5 inches (65mm.) You can readjust the convergence point and bring the maximum positive parallax within the limits of the native display parallax (2.5 inches) but that will also increase your negative parallax.
Selecting Your Gear
Side by Side Rig vs Beam-Splitter Rig
Interaxial separation is an important factor when shooting S3D so therefore the width of your two cameras will determine the minimum interaxial separation in a side by side rig. Both of these interaxial distances are far too wide for any application other than hyper-stereo shots of landscapes, mountain ranges, helicopter shots, etc.
In order to shoot subjects in close range (within 15 or 20 feet) you will require a beamsplitter rig.
Beam-splitters use a 50/50 or 60/40 mirror (similar to teleprompter glass) that allows one camera to shoot through the glass and then other to shoot the reflection. The interaxial can be brought down to as little as 0mm (2D) with beamsplitter rigs.
There are over 20 different beamsplitter rigs on the market ranging from $2500 USD to $500,000. However, many other types of disparity can be introduced when shooting through the glass (polarization effect, dust contamination, color cast, etc.)
Special Stereoscopic Lenses
There are special stereoscopic lenses on the market designed for various digital SLR cameras. These lenses will work with a single camera but capture a left and right point of view in the same frame. The concept is intriguing but the lenses are very slow (F/11 – F/22), they use a smaller portion of the image sensor for each eye, they are usually made from plastic optics instead of glass and (in the case of the Loreo) the aspect ratio is vertically oriented.
Purpose-built Stereoscopic cameras
Stereoscopic film cameras have existed for decades. I personally own a Kodak Stereo camera from the early 50’s that I’ve shot hundreds of 3D slides with and I love the simplicity. Recently manufacturers like Fujifilm, Panasonic, Sony and JVC have recognized the demand for digital versions of these cameras and released new products to market. Some can record to separate left and right files or side-by-side format files for easy workflows in most non-linear editing systems (and easy compatibility with Stereo3D Toolbox) but many of the new systems record the two streams into a self-contained multi video codec (MVC) file that requires specific editing software (currently only Sony Vegas 10 on Windows) or a demuxing stage to separate the MVC into discreet left and right files (as with JVC’s bundled Mac/PC software.)
If you plan to shoot stereoscopic video with any action then it will be beneficial to use two cameras that can be genlocked together. Cameras that cannot be genlocked will have some degree of temporal disparity. However using the highest frame rate available (60p for example) will reduce the chance of detrimental temporal disparity. There are also some devices capable of synchronizing cameras that use LANC controllers.
Interlace vs. Progressive
Every frame of interlaced video inheritably will have some degree of temporal disparity between the fields. It is recommended to shoot with progressive formats whenever possible.
Lens & Focal Length selection
Wider lenses will be easier to shoot with for the beginner and will also lend more “dimensionality” to your subjects. Telephoto lenses will compress your subjects flat so they appear as cardboard cutouts. Stay away from “fisheye” lenses because the distortion will cause many geometrical disparities.
OK, so you’ve learned your terminology and selected your gear. Now what? It’s time to get out there and shoot. We haven’t discussed the various calculations or the rules of S3D but I encourage you to shoot now so you can learn from your mistakes.
Turn off Image Stabilization
If you are using video cameras with image stabilization you must turn the feature off or the camera’s optical axis will move independent of each other in unpredictable ways. As you can imagine this will make it impossible to tune out disparities.
Manually Set White Balance
Use a white card, chart or 18% gray card to set the manual white balance of both cameras. On beamsplitter rigs it is not advisable to use preset white balance settings because the mirror glass introduces its own tint to the image on each camera. Set the WB switch to either A or B and press and hold the AWB button to execute the white balance sequence.
It is best to shoot on 0dB gain when possible. The noise and grain patterns at high gain levels will be unique on each camera for each frame and therefore will be a visual disparity.
Use identical settings on both cameras
It is very important to use the same type of camera, same type of lens and exactly the same camera settings (white balance, shutter speed, aperture, frame rate, resolution, zoom, codec, etc.) on both cameras. Any differences will cause a disparity. It is also a good idea to use manual focus and set it to the hyperfocal distance or a suitable distance with a deep depth of field.
Proper configuration for CMOS shutters
The CMOS sensors in cameras like the Sony F3, Red, Canon XF105 or XF305 use a rolling shutter that requires a particular mounting configuration in a beamsplitter rig. The tops of the frames must match so there is no rolling shutter disparity between the sensors. If the mirror in your rig faces the ground and the upright camera mounts underneath then the camera can be mounted on the plate normally. If your mirror faces up and the upright camera points down then the camera must be mounted upside down so that the top-bottom orientation of the sensors match.
Use a clapper or synchronize timecode
If your cameras are capable of genlock and TC slave then by all means use those features to maintain synchronization. If you are using consumer level cameras it will be up to you to synchronize the shots in post. In either case you should use a slate with a clapper to identify the shot/takes and easily synch them.
If your cameras have an IR remote start/stop it is handy to use one remote to roll & cut both cameras simultaneously. If you are shooting stills with DSLRs there are ways to connect the cameras with an electronic cable release for synchronized shutters.
Slow down your pans
However fast you are used to panning in 2D, cut the speed in half for 3D. If you are shooting in interlace then cut the speed in half again. Better yet, avoid pans altogether unless your cameras are genlocked. Whip pans should be OK with genlocked cameras.
Label your media “Left” and “Right”
This might seem like a simple rule to remember but the truth is that most instances of inverted 3D is a result of a mislabeled tape or clip. Good logging and management of clips is essential with stereoscopic post production.
To Converge or Not Converge… That is the question.
One of the most debated topics among stereographers is whether to “toe-in” the cameras to converge on your subject or simply mount the cameras perfectly parallel and set convergence in post-production. Converging while shooting requires more time during production but one would hope less time in production. However “toeing-in” can also create keystoning issues that need to be repaired later. My personal mantra is to always shoot perfectly parallel and I recommend the same for the budding stereographer.
So you’ve shot your footage and now you want to edit and watch it. If you work with After Effects, Motion or Final Cut Pro on the Mac please watch some of the tutorials on this website to learn more about how Stereo3D Toolbox can help you master your S3D content.
Fixing Disparity and Setting Convergence
Most stereoscopic post software has sliders to adjust vertical, rotational, zoom, color & keystone disparities. Fixing these disparities requires skill and practice but my recommendation is to start with rotation and make sure any straight lines are parallel to each other and then adjust zoom to make sure objects are the same apparent size. Next adjust the vertical disparity control make sure all objects next to each other. Finally adjust the horizontal convergence to perfectly align the object you wanted to be on the stereo window.
Native Pixel Parallax
There is one last thing you should check after aligning each shot. You must make sure that your background doesn’t exceed the Native Pixel Parallax of your display screen or your audience’s eyes will diverge (which is bad.) The idea here is that the maximum positive parallax (the parallax of your deepest object/background) does not exceed the human interocular distance when presented.
You can determine the Native Pixel Parallax (a.k.a. NPP) by dividing 2.5 inches by the display screen’s width and then multiply the result by the amount of horizontal pixels (i.e. 1920 for 1080p or 1280 for 720p.)
I present my S3D material on JVC’s 46” 3DTV. It is 42 inches wide and 1920 pixels wide so the calculation is 2.5/42×1920 = 114 pixels. This means that the parallax of the background should not exceed 114 pixels.
In Stereo3D Toolbox you can enter your screen width and the filter will automatically calculate NPP and display a grid. If the parallax in your background does exceed this limit then adjust your convergence to move the depth range back away from the viewer.
Share your S3D Masterpiece on YouTube with the yt3d tag
Now that you have finished editing and mastering your S3D movie it is time to share it with the world. YouTube added the capability to dynamically present S3D content in any anaglyph format. All you have to do is export your movie file as “side by side squeezed” and encode it as H264 with Compressor. I recommend using 1280x720p for S3D content on Youtube but not 1080p. The workload of rendering the anaglyph result is handled by the viewer’s computer so 1080p will decrease the frame rate on most laptops.
Upload your movie file to YouTube and then add the tag “yt3d:enable=true” to enable YouTube 3D mode. If your footage is 16×9 aspect ratio also add the tag “yt3d:aspect=16:9”. YouTube 3D expected crossview formatted side by side so if you exported as side by side parallel instead of crossview you will need to add the tag “yt3d:swap=true” to ensure the left and right eyes are presented correctly.
I think I’ve covered the basics of shooting & posting stereoscopic 3D but we’ve really just scratched the surface of what a professional stereographer needs to know. If you want to continue your education in this area I recommend you pick up Bernard Mendiburu’s 3D Movie Making or search your library for the “bible” of stereoscopic 3D, Lenny Lipton’s classic “Foundations of the StereoScopic Cinema. A Study in Depth.”
Remember… stereoscopic 3D cinematography is a craft that can takes years to master and is a craft where even the ‘experts’ are still learning new techniques. As the popularity of S3D continues to rise there will by many demands on inexperienced videographers to provide stereoscopic services. It is important to remember that 2D can look bad, but 3D can feel bad. The last thing any producer wants is to physically hurt the audience. Therefore, extensive practice and testing is advised before producing content to be viewed by anyone other than the stereographer. Trial and error is the best way to learn this particular craft.