how fast: Packing < 16 bit pixels into words? at DVinfo.net

Go Back   DV Info Net > Special Interest Areas > Alternative Imaging Methods

Alternative Imaging Methods
DV Info Net is the birthplace of all 35mm adapters.


Reply
 
Thread Tools Search this Thread
Old August 3rd, 2004, 01:42 AM   #1
Major Player
 
Join Date: Oct 2003
Location: Southern Cal-ee-for-Ni-ya
Posts: 608
how fast: Packing < 16 bit pixels into words?

I have a programming question that maybe some of you can answer: How fast can we expect a high end P4 system with dual channel DDR be expected to bit shift and repack 12 bit data into a few words?
Is this something that MMX or SSE instructions can help with?
I suppose this would happen in cache if it's a scan line at a time.

So, How many megapixels per second, for lack of better metrics?
I don't care if it's not real time, but how fast. I want to move images across a gigE network, weighing packing vs just transferring.

Thanks
-Les
Les Dit is offline   Reply With Quote
Old August 3rd, 2004, 06:21 AM   #2
RED Code Chef
 
Join Date: Oct 2001
Location: Holland
Posts: 12,514
If I understand you correctly this can be done very fast. Especially
since it can be done inline in your buffer (ie, no need to allocate
a second buffer). I have no indications to you how fast, but
probably fast enough to do realtime over a gigE network.
Especially if it where coded in assembly.

12 bits are excellent as well since you can store these as 8 + 4
bits instead of 16 bits. So you will need to process two 16 bit
pixels at a time and pack that into 24 bits (3 bytes), not that
hard to do in C or assembly as long as you know how many
pixels there are in the memory block.

I'm not sure if MMX/SSE could help here, I have no experience
with those. But if someone checks the spec on the instructions
available it shouldn't be too hard, basically in assembly it boils
down to something like this (the following code assumes the
lower 12 bits are to be used in Intel format and is not tested):


MOV ESI, [source buffer]
MOV EDI, ESI
MOV ECX, [number of 16bit pixels]

Start:
; load 2 16 bit pixels, 32 bits
LODSD
MOV BX,AX
SHR EAX, 16

; re-pack them as 24 bits
SHL BX, 4
SHLD AX,BX,4
STOSW

SHR BX, 4
MOV EAX,EBX ;probably faster than AL,BL
STOSB

DEC ECX
LOOP Start


Something like this. Can probably be optimized futher, but this
should basically do what you want pretty fast (it has one 386
instruction in it to speed up some of the packing). This routine
overwrites the buffer it is given transforming every 4 bytes into
3 bytes. So for every n pixels (which must be a multiple of 2!)
you will get a buffer back of (n * 2) - (n / 2) bytes. It might swap
some pixels, but that rarely is a problem if you have enough time
on the other end to de-pack the data which can probably be
done in realtime as well.

I've looked the web a bit regarding MMX and SSE and I don't
think they will help for this. MMX seems to largely operate on
words/double words/quad words which is not what we are
trying to do. SSE seems to be mostly for floating point work
which is also now what you are trying to do.

One last thing. In this case it will introduce another loop over
the data. Whenever possible you should try to integrate packing
into either the writing routine or the reading one so you don't
waste time going over the data again.
__________________

Rob Lohman, visuar@iname.com
DV Info Wrangler & RED Code Chef

Join the DV Challenge | Lady X

Search DVinfo.net for quick answers | Buy from the best: DVinfo.net sponsors
Rob Lohman is offline   Reply With Quote
Old August 3rd, 2004, 08:59 AM   #3
Major Player
 
Join Date: Apr 2003
Location: St. Louis, MO
Posts: 581
Hey! A fellow asm coder! Have I seen you on hutch's board or win32asm before?
Rob Belics is offline   Reply With Quote
Old August 3rd, 2004, 09:28 AM   #4
RED Code Chef
 
Join Date: Oct 2001
Location: Holland
Posts: 12,514
Hmmm, all Rob's seem to be ASM coders thusfar, hehe. Rob Scott
at least codes as well, think he can do ASM as well.

Nope, never been to any of those boards. It has been a pretty
long time since I've done anything in ASM (at least 5+ years),
but did a lot of low-level stuff in the DOS days and whatnot.

Never really forgot it although I never ventured into the whole
386/protected mode/MMX/SSE stuff etc.

Gives you a great insight into computers / operating systems and
how things works, don't you think?

My programming experience went like this:

(Quick)basic -> assembly -> pascal/delphi -> C(++) -> Visual Basic -> C#
__________________

Rob Lohman, visuar@iname.com
DV Info Wrangler & RED Code Chef

Join the DV Challenge | Lady X

Search DVinfo.net for quick answers | Buy from the best: DVinfo.net sponsors
Rob Lohman is offline   Reply With Quote
Old August 3rd, 2004, 09:46 AM   #5
Major Player
 
Join Date: Apr 2003
Location: St. Louis, MO
Posts: 581
I used to design hardware so assembly was part of the job. Begrudgingly learned C, then C++. Looks like I'll be starting a server business so need to get into C#, Java, etc. But I'd rather do it all in assembly.

But programming has taken a back seat for a few years now that I've gotten back into film. So I hesitate to answer MMX/SSE questions since it would only be from a foggy memory.

My foggy memory says SSE can do this 12-bit work on chip but, as I said, I don't recall.
Rob Belics is offline   Reply With Quote
Old August 3rd, 2004, 09:50 AM   #6
RED Code Chef
 
Join Date: Oct 2001
Location: Holland
Posts: 12,514
It's like the other way around for me. I'm doing programming as
a job and hopefully will be moving to some film related stuff in
the near future. Too bad I ain't exactly on the right side of the
globe for that.
__________________

Rob Lohman, visuar@iname.com
DV Info Wrangler & RED Code Chef

Join the DV Challenge | Lady X

Search DVinfo.net for quick answers | Buy from the best: DVinfo.net sponsors
Rob Lohman is offline   Reply With Quote
Old August 3rd, 2004, 12:51 PM   #7
Major Player
 
Join Date: Oct 2003
Location: Southern Cal-ee-for-Ni-ya
Posts: 608
Thanks Rob,

I think it would be interesting to see what the ASM output of the Visual C compiler would look like to do the same thing!
A test for their optimizer?
I hear the Intel compiler is the best, but I don't think it's popular.

-Les
Les Dit is offline   Reply With Quote
Old August 3rd, 2004, 12:57 PM   #8
Major Player
 
Join Date: Apr 2003
Location: St. Louis, MO
Posts: 581
The output of compilers can be really bizarre to look at, especially Microsofts. Names get mangled and it can be hard to follow the logic flow. Though efficient, it is just hard to follow sometimes.

The optimizers are very good. But in critical timing, it can still be best to hand optimize it.

I just happen to think that, in the programming world, you get arguments about HLL language vs assembly all the time. Just like the film/digital arguments.
Rob Belics is offline   Reply With Quote
Old August 4th, 2004, 03:05 PM   #9
RED Code Chef
 
Join Date: Oct 2001
Location: Holland
Posts: 12,514
If you build the function correctly in C I think Microsofts and Intels
compiler will probably closely match to what I've written. They
might even include some more tricks. I've been reading an
assembly optimization guide for Intel processors the other day.
Interesting stuff regarding cache misses etc. etc. Such stuff will
take quite a lot of time if you want to do it correctly (ie do it in
C first time that, do it in assembly, time again and see what can
be futher improved etc. etc.)
__________________

Rob Lohman, visuar@iname.com
DV Info Wrangler & RED Code Chef

Join the DV Challenge | Lady X

Search DVinfo.net for quick answers | Buy from the best: DVinfo.net sponsors
Rob Lohman is offline   Reply With Quote
Old August 4th, 2004, 03:11 PM   #10
Major Player
 
Join Date: Oct 2003
Location: Southern Cal-ee-for-Ni-ya
Posts: 608
Thanks again for the info guys.

On a related note: I just did some network tests between 2 identical P4 3Ghz machines with a tool called iperf.
I'm getting 90 megabytes a second between the two !!

This has no file system overhead, it's just raw data passing between the two, but it looks very good!

-Les
Les Dit is offline   Reply With Quote
Old August 4th, 2004, 03:25 PM   #11
RED Code Chef
 
Join Date: Oct 2001
Location: Holland
Posts: 12,514
Is this true the microsoft drivers and TCP/IP stack? What kind of
network is this exactly?
__________________

Rob Lohman, visuar@iname.com
DV Info Wrangler & RED Code Chef

Join the DV Challenge | Lady X

Search DVinfo.net for quick answers | Buy from the best: DVinfo.net sponsors
Rob Lohman is offline   Reply With Quote
Old August 4th, 2004, 03:34 PM   #12
Major Player
 
Join Date: Oct 2003
Location: Southern Cal-ee-for-Ni-ya
Posts: 608
Microsoft drivers, for the Marvel Yukon chip on the motherboards.
8 port gig E switch.
Nothing fancy!
-Les
Les Dit is offline   Reply With Quote
Old August 9th, 2004, 04:56 AM   #13
Major Player
 
Join Date: May 2004
Location: Knoxville, TN (USA)
Posts: 358
Les, Rob Lohman is correct -- it should be possible to write a very efficient routine in assembler. In the ObscuraCapture app (tm :-) I've been able to pack the 10-bit data from the SI-1300 camera at over 250 MB/sec.
Rob Scott is offline   Reply With Quote
Old August 9th, 2004, 12:44 PM   #14
Major Player
 
Join Date: Oct 2003
Location: Southern Cal-ee-for-Ni-ya
Posts: 608
Thanks Rob, that's fast enough. I'm looking at options for speeding up my film scanner, it has an 8 megapixel camera on it.
Are you guys using GCC ? I was wondering what the interactive debugger is like on that. Most of my code is written by my programmer, but I do mods and add features. I am OK with the MS visual debugger, but MS isn't issuing bug fixes on the C side of that dev system much anymore, so we want to switch to the GCC system. I'm not comfortable with their optimizer and I hear the debugger is command line.

What type of system are you getting 250MB a sec on? dual ddr with 800 Mhz FSB ?

Thanks
-Les
Les Dit is offline   Reply With Quote
Old August 9th, 2004, 01:09 PM   #15
Major Player
 
Join Date: May 2004
Location: Knoxville, TN (USA)
Posts: 358
Quote:
Les Dit wrote:
Are you guys using GCC?
I'm using MS VC++. I started out using MinGW (basically GCC for Windows) but had trouble calling DirectX. There is very nice a graphical IDE for MinGW and it had a decent visual debugger IIRC.
Quote:
What type of system are you getting 250MB a sec on? dual ddr with 800 Mhz FSB ?
No, it's just a basic laptop running an AMD Athlon XP-M 2500. I'm not sure about the memory configuraiton. (And no, I don't have the camera connected to the laptop. I have some code that simulates it.)
Rob Scott is offline   Reply
Reply

DV Info Net refers all where-to-buy and where-to-rent questions exclusively to these trusted full line dealers and rental houses...

Professional Video
(800) 833-4801
Portland, OR

B&H Photo Video
(866) 521-7381
New York, NY

Z.G.C.
(973) 335-4460
Mountain Lakes, NJ

Abel Cine Tech
(888) 700-4416
N.Y. NY & L.A. CA

Precision Camera
(800) 677-1023
Austin, TX

DV Info Net also encourages you to support local businesses and buy from an authorized dealer in your neighborhood.
  You are here: DV Info Net > Special Interest Areas > Alternative Imaging Methods

Thread Tools Search this Thread
Search this Thread:

Advanced Search

 



Google
 

All times are GMT -6. The time now is 05:29 AM.


DV Info Net -- Real Names, Real People, Real Info!
1998-2017 The Digital Video Information Network