DV Info Net

DV Info Net (https://www.dvinfo.net/forum/)
-   Open DV Discussion (https://www.dvinfo.net/forum/open-dv-discussion/)
-   -   Can someone please explain RAID to me (https://www.dvinfo.net/forum/open-dv-discussion/85464-can-someone-please-explain-raid-me.html)

Chris Sinista February 1st, 2007 11:31 PM

Can someone please explain RAID to me
 
What is raid? when do I need it ? where do I get it? does it come in some external hard drives that say configured with raid 0 or is that just part of what I need to run raid any help would be great thank you all

Kit Hannah February 2nd, 2007 12:18 AM

1st google search return for "What is Raid?"

In computing, the acronym RAID (originally redundant array of inexpensive disks, also known as redundant array of independent disks) refers to a data storage scheme using multiple hard drives to share or replicate data among the drives. Depending on the configuration of the RAID (typically referred to as the RAID level), the benefit of RAID is to increase data integrity, fault-tolerance, throughput or capacity, compared with single drives. In its original implementations, its key advantage was the ability to combine multiple low-cost devices using older technology into an array that offered greater capacity, reliability, speed, or a combination of these things, than was affordably available in a single device using the newest technology.

http://en.wikipedia.org/wiki/Redunda...ependent_disks

Raid will be determind usually by how many hard drives you have and what kind of Motherboard you have - specifically if it will support RAID (most now-a-days do). The different levels of Raid are different configurations you want to accomplish. There are many different ways to do it, and it's not very hard to do, but you need to do your homework.

It can also be used for mirroring hard drives so you have 2 exact copies in case one fails. A simple google search will give you all the answers you need.

Chris Sinista February 2nd, 2007 02:01 PM

my main question is theres a raid in the western digital pro book 1tb with raid is that just part of what i would need to utilize it or is it complete ?

Mike Teutsch February 2nd, 2007 02:22 PM

Quote:

Originally Posted by Chris Sinista
my main question is theres a raid in the western digital pro book 1tb with raid is that just part of what i would need to utilize it or is it complete ?

There are numerous forms of RAID. They are different ways of using or controling multiple drives. Not sure you can have a RAID on just one disk.

Mike

Search the forum:

Start with this thread: http://www.dvinfo.net/conf/showthrea...highlight=RAID

Dat Nguyen February 2nd, 2007 02:40 PM

Raid
 
This comes with everything you need. However, you should check if it is RAID implemented in software or hardware. Hardware RAID is generally better.

RAID 0 (striping) is for disk read performance. RAID 1 (mirroring) is for data redundancy. All data is written twice, once for each drive. So if one drive fails, the other still has the data. Note that if you use RAID 1 (you must choose one or the other) you will effectively halve your storage to 500GB.

Georgia Hilton February 2nd, 2007 03:53 PM

Hi ...This might help in your search for Raid systems answers.

Overview

A drive array is a collection of hard disk drives that are grouped together. When you talk about Raid, there is often a distinction between physical drives and arrays and logical drives and arrays. Physical arrays can be divided or grouped together to form one or more logical arrays. These logical arrays can be divided into logical drives that the operating system sees. The logical drives are treated like single hard drives and can be partitioned and formatted accordingly. The Raid controller is what manages how the data is stored and accessed across the both the physical and logical arrays. It ensures that the operating system only sees the logical drives and does not need to worry about managing the underlying schema. As far as the system is concerned, it's dealing with regular hard drives. A Raid controller's functions can be implemented in hardware or software. Hardware implementations are better for Raid levels that require large amounts of calculations. The single individual Raid levels don't address every application requirement that exist. So, to get more functionality, someone thought of the idea of combining Raid levels. The main benefit of using multiple Raid levels is the increased performance. Usually combining Raid levels means using a hardware Raid controller. The increased level of complexity of these levels means that software solutions are no practical. Raid 0 has the best performance out of the single levels and it is the one most commonly being
combined. Not all combinations of Raid levels exist. The most common combinations are Raid 0+1 and 1+0. The difference between 0+1 and 1+0 might seem subtle… the difference lies in the amount of fault tolerance. Both these levels require at least 4 hard drives to implement so this can get a bit expensive.. ok lets hit the details of Raid levels…

Raid 0
This is the simplest level of Raid… and it just involves striping. Data redundancy is not even present in this level, so it is not recommended for applications where data is critical. This level offers the highest level of performance out of any single Raid level. At least 2 hard drives are required, preferably identical, and the maximum depends on the Raid controller. None of the space is wasted as long as the hard drives used are identical. it's relatively low cost and high performance gain. This level is good for most people that don't need any data redundancy. It works with SCSI and IDE/ATA implementations. Finally, it's important to note that if any of the hard drives in the array fails, you lose everything.

Raid 1
This level is usually implemented as mirroring. Two identical copies of data are stored on two drives. When one drive fails, the other drive still has the data to keep the system going. Rebuilding a lost drive is very simple since you still have the second copy. This adds data redundancy to the system and provides some safety from failures. Some implementations add an extra Raid controller to increase the fault tolerance even more. It’s ideal for applications that use critical data. Even though the performance benefits are not great, it really helps with preserving data. It is also relative simple and has a low cost of implemention. Most Raid
controllers nowadays implement some form of Raid 1.

Raid 2
This level uses bit level striping with Hamming code ECC. The technique used here is somewhat similar to striping with parity but not really. The data is split at the bit level and spread over a number of data and ECC disks. When data is written to the array, the Hamming codes are calculated and written to the ECC disks. When the data is read from the array, Hamming codes are used to check whether errors have occurred since the data was written to the array. Single bit errors can be detected and corrected immediately. This is the only level that really deviates from traditional Raid ideas. Remember, this level is very complicated and expensive Raid controller hardware is needed.

Raid 3
Raid 3 uses byte level striping with dedicated parity. In other words, data is striped across the array at the byte level with one dedicated parity drive holding the redundancy information. The idea behind this level is that striping the data increasing performance and using dedicated parity takes care of redundancy. 3 hard drives are required. 2 for striping, and 1 as the dedicated parity drive. Although the performance is good, the added parity does slow down writes. The parity information has to be written to the parity drive whenever a write occurs. This increased computation calls for a hardware controller, so software
implementations are not practical. Raid 3 is good for applications that deal with large files since the stripe size is small.

Raid 4
This level is very similar to Raid 3. The only difference is that it uses block level striping instead of byte level striping. The advantage in that is that you can change the stripe size to suit application needs. This level is often seen as a mix between Raid 3 and Raid 5, having the dedicated parity of Raid 3 and the block level striping of Raid 5. Again, you'll probably need a hardware Raid controller for this level. Also, the dedicated parity drive continues to slow down performance in this level as well.

Raid 5
Raid 5 uses block level striping and distributed parity. This level tries to remove the bottleneck of the dedicated parity drive. With the use of a distributed parity algorithm, this level writes the data and parity data across all the drives. Basically, the blocks of data are used to create the parity blocks which are then stored across the array. This removes the bottleneck of writing to just one parity drive. However, the parity information still has to be calculated and written whenever a write occurs, so the slowdown involved with that still applies. The fault tolerance is maintained by separating the parity information for a block from the actual data block. This way when one drive goes, all the data on that drive can be rebuilt from the data on the other drives. Recovery is more complicated than usual because of the distributed nature of the parity. Just as in Raid 4, the stripe size can be changed to suit the needs of the application. Also, using a hardware controller is probably the more practical solution. Raid 5 is one of the most popular Raid levels being used today. It appears to be the best combination of performance, redundancy, and storage efficiency.

Raid 0+1
This combination uses Raid 0 for it's high performance and Raid 1 for it's high fault tolerance. Let's say you have 8 hard drives. You can split them into 2 arrays of 4 drives each, and apply Raid 0 to each array. Now you have 2 striped arrays. Then you would apply Raid 1 to the 2 striped arrays and have one array mirrored on the other. If a hard drive in one striped array fails, the entire array is lost. The other striped array is left, but contains no fault tolerance if any of the drives in it fail.

Georgia Hilton February 2nd, 2007 03:54 PM

more
 
Raid 1+ 0
Raid 1+0 applies Raid 1 first then Raid 0 to the drives. To apply Raid 1, you split the 8 drives into 4 sets of 2 drives each. Now each set is mirrored and has duplicate information. To apply Raid 0, you then stripe across the 4 sets. In essence, you have a striped array across a number of mirrored sets. This combination has better fault tolerance than Raid 0+1. As long as one drive in a mirrored set is active, the array can still function. So theoretically you can have up to half the drives fail before you lose everything, as opposed too nly two drives in Raid 0+1.

In conclusion
Ok now that you know the different Raid levels and configurations, why would you even bother? Well it really all depends on your application and the Raid level you use. However, in general using Raid provides data redundancy, fault tolerance, increased capacity, and increased performance. Data redundancy protects the data from hard drive failures. This benefit is good for companies or individuals that have critical or important data to protect, or just anyone that's paranoid about losing their gigabytes of data. Fault tolerance goes hand in hand with redundancy in providing a better over-all storage system. The only Raid level that does not have any form of redundancy or fault tolerance is Raid 0. Raid also provides
increased capacity by combining multiple drives. The efficiency of how the total drive storage is used depends on the Raid level. Usually, levels involving mirroring need twice as much storage to mirror the data. And lastly, the reason most people go to Raid is for the increase in performance. Depending on the Raid level used, the performance increase is different. For applications that need raw speed, Raid is definitely the way to go.

Here is a simple view of Raid:
Mirroring gives you Redundancy …therefore Data security goes up. Write performance goes down due to duplicated writes ( the amount varies by implementation). and read performance goes up, since there are two spindles with duplicated data that can be accessed by the system. In fact, in some implementations, the data that is closest to the read head of a given spindle is chosen for read making the seek and latency time drop dramatically ( note: again this depends on how your system is implemented and how you configure caching algorithms. The main thing to remember here is that the Raid controller writes the same data blocks to each mirrored drive. Each drive or array has the same information in it To set up mirroring the number of drives will have to be in the power of 2 for obvious reasons. The drawback here is that both drives are tied up during the writing process which limits parallelism and can hurt performance. A good Raid controller will only read from one of the drives since the data on both are the same. While the other is used to read, the free drive can be used for other requests. This increases parallelism, which is pretty much the concept behind the performance increase of Raid.

Stripping
Spreading that single file across a bunch-o-drives. Security of data drops ( more spindles & drive mechanics to break) but this gives you almost unlimited size of a “single” logical disk. Add two 60 gig disks get 1 120 gig disk. . Striping improves the performance of the array by distributing the data across all the drives. The main principle behind striping is parallelism. Imagine you have a large file on a single hard drive. If you want to read the file, you have to wait for the hard drive to read the file from beginning to end. Now, if you break the file up into multiple pieces and distribute it across multiple hard drives, you have all these drives reading a part of the file at the same time. You only have to wait as long as it takes to read each piece since the drives are working in parallel. The same is true if you were writing a large file to a disk. Transfer performance is greatly increased. The more hard drives you have, the greater the increase in performance. The stripe size is a largely debated topic. There is no ideal stripe size but certain sizes work best with certain applications. The performance effects of increasing or decreasing stripe size are apparent. Using a small stripe size will enable files to be broken up more and distributed across the drives. The transfe performance will increase due to the increased parallelism. However, this also increases the randomness of the position of each piece of the file. As you probably guessed already, using a large stripe size will do the opposite of decreasing the size. The data will be less distributed and transfer performance is decreased. The randomness is decreased as well. The best way to find out the right stripe size for your particular application is to experiment. Start out with a medium stripe size and try decreasing or increasing the siz and recording the difference in over-all performance. Remember, if you want to move or transfer a file somewhere, the controller accesses both drives simultaneously, which is where the performance gain kick in. It only takes half the time to transfer the file. If you increase the number of hard drives, the file will be transferred in 1/Nth the time it takes to transfer from 1 hard drive .

Mirroring and stripping
Add them both together data redundancy is up, security of data is better, read performance goes up, much faster ( depending on configuration again), write performance suffers depending on implementation

Sorry for being so long winded…. It just seemed that there is some confusion with regard to Raid capabilities and benefits.

... For What its worth...


cheers

georgia

Scott Ellifritt February 2nd, 2007 04:27 PM

In a nutshell, Raid enables your computer to recognize multiple drives as a single information source.

David Bertinelli February 2nd, 2007 05:21 PM

Thanks for being so long winded Georgia, that was a great explanation for us not know what a RAID does/is.

Thanks.

Cheers,
D

David Yuen February 2nd, 2007 09:34 PM

More on RAID
 
Here is the fuller text of Georgia's explanation.

StorageReview.com's detailed explanation.

Chris Sinista February 2nd, 2007 11:00 PM

you guys rock this forum rocks


All times are GMT -6. The time now is 02:47 AM.

DV Info Net -- Real Names, Real People, Real Info!
1998-2024 The Digital Video Information Network