RAID
RAID stands for redundant array of inexpensive disks or redundant array of independent disks. Its primary purpose is to provide fault tolerance and protection against file server hard disk failure and the resultant loss of availability and data. Some RAID types secondarily improve system performance by caching and distributing disk reads from multiple disks that work together to save files simultaneously.
Simply put, RAID separates the data into multiple units and stores it on multiple disks by using a process called striping. It can be implemented as either a hardware or a software solution; each type of implementation has its own issues and benefits.
The RAID Advisory Board has defined three classifications of RAID:
· Failure-Resistant Disk Systems (FRDS)
· Failure-Tolerant Disk Systems
· Disaster-Tolerant Disk Systems
RAID Levels
RAID is implemented in one or a combination of several ways, called levels. They are:
- RAID Level 0 creates one large disk by using several disks. This process is called striping. It stripes data across all disks (but provides no redundancy) by using all the available drive space to create the maximum usable data volume size and to increase the read/write performance. One problem with this level of RAID is that it actually lessens the fault tolerance of the disk system rather than increasing it; the entire data volume is unusable if one drive in the set fails.
- RAID Level 1 is commonly called mirroring. It mirrors the data from one disk or set of disks by duplicating the data onto another disk or set of disks. This process is often implemented by a one-for-one disk-to-disk ratio; each drive is mirrored to an equal drive partner that is continually being updated with current data. If one drive fails, the system automatically gets the data from the other drive. The main issue with this level of RAID is that the one-for-one ratio is very expensive, resulting in the highest cost per megabyte of data capacity. This level effectively doubles the amount of hard drives you need; therefore, it is usually best for smaller-capacity systems.
- RAID Level 2 consists of bit-interleaved data on multiple disks. The parity information is created by using a Hamming code, which detects errors and establishes which part of which drive is in error. It defines a disk drive system with 39 disks – 32 disks of user storage and seven disks of error-recovery coding. This level is not used in practice and was quickly superseded by the more flexible levels of RAID that follow.
- RAID Levels 3 and 4 are discussed together because they function in the same way. The only difference is that Level 3 is implemented at the byte level, whereas Level 4 is usually implemented at the block level. In this scenario, data is striped across several drives, and the parity check bit is written to a dedicated parity drive. This process is similar to RAID 0. They both have a large data volume, but the addition of a dedicated parity drive provides redundancy. If a hard disk fails, the data can be reconstructed by using the bit information on the parity drive. The main issue with these levels of RAID is that the constant writes to the parity drive can create a performance hit. In this implementation, spare drives can be used to replace crashed drives.
- RAID Level 5 stripes the data and the parity information at the block level across all the drives in the set. It is similar to RAID 3 and 4 except that the parity information is written to the next-available drive rather than to a dedicated drive by using an interleave parity. This feature enables more flexibility in the implementation and increases fault tolerance because the parity drive is not a single point of failure, as it is in RAID 3 and 4. The disk reads and writes are also performed concurrently, thereby increasing performance over levels 3 and 4. The spare drives that replace the failed drives are usually hot swappable, meaning they can be replaced on the server while the system is up and running. This is probably the most popular implementation of RAID today.
Vendors created various other implementations of RAID to combine the features of several RAID levels, although these levels are less common. Level 6 is an extension of Level 5 that allows for additional fault tolerance by using a second independent distributed-parity scheme (i.e., two-dimensional parity). Level 10 is created by combining Level 0 (striping) with Level 1 (mirroring). Level 15 is created by combining Level 1 (mirroring) with Level 5 (interleave). Level 51 is created by mirroring entire Level 5 arrays. Table below shows the various levels of RAID with terms you will need to remember.
|
Table: RAID Level Descriptions |
|
| RAID LEVEL | DESCRIPTION |
| 0 | Striping |
| 1 | Mirroring |
| 2 | Hamming Code Parity |
| 3 | Byte Level Parity |
| 4 | Block Level Parity |
| 5 | Interleave Parity |
| 6 | Second Independent Parity |
| 7 | Single Virtual Disk |
| 10 | Striping Across Multiple Pairs (1+0) |
| 15 | Striping With Parity Across RAID 5 Pairs (1+5) |
| 51 | Mirrored RAID 5 Arrays With Parity (5+1) |
Backup Concepts
A CISSP candidate will also need to know the basic concepts of data backup. The candidate might be presented with questions regarding file selection methods, tape format types, and common problems.
Tape Backup Methods
The purpose of a tape backup method is to protect and restore lost, corrupted, or deleted information – thereby preserving the data’s integrity and ensuring network availability. There are several varying methods of selecting files for backup.
Most backup methods use the Archive file attribute to determine whether the file should be backed up. The backup software determines which files need to be backed up by checking to see whether the Archive file attribute has been set and then resets the Archive bit value to null after the backup procedure.
The three most common methods are:
1. Full Backup Method - This backup method makes a complete backup of every file on the server every time it is run. A full or complete backup backs up all files in all directories stored on the server regardless of when the last backup was made and whether the files have already been backed up. The Archive file attribute is changed to mark that the files have been backed up, and the tapes or tapes will have all data and applications on it or them. The method is primarily run for system archive or baselined tape sets.
2. Incremental Backup Method – The incremental backup method backs up files that have been created or modified only since the last backup was made, or in other words files whose Archive file attribute is reset. This can result in the backup operator needing several tapes to do a complete restoration, because every tape with changed files as well as the last full backup tape will need to be restored.
3. Differential Backup Method – The differential backup method backs up files that have been created or modified only since the last backup was made, like an incremental backup. The difference between an incremental backup and a differential backup is that the Archive file attribute is not reset after the differential backup is completed. Therefore the changed file is backed up every time the differential backup is run. The backup set grows in size until the next full backup, as these files continue to be backed up during each subsequent differential backup. The advantage of this backup method is that the backup operator should need only the full backup and the latest differential backup to restore the system.
Other Backup Formats
- Compact Disc (CD) Optical Media. Write once, read many (WORM) optical disk “jukeboxes” are used for archiving data that does not change. This is a very good format to use for a permanent backup. Companies use this format to store data in an accessible format that may need to be accessed at a much later date, such as legal data. The shelf life of a CD is also longer than a tape. Rewritable and erasable (CDR/W) optical disks are sometimes used for backups that require short-time storage for changeable data but require faster file access than tape. This format is used more often for very small data sets.
- Zip/Jaz Drives, SyQuest, and Bernoulli Boxes. These types of drives are frequently used for the individual backups of small data sets of specific application data. These formats are very transportable and are often the standard for data exchange in many businesses.
- Tape Arrays. A tape array is a large hardware/software system that uses the RAID technology we discussed earlier in a large device with multiple (sometimes 32 or 64) tapes, configured as a single array. These devices require very specific hardware and software to operate, but they provide a very fast backup and a multitasking backup of multiple targets with considerable fault tolerance.
- Hierarchical Storage Management (HSM). HSM provides a continuous online backup by using optical or tape “jukeboxes,” similar to WORMs. It appears as an infinite disk to the system and can be configured to provide the closest version of an available real-time backup. This is commonly employed in very large data retrieval systems.
Common Backup Issues and Problems
All backup systems share common issues and problems, whether they use a tape or a CD-ROM format. There are three primary backup concerns:
- Slow data transfer of the backup. All backups take time, especially tape backup. Depending upon the volume of data that needs to be copied, full backups to tape can take an incredible amount of time. In addition, the time required to restore the data must also be factored into any disaster recovery plan. Backups that pass data through the network infrastructure must be scheduled during periods of low network utilization, which are commonly overnight, over the weekend, or during holidays. This also requires off-hour monitoring of the backup process.
- · Server disk space utilization expands over time. As the amount of data that needs to be copied increases, the length of time to run the backup proportionally increases, and the demand on the system grows as more tapes are required. Sometimes the data volume on the hard drives expands very quickly, thus overwhelming the backup process. Therefore, this process must be monitored regularly.
- The time the last backup was run is never the time of the server crash. With noncontinuous backup systems, data that was entered after the last backup prior to a system crash will have to be recreated. Some systems have been designed to provide online fault tolerance during backup (the old Vortex Retrochron was one), yet, because backup is a postprocessing batch process, some data reentry will need to be performed.
PEOPLE FIND THIS PAGE BY THIS WORDS:
Ways of maintaining information system; maintaining resource availability; Zip/Jaz drives are frequently used for the individual backups of small data sets of:;
