Failure Protection in Teradata
let's first look at Data Allocation in Teradata: Bacisally, OS recognize logical units(LUN), which is composed of slices(UNIX) or partitions(Windows/Linux) from each of the disk drives of a disk rank. Then the PDE translates the LUN into one or more pdisks. psdisks are then assigned to AMPs. All the logical disk spaces an AMP manages is called a vdisk. In general all pdisks from a rank will be assigned to the same AMP.
Failure protection in Teradata falls in the several different levels:
Disk drive level: RAID
RAID: Redundant Array of Independent(or Inexpensive) Disks.
The various designs of RAID systems involve two key goals: increase data reliability and increase input/output performance. There are six different designs RAID 1 to RAID 6 that provides fault tolerance (There's also so called RAID 0 which has no fault tolerance, and RAID 10 TBD.)
Teradata supports RAID 1 and RAID 5.
- RAID 1(mirroring without parity): Data is fully replicated in mirror disk(s). Read blocks from the 1st available disk. Besides failure protection it also provides great performance benefit.
- RAID 5(block-level striping with distributed parity): Data is striped across a rank of disks one segment at a time. Parity is also striped all disk drives, interleaved with data. When a disk fails, data is reconstructed on the fly using existing data and parity.
RAID 1 is faster than RAID 5, as the two(or more) disks are read parallelly, and no parity computation.
AMP level: Fallback tables
Storing a 2nd copy of each row of a table on a different AMP in the same cluster. Specified during table creation. Fallback will cause twict I/O on data modifications.
Obviously the highest level of protection is RAID 1 with Fallback protection.
Componenet/Process Level: Journal
Journals are used for specific types of data or process recovery.
Recovery Journals: maintained by system automatically. Two different types:
- transient journal: keeps "before image" of changed rows so data can be restored to previous state in case of an interrupted transaction. Happens in each AMP.
- down AMP recovery journal: log write changes to data on the failed AMP by other AMPs in the cluster. Then applying changes to the recovered AMP.
Permanet Journals: optional, user specifies at table level, and can store before images or after images to provide full-table recovery to a specific point in time.
Database Object Level: Locks
Applied at 3 different levels: Database/Table/Row Hash
4 types:
- Exclusive: at db/table level, used for DDL, blocks all other locks
- Write: ensures data consistency while writing, only allow access locks
- Read: ensures data consistency while reading, allows read/access locks
- Access: allows table update only for small single-row changes, blocks exclusive locks.
Local deadlocks are checked at AMP level, and global deadlocks are coordinated by PE on a timed basis.
2 Comments:
logically
Hey There. I found your blog using msn. This is a very well written article. I will be sure to bookmark it and come back to read more of your useful information. Thanks for the post. I'll definitely return.
Post a Comment
<< Home