## Monday, April 30, 2007

### IBM PC的系统引导过程之我见

-------------------------------------------------------

dd if=/dev/hda of=mbr.bin bs=512 count=1

-------------------------------------------------------

1. Booting词条from wikipedia
2. MBR词条from wikipedia
3. From Power Up to Bash Prompt
4. LPI 证书 101 考试准备: Linux 安装与包管理 from IBM developerWorks

### [zz]引导过程概述

MS DOS、PC DOS 和 Windows 操作系统使用的标准硬盘驱动器 MBR 会检查分区表，寻找引导驱动器上标为活动 的主分区，从这个分区装载第一个扇区，并且将控制传递到装载的代码的开头。这个新的代码段也称为分区引导记录（partition boot record）。分区引导记录实际上是另一个第一阶段引导装载程序，但是它能够从这个分区装载一组代码块。这些新代码称为第二阶段引导装载程序。在 MS-DOS 和 PC-DOS 中，第二阶段引导装载程序直接装载操作系统的其余部分。这就是操作系统通过引导自举使自己启动的过程。

1. Loadlin，一个 DOS 可执行程序，可以从正在运行的 DOS 系统调用它来引导 Linux 分区。在建立多重引导系统还很复杂而且有风险的年代里，这个程序曾经流行过。
2. OS/2 Boot Manager，一个安装在专用的小分区中的程序。这个分区被标为“活动的”并且标准 MBR 引导过程启动 Boot Manager，这个程序显示一个菜单，让用户选择要引导的操作系统。
3. 智能引导装载程序，这种程序可以驻留在一个操作系统分区上，可以由活动分区的分区引导记录或者主引导记录调用。这些程序包括：
• BootMagic™，即 Norton PartitionMagic™ 的组成部分
• GRUB，即 GRand Unified Boot loader

LILO 和 GRUB 中使用的第二阶段装载程序允许在几种操作系统或版本之中选择要装载哪一个。但是，LILO 和 GRUB 的显著差异在于，在修改系统（比如升级内核或做其他修改）之后，需要使用一个命令重新创建 LILO 引导设置，而 GRUB 能够通过一个可编辑的配置文本文件来完成设置更新。LILO 的历史比较长，GRUB 比较新。原来的 GRUB 现在成了 GRUB Legacy，GRUB 2 正在 Free Software Foundation 的赞助下进行开发。

Cylinder-head-sector, also known as CHS, was an early method for giving addresses to each physical block of data on a hard drive. In the case of floppy drives, for which the same exact diskette medium can be truly low-level formatted to different capacities, this is still true.

Though CHS values no longer have a direct physical relationship to the data stored on disks, pseudo CHS values (which can be translated by disk electronics or software) are still being used by many utility programs.

## Definitions

### Platters and Tracks

Every disk drive consists of one or more platters. A platter can be thought of as a collection of concentric, flat rings. Data is stored on the surface of a platter inside these rings, or circular strips; which are called tracks. These tracks, which may exist on either side of a platter, can be delineated by specifying certain cylinder and head values.

Data is written to and read from the surface of a platter by a device called a head. Naturally, a platter has 2 sides and thus 2 surfaces on which data can be manipulated; usually there are 2 heads per platter--one on each side. (Sometimes the term side is substituted for head, since platters may be separated from their head assemblies; as is definitely the case with the removable media of a floppy drive.)

### Cylinders

A cylinder comprises all the tracks that can be accessed by the read/write heads while their access arms remain stationary. If drives used only one side of a single platter, the term "track" would always be used instead of "cylinder."

### Sectors

A platter can also be thought of as a collection of slices called sectors.

### Blocks

The intersection of a cylinder and a sector is called a block. These blocks are the smallest geometrical breakdown of a disk, and they represent the smallest amount of data with which a disk can deal (each block usually contains 512 bytes of data).

Note: Many PC engineers and technicians now use the term sector to speak of the smallest amount of data a disk drive can access. The UNIX/Linux communities, however, still employ the term block in the sense described above, for 512-byte chunks of data, or various multiples thereof. For example, the Linux fdisk utility normally displays partition table information using 1024-byte Blocks while also using the term sector in describing the disk's size in the phrase, "63 sectors/track" (though technically, the phrase "63 sectors/cylinder" should be used).

Hence, each block of data can be addressed by specifying a cylinder, head, and sector. The following formulas detail the CHS geometry and addressing scheme.

The number of blocks on one side of a platter is:

   blocksPerPlatterSide = (cylindersPerPlatter)*(SectorsPerPlatter)

The number of blocks per platter is:

   blocksPerPlatter = (blocksPerPlatterSide)*(sidesUsedPerPlatter)

which is usually written in terms of the number of heads used:

   blocksPerPlatter = (blocksPerPlatterSide)*(HeadsPerPlatter)

This is usually expanded to:

   blocksPerPlatter =  (cylindersPerPlatter)*(SectorsPerPlatter)*(HeadsPerPlatter)

and rearranged:

   blocksPerPlatter =  (cylindersPerPlatter)*(HeadsPerPlatter)*(SectorsPerPlatter)

then finally rewritten as:

   blocksPerPlatter = (Cylinders)*(Heads)*(Sectors)

## Examples

A "1.44 MB"[1] floppy disk has 80 cylinders (numbered 0 to 79), 2 heads (numbered 0 to 1) and 18 sectors (numbered 1 to 18). Therefore, its capacity in blocks is computed as follows:

   blocksPerPlatter = (80)*(2)*(18) = 2880

The four 16-byte entries within an MBR's Partition Table limit the values of their CHS tuples to: 1024 cylinders, 255 heads and 63 sectors. For computers whose BIOS code was also limited to using only these CHS values, what was the largest size hard disk they could handle? Starting with the formula above, but also multiplying by 512 bytes/block, the hard disk could be no larger than:

   (1024)*(255)*(63)*(512) = 8,422,686,720 bytes (about 8.4 GB)

## History

Older hard drives, such as MFM and RLL drives, divided each cylinder into an equal number of sectors and the CHS values matched the physical makeup of the drive. A drive with a CHS value of 500 x 4 x 32 would have 500 tracks per side on each platter, two platters, and 32 sectors per cylinder, with a total of 32,768,000 bytes (about 32.8 MB, or 31.25 MiB). Most modern drives have a surplus space that doesn't make a cylinder boundary. Each partition should always start and end at a cylinder boundary. Only some of the most modern operating systems may disregard this rule, but doing so can still cause some compatibility issues, especially if the user wants to boot more than one OS on the same drive.

ATA/IDE drives have replaced the MFM and RLL drives, and are much more efficient at storing data. They use Zone Bit Recording (ZBR), where the number of sectors in a cylinder varies with its location on the drive. Cylinders nearer to the edge of the platter contain more sectors than cylinders close to the spindle, because there is more space in a given track near the edge of the platter. The CHS addressing scheme does not work on these drives because of the varying number of sectors per cylinder. An IDE drive can be set in the system BIOS with any configuration of cylinders, heads, and sectors that do not exceed the capacity of the drive since the drive will convert any given CHS value into an actual address for its specific hardware configuration.

## Notes

1. ^ This popular label for the usual format of a 3.5-inch floppy diskette is a misnomer, since its actual capacity is 1440 KiB; or: 1440 binaryKilo-bytes x 1024/binaryKilo = (exactly) 1,474,560 bytes (which is neither 1,440,000 bytes nor the equivalent of 1.44 MiB).

### 硬盘，磁盘，存储。分区，主分区,拓展分区与逻辑分区。乱谈。

-------------------------------------------------------------------------------------

1.存储器的分类

• OS的待机与睡眠：待机时切断硬盘，显示器等电源，但内存仍然供电，因此可以快速恢复；而睡眠则把所有当前信息(或说context)保存在硬盘里，然后关机。当再次开机时，OS检测到上一次关闭时处于睡眠状态从而把context从硬盘中读取出来,快速恢复上一次休眠时的状态。

2.磁盘的MBR（Master Boot Record)

-01B7的4个9字节的主分区表入口，从01BE-01FD有4个16字节的主分区表了！这也是为什么人们总说：一块硬盘只能有四个主分区的原因。

• 操作系统必须装在主分区？听说Windows与FreeBSD是的……不过Linux绝对不是，因为我一直装在逻辑分区 :)

Linux 将主分区或扩展分区编号为 1 到 4，dev hda 可以有四个主分区，即 /dev/hda1、/dev/hda2、/dev/hda3 和 /dev/hda4。也可以有一个主分区 /dev/hda1 和一个扩展分区 /dev/hda2。如果定义逻辑分区，它们的编号从 5 开始，所以 /dev/hda 上的第一个逻辑分区是 /dev/hda5，即使硬盘上没有主分区，只有一个扩展分区（/dev/hda1）。

[root@localhost ~]# fdisk -l

Disk /dev/hda: 120.0 GB, 120034123776 bytes
255 heads, 63 sectors/track, 14593 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/hda1 1 892 7164958+ 12 Compaq diagnostics
/dev/hda2 * 893 4156 26218080 c W95 FAT32 (LBA)
/dev/hda3 4157 9371 41889456 c W95 FAT32 (LBA)
/dev/hda4 9372 14593 41945715 f W95 Ext'd (LBA)
/dev/hda5 9372 11982 20972826 b W95 FAT32
/dev/hda6 11983 11995 104391 83 Linux
/dev/hda7 11996 12256 2096451 82 Linux swap / Solaris
/dev/hda8 12257 14593 18771921 83 Linux

hda1(主分区1）被我的笔记本制造商用来做成诊断盘了……不表。hda2（主分区2）则是用于启动的分区(?)，使用了LBA，可以看到hda3,hda4都是主分区。而hda4是一个拓展分区(可以计算得到，hda5-hda8的大小正好是：20972826+104391+2096451+18771921=41945589，为什么不一致？41945715-41945589=126，我猜想是用一些空间来保存拓展分区表信息的）

IBM PC发展至今，数据交换经历过一段历史。基本上数据总线由现在已经淘汰的ISA，到广泛使用的PCI（貌似也开始没落了），到AGP，再到PCI-E。而磁盘所使用的数据总线叫IBM-AT(Advanced Technology)。后来改名为ATA(AT Attachment)，当串行总线出现后(S-ATA)又叫做(P-ATA,P stands for Parallel)。使用PATA的硬盘也叫做IDE硬盘（由于Western Digital公司某个规范所遗留的名称，Integrated Drive Electronics，我想这也是为什么SATA在机器上被认作SCSI硬盘的原因吧……)

• 误解一：以为两条ATA cable接的两个硬盘有一个是主盘，有一个是从盘。其实，在同一条IDE线上连的两个硬盘才分主/从盘。这也是一个术语“跳线”所指的东西，如果两个硬盘接在两条cable上，根本不用跳线。跳线只是为了指定哪个是主盘，哪个是从盘。
• 误解二：主盘比从盘好。其实不然，这只不过是一个index的问题。两个连在一条ATA cable上的磁盘都受到同一个主板接口的限制，或说，都受到磁盘控制器的限制。它们其实并无主次之分。因此，无论是访问速度还是访问优先级，都是平等的。
• 误解三：两个盘会有共享介质冲突。由于已经使用了某些并行访问技术，使得可以错开始终周期进行磁盘读写，因此，一条cable上的两个盘完全独立！因此两个速度不同的磁盘不会出现所谓的瓶颈效应，也不会出现“同一时间只能读一个硬盘”的限制。
3.CHS,LBA,Large
LBA(Logical Block Addressing)逻辑块寻址模式。管理的硬盘空间可达 8.4GB。在 LBA 模式下，设置的柱面、磁头、扇区等参数并不是实际硬盘的物理参数。在访问硬盘时，由 IDE 控制器把由柱面、磁头、扇区等参数确定的逻辑地址转换为实际硬盘的物理地址。在 LBA 模式下，可设置的最大磁头数为 255，其余参数与普通模式相同，由此可以计算出可访问的硬盘容量为：512x63x255x1025=8.4GB。不过现在新主板的 BIOS 对 INT13 进行了扩展，使得 LBA 能支持 100GB 以上的硬盘。
LARGE 大硬盘模式，在硬盘的柱面超过 1024 而又不为 LBA 支持时采用。LARGE 模式采用的方法是把柱面数除以 2，把磁头数乘以 2，其结果总容量不变。

4.IRQ

5.IO端口

6.吹一吹Linux文件系统与Windows文件系统

• C盘是系统盘(System)。只分配安装好系统后剩余几G的空间。除了系统以及必要的驱动程序外，什么也不安装。
• D盘是媒体盘(Media)。用来装图片，音乐，电影等多媒体资料。
• E盘是游戏盘(Game)。用来专门放游戏。
• F盘是工具盘(Tool)。用来安装工具。
• G盘是资料盘(Document)。用来放文档，电子书，以及ghost镜像。

Linux 文件系统包含了分配在磁盘或其他块存储设备备上的一些文件（以目录的形式）。与许多其他系统一样，Linux 系统上的目录可以包含其他目录，这称为 子目录。Microsoft® Windows® 等系统根据不同的盘符（A:，C:，等等）来分隔文件系统，但是 Linux 文件系统只是一个树型结构，它以 / 目录作为 目录。

bin基本命令的二进制代码
boot引导装载程序的静态文件
dev设备文件
etc主机特有的系统配置
lib基本的共享库和内核模块
media可移除介质的挂装点
mnt临时挂装文件系统的挂装点
opt附加的应用程序软件
sbin基本的系统二进制代码
srv这个系统提供的服务的数据
tmp临时文件
usr次级层次结构
var可变数据

LPI 证书 101 考试准备: Linux 安装与包管理

### MBR-Master Boot Record

A Master Boot Record (MBR), or partition sector, is the 512-byte boot sector that is the first sector ("Sector 0") of a partitioned data storage device such as a hard disk. (The boot sector of a non-partitioned device is a Volume Boot Record.) It is sometimes used for bootstrapping operating systems, containing a machine code program; sometimes used for holding part of a disk's partition table1; and sometimes used for uniquely identifying individual disk media, with a 32-bit data signature; although on some machines it is entirely unused or redundant.[1][2][3][4]

Due to the broad popularity of IBM PC-compatible computers, this type of MBR is widely used, to the extent of being supported by, and incorporated into, other computer types, including newer cross-platform standards for bootstrapping and partitioning.

### Drive letter assignment

Drive letter assignment is the process of assigning drive letters to primary and logical partitions (drive volumes) in the root namespace; this usage is found in Microsoft operating systems. Unlike the concept of UNIX mount points, where the user can create directories of arbitrary name and content in a root namespace, drive letter assignment constrains the highest-level namespace to single letters. Drive letter assignment is thus a process of using letters to name the roots of the "forest" representing the file system; each volume holds an independent "tree" (or, for non-hierarchical file systems, an independent list of files).

## Origin

The concept of drive letters, as used today, probably owes its origins to IBM's VM family of operating systems, dating back to CP/CMS in 1967 (and its research predecessor CP-40). The concept evolved through several steps:

• CP/CMS used drive letters to identify minidisks attached to a user session. A full file reference (pathname in today's parlance) consisted of a filename, a filetype, and a disk letter called a filemode. Minidisks could correspond to physical disk drives, but more typically referred to logical drives, which were mapped automatically onto shared devices by the operating system as sets of virtual cylinders of fixed-size blocks. This was vastly easier to use than other mainframe file reference mechanisms, e.g. JCL.
• CP/CMS inspired numerous other operating systems, including the CP/M microcomputer operating system – which used a drive letter prefix (e.g. "A:") to specify a physical storage device. This usage was similar to the device prefixes used in the RSX-11 and VMS operating systems. Early versions of CP/M (and other microcomputer operating systems) implemented a flat file system on each disk drive, where a complete file reference consisted of a drive letter followed by a filename (eight characters) and a filetype (three characters): A:readme.txt. (This was the era of 8-inch floppy disks, where such small namespaces did not impose practical constraints.)
• The drive letter syntax chosen for CP/M was also adopted by Microsoft for its ubiquitous microcomputer operating systems MS-DOS and, later, Microsoft Windows. Originally, drive letters always represented physical volumes, but support for logical volumes eventually appeared.

Note that the important capability of hierarchical directories within each drive letter was initially absent from these systems. This was a major feature of UNIX and other robust operating systems, where hard disk drives held thousands (rather than tens or hundreds) of files. Increasing microcomputer storage capacities led to their introduction, eventually followed by long filenames. In file systems lacking such naming mechanisms, drive letter assignment proved a useful, simple organizing principle.

## JOIN and SUBST

Drive letters are not the only way of accessing different volumes. MS-DOS offers a JOIN command that allows access to an assigned volume through an arbitrary directory, similar to the Unix mount command. It also offers a SUBST command which allows the assignment of a drive letter to a directory. One or both of these commands are removed in later systems like OS/2 or Windows NT, but starting with Windows 2000 both are again supported: the SUBST command exists as before, while JOIN's functionality is subsumed in LINKD (part of the Windows Resource Kit). In Windows Vista, the new command MKLINK can be used for this purpose. Also Windows 2000 and later supports mount points, accessible from the Control Panel.

## Order of assignment

Except for CP/M and early versions of MS-DOS, each of these operating systems assigns drive letters according to the following algorithm:

1. Assign the drive letter 'A' to the boot floppy, and 'B' to the secondary floppy.
2. Assign a drive letter, beginning with 'C' to the first active primary partition recognised upon the first physical hard disk.
3. Assign subsequent drive letters to the first primary partition upon each successive physical hard disk drive, if present within the system.
4. Assign subsequent drive letters to every recognised logical partition, beginning with the first hard drive and proceeding through successive physical hard disk drives, if present within the system.
5. Assign subsequent drive letters to any RAM Disk.
6. Assign subsequent drive letters to any additional floppy, CD/DVD drives.

MS-DOS versions 3 and earlier assign letters to all of the floppy drives before considering hard drives, so a system with four floppy drives would call the first hard drive 'E'.

The order can depend on whether a given disk is managed by a boot-time driver or by a dynamically loaded driver. For example, if the second or third hard disk is of SCSI type and on MS-DOS requires drivers loaded through the CONFIG.SYS file (e.g. the controller card does not offer on-board BIOS or using this BIOS is not practical), then the first SCSI primary partition will appear after all the IDE partitions on MS-DOS. Therefore MS-DOS and, for example, OS/2 could have different drive letters, as OS/2 loads the SCSI driver earlier. A solution was not to use primary partitions on such hard disks.

In Windows NT, Windows 2000, Windows XP and OS/2, the operating system uses the aforementioned algorithm to automatically assign letters to floppy disk drives, CD-ROM drives, DVD drives, the boot disk, and other recognised volumes that are not otherwise created by an administrator within the operating system. Volumes that are created within the operating system are manually specified, and some of the automatic drive letters can be changed. Unrecognised volumes are not assigned letters, and are usually left untouched by the operating system.

A common problem that occurs with the drive letter assignment is that the letter assigned to a network drive can interfere with the letter of a local volume (like a newly installed CD/DVD drive or a USB stick). For example, if the last local drive has the letter D: and we have assigned to a network drive the letter E:, then when we connect a USB mass storage device it will also be assigned the letter E: causing to lose connectivity with either the network share or the USB device. To overcome this problem we have to manually assign drive letters or to install a 3rd party software as the USB Drive Letter Manager.

An alternate condition that can cause problems on Windows XP is when there are network drives defined but in an error condition (as they would be on a laptop operating outside the network). Even when the unconnected network drive is not the next available drive letter, Windows XP may be unable to map a drive and this error may also prevent the mounting of the USB device.

## Common assignments

Applying the algorithms discussed above on a fairly modern Windows based system typically results in the following drive letter assignments:

• A:Floppy drive (3.5-inch is the modern standard).
• B: — Unused, reserved for floppy drive; historically also for a second floppy drive, usually 5.25-inch. Also used for RAM-drives, in case of live CDs.
• C:Hard Drive.
• D: to Z: — Other disk partitions are labeled here. (Win98 update really likes to put any CD-ROM drive as D:\ even putting it above a Primary Partition IDE device[citation needed])
• D: to Z:CD, DVD and shared drives begin lettering after the last used hard drive partition designation.
• F: — First Network Drive if using Novell NetWare
• Z: — First Network Drive if using Banyan VINES

The C: drive usually contains all of the operating system files required for operation of the computer. On many modern personal computers only one hard drive is included in the design so it is designated C:. On such a computer, all of a user's personal files are often stored in directories on this drive as well. Keep in mind, that these drives can, however, be different.

When there was not a second physical floppy drive, the B: drive was used as a virtual floppy drive marker for the A: drive, whereby the user would be prompted to switch floppies every time a read or write was required to whichever was not most recently used of A: or B:. This allowed for much of the functionality of two floppy drives on a computer that had only one (albeit usually resulting in lots of swapping).

Network drives are often assigned letters towards the end of the alphabet. This is often done to differentiate them from local drives. Local drives typically use letters towards the beginning of the alphabet, by using letters towards the end it reduces the risk of an assignment conflict. This is especially true when the assignment is done automatically across a network (usually by a logon script).

It is not possible to have more than 26 drives. If access to more filesystems than this is required volume mount points must be used.[1]

## Other implementations

### The Amiga

The Commodore Amiga used a modified system whereby each drive was identified by a drive type and a drive number, starting from 0 (zero). For example, the first floppy drive on a system would be referred to as DFO:, where DF presumably stands for disk floppy (this, like the name of the computer, was probably inspired by Spanish: Amiga is the Spanish word for female friend, and the Spanish word for e.g. hard disk is disco duro, literally "disk hard"). Hard disks would be numbered starting at DH0:. Printers and CD-ROMs, however, broke this scheme, being numbered LPT1: (Line PrinTer 1) etc., and CD0: and up, respectively.

## References

1. ^ Microsoft TechNet Retrieved on 1 December 2006

### Large Disk HOWTO

Advanced Technology Attachment (ATA) is a standard interface for connecting storage devices such as hard disks and CD-ROM drives inside personal computers.

The standard is maintained by X3/INCITS committee T13. Many synonyms and near-synonyms for ATA exist, including abbreviations such as IDE and ATAPI. Also, with the market introduction of Serial ATA in 2003, the original ATA was retroactively renamed Parallel ATA (PATA).

In line with the original naming, this article covers only Parallel ATA.

Parallel ATA standards allow cable lengths up to only 18 inches (46 centimetres) although cables up to 36 inches (91 cm) can be readily purchased. Because of this length limit, the technology normally appears as an internal computer storage interface. It provides the most common and the least expensive interface for this application.

## History

ATA connection sockets on a PC motherboard located below RAM sockets

The name of the standard was originally conceived as PC/AT Attachment as its primary feature was a direct connection to the 16-bit ISA bus then known as 'AT bus'; the name was shortened to an inconclusive "AT Attachment" to avoid possible trademark issues.

An early version of the specification, conceived by Western Digital in late 1980s, was commonly known as Integrated Drive Electronics (IDE) due to the drive controller being contained on the drive itself as opposed to the then-common configuration of a separate controller connected to the computer's motherboard — thus making the interface on the motherboard a host adapter, though many people continue, by habit, to call it a controller.

Enhanced IDE (EIDE) — an extension to the original ATA standard again developed by Western Digital — allowed the support of drives having a storage capacity larger than 528 megabytes (504 mebibytes), up to 8.4 gigabytes. Although these new names originated in branding convention and not as an official standard, the terms IDE and EIDE often appear as if interchangeable with ATA. This may be attributed to the two technologies being introduced with the same consumable devices — these "new" ATA hard drives.

With the introduction of Serial ATA around 2003, conventional ATA was retroactively renamed to Parallel ATA (P-ATA), referring to the method in which data travels over wires in this interface.

The interface at first worked only with hard disks, but eventually an extended standard came to work with a variety of other devices — generally those using removable media. Principally, these devices include CD-ROM and DVD-ROM drives, tape drives, and large-capacity floppy drives such as the Zip drive and SuperDisk drive. The extension bears the name AT Attachment Packet Interface (ATAPI), which started as non-ANSI SFF-8020 standard developed by Western Digital and Oak Technologies, but then included in the full standard now known as ATA/ATAPI starting with version 4. Removable media devices other than CD and DVD drives are classified as ARMD (ATAPI Removable Media Device) and can appear as either a floppy or a hard drive to the operating system.

The move from programmed input/output (PIO) to direct memory access (DMA) provided another important transition in the history of ATA. As every computer word must be read by the CPU individually, PIO tends to be slow and use a lot of CPU resources. This is especially a problem on faster CPUs where accessing an address outside of the cacheable main memory (whether in the I/O map or the memory map) is a relatively expensive process. This meant that systems based around ATA devices generally performed disk-related activities much more slowly than computers using SCSI or other interfaces. However, DMA (and later Ultra DMA, or UDMA) greatly reduced the amount of processing time the CPU had to use in order to read and write the disks. This is possible because DMA and UDMA allow the disk controller to write data to memory directly, thus bypassing the CPU.

The original ATA specification used a 28-bit addressing mode. This allowed for the addressing of 228 (268,435,456) sectors (with blocks of 512 bytes each), resulting in a maximum capacity of 137 gigabytes (128 GiB). The standard PC BIOS system supported up to 7.88 GiB (8.46 GB), with a maximum of 1024 cylinders, 256 heads and 63 sectors. When the lowest common denominators of the CHS limitations in the standard PC BIOS system and the IDE standard were combined, the system as a whole was left limited to a mere 504 megabytes. BIOS translation and LBA were introduced, removing the need for the CHS structure on the drive itself to match that used by the BIOS and consequently allowing up to 7.88 GiB when accessed through Int 13h interface. This barrier was overcome with Int 13H extensions, which used 64 bit linear address and therefore allowed access to the full 128 GiB and more (although some BIOSes initially had problems handling more than 31.5 GiB due to a bug in implementation).

ATA-6 introduced 48 bit addressing, increasing the limit to 128 PiB (or 144 petabytes). Some OS environments, including Windows 2000 until Service Pack 3, did not enable 48-bit LBA by default, so the user was required to take extra steps to get full capacity on a 160 GB drive.

All these size limitations come about because some part of the system is unable to deal with block addresses above some limit. This problem may manifest itself by the system recognizing no more of a drive than that limiting value, or by the system refusing to boot and hanging on the BIOS screen at the point when drives are initialized. In some cases, a BIOS upgrade for the motherboard will resolve the problem. This problem is also found in older external FireWire disk enclosures, which limit the usable size of a disk to 128 GB. By early 2005 most enclosures available have practically no limit. (Earlier versions of the popular Oxford 911 FireWire chipset had this problem. Later Oxford 911 versions and all Oxford 922 chips resolve the problem.)

## Parallel ATA interface

Until the introduction of Serial ATA, 40-pin connectors generally attached drives to a ribbon cable. Each cable has two or three connectors, one of which plugs into an adapter that interfaces with the rest of the computer system. The remaining one or two connectors plug into drives. Parallel ATA cables transfer data 16 bits at a time (it is a common misconception that they transfer 32 bits of data at a time, mainly because the 40 cable ribbon would appear to allow this).

Parallel ATA Pins
Pin Function Pin Function
1 Reset 2 Ground
3 Data 7 4 Data 8
5 Data 6 6 Data 9
7 Data 5 8 Data 10
9 Data 4 10 Data 11
11 Data 3 12 Data 12
13 Data 2 14 Data 13
15 Data 1 16 Data 14
17 Data 0 18 Data 15
19 Ground 20 Key (alternative usage is VCC_in)
21 DDRQ 22 Ground
23 I/O Write 24 Ground
27 IOC HRDY 28 Cable Select (see below)
29 DDACK 30 Ground
31 IRQ 32 No Connect
33 Addr 1 34 GPIO_DMA66_Detect (see below)
37 Chip Select 1P 38 Chip Select 3P
39 Activity 40 Ground

ATA's ribbon cables had 40 wires for most of its history, but an 80-wire version appeared with the introduction of the Ultra DMA/66 (UDMA4) mode. All of the additional wires in the new cable are ground wires, interleaved with the previously defined wires. The interleaved ground wire reduces the effects of capacitive coupling between neighboring signal wires, thereby reducing crosstalk. Capacitive coupling is more of a problem at higher transfer rates, and this change was necessary to enable the 66 megabytes per second (MB/s) transfer rate of UDMA4 to work reliably. The faster UDMA5 and UDMA6 modes also require 80-conductor cables.

Though the number of wires doubled, the number of connector pins and the pinout remain the same as on 40-conductor cables, and the external appearance of the connectors is identical. Internally, of course, the connectors are different: The connectors for the 80-wire cable connect a larger number of ground wires to a smaller number of ground pins, while the connectors for the 40-wire cable connect ground wires to ground pins one-for-one. 80-wire cables usually come with three differently colored connectors (blue, gray & black) as opposed to uniformly colored 40-wire cable's connectors (all black). The gray connector has pin 28 CSEL not connected; this makes it the slave position for drives configured cable select.

### Using non-standard cables

The ATA standard has always specified a maximum cable length of just 46 cm (18 inches) and flat cables with particular impedance and capacitance characteristics. For various reasons, it may be desirable to use alternative cables : eg to have longer cables when connecting drives within a large computer case, or when mounting several physical drives into one computer ; or to use rounded cables to improve airflow (cooling) inside the computer case. Such cables are widely available on the market, and used successfully in most cases, however the user must understand that they are outside the parameters set by the specifications, and should be used with caution.

The short standard cable length all but completely eliminates the possibility of using parallel ATA for external devices.

### Pin 20

In the ATA standard, Pin 20 is defined as key and is not used. However, some FLASH disks can use pin 20 as VCC_in to power disk without need of special power cable[1].

### Pin 28

Pin 28 of the gray connector of an 80 conductor cable is not attached to any conductor of the cable. It is attached normally on the blue and black connectors.

### Pin 34

Pin 34 is connected to ground inside the blue connector of an 80 conductor cable but not attached to any conductor of the cable. It is attached normally on the gray and black connectors. See page 315 of [2].

### Differences between connectors on 80 conductor cables

The image shows PATA connectors after removal of strain relief, cover, and cable. Pin one is at bottom left of the connectors, pin 2 is top left, etc., except that the lower image of the blue connector shows the view from the opposite side, and pin one is at top right.

Each contact comprises a pair of points which together pierce the insulation of the ribbon cable with such precision that they make an excellent connection to the desired conductor without harming the insulation on the neighboring wires. The center row of contacts are all connected to the common ground bus and attach to the odd numbered conductors of the cable. The top row of contacts are the even-numbered sockets of the connector (mating with the even-numbered pins of the receptacle) and attach to every other even-numbered conductor of the cable. The bottom row of contacts are the odd-numbered sockets of the connector (mating with the odd-numbered pins of the receptacle) and attach to the remaining even-numbered conductors of the cable. (An alternate version of the connectors is allowed in which the even-numbered conductors are grounded and the odd-numbered conductors carry the signals. Obviously all three connectors on a cable must agree on this.)

Note the connections to the common ground bus from sockets 2 (top left), 19 (center bottom row), 22, 24, 26, 30, and 40 on all connectors. Also note (enlarged detail, bottom, looking from the opposite side of the connector) that socket 34 of the blue connector does not contact any conductor but unlike socket 34 of the other two connectors, it does connect to the common ground bus. On the gray connector, note that socket 28 is completely missing, so that pin 28 of the drive attached to the gray connector will be open. On the black connector, sockets 28 and 34 are completely normal, so that pins 28 and 34 of the drive attached to the black connector will be connected to the cable. Pin 28 of the black drive reaches pin 28 of the host receptacle but not pin 28 of the gray drive, while pin 34 of the black drive reaches pin 34 of the gray drive but not pin 34 of the host. Instead, pin 34 of the host is grounded.

The standard dictates color-coded connectors for easy identification by both installer and cable maker. All three connectors are different from one another. The blue (host) connector has the socket for pin 34 connected to ground inside the connector but not attached to any conductor of the cable. Since the old 40 conductor cables do not ground pin 34, the presence of a ground connection indicates that an 80 conductor cable is installed. The wire for pin 34 is attached normally on the other types and is not grounded. Installing the cable backwards (with the black connector on the system board, the blue connector on the remote device and the gray connector on the center device) will ground pin 34 of the remote device and connect host pin 34 through to pin 34 of the center device. The gray center connector omits the connection to pin 28 but connects pin 34 normally, while the black end connector connects both pins 28 and 34 normally.

(Although the standard itself must be purchased or consulted at a library, there is a draft copy available. See page 315 of the draft at [3], also archived at [4])

### Multiple devices on a cable

If two devices attach to a single cable, one is commonly referred to as a master and the other as a slave. The master drive generally appears first when the computer's BIOS and/or operating system enumerates available drives. On old BIOSes (486 era and older) the drives are often misleadingly referred to by the BIOS as "C" for the master and "D" for the slave.

If there is a single device on a cable, in most cases it should be configured as master. However, some hard drives have a special setting called single for this configuration (Western Digital, in particular). Also, depending on the hardware and software available, a single drive on a cable can work reliably even though configured as the slave drive (this configuration is most often seen when a CDROM has a channel to itself).

### Cable select

A drive setting called cable select was described as optional in ATA-1 and has come into fairly widespread use with ATA-5 and later. A drive set to "cable select" automatically configures itself as master or slave, according to its position on the cable. Cable select is controlled by pin 28. The host adapter grounds this pin; if a device sees that the pin is grounded, it becomes the master device; if it sees that pin 28 is open, the device becomes the slave device.

This setting is usually chosen by placing a jumper on the "cable select" position, usually marked CS, rather than on the "master" or "slave" position.

With the 40-wire cable it was very common to implement cable select by simply cutting the pin 28 wire between the two device connectors. This puts the slave device at the end of the cable, and the master on the "middle" connector. This arrangement eventually was standardized in later versions of the specification. If there is just one device on the cable, this results in an unused "stub" of cable. This is undesirable, both for physical convenience and electrical reasons: The stub causes signal reflections, particularly at higher transfer rates.

When the 80-wire cable was defined for use since ATAPI5/UDMA4, the master device goes at the end of the 18-inch cable(black connector), the middle-slave connector is grey, the blue connector goes onto the motherboard. So, if there is only one (master)device on the cable, there is no cable "stub" to cause reflections. Also, cable select is now implemented in the slave device connector, usually simply by omitting the ?contact? from the connector body. Both the 40-wire and 80-wire parallel-IDE cables share the same 40-socket connector configuration.

### Master and slave clarification

Although they are in extremely common use, the terms master and slave do not actually appear in current versions of the ATA specifications. The two devices are correctly referred to as device 0 (master) and device 1 (slave), respectively. It is a common myth that "the master drive arbitrates access to devices on the channel" or that "the controller on the master drive also controls the slave drive." In fact, the drivers in the host operating system perform the necessary arbitration and serialization (as described in the next section), and each drive's controller operates independently. There is therefore no suggestion in the ATA protocols that one device has to ask the other if it can use the channel. Both are really "slaves" to the driver in the host OS.

### Serialized, overlapped, and queued operations

The parallel ATA protocols up through ATA-3 require that once a command has been given to one device on an ATA interface, that command must complete before any subsequent command may be given to either device on the same interface. In other words, operations on the devices must be serialized—with only one operation in progress at a time—with respect to the ATA host interface.

For example, suppose a read operation is in progress on one drive on a given interface (cable). It is not possible to initiate another command on another drive on the same interface, or inform the first drive of additional commands that it should perform later (capabilities referred to as "overlapped" and "queued" operations, respectively), until the first drive's read operation completes. This is true even though the total I/O time is dominated by seek time and rotational latency, and during these phases, the first drive is transferring no data.

A useful mental model is that the host ATA interface is busy with the first request for its entire duration, and therefore can't be told about another request until the first one is complete.

The function of serializing requests to the interface is usually performed by a device driver in the host operating system.

The ATA-4 and subsequent versions of the specification have included both an "overlapped feature set" and a "queued feature set" as optional features. However, support for these is extremely rare in actual parallel ATA products and device drivers.

By contrast, overlapped and queued operations have been common in other storage buses for some time. In particular, tagged command queuing is characteristic of SCSI, and this has long been seen as a major advantage of SCSI over parallel ATA. The Serial ATA standard has supported what it calls native command queueing since its first released version, but the feature is present in only a few (generally the highest-priced) Serial ATA drives.

### Two Devices on one cable - speed impact

There are many debates about how much a slow device can impact the performance of a faster device on the same cable. There is an effect, but the debate is confused by the blurring of two quite different causes, called here "Slowest Speed" and "One Operation at a Time".

#### "Slowest Speed"

It is a common misconception that, if two devices of different speed capabilities are on the same cable, both devices' data transfers will be constrained to the speed of the slower device.

For all modern ATA host adapters (since, at least, the late Pentium III and AMD K7 era) this is not true, as modern ATA host adapters support independent device timing. This allows each device on the cable to transfer data at its own best speed.

Even with older adapters without independent timing, this effect only impacts the data transfer phase of a read or write operation. This is usually the shortest part of a complete read or write operation (except for burst mode transfers).

#### "One Operation at a Time"

This is a much more important effect. It is caused by the omission of both overlapped and queued feature sets from most parallel ATA products. This means that only one device on a cable can perform a read or write operation at one time. Therefore, a fast device on the same cable as a slow device under heavy use will find that nearly every time it is asked to perform a transfer, it has to wait for the slow device to finish its own ponderous transfer.

For example, consider an optical device such as a DVD-ROM, and a hard drive on the same parallel ATA cable. With average seek and rotation speeds for such devices, a read operation to the DVD-ROM will take an average of around 100 milliseconds, while a typical fast parallel ATA hard drive can complete a read or write in less than 10 milliseconds. This means that the hard drive, if unencumbered, could perform more than 100 operations per second (and far more than that if only short head movements are involved). But since the devices are on the same cable, once a "read" command is given to the DVD-ROM, the hard drive will be inaccessible (and idle) for as long as it takes the DVD-ROM to complete its read—seek time included. Frequent accesses to the DVD-ROM will therefore vastly reduce the maximum throughput available from the hard drive. If the DVD-ROM is kept busy with average-duration requests, and if the host operating system driver sends commands to the two drives in a strict "round robin" fashion, then the hard drive will be limited to about 10 operations per second while the DVD-ROM is in use, even though the burst data transfers to and from the hard drive still happen at the hard drive's usual speed.

The impact of this on a system's performance depends on the application. For example, when copying data from an optical drive to a hard drive (such as during software installation), this effect probably doesn't matter: Such jobs are necessarily limited by the speed of the optical drive no matter where it is. But if the hard drive in question is also expected to provide good throughput for other tasks at the same time, it probably should not be on the same cable as the optical drive.

Remember that this effect occurs only if the slow drive is actually being accessed. The mere presence of an idle drive will not affect the performance of the other device on the cable (for a modern host adapter which supports independent timing).

## ATA standards versions, transfer rates, and features

The following table shows the names of the versions of the ATA standards and the transfer modes and rates supported by each. Note that the transfer rate for each mode (for example, 66.7 MB/s for UDMA4, commonly called "Ultra-DMA 66") gives its maximum theoretical transfer rate on the cable. This is simply two bytes multiplied by the effective clock rate, and presumes that every clock cycle is used to transfer end-user data. In practice, of course, protocol overhead reduces this value.

Congestion on the host bus to which the ATA adapter is attached may also limit the maximum burst transfer rate. For example, the maximum data transfer rate for conventional PCI bus is 133 MB/s, and this is shared among all active devices on the bus.

In addition, no ATA hard drives exist capable of measured sustained transfer rates of above 80 MB/s. Furthermore, sustained transfer rate tests do not give realistic throughput expectations for most workloads: They use I/O loads specifically designed to encounter almost no delays from seek time or rotational latency. Hard drive performance under most workloads is limited first and second by those two factors; the transfer rate on the bus is a distant third in importance. Therefore, transfer speed limits above 66 MB/s really affect performance only when the hard drive can satisfy all I/O requests by reading from its internal cache — a very unusual situation, especially considering that such data is usually already buffered by the operating system.

Standard Other Names Transfer Modes Added (MB/s) Maximum disk size Other New Features ANSI Reference
ATA-1 ATA, IDE PIO 0, 1, 2 (3.3, 5.2, 8.3)
Single-word DMA 0, 1 ,2 (2.1, 4.2, 8.3)
Multi-word DMA 0 (4.2)
137 GB
X3.221-1994
(obsolete since 1999)
ATA-2 EIDE, Fast ATA,
Fast IDE, Ultra ATA
PIO 3, 4: (11.1, 16.6)
Multi-word DMA 1, 2 (13.3, 16.6)

28-bit logical block addressing (LBA) X3.279-1996
(obsolete since 2001)
ATA-3 EIDE

S.M.A.R.T., Security
X3.298-1997
(obsolete since 2002)
ATA/ATAPI-4 ATA-4, Ultra ATA/33 Ultra DMA 0, 1, 2 (16.7, 25.0, 33.3)
aka UDMA/33

AT Attachment Packet Interface (ATAPI), i.e. support for CD-ROM, tape drives etc.,
Optional overlapped and queued command set features,
Host Protected Area (HPA)
NCITS 317-1998
ATA/ATAPI-5 ATA-5, Ultra ATA/66 Ultra DMA 3, 4 (44.4, 66.7)
aka UDMA/66

80-wire cables NCITS 340-2000
ATA/ATAPI-6 ATA-6, Ultra ATA/100 UDMA 5 (100)
aka UDMA/100
144 PB 48-bit LBA, Device Configuration Overlay (DCO),
Automatic Acoustic Management
NCITS 361-2002
ATA/ATAPI-7 ATA-7, Ultra ATA/133 UDMA 6 (133)
aka UDMA/133
SATA/150

SATA 1.0, Streaming feature set, long logical/physical sector feature set for non-packet devices NCITS 397-2005 (vol 1)
ATA/ATAPI-8 ATA-8
Hybrid drive featuring non-volatile cache to speed up critical OS files In progress

In August 2004, Sam Hopkins and Brantley Coile of Coraid specified a lightweight ATA-over-Ethernet protocol to carry ATA commands over Ethernet instead of directly connecting them to a PATA host adapter. This permitted the established block protocol to be reused in Network-attached storage applications.