This is the printer-friendly version of 'reference information' section.


.:Files::CPU::Memory::IRQ::Video::PnP:.


Files & Filesystems

Executable File Types

Included from Executable File Types

The number of different executable file types is as many and varied as the number of different image and sound file formats. Every Operating System seems to have several executable file types unique to itself. This part of the FAQ will give a brief rundown on the various types you will come across.

A quick intro to a few terms:

  • TEXT is the actual exectuable code area,
  • DATA is "initialised" data,
  • BSS is "un-initialised" data.

The BSS (Below Stack Segment) needn't to be present in an executable file. At load-time, the loader will still allocate memory for it and wipes this memory with zeroes (this is assumed by C programs, for instance).

If you're looking for comprehensive informations, consider using the Programmer's File Format Collection and the Linkers and Loaders online book... You can also check Pierre's Library

EXE (DOS "MZ")

DOS-MZ was introduced with MS-DOS (not DOS v1 though) as a companion to the simplified DOS COM file format. DOS-MZ was designed to be run in real mode and reflects this, having a relocation table of SEGMENT:OFFSET pairings. A very simple format that can be run at any offset, it does not distinguish between TEXT, DATA and BSS. Since it was designed to run in real mode, its maximum filesize of code + data + bss is 1mb in size.

Operating Systems that use it: DOS, Win*, Linux DOS Emu, Amiga DOS Emu

EXE (Win 3.xx "NE")

The WIN-NE executable formated designed for Windows 3.x was the "NE" New-Executable. Again, a 16bit format, it alleviated the maximum size restrictions that the DOS-MZ had.

Operating Systems that use it: Windows 3.xx

EXE (OS/2 "LE")

The "LE" Linear Executable format was designed for IBM's OS/2 operating system by Microsoft. Supporting both 16 and 32bit segments.

Operating Systems that use it: OS/2, Watcom Compiler/Extender (DOS)

EXE (Win 9x/NT "PE")

Included from PeBinaries

With Windows 95/NT, a new exectuable file type was required. Thus was born the "PE" Portable Executable, which is still in use. Unlike its predecessors, WIN-PE is a true 32bit file format, supporting relocatable code. It does distinguish between TEXT, DATA, and BSS. It is, in fact, a bastardised version of the COFF format.

If you did set up a Cygwin environment on your Windows machine, "PE" is the target format for your Cygwin GCC toolchain, which causes the unaware some headache when trying to link parts build under Cygwin with parts build under Linux or BSD (which use the ELF target by default). (Hint: You have to build a GCC Cross-Compiler...)

Operating Systems that use it: Windows 95/98/NT, the Mobius.


additionnal resources

  • MSDN documentation about PE format

ELF

Included from ElfBinaries

The ELF (Executable Linkable Format) was designed by SUN for use in their Unix clone. A very versatile file format, it was later picked up by many other operating systems for use as both executable files and as shared library files. It does distinguish between TEXT, DATA and BSS.

Documentation on ELF can be obtained e.g. at http://www.linuxbase.org/spec/refspecs/elf/, ftp://tsx.mit.edu/pub/linux/packages/GCC/ELF.doc.tar.gz, or various other sources.

Today, ELF is considered the standard format on Unix-alike systems. While it has some drawbacks (e.g., using up one of the scarce general purpose registers of the IA32 when using position-independent code), it is well supported and documented.

Operating Systems that use it: Solaris, IRIX and IRIX64, Linux, *BSD, many many others...

A.OUT

A.OUT is the "original" binary format for Unix machines. It is considered obsolete today because of several shortcomings. However, as it is extremely simple and supported by many compilers/assemblers, it may be a good choice if you're willing to develop your own format or have more information than 'raw binary' for your bootloader.

File Systems

Included from Filesystems

Tell me about Filesystems

Filesystems are the machine's way of ordering your data on readable and/or writable media. They provide a logical way to access the stuff that you have down on disk so that you can read or modify extit. Which file system you use depends upon what you want to do with it. For example, Windows uses the Fat32 or NTFS filesystem. If your disk is really huge, then there's no point using Fat32 because the FAT system was designed in the days when nobody had disks as big as we do now. At the same time, there's no point using a NTFS filesystem on a tiny disk, because it was designed to work with large volumes of data - the overhead would be pointless for, say, reading a 1.44m floppy disk.

There are many different kinds of filesystems around, from the well-known to the more obscure ones. The most unfortunate thing about filesystems is that every hobbyist OS programmer thinks that the filesystem they design is the ultimate technology, when in reality it's usually just a bad copy of DOS FAT with a change here and there. The world doesn't need another crap filesystem. Investigate all the possibilites before you decide you roll your own.

FAT And its Variants

File Allocation Table (FAT) was introduced with DOS v1.0 (and possibly CP/M), supposedly written by Bill Gates. FAT is a very simple filesystem which is nothing more than a singular linked list of clusters. FAT filesystems use very little memory and is one of, if not the, most basic filesystem in existance today.

There are two versions of this simplified FAT, FAT12 and FAT16. FAT12 was designed for floppy disks and can manage a maximum size of 16mb using 12bit cluster numbers. FAT16 was designed for early hard disks and could handle a maximum size of 64kb * cluster_size. The larger the hard disk, the larger the cluster size would be, which lead to large amounts of "slack space" on the disk.

FAT12+FAT16 filesystems have fixed size for filenames of "8.3" and limited support for file attributes. You could also read the FAT12 document. You could also check out the FAT tutorial reported by Kemp.

VFAT

VFAT is an extension of FAT16 and FAT12 that has the ability to use long filenames (up to 255 characters i think). First introduced by Windows95, it uses a "cludge" whereby long filenames are marked with an "volume label"; attribute and filenames are subsequently stored in the 8.3 format in sequential directory entries. (This is a bit of an oversimplification, but close enough).

FAT32

FAT32 was introduced to us by Windows95-B and Windows98. FAT32 solved some of FAT's problems. No more 64kb max clusters! FAT32, as its name suggests, can handle a maximum of 4gig clusters per partition. This enables very large hard disks to still maintain very small cluster sizes and thus reduce slack space between files.

Is FAT32 really able to handle 4G clusters ? last time i looked to a FAT table, it looked to have last 4 bits cleared all the time, including the 'end of chain' tag, giving me the feeling that it was somehow more a "FAT28" than "FAT32" -- PypeClicker

Correct - FAT32 is actually only FAT28. Top 4 bits are currently "Undefined" (according to Microsoft's specification). Note that this does not mean 'top four bits should be "0"' - they should be honored. This means that if you need to alter a FAT entry, and the top 4 bits contain 1001, you should write it back to disk containing 1001 and not 0000. -- djhayman

Inode-based File Systems

Inodes (information nodes) are a crucial design element in most Unix filesystems: Each file is made of data blocks (the sectors that contains your raw data bits), index blocks (containing pointers to data blocks so that you know which sector is the nth in the sequence), and one inode block.

The inode is the root of the index blocks, and can also be the sole index block if the file is small enough. Moreover, as unix filesystems support hard links (the same file may appear several times in the directory tree), inodes are a natural place to store metadata such as file size, owner, creation/access/modification times, locks, etc.

HPFS (High Performace Filesystem)

The HPFS was designed by IBM/Microsoft for IBMs new windowing system, OS/2. It was designed to be fast, remove all the shortcomings of FAT, support long filenames, small cluster sizes, remove degfragmentation as much as possible and support more attributes.

HPFS is the precursor to NTFS and is, in a nutshell, NTFS minus the security features embeded into NTFS. Instead of storing cluster chains in a single linked list format, HPFS stores its information in sorted B-Tree's. This makes searching for files very fast.

Instead of keeping the directory tables and other descriptors at the start of the disk, HPFS bands them at regular intervals throughout the disk and in the middle of the disk, with the theory being that the heads only have to move half as much in any direction.

More information about HPFS can be found at: http://www.wotsit.org/download.asp?f=hpfs

NTFS (New Technology Filesystem)

NTFS is the native filesystem of WindowsNT. It is much like HPFS, but supports security features in the filesystem such as access control. Since WindowsNT is entirly unicode, NTFS is a unicode filesystem, each "character" being 16bits wide. NTFS adds quite a bit more to HPFS than just security features, though. First, it adds quite a bit of builtin redundancy -- with HPFS, wiping out one sector in the wrong place can render an entire volume inaccessible. Second, it adds support for multiple hard-links to a file (up 'til now, the only easy access has been via the POSIX subsystem, but NT 5/Win2K adds this to Win32 as well). Third, it supports an arbitrary number of file forks a la MacOS (except MacOS always has exactly 2 forks per file). Fourth, HPFS decrees that a cluster is always 512 bytes, and a cluster is always one sector. For the sake of performance and compatibility with some (especially Japanese) machines, NTFS allows sectors of other sizes. It also supports clusters of more than one sector, which tends to help performance a little.

NTFS is probably one of the most difficult file system to deal with, especially because of the lack of hacking experience and reliable documents about it. A read-only stable driver is in Linux source code base since kernel 2.4, while an experimental read-write driver is coming with linux 2.6.

The best information found about it so far is Andrew Tanenbaum's article. The Linux NTFS project also has some information about it, at http://linux-ntfs.sourceforge.net/ntfs/index.html. (Use the next / previous links at the top of the pages, or use the glossary.) You are welcome to add more.

ext2fs (Second Extended Filesystem)

The Second Extended Filesystem (ext2fs) was the default filesystem of Linux prior the advent of the journaling file systems ext3fs and ReiserFS. It has native support for UNIX ownership / access rights, symbolic and hard links and other Unix-native properties. Like HPFS, it tries to minimize head movement by distributing data across the disk. Also, by using "groups", it minimizes the impact of fragmentation. It is another "inode" based system. An ext2fs-partition is made up from blocks, which normally are 1K each. The first block (the bootblock) is zeroized, all the other blocks are divided into so-called block groups (normally, between 256 and 8192 blocks form a group). Each block group contains:

  • a copy of the superblock (which is a mighty useful structure containing info about the filesystem);
  • the filesystem descriptors (dunno what that is exactly)
  • the block bitmap, tells which blocks are used
  • the inode bitmap, tells which inodes are used (difference?)
  • the inode table, which contains the inodes themselves
  • the data blocks referenced by the inodes

The first inode is a special one; it is the bad blocks inode, which references all the damaged sectors of the partition. The fifth inode contains the bootloader, whereas the 11th contains the root directory.

Windows users can access ext2fs partitions with explore2fs.

Additional information about ext2fs:

ext3fs (Third Extended File System)

ext3fs is basically ext2fs with journaling added. If your ext3fs partition does not need journal replay, it can even be accessed with a 'simple' ext2fs driver.

ReiserFS

ReiserFS - homepage at http://www.namesys.com - is a file system that is free if your OS is free, and comes at a licensing cost if your OS is not free. It has excellent performance on large directories and small files, using "dancing trees" instead of B-trees, and does meta-data journaling to improve file-system stability across system crashes.

Unlike "classic" filesystems, Reiser allows you to have files that occupy less than one sector on the disk (i.e. it can store several tiny files or tails of files on the same sector) through its tree organization.

As of version 3.6, ReiserFS supports:

  • max number of files - 232 - 3 => 4 G - 3
  • max number files a dir can have - 232 - 4 => 4 G - 4, but in practice this value is limited by hash function. r5 hash allows about 1 200 000 file names without collisions
  • max file size - 260 - bytes => 1 E, but page cache limits this to 16 T on architectures with 32 bit int
  • max number links to a file - 232 => 4 G
  • max filesystem size - 232 (4K) blocks => 16 T

Network-based File Systems

All these file systems are a way to create a large, distributed storage system from a collection of "back end" systems. That means you cannot (for instance) format a disk in 'NFS' but you instead mount a 'virtual' NFS partition that will reflect what's on another machine. Note that a new generation of File Systems is under heavy research, basing on latest P2P, cryptography and error correction techniques (such as the Ocean Store Project or Archival Intermemory.

NFS

NFS was invented by Sun Microsystems. It became widespread largely because it'a quite easy to implement. In return for its simplicity, it tends to give relatively poor performance and a nearly complete lack of safety. These are both largely due to its connectionless nature. When you request data from a file, the server sends you the requested information, but does NOT keep track of which clients have which files open. To keep you from seeing (terribly) out-of-date information from a file, the data you read has an "expiration date". If you refer to the data from more than, say, a minute, it will expire and your client will request the data from the server again, whether it's changed or not. If you write data to the file, you have no way of knowing whether somebody else has updated the information between your reading and writing your data, so you may overwrite things they've written with older data. To ensure at least a little bit of safety, the server is supposed to actually commit data you write to disk before it returns to you.

In other words, NFS works pretty well for read-only access to things like executables on a server. For things like on-line databases, it's essentially a disaster waiting to happen (and it usually doesn't wait very long).

More recent versions of the NFS spec have cured most of these problems, but support for these updates is still (years later) somewhat uneven.

AFS

AFS is the Andrew File System aka Advanced File System, similar to NFS to about the same degree that a tricycle is similar to a fighter jet -- they're both typically one-person vehicles. AFS is a drastically more robust design than NFS, and is intended for MUCH larger networks. OTOH, it's also much more difficult to implement completely -- to the point that it's not likely to be of much interest to most hobbyists and such writing a new OS.

RFS

RFS (Remote File System) was introduced in UNIX System V to compete with NFS and such. Unlike NFS, RFS is a connection-oriented system, so if, for example, two different machines access a file on a server, they get about the same semantics as if two processes on a single machine accessed the file. Note that NFS and RFS are both built on top of some sort of local file system, which determines things like inodes and such.

Unclassifiable and Other Stuff

BeFS

BeFS is the new filesystem for the Be Operating system. It is very much like the MacOS Filesystem, supporting multiple forks in a 64bit filesystem. One very useful feature it shares with the AmigaOS FFS is the ability for an application to set a "notify callback", i.e. being notified when a file or directory changes.

Get way more information about BeFS in Practical File System Design

FFS (Amiga)

The Amiga Fast File System, to put it bluntly, is not - or rather, it's fast only when compared to the OFS, the Original File System of AmigaOS 1.x.

There are many bright design ideas making the AmigaOS a very special thing, but the file system was not exactly part of it. It is prone to invalidation, holds redundant data, and its directory structure is comparatively slow to traverse. It also lacks any concept of multi-user environments.

Perhaps the only good thing with the Amiga FFS was the concept of the Rigid Disk Block (RDB) - a special area at the beginning of a disk, holding not only the partitioning information. It was also possible to store a file system there - a module that would tell a different AmigaOS machine how to read a partition if it was not formatted in FFS format but something else.

For those interested in its internals should try to find a copy of "The Amiga Guru Book" by Ralph Babel, which holds a complete reference of its rather complex block structure. (It also has a complete reference of the DOS library, as well as interesting information on various internals of the Amiga architecture. It is long out of print, but perhaps you can still find copies on eBay.) The old FAQ also held some info in the internals, which are preserved in the AmigaFFS Document.

FFS / UFS (BSD)

Not to be confused with the Amiga FFS, the BSD FFS / UFS is commonly used on hard disks for the *BSD and derivatives. What is usually called a "partition" is called a "slice" in *BSD, which is in turn subdivided into "partitions" - a naming pattern that leads to some confusion, and to rather cryptic device names (ad0s1c for the third partition on the second slice on the primary master ATAPI hard drive...).

XFS

XFS is Silicon Graphics "Next Generation Journalled 64-Bit Filesystem With Guaranteed Rate I/O" designed for IRIX based systems. XFS uses the standard inodes, bitmaps and blocks, and is compatable with EFS and NFS filesystems.

According to the XFS white paper it has;

  • Scalable features and performance from small to truly huge data (petabytes)
  • Huge numbers of files (millions)
  • Exceptional performance: 500+ MBytes/second
  • Designed with log/database (journal) technology as a fundamental part not just an extension to an existing filesystem
  • Mission-critical reliability

BFS

BFS (UnixWare Boot File System) is a SCO specification for a KISS filesystem used at bootstrap. It only offers one directory and, due to the way information about blocks are stored, only one file opened for writing at a time.

http://www.penguin.cz/~mhi/fs/bfs/bfs-structure.html

From what i see, it also means BFS will have to do nasty things if a file must be extended after some other file has been created -- PypeClicker

Agreed, but it's not a general-purpose filesystem. One tends not to extend things like the kernel image or modules. -- Strib

FAT12 (for floppies)

Included from FAT12 document

The BAD URL -- remove all of <, >, " might have information left out in this document. Make sure you read BAD URL -- remove all of <, >, " too.

Note that the FAT filesystem is covered by software patents.


File Allocation Table (FAT 12)

This paper concentrates on the FAT12 system only. It is broken down into several sections. Following a brief introduction on File Allocation Tables, the paper goes into a step by step instruction on how to read an MS-DOS File Allocation Table for a diskette (FAT12). The sections are in order:

  • Introduction
  • FAT12 (Diskette)
  • Reading the Boot Sector
  • Reading the Directory
  • Finding the Beginning of the Boot, FAT, Directory, and Open Space
  • File Allocation Table Entry Cluster Values
  • Location of File in Open Space Area
  • A printed copy of the file that is used to dissect the FAT12 table
  • A printed copy of the Boot Sector and Directory
  • A printed copy of the File Allocation Table
  • A printed copy of the Beginning of the File in the Open Space Area
  • A printed copy of the Ending of the File in the Open Space Area

In the Introduction, what is called the data area is called the Open Space Area later in the instructional part of the paper. And finally although, this paper does go into quite a bit of detail it is by no means complete.

Introduction

The File Allocation Table (FAT) is a table stored on every hard or floppy disk that indicates the status and location of all data clusters that are on the disk. The File Allocation Table can be considered to be the "table of contents" of a disk. If the file allocation table is damaged or lost, then a disk is unreadable. In a file server the FAT data is sometimes kept in the computer RAM for quick access and is easily lost if the system crashes as the result of a power failure.

The File Allocation Table is maintained by the operating system that provides a map of the clusters (the basic unit of logical storage on a disk) that a file has been stored in. When you write a new file, the file is stored in one or more clusters that are not necessarily next to each other; they may be rather widely scattered over the disk. A typical cluster size is 2,048 Bytes, 4,096 Bytes or 8,192 Bytes.* The operating system creates a FAT entry for the new file that records where each cluster is located and their sequential order. When you read a file, the operating system reassembles the file from clusters and places it as an entire file where you want to read it.

The hard disk is physically arranged by cylinders, heads, and sectors, that is how it is addressed by the hardware controller and the ROM BIOS, which addresses it at a physical level. For the operating system and other programs, however, this is cumbersome, since the physical number of cylinders, heads, and sectors varies from disk to disk. It would be convenient to view the disk as simply a large continuous block of sectors with simple sequential addresses.

MS-DOS does, in fact, view the sectors on a disk as a one-dimensional array of sectors numbered from 0 to n-1, where n is the total number of sectors on the disk. It therefore must translate from the logical sector numbers to physical to physical cylinder-head-sector, or CHS addresses. In doing so, MS-DOS sequentially numbers all the sectors of head 0, cylinder 0, then all the sectors of head 1, cylinder 0, and so on for each head, and then repeats this for each cylinder, to the end of the disk.

Furthermore, MS-DOS logically divides this array of sectors into five distinct areas, which are, in the order they appear on the disk,

  • The partition table,
  • The boot record,
  • The File Allocation Table (FAT),
  • The root directory, and
  • The data area.

    The first four areas of the disk, collectively called the system area, are used by MS-DOS to keep track of the contents of the disk. The largest area of the disk, the data area, is where all user files and data reside. MS-DOS uses a special numbering scheme for the area called cluster numbering which is in addition to, but independent of logical sector numbers.

    The boot record occupies one sector, and is always placed in logical sector number (LSN) 0, which is physically cylinder 0, head 0, sector 1, the first sector of the first head of the first cylinder on the disk. This is the easiest sector on the disk for the computer to locate when it begins running.

    The File Allocation Table (FAT) is an array of integers in which each element represents one cluster in the data area. For each cluster in the data area the corresponding entry in the FAT contains a code which indicates the status of the cluster. The cluster may be available for use, it may be reserved by the operating system, it may be unavailable due to a bad sector on the disk, or it may be in use by a file.

    MS-DOS maintains a hierarchical directory structure in which there is one entry for every file on the disk.

    Data area is where all user files and data reside.

FAT 12 (Diskette)

Boot Sector

BYTE +0 +1 +2 +3 +4 +5 +6 +7 meaning
0-2 0000 eb 3c 90 .. .. .. .. .. Jump to start of boot code
3 - 10 0000 .. .. .. 6d 6b 64 6f 73 OEM identifier (mkdosfs)
0008 66 73 00 .. .. .. .. .. (program/OS being used to format)
11 - 12 0008 .. .. .. 00 02 .. .. .. The number of Bytes per sector (512)
13 0008 .. .. .. .. .. 01 .. .. Number of sectors per allocation unit (cluster)
14 - 15 0008 .. .. .. .. .. .. 01 00 Number of reserved sectors
16 0010 02 .. .. .. .. .. .. .. Number of FAT's on the diskette
17 - 18 0010 .. e0 00 .. .. .. .. .. Number of directory entries (BIOS Parameters)
19 - 20 0010 .. .. .. 40 0b .. .. .. The total sectors in the logical volume
21 0010 .. .. .. .. .. f0 .. .. Media descriptor type
22 - 23 0010 .. .. .. .. .. .. 09 00 Number of sectors per FAT
24 - 25 0018 12 00 .. .. .. .. .. .. Number of sectors per track
26 - 27 0018 .. .. 02 00 .. .. .. .. Number of heads or sides on the diskette
28 - 31 0018 .. .. .. .. 00 00 00 00 Number of hidden sectors
32 - 35 0020 00 00 00 00 .. .. .. .. Large amount of sector on media
36 0020 .. .. .. .. 00 .. .. .. Drive number
37 0020 .. .. .. .. .. 00 .. .. Flags
38 0020 .. .. .. .. .. .. 29 .. Signature (must be 0x28 or 0x29)
39 - 42 <disk-dependent> VolumeID 'Serial' number (ignore this)
43 - 53 0028 .. .. .. 20 20 20 20 20 Volume label,
0030 20 20 20 20 20 20 .. .. padded with spaces
54 - 61 =0030 .. .. .. .. .. .. F A = system identifier
0038 T 1 2 20 20 20 .. .. (padded with space)
62-509 0038 .. .. .. .. .. .. 0e 1f Start of Bootstrap routine
0040 be 5b 7c ac 22 c0 74 0b Bootstrap routine (cont'd)
510 01f0 .. .. .. .. .. .. 55 aa BIOS boot Signature (dw 0xAA55)

Reading the Boot Sector

Bytes (0-2)
The first three bytes 6B 3C and 90 disassemble to JMP 003C NOP. The reason for this is to jump over the disk format information. Since the first sector of the disk is loaded into ram at location 0x0000:0x7c00 and executed, without this jump, the processor would attempt to execute data that isn't code.
Bytes (3 - 10)
The first 8 Bytes (3 - 10) is the version of DOS being used. The next eight Bytes 29 3A 63 7E 2D 49 48 and 43 read out the name of the version. The official FAT Specification from Microsoft says that this field is really meaningless and is ignored by MS FAT Drivers, however it does recommend the value "MSWIN4.1" as some 3rd party drivers supposedly check it and expect it to have that value. Older versions of dos also report MSDOS5.1 and linux-formatted floppy will likely to carry "mkdosfs" here. If the string is less than 8 bytes, it is padded with zeroes.
Bytes (11 - 12)
The next two Bytes (11 - 12), 00 and 02, is the number of Bytes per sector. The first thing you do when reading a pair of Bytes is reverse them to read 02 00. 0200 is the number of Bytes per sector in hexadecimal or 512 Bytes per sector in decimal.
Byte 13
Byte 13 is the number of sectors per allocation (cluster). In this case it is one.
Bytes (14 - 15)
These two Bytes, 01 and 00, indicate the number of reserved sectors. Again you must reverse the Bytes to 00 01. There is one reserved sector.
Byte 16
This is the first Byte of the second row and it indicates the number of FAT's on the diskette. There are two.
Bytes (17 -18)
This indicates the number of directory entries. Reversing the Bytes E0 00 to 00 E0 and converting the number to decimal we have 224 directory entries.
Bytes (19 - 20)
The total sectors in the logical volume. Reversing 40 and 0B to 0B 40 and converting the number to decimal we have 2880 sectors in the logical volume. If that value is 0, it means there are more than 65535 sectors in the volume, and the actual count is stored in "Large Sectors (bytes 32-35).
Byte 21
This Byte (F0) indicates the media descriptor type, which is here a 1.44MB floppy.
Bytes (22 - 23)
These two Bytes, 09 and 00, indicate the number of sectors per FAT. Reversing 09 and 00 to 00 09, we see that we have nine sectors per FAT.
Bytes (24 - 25)
Number of sectors per track. Reversing the Bytes 12 and 00 to 00 12, there are eighteen sectors per track.
Bytes (26 - 27)
These two Bytes indicate the number of heads or sides on the diskette. Reversing the Bytes 02 and 00 to 00 02, we see that there are two sides to the diskette.
Bytes (28 - 29)
Number of hidden sectors. Both Bytes read zero, no hidden sectors.

Byte 30 Start of bootstrap routine is zero.

Directory

ToDo this information is weak, lacks clarifications about padding (how is A.B exactly encoded), what time and dates refers, etc.
Bytes Meaning
0 - 10 File name with extension
11 Attributes of the file
12 - 21 Reserved Bytes
22 - 23 Indicate the time
24 - 25 Indicate the date
26 - 27 Indicate the entry cluster value
28 - 31 Indicate the size of the file

Reading the Directory

We had a thread where it shows that root directory might not be that simple. It seems like we should read all entries, skipping entries marked as 'volume label' if any. -- PypeClicker

Bytes (0 - 10)
Starting with the Byte 2620, the first 11 Bytes (0 - 10) is the name of the file with extension. If the 11 byte string is PROCESSATXT, then the 8.3 filename is PROCESSA.TXT since the first 8 bytes of the string comprise the filename and the last 3 are the extension. If the filename is less than 8 bytes or the extension is less than 3, padding spaces are added, e.g. a file name of LOADER.RC would be encoded simply as "LOADER RC " (that's two spaces after LOADER and one after RC).
Byte 11
This Byte lists the attributes of the file. To read this you must convert the hexadecimal Byte to binary. In this case 20 (hex) is converted to 0010 0000. Each of the eight bits represents an attribute of the file. When a bit is on, indicated by a one, the file has that attribute. Starting with the right most bit, which is the zero bit and working over to the left most bit the 7th bit the attributes are; read only, hidden, system file, volume label, sub-directory, archive, and the last two bits the 6th and 7th bits indicate resolved. In this particular file it is the 5th bit that is on meaning that it is an achieve file.
Bytes (12 - 21)
These are the reserved Bytes.
Bytes (22 - 23)
These two Bytes, 4E and 7B, indicate the time the file was made. To retrieve the time reverse the Bytes to 7B 4E and convert to binary 0111 1011 0100 1110. The hour is read from the first five bits, the minutes are read from the next six bits, and the seconds are read from the last five bits. So our time Bytes are read like this 01111 011010 01110. Reading the Bytes; the hour is 15, the minutes are 26, and the seconds are 14. Important, the seconds must be multiplied by 2 to get the true second reading. So the time that the file was created was 15:26:28 military time or 3:26:28 PM.
Hour 5 bits
Minutes 6 bits
Seconds 5 bits
Bytes (24 - 25)
These two Bytes, 96 and 26, indicate the date the file was made. To retrieve the date reverse the Bytes to 26 96 and convert to binary 0010 0110 1001 0110. The year is read from the first seven bits, the month is read from the next four bits, and the day is read from the last five bits. So our date Bytes are read like this 0010011 0100 10110. Reading the Bytes; the year is 19, the month is 4, and the day is 22. The number for the year must be added with 1980 to get the correct year the file was made. So the date that the file was made was April 22, 1999.
Year 7 bits
Month 4 bits
Day 5 bits
Bytes (26 -27)
These two Bytes, 02 and 00, indicate the entry cluster value for both the FAT and the Open Space Area. More about this in the last two sections (File Allocation Table Entry Cluster Values and Location of File in the Open Space Area).
Bytes (28 - 31)
These four Bytes 1B, 0C, 00, 00 indicate the size of the file. Reversing the Bytes to 00 00 0C 1B and converting the number to decimal the size of the file is 3099 Bytes.

Finding the Beginning of the Boot, FAT, Directory, and Open Space

Boot Sector

as stated in the introduction the Boot Sector is always placed in logical sector number (LSN) 0, 0000.

File Allocation Table (FAT)

The File Allocation Table begins after the Boot Sector. To find the starting Byte, find the length of the Boot Sector which is one sector multiplied by the number of Bytes per sector (Bytes 11 and 12 of the Boot Sector). The File Allocation Table begins at 0200 (hex).

Directory

The Directory begins after both the Boot Sector and the File Allocation Tables. To find the starting Byte, find the number File Allocation Tables on the diskette (Byte 16 of the Boot Sector) and multiply this number with the number of sectors per FAT (Bytes 22 and 23 of the Boot Sector). Add this number with the number of Boot Sectors (which is one) to give you the total number of sectors of both the FAT and Boot sectors. Multiply total number of sectors by the number of Bytes per sector (Bytes 11 and 12 of the Boot Sector) giving you the starting Byte of the Directory.

(2 * 9) + 1 = 19 sectors; 19 sectors * 512 Bytes/ sector = 9728 Bytes (decimal) or 2600 Bytes (hex). The start of the Directory is 2600.

Open Space

The Open Space begins after the directory. To find the beginning of the Open Space you need to find the size of the directory in Bytes and add that to the beginning Byte of the Directory (2600). To find the size of the directory multiply the number of directory entries (Bytes 17 and 18 of the Boot Sector) by the Bytes per directory entries which in this case is given at 32 Bytes/directory entry of data (decimal).

224 directory entries * 32 Bytes/ directory entry = 7168 Bytes (decimal) or 1C00 (hex)

1C00 Bytes (hex) + 2600 Bytes (hex) = 4200 Bytes (hex). The start of the Open Space is 4200.

File Allocation Table Entry Cluster Values

Starting with the entry cluster value (Bytes 26 and 27 in the Directory), find the values (02 and 00) and reverse them to read 00 02 (hex). The result being **2 (hex or decimal).

Because, this result, the value of 2 is the same for both hexadecimal and decimal converting to decimal is not necessary, just remember that this next step is in decimal. Multiply this number 2 by 1.5 giving the number 3 (decimal) or 3 (hex). Now, go to the File Allocation table and retrieve the 3rd (0203) and 4th (0204) Bytes.¨ Remember to start your counting from zero. Take the two Bytes 03 and 40 and reverse them to 40 03. Because 3 is a whole integer, AND the binary value of the hexadecimal number of 4003 (0100 0000 0000 0011) to the binary value of the hexadecimal number 0FFF (0000 1111 1111 1111). The result is **3 (hex or decimal).

Convert the result above to decimal (if necessary) and multiply by 1.5 giving 4.5. So now we extract the 4th (0204) and 5th (0205) numbers from the File Allocation Table which are 40 and 00. Reverse the hexadecimal numbers to read 00 40. Because 4.5 is not a whole integer we right shift 0040 to read 0004. The result being **4 (hex or decimal).

Multiply 4 (decimal) by 1.5 giving 6. Now we go to the 6th (0206) and 7th (0207) number in the File Allocation Table which are 05 and 60. Reverse the numbers to read 60 05. Because 6 is a whole integer, AND the binary value of the hexadecimal number of 6005 (0110 0000 0000 0101) to the binary value of the hexadecimal number 0FFF (0000 1111 1111 1111). The result is **5 (hex or decimal).

Multiply 5 (decimal) by 1.5 giving 7.5. Now we read the 7th (0207) and 8th (0208) numbers in the File Allocation Table which are 60 and 00. Reversing the numbers we have 00 60. Because 7.5 is a fractional number we right shift 0060 to read 0006. The result is **6 (hex or decimal).

Multiply 6 (decimal) by 1.5 giving 9. Reading the 9th (0209) and 10th (0210) numbers in the File Allocation Table which are 07 and 80. Reverse the numbers to read 80 07. Because 9 is a whole integer, AND the binary value of the hexadecimal number of 8007 (1000 0000 0000 0111) to the binary value of the hexadecimal number 0FFF (0000 1111 1111 1111). The result is **7 (hex or decimal).

Take the decimal result above and multiply by 1.5 giving 10.5. Read the 10th (0210) and 11th (0211) Bytes in the File Allocation Table, the numbers are 80 and 00. Reversing the numbers we have 00 80. Because 10.5 is not a whole integer we right shift 0080 to 0008. The result is **8 (hex or decimal).

    • This number in decimal form is used also to calculate the Location of File in Open Space Area (Next Section).

Multiply 8 (decimal) by 1.5 giving 12. Now extract the 12th (0212) and 13th (0213) Bytes in the File Allocation Table. These numbers are FF 0F. Reverse the numbers to read 0F FF. This value 0FFF (hex) indicates the end of this file.

Location of File in Open Space Area

To find the location of the file in the Open Space Area take the decimal results of the File Allocation Table entry cluster values, as denoted by the double asterisks, and subtract 2. Then multiply by the number of Bytes per sector, which is indicated in Bytes 11 and 12 in the Boot Sector. In this case the Bytes per sector value is 512 (decimal). Finally take this value in Bytes, convert it to hexadecimal, and add it onto the starting location of the Open Space Area, which in this case is 4200.

(2 - 2) sectors * 512 Bytes per sector = 0 Bytes (decimal) 0 Bytes (hex) + 4200 Bytes = 4200 Entry value of the first cluster is 4200.

(3 - 2) sectors * 512 Bytes per sector = 512 Bytes (decimal) 200 Bytes (hex) + 4200 Bytes = 4400 Entry value of the second cluster is 4400.

(4 - 2) sectors * 512 Bytes per sector = 1024 Bytes (decimal) 400 Bytes (hex) + 4200 Bytes = 4600 Entry value of the third cluster is 4600.

(5 - 2) sectors * 512 Bytes per sector = 1536 Bytes (decimal) 600 Bytes (hex) + 4200 Bytes = 4800 Entry value of the fourth cluster is 4800.

(6 - 2) sectors * 512 Bytes per sector = 2048 Bytes (decimal) 800 Bytes + 4200 Bytes = 4A00 Entry value of fifth cluster is 4A00.

(7 - 2) sectors * 512 Bytes per sector = 2560 Bytes (decimal) A00 Bytes + 4200 Bytes = 4C00 Entry value of sixth cluster is 4C00.

(8- 2) sectors * 512 Bytes per sector = 3072 Bytes (decimal) C00 Bytes + 4200 Bytes = 4E00 Entry value of seventh cluster is 4E00.


Links to more information about FAT

AmigaFFS document

Included from AmigaFFS Document

1.1 Root Block

The root of the tree is the root block, which is at a fixed place on the disk. The root is like any other directory, except that it has no parent, and it's secondary type is different. AmigaDOS stores the name of the disk volume in the name field of the root block.

Each filing system blck contains a checksum, where the sum (ignoring overflow) of all the words in the block is zero.

          +---------------+
        0 |  T. SHORT     | Type
          |---------------|
        1 |       0       | header key (always 0)
          |---------------|
        2 |         0     | Highest seq number (always 0)
          |---------------|
        3 |   HT SIZE     | Hashtable size (=blocksize -56)
          |---------------|
        4 |       0       |
          |---------------|
        5 |   CHECKSUM    |
          |---------------|
        6 |     hash      |
          |     table     |
          /               /
          \               \
  SIZE-51 |               |
          |---------------|
  SIZE-50 |  BMFLAG       | TRUE if bitmap on disk is valid
          |---------------|
  SIZE-49 |   bitmap      | Used to indicate the blocks
  SIZE-24 |    pages      | containing the bitmap
          |---------------|
  SIZE-23 |    DAYS       | Volume last altered date and time
          |---------------|
  SIZE-22 |    MINS       |
          |---------------|
  SIZE-21 |    TICKS      |
          |---------------|
  SIZE-20 |     DISK      | Volume name as a BCPL string
          |     NAME      | of <= 30 characters
          |---------------|
  SIZE-7  |   CREATEDAYS  | Volume creation date and time
          |---------------|
  SIZE-6  |   CREATEMINS  |
          |---------------|
  SIZE-5  |  CREATETICKS  |
          |---------------|
  SIZE-4  |       0       | Next entry on this hash chain
          |---------------| (always 0)
  SIZE-3  |       0       | Parent directory (always 0)
          |---------------|
  SIZE-2  |       0       | Extension (always 0)
          |---------------|
  SIZE-1  |    ST.ROOT    | Secondary type indicates root block
          +---------------+

1.1.2 User Directory Blocks

          +---------------+
        0 |   T.SHORT     | Type
          |---------------|
        1 |   OWN KEY     | Header Key (pointer to self)
          |---------------|
        2 |       0       | Highest Seq Number (always 0)
          |---------------|
        3 |       0       |
          |---------------|
        4 |       0       |
          |---------------|
        5 |  CHECKSUM     |
          |---------------|
        6 |               |
          |    hash table |
          /               /
          \               \
  SIZE-51 |               |
          |---------------|
  SIZE-50 |    Spare      |
          |---------------|
  SIZE-48 |    PROTECT    |  Protection bits
          |---------------|
  SIZE-47 |       0       | Unused (always 0)
          |---------------|
  SIZE-46 |               |
          |   COMMENT     | Stored as  BCPL string
  SIZE-24 |               |
          |---------------|
  SIZE-23 |     DAYS      | Creation date and time
          |---------------|
  SIZE-22 |     MINS      |
          |---------------|
  SIZE-21 |    TICKS      |
          |---------------|
  SIZE-20 | DIRECTORY NAME| Stored as a BCPL string <=30 chars
          |---------------|
  SIZE-4  | HASHCHAIN     | Next entry with same hash value
          |---------------|
  SIZE-3  |    PARENT     | back pointer to parent directory
          |---------------|
  SIZE-2  |      0        | Extension (always 0)
          |---------------|
  SIZE-1  |  ST.USERDIR   | secondary type
          +---------------+

User directory blocks have type T.SHORT and secondary type ST.USERDIRECTORY. The six information words at the start of the block also indicate the block's own key (this is, the block number) as a consistency check and the size of the hash table. The 50 information words at the end of the block contain the date and time of creation, the name of the directory, a pointer to the next file or directory on the hash chain, and a pointer to the directory above.

To find a file or sub-directory, you must first apply a hash function to its name. This has function yields and offset in the hash table, which is the key of the first block on a chain linking those with the same hash value (or 0, if there are none). AmigaDOS reads teh block with this key and compares the name of the block with the required name. If the names do not match, it reads the next block on the chain, and so on.

1.1.3 File Header Block

           +------------+
        0  |   T.SHORT  | Type
           |------------|
        1  |   OWN KEY  | Header Key
           |------------|
        2  | HIGHEST SEQ| Total number of data blocks in file
           |------------|
        3  |  DATA SIZE | Number of data block slots used
           |------------|
        4  | FIRST DATA | First data block
           |------------|
        5  |  CHECKSUM  |
           |------------|
        6  |            |
           /            /
           \            \
           | DATA BLK 3 |
           | DATA BLK 2 | List of data block keys
  SIZE-51  | DATA BLK 1 |
           |------------|
  SIZE-50  |  Spare     |
           |------------|
  SIZE-49  |   PROTECT  | Protection bits
           |------------|
  SIZE-48  |  BYTESIZE  | Total size of file in bytes
           |------------|
  SIZE-46  |            |
           |  COMMENT   | Comment as a BCPL string
  SIZE-24  |            |
           |------------|
  SIZE-23  |    DAYS    | Creation date and time
           |------------|
  SIZE-22  |    MINS    |
           |------------|
  SIZE-21  |    TICKS   |
           |------------|
  SIZE-20  | FILE NAME  | Stored as BCPL string <= 30 chars
           |------------|
  SIZE-4   |  HASHCHAIN | Next entry with same hash value
           |------------|
  SIZE-3   |   PARENT   | Back pointer to the parent directory
           |------------|
  SIZE-2   |  EXTENSION | Zero pointer to the first extension
           |------------| block
  SIZE-1   |  ST. FILE  | Secondary type
           +------------+

Each terminal file starts with a file header block, which has type T.SHORT and secondary type ST.FILE. The start and end of the block contain name, time, and redundancy information similar to that in a directory block. The body of the file consists of Data blocks with sequence numbers from 1 upwards. AmigaDOS stores the addresses of these blocks in consecutive words downwards from offset size-51 in the block. In general, AmigaDOS does not use all the space for this list and the last data block is not full.

1.1.4 File List Block

If there are more blocks in the file than can be specified in the block list, then the EXTENSION field is non-zero and points to another disk block which contains a further data block list. The following figure explains the structure of the file list block.

           +-------------+
        0  |   T. LIST   | Type
           |-------------|
        1  |   OWN KEY   | Header Key
           |-------------|
        2  | BLOCK COUNT | =number of data blocks in block list
           |-------------|
        3  | DATA SIZE   | Same as above
           |-------------|
        4  | FIRST DATA  | First Data Block
           |-------------|
        5  |  CHECKSUM   |
           |-------------|
        6  |             |
           /             /
           \             \
           | BLOCK N+3   |
           | BLOCK N+2   | Extended list of data block keys
  SIZE-51  | BLOCK N+1   |
           |-------------|
  SIZE-50  |      info   | (unused)
           |-------------|
  SIZE-4   |     0       | Next in hash list (always 0)
           |-------------|
  SIZE-3   |   PARENT    | File header block of this file
           |-------------|
  SIZE-2   | EXTENTSION  | Next extension block
           |-------------|
  SIZE-1   |   ST. FILE  | Secondary type
           +-------------+

There are as many file extension blocks as required to list the data blocks that make up the file. The layout of the block is very similar to that of a file header block, except that the type is different and the date and filename fields are not used.

1.1.5 Data Block

           +-------------+
        0  |   T. DATA   | type
           |-------------|
        1  |   HEADER    | header key
           |-------------|
        2  |   SEQ NUM   | Sequence number
           |-------------|
        3  |  DATA SIZE  |
           |-------------|
        4  |  NEXT DATA  | next data block
           |-------------|
        5  |  CHECKSUM   |
           |-------------|
        6  |             |
           |             |
           |             |
           |             |
           |    DATA     |
           |             |
           |             |
           |             |
           +-------------+

Data blocks contain only six words of filing system information. These six words refer to the following:

  • type (T.DATA)
  • pointer to the file header block
  • sequence number of the data block
  • number of words of data
  • pointer to the next data block
  • checksum

Normally, all data blocks except the last are full (that is, they have a blocksize = blocksize-6). The last data block has a forward pointer of 0.



.:Files::CPU::Memory::IRQ::Video::PnP:.



Hardware::CPU

The IA32 Architecture Family

Included from The IA32 Architecture Family

(Information taken from the Intel manuals to give an overview over the individual generation's capabilities.)

Intel Processors

These processors from Intel use the CPUID string "GenuineIntel"

Intel 386

Successor to the 80286, the Intel 386 is the first processor of the IA32 architecture. It has 32 bit wide registers, supports 4 kByte paging, and a flat memory model in addition to the segmented memory model of the 80286.

Intel 486

The 486 integrates a 80x87 FPU on-chip, and supports power saving functions (System Management Mode, StopClock, AutoHaltPowerdown).

Pentium

The Pentium supports 4 MByte paging in addition to the usual 4 kByte paging, integrates an APIC and (in later steppings) MMX SIMD registers (single instruction, multiple data). It also supports 2-way multiprocessing.

Pentium Pro

The Pentium Pro supports PAE (36 bit address space), but does not have the MMX registers of the Pentium.

Pentium II

The Pentium II again supports MMX (as well as PAE), as well as additional low-power states: AutoHALT, Stop-Grant, Sleep, and DeepSleep.

Pentium II Xeon

The Xeon supports 4/8/+ way multiprocessing.

Pentium III

The Pentium III supports SSE (128 bit packed single FP SIMD).

Pentium IV / Pentium M

The Pentium IV as well as the (mobile) Pentium M both support SSE2; the Pentium IV also supports Hyper-Threading (one-chip multiprocessing).

Advanced Micro Device Intel-compatible Processors

The biggest competitor to Intel at this time (2004 August). They came into being slightly after Cyrix with a 5k86 (being a 486 compatible similar to the 5x86, don't confuse them) and then followed it up by a K6 processor. This one was faster than the Pentiums, and more popular than the Cyrix ones because they both didn't rate it (afaik), and they didn't overheat (as was claimed, untrue, for the Cyrixes).

The CPUID identifier string is "AuthenticAMD"

K6

The K6 processor was a very nice processor, being Pentium compatible and doing anything the Pentium half could. It became impopular because it was too damn fast, it did a LOOPcc within 2 cycles, where the Pentiums took 18 cycles.

Because software couldn't handle the sheer speed, it made errors, and thus caused frequent "Blue Screen"s. Microsoft issued a K6 patch which was mandatory for all K6 users.

K6-2

AMD had a lesson learned there, don't make your processor too fast in some instructions. Not that they were put off by that, they just added 16 wait states to the execution of the LOOPcc and thus caused it to slow to the speed of a Pentium. AMD didn't just do this however. They added a special case (speculation, might be coincidence) for the DEC (E)CX; Jcc combination, which is semantically equivalent with the LOOPcc instruction, but this semantic equivalency and the loop being faster on Intels caused the loop instruction to always be used. Nobody used the DEC/Jcc combo. They kept the original speed for this combo and specified in their optimization manuals that this was the preferred method over the loopcc instruction.

It also featured a new technology, the 3DNOW! technology, which was MMX using floating point numbers, and multiplexed (again) on the floating point registers. The K6-2 was quite popular, and scaled higher than the P1 ever did. It was largely compatible with the P2, but (afaik) not completely.

K6-3

They started this design off with the concept of not making it underpowered in any place, and to make it at least P2 compatible. It was fully P2 compatible.

The K6-3 was not too popular, mainly because the K6-2 did very well and people didn't see why they should buy a more expensive K6-3 for the same amount of megahertz. This of course was a joke, same as it is to call a 2GHZ opteron slower than a 2.2GHZ celeron.

A little known fact about the K6-3 is that it is in fact an Athlon, minus a few instructions, and minus one very important piece. The K6-3 suffered from a bottleneck at the instruction decode unit (which converts the X86 instructions to native instructions). It could only handle 2 in a cycle, which it made during about 20-30% of the cycles for average software. For optimized software you could bring it to 100% easily, and still want another channel. This wasn't too weird, because it did have 3 execution units of each type (ALU / MMX / loadstore) which were not used much at all. Note that these units are units executing the native instructions, so making 3 of each is not a stupid idea. They needed a new front end, and of course a new copy of instructions from Intel.

Athlon (first try)

The first models of the Athlon were distinct, they were the first time that a competitor to Intel actually had a faster processor, without Intel having a backup plan. It was poised against the PIII, which at that time was their top model and best-running one too. The athlon beat them to the 1GHZ mark, and at that time the 1GHZ had become completely irrelevant. It just meant that they had a new size to mark their processors with. Intel missed the point here, and they did until very shortly ago. The GHZ myth had been broken, the Athlon at 1.1GHZ was still faster than the PIII at 1.3GHZ, and people knew. They didn't go for a P3 if a faster athlon was available at a lower clock speed, and at a lower price.

Athlon XP / MP / Duron (new style)

AMD switched to a big offensive, trying to persuade the buyers to demand AMD CPU's instead of being OK with Intels. The new versions of these processors were all just a tad better than the previous one, could do a slight number of instructions more (the Athlons started with not even SSE1, and from model 6 (both Athlon and Duron) they supported it). The processors also advanced very slightly in each other direction, making each new type just a tad faster than the previous one. In the end of the GHZ wars (past year, about) the fastest Athlon was running at 2.2GHZ, but outperformed the better half of the 3GHZ P4's.

AMD64 based CPU's

This is slightly offtopic here, but still quite relevant, since these processors all support the entire IA32 family natively. AMD created a new processor, with 64-bit (actually 48-bit, but who notices those 16 bits?) memory addressing and 64-bit calculations, being very compatible with the old style CPU's. So compatible, that the core for 32-bit and 64-bit is essentially equal, aside from the size of calculations and the support of a few encodings that were in effect redundant. They removed a few 1-byte opcodes (about 20 in total, including all 1-byte INC and 1-byte DEC instructions) to make place for a new REX prefix. They modified it to use 16 registers instead of 8, added a load of new names, got the old software working, and optimized the 32-bit performance to unprecedented levels. These CPU's outperform the P4 at any clock speed, in almost (1/20 programs not) any calculation-intensive program. This made them very popular, but also very expensive, The cheapest nowadays is around 180 dollars, or euro's.

Other CPU vendors making similar chips

Cyrix

Cyrix was a well-known CPU vendor from the 386 years (and slightly before) up to the Pentium II times, when it more or less vanished inside Via. Via now uses the name as a CPU name (not making it clearer), but this section is about the Cyrix CPU's. The processors supporting CPUID call it a "CyrixInstead"

Cyrix 387

This isn't actually a processor, but is the most famous Cyrix processor. It was the fastest coprocessor to the 386 to be found, and was even very usable aside a 486-SX. These were the main line of money for Cyrix.

Cyrix 5x86

A processor that performed as a 486 and was socket-compatible. Is not a pentium compatible, misses required instructions (such as cmpxchg8b).

Cyrix 6x86MX / M1

This processor is, even though the name suggests otherwise, compatible with the 586 (Pentium). It didn't contain any of the MMX or PPro features but is nevertheless very nice. It performed slightly better per cycle, and was thus given ratings. This was the time they were loathed for rating their processors.

Cyrix M2

Was a Pentium MMX compatible processor, also using ratings which gave it a bad name to start with. It was again socket-compatible to the Pentium MMX and the older Pentiums (without MMX). It supported a few features from the Pentium Pro, among which the very usable CMOVcc set. This however wasn't well known at the time, and nobody seemed to care.

There were possibly more but the current author can't recall which ones. Suffice to say, if there were any they were impopular and they were soon gone. The company was bought by Via.

Rise Technologies

I've only heard about this company making Pentium-compatible chips, without MMX, but I don't know any detail but the CPUID identifier string. It just stuck. The string was "RiseRiseRise", or the same in all 3 dwords (making a search for it very easy).


partially related thread: AT,XT and PC


Category: CollectedKnowledge, HardWareCpu

What is v8086 mode ?

Included from What is v8086 mode?

Virtual 8086 mode is a sub-mode of ProtectedMode. In short, virtual 8086 mode is whereby the cpu (in protected mode) is running a "Emulated" 16bit 'segmented' model (real mode) machine.

I don't enable V86 myself. Why should i care ?

The most common problem with v86 mode is that you can't enter ProtectedMode inside a v86 task. In other words, if you are running Windows or have emm386 in memory, you can't do a "raw" switch into protected mode (it causes an exception, iirc). DOS extenders worked around that problem using either VCPI or DPMI interfaces to switch into pmode (actually, promoting their V86 task as a 'regular' user task). For an OS programmer such interfaces are simply useless as they're part of another OS.

There are a few other more "technical" problems people have when using v86 mode, mostly because v86 has some instructions "emulated" by what's known as a v86-monitor program, as the cpu is in protected mode, some instructions are high up on the security/protection level and running those directly would cause no-end of trouble for the OS.

Such technicalities are beyond the scope of a simple FAQ. If you wish to learn more about virtual mode, i suggest you read the corresponding chapter of the HollyIntelManual.

How do i detect v8086 ?

EFLAGS.VM is NEVER pushed onto the stack if the V86 task uses PUSHFD. You should check if CR0.PE=1 and then assume it's V86 if that bit is set.

detect_v86:
        smsw    ax
        and     eax,1           ;CR0.PE bit
        ret

VM mode detection is mainly useful when writing DOS extenders or other programs that could be started either in plain real mode or in virtual mode from a protected mode system. An 'ordinary' bootloader shouldn't worry about this since the BIOS will not set up VM86 to read the bootsector ;)

I heard it could help me. How can i support it ?

Indeed, VM86 can be of high interrest if you need to access BIOS functions while you're in ProtectedMode. This is essentially useful in order to set up video mode. As many modern card/chipsets lack support for VBE3 protected mode interface, setting up a VM86 task that will perform the proper 'set video mode' call remains the method.

TimRobinson has provided a very nice tutorial about VM86 mode. BeyondInfinity also has a working implementation (combined VM86+VBE task). See VirtualMonitor page for more implementation considerations.

Argh! My kernel is below 1MB! what can i do ?

TimRobinson and many others suggests that you put your kernel at a 'high' logical address (e.g. 0xC0000000) to avoid VM86 tasks to interfere with it. This is especially important when your kernel is large and leaves no room for VM86 code below 1MB, or when you plan to run 'full programs' within your VM86 box.

If all you need is a BIOS interrupt wrapper, then you can easily do the following:

  1. ensure that your 16bits code is on a separate page from any 32 bits code
  2. enable paging
  3. make kernel pages unwritable (and unreadable ?) for DPL3 and allow user-access only to those pages that contains your 16 bits code and pages that contains BIOS code or data.

Can i use VM86 for disk access ?

Theorically yes, though it is probably not a GoodIdea(tm), as most BIOS disk access will include IRQ handlers, DMA transfers which you can hardly control from your VM monitor, and may stick in VM86 task while the BIOS waits for an interrupt response while a 'good' driver would have let the CPU free for other processes.

Remember of your old MS9x system freezing when doing a disk access ? that was most of the time due to an INT13-through-VM86 problem.


Categories: FAQ, HardWareCpu


Related forum threads

Creating vm86 task VM86 and INT10h kernel location & VM86

add yours here

Additionnal links

add yours here

AMD K6 WriteBack Optimisations

Included from AMD K6 WriteBack Optimisations

I wrote and tested this on my own K6 (k6-200) and it works ok, but I was unable to find anyone with a K6-2 (CXT core) or K6-3 since there is two different methods for enabling writeback mode. It should work fine on k6-2 CXT and K6-3 processors.

With some tweaking, can be put into anyone's OS.

You call AMD_K6_writeback with the CPUID results family, model and stepping, only when you are sure you have an AMD cpu.

Here's the code, using InlineAssembly

void AMD_K6_writeback(int family, int model, int stepping)
{
    /* mem_end == top of memory in bytes */
    int mem=(mem_end>>20)/4; /* turn into 4mb aligned pages */
    int c;
    union REGS regs;

    if(family==5)
    {
        c=model;

        /* model 8 stepping 0-7 use old style, 8-F use new style */
        if(model==8)
        {
            if(stepping<8)
                c=7;
            else
                c=9;
        }

        switch(c)
        {
        /* old style write back */
        case 6:
        case 7:
            AMD_K6_read_msr(0xC0000082, &regs);
            if(((regs.x.eax>>1)&0x7F)==0)
                kprintf("AMD K6 : WriteBack currently disabled\n");
            else
                kprintf("AMD K6 : WriteBack currently enabled (%luMB)\n",
                    ((regs.x.eax>>1)&0x7F)*4);

            kprintf("AMD K6 : Enabling WriteBack to %luMB\n", mem*4);
            AMD_K6_write_msr(0xC0000082, ((mem<<1)&0x7F), 0, &regs);
            break;

        /* new style write back */
        case 9:
            AMD_K6_read_msr(0xC0000082, &regs);
            if(((regs.x.eax>>22)&0x3FF)==0)
                kprintf("AMD K6 : WriteBack Disabled\n");
            else
                kprintf("AMD K6 : WriteBack Enabled (%luMB)\n",
                    ((regs.x.eax>>22)&0x3FF)*4);

            kprintf("AMD K6 : Enabled WriteBack (%luMB)\n", mem*4);
            AMD_K6_write_msr(0xC0000082, ((mem<<22)&0x3FF), 0, &regs);
            break;
        default:    /* dont set it on Unknowns + k5's */
            break;
        }
    }
}

void AMD_K6_write_msr(ULONG msr, ULONG v1, ULONG v2, union REGS *regs)
{
    asm __volatile__ (
        "pushfl\n"
        "cli\n"
        "wbinvd\n"
        "wrmsr\n"
        "popfl\n"
        : "=a" (regs->x.eax),
          "=b" (regs->x.ebx),
          "=c" (regs->x.ecx),
          "=d" (regs->x.edx)
        : "a" (v1),
          "d" (v2),
          "c" (msr)
        : "eax",
          "ecx",
          "edx",
          "ebx",
          "memory");
}

void AMD_K6_read_msr(ULONG msr, union REGS *regs)
{
    asm __volatile__ (
        "pushfl\n"
        "cli\n"
        "wbinvd\n"
        "xorl %%eax, %%eax\n"
        "xorl %%edx, %%edx\n"
        "rdmsr\n"
        "popfl\n"
        : "=a" (regs->x.eax),
          "=b" (regs->x.ebx),
          "=c" (regs->x.ecx),
          "=d" (regs->x.edx)
        : "c" (msr)
        : "eax",
          "ecx",
          "edx",
          "ebx",
          "memory");
}

Categories: CollectedKnowledge, HardWareCpu

How can I tell CPU speed ?

Included from How can I tell CPU speed ?

General Overview

In order to tell what's the CPU speed, we need two things:

  1. being able to tell that a given (precise) amount of time has elapsed.
  2. being able to know how much 'clock cycles' a portion of code took.

Once these two sub-problems are solved, one can easily tell the CPU speed

using the following pseudo-code
prepare_a_timer(X milliseconds ahead);
while (timer has not fired) {
  inc iterations_counter;
}
cpuspeed_mhz = (iteration_counter * clock_cycles_per_iteration)/1000;

Note that except for very special cases, using a busy-loop (even calibrated) to introduce delays is a bad idea and that it should be kept for very small delays (nano or micro seconds) that you must comply when programming hardware only.

Also note that PC emulators (like BOCHS, for instance) are rarely realtime and that you shouldn't be surprised if your clock appears to run faster than expected on those emulators.

Waiting for a given amount of time

There are two circuits in a PC that allows you to deal with time: the PIT (Programmable Interval Timer, 8253 iirc) and the RTC (Real Time Clock). The PIT is probably the better of the two for this task.

The PIT has two operating mode that can be useful for telling the cpu speed:

  1. the periodic interrupt mode (0x36), in which a signal is emitted to the interrupt controller at a fixed frequency. This is especially interresting on PIT channel 0 which is bound to IRQ0 on a PC.
  2. the one shot mode (0x34), in which the PIT will decrease a counter at its top speed (1.19318 MHz) until the counter reaches zero.

    Whether or not an IRQ is fired by channel0 in 0x34 mode should be checked

Note that theorically, one shot mode could be used with a polling approach, reading the current count on the channel's data port, but I/O bus cycles have unpredictable latency and one should make sure the timestamp counter is not affected by this approach.

ToDo: check if there's code that programs the PIT in the FAQ already

Knowing how many cycles your loop takes

This step depends on your CPU. On 286, 386 and 486, each instruction took a well-known and deterministic amount of clock cycles to execute. This allowed the programmer to tell exactly how many cycles a loop iteration took by looking up the timing of each instruction (see HelpPC) and then sum them up.

Since the multi-pipelined architecture of the Pentium, however, such numbers are no longer communicated (for a major part because the same instruction could have variable timings depending on its surrounding, which makes the timing almost useless)

It is possible to create code which is exceptionally pipeline hostile such as:

xor eax,edx
xor edx,eax
xor eax,edx
xor edx,eax
...

A simple xor instruction takes one cycle, and it's guaranteed that the processor cannot pipeline this code as the current instructions operands depend on the results from the last calculation. One can check that, for a small count (tested from 16 to 64), RDTSC will show the instruction count is almost exactly (sometimes off by one) the cycles count. Unfortunately, when making the chain longer you'll start experiencing code cache misses, which will ruin the whole process.

E.g. looping on a chain of 1550 XORs may require a hundred of iterations before it stabilizes around 1575 clock cycles on a AMDx86-64, and i'm still waiting it to stabilize on my Pentium3

Despite this inaccuracy it gives relatively good results across the whole processor generation given a reasonably accurate timer but if very accurate measurements are needed the next method should prove more useful.

A Pentium developer has a much better tool to tell timings: the Time Stamp Counter: an internal counter that can be read using RDTSC special instruction

rdtscpm1.pdf explains how that feature can be used for performance monitoring and should provide the necessary information on how to access the TSC on a pentium

How do i know if i have access to RDTSC instruction or not ?

The presence of the Time Stamp Counter (and thus the availability of RDTSC instruction) can be detected through the CPUID instruction. When calling cpuid with eax=1, you'll receive the features flags in edx. TSC is the bit #4 of that field.

Included from CpuIdWarning

Note that prior to use the CPUID instruction, you should also make sure the processor support it by testing the 'ID' bit in eflags (this is 0x200000 and is modifiable only when CPUID instruction is supported. For systems that doesn't support CPUID, writing a '1' at that place will have no effect)

In the case of a processor that does not support CPUID, you'll have to use more eflags-based tests to tell if you're running on a 486, 386, etc. and then pick up one of the 'calibrated loops' for that architecture (8086 through 80486 may have variable instruction timings).

Do you have code that works ?

There is a RealMode Intel-copyrighted example in the above-mentionned application note ... Here comes another code submitted by DennisCGC that will give the total measured frequency of a pentium processor.

Some notes:

  • irq0_count is a variable, which increases each time when the timer interrupt is called.
  • in this code it's assumed that the PIT is programmed to 100 hz (of course, I give the formula about how to calculate it
  • it's assumed that the command CPUID is supported.

AsmExample:

;get_speed
;first do a cpuid command, with eax=1
mov  eax,1
cpuid
test edx,byte 0x10      ; test bit #4. Do we have TSC ?
jnz  detect_end         ; no ?, go to detect_end
;wait until the timer interrupt has been called.
mov  ebx, [irq0_count]
;wait_irq0
cmp  ebx, [irq0_count]
jz   wait_irq0
rdtsc                   ; read time stamp counter
mov  [tscLoDword], eax
mov  [tscHiDword], edx
add  ebx, 2             ; Set time delay value ticks.
; remember: so far ebx = [irq0]-1, so the next tick is
; two steps ahead of the current ebx ;)
;wait_for_elapsed_ticks
cmp  ebx, [irq0_count] ; Have we hit the delay?
jnz  wait_for_elapsed_ticks
rdtsc
sub eax, [tscLoDword]  ; Calculate TSC
sbb edx, [tscHiDword]
; f(total_ticks_per_Second) =  (1 / total_ticks_per_Second) * 1,000,000
; This adjusts for MHz.
; so for this: f(100) = (1/100) * 1,000,000 = 10000
mov ebx, 10000
div ebx
; ax contains measured speed in MHz
mov [mhz], ax

See the intel manual (see links) for more information. (

-- bugs report are welcome. IM to DennisCGC

Can i do it if i have no interrupts support (yet) ?

I'd be tempted to say 'yes', though I haven't gave it a test nor heard of it elsewhere so far. Here is the trick
disable()     // disable interrupts (if still not done)
outb(0x43,0x34);   // set PIT channel 0 to single-shot mode
outb(0x40,0);
outb(0x40,0);      // program the counter will be 0x10000 - n after n ticks
long stsc=CPU::readTimeStamp();
for (int i=0x1000;i>0;i--);
long etsc=CPU::readTimeStamp();
outb(0x43,0x04);   // read PIT counter command ??
byte lo=inb(0x40);
byte hi=inb(0x40);

Now, we know that

  1. ticks=(0x10000 - (hi*256+lo)) periods of 1/1193180 seconds have elapsed at least and no more than ticks+1.
  2. etsc-stsc clock cycles have elapsed during the same time.

Thus (etsc-stsc)*1193180 / ticks should be your CPU speed in Hz ...

As far as i can say, 0x1000 iterations lead to 10 PIT ticks on a 1GHz CPU and a bit less than 0x8000 ticks on the same CPU running BOCHS. This certainly means that on very high speed systems, the discovered speed may not be accurate at all, or worse, less than 1 tick could occur ...

This technique is currently under evaluation in the forum

-- hope you like my technique /PypeClicker

Asking the SMBios for CPU speed

The SMBios (System Management BIOS) Specification addresses how motherboard and system vendors present management information about their products in a standard format by extending the BIOS interface on Intel architecture systems. The information is intended to allow generic instrumentation to deliver this information to management applications that use DMI, CIM or direct access, eliminating the need for error prone operations like probing system hardware for presence detection.

SMBios Processor Information

A Processor information (type 4) structure describes features of the CPU as detected by the SMBios. The exact structure is depicted in section 3.3.5 (p 39) of the standard. Within those informations will you find the processor type, family, manufacturer etc. but also

  • the External Clock (bus) frequency, which is a word at offset 0x12,
  • the Maximum CPU speed in MHz, which is a word at offset 0x14 (e.g. 0xe9 is a 233MHz processor),
  • the Current CPU speed in MHz, (word at offset 0x16).

How do i get that structure ?

SMBios provide a Get SMBIOS Information function that tells you how many structures exists. You can then use Get SMBIOS Structure function to read processor information.

As an alternative, you can locate the SMBIOS Entry Point and then traverse manually the SMBIOS structure table, looking for type 4.

All this is depicted in 'Acessing SMBIOS Information' structure of the standard (p 11).

The SMBIOS Entry Point structure, described below, can be located by application software by searching for the anchor-string on paragraph (16-byte) boundaries within the physical memory address range 000F0000h to 000FFFFFh. This entry point encapsulates an intermediate anchor string that is used by some existing DMI browsers.

00-03 Anchor String (_ SM _ or 5f 33 4d 5f)
04 Checksum
05 Length
06 major version
07 minor version
08-09 max structure size
0A entry point revision
0B-0F formatted area
10-14 _ DMI _ signature
15 intermediate checksum
16-17 structure table length
18-1B structure table (physical) address
1C-1D number of SMBIOS structures
1E SMBIOS revision (BCD)

I don't feel like re-explaining the PnP calling convention etc. as chances are it will be useless in ProtectedMode ...

-- Thanks to DasCandy for bringing this information to my knowledge ;)


Categories: HowTo, HardWareCpu


Links

Related threads in the forum:

Forum:5849 Forum:767 Forum:922 Forum:8949 featuring info on bogomips, how linux does it and durand's code.

Other resources

ftp://download.intel.com/support/processors/procid/

especially section 12: "Operating Frequency" on page 29 of 24161815.pdf

Searching for SMBIOS should give you info on that too, it contains entries about the CPU, including current speed.

Tell me about x86 64 bits CPU ...

Included from Tell me about x86 64 bits CPU ...

This page tries to clear ideas about x86-64 CPUs (AMD64 and Intel's equivalent EM64T implementation). IA-64 (Itanium) are really a different beast and not addressed here.

Features

What does Long Mode offer ?

Long mode extends general registers to 64 bits (RAX, RBX, RIP, RSP, RFLAGS, etc), and adds an additional 8 integer registers (R8, R9, ..., R15) plus 8 more SSE registers (XMM8 to XMM15) to the CPU. Linear addresses are extended to 64 bit (however, a given CPU may implement less than this) and the physical address space is extended to 52 bits (a given CPU may implement less than this). In essence long mode adds another mode to the CPU
  • Real mode
  • Legacy mode (32 bit protected mode)
  • Long mode (64 bit protected mode)
  • System Management mode

Long mode does not support hardware task switching or virtual 8086 tasks, and most of the segment register details are ignored (a flat memory model is required). In long mode the current CS determines if the code currently running is 64 bit code (true long mode) or 32 bit code (compatability mode), or even 16-bit protected mode code (still in compatability mode).

The first 64 bit CPUs from both Intel and AMD will support 40 bit physical addresses and 48 bit linear addresses.

Setting up ...

How do I detect if the CPU is 64 bits ?

You can find that out by checking CPUID. All AMD64 compliant processors have the longmode-capable-bit turned on in the extended feature flags (bit 29) in EDX, after calling CPUID with EAX=0x80000001. There are also other bits required by long mode, but you can see those yourself in CPUID at AMD general purpose instruction reference

How do i enable Long Mode ?

The steps for enabling long mode are
  • Disable paging
  • Set the PAE enable bit in CR4
  • Load CR3 with the physical address of the PML4
  • Enable long mode by setting the EFER.LME flag in MSR 0xC00000080
  • Enable paging

Now the CPU will be in compatability mode, and instructions are still 32-bit. To enter long mode, the D/B bit (bit 22, 2nd dword) of the GDT code segment must be clear (as it would be for a 16-bit code segment), and the L bit (bit 21, 2nd dword) of the GDT code segment must be set. Once that is done, the CPU is in 64-bit long mode.

Are there restrictions on 32 code running in Legacy Mode ?

x86-64 processors can operate in a legacy mode, they still start in real mode and protected mode is still available (along with the associated v8086 mode). This means an x86 operating system, even DOS, will still run just fine. The only difference is that physical addresses can be up to 52 bits (or as many bits as implemented by the CPU) when PAE is used.

However, there is nothing like Virtual8086 Mode (16 bits support) once in long/compatibility mode.

Can i enable Long Mode directly ?

Protected mode must be entered before activating long mode. A minimal protected-mode environment must be established to allow long-mode initialization to take place. This environment must include the following:

  • A protected-mode IDT for vectoring interrupts and exceptions to the appropriate handlers while in protected mode.
  • The protected-mode interrupt and exception handlers referenced by the IDT.
  • Gate descriptors for each handler must be loaded in the IDT.

    --AMD64 docs, volume 2, section 14.4 (Enabling Protected Mode), 24593 Rev. 3.10 February 2005

That being said, we have a thread where Brendan shows how you can enable 64-bit long mode with no 32-bit IDT and no 32-bit segments ... Be assured, however, that any paging-related exception that occurs in long mode before you enable 64-bit IDT will cause the processor to reset due to a triple fault ...

64bit Environment Models

There are three 64bit programming models you need to consider; LP64, ILP64, LLP64, each mode has its own pitfalls. The I/L/P stand for Int, Long, Pointer, and the 64 means thats how many bits in each.

This LP64 means Longs and Pointers are 64bits wide. LL is a special case and means long-long...

DataTypes

This table lists the breakdown of sizes in the various programming models.

 Datatype   LP64   ILP64   LLP64   ILP32   LP32 
 char   8   8   8   8   8 
 short   16   16   16   16   16 
 _int   32   --   32   --   -- 
 int   32   64   32   32   16 
 long   64   64   32   32   32 
 long long   --   --   64   --   -- 
 pointer   64   64   64   32   32 

64bit OS Modes

The following table lists what some current 64bit OS have as a programming model.

 OS   Mode 
 Windows XP-64 / IA64   LLP64 
 Linux   LP64 
 Solaris   LP64 
 DEC OSF/1 Alpha   LP64 
 SGI Irix   LP64 
 HP UX 11   LP64 

Categories: CollectedKnowledge, HardWareCpu


Learn More

Protected Mode (glossary)

Included from ProtectedMode

Glossary -- ProtectedMode

Protected mode is the 32 bit 'native' operating mode of Intel processors (and clones) since the 80386. It allows the developer to work with several virtual address spaces, each of which has 4GB of addressable memory and allows the system to enforce strict memory protection as well as restricting the available instruction set (so that your application cannot control the hard disk directly while the kernel can ;)

Protected mode unleashes the real power of your CPU, so you better get informed about it if you are considering writing an OS. However, it will prevent you from using virtually any of the BIOS interrupts (unless you have a V86 monitor).

Whether the CPU is in RealMode or in protected mode is defined by the lowest bit of the CR0 register, so basically

    ;; make sure interrupts are disabled, etc.
    mov eax, cr0
    or al,1
    mov cr0,eax

takes you to protected mode ... however you'll discover that there are many other things to be done before and after that operation to switch gracefully to pmode rather than resetting the CPU...

Plenty of information about protected mode can be found on both OSRC and Bona Fide, including in-detail tutorials and realmode/pmode switch programs. Our BabyStep series could also help you a bit :)


You may like http://home.swipnet.se/smaffy/asm/info/embedded_pmode.pdf if you're looking for a pragmatic tutorial on pmode.

Real Mode (glossary)

Included from RealMode

Glossary -- RealMode 16 bits Operating mode in which the x86 cpu runs when it boots. That mode is mainly for backward compatibility and provide very few help for the modern developer (no memory protection, only 1MB of adressable memory, no virtual memory support). BIOS and DOS are typically real-mode stuff. All the rest you know (windows, linux, DukeNukem3D, zsnes, dos4gw, djgpp ...) are ProtectedMode OS/applications/dosextenders respectively.

additionnal informations about address formations in RealMode can be found in Perica's tutorial on Bona Fide.

Unreal Mode (glossary)

Included from UnrealMode

What is unreal mode ? (for the Glossary)

basically, unreal mode consist of breaking the '64Kb' limit of real mode segments, but still keeping 16 bits instruction and segment*16+offset address formation. You can find much more about it in OSRC

When should i use unreal mode ?

unreal mode is recommended in the two following cases :

  1. you're trying to extend a legacy 16-bits DOS program so that it can deal with larger datas and neither vm86, nor xms is suitable for your needs
  2. you're trying to load something that will run in 32 bits mode and which is larger than 640Kb (so you cannot load it in conventionnal memory) and you don't want to bother with a disk driver called from pmode yet, and you do not wish to switch between real and protected mode for copying chunks from the conventionnal memory buffer to the high memory areas ...

Of course, unreal mode is kinda useless as long as you don't have the A20Line enabled.

How do i set up unreal mode ?

See BabyStep7 :)


related threads:

PowerPC (a step in non-Intel world)

Included from PowerPC

The PowerPC CPU architecture is significantly different from the IA32. Yet still, the architecture of your OS need not differ too much: While the way you address memo