BabyStep is a series of tutorials written by CrazyBuddah in the forum. You can browse them by going to the .:QuickLinkz:. forum thread.

They are inteded for novice programmers that need help in writing simple bootloaders, and range from a simple 'hello world' bootsector to UnrealMode and ProtectedMode switch & display.

Step 1: your first boot sector

Included from BabyStep1

The following code is the smallest possible example of booting code from a floppy. It is assembled in NASM and copied to floppy using Partcopy. Then you boot from the floppy.

; nasmw boot.asm -f bin -o boot.bin
; partcopy boot.bin 0 200 -f0

hang:
   jmp hang

   times 512-($-$$) db 0

The CPU starts in real mode and BIOS loads this code at 0000:7c00. The "times 512..." stuff is NASM's way of saying fill up 512 bytes with zeros. And Partcopy is going to expect that (200 in hex = 512). Change it, and you'll see Partcopy choke.

Often, you will see a so-called boot signature (0xAA55) at the end. Older versions of BIOSes looked for this in order to identify a boot sector on a disk. It is evidentally unnecessary nowadays. If it's needed, the last line would be replaced with (or some version of it)

   times 510-($-$$) db 0  ;2 bytes less now
   db 0x55
   db 0xAA

But the thing I'd really like to point out is how once you've booted, and the cursor is happily blinking on a blank screen, you might notice two things. One is that the floppy motor will turn off and the other is that you can press Crtl-Alt-Del to reboot. The point is that interrupts ( such as int 0x09) are still being generated.

For kicks try clearing the interrupts flag:

   cli
hang:
   jmp hang

   times 510-($-$$) db 0
   db 0x55
   db 0xAA

You may notice that the floppy motor doesn't turn off and you can't reboot with Ctrl-Alt-Del.

If you try to reduce this even more by removing the loop and merely pad out the sector with zeros, the BIOS will have something to say about it. On my machine, it was "Operating System Not Found". I have yet to try filling the sector with zeros except for adding a boot signature.

Not exactly something you would show your girlfriend, but I wanted to show just what the bare minimum is before I elaborate. Unless I'm irritating anyone, in which case I'll desist.


.: prev :: up :: next :.

initial thread


REFERENCES:

Step 2: writing a message using the BIOS

Included from BabyStep2

Quick review:

  1. boot sector loaded by BIOS is 512 bytes
  2. the code in the boot sector of the disk is loaded by the BIOS at 0000:7c00
  3. machine starts in real mode
  4. be aware that the CPU is being interrupted unless you CLI

Many (but not all) BIOS interrupts expect DS to be filled with a real-mode segment value. This is why many BIOS interrupts won't work in protected mode. So if you want to use int 10h/ah=0eh to print to the screen, then you need to make sure that your seg:offset for the characters to print is correct.

It doesn't matter if you use 0000:7c00 or 07c0:0000, but if you use ORG, you need to be aware of what's happening.

AsmExample:

   mov ax, 0x07c0
   mov ds, ax

   mov si, msg
ch_loop:lodsb
   or al,al ;zero=end of str
   jz hang   ;get out
   mov ah,0x0E
   int 0x10
   jmp ch_loop

hang:
   jmp hang

msg   db 'Welcome to Macintosh', 13, 10, 0
   times 510-($-$$) db 0
   db 0x55
   db 0xAA

Here's the ORG version. Note that you still need to tell DS what to be

[ORG 0x7c00]

   xor ax, ax  ;make it zero
   mov ds, ax

   mov si, msg
ch_loop:lodsb
   or al,al ;zero=end of str
   jz hang   ;get out
   mov ah,0x0E
   int 0x10
   jmp ch_loop

hang:
   jmp hang

msg   db 'Welcome to Macintosh', 13, 10, 0

   times 510-($-$$) db 0
   db 0x55
   db 0xAA

New topic: Typically 'procedures' are separated from the code using CALL/RET like the following

[ORG 0x7c00]
   xor ax, ax  ;make it zero
   mov ds, ax

   mov si, msg
   call bios_print

hang:
   jmp hang



msg   db 'Welcome to Macintosh', 13, 10, 0


bios_print:
   lodsb
   or al,al ;zero=end of str
   jz done   ;get out
   mov ah,0x0E
   int 0x10
   jmp bios_print
done:
   ret

   times 510-($-$$) db 0
   db 0x55
   db 0xAA

For some inexplicable reason, loading SI then jumping to the procedure always bugged me. Fortunately for psychos like me NASM's macros let you pretend that you are passing a parameter (macro definition has to go before it's being called).

%macro BiosPrint 1
                mov si, word %1
ch_loop:lodsb
   or al,al ;zero=end of str
   jz done   ;get out
   mov ah,0x0E
   int 0x10
   jmp ch_loop
done:
%endmacro

[ORG 0x7c00]

   xor ax, ax  ;make it zero
   mov ds, ax

   BiosPrint msg

hang:
   jmp hang

msg   db 'Welcome to Macintosh', 13, 10, 0

   times 510-($-$$) db 0
   db 0x55
   db 0xAA

And in case your code is becoming long and unreadable, you can break it up into different files, then include the files at the beginning of your main code. Like so:

   jmp main

   %include "othercode.inc"

main:
   ;... rest of your code here

.: prev :: up :: next :. initial thread

Step 3: a look at machine code (opcodes, prefix, etc)

Included from BabyStep3

Opcodes in machine language (AsmExample)

; nasmw encode.asm -f bin -o encode.bin

   mov cx, 0xFF
   times 510-($-$$) db 0
   db 0x55
   db 0xAA

Don't Partcopy to disk. Just open this in DEBUG (for MS*. Hexdump will be nice for linux users)

C:\osdev\debug encode.bin

Type in 'd' after the '-' to see the binary file. ('?' will give you help; 'q' will quit). You will see something like this:

OAE3:0100 B9 FF 00 00 00 00 etc...

Look up the opcode for MOV here: http://www.baldwin.cx/386htm/MOV.htm

See Section "17.2.2.1 Opcode" here: http://www.baldwin.cx/386htm/s17_02.htm

In other words, there is a unique register number (CX=1) added to the base opcode value 'B8' to give 'B9', which you see in the dump.

But watch what happends when you replace CX with ECX:

   mov ecx, 0xFF
   times 510-($-$$) db 0
   db 0x55
   db 0xAA

OAE3:0100 66 B9 FF 00 00 00 00 etc...

The '66' is a Operand Size Override Prefix generated by the assembler when there is a discrepancy with the default mode, which when NASM assembles binary files, it is 16-bit. The same thing happens if you use the BITS directive to change the mode, but it differs from the size of the operand:

[BITS 32]
   mov cx, 0xFF
   times 510-($-$$) db 0
   db 0x55
   db 0xAA

This doesn't actually change the mode of the processor, but it does help it interpret the subsequent bytes.

Addresses

Address encoding is a bit more complicated

   mov cx, [temp]

temp db 0x99
   times 510-($-$$) db 0
   db 0x55
   db 0xAA

OAE3:0100 8B 0E 04 00 99 00 00 00 etc...

  • '8B' is the opcode
  • '0E' is a ModR/M byte which helps the opcode interpretation

See Section "17.2.1 ModR/M and SIB Bytes" here: http://www.baldwin.cx/386htm/s17_02.htm

The rules for interpreting this byte, which contains different fields (see Fig. 17-2), but fortunately Table 17-2 makes it easier. Look up '0E' and you will see at the left it says "disp16" which means that the operand will be interpreted as a 16-bit offset.

'04 00' is the 16-bit offset. If you are confused why 0x0004 is backwards, it's because the Intel processor is "little endian". The "little" end of the number comes first.

'99' is of course the value of the byte at 0x0004 (8B is at 0x0000)

Be aware of another prefix called the Address size Override Prefix '67' which the assembler generates when there is a discrepancy just like with '66' above.

This stuff matters for a bunch of reasons, but since we will be making the switch from 16-bit real mode to 32-bit protected mode, our code is going to also change. And being aware of what a dump looks like can prevent a lot of grief.


.: prev :: up :: next :.

Initial Thread


Refs:

Step 4: writing to video memory

Included from BabyStep4

I know this is starting to look like a half-baked tutorial in assembly, but there's actually a reason behind my madness. Namely, solving as many problems as possible before switching to pmode etc. will lessen the confusion a great deal.

This example prints a string and the contents of a memory location (which is the first letter of the string in video memory). It is meant to demonstrate printing to screen in text mode without using BIOS, as well as converting hex so it can be displayed -- so we can check register and memory values.

I added a stack, but didn't end up using it. However, I just left it in because it will probably get used soon.

;=====================================
; nasmw boot.asm -f bin -o boot.bin
; partcopy boot.bin 0 200 -f0

[ORG 0x7c00]      ; add to offsets
   xor ax, ax    ; make it zero
   mov ds, ax   ; DS=0
   mov ss, ax   ; stack starts at 0
   mov sp, 0x9c00   ; 200h past code start

   mov ax, 0xb800   ; text video memory
   mov es, ax

   mov si, msg   ; show text string
   call sprint

   mov ax, 0xb800   ; look at video mem
   mov gs, ax
   mov bx, 0x0000   ; 'W'=57 attrib=0F
   mov ax, [gs:bx]

   mov  word [reg16], ax ;look at register
   call printreg16

hang:
   jmp hang

;----------------------
dochar:   call cprint         ; print one character
sprint:   lodsb      ; string char to AL
   cmp al, 0
   jne dochar   ; else, we're done
   add byte [ypos], 1   ;down one row
   mov byte [xpos], 0   ;back to left
   ret

cprint:   mov ah, 0x0F   ; attrib = white on black
   mov cx, ax    ; save char/attribute
   movzx ax, byte [ypos]
   mov dx, 160   ; 2 bytes (char/attrib)
   mul dx      ; for 80 columns
   movzx bx, byte [xpos]
              shl bx, 1    ; times 2 to skip attrib

   mov di, 0        ; start of video memory
   add di, ax      ; add y offset
   add di, bx      ; add x offset

   mov ax, cx        ; restore char/attribute
   stosw              ; write char/attribute
   add byte [xpos], 1  ; advance to right

   ret

;------------------------------------

printreg16:
   mov di, outstr16
   mov ax, [reg16]
   mov si, hexstr
   mov cx, 4   ;four places
hexloop:
   rol ax, 4   ;leftmost will
   mov bx, ax   ; become
   and bx, 0x0f   ; rightmost
   mov bl, [si + bx];index into hexstr
   mov [di], bl
   inc di
   dec cx
   jnz hexloop

   mov si, outstr16
   call sprint

   ret

;------------------------------------

xpos   db 0
ypos   db 0
hexstr   db '0123456789ABCDEF'
outstr16   db '0000', 0  ;register value string
reg16   dw    0  ; pass values to printreg16
msg   db "What are you doing, Dave?", 0
times 510-($-$$) db 0
db 0x55
db 0xAA
;==================================

.: prev :: up :: next :.

Initial Thread

Step 5: the IVT - interrupts in the good old days.

Included from BabyStep5

This code is meant to show how the hardware interrupt generated when you press a key can be handled by replacing the seg:offset specified in the IVT (interrupt vector table). This normally points to a BIOS routine. To find the entry in the IVT, multiply the interrupt number by 4 (which is the size of each entry).

This key handler just displays the scan code without conversion to ASCII, buffering, or handling extended keys. The reason for doing this is to not muddle up the basic idea, which is to provide input, as well as output, in its most simple form.

I will not go into the hows and whys of reading the ports involved in a key press. Suffice it to say that you are communicating with actual chips (or parts of chips), not some software intermediary. I personally feel it is good to remember that, no matter what level of abstraction you work at, you are ultimately telling hardware what to do.

I will point out the turning the keyboard on/off through port 0x61 is given in its complete form, some of which might not be needed, depending on the system.

;==========================================
; nasmw boot.asm -f bin -o boot.bin
; partcopy boot.bin 0 200 -f0

[ORG 0x7c00]      ; add to offsets
   jmp start

   %include "print.inc"

start:   xor ax, ax   ; make it zero
   mov ds, ax   ; DS=0
   mov ss, ax   ; stack starts at 0
   mov sp, 0x9c00   ; 200h past code start

   mov ax, 0xb800   ; text video memory
   mov es, ax

   cli      ;no interruptions
   mov bx, 0x09   ;hardware interrupt #
   shl bx, 2   ;multiply by 4
   xor ax, ax
   mov gs, ax   ;start of memory
   mov [gs:bx], word keyhandler
   mov [gs:bx+2], ds ; segment
   sti

   jmp $      ; loop forever

keyhandler:
   in al, 0x60   ; get key data
   mov bl, al   ; save it
   mov byte [port60], al

   in al, 0x61   ; keybrd control
   mov ah, al
   or al, 0x80   ; disable bit 7
   out 0x61, al   ; send it back
   xchg ah, al   ; get original
   out 0x61, al   ; send that back

   mov al, 0x20   ; End of Interrupt
   out 0x20, al   ;

   and bl, 0x80   ; key released
   jnz done   ; don't repeat

   mov ax, [port60]
   mov  word [reg16], ax
   call printreg16

done:
   iret

port60   dw 0

   times 510-($-$$) db 0  ; fill sector w/ 0's
   dw 0xAA55        ; req'd by some BIOSes
;==========================================

*** hardware fun http://chip.ms.mff.cuni.cz/pcguts/

*** Intel's Summer Reading List http://developer.intel.com/vtune/cbts/refman.htm

*** John Fine links to hardware programming http://www.geocities.com/SiliconValley/Peaks/8600/device.html


.: prev :: up :: next :.

Initial Thread

Step 6: Understanding segmentation

Included from BabyStep6

Baby Step VI : descriptors

Actually entering ProtectedMode is simply switching a single bit in a special control register (cr0). (All the other stuff, like A20Line, tasks, IDT, call gates, etc. is additional stuff.)

However, before switching to pmode, you have to use the LGDT instruction to load another special register (gdtr) with the location of a table of data structures called descriptors that tell the process how to access memory.

We're arguing about whether GDT could be set up after switching to pmode in this thread
--PypeClicker

Overview of bytes in the descriptor:

  +0 +1 +2 +3  +4 +5 +6 +7
  l0 l1 b0 b1  b2 TT Fl b3

Descriptor bytes arranged from lowest memory location to highest:

0 0x00 lowest byte of Limit
1 0x00 next byte of Limit
2 0x00 lowest byte of Base Addr
3 0x00 next byte of Base Addr
4 0x00 third byte of Base Addr
5 0x00 = (bits) 0 - 00 - 0 - 0000 = P - DPL - S - Type
6 0x00 = (bits) 0 - 0 - 0 - 0 - 0000 = G - D/B - 0 - AVL - Size
7 0x00 fourth and highest byte of Base Addr

Bits in Type (byte#5)

"P"
Present (1 bit) = 1 means segment is in memory (accessing a non-present segment will raise an exception)
"DPL"
Descriptor Privilege Level (2 bits) = 0 is most privileged and 3 is least.
"S"
System (1 bit) = must be 0 in descriptors for Task State Segments (TSS), Interrupt Gate, Trap Gate, Task Gate, Call Gates. Otherwise, for code/data/stack segment descriptors, it will be 1.
"Type"
Type (4 bits) = interpretation of these depends on whether S (above) is set or not. For S=0, the interpretation will be covered in specific instances of gates etc.
Type bit 3
If S=1, then if high bit is 1, it's a code segment, otherwise it's a data segment.
Type bit 2
The next highest bit depends on the highest bit. If code segment, this next bit indicates whether the segment is 'Conforming' or not. This allows programs somewhere else that are LESS privileged to access this segment, then this segment conforms to the priviledge level of the calling programm. If it's a data segment, this bit specifies "Expand (up or down)" for when the segment is used as a stack. Expand-up (bit=0) is your normal stack behavior. Expand-down is used to prevent problems in stacks that are resized.
Type bit 1
The subsequent bit specifies permission to Read/Write. For data segments, 0 means read-only and 1 is r/w. For code segments, 0 means you can't read from it (e.g. using MOV) and 1 means you can.
Type bit 0
The lowest bit means that the segment has been accessed already (1) or not.

bits in 'flags' (byte #6)

"G" = Granularity (1 bit) = segment Size specified in bytes (0) or 4K pages (1)

"D/B" = Default (code seg) / Big (data seg) = (1 bit) In a code segment (see "Type" above), this bits says default operand/address size is 32-bit (1) or 16-bit (0). For a data segment, it means stack pointer is 32-bit (1) or 16-bit (0). Also means something for expand-down stacks (see "Type" above), but we don't care.

"0" = Reserved (1 bit) = belongs to the Intel of the future.

"AVL" = Available (1 bit) = For your use. Go crazy.

"Size" = Top Nibble of Size (4 bits) = The size of the segment is 20-bits. This is the final four. Whether it means the highest possible segment size is 1 meg or 4 Gigs depends on Granularity above.


.: prev :: up :: next :.

Initial Thread See also What Segments are About?

Step 7: Entering Unreal mode

Included from BabyStep7

Baby Step VII: Big Real Mode

(a.k.a UnrealMode or voodoo mode.)

While this code is largely just a party trick, understanding it gives a gentle intro to protected mode concepts and possibly avoids some headaches later on 'cause you skipped over this kind of stuff.

The single descriptor in the global descriptor table at the bottom is layed out to match BabyStep6. The 'size' given is 1 MB, the base address is 0x0, and the bit fields you can work out yourself.

The reason for doing this is to enable 32-bit offsets in real mode. However, you won't be able to go past 1 meg quite yet.

In protected mode, the bits 3-15 in the segment register are an index into the descriptor table. That's why in this code 0x08 = 1000b gets you the 1 entry.

When this register given a "selector", a "segment descriptor cache register" is filled with the descriptor values, including the size (or limit). After the switch back to real mode, these values are not modified, regardless of what value is in the 16-bit segment register. So the 64k limit is no longer valid and 32-bit offsets can be used with the real-mode addressing rules (i.e. shift segment 4 bits, then add offset).

Finally, note that IP is unaffected by all this, so the code itself is still limited to 64k.

AsmExample:

;==========================================
; nasmw boot.asm -o boot.bin
; partcopy boot.bin 0 200 -f0

[ORG 0x7c00]      ; add to offsets

start:   xor ax, ax   ; make it zero
   mov ds, ax   ; DS=0
   mov ss, ax   ; stack starts at 0
   mov sp, 0x9c00   ; 200h past code start

   cli      ; no interrupt
   push ds      ; save real mode

   lgdt [gdtinfo]   ; load gdt register

   mov  eax, cr0   ; switch to pmode by
   or al,1         ; set pmode bit
   mov  cr0, eax

   mov  bx, 0x08   ; select descriptor 1
   mov  ds, bx   ; 8h = 1000b

   and al,0xFE     ; back to realmode
   mov  cr0, eax   ; by toggling bit again

   pop ds      ; get back old segment
   sti

   mov bx, 0x0f01   ; attrib/char of smiley
   mov eax, 0x0b8000 ; note 32 bit offset
   mov word [ds:eax], bx

   jmp $      ; loop forever

gdtinfo:
   dw gdt_end - gdt - 1   ;last byte in table
   dd gdt         ;start of table

gdt        dd 0,0  ; entry 0 is always unused
flatdesc    db 0xff, 0xff, 0, 0, 0, 10010010b, 01001111b, 0
gdt_end:

   times 510-($-$$) db 0  ; fill sector w/ 0's
   db 0x55          ; req'd by some BIOSes
   db 0xAA
;==========================================

.: prev :: up :: next :.

Initial Thread

Step 8: 32-bits aware text display

Included from BabyStep8

baby steps VIII - 32-bit printing

Here is the same non-BIOS screen print AsmExample as before, but adjusted to use 32-bit registers and offsets. The 'complex' string instructions have been replaced.

;----------------------
dochar:
    call cprint              ; print one character
sprint:
    mov eax, [esi]          ; string char to AL
    lea esi, [esi+1]
    cmp al, 0
    jne dochar               ; else, we're done
    add byte [ypos], 1       ; down one row
    mov byte [xpos], 0       ; back to left
    ret

cprint:
    mov ah, 0x0F             ; attrib = white on black
    mov ecx, eax             ; save char/attribute
    movzx eax, byte [ypos]
    mov edx, 160             ; 2 bytes (char/attrib)
    mul edx                  ; for 80 columns
    movzx ebx, byte [xpos]
    shl ebx, 1               ; times 2 to skip attrib

    mov edi, 0xb8000         ; start of video memory
    add edi, eax             ; add y offset
    add edi, ebx             ; add x offset

    mov eax, ecx             ; restore char/attribute
    mov word [ds:edi], ax
    add byte [xpos], 1       ; advance to right

    ret

;------------------------------------

printreg32:
    mov edi, outstr32
    mov eax, [reg32]
    mov esi, hexstr
    mov ecx, 8               ; eight nibbles

hexloop:
    rol eax, 4               ; leftmost will
    mov ebx, eax             ; become rightmost
    and ebx, 0x0f ;
    mov bl, [esi + ebx]      ; index into hexstr
    mov [edi], bl
    inc edi
    dec ecx
    jnz hexloop

    mov esi, outstr32
    call sprint

    ret

;------------------------------------

xpos db 0
ypos db 0
hexstr db '0123456789ABCDEF'
outstr32 db '00000000', 0    ; register value
reg32 dd 0                   ; pass values to printreg32

;------------------------------------

.: prev :: up :: next :.

Initial Thread

All Steps

Note: See GasAllInOne for the examples in AT&T GAS Syntax.