BabyStep is a series of tutorials written by CrazyBuddah in the forum. You can browse them by going to the
.:QuickLinkz:. forum thread.
They are inteded for novice programmers that need help in writing simple bootloaders, and range from a simple 'hello world' bootsector to UnrealMode and ProtectedMode switch & display.
Step 1: your first boot sector
Included from BabyStep1
The following code is the smallest possible example of booting code from a floppy. It is assembled in NASM and copied to floppy using Partcopy. Then you boot from the floppy.
; nasmw boot.asm -f bin -o boot.bin ; partcopy boot.bin 0 200 -f0 hang: jmp hang times 512-($-$$) db 0
The CPU starts in real mode and BIOS loads this code at 0000:7c00. The "times 512..." stuff is NASM's way of saying fill up 512 bytes with zeros. And Partcopy is going to expect that (200 in hex = 512). Change it, and you'll see Partcopy choke.
Often, you will see a so-called boot signature (0xAA55) at the end. Older versions of BIOSes looked for this in order to identify a boot sector on a disk. It is evidentally unnecessary nowadays. If it's needed, the last line would be replaced with (or some version of it)
times 510-($-$$) db 0 ;2 bytes less now db 0x55 db 0xAA
But the thing I'd really like to point out is how once you've booted, and the cursor is happily blinking on a blank screen, you might notice two things. One is that the floppy motor will turn off and the other is that you can press Crtl-Alt-Del to reboot. The point is that interrupts ( such as int 0x09) are still being generated.
For kicks try clearing the interrupts flag:
cli hang: jmp hang times 510-($-$$) db 0 db 0x55 db 0xAA
You may notice that the floppy motor doesn't turn off and you can't reboot with Ctrl-Alt-Del.
If you try to reduce this even more by removing the loop and merely pad out the sector with zeros, the BIOS will have something to say about it. On my machine, it was "Operating System Not Found". I have yet to try filling the sector with zeros except for adding a boot signature.
Not exactly something you would show your girlfriend, but I wanted to show just what the bare minimum is before I elaborate. Unless I'm irritating anyone, in which case I'll desist.
REFERENCES:
- Instruction Set from the horse's mouth
http://www.intel.com/design/PentiumII/manuals/
Easier to read
NASM assembler - docs incl Instruction Set
Partcopy - download pcopy02.zip
Interrupts by number
Randall Hyde's look into the bowels of the PC
Step 2: writing a message using the BIOS
Included from BabyStep2
Quick review:
- boot sector loaded by BIOS is 512 bytes
- the code in the boot sector of the disk is loaded by the BIOS at 0000:7c00
- machine starts in real mode
- be aware that the CPU is being interrupted unless you CLI
Many (but not all) BIOS interrupts expect DS to be filled with a real-mode segment value. This is why many BIOS interrupts won't work in protected mode. So if you want to use int 10h/ah=0eh to print to the screen, then you need to make sure that your seg:offset for the characters to print is correct.
It doesn't matter if you use 0000:7c00 or 07c0:0000, but if you use ORG, you need to be aware of what's happening.
mov ax, 0x07c0 mov ds, ax mov si, msg ch_loop:lodsb or al,al ;zero=end of str jz hang ;get out mov ah,0x0E int 0x10 jmp ch_loop hang: jmp hang msg db 'Welcome to Macintosh', 13, 10, 0 times 510-($-$$) db 0 db 0x55 db 0xAA
Here's the ORG version. Note that you still need to tell DS what to be
[ORG 0x7c00] xor ax, ax ;make it zero mov ds, ax mov si, msg ch_loop:lodsb or al,al ;zero=end of str jz hang ;get out mov ah,0x0E int 0x10 jmp ch_loop hang: jmp hang msg db 'Welcome to Macintosh', 13, 10, 0 times 510-($-$$) db 0 db 0x55 db 0xAA
New topic: Typically 'procedures' are separated from the code using CALL/RET like the following
[ORG 0x7c00] xor ax, ax ;make it zero mov ds, ax mov si, msg call bios_print hang: jmp hang msg db 'Welcome to Macintosh', 13, 10, 0 bios_print: lodsb or al,al ;zero=end of str jz done ;get out mov ah,0x0E int 0x10 jmp bios_print done: ret times 510-($-$$) db 0 db 0x55 db 0xAA
For some inexplicable reason, loading SI then jumping to the procedure always bugged me. Fortunately for psychos like me NASM's macros let you pretend that you are passing a parameter (macro definition has to go before it's being called).
%macro BiosPrint 1
mov si, word %1
ch_loop:lodsb
or al,al ;zero=end of str
jz done ;get out
mov ah,0x0E
int 0x10
jmp ch_loop
done:
%endmacro
[ORG 0x7c00]
xor ax, ax ;make it zero
mov ds, ax
BiosPrint msg
hang:
jmp hang
msg db 'Welcome to Macintosh', 13, 10, 0
times 510-($-$$) db 0
db 0x55
db 0xAA
And in case your code is becoming long and unreadable, you can break it up into different files, then include the files at the beginning of your main code. Like so:
jmp main %include "othercode.inc" main: ;... rest of your code here
.: prev :: up :: next :.
initial thread
Step 3: a look at machine code (opcodes, prefix, etc)
Included from BabyStep3
Opcodes in machine language (AsmExample)
; nasmw encode.asm -f bin -o encode.bin mov cx, 0xFF times 510-($-$$) db 0 db 0x55 db 0xAA
Don't Partcopy to disk. Just open this in DEBUG (for MS*. Hexdump will be nice for linux users)
C:\osdev\debug encode.bin
Type in 'd' after the '-' to see the binary file. ('?' will give you help; 'q' will quit). You will see something like this:
OAE3:0100 B9 FF 00 00 00 00 etc...
Look up the opcode for MOV here:
http://www.baldwin.cx/386htm/MOV.htm
See Section "17.2.2.1 Opcode" here:
http://www.baldwin.cx/386htm/s17_02.htm
In other words, there is a unique register number (CX=1) added to the base opcode value 'B8' to give 'B9', which you see in the dump.
But watch what happends when you replace CX with ECX:
mov ecx, 0xFF times 510-($-$$) db 0 db 0x55 db 0xAA
OAE3:0100 66 B9 FF 00 00 00 00 etc...
The '66' is a Operand Size Override Prefix generated by the assembler when there is a discrepancy with the default mode, which when NASM assembles binary files, it is 16-bit. The same thing happens if you use the BITS directive to change the mode, but it differs from the size of the operand:
[BITS 32] mov cx, 0xFF times 510-($-$$) db 0 db 0x55 db 0xAA
This doesn't actually change the mode of the processor, but it does help it interpret the subsequent bytes.
Addresses
Address encoding is a bit more complicated
mov cx, [temp] temp db 0x99 times 510-($-$$) db 0 db 0x55 db 0xAA
OAE3:0100 8B 0E 04 00 99 00 00 00 etc...
- '8B' is the opcode
- '0E' is a ModR/M byte which helps the opcode interpretation
See Section "17.2.1 ModR/M and SIB Bytes" here:
http://www.baldwin.cx/386htm/s17_02.htm
The rules for interpreting this byte, which contains different fields (see Fig. 17-2), but fortunately Table 17-2 makes it easier. Look up '0E' and you will see at the left it says "disp16" which means that the operand will be interpreted as a 16-bit offset.
'04 00' is the 16-bit offset. If you are confused why 0x0004 is backwards, it's because the Intel processor is "little endian". The "little" end of the number comes first.
'99' is of course the value of the byte at 0x0004 (8B is at 0x0000)
Be aware of another prefix called the Address size Override Prefix '67' which the assembler generates when there is a discrepancy just like with '66' above.
This stuff matters for a bunch of reasons, but since we will be making the switch from 16-bit real mode to 32-bit protected mode, our code is going to also change. And being aware of what a dump looks like can prevent a lot of grief.
Refs:
Step 4: writing to video memory
Included from BabyStep4
I know this is starting to look like a half-baked tutorial in assembly, but there's actually a reason behind my madness. Namely, solving as many problems as possible before switching to pmode etc. will lessen the confusion a great deal.
This example prints a string and the contents of a memory location (which is the first letter of the string in video memory). It is meant to demonstrate printing to screen in text mode without using BIOS, as well as converting hex so it can be displayed -- so we can check register and memory values.
I added a stack, but didn't end up using it. However, I just left it in because it will probably get used soon.
;=====================================
; nasmw boot.asm -f bin -o boot.bin
; partcopy boot.bin 0 200 -f0
[ORG 0x7c00] ; add to offsets
xor ax, ax ; make it zero
mov ds, ax ; DS=0
mov ss, ax ; stack starts at 0
mov sp, 0x9c00 ; 200h past code start
mov ax, 0xb800 ; text video memory
mov es, ax
mov si, msg ; show text string
call sprint
mov ax, 0xb800 ; look at video mem
mov gs, ax
mov bx, 0x0000 ; 'W'=57 attrib=0F
mov ax, [gs:bx]
mov word [reg16], ax ;look at register
call printreg16
hang:
jmp hang
;----------------------
dochar: call cprint ; print one character
sprint: lodsb ; string char to AL
cmp al, 0
jne dochar ; else, we're done
add byte [ypos], 1 ;down one row
mov byte [xpos], 0 ;back to left
ret
cprint: mov ah, 0x0F ; attrib = white on black
mov cx, ax ; save char/attribute
movzx ax, byte [ypos]
mov dx, 160 ; 2 bytes (char/attrib)
mul dx ; for 80 columns
movzx bx, byte [xpos]
shl bx, 1 ; times 2 to skip attrib
mov di, 0 ; start of video memory
add di, ax ; add y offset
add di, bx ; add x offset
mov ax, cx ; restore char/attribute
stosw ; write char/attribute
add byte [xpos], 1 ; advance to right
ret
;------------------------------------
printreg16:
mov di, outstr16
mov ax, [reg16]
mov si, hexstr
mov cx, 4 ;four places
hexloop:
rol ax, 4 ;leftmost will
mov bx, ax ; become
and bx, 0x0f ; rightmost
mov bl, [si + bx];index into hexstr
mov [di], bl
inc di
dec cx
jnz hexloop
mov si, outstr16
call sprint
ret
;------------------------------------
xpos db 0
ypos db 0
hexstr db '0123456789ABCDEF'
outstr16 db '0000', 0 ;register value string
reg16 dw 0 ; pass values to printreg16
msg db "What are you doing, Dave?", 0
times 510-($-$$) db 0
db 0x55
db 0xAA
;==================================
Step 5: the IVT - interrupts in the good old days.
Included from BabyStep5
This code is meant to show how the hardware interrupt generated when you press a key can be handled by replacing the seg:offset specified in the IVT (interrupt vector table). This normally points to a BIOS routine. To find the entry in the IVT, multiply the interrupt number by 4 (which is the size of each entry).
This key handler just displays the scan code without conversion to ASCII, buffering, or handling extended keys. The reason for doing this is to not muddle up the basic idea, which is to provide input, as well as output, in its most simple form.
I will not go into the hows and whys of reading the ports involved in a key press. Suffice it to say that you are communicating with actual chips (or parts of chips), not some software intermediary. I personally feel it is good to remember that, no matter what level of abstraction you work at, you are ultimately telling hardware what to do.
I will point out the turning the keyboard on/off through port 0x61 is given in its complete form, some of which might not be needed, depending on the system.
;========================================== ; nasmw boot.asm -f bin -o boot.bin ; partcopy boot.bin 0 200 -f0 [ORG 0x7c00] ; add to offsets jmp start %include "print.inc" start: xor ax, ax ; make it zero mov ds, ax ; DS=0 mov ss, ax ; stack starts at 0 mov sp, 0x9c00 ; 200h past code start mov ax, 0xb800 ; text video memory mov es, ax cli ;no interruptions mov bx, 0x09 ;hardware interrupt # shl bx, 2 ;multiply by 4 xor ax, ax mov gs, ax ;start of memory mov [gs:bx], word keyhandler mov [gs:bx+2], ds ; segment sti jmp $ ; loop forever keyhandler: in al, 0x60 ; get key data mov bl, al ; save it mov byte [port60], al in al, 0x61 ; keybrd control mov ah, al or al, 0x80 ; disable bit 7 out 0x61, al ; send it back xchg ah, al ; get original out 0x61, al ; send that back mov al, 0x20 ; End of Interrupt out 0x20, al ; and bl, 0x80 ; key released jnz done ; don't repeat mov ax, [port60] mov word [reg16], ax call printreg16 done: iret port60 dw 0 times 510-($-$$) db 0 ; fill sector w/ 0's dw 0xAA55 ; req'd by some BIOSes ;==========================================
*** hardware fun
http://chip.ms.mff.cuni.cz/pcguts/
*** Intel's Summer Reading List
http://developer.intel.com/vtune/cbts/refman.htm
*** John Fine links to hardware programming
http://www.geocities.com/SiliconValley/Peaks/8600/device.html
Step 6: Understanding segmentation
Included from BabyStep6
Baby Step VI : descriptors
Actually entering ProtectedMode is simply switching a single bit in a special control register (cr0). (All the other stuff, like A20Line, tasks, IDT, call gates, etc. is additional stuff.)
However, before switching to pmode, you have to use the LGDT instruction to load another special register (gdtr) with the location of a table of data structures called descriptors that tell the process how to access memory.
We're arguing about whether GDT could be set up after switching to pmode in
this thread
--PypeClicker
Overview of bytes in the descriptor:
+0 +1 +2 +3 +4 +5 +6 +7 l0 l1 b0 b1 b2 TT Fl b3
Descriptor bytes arranged from lowest memory location to highest:
| 0 | 0x00 lowest byte of Limit |
|---|---|
| 1 | 0x00 next byte of Limit |
| 2 | 0x00 lowest byte of Base Addr |
| 3 | 0x00 next byte of Base Addr |
| 4 | 0x00 third byte of Base Addr |
| 5 | 0x00 = (bits) 0 - 00 - 0 - 0000 = P - DPL - S - Type |
| 6 | 0x00 = (bits) 0 - 0 - 0 - 0 - 0000 = G - D/B - 0 - AVL - Size |
| 7 | 0x00 fourth and highest byte of Base Addr |
Bits in Type (byte#5)
- "P"
- Present (1 bit) = 1 means segment is in memory (accessing a non-present segment will raise an exception)
- "DPL"
- Descriptor Privilege Level (2 bits) = 0 is most privileged and 3 is least.
- "S"
- System (1 bit) = must be 0 in descriptors for Task State Segments (TSS), Interrupt Gate, Trap Gate, Task Gate, Call Gates. Otherwise, for code/data/stack segment descriptors, it will be 1.
- "Type"
- Type (4 bits) = interpretation of these depends on whether S (above) is set or not. For S=0, the interpretation will be covered in specific instances of gates etc.
- Type bit 3
- If S=1, then if high bit is 1, it's a code segment, otherwise it's a data segment.
- Type bit 2
- The next highest bit depends on the highest bit. If code segment, this next bit indicates whether the segment is 'Conforming' or not. This allows programs somewhere else that are LESS privileged to access this segment, then this segment conforms to the priviledge level of the calling programm. If it's a data segment, this bit specifies "Expand (up or down)" for when the segment is used as a stack. Expand-up (bit=0) is your normal stack behavior. Expand-down is used to prevent problems in stacks that are resized.
- Type bit 1
- The subsequent bit specifies permission to Read/Write. For data segments, 0 means read-only and 1 is r/w. For code segments, 0 means you can't read from it (e.g. using MOV) and 1 means you can.
- Type bit 0
- The lowest bit means that the segment has been accessed already (1) or not.
bits in 'flags' (byte #6)
"G" = Granularity (1 bit) = segment Size specified in bytes (0) or 4K pages (1)
"D/B" = Default (code seg) / Big (data seg) = (1 bit) In a code segment (see "Type" above), this bits says default operand/address size is 32-bit (1) or 16-bit (0). For a data segment, it means stack pointer is 32-bit (1) or 16-bit (0). Also means something for expand-down stacks (see "Type" above), but we don't care.
"0" = Reserved (1 bit) = belongs to the Intel of the future.
"AVL" = Available (1 bit) = For your use. Go crazy.
"Size" = Top Nibble of Size (4 bits) = The size of the segment is 20-bits. This is the final four. Whether it means the highest possible segment size is 1 meg or 4 Gigs depends on Granularity above.
Initial Thread
See also What Segments are About?
Step 7: Entering Unreal mode
Included from BabyStep7
Baby Step VII: Big Real Mode
(a.k.a UnrealMode or voodoo mode.)
While this code is largely just a party trick, understanding it gives a gentle intro to protected mode concepts and possibly avoids some headaches later on 'cause you skipped over this kind of stuff.
The single descriptor in the global descriptor table at the bottom is layed out to match BabyStep6. The 'size' given is 1 MB, the base address is 0x0, and the bit fields you can work out yourself.
The reason for doing this is to enable 32-bit offsets in real mode. However, you won't be able to go past 1 meg quite yet.
In protected mode, the bits 3-15 in the segment register are an index into the descriptor table. That's why in this code 0x08 = 1000b gets you the 1 entry.
When this register given a "selector", a "segment descriptor cache register" is filled with the descriptor values, including the size (or limit). After the switch back to real mode, these values are not modified, regardless of what value is in the 16-bit segment register. So the 64k limit is no longer valid and 32-bit offsets can be used with the real-mode addressing rules (i.e. shift segment 4 bits, then add offset).
Finally, note that IP is unaffected by all this, so the code itself is still limited to 64k.
;========================================== ; nasmw boot.asm -o boot.bin ; partcopy boot.bin 0 200 -f0 [ORG 0x7c00] ; add to offsets start: xor ax, ax ; make it zero mov ds, ax ; DS=0 mov ss, ax ; stack starts at 0 mov sp, 0x9c00 ; 200h past code start cli ; no interrupt push ds ; save real mode lgdt [gdtinfo] ; load gdt register mov eax, cr0 ; switch to pmode by or al,1 ; set pmode bit mov cr0, eax mov bx, 0x08 ; select descriptor 1 mov ds, bx ; 8h = 1000b and al,0xFE ; back to realmode mov cr0, eax ; by toggling bit again pop ds ; get back old segment sti mov bx, 0x0f01 ; attrib/char of smiley mov eax, 0x0b8000 ; note 32 bit offset mov word [ds:eax], bx jmp $ ; loop forever gdtinfo: dw gdt_end - gdt - 1 ;last byte in table dd gdt ;start of table gdt dd 0,0 ; entry 0 is always unused flatdesc db 0xff, 0xff, 0, 0, 0, 10010010b, 01001111b, 0 gdt_end: times 510-($-$$) db 0 ; fill sector w/ 0's db 0x55 ; req'd by some BIOSes db 0xAA ;==========================================
Step 8: 32-bits aware text display
Included from BabyStep8
baby steps VIII - 32-bit printing
Here is the same non-BIOS screen print AsmExample as before, but adjusted to use 32-bit registers and offsets. The 'complex' string instructions have been replaced.
;----------------------
dochar:
call cprint ; print one character
sprint:
mov eax, [esi] ; string char to AL
lea esi, [esi+1]
cmp al, 0
jne dochar ; else, we're done
add byte [ypos], 1 ; down one row
mov byte [xpos], 0 ; back to left
ret
cprint:
mov ah, 0x0F ; attrib = white on black
mov ecx, eax ; save char/attribute
movzx eax, byte [ypos]
mov edx, 160 ; 2 bytes (char/attrib)
mul edx ; for 80 columns
movzx ebx, byte [xpos]
shl ebx, 1 ; times 2 to skip attrib
mov edi, 0xb8000 ; start of video memory
add edi, eax ; add y offset
add edi, ebx ; add x offset
mov eax, ecx ; restore char/attribute
mov word [ds:edi], ax
add byte [xpos], 1 ; advance to right
ret
;------------------------------------
printreg32:
mov edi, outstr32
mov eax, [reg32]
mov esi, hexstr
mov ecx, 8 ; eight nibbles
hexloop:
rol eax, 4 ; leftmost will
mov ebx, eax ; become rightmost
and ebx, 0x0f ;
mov bl, [esi + ebx] ; index into hexstr
mov [edi], bl
inc edi
dec ecx
jnz hexloop
mov esi, outstr32
call sprint
ret
;------------------------------------
xpos db 0
ypos db 0
hexstr db '0123456789ABCDEF'
outstr32 db '00000000', 0 ; register value
reg32 dd 0 ; pass values to printreg32
;------------------------------------
All Steps
Note: See GasAllInOne for the examples in AT&T GAS Syntax.
