Simon Glass | 90268b8 | 2014-10-19 21:11:24 -0600 | [diff] [blame] | 1 | Booting Linux on x86 with FIT |
| 2 | ============================= |
| 3 | |
| 4 | Background |
| 5 | ---------- |
| 6 | |
| 7 | (corrections to the text below are welcome) |
| 8 | |
| 9 | Generally Linux x86 uses its own very complex booting method. There is a setup |
| 10 | binary which contains all sorts of parameters and a compressed self-extracting |
| 11 | binary for the kernel itself, often with a small built-in serial driver to |
| 12 | display decompression progress. |
| 13 | |
| 14 | The x86 CPU has various processor modes. I am no expert on these, but my |
| 15 | understanding is that an x86 CPU (even a really new one) starts up in a 16-bit |
| 16 | 'real' mode where only 1MB of memory is visible, moves to 32-bit 'protected' |
| 17 | mode where 4GB is visible (or more with special memory access techniques) and |
| 18 | then to 64-bit 'long' mode if 64-bit execution is required. |
| 19 | |
| 20 | Partly the self-extracting nature of Linux was introduced to cope with boot |
| 21 | loaders that were barely capable of loading anything. Even changing to 32-bit |
| 22 | mode was something of a challenge, so putting this logic in the kernel seemed |
| 23 | to make sense. |
| 24 | |
| 25 | Bit by bit more and more logic has been added to this post-boot pre-Linux |
| 26 | wrapper: |
| 27 | |
| 28 | - Changing to 32-bit mode |
| 29 | - Decompression |
| 30 | - Serial output (with drivers for various chips) |
| 31 | - Load address randomisation |
| 32 | - Elf loader complete with relocation (for the above) |
| 33 | - Random number generator via 3 methods (again for the above) |
| 34 | - Some sort of EFI mini-loader (1000+ glorious lines of code) |
| 35 | - Locating and tacking on a device tree and ramdisk |
| 36 | |
| 37 | To my mind, if you sit back and look at things from first principles, this |
| 38 | doesn't make a huge amount of sense. Any boot loader worth its salts already |
| 39 | has most of the above features and more besides. The boot loader already knows |
| 40 | the layout of memory, has a serial driver, can decompress things, includes an |
| 41 | ELF loader and supports device tree and ramdisks. The decision to duplicate |
| 42 | all these features in a Linux wrapper caters for the lowest common |
| 43 | denominator: a boot loader which consists of a BIOS call to load something off |
| 44 | disk, followed by a jmp instruction. |
| 45 | |
| 46 | (Aside: On ARM systems, we worry that the boot loader won't know where to load |
| 47 | the kernel. It might be easier to just provide that information in the image, |
| 48 | or in the boot loader rather than adding a self-relocator to put it in the |
| 49 | right place. Or just use ELF? |
| 50 | |
| 51 | As a result, the x86 kernel boot process is needlessly complex. The file |
| 52 | format is also complex, and obfuscates the contents to a degree that it is |
| 53 | quite a challenge to extract anything from it. This bzImage format has become |
| 54 | so prevalent that is actually isn't possible to produce the 'raw' kernel build |
| 55 | outputs with the standard Makefile (as it is on ARM for example, at least at |
| 56 | the time of writing). |
| 57 | |
| 58 | This document describes an alternative boot process which uses simple raw |
| 59 | images which are loaded into the right place by the boot loader and then |
| 60 | executed. |
| 61 | |
| 62 | |
| 63 | Build the kernel |
| 64 | ---------------- |
| 65 | |
| 66 | Note: these instructions assume a 32-bit kernel. U-Boot does not currently |
| 67 | support booting a 64-bit kernel as it has no way of going into 64-bit mode on |
| 68 | x86. |
| 69 | |
| 70 | You can build the kernel as normal with 'make'. This will create a file called |
| 71 | 'vmlinux'. This is a standard ELF file and you can look at it if you like: |
| 72 | |
| 73 | $ objdump -h vmlinux |
| 74 | |
| 75 | vmlinux: file format elf32-i386 |
| 76 | |
| 77 | Sections: |
| 78 | Idx Name Size VMA LMA File off Algn |
| 79 | 0 .text 00416850 81000000 01000000 00001000 2**5 |
| 80 | CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE |
| 81 | 1 .notes 00000024 81416850 01416850 00417850 2**2 |
| 82 | CONTENTS, ALLOC, LOAD, READONLY, CODE |
| 83 | 2 __ex_table 00000c50 81416880 01416880 00417880 2**3 |
| 84 | CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA |
| 85 | 3 .rodata 00154b9e 81418000 01418000 00419000 2**5 |
| 86 | CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA |
| 87 | 4 __bug_table 0000597c 8156cba0 0156cba0 0056dba0 2**0 |
| 88 | CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA |
| 89 | 5 .pci_fixup 00001b80 8157251c 0157251c 0057351c 2**2 |
| 90 | CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA |
| 91 | 6 .tracedata 00000024 8157409c 0157409c 0057509c 2**0 |
| 92 | CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA |
| 93 | 7 __ksymtab 00007ec0 815740c0 015740c0 005750c0 2**2 |
| 94 | CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA |
| 95 | 8 __ksymtab_gpl 00004a28 8157bf80 0157bf80 0057cf80 2**2 |
| 96 | CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA |
| 97 | 9 __ksymtab_strings 0001d6fc 815809a8 015809a8 005819a8 2**0 |
| 98 | CONTENTS, ALLOC, LOAD, READONLY, DATA |
| 99 | 10 __init_rodata 00001c3c 8159e0a4 0159e0a4 0059f0a4 2**2 |
| 100 | CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA |
| 101 | 11 __param 00000ff0 8159fce0 0159fce0 005a0ce0 2**2 |
| 102 | CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA |
| 103 | 12 __modver 00000330 815a0cd0 015a0cd0 005a1cd0 2**2 |
| 104 | CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA |
| 105 | 13 .data 00063000 815a1000 015a1000 005a2000 2**12 |
| 106 | CONTENTS, ALLOC, LOAD, RELOC, DATA |
| 107 | 14 .init.text 0002f104 81604000 01604000 00605000 2**2 |
| 108 | CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE |
| 109 | 15 .init.data 00040cdc 81634000 01634000 00635000 2**12 |
| 110 | CONTENTS, ALLOC, LOAD, RELOC, DATA |
| 111 | 16 .x86_cpu_dev.init 0000001c 81674cdc 01674cdc 00675cdc 2**2 |
| 112 | CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA |
| 113 | 17 .altinstructions 0000267c 81674cf8 01674cf8 00675cf8 2**0 |
| 114 | CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA |
| 115 | 18 .altinstr_replacement 00000942 81677374 01677374 00678374 2**0 |
| 116 | CONTENTS, ALLOC, LOAD, READONLY, CODE |
| 117 | 19 .iommu_table 00000014 81677cb8 01677cb8 00678cb8 2**2 |
| 118 | CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA |
| 119 | 20 .apicdrivers 00000004 81677cd0 01677cd0 00678cd0 2**2 |
| 120 | CONTENTS, ALLOC, LOAD, RELOC, DATA |
| 121 | 21 .exit.text 00001a80 81677cd8 01677cd8 00678cd8 2**0 |
| 122 | CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE |
| 123 | 22 .data..percpu 00007880 8167a000 0167a000 0067b000 2**12 |
| 124 | CONTENTS, ALLOC, LOAD, RELOC, DATA |
| 125 | 23 .smp_locks 00003000 81682000 01682000 00683000 2**2 |
| 126 | CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA |
| 127 | 24 .bss 000a1000 81685000 01685000 00686000 2**12 |
| 128 | ALLOC |
| 129 | 25 .brk 00424000 81726000 01726000 00686000 2**0 |
| 130 | ALLOC |
| 131 | 26 .comment 00000049 00000000 00000000 00686000 2**0 |
| 132 | CONTENTS, READONLY |
| 133 | 27 .GCC.command.line 0003e055 00000000 00000000 00686049 2**0 |
| 134 | CONTENTS, READONLY |
| 135 | 28 .debug_aranges 0000f4c8 00000000 00000000 006c40a0 2**3 |
| 136 | CONTENTS, RELOC, READONLY, DEBUGGING |
| 137 | 29 .debug_info 0440b0df 00000000 00000000 006d3568 2**0 |
| 138 | CONTENTS, RELOC, READONLY, DEBUGGING |
| 139 | 30 .debug_abbrev 0022a83b 00000000 00000000 04ade647 2**0 |
| 140 | CONTENTS, READONLY, DEBUGGING |
| 141 | 31 .debug_line 004ead0d 00000000 00000000 04d08e82 2**0 |
| 142 | CONTENTS, RELOC, READONLY, DEBUGGING |
| 143 | 32 .debug_frame 0010a960 00000000 00000000 051f3b90 2**2 |
| 144 | CONTENTS, RELOC, READONLY, DEBUGGING |
| 145 | 33 .debug_str 001b442d 00000000 00000000 052fe4f0 2**0 |
| 146 | CONTENTS, READONLY, DEBUGGING |
| 147 | 34 .debug_loc 007c7fa9 00000000 00000000 054b291d 2**0 |
| 148 | CONTENTS, RELOC, READONLY, DEBUGGING |
| 149 | 35 .debug_ranges 00098828 00000000 00000000 05c7a8c8 2**3 |
| 150 | CONTENTS, RELOC, READONLY, DEBUGGING |
| 151 | |
| 152 | There is also the setup binary mentioned earlier. This is at |
| 153 | arch/x86/boot/setup.bin and is about 12KB in size. It includes the command |
| 154 | line and various settings need by the kernel. Arguably the boot loader should |
| 155 | provide all of this also, but setting it up is some complex that the kernel |
| 156 | helps by providing a head start. |
| 157 | |
| 158 | As you can see the code loads to address 0x01000000 and everything else |
| 159 | follows after that. We could load this image using the 'bootelf' command but |
| 160 | we would still need to provide the setup binary. This is not supported by |
| 161 | U-Boot although I suppose you could mostly script it. This would permit the |
| 162 | use of a relocatable kernel. |
| 163 | |
| 164 | All we need to boot is the vmlinux file and the setup.bin file. |
| 165 | |
| 166 | |
| 167 | Create a FIT |
| 168 | ------------ |
| 169 | |
| 170 | To create a FIT you will need a source file describing what should go in the |
| 171 | FIT. See kernel.its for an example for x86. Put this into a file called |
| 172 | image.its. |
| 173 | |
| 174 | Note that setup is loaded to the special address of 0x90000 (a special address |
| 175 | you just have to know) and the kernel is loaded to 0x01000000 (the address you |
| 176 | saw above). This means that you will need to load your FIT to a different |
| 177 | address so that U-Boot doesn't overwrite it when decompressing. Something like |
| 178 | 0x02000000 will do so you can set CONFIG_SYS_LOAD_ADDR to that. |
| 179 | |
| 180 | In that example the kernel is compressed with lzo. Also we need to provide a |
| 181 | flat binary, not an ELF. So the steps needed to set things are are: |
| 182 | |
| 183 | # Create a flat binary |
| 184 | objcopy -O binary vmlinux vmlinux.bin |
| 185 | |
| 186 | # Compress it into LZO format |
| 187 | lzop vmlinux.bin |
| 188 | |
| 189 | # Build a FIT image |
| 190 | mkimage -f image.its image.fit |
| 191 | |
| 192 | (be careful to run the mkimage from your U-Boot tools directory since it |
| 193 | will have x86_setup support.) |
| 194 | |
| 195 | You can take a look at the resulting fit file if you like: |
| 196 | |
| 197 | $ dumpimage -l image.fit |
| 198 | FIT description: Simple image with single Linux kernel on x86 |
| 199 | Created: Tue Oct 7 10:57:24 2014 |
| 200 | Image 0 (kernel@1) |
| 201 | Description: Vanilla Linux kernel |
| 202 | Created: Tue Oct 7 10:57:24 2014 |
| 203 | Type: Kernel Image |
| 204 | Compression: lzo compressed |
| 205 | Data Size: 4591767 Bytes = 4484.15 kB = 4.38 MB |
| 206 | Architecture: Intel x86 |
| 207 | OS: Linux |
| 208 | Load Address: 0x01000000 |
| 209 | Entry Point: 0x00000000 |
| 210 | Hash algo: sha1 |
| 211 | Hash value: 446b5163ebfe0fb6ee20cbb7a8501b263cd92392 |
| 212 | Image 1 (setup@1) |
| 213 | Description: Linux setup.bin |
| 214 | Created: Tue Oct 7 10:57:24 2014 |
| 215 | Type: x86 setup.bin |
| 216 | Compression: uncompressed |
| 217 | Data Size: 12912 Bytes = 12.61 kB = 0.01 MB |
| 218 | Hash algo: sha1 |
| 219 | Hash value: a1f2099cf47ff9816236cd534c77af86e713faad |
| 220 | Default Configuration: 'config@1' |
| 221 | Configuration 0 (config@1) |
| 222 | Description: Boot Linux kernel |
| 223 | Kernel: kernel@1 |
| 224 | |
| 225 | |
| 226 | Booting the FIT |
| 227 | --------------- |
| 228 | |
| 229 | To make it boot you need to load it and then use 'bootm' to boot it. A |
| 230 | suitable script to do this from a network server is: |
| 231 | |
| 232 | bootp |
| 233 | tftp image.fit |
| 234 | bootm |
| 235 | |
| 236 | This will load the image from the network and boot it. The command line (from |
| 237 | the 'bootargs' environment variable) will be passed to the kernel. |
| 238 | |
| 239 | If you want a ramdisk you can add it as normal with FIT. If you want a device |
| 240 | tree then x86 doesn't normally use those - it has ACPI instead. |
| 241 | |
| 242 | |
| 243 | Why Bother? |
| 244 | ----------- |
| 245 | |
| 246 | 1. It demystifies the process of booting an x86 kernel |
| 247 | 2. It allows use of the standard U-Boot boot file format |
| 248 | 3. It allows U-Boot to perform decompression - problems will provide an error |
| 249 | message and you are still in the boot loader. It is possible to investigate. |
| 250 | 4. It avoids all the pre-loader code in the kernel which is quite complex to |
| 251 | follow |
| 252 | 5. You can use verified/secure boot and other features which haven't yet been |
| 253 | added to the pre-Linux |
| 254 | 6. It makes x86 more like other architectures in the way it boots a kernel. |
| 255 | You can potentially use the same file format for the kernel, and the same |
| 256 | procedure for building and packaging it. |
| 257 | |
| 258 | |
| 259 | References |
| 260 | ---------- |
| 261 | |
| 262 | In the Linux kernel, Documentation/x86/boot.txt defines the boot protocol for |
| 263 | the kernel including the setup.bin format. This is handled in U-Boot in |
| 264 | arch/x86/lib/zimage.c and arch/x86/lib/bootm.c. |
| 265 | |
| 266 | The procedure for entering 64-bit mode on x86 seems to be described here: |
| 267 | |
| 268 | http://wiki.osdev.org/64-bit_Higher_Half_Kernel_with_GRUB_2 |
| 269 | |
| 270 | Various files in the same directory as this file describe the FIT format. |
| 271 | |
| 272 | |
| 273 | -- |
| 274 | Simon Glass |
| 275 | sjg@chromium.org |
| 276 | 7-Oct-2014 |