| Welcome to YAFFS, the first file system developed specifically for NAND flash. |
| |
| It is now YAFFS2 - original YAFFS (AYFFS1) only supports 512-byte page |
| NAND and is now deprectated. YAFFS2 supports 512b page in 'YAFFS1 |
| compatibility' mode (CONFIG_YAFFS_YAFFS1) and 2K or larger page NAND |
| in YAFFS2 mode (CONFIG_YAFFS_YAFFS2). |
| |
| |
| A note on licencing |
| ------------------- |
| YAFFS is available under the GPL and via alternative licensing |
| arrangements with Aleph One. If you're using YAFFS as a Linux kernel |
| file system then it will be under the GPL. For use in other situations |
| you should discuss licensing issues with Aleph One. |
| |
| |
| Terminology |
| ----------- |
| Page - NAND addressable unit (normally 512b or 2Kbyte size) - can |
| be read, written, marked bad. Has associated OOB. |
| Block - Eraseable unit. 64 Pages. (128K on 2K NAND, 32K on 512b NAND) |
| OOB - 'spare area' of each page for ECC, bad block marked and YAFFS |
| tags. 16 bytes per 512b - 64 bytes for 2K page size. |
| Chunk - Basic YAFFS addressable unit. Same size as Page. |
| Object - YAFFS Object: File, Directory, Link, Device etc. |
| |
| YAFFS design |
| ------------ |
| |
| YAFFS is a log-structured filesystem. It is designed particularly for |
| NAND (as opposed to NOR) flash, to be flash-friendly, robust due to |
| journalling, and to have low RAM and boot time overheads. File data is |
| stored in 'chunks'. Chunks are the same size as NAND pages. Each page |
| is marked with file id and chunk number. These marking 'tags' are |
| stored in the OOB (or 'spare') region of the flash. The chunk number |
| is determined by dividing the file position by the chunk size. Each |
| chunk has a number of valid bytes, which equals the page size for all |
| except the last chunk in a file. |
| |
| File 'headers' are stored as the first page in a file, marked as a |
| different type to data pages. The same mechanism is used to store |
| directories, device files, links etc. The first page describes which |
| type of object it is. |
| |
| YAFFS2 never re-writes a page, because the spec of NAND chips does not |
| allow it. (YAFFS1 used to mark a block 'deleted' in the OOB). Deletion |
| is managed by moving deleted objects to the special, hidden 'unlinked' |
| directory. These records are preserved until all the pages containing |
| the object have been erased (We know when this happen by keeping a |
| count of chunks remaining on the system for each object - when it |
| reaches zero the object really is gone). |
| |
| When data in a file is overwritten, the relevant chunks are replaced |
| by writing new pages to flash containing the new data but the same |
| tags. |
| |
| Pages are also marked with a short (2 bit) serial number that |
| increments each time the page at this position is incremented. The |
| reason for this is that if power loss/crash/other act of demonic |
| forces happens before the replaced page is marked as discarded, it is |
| possible to have two pages with the same tags. The serial number is |
| used to arbitrate. |
| |
| A block containing only discarded pages (termed a dirty block) is an |
| obvious candidate for garbage collection. Otherwise valid pages can be |
| copied off a block thus rendering the whole block discarded and ready |
| for garbage collection. |
| |
| In theory you don't need to hold the file structure in RAM... you |
| could just scan the whole flash looking for pages when you need them. |
| In practice though you'd want better file access times than that! The |
| mechanism proposed here is to have a list of __u16 page addresses |
| associated with each file. Since there are 2^18 pages in a 128MB NAND, |
| a __u16 is insufficient to uniquely identify a page but is does |
| identify a group of 4 pages - a small enough region to search |
| exhaustively. This mechanism is clearly expandable to larger NAND |
| devices - within reason. The RAM overhead with this approach is approx |
| 2 bytes per page - 512kB of RAM for a whole 128MB NAND. |
| |
| Boot-time scanning to build the file structure lists only requires |
| one pass reading NAND. If proper shutdowns happen the current RAM |
| summary of the filesystem status is saved to flash, called |
| 'checkpointing'. This saves re-scanning the flash on startup, and gives |
| huge boot/mount time savings. |
| |
| YAFFS regenerates its state by 'replaying the tape' - i.e. by |
| scanning the chunks in their allocation order (i.e. block sequence ID |
| order), which is usually different form the media block order. Each |
| block is still only read once - starting from the end of the media and |
| working back. |
| |
| YAFFS tags in YAFFS1 mode: |
| |
| 18-bit Object ID (2^18 files, i.e. > 260,000 files). File id 0- is not |
| valid and indicates a deleted page. File od 0x3ffff is also not valid. |
| Synonymous with inode. |
| 2-bit serial number |
| 20-bit Chunk ID within file. Limit of 2^20 chunks/pages per file (i.e. |
| > 500MB max file size). Chunk ID 0 is the file header for the file. |
| 10-bit counter of the number of bytes used in the page. |
| 12 bit ECC on tags |
| |
| YAFFS tags in YAFFS2 mode: |
| 4 bytes 32-bit chunk ID |
| 4 bytes 32-bit object ID |
| 2 bytes Number of data bytes in this chunk |
| 4 bytes Sequence number for this block |
| 3 bytes ECC on tags |
| 12 bytes ECC on data (3 bytes per 256 bytes of data) |
| |
| |
| Page allocation and garbage collection |
| |
| Pages are allocated sequentially from the currently selected block. |
| When all the pages in the block are filled, another clean block is |
| selected for allocation. At least two or three clean blocks are |
| reserved for garbage collection purposes. If there are insufficient |
| clean blocks available, then a dirty block ( ie one containing only |
| discarded pages) is erased to free it up as a clean block. If no dirty |
| blocks are available, then the dirtiest block is selected for garbage |
| collection. |
| |
| Garbage collection is performed by copying the valid data pages into |
| new data pages thus rendering all the pages in this block dirty and |
| freeing it up for erasure. I also like the idea of selecting a block |
| at random some small percentage of the time - thus reducing the chance |
| of wear differences. |
| |
| YAFFS is single-threaded. Garbage-collection is done as a parasitic |
| task of writing data. So each time some data is written, a bit of |
| pending garbage collection is done. More pages are garbage-collected |
| when free space is tight. |
| |
| |
| Flash writing |
| |
| YAFFS only ever writes each page once, complying with the requirements |
| of the most restricitve NAND devices. |
| |
| Wear levelling |
| |
| This comes as a side-effect of the block-allocation strategy. Data is |
| always written on the next free block, so they are all used equally. |
| Blocks containing data that is written but never erased will not get |
| back into the free list, so wear is levelled over only blocks which |
| are free or become free, not blocks which never change. |
| |
| |
| |
| Some helpful info |
| ----------------- |
| |
| Formatting a YAFFS device is simply done by erasing it. |
| |
| Making an initial filesystem can be tricky because YAFFS uses the OOB |
| and thus the bytes that get written depend on the YAFFS data (tags), |
| and the ECC bytes and bad block markers which are dictated by the |
| hardware and/or the MTD subsystem. The data layout also depends on the |
| device page size (512b or 2K). Because YAFFS is only responsible for |
| some of the OOB data, generating a filesystem offline requires |
| detailed knowledge of what the other parts (MTD and NAND |
| driver/hardware) are going to do. |
| |
| To make a YAFFS filesystem you have 3 options: |
| |
| 1) Boot the system with an empty NAND device mounted as YAFFS and copy |
| stuff on. |
| |
| 2) Make a filesystem image offline, then boot the system and use |
| MTDutils to write an image to flash. |
| |
| 3) Make a filesystem image offline and use some tool like a bootloader to |
| write it to flash. |
| |
| Option 1 avoids a lot of issues because all the parts |
| (YAFFS/MTD/hardware) all take care of their own bits and (if you have |
| put things together properly) it will 'just work'. YAFFS just needs to |
| know how many bytes of the OOB it can use. However sometimes it is not |
| practical. |
| |
| Option 2 lets MTD/hardware take care of the ECC so the filesystem |
| image just had to know which bytes to use for YAFFS Tags. |
| |
| Option 3 is hardest as the image creator needs to know exactly what |
| ECC bytes, endianness and algorithm to use as well as which bytes are |
| available to YAFFS. |
| |
| mkyaffs2image creates an image suitable for option 3 for the |
| particular case of yaffs2 on 2K page NAND with default MTD layout. |
| |
| mkyaffsimage creates an equivalent image for 512b page NAND (i.e. |
| yaffs1 format). |
| |
| Bootloaders |
| ----------- |
| |
| A bootloader using YAFFS needs to know how MTD is laying out the OOB |
| so that it can skip bad blocks. |
| |
| YAFFS Tracing |
| ------------- |