User Tools

Site Tools


filesystems:btrfs

Btrfs documentation and tricks

The B-Tree File System usually called btrfs, butterfs, betterfs, etc. “is a new copy on write (CoW) filesystem for Linux aimed at implementing advanced features while focusing on fault tolerance, repair and easy administration. Jointly developed at Oracle, Red Hat, Fujitsu, Intel, SUSE, STRATO and many others, Btrfs is licensed under the GPL and open for contribution from anyone.
Source https://btrfs.wiki.kernel.org/index.php/Main_Page

How to Setup RAID Using btrfs

Regardless of what type of RAID setup you would like to use you will need to find the /dev/sdx names of the disks you would like to use in RAID using fdisk or lsblk. The last step before you make your disks into a btrfs pool is to partition the drives with a, preferably, empty partition using whatever utility gets the job done, cfdisk comes highly recommended. Once you have done this you can continue on to either the RAID 1 or RAID 10 section.

RAID 1

The switch -d is for the data and -m is for the metadata. The below command makes a RAID 1 with both the data and metadata being mirrored across both disks.

# mkfs.btrfs -d raid1 -m raid1 /dev/disk1 /dev/disk2

RAID 10

RAID 10 requires that you use 4 or more drives, increasing only in even numbers. The below command makes a RAID 10 with both the data and metadata being mirrored across all disks.

# mkfs.btrfs -d raid10 -m raid10 /dev/part1 /dev/part2 /dev/part3 /dev/part4...

Mounting

Mounting a btrfs RAID is very simple all you have to do is mount any one of RAID members from the pool and the drives are mirrored. There is a difference between RAID 1 and RAID 10, in that btrfs shows the partitions on the drives for members in RAID 1 and does not for RAID 10.

Example:

user@box ~> lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda      8:0    0 111.8G  0 disk
└─sda1   8:1    0 111.8G  0 part /
sdb      8:16   0   3.7T  0 disk
└─sdb1   8:17   0   3.7T  0 part
sdc      8:32   0   3.7T  0 disk
└─sdc1   8:33   0   3.7T  0 part /mnt/pool0
sdd      8:48   0 931.5G  0 disk /mnt/pool1
sde      8:64   0 931.5G  0 disk
sdf      8:80   1 931.5G  0 disk
sdg      8:96   1 931.5G  0 disk
sdh      8:112  1 931.5G  0 disk
sdi      8:128  1 931.5G  0 disk

The fstab for the above configuration:

<file system>      <dir>           <type>          <options>                       <dump>  <pass>
/dev/sdc1          /mnt/pool0      btrfs           auto,compress=lzo               0       0
/dev/sdd           /mnt/pool1      btrfs           auto,compress=lzo               0       0

As you can see it does not matter which of the disks in RAID is mounted. You can also see that this is where the mount option for data compression is chosen, which in this case is LZO. It is best to set the compression method before you begin moving data to the drives so that you do not have to compress it later. Something something compression being the default and best way to store data because why wouldn't you want to save space if you can afford the CPU time.

Monitoring

To display the current space usage and drives in a pool:

# btrfs filesystem show

It is highly recommended to setup scheduled scrubbing as a means of error correction on your btrfs pools, look here for more info.

Replacing a Hard Drive

Comment out the disk pool that needs the drive replaced in /etc/fstab. Then power down machine and physically remove the failing/failed hard drive.

Find out what the disk id is in /dev. In this example the drive that has just been inserted is /dev/sdh.

$ lsblk

Example:

NAME                    MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sdf                       8:80   1 931.5G  0 disk 
sdd                       8:48   0 931.5G  0 disk /mnt/pool1
sdb                       8:16   0   3.7T  0 disk 
└─sdb1                    8:17   0   3.7T  0 part 
sdi                       8:128  1 931.5G  0 disk 
sdg                       8:96   1 931.5G  0 disk 
sde                       8:64   0 931.5G  0 disk 
sdc                       8:32   0   3.7T  0 disk 
└─sdc1                    8:33   0   3.7T  0 part /mnt/pool0
sda                       8:0    0  29.8G  0 disk 
├─sda2                    8:2    0     1K  0 part 
├─sda5                    8:5    0  29.3G  0 part 
│ ├─waruwaru--vg-swap_1 253:1    0   980M  0 lvm  [SWAP]
│ └─waruwaru--vg-root   253:0    0  28.4G  0 lvm  /
└─sda1                    8:1    0   487M  0 part /boot
sdh                       8:112  1 931.5G  0 disk

List the btrfs filesystem to find out the devid number

# btrfs filesystem show

Example:

warning, device 5 is missing
warning devid 5 not found already
Label: none  uuid: 63e6a264-9951-4271-87f3-197a0c745036
	Total devices 6 FS bytes used 991.08GiB
	devid    1 size 931.51GiB used 332.00GiB path /dev/sdd
	devid    2 size 931.51GiB used 332.00GiB path /dev/sde
	devid    3 size 931.51GiB used 332.00GiB path /dev/sdf
	devid    4 size 931.51GiB used 332.00GiB path /dev/sdg
	devid    6 size 931.51GiB used 332.00GiB path /dev/sdi
	*** Some devices missing

Force the array to mount in degraded mode.

# mount -o degraded /dev/sdx /<mnt dir>

Start the drive rebuild process

# btrfs replace start <devid> /dev/sdx /<mnt dir>

Show the rebuild progress

# btrfs replace status /<mnt dir>

Should look something like this:

0.5% done, 0 write errs, 0 uncorr. read errs

Scrubbing

Scrubbing is a type of checksum validation between the members of a RAID array. Commonly known as a consistency check in hardware RAID setups. With FreeNAS, using ZFS, scrubs are a part of the webgui which the function of scheduling them on a regular basis. It is highly recommended to setup scheduled scrubs on your RAID arrays no matter what type of RAID is being used.

To start a scrub manually with btrfs run

# btrfs scrub start <btrfs mount point>

To see the status of a scrub, useful to monitor with the watch command.

# btrfs srub status <btrfs mount point>

Create a cron job to run a scrub on the first of every month

# echo "btrfs scrub start <btrfs mount point>" > /etc/cron.monthly/btrfsscrub

Fixing Errors

The dread of any data hoarder, checksum errors. If you come across a checksum error during, you should be greeted with a message like this:

[1377563.360789] BTRFS warning (device sdf): csum failed root -9 ino 7740 off 750673920 csum 0x2bc360f3 expected csum 0x98308d82 mirror 2

If you missed it while it was running the scrub you can check dmesg to see if anything happened.

$ dmesg

The corrupted file can be found using the find command.

$ find /<btrfs mount point> -xdev -inum <inode number>

There are probably ways of fixing it, but likely you will be out of luck and have to restore from a backup or otherwise be fucked. Further info on recovering corrupted inodes

sudo btrfs inspect-internal inode-resolve 15380 /home

man btrfs-inspect-internal says:

   inode-resolve [-v] <ino> <path>
       (needs root privileges)

       resolve paths to all files with given inode number ino in a given
       subvolume at path, ie. all hardlinks

       Options

       -v
           verbose mode, print count of returned paths and ioctl()
           return value
filesystems/btrfs.txt · Last modified: 2021/06/18 16:36 by 127.0.0.1