The B-Tree File System usually called btrfs, butterfs, betterfs, etc. “is a new copy on write (CoW) filesystem for Linux aimed at implementing advanced features while focusing on fault tolerance, repair and easy administration. Jointly developed at Oracle, Red Hat, Fujitsu, Intel, SUSE, STRATO and many others, Btrfs is licensed under the GPL and open for contribution from anyone.
Source https://btrfs.wiki.kernel.org/index.php/Main_Page
Regardless of what type of RAID setup you would like to use you will need to find the /dev/sdx names of the disks you would like to use in RAID using fdisk or lsblk. The last step before you make your disks into a btrfs pool is to partition the drives with a, preferably, empty partition using whatever utility gets the job done, cfdisk comes highly recommended. Once you have done this you can continue on to either the RAID 1 or RAID 10 section.
The switch -d is for the data and -m is for the metadata. The below command makes a RAID 1 with both the data and metadata being mirrored across both disks.
# mkfs.btrfs -d raid1 -m raid1 /dev/disk1 /dev/disk2
RAID 10 requires that you use 4 or more drives, increasing only in even numbers. The below command makes a RAID 10 with both the data and metadata being mirrored across all disks.
# mkfs.btrfs -d raid10 -m raid10 /dev/part1 /dev/part2 /dev/part3 /dev/part4...
Mounting a btrfs RAID is very simple all you have to do is mount any one of RAID members from the pool and the drives are mirrored. There is a difference between RAID 1 and RAID 10, in that btrfs shows the partitions on the drives for members in RAID 1 and does not for RAID 10.
Example:
user@box ~> lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 111.8G 0 disk └─sda1 8:1 0 111.8G 0 part / sdb 8:16 0 3.7T 0 disk └─sdb1 8:17 0 3.7T 0 part sdc 8:32 0 3.7T 0 disk └─sdc1 8:33 0 3.7T 0 part /mnt/pool0 sdd 8:48 0 931.5G 0 disk /mnt/pool1 sde 8:64 0 931.5G 0 disk sdf 8:80 1 931.5G 0 disk sdg 8:96 1 931.5G 0 disk sdh 8:112 1 931.5G 0 disk sdi 8:128 1 931.5G 0 disk
The fstab for the above configuration:
<file system> <dir> <type> <options> <dump> <pass> /dev/sdc1 /mnt/pool0 btrfs auto,compress=lzo 0 0 /dev/sdd /mnt/pool1 btrfs auto,compress=lzo 0 0
As you can see it does not matter which of the disks in RAID is mounted. You can also see that this is where the mount option for data compression is chosen, which in this case is LZO. It is best to set the compression method before you begin moving data to the drives so that you do not have to compress it later. Something something compression being the default and best way to store data because why wouldn't you want to save space if you can afford the CPU time.
To display the current space usage and drives in a pool:
# btrfs filesystem show
It is highly recommended to setup scheduled scrubbing as a means of error correction on your btrfs pools, look here for more info.
Comment out the disk pool that needs the drive replaced in /etc/fstab. Then power down machine and physically remove the failing/failed hard drive.
Find out what the disk id is in /dev. In this example the drive that has just been inserted is /dev/sdh.
$ lsblk
Example:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sdf 8:80 1 931.5G 0 disk sdd 8:48 0 931.5G 0 disk /mnt/pool1 sdb 8:16 0 3.7T 0 disk └─sdb1 8:17 0 3.7T 0 part sdi 8:128 1 931.5G 0 disk sdg 8:96 1 931.5G 0 disk sde 8:64 0 931.5G 0 disk sdc 8:32 0 3.7T 0 disk └─sdc1 8:33 0 3.7T 0 part /mnt/pool0 sda 8:0 0 29.8G 0 disk ├─sda2 8:2 0 1K 0 part ├─sda5 8:5 0 29.3G 0 part │ ├─waruwaru--vg-swap_1 253:1 0 980M 0 lvm [SWAP] │ └─waruwaru--vg-root 253:0 0 28.4G 0 lvm / └─sda1 8:1 0 487M 0 part /boot sdh 8:112 1 931.5G 0 disk
List the btrfs filesystem to find out the devid number
# btrfs filesystem show
Example:
warning, device 5 is missing warning devid 5 not found already Label: none uuid: 63e6a264-9951-4271-87f3-197a0c745036 Total devices 6 FS bytes used 991.08GiB devid 1 size 931.51GiB used 332.00GiB path /dev/sdd devid 2 size 931.51GiB used 332.00GiB path /dev/sde devid 3 size 931.51GiB used 332.00GiB path /dev/sdf devid 4 size 931.51GiB used 332.00GiB path /dev/sdg devid 6 size 931.51GiB used 332.00GiB path /dev/sdi *** Some devices missing
Force the array to mount in degraded mode.
# mount -o degraded /dev/sdx /<mnt dir>
Start the drive rebuild process
# btrfs replace start <devid> /dev/sdx /<mnt dir>
Show the rebuild progress
# btrfs replace status /<mnt dir>
Should look something like this:
0.5% done, 0 write errs, 0 uncorr. read errs
Scrubbing is a type of checksum validation between the members of a RAID array. Commonly known as a consistency check in hardware RAID setups. With FreeNAS, using ZFS, scrubs are a part of the webgui which the function of scheduling them on a regular basis. It is highly recommended to setup scheduled scrubs on your RAID arrays no matter what type of RAID is being used.
To start a scrub manually with btrfs run
# btrfs scrub start <btrfs mount point>
To see the status of a scrub, useful to monitor with the watch command.
# btrfs srub status <btrfs mount point>
Create a cron job to run a scrub on the first of every month
# echo "btrfs scrub start <btrfs mount point>" > /etc/cron.monthly/btrfsscrub
The dread of any data hoarder, checksum errors. If you come across a checksum error during, you should be greeted with a message like this:
[1377563.360789] BTRFS warning (device sdf): csum failed root -9 ino 7740 off 750673920 csum 0x2bc360f3 expected csum 0x98308d82 mirror 2
If you missed it while it was running the scrub you can check dmesg to see if anything happened.
$ dmesg
The corrupted file can be found using the find command.
$ find /<btrfs mount point> -xdev -inum <inode number>
There are probably ways of fixing it, but likely you will be out of luck and have to restore from a backup or otherwise be fucked. Further info on recovering corrupted inodes
sudo btrfs inspect-internal inode-resolve 15380 /home
man btrfs-inspect-internal says:
inode-resolve [-v] <ino> <path>
(needs root privileges)
resolve paths to all files with given inode number ino in a given
subvolume at path, ie. all hardlinks
Options
-v
verbose mode, print count of returned paths and ioctl()
return value