From LedHed's Wiki
Jump to: navigation, search

REPOST

The below article is a repost of an article I found on the internet. All credit goes the the original author.


Tonight I had to replace a disk in my FreeNAS box that was completely dead, as in, not detected by the BIOS. Below are the steps to replace a completely failed disk. The FreeNAS docs have an article on replacing a failed disk but it does not cover replacing a disk that is no longer detected by the system. You can read that article here.

A zpool status shows the disk as unavailable:

[root@freenas] ~# zpool status -v zpool0
 pool: zpool0
state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
       the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
  see: http://illumos.org/msg/ZFS-8000-2Q
 scan: scrub repaired 0 in 6h31m with 0 errors on Sun Jul 28 06:31:27 2013
config:

       NAME                                            STATE     READ WRITE CKSUM
       zpool0                                          DEGRADED     0     0     0
         mirror-0                                      DEGRADED     0     0     0
           3282272283788900661                         UNAVAIL      0     0     0  was /dev/gptid/3937b1c2-fec4-11d5-a8b2-001f2961db
           gptid/398a9808-fec4-11d5-a8b2-001f2961db70  ONLINE       0     0     0
         mirror-1                                      ONLINE       0     0     0
           gptid/998b8dc4-ff2b-11d5-a8b2-001f2961db70  ONLINE       0     0     0
           gptid/99e507d9-ff2b-11d5-a8b2-001f2961db70  ONLINE       0     0     0

errors: No known data errors

Next I took a screenshot of all my disks via the webGUI and noted the serial numbers for the disks in the pool. The failed disk will not show up in the list so we can use that to identify which physical disk we need to pull. Next shutdown the server and start pulling one disk at a time until you find the one with the serial number that is not in your list of serial numbers. When you find it, pull it out and replace it with your new one noting the serial number of the new disk. Next power on the system and login via SSH.

Next, offline the failed disk:

[root@freenas] ~# zpool offline zpool0 /dev/gptid/3937b1c2-fec4-11d5-a8b2-001f2961db70

Check the status of the disk to ensure it's offline:

[root@freenas] ~# zpool status -v zpool0
 pool: zpool0
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
       Sufficient replicas exist for the pool to continue functioning in a
       degraded state.
action: Online the device using 'zpool online' or replace the device with
       'zpool replace'.
 scan: scrub repaired 0 in 6h31m with 0 errors on Sun Jul 28 06:31:27 2013
config:

       NAME                                            STATE     READ WRITE CKSUM
       zpool0                                          DEGRADED     0     0     0
         mirror-0                                      DEGRADED     0     0     0
           3282272283788900661                         OFFLINE      0     0     0  was /dev/gptid/3937b1c2-fec4-11d5-a8b2-001f2961db
           gptid/398a9808-fec4-11d5-a8b2-001f2961db70  ONLINE       0     0     0
         mirror-1                                      ONLINE       0     0     0
           gptid/998b8dc4-ff2b-11d5-a8b2-001f2961db70  ONLINE       0     0     0
           gptid/99e507d9-ff2b-11d5-a8b2-001f2961db70  ONLINE       0     0     0


errors: No known data errors

Now replace the disk in the pool with your new disk. You can use the webGUI to get the block device name, looking for the serial number of the new device you noted above:

[root@freenas] ~# zpool replace zpool0 /dev/gptid/3937b1c2-fec4-11d5-a8b2-001f2961db70 /dev/ada2

Now just online the disk and ensure its says the new disk is resilvering:

[root@freenas] ~# zpool online zpool0 /dev/ada2
[root@freenas] ~# zpool status -v zpool0
 pool: zpool0
state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
       continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scan: resilver in progress since Tue Sep  3 21:32:02 2013
       28.1M scanned out of 3.53T at 1.17M/s, (scan is slow, no estimated time)
       15.4M resilvered, 0.00% done
config:

       NAME                                            STATE     READ WRITE CKSUM
       zpool0                                          DEGRADED     0     0     0
         mirror-0                                      DEGRADED     0     0     0
           replacing-0                                 DEGRADED     0     0     0
             3282272283788900661                       OFFLINE      0     0     0  was /dev/gptid/3937b1c2-fec4-11d5-a8b2-001f2961db70
             ada2                                      ONLINE       0     0     0  (resilvering)
           gptid/398a9808-fec4-11d5-a8b2-001f2961db70  ONLINE       0     0     0
         mirror-1                                      ONLINE       0     0     0
           gptid/998b8dc4-ff2b-11d5-a8b2-001f2961db70  ONLINE       0     0     0
           gptid/99e507d9-ff2b-11d5-a8b2-001f2961db70  ONLINE       0     0     0


errors: No known data errors


References

http://dcprom0.blogspot.com/2013/09/freenas-replacing-failed-disk.html