OpenBSD disk encryption

11 January 2015

Introduction

Although there are many tutorials on how to set up OpenBSD disk encryption, there is only limited information on the encryption itself (design, algorithms, etc.). Historically, OpenBSD used vnd(4) disk to implement that feature. Currently, the use of softraid(4) is recommended.

This post will review the design of the current implementation.

Documentation

From the softraid documentation:

The softraid device emulates a Host Bus Adapter (HBA) that provides RAID and other I/O related services. [...] softraid supports a number of disciplines. A discipline is a collection of functions that provides specific I/O functionality.

Although this device was initially developed for RAID 0/1, there is also a CRYPTO discipline that may be used for the encryption. For each discipline, there is a number of chunks to be used (e.g., two for RAID1). CRYPTO only takes one chunk. This is about all the information the man page has. There is no mention on how the encryption operates or which algorithm or block schema can be used.

In userland, bioctl(8) is used to interact with softraid. The man page gives us more details on the available options such as:

  • -k to specify a key disk,
  • -p to specify a passfile and
  • -r to specify the number of iterations for PBKDF2.

Since 4.7, OpenBSD supports the use of a separate disk to store the decryption key (e.g., USB stick). In this post, we will only focus on the passphrase method.

Source code

The kernel part of the encryption is mainly within the sys/dev/softraid* files. softraidvars.h contains all the structures used by the discipline, including the metadata written at the beginning of the partition.

The code is well commented and references to PBKDF2, AES-XTS, AES-ECB and HMAC-SHA1 are found. The most useful comment being:

/*
* Check that HMAC-SHA1_k(decrypted scm_key) == sch_mac, where
* k = SHA1(masking key)
*/

Each volume created with softraid contains a metadata structure at the beginning of the partition called sr_metadata. This metadata includes a magic, a version number, the type of discipline, a header checksum, etc. It also includes the number of associated chunks:

struct sr_metadata {
  struct sr_meta_invariant {
    /* do not change order of ssd_magic, ssd_version */
    u_int64_t ssd_magic;  /* magic id */
#define SR_MAGIC    0x4d4152436372616dLLU
    u_int32_t ssd_version;  /* meta data version */
    u_int32_t ssd_vol_flags;  /* volume specific flags. */
    struct sr_uuid  ssd_uuid; /* unique identifier */

    /* chunks */
    u_int32_t ssd_chunk_no; /* number of chunks */
    u_int32_t ssd_chunk_id; /* chunk identifier */

    /* optional */
    u_int32_t ssd_opt_no; /* nr of optional md elements */
    u_int32_t ssd_pad;

    /* volume metadata */
    u_int32_t ssd_volid;  /* volume id */
    u_int32_t ssd_level;  /* raid level */
    int64_t   ssd_size; /* virt disk size in blocks */
    char    ssd_vendor[8];  /* scsi vendor */
    char    ssd_product[16];/* scsi product */
    char    ssd_revision[4];/* scsi revision */
    /* optional volume members */
    u_int32_t ssd_strip_size; /* strip size */
  } _sdd_invariant;
#define ssdi      _sdd_invariant
  /* MD5 of invariant metadata */
  u_int8_t    ssd_checksum[MD5_DIGEST_LENGTH];
  char      ssd_devname[32];/* /dev/XXXXX */
  u_int32_t   ssd_meta_flags;
#define SR_META_DIRTY   0x1
  u_int32_t   ssd_data_offset;
  u_int64_t   ssd_ondisk; /* on disk version counter */
  int64_t     ssd_rebuild;  /* last block of rebuild */
} __packed;

This header is followed by the sr_meta_chunk structure, which includes the chunk id, name, size, etc.:

struct sr_meta_chunk {
  struct sr_meta_chunk_invariant {
    u_int32_t scm_volid;  /* vd we belong to */
    u_int32_t scm_chunk_id; /* chunk id */
    char    scm_devname[32];/* /dev/XXXXX */
    int64_t   scm_size; /* size of partition in blocks*/
    int64_t   scm_coerced_size; /* coerced sz of part in blk*/
    struct sr_uuid  scm_uuid; /* unique identifier */
  } _scm_invariant;
#define scmi      _scm_invariant
  /* MD5 of invariant chunk metadata */
  u_int8_t    scm_checksum[MD5_DIGEST_LENGTH];
  u_int32_t   scm_status; /* use bio bioc_disk status */
} __packed;

Finally, an optional structure may be appended to sr_meta_chunk, depending on the discipline used. For the passphrase based CRYPTO, it is sr_meta_crypto:

struct sr_meta_crypto {
  struct sr_meta_opt_hdr  scm_hdr;
  u_int32_t   scm_alg;  /* vol crypto algorithm */
#define SR_CRYPTOA_AES_XTS_128  1
#define SR_CRYPTOA_AES_XTS_256  2
  u_int32_t   scm_flags;  /* key & kdfhint valid */
#define SR_CRYPTOF_INVALID  (0)
#define SR_CRYPTOF_KEY    (1<<0)
#define SR_CRYPTOF_KDFHINT  (1<<1)
  u_int32_t   scm_mask_alg; /* disk key masking crypt alg */
#define SR_CRYPTOM_AES_ECB_256  1
  u_int32_t   scm_pad1;
  u_int8_t    scm_reserved[64];

  /* symmetric keys used for disk encryption */
  u_int8_t    scm_key[SR_CRYPTO_MAXKEYS][SR_CRYPTO_KEYBYTES];
  /* hint to kdf algorithm (opaque to kernel) */
  u_int8_t    scm_kdfhint[SR_CRYPTO_KDFHINTBYTES];

  u_int32_t   scm_check_alg;  /* key chksum algorithm */
#define SR_CRYPTOC_HMAC_SHA1    1
  u_int32_t   scm_pad2;
  union {
    struct sr_crypto_chk_hmac_sha1  chk_hmac_sha1;
    u_int8_t      chk_reserved2[64];
  }     _scm_chk;
#define chk_hmac_sha1 _scm_chk.chk_hmac_sha1
} __packed;

struct sr_crypto_chk_hmac_sha1 {
  u_int8_t  sch_mac[20];
} __packed;

That structure includes which type of encryption is used for the disk blocks (AES-XTS 128 or 256). There are also references to a key masking algorithm and KDF hints.

Structure Matching

All these structures will be written to the disk. As an exercise, we may match the structure against a dump of the disk. With a bit of dd and hexdump:

OpenBSD softraid dump 1

sr_metadata from 0xa000 to 0xa0a8:

  • The marcCRAM string which is the ssd_magic (reference to Marco Peereboom, the original developer).
  • OPENBSD, the ssd_vendor
  • SR CRYPTO, the ssd_product
  • 005, the ssd_revision
  • sd0, the ssd_devname

sr_meta_chunk from 0xa0a8 to 0xa104:

  • wda0, the scm_devname
  • 0x3ff9b0, the scm_size
  • 0x4fca97..3bfd72, the scm_checksum (MD5)

sr_meta_opt_hdr from 0xa104 to 0xa11c

sr_meta_crypto from 0xa11c to 0xaa88:

  • 0x2 (SR_CRYPTOA_AES_XTS_256), the scm_alg
  • 0x1 (SR_CRYPTOM_AES_ECB_256), the scm_mask_alg
  • The encryption keys, scm_key, from 0xa16c to 0xa96c
  • scm_kdfhint, which contains the PBKDF2 information (0x2000 rounds, salt 128 bytes from 0xa978 to 0xa9f8)
  • 0xbc5105..ac81ad, the chk_hmac_sha1

Design

Now that we have a good understanding of what is being stored, we may review softraid_crypto.c and try to deduce how the full encryption operates:

  1. The user's password is derived using PBKDF2, the salt and number of iterations are being stored on the disk itself. The outcome is called 'masking key'. Only the number of iterations can be changed by the end-user (see option -r of bioctl).
  2. The masking key is used to decrypt the disk keys, using AES-ECB-256. There are 32 keys, each 512 bits longs.
  3. Each disk key is used to decrypt a certain portion of the disk using AES-XTS-256, about 0.5TB per keys. Although the headers have the definition of another algorithm (AES-XTS-128), there is no possibility to select it from userland.

Since there is no evidence that the user has provided the correct passphrase, a validation is performed:

  1. The masking key is hashed (SHA1).
  2. The HMAC of the decrypted disk keys is calculated using the hash of the masking key.
  3. This hash is compared to chk_hmac_sha1.

Some extra notes from reading the code:

  • The disk keys are randomly generated using arc4random_buf. As such, the ECB mode for their encryption is appropriate.
  • The current implementation does not use all the disk keys, only the first one:

XXX - this does not handle the case where the read/write spans across a different key blocks (e.g. 0.5TB boundary). Currently this is already broken by the use of scr_key[0] below.

Relative Strength

It is interesting to compare how stand the OpenBSD disk encryption against other operating system. From a design perspective, the use of AES-XTS for the main encryption is quite standard. LUKS, FileVault2, TrueCrypt and geli (FreeBSD) all use this algorithm and block mode by default.

The key derivation function (PBKDF2) is again a standard. The interesting part is the number of iterations which directly impact the resistance to brute force attacks:

On my home install, my LUKS volume is configured for about 400,000 iterations, which is significantly higher than the OpenBSD default.

There are other interesting differences in the design details:

  • On OpenBSD, once the passphrase has been hashed, the remaining operations are negligible. In its design, there is only one slot for the master key that unlocks 32 disk keys.
  • On LUKS, the master key is rehashed with another PBKDF2 in order to validate its integrity. There is also 8 slots by default for the master key but only one disk key. Other difference includes an anti-forensic filter (AF-split), which spread the encrypted master key over a large blob (~256kB). This mechanism has an impact on the brute force attack as a complete copy of the header is necessary.

From a practical perspective, here are both speed tests run on John the Ripper against OpenBSD softraid and LUKS (test vectors mimic real-world configurations):

[tweek@sec0 run]$ ./john --test -format=luks
Will run 4 OpenMP threads
Benchmarking: LUKS [PBKDF2-SHA1 8x SSE2]... (4xOMP) Warning: No dupe-salt detection
DONE
Raw:    49.6 c/s real, 13.0 c/s virtual

[tweek@sec0 run]$ ./john --test -format=openbsd-softraid
Will run 4 OpenMP threads
Benchmarking: OpenBSD-SoftRAID (8192 iterations) [PBKDF2-SHA1 8x SSE2]... (4xOMP) DONE
Speed for cost 1 (iteration count) of 8192
Raw:    1088 c/s real, 274 c/s virtual

Conclusion

If using OpenBSD on a recent computer, bump up the number of PBKDF2 iterations when creating the volume.

Extra

As currently implemented, the softraid cracking algorithm in JtR reproduces all the steps of the standard implementation (i.e., PBKDF2, AES over disk keys, SHA1 of masking key and HMAC of decrypted keys). In case of hash matching, we are sure that the tested passphrase is correct.

However, a shortcut version may be possible. Instead of hashing the masking key and HMAC'ing the decrypted keys, we may stop after the decryption of the first disk key and try to decrypt the beginning of the partition. Chance are that we will fall on the MBR and recognise some code or a string matching the boot code.

Two drawbacks for this method exist: first, we will have to confirm the guessed passphrase so our savings highly depends on the false positives of our MBR identification; second, the gain of avoiding the last stages is not as attractive as it seems. Indeed, we only avoid the decryption of the remaining keys and few hashings, which are all negligible compared to the thousands iterations of PBKDF2.

To confirm these, I modified the code to implement that approach and run the test suite. A gain of about 3% was observed. Probably not worthing the increase of false positives and potential false negative.