642 lines
29 KiB
ReStructuredText
642 lines
29 KiB
ReStructuredText
|
=====================================
|
||
|
Filesystem-level encryption (fscrypt)
|
||
|
=====================================
|
||
|
|
||
|
Introduction
|
||
|
============
|
||
|
|
||
|
fscrypt is a library which filesystems can hook into to support
|
||
|
transparent encryption of files and directories.
|
||
|
|
||
|
Note: "fscrypt" in this document refers to the kernel-level portion,
|
||
|
implemented in ``fs/crypto/``, as opposed to the userspace tool
|
||
|
`fscrypt <https://github.com/google/fscrypt>`_. This document only
|
||
|
covers the kernel-level portion. For command-line examples of how to
|
||
|
use encryption, see the documentation for the userspace tool `fscrypt
|
||
|
<https://github.com/google/fscrypt>`_. Also, it is recommended to use
|
||
|
the fscrypt userspace tool, or other existing userspace tools such as
|
||
|
`fscryptctl <https://github.com/google/fscryptctl>`_ or `Android's key
|
||
|
management system
|
||
|
<https://source.android.com/security/encryption/file-based>`_, over
|
||
|
using the kernel's API directly. Using existing tools reduces the
|
||
|
chance of introducing your own security bugs. (Nevertheless, for
|
||
|
completeness this documentation covers the kernel's API anyway.)
|
||
|
|
||
|
Unlike dm-crypt, fscrypt operates at the filesystem level rather than
|
||
|
at the block device level. This allows it to encrypt different files
|
||
|
with different keys and to have unencrypted files on the same
|
||
|
filesystem. This is useful for multi-user systems where each user's
|
||
|
data-at-rest needs to be cryptographically isolated from the others.
|
||
|
However, except for filenames, fscrypt does not encrypt filesystem
|
||
|
metadata.
|
||
|
|
||
|
Unlike eCryptfs, which is a stacked filesystem, fscrypt is integrated
|
||
|
directly into supported filesystems --- currently ext4, F2FS, and
|
||
|
UBIFS. This allows encrypted files to be read and written without
|
||
|
caching both the decrypted and encrypted pages in the pagecache,
|
||
|
thereby nearly halving the memory used and bringing it in line with
|
||
|
unencrypted files. Similarly, half as many dentries and inodes are
|
||
|
needed. eCryptfs also limits encrypted filenames to 143 bytes,
|
||
|
causing application compatibility issues; fscrypt allows the full 255
|
||
|
bytes (NAME_MAX). Finally, unlike eCryptfs, the fscrypt API can be
|
||
|
used by unprivileged users, with no need to mount anything.
|
||
|
|
||
|
fscrypt does not support encrypting files in-place. Instead, it
|
||
|
supports marking an empty directory as encrypted. Then, after
|
||
|
userspace provides the key, all regular files, directories, and
|
||
|
symbolic links created in that directory tree are transparently
|
||
|
encrypted.
|
||
|
|
||
|
Threat model
|
||
|
============
|
||
|
|
||
|
Offline attacks
|
||
|
---------------
|
||
|
|
||
|
Provided that userspace chooses a strong encryption key, fscrypt
|
||
|
protects the confidentiality of file contents and filenames in the
|
||
|
event of a single point-in-time permanent offline compromise of the
|
||
|
block device content. fscrypt does not protect the confidentiality of
|
||
|
non-filename metadata, e.g. file sizes, file permissions, file
|
||
|
timestamps, and extended attributes. Also, the existence and location
|
||
|
of holes (unallocated blocks which logically contain all zeroes) in
|
||
|
files is not protected.
|
||
|
|
||
|
fscrypt is not guaranteed to protect confidentiality or authenticity
|
||
|
if an attacker is able to manipulate the filesystem offline prior to
|
||
|
an authorized user later accessing the filesystem.
|
||
|
|
||
|
Online attacks
|
||
|
--------------
|
||
|
|
||
|
fscrypt (and storage encryption in general) can only provide limited
|
||
|
protection, if any at all, against online attacks. In detail:
|
||
|
|
||
|
fscrypt is only resistant to side-channel attacks, such as timing or
|
||
|
electromagnetic attacks, to the extent that the underlying Linux
|
||
|
Cryptographic API algorithms are. If a vulnerable algorithm is used,
|
||
|
such as a table-based implementation of AES, it may be possible for an
|
||
|
attacker to mount a side channel attack against the online system.
|
||
|
Side channel attacks may also be mounted against applications
|
||
|
consuming decrypted data.
|
||
|
|
||
|
After an encryption key has been provided, fscrypt is not designed to
|
||
|
hide the plaintext file contents or filenames from other users on the
|
||
|
same system, regardless of the visibility of the keyring key.
|
||
|
Instead, existing access control mechanisms such as file mode bits,
|
||
|
POSIX ACLs, LSMs, or mount namespaces should be used for this purpose.
|
||
|
Also note that as long as the encryption keys are *anywhere* in
|
||
|
memory, an online attacker can necessarily compromise them by mounting
|
||
|
a physical attack or by exploiting any kernel security vulnerability
|
||
|
which provides an arbitrary memory read primitive.
|
||
|
|
||
|
While it is ostensibly possible to "evict" keys from the system,
|
||
|
recently accessed encrypted files will remain accessible at least
|
||
|
until the filesystem is unmounted or the VFS caches are dropped, e.g.
|
||
|
using ``echo 2 > /proc/sys/vm/drop_caches``. Even after that, if the
|
||
|
RAM is compromised before being powered off, it will likely still be
|
||
|
possible to recover portions of the plaintext file contents, if not
|
||
|
some of the encryption keys as well. (Since Linux v4.12, all
|
||
|
in-kernel keys related to fscrypt are sanitized before being freed.
|
||
|
However, userspace would need to do its part as well.)
|
||
|
|
||
|
Currently, fscrypt does not prevent a user from maliciously providing
|
||
|
an incorrect key for another user's existing encrypted files. A
|
||
|
protection against this is planned.
|
||
|
|
||
|
Key hierarchy
|
||
|
=============
|
||
|
|
||
|
Master Keys
|
||
|
-----------
|
||
|
|
||
|
Each encrypted directory tree is protected by a *master key*. Master
|
||
|
keys can be up to 64 bytes long, and must be at least as long as the
|
||
|
greater of the key length needed by the contents and filenames
|
||
|
encryption modes being used. For example, if AES-256-XTS is used for
|
||
|
contents encryption, the master key must be 64 bytes (512 bits). Note
|
||
|
that the XTS mode is defined to require a key twice as long as that
|
||
|
required by the underlying block cipher.
|
||
|
|
||
|
To "unlock" an encrypted directory tree, userspace must provide the
|
||
|
appropriate master key. There can be any number of master keys, each
|
||
|
of which protects any number of directory trees on any number of
|
||
|
filesystems.
|
||
|
|
||
|
Userspace should generate master keys either using a cryptographically
|
||
|
secure random number generator, or by using a KDF (Key Derivation
|
||
|
Function). Note that whenever a KDF is used to "stretch" a
|
||
|
lower-entropy secret such as a passphrase, it is critical that a KDF
|
||
|
designed for this purpose be used, such as scrypt, PBKDF2, or Argon2.
|
||
|
|
||
|
Per-file keys
|
||
|
-------------
|
||
|
|
||
|
Since each master key can protect many files, it is necessary to
|
||
|
"tweak" the encryption of each file so that the same plaintext in two
|
||
|
files doesn't map to the same ciphertext, or vice versa. In most
|
||
|
cases, fscrypt does this by deriving per-file keys. When a new
|
||
|
encrypted inode (regular file, directory, or symlink) is created,
|
||
|
fscrypt randomly generates a 16-byte nonce and stores it in the
|
||
|
inode's encryption xattr. Then, it uses a KDF (Key Derivation
|
||
|
Function) to derive the file's key from the master key and nonce.
|
||
|
|
||
|
The Adiantum encryption mode (see `Encryption modes and usage`_) is
|
||
|
special, since it accepts longer IVs and is suitable for both contents
|
||
|
and filenames encryption. For it, a "direct key" option is offered
|
||
|
where the file's nonce is included in the IVs and the master key is
|
||
|
used for encryption directly. This improves performance; however,
|
||
|
users must not use the same master key for any other encryption mode.
|
||
|
|
||
|
Below, the KDF and design considerations are described in more detail.
|
||
|
|
||
|
The current KDF works by encrypting the master key with AES-128-ECB,
|
||
|
using the file's nonce as the AES key. The output is used as the
|
||
|
derived key. If the output is longer than needed, then it is
|
||
|
truncated to the needed length.
|
||
|
|
||
|
Note: this KDF meets the primary security requirement, which is to
|
||
|
produce unique derived keys that preserve the entropy of the master
|
||
|
key, assuming that the master key is already a good pseudorandom key.
|
||
|
However, it is nonstandard and has some problems such as being
|
||
|
reversible, so it is generally considered to be a mistake! It may be
|
||
|
replaced with HKDF or another more standard KDF in the future.
|
||
|
|
||
|
Key derivation was chosen over key wrapping because wrapped keys would
|
||
|
require larger xattrs which would be less likely to fit in-line in the
|
||
|
filesystem's inode table, and there didn't appear to be any
|
||
|
significant advantages to key wrapping. In particular, currently
|
||
|
there is no requirement to support unlocking a file with multiple
|
||
|
alternative master keys or to support rotating master keys. Instead,
|
||
|
the master keys may be wrapped in userspace, e.g. as is done by the
|
||
|
`fscrypt <https://github.com/google/fscrypt>`_ tool.
|
||
|
|
||
|
Including the inode number in the IVs was considered. However, it was
|
||
|
rejected as it would have prevented ext4 filesystems from being
|
||
|
resized, and by itself still wouldn't have been sufficient to prevent
|
||
|
the same key from being directly reused for both XTS and CTS-CBC.
|
||
|
|
||
|
Encryption modes and usage
|
||
|
==========================
|
||
|
|
||
|
fscrypt allows one encryption mode to be specified for file contents
|
||
|
and one encryption mode to be specified for filenames. Different
|
||
|
directory trees are permitted to use different encryption modes.
|
||
|
Currently, the following pairs of encryption modes are supported:
|
||
|
|
||
|
- AES-256-XTS for contents and AES-256-CTS-CBC for filenames
|
||
|
- AES-128-CBC for contents and AES-128-CTS-CBC for filenames
|
||
|
- Adiantum for both contents and filenames
|
||
|
|
||
|
If unsure, you should use the (AES-256-XTS, AES-256-CTS-CBC) pair.
|
||
|
|
||
|
AES-128-CBC was added only for low-powered embedded devices with
|
||
|
crypto accelerators such as CAAM or CESA that do not support XTS.
|
||
|
|
||
|
Adiantum is a (primarily) stream cipher-based mode that is fast even
|
||
|
on CPUs without dedicated crypto instructions. It's also a true
|
||
|
wide-block mode, unlike XTS. It can also eliminate the need to derive
|
||
|
per-file keys. However, it depends on the security of two primitives,
|
||
|
XChaCha12 and AES-256, rather than just one. See the paper
|
||
|
"Adiantum: length-preserving encryption for entry-level processors"
|
||
|
(https://eprint.iacr.org/2018/720.pdf) for more details. To use
|
||
|
Adiantum, CONFIG_CRYPTO_ADIANTUM must be enabled. Also, fast
|
||
|
implementations of ChaCha and NHPoly1305 should be enabled, e.g.
|
||
|
CONFIG_CRYPTO_CHACHA20_NEON and CONFIG_CRYPTO_NHPOLY1305_NEON for ARM.
|
||
|
|
||
|
New encryption modes can be added relatively easily, without changes
|
||
|
to individual filesystems. However, authenticated encryption (AE)
|
||
|
modes are not currently supported because of the difficulty of dealing
|
||
|
with ciphertext expansion.
|
||
|
|
||
|
Contents encryption
|
||
|
-------------------
|
||
|
|
||
|
For file contents, each filesystem block is encrypted independently.
|
||
|
Currently, only the case where the filesystem block size is equal to
|
||
|
the system's page size (usually 4096 bytes) is supported.
|
||
|
|
||
|
Each block's IV is set to the logical block number within the file as
|
||
|
a little endian number, except that:
|
||
|
|
||
|
- With CBC mode encryption, ESSIV is also used. Specifically, each IV
|
||
|
is encrypted with AES-256 where the AES-256 key is the SHA-256 hash
|
||
|
of the file's data encryption key.
|
||
|
|
||
|
- In the "direct key" configuration (FS_POLICY_FLAG_DIRECT_KEY set in
|
||
|
the fscrypt_policy), the file's nonce is also appended to the IV.
|
||
|
Currently this is only allowed with the Adiantum encryption mode.
|
||
|
|
||
|
Filenames encryption
|
||
|
--------------------
|
||
|
|
||
|
For filenames, each full filename is encrypted at once. Because of
|
||
|
the requirements to retain support for efficient directory lookups and
|
||
|
filenames of up to 255 bytes, the same IV is used for every filename
|
||
|
in a directory.
|
||
|
|
||
|
However, each encrypted directory still uses a unique key; or
|
||
|
alternatively (for the "direct key" configuration) has the file's
|
||
|
nonce included in the IVs. Thus, IV reuse is limited to within a
|
||
|
single directory.
|
||
|
|
||
|
With CTS-CBC, the IV reuse means that when the plaintext filenames
|
||
|
share a common prefix at least as long as the cipher block size (16
|
||
|
bytes for AES), the corresponding encrypted filenames will also share
|
||
|
a common prefix. This is undesirable. Adiantum does not have this
|
||
|
weakness, as it is a wide-block encryption mode.
|
||
|
|
||
|
All supported filenames encryption modes accept any plaintext length
|
||
|
>= 16 bytes; cipher block alignment is not required. However,
|
||
|
filenames shorter than 16 bytes are NUL-padded to 16 bytes before
|
||
|
being encrypted. In addition, to reduce leakage of filename lengths
|
||
|
via their ciphertexts, all filenames are NUL-padded to the next 4, 8,
|
||
|
16, or 32-byte boundary (configurable). 32 is recommended since this
|
||
|
provides the best confidentiality, at the cost of making directory
|
||
|
entries consume slightly more space. Note that since NUL (``\0``) is
|
||
|
not otherwise a valid character in filenames, the padding will never
|
||
|
produce duplicate plaintexts.
|
||
|
|
||
|
Symbolic link targets are considered a type of filename and are
|
||
|
encrypted in the same way as filenames in directory entries, except
|
||
|
that IV reuse is not a problem as each symlink has its own inode.
|
||
|
|
||
|
User API
|
||
|
========
|
||
|
|
||
|
Setting an encryption policy
|
||
|
----------------------------
|
||
|
|
||
|
The FS_IOC_SET_ENCRYPTION_POLICY ioctl sets an encryption policy on an
|
||
|
empty directory or verifies that a directory or regular file already
|
||
|
has the specified encryption policy. It takes in a pointer to a
|
||
|
:c:type:`struct fscrypt_policy`, defined as follows::
|
||
|
|
||
|
#define FS_KEY_DESCRIPTOR_SIZE 8
|
||
|
|
||
|
struct fscrypt_policy {
|
||
|
__u8 version;
|
||
|
__u8 contents_encryption_mode;
|
||
|
__u8 filenames_encryption_mode;
|
||
|
__u8 flags;
|
||
|
__u8 master_key_descriptor[FS_KEY_DESCRIPTOR_SIZE];
|
||
|
};
|
||
|
|
||
|
This structure must be initialized as follows:
|
||
|
|
||
|
- ``version`` must be 0.
|
||
|
|
||
|
- ``contents_encryption_mode`` and ``filenames_encryption_mode`` must
|
||
|
be set to constants from ``<linux/fs.h>`` which identify the
|
||
|
encryption modes to use. If unsure, use
|
||
|
FS_ENCRYPTION_MODE_AES_256_XTS (1) for ``contents_encryption_mode``
|
||
|
and FS_ENCRYPTION_MODE_AES_256_CTS (4) for
|
||
|
``filenames_encryption_mode``.
|
||
|
|
||
|
- ``flags`` must contain a value from ``<linux/fs.h>`` which
|
||
|
identifies the amount of NUL-padding to use when encrypting
|
||
|
filenames. If unsure, use FS_POLICY_FLAGS_PAD_32 (0x3).
|
||
|
In addition, if the chosen encryption modes are both
|
||
|
FS_ENCRYPTION_MODE_ADIANTUM, this can contain
|
||
|
FS_POLICY_FLAG_DIRECT_KEY to specify that the master key should be
|
||
|
used directly, without key derivation.
|
||
|
|
||
|
- ``master_key_descriptor`` specifies how to find the master key in
|
||
|
the keyring; see `Adding keys`_. It is up to userspace to choose a
|
||
|
unique ``master_key_descriptor`` for each master key. The e4crypt
|
||
|
and fscrypt tools use the first 8 bytes of
|
||
|
``SHA-512(SHA-512(master_key))``, but this particular scheme is not
|
||
|
required. Also, the master key need not be in the keyring yet when
|
||
|
FS_IOC_SET_ENCRYPTION_POLICY is executed. However, it must be added
|
||
|
before any files can be created in the encrypted directory.
|
||
|
|
||
|
If the file is not yet encrypted, then FS_IOC_SET_ENCRYPTION_POLICY
|
||
|
verifies that the file is an empty directory. If so, the specified
|
||
|
encryption policy is assigned to the directory, turning it into an
|
||
|
encrypted directory. After that, and after providing the
|
||
|
corresponding master key as described in `Adding keys`_, all regular
|
||
|
files, directories (recursively), and symlinks created in the
|
||
|
directory will be encrypted, inheriting the same encryption policy.
|
||
|
The filenames in the directory's entries will be encrypted as well.
|
||
|
|
||
|
Alternatively, if the file is already encrypted, then
|
||
|
FS_IOC_SET_ENCRYPTION_POLICY validates that the specified encryption
|
||
|
policy exactly matches the actual one. If they match, then the ioctl
|
||
|
returns 0. Otherwise, it fails with EEXIST. This works on both
|
||
|
regular files and directories, including nonempty directories.
|
||
|
|
||
|
Note that the ext4 filesystem does not allow the root directory to be
|
||
|
encrypted, even if it is empty. Users who want to encrypt an entire
|
||
|
filesystem with one key should consider using dm-crypt instead.
|
||
|
|
||
|
FS_IOC_SET_ENCRYPTION_POLICY can fail with the following errors:
|
||
|
|
||
|
- ``EACCES``: the file is not owned by the process's uid, nor does the
|
||
|
process have the CAP_FOWNER capability in a namespace with the file
|
||
|
owner's uid mapped
|
||
|
- ``EEXIST``: the file is already encrypted with an encryption policy
|
||
|
different from the one specified
|
||
|
- ``EINVAL``: an invalid encryption policy was specified (invalid
|
||
|
version, mode(s), or flags)
|
||
|
- ``ENOTDIR``: the file is unencrypted and is a regular file, not a
|
||
|
directory
|
||
|
- ``ENOTEMPTY``: the file is unencrypted and is a nonempty directory
|
||
|
- ``ENOTTY``: this type of filesystem does not implement encryption
|
||
|
- ``EOPNOTSUPP``: the kernel was not configured with encryption
|
||
|
support for this filesystem, or the filesystem superblock has not
|
||
|
had encryption enabled on it. (For example, to use encryption on an
|
||
|
ext4 filesystem, CONFIG_EXT4_ENCRYPTION must be enabled in the
|
||
|
kernel config, and the superblock must have had the "encrypt"
|
||
|
feature flag enabled using ``tune2fs -O encrypt`` or ``mkfs.ext4 -O
|
||
|
encrypt``.)
|
||
|
- ``EPERM``: this directory may not be encrypted, e.g. because it is
|
||
|
the root directory of an ext4 filesystem
|
||
|
- ``EROFS``: the filesystem is readonly
|
||
|
|
||
|
Getting an encryption policy
|
||
|
----------------------------
|
||
|
|
||
|
The FS_IOC_GET_ENCRYPTION_POLICY ioctl retrieves the :c:type:`struct
|
||
|
fscrypt_policy`, if any, for a directory or regular file. See above
|
||
|
for the struct definition. No additional permissions are required
|
||
|
beyond the ability to open the file.
|
||
|
|
||
|
FS_IOC_GET_ENCRYPTION_POLICY can fail with the following errors:
|
||
|
|
||
|
- ``EINVAL``: the file is encrypted, but it uses an unrecognized
|
||
|
encryption context format
|
||
|
- ``ENODATA``: the file is not encrypted
|
||
|
- ``ENOTTY``: this type of filesystem does not implement encryption
|
||
|
- ``EOPNOTSUPP``: the kernel was not configured with encryption
|
||
|
support for this filesystem
|
||
|
|
||
|
Note: if you only need to know whether a file is encrypted or not, on
|
||
|
most filesystems it is also possible to use the FS_IOC_GETFLAGS ioctl
|
||
|
and check for FS_ENCRYPT_FL, or to use the statx() system call and
|
||
|
check for STATX_ATTR_ENCRYPTED in stx_attributes.
|
||
|
|
||
|
Getting the per-filesystem salt
|
||
|
-------------------------------
|
||
|
|
||
|
Some filesystems, such as ext4 and F2FS, also support the deprecated
|
||
|
ioctl FS_IOC_GET_ENCRYPTION_PWSALT. This ioctl retrieves a randomly
|
||
|
generated 16-byte value stored in the filesystem superblock. This
|
||
|
value is intended to used as a salt when deriving an encryption key
|
||
|
from a passphrase or other low-entropy user credential.
|
||
|
|
||
|
FS_IOC_GET_ENCRYPTION_PWSALT is deprecated. Instead, prefer to
|
||
|
generate and manage any needed salt(s) in userspace.
|
||
|
|
||
|
Adding keys
|
||
|
-----------
|
||
|
|
||
|
To provide a master key, userspace must add it to an appropriate
|
||
|
keyring using the add_key() system call (see:
|
||
|
``Documentation/security/keys/core.rst``). The key type must be
|
||
|
"logon"; keys of this type are kept in kernel memory and cannot be
|
||
|
read back by userspace. The key description must be "fscrypt:"
|
||
|
followed by the 16-character lower case hex representation of the
|
||
|
``master_key_descriptor`` that was set in the encryption policy. The
|
||
|
key payload must conform to the following structure::
|
||
|
|
||
|
#define FS_MAX_KEY_SIZE 64
|
||
|
|
||
|
struct fscrypt_key {
|
||
|
u32 mode;
|
||
|
u8 raw[FS_MAX_KEY_SIZE];
|
||
|
u32 size;
|
||
|
};
|
||
|
|
||
|
``mode`` is ignored; just set it to 0. The actual key is provided in
|
||
|
``raw`` with ``size`` indicating its size in bytes. That is, the
|
||
|
bytes ``raw[0..size-1]`` (inclusive) are the actual key.
|
||
|
|
||
|
The key description prefix "fscrypt:" may alternatively be replaced
|
||
|
with a filesystem-specific prefix such as "ext4:". However, the
|
||
|
filesystem-specific prefixes are deprecated and should not be used in
|
||
|
new programs.
|
||
|
|
||
|
There are several different types of keyrings in which encryption keys
|
||
|
may be placed, such as a session keyring, a user session keyring, or a
|
||
|
user keyring. Each key must be placed in a keyring that is "attached"
|
||
|
to all processes that might need to access files encrypted with it, in
|
||
|
the sense that request_key() will find the key. Generally, if only
|
||
|
processes belonging to a specific user need to access a given
|
||
|
encrypted directory and no session keyring has been installed, then
|
||
|
that directory's key should be placed in that user's user session
|
||
|
keyring or user keyring. Otherwise, a session keyring should be
|
||
|
installed if needed, and the key should be linked into that session
|
||
|
keyring, or in a keyring linked into that session keyring.
|
||
|
|
||
|
Note: introducing the complex visibility semantics of keyrings here
|
||
|
was arguably a mistake --- especially given that by design, after any
|
||
|
process successfully opens an encrypted file (thereby setting up the
|
||
|
per-file key), possessing the keyring key is not actually required for
|
||
|
any process to read/write the file until its in-memory inode is
|
||
|
evicted. In the future there probably should be a way to provide keys
|
||
|
directly to the filesystem instead, which would make the intended
|
||
|
semantics clearer.
|
||
|
|
||
|
Access semantics
|
||
|
================
|
||
|
|
||
|
With the key
|
||
|
------------
|
||
|
|
||
|
With the encryption key, encrypted regular files, directories, and
|
||
|
symlinks behave very similarly to their unencrypted counterparts ---
|
||
|
after all, the encryption is intended to be transparent. However,
|
||
|
astute users may notice some differences in behavior:
|
||
|
|
||
|
- Unencrypted files, or files encrypted with a different encryption
|
||
|
policy (i.e. different key, modes, or flags), cannot be renamed or
|
||
|
linked into an encrypted directory; see `Encryption policy
|
||
|
enforcement`_. Attempts to do so will fail with EPERM. However,
|
||
|
encrypted files can be renamed within an encrypted directory, or
|
||
|
into an unencrypted directory.
|
||
|
|
||
|
- Direct I/O is not supported on encrypted files. Attempts to use
|
||
|
direct I/O on such files will fall back to buffered I/O.
|
||
|
|
||
|
- The fallocate operations FALLOC_FL_COLLAPSE_RANGE,
|
||
|
FALLOC_FL_INSERT_RANGE, and FALLOC_FL_ZERO_RANGE are not supported
|
||
|
on encrypted files and will fail with EOPNOTSUPP.
|
||
|
|
||
|
- Online defragmentation of encrypted files is not supported. The
|
||
|
EXT4_IOC_MOVE_EXT and F2FS_IOC_MOVE_RANGE ioctls will fail with
|
||
|
EOPNOTSUPP.
|
||
|
|
||
|
- The ext4 filesystem does not support data journaling with encrypted
|
||
|
regular files. It will fall back to ordered data mode instead.
|
||
|
|
||
|
- DAX (Direct Access) is not supported on encrypted files.
|
||
|
|
||
|
- The st_size of an encrypted symlink will not necessarily give the
|
||
|
length of the symlink target as required by POSIX. It will actually
|
||
|
give the length of the ciphertext, which will be slightly longer
|
||
|
than the plaintext due to NUL-padding and an extra 2-byte overhead.
|
||
|
|
||
|
- The maximum length of an encrypted symlink is 2 bytes shorter than
|
||
|
the maximum length of an unencrypted symlink. For example, on an
|
||
|
EXT4 filesystem with a 4K block size, unencrypted symlinks can be up
|
||
|
to 4095 bytes long, while encrypted symlinks can only be up to 4093
|
||
|
bytes long (both lengths excluding the terminating null).
|
||
|
|
||
|
Note that mmap *is* supported. This is possible because the pagecache
|
||
|
for an encrypted file contains the plaintext, not the ciphertext.
|
||
|
|
||
|
Without the key
|
||
|
---------------
|
||
|
|
||
|
Some filesystem operations may be performed on encrypted regular
|
||
|
files, directories, and symlinks even before their encryption key has
|
||
|
been provided:
|
||
|
|
||
|
- File metadata may be read, e.g. using stat().
|
||
|
|
||
|
- Directories may be listed, in which case the filenames will be
|
||
|
listed in an encoded form derived from their ciphertext. The
|
||
|
current encoding algorithm is described in `Filename hashing and
|
||
|
encoding`_. The algorithm is subject to change, but it is
|
||
|
guaranteed that the presented filenames will be no longer than
|
||
|
NAME_MAX bytes, will not contain the ``/`` or ``\0`` characters, and
|
||
|
will uniquely identify directory entries.
|
||
|
|
||
|
The ``.`` and ``..`` directory entries are special. They are always
|
||
|
present and are not encrypted or encoded.
|
||
|
|
||
|
- Files may be deleted. That is, nondirectory files may be deleted
|
||
|
with unlink() as usual, and empty directories may be deleted with
|
||
|
rmdir() as usual. Therefore, ``rm`` and ``rm -r`` will work as
|
||
|
expected.
|
||
|
|
||
|
- Symlink targets may be read and followed, but they will be presented
|
||
|
in encrypted form, similar to filenames in directories. Hence, they
|
||
|
are unlikely to point to anywhere useful.
|
||
|
|
||
|
Without the key, regular files cannot be opened or truncated.
|
||
|
Attempts to do so will fail with ENOKEY. This implies that any
|
||
|
regular file operations that require a file descriptor, such as
|
||
|
read(), write(), mmap(), fallocate(), and ioctl(), are also forbidden.
|
||
|
|
||
|
Also without the key, files of any type (including directories) cannot
|
||
|
be created or linked into an encrypted directory, nor can a name in an
|
||
|
encrypted directory be the source or target of a rename, nor can an
|
||
|
O_TMPFILE temporary file be created in an encrypted directory. All
|
||
|
such operations will fail with ENOKEY.
|
||
|
|
||
|
It is not currently possible to backup and restore encrypted files
|
||
|
without the encryption key. This would require special APIs which
|
||
|
have not yet been implemented.
|
||
|
|
||
|
Encryption policy enforcement
|
||
|
=============================
|
||
|
|
||
|
After an encryption policy has been set on a directory, all regular
|
||
|
files, directories, and symbolic links created in that directory
|
||
|
(recursively) will inherit that encryption policy. Special files ---
|
||
|
that is, named pipes, device nodes, and UNIX domain sockets --- will
|
||
|
not be encrypted.
|
||
|
|
||
|
Except for those special files, it is forbidden to have unencrypted
|
||
|
files, or files encrypted with a different encryption policy, in an
|
||
|
encrypted directory tree. Attempts to link or rename such a file into
|
||
|
an encrypted directory will fail with EPERM. This is also enforced
|
||
|
during ->lookup() to provide limited protection against offline
|
||
|
attacks that try to disable or downgrade encryption in known locations
|
||
|
where applications may later write sensitive data. It is recommended
|
||
|
that systems implementing a form of "verified boot" take advantage of
|
||
|
this by validating all top-level encryption policies prior to access.
|
||
|
|
||
|
Implementation details
|
||
|
======================
|
||
|
|
||
|
Encryption context
|
||
|
------------------
|
||
|
|
||
|
An encryption policy is represented on-disk by a :c:type:`struct
|
||
|
fscrypt_context`. It is up to individual filesystems to decide where
|
||
|
to store it, but normally it would be stored in a hidden extended
|
||
|
attribute. It should *not* be exposed by the xattr-related system
|
||
|
calls such as getxattr() and setxattr() because of the special
|
||
|
semantics of the encryption xattr. (In particular, there would be
|
||
|
much confusion if an encryption policy were to be added to or removed
|
||
|
from anything other than an empty directory.) The struct is defined
|
||
|
as follows::
|
||
|
|
||
|
#define FS_KEY_DESCRIPTOR_SIZE 8
|
||
|
#define FS_KEY_DERIVATION_NONCE_SIZE 16
|
||
|
|
||
|
struct fscrypt_context {
|
||
|
u8 format;
|
||
|
u8 contents_encryption_mode;
|
||
|
u8 filenames_encryption_mode;
|
||
|
u8 flags;
|
||
|
u8 master_key_descriptor[FS_KEY_DESCRIPTOR_SIZE];
|
||
|
u8 nonce[FS_KEY_DERIVATION_NONCE_SIZE];
|
||
|
};
|
||
|
|
||
|
Note that :c:type:`struct fscrypt_context` contains the same
|
||
|
information as :c:type:`struct fscrypt_policy` (see `Setting an
|
||
|
encryption policy`_), except that :c:type:`struct fscrypt_context`
|
||
|
also contains a nonce. The nonce is randomly generated by the kernel
|
||
|
and is used to derive the inode's encryption key as described in
|
||
|
`Per-file keys`_.
|
||
|
|
||
|
Data path changes
|
||
|
-----------------
|
||
|
|
||
|
For the read path (->readpage()) of regular files, filesystems can
|
||
|
read the ciphertext into the page cache and decrypt it in-place. The
|
||
|
page lock must be held until decryption has finished, to prevent the
|
||
|
page from becoming visible to userspace prematurely.
|
||
|
|
||
|
For the write path (->writepage()) of regular files, filesystems
|
||
|
cannot encrypt data in-place in the page cache, since the cached
|
||
|
plaintext must be preserved. Instead, filesystems must encrypt into a
|
||
|
temporary buffer or "bounce page", then write out the temporary
|
||
|
buffer. Some filesystems, such as UBIFS, already use temporary
|
||
|
buffers regardless of encryption. Other filesystems, such as ext4 and
|
||
|
F2FS, have to allocate bounce pages specially for encryption.
|
||
|
|
||
|
Filename hashing and encoding
|
||
|
-----------------------------
|
||
|
|
||
|
Modern filesystems accelerate directory lookups by using indexed
|
||
|
directories. An indexed directory is organized as a tree keyed by
|
||
|
filename hashes. When a ->lookup() is requested, the filesystem
|
||
|
normally hashes the filename being looked up so that it can quickly
|
||
|
find the corresponding directory entry, if any.
|
||
|
|
||
|
With encryption, lookups must be supported and efficient both with and
|
||
|
without the encryption key. Clearly, it would not work to hash the
|
||
|
plaintext filenames, since the plaintext filenames are unavailable
|
||
|
without the key. (Hashing the plaintext filenames would also make it
|
||
|
impossible for the filesystem's fsck tool to optimize encrypted
|
||
|
directories.) Instead, filesystems hash the ciphertext filenames,
|
||
|
i.e. the bytes actually stored on-disk in the directory entries. When
|
||
|
asked to do a ->lookup() with the key, the filesystem just encrypts
|
||
|
the user-supplied name to get the ciphertext.
|
||
|
|
||
|
Lookups without the key are more complicated. The raw ciphertext may
|
||
|
contain the ``\0`` and ``/`` characters, which are illegal in
|
||
|
filenames. Therefore, readdir() must base64-encode the ciphertext for
|
||
|
presentation. For most filenames, this works fine; on ->lookup(), the
|
||
|
filesystem just base64-decodes the user-supplied name to get back to
|
||
|
the raw ciphertext.
|
||
|
|
||
|
However, for very long filenames, base64 encoding would cause the
|
||
|
filename length to exceed NAME_MAX. To prevent this, readdir()
|
||
|
actually presents long filenames in an abbreviated form which encodes
|
||
|
a strong "hash" of the ciphertext filename, along with the optional
|
||
|
filesystem-specific hash(es) needed for directory lookups. This
|
||
|
allows the filesystem to still, with a high degree of confidence, map
|
||
|
the filename given in ->lookup() back to a particular directory entry
|
||
|
that was previously listed by readdir(). See :c:type:`struct
|
||
|
fscrypt_digested_name` in the source for more details.
|
||
|
|
||
|
Note that the precise way that filenames are presented to userspace
|
||
|
without the key is subject to change in the future. It is only meant
|
||
|
as a way to temporarily present valid filenames so that commands like
|
||
|
``rm -r`` work as expected on encrypted directories.
|