Norwegian version of this page

Security aspects regarding S3

General password policy

Users with access to the web interface (CMC) will automatically be prompted to change their password when it approaches a year since the last change. We encourage users to simultaneously rotate their key pairs, both the primary system user's key and any created IAM keys. This is easily done by creating a new key pair and deleting the previous one.

For red buckets, we routinely rotate the keys belonging to the system users who own the buckets, but we lack a system for the routine rotation of IAM keys that are distributed to end users. There will be an update here as soon as possible.

 

Data protection

We offer three options for protecting your data:

  1. Versioning of objects
  2. Traditional backup to another storage system
  3. Replication to a separate Cloudian cluster

Versioning means that the system retains all copies of objects with the same name (key), providing a change history per object. This protects against both unwanted overwrites and file deletions. If there is only one version of an object, it will be duplicated before a delete marker is placed on the current version.

Users can themselves decide whether to enable automatic deletion of previous versions, and how often this should be done. Moreover, there will be no surprises regarding cost as the extra capacity used by older versions is deducted from the regular quota.
For many, this may be sufficient data protection in itself, but note that versioning cannot be considered adequate protection for sensitive data that needs to be stored long-term.

Traditional backup stores a full copy and incremental changes on the bucket once a day, with a retention period of 90 days. This may be necessary if you have critical data that you want to have a copy of on a separate system and data center. There is an additional cost for this.

Finally, we have a separate, virtual Cloudian instance that can be used to upload the same data to two separate locations. However, this instance has very limited capacity and is a service we only offer for the most critical data/use cases.

 

Securing the access key on the client

By default, configured key pairs are stored in plaintext under ~/.aws/credentials. Initially, access to these is restricted to the owner of the home directory, but it is recommended to encrypt the key file prevent unauthorized access to object storage if the machine is hacked. This is especially important if sensitive data is stored in the buckets.

In the credentials file, you can specify a parameter instead of the keys that will point to a script that retrieves them. This way, we can create a script that either decrypts an encrypted key file or retrieves keys from, for example, Vault, Enpass, or another password vault. See below.

[default] 
endpoint_url = https://s3-oslo.educloud.no 
region = oslo 
credential_process = /path/to/script

credential_process expects a JSON-output on the following format:

{
  "Version": 1, 
  "AccessKeyId": "AKIA0123456787EXAMPLE",
  "SecretAccessKey": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
}

This way, we can create a script which either decrypts an encrypted key file, or retrieves the keys from a secure password vault (ex. Vault or Enpass, as verified by UiO). Examples below.

 

1. Encryption + decryption script

This method only requires a standard encryption tool like OpenSSL and a few lines of shell code. Therefore, it is available out of the box on most Unix systems.

Start by creating a JSON file with the keys in the format above and encrypt it with OpenSSL using the following options:

openssl enc -aes-256-cbc -md sha512 -pbkdf2 -in s3-creds.json -out s3-creds.enc

You will be prompted to enter a password, which you will also need to provide when decrypting the file. It's a good idea to store this in Vault or similar, as you would normally store shared passwords.

Additional Steps for Mac Users
By default, macOS uses LibreSSL for encryption, which does not support the pbkdf2 function in the example above.
So if you encounter errors, check the version of OpenSSL. If it refers to LibreSSL, install a proper OpenSSL (from, for example, brew), and add it to your PATH:

# openssl version
LibreSSL 3.3.6

# brew update
# brew install openssl
# echo 'export PATH="/usr/local/opt/openssl@3/bin:$PATH"' >> ~/.bash_profile

# openssl version
OpenSSL 3.1.2 1 Aug 2023 (Library: OpenSSL 3.1.2 1 Aug 2023) 

 

Now we create a short decryption script to be referenced in credential_process, so that the encrypted JSON file with the keys can be decrypted as needed. In its simplest form, it can look like this:

#!/bin/sh

openssl enc -d -aes-256-cbc -md sha512 -pbkdf2 -in $1 -pass env:S3_PW

The decryption script above expects a path to the encrypted file when it runs (argument $1) and that the password has been set beforehand as an environment variable, S3_PW. This can be set without displaying input in the terminal as follows:

read -s -r S3_PW && export S3_PW

Note that when you close the terminal, this variable will be removed and must be set again in the next session.

Finally, update ~/.aws/credentials to use the decryption script:

[default]
endpoint_url = https://s3-oslo.educloud.no
region = oslo
credential_process = /path/to/decrypt.sh /stil/til/s3-creds.enc

When you now attempt to perform an S3 call against the storage using this configured profile, the script will automatically decrypt s3-creds.enc and authorize with the associated keys. Remember to remove the unencrypted file once it is confirmed that the setup works.

 

2. Retrieval of credentials from password vault

The guide below is based on HashiCorp Vault (which is widely used internally), but most password vaults/managers have equivalent APIs that can be used to retrieve stored secrets.

To use this method, you need to have the Vault client installed. Start by placing a secret in the same JSON format on a preferred path in the password vault as follows: On a preferred path in the service, create a secret in the following format where the name "system_user_testbucket" is the name of the key where you will be storing the JSON data containing your access key and secret key.

vault kv put /secret/engine/path/to/S3_credentials testuser='{
  "Version": 1,
  "AccessKeyId": "AKIA0123456787EXAMPLE",
  "SecretAccessKey": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
}'

Then we can create a bash script which retrieves the keys:

#!/bin/bash

vault kv get --field $1 /secret/engine/path/to/S3_credentials

When the script runs, you need to specify which user you want to retrieve credentials for, in the above example = testuser. This means you can store multiple users in the same secret and dynamically retrieve what you need as follows:

$ ./retrieve_S3_vault.sh testuser
{
  "Version": 1,
  "AccessKeyId": "AKIA0123456787EXAMPLE",
  "SecretAccessKey": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
}

When it is confirmed to be working as intended, update ~/.aws/credentials:

[default]
endpoint_url = https://s3-oslo.educloud.no
region = oslo
credential_process = /sti/til/retrieve_S3_vault.sh testuser

Note that every time your need to communicate with the bucket, you must first log in to Vault CLI beforehand to have an active token (which by default expires after 1 hour).

Server Side Encryption (SSE)

The UiO S3 solution supports SSE for encryption at rest, but we don't currently utilize it for our storage policies. The reason is that using this method, the data is still automatically decrypted upon every request, and thus will not provide protection in situations were data is wrongfully made public ex. The only protection SSE would offer is against physical disks being stolen, which is considered unlikely.
As such, we concluded that SSE is not worth the extra overhead caused by the encryption. However, we aim to implement KMS encryption when properly supported, as this do protect against accidental publication of data.

If you are required to utilize some form of encryption, we recommend using SSE-C (Server-Side Encryption with Customer Key). This requires an encryption key which will be used both when uploading and downloading data.
Note that this method comes with a certain risk; if the encryption key is lost, the object is also effectively lost forever. The key should also not be stored in the same location as the access and secret keys for the bucket.
For more information, we refer to Amazon's guide on SSE-C with PowerShell.

In the end, we believe that all forms of encryption at rest can provide a kind of false security, as it does not replace good handling of IAM keys and any bucket policies which themselves offer sufficient protection from the user side.

Tags: S3, storage, object storage By Markus S?rensen
Published Sep. 26, 2024 10:44 AM - Last modified Sep. 26, 2024 2:31 PM