Amazon S3 Encryption
Cloud computing has really changed how organizations operate and store their data. Cloud computing attracts big organizations with tags such as huge savings on cap-ex, op-ex, on-demand, and availability. These are all valid and good features, but the cloud actually comes inherent with a lot of security concerns. One among them is data storage in cloud, because you are moving your data outside your premises and storing it in a cloud service provider (CSP) data center, which will be in charge of the underlying machines/servers storing data.
Storing Data in Cloud
Security concerns around data storage in cloud are one of the most significant risks that are restricting companies from adopting cloud. Public cloud by its feature definition states that data can be accessed from anywhere, but the client will never know where it is actually stored. Situations change a bit in private cloud. There are a lot of online storage services available such as Google Cloud, Amazon S3, Open Stack Shift, etc. There are a lot of laws and regulations that organizations should keep in check before migrating data to cloud. An important one among them is protection of data to cloud which can be achieved by good encryption-decryption strategies. When encryption-decryption comes in to the picture, one of the first questions that should be checked or comes to mind is where the keys are stored, or who has the ownership of the keys.
Key Management in Cloud
As I have mentioned in my previous article, key management in cloud is one of the biggest concerns floating around. It is because if the key ownership is lost, then the whole underlying encryption strategy is of no use. For more details on this, please refer to my published article here.
Amazon Simple Storage Service (S3) is an online file storage service provided by Amazon Web services. Amazon S3 provides services through web service interfaces like REST, SOAP and BitTorrent. S3 stores arbitrary objects which are up to 5 terabytes in size, each accompanied by up to 2 kilobytes of metadata. Objects are organized into buckets (each owned by an Amazon Web Services account), and identified within each bucket by a unique, user-assigned key. It provides features like scalability, high availability, and low latency. S3 is designed to provide 99.999999999% durability and 99.99% availability of objects over a given year. Amazon Machine Images (AMIs) which are used in the Elastic Compute Cloud (EC2) can be exported to S3 as bundles. Buckets and objects can be created, listed, and retrieved using either a REST-style HTTP interface or a SOAP interface. Additionally, objects can be downloaded using the HTTP GET interface and the BitTorrent protocol. All access to AWS S3 is controlled using an access control list associated with every bucket and object. Bucket names and keys are chosen so that objects are addressable using HTTP URLs:
Because objects are accessible by unmodified HTTP clients, S3 can be used to replace significant existing (static) web hosting infrastructure. The Amazon AWS Authentication mechanism allows the bucket owner to create an authenticated URL with time-bounded validity i.e. someone can construct a URL that can be handed off to a third-party for access for a period such as the next 30 minutes or any defined amount of time. Every item in a bucket can also be served up as a BitTorrent feed. The S3 store can act as a seed host for a torrent and any BitTorrent client can retrieve the file.
Encryption Capabilities of AWS S3
As the data moves from client on-premise to AWS S3, it should be protected both when in transit and when it is stored in S3 buckets. While a strategy around protecting data in transit can be easily built around deploying a SSL scheme, protecting data at rest involves the following two deployment options:
Client Side Encryption
In client side encryption, all encryption/decryption happens exclusively in applications using a process called “envelope encryption”. In the envelope encryption process, encryption keys and unencrypted data are never sent to AWS, so it’s very important that your client should safely manage their encryption keys. If your client loses their encryption keys, they won’t be able to unencrypt their data, and they can’t recover their encryption keys from AWS, since AWS doesn’t know anything about them.
The goal of envelope encryption is to combine the secure key management that asymmetric keys provide and the performance of fast symmetric encryption. A one-time-use symmetric key (the envelope symmetric key) is generated by the Amazon S3 encryption client to encrypt your data, then that key is encrypted by the master key and stored alongside the data in Amazon S3. When accessing the data with the Amazon S3 encryption client, the encrypted symmetric key is retrieved and decrypted with client’s real key, then the data is decrypted.
Below is the process of Encryption and decryption in AWS S3 Client Side Encryption.
- Generate a one-time use envelope symmetric key using AWS S3 encryption client.
- Data is encrypted using this envelope key.
- The whole envelope is encrypted using a master public key or symmetric key.
- Store this encrypted envelope key with the encrypted file.
- Store a description of the master key alongside the envelope key to uniquely identify the key used to encrypt the envelope key.
- Retrieve the encrypted envelope key you stored with the encrypted file.
- Retrieve the description of the original master key.
- If the description of the master key on hand does not match the description of the original master key, use the unique description to fetch the original master symmetric key or private key.
- Decrypt the envelope key using the master key.
- Decrypt the file data using the envelope key.
- If client master key is compromised, they have the option of just re-encrypting the stored envelope symmetric keys, instead of re-encrypting all the data in their account.
- Despite all the good things above, the only drawback on client side encryption is that the symmetric key will be stored at AWS side thus client will not have full control over the keys.
Server Side Encryption
AWS S3 has added a new feature of providing customers the option to encrypt data at rest using their own managed encryption keys. S3 interface is easily accessible over APIs, the encryption key is supplied as a part of the PUT request, and S3 will take care of the rest of the process. Using the key S3 will maintain both the encryption of data while writing to disk and decryption of objects, thus freeing the customers from maintaining the encryption and decryption code logic. AWS S# will use the key and apply an AES-256 bit encryption to client data, and after applying encryption to data it removes the key from the memory, i.e. AWS S3 does not store client keys. Instead, AWS S3 stores a randomly salted HMAC value of the encryption key in order to validate future requests. The salted HMAC value cannot be used to derive the value of the encryption key or to decrypt the contents of the encrypted object. When you retrieve an object, you must provide the same encryption key as part of your request. Amazon S3 first verifies that the encryption key you provided matches, and then decrypts the object before returning the object data to you.
Features of Server Side Encryption:
- AWS S3 accepts encryption keys as parameters only over https connection. Any key provided over insecure http is discarded.
- All the key management lifecycle activities have to be taken care by the client only.
- If the client looses the encryption key, the client loses the object stored in AWS S3.
When using AWS S3 server side encryption with customer keys, provide encryption key information using the following headers:
- x-amz-server-side-encryption-customer-algorithm : This header should be used to specify the encryption algorithm. The header value must be “AES256”.
- x-amz-server-side-encryption-customer-key : This header should be used to provide the 256 bit, base64-encoded encryption key for Amazon S3 to use to encrypt or decrypt your data.
- x-amz-server-side-encryption-customer-key-MD5: This header should be used to provide the base64-encoded 128-bit MD5 digest of the encryption key according to RFC 1321. Amazon S3 uses this header for a message integrity check to ensure the encryption key was transmitted without error.
- Client has the complete ownership of keys. Thus the chances of data theft from CSP side are eliminated.
- Since client has the complete ownership of keys, if the client loses the key, the client will lose the data as well. AWS will not be responsible for recovering the data.