In S3 there is no concept of directories within the buckets, just keys with very long names that contain slashes (“/”)
Object values are the content of the body:
Max object size in S3 is 5TB
In case of a upload bigger than 5GB, we must use multi-part upload
Each object can have metadata: list of text key/value pairs - system or user added
Each object can have tags: unicode key/value par, useful for security, lifecycle. A bucket can have up to 10 tags
If versioning is enabled each object has a version ID
S3 Versioning
Files can have a version in S3, this should be enabled at the bucket level
If a files is uploaded with the same key (same filename) the version of the file will be changed, the existing file wont be overridden, we will have both files available with different versions
It is best practice to version the files, because:
The files will be protected against unintended deletes
The files can be rolled back to previous versions
Notes:
Any file that is not versioned prior to enabling versioning will have the version “null”
Suspending versioning does not delete the previous versions of the file
Deleting versioned files:
When deleting a versioned file adds a delete marker to the file, but the file wont be deleted
The file can be restored by deleting the delete marker
Deleting the delete marker and the file together is a permanent delete, meaning the file wont be able to be restored
S3 Security
Encryption at Rest
AWS provides 4 methods of encryption for objects in S3
SSE-S3: encrypts S3 objects using keys handled and managed by AWS
SSE-KMS: leverage AWS Key Management Service to manage encryption keys
SSE-C: the encryption keys are managed by the user
Client Side Encryption
SSE-S3:
Keys used for encryption is managed by Amazon S3
Objects are encrypted in the server side
It uses AES-256 encryption
In order to have SSE-S3 encryption clients must set a header, which is x-amz-server-side-encryption”: “AES256”
SSE-KMS:
Encryption keys are handled and managed by KMS
KMS allows the manage which keys will be used for the encryption, moreover it has audit trails in order to be able to see who was using the KMS key
Objects are encrypted in the server-side
In order to have SSE-S3 encryption clients must set a header, which is x-amz-server-side-encryption”: “aws:kms”
SSE-C:
Server side encryption using data keys provided by the user from the outside of AWS
Amazon S3 does not store the encryption key provided by the user
Data should be transmitted through HTTPS, because a key is sent the AWS
Encryption key must be provided in the header of the request for every request
When retrieving the object, the same encryption key must be provided in the header
Client Side Encryption:
Data must be encrypted before sending it to S3
This is usually accomplished by using a third party encryption library, like Amazon S3 Encryption Client
The user is solely responsible for encryption-decryption
The keys and the encryption cycle is managed by the user
Encryption in transit (SSL/TLS)
Amazon S3 exposes:
HTTP endpoint for non-encrypted data
HTTPS endpoint for encryption in flight which relies on SSL/TLS
Most clients for S3 will use HTTPS by default
In case of SSE-C encryption HTTPS in mandatory
S3 Security Overview
User based security:
Accomplished by using IAM policies: specified which calls should be allowed for a specified sued from IAM console
Resource based security:
Accomplished by using bucket policies which are bucket wide rules from the S3 console. These rules may allow cross account access to the bucket
We also have Object Access Control Lists (ACL) and Bucket Access Control Lists which allow finer grain control over the bucket
Note: an IAM principle can access an S3 object if:
The user IAM permission allows it or the resource policy allows it
The is no explicit deny
S3 Bucket Policies
Bucket policies are JSON based documents
They can be applied to both buckets and objects in buckets
The effect of a statement in the bucket policy can be either allow or deny
The principle in the policy represents the account or the user for which the policy applies to
Common use cases for S3 bucket policies:
Grant public access to the bucket
Force objects to be encrypted at the upload
Grant access to another account (cross account access)
Bucket Settings for Block Public Access
Relatively new settings that was created to block public access to buckets and objects if the account has some restrictions:
S3 provides 4 different kind of block public access settings:
new access control lists
any access control lists
new public bucket or access point policies
block public and cross-account access to buckets and objects through any public bucket or access point policies
These settings were created to prevent company data leaks
S3 Other Security Features
Networking:
S3 supports VPC Endpoints (for instances in VPC without internet access)
Logging and Audit:
S3 Access Logs can be stored in other S3 buckets
API calls can be logged in AWS CloudTrail
User Security:
MFA Delete can be required in versioned buckets in order to protect for accidental deletions
Pre-Signed URLs: ULRs that are valid only for a limited time
S3 Websites
S3 can host static websites and have them accessible from the internet
In case of 403 errors we have to make sure that the bucket policy allows public reads
CORS
An origin is a scheme (protocol), host (domain) or port
CORS: Cross Origin Resource Sharing
CORS is a web browser based mechanism to allow requests to other origins while visiting the main one
Same origin example: http://example.com/app1 and http://example.com/app2
Different origins: http://example.com and http://otherexample.com
The request wont be fulfilled unless the other origin allows for the request, using CORS headers (example: Access-Control-Allow-Origin, Access-Control-Allow-Method)
S3 CORS
If a client does a cross-origin request on an S3 bucket, the correct CORS headers need to be enabled in order for the request to succeed
Request can be allowed for a specified origin (by specifying the URL of the origin) or for all origins (by using *)