Storage

Amazon S3 provides multiple storage classes depending on your object’s access patterns and requirements for retrieval speeds. S3 Standard, Infrequent Access and Glacier Instance Retrieval all provide millisecond first-byte latency for object retrievals but charge different rates per GB of data stored and retrieved. Generally speaking, the colder a storage tier is, the cheaper it is to store data, but the more expensive it gets to retrieve. For this example, I’ll only be comparing three storage classes that offer millisecond first-byte data retrieval. Read more...

TL;DR: There may come a time when you need to build a ZFS pool in a temporarily degraded state. Reddit user u/mercenary_sysadmin describes how this is done, and this post provides additional commentary. The approach uses a temporary file in place of the disk, then taking it offline resulting in a degraded pool. When the physical disk is available, the file can be replaced with the actual disk in the pool. My desktop sits idle most of the time as I’ve switched to using my laptop for day-to-day tasks, which seemed like a waste of hardware. Meanwhile, my NAS, running off an 8TB WD Elements drive connected via USB to a ThinkCentre M93p, could use an upgrade. The plan was to set up my desktop as the new NAS with four 8TB WD drives in RAID-Z1. I already have the drives, plus one spare, lying around, so I thought it would be an easy installation, but two of the drives were dead. This left me with three good drives plus the WD Elements, which I then had to shuck. A backup of all critical data was already in place but the restore would have taken too much time, so I made the decision to set up a degraded ZFS pool with the three drives, copy the data over from the WD Elements, shuck it, and then introduce it to the pool. Read more...

AWS Backup is a service that helps orchestrate, audit, and restore backups within and across AWS accounts. The content of this post are my personal notes on backup plans, and do not reflect the views of my employer. A core component of the AWS Backup service is the backup plan. The AWS documentation describes a backup plan as: […] a policy expression that defines when and how you want to back up your AWS resources. Read more...

I’ve been spending more time browsing StackOverflow recently and came across a question asking if it was possible to find duplicate objects within an S3 bucket. One way would be to hash the object prior to upload and store the value in a local or remote data store. If that’s not possible or too much overhead, I figured I could use S3 Metadata and Athena to solve this, services I’ve covered on this blog not too long ago. Athena alone has come up a few times this year, just because I’ve been finding interesting use cases for it. While I am an AWS employee, everything I’ve written, and will write, on this blog has always been out of personal interest. There are no sponsored posts here and all opinions are all my own. Read more...

Amazon S3 Intelligent-Tiering moves your data to the most cost-effective S3 storage tier based on the object’s access pattern for the price of $0.0025 per 1,000 objects it monitors. Since the movement is done by the service you don’t know, or need to know, the access tier the object is currently in as all objects can be retrieved asychronously. If you opt-in for asynchronous archive tiers, you can find out if an object is in one of these tiers by requesting the HEAD of an object. This only works for these opt-in tiers, if you’d like to find out if the object is in the Frequent Access, Infrequent Access or Archive Instance Access tiers you will need to refer to the Amazon S3 Inventory. The S3 Inventory provides a snapshot of your object’s metadata at a daily or weekly frequency, this snapshot also includes S3 Intelligent-Tiering access tier, the key we are interested in. Read more...

Recently, Simon Willison shared how he uses S3 event notifications with Lambda and DynamoDB to list recently uploaded files from an S3 bucket. The first thought that occurred to me was to use S3 inventory which provides a daily catalog of objects within a bucket queriable through Athena. The second idea involved doing the same with the recently announced S3 metadata feature. Both methods, I discovered, were already commented on by others. In this post, I want to explore the S3 metadata method to get my feet wet with the service. Read more...

Recursively deleting all objects in a bucket and the bucket itself can be done with the following command. aws s3 rb s3://<bucket_name> --force If the bucket has versioning enabled any object versions and delete markers will fail to delete. The following message will be returned. remove_bucket failed: s3://<bucket_name> An error occurred (BucketNotEmpty) when calling the DeleteBucket operation: The bucket you tried to delete is not empty. You must delete all versions in the bucket. The following set of command deletes all objects, versions, delete markers, and the bucket. Read more...

Install the NFS client pacakge. For distros that use yum install nfs-utils. sudo apt install nfs-common Manually mount the share in a directory. Replace the following with your own values: server with your NFS server /data with your exported directory /mnt/data with your mount point sudo mount -t nfs server:/data /mnt/data To automatically mount the NFS share edit /etc/fstab with the following: # <file system> <mount point> <type> <options> <dump> <pass> server:/data /mnt/data nfs defaults 0 0 To reload fstab verbosely use the following command: Read more...

EBS sends events to CloudWatch when creating, deleting or attaching a volume, but not on detachment. However, CloudTrail is able to list detachments, the command below lists the last 25 detachments. aws cloudtrail lookup-events \ --max-results 25 \ --lookup-attributes AttributeKey=EventName,AttributeValue=DetachVolume Setting up noticiations is then possible with CloudWatch alarms for CloudTrail. The steps are summarized below: Ensure that a trail is created with a log group. Create a metric filter with the Filter pattern { $.eventName = "DetachVolume" } in CloudWatch. Create an alarm in CloudWatch with threshold 1 and the appropriate Action.