S3 Folder Usage Recording

S3 Folder Usage Recording
Photo by Edu Grande / Unsplash

Prerequisite

Install or update to the latest version of the AWS CLI

Solution

get_daily_usage.sh

#!/bin/bash

# Set your AWS credentials
export AWS_ACCESS_KEY_ID="YOUR_ID"
export AWS_SECRET_ACCESS_KEY="YOUR_KEY"
export AWS_DEFAULT_REGION="REGION"

# Name of your S3 bucket
my_bucket_name="bucketx"

output_file="daily_usage.csv"

local_output_path="/Users/waicung/aws/"

# Function to calculate size of folder
get_folder_size() {
    bucket_name="$1"
    folder_path="$2"
    aws s3 ls "s3://$bucket_name/$folder_path" --recursive | awk 'BEGIN {total=0} {total+=$3} END {print total/1024/1024/1024}'
}

# Function to get all first level folders
get_first_level_folders() {
    bucket_name="$1"
    result=$(aws s3 ls "s3://$bucket_name/" | awk '$1=$1')
    echo "$result" | awk '/^ *PRE / {sub(/^ *PRE /, "", $0); print}'
}

check_folder_size() {
    bucket_name="$1"
    output_file="$2"

    folders=$(get_first_level_folders "$bucket_name")

    # check if file exists, if not create it and inject a row with headers
    if [ ! -f $output_file ]; then
        echo "Date, Bucket, Folder, Size(GB)" >> $output_file
    fi

    while IFS= read -r folder; do
        echo "Size of folder $folder:"
        datetime=$(date '+%Y-%m-%d %H:%M:%S')
        echo "$datetime, $bucket_name, $folder, $(get_folder_size "$bucket_name" "$folder")" >> $output_file
    done <<< "$folders"
}

check_folder_size "$my_bucket_name" "$local_output_path$output_file"

Upload the result back to S3

log_bucket_name="bucket-log"

aws s3 cp $local_output_path$output_file s3://$log_bucket_name/usage/daily_usage.csv

Schdedule in crontab

crontab -e

for example, to run daily at 23:59

59 23 * * * /home/waicung/get_daily_usage.sh