Finding the size of everything in your Redshift cluster – including tables, schemas, and even entire databases – is a fairly simply procedure when using a few handy scripts provided by Amazon support. We’ll explore all three methods below.
Getting Table Sizes
The simplest method for listing the size of your database tables is to use a handy and official script provided by Amazon Redshift Utilities, table_info.sql
.
This script looks (and, in fact, is) rather complex, but in essence it can be executed to quickly return a list of all the user tables in the system, along with various information about each table. The crucial column for our purposes is mbytes
, which indicates the total size, in megabytes
, of that particular table.
The results table will look something like this, which is automatically sorted from largest to smallest table:
schema table tableid distkey skew sortkey #sks rows mbytes enc pct_of_total pct_stats_off pct_unsorted
public users 100074 userid 1.00 userid 1 49990 84 Y 0.02 0.00 0.00
public sales 100087 listid 1.00 dateid 1 172456 52 Y 0.01 0.00 0.00
public date 100080 dateid 1.00 dateid 1 365 44 N 0.01 0.00 0.00
public listing 100085 listid 1.00 dateid 1 192497 44 Y 0.01 0.00 0.00
public event 100083 eventid 1.00 dateid 1 8798 36 Y 0.00 0.00 0.00
public venue 100076 venueid 1.00 venueid 1 202 32 N 0.00 0.00 0.00
public category 100078 catid 1.00 catid 1 11 28 N 0.00 0.00 0.00
Getting Schema Sizes
While Amazon hasn’t created a specific script for this purpose, we can easily repurpose an existing script for the job of finding the total schema size. First, copy and execute the code for the v_space_used_per_tbl.sql
view script to create a copy of that view in your Redshift cluster. Note: If you receive an error about the admin
schema, you likely are missing that schema from the system. Try changing the schema in the first line from admin
to something that exists (usually public
).
Now we can use this view to grab just the information we need. In this case, we want to group the results by the schema
and total the size of all tables therein to find the actual size of our schemas. This is accomplished with the following execution of the installed view:
SELECT
schemaname,
SUM(megabytes) as total_mb
FROM
public.v_space_used_per_tbl
GROUP BY
schemaname;
The results will look something like this, displaying the list of schemas
in the system, along with the total megabytes of disk space used by that schema:
schemaname total_mb
public 320
Getting Database Sizes
Finding the size of your databases is nearly identical to the method used above for the schema
sizes. Again we’ll use the already installed v_space_used_per_tbl
view, but this time group by the database
instead of schema
:
SELECT
dbase_name,
SUM(megabytes) as total_mb
FROM
public.v_space_used_per_tbl
GROUP BY
dbase_name;
The results should look familiar to you:
dbase_name total_mb
public 320