Data TutorialsGoogle BigQuery

How to Estimate Google BigQuery Pricing

Posted by AJ Welch

Since Google BigQuery pricing is all based on usage, there are primarily only (3) core aspects of your BigQuery data storage you need to consider when estimating the costs: Storage Data, Long Term Storage Data, and Query Data Usage.

While the official Pricing page includes a lot more details and useful information that’s worth looking over, in this guide we’ll briefly explore each of the three pricing elements in order to estimate our monthly costs.

Storage Data

Storage Data is by far the simplest component of BigQuery pricing to calculate, as BigQuery currently charges a flat rate of $0.02 per GB, per month for all stored data.

It is simple to view the Table Size for the various tables in a BigQuery dataset to give a rough estimation of the Storage Data you’re using. For example, the public-data:samples.gsod table is around 16.1 GB in total with 114 million rows. Yet even a massive table that size is only about a third of a dollar per month in storage fees.

If we had a dataset that was a total of 500 TB of data, we can simply multiple that by 1000 (to convert to GB), then multiple the result by the $0.02 per GB fee.

500 TB * 1000 = 500,000 GB
500,000 GB * $0.02 = $10,000 per month

Even storing a whopping 500 TB of data is (at most) a cost of roughly $10,000 per month in BigQuery.

Long Term Storage Data

After determining the total Storage Data size above, it’s also worth considering how much of that data will qualify as Long Term Storage. Long Term Storage is simply a flag that is automatically applied to any table that has not been updated within the previous 90 consecutive days. Once the 90-day mark has been met, the Storage Data price drops by 50%, from $0.02 per GB, per month to $0.01 per GB, per month.

Functionally, a table marked as Long Term Storage is no different than normal. Once the table is updated, however, the table is automatically reverted back to normal Storage Data pricing and the 90-day timer is reset.

The critical consideration here is the keyword of updated. This means that a table that is simply being QUERIED from (or having data manually exported/copied from) DOES NOT indicate the table has had its data updated, and thus it will retain Long Term Storage status (if already applied).

On the other hand, if records are updated or added in anyway, the table is back to normal and the 90-day timer resets to zero.

If we estimate that 25% of our dataset from the above example will remain static and only be viewed or queried, we can modify our previous estimation of $10,000 per month for Data Storage down to $8,750 per month, giving us a $1,250 per month savings due to Long Term Storage.

Query Data Usage

To calculate Query Data Usage we need to start by estimating a few basic paramaters of our service, then use that information to calculate our monthly costs:

  • # of Users (per day)
  • # of Queries (per User, per day)
  • Average Data Usage (per Query)

We can then take those parameters and apply a basic calculation to estimate our monthly Query Data Usage:

numUsersPerDay * numQueriesPerUser * dataPerQuery * daysPerMonth = MONTHLY_QUERY_DATA_USAGE

Thus, if we estimate we’ll see about 150 Users per day, each running 50 Queries per day, with an average data usage of 5 GB per query, we just plug those values in to get our estimation:

150 * 50 * 5 GB * 30 = MONTHLY_QUERY_DATA_USAGE

With a rough estimation of 1125 TB of Query Data Usage per month, we can simply multiple that by the $5 per TB cost of BigQuery at the time of writing to get an estimation of ~$5,625 / month for Query Data Usage.

This brings our grand total to $14,375 per month for our example dataset.

Google Cloud Platform Pricing Calculator

While the above calculations are simple enough once you’ve estimated your monthly usage, Google provides a handy Pricing Calculator tool that you can use to estimate your costs. Just visit the link and select BigQuery, then enter your Storage Data and Query Pricing estimations and you’re all set!