Choosing Between Amazon Redshift and Google BigQuery
Posted by Data Governanceon November 16, 2016
Data runs our world. It’s how startups track usage to determine product viability and how large enterprises determine quarterly performance. Making data-driven decisions is no longer a nice-to-have for companies, it’s a competitive requirement.
Most companies run multiple operational databases like MySQL or PostgreSQL, track web analytics with Google Analytics and use Customer Relationship Management (CRM) systems like Salesforce—each of which collects volumes of data. Historically, this data is stored in separate silos, often as a mix of structured and unstructured data that must be transformed and combined before meaningful data analysis can be performed.
As a company continues to amass even larger volumes of data, it may be time for them to evaluate data warehousing as a potential solution to one or more of the following challenges:
Data isolation/consolidation: Data needs to be collocated in order for analysis to cross application boundaries (e.g. Google Analytics visitor traffic and sales receipts).
Database sustainability: Production systems tuned for concurrent latency-sensitive transactions may not be particularly efficient when queries touch the entire dataset.
Long-term retention: Storing many years of historical data may be desirable, but it may not be practical to use the same systems as used for day-to-day operations.
Data warehouses are not just for storing data, they must be architected to facilitate analytics and business reporting needs, handling complex analytical queries quickly without impacting operational databases or other systems where the data was originally created.
Raw data on its own offers no insights and addresses few business goals, but by loading raw data into a data warehouse, it’s possible to facilitate data exploration, interactive reporting and data-informed decision-making for an entire organization.
Data Warehouses: A Brief Overview
Historically, data warehouses were clunky systems that took up physical space, needed a white-glove installation and required a team of database administrators to keep systems running smoothly. Today’s data warehouses are cloud-based, available on-demand and require significantly less upfront and ongoing expenses to maintain than legacy data warehouses.
A wide range of businesses from SaaS startups to Fortune 500 companies are storing and analyzing massive amounts of data without any server, storage or networking of their own. By outsourcing management of those systems to cloud providers, they can focus employee efforts on analyzing data rather than keeping data centers running 24/7.
Of the cloud-based data warehouses, Amazon Web Services (AWS) pioneered the movement and refocused public perception with Amazon Redshift. In the days since its launch, Amazon Redshift now has its share of competitors including Google Cloud Platform’s offering of Google BigQuery.
There’s no one-size-fits-all data warehouse, but it’s crucial to choose a data warehouse that fits your business needs and will scale alongside your company. No matter which data warehouse you choose, they all have similarities, which can make it difficult when it comes to evaluation.
We’re comparing Amazon Redshift and Google BigQuery because both systems:
Are marketed as fully-managed petabyte-scalable systems
Leverage parallel processing
Leverage columnar storage
Are geared towards interactive reporting on large data sets
Support integrations and connections with various applications, including Business Intelligence tools
While the similarities are recognizable between Amazon Redshift and Google BigQuery, their differences may be harder to pinpoint.
Choosing a data warehouse that will help you analyze your data doesn’t have to be difficult. To help you make your data warehouse decision, particularly between Amazon Redshift and Google BigQuery, we’ve written a white paper titled “What to Consider When Choosing Between Amazon Redshift and Google BigQuery.”
Our White Paper:
If you have yet to invest in a data warehouse, our white paper is integral to your evaluation process. If you’re already invested in either Amazon Redshift or Google BigQuery, the advice within this white paper can help optimize your data warehouse to its full potential. Download What to Consider When Choosing Between Amazon Redshift and Google BigQuery for more insight into the most popular data warehouses and learn:
The differences between Amazon Redshift and Google BigQuery
The impact and implications of throughput and concurrency
Data warehouse Operations (including provisioning, loading, security and maintenance)
The ongoing cost of operating a data warehouse
Use cases for each data warehouse from fast-growing, innovative companies