While storing data, serverless offerings are becoming increasingly popular. Going serverless lowers deployment, operational, and scaling costs, making it easy to manage resources. Amazon and Google designed excellent serverless database services: Google BigQuery and Amazon Athena. Both are good options for data analysis. In this article, we’ll go over BigQuery and Athena’s differences.
What is Google BigQuery?
Google BigQuery is a well-known cloud-based Data Warehouse noted for its high-level analytic services that can efficiently analyze enormous datasets. This serverless platform enables high-speed SQL query processing and can handle billions of records at once.
Google BigQuery also simplifies the process of allocating resources. Its storage is based on a columnar format, allowing easy querying and aggregation. This platform also provides data security, allowing you to verify customers’ identities and access statuses.
Advantages of using Google BigQuery
BigQuery Machine Learning
BigQuery ML allows you to develop, execute, and test machine learning models within the database using common SQL queries. Then, you move the model and the complete build, run, and test process to the data, reducing complexity and the procedures necessary to get things started and increasing performance.
In addition, BigQuery ML’s use of SQL queries lets customers already familiar with the SQL language leverage their current skills and procedures to get started with the numerous benefits of machine learning.
Security
BigQuery provides the most acceptable security of any cloud data warehouse service and a 99.9 percent service level agreement. The integrity of our corporate data is critical for ensuring the correctness and dependability of the data we keep and query and the insights we obtain once analyzed.
In addition, BigQuery features free automated data replication for availability and disaster recovery in the event of an unexpected outage or accidental intrusion. Without needing additional tools to achieve this degree of data protection, your team is free to focus on creating insights rather than fussing with different security solutions.
Ease of use
One of BigQuery’s most notable advantages is its ease of usage. Building a data warehouse is a costly, time-consuming, and complex process. You do not want to invest much effort and money in a data warehouse that does not work as per your expectation.
BigQuery is one of the best alternatives for data warehouses. You store and process your data in a single place and are billed based on your usage only.
What is AWS Athena?
Amazon Athena is a database query offering that allows you to query the data stored in S3 using SQL easily. Because Athena is serverless and straightforward to use. No infrastructure management is required, and you just pay for the queries you execute.
You just need to point to your S3 data, configure the schema, and begin querying using regular SQL, and you will get the results in seconds. In addition, there is no need for sophisticated ETL procedures to organize your data for analysis with Athena. This enables anyone with SQL knowledge to swiftly evaluate large-scale datasets.
Athena assists you in analyzing data stored in Amazon S3 that is unstructured, semi-structured, or structured. CSV, JSON, and columnar data formats like Apache ORC and Parquet are some of the data formats supported by Athena.
For quick data visualization, Athena connects with Amazon QuickSight. As a result, you can use it to produce reports or analyze data. In addition, Athena works with the AWS Glue Data Catalog to provide a permanent metadata repository on Amazon S3 for your data. This enables you to construct tables and query data in Athena based on a centralized metadata store accessible throughout your account and connected with AWS Glue’s ETL and data discovery functionalities.
Advantages of using Athena
Pricing
You pay for the queries you perform using Amazon Athena. You are charged $5 for each terabyte scan. However, you may save up to 90% on per-query expenses by compressing your data into columnar forms. Athena performs direct data searches in Amazon S3, so there is no other data source. Also, in terms of storage, S3 standard charges are applicable.
Easy to set up
Athena is a serverless system. You can query your data rapidly without setting up or managing any servers or infrastructure. Simply refer to your data on Amazon S3, specify the schema, and begin querying with the integrated query editor.
Amazon Athena enables you to access all of your S3 data without having to put up sophisticated procedures to extract, convert, and load the data (ETL).
High speed
Athena queries return results in seconds. So you don’t need to worry about not having enough computational resources to run quick interactive queries with Amazon Athena. Most responses are returned in seconds.
Powerful performance
Amazon Athena works with various standard data formats, including CSV, JSON, ORC, Avro, and Parquet, and is powered by Presto with ANSI SQL support. Athena excels in interactive querying and can perform complicated analyses like massive joins and arrays.
It is highly available, and queries are executed utilizing compute resources distributed across many facilities and devices inside each facility. Amazon Athena’s underlying data store is Amazon S3, making your data extremely accessible and durable.
Google BigQuery vs AWS Athena : Difference between Google BigQuery and AWS Athena
Now that you have a broad grasp of BigQuery and Athena let us look into what makes them different. Then, we will compare them on various parameters like architecture, security, pricing, performance, and data they support.
Architecture
BigQuery is the first data warehouse that separated storage and computation systems. It is multi-tenant and uses shared resources, which are assigned as “slots,” a virtual CPU responsible for SQL execution. BigQuery decides how many slots a query needs, and users are not allowed to modify it.
BigQuery runs searches and creates tables using Google’s Dremel query system, designed for executing queries on enormous datasets natively stored in BigQuery. It takes advantage of the Jupiter petabit network to speed up remote storage access; however, leveraging shared memory via the network for each stage of query execution in the DAG might harm performance.
Athena is serverless and decouples storage and computation. It directly queries data in S3, eliminating data ingest/copy requirements. It is multi-tenant and uses shared resources. Users have no say over the computing resources.
Instead, Athena allocates resources to each query from the shared resource pool. Presto, which is designed for conducting questions on smaller single data sources, is used by Athena for query execution.
Scalability
BigQuery grows to significant data volumes and automatically assigns extra computational resources as “slots” on the requirement. BigQuery operates in either an “on-demand pricing model,” in which slot assignment is entirely dependent on BigQuery and the condition of the shared resource pool, or in a “reserved model,” in which slots are booked in advance.
Reserved slots provide more control over computational resources, making scalability more predictable.
As AWS Athena is a shared multi-tenant offering, there are no assurances about the amount or availability of resources assigned to your queries. However, the maximum query concurrency is 20.
Security
AWS IAM and GCP IAM provide role-based access restrictions for Athena and BigQuery. Although the IAM services are not primarily built for data warehouses, they give a comprehensive set of controls for those services. BigQuery additionally encrypts customer data kept by default, using customer-managed encryption store keys for cloud services. Because Athena only accepts data from S3, you can easily query encrypted data and publish encrypted results back to your S3 bucket.
Data type and sources
BigQuery supports various data formats from CSV to JSON, ORC, Parquet, and Avro. Suppose a BigQuery data source is already stored in BigQuery. In that case, it can be queried without having to load it, and external sources can also be queries without loading, but a reference table must be created.
Athena supports data formats like CSV, Avro, Apache log files, Logstash log files, JSON, ORC, CloudTrail log files, and Parquet. In addition, complex data types such as maps and arrays are also supported. Because Athena can only query data stored in S3, this is the only data source available to it.
Performance
The most challenging hurdle for most data warehouses today is performance.
Although having separate computing and storage has increased scalability and management, they generated two bottlenecks in most data warehouses regarding computing and storage. First, most current cloud data warehouses retrieve whole partitions over the network rather than only the data required for each query. While many organizations invest in caching, only a few spend extensively on query optimization.
Apart from applying best practices, you can do nothing to improve BigQuery speed because it determines how many resources (slots) the query requires. BigQuery may be used with the “BigQuery BI Engine” to do low-latency analytics. However, because it operates in memory, BI Engine is restricted in size because it works in memory. It has a maximum storage capacity of 100GB.
Athena is intended to query data where it exists, making it incredibly useful for quick and fast querying, but it degrades performance. However, Athena performs relatively well in performance testing, mainly when external storage is expertly controlled. While it allows partitions, it does not offer indexing, and because resources are pooled from a standard multi-tenant service, performance and low latency are not priorities for Athena.
Pricing
BigQuery eliminates the need to provide separate instances or virtual machines by allocating computing resources as needed. It has three pricing models: on-demand pricing, where you get 1 TB of data processing free every month, and later you are billed $5/TB for the number of bytes processed by each query. It also has a reserved pricing model where you get dedicated slots to run queries for $4 for 100 slots/hour and $1700/month per 100 slots. Finally, bigQuery storage pricing is $20/TB for active and $10/TB for inactive storage.
In Athena, you only pay for queries that you execute. Since you are paid based on the quantity of data scanned while running each query, this helps in cost control, making it best suited for one-time analytics. Athena charges $5/TB of data reviewed; however, you may significantly lower your costs by compressing, dividing, and turning your data into columnar forms. The storage cost is based on the data stored in S3.
Sum up
Here we described the fundamental distinctions between the two most popular cloud data warehouse platforms, AWS Athena and Google BigQuery, providing you with a complete examination of each platform’s strengths. BigQuery enables you to execute SQL queries on petabytes of data in seconds, while with Athena, you can run SQL queries on data stored in S3. Amazon Athena is basic yet of high quality. Because no upfront setup is necessary, ad hoc querying is simple. It is suitable for primary and aggregated searches and is less expensive than BigQuery.
Related Posts
Google Bigquery vs Azure Synapse

Amit Doshi is a Cloud Engineer who has experienced more than 5 years in AWS, Azure, and Google Cloud. He is an IT professional responsible for designing, implementing, managing, and maintaining cloud computing infrastructure, applications, and services.