Data Platforms

Introducing AlloyDB for PostgreSQL. GCP’s Latest Solution for Operational Analytics

By
Beatrice Bu
Bitstrapped
Updated
July 25, 2022

AlloyDB was announced for public preview  shortly before we began collaboration with a recent client. At the time, our original intention was to migrate our client’s Snowflake database to BigQuery. This is a smart choice because of BigQuery’s speed, cost efficiency and ease of use (There's no need to manage infrastructure or indexes.).

With AlloyDB, we saw the opportunity to not only provide a solution that was more appropriate for the client’s analytical needs, but also an opportunity to champion the cutting edge and be a leader in operational analytics innovations. But what exactly is AlloyDB? And how does it compare to GCP’s best of breed analytical power house, BigQuery?

First, A Few Thoughts On Database Migration

Migrating databases has specific success criteria and key performance indicators, these play a large part in the business logic that goes into choosing appropriate Database Migration destinations:

  • Minimizing the amount of “friction” between systems. It’s often important to keep in mind how much refactoring of code, queries, data structures and indexing is required to produce similar or better results on your new database.
  • Data governance and health. Ensuring the reliability, accessibility and completeness of data from source to sink is critical. This extends to ensuring access control is properly emulated on the sink database.

What is AlloyDB for PostgreSQL?

AlloyDB is, in a nutshell, a PostgreSQL database with native serverless GCP support. From the front-end, you might never know you’re using anything other than your standard PostgreSQL database, such as with Google Cloud SQL. In the back-end, however, Google has engineered an entirely disaggregated, distributed cluster architecture to maximize analytical query performance in line with transactional and application performance. Google separates your database into two abstractions, the Primary Instance where writes, updates and transactions are sent to, and Read Pools for reads and analytics at blazing, distributed speeds while also providing redundancy and fault-tolerance.

Performance and Disaggregation:

AlloyDB promises up to 100x analytical speed (reading) and 4x transactional (writing) speed from your traditional databases. (In their demo of AlloyDB, they actually demonstrate a 160x increase.) By using a combination of ML-Driven Columnar and traditional Buffer caches, along with a distributed node network of read-only replicas, you get the speed class of in-memory data. With AlloyDB, your Storage Layer is entirely separate from your Compute. Not only does this allow you to bill both storage and computation separately, it also allows your system to utilize GCP’s Autopilot to scale both horizontally and vertically without relation. Read performance, when scaling nodes horizontally in a Read Pool, is linear and thus very predictable; a Read Pool with 8 nodes will in most cases be twice as fast as a Read Pool with 4 nodes.

No schema or configuration changes are needed to your workload, and GCP even suggests possibly dropping your analytical indexes altogether to improve transactional performance.

Enter BigQuery

BigQuery has been Google’s go-to for Data Warehousing. With minimal setup, no need to manage storage capacity, compute, indexes or infrastructure, BigQuery is the analytics and data warehouse choice to many. In minimal time, you can have a production-grade data warehouse running with serverless, autoscaling queries. GCP now provides two highly scalable, native serverless, relational, analytical systems. For many, it won’t be entirely obvious from the outside just how different these two products are, and especially which data warehouse is appropriate for which use case.

BigQuery provides a great system for new databases and datasets. As setup is minimal, and there is no need to manage infrastructure, this is a great option for quickly deploying and managing petabyte-level datasets, and getting queries fast.

AlloyDB will prove to be the choice of teams migrating legacy systems to the cloud. With native compatibility with existing PostgreSQL servers, AlloyDB also comes with full integration for the Database Migration Service to provide heterogeneous migration from other database systems. To add on to this, GCP has announced the Database Migration Program for “tooling, resources, and expertise from our network of specialized technology partners”.  While head to head performance compared to BigQuery is yet to be seen, AlloyDB will prove more useful and effective to any team with prior infrastructure experience and protocols.

Conclusion

As Google Cloud Platform continues to provide newer, powerful and sometimes niche products, the smart Cloud Practitioner will need to know which use cases are appropriate for which systems. In future posts, we will discuss comparing the friction of migrating a database to both BigQuery and AlloyDB. Afterwards, we will follow-up with benchmarks for both services, including testing the linear scalability of AlloyDB’s Read Pools.

Article By

Beatrice Bu

Related Articles