Cloud GCP Kafka FinOps

Kafka to Google PubSub with massive savings

Can managed Pub/Sub be an inexpensive replacement for on-prem and hybrid-cloud Kafka?

October 20, 2023
Kafka to Google PubSub with massive savings - Case Study Hero

Overview

I led a joint engineering team (client, consultant, Google PSO) to run a two-pronged experiment validating whether Google Cloud Managed Pub/Sub could be a cost-effective replacement for an on-prem Kafka installation on an OpenShift cluster.

The Challenge

A midsized bank runs a sizable on-prem OpenShift cluster hosting internally developed and vendor products (e.g., IBM Cloud Pak: MQ, AppConnect, API Connect). The bank operates a hybrid-cloud setup with workloads across Azure, AWS, GCP, and on-prem. It wants to retire its aging data center but is concerned about continuity and the migration path for in-house applications that rely on an on-prem Conduent Kafka deployment as the messaging backbone. The bank was unsure whether to lift-and-shift to ROSA + Confluent Kafka or pursue alternatives.

The Solution

I assembled a joint engineering team of client developers, consultants, and vendor engineers and defined the scope as follows:

  1. Cloud fundamentals
    1. Establish connectivity from on-prem to GCP
    2. Enable GCP Pub/Sub (API enablement and access/auth setup)
    3. Create PoC topics
  2. Rapid Dev & Ops onboarding kit
    1. Create onboarding documentation for developers and operators
    2. Create customized code/libraries in the two most commonly used languages at the bank to publish/consume messages to GCP Pub/Sub
  3. Rapid PoC and benchmark
    1. Identify two client applications in different languages that use Kafka
    2. Work with client developers to validate that they can publish/consume to Pub/Sub and document required changes
    3. Create a messaging performance-testing framework, benchmark latency, and compare results to on-prem Kafka

Results

  • Validated that GCP Pub/Sub performance (latency/throughput) is comparable to on-prem Kafka from the bank’s workload locations (cloud and on-prem)
  • Demonstrated an 80% reduction in operational costs for Kafka infrastructure at comparable message volumes
  • Enabled client developers to start onboarding applications on Day 1 with the provided kit