Off the Charts: Craig Kerstiens, Postgres Community Contributor and Heroku Ecosystem Product Manager

Posted by aj on July 31, 2015 Off The Charts

Craig Kerstiens

Craig Kerstiens runs the Ecosystem group at Heroku which comprises their add-ons marketplace, core languages and API. He’s a well-known content contributor to the Postgres community. Craig’s originally from Alabama, so just ask him about BBQ, but has been a Bay Area resident for nine years. Chartio’s AJ Welch interviewed Craig for Off the Charts.

You were for a while part of the Heroku Postgres team and are a well known contributor to the Postgres community. When did you first learn about Postgres and what got you interested in it?

I first came to Postgres when I was at a startup that extended Postgres to be a streaming database. We essentially did MapReduce on data as it came in. If you didn’t have data throughput of 10-100 GB a day you weren’t very interesting to us. This was around seven years ago, and I recall some of the big improvements that started to make Postgres a bit more powerful at that time, such as CTEs. Since that time Postgres has generally been my go-to database. Before that, my database experience had been with Oracle and SQL Server, so I’ve always worked with pretty “real” databases.

Initially, I didn’t come on board at Heroku to do anything with Postgres. I was focused on product management over areas of the core platform – in particular launching Python and then some other areas as well. I got pulled over to the Postgres team after the GM of the team stated that I was doing better marketing than the data team themselves. In reality I was just focused on evangelizing internally, found I was repeating myself, and so I blogged it. Though these days I run product for several other teams at Heroku and am not actively involved with our data team.

Heroku provided Postgres as a managed service well before other platforms. Was there an initial insight that drove the launch of the service?

The initial decision came because we believed Postgres was safe for your data. When building a database-as-a-service, the number one requirement is to be safe and reliable. At the time, however, Postgres still wasn’t the most user-friendly database, but we decided it was easier to make Postgres user-friendly than take something less mature and make it more safe.

Don’t over-engineer for the future, engineer for where you are and 10x growth. The problems you’ll have when you hit that 10x growth aren’t ever what you imagine, so there’s no point in trying to optimize for 100x growth.

For the same reason, we funded some core development around the JSON and JSONB data types, and created to make it easier to get Postgres up and running on a Mac.

Your team has grown the service to one of the largest Postgres fleets in the world. What have some of the challenges been along the way and what can you share about operating at that scale?

There’s a lot of pieces to it.

Don’t over-engineer for the future, engineer for where you are and 10x growth. The problems you’ll have when you hit that 10x growth aren’t ever what you imagine, so there’s no point in trying to optimize for 100x growth.

Something I’ve always valued as a product person is feeling the customer’s pain directly. It’s very much a shared responsibility across the team. It’s one of the best ways to help keep a clear picture of where your product is failing and how you can improve.

What sort of metrics are you collecting about Heroku Postgres and how does this shape the service?

Our data team does do some pretty extensive monitoring across our fleet. We’ve had to, in order to operate at our scale. It starts with a single system that reaches out to all of our databases and checks for a variety of things, such as are they able to be written to  and read from. For follower databases, how bad is replication lag? If it gets beyond a certain amount, we page. At the core is a finite state machine that maintains the individual state of all databases. If anything starts to seem off on the first feel, we flag it as questionable and run it again. If that fails, we begin a series of processes to fix the issue or page the on-call engineer.

Your analysis about feature and extension usage across the Heroku Postgres fleet was particularly interesting. I even saw it referenced on the pgsql-hackers list to make a more informed decision about supporting the CUBE keyword in 9.5. Are there any updates as new features have been released?

I actually had no idea that was referenced in the decision, so thanks for surfacing that! I’m a big fan of using data to make decisions for products and seeing it done in communities is equally exciting.

I covered this in the talk, but some of the recent feature adoption around H-Store and then JSON has been amazing. I’ve been excited about them and can understand why others are, but to see the actual adoption is huge.

Being safe for your data has always been a priority of Postgres. As a database, this is the number one thing I care about. Being cool is not what I want with my data. As an open source database, I consider it the best in its class easily. This isn’t to discount some of the commercial databases. Many of them are of a great quality, but Postgres is quickly catching up. I once heard from a former Oracle sales engineer years ago that they felt their number one threat was this open source PostgreSQL database thing. It’s got the baseline of being safe for your data, and now is quickly catching up on all of the usability and flexibility features.

Given the success you’ve had with your blog, newsletter and talks, can you share any tips for producing and distributing good technical content?

I gave a talk titled “Marketing for Developers” a few months ago. Sadly it wasn’t recorded.

Create content early. It’s incredibly hard to maintain a beginner’s mindset. Once you get past it, it’s almost impossible to get it back.

The number one piece of advice I can give is to create content early. It’s incredibly hard to maintain a beginner’s mindset. Once you get past it, it’s almost impossible to get it back.  As a beginner you feel that you don’t have something worth saying, but many of my resonant posts are those where I record the first time I do something.

You’ve also written a lot about marketing to a technical audience. What are some tactics you’ve seen that get it right?

I could go on at length about how companies do it wrong, but I’d prefer to talk about  what companies do right. The first thing is giving your engineers a voice that’s at least somewhat detached from marketing. An engineering blog or support for personal blogs can be great.

The other piece is helping push out that content once you create the channels. Often there are a lot of internal emails among engineers that would make great blog posts. Moving to a pull instead of push model to flag those and help get them published is a great way for the company to help support the process.

Finally, most engineers want to write content or speak at conferences, but many just don’t know where to start. An internal program to support and foster that can go a long way.

Looking ahead, what excites you most about the future of Postgres and Heroku Postgres?

There are two big things for Postgres: extensions and foreign data wrappers. Extensions really turn Postgres into a data platform more than just a RDBMS. At that point a massive number of people can begin contributing and expanding it’s capabilities.

Foreign data wrappers are going to be powerful, allowing you to access lots of disparate systems.

As for Heroku Postgres, we’re excited to continue focusing on whatever you need for your app in terms of data. Heroku Redis is a recent addition, and you can already see how it starts to play well with Postgres via Data Links. For Postgres specifically, it’s really about giving app developers all the insights and tools they need to be able to focus on their app. Our expensive queries report is one example of this and we’ll continue investing in all of these tools so you can worry less about the ops and focus more on your app.