Meltastic October 2022 Update

What happened in the Meltano universe in early fall? A lot it turns out. 

Before we get started, two short reminders

1. Our CEO Douwe and Head of Product Taylor are at Coalesce this week, feel free to reach out to them for a quick chat!

2. The Tap-Toberfest is happening next week. Sign up now to participate!

Let’s dive into these meltastic events. 

Meltano Ecosystem in Action

We saw 4 releases in the past few weeks bringing us some exciting new features including… the interactive config, making it much easier to configure from right inside the CLI. Watch Ken’s demo of this exciting new feature below:

List of all releases

  • v2.7.0v2.7.2  State IDs can now be user defined, Documentation improvements, simplified installation guide and 4 part tutorial added. Includes 1 community contributor (thanks to Reuben, again).
  • v2.6.0 The “interactive” config and rich exception formatting now improve the Meltano CLI experience. Includes 3 community contributors (thanks to Jared, Reuben and Jake)!

Community & Hub updates

  • Meltano Hub got a brand new design!
  • The Meltano Singer SDK released v0.11.1 with “BATCH” message support.  
  • The utility dbt-osmosis got added to the hub, supercharging your dbt developer experience.
  • The extractor tap-sftp powered by MeltanoLabs was added to the hub, including a full description on how to use it.
  • The utility Metabase was added to the hub, providing you with another BI tool integration.
  • The tap-xero description was added to the hub to make it usable.
  • Target-duckdb was added to the hub.

The Community in Action

This month the community provided everything including core Meltano features, plugins, blog articles and GitHub repositories implementing Meltano in action. Here are a few selected highlights:

Write Up: Finding Performance Issues in Meltano

This is a write up of community content. Shamelessly taken from AJ Steers & Derek Visch discussions on Slack & GitHub.

We’re going to focus on how to answer the question:

The total time to run the job is very long, why?

To answer this question, we need to find the bottleneck: 

  • 1. Is it the tap?
  • 2. Is it the target?
  • 3. Is it Meltano? 

(0) Compare to a subset

Run  `time {meltano run tap-name target-name}` in a shell to get the total runtime. Then modify your tap to select just a subset (10-25% of your records). Again take the time and compare the results, save them as base lines.

Note: Make the subset small enough to be manageable, but large enough to be problematic. You’ll save it to your disc and then send it out again.

(1) Starting at the beginning, the tap

Run the tap alone by using  `time { meltano invoke tap-name > tap.singer }` This will output the tap data into `tap.singer` and give you a time for how long the tap takes to run. 

(2) Run the target alone 

Let’s test how long the target takes to run with data already pulled from the tap by running `time { cat tap.singer | meltano invoke target-name }`. 

(3) Let’s identify the bottleneck

Take your baseline time, it will be either dominated by the lone tap time or the lone target run. This should identify the bottleneck right away! 

(4) Drill deeper, possible limiting factors

Great, now you know where the bottleneck is, you can then use the following list and variations alongside these dimensions to identify it exactly. Possible limiting factors are (non-complete list!):

  • CLI startup (should be visible within the test runs)
  • tap throughput (identifiable in (1))
  • tap rate limits (identifiable in (1) – possibly increase subset size)
  • target throughput (identifiable in (2))
  • target batch frequency
  • network latency
  • network bandwidth
  • RAM limitations and buffer overruns (if things work fine locally, this might be an issue, in that case, you can try to dockerize things to match the production environment)

Join the Conversation

The Meltano community is active on a variety of GitHub repositories like the core meltano/meltano repository as well as on Slack. Here are a few conversations that might interest you!

Additional Updates

A lot has happened on the Meltano blog, here are a few highlights.

Intrigued?

You haven’t seen nothing yet!