Now Available: Meltano v1.70.0

Today, we are excited to release Meltano version 1.70.0, which:

Excited to try it out?

To upgrade Meltano and your Meltano project to the latest version, navigate to your project directory, activate the appropriate virtual environment, and run meltano upgrade. This will upgrade the meltano package and apply any necessary changes to your project.

What else is new?

The list below (copied from the changelog) covers all of the changes made to Meltano since the release of v1.69.0 on February 16:

New

  • #2590 Add hotgluexyz variant of tap-chargebee
  • #2593 Add hotgluexyz variant of tap-intacct

Changes

  • #2356 Disallow two pipelines with the same job ID to run at the same time by default

Fixes

  • #2585 Fix bug with finding a schedule based on namespace for a custom plugin

Now Available: Meltano v1.69.0

Today, we are excited to release Meltano version 1.69.0, which adds out-of-the-box support for the Quickbooks source (thanks Hassan Syyid of Hotglue!) and adds support for Airflow 2 (thanks Michel Radosavljevic!).

You can add tap-quickbooks to your project using meltano add:

meltano add extractor tap-quickbooks

Airflow 2 is not the default yet, but you can use it in your project by adding the following to your meltano.yml project file (or modifying pip_url in your existing entry for airflow):

orchestrators:
- name: airflow
  pip_url: apache-airflow==2.0.1 --constraint https://raw.githubusercontent.com/apache/airflow/constraints-2.0.1/constraints-3.6.txt

Change 3.6 to 3.7 or 3.8 in accordance with your Python version. Then run meltano install orchestrator airflow to install.

Excited to try it out?

To upgrade Meltano and your Meltano project to the latest version, navigate to your project directory, activate the appropriate virtual environment, and run meltano upgrade. This will upgrade the meltano package and apply any necessary changes to your project.

What else is new?

The list below (copied from the changelog) covers all of the changes made to Meltano since the release of v1.68.0 on February 11:

New

  • #2558 Add support for Airflow 2.0
  • #2577 Add hotgluexyz variant of tap-quickbooks

Now Available: Meltano v1.68.0

Today, we are excited to release Meltano version 1.68.0, which adds support for entity/attribute selection to tap-gitlab (thanks Charles Julian Knight for contributing!) and bumps Airflow to version 1.10.14 (support for Airflow 2.0 is on the way!).

Excited to try it out?

To upgrade Meltano and your Meltano project to the latest version, navigate to your project directory, activate the appropriate virtual environment, and run meltano upgrade. This will upgrade the meltano package and apply any necessary changes to your project.

What else is new?

The list below (copied from the changelog) covers all of the changes made to Meltano since the release of v1.67.0 on January 26:

New

  • #2557 Add support for entity and attribute selection to tap-gitlab

Changes

  • #2559 Bump Airflow version to 1.10.14

Fixes

  • #2543 Fix packages dependencies that claim Python 3.9 is supported when it actually isn’t.

Now Available: Meltano v1.67.0

Today, we are excited to release Meltano version 1.67.0, which fixes two bugs with meltano schedule run <name>: if the schedule’s meltano elt command fails with a nonzero exit code, it now does as well, and it no longer requires the meltano executable to be in the PATH.

Excited to try it out?

To upgrade Meltano and your Meltano project to the latest version, navigate to your project directory, activate the appropriate virtual environment, and run meltano upgrade. This will upgrade the meltano package and apply any necessary changes to your project.

What else is new?

The list below (copied from the changelog) covers all of the changes made to Meltano since the release of v1.66.0 on January 18:

Fixes

  • #2540 meltano schedule run exit code now matches exit code of wrapped meltano elt
  • #2525 meltano schedule run no longer requires meltano to be in the PATH

Building Meltano in Public: Bimonthly recap

Last week, it was once again my turn to host a GitLab Group Conversation (a publicly live streamed Q&A on the GitLab Unfiltered YouTube channel) on Meltano!

I used the opportunity to share a recap of:

If you’re curious, check out the presentation on Google Slides and the Q&A on YouTube. The presentation content is also reproduced below, as is an embedded video of the Q&A!

Group Conversation Presentation

7 releases since the last GC (2020-11-19)

  1. V1.59.0 makes sure that all meltano elt errors properly make it into the log file and UI and that meltano select --list prints entities and attributes in alphabetical order.
  2. V1.60.0 adopts Poetry for dependency and build management and hides settings of kind object or array in the UI.
  3. V1.61.0 adds out-of-the-box support for the BigQuery source using the tap-bigquery extractor, adds a new meltano schedule run <name> command to easily run a scheduled pipeline by name, shows array and object settings in the plugin configuration UI as unsupported, and fixes the “Run Now” button in the pipelines UI to take into account the schedule’s overridden environment variables.
  4. V1.62.0 introduces plugin inheritance to let you have multiple configurations of the same package in your project at the same time, in the form of separate plugins that inherit their base plugin (package) description from an existing plugin and can then override (parts of) the inherited configuration.
  5. V1.63.0 automatically retries failed connections to the system database, and lets you tweak this behavior using new database_max_retries (default: 3) and database_retry_timeout (default: 5 seconds) settings.
  6. V1.64.0 fixes runaway memory consumption when an extractor outputs records at a much higher rate than the loader can process them, by enabling flow control with a 64KB buffer size limit. 
  7. V1.65.0 lets you tweak the size of the buffer that stores records output by the extractor (tap) while they are waiting to be processed by the loader (target), using a new elt.buffer_size setting.

(Earlier today, we also released V1.66.0!)

17 recent contributions by 11 community members

Done

  1. Enable `pool_pre_ping` for the project’s DB engine by Suyash Behera (Goldman Sachs)
  2. Update the ELT runner to explicitly log any error messages by Allan Whatmough (Run with AI)
  3. Added sorting for `meltano select –list –all` by Nil
  4. Added missing `mysql-logo.png` by Nil
  5. Adding Poetry for better dependency and build management by Tobias Macey (MIT)
  6. Make tap bigquery known to meltano by Niall Woodward (Tails.com)
  7. Adding pre-commit and linting configurations by Tobias Macey (MIT)
  8. Tap-salesforce:  Pass is_sandbox to authenticator by Kevin Ford
  9. Files-dbt: Add bigquery profile by Daniel Pettersson (Smartr)
  10. Linting fixes and contributor guide update by Tobias Macey (MIT)
  11. Target-postgres: Fix non-null falsey values by Charles Julian Knight (FIXD)

In development

  1. Search and replace “entities” >> “streams”, “attributes” >> “properties” by AJ Steers (Slalom)
  2. Add pipx-based install instructions by AJ Steers (Slalom)
  3. Singer SDK: Accelerated tap development framework (v0.0.1-alpha) by AJ Steers (Slalom)
  4. Singer SDK: Initial mock-up for target-base by AJ Steers (Slalom)
  5. Added libpq required target-postgres dependency install instructions by Geoff Langenderfer
  6. Target-snowflake: Dynamic precision fix by Bryan Wise (Halosight)

Recent weekly Slack activity

As expected, activity dropped significantly over the holidays, but it’s steadily climbing back to our previous records of 157 “weekly active members” (Dec 9) and 30 “members who posted” (Nov 18).

Join us on Slack!

Other exciting recent and ongoing developments

  • AJ Steers (Slalom) is working on the Singer SDK: a new framework and set of tools for building high-quality Singer taps and targets
  • This quarter, we intend to hire 2 active contributors from the community onto the Meltano team at GitLab as Backend Engineers to work full-time on Meltano and related projects like the Singer SDK! If you’re interested, please reach out to project lead Douwe Maan on Slack.

This week’s priorities

Milestone issue board

Upcoming priorities

Milestones issue board

Epics:

Group Conversation Q&A

Now Available: Meltano v1.66.0

Today, we are excited to release Meltano version 1.66.0, which (among other things) prevents pipelines from getting stuck in the “running” state forever when their meltano elt process is killed unceremoniously by the operating system or some other mechanism.

This is realized by automatically detecting stale runs in the system database and marking them as “failed” before meltano elt runs a new pipeline with the same Job ID, and before meltano schedule list lists the scheduled pipelines and their status (which meltano invoke airflow scheduler does periodically).

As long as a pipeline is running, meltano elt now records a heartbeat timestamp on the pipeline run row in the system database every second. A run is considered stale when 5 minutes have elapsed since the last recorded heartbeat. Older runs without a heartbeat are considered stale if they are still in the “running” state 24 hours after starting.

Excited to try it out?

To upgrade Meltano and your Meltano project to the latest version, navigate to your project directory, activate the appropriate virtual environment, and run meltano upgrade. This will upgrade the meltano package and apply any necessary changes to your project.

What else is new?

The list below (copied from the changelog) covers all of the changes made to Meltano since the release of v1.65.0 on January 12:

New

  • #2483 Every second, meltano elt records a heartbeat timestamp on the pipeline run row in the system database as long as the pipeline is running.
  • #2483 Before running the new pipeline, meltano elt automatically marks runs with the same Job ID that have become stale as failed. A run is considered stale when 5 minutes have elapsed since the last recorded heartbeat. Older runs without a heartbeat are considered stale if they are still in the running state 24 hours after starting.
  • #2483 meltano schedule list (which is run periodically by meltano invoke airflow scheduler) automatically marks any stale run as failed.
  • #2502 Add User-Agent header with Meltano version to request for remote discovery.yml manifest (typically https://www.meltano.com/discovery.yml)
  • #2503 Include project ID in X-Project-ID header and project_id query param in request for remote discovery.yml manifest when send_anonymous_usage_stats setting is enabled.

Now Available: Meltano v1.65.0

Today, we are excited to release Meltano version 1.65.0, which lets you tweak the size of the buffer that stores records (and other messages) output by the extractor (tap) while they are waiting to be processed by the loader (target), using a new elt.buffer_size setting with a default value of 10MiB.

The length of a single line of extractor output is limited to half the buffer size, making the default maximum message size 5MiB.

Excited to try it out?

To upgrade Meltano and your Meltano project to the latest version, navigate to your project directory, activate the appropriate virtual environment, and run meltano upgrade. This will upgrade the meltano package and apply any necessary changes to your project.

What else is new?

The list below (copied from the changelog) covers all of the changes made to Meltano since the release of v1.64.0 on January 7:

New

  • #2392 Add ‘elt.buffer_size’ setting with default value of 10MiB to let extractor output buffer size and line length limit (maximum message size) be configured as appropriate for the extractor and loader in question.

Fixes

  • #2501 Don’t lose version when caching discovery.yml.

Now Available: Meltano v1.64.0 and v1.64.1

Today, we are excited to release Meltano version 1.64.0, which fixes runaway memory consumption when an extractor outputs records at a much higher rate than the loader can process them, by enabling flow control with a 64KB buffer size limit.

As a result of this bug, meltano elt pipelines composed of fast extractors and slow loaders would sometimes be terminated by the operating system before completing, to prevent the system from running out of memory entirely.

Making the buffer size (and the related Singer message length limit) configurable is being tracked in a separate issue that is also being worked on this week.


Shortly after v1.64.0 was released, Yordan Ivanov reported a new critical bug introduced by this “fix”: when the extractor finishes before the loader, not all messages (records) would actually make it to the loader, but meltano elt would finish successfully anyway. This has been fixed in Meltano version 1.64.1, released on January 8.

Excited to try it out?

To upgrade Meltano and your Meltano project to the latest version, navigate to your project directory, activate the appropriate virtual environment, and run meltano upgrade. This will upgrade the meltano package and apply any necessary changes to your project.

What else is new?

The list below (copied from the changelog) covers all of the changes made to Meltano since the release of v1.63.0 on January 4:

Fixes

  • #2478 Fix runaway memory usage (and possible out-of-memory error) when extractor outputs messages at higher rate than loader can process them, by enabling flow control with a 64KB buffer size limit

Now Available: Meltano v1.63.0

Today, we are excited to release Meltano version 1.63.0, which automatically retries failed connections to the system database, and lets you tweak this behavior using new database_max_retries (default: 3) and database_retry_timeout (default: 5 seconds) settings!

Special thanks go out to Suyash Behera for contributing this functionality.

Excited to try it out?

To upgrade Meltano and your Meltano project to the latest version, navigate to your project directory, activate the appropriate virtual environment, and run meltano upgrade. This will upgrade the meltano package and apply any necessary changes to your project.

What else is new?

The list below (copied from the changelog) covers all of the changes made to Meltano since the release of v1.62.0 on December 23:

New

  • #2308 Verify that system database connection is still viable when checking it out of connection pool.
  • #2308 Add database_max_retries and database_retry_timeout settings to configure retry attempts when the first connection to the DB fails.

Fixes

  • #2486 Remove state capability from tap-google-analytics because it’s not actually currently supported yet

Now Available: Meltano v1.62.0

Today, we are excited to release Meltano version 1.62.0, which introduces plugin inheritance to let you have multiple configurations of the same package in your project at the same time, in the form of separate plugins that inherit their base plugin (package) description from an existing plugin and can then override (parts of) the inherited configuration!

To learn more about plugin inheritance, refer to the Plugins concept doc.

To learn how to use plugin inheritance for multiple configurations, refer to the Configuration guide.

To learn how to add an inheriting plugin to your project using --inherit-from=<name> on meltano add or inherit_from: <name> in meltano.yml, refer to the Plugin Management guide.

Excited to try it out?

To upgrade Meltano and your Meltano project to the latest version, navigate to your project directory, activate the appropriate virtual environment, and run meltano upgrade. This will upgrade the meltano package and apply any necessary changes to your project.

What else is new?

The list below (copied from the changelog) covers all of the changes made to Meltano since the release of v1.61.0 on December 9:

New

  • #2390 Let a plugin inherit its base plugin (package) description and configuration from another one using --inherit-from=<name> on meltano add or inherit_from: <name> in meltano.yml, so that the same package can be used in a project multiple times with different configurations.

Changes

  • #2479 Use extractor load_schema (usually its namespace) as default for target-bigquery dataset_id setting, as it already is for target-snowflake and target-postgres schema