Now Available: Meltano v1.55.0

Today, we are excited to release Meltano version 1.55.0, which (among other things) adds out-of-the-box support for the datamill-co and transferwise (aka PipelineWise) variants of target-snowflake. The datamill-co variant is recommended for new users and is now the default, but the original meltano variant is still supported.

Special thanks go out to Nil for contributing a bug fix to stop meltano init from failing when the underlying filesystem doesn’t support symlinks!

Excited to try it out?

To upgrade Meltano and your Meltano project to the latest version, navigate to your project directory, activate the appropriate virtual environment, and run meltano upgrade. This will upgrade the meltano package and apply any necessary changes to your project.

What else is new?

The list below (copied from the changelog) covers all of the changes made to Meltano since the release of v1.54.0 on October 8:

New

  • #2368 Add transferwise and datamill-co variants of target-snowflake

Changes

  • #2368 Make datamill-co variant of target-snowflake the default instead of meltano
  • #2380 Add target_ prefix to namespaces of discoverable loaders target-postgres, target-snowflake, and target-sqlite

Fixes

  • #2373 meltano init emits warning instead of failing when underlying filesystem doesn’t support symlinks
  • #2391 Add missing max_workers setting to tap-salesforce discoverable plugin definition
  • #2400 Constrain Airflow installation to specific set of known-to-work requirements to prevent it from breaking unexpectedly

Watch Now: Open source EL(T) with Meltano and Singer

On October 16, I had the honor to host a talk and Q&A on open source EL(T) with Meltano and Singer at a Data Nerd Herd event organized by Ternary Data‘s Joe Reis.

If you’d like to learn why GitLab is so excited about open source data integration, how Singer provides the foundation, and how Meltano completes the picture, watch the recorded talk below, or check out the slides!

Talk description

Meltano is an open source, self-hosted, CLI-first ELT platform that brings DevOps best practices such as version control, code review, and continuous integration and deployment (CI/CD) to a space historically dominated by proprietary, hosted (SaaS), UI-first solutions, that end up being poorly customizable black boxes in otherwise transparent data stacks.

At GitLab, where Meltano was founded in 2018, we think that the power of data integration should be available to all, and are on a mission to build a truly competitive alternative to existing proprietary EL(T) solutions, in terms of ease of use, reliability, and quantity and quality of supported data sources.

Meltano leverages the existing ecosystem of open source Singer taps and targets for extraction and loading, and supports dbt transformations and Airflow orchestration through a thin plugin-based integration layer. This makes Meltano the easiest way to get started with open source ELT, without imposing arbitrary limitations on those advanced users who want to take a look behind the curtain and interact with the plugins directly.

About the presenter

Douwe Maan leads the Meltano project at GitLab, which he joined in 2015 as its fifth software engineer, when GitLab’s community of open source contributors still greatly outnumbered the in-house engineering team. He became GitLab’s first Engineering Lead, and ran the team responsible for all version control and code review functionality until 2019, when he moved to Meltano to attempt to replicate GitLab’s success in commoditizing DevOps tooling in the data integration space.

Building Meltano in Public: 6-weekly recap

Earlier this week, it was my turn to host a GitLab Group Conversation (a publicly live streamed Q&A on the GitLab Unfiltered YouTube channel) on Meltano.

I used the opportunity to share a recap of:

If you’re curious, check out the presentation on Google Slides and the Q&A on YouTube. The presentation content is also reproduced below, as is an embedded video of the Q&A!

Group Conversation Presentation

Meltano has had 7 releases since the last GC (2020-09-01)

  1. V1.47.0 adds support for Bing Ads, prints docs and repo URLs when adding plugins, lets you specify a full schema for taps that can’t discover theirs, automatically uppercases target-snowflake‘s schema setting, and fixes a bug with embedded reports in the UI.
  2. V1.48.0 lets you extract a subset of selected entities using new --select and --exclude options on meltano elt, improves plugin invocation and extractor catalog discovery error messages, and changes where meltano elt logs and generated plugin config files are stored. 
  3. V1.49.0 standardizes on <PLUGIN_NAME>_<SETTING_NAME> for configuration environment variable names, makes environment variable expansion in setting values more flexible, and uses this to let you easily override your extractor’s load_schema.
  4. V1.50.0 lets you manually provide extractor catalog and state files to meltano elt using new --catalog and --state options (and catalog and state extras), as an alternative to letting the catalog be generated on the fly and letting state be looked up based on the Job ID.
  5. V1.51.0 simplifies debugging extractor catalog generation, pipeline state lookup, and pipeline-specific configuration by letting you dump the contents of meltano elt‘s generated catalog, state, and config files to STDOUT (or a file) using a new --dump option.
  6. V1.52.0 fixes a bug where meltano elt --transform=run would unexpectedly install a transform plugin, and another where meltano select would show outdated results after changing configuration.
  7. V1.53.0 lays the foundation for out-of-the-box support for different variants of extractors and loaders, like the transferwise and datamill-co variants of target-snowflake and the singer-io variant of tap-facebook.  

5 community members made 11 recent contributions

Done

  1. Compose file update and readme addition by Nevin Morgan (VividFront)
  2. Override auth check when using a shared embed link by Allan Whatmough
  3. Resolve “Add a new `upcase_string` `value_processor` and apply to `target-snowflake`’s `schema` setting” by Nevin Morgan (VividFront)
  4. Add max_active_runs=1 to prevent scheduled job overlap by Niall Woodward (Tails.com)
  5. Remove automatic plugin install and remove associated tests by Paul Blankley (Zenlytic)
  6. Remove snowflake-connector-python dependency, bump snowflake-sqlalchemy, sqlalchemy and flask-sqlalchemy by Niall Woodward (Tails.com)
  7. Stop inheriting Meltano venv when invoking Airflow by Niall Woodward (Tails.com)
  8. Upgrade `pip` and related tools to the latest version in plugin venvs by Charles Julian Knight (FIXD)

In development

  1. Add pipelinewise-tap-mysql and pipelinewise-target-snowflake to known tap and targets by Niall Woodward (Tails.com)
  2. Use pipenv for reproducible development environment by Niall Woodward (Tails.com)
  3. Bump Airflow version to 1.10.12 by Niall Woodward (Tails.com)

Recent weekly Slack activity

Join us on Slack!

Other exciting recent developments

Current milestone priorities

Milestone issue board

Epics for upcoming priorities

Group Conversation Q&A

Now Available: Meltano v1.54.0

Today, we are excited to release Meltano version 1.54.0, which (among other things) ensures arbitrary environment variables defined in .env are passed to invoked plugins, recreates plugin virtual environments when running meltano install to make sure the latest versions of dependencies are installed, and bumps the Airflow version to 1.10.12.

Special thanks go out to Niall Woodward of Tails.com for contributing these last two changes!

Excited to try it out?

To upgrade Meltano and your Meltano project to the latest version, navigate to your project directory, activate the appropriate virtual environment, and run meltano upgrade. This will upgrade the meltano package and apply any necessary changes to your project.

What else is new?

The list below (copied from the changelog) covers all of the changes made to Meltano since the release of v1.53.0 on October 6:

Changes

  • #2057 Bump Airflow version to 1.10.12
  • #2224 Delete (if present) and recreate virtual environments for all plugins when running meltano install.

Fixes

  • #2334 Omit keys for settings with null values from config.json files generated for taps and targets, to support plugins that check if a config key is present instead of checking if the value is non-null.
  • #2376 Fix meltano elt ... --transform={run,only} raising PluginMissingError when a default transform for the extractor is discoverable but not installed
  • #2377 Ensure arbitrary env vars defined in .env are passed to invoked plugins

Now Available: Meltano v1.53.0

Today, we are excited to release Meltano version 1.53.0, which lays the foundation for out-of-the-box support for different variants of extractors and loaders, like the transferwise and datamill-co variants of target-snowflake and the singer-io variant of tap-facebook.

It also unblocks us from updating Airflow to the latest version and supporting Python 3.8 by having Airflow no longer inherit Meltano’s own virtual environment when invoked. Thanks, Niall Woodward, for contributing this change!

Finally, it makes sure that plugins are always installed using the latest versions of pip and setuptools to ensure support for all modern PyPI packages. Thanks, Charles Julian Knight, for contributing this change!

Excited to try it out?

To upgrade Meltano and your Meltano project to the latest version, navigate to your project directory, activate the appropriate virtual environment, and run meltano upgrade. This will upgrade the meltano package and apply any necessary changes to your project.

What else is new?

The list below (copied from the changelog) covers all of the changes made to Meltano since the release of v1.52.0 on September 28:

New

  • #2134 Let different variants of plugins be discovered and let users choose which to add to their project

Changes

  • #2112 Stop inheriting Meltano venv when invoking Airflow

Fixes

  • #2372 Upgrade pip and related tools to the latest version in plugin venvs

Now Available: Meltano v1.52.0

Today, we are excited to release Meltano version 1.52.0, which fixes a bug where meltano elt --transform=run would unexpectedly install a transform plugin, and another where meltano select would show outdated results after changing configuration.

Special thanks go out to Paul Blankley for contributing this first bugfix!

Excited to try it out?

To upgrade Meltano and your Meltano project to the latest version, navigate to your project directory, activate the appropriate virtual environment, and run meltano upgrade. This will upgrade the meltano package and apply any necessary changes to your project.

What else is new?

The list below (copied from the changelog) covers all of the changes made to Meltano since the release of v1.51.0 on September 21:


Fixes

  • #2360 Remove automatic install of extractors, loaders, and transforms if they are not present.
  • #2348 Invalidate meltano select catalog discovery cache when extractor configuration is changed

Now Available: Meltano v1.51.0

Today, we are excited to release Meltano version 1.51.0, which simplifies debugging extractor catalog generation, pipeline state lookup, and pipeline-specific configuration by letting you dump the contents of meltano elt‘s generated catalog, state, and config files to STDOUT (or a file) using a new --dump option with possible values catalog, state, extractor-config, and loader-config:

# Dump generated catalog in catalog.json
meltano elt <extractor> <loader> --job_id=<pipeline name> --dump=catalog > catalog.json

# Dump generated state in state.json
meltano elt <extractor> <loader> --job_id=<pipeline name> --dump=state > state.json

# View generated extractor config
meltano elt <extractor> <loader> --job_id=<pipeline name> --dump=extractor-config

# View generated loader config
meltano elt <extractor> <loader> --job_id=<pipeline name> --dump=loader-config

Excited to try it out?

To upgrade Meltano and your Meltano project to the latest version, navigate to your project directory, activate the appropriate virtual environment, and run meltano upgrade. This will upgrade the meltano package and apply any necessary changes to your project.

What else is new?

The list below (copied from the changelog) covers all of the changes made to Meltano since the release of v1.50.0 on September 17:

New

  • #2355 Add meltano elt --dump option with possible values catalog, state, extractor-config, and loader-config to dump content of pipeline-specific generated file

Fixes

  • #2358 Don’t unintentionally deselect all attributes other than those marked inclusion: automatic when using extractor select_filter extra or meltano elt‘s --select <entity> option

Now Available: Meltano v1.50.0

Today, we are excited to release Meltano version 1.50.0, which lets you manually provide extractor catalog and state files to meltano elt using new --catalog and --state options (and catalog and state extras), as an alternative to letting the catalog be generated on the fly and letting state be looked up based on the Job ID.

Excited to try it out?

To upgrade Meltano and your Meltano project to the latest version, navigate to your project directory, activate the appropriate virtual environment, and run meltano upgrade. This will upgrade the meltano package and apply any necessary changes to your project.

What else is new?

The list below (copied from the changelog) covers all of the changes made to Meltano since the release of v1.49.0 on September 15:

New

  • #2291 Add catalog extractor extra to allow a catalog to be provided manually
  • #2291 Add --catalog option to meltano elt to allow a catalog to be provided manually
  • #2289 Add state extractor extra to allow state file to be provided manually
  • #2289 Add --state option to meltano elt to allow state file to be provided manually

Fixes

  • #2352 meltano elt --select and --exclude no longer unexpectedly select entities for extraction that match the wildcard pattern but weren’t selected originally.

Now Available: Meltano v1.49.0

Today, we are excited to release Meltano version 1.49.0, which (among other things) standardizes on <PLUGIN_NAME>_<SETTING_NAME> for configuration environment variable names (e.g. TARGET_POSTGRES_HOST for target-postgres‘s host setting instead of PG_ADDRESS), makes environment variable expansion in setting values more flexible, and uses this to let you easily override your extractor’s load_schema.

Excited to try it out?

To upgrade Meltano and your Meltano project to the latest version, navigate to your project directory, activate the appropriate virtual environment, and run meltano upgrade. This will upgrade the meltano package and apply any necessary changes to your project.

What else is new?

The list below (copied from the changelog) covers all of the changes made to Meltano since the release of v1.48.0 on September 7:

New

  • #2279 Populate primary setting env var and aliases when invoking plugin or expanding env vars
  • #2349 Allow plugin setting (default) values to reference pipeline plugin extras using generic env vars, e.g. MELTANO_EXTRACT__<EXTRA>
  • #2281 Allow plugin extra (default) values to reference plugin name, namespace, profile using generic env vars, e.g. MELTANO_EXTRACTOR_NAMESPACE
  • #2280 Allow plugin extra (default) values to reference plugin settings using env vars, e.g. target_schema: $PG_SCHEMA
  • #2278 Read setting values from <PLUGIN_NAME>_<SETTING_NAME> env vars, taking precedence over <PLUGIN_NAMESPACE>_<SETTING_NAME> but not custom env
  • #2350 Add MELTANO_TRANSFORM_* transform pipeline env vars for transformer (configuration) to access
  • #2282 Add new extractor extra load_schema and use it as default loader schema instead of namespace
  • #2284 Add new loader extra dialect and use it as default dbt target and Meltano UI SQL dialect instead of namespace
  • #2283 Add new loader extra target_schema and use it as default dbt source_schema instead of loader schema
  • #2285 Add new transform extra package_name and use it in dbt’s dbt_project.yml and --models argument instead of namespace

Changes

  • #2279 Fall back on setting values from <PLUGIN_NAME>_<SETTING_NAME> and <PLUGIN_NAMESPACE>_<SETTING_NAME> env vars if a custom env is defined but not used
  • #2278 Stop unnecessarily prepopulating env on a newly added custom plugin’s settings definitions
  • #2208 Standardize on setting env vars prefixed with plugin name, not namespace or custom env

Now Available: Meltano v1.48.0

Today, we are excited to release Meltano version 1.48.0, which (among other things) lets you extract a subset of selected entities using new --select and --exclude options on meltano elt, improves plugin invocation and extractor catalog discovery error messages, and changes where meltano elt logs and generated plugin config files are stored.

Users who were depending on the previous locations of meltano elt logs ( .meltano/run/elt/<job ID>/<run ID>/elt.log) and extractor catalog files (.meltano/run/<extractor>/tap.properties.json) will want to use the new locations instead: .meltano/logs/elt/<job ID>/<run ID>/elt.log and .meltano/extractors/<extractor>/tap.properties.json.

Excited to try it out?

To upgrade Meltano and your Meltano project to the latest version, navigate to your project directory, activate the appropriate virtual environment, and run meltano upgrade. This will upgrade the meltano package and apply any necessary changes to your project.

What else is new?

The list below (copied from the changelog) covers all of the changes made to Meltano since the release of v1.47.0 on September 3:

New

  • #2340 Print full error message when extractor catalog discovery fails
  • #2223 Print clear error message when meltano invoke or meltano elt attempts to execute a plugin that hasn’t been installed yet
  • #2345 Make tap-csv files a known setting
  • #2155 Add select_filter extractor extra to allow extracting a subset of selected entities
  • #2155 Add --select and --exclude options to meltano elt to allow extracting a subset of selected entities

Changes

  • #2167 Include extractor and loader name in autogenerated meltano elt job ID
  • #2343 Automatically delete generated plugin config files at end of meltano elt and meltano invoke

Fixes

  • #2167 Make sure autogenerated meltano elt job ID matches in system database and .meltano/{run,logs}/elt
  • #2347 Have meltano config <plugin> set --store=dotenv store valid JSON values for arrays and objects
  • #2346 Correctly cast environment variable string value when overriding custom array and object settings

Breaks

  • #2344 Move meltano elt output logs from .meltano/run/elt to .meltano/logs/elt
  • #2342 Store pipeline-specific generated plugin config files (tap.config.json, tap.properties.json, etc) under .meltano/run/elt/<job_id>/<run_id> instead of .meltano/run/<plugin_name>. Users who were explicitly putting a catalog file at .meltano/run/<plugin_name>/tap.properties.json should use .meltano/extractors/<plugin_name>/tap.properties.json instead.