# Plugins

A Meltano project's primary components are its plugins, that implement the various details of your ELT pipelines:

  • Extractors pull data out of arbitrary data sources.
  • Loaders load extracted data into arbitrary data destinations.
  • Transforms transform data that has been loaded into a database (data warehouse).
  • Models describe the schema of the data being analyzed and the ways different tables can be joined.
  • Dashboards bundle curated Meltano UI dashboards and reports.
  • Orchestrators orchestrate a project's scheduled pipelines.
  • Transformers run transforms.
  • File bundles bundle files you may want in your project.

Discoverable plugins are supported out of the box, and others can easily be added as custom plugins.

To learn how to manage your project's plugins, refer to the Plugin Management guide.

# Discoverable plugins

Before Meltano can use a plugin, it needs to know where its package can be found, how it can be invoked, and what settings it supports, on top of plugin type-specific details like an extractor's capabilities (supported executable options) or a loader's dialect.

Meltano supports many common extractors (Singer taps), loaders (Singer targets), and other plugins out of the box, since their metadata has already been collected and contributed to its index of discoverable plugins: the discovery.yml manifest, which can be found in the Meltano repository, ships inside the meltano package, and can be downloaded from https://www.meltano.com/discovery.yml.

To find out which plugins are discoverable and supported out of the box, run meltano discover or refer to the lists of Extractors and Loaders. Discoverable plugins can be added to your project using meltano add <type> <name>.

To find discoverable plugins, Meltano will first look for the discovery.yml manifest at the root of your project (where it won't exist until you create it), then at the URL specified by the discovery_url setting (https://www.meltano.com/discovery.yml by default), and finally inside its own package.

# Variants

In the case of various popular data sources and destinations, multiple alternative implementations of Singer taps (extractors) and targets (loaders) exist, some of which are forks of an original (canonical) version that evolved in their own direction, while others were developed independently from the start.

These different implementations and their repositories typically use the same name (tap-<source> or target-<destination>) and may on the surface appear interchangeable, but often vary significantly in terms of exact behavior, quality, and supported settings.

In its index of discoverable plugins, Meltano considers these different implementations different variants of the same plugin, that share a plugin name and other source/destination-specific details (like a logo and description), but have their own implementation-specific variant name and metadata (like capabilities and settings).

Every discoverable plugin has a default variant that is known to work well and recommended for new users, which will be added to your project unless you explicitly select a different one. Users who already have experience with a different variant (or have specific reasons to prefer it) can explicitly choose to add it to their project instead of the default, so that they get the same behavior and can use the same settings as before. If the variant in question is not discoverable yet, it can be added as a custom plugin.

When multiple variants of a discoverable plugin are available, meltano discover will list their names alongside the plugin name. When adding a plugin to your project, a non-default variant can be specified using meltano add's --variant option.

# Custom plugins

If you'd like to use Meltano with a plugin that isn't discoverable yet, you'll be asked to provide the relevant metadata when you add it to your project as a custom plugin using meltano add --custom <type> <name>.

Once you've got the plugin working in your project, please consider contributing its definition to the discovery.yml manifest so that it can be supported out of the box for new users!

# Extractors

Extractors are pip packages used by meltano elt as part of data integration. They are responsible for pulling data out of arbitrary data sources: databases, SaaS APIs, or file formats.

Meltano supports Singer taps: executables that implement the Singer specification.

To learn which extractors are discoverable and supported out of the box, refer to the Extractors page or run meltano discover extractors.

# catalog extra

An extractor's catalog extra holds a path to a catalog file (relative to the project directory) to be provided to the extractor when it is run in sync mode using meltano elt or meltano invoke.

If a catalog path is not set, the catalog will be generated on the fly by running the extractor in discovery mode and applying the schema, selection, and metadata rules to the discovered file.

Selection filter rules are always applied to manually provided catalogs as well as discovered ones.

While this extra can be managed using meltano config or environment variables like any other setting, a catalog file is typically provided using meltano elt's --catalog option.

# How to use

# In meltano.yml



 

extractors:
- name: tap-gitlab
  pip_url: tap-gitlab
  catalog: extract/tap-gitlab.catalog.json
# On the command line
meltano config <extractor> set _catalog <path>

export <EXTRACTOR>__CATALOG=<path>

meltano elt <extractor> <loader> --catalog <path>

# For example:
meltano config tap-gitlab set _catalog extract/tap-gitlab.catalog.json

export TAP_GITLAB__CATALOG=extract/tap-gitlab.catalog.json

meltano elt tap-gitlab target-jsonl --catalog extract/tap-gitlab.catalog.json

# load_schema extra

  • Setting: _load_schema
  • Environment variable: <EXTRACTOR>__LOAD_SCHEMA, e.g. TAP_GITLAB__LOAD_SCHEMA
  • Default: $MELTANO_EXTRACTOR_NAMESPACE, which will expand to the extractor's namespace, e.g. tap_gitlab for tap-gitlab

An extractor's load_schema extra holds the name of the database schema extracted data should be loaded into, when this extractor is used in a pipeline with a loader for a database that supports schemas, like PostgreSQL or Snowflake.

The value of this extra can be referenced from a loader's configuration using the MELTANO_EXTRACT__LOAD_SCHEMA pipeline environment variable. It is used as the default value for the target-postgres and target-snowflake schema settings.

# How to use

# In meltano.yml



 

extractors:
- name: tap-gitlab
  pip_url: tap-gitlab
  load_schema: gitlab_data
# On the command line
meltano config <extractor> set _load_schema <schema>

export <EXTRACTOR>__LOAD_SCHEMA=<schema>

# For example:
meltano config tap-gitlab set _load_schema gitlab_data

export TAP_GITLAB__LOAD_SCHEMA=gitlab_data

# metadata extra

  • Setting: _metadata, alias: metadata
  • Environment variable: <EXTRACTOR>__METADATA, e.g. TAP_GITLAB__METADATA
  • Default: {} (an empty object)

An extractor's metadata extra holds an object describing Singer stream and property metadata rules that are applied to the extractor's discovered catalog file when the extractor is run using meltano elt or meltano invoke. These rules are not applied when a catalog is provided manually.

Stream (entity) metadata <key>: <value> pairs (e.g. {"replication-method": "INCREMENTAL"}) are nested under top-level entity identifiers that correspond to Singer stream tap_stream_id values. These nested properties can also be thought of and interacted with as settings named _metadata.<entity>.<key>.

Property (attribute) metadata <key>: <value> pairs (e.g. {"is-replication-key": true}) are nested under top-level entity identifiers and second-level attribute identifiers that correspond to Singer stream property names. These nested properties can also be thought of and interacted with as settings named _metadata.<entity>.<attribute>.<key>.

Unix shell-style wildcards can be used in entity and attribute identifiers to match multiple entities and/or attributes at once.

Entity and attribute names can be discovered using meltano select --list --all <plugin>.

# How to use

# In meltano.yml



 
 
 
 
 
 

extractors:
- name: tap-postgres
  pip_url: tap-postgres
  metadata:
    some_stream_id:
      replication-method: INCREMENTAL
      replication-key: created_at
      created_at:
        is-replication-key: true
# On the command line
meltano config <extractor> set _metadata <entity> <key> <value>
meltano config <extractor> set _metadata <entity> <attribute> <key> <value>

export <EXTRACTOR>__METADATA='{"<entity>": {"<key>": "<value>", "<attribute>": {"<key>": "<value>"}}}'

# Once metadata has been set in `meltano.yml`, environment variables can be used
# to override specific nested properties:
export <EXTRACTOR>__METADATA_<ENTITY>_<ATTRIBUTE>_<KEY>=<value>

# For example:
meltano config tap-postgres set _metadata some_stream_id replication-method INCREMENTAL
meltano config tap-postgres set _metadata some_stream_id replication-key created_at
meltano config tap-postgres set _metadata some_stream_id created_at is-replication-key true

export TAP_POSTGRES__METADATA_SOME_TABLE_REPLICATION_METHOD=FULL_TABLE

# schema extra

  • Setting: _schema
  • Environment variable: <EXTRACTOR>__SCHEMA, e.g. TAP_GITLAB__SCHEMA
  • Default: {} (an empty object)

An extractor's schema extra holds an object describing Singer stream schema override rules that are applied to the extractor's discovered catalog file when the extractor is run using meltano elt or meltano invoke. These rules are not applied when a catalog is provided manually.

JSON Schema descriptions for specific properties (attributes) (e.g. {"type": ["string", "null"], "format": "date-time"}) are nested under top-level entity identifiers that correspond to Singer stream tap_stream_id values, and second-level attribute identifiers that correspond to Singer stream property names. These nested properties can also be thought of and interacted with as settings named _schema.<entity>.<attribute> and _schema.<entity>.<attribute>.<key>.

Unix shell-style wildcards can be used in entity and attribute identifiers to match multiple entities and/or attributes at once.

Entity and attribute names can be discovered using meltano select --list --all <plugin>.

If a schema is specified for a property that does not yet exist in the discovered stream's schema, the property (and its schema) will be added to the catalog. This allows you to define a full schema for taps such as tap-dynamo-db that do not themselves have the ability to discover the schema of their streams.

# How to use

# In meltano.yml



 
 
 
 
 

extractors:
- name: tap-postgres
  pip_url: tap-postgres
  schema:
    some_stream_id:
      created_at:
        type: ["string", "null"]
        format: date-time
# On the command line
meltano config <extractor> set _schema <entity> <attribute> <schema description>
meltano config <extractor> set _schema <entity> <attribute> <key> <value>

export <EXTRACTOR>__SCHEMA='{"<entity>": {"<attribute>": {"<key>": "<value>"}}}'

# Once schema descriptions have been set in `meltano.yml`, environment variables can be used
# to override specific nested properties:
export <EXTRACTOR>__SCHEMA_<ENTITY>_<ATTRIBUTE>_<KEY>=<value>

# For example:
meltano config tap-postgres set _metadata some_stream_id created_at type '["string", "null"]'
meltano config tap-postgres set _metadata some_stream_id created_at format date-time

export TAP_POSTGRES__SCHEMA_SOME_TABLE_CREATED_AT_FORMAT=date

# select extra

  • Setting: _select
  • Environment variable: <EXTRACTOR>__SELECT, e.g. TAP_GITLAB__SELECT
  • Default: ["*.*"]

An extractor's select extra holds an array of entity selection rules that are applied to the extractor's discovered catalog file when the extractor is run using meltano elt or meltano invoke. These rules are not applied when a catalog is provided manually.

A selection rule is comprised of an entity identifier that corresponds to a Singer stream's tap_stream_id value, and an attribute identifier that that corresponds to a Singer stream property name, separated by a period (.). Rules indicating that an entity or attribute should be excluded are prefixed with an exclamation mark (!). Unix shell-style wildcards can be used in entity and attribute identifiers to match multiple entities and/or attributes at once.

Entity and attribute names can be discovered using meltano select --list --all <plugin>.

While this extra can be managed using meltano config or environment variables like any other setting, selection rules are typically specified using meltano select.

# How to use

# In meltano.yml



 
 
 

extractors:
- name: tap-gitlab
  pip_url: tap-gitlab
  select:
  - project_members.*
  - commits.*
# On the command line
meltano config <extractor> set _select '["<entity>.<attribute>", ...]'

export <EXTRACTOR>__SELECT='["<entity>.<attribute>", ...]'

meltano select <extractor> <entity> <attribute>

# For example:
meltano config tap-gitlab set _select '["project_members.*", "commits.*"]'

export TAP_GITLAB__SELECT='["project_members.*", "commits.*"]'

meltano select tap-gitlab project_members "*"
meltano select tap-gitlab commits "*"

# select_filter extra

  • Setting: _select_filter
  • Environment variable: <EXTRACTOR>__SELECT_FILTER, e.g. TAP_GITLAB__SELECT_FILTER
  • meltano elt CLI options: --select and --exclude
  • Default: []

An extractor's select_filter extra holds an array of entity selection filter rules that are applied to the extractor's discovered or provided catalog file when the extractor is run using meltano elt or meltano invoke, after schema, selection, and metadata rules are applied.

It can be used to only extract records for specific matching entities, or to extract records for all entities except for those specified, by letting you apply filters on top of configured entity selection rules.

Selection filter rules use entity identifiers that correspond to Singer stream tap_stream_id values. Rules indicating that an entity should be excluded are prefixed with an exclamation mark (!). Unix shell-style wildcards can be used in entity identifiers to match multiple entities at once.

Entity names can be discovered using meltano select --list --all <plugin>.

While this extra can be managed using meltano config or environment variables like any other setting, selection filers are typically specified using meltano elt's --select and --exclude options.

# How to use

# In meltano.yml






 
 

extractors:
- name: tap-gitlab
  pip_url: tap-gitlab
  select:
  - project_members.*
  - commits.*
  select_filter:
  - commits
# On the command line
meltano config <extractor> set _select_filter '["<entity>", ...]'
meltano config <extractor> set _select_filter '["!<entity>", ...]'

export <EXTRACTOR>__SELECT_FILTER='["<entity>", ...]'
export <EXTRACTOR>__SELECT_FILTER='["!<entity>", ...]'

meltano elt <extractor> <loader> --select <entity>
meltano elt <extractor> <loader> --exclude <entity>

# For example:
meltano config tap-gitlab set _select_filter '["commits"]'
meltano config tap-gitlab set _select_filter '["!project_members"]'

export TAP_GITLAB__SELECT_FILTER='["commits"]'
export TAP_GITLAB__SELECT_FILTER='["!project_members"]'

meltano elt tap-gitlab target-jsonl --select commits
meltano elt tap-gitlab target-jsonl --exclude project_members

# state extra

An extractor's state extra holds a path to a state file (relative to the project directory) to be provided to the extractor when it is run as part of a pipeline using meltano elt.

If a state path is not set, the state will be looked up automatically based on the ELT run's Job ID.

While this extra can be managed using meltano config or environment variables like any other setting, a state file is typically provided using meltano elt's --state option.

# How to use

# In meltano.yml



 

extractors:
- name: tap-gitlab
  pip_url: tap-gitlab
  state: extract/tap-gitlab.state.json
# On the command line
meltano config <extractor> set _state <path>

export <EXTRACTOR>__STATE=<path>

meltano elt <extractor> <loader> --state <path>

# For example:
meltano config tap-gitlab set _state extract/tap-gitlab.state.json

export TAP_GITLAB__STATE=extract/tap-gitlab.state.json

meltano elt tap-gitlab target-jsonl --state extract/tap-gitlab.state.json

# Loaders

Loaders are pip packages used by meltano elt as part of data integration. They are responsible for loading extracted data into arbitrary data destinations: databases, SaaS APIs, or file formats.

Meltano supports Singer targets: executables that implement the Singer specification.

To learn which loaders are discoverable and supported out of the box, refer to the Loaders page or run meltano discover loaders.

# dialect extra

  • Setting: _dialect
  • Environment variable: <LOADER>__DIALECT, e.g. TARGET_POSTGRES__DIALECT
  • Default: $MELTANO_LOADER_NAMESPACE, which will expand to the loader's namespace. Note that this default has been overridden on discoverable loaders, e.g. postgres for target-postgres and snowflake for target-snowflake.

A loader's dialect extra holds the name of the dialect of the target database, so that transformers in the same pipeline and Meltano UI's Analysis feature can determine the type of database to connect to.

The value of this extra can be referenced from a transformer's configuration using the MELTANO_LOAD__DIALECT pipeline environment variable. It is used as the default value for dbt's target setting, and should therefore correspond to a target name in transform/profile/profiles.yml.

# How to use

# In meltano.yml



 

loaders:
- name: target-example-db
  pip_url: target-example-db
  dialect: example-db
# On the command line
meltano config <loader> set _dialect <dialect>

export <LOADER>__DIALECT=<dialect>

# For example:
meltano config target-example-db set _dialect example-db

export TARGET_EXAMPLE_DB__DIALECT=example-db

# target_schema extra

  • Setting: _target_schema
  • Environment variable: <LOADER>__TARGET_SCHEMA, e.g. TARGET_POSTGRES__TARGET_SCHEMA
  • Default: $MELTANO_LOAD_SCHEMA, which will expand to the value of the loader's schema setting

A loader's target_schema extra holds the name of the database schema the loader has been configured to load data into (assuming the destination supports schemas), so that transformers in the same pipeline and Meltano UI's Analysis feature can determine the database schema to load data from.

The value of this extra is usually not set explicitly, since its should correspond to the value of the loader's own "target schema" setting. If the name of this setting is not schema, its value can be referenced from the extra's value using $MELTANO_LOAD_<TARGET_SCHEMA_SETTING>, e.g. $MELTANO_LOAD_DESTINATION_SCHEMA for setting destination_schema.

The value of this extra can be referenced from a transformer's configuration using the MELTANO_LOAD__TARGET_SCHEMA pipeline environment variable. It is used as the default value for dbt's source_schema setting.

# How to use

# In meltano.yml





 

loaders:
- name: target-example-db
  pip_url: target-example-db
  settings:
  - name: destination_schema
  target_schema: $MELTANO_LOAD_DESTINATION_SCHEMA # Value of `destination_schema` setting
# On the command line
meltano config <loader> set _target_schema <schema>

export <LOADER>__TARGET_SCHEMA=<schema>

# For example:
meltano config target-example-db set _target_schema '$MELTANO_LOAD_DESTINATION_SCHEMA'

# If the target schema cannot be determined dynamically using a setting reference:
meltano config target-example-db set _target_schema explicit_target_schema

export TARGET_EXAMPLE_DB__TARGET_SCHEMA=explicit_target_schema

# Transforms

Transforms are dbt packages containing dbt models, that are used by meltano elt as part of data transformation.

Together with the dbt transformer, they are responsible for transforming data that has been loaded into a database (data warehouse) into a different format, usually one more appropriate for analysis.

When a transform is added to your project using meltano add, the dbt package Git repository referenced by its pip_url will be added to your project's transform/packages.yml and the package will be enabled in transform/dbt_project.yml.

# package_name extra

  • Setting: _package_name
  • Environment variable: <TRANSFORM>__PACKAGE_NAME, e.g. TAP_GITLAB__PACKAGE_NAME
  • Default: $MELTANO_TRANSFORM_NAMESPACE, which will expand to the transform's namespace, e.g. tap_gitlab for tap-gitlab

A transform's package_name extra holds the name of the dbt package's internal dbt project: the value of name in dbt_project.yml.

When a transform is added to your project using meltano add, this name will be added to the models dictionary in transform/dbt_project.yml.

The value of this extra can be referenced from a transformer's configuration using the MELTANO_TRANSFORM__PACKAGE_NAME pipeline environment variable. It is included in the default value for dbt's models setting: $MELTANO_TRANSFORM__PACKAGE_NAME $MELTANO_EXTRACTOR_NAMESPACE my_meltano_model.

# How to use

# In meltano.yml




 

transforms:
- name: dbt-facebook-ads
  namespace: tap_facebook
  pip_url: https://github.com/fishtown-analytics/facebook-ads
  package_name: facebook_ads
# On the command line
meltano config <transform> set _package_name <name>

export <TRANSFORM>__PACKAGE_NAME=<name>

# For example:
meltano config dbt-facebook-ads set _package_name facebook_ads

export DBT_FACEBOOK_ADS__PACKGE_NAME=facebook_ads

# vars extra

  • Setting: _vars
  • Environment variable: <TRANSFORM>__VARS, e.g. TAP_GITLAB__VARS
  • Default: {} (an empty object)

A transform's vars extra holds an object representing dbt model variables that can be referenced from a model using the var function.

When a transform is added to your project using meltano add, this object will be used as the dbt model's vars object in transform/dbt_project.yml.

Because these variables are handled by dbt rather than Meltano, environment variables can be referenced using the env_var function instead of $VAR or ${VAR}.

# How to use

# In meltano.yml



 
 

transforms:
- name: tap-gitlab
  pip_url: dbt-tap-gitlab
  vars:
    schema: '{{ env_var(''DBT_SOURCE_SCHEMA'') }}'
# On the command line
meltano config <transform> set _vars <key> <value>

export <TRANSFORM>__VARS='{"<key>": "<value>"}'

# For example
meltano config --plugin-type=transform tap-gitlab set _vars schema "{{ env_var('DBT_SOURCE_SCHEMA') }}"

export TAP_GITLAB__VARS='{"schema": "{{ env_var(''DBT_SOURCE_SCHEMA'') }}"}'

# Models

Models are pip packages used by Meltano UI to aid in data analysis. They describe the schema of the data being analyzed and the ways different tables can be joined, and are used to automatically generate SQL queries using a point-and-click interface.

# Dashboards

Dashboards are pip packages bundling curated Meltano UI dashboards and reports.

When a dashboard is added to your project using meltano add, the bundled dashboards and reports will automatically be added to your project's analyze directory as well.

# Orchestrators

Orchestrators are pip packages responsible for orchestrating a project's scheduled pipelines.

Meltano supports Apache Airflow out of the box, but can be used with any tool capable of reading the output of meltano schedule list --format=json and executing each pipeline's meltano elt command on a schedule.

When the airflow orchestrator is added to your project using meltano add, its related file bundle will automatically be added as well.

# Transformers

Transformers are pip packages used by meltano elt as part of data transformation. They are responsible for running transforms.

Meltano supports dbt and its dbt models out of the box.

When the dbt transformer is added to your project using meltano add, its related file bundle will automatically be added as well.

# File bundles

File bundles are pip packages bundling files you may want in your project.

When a file bundle is added to your project using meltano add, the bundled files will automatically be added as well. The file bundle itself will not be added to your meltano.yml project file unless it contains files that are managed by the file bundle and to be updated automatically when meltano upgrade is run.

# update extra

  • Setting: _update
  • Environment variable: <BUNDLE>__UPDATE, e.g. DBT__UPDATE
  • Default: {} (an empty object)

A file bundle's update extra holds an object mapping file paths (of files inside the bundle, relative to the project root) to booleans.

When a file path's value is True, the file is considered to be managed by the file bundle and updated automatically when meltano upgrade is run.

# How to use

# In meltano.yml



 
 

files:
- name: dbt
  pip_url: files-dbt
  update:
    transform/dbt_project.yml: false
# On the command line
meltano config <bundle> set _update <path> <true/false>

export <BUNDLE>__UPDATE='{"<path>": <true/false>}'

# For example:
meltano config --plugin-type=files dbt set _update transform/dbt_project.yml false

export DBT__UPDATE='{"transform/dbt_project.yml": false}'