At the core of the Meltano experience is your Meltano project, which represents the single source of truth regarding your ELT pipelines: how data should be integrated and transformed, how the pipelines should be orchestrated, and how the various plugins that make up your pipelines should be configured.
Since a Meltano project is just a directory on your filesystem containing text-based files, you can treat it like any other software development project and benefit from DataOps best practices such as version control, code review, and continuous integration and deployment (CI/CD).
You can initialize a new Meltano project using
meltano.yml project file
At a minimum, a Meltano project must contain a project file named
which contains your project configuration and tells Meltano that a particular directory is a Meltano project.
The only required property is
version, which currently always holds the value
At the root of
meltano.yml, and usually at the top of the file, you will find project-specific configuration.
In a newly initialized project, only the
will be set.
To learn which settings are available, refer to the Settings reference.
Every plugin in your project needs to have:
namethat's unique among plugins of the same type,
- a base plugin description describing the package in terms Meltano can understand, and
- configuration that can be defined across various layers, including the definition's
A base plugin description consists of the
but not every plugin definition will specify these explicitly:
- An inheriting plugin definition has an
inherit_fromproperty and inherits its base plugin description from another plugin in your project or a discoverable plugin identified by name.
- A custom plugin definition has a
namespaceproperty instead and explicitly defines its base plugin description.
- A shadowing plugin definition has neither property and implicitly inherits its base plugin description from the discoverable plugin with the same
When inheriting a base plugin description, the plugin definition does not need to explicitly specify a
pip install argument),
but you may want to override the inherited value and set the property explicitly to point at a (custom) fork or to pin a package to a specific version.
When a plugin is added using
meltano add, the
pip_url is automatically repeated in the plugin definition for convenience.
# Inheriting plugin definitions
A plugin defined with an
inherit_from property inherits its base plugin description from another plugin identified by name. To find the matching plugin, other plugins in your project are considered first, followed by
plugins: extractors: - name: tap-postgres # Shadows discoverable `tap-postgres` (see below) - name: tap-postgres--billing inherit_from: tap-postgres # Inherits from project's `tap-postgres` - name: tap-bigquery--events inherit_from: tap-bigquery # Inherits from discoverable `tap-bigquery`
When inheriting from another plugin in your project, its configuration is also inherited as if the values were defaults, which can then be overridden as appropriate:
plugins: extractors: - name: tap-google-analytics variant: meltano config: key_file_location: client_secrets.json start_date: '2020-10-01T00:00:00Z' - name: tap-ga--view-foo inherit_from: tap-google-analytics config: # `key_file_location` and `start_date` are inherited view_id: 123456 - name: tap-ga--view-bar inherit_from: tap-google-analytics config: # `key_file_location` is inherited start_date: '2020-12-01T00:00:00Z' # `start_date` is overridden view_id: 789012
plugins: loaders: - name: target-snowflake # Shadows discoverable `target-snowflake` (see below) variant: datamill-co # using variant `datamill-co` - name: target-snowflake--derived inherit_from: target-snowflake # Inherits from project's `target-snowflake` - name: target-snowflake--transferwise inherit_from: target-snowflake # Inherits from discoverable `target-snowflake` variant: transferwise # using variant `transferwise`
To learn how to add an inheriting plugin to your project, refer to the Plugin Management guide.
# Custom plugin definitions
plugins: extractors: - name: tap-covid-19 namespace: tap_covid_19 pip_url: tap-covid-19 executable: tap-covid-19 capabilities: - catalog - discover - state settings: - name: api_token - name: user_agent - name: start_date
To learn how to add a custom plugin to your project, refer to the Plugin Management guide.
# Shadowing plugin definitions
plugins: extractors: - name: tap-gitlab
To learn how to add a discoverable plugin to your project, refer to the Plugin Management guide.
If multiple variants of a discoverable plugin are available,
variant property can be used to choose a specific one:
plugins: extractors: - name: tap-gitlab variant: meltano
variant is specified, the original variant supported by Meltano is used.
Note that this is not necessarily the default variant that is recommended to new users and would be used if the plugin were newly added to the project.
# Plugin configuration
extractors: - name: tap-example config: # Configuration goes here! example_setting: value # Extras go here! example_extra: value
# Plugin commands
Plugin commands are defined by the
commands property. The keys are the name of the command and the values are the arguments to be passed to the plugin executable. These can contain dynamic references to configuration using the Environment variable form of the configuration option.
transformers: - name: dbt executable: dbt commands: seed: seed --project-dir $DBT_PROJECT_DIR --profile $DBT_PROFILE --target $DBT_TARGET --select $DBT_MODEL snapshot: snapshot --project-dir $DBT_PROJECT_DIR --profile $DBT_PROFILE --target $DBT_TARGET --select $DBT_MODEL
A schedule definition must have a
schedules: - name: foo-to-bar extractor: tap-foo loader: target-bar transform: skip interval: '@hourly'
schedules: - name: foo-to-bar extractor: tap-foo loader: target-bar transform: skip interval: '@hourly' env: TAP_FOO_BAR: bar TAP_FOO_BAZ: baz
To learn more about pipeline schedules and orchestration, refer to the Orchestration guide.
A newly initialized project comes with a
.gitignore file to ensure that
environment-specific and potentially sensitive configuration stored inside the
.meltano directory and
.env file is not leaked accidentally.
All other files are recommended to be checked into the repository and shared between all users and environments that may use the project.
Typically, this file is used to store configuration that is environment-specific or sensitive,
and should not be stored in
meltano.yml and checked into version control.
meltano config <plugin> set will automatically store configuration in
.env as appropriate.
In a newly initialized project, this file will be included in
.gitignore by default.
Meltano stores various files for internal use inside a
.meltano directory inside your project.
These files are specific to the environment Meltano is running in, and should not be checked into version control.
In a newly initialized project, this directory will be included in
.gitignore by default.
While you would usually not want to modify files in this directory directly, knowing what's in there can aid in debugging:
.meltano/meltano.db: The default SQLite system database.
meltano eltoutput logs for the specified pipeline run.
.meltano/run/bin: Symlink to the
meltanoexecutable most recently used in this project.
.meltano/run/elt/gitlab-to-postgres/<UUID>/: Directory used by
meltano eltto store pipeline-specific generated plugin config files, like an extractor's
.meltano/run/<plugin name>/, e.g.
.meltano/run/tap-gitlab/: Directory used by
meltano invoketo store generated plugin config files.
.meltano/<plugin type>/<plugin name>/venv/, e.g.
.meltano/extractors/tap-gitlab/venv/: Python virtual environment directory that a plugin's pip package was installed into by
# System database
Meltano stores various types of metadata in a project-specific system database,
that takes the shape of a
meltano.db SQLite database stored inside the
.meltano directory by default.
Like all files stored in the
.meltano directory, the system database is also environment-specific.
You can choose to use a different system database backend or configuration using the
While you would usually not want to modify the system database directly, knowing what's in there can aid in debugging:
jobtable: One row for each
meltano eltpipeline run, holding started/ended timestamps and incremental replication state.
plugin_settingstable: Plugin configuration set using
meltano config <plugin> setor the UI when the project is deployed as read-only.
usertable: Users for Meltano UI created using
meltano user add.