How do we make custom connectors easier to move from local code to production use, without hiding the details that data teams need to trust?
For analytical engineers, this is not abstract product plumbing. It is the part of the work that decides whether a connector becomes a dependable pipeline or stays as something only one person knows how to run.
Custom connectors often start simply. A source is missing from the catalog. An internal API has useful operational data. A vendor endpoint exposes exactly the records finance, growth, or support wants in the warehouse. You build a Singer tap with the Meltano SDK, define the settings in Python, test it locally, and get records flowing.
Then the second job begins.
You need to add that connector to a real Meltano project. You need configuration that another person can read. You need install behavior that is repeatable. You need the connector definition to survive review, deployment, and update cycles. If it is useful beyond your team, you may also want to make it discoverable through Meltano Hub.
That path works today, but it asks developers to repeat themselves more than they should have to.
The Problem: Plugin Metadata Can Drift
A Meltano plugin definition is the information Meltano needs to run a package as part of a project. For an extractor, that usually means the package location, executable, namespace, capabilities, settings, and configuration shape.
When you build a custom Singer tap with the SDK, some of that information already lives in the connector code. Settings live in the tap class. Capabilities can be exposed by the connector. The SDK can describe parts of the connector through its about output.
But in practice, the same information may show up in several places:
- The connector code, such as tap.py
- The local meltano.yml used while testing the tap
- The destination project where the tap is actually used
- A Hub style plugin definition if the connector becomes discoverable
- A lock artifact under the project plugins directory after adding a discoverable plugin
None of these artifacts are pointless. They each exist for a reason. The problem is that they do not always share one source of truth.
If a tap setting changes from start_date to replication_start_date, the Python code can be updated while the project definition remains stale. If a capability changes, the Hub definition may not reflect it. If a lock file and project file diverge, updates become harder to reason about.
That is the sort of friction Meltano should remove. Not by making the system opaque, but by making the right metadata easier to move around.
Why This Matters For Analytical Engineers
Analytical engineers tend to live between application code and business reporting. That is a demanding place to be.
A notebook can prove that an API has useful data. A script can get records into a table. But production analytics needs more than a working first run. It needs predictable installs, explicit settings, state handling, environment aware configuration, reviewable changes, and clear ownership when something breaks.
This is why Meltano’s project model matters. It gives teams a code first, composable way to manage data movement without handing the whole workflow to a black box.
The goal is not to turn every connector into a large software project. The goal is also not to make data teams click through hidden configuration. The goal is simpler:
Make custom connector promotion predictable.
A connector should be easy to test locally, easy to add to a project, easy to update, and easy to share with the community when it is ready.
What meltano add Handles Today
Today, meltano add works well when the plugin is discoverable. If a plugin exists in the catalog, Meltano can resolve its definition, add it to the project, and create the lock artifact that keeps the plugin behavior stable over time.
That is good for common connectors. It is less smooth for a connector that still lives in a private repo, a local directory, or a project branch.
Meltano already supports useful custom plugin workflows. A team can add a custom plugin definition to meltano.yml. Meltano also has a way to add a plugin from a YAML definition reference. That gives developers a path before a plugin is published to Hub.
The question from office hours was whether this can become more natural.
What if Meltano could add a custom plugin from a repository, local directory, or structured reference, then resolve the right metadata with less manual translation?
That would make a custom connector feel closer to a dependency. You point Meltano at the source, Meltano understands what it is, and the project records enough detail to run it predictably.
Option One: Keep Plugin YAML Explicit
The first model is the explicit model.
In this approach, the connector repository includes a plugin definition file, for example plugin.yml. Meltano reads that file when adding the connector to a project.
There is a lot to like about this. A YAML definition is visible. It can be reviewed in a pull request. It can include details that are not obvious from code. It also works for plugins that cannot describe themselves through the Singer SDK, such as utilities or small command based tools.
This fits the way many data teams prefer to work. Configuration is in version control. The tradeoffs are visible. No one has to guess what was inferred.
The cost is maintenance. The developer still needs to keep plugin.yml aligned with the connector code. If the tap settings change and the YAML does not, drift remains.
For some teams, that tradeoff is acceptable. Explicit files are dependable because they are boring. For others, it is another place to forget an update.
Option Two: Let Meltano Pull Metadata From The Connector
The second model is introspection.
In this approach, Meltano asks the connector to describe itself. For Singer SDK connectors, that is a reasonable direction because the SDK already knows a lot about the tap or target.
A connector can expose settings, capabilities, and supported runtime information. In theory, Meltano could use that metadata to produce the plugin definition it needs, instead of asking the developer to hand write it again.
This is attractive because it reduces repetition. The connector code becomes closer to the source of truth. If a setting changes in the SDK config schema, the generated Meltano metadata can change with it.
There are two practical challenges.
First, the SDK about output and the Meltano plugin definition format are not the same today. They overlap, but they are not interchangeable.
Second, introspection may require installing enough of the connector to run it. That means Meltano has to learn about a plugin before it has fully added the plugin. That can work, but the failure modes need to stay clear. If dependency installation fails, the user should know whether the problem is packaging, metadata, credentials, or something else.
This is where Meltano has to stay true to its own voice. No hidden magic. No mystery behavior. If the system infers metadata, it should still make the result visible and understandable.
The Likely Answer Is Both
The strongest path may be to support both models.
Meltano could first look for an explicit plugin definition in the repository. If it exists, use it.
If no definition exists and the package looks like an SDK based connector, Meltano could attempt to read metadata from the connector and generate the definition.
That gives maintainers control when they want it, while making the default path easier for common SDK based taps and targets.
It also respects the shape of the Meltano ecosystem. Not every plugin is a modern SDK connector. Some plugins predate the SDK. Some are utilities. Some are internal packages. Some will never be published to Hub, but still deserve clean project ergonomics.
Meltano does not need one clever rule for every case. It needs a dependable path for each case.
Where URI Based References Might Fit
A second part of the discussion was how much meaning a plugin reference should carry.
If a user points Meltano at a Git repository, Meltano needs to know what kind of thing it is adding. Is it an extractor, a loader, a mapper, or a utility? Should that come from the plugin name? From an explicit plugin type flag? From the structure of a URI? From a file inside the repository?
There is a tradeoff here.
A structured reference can help automation. It can make the plugin type, location, subdirectory, and variant more explicit. That matters as more teams use agents and automated workflows to inspect repos, generate project files, or propose pipeline changes.
But the interface still has to be humane. A user should not need to learn a private URI language just to add a connector from a Git repo.
The right design probably meets both needs. It accepts simple references when the answer is obvious. It allows explicit metadata when precision matters.
That is how data tooling should feel. Clear when the default is enough. Exact when the work demands it.
What This Means For Meltano Hub
This discussion also raised a useful question: what should Meltano Hub become as more connectors can describe themselves?
Hub originally solved an ecosystem problem. Many Singer connectors existed before Meltano and before the SDK. Their repositories did not carry Meltano specific metadata. Hub gathered plugin definitions, variants, settings, capabilities, and install details so users could find and add connectors without reconstructing that information each time.
That role still matters. Discovery matters. Defaults matter. Trust signals matter.
But the next version of Hub can be more than a catalog. It can help data teams answer the questions they already ask before trusting a connector:
Which variant should I use?
Who maintains it?
When was it last updated?
Which capabilities are supported?
How many teams appear to use it?
Does the Hub definition match what the connector actually exposes?
That last question is where metadata flow matters. If SDK based connectors can generate or validate Meltano compatible definitions, Hub becomes less of a manual registry and more of a reliable index over living packages.
That would make the ecosystem easier to trust. Not because every connector is perfect, but because the facts about each connector are clearer.
Recent Connector Work
Office hours also covered a few connector updates from the team.
The Meltano Cloud tap now includes streams for pipeline jobs as well as workspace jobs. That gives teams a cleaner way to bring pipeline metrics into the warehouse and build reporting around operational behavior.
The Weather API tap received larger changes, including support for bulk mode where the API plan allows it. That reduces request volume and handles chunking when the API limits the number of locations per request.
We also touched on ongoing work around spreadsheet style ingestion, especially large file sets and state based on file modification time. Spreadsheet ingestion can look simple from the outside, but real production use often depends on careful state behavior and predictable file handling.
These are small examples of a larger point. Connectors are operational software. A better setting, a clearer state model, or fewer requests can matter a lot when a pipeline runs every day.
The Direction We Want
Meltano’s direction is not to hide data movement behind a glossy abstraction. It is to make data movement dependable without taking away control.
For custom connectors, that means a few things:
- Plugin metadata should have less room to drift.
- SDK based connectors should be easier to describe.
- Explicit plugin definitions should remain supported.
- Hub should help users choose and trust connectors.
- Adding a private or local connector should feel like a normal project workflow.
That is not a flashy promise. It is a practical one.
Data movement should be open, predictable, and clear enough to debug. It should fit into the way teams already manage code. It should make getting data into the warehouse feel finished, not fragile.
If you maintain custom taps or targets, we would like to hear how this works in your projects.
Where does metadata drift today?
Would you rather maintain a plugin definition file, generate one from connector code, or let Meltano inspect the connector?
What would make a private connector feel natural to add, review, and update?
Bring those workflows to the next Meltano Office Hours, or join the conversation in Slack. The best version of this will come from the teams who are already doing the work.
Dig Deeper
Watch the office hours recording on YouTube: https://youtu.be/lJGN3OW-mKU
Join the Meltano community: https://meltano.com/slack
Read the docs: https://docs.meltano.com
Quick FAQ
What is a Meltano plugin definition?
A Meltano plugin definition tells Meltano how to use a package inside a project. It can include the package source, executable, namespace, capabilities, settings, and configuration details.
Why does plugin metadata duplication matter?
Duplication matters because connector code, project config, Hub definitions, and lock artifacts can drift. When they drift, updates become harder to review and production behavior becomes harder to predict.
What is the proposed improvement?
The proposed direction is to make custom connectors easier to add from a repository, local directory, or plugin definition. Meltano could support both explicit plugin files and metadata generated from SDK based connectors.
Who benefits from this?
Analytical engineers, data engineers, and platform teams benefit because custom connectors become easier to promote from local development into dependable project workflows.
