Today, we are excited to launch Meltano 2.0, which represents a major step toward our vision to become the foundation of every team’s ideal data stack. Going beyond ELT, 2.0 makes it easier than ever for teams of any size to build an end-to-end data platform out of open source modern data stack components like Singer, dbt, Airflow, Great Expectations, and Superset, and collaborate on it like the software project it is. This makes it the MVP (Minimum Viable Product) of Meltano as your DataOps platform infrastructure.
In many ways, 2.0 is a return to our roots. Meltano first started in 2018 with a goal of building an end-to-end data platform that would be open source and bring software development best practices to the entire data lifecycle. Over time, we narrowed our focus to bringing together Singer for data replication and dbt for transformation to form an open source ELT solution that lets users manage their pipelines and configuration as code. In Singer, we found a vibrant ecosystem of data teams and consultancies collaborating on open source connectors that, with our support (and Meltano SDK), has grown to cover more than 300 sources and destinations listed on MeltanoHub—more than any other data integration solution out there, open source or not.
Now, with the launch of 2.0, we’ve officially returned to our original end-to-end vision by adding support for open source BI with Superset, and we are proud to reintroduce ourselves as Meltano: your DataOps platform infrastructure.
Less than a year after spinning out of GitLab with $4.2 million of Seed funding led by GV, we’re also grateful to be supported in this next stage of our journey with additional funding from Venrock and others, that brings the total funds raised to $12.4 million. Ethan Batraski, partner at Venrock and our newest board member, wrote about his vision for the data space and Meltano’s growing role in it.
What’s New in Meltano 2.0?
Meltano 2.0 makes it easier than ever to get up and running with the open source modern data stack. Meltano serves as the infrastructure that holds the data stack together and enables you to bring in any of the hundreds of plugins available on MeltanoHub—without ever needing to leave your local machine or go through a signup or payment flow.
Specifically, 2.0 completes Meltano’s initial end-to-end aspirations by adding support for open source data analysis tools, letting you easily build a modern data stack out of all or some of the following components:
- Singer taps and targets for data replication (EL)
- dbt for data transformation
- Airflow for pipeline scheduling and workflow orchestration
- Great Expectations for data quality
- Superset for analysis
The only missing piece still managed outside of Meltano is a cloud data warehouse like Snowflake.
Each of these plugins are installable, configurable, and runnable via Meltano. With your meltano.yml as your project definition, you can test all these plugins in a development environment and then deploy them to production with confidence, knowing that your configuration is version-controlled and your changes are tested automatically. You can also use composable pipelines via `meltano run` to define workflows across all your plugins.
Meltano 2.0 also elevates MeltanoHub to be the complete resource for all plugins available for use with Meltano, ranging from Singer taps and targets to alternative analysis tools, and data utilities such as SQLFluff. Previously, a subset of plugins could be directly added to a Meltano project, while others needed some metadata to be provided manually. Now, any plugin listed on MeltanoHub is immediately discoverable within Meltano and can be installed appropriately. We’ve also added lock files, which give you confidence in the stability and reproducibility of your data platform while allowing the community to iterate on plugins and their metadata definitions on the Hub. Stay tuned for more improvements in the coming months that will make it even easier for users to contribute new plugins and discover existing ones.
As with every new release, 2.0 provides a myriad of behavior improvements and bug fixes. In particular, we’ve updated how jobs and schedules are defined to take advantage of ‘meltano run‘. We’ve also updated how environments work by clarifying the behavior of inheritance and how environment variables are sourced and injected into their run context.
Additional Highlights of Meltano 2.0
- Improved flexibility in how jobs and schedules are defined to take full advantage of composable pipelines via `meltano run`
- Greater stability in Meltano projects with the addition of the Meltano lockfile
- Explicit plugin dependency references to enable installation of multiple plugins simultaneously
- Enhanced support for Singer plugins with the addition of Stream Map plugins to enable on-the-fly data transformations
- Enhanced support for dbt through adapter-specific plugins
- Enhanced telemetry to improve transparency on MeltanoHub as well as product support
- Routine maintenance, bug squashing, deprecation of legacy features, and elimination of technical debt
Meltano 2.0 is just the first step toward our larger vision and taming the complexity and fragility of the modern data stack. Our product roadmap is public, and we regularly release new features and capabilities that we’ve collaborated on with our community of 2,500-plus data professionals.
In the next two-to-three months we’ll focus primarily on a few strategic goals. The first is to continue enhancing Meltano’s capabilities as infrastructure for your data stack, and increasing the value users get from managing more and more of their stack in one place.
Achieving these goals requires new and improved abstractions for unique capabilities, such as end-to-end testing across plugins and standardized deployments, and deepening the integration with existing plugins to use these abstractions to the full advantage. At the same time, we will add more plugins to MeltanoHub and make it easier for our community to help us with this, increasing the diversity of data tools and stacks that Meltano can add value to. We’ll also improve how Meltano creates and manages metadata generated by the workflows it runs to enhance monitoring and debugging across the board. Other major upcoming features include better support for handling secrets natively and a more robust API.
The next top priorities concern the two stages on the journey from having just found Meltano to getting lasting value out of it: first you need to get to a successful local proof-of-concept workflow run, and then you need to take this into production. This means we’ll focus on documentation and functionality that helps users understand core Meltano concepts and discover the right plugins for their use case, guides them through configuration and debugging any issues that may come up, and simplifies the deployment process.
Looking even further ahead to late 2022 and early 2023, we’ll be building our Managed Meltano service to bring the value of Meltano to all of those teams that are comfortable with Git repos and CLIs, but would rather not deal with Terraform, Kubernetes, and one or more self-managed data platform deployments. We’re currently in the early design and planning phase, so if you’re interested in learning more, please sign up for the waitlist to let us know!
Try it Out
Also, join our Slack community to connect with other data professionals and learn more about how to get the most out of Meltano.