Earlier this month, I joined the Meltano team full time with two ambitious goals: (1) redefine how organizations think about their approaches to data projects and (2) make this next-gen approach to data freely available to everyone, regardless of budget or the size of the data. From hobbyists working on pet projects, to multi-billion dollar corporations: I believe everyone can benefit from Meltano’s approach.
This post will explain why I decided to join Meltano, why I think DataOps is critical to a team’s success, and what I think is so unique about Meltano in particular.
ETL is dead; Long live EL and T!
To start off, one of the biggest reasons I joined Meltano is because I think transitioning to the new DataOps approach is exciting, and I want to do my part to make transitioning to DataOps a positive, joyful experience. If your data team has not taken the plunge yet, the plunge is coming and you should be getting ready! If you work in data, it is likely some time within the next 12 months, someone is going to excitedly come ask you about DevOps, it’s younger cousin “DataOps”, and CI/CD: they’re going to ask you if your data team has taken that plunge into the world of DataOps or if you are still using the old-style traditional ETL tools. I’m here to tell you life is much nicer here on the other side.
Addressing two important questions about the future of data teams
You might be in the data space yourself and you might be skeptical of the direction that tools like Meltano and dbt are moving in. Or maybe DevOps, git, and CI/CD seem like too much to learn. I want to address two of these concerns of these before I go further.
Objection #1: My data team aren’t coders. Do we really need git and CI/CD?
First of all, I know this is going to sound controversial, but your data team are already coders. They just might not yet have the tools they need to support them writing good code. In our own way, we are all coders whether we write code directly or use a tool to write the code for us.
As long as two or more people are working together towards a common goal, there will always be merge conflicts to resolve and code reviews to perform. Have you ever emailed a coworker a copy of a word document, only for both of you to make changes and then stumble through how to merge those changes back again into a single copy? That’s called a “merge conflict” – resolving it made you a coder! Have you enabled “track changes” on a document so someone else could review those changes one at a time? That was a code review and in that moment, you were a coder! These are exactly the types of source control problems which benefit from git and CI/CD. Historically, however, very few of us in the data community have had access to tools that managed our code effectively. That is, until now.
For anyone still nervous about git or “coding”, please consider checking out the Git and GitHub for Poets series. It’s clever and entertaining and does a great job demystifying all the DevOps technical jargon.
Objection #2: But isn’t coding hard? What about the move towards “no-code” solutions?
I want to address this point because I hear this question often in the data space. While we all want to be writing less code, more elegant code, and more efficient code, we can’t get away from code altogether; we can only invent simpler and more accessible languages to deal with it, and then build great tools on top of those languages. The Meltano UI, for instance, literally writes EL code for us. Afterwards, we’re still free to modify the code ourselves if we wish, track changes, and even undo changes.
Rest assured: the languages you learn when building a Meltano project are not going away anytime soon. These are the same languages that I’d bet on my 5 year old son learning in high school ten years from now: SQL, YAML, and Python. These are not only among the most popular languages out there, but they are also among the easiest that one can learn today! (Yes, much easier than French! 😅)
Why I believe Meltano is special
As I think about why Meltano is special, there are three big reasons that come to mind: specialization, composition, and collaboration.
Meltano treats EL and T as distinct problem types (Specialization)
Meltano understands that the challenges with EL (“Extract-and-Load”) and T (“transformation”) are each unique. As such, Meltano handles each with a targeted best-in-class approach for each: the powerful Singer platform as our EL framework and the revolutionary open source tool dbt as our data transformation layer. Meltano itself is the cohesive solution for the entire end-to-end data project. Looking to the future, beyond EL (Singer) and T (dbt), Meltano will similarly integrate with other great tools like Superset for reporting and BI and Great Expectations for validation and governance.
Meltano brings it all together in one place (Composition)
Despite the distinct and unique challenges of each stage of EL and T, Meltano knows that data professionals want a single home for the entire data project, with clear transitions and handoff between each stage of the process. For instance, today Meltano recommends dbt transformation packages based upon the data sources in your project. There is plenty of room for us to continue expanding this functionality in the future.
Meltano is Pro-Team, because… DataOps! (Collaboration)
If you haven’t heard the word “DataOps” before, please allow me to formally introduce you: “DataOps” is simply “DevOps for Data” – and at the same time it is also so much more. The term “DevOps” was coined in 2009 as an agile-driven movement in software development, automating the “ops” part of development and putting source control systems to work automating those steps we don’t need to be spending time on manually. Deploying, packaging, testing, rolling back: all of those tasts are handled automatically by what we now call “CI/CD”. All we need to do is write some tests, use automation to continually retest those tests, and then finally we can trade our manual deployment headaches for a streamlined code review and approval flow. With DataOps, anyone can send a proposed code change, code owners can review that change, and the systems literally do the rest.
With DataOps and Meltano, teams can pride themselves not on how complex their solutions are, but how simple and elegant they are. A “great” solution is a solution that is elegant, repeatable, and easily understood by new team members. Everyone can see the work contributed by everyone else, and that code is human readable by design.
Meltano’s DataOps approach makes the hard stuff easy
Want to add Salesforce as a new source? Easy:
✅ Type into a terminal: meltano add extractor tap-salesforce
Want to do the same thing with a friendly UI? Easy:
✅ Go to the meltano web UI, click the option to add a new extractor, and select “Salesforce”.
Want to give everyone in the company the ability to contribute? Easy:
✅ Make the code repository public to all employees – especially the company’s product managers and analysts!
…while no single person has access to break anything? Easy:
✅ Require code review on all changes.
Want to make sure it’s perfect before you deploy? Easy:
✅ Turn on CI/CD with your source provider. Get a full or mini copy of the DB every time you make a change!
Want to undo your mistake? Easy:
✅ Instantly roll back to any point in your history using git.
Like magic! The right tools, fit together just right.
The most profound technologies are those that disappear. They weave themselves into the fabric of everyday life until they’re indistinguishable from it.Mark Weiser
Everything should be made as simple as possible, but no simpler.Albert Einstein
What’s unique with Meltano is that rather than build a more complex platform or a more complex set of tools, we’re making a simpler platform and a more cohesive set of specialized tools. We solve the core data problems in a fundamentally more elegant way. For instance, rather than build a platform with rollback and schema-diff capability, we get out of the way and let source control and CI/CD tools do their job. In future, you won’t be thinking about Singer taps and targets – you’ll just be thinking about source data and where we want it to land. You won’t be thinking about dbt or SQL – you’ll just be thinking about whether your model runs smoothly and meets the business requirements.
The sign of having a great tool is that the tool gets the job done seamlessly and doesn’t distract or get in the way, providing safety and agility at the same time. While I’m not going to claim that Meltano is perfect (not yet! 😉), I do believe Meltano is on its way to being the perfect DataOps platform for any organization – whether you are a junior analyst, a senior data engineer, or a one-person startup.
Let’s build Meltano, together!
I’ve told you now why I’m excited about building Meltano, but the most exciting part is that I get to build it alongside all of you! Whether you are reporting bugs or fixing them, whether you are brand new in your career or have 20 years under your belt: we’re all in this together and we all can benefit from making Meltano the most amazing product it can be!
I do believe we’ll be successful in making DataOps available to the world and we’ll do it with Meltano. 🚀
Let’s do this!
Aaron (“AJ”) Steers