DataOps Teams Get a Seat at the Adult’s Table as Organizations Recognize their Strategic, Proactive Value
Gone are the days when success meant keeping data teams small and getting your insights quickly with tools built in-house. Data is taking on a new level of importance to businesses, and expectations are changing. Reliability, consistency, and accuracy are of greater importance than ever before, and the old ways of data don’t support that, leaving DataOps professionals frustrated.
The modern data stack and data lifecycle are extremely complex. An ideal data stack today can have upwards of 15 tools in it, including extract and loader tools, transformers, reverse ETL, warehouses, business intelligence (BI) tools, and metrics layers. Managing and deploying that complex mix is a big challenge.
Data teams—analysts, engineers, scientists—and sometimes product teams—are answering vital business questions and generating insights that inform organizations’ decisions. Companies want to feel confident that their data is good—that it can be relied upon and trusted. The old ways of data don’t get them the in-demand outcomes. Data professionals are feeling this pain, but the solution isn’t necessarily obvious to them.
Looking to the Software Development Team as a Model
Software engineering went through this journey over a decade ago. DevOps brought a solution to the frustration software engineers were experiencing juggling multiple tools, managing a clunky workflow, and facing higher demands. Developers and software engineers have very different skill sets than data professionals, but the outcomes and benefits that DevOps brought those teams are the very things data teams crave. A set of best practices resulted from the emergence of DevOps. Those best practices, such as version control, a unified development environment, and continuous, built-in testing, can also be brought to data workflows.
Now it’s Data’s Turn
The data profession is about five years behind software engineering. Data professionals need the quality outcomes and benefits that DevOps brings developers, but those outcomes need to be clearly defined. Analysts and scientists don’t have the skills that developers and software engineers have, and the tools data teams are accustomed to using come with limitations. People are less comfortable working with tools such as version control and the command line. Data is just very different from dev.
There are, however, similarities from a broad operational standpoint. Data teams are building a data workflow to generate insights where they need to be, just as dev teams are building a development pipeline to get applications to production and in the hands of the user. Chaining tools together is part of the DevOps story, and now it’s also part of the DataOps story.
Meltano Ushers in the Age of the DataOps Platform Infrastructure
Data teams could be using the best-in-breed of available tools, but pulling them all together to build a data platform is extremely difficult considering the business decisions riding on the data they produce. For example, a top-notch BI tool could create great reports, but it can’t be relied upon to make decisions if it’s not updatable. A report is only as good as the data behind it. Change management is a key need for data tooling because data is constantly changing.
With Meltano, teams can choose any open source tools, install them, configure them in a central location, and get them up and running. The complexity introduced by all the tools in the modern data stack is made simpler with Meltano, which serves as the foundation to pull them all together.
Meltano also handles those other challenges that are unique to data, such as change management, scaling, and new business requirements. The added bonus is that Meltano brings those DevOps/DataOps best practices, including features that offer version control, a native understanding of development environments, and built-in testing. It takes on the data team’s burden of managing and deploying their complex data stack.
Examples of Unique Challenges for Data Teams
Data is dynamic; it is innately different from the software in a DevOps pipeline. Data teams are asked for data points such as the status of a customer or the cash flow in a specific region. They have access to data, but it could be buried in Microsoft Excel macros or a BI tool. Once they get the data, their work isn’t done. They might next be asked to tweak the data or share it with someone else in the organization who can make adjustments. Collaboration is very hard—not something data professionals of the “get it done fast and lean” age had to contend with.
Another challenge of the modern data team is reproducibility. In data pipelines, data manipulations are being made in production, in many different staging environments and without testing. Once changes are made, they’re not translated back. It’s a core difference between software development and data. Today, it’s nearly impossible to rerun the same data someone asked for two months ago today, yet data professionals are asked to do this.
Sometimes data teams can create special deployment environments to test things and manipulate data as a one-off workflow of sorts. Still, when it’s time to deploy those new changes into production, those environments aren’t necessarily compatible. User interface (UI) isn’t enough, and changes over time need to be seen. It’s the statement echoed in data departments everywhere: “Well, it works on my machine.” That just won’t fly anymore.
A Tandem Culture Change
Meltano understands that data engineers can confidently build systems and tools that are sustainable and that work. The culture of the data field is changing, and it’s technical. Data professionals are becoming more aware and skilled, primarily due to open source tooling. Proprietary stacks used to be more predominant in companies, and data professionals gained skills that were pertinent to that stack but weren’t necessarily transferable when they changed jobs. With the proliferation of open source tools, frameworks, and languages (such as SQL and Python), data professionals can build skills that will stay with them throughout their careers.
In terms of the larger culture, the view of data is shifting from static to dynamic. Awareness is rising of the changing nature of data over time and the need for collaboration with other people outside the data team. The need for repeatable, reproducible, and dynamic analyses is clear. A DataOps platform infrastructure enables these goals to help data teams take their place as strategic partners in achieving business objectives.
Meltano is on track to launch the beta version of Managed Meltano in late 2022 and make it open to the public in 2023. You can get a quick preview of what Managed Meltano will deliver or be first in line to be part of the test by filling out this waitlist form. Or, follow our progress on GitHub.