The Journey to Meltano Version 1.0

This post was originally published on Medium

This week we are down to working through the final requirements for announcement Meltano v1.0 and it’s been a long road to get to this point.

I’m so proud of our team for fighting through the mental fog that all too often accompanies a complex and wide-ranging product vision to reach clarity about our order of operations. Sticking to the guidelines below has been key delivering value to users, and I hope readers will check out what we’ve built.

We’re on track to formally release v1.0 in early October, and you can check out the Meltano roadmap for a detailed breakdown of remaining issues.

Usable End-to-End, Without the Command Line

Meltano is an internal startup within GitLab, and our goal is to grow MAUI (monthly active UI users) 10% week-over-week.

When I joined the team in February I quickly discovered something that made hitting this goal very challenging: in our existing user-adoption workflow most command-line (CLI) users were not ever making it to the UI.

In the data world, many analysts receive either a login to a SaaS tool or a link (or laptop) where software has already been installed and configured for them by the IT department or a software engineer. To drive adoption of Meltano as a self-hosted tool for managing data analytics pipelines from data ingestion to dashboard, we knew we’d need to drive bottom-up adoption by people who wouldn’t want to be burdened with these extra steps.

Users who would like to install and/or use Meltano from the command line can still do so, but when v1.0 releases this will no longer be required.

Deployable to the Cloud in a Single Click

Another major barrier we wanted to resolve in service of data analysts adopting Meltano is hosting. While Meltano can be run locally on your laptop/desktop using a virtual environment, as soon as you are looking to pull very large data sets this can become problematic from a performance and/or security perspective.

We are working to offer several one-click installation options on cloud hosting platforms, and our first submission will be to Digital Ocean’s Droplet marketplace. In the process of working on our submission, we’ve also documented the steps required to deploy Meltano on a Digital Ocean Droplet as an advanced user, but look forward to simplifying all of this into a pre-made image you can install with a single click.

Everything You Need Comes Installed

Earlier Meltano users will remember the many steps required to install various pieces of the pipeline. Meltano v1.0 will bundle what we believe are the best-in-class open source software available for each step in the pipeline: data taps and targets from Singer, transforms from DBT, and orchestration from Airflow.

We still have work to do post v1.0 to make Jupyter Notebooks integration even easier (advanced users can check out this tutorial) and we are looking to swap out the Meltano Models and Meltano Analyze steps with open source solutions rather than what we’ve built (email hello@meltano.com if you have suggestions or are working on a project that might be a fit).


Thank you for following along with our adventure building Meltano!

We are a 6-person startup within GitLab, and your engagement and support helps us learn what is important and keeps us motivated everyday.

You can find us on Twitter @meltanodata and subscribe to our newsletter here

Meltano 0.41 Released

If this is your first time exploring Meltano for your company’s data pipeline management, you can follow our Installation Guide and Getting Started Guide to get going in minutes!

New

#579 Add meltano schedule list to show a project’s schedules

#942 Add progress bars on various routes to improve UX feedback

#779 Add various UI polish details regarding iconography use, preloading feedback, breadcrumbs, container styling, navigation, and sub-navigation

Changes

#942 Update Analyze Connections UI to match configuration-as-modal pattern for UX consistency regarding configuration

#779 Update all “This feature is queued…” temporary UI buttons to link to the Meltano repo issues page with a contextual search term


Instructions for upgrading to the most current version of Meltano are available in our documentation.

To see the full history of improvements to Meltano, please review our CHANGELOG

Meltano is now using Vue CLI 3! 🎉

Since the early days of Meltano, we have been using Vue.js, a progressive JavaScript framework, to power our frontend application. 

Until recently though, we were using Vue CLI 2 to power the infrastructure behind our app. Having gone through the transition from Vue CLI 2 to Vue CLI 3 multiple times, I knew that there were numerous benefits that the team would gain as a result:

  • Lowers the barrier for future contributions to the Vue app
  • Improved developer experience for things such as managing configuration of things like linting rules
  • Using the scaffold of pre-commit linting in order to minimize back and forth on merge requests

And while everything was working smoothly, the team decided it was time to upgrade.

Why upgrade to Vue CLI 3?

As some of you may know, Meltano is entirely open-source. As a result, contributions from non-core team are not only more than welcome, but highly encouraged! 

However, as someone who has spent a fair amount of time  with the open source community, one of the biggest hurdles to contributing to a codebase that you are not intimately familiar with is the onboarding time it takes to be productive. After all, your time is extremely valuable and we want to ensure that you spend it doing something impactful rather than wading through a bunch of configuration options.

While one could argue that people “should” be able to navigate the old Vue CLI configuration options, the reality is that we needed to leverage the new Vue CLI 3 tooling in order to:

#1. Reduce technical debt

As the core library continues to grow and change, any custom configurations that we make to things such as webpack will only add to the complexity of a migration in the future. 

#2. Improve the security of our code

For those who are active open source contributors, one of the major changes that came to GitHub is the alert to security vulnerabilities within dependencies that projects use. And although it would be great if we lived in a world where people did not attempt to take advantage of others by injecting malicious code in seemingly harmless packages, this is a reality of using open source libraries. 

#3. Make it easier for you to contribute

At the end of the day, creating an environment that makes it easy for developers to contribute is critical to any open source project looking to thrive in the ecosystem. This means we want to make sure that you are:

  1. Able to easily understand how to navigate our codebase since we are following many of the best practices recommended in the Vue community
  2. Using the latest and greatest tools so that your skills are continually sharpened
  3. Productive and creating impact in as little time as possible

What are the technical benefits?

For those who are new to Vue CLI 3, the prime directive of any CLI tool for frontend frameworks is to help automate processes such as:

  • Creating intelligently bundled scripts
  • Automatically injecting them into final production code
  • Managing configuration for build processes such as transpilation and integration of libraries to improve the overall developer experience

It has a lot of features out of the box

Here is what makes Vue CLI 3 special though:

Once it is installed, Vue CLI 3 makes it incredibly easy to progressively enhance Vue apps with popular tools and functionality such as:

  • Babel
  • ESLint
  • TypeScript
  • PostCSS
  • PWA
  • Unit & E2E Testing

It has a Graphical User Interface (GUI)!

Although it might seem counter-intuitive, there is a GUI to allow for an intuitive and powerful way to manage your Vue apps. Whether it is creating, developing and/or maintaining your projects, the GUI is a breath of fresh air when it comes to helping us accomplish these tasks.

It has a plugin system which makes it extensible

With an intuitive plugin system, it is easier than ever to leverage community built plugins to solve common problems. In addition, this means that users are more easily able to discover other plugins which provides the team additional opportunities for outreach.

Configuration without the mess of ejecting

Rather than put the burden of maintaining all the various standard configurations in your repo, Vue CLI 3 abstracts that away so that the longevity of a project’s infrastructure is lengthened. An additional benefit is that this ensures that any configurations being tracked in a repo are custom to the project itself. This is a big relief from a developer experience perspective since you do not have to worry about accidentally overriding default behavior. 

Final Thoughts

Since we have upgraded to Vue CLI 3, the team has already been seeing benefits of the new infrastructure with things like:

  • Improved developer experience for managing configuration of things like linting rules
  • Using the scaffold of pre-commit linting in order to minimize back and forth on merge requests.

We are thrilled with the upgrade and can’t wait to see more contributions from the frontend community as we work our way towards v1!

Meltano 0.40 Released

If this is your first time exploring Meltano for your company’s data pipeline management, you can follow our Installation Guide and Getting Started Guide to get going in minutes!

New

  • #916 Add Transform step as first-class and adjacent step to Extract and Load
  • #916 Improve Create Pipeline Schedule default selection UX by leveraging “ELT recents” concept
  • #936 Add “Refresh Airflow” button in Orchestrate to bypass route change or full-page refresh when iframe doesn’t initially inflate as expected (this will likely be automated once the root cause is determined)
  • #899 Add deep linking improvements to reports and dashboards to better facilitate sharing
  • #899 Add “Edit” and “Explore” buttons to each report instance displayed in a dashboard to enable editing said report and exploring a fresh and unselected analysis of the same model and design

Changes

  • #909 Default names will be generated for Reports and Dashboards
  • #892 Improve experience for parsing Snowflake URL for ID by showing processing step
  • #935 Update Entity Selection to be nested in the Extract step so each ELT step is consecutive
  • #886 Add validation for grouping settings as the next iteration of improved form validation for generated connector settings

Fixes

  • #931 Fix Analyze Connections identifier mismatch resulting from recent linting refactor
  • #919 Fix Airflow iframe automatic UI refresh
  • #937 Fix Chart.vue prop type error

Instructions for upgrading to the most current version of Meltano are available in our documentation.

To see the full history of improvements to Meltano, please review our CHANGELOG

Meltano Demo Day 2019-08-30

Showcasing a whole bunch of new goodness:

  • Ben
    • Snowflake configuration UI improvements
    • Saved reports and dashboards have default names with timestamps
  • Derek
    • Pipeline configuration UI updated, Transform has its own step and Select is nested within Extract
    • Extractor configuration now has data validation and makes it clear what is required
    • Dashboards UI improvements
  • Danielle
    • DigitalOcean Droplet creation tutorial

Meltano 0.39 Released

If this is your first time exploring Meltano for your company’s data pipeline management, you can follow our Installation Guide and Getting Started Guide to get going in minutes!

New

  • #838 Add indicator for speed run plugins
  • #870 Add global footer component in docs
  • #871 Add contributing link in footer of docs
  • #908 Add auto installation for Airflow Orchestrator for improved UX
  • #912 Auto run the ELT of a saved Pipeline Schedule by default
  • #907 Add auto select of “All” for Entities Selection step and removed the performance warning (a future iteration will address the “Recommended” implementation and the display of a resulting performance warning when “All” is selected and “Recommended” ignored)
  • #799 Standardized code conventions on the frontend and updated related documentation (issues related to further linting enforcement will soon follow)

Changes

  • #838 Speed run plugins prioritized to top of the list
  • #896 Add documentation for how to do patch releases
  • #885 Add docs for all extractors and loaders
  • #885 All plugin modal cards show docs text if they have docs
  • #733 Improve error feedback to be more specific when plugin installation errors occur

Fixes

  • #923 Fix contributing release docs merge conflict issue

Instructions for upgrading to the most current version of Meltano are available in our documentation.

To see the full history of improvements to Meltano, please review our CHANGELOG

Using Airflow Within Meltano

With the release of Meltano v0.38, users are now able to install Apache Airflow from the Meltano UI with a single click and then utilize it’s powerful orchestration capabilities from within the Meltano UI to schedule ELT runs.

This functionality is crucial to our path to V1, which requires that we eliminate all requirements that the Meltano UI user execute commands from the command line to use our product. While all command line functionality will still work, and is documented, the goal is to make Meltano accessible to users who don’t want to (or don’t know how to) use the command line.

At this point, the final 2 command line steps for us to elimate are meltano init which creates a new instance of meltano and meltano ui which launches the Meltano UI to the browser at localhost:5000.

Next up: We are looking forward to offering one-click installs on Amazon, Dreamhost, DigitalOcean, and anywhere else users want this capability!


If this is your first time exploring Meltano for your company’s data pipeline management, you can follow our Installation Guide and Getting Started Guide to get going in minutes!