# Comma Separated Values (CSV)

The tap-csv extractor pulls data from Comma Separated Values (CSV) files.

To learn more about tap-csv, refer to the repository at https://gitlab.com/meltano/tap-csv.

# Getting Started

# Prerequisites

If you haven't already, follow the initial steps of the Getting Started guide:

  1. Install Meltano
  2. Create your Meltano project

# Installation and configuration

  1. Add the tap-csv extractor to your project using meltano add:

    meltano add extractor tap-csv
    
  2. Configure the settings below using meltano config.

# Next steps

Follow the remaining steps of the Getting Started guide:

  1. Select entities and attributes to extract
  2. Add a loader to send data to a destination
  3. Run a data integration (EL) pipeline

# Settings

tap-csv requires the configuration of the following settings:

# Minimal configuration

A minimal configuration of tap-csv in your meltano.yml project file will look like this:






 
 
 
 
 
 
 
 
 

plugins:
  extractors:
  - name: tap-csv
    variant: meltano
    pip_url: git+https://gitlab.com/meltano/tap-csv.git
    config:
      files:
        - entity: things
          file: extract/things.csv
          keys: [thing_id]
        - entity: widgets
          file: extract/widgets.csv
          keys: [widget_id]
      # csv_files_definition: extract/csv_files.json    # if defining the files in a separate file is preferred

# Files

Array of objects with entity, file, and keys keys:

  • entity: The entity name, used as the table name for the data loaded from that CSV.
  • file: Local path (relative to the project's root) to the file to be ingested. Note that this may be a directory, in which case all files in that directory and any of its subdirectories will be recursively processed
  • keys: The names of the columns that constitute the unique keys for that entity.

Each input CSV file must be a traditionally-delimited CSV (comma separated columns, newlines indicate new rows, double quoted values).

The first row is the header defining the attribute name for that column and will result to a column of the same name in the database. It must have a valid format with no spaces or special characters (like for example ! or @, etc).

You can check the following files as an example of valid CSV files:

Those files were generated by exporting Google Sheets to CSV. By exporting to CSV from most spreadsheet applications, you can be sure that the format of the CSV files will be supported by the CSV Extractor.

# How to use

Manage this setting directly in your meltano.yml project file:






 
 
 
 
 
 

plugins:
  extractors:
  - name: tap-csv
    variant: meltano
    pip_url: git+https://gitlab.com/meltano/tap-csv.git
    config:
      files:
        - entity: <entity>
          file: <path>
          keys: [<key>]
        # ...

Alternatively, manage this setting using meltano config or an environment variable:

meltano config tap-csv set files '[{"entity": "<entity>", "file": "<path>", "keys": ["<key>", ...]}, ...]'

export TAP_CSV_FILES='[{"entity": "<entity>", "file": "<file>", "keys": ["<key>", ...]}, ...]'

# CSV Files Definition

  • Name: csv_files_definition
  • Environment variable: TAP_CSV_FILES_DEFINITION, alias: TAP_CSV_CSV_FILES_DEFINITION

Project-relative path to JSON file holding array of objects with entity, file, and keys keys, as ascribed under Files:

[
  {
    "entity": "<entity>",
    "file": "<path>",
    "keys": ["<key>"]
  },
  // ...
]

# How to use

Manage this setting using meltano config or an environment variable:

meltano config tap-csv set csv_files_definition <path>

export TAP_CSV_FILES_DEFINITION=<path>