
Spreadsheets Gmail
tap-spreadsheets-gmail (matatika variant)
Sync spreadsheets data from a Gmail mailbox
Setup
In Gmail:
- Identify the file names/formats of email attachments you want to sync
In Meltano Cloud:
- Login and connect with your Google account
- Add Tables configuration, with respect to the identified files
Settings
Tables
A list of table definition objects
Table definition
-
name: string (required)The name to assign to the stream
-
path: string (required)A path to a folder or set of emails in the format
imap://imap.gmail.com/path/to/folder-or-emailsFolders and emails are treated as directories and attachments as files
Supports glob pattern matching:
imap://imap.gmail.com/*/*/all emails in all top-level foldersimap://imap.gmail.com/INBOX/*/all emails in the inboximap://imap.gmail.com/INBOX/*/*: all attachments for emails in the inboximap://imap.gmail.com/INBOX/*/*.csv: all CSV file attachments for emails in the inbox
-
format: string (required)The format of files to sync - one of
csv,json,jsonl,excelordetect -
pattern: string (required)A regex pattern to filter resolved files on name by - set to
""ifpathspecifies some kind of filtering mechanism (e.g. glob pattern matching) and no more granular filtering is required -
start_date: string (required)An ISO-8601 date-time to filter resolved files on last modified timestamp by
-
key_properties: array of strings (required)The stream primary keys - for files where a primary key cannot be clearly identified, you can reference meta-properties
[ "_smart_source_bucket", "_smart_source_file", "_smart_source_lineno" ]or
[]for append-only behaviourIf using the meta-property approach outlined above, be aware that changes to file locations or contents may result in unexpected duplicates or overwrites - therefore, it is safest to use this approach if the targeted files are, for all intents and purposes, immutable
-
encoding: string (default:utf-8)The encoding to use when reading files
-
state_based_discovery: boolean (default:false)Whether or not to use state-based discovery, i.e. sample files starting from the bookmark stored in state, rather than the initial
start_dateas defined in the tables configuration) - we recommend setting this totrueto capture any file schema changes and optimise performance -
skip_initial: integer (default:0)The number of lines to skip over when reading a file - mostly useful for
excelformat files -
max_sampled_files: integer (default:50)The number of files to sample during dynamic catalog discovery
-
max_sampling_read: integer (default:1000)The number of lines to sample for each file during dynamic catalog discovery
-
sample_rate: integer (default:5)Controls how frequently lines are sampled (i.e. every
nth line) during dynamic catalog discovery -
prefer_schema_as_string: boolean (default:false)Whether or not to skip inferring property types during sampling - if using with CSV files, you can set
max_sampling_readto1alongside this to improve discovery performance (only the header row needs to be sampled to resolve the property set, given that all values will be treated as strings)
csv format only
-
delimiter: string (default:,)The value delimiter sequence used in the targeted files
-
quotechar: string (default:")The quotechar delimiter character used in the targeted files - set to
detectto auto-discover
excel format only
-
worksheet_name: stringThe specific worksheet name to pull out - defaults to the sheet with the most available data
See the tap README for further information and other miscellaneous settings
Mailbox email address
Spreadsheets Gmail connector is available on Meltano. It is built, maintained, supported, and tested by Meltano.