Greenhouse Connector UI Setup Guide
Prerequisites
Generate Greenhouse Harvest API key
Before getting started please make sure to have the following admin access to an active Greenhouse account
The Greenhouse connector requires an Harvest API token to collect data. An auth token can be obtained from Greenhouse through the following steps:
- Click on the "Configure" icon on the top right corner
- Click on "Dev Center" on the left
- Click on "API Credential Management"
- Click on "Create New API Key"
- Use an identifier, like "Datacoral API Key" to describe the new API key, "Harvest" as Type and click "Create"
- Click on "Manage Permissions" and select all the objects that you need to ingest from Greenhouse to your warehouse
Important!
The API key will have permission to access only the selected objects. Please review the objects and select only the ones needed.
For example, if offers
object is considered confidential, please uncheck it under "Manage API Key Permissions" before generating the key.
- Copy the newly generated Harvest API key
Step 1: Select Greenhouse Connector
- From the main menu, click on Add connector
- Find and select Greenhouse connector
Step 2. Configure connection parameters
Fill in the details for
- Connector name : Set the name of the connector, please note that this cannot be changed as this becomes the name of the schema
- Destination warehouse : Choose the destination warehouse from the drop down
- Fill in the API key created to connect to your Greenhouse account, click on Check Connection to validate the harvest API key
- Click on Next after succesfully connecting to the source
Step 3: Configure source information
- Interval : Set the frequency of data extraction
- Sync Historical data : It will load the entire past database as a one-time activity
note
This functionality will be enabled in the future. For now contact Datacoral Support to initiate historical sync
- Click on Fetch Source Metadata to retrieve the metadata and then click on Next
Step 4: Configure load units information
The list of loadunits with extraction mode and schedule is displayed.
Extraction mode is auto detected based on the table size and availability of primary key and timestamp column at the source table. Click on Edit to update edit configuration per loadunit.
- Extraction mode: Can be snapshot, incrementalappend or incrementalupdate
- Interval: The frequency of the extraction mode ranges in discrete interval from 5 minutes
- Timestampcol: Its auto-detected for
incrementalupdate
extraction mode - Column Blacklist: The columns that need to be excluded in the destination warehouse should be added here
Important
When using the Column Blacklist feature in a loadunit, please make sure that they are excluded from other loadunits as well.
For example:
jobs
and job_openings
loadunits have custom_fields
and keyed_custom_fields
as json properties which may have confidential data.
Excluding just these columns in the above loadunits will not suffice, as job_openings
data is present as openings
column in jobs
loadunit, which means openings
column should be excluded as well.
Please refer to the links to documentation of each objects in the Greenhouse connector overview page
Step 5: Edit data layouts
Update data type as needed and click on Next
Step 6: Configure warehouse
For each of the load units on the left, you can decide the load mode
Load Mode: Datacoral supports the below load modes
- Replace : This is a wipe and load operation replacing all the rows of the destination table with the results of the transformation query
- Append: Insert operation where, the result of the transformation query are inserted into the destination table, rows already in the destination table are not updated
- Merge: Upsert operation where the transformation query results in rows that indicate that the destination table rows have to be inserted, updated, or even deleted. This mode allows for efficient incremental updates to destination tables.
Primary Key: This is a mandatory key for Merge load mode.
Copy options: Add the copy options (For more information visit Redshift documentation and Snowflake documentation )
When done with the configuration changes, please click on Update and Next on the top right.
Connector Added
You have successfully added the connector once you have landed on the below page. Click on the enable icon on the top right to activate it. Please open a ticket to initiate historical sync.
Questions?
Please contact Datacoral's Support Team, we'd be more than happy to answer any of your questions.