Firebase Collect Slice

Overview

Firebase Realtime Database is a cloud-hosted database and collaboration platform for software developers. Firebase allows one to build cross-platform apps with our Android, iOS, and JavaScript SDKs, with all of the clients sharing one Realtime Database instance and automatically receive updates with the newest data.

The Datacoral Firebase slice collects data from a Firebase database and enables data flow of object updates into a data warehouse, such as Redshift.

Capabilities

  • Periodically pull data from the Firebase database. With a pull mechanism, data can be extracted as frequently as every five minutes
  • Extraction of only data modified since the last pull is extracted using date/timestamp attributes
  • Flattening nested attributes into columns
  • Loading nested array structures into parent/child tables for easier analysis
  • Scalar transformations of attributes during extraction. Ex. date format changes

Steps to add this slice to your installation

The steps to launch your slice are:

  1. Generate Firebase API keys
  2. Specify the slice config
  3. Add the Firebase slice

1. Generate Firebase API keys

Setup requirements

Before getting started please make sure to have the following information:

  • Admin access in your Firebase console

Setup instructions

You can generate your access auth_token using the following steps:

  1. The access tokens can be generated using a service account with proper permissions to your Realtime Database. Clicking the Generate New Private Key button at the bottom of the Service Accounts section of the Firebase console allows you to easily generate a new service account key file if you do not have one already.
  2. Once you have a service account key file, you can use one of the Google API client libraries to generate a Google OAuth2 access token with the scope

2. Specify the slice config

To get a template for the Firebase slice configuration save the output of the describe --input-parameters command as follows:

datacoral collect describe --slice-type firebase \
--input-parameters > firebase_parameters_file.json

Necessary input parameters:

  • token- your auth_token from step 2 above
  • url- url of your database

Example template:

{
"token": "test",
"user_agent": "https://gardener-app.firebaseio.com/locationsDBv2",
schedule: '0 0 * * *',
loadunits: {
gardeners_emails: {
config: {
'uri': 'URL.json',
'fanout': true,
'nextLoadUnit': 'gardeners_gps_data'
}
},
gardeners_gps_data: {
config: {
'split': true,
'uri': 'URL/SPLIT.json',
'indexOn': '"timestamp"',
'orderBy': '"timestamp"',
'map': {
'list': '',
'item': {
'email_id': '',
'accuracy': 'accuracy',
'latitude': 'latitude',
'longitude': 'longitude',
'meta_batterylevel': 'meta.batteryLevel',
'timestamp_raw': 'timestamp',
'timestamp': 'timestamp'
},
'operate': [
{
'run': '(function(){return context.eventParams.split;})',
'on': 'email_id'
},
{
'run': '(function(val){return moment(val).format(\'YYYY-MM-DD HH:mm:ss\')})',
'on': 'timestamp'
}
]
}
},
ignoreSchedule: 'true',
}
}
}

3. Add the Slice

datacoral collect add --slice-type firebase --slice-name <slice-name> --parameters-file <params-file>
  • slice-name Name of your slice. A schema with your slice-name is automatically created in your warehouse
  • params-file File path to your input parameters file. Ex. firebase_parameters_file.json

Slice output

Data stored in AWS S3 is partitioned by date and time in the following bucket s3://datacoral-data-bucket/<sliceName>

AWS Redshift: Schema - schema name will be same as a slice-name. Tables produced by the slice are:

Schema - schema name will be same as a slice name
Tables list:
- schema.loadunit_name

Questions? Interested?

If you have questions or feedback, feel free to reach out at hello@datacoral.co or Request a demo