The easiest way to ingest data to Peak is by performing a direct file transfer.

This is typically done at the start of a project or on an ad hoc basis and can be done via a simple drag and drop, FTP / SFTP or signed URL.

This article explains how to create a feed for files uploaded using FTP or SFTP.


Contents


Getting to the screens

To make a FTP / SFTP connection:

  1. Go to Dock > Data Sources.
    The Feeds screen appears.
  2. Click ADD.
    The Choose Connector screen appears.
  3. Go to the File Storage section and click FTP/SFTP.
    The Create Feed screen appears.


Process Overview

There are four stages that need to be completed when creating a data feed:

  • Connection
  • Import Configuration 
  • Destination
  • Trigger

To find out how to create new and edit existing data feeds, see Managing your data feeds.



Connection

When setting up a connection, you can either use one that has been preconfigured or create a new one.

If you are using a preconfigured connection, you can leave the configuration parameters as they are or edit them.

To use a preconfigured connection:

  1. At the Connection stage, from the Select Connection drop-down, choose the required connection.
    The drop-down will be empty if there have not been any previous connections configured.
  2. Enter the required connection parameters.
    See below for details.
  3. Click SAVE to save the parameters and move to the next step.


To create a new connection:

  1. At the Connection stage, click NEW CONNECTION.

  2. Enter the required connection parameters.
    See below for details.
  3. Click SAVE to save the parameters and move to the next step.


Connection Parameters

Complete these fields:

Connection Name:Enter a suitable name for the connection using alphanumeric characters.
Protocol:Choose from:
Encryption:This only applies when FTP is used.
Choose from:
  • Use explicit FTP over TLS
  • Use plain FTP

For more details, see File Transfer Protocol (FTP)

Host Address:ftp.example.com
Port:21 (used as default)
Username:This is the FTP account login ID
Password (optional):This is the FTP account login password
Use Private Key:This is only available for SFTP.


Import Configuration

The Import Configuration screen enables you to specify the type of file that you are importing, how the data will be formatted and loaded.


Completing the fields

File path

The accepted file types are CSV, TXT, JSON, XML and GZIP.

Validate the file path before you continue.

After the file has been validated, a preview of the file appears:



File type

Choose the type of file: CSV, JSON or XML


CSV

Value separators can be:

  • Comma
  • Tab
  • Pipe


XML


Feed load type


Primary key (optional)

The primary key is only mandatory for an upsert feed.


Feed name

Enter a suitable name for the feed:

  • The name should be meaningful
  • Only alphanumeric and underscore is allowed.
  • It must start with a letter.
  • Must not end with an underscore.
  • Up to 50 characters allowed.

Destination

When configuring a data connector, stage 3 of the process enables you to choose a destination to store your data.


Choose a destination

The destination is where the customer data is stored by Peak. 

It can be either S3 (spark processing), Redshift, or both. 

S3 (Spark processing)

This is Amazon S3 data storage. 

Apache Spark is used by Peak to process large, unstructured (CSV) datasets on Amazon S3.


Redshift

This is Amazon Redshift data storage.

Data stored using Redshift can be queried using SQL. This makes it possible to run frequent aggregations on really large datasets.

Redshift is a relational database and any data that is fed into it has to map exactly - column by column.

Any failed rows are flagged and written to a separate table.


Failed row threshold

This is the number of failed rows that are acceptable before the feed is stopped.

The threshold should reflect the total number of rows that are being written to the table and what is an acceptable proportion of fails before the quality of the data could be considered compromised.


Changing the data type of a schema

When specifying the destination for a data connector, you can change the data type of your schema. 

This function is available for all connectors apart from Webhooks and the old agent based feeds.

Choose the required column name or field name and click the dropdown icon next to the Suggested Data Type. The following data types are available:

  • STRING
  • INTEGER
  • NUMERIC 
  • TIMESTAMP
  • DATE
  • BOOLEAN
  • JSON

Note:
In the current release, TIMESTAMPTZ is not supported.
Any data in this format will be ingested as a string by default.



Setting a trigger

From the Trigger stage, you can define triggers and watchers:

  • Triggers enable you to define when a data feed is run. 
  • Watchers can be added to feeds to provide notifications of feed events to Peak users or other systems.

Triggers

Triggers enable you to define when a data feed is run. There are three types of trigger:

  • Schedule trigger:
    Schedule when the feed runs. A basic and advanced (Cron) scheduler is available.
  • Webhook trigger:
    Trigger a feed to run via a webhook from another system.
  • Run Once trigger:
    Trigger the feed to run once at either a set time or manually from the feed list.

Basic Schedule Trigger

  • Basic schedules use days and time.
  • The feed will run on the selected days (blue).
  • Enter a suitable time or frequency for the tenant’s environment.

Advanced Schedule Trigger

  • Advanced schedules use Cron.
  • Enter the time / frequency as a Cron string.

Cron formatting

A cron expression is a string comprising 6 or 7 fields separated by a space.


FieldMandatoryAllowed ValuesAllowed Special Characters
SecondsYes0-59

, - * /

MinutesYes0-59

, - * /

HoursYes0-23

, - * /

Days of monthYes1-31

, - * ? / L W

MonthYes

1-12 or JAN-DEC

, - * /

Day of weekYes

1-7 or SUN-SAT

, - * ? / L #

YearNo

empty, 1970-2099

, - * /


Cron expression examples

ExpressionMeaning

0 0 12 * * ?

Trigger at 12pm (noon) every day

0 15 10 * * ? 2021

Trigger at 10:15am every day during the year 2021

0 15 10 ? * 6L

Trigger at 10:15am on the last Friday of every month


Webhook triggers

Webhook triggers are used to trigger a data feed when data on a separate system has been updated. 

Webhooks work in a similar way to regular APIs, but rather than making constant requests to other systems to check for updates, webhooks will only send data when a particular event takes place - in this case when new data is available for the data feed.

Using a the webhook URL

The webhook URL is generated by Peak and is unique to the data feed that you are creating or editing. The data source system needs the URL so that it knows where to send the notification.

  1. From the Trigger stage, click Webhook and copy the URL.
    If required, you can generate a new URL by clicking the curved arrow.
  2. Use the URL in the webhook section of the application that you want to receive data from.
    If the system is external to Peak, you will also need to provide it with an API Key for your tenant so that the webhook can be authenticated.
    For more information about generating API Keys, see API Keys.
  3. Once you have generated and copied your webhook URL, click SAVE.

Run Once Triggers

Run Once triggers are used to run the feed once at either a set time or manually from the feeds list.

From the Run Type drop-down menu, choose either:

  • Manual:
    This enables you to trigger the feed manually from the feeds list.
    To do this, go to Dock > Data Sources, hover over the feed and click ‘Run now’.
    For more information, see Managing your data feeds.
  • Date and Time:
    The feed will run once at the scheduled date and time.
    The time you enter must be at least 30 minutes from the current time.


Watchers

Watchers can be added to feeds to provide notifications of feed events to Peak users or other systems. 

There are two types of watcher:

  • User watcher:
    These are users of your tenant that will receive a notification within the platform if a feed event occurs.
  • Webhook watcher:
    These are used to trigger or send notifications to other systems or applications when a feed is updated.
    They could include external applications such as Slack or internal Peak functions such as Workflows.

To add a watcher:

  1. From the Trigger step screen, click ADD WATCHER.
  2. Choose either User or Webhook

 

User Watchers

These are users of your tenant that will receive a notification within the platform if a feed event occurs.

  1. To choose a tenant user to add as a watcher, click the Search User drop-down.
  2. Choose the data feed events that you want the user to be notified of.  
    You can choose to watch all or a custom selection.
    Once added, users can view notifications by clicking the bell icon at the top of the screen.

Data feed events

Users can be notified of the following data feed events:

  • Create:

  • Execution status:

  • Run fail:
    The feed run has failed.
  • Edit / delete:
    The feed has been edited or deleted.
  • Run success:
    The feed has successfully run
  • No new data:
    There is no new data available on a feed

 

Webhook Watchers

These are used to trigger or send notifications to other systems or applications when a feed is updated.
They could include external applications such as Slack or internal Peak functions such as Workflows.

The Webhook URL is taken from the application that you want to trigger if an event occurs. 

If this is a Peak Workflow, this can be taken from the workflow’s trigger step.

The JSON payload is optional. It can be used to pass variables to provide additional information about the feed. Parameters can include:

  • {tenantname}
  • {jobtype}
  • {jobname}
  • {trigger}


Data feed events

Webhooks can be configured for the following data feed events:

  • Run fail:
    The feed run has failed.
  • Run success:
    The feed has successfully run
  • Running for more than x minutes:
    The feed has been running for more than the specified time in minutes
  • No new data:
    There is no new data available on a feed