Peak uses a data lake to store data from multiple sources and in multiple formats.

This guide explains how to connect Peak to either a Peak managed or customer managed data lake when onboarding to the platform.


Contents


Process overview

You must configure a data lake before you can start using Peak.

Currently, Peak supports Amazon S3 data lakes.


There are three steps to complete during this process:

  1. Details
    This step lets you name your data lake and choose between a Peak managed or Customer managed configuration.

  2. Configuration
    This step lets you specify the region where your data lake is located.
    If you choose Customer managed, this step will also guide you through creating an IAM role in your AWS S3 account so that Peak can connect to your S3 bucket.

  3. Review
    Once you have entered all of your configuration details, this step lets you review everything before saving.


Entering the data lake details

When signing into Peak for the first time, you will be prompted to connect to a data lake.

  1. To get started, click ADD DATA LAKE and follow the prompts.



  2. Name your data lake connection.
    The name must be unique to your Peak organization. 
    Only alphanumeric characters and underscores are allowed.
    The name cannot be changed after the connection has been set up.




  3. Choose from either Peak managed or Customer managed.
  4. Click NEXT to move to the Configuration stage.
    See below for details of each type of configuration.

Peak managed configuration

This is the quickest process as Peak holds all of the security credentials that are required to make a connection.

  1. Choose the data lake region where your data will be physically stored.
    Peak then creates and manages the data lake for your organization.

    Make sure that your chosen region complies with your local storage regulations.


  2. Once you have chosen the region, either save it as a draft or click next to move to the review stage.
    See Reviewing your connection.


Customer managed configuration

During this process, you configure your Amazon S3 data lake to work with your organization.
You will need to create an IAM role in your AWS S3 account so that Peak can connect to your S3 bucket. The Peak platform generates the IAM policy that you will need to use while creating the IAM role.

To configure your Amazon S3 data lake to work with Peak:

  1. Choose the data lake region where your data will be physically stored.

    Make sure that your chosen region complies with your local storage regulations.

  2. Enter the bucket name.
    This is the name of the Amazon S3 bucket where your data is stored.

  3. Enter the Root Path.
    This is path from root to your S3 bucket.

  4. After entering the bucket and root path, click GENERATE POLICY.
    This generates an Amazon Identity and Access Management (IAM) policy so that Peak can access your S3 bucket.



  5. After the policy has been generated, go to your Amazon IAM web service.

  6. Create an IAM role in your AWS account and add Peak as a trusted entity.
    For more information, see  AWS IAM Configuration.

    The IAM role created is linked to the data lake region, bucket name and path. This means that if there is a change in user, you will need to generate new IAM policy and update the role in your AWS account.

    The Data lake region, bucket name and path can be edited while the data lake configuration is in draft state. Once set up is complete, it is not possible to make further edits.

  7. Once the IAM role is created in your account, copy the IAM role ARN and paste it into the IAM Role ARN field.
    Peak will use this to connect to your Amazon S3 bucket.


  8. Click TEST to test the connection.
    If successful, proceed to the review stage.
    If it is unsuccessful, check your connection details and try again.


Reviewing your connection

Before you complete the configuration process, you can review the details you have given at each stage of the process. 

  1. To make changes, click Edit next to the option you want to change.


  2. Once the details are correct, click FINISH.
    You will be taken back to the Data Bridge listing screen and your newly configured data lake will be shown as ‘Active’.