Build personalized experiences at scale

Solution Recipe 24: AWS S3-Triggered Ingestion via Klaviyo’s SFTP

9 min read

August 25, 2023

Solution Recipes are tutorials to achieve specific objectives in Klaviyo. They can also help you master Klaviyo, learn new third-party technologies, and come up with creative ideas. They are written mainly for developers and technically-advanced users.

What you’ll learn:

In this Solution Recipe, we will outline how to connect AWS S3 to Klaviyo’s SFTP to trigger profile ingestion when a new file is added to an S3 bucket. While this recipe covers AWS S3, you can apply this solution to connect any code hosting platform to Klaviyo’s SFTP.

Why it matters:

Increasingly, customers need a reliable solution that allows them to effortlessly import a large amount of data. A lot of Klaviyo customers rely on SFTP ingestion to quickly and accurately ingest data. By leveraging AWS, specifically S3 and Lambda, users can trigger ingestion when a new file is added to an S3 bucket, creating a more automatic ingestion schedule that scales for large data sets.

Level of sophistication:

Moderate

Introduction

It is time consuming to update and import a large number of profiles (i.e., 500K+). We will outline an AWS-based solution that utilizes an S3 bucket and Klaviyo’s SFTP import tool to reduce the time required to make updates and automate key parts of the process.

When using an S3-triggered SFTP ingestion, you can streamline and automate the process of importing data, which ultimately improves the overall experience of accurately and effectively maintaining customer data and leveraging Klaviyo’s powerful marketing tools. Some use cases that may require this type of solution include a bulk daily sync to update profile properties with rewards data or updating performance results from a third-party integration.

Other key profile attributes you may want to update in bulk and leverage for greater personalization in Klaviyo include:

Birthdate
Favorite brands
Loyalty
Properties from off-line sources

In this solution recipe, we’ll walk through:

Setting up your SFTP credentials
Configuring your AWS account with the required IAM, S3 and Lambda settings
Deploying code to programmatically ingest a CSV file into Klaviyo via our SFTP server.

The goal is to streamline and accelerate the process of ingesting profile data into Klaviyo using AWS services and Klaviyo’s SFTP import tool.

Ingredients

GitHub Link Repo
Klaviyo SFTP Import tool
AWS Lambda with S3
Python Libraries: pandas, pysftp, boto3

Instructions

We will provide in-depth, step-by-step instructions throughout this Solution Recipe. The more broad overview of steps are:

Create an SSH key pair and prepare for SFTP ingestion.
Set up an AWS account with access to IAM, S3, and Lambda
a. Create an IAM execution role for Lambda
b. Create and record security credentials
c. Create an S3 bucket
d. Configure a Lambda function with the IAM execution role and set the S3 bucket as the trigger
e. Configure AWS environment variables.
Deploy the Lambda function and monitor progress to ensure profiles are updated and displayed in Klaviyo’s UI as anticipated.

Instructions for SFTP configuration

Step 1: Create SSH key pair and prepare for SFTP ingestion

SFTP is only available to Klaviyo users with an Owner, Admin, or Manager role. To start, you’ll need to generate a public/private key pair on your local machine using ssh-keygen or a tool of your choice. Refer to this documentation for the supported SSH key formats.

Once you have generated your keys:

In Klaviyo, click your account name in the lower left corner and select “Integrations“
On the Integrations page, click “Manage Sources” in the upper right, then select “Import via SFTP“
Click “Add a new SSH Key“
Paste your public key into the “SSH Key” box
Click “Add key”

Record the following details so you can add them into your AWS Environment Variables in the next step:

Server: sftp.klaviyo.com
Username: Your_SFTP_Username (abc123_def456)
SSH Private Key: the private key associated with the public key generated in previous steps

You can read up on our SFTP tool by visiting our developer portal.

Instructions for AWS implementation

Step 2A. Create an IAM Execution role for Lambda

Create an IAM role with AWS service as the trusted entity and Lambda as the use-case.
The Lambda will require the following policy names:
- AmazonS3FullAccess
- AmazonAPIGatewayInvokeFullAccess
- AWSLambdaBasicExecutionRole

Step 2B. Set up your access key ID and secret access key

You will need to set up your access key ID and secret access key in order to give access to the files to be uploaded from the local machine.

Navigate to “Users” in the IAM console
Choose your IAM username
Open the “Security credentials” tab, and then choose “Create access key”
To see the new access key, choose “Show”
To download the key pair, choose “Download .csv file“. Store the file in a secure location. You will add these values into your AWS Environment Variables.

Note: You can retrieve the secret access key only when you create the key pair. Like a password, you can’t retrieve it later. If you lose it, you must create a new key pair.

Step 2C. Create S3 bucket

Navigate here to create the S3 bucket.

Step 2D. Configure a Lambda function

The Lambda is the component of this set-up used to start up the SFTP connection and ingest the CSV file.

Create a new Lambda function with the execution role configured in step 2A.
Update the Trigger settings with the S3 bucket created in step 2C.

Add corresponding files into Lambda from GitHub.

We’ll do a code deep dive in step 3.

Step 2E. Configure AWS environment variables.

Navigate to the “Configuration” tab and add the following resources into your Environment Variables with their corresponding values.

# AWS access and secret keys
ACCESS_KEY_ID
SECRET_ACCESS_KEY

S3_BUCKET_NAME

# folder path to where you are saving your S3 file locally
FOLDER_PATH
Example: /Users/tyler.berman/Documents/SFTP/

S3_FILE_NAME
Example: unmapped_profiles.csv

LOCAL_FILE_NAME
Example: unmapped_profiles_local.csv

MAPPED_FILE_NAME
Example: mapped_profiles.csv

# add your 6 digit List ID found in the URL of the subscriber list
LIST_ID
Example: ABC123

PROFILES_PATH = /profiles/profiles.csv

# folder path to where your SSH private key is stored
PRIVATE_KEY_PATH
Example: /Users/tyler.berman/.ssh/id_rsa

HOST = sftp.klaviyo.com

# add your assigned username found in the UI of Klaviyo's SFTP import tool
USERNAME
Example: abc123_def456

Step 3: Deploy your code

Let’s review the code deployed in your AWS instance.

Programmatically add CSV file to S3:

import boto3
import os


session = boto3.Session(
    aws_access_key_id = os.environ['ACCESS_KEY_ID'],
    aws_secret_access_key = os.environ['SECRET_ACCESS_KEY']
)

s3 = session.resource('s3')

#Define the bucket name and file name
bucket_name = os.environ['S3_BUCKET_NAME']
#The name you want to give to the file in S3
s3_file_name = os.environ['S3_FILE_NAME']  
local_file = os.environ['LOCAL_FILE_NAME']

s3.meta.client.upload_file(Filename=local_file, Bucket=bucket_name,Key=s3_file_name)

Download S3 file:

import boto3


def download_file_from_s3(aws_access_key_id, aws_secret_access_key, bucket_name, s3_file_name, downloaded_file):
    session = boto3.Session(
        aws_access_key_id=aws_access_key_id,
        aws_secret_access_key=aws_secret_access_key
    )

    s3 = session.resource('s3')
    
    s3.meta.client.download_file(Bucket=bucket_name, Key=s3_file_name, Filename=downloaded_file)

Prepare CSV file for ingestion:

import pandas as pd
import os


#anticipate column mapping based on commonly used headers
def suggest_column_mapping(loaded_file):
    column_mapping = {
        'Email': ['EmailAddress', 'person.email', 'Email', 'email', 'emailaddress', 'email address', 'Email Address', 'Emails'],
        'PhoneNumber': ['Phone#', 'person.number', 'phone', 'numbers', 'phone number', 'Phone Number']
    }

    suggested_mapping = {}
    for required_header, old_columns in column_mapping.items():
        for column in old_columns:
            if column in loaded_file.columns:
                suggested_mapping[column] = required_header

    return suggested_mapping


#map column headers of S3 file and add List ID column
def map_column_headers(loaded_file):
    mapped_file = loaded_file.rename(columns=suggest_column_mapping(loaded_file), inplace=False)
    mapped_file['List ID'] = os.environ['LIST_ID']

    final_file = os.environ['FOLDER_PATH'] + os.environ['MAPPED_FILE_NAME']
    mapped_file.to_csv(final_file, index=False)

    return final_file

Establish SFTP server connection:

import pysftp
import os


#ingest S3 file via SFTP
def connect_to_sftp_and_import_final_csv(final_file):
    with pysftp.Connection(host=os.environ['HOST'],
                           username=os.environ['USERNAME'],
                           private_key=os.environ['PRIVATE_KEY_PATH']) as sftp:
        print(f"Connected to {os.environ['HOST']}!")
        try:
            sftp.put(final_file, os.environ['PROFILES_PATH'])
            print(f"Imported {final_file}. Check your inbox for SFTP job details. View progress at https://www.klaviyo.com/sftp/set-up")

        except Exception as err:
            raise Exception(err)

        # close connection
        pysftp.Connection.close(self=sftp)
        print(f"Connection to {os.environ['HOST']} has been closed.")

Put it all together in a lambda_handler function to ingest S3 file via SFTP:

import configure_csv
import sftp_connection
import s3_download
import os
import pandas as pd
import json


def lambda_handler(event, context):
s3_download.download_file_from_s3(os.environ['ACCESS_KEY_ID'],
os.environ['SECRET_ACCESS_KEY'],os.environ['S3_BUCKET_NAME'],
os.environ['S3_FILE_NAME'],os.environ['FOLDER_PATH'] + os.environ['LOCAL_FILE_NAME'])
loaded_file = pd.read_csv(os.environ['FOLDER_PATH'] + os.environ['LOCAL_FILE_NAME'])
    final_file = configure_csv.map_column_headers(loaded_file)
    sftp_connection.connect_to_sftp_and_import_final_csv(final_file)

    return {
        'statusCode': 200,
        'body': json.dumps('successfully ran lambda'),
    }

Impact

Using Klaviyo’s SFTP tool makes data ingestion faster and more efficient. When coupled with the power of AWS’s S3 and Lambda services, you can boost its automation and scalability. With this configuration, your team will be able to manage data and execute ingestion with speed and accuracy, reducing the time and burden of updating profiles manually. Moreover, you can significantly improve data accuracy and mitigate the risk of errors during the ingestion process, ensuring the reliability and integrity of the data you’re using.

Overall, this solution optimizes the efficiency and effectiveness of leveraging relevant data in Klaviyo while streamlining operations and enhancing overall performance.

Nina Ephremidze

Tyler Berman