Knowledge Base/How to get started with S3 on Polygon.io

How to get started with S3 on Polygon.io

Explore vast financial datasets with Polygon.io’s S3 integration. Easy setup with AWS CLI, Rclone, MinIO, or Boto3. See guide for details.

Polygon.io is proud to offer extensive financial datasets, easily accessible thanks to S3. This guide provides detailed steps on setting up various S3 clients to efficiently retrieve the data you need.

Pre-requisites:

A valid flat files subscription with Polygon.io, after which you will receive unique Access and Secret authentication keys from the Dashboard. We have standardized the endpoint and bucket details for all users as follows:

Endpoint: https://files.polygon.io Bucket: flatfiles

Supported S3 Clients

Polygon.io rigorously tests and supports four widely recognized S3 clients to ensure compatibility and user convenience. These include:

  1. AWS S3 CLI
  2. Rclone
  3. MinIO
  4. Python Boto3 SDK

Below, you’ll find the setup process for each client.

AWS S3 CLI

The AWS S3 Command Line Interface (CLI) is a powerful tool designed to interact with Amazon S3.

Here’s how to get started:

  • Install the AWS S3 CLI from the official website.
  • Run
    aws configure
    in your command line and enter your Access Key ID, and Secret Access Key.
  • To interact with Polygon.io’s S3 files, use aws s3 commands:

# Configure your S3 Access and Secret keys
aws configure set aws_access_key_id Your-Access-Key
aws configure set aws_secret_access_key Your-Secret-Key

# List
aws s3 ls s3://flatfiles/ --endpoint-url https://files.polygon.io

# Copy
aws s3 cp s3://flatfiles/us_stocks_sip/trades_v1/2024/03/2024-03-07.csv.gz . --endpoint-url https://files.polygon.io

Rclone

Rclone facilitates data synchronization between your local system and S3 storage. Follow these steps to configure Rclone with Polygon.io:

  • Download and install Rclone from here.
  • Initiate a new remote storage configuration with
    rclone config
    .
  • Here’s an example configuration is provided below for reference:

# Set up your rclone configuration
rclone config create s3polygon s3 env_auth=false access_key_id=ACCESS_KEY_ID secret_access_key=SECRET_ACCESS_KEY endpoint=https://files.polygon.io

# List
rclone ls s3polygon:flatfiles

# Copy
rclone copy s3polygon:flatfiles/us_stocks_sip/trades_v1/2024/03/2024-03-07.csv.gz .

MinIO

MinIO client works seamlessly with any S3 compatible cloud storage. Here’s how to configure it with Polygon.io:

  • If you haven’t already, download and set up MinIO from their documentation.
  • Here’s an example configuration is provided below for reference:

# Enter S3 Access and Secret keys in config file ~/.mc/config.json
mc alias set s3polygon https://files.polygon.io Your-Access-Key Your-Secret-Key

# List
mc ls s3polygon/flatfiles

# View
mc cat s3polygon/flatfiles/us_stocks_sip/trades_v1/2024/03/2024-03-07.csv.gz | gzcat | head -4

# Copy
mc cp s3polygon/flatfiles/us_stocks_sip/trades_v1/2024/03/2024-03-07.csv.gz .

Python Boto3 SDK

Boto3 is the Amazon Web Services (AWS) SDK for Python, which enables Python developers to write software that makes use of Amazon services like S3. To set it up, follow these steps:

  • Install Boto3 by running
    pip install boto3
    if you haven’t already.
  • Utilize the following script to interact with Polygon.io data:

import boto3
from botocore.config import Config

# Initialize a session using your credentials
session = boto3.Session(
   aws_access_key_id='Your-Access-Key',
   aws_secret_access_key='Your-Secret-Key',
)

# Create a client with your session and specify the endpoint
s3 = session.client(
   's3',
   endpoint_url='https://files.polygon.io',
   config=Config(signature_version='s3v4'),
)

# List Example
# Initialize a paginator for listing objects
paginator = s3.get_paginator('list_objects_v2')

# Choose the appropriate prefix depending on the data you need:
# - 'global_crypto' for global cryptocurrency data
# - 'global_forex' for global forex data
# - 'us_indices' for US indices data
# - 'us_options_opra' for US options (OPRA) data
# - 'us_stocks_sip' for US stocks (SIP) data
prefix = 'us_stocks_sip'  # Example: Change this prefix to match your data need

# List objects using the selected prefix
for page in paginator.paginate(Bucket='flatfiles', Prefix=prefix):
    for obj in page['Contents']:
        print(obj['Key'])

# Copy example
# Specify the bucket name
bucket_name = 'flatfiles'

# Specify the S3 object key name
object_key = 'us_stocks_sip/trades_v1/2024/03/2024-03-07.csv.gz'

# Specify the local file name and path to save the downloaded file
# This splits the object_key string by '/' and takes the last segment as the file name
local_file_name = object_key.split('/')[-1]

# This constructs the full local file path
local_file_path = './' + local_file_name

# Download the file
s3.download_file(bucket_name, object_key, local_file_path)

Next Steps

By following the steps outlined for each client, you can efficiently set up your systems to access extensive financial datasets from Polygon.io via S3. or further assistance or more advanced setup options, please refer to the respective client’s official documentation.


Was this article helpful?

https://cdn.sanity.io/images/dhlwe0i3/production/282ed6a0b66079b791c1fb54aef0d268fb193ad7-592x416.png

Schedule a demo

Talk with our market data experts

Calendly
https://cdn.sanity.io/images/dhlwe0i3/production/70ffe6ec85723f74679004a0a6bbdf66ce1fb58f-592x256.png

Modernizing Wall St.

Reimagining financial market data for the 21st century.

About Polygon

Recommended

Can’t find the answer you’re looking for? Contact our team.