Scraping Crypto Price Data From

Scraping Crypto Price data from

Polygon is a great website for getting price data across crypto, option, forex and the stock market. It has a very simple pricing model that gives you unlimited access to whatever asset classes you signed up for.

Today we'll be using their free plan to get historical candle bars for cryptocurrencies. Although the same techniques that we use in this article will help you to use any of their other REST API endpoints for other asset classes.

To get yourself up and running you'll only need pandas and requests installed in a fresh python venv.

Requesting from the API I'm going to structure this script around a pull_1m_data function that will grab 1-minute bars for a certain symbol on a given day. I'll make it easily adjustable for other timeframes.

If you take a look at the docs you can see all of the different parameters that we need to feed into the endpoint to get our data.

Here's what my attempt at the function looked like:

 1def pull_1m_data(symbol, date):
 2    """
 3    date is a python date format
 4    symbol is of the form XXXUSD
 5    """
 7    polygon_api_key = "rACH0leobCBj1JkpYoEZjveYtXFhxyJj"
 8    polygon_rest_baseurl = ""
11    symbol = "X:" + symbol
13    multiplier = 5
14    timespan = "minute"
16    limit = 40000
18    # newest data at the bottom
19    sort = "asc"
21    start_time = datetime.combine(date, datetime.min.time())
22    end_time = start_time + timedelta(days = 1)
24    start_time = int(start_time.timestamp() * 1000)
25    end_time = int(end_time.timestamp() * 1000) -1
27    request_url = f"{polygon_rest_baseurl}aggs/ticker/{symbol}/range/{multiplier}/" +\
28            f"{timespan}/{start_time}/{end_time}?adjusted=true&sort={sort}&" + \
29            f"limit={limit}&apiKey={polygon_api_key}"
31    data = requests.get(request_url).json()
33    if "results" in data:
34        return data["results"]
35    else:
36        raise Exception("Something went wrong")

Obviously make sure that you replace the API key with your own from the polygon dashboard.

One interesting trick I'm using in there is to use

1start_time = datetime.combine(date, datetime.min.time())

To get a timestamp at midnight for the date that I enter. We do this so that we can extract a unix timestamp from the datetime object. You'll also notice that I take 1 millisecond away from our end_time this is done to make sure that we don't include the bar starting at midnight the next day. Try it yourself and you'll see what I mean.

Other than that the script is largely a conventional pattern that you'll use over and over again when interacting with REST APIs.

At this point you can go ahead and just call the function, running it through a for loop to get the amount of days that you're after

1day = date(year = 2021, month = 1, day =1)
3bars = []
4days_of_data = 2
6for i in range(days_of_data):
7    bars += pull_1m_data("BTCUSD", day)
8    day -= timedelta(days = 1)
9    time.sleep(15)

If you have a paid plan, you don't need to include the time.sleep, that's just to make sure that we don't go over the 5 requests / minute limit for free accounts.

Lastly we can make use of pandas to clean up our data and save down to a CSV

1df = pd.DataFrame(bars)
2df["date"] = pd.to_datetime(df["t"], unit = "ms")
3df =  df[["date","o","h","c","l","v"]]
4df.columns = ["time","open","high","low","close","volume"]
5df = df.sort_values("time")
7df.to_csv("data.csv", index=  False)

Video Tutorial

If you'd prefer a video tutorial, you can check out this video from my channel: