Sitemap

Deploying XGBoost on AWS Lambda using S3 to store binaries

2 min readOct 2, 2020

Lambda is getting more and more powerful: so much so that you can now run many ETL and ML workloads on it. In this article I’ll show you how to deploy xgboost with lambda so that you can train models and run inference serverlessly.

Why this is hard

Let’s begin with why this is a challenge: lambda deployment package size limits. The XGBoost binaries are ~400MB. Lambda will only let you deploy a package up to 250MB. Same deal if you try to use lambda layers. At the time of writing, lambda layers still have a total size limit of 250MB. I tried splitting up xgboost into its constituent libraries such that each layer was less than 250MB but lambda counts the aggregate size.

Two ways to make it work

Originally when I was working on the deployment, there was only one option: downloading the binaries from S3 into the Lambda /tmp directory. That works perfectly well under the following circumstances:

  1. Your code + the ML binaries + any artifacts you product are less than 512 MB (total storage available to lambda functions)
  2. Your lambda function runs quite often so you will generally have a hot cache OR your code executes in well under 15 minutes so cold starts don’t matter.

Since then, AWS has released the ability to use EFS with Lambda. Here is a blog post showing how to retrieve ML binaries from EFS. The EFS solution is better if you are likely to exceed the 512MB threshold or if you want to speed up cold starts (though it doesn’t really take that long to download from S3).

I am quite satisfied with the S3 solution and haven’t found it well described elsewhere so will document this approach here.

High level approach

To get this working you need:

  1. A “slim handler” lambda (I borrowed this approach from Zappa) which downloads the libraries from S3 and makes them available for Python to import
  2. The dependencies bundled as a tar archive and stored in S3
  3. The lambda code bundled as a tar archive and stored in S3

Step 1. Slim handler

I borrowed this code from Zappa removing a lot of boilerplate which wasn’t needed. The key functionality is in the load_remote_project_archive function. It downloads the fat handler code and the dependencies archive and makes them both available for the python interpreter.

Step 2: packaging dependencies

I’m using Pipenv to manage dependencies. I use a little shell script to convert the Pipfile into a requirements.txt and then pass that to lambci to install in a lambda like environment:

Step 3: packaging the fat handler

I’ve chosen to wrap the fat handler in a tar archive (though this isn’t really necessary as I only have one file):

Step 4: CDK code

We use AWS CDK (cloud development kit) to define our infrastructure. Key things to note:

  1. The slim handler code is used to bootstrap the lambda function
  2. The dependencies + fat handler code are uploaded to S3 and provided as S3 references to the slim handler

Hope that helps!

--

--

No responses yet