The mechanism for uploading files from a browser has been around since the early days of the Internet. In the server-full environment it’s very easy to use Django, Express, or any other popular framework. It’s not an exciting topic — until you experience the scaling problem.
Imagine this scenario — you have an application that uploads files. All is well until the site suddenly gains popularity. Instead of handling a gigabyte of uploads a month, usage grows to 100Gb an hour for the month leading up to tax day. Afterwards, the usage drops back down again for another year. This is exactly the problem we had to solve.
File uploading at scale gobbles up your resources — network bandwidth, CPU, storage. All this data is ingested through your web server(s), which you then have to scale — if you’re lucky this means auto-scaling in AWS, but if you’re not in the cloud you’ll also have to contend with the physical network bottleneck issues.
You can also face some difficult race conditions if your server fails in the middle of handling the uploaded file. Did the file make to its end destination? What was the state of the processing? It can be very hard to replay the steps to failure or know the state of transactions when the server is overloaded.
Fortunately, this particular problem turns out to be a great use case for serverless — as you can eliminate the scaling issues entirely. For mobile and web apps with unpredictable demand, you can simply allow the application to upload the file directly to S3. This has the added benefit of enabling an https endpoint for the upload, which is critical for keeping the file’s contents secure in transit.
All this sounds great — but how does this work in practice when the server is no longer there to do the authentication and intermediary legwork?