If you’ve ever tried to run operations on a large number of objects in S3, you might have encountered a few hurdles. Listing all files and running the operation on each object can get complicated and time consuming as the number of objects scales up. Many decisions have to be made: is running the operations from my personal computer fast enough? Or should I run it from a server that’s closer to the AWS resources, benefiting from AWS’s fast internal network? If so, I’ll have to provision resources (e.g. ec2 instance, lambda functions, containers, etc) to run the job.
Thankfully, AWS has heard our pains and announced AWS S3 Batch Operations preview during the last AWS Reinvent conference. This new service (which you can access by asking AWS politely) allows you to easily run operations on very large numbers of S3 objects in your bucket. Curious to know how it works? Let’s get going.