Building common SaaS features à la serverless: Upload a File
Building on the foundation of the previous blog post, by the end of this one you'll have designed a well-architected ©, serverless way of processing files uploaded from your users via the web or mobile clients of your (surely awesome) SaaS app.
A personal announcement
By the way, some unrelated yet wonderful news: my recent blogging hiatus might have something to do with me becoming a father for the first time!
I promise to pick the pace back up going forward 😅
Premise
Suppose that your SaaS app has a feature or use-case that requires some sort of document/file to be processed on the server side.
This could be anything, ranging from fancier business use cases like image recognition ML code if the file is a picture or a .PDF to more common ones like extracting information out of an Excel or CSV file.
Traditionally, that would involve an HTTP POST request from a web or mobile client towards one of your application's web servers with the file in question as part of the network call's payload. The web server would then run the file processing code in-process, using the same pool of hardware resources as everything else in your app, limiting capacity for other users and services in the meantime.
I'll show an alternative way of achieving the same result, without any on-premise computing infrastructure (and it's shortcomings) involved, with all the benefits of serverless computing that we've already covered early in this series.
Assumptions
I'll assume that your application is already making good use of a blob storage service (such as AWS S3), to store and retrieve static assets.
Additionally I'll also assume, for the sake of brevity, that your client-side code already uses the language-your-client-code-is-written-at AWS SDK to create objects in S3 buckets.
Finally, because of the assumption I just made, I'll also assume you have an established way of retrieving objects from S3; memory streams or otherwise.
Design
Good news: Amazon S3 has an Event Notifications system built-in! Check the references section for a detailed read up on it.
The core idea with the proposed system architecture is that you can leverage a PUT event (that's AWS lingo for creating an object in a S3 bucket) to trigger a Lambda function. Thus your client-side code creating an object in a S3 bucket, can automatically trigger an AWS Lambda function invocation which can process the S3 event payload (that includes metadata as well as the unique URL of the new object in S3) and run business-specific code against this file.
.NET Implementation
Building a reusable Library
We start by creating a .NET Standard 2.1 library project, which will be the dependency which all Lambda functions that need to process S3 events will reference to streamline & standardize the processing of creating or updating objects in an S3 bucket with a connected Lambda function trigger.
Run dotnet new classlib
in your command line of choice to create the library.
First, make sure you reference the Amazon.Lambda.S3Events
NuGet package.
Building this library on top of the generic event handler function that I talked about on the previous post, we need a contract, one that an actual Lambda function's entry point should be able to call:
There's something going on here beyond the provided method; notice the interface's generic constraint: TS3Event
needs to be a class (hint: the S3Event class from Amazon.Lambda.S3Events
, but also, potentially, any other class that represents a blob storage object).
Next, we'll need to add a reference to our existing generic EventHandler library from the previous post in order to implement the S3EventHandler
class that inherits from the IEventHandler<T>
interface defined there, where <T>
is the S3Event
Pretty straightforward, we just wire up the logging factory & dependency injection services here, add some rudimentary exception handling and asynchronously process each of the available records that constitute the invocation event.
That's all it is! Before we move on to using our new library in a Lambda function .NET project though, it's worth discussing the DI part briefly.
Microsoft has provided it's own implementation of a dependency injection container in .NET (in the form of a NuGet package) since .NET Core 2.1, called Microsoft.Extensions.DependencyInjection
.
If you're looking to do DI as part of any library, you'll need to implement the IServiceCollection
interface in a static class, so that the framework is able to collect the necessary service descriptors.
For our library, this will look like this 👇🏻
Using the Library with an AWS Lambda .NET project template
Using your favorite shell, install the AWS .NET project templates: dotnet new -i Amazon.Lambda.Templates
.
Now let's create an AWS .NET Lambda project: dotnet new lambda.EmptyFunction
. Please see the references section for an exhaustive list of the possible arguments for this command.
Next, reference both the generic Event function library as well as the S3 specific one you just created.
Modify Function.cs
, which is the entry point for Lambda function invocations as follows:
Y'all know exactly what's up above already. All that's left at this point is to create a class to implement your business logic, YourAwesomeImplS3EventHandler.cs
.
Once you're done, you can build and deploy using the Amazon.Lambda.Tools
.NET Core Global Tool. To install the .NET Core Global Tool, run the following command: dotnet tool install -g Amazon.Lambda.Tools
.
Update the tool to it's latest version: dotnet tool update -g Amazon.Lambda.Tools
.
With Amazon.Lambda.Tools
now installed, you can deploy your function using the following command: dotnet lambda deploy-function MyFunction --function-role role
. Please check the references section for details on configuring a relevant AWS IAM role as well as function-specific configuration such as resources & environment variables.
Conclusions
Serverless is badass, people. That's it, that's the conclusion.
Coming up next
In the next post you'll design and implement the serverless equivalent of an extremely common type of application service: a CRON job; code that runs on either a scheduled basis/regular intervals or that is scheduled to run once at some point in the future in *nix-based operating systems.