COMP 4299|System Design

Serverless Functions

These notes introduce serverless functions as an alternative deployment model for backend logic. The API design principles from the previous chapter still apply: you are still designing endpoints, modelling entities, and thinking about request and response shapes. What changes is the infrastructure underneath and the constraints that come with it.

We will use Vercel as the reference platform throughout, given its popularity with Next.js projects and its straightforward mental model. Where behaviour differs between providers, that is called out explicitly.

Contents

  1. What Is a Serverless Function?
  2. Traditional Server vs. Serverless
  3. Statelessness
  4. Cold Starts
  5. Re-implementing the Notes API
  6. Provider Differences
  7. When to Use Serverless
  8. Summary

1. What Is a Serverless Function?

A serverless function is a single unit of backend logic, written as a function, that a cloud provider runs on demand in response to a request. You write the code, deploy it, and the provider handles provisioning servers, scaling, patching, and availability. There is no server for you to think about — hence the name, though servers obviously still exist underneath.

The major providers each have their own implementation of this idea: AWS Lambda, Google Cloud Functions, Cloudflare Workers, and Vercel Functions are the most common. They share the same core model but differ in meaningful ways around runtime limits, cold start behaviour, supported languages, and pricing. The examples in this chapter use Vercel with Next.js, which exposes serverless functions through its App Router API routes.

⚠️Serverless Behaviour Varies Significantly Between Providers

The concepts in this chapter are broadly applicable, but the specifics are Vercel-specific. Execution time limits, cold start characteristics, runtime environments, and deployment models differ enough between AWS Lambda, Cloudflare Workers, and Vercel that code and assumptions from one do not transfer directly to another. Always consult the documentation for the provider you are actually using.

2. Traditional Server vs. Serverless

With a traditional REST server, a single long-lived process boots once and stays running. It listens for incoming requests, handles them one by one (or concurrently), and maintains the same process throughout its lifetime. If traffic increases, you scale by adding more server instances. If traffic drops to zero, the server is still running and still costing money.

Rendering diagram…
A traditional REST server is a persistent process. It boots once and handles every subsequent request within the same running environment.

With serverless, there is no persistent process. Each incoming request causes the provider to run your function, and when the function returns a response, that execution ends. The provider decides when to spin up new instances and when to tear them down. If no requests arrive, nothing is running.

Rendering diagram…
A serverless function has no persistent process. The platform manages invocation and instance lifecycle. Each request may land on a fresh or reused instance.

The practical difference for a developer: with a traditional server, you control when the process starts and what it holds in memory for the duration of its life. With serverless, you give up that control entirely. The platform decides.

3. Statelessness

The REST chapter introduced statelessness as a constraint: every request should carry all the information the server needs to process it, and the server should store no session state between requests. This was a recommendation for REST servers.

For serverless functions, it is not a recommendation. It is a hard constraint.

Because each invocation may run on a completely fresh instance, any data stored in memory during one request is not guaranteed to exist when the next request arrives. A variable set at module scope might persist across requests on a warm instance, or it might not. You cannot rely on it.

Rendering diagram…
Serverless functions cannot rely on in-memory state surviving between invocations. Any data that needs to persist must live in an external store.

Every piece of state that needs to outlive a single invocation must live in an external store: a database, a cache like Redis, or object storage like S3. The function itself is purely a stateless handler.

📝Module-Level Code Does Sometimes Persist

On Vercel and other providers, a warm instance may reuse the same module environment across multiple invocations. This means module-level code (database connection setup, for example) can sometimes be cached between requests as an optimisation. This is an implementation detail of the platform, not something to rely on for application logic. Never store request-specific data at module scope.

4. Cold Starts

Because there is no always-on process, the first request to a function that has not been recently invoked requires the provider to provision a new execution environment before the function can actually run. This is a cold start. Subsequent requests that land on an already-running instance skip this step — those are warm invocations.

Rendering diagram…
A cold start adds provisioning and code-loading time before the handler runs. Warm invocations skip straight to the handler.

Cold starts are typically measured in tens to hundreds of milliseconds, though this varies by provider, runtime, and how much code your function loads at startup. For most applications they are imperceptible. For latency-sensitive endpoints (real-time features, interactive UIs waiting on a response) they can be a genuine problem.

Vercel mitigates cold starts by keeping instances warm for active deployments, but does not eliminate them entirely. Providers like Cloudflare Workers use a different execution model (V8 isolates rather than full Node.js environments) that significantly reduces cold start time, but introduces other constraints in return.

💡Keep Functions Small to Reduce Cold Start Time

The more code a function loads at startup, the longer a cold start takes. Avoid importing large libraries that are only needed conditionally, and split unrelated logic into separate functions rather than bundling everything into one. A function that only loads what it needs will start faster.

5. Re-implementing the Notes API

The notes API from the previous chapter defined three core endpoints. Below is how those same endpoints are structured as Vercel serverless functions using the Next.js App Router.

In Next.js, API routes live under the app/api/ directory. Each folder maps to a URL segment, and a route.ts file inside it exports named functions corresponding to HTTP verbs. This is the file-system routing convention Vercel uses to determine which function handles which request.

Rendering diagram…
Next.js maps the file system to API endpoints. Each route.ts file handles requests for its corresponding URL, with folder names in square brackets becoming dynamic parameters.

Create a note

POST /api/note
ts
// app/api/note/route.ts

import { NextRequest, NextResponse } from 'next/server';

export async function POST(req: NextRequest) {
  const body = await req.json();
  const { userId, title, content, tags, sectionId } = body;

  if (!userId || !title || !content) {
    return NextResponse.json(
      {
        error: {
          code: 'VALIDATION_ERROR',
          message: 'Request body is missing required fields.',
          details: [
            !userId && { field: 'userId', issue: 'Field is required.' },
            !title && { field: 'title', issue: 'Field is required.' },
            !content && { field: 'content', issue: 'Field is required.' },
          ].filter(Boolean),
        },
      },
      { status: 400 }
    );
  }

  // Write to your database here
  const note = await db.notes.create({
    userId,
    title,
    content,
    tags: tags ?? [],
    sectionId: sectionId ?? null,
  });

  return NextResponse.json(note, { status: 201 });
}

The exported function name matches the HTTP verb (POST). The request body is parsed from JSON, required fields are validated, and the response is returned using NextResponse.json. The db.notes.create call represents whatever database client you are using — the function itself stays thin.

Get a single note

GET /api/note/:noteId
ts
// app/api/note/[noteId]/route.ts

import { NextRequest, NextResponse } from 'next/server';

export async function GET(
  req: NextRequest,
  { params }: { params: { noteId: string } }
) {
  const { noteId } = params;

  const note = await db.notes.findById(noteId);

  if (!note) {
    return NextResponse.json(
      { error: { code: 'NOT_FOUND', message: 'Note not found.' } },
      { status: 404 }
    );
  }

  return NextResponse.json(note, { status: 200 });
}

The dynamic segment [noteId] in the folder name becomes a parameter available on the params object. If no note is found for that ID, the function returns a structured 404 rather than letting the database error surface unhandled.

Get all notes for a user (paginated)

GET /api/users/:userId/notes?limit=10&offset=0
ts
// app/api/users/[userId]/notes/route.ts

import { NextRequest, NextResponse } from 'next/server';

export async function GET(
  req: NextRequest,
  { params }: { params: { userId: string } }
) {
  const { userId } = params;
  const { searchParams } = req.nextUrl;

  const limit = Math.min(Number(searchParams.get('limit') ?? 20), 100);
  const offset = Number(searchParams.get('offset') ?? 0);

  const { notes, total } = await db.notes.findByUser(userId, { limit, offset });

  return NextResponse.json({ notes, total, limit, offset }, { status: 200 });
}

Query parameters are read from req.nextUrl.searchParams. The limit is clamped to a maximum of 100 server-side, matching the design principle from the previous chapter: never let user input drive an unbounded query.

📝The API Contract Is the Same

The request shapes, response shapes, status codes, and pagination parameters are identical to what was designed in the previous chapter. Serverless changes how the code is deployed and run, not what the API promises to its callers.

6. Provider Differences

Serverless functions are not a single standardised technology. Each provider implements the model differently, and those differences can matter significantly when choosing a platform or debugging behaviour that does not match expectations.

DimensionVercelAWS LambdaCloudflare Workers
RuntimeNode.js (default), Edge runtime availableNode.js, Python, Go, Java, Ruby, and othersV8 isolates; limited Node.js compatibility
Max execution time10s (Hobby), 15s (Pro), 300s (Enterprise)Up to 15 minutes30s (CPU time); effectively unlimited wall time
Cold startsPresent; mitigated for active deploymentsPresent; can be significant for large runtimesVery fast; isolates start in under 5ms typically
Pricing modelPer invocation and compute time; generous free tierPer invocation and GB-seconds of computePer request; very low cost at scale
DeploymentGit-based via Vercel dashboard or CLIZIP upload, container image, or IaC (CDK, Terraform)Wrangler CLI or git integration

⚠️Edge Runtime Is a Different Environment

Vercel offers an Edge runtime in addition to its standard Node.js runtime. Edge functions run closer to the user geographically and start faster, but they run in a restricted environment that does not support all Node.js APIs. Standard database clients, for instance, may not work on the Edge runtime without a compatible adapter. Check compatibility before choosing Edge for a data-heavy route.

7. When to Use Serverless

Serverless is a good fit when traffic is unpredictable or spiky, when the workload is composed of small discrete operations, or when you want to avoid the operational overhead of managing a persistent server. Common examples include webhook handlers, background tasks triggered by events (sending an email on sign-up, processing an image on upload), and APIs for projects where traffic is low or variable enough that paying for idle server time makes no sense.

It is a poor fit for long-running processes, anything that genuinely needs to hold shared state in memory across requests, or workloads with consistently high and predictable traffic where a dedicated server would be cheaper and simpler. If your function regularly runs for more than a few seconds, or if your use case requires holding an open connection to many concurrent clients, a traditional server is likely the better choice.

For a Next.js project in particular, serverless is often the natural default: Vercel deploys API routes as functions automatically, the operational overhead is near zero, and the pricing works well for projects at typical indie or startup scale. The constraints become relevant mostly when you grow into them.

💡Start Serverless, Migrate When You Have a Reason

For most new projects, starting with serverless and migrating specific routes to a dedicated server only when a concrete constraint forces it is a sound approach. Serverless lets you ship quickly without infrastructure decisions, and the API contract your callers depend on does not change when you swap the deployment model underneath it.

Summary

ConceptKey Takeaway
Serverless functionA single unit of backend logic run on demand by a cloud provider, with no persistent server to manage
Traditional serverA long-lived process that boots once and handles requests continuously; you manage its lifecycle
StatelessnessNot a recommendation in serverless but a hard constraint; in-memory state does not reliably survive between invocations
Cold startThe latency penalty on the first invocation of an idle function, caused by provisioning a new execution environment
Warm invocationA request handled by an already-running instance; skips provisioning and runs at normal latency
File-system routingNext.js maps app/api/ folder structure to endpoint URLs; route.ts exports named functions per HTTP verb
Provider differencesRuntime, execution limits, cold start behaviour, and pricing vary significantly between Vercel, Lambda, and Cloudflare Workers
Good fitSpiky or unpredictable traffic, event-driven tasks, low operational overhead requirements
Poor fitLong-running processes, shared in-memory state, consistently high traffic with predictable load