Lesson Overview

In the Introduction to Node.js lesson, we talked a little bit about Express.js and what it is, what it's used for. One of the great things Express.js allows us to use is middleware. Often when processing requests, you might have several different functions that need to execute: these are middleware functions. Express.js allows us to add middleware functions to perform important tasks such as converting a request's body into JSON or form-encoded data, calculating data from the request in some way, or event retrieve and process values from a request URL.

In this lesson we'll become familiar with Express.js and how we can use middleware to route more requests that involve a bit more processing.

Pre-Requisites

Before doing this lesson, make sure you've gone over the First Node.js Applications examples and that you've completed the Basic Routes in Node.js lesson. It's also helpful if you are familiar with the concepts discussed in Processing Form Submission.

Express Middleware

Middleware is the function or set of functions that execute in between receiving a request and sending the reponse. Most requests involve many steps, and each step could be a middleware function. A middleware function generally performs one or more of the following tasks:

The set of middleware functions that a request has to go through is called the middleware stack.

A stack is a data structure for a collection of items. The first item in the collection is the last item processed, and the last item in the collection is the first item processed. I always describe it as a stack of plates: When you stack the plates, the first plate you put down is at the bottom. The last plate you put down is on the top of the stack. When someone comes along to take a plate, they will take the top plate (the one on the top of the stack). The last plate on the bottom of the stack will be the last one taken. Stacks are used a lot in computers and programing: you're probably already familiar with a stack trace (the pathway of function calls a process took when it encountered an error), or a call stack (a series of function calls in the order they were invoked: when the last function is done, the program flows back up the call stack in reverse order to figure out where it left off).

In Node.js, the middleware stack is the stack of middleware functions that need to execute for a specific request. For example, a GET request to www.foo.com/orders/123 might have the following middleware functions:

  1. a function getId(req.url) that extracts the ID "123" from the URL and returns it
  2. a function called searchById(id) that accepts the ID and looks it up in a database, and then returns the order object it retrieved
  3. function renderOrder(order) that accepts that order object record and formats it into a user-friendly way and sends it to a view
  4. a function logRequest() that logs the request for audit purposes

So in this example, your middleware stack consists of the getId(), searchById(), renderOrder(), and logRequest(). Why would we not write all this code in a single function? Because a lot of the tasks you want to perform on a request are also performed on other requests. We modularize our code so that we don't have to write the same code for different requests: just call the function(s) that request needs.

A middlware function must either end the request/response cycle or it must call the next() function to pass control to the next middleware function in the stack. If a middleware function does not call next(), it must terminate the request/response cycle (e.g. by sending the response back to the client). If a cycle is never terminated, the response is never sent back to the client.

Middleware functions can read request data and modify the response before the response is sent to the client. With Express.js, you can use middleware functions for both GET and POST requests. You can use functions that are already part of Node, Express, or some other module, or you can create your own middleware functions. In most applications, you'll do all of these.

Adding Express to an Application

The Express module contains the following:

To add Express to your Node.js app, you have to import the "express" module and then instantiate the Express Application object.

// programs would start with something like:
const express = require("express"),
    app = express();
// ... rest of code ...

The const express = require("express") part of the statement imports the Express.js module into your project so that you can use Express.js objects/functions. Because we're using the NPM, we don't have to worry about what version of Express we're getting: it will make sure we get the newest.

The app = express() part of statement creates an instance of the Express Application object. The variable app can be any valid JavaScript identifier. It references the main Express.js Application object, so it's the variable you use when you want to refer to one of the Application properties or methods.

The application object is also your server: you don't have to call createServer(): the express application will do that for you.

To start the server, you can use the express.listen() function:

app.listen(app.get("port"), () => {
    console.log(`Server running on port: ${app.get('port')}`);
});

Common Functions

Some common functions you will find useful from the Application object:

app.set(property, value)
allows you to configure your application by setting certain environment variables to specific values
Environment Variables are application-wide variables that store configuration data about the application.
e.g. app.set("port", process.env.PORT || 3000); sets the port the app uses for listening/waiting for requests (if there is no PORT environment variable, it sets it to 3000)
In previous apps, we set a const for this e.g.
const port = 3000
But in real life, the port number may not stay the same. process.env contains all the environment variables for the application.
As you can imagine, you can retrieve the values of environment variables using app.get() e.g. app.get("port") -- the app.get() method is overloaded: there are two versions (one for processing GET requests and one for getting environment variables. Express knows the difference by looking at the arguments passed into the get() method)
app.get(), app.post(), etc.
These methods route a specific type of request with a specific URL or URL pattern; they define an end point for a specific request.
There are several request methods, not just GET and POST (e.g. PUT, DELETE). There's a function for each request method e.g. app.put() for PUT requests, app.delete() for DELETE requests, etc.
app.all() is used when you want to route requests of any request type e.g. all requests to www.foo.com/orders execute a specific function, regardless of whether it's GET, POST, whatever.
All these functions can take an optional argument for a URL or URL Pattern: the URL/pattern indicates the end point. For example, app.get("/foo") will define the end point for the url www.domain.whatever/foo.
All of these functions also takes one or more callbacks, which are the functions that should execute when the request URL matches the pattern.
If no URL is provided, the path "/" is assumed.
app.use()
Sets up ("mounts") a middleware function to be executed for a specific URL pattern.
Accepts an optional URL pattern and a callback function.
It adds the callback function to the middleware stack when the request matches the URL/pattern
If no URL is specified, the path "/" is assumed.
Unlike app.get(), app.post(), etc, app.use() will add the callback(s) if the URL begins with the URL pattern. e.g. app.use("/foo") will match www.domain.whatever/foo, www.domain.whatever/foo/bar, www.domain.whatever./foo/bar.html, www.domain.whatever/foo/bar/baz, etc.
app.listen()
starts listening for connections

If you look back at the routing program we wrote in Routing in Node.js, you might notice that we call createServer() with a callback, and that callback was handled by all requests. We then wrote code inside the callback to examine the request URL and send it the appropriate file.

In an express app, we instead instantiate the express() object and then tell it what list of functions we want to add to the middleware stack for the various requests we are expecting to process. So instead of having one callback function with if statments, we will instead have several modular functions that the server will execute when the URL patterns match. It's much easier to understand with an actual example.

URL Patterns

A URL Pattern or a route path along with a request type (e.g. GET or POST), is what defines the end points for various requests. For example, you could specify that GET requests to /foo execute function1() but POST requests to /foo execute function2().

A URL pattern is simply a pattern or expression that a request URL needs to match for a specific end point. For example, the pattern /foo would match localhost:3000/foo (for these examples, I'll use localhost:3000, but note that you could put a domain in here too, such as www.whatever.com/foo). Similarly, the pattern "/" just means the root of the application. So if your application was at www.whatever.com, then the pattern "/" would be the same as the url https://www.whatever.com. If your application lived at www.whatever.com/myapplication, then the pattern "/" would match https://www.whatever.com/myapplication. Note that if you're testing your applications on a server, this might be an important distinction.

URL patterns don't have to be for a specific directory. You can use file names, wildcard symbols, and even regular expressions to match specific URLs. You can create complex expressions using special characters, and even full regular expression syntax. For example:

You can enclose characters in brackets to group them together. For example, "/foo(bar)?" means "foo" which may be optionally followed by "bar", so this would match /foo or /foobar. This could be useful if you wanted to do something like "match /getyear or getyear.html": you could use the pattern "/getyear(.html)?".

You can use other valid regex symbols and expressions if you need to create more complicated URL patterns.

First Express Application

Start up a new project; I'll call mine /firstExpress

Add your app.js file and add the strict mode statement followed by the statements to add express, set the PORT environment variable, and start listening for requests:

"use strict";

const express = require("express"),
    app = express();
app.set("port", process.env.PORT || 3000);

// rest of code will go here

app.listen(app.get("port"), () => {
  console.log(`Server running on port: ${app.get('port')}`);
});

What we'd like to do add a function that prints the request information to the console for debugging purposes. This will be part of our middleware stack. When the program executes a middleware function, it needs to know where to go next (what is the next function to execute in the stack). Middleware functions usually have an extra paramter, besides the request and response parameters:

app.use((req, res, next) => {                    
    console.log(`${req.method} request made to: ${req.url}`);    
    // uncomment if you'd like to see the request headers:
    //console.log(req.headers);
    next(); // invoke next function                                  
});

The app.use() function mounts a middleware function: in plain English, this just means that you're telling Express that you would like it to use this function by adding it to the middleware stack. Express will take the function and execute it whenever a request matches the URL pattern. We didn't specify a URL, so it will use the default of "/". This means that this middleware function will execute for ALL incoming requests: the URL pattern for app.use() specifies what the URL should begin with, and all requests begin with "/".

The next parameter is a reference to the next function in the stack: you can see that we're calling next() on the last line of the callback. So basically, this function executes its code and then calls the next function in the stack. If a middleware function takes next as a parameter, it must call next() at the end of the function.

You don't have to pass anything into the next parameter: like req and res, this is given a value automatically.

Also note that even though we're not using the res parameter, we still have to include it (we also have to include req). The req and res parameters contain the request and response objects for this request: even if we don't use them, they are automatically passed to the next request in the middleware stack.

If a middleware function doesn't have a next parameter, it's considered the end of that chain, or the last function in the middleware stack. Any middleware with no next needs to send a response, because no other functions will execute after it.

The order of middleware functions matters: they will execute in the order they are defined (as long as they match the incoming request). For example, let's add a middleware function that executes on a GET request to the /express URL:

app.get("/express", (req, res) => {
    console.log(req.url);
    res.send("<h1>Hello, Express!</h1>");
});

This segment of code should go AFTER the app.use() statement: this middleware is specifically for GET requests to a URL that has a single segment "/express". For example, this middleware will execute for http://localhost:3000/express but it would not match http://localhost:3000/expresssss, http://localhost:3000/nexpress, http://localhost:3000/express/foo, or http://localhost:3000/foo/express.

The Request and Response objects in Express have more properties and functions than the ones in plain Node.js. The Response object's send() function is similar to the Node.js Response.end() function except that response.send() will automatically set the Content-Type header to an appropriate MIME type, based on the response body being sent. For example, up above it automatically sets the Content-Type header to "text/html", because the send() function is sending a response body with HTML code inside it. You can check this later in your browser when you run this program. In most modern browsers, the developers' tools should have a way to view the request and response headers (usually in the Network tab).

How does our modified program work, now?

Initialize your app if you haven't already. Then run your program and try some URLs in your browser, while watching your program's console output:

Try localhost:3000/express

When a request comes in, it doesn't matter what the URL is: it will match app.use() because there is no URL pattern, so it defaults to "/". All requests start with "/", so this will match all requests. So after trying localhost:3000/express, you should see the output from the app.use() callback in the console.

Since this request is for "/express" then it will also execute the callback in the app.get("/express") function. This will show the request URL in the console and load the Hello, Express! output in the browser.

code and output
Output from http://localhost:3000/express

But if a request comes in for http://localhost:3000/foo, the app.use() callback will execute first, and then the program will produce an error: the app.use() callback was told to pass execution on to the next function in the middleware stack, but we didn't define one (because the app.get() doesn't match the URL). Therfore, there's no function for the request to execute, so it doesn't know where to go. In fact, all URLs except /express will work the same way.

Cannot GET /foo in the browser, console output
Output from http://localhost:3000/foo

It's important to make sure you cover all the possibilities, or make sure you have an error handler that sends back a 404 response (we'll get to that eventually). You should never have a middleware function with a next parameter at the end of a middleware stack.

Think of middleware like a series of steps that are followed, but not everyone will need to always folow the same steps. For example, you might consider a set of steps or procedures that are followed after arriving at an airport by plane:

This is a standard set of procedures for most airports, but not everyone needs to follow some of those steps. Furthermore, some steps depends on whether or not other steps are followed.

For example, not all flights have officials asking to see passports. So if a flight you're on isn't one of those flights, then the "show passport to officials" doesn't need to be followed for that flight. Someone who flew without extra baggage does not need to follow the steps regarding going to baggage claim and picking up their luggage. If a person is being met by someone at the airport, they may not need to find a ride to their destination if the person they're meeting is doing the driving.

Similarly, a single person is not going to both "join citizens line" and "join visitors line": if you're arriving, you're either a citizen of the country you've landed in or your a visitor to that country, so you would only join the line that applies to you. Also, doing a random search is only going to apply to people going through customs that get chosen for a random search: not everyone is chosen and most people are not going to volunteer.

The middleware stack is the same: not all functions will apply to every request. Some functions will only apply to one specific request or might even apply to many requests. Some functions only apply if other functions have also been executed.

For example, say you have a web site at mystore.com, and in the /orders directory is the index page where customers can view their past orders, along with other pages and resources. Only logged in customers can access anything in the /orders directory. So you might set up your middleware as:

app.use("/orders", (req, res, next) => {
    if (userIsLoggedIn()) {
        next();
    } else {
        // send redirect to login page
    }
});
app.get("/orders(/index.html)?", (req, res) => {
    // send response with index.html page
});
app.get("/orders/outstanding", (req, res) => {
    // send response with outstanding.html page
});

In the fictional example above, all requests to /orders or any files/directories in /orders result in a callback that ensures a user is already logged in. If the user is not logged in, they are redirected to a login page. But if they are not logged in, the next function in the middleware stack is executed (which is probably one of the ones bound to either of the two app.get() functions in the example). So if a user requests mystore.com/orders/outstanding, the program will make sure they're logged in before passing the request on to the appropriate app.get() callback.

In our own example so far, all requests will do the app.use() middleware function, but only GET-requests to /express are going to do both middleware functions.

Let's add another middleware for GET requests. Put this one after the app.get("/express"):

app.get("/express/foo", (req, res) => {
    console.log(req.url);
    res.send("

Foo!

"); });

This middleware function will handle GET requests to the URL /express/foo. So when this URL is requested, the first middleware (the one with app.use()) will execute, because that one applies to ALL requests. The next one that handles GET requests to /express will be skipped over because the URL pattern doesn't match. The third middleware function for GET requests to /express/foo will execute, instead.

Try your program again: Go to your browser and try the following URLS:

What if you wanted a function to execute for requests that started with the /express path segment (e.g. /express, /express/foo, or express/foo.html)?

You can add a middleware function with app.use() with the URL pattern "/express". Unlike app.get(), app.use("/express") will match localhost:3000/express, localhost:3000/express/foo, and localhost:3000/express/foo.html. However, it would also match localhost:3000/express/foo/bar and localhost:3000/express/foo/bar/whee. Try it (make sure this goes below the first app.use() and above the first app.get()):

  app.use("/express", (req, res, next) => {
    console.log(`${req.originalUrl} starts with /express`);
    console.log(`New request URL: ${req.url}`);
    next();
});
code so far
All our middleware so far

Why does this middleware have to go between app.use() and app.get("/express")? Because the middleware functions are executed in the order that they appear in the code. When you enter the URL localhost:3000/express/foo in your browser, the app.use() will execute first, then app.use("/express"), then app.get("/express/foo"). If you were to move app.use("/express") below app.get("/express"), then app.use("/express") will not execute for localhost:3000/express, because the app.get("/express") is the end of the middleware stack for the /express request URL.

Try your program now, and test the url localhost:3000/express again:

GET request made to: /express
/express starts with /express
New request URL: /
/express
Console Output from the /express request

The first line of output is from the first app.use(). The second and third lines of output are from the app.use("/express") callback. Notice that the "New request URL" is listed as just "/". The app.use() callback's req.url property contains the URL without the mount point (which is /express). It has no effect on the original request URL, which you can access in the req.originalUrl property.

The last line of output is from the app.get("/express") callback.

Now try using the same requests in the list we made before:

URL patterns in app.get() must match exactly. So you'll notice that the pattern /express will only match localhost:3000/express, it will not match localhost:3000/expresss, localhost:3000/foo/express, or anything else. URL patterns in app.use() will match if the URL starts with the pattern and can be followed by one more more additional segments. So app.use("/express") matches localhost:3000/express, localhost:3000/express/foo, localhost:3000/express/foo.html, localhost:3000/express/foo/bar.html, etc.

You can also try a query string in the URL by simply typing it. Query string parameters are stored in the req.query property when they are sent via GET request. Add a console.log(req.query); to the app.get("/express/foo") callback, and then try the following URL:
http://localhost:3000/express/foo?name=foobar&age=50

You should see the following output in the console:

GET request made to: /express/foo?name=foobar&age=50
/express/foo?name=foobar&age=50 starts with /express
New request URL: /foo?name=foobar&age=50
/express/foo?name=foobar&age=50
{ name: 'foobar', age: '50' }
Sample output with query string contents

The req.query property contains a regular JSON object, so you can access the properties of that object, also:

res.send(`<h1>Foo!</h1><p>${req.query.name} is ${req.query.age}</p>`);

Now you can start to imagine how you can process form inputs!

Handling POST Requests

Before reading this part of the lesson, make sure you are familiar with how forms submission works: you should be familiar with the action and method attributes, understand the difference between an HTTP GET-request and an HTTP POST-request, and know what a query string is. If you need to review, read over Processing Form Submission.

To handle a POST request in an Express app, you can use the app.post() function: it works exactly like app.get(), except that it only matches POST requests. Try adding a middleware function that handles a POST request to /express:

app.post("/express", (req, res) => {
    // prints whatever is in the request body, req.body doesn't like template string
    console.log("POST request to /express contains:", req.body);
    res.send("<h1>This was a POST Request</h1>");
});

In this middleware, I'm printing the contents of the request body by accessing the body property of the request object.

You can't test a POST request in your browser without a FORM: when you click a URL or type a URL in the address bar, your browser will always perform a GET request. To test a POST request, you can use the curl command in your terminal or command prompt window. Open a second terminal window or command prompt window (obviously you can't use the one where your app is currently running) and type:

curl -X POST http://localhost:3000/express

After typing this command, press ENTER.

The curl command performs an HTTP request to the specified URL. We indicated that we wanted to do a POST request: the -X POST says that we want to do a POST request instead of a default GET request.

In the window where you typed the curl command, you should see your level-1 heading with "This was a POST request". If you check the other terminal or command prompt window where your app is running, you'll see the console output with an undefined request body (we haven't added a request body yet).

POST request made to: /express
/express starts with /express
New request URL: /
undefined
Console output from the POST request

The first line of output is from the first app.use() callback. The second and third lines from the app.use("/express") callback. The last line is from the app.post() callback.

You can add a query string to your request using the curl command by adding the --data option:

curl --data "name=foobar&age=50" http://localhost:3000/express

This will automatically perform a POST request, but if you prefer, you can specify that it's a POST request:

curl -X POST --data "name=foobar&age=50" http://localhost:3000/express

Either way, when you return to your console window, you'll notice the request body is still showing undefined.

The req.body property is not populated with the actual body of the Request object automatically. You have to tell Express to take the body of the request and put it in the req.body property. There are a few existing middleware functions you can use to do this (you do not have to write your own middleware to do this):

Since we're dealing with regular query strings, we can use urlencoded(). You need to mount the express.urlencoded() function as a middleware function. This should be placed above app.post() so that it catches all POST requests before the app.post() end point.

app.use(express.urlencoded({extended: false}));

You can pass express.urlencoded() a JSON object with various options as properties. In this case I'm sending a JSON object with one property:
{extended: false}
The extended option indicates that we don't want to use extended syntax to parse out the data (there's an extra library/module for this if you want to do fancy things: you can read more in the express.urlencoded() documentation if you're interested in exploring this).

After express.urlencoded() parses out the request body saves it to the req.body property, it will automatically go on to the next function in the stack: there is no need to call next().

Before trying the updated program, uncomment the console.log(req.headers); in the first app.use() callback, if it's not already.

Rerun your program and try the curl command again that sent a POST request with a query string: see that now you have some meaningful output for the request body:

POST request made to: /express
/express starts with /express
New request URL: /
[Object: null prototype] { name: 'foobar', age: '50' }
Output after updated POST request

Check the request headers in your output: notice the Content-Type header is set to "application/x-www-form-urlencoded". This is the MIME type for url-encoded data that comes from a query string (e.g. form submission). That's how the express.urlencoded() middleware knew to execute: it targets requests that have that MIME type, because this says there is a query string inside the request body as url-encoded data.

request header content type circled
The request headers from the POST request

As with the GET request's req.query, you can access the key-value pairs individually e.g. req.body.name

res.send(`<h1>This was a POST Request</h1><p>${req.body.name} is ${req.body.age}</p>`);