In the Introduction to Node.js
lesson, we talked a little
bit about Express.js and what it is, what it's used for.
One of the great things Express.js allows us to use is
middleware.
Often when processing requests, you might have several
different functions that need to execute: these are
middleware functions. Express.js allows us to add
middleware functions to perform important tasks such
as converting a request's body into JSON or form-encoded
data, calculating data from the request in some way,
or event retrieve and process values from a request
URL.
In this lesson we'll become familiar with Express.js
and how we can use middleware to route more
requests that involve a bit more processing.
Middleware is the function or set of functions that
execute in between receiving a request and sending
the reponse. Most requests involve many steps,
and each step could be a middleware function.
A middleware function generally performs one or more
of the following tasks:
Execute code that processes the request.
Modify the request or response object.
End the request/response cycle.
Invoke the next middleware function in the middleware stack.
The set of middleware functions that a request has to go
through is called the middleware
stack.
A stack is a data structure for a
collection of items. The first item in the collection
is the last item processed, and the last item in the collection
is the first item processed. I always describe it as a
stack of plates: When you stack the plates, the first plate
you put down is at the bottom. The last plate you put down is
on the top of the stack. When someone comes along to take a plate,
they will take the top plate (the one on the top of the stack).
The last plate on the bottom of the stack will be the last
one taken. Stacks are used a lot in computers and programing:
you're probably already familiar with a stack trace
(the pathway of function calls a process took when it encountered
an error), or a call stack (a series of function calls in the
order they were invoked: when the last function is done,
the program flows back up the call stack in reverse order
to figure out where it left off).
In Node.js, the middleware stack is the stack
of middleware functions that need to execute for
a specific request. For example, a GET request to
www.foo.com/orders/123 might have the following middleware
functions:
a function getId(req.url) that extracts the ID "123" from the URL and returns it
a function called searchById(id) that accepts the ID and looks it up
in a database, and then returns the order object it retrieved
function renderOrder(order) that accepts that
order object record and formats it into a user-friendly way
and sends it to a view
a function logRequest() that logs the request for
audit purposes
So in this example, your middleware stack consists of
the getId(), searchById(), renderOrder(), and logRequest().
Why would we not write all this code in a single function?
Because a lot of the tasks you want to perform on a request
are also performed on other requests. We modularize our
code so that we don't have to write the same code for
different requests: just call the function(s) that request
needs.
A middlware function must either end the request/response
cycle or it must call the next() function
to pass control to the next middleware function in the stack.
If a middleware function does not call next(),
it must terminate the request/response cycle (e.g. by sending
the response back to the client). If a cycle is never
terminated, the response is never sent back to the client.
Middleware functions can read request data and modify the response
before the response is sent to the client. With Express.js, you
can use middleware functions for both GET and POST requests.
You can use functions that are already part of Node, Express,
or some other module, or you can create your own middleware functions.
In most applications, you'll do all of these.
Adding Express to an Application
The Express module contains the following:
Methods/functions for reading various forms of request data
(e.g if a request body contains plain text, JSON data, or
other types of data).
An Application object, which contains properties and middleware
functions that allow you to process GET requests and POST requests.
(this is what you're getting when you call the express() function)
A Request object, which contains properties and functions
that allow you to read request information such as the URL
and headers, along with information you can't retrieve using plain
Node.js (the protocol used, the query string, etc).
A Reponse object, which contains properties and methods/functions
that allow you to modify the response headers and response body,
set cookies, add attachment, perform redirects, and lots of other
useful tasks.
A Router object that is used for routing of requests. We often
use the Router in larger programs with lots of different requests
that require many different tasks.
To add Express to your Node.js app, you have to
import the "express" module and then
instantiate the Express Application object.
// programs would start with something like:
const express = require("express"),
app = express();
// ... rest of code ...
The const express = require("express") part
of the statement imports the Express.js module into your project
so that you can use Express.js objects/functions.
Because we're using the NPM,
we don't have to worry about what version of Express we're getting:
it will make sure we get the newest.
The app = express() part of statement creates
an instance of the Express Application object.
The variable app can be any valid JavaScript
identifier. It references the main Express.js Application
object, so it's the variable you use when you want to refer
to one of the Application properties or methods.
The application object is also your server:
you don't have to call createServer(): the express
application will do that for you.
To start the server, you can use the express.listen()
function:
app.listen(app.get("port"), () => {
console.log(`Server running on port: ${app.get('port')}`);
});
Common Functions
Some common functions you will find useful from the Application
object:
app.set(property, value)
allows you to configure your application by setting
certain environment variables to specific values
Environment Variables are application-wide
variables that store configuration data about the application.
e.g. app.set("port", process.env.PORT || 3000);
sets the port the app uses for listening/waiting for requests
(if there is no PORT environment variable, it sets it to 3000)
In previous apps, we set a const for this
e.g. const port = 3000
But in real life, the port number may not stay the same.
process.env contains all the environment variables
for the application.
As you can imagine, you can retrieve the values of
environment variables using app.get()
e.g. app.get("port")
-- the app.get() method is overloaded: there are two versions (one
for processing GET requests and one for getting environment
variables. Express knows the difference by looking at the arguments
passed into the get() method)
app.get(), app.post(), etc.
These methods route a specific type of request with a
specific URL or URL pattern; they define an end
point for a specific request.
There are several request methods, not just GET and POST
(e.g. PUT, DELETE). There's a function for each request
method e.g. app.put() for PUT requests, app.delete() for DELETE
requests, etc.
app.all() is used when you want to route requests of any
request type e.g. all requests to www.foo.com/orders execute a
specific function, regardless of whether it's GET, POST, whatever.
All these functions can take an optional argument for a URL
or URL Pattern: the URL/pattern indicates the end point.
For example, app.get("/foo") will define the end point for
the url www.domain.whatever/foo.
All of these functions also takes one or more callbacks,
which are the functions that should execute when the
request URL matches the pattern.
If no URL is provided, the path "/" is assumed.
app.use()
Sets up ("mounts") a middleware function to be executed
for a specific URL pattern.
Accepts an optional URL pattern and a callback function.
It adds the callback function to the middleware stack when
the request matches the URL/pattern
If no URL is specified, the path "/" is assumed.
Unlike app.get(), app.post(), etc, app.use() will add the
callback(s) if the URL begins with the URL pattern.
e.g. app.use("/foo") will match www.domain.whatever/foo,
www.domain.whatever/foo/bar, www.domain.whatever./foo/bar.html,
www.domain.whatever/foo/bar/baz, etc.
app.listen()
starts listening for connections
If you look back at the routing program we wrote in
Routing in Node.js, you
might notice that we call createServer() with a callback,
and that callback was handled by all requests. We then wrote
code inside the callback to examine the request URL
and send it the appropriate file.
In an express app, we instead instantiate the express()
object and then tell it what list of functions we
want to add to the middleware stack for the various
requests we are expecting to process. So instead of
having one callback function with if statments, we
will instead have several modular functions that the
server will execute when the URL patterns match.
It's much easier to understand with an actual example.
URL Patterns
A URL Pattern or a route path along with
a request type (e.g. GET or POST), is what defines the
end points for various requests.
For example, you could specify that GET requests to /foo
execute function1() but POST requests to /foo execute
function2().
A URL pattern is simply a pattern or expression that
a request URL needs to match for a specific end point.
For example, the pattern /foo would match localhost:3000/foo
(for these examples, I'll use localhost:3000, but note
that you could put a domain in here too, such as www.whatever.com/foo).
Similarly, the pattern "/" just means the root of the application.
So if your application was at www.whatever.com, then the pattern "/"
would be the same as the url https://www.whatever.com.
If your application lived at www.whatever.com/myapplication, then the
pattern "/" would match https://www.whatever.com/myapplication.
Note that if you're testing your applications on a server, this
might be an important distinction.
URL patterns don't have to be for a specific directory.
You can use file names, wildcard
symbols, and even regular expressions to match specific
URLs. You can create complex expressions using special
characters, and even full
regular expression syntax. For
example:
The ? character means a character is optional.
So "/foo?" means "fo followed by an optional o": this will
match /foo or /fo
The + character means "one or more of the preceeding
character. So "/foo+bar" will match "foo" followed by any number
of "o"s, followed by bar: /foobar, /foooobar, /fooooooobar, etc.
The * means 0 or more of the preceeding character,
so "/foo/*" means the directory /foo followed by any other
path segment (and any number of path segments),
so it will match /foo/bar, /foo/bar.html, /foo/bar/index.html,
etc. However, "/foo/*" will not
match /foo on its own (e.g. www.whatever.com/foo would not match).
You can also use the wildcard for file extensions: "/foo.*"
will match /foo.html, /foo.php, /foo.css, etc.
You can enclose characters in brackets to group them
together. For example, "/foo(bar)?" means "foo" which
may be optionally followed by "bar", so this would match
/foo or /foobar. This could be useful if you wanted to
do something like "match /getyear or getyear.html":
you could use the pattern "/getyear(.html)?".
You can use other valid regex symbols and expressions
if you need to create more complicated URL patterns.
First Express Application
Start up a new project; I'll call mine /firstExpress
Add your app.js file and add the strict mode statement
followed by the statements to add express,
set the PORT environment variable, and start
listening for requests:
"use strict";
const express = require("express"),
app = express();
app.set("port", process.env.PORT || 3000);
// rest of code will go here
app.listen(app.get("port"), () => {
console.log(`Server running on port: ${app.get('port')}`);
});
What we'd like to do add a function that prints the
request information to the console for debugging purposes.
This will be part of our middleware stack. When the program
executes a middleware function, it needs to know where to go
next (what is the next function to execute in the stack).
Middleware functions usually have an extra paramter, besides
the request and response parameters:
app.use((req, res, next) => {
console.log(`${req.method} request made to: ${req.url}`);
// uncomment if you'd like to see the request headers:
//console.log(req.headers);
next(); // invoke next function
});
The app.use() function mounts a middleware function:
in plain English, this just means that you're telling Express that
you would like it to use this function by adding it to the middleware
stack. Express will take the function and execute it
whenever a request matches the URL pattern.
We didn't specify a URL, so it will use the default of "/".
This means that this middleware function will execute for
ALL incoming requests: the URL pattern for app.use()
specifies what the URL should begin with, and all requests
begin with "/".
The next parameter is a reference to the
next function in the stack: you can see that we're calling
next() on the last line of the callback.
So basically, this function executes its code and then calls
the next function in the stack. If a middleware
function takes next as a parameter, it must
call next() at the end of the function.
You don't have to pass anything into the next
parameter: like req and res, this is
given a value automatically.
Also note that even though we're not using the res
parameter, we still have to include it (we also have to include
req). The req and res
parameters contain the request and response objects for this
request: even if we don't use them, they are automatically
passed to the next request in the middleware stack.
If a middleware function doesn't have a next parameter,
it's considered the end of that chain, or the last function
in the middleware stack. Any middleware with no next
needs to send a response, because no other functions will
execute after it.
The order of middleware functions matters: they will
execute in the order they are defined (as long as they match
the incoming request).
For example, let's add a middleware function that
executes on a GET request to the /express URL:
This segment of code should go AFTER the app.use() statement:
this middleware is specifically for GET requests to a URL
that has a single segment "/express". For example, this
middleware will execute for http://localhost:3000/express
but it would not match http://localhost:3000/expresssss,
http://localhost:3000/nexpress, http://localhost:3000/express/foo,
or http://localhost:3000/foo/express.
The Request and Response objects in Express have more
properties and functions than the ones in plain Node.js.
The Response object's send() function is similar to the
Node.js Response.end() function except that response.send()
will automatically set the Content-Type header to an
appropriate MIME type, based on the response body being
sent. For example, up above it automatically sets the
Content-Type header to
"text/html", because the send()
function is sending a response body with HTML code
inside it. You can check this later in your browser
when you run this program. In most modern browsers, the
developers' tools should have a way to view the request and
response headers (usually in the Network tab).
How does our modified program work, now?
Initialize your app if you haven't already.
Then run your program and try some URLs in your browser, while
watching your program's console output:
Try localhost:3000/express
When a request comes in, it doesn't matter what the URL
is: it will match app.use() because there is no URL pattern,
so it defaults to "/". All requests start with "/", so
this will match all requests. So after trying localhost:3000/express,
you should see the output from the app.use() callback in the
console.
Since this request is for "/express" then it will also execute the
callback in the app.get("/express") function.
This will show the request URL in the console and load the
Hello, Express! output in the browser.
But if a request comes in for http://localhost:3000/foo, the
app.use() callback will execute first, and then the program will
produce an error: the app.use() callback was told
to pass execution on to the next function in the middleware
stack, but we didn't define one (because the app.get()
doesn't match the URL). Therfore, there's no function
for the request to execute, so it doesn't know where to go.
In fact, all URLs except /express will work the same way.
It's important to make sure you cover all the possibilities,
or make sure you have an error handler that sends back
a 404 response (we'll get to that eventually). You should never
have a middleware function with a next parameter
at the end of a middleware stack.
Think of middleware like a series of steps that are followed,
but not everyone will need to always folow the same steps.
For example, you might consider a set of steps or procedures
that are followed after arriving at an airport by plane:
disembark plane
show passport to officials
join citizens line
join visitors line
provide documents and answer questions
go to baggage claim
find and pick up bags
go through customs, answer questions
do random search
enter public arrivals hall
search for person/people meeting you
find a ride to destination
This is a standard set of procedures for most airports,
but not everyone needs to follow some of those steps.
Furthermore, some steps depends on whether or not other
steps are followed.
For example, not all flights have officials asking to
see passports. So if a flight you're on isn't one of
those flights, then the "show passport to officials"
doesn't need to be followed for that flight. Someone
who flew without extra baggage does not need to follow the
steps regarding going to baggage claim and picking up
their luggage. If a person is being met by someone at the
airport, they may not need to find a ride to their destination
if the person they're meeting is doing the driving.
Similarly, a single person is not going to both
"join citizens line" and "join visitors line": if you're
arriving, you're either a citizen of the country you've landed
in or your a visitor to that country, so you would only
join the line that applies to you. Also, doing a random search
is only going to apply to people going through customs that
get chosen for a random search: not everyone is chosen
and most people are not going to volunteer.
The middleware stack is the same: not all functions will apply
to every request. Some functions will only apply to one
specific request or might even apply to many requests.
Some functions only apply if other functions have
also been executed.
For example, say you have a web site at mystore.com,
and in the /orders directory is the index page
where customers can view their past orders, along with
other pages and resources. Only logged in customers
can access anything in the /orders directory.
So you might set up your middleware as:
In the fictional example above, all requests to /orders
or any files/directories in /orders result in a callback
that ensures a user is already logged in. If the user is
not logged in, they are redirected to a login page. But
if they are not logged in, the next function in the
middleware stack is executed (which is probably one of the
ones bound to either of the two app.get() functions in
the example). So if a user requests mystore.com/orders/outstanding,
the program will make sure they're logged in before passing
the request on to the appropriate app.get() callback.
In our own example so far, all requests will do the app.use()
middleware function, but only GET-requests to /express
are going to do both middleware functions.
Let's add another middleware for GET requests. Put this
one after the app.get("/express"):
This middleware function will handle GET requests to the URL
/express/foo. So when this URL is requested, the
first middleware (the one with app.use()) will execute, because
that one applies to ALL requests. The next one that handles GET
requests to /express will be skipped over because
the URL pattern doesn't match. The third middleware function
for GET requests to /express/foo will execute,
instead.
Try your program again: Go to your browser and
try the following URLS:
localhost:3000/foo - this should print the request information
and request headers to the console, and then move to the
next middleware function in the chain. There is no other
function that matches a GET request to /foo, so you will
see an error on the console that the program can't do a
GET request to /foo.
localhost:3000/express - this should print the request
information and headers (from the app.use() callback),
and then it will move to the next
function in the chain: the GET request for /express, which
sends a response that shows Hello Express in the browser.
localhost:3000/express/foo - this should print the request
information and headers (from the app.use() callback),
and then it will move to the next
function in the chain: the GET request for /express/foo,
which sends a response that shows "Foo" in the browser.
localhost:3000/foo/express - this should print the request
information and headers (from the app.use() callback),
and then it will move to the next
function in the chain: But there is no other
function that matches a GET request to /foo/express, so you will
see an error on the console that the program can't do a
GET request to /foo/express.
localhost:3000/ - this should print the request
information and headers (from the app.use() callback),
and then it will move to the next
function in the chain: But there is no other
function that matches a GET request to / , so you will
see an error on the console that the program can't do a
GET request to / (/ is another way of saying the "application
root").
What if you wanted a function to execute for requests
that started with the /express path segment (e.g.
/express, /express/foo, or express/foo.html)?
You can add a middleware function with app.use()
with the URL pattern "/express". Unlike app.get(),
app.use("/express") will match localhost:3000/express,
localhost:3000/express/foo, and
localhost:3000/express/foo.html. However,
it would also match localhost:3000/express/foo/bar
and localhost:3000/express/foo/bar/whee.
Try it (make sure this goes below the first
app.use() and above the first app.get()):
Why does this middleware have to go between app.use() and
app.get("/express")? Because the middleware functions are
executed in the order that they appear in the code. When you
enter the URL localhost:3000/express/foo in your browser, the
app.use() will execute first, then app.use("/express"),
then app.get("/express/foo"). If you were to move app.use("/express")
below app.get("/express"), then app.use("/express") will not
execute for localhost:3000/express, because the app.get("/express")
is the end of the middleware stack for the /express request URL.
Try your program now, and test the url
localhost:3000/express again:
The first line of output is from the first
app.use(). The second and third lines of output
are from the app.use("/express") callback.
Notice that the "New request URL" is listed as
just "/". The app.use() callback's req.url
property contains the URL without the mount point
(which is /express). It has no effect on the original
request URL, which you can access in the
req.originalUrl property.
The last line of output is from the app.get("/express")
callback.
Now try using the same requests in the list
we made before:
localhost:3000/foo - executes app.use(), then an error.
localhost:3000/express/foo - executes app.use(),
then app.use("/express"), then app.get("/express/foo")
localhost:3000/foo/express - executes app.use(),
then an error
localhost:3000/ - executes app.use(), then an error
Also, try localhost:3000/express/foo/bar - this executes
app.use(), then app.use("/express"), then shows an error
because app.use("/express") calls next() and there are no
middleware functions left that match.
URL patterns in app.get()
must match exactly. So you'll notice
that the pattern /express will only match
localhost:3000/express,
it will not match localhost:3000/expresss,
localhost:3000/foo/express,
or anything else. URL patterns in app.use() will match
if the URL starts with the
pattern and can be followed by one more more additional
segments. So app.use("/express") matches localhost:3000/express,
localhost:3000/express/foo, localhost:3000/express/foo.html,
localhost:3000/express/foo/bar.html, etc.
You can also try a query string in the URL by
simply typing it. Query string parameters are stored
in the req.query property when they
are sent via GET request. Add a console.log(req.query);
to the app.get("/express/foo") callback, and then try
the following URL: http://localhost:3000/express/foo?name=foobar&age=50
You should see the following output in the console:
The req.query property contains a regular
JSON object, so you can access the properties of that
object, also:
res.send(`<h1>Foo!</h1><p>${req.query.name} is ${req.query.age}</p>`);
Now you can start to imagine how you can process form inputs!
Handling POST Requests
Before reading this part of the lesson, make sure you are
familiar with how forms submission works: you should be
familiar with the action and method
attributes, understand the difference between an HTTP GET-request
and an HTTP POST-request, and know what a query string is.
If you need to review, read over
Processing Form Submission.
To handle a POST request in an Express app, you can use
the app.post() function: it works exactly like app.get(),
except that it only matches POST requests. Try adding a
middleware function that handles a POST request to /express:
app.post("/express", (req, res) => {
// prints whatever is in the request body, req.body doesn't like template string
console.log("POST request to /express contains:", req.body);
res.send("<h1>This was a POST Request</h1>");
});
In this middleware, I'm printing the contents of the request
body by accessing the body property of the
request object.
You can't test a POST request in your browser without a FORM:
when you click a URL or type a URL in the address bar, your
browser will always perform a GET request. To test a POST
request, you can use the curl command in your
terminal or command prompt window. Open a second terminal
window or command prompt window (obviously you can't use the
one where your app is currently running) and type:
curl -X POST http://localhost:3000/express
After typing this command, press ENTER.
The curl command performs an HTTP request to the
specified URL. We indicated that we wanted to do a POST request:
the -X POST says that we want to do a POST request
instead of a default GET request.
In the window where you typed the curl command, you should
see your level-1 heading with "This was a POST request".
If you check the other terminal or command prompt window
where your app is running, you'll see the console output
with an undefined request body (we haven't added a request
body yet).
The first line of output is from the first app.use()
callback. The second and third lines from the app.use("/express")
callback. The last line is from the app.post() callback.
You can add a query string to your request using the curl
command by adding the --data option:
This will automatically perform a POST request, but if you
prefer, you can specify that it's a POST request:
curl -X POST --data "name=foobar&age=50" http://localhost:3000/express
Either way, when you return to your console window, you'll
notice the request body is still showing undefined.
The req.body property is not populated
with the actual body of the Request object automatically.
You have to tell Express to take the body of the request
and put it in the req.body property.
There are a few existing middleware functions you can use
to do this (you do not have to write your own middleware
to do this):
express.urlencoded() will look for requests
that have a body containing URL-encoded text. When it finds
these, it will take the body of the request and store it in the
req.body property as key-value pairs.
Query strings are url-encoded data, so when a form submits
using a POST request, the request body contains the form's
query string as url-encoded data.
express.json() works like express.ulrencoded()
except that it looks for requests that contain JSON data in the
request body.
Since we're dealing with regular query strings, we can use
urlencoded(). You need to mount the express.urlencoded()
function as a middleware function. This should be placed
above app.post() so that it catches all
POST requests before the app.post() end point.
app.use(express.urlencoded({extended: false}));
You can pass express.urlencoded() a JSON object with various
options as properties. In this case I'm sending a JSON object
with one property: {extended: false}
The extended option indicates that we don't want
to use extended syntax to parse out the data (there's an
extra library/module for this if you want to do fancy
things: you can read more in the
express.urlencoded()
documentation if you're
interested in exploring this).
After express.urlencoded() parses out the request body
saves it to the req.body property,
it will automatically go on to the next function
in the stack: there is no need to call next().
Before trying the updated program, uncomment
the console.log(req.headers);
in the first app.use() callback, if it's not already.
Rerun your program and try the curl command again
that sent a POST request with a query string: see
that now you have some meaningful output for the
request body:
Check the request headers in your output: notice
the Content-Type header is set to
"application/x-www-form-urlencoded". This is the
MIME type for url-encoded data that comes from a
query string (e.g. form submission). That's
how the express.urlencoded() middleware
knew to execute: it targets requests that have that
MIME type, because this says there is a query string
inside the request body as url-encoded data.
As with the GET request's req.query, you can access the key-value
pairs individually e.g. req.body.name
res.send(`<h1>This was a POST Request</h1><p>${req.body.name} is ${req.body.age}</p>`);