Processing Form data

So you have an application that uses an HTML form to get user inputs: but how do you access that input data in your server-side code?

The action="" Attribute

First of all, your form's action="" attribute needs to define which program/file/script should recive the form input data. That program will then execute its code to process the form data (e.g. perform calculations, create objects, read/write a database, etc).

If you want the current page that contains your form to also process the form data, you can set the action attribute to the current file name or leave the attribute out entirely. For example, if you have a form inside a page called foo.php, either of these will cause the form to be processed by the same foo.php page:

<form action="foo.php">
<form>

When a form is submitted using the SUBMIT button, the user's inputs in the form are sent along in an HTTP Request object that goes to the server. Form data is sent as a query string that is either sent in the request URL or in the request body.

Query Strings

A query string consists of a set of key-value pairs separated by the & (ampersand) symbol. The key is the form input element's name="" attribute value, and the value is the value that the input element contains i.e. the value the user typed, selected, or checed. We'll look at how different types of form input elements send their input value to the query string later in this lesson.

For example, if you have a form with some inputs defined as:

<form action="contact.php" method="get">
  <div><label for="name">Name:
    <input type="text" id="name" name="userName">
  </label></div>    
  <div><label for="email">Email:
    <input type="email" id="email" name="userEmail">
  </label></div>
  <div><label for="phone">Phone:
    <input type="text" id="phone" name="userPhone">
  </label></div>
  <div><input type="submit"></div>  
</form>

When the user submits the form using the SUBMIT button after entering "Foo Bar" as the name, "me@me.com" as the email, and "555-5555" as the phone number, the query string will be built as:

userName=Foo+Bar&userEmail=me%40me.com&userPhone=555-5555

Notice that

  • The user input "Foo Bar" contains a space, which is encoded as a + when it's added to the query string.
  • The @ symbol in the email address is also encoded as the value %40
  • Each key-value pair is formatted as "key=value" when it's added to the query string.

Other special characters in the URL might also be encoded, but you don't have to worry about it at all: when you send and then later receive the values in the query string, the encoding and decoding is automatic. The characters that are encoded are generally ones that have special meaning in URLs (such as the @ symbol) or that make the URL invalid (such as a space).

GET vs POST Requests

When your form's method attribute is set as method="get" (of if the form has no method attribute at all), it uses the GET method to send the request. In this case, the data is sent as key-value pairs in the URL of the request. If your form's method is setas method="post", the request is sent using the POST method. This means that the data is sent as encoded key-value pairs in the HTTP Request body.

What's the difference between GET and POST?

The GET method sends your query string in the URL. A ? (question mark) separates the query string from the request URL. For example, using the form and sample inputs in the earlier example, which is processed by the file contact.php, the GET request would appear as:

http://server.name.com/whatever/contact.php?userName=Foo+Bar&userEmail=me@me.com&userPhone=555-5555

The POST method doesn't use the URL to send the query string in the request. It sends the query string in the body of the request.

A request (and a response, for that matter) consists of a header and a body. The header contains information (stored in HTTP headers) about the request, such as the type of content the request is willing to accept in the response (and in what encoding scheme and languages), the size of the content/data that was sent in the request, what kind of data is being sent in the request, where the request came from and using which browser, and lots of other information.

The request body contains the data that was sent along in the request. It might not only include a query string. For example, if you were uploading a file, the body of the request would contain the file you were uploading.

You can use your browser's developer's tools to view the request headers and body: Using Ctrl-Shift-J to open the developer's tools, go to the Network tab and then refresh the page you want to view. You can view the request headers by selecting the request:

the headers showing after selecting Request Headers
Select the request in the list on the left and you can view the Request Headers on the right.

For a request that has a query string, either in the URL as a GET request or in the request body in a POST request, you will see a "Payload" tab beside the "Headers" tab. You can view the query string in its raw form or formatted as it might look in the form.

nicely formatted query string
Use the Payload tab to view the query string.

clicking view source shows the query string in its raw form
View the query string in its raw, encoded form

Both GET and POST send the query string as encoded, plain text but there are still some significant differences when it comes down to making a choice between GET or POST.

GET vs. POST Request Methods
GET POST
Data is sent as key-value pairs as part of URL, therefore
  • data is visible to a user
  • data is visible when page is bookmarked or in the browser history
  • data can be cached
  • it's extremely easy to hack
Data is sent inside the HTTP request body, therefore
  • data is not visible to a user
  • data is not visible when page is bookmarked nor in the browser history
  • data can't be cached
  • it's more difficult to hack (but still not difficult)
Data can only be plain text Data can be plain text, objects, binary data, etc.
Data is limited to the size limit of a URL (generally 2,048 characters) There is no limit to amount of data you can send
No problems reloading the URL or using the browser's BACK button to go back to the URL When browser reloads/revisits the page, the form data must be resubmitted

So you can see that using a GET request makes the form data visible to anyone. Not only in the browser window, but it will appear if the requested page is bookmarked. Also, when a request travels from the client to the server, it makes many stops at various routers along it's route: there are many paths the request can take to get from source to destination and as it passes through a router on the way, the router caches the request. Anyone with access to the router's cache can easily see the query string in the URL. This means any data sent with a GET request has the potential to be seen by a lot of different people!

A POST request sends the data within the request body, so it's not as easily visible - it doesn't appear in the URL so it won't appear in bookmarks or a router cache. This doens't mean your data is safe, though: you would have to encrypt sensitive data and send it ofer Secure HTTP (HTTPS) for it to be safe. You'll learn those things in a course on network security.

Lastly, when the user refreshes a page, the browser requests that page again. This means if your request is to process some form data, that form data will be sent to the server again and the page that's processing the data will execute again. This might not be an issue for a lot of applications, but what if your form is asking for data to add a new record to a database? If your processing script inserts that record into a database or adds it to a file, refreshing or reloading that page will cause the script to execute again, and it will add the new record to the database/file a second time, and every time the page is refreshed. This causes redundant records and might even cause data integrity exceptions on a database server.

With a GET request, your browser will allow the user to refresh a page, but with a POST request, the browser will warn the user that they're about to re-submit the form and cause the form inputs to be processed again by the program. The user has the option of confirming form re-submission so you can't prevent it, but at least they will get a warning and have the opportunity to say no.

With these things in mind, here are the standard industry guidelines to use when trying to decide if you should use GET or POST:

  • Use POST when:
    • Transmitting Sensitive Data: since POST doesn't send the data as part of the URL, you should use it for sensitive data like logins, personal data, payment information, etc.
    • Writing to a File/Database: if you use GET, the user can reload/refresh the page, which causes whatever method you're calling to execute again with the same data. Using POST requires a resubmission of the data, so the user is less likely to accidentally perform a file/database write with the exact same data multiple times.
    • Transmitting Large Amounts of Data or Mixed Types of Data: POST has no restrictions on the amount and type of data you want to send, whereas GET can only handle 2k-4k characters and those characters can only be ASCII characters.
  • If none of the above conditions apply, use GET.
    • GET is a lot faster than POST.