This lesson introduces you to HTML: What it is, why we use it,
and how we use it. We'll look at where HTML came from and why
we use it to structure a document and its content. We'll examine
the basic syntax of HTML and learn about elements
and attributes. Lastly, we'll discuss the HTML
standard: where does it come from, who defines it, and how are
HTML elements categorized?
SGML uses a DTD (Document Type Definition) to define a markup language.
For example, there is a SGML DTD for the current versions of HTML that
states that the elements of HTML are defined as tags that have names,
and that those tags are identified by the tag name inside pointy brackets.
So an element with the name "span" should be represented by the tag <span>.
The DTD also states that any HTML element that can contain text or other HTML
elements denote the boundaries of the content with an opening tag and a closing
tag. For example, if you wanted to code the "span" element and give it the text
"foo", it must be coded as:
<span>foo</span>
The DTD for HTML also indicates that the closing tag for an element must
include the forward-slash in front of the element's name, like you see in
the example.
I might be simplifiing things a bit, but in reality, an SGML DTD is a syntax
all its own. Thankfully, we don't need to learn SGML in order to learn HTML!
Learning HTML is just a matter of learning the different elements and what
they do. To be honest, it's a lot of memorization: there are hundreds of elements
in HTML. But learning when, where, and how to use each element is going to
be the bigger challenge.
You might have also heard of XML
(Extensible Markup
Language). XML is also derived from SGML and is used to
structure data. In fact, there is a version of HTML that is based
on XML called XHTML but it's not as flexible as HTML and still
endures a lot of criticism today.
XML has stricter rules but it's used in a
variety of applications that need to share data over networks.
XML is also used as a basis for more specific languages, such as
XHTML, FXML (used to create user interfaces in Java FX) and
EPUB (used to for e-book files that can be read on smart devices
and e-readers). You might learn XML in later courses, but it's actually
very simple. For example, here's a bit of XML that defines some of my cats:
HTML looks very similar to the XML you see above, so you'll see quickly
why they are related through SGML.
HTML is written in plain text, so you can use any editor,
even if it's just Notepad. However, using editors that have
syntax highlighting are helpful and make it
easier to read and edit your code.
HTML Syntax
HTML is made up of elements,
attributes, and comments.
Elements and attributes are always in all lower-case letters.
Never use upper-case letters in elements or attributes.
This is part of the HTML5 standard.
Elements
HTML consists of a set of pre-defined elements or tags that represent the
structure of your web page. Many of these elements can contain
text, which contains the content of your web page.
For example, a paragraph of text can be containted inside
a <p> element like so:
There are elements you will use for paragraphs, document headers
and footers, articles, sidebars, lists, tables, forms, and many
other standard parts of a document.
Elements don't contain styling properties such as colours,
backgrounds, borders, fonts, and layout. These things are configured
with CSS (Cascading Stylesheets), which you'll learn in a
different set of tutorials.
Most elements have an opening tag and a
closing tag. For example,
the paragraph tag <p></p>
encloses a paragraph of text and/or other
HTML content. A few elements only have an opening tag such as
the line-break tag <br>
which adds a line break or new-line into a document.
How do you learn all the different tags? Practice!!
Practice!!! Practice!!!
Attributes
Attributes appear inside an element's opening tag. Attributes
add extra information to the element.
Attributes are generally assigned a value. When using an attribute
that has a value, always place the value in single- or double-quotes.
For example,
<a href="https://terminallearning.com">Visit TerminalLearning for more tutorials</a>
This example uses the <a> element, or the anchor element. The
anchor element is also known as the "link element" because it is
used to add a hyperlink (or clickable link) in a document.
But be careful,
because there is also a <link> element that has nothing to do
with creating hyperlinks on a web page (you'll learn about the
<link> element when you learn about CSS files).
The content in between the anchor element's open and closing
tags is the link text: the text that appears as the clickable
link on the page.
This <a> element's opening tag
has an attribute called href. The <a> element's
href attribute defines the URL/page/file you want the user to
go to when they click the link text. The user will see the
text "Visit TerminalLearning for more tutorials" on the page
as a clickable link. When the user clicks that text, their
browser will load the URL https://terminallearning.com.
You could also use
single quotes around the href attribute's value and write it as
<a href='https://terminallearning.com'>Visit TerminalLearning for more tutorials</a>
Right now it doesn't matter if you use single-quotes or double-quotes,
but you should choose one and be consistent. This will matter
later if you start coding with JavaScript and PHP.
HTML consists of several different attributes. Many are
Global Attributes,
which can be added to any
HTML element/tag. Examples include the accesskey
attribute, which
identifies a special shortcut key a user can press to go to
(or focus on) that element, and the id
attribute, which assigns
a unique variable name to an element (these are often needed
for CSS and/or JavaScript).
Other attributes are specific to certain elements. For example,
the href
attribute is used for <a> and <link> elements to
specify the URL or path to a link or file, and the src
attribute is used in <img> elements to specify the path to
an image file.
As you learn various HTML elements, you'll also learn the different
attributes available. Especially if you practice!!
Comments
Comments are special bits of code that are ignored by compilors,
parsers, browsers, etc. If you have learned a programming language,
you've likely learned how to add comments to your program code.
We do the same thing in HTML, although the reasons for adding
comments and documentation are different.
The purpose of a comment in programming languages
is to describe WHY you're doing something,
especially when you're commenting a complex piece of program code
or algorithm.
We also use comments in HTML to label the different parts
of our page structure or and explain why we're using certain
elements/attributes for certain things.
Comments are very important, and are unfortunately under-taught
and not emphasized enough. If you ever hear
other developers say things like "No one comments in industry.."
or "You shouldn't comment because then only you can understand
your code so they can't fire you." know that those things are
NOT TRUE! In fact, those things were likely said by someone
who mistakenly things that comments are to describe WHAT the
code is doing, and not WHY it was written a certain way.
It's absolutely
vital that you comment your code, and if you don't comment your
code, or you comment it badly, that will make you look unprofessional
and could even get you demoted or fired.
Comments help other developers understand WHY you chose to use
a certain technique, logic, or structure. The purpose of comments
is not necessarily to describe WHAT you're doing: any coder
can read your code and tell what you're doing! The idea that
not commenting can save your job is just B.S. When someone
is trying to understand your code, it's much easier when that
code is commented and documented well, describing why the
code is written the way it is and what you were trying to
accomplish. We will focus a lot on good documentation in this
course.
To add a comment or to document your HTML code, you
use the <!-- symbols to open the comment and then the
--> symbols to close the comment. Your comment text
goes in between. For example:
In the example, you can see that the main sections of the
document have been documented to describe their purpose.
You'll also notice a special TODO
comment inside the <footer> element. A TODO is one of
a few special key words used in documentation to identify
tasks that still need to be performed. You can use these
in Java, too!
The W3C was founded in 1994 by Tim Berners-Lee after he
left CERN. The objective of the W3C was to develop standards
for the World Wide Web (WWW). They remained the sole community
for web standards until 2004.
In 2004, several of the web's main players (including Apple, Mozilla,
and Opera) were at a conference
discussing their concerns about where the W3C was going in terms
of HTML (this was around days of XHTML, which many developers felt
was becoming an ungainly monstrosity). These folks decided to
form their own consortium (WHATWG) and developed their own standards for
HTML based on the pre-XHTML standards. WHATWG has maintained their
own HTML standards ever since.
The Drama Between W3C and WHATWG
Eventually, W3C abandoned the XHTML standard and decided to
focus on maintaining the new HTML standards, which was also
maintained by WHATWG at the same time. The HTML5 draft
was released in 2007 and accepted as the current standard
in 2014. Both W3C and WHATWG
had similar standards for HTML5 with some minor differences,
although the number of differences grew over the years, and
in some cases, both standards completely contradicted each
other. This became very confusing and problematic for
developers.
Additionally, WHATWG preferred a Living
Standard for HTML: No version numbers are assigned
because the standards documents grow as the language grows and expands.
When items are changed or new things are added, the documentation
is updated as necessary.
W3C preferred retiring standards: a version number is assigned and
the documentation for that version is static. Changes and new
items are added to a draft document of the next version, but this
draft is not yet the accepted standard, yet. When there are enough new
additions and changes to the language, the draft of the new standard
is then published as the current standard with a new version
number. Then the old standard is retired.
There are advantages and disadvantages of both: for example,
browsers need time to update to accept new standards so sometimes
a living standard that changes quickly can be difficult to keep
up with. But maintaining a living standard for a language
that changes quickly, as is the case with HTML, is much easier.
A living standard also allows developers to start implementing
changes right away because they know the browsers will also
be ensuring their software supports the new changes as they
are added to the living standard. With a static standard, the
changes that browsers and developers make are less gradual and
generally happen only when a current standard is published.
This means that sometimes developers have to spend large amounts
of time implementing many changes at once, rather than being
able to implement them gradually over time.
In May 2019, W3C and WHATWG signed an agreement that they would
collaborate on a single HTML Living Standard and DOM Specification
so that developers could be assured that there was one set of
standards for HTML and DOM specifications (we cover
DOM in a future lesson). They agreed
that both the HTML and DOM standards would be maintained by
WHATWG and W3C would no longer publish their own specifications
for HTML and DOM (but they continue to maintain other specifications).
If you're interested in the finer details of the agreement
you can read the announcement published on May 28 2019 in the W3C blog.
The HTML Living Standard
The HTML Living Standard can be found at
WHATWG: HTML Living Standard. This document outlines
the standard structure of HTML documents, the syntax for
elements and attributes, and describes the intended use for
all the HTML elements and attributes. There are several other
sections of the specification that are beyond the scope of
this particular set of tutorials, but some of those will appear
in other courses! Feel free to explore the document as you
become more comfortable with HTML.
The HTML elements are orgainzed into categories, and you can see
these in Section 4 of the Living
Standard.
The categories of elements we're covering in this course are:
Section #
Category
Description
Examples of Elements
4.1, 4.2
Document Structure and Meta Data
Elements that define the main structure of the page
and the meta data for the document. Note that we
already covered these elements in the
Minimal HTML lesson.
HTML, HEAD, TITLE, META
4.3
Document Sectioning
Elements that logically structure the various sections of the
page/document. These are semantic elements
that tell you something about that section of the
page.
BODY, SECTION, NAV, HEADER, FOOTER, H1, H2, etc.
4.4
Content Grouping
Elements that organize the actual content of the page.
These elements contain actual content (e.g. text,
figures). These are block elements, and the element name
helps define the type of content it contains, in some cases.
P, BLOCKQUOTE, PRE, MAIN, DIV, FIGURE, MENU, elements for lists
4.5
Text-Level Semantics
In-line elements that help define the style or meaning
of a piece of in-line text.
EM, STRONG, CODE, SPAN, BR
4.6
Links
Elements used to create hypertext links in a document.
A, AREA
4.8
Embedded Content
Elements that contain media such as images, audio,
and video.
IMG, PICTURE, EMBED, AUDIO, VIDEO
4.9
Tables
The various elements that make up a table of
data with rows and columns.
TABLE, TR, TD, TH
4.10
Forms
The various elements that make up a form that can be
used to collect inputs from users.
FORM, INPUT, BUTTON, FIELDSET, LEGEND
4.11
Interactive Elements
Elements used to create interactive components such as a
details/disclosure box or a dialog box.
DETAILS, SUMMARY, DIALOG
The next set of tutorials covers each of the above sections, although
some sets of elements require their own tutorials!