📖 Deeper dive reading:
- MDN HTML
- W3C specification - This official specification is only for reference
HyperText Markup Language (HTML) provides the foundational content structure that all web applications build on. HTML was originally designed to be a publishing format for web documents, or pages. From that original definition web programmers have morphed the web page concept into a web application where a page now represents either a single page application (SPA) or a large group of hyperlinked pages that form a multi-page application (MPA). By itself HTML is amazing, but to create a full web application we will need other technologies to style (CSS) our pages and make them interactive (JavaScript). For now, we will focus on creating the content structure with HTML.
Here is an example of a simple HTML document.
Hello worldThe first thing you noticed is that this looks like a simple text document. That is because text is valid HTML. In order to provide structure to our text we need to introduce the concept of elements and their associated tag representation.
HTML elements are represented with enclosing tags that may enclose other elements or text. For example, the paragraph element, and its associated tag (p), designate that the text is a structural paragraph of text. When we talk about tags we are referring to a delimited textual name that we use to designate the start and end of an HTML element as it appears in an HTML document. Tags are delimited with the less than (<) and greater than (>) symbols. A closing tag will also have a forward slash (/) before its name.
<p>Hello world</p>We can continue adding structure to the page with additional elements. Each of these elements may contain other elements that provide the structure of our web page. The html element represents the top level page structure. The head element contains metadata about the page and the page title. The body element represents the content structure. The main element represents the main content structure, as opposed to things like headers, footers, asides, and navigation content. These additional elements result in the following HTML page.
<html>
<head>
<title>My First Page</title>
</head>
<body>
<main>
<p>Hello world</p>
</main>
</body>
</html>However, when we render the HTML in a browser it would look exactly the same as our original simple text example. The reason for that is that HTML is almost completely about structure. The visual appearance of the web page won't really change until we start styling the page with CSS.
Every HTML element may have attributes. Attributes describe the specific details of the element. For example, the id attribute gives a unique ID to the element so that you can distinguish it from other elements. The class attribute is another common element attribute that designates the element as being classified into a named group of elements. Attributes are written inside the element tag with a name followed by an optional value. You can use either single quotes (') or double quotes (") to delimit attribute values.
<p id="hello" class="greeting">Hello world</p>One of the core features that made the web so successful was the ability to create hyperlinks that take you from one page to another another with a simple click. A hyperlink in HTML is represented with an anchor (a) element that has an attribute containing the address of the hyperlink reference (href). A hyperlink to BYU's home page looks like this:
<a href="https://byu.edu">Go to the Y</a>HTML defines a header (<!DOCTYPE html>) that tells the browser the type and version of the document. You should always include this at the top of the HTML file. We can now add the header, some attributes, and more content to our document for a full example.
<!DOCTYPE html>
<html lang="en">
<body>
<main>
<h1>Hello world</h1>
<p class="introduction">
HTML welcomes you to the amazing world of
<span class="topic">web programming</span>.
</p>
<p class="question">What will this mean to you?</p>
<p class="assignment">Learn more <a href="instruction.html">here</a>.</p>
</main>
</body>
</html>Notice that the rendered document has almost no styling. That is because the entire purpose of HTML is to provide content and structure. The layout of the content is left almost entirely up to Cascading Stylesheets (CSS). When styling was introduced with CSS, all of the HTML elements that defined style such as font, strike, and plaintext were deprecated.
Modern HTML contains over 100 different elements. Here is a short list of HTML elements that you will commonly see.
| element | meaning |
|---|---|
html |
The page container |
head |
Header information |
title |
Title of the page |
meta |
Metadata for the page such as character set or viewport settings |
script |
JavaScript reference. Either a external reference, or inline |
include |
External content reference |
body |
The entire content body of the page |
header |
Header of the main content |
footer |
Footer of the main content |
nav |
Navigational inputs |
main |
Main content of the page |
section |
A section of the main content |
aside |
Aside content from the main content |
div |
A block division of content |
span |
An inline span of content |
h<1-9> |
Text heading. From h1, the highest level, down to h9, the lowest level |
p |
A paragraph of text |
b |
Bring attention |
table |
Table |
tr |
Table row |
th |
Table header |
td |
Table data |
ol,ul |
Ordered or unordered list |
li |
List item |
a |
Anchor the text to a hyperlink |
img |
Graphical image reference |
dialog |
Interactive component such as a confirmation |
form |
A collection of user input |
input |
User input field |
audio |
Audio content |
video |
Video content |
svg |
Scalable vector graphic content |
iframe |
Inline frame of another HTML page |
You can include comments in your HTML files by starting the comment with <!-- and ending it with -->. Any text withing a comment block will be completely ignored when the browser renders it.
<!-- commented text -->HTML uses several reserved characters for defining its file format. If you want to use those characters in your content then you need to escape them using the entity syntax. For example, to display a less than symbol (<) you would instead use the less than entity (<). You can also use the entity syntax to represent any unicode character.
| Character | Entity |
|---|---|
| & | & |
| < | < |
| > | > |
| " | " |
| ' | ' |
| 😀 | 😀 |
Understanding when different HTML features were introduced helps you know what has been around for a long time and probably supported by all browsers, and what is new and may not work everywhere. HTML is pretty stable, but it is still good to check a website like MDN or canIUse to make sure.
| Year | Version | Features |
|---|---|---|
| 1990 | HTML1 | format tags |
| 1995 | HTML2 | tables, internationalization |
| 1997 | HTML3 | MathML, CSS, frame tags |
| 1999 | HTML4 | external CSS |
| 2014 | HTML5 | email, password, media, and semantic tags |
By default a web server will display the HTML file named index.html when a web browser, such as Google Chrome, makes a request without asking for a specific HTML file. For example, when you ask for https://google.com in your web browser you will actually get back the HTML for the file https://google.com/index.html. For this reason, it is very common to name the main HTML file for your web application index.html.
You can save any HTML file to your computer's disk and then open the file using your browser. You can also open the HTML file in VS Code and use the Live Server extension to display the HTML. Another way to easily play with HTML is to use a sandbox like CodePen. However, when you use CodePen it is not necessary to supply the HTML DocType header or the root html element since CodePen already assumes you are providing HTML. Here is our example HTML document rendered in CodePen.



