HTML (Hypertext Markup Language) is a text-based approach to describing how content contained within an HTML file is structured. This markup tells a web browser how to display text, images and other forms of multimedia on a webpage.
HTML is a formal recommendation by the World Wide Web Consortium (W3C) and is generally adhered to by all major web browsers, including both desktop and mobile web browsers. HTML5 is the latest version of the specification.
Basics of an HTML element
Using HTML, a document containing text is further marked up with additional text describing how the document should be displayed. To keep the markup part separate from the actual content of the HTML file, there is a special, distinguishing HTML syntax that is used. These special components are known as HTML tags. The tags can contain name-value pairs known as attributes, and a piece of content that is enclosed within a tag is referred to as an HTML element.
An HTML element always has an opening tag, content in the middle and a closing tag. Attributes can provide additional information about the element and are included in the opening tag. Elements can be described in one of two ways:
- Block-level elements start on a new line in the document and take up their own space. Examples of these elements include headings and paragraph tags.
- Inline elements do not start on a new line in the document and only take up necessary space. These elements usually format the contents of block-level elements. Examples of inline elements include hyperlinks and text format tags.
Commonly used HTML tags
The role of HTML is to inform a web browser about how the content contained within an HTML file is structured. Commonly used HTML tags include:
- <H1> which describes a top-level heading.
- <H2> which describes a second-level heading.
- <p> which describes a paragraph.
- <table> which describes tabular data.
- <ol> which describes an ordered list of information.
As you can see from this very short list, HTML tags primarily dictate the structural elements of a page.
How to use and implement HTML
Because HTML is completely text-based, an HTML file can be edited simply by opening it up in a program such as Notepad++, Vi or Emacs. Any text editor can be used to create or edit an HTML file and, so long as the file is created with an .html extension, any web browser, such as Chrome or Firefox, will be capable of displaying the file as a webpage.
For professional software developers, there are a variety of WYSIWYG editors to develop webpages. NetBeans, IntelliJ, Eclipse and Microsoft's Visual Studio provide WYSIWYG editors as either plug-ins or as standard components, making it incredibly easy to use and implement HTML.
These WYSIWYG editors also provide HTML troubleshooting facilities, although modern web browsers often contain web developer plug-ins that will highlight problems with HTML pages, such as a missing end tag or syntax that does not create well-formed HTML.
Chrome and Firefox both include HTML developer tools that allow for the immediate viewing of a webpage's complete HTML file, along with the ability to edit HTML on the fly and immediately incorporate changes within the browser.
Separating information about how a page is structured, which is the role of HTML, from the information about how a webpage looks when it is rendered in a browser is a software development pattern and best practice known as separation of concerns.
New features of HTML5
In the early days of the world wide web, marking up text-based documents using HTML syntax was more than sufficient to facilitate the sharing of academic documents and technical memos. However, as the internet expanded beyond the walls of academia and into the homes of the general population, greater demand was placed on webpages in terms of formatting and interactivity.
HTML 4.01 was released in 1999, at a time when the internet was not yet a household name, and HTML5 was not standardized until 2014. During this time, HTML markup drifted from the job of simply describing the structure of the content on a webpage into the role of also describing how content should look when a webpage displays it.
As a result, HTML4-based webpages often included information within a tag about what font to use when displaying text, what color should be used for the background and how content should be aligned. Describing within an HTML tag how an HTML element should be formatted when rendered on a webpage is considered an HTML antipattern. HTML should describe how content is structured, not how it will be styled and rendered within a browser.
The separation of concerns pattern is more rigorously enforced in HTML5 than it was in HTML4. With HTML5, the bold <b> and italicize <i> tags have been deprecated. For the paragraph tag, the align attribute has been completely removed from the HTML specification.
For the purpose of backward-compatibility, web browsers will continue to support these deprecated HTML tags, but the changes to the HTML specification do demonstrate the desire of the community for HTML to return to its original purpose of describing the structure of content, while encouraging developers to use cascading style sheets for formatting purposes.
Another important feature of HTML5 when compared to HTML4 is the support of audio and video embedding. Instead of using plugins, multimedia can be placed within the HTML code using an <audio> or <video> tag. Additionally, there is built-in support for scalable vector graphics and MathML for mathematical and scientific formulas.
Editing HTML example
In the following HTML example, there are two HTML elements. Both elements use the same paragraph tag, designated with the letter p, and both use the directional attribute dir, although a different value is assigned to the HTML attribute's name-value pairing, namely rtl and ltr.
Notice that when this HTML snippet is rendered in a browser, the HTML tags impact how each HTML element is displayed on the page, but none of the HTML tags or attributes are displayed. HTML simply describes how to render the content. The HTML itself is never displayed to the end user.
HTML syntax standards
In order for a web browser to display an HTML page without error, it must be provided with well-formed HTML. To be well-formed, each HTML element must be contained within an opening tag -- <p> -- and a closing tag -- </p>. Furthermore, any new tag opened within another tag must be closed before the containing tag is closed. So <h1><p>well-formed HTML</p></h1> is well-formed HTML, while <h1><p>well-formed HTML</h1></p> is not well-formed HTML.
Another syntax rule is that HTML attributes should be enclosed within single or double quotes. There is often debate about which format is technically correct, but the World Wide Web Consortium asserts that both approaches are acceptable.
The best advice for choosing between single and double quotes is to keep the usage consistent across all the documents. HTML style-checkers can be used to enforce consistent use across pages. It should be noted that sometimes using a single quote is required, such as in an instance where an attribute's value actually contains a double quote character. The reverse is true as well.
Pros and cons of HTML
Pros to using HTML include:
- Is widely adopted with a large amount of resources available.
- Is natively run on every browser.
- Is relatively easy to learn.
- Has clean and consistent source code.
- Is open source and free to use.
- Can be integrated with other backend programming languages such as PHP.
A few cons to consider are:
- Does not have very dynamic functionality and is mainly used for static web pages.
- All components must be created separately even if they use similar elements.
- Browser behavior can be unpredictable. For example, older browsers may not be compatible with newer features.
ersion of the specification.