HTML (Hypertext Markup Language) is a text-based approach to describing how content contained within an HTML file is structured. This markup tells a web browser how to display the text, images and other forms of multimedia on a webpage.
Commonly used HTML tags
The role of HTML is to inform a web browser about how the content contained within an HTML file is structured. Commonly used HTML tags include <H1>, which describes a top-level heading; <H2>, which describes a second-level heading; <p> to describe a paragraph; <table>, which describes tabular data; and <ol>, which describes an ordered list of information.
As you can see from this very short list, HTML tags primarily dictate the structural elements of a page.
Variations of HTML
In the early days of the world wide web, marking up text-based documents using HTML syntax was more than sufficient to facilitate the sharing of academic documents and technical memos. However, as the internet expanded beyond the walls of academia and into the homes of the general population, greater demand was placed on webpages in terms of formatting and interactivity.
HTML 4.01 was released in 1999, at a time when the internet was not yet a household name, and HTML5 was not standardized until 2014. During this time, HTML markup drifted from the job of simply describing the structure of the content on a webpage into the role of also describing how content should look when a webpage displays it.
As a result, HTML4-based webpages often included information within a tag about what font to use when displaying text, what color should be used for the background and how content should be aligned.
Describing within an HTML tag how an HTML element should be formatted when rendered on a webpage is considered an HTML antipattern. HTML should describe how content is structured, not how it will be styled and rendered within a browser.
For rendering, the proper practice is to use cascading style sheets (CSS). An HTML file can link to a cascading style sheet, which will contain information about which colors to use, which fonts to use and other HTML element rendering information. Separating information about how a page is structured, which is the role of HTML, from the information about how a webpage looks when it is rendered in a browser, which is the role of a CSS file, is a software development pattern and best practice known as separation of concerns.
HTML4 vs. HTML5
The separation of concerns pattern is more rigorously enforced in HTML5 than it was in HTML4. With HTML5, the bold <b> and italicize <i> tags have been deprecated. For the paragraph tag, the align attribute has been completely removed from the HTML specification.
For the purpose of backward-compatibility, web browsers will continue to support these deprecated HTML tags, but the changes to the HTML specification do demonstrate the desire of the community for HTML to return to its original purpose of describing the structure of content, while encouraging developers to use cascading style sheets for formatting purposes.
HTML tag vs. element vs. attribute
The idea of using text to describe how text should be displayed might sound somewhat paradoxical, but it is not. This is the whole reason why HTML is known as a markup language.
Using HTML, a document containing text is further marked up with additional text describing how the document should be displayed. To keep the markup part separate from the actual content of the HTML file, there is a special, distinguishing HTML syntax that is used. These special components are known as HTML tags. The tags can contain name-value pairs known as attributes, and a piece of content that is enclosed within a tag is referred to as an HTML element.
Editing HTML example
In the following HTML example, there are two HTML elements. Both elements use the same paragraph tag, designated with the letter p, and both use the directional attribute dir, although a different value is assigned to the HTML attribute's name-value pairing, namely rtl and ltr.
Notice that when this HTML snippet is rendered in a browser, the HTML tags impact how each HTML element is displayed on the page, but none of the HTML tags or attributes are displayed. HTML simply describes how to render the content. The HTML itself is never displayed to the end user.
What is well-formed HTML?
In order for a web browser to display an HTML page without error, it must be provided with well-formed HTML. To be well-formed, each HTML element must be contained within an open tag -- <p> -- and a close tag -- </p>. Furthermore, any new tag opened within another tag must be closed before the containing tag is closed. So <h1><p>well-formed HTML</p></h1> is well-formed HTML, while <h1><p>well-formed HTML</h1></p> is not well-formed HTML.
HTML syntax standards
Another syntax rule is that HTML attributes should be enclosed within single or double quotes. There is often debate about which format is technically correct, but the World Wide Web Consortium asserts that both approaches are acceptable.
"By default, SGML requires that all attribute values be delimited using either double quotation marks (ASCII decimal 34) or single quotation marks (ASCII decimal 39)."
The best advice for choosing between single and double quotes is to keep the usage consistent across all the documents. HTML style-checkers can be used to enforce consistent use across pages. It should be noted that sometimes using a single quote is required, such as in an instance where an attribute's value actually contains a double quote character. The reverse is true as well.
"Single quote marks can be included within the attribute value when the value is delimited by double quote marks, and vice versa."
How to use and implement HTML
Because HTML is completely text-based, an HTML file can be edited simply by opening it up in a program such as Notepad++, Vi or Emacs. Any text editor can be used to create or edit an HTML file and, so long as the file is created with a .html extension, any web browser, such as Chrome or Firefox, will be capable of displaying the file as a webpage.
For professional software developers, there are a variety of WYSIWYG editors to develop webpages. NetBeans, IntelliJ, Eclipse and Microsoft's Visual Studio provide WYSIWYG editors as either plug-ins or as standard components, making it incredibly easy to use and implement HTML.
These WYSIWYG editors also provide HTML troubleshooting facilities, although modern web browsers often contain web developer plug-ins that will highlight problems with HTML pages, such as a missing end tag or syntax that does not create well-formed HTML.
Chrome and Firefox both include HTML developer tools that allow for the immediate viewing of a webpage's complete HTML file, along with the ability to edit HTML on the fly and immediately incorporate changes within the browser.
The HTML standard
HTML is a formal recommendation by the World Wide Web Consortium (W3C) and is generally adhered to by all major web browsers, including both desktop and mobile web browsers. HTML5 is the latest version of the specification.