HTML
What HTML actually is (and is not)
HTML is a document markup language, not a programming language. It describes what things are, not how they look or how they behave. Browsers parse HTML into a DOM tree that becomes the foundation for CSS styling, JavaScript behavior, accessibility, and security decisions. If the structure is wrong, everything built on top becomes fragile. HTML correctness is a leverage multiplier.
The document model (DOM)
When a browser loads HTML, it incrementally builds a Document Object Model (DOM).
- Nodes represent elements, text, comments, etc.
- Parent–child relationships define structure.
- The DOM is live and mutable via JavaScript.
- CSS selectors and JS APIs operate on this tree.
Poor HTML leads to:
- Unexpected layout behavior
- Accessibility failures
- Hard-to-debug JavaScript issues
Required document skeleton
Every HTML document follows this high-level structure:
<!DOCTYPE html><html lang="en"><head> <!-- metadata --></head><body><!-- visible content --></body></html><!DOCTYPE html>
Enables standards mode. Without it, browsers may enter quirks mode (legacy behavior). Always include it.
<html>
Root element of the document. lang attribute is critical for accessibility and SEO.
<head>
Contains metadata, not content. Common responsibilities:
- Character encoding
- Page title
- CSS inclusion
- Script loading hints
- SEO and social metadata
<body>
Contains all visible, interactive content.
This is where the application lives.
Core metadata tags (inside )
<meta charset="utf-8">
- Must be early in .
- Prevents encoding bugs.
<title>
Browser tab title.
Used by search engines and bookmarks.
<meta name="viewport">
Essential for responsive layouts.
Controls how mobile browsers scale content.
<link>
Used for stylesheets, icons, preloads.
Declarative resource loading.
<script>
Loads JavaScript.
defer and async dramatically affect execution timing.
Structural content tags (the backbone)
These define meaningful sections, not layout.
<header>
Introductory content for a page or section.
Often contains navigation or headings.
<nav>
Major navigation blocks.
Helps screen readers and search engines.
<main>
Primary content of the document.
Should appear once per page.
<section>
Thematic grouping of content.
Should usually have a heading.
<article>
Self-contained, reusable content.
Examples: blog post, comment, card.
<aside>
Tangential or supplementary content.
Sidebars, callouts, related links.
<footer>
Footer for a page or section.
Metadata, links, legal text.
Semantic structure matters more than visual layout.
Headings (critical and often misused)
Headings represent structure, not font size.
<h1> through <h6>
Define a logical outline.
<h1> is the top-level heading.
Do not skip levels arbitrarily.
Used heavily by screen readers.
Text-level semantics (meaning over style)
Prefer semantic tags over generic ones.
<p>
Paragraphs of text.
Automatic spacing and block behavior.
<strong> vs <b>
<strong> = importance
<b> = visual emphasis only
<em> vs <i>
<em> = emphasis (affects screen readers)
<i> = alternate voice or style
<span>
No semantic meaning.
Use only when nothing else fits.
Lists (structured repetition)
Lists convey structure to accessibility tools.
<ul>
Unordered list
<ol>
Ordered list
<li>
List item (must be inside ul/ol)
Links and navigation
<a>
Hyperlink element. href defines navigation target. Without href, it’s not a link.
Important notes:
- Links are for navigation, not actions.
- Buttons trigger actions; links navigate.
Media elements
<img>
Embeds images.
alt is mandatory for accessibility.
Does not have a closing tag.
<video> / <audio>
Native media playback.
Support controls, captions, multiple sources.
Forms (user input and data submission)
Forms define how users interact with data.
Form semantics affect:
- Keyboard navigation
- Validation
- Screen reader behavior
<form>
Groups inputs and defines submission behavior.
<input>
Many types: text, email, checkbox, radio, password, etc.
Type determines validation and UI.
<textarea>
Multi-line text input.
<select> / <option>
Dropdown selection.
<label>
Associates text with an input. Crucial for accessibility.
Tables (structured data only)
Misuse of tables for layout is a historical antipattern.
<table>
For tabular data, not layout.
Core elements:
<thead>,<tbody>,<tfoot><tr>rows<th>header cells<td>data cells
Script and style integration
<script>
Executes JavaScript.
defer→ execute after parsing, before DOMContentLoaded.async→ execute as soon as loaded, order not guaranteed.
<style>
Inline CSS (generally discouraged at scale).
Script placement and attributes directly affect performance and correctness.
Attributes (data and behavior)
Use attributes for declarative intent, not logic.
Attributes modify elements:
- Global attributes: id, class, hidden, tabindex
- ARIA attributes for accessibility
- data-* for custom metadata
HTML and accessibility
Correct HTML:
- Enables keyboard navigation
- Enables screen readers
- Enables assistive technologies
Bad HTML:
- Cannot be fixed fully with JavaScript
- Breaks users silently
Accessibility starts with semantic HTML, not ARIA.
HTML error handling (browser reality)
HTML is forgiving by design.
Browsers:
- Auto-close tags
- Reorder invalid structures
- Guess intent
This tolerance hides bugs. Invalid HTML leads to unpredictable DOMs.
Rule:
- Write valid HTML even if browsers “fix” it.
Mental checklist
- Does this element describe what it is?
- Is this structure meaningful without CSS?
- Would this work with a keyboard only?
- Does the DOM match my mental model?