AEM Architecture: The Complete Guide (Whiteboard-Ready)

June 14, 202612 min read

Understand AEM's architecture end to end — author vs publish, the Dispatcher, the CDN, replication and reverse replication, the load balancer, the layered caching, and content distribution. Includes a whiteboard-ready diagram, a step-by-step way to draw it, a cheat sheet, best practices, and do's & don'ts.

AEMArchitectureDispatcherCachingDevOpsReference

Sooner or later — in an interview, a design review, or an incident call — someone hands you a marker and says "draw the AEM architecture." It's a fair test, because if you can draw it accurately you understand how content gets created, how it reaches users, and where things cache and fail. This guide is built to get you there: it explains every piece of a production AEM topology, shows the whole thing in one diagram, and then walks you through how to reproduce that diagram on a whiteboard from memory.

We'll cover the two instance tiers (author and publish), the Dispatcher, the CDN, the load balancer, the layered caching, and how content moves around through replication, reverse replication, and content distribution. A cheat sheet, best practices, and do's & don'ts close it out.

This expands on the architecture section of the AEM Developer Cheat Sheet and pairs with the deep-dive Dispatcher guide.

Author vs Publish

The first and most important split in AEM is between two kinds of instance, each tuned for an opposite job.

The author instance is where content is created and edited. It sits behind authentication, holds the full repository, and is optimized for writing — drag-and-drop editing, workflows, versioning, previews. Typically there are very few author instances (often one, sometimes a cluster for high availability), because authoring is a controlled, internal activity.

The publish instance is what serves the live site to the public. It receives a copy of approved content from author and is optimized for reading at scale — fast, stateless, and horizontally replicated. You usually run many publish instances so the site can absorb real traffic and survive the loss of any one of them.

Keeping the two separate is what makes AEM both secure and scalable: the editing environment is never exposed to the internet, and the delivery environment can be scaled and cached aggressively without worrying about author tooling.

	Author	Publish
Purpose	Create & edit content	Serve the live site
Access	Internal, authenticated	Public
Optimized for	Writing (editing, workflow)	Reading (high traffic)
Count	Few (often 1, or a cluster)	Many (horizontally scaled)

The full picture

Here is a production AEM topology in one diagram — the read path top to bottom, with the author tier feeding publish from below:

            End users (browsers)
                     │
                     ▼
            ┌──────────────────┐
            │       CDN        │
            └──────────────────┘
                     │
                     ▼
            ┌──────────────────┐
            │  Load Balancer   │
            └──────────────────┘
                │            │
                ▼            ▼
          ┌──────────┐ ┌──────────┐
          │Dispatcher│ │Dispatcher│
          └──────────┘ └──────────┘
                │            │
                ▼            ▼
          ┌──────────┐ ┌──────────┐
          │ Publish  │ │ Publish  │
          └──────────┘ └──────────┘
                ▲            ▲
                └─────┬──────┘
                      │
     forward repl. ↑   reverse repl. ↓
                      ▼
            ┌──────────────────┐
            │      Author      │
            └──────────────────┘
                     ▲
                     │
            Authors (logged in)

Two different journeys run through this picture, and separating them in your head is the key to understanding AEM:

The request (read) path — how a visitor's request reaches content: Users → CDN → Load Balancer → Dispatcher → Publish. Most requests are answered by a cache long before they reach publish.
The content (write) path — how content gets published: Authors → Author → (replication) → Publish, with reverse replication carrying user-generated content back the other way.

We'll walk each layer of the read path, then each mechanism of the write path.

The Dispatcher

The Dispatcher sits directly in front of publish and does three jobs at once: it caches rendered pages and assets as files on disk, it load-balances across publish instances, and — most importantly — it acts as the security layer, deciding which URLs are even allowed through. When a page is in the Dispatcher cache, the request never touches publish at all, which is the foundation of AEM performance.

The Dispatcher is an Apache HTTP Server module, configured very differently on classic AMS/on-prem versus AEM as a Cloud Service. Because it's a deep topic in its own right — filters, caching, invalidation, security hardening — it has a dedicated Dispatcher guide. For architecture purposes, remember its one-line role: cache and protect publish.

The CDN

In front of the Dispatcher sits a CDN (Content Delivery Network) — a globally distributed edge cache. It stores cacheable responses close to users around the world, so a visitor in Tokyo is served from a nearby edge node rather than reaching back to your origin. On AEM as a Cloud Service an Adobe-managed CDN (built on Fastly) is always present; on classic deployments you add your own (Akamai, CloudFront, Fastly).

The CDN is controlled almost entirely by response headers — Cache-Control with s-maxage, Surrogate-Control, and Age. The mental model is that the CDN is simply another cache layer, further out and faster, governed by the same TTL thinking as the Dispatcher.

The Load Balancer

A load balancer distributes incoming traffic across multiple instances so no single one is overwhelmed, and so the loss of one doesn't take the site down. It uses health checks to route only to healthy instances and can apply session stickiness when a flow needs to stay on one instance.

A subtle point that trips people up: AEM has load balancing at two levels. A load balancer sits in front of the web tier (the Dispatcher/Apache instances), and the Dispatcher itself also load-balances across the publish instances behind it (its "renders"). On AEM as a Cloud Service this is all managed for you — instances autoscale and the routing is handled by the platform — but the concept still exists, and it's worth drawing both levels on the whiteboard to show you understand it.

Caching layers

Performance in AEM is fundamentally about caching, and there isn't one cache — there are several, stacked. A request passes through each in turn, and ideally is answered by the outermost one it can be:

Layer	Where	Caches	Controlled by
Browser cache	The user's device	Per-user assets/pages	`Cache-Control: max-age`
CDN	Global edge	Cacheable responses worldwide	`s-maxage` / `Surrogate-Control`
Dispatcher	In front of publish	Rendered pages & assets (as files)	cache rules + invalidation
Publish	The AEM instance	Last resort — renders the page	—

The golden rule is cache as far out as possible: a response served from the browser or CDN never consumes origin resources. Invalidation flows in the opposite direction — when content is published, the Dispatcher cache is flushed, and the CDN is purged or expires by TTL — so a change ripples outward from publish to the edge. When "my change isn't showing up," you're almost always looking at a stale copy in one of these layers; check them from the outside in.

Replication

Replication is how approved content travels from author to publish — the forward content path. When an author activates a page, a replication agent on the author instance packages the content and sends it to each publish instance, and a companion Dispatcher flush agent invalidates the relevant cache so visitors see the new version.

Replication handles three actions: activate (publish), deactivate (unpublish), and delete. You can trigger it from the UI, from a workflow step, or from code via the Replicator service. It's the mechanism that keeps the publish tier in sync with what authors have approved.

Reverse Replication

Content doesn't only flow outward. Some content originates on publish — think form submissions, comments, ratings, or other user-generated content created by anonymous visitors. Reverse replication carries that content back from publish to author, where it can be moderated, stored, or processed.

It works through an outbox on publish and a reverse replication agent on author that periodically pulls from it. Reverse replication is far less common in modern architectures (user-generated data increasingly goes to dedicated services or databases), but you should know it exists and which direction it flows — publish → author — because it's a classic interview question and the conceptual counterpart to forward replication.

Content Distribution

"Content distribution" is the broader umbrella for moving content between instances and environments, and how it's implemented depends on your AEM flavor:

On classic AMS/on-prem, distribution is the replication-agent model described above.
On AEM as a Cloud Service, the platform uses Sling Content Distribution (SCD) under the hood. You don't hand-configure replication agents; activation still works the same way from your perspective, but the transport is a managed, queue-based distribution service designed for autoscaling clusters.
Between environments (copying content from prod to stage, say), teams use content packages or content-copy tooling rather than replication.

The takeaway for the whiteboard: there's always some mechanism moving content author → publish; name it "replication" on-prem and "content distribution / SCD" on cloud, and you've got it right.

AEM as a Cloud Service: what changes

The diagram above is universal, but AEMaaCS formalizes and automates much of it. The CDN is always present and Adobe-managed; publish instances autoscale horizontally; the load balancing and Dispatcher operation are managed for you; content distribution runs over SCD; and there's an extra preview tier (its own Dispatcher + publisher pair) for previewing content before it goes live. The shapes on the whiteboard don't change — but on cloud, you operate fewer of them by hand.

How to draw it on a whiteboard

Here's a reliable way to reproduce the architecture under pressure. Build it up in six strokes, narrating as you go — interviewers care as much about the story as the boxes.

Start with the two tiers. Draw two boxes: Author and Publish. Say: "Author is where content is created; publish serves it to the public — kept separate for security and scale."
Connect them with replication. Draw an arrow Author → Publish labeled replication (and a thinner arrow back, reverse replication, for user-generated content). This is the content path.
Put a Dispatcher in front of publish. One box: Dispatcher → Publish. Say: "Cache plus security plus load balancing in front of publish."
Scale publish and add the load balancer. Draw multiple publish boxes and a Load Balancer in front of the Dispatcher(s). Mention the two levels of balancing — LB to web tier, Dispatcher to publishers.
Add the CDN. One box on top: CDN. Say: "Global edge cache, governed by response-header TTLs."
Add the actors and the request path. Users at the top hitting the CDN; Authors at the bottom logging into author. Trace the read path aloud: Users → CDN → LB → Dispatcher → Publish, and the write path: Authors → Author → replication → Publish.

If you can draw those six strokes and narrate the two paths, you've demonstrated real architectural understanding — not just memorized boxes.

Cheat sheet

Component	One-line role
Author	Create & edit content (internal, authenticated)
Publish	Serve the live site (public, scaled out)
Dispatcher	Cache + security + load balance, in front of publish
CDN	Global edge cache, TTL-driven
Load Balancer	Distribute traffic; health checks + stickiness
Replication	Author → Publish (activate / deactivate / delete)
Reverse Replication	Publish → Author (user-generated content)
Content Distribution	The transport (agents on-prem, SCD on cloud)
Caching layers	Browser → CDN → Dispatcher → Publish

Best practices

✅ Keep author internal and never publicly reachable.
✅ Scale publish horizontally and put a load balancer in front.
✅ Cache as far out as possible — browser/CDN before Dispatcher before publish.
✅ Drive CDN and Dispatcher TTLs with response headers, and ensure activation invalidates them.
✅ Trigger publishing through workflows / the Replicator, not manual one-offs, for repeatability.

Do's and Don'ts

✅ Distinguish the read path (users inward) from the write path (authors outward) when reasoning about issues.
✅ Check caches outside-in (CDN → Dispatcher → publish) when content looks stale.
✅ Treat AEMaaCS's autoscaling/managed CDN as the same shapes, automated.

Don't

❌ Don't expose the author instance to the public internet.
❌ Don't assume one cache — there are several layers, each with its own TTL.
❌ Don't forget reverse replication flows publish → author (not the other way).
❌ Don't conflate the load balancer (web tier) with the Dispatcher's balancing to publishers.

Wrapping up

AEM's architecture is really two stories overlaid on the same boxes: content flowing outward from authors to the edge, and requests flowing inward from users to publish — meeting at the publish tier, with caches stacked in between to keep almost everything off the origin. Hold onto the two tiers, the Dispatcher and CDN caches, the load balancer, and the replication/reverse-replication/distribution flows, and you can both operate the platform and draw it on a whiteboard with confidence.

Go deeper where it matters: the Dispatcher guide for the caching and security layer, and the AEM Developer Cheat Sheet for how this topology maps to the codebase and consoles.