Skip to content
ProseMirror logo with code snippets and editor interface elements
Development

ProseMirror for Beginners: How It Actually Works

ProseMirror can feel intimidating at first because it does not think like most UI libraries. This guide explains the core mental model behind schemas, documents, state, transactions, and views.

#prosemirror #rich-text #editor #tutorial #javascript

I work with ProseMirror every day. As a core maintainer of Tiptap, I have spent years building on top of it, debugging it, and writing code that runs inside its transaction pipeline. And I still remember how much I had to wrap my head around it at first.

Most people I talk to have a similar experience. ProseMirror is incredibly powerful, but its mental model is different from what most frontend developers are used to. The docs are thorough, but they assume you already understand the concepts, which makes the first read confusing. A lot of people bounce off before the pieces click. I almost did too.

The thing is, ProseMirror is not complicated by accident. It was designed to model rich-text editing properly, and that turns out to be a much harder problem than most of us realize. Once you understand why it works the way it does, everything starts fitting together.

This post covers the mental model that took me a while to figure out: schemas, documents, editor state, transactions, the view, and how they all connect.

What ProseMirror Actually Is

At its core, ProseMirror is a toolkit for building rich-text editors. Not a ready-made editor you can drop into a page, not a React component, a toolkit. It gives you primitives to describe document structure, update it, and render an editor view that stays in sync.

Because it is low-level, a lot of serious editor products are built on top of it. But that also means it feels heavier than a <textarea> from day one.

If you just need a polished editor quickly, raw ProseMirror is probably not what you want. If you want to understand structured editing, or need deep control over editor behavior, it is absolutely worth the effort.

Why It Feels Weird at First

In most frontend work, state is simple. A modal is open or closed. A form field holds a string. A tab is selected or not. Even complex UIs usually boil down to manageable state.

A rich-text editor is not like that.

It has to track structure, selections, formatting, keyboard behavior, document validity, commands (covered later), plugin state (covered later), and transformations, all at once. The internal model needs to be way more precise than a plain string or loose HTML.

Typical UI StateProseMirror State
Simple values (boolean, string, number)Complex tree structure (nodes, marks)
Mutable updatesImmutable state with transactions
Single source of truthDocument + selection + plugin state
Component-driven renderingSchema-driven document rendering
Event handlers everywhereCentralized transaction pipeline

Here is the first mental shift: ProseMirror is not mostly about rendering. It is mostly about keeping a structured document consistent while users edit it.

The Five Core Concepts

If you are new to ProseMirror, these are the concepts that matter most.

1. Schema

The schema is the blueprint for your document. It defines which nodes and marks are valid, and how they nest.

Most practical text-editing schemas include at least doc, paragraph, and text. doc is the root, text represents inline text, and paragraph is usually the default block node.

Before diving into a custom schema, here is the smallest useful ProseMirror setup using the built-in basic schema:

npm install prosemirror-state prosemirror-view prosemirror-model prosemirror-schema-basic prosemirror-commands prosemirror-keymap
import { EditorState } from "prosemirror-state";
import { EditorView } from "prosemirror-view";
import { schema } from "prosemirror-schema-basic";
import { keymap } from "prosemirror-keymap";
import { baseKeymap } from "prosemirror-commands";

const state = EditorState.create({
  schema,
  plugins: [keymap(baseKeymap)],
});

const view = new EditorView(document.querySelector("#editor"), {
  state,
});

This is enough to mount a basic editable ProseMirror view with sensible keyboard behavior. The custom schema example below shows what the schema looks like when you define it yourself.

Here is how you define your own schema:

import { Schema } from "prosemirror-model";

// A schema defines the grammar of your editor.
// This one allows paragraphs, headings, and basic formatting.
const mySchema = new Schema({
  nodes: {
    // Every schema needs a root "doc" node and a "text" node.
    doc: { content: "block+" },
    text: { group: "inline" },

    // "paragraph" is the default block node.
    // "inline*" means it accepts zero or more inline nodes (text, marks).
    paragraph: {
      content: "inline*",
      group: "block",

      // toDOM tells ProseMirror how to render this node as HTML.
      // The view calls this when the node needs to appear in the DOM.
      // The 0 is the content hole. It tells ProseMirror
      // where the node's child content should be rendered.
      toDOM() { return ["p", 0]; },
    },

    // A custom heading node with a numbered level attribute.
    heading: {
      content: "inline*",
      group: "block",
      attrs: { level: { default: 1 } },

      // The level attribute becomes an HTML attribute on the element.
      // The trailing 0 tells ProseMirror where child content goes.
      toDOM(node) {
        return ["h" + node.attrs.level, 0];
      },

      // parseDOM tells ProseMirror how to read existing HTML back
      // into this node type, e.g. for pasting or SSR.
      parseDOM: [
        { tag: "h1", attrs: { level: 1 } },
        { tag: "h2", attrs: { level: 2 } },
        { tag: "h3", attrs: { level: 3 } },
      ],
    },

    // A blockquote node for quoted text.
    blockquote: {
      content: "block+",
      group: "block",
      toDOM() { return ["blockquote", 0]; },
      parseDOM: [{ tag: "blockquote" }],
    },
  },
  marks: {
    // Marks are inline formatting that wraps parts of a text node.
    strong: {
      toDOM() { return ["strong", 0]; },
      parseDOM: [{ tag: "strong" }, { tag: "b" }],
    },
    em: {
      toDOM() { return ["em", 0]; },
      parseDOM: [{ tag: "em" }, { tag: "i" }],
    },
  },
});

The toDOM function tells ProseMirror how to render a node or mark into the DOM. The array syntax ["tag", ...attrs, contentSlot] is ProseMirror’s declarative DOM description language.

The parseDOM rules do the reverse: they tell ProseMirror how to recognize HTML elements during pasting or content load and convert them back into your schema types. Together, they make the schema a two-way bridge between your structured document and the browser’s DOM.

2. Document

The document is the actual content tree. It is made up of nodes and marks that conform to the schema.

A ProseMirror document is a tree of nodes. Here is the internal representation of a document with a heading and a paragraph:

{
  "type": "doc",
  "content": [
    {
      "type": "heading",
      "attrs": { "level": 1 },
      "content": [
        { "type": "text", "text": "Hello World" }
      ]
    },
    {
      "type": "paragraph",
      "content": [
        { "type": "text", "text": "This is a paragraph with " },
        {
          "type": "text",
          "text": "bold",
          "marks": [{ "type": "strong" }]
        },
        { "type": "text", "text": " text." }
      ]
    }
  ]
}

This structured tree is why ProseMirror can validate, transform, and serialize content reliably. HTML is the browser’s model. This tree is yours.

3. EditorState

EditorState is the current snapshot of the editor. It holds the document, but also the selection, stored marks, and plugin state, which becomes important once you start adding custom behavior.

A common beginner mistake is treating the document as the only state. The editor is a live system, cursor position and plugin data matter too.

Here is how you create an EditorState using the schema we defined above:

import { EditorState } from "prosemirror-state";

// Create an editor state from our custom schema.
// The state bundles everything: document, selection, plugin state.
const state = EditorState.create({ schema: mySchema });

// The state is immutable. You never mutate it directly.
// Instead, you apply transactions to produce a new state.

4. Transaction

A transaction describes a change to the state.

When a user types, deletes text, or moves the selection, ProseMirror represents that as a transaction. The transaction is applied to the current EditorState to produce a new one.

Here is how you create and dispatch a transaction programmatically:

// state.tr creates a new transaction from the current state.
// The transaction starts empty, you add steps to it.
const tr = state.tr;

// Add a step: insert text at the start of the document.
tr.insertText("Hello ProseMirror!");

// dispatch applies the transaction through the EditorView,
// which updates the state and re-renders the DOM automatically.
view.dispatch(tr);

Internally, a transaction can contain one or more steps. Each step describes a document change, like replacing a range or inserting text.

Instead of mutating state, you build a transaction that describes what changed, and let ProseMirror apply it. This is the second big mental shift: changes are explicit transformations, not random mutations.

5. EditorView

EditorView is the bridge between ProseMirror’s state and the DOM. It renders the current state and captures user interactions. When the state changes, the view updates. When the user types or clicks, the view helps produce the next transaction.

Here is how you connect a view to the state we created earlier:

import { EditorView } from "prosemirror-view";

// Mount the editor into a DOM element.
// The view renders the document and captures user interactions.
const view = new EditorView(document.querySelector("#editor"), {
  state,
});

// The editor is now visible in the browser.
// Basic text input works at this point, but you usually add keymaps such as baseKeymap to get expected editor behavior for Enter, Backspace, and formatting shortcuts.
// You can also dispatch changes programmatically, see the next section.

How Updates Actually Flow

This is the part that made it click for me. The whole system runs on a simple loop:

EditorState → User Action → Transaction → New EditorState → View Update → Plugin Reaction
     ↑                                                                      ↓
     └──────────────────────────────────────────────────────────────────────┘

In words:

  1. You start with an EditorState (document + selection + plugin state).
  2. The user types a character, clicks somewhere, presses Backspace.
  3. The EditorView captures that and creates a transaction.
  4. The transaction is applied to the current state, producing a new one.
  5. The view renders the new state.
  6. Any plugins watching for changes react.

That is it. Every edit follows this same path. Once you understand that the editor moves forward by applying transactions to immutable state, you stop trying to force it into patterns that do not fit.

Here is the whole setup with a programmatic change:

import { EditorState } from "prosemirror-state";
import { EditorView } from "prosemirror-view";

// Step 1: create an initial state
const state = EditorState.create({ schema: mySchema });

// Step 2: mount the view
const view = new EditorView(document.querySelector("#editor"), {
  state,
});

// Step 3: make a programmatic change using a transaction
const tr = state.tr;
tr.insertText("Hello ProseMirror!");

// Dispatch the transaction through the view.
// The view applies it to produce a new state and re-renders.
view.dispatch(tr);

// The editor now shows "Hello ProseMirror!".
// The old state is discarded. The view owns the current state.

How Behavior Gets Added

ProseMirror provides several systems for adding custom behavior on top of the core state loop. These are the ones you will encounter most often.

Commands

A command is a function that performs an action, like toggling bold or inserting a heading, and returns a boolean indicating whether it could run.

import { toggleMark } from "prosemirror-commands";

// toggleMark creates a command that toggles a mark on the current selection.
// It returns true if the action ran, or false if it could not.
const toggleBold = toggleMark(mySchema.marks.strong);

// Execute the command by passing the current state and dispatch function.
// dispatch is typically `view.dispatch`, the same dispatch from earlier.
toggleBold(view.state, view.dispatch);

// Now the selected text is bold (or bold is removed if already applied).

This is one of those patterns I use constantly. Instead of manually building a tr each time, I write a command that checks whether the action is valid and applies the right transaction if it is.

Keymaps

Keymaps bind keyboard shortcuts to commands.

import { keymap } from "prosemirror-keymap";
import { toggleMark } from "prosemirror-commands";

// A keymap plugin maps key combinations to commands.
const myKeymap = keymap({
  "Mod-b": toggleMark(mySchema.marks.strong),   // Cmd/Ctrl+B → bold
  "Mod-i": toggleMark(mySchema.marks.em),       // Cmd/Ctrl+I → italic
});

// Add the keymap as a plugin when creating the state.
const state = EditorState.create({
  schema: mySchema,
  plugins: [myKeymap],
});

ProseMirror does not come with default key bindings. Pick the keymaps you need, or import baseKeymap from prosemirror-commands for sensible defaults like Enter, Backspace, and arrow keys.

Input Rules

Input rules automate behavior as the user types. They are the mechanism behind Markdown-like shortcuts.

import { inputRules, wrappingInputRule, textblockTypeInputRule } from "prosemirror-inputrules";

// Type "> " at the start of a line to turn it into a blockquote.
const blockquoteRule = wrappingInputRule(
  /^\s*>\s/, mySchema.nodes.blockquote
);

// Type "## " to turn the line into an h2.
const headingRule = textblockTypeInputRule(
  /^#{1,6}\s/, mySchema.nodes.heading, (match) => ({
    level: match[0].trim().length,
  })
);

const myInputRules = inputRules({
  rules: [blockquoteRule, headingRule],
});

// Add to the state as a plugin, same as keymaps.
const state = EditorState.create({
  schema: mySchema,
  plugins: [myInputRules],
});

If you have ever used a Markdown shortcut in any editor, that is input rules at work. Every keystroke is checked against the patterns, and when something matches, the document transforms automatically.

Paste Handling

ProseMirror lets you intercept pasted content through props.handlePaste on the EditorView or through a plugin.

import { Plugin } from "prosemirror-state";

// A plugin that logs and transforms pasted content.
const pastePlugin = new Plugin({
  props: {
    // handlePaste receives the paste event and returns true
    // if it handled the paste, preventing the default behavior.
    handlePaste(view, event) {
      const text = event.clipboardData?.getData("text/plain");
      if (!text) return false;

      console.log("Pasted text:", text);

      // Insert the pasted text as a transaction.
      const tr = view.state.tr.insertText(text);
      view.dispatch(tr);
      return true;
    },
  },
});

const state = EditorState.create({
  schema: mySchema,
  plugins: [pastePlugin], // or [myKeymap, inputRulesPlugin, pastePlugin]
});

The same pattern works for other paste-related props like transformPasted (transform content before insertion) and transformPastedHTML (clean up HTML before parsing).

These extension points sit around the same transaction lifecycle. Commands produce transactions, keymaps and input rules usually wrap commands in plugins, and paste handling can be provided through view props or plugin props.

Mistakes I See Often

I see the same patterns trip people up over and over, both in issues and in conversations at meetups.

Treating It Like A Controlled Input

A rich-text editor is not a textarea with extra features. A common mistake is serializing the document to JSON or HTML on every transaction, storing it in React state, and then feeding it back into the editor as the source of truth. This can cause selection bugs, unnecessary renders, and awkward state loops. Let the EditorView own the live editor state while the app stores snapshots only when needed.

Thinking The Document Is The Whole State

The document matters. But selection and plugin state matter too. Selection and plugin state are not secondary details, they affect how commands behave, how marks are applied, and how plugins track their own data. If you only think about serialized content, you miss how the editor actually works.

Jumping Into Plugins Too Early

Plugins are powerful, but they make way more sense once the basics click. Learn schema, state, transactions, and view first. Plugins start feeling natural after that. Many things can first be modeled as commands, keymaps, input rules, or view props before writing complex plugin logic.

Three Things To Remember

If nothing else sticks, keep these:

  • The schema defines what is valid. Without it, the editor does not know what content to expect.
  • Transactions describe change. Every edit is an explicit transformation, not a silent mutation.
  • The view stays in sync with the state. If the state changes, the DOM follows.

That is the foundation. Once those click, everything else stops feeling random.

Supporting ProseMirror

ProseMirror is maintained by Marijn Haverbeke, who also built CodeMirror and the Acorn parser. It is one of those rare open-source projects that powers thousands of products without asking for anything in return. I see its impact every day in the work I do.

If you find ProseMirror useful, or if this guide helped you understand it better, consider:

ProseMirror exists because Marijn spent years engineering a solution to a problem most people do not even realize is hard. A little support goes a long way.

What To Learn Next

If this helped, here is a suggested learning path:

  • Schema design for real products, custom nodes, marks, and content rules (ProseMirror docs)
  • Commands and keymaps, beyond the basics here (prosemirror-commands)
  • Plugin architecture, how plugins hook into the state and transaction lifecycle (ProseMirror docs)
  • ProseMirror state management, thinking about EditorState in an app context (ProseMirror docs)
  • Collaborative editing with Yjs, how ProseMirror’s state model makes real-time collab possible (Yjs docs)
  • Tiptap, built on ProseMirror, packages much of this into a practical framework

For a first step though, this is enough. ProseMirror is a structured document system first. Approach it that way and the rest gets a lot easier.