LC-JSON

An open learning-content interchange specification.

LC-JSON (Learning Content JSON) is a JSON-native format, schema set, and producer/consumer behavior contract for portable teacher-authored courses, lessons, questions, feedback, and assessment intent. A course authored in one tool can be validated, transferred, and delivered in another, with predictable behavior on both ends.

The specification is open. The schemas are public, stable, and versioned. The license is permissive (Apache 2.0). Implementers can build conforming tools without permission.

LC-JSON is a content-layer format — complementary to LMS interop standards (LTI, OneRoster, xAPI, SCORM) rather than competing with them. See the Rationale for the full landscape and what LC-JSON is not.

What you can do with it

Take your courses with you. Course content in LC-JSON is independent of the tool that authored it. Schools, publishers, and authors can move content between platforms without rewriting it.

Validate before you ship. Every LC-JSON document validates against published JSON Schemas. Authoring errors are caught before delivery, not after a learner gets stuck.

Build with confidence. Schema URLs at every published version path — lc-json.org/1.1-rc.1/, lc-json.org/1.0/, the frozen lc-json.org/1.0-rc.N/ candidate paths, and any future minor or major release — are immutable. A document that validates today will validate forever. Forward-compatible additions land at new URL paths; existing files keep working.

Read the specification

Specification overview — what LC-JSON looks like, with worked examples.
NORMATIVE.md — the conformance requirements (RFC 2119 keywords, producer/consumer roles).
Question types reference — per-type property reference for all 12 implemented question types.
Schemas — Draft-7 JSON Schemas for every artifact and question type.
Examples — examples spanning all five artifact types, from minimal documents to fuller Course and QuestionSet samples, plus question, item, unit, and lesson fragments.

For implementers

Conformance test corpus — valid and invalid cases per clause, with a machine-readable manifest. Run your validator over the corpus to verify conformance.
Reference tools: validate_course.py (validator) and run_corpus.py (corpus harness for spec contributors + the spec repo’s CI). See tools/.
GitHub repository — issues, discussions, releases.

Who is this for?

If you are…	LC-JSON gives you…
A teacher or course author	Confidence that the courses you write are not locked into any single tool.
A school or institution	A portable, vendor-neutral format for learning content. Procurement decisions don’t lock in pedagogical content.
An EdTech tool builder	A clean import/export target. Conforming tools interoperate without bespoke adapters.
A learning-platform vendor	Reduced friction in onboarding teacher-authored content from any source.

What’s covered in 1.1

Five artifact types sharing a common flat root format:

Course — hierarchical: Course → Units → Lessons → Items → Questions.
Question Set — flat list of questions for question-bank exchange and packaged delivery.
Glossary — flat list of terms with pronunciation, translations, examples, and inflected forms; attaches to a course, unit, or lesson.
Subject Collection — a reusable classification vocabulary: the tags and learning objectives for one subject at one level.
Curriculum Pack — an arrangement: sequence, pacing, and assessment checkpoints over a collection plus content documents.

The three 1.1 additions are abstract at first read. Two distinctions do most of the work of telling the five types apart:

Type	Plain role	What sets it apart
Course	Learning content	Contains what learners work through
Question Set	Assessment resource	Contains reusable questions
Glossary	Learner reference	Contains the terms learners study
Subject Collection	Curriculum vocabulary	Describes how educators classify content and objectives
Curriculum Pack	Curriculum plan	Arranges and references content without becoming the content

A Subject Collection describes how educators classify learning; a Glossary contains the words and meanings learners study.
A Curriculum Pack arranges learning content; a Course contains it.

Each type’s reference page opens with a plain-English summary for educators and curriculum teams; the rest of each page is the technical contract.

Twelve question types fully implemented and schema-validated:

simpleGapFill · trueFalseQuestion · multipleChoice · wordBankCloze · multiGapCloze · multipleChoiceCloze · shortAnswer · essay · sentenceTransformation · matching · ordering · placement

Seven additional types are reserved for a future minor version (targeted for 2027).

Five lesson item types: content, exercise, quiz, content-sequence, signpost.

License

LC-JSON is licensed under the Apache License, Version 2.0. The license includes a patent grant. Conforming implementations require no further permission.

“Lesson Commons” is a separate trademark and is not asserted over LC-JSON or its conforming implementations.

Project status

Version 1.1-rc.1 — current publication (2026-07-17), a release candidate at immutable lc-json.org/1.1-rc.1/ URLs. It adds three artifact types — Glossary, Subject Collection, and Curriculum Pack — plus publication fields and glossaryRefs on courses. The addition is backwards-compatible: every 1.0-valid document remains valid under 1.1 with no migration or re-export. As a candidate, 1.1-rc.1 may take backwards-compatible corrections before the 1.1 final release; the /1.1/ path is not populated until then.

Version 1.0 — accepted final release (2026-06-30). The 1.0 wire format is stable and schema URLs at lc-json.org/1.0/ are immutable per NORMATIVE.md §8.3. 1.0 is a pure URL rebase of 1.0-rc.3 with no wire/content delta: the rc.3 schema set is republished unchanged at the final path, with only $id/$schema URL strings and document version labels updated. Every 1.0-rc.3-valid document is valid under 1.0 with no migration or re-export. The earlier release-candidate paths — /1.0-rc.1/, /1.0-rc.2/, and /1.0-rc.3/ — stay served and frozen.

LC-JSON’s public history begins with the 1.0 release-candidate line — 1.0-rc.2 (2026-05-30) was its first publicly announced release. Internal iteration before the candidate line is not reflected in the version history.

LC-JSON is maintained under a single-maintainer steward model; see GOVERNANCE.md for the decision-making process and the criteria for transitioning to a working group.

LC-JSON Specification

Spec version: 1.1-rc.1 Last updated: 2026-07-22

This directory contains the LC-JSON (Learning Content JSON) specification for structured learning content, covering the complete hierarchy from Course structure down to individual Question types.

Implementing LC-JSON? See NORMATIVE.md for the conformance requirements (RFC 2119 keywords, producer/consumer roles, versioning rules, URL stability promises). This README is descriptive; NORMATIVE.md is authoritative. For terminology, see GLOSSARY.md.

Complete Coverage:

Five artifact types sharing a common flat root format: three content documents (Course, QuestionSet, Glossary), a vocabulary document (SubjectCollection), and an arrangement document (CurriculumPack)
Course Hierarchy (Course → Units → Lessons → Items)
5 Lesson Item Types (Content, Exercise, Quiz, ContentSequence, Signpost)
19 Question Types (12 fully implemented + schema-validated; 7 reserved for a future minor version)
JSON Schemas (27) for validation — strictly enforced by the reference validator
Minimal + detailed examples (35 files, all schema-clean)

Design Principles

LC-JSON is machine-validatable, but human-inspectable.

The documents are validated automatically against JSON Schema Draft 7, but they are also designed so that authored content remains visible in the file. A teacher, curriculum designer, or teacher-developer can recognize courses, units, lessons, items, questions, prompts, choices, answers, and feedback without proprietary tooling — opening a course .json in any text editor should be enough to inspect what the course actually contains.

Technical fields such as $schema, specVersion, and globalId exist to make documents portable across tools and stable across re-imports, but they should not bury the pedagogical content. Where this trade-off arises in spec evolution — naming, structure, ordering of fields — the spec favors the form that keeps pedagogical content recognizable.

This is a deliberate stance against formats whose meaning only emerges through tooling. It is offered without promise of zero technical fields, because portability requires some; the promise is that the pedagogical structure stays inspectable to the people who authored it.

Wire Format

LC-JSON uses a flat root with a documentType discriminator (no enclosing envelope around the document). Every conforming document carries $schema, documentType, and specVersion as root-level siblings. The course content itself is hierarchical — Course → Units → Lessons → Items → Questions — and reflects how teachers structure their material.

Five artifact types

Artifact	`documentType`	Schema	Description
Course	`"course"`	`course.schema.json`	Hierarchical course (Units → Lessons → Items). The usual shape for a full course.
Question Set	`"questionSet"`	`question-set.schema.json`	Flat list of questions for question-bank exchange and packaged delivery — no hierarchy.
Glossary	`"glossary"`	`glossary.schema.json`	Flat list of terms — pronunciation, translations, examples, inflected forms. Study material that attaches to a course, unit, or lesson.
Subject Collection	`"subjectCollection"`	`subject-collection.schema.json`	A reusable classification vocabulary: the tags and learning objectives for one subject at one level, as a portable document.
Curriculum Pack	`"curriculumPack"`	`curriculum-pack.schema.json`	An arrangement: sequence, pacing, and assessment checkpoints referencing a Subject Collection plus content documents.

Required root fields (all artifact types)

{
  "$schema": "https://lc-json.org/1.1-rc.1/<artifact>.schema.json",
  "documentType": "course",      // or "questionSet", "glossary",
                                 // "subjectCollection", "curriculumPack"
  "specVersion": "1.1",
  "title": "...",
  ...
}

The $schema URL serves as a stable, versioned identifier and is used by integrated development environments (IDEs such as VS Code) for schema autocomplete. specVersion is forward-compatible across the 1.x series — conforming consumers MUST accept any 1.x value and reject 2.x+ cleanly.

Reserved question types (targeted for 2027)

Seven question types are reserved in the polymorphic discriminator set but do not yet have per-type schemas — full authoring and consumer support is targeted for 2027:

association, hotspot, graphicGapMatch, graphicAssociate, graphicOrder, fileUpload, mediaPromptedEssay

The 12 question types with full per-type schemas — simpleGapFill, trueFalseQuestion, multipleChoice, wordBankCloze, multiGapCloze, multipleChoiceCloze, shortAnswer, essay, sentenceTransformation, matching, ordering, placement — are the spec’s stable surface as of 1.0.

Consumer obligations for reserved (and unknown) types are normative under NORMATIVE.md §6: consumers MUST preserve them verbatim across read/write cycles, MUST NOT silently drop them, MUST treat their earned points as zero, and SHOULD render a non-interactive placeholder. The intent is round-trip preservation: a teacher exporting from a consumer that does not support hotspot can take the file back to a consumer that does, without losing the question. Producers SHOULD NOT emit reserved types in 1.0 documents intended for cross-implementation distribution; reserved types are tool-specific extensions until promoted.

Discriminator casing

Conforming producers emit camelCase question discriminators (simpleGapFill, multipleChoice, etc.). All examples in this directory strictly validate against the schemas in their canonical casing. Non-canonical casings are non-conforming; consumers MUST reject them.

What v1.1 adds — vocabularies, arrangements, and glossaries

LC-JSON 1.0 moved content: courses and question sets that survive the trip between tools. Version 1.1 adds the documents that sit around content — the classification it is organized by, the plan it is taught on, and the vocabulary students study alongside it. All three are ordinary LC-JSON files: flat root, stable ids, readable in a text editor.

A Subject Collection is a curriculum’s working vocabulary as a file. It holds the tags and learning objectives for one scope — B2 adult general English, KS3 mathematics — with two properties free-text tagging cannot offer. First, every tag and objective has a permanent id, so two courses classified with the same member are comparably classified: search, curriculum mapping, and progress reporting can line them up without guessing that “conditionals” and “if-clauses” mean the same thing. Second, a collection can state precisely what external framework it tracks (externalAlignments — a national curriculum section, an official training catalog entry, an exam specification) without re-publishing that framework’s content. For an educational authority or awarding body, this is the piece that makes publishing an official vocabulary practical: author the objectives once, point at the official register, and let any number of course authors reference the same members rather than re-typing them. For a content author, it means the expensive work — writing good can-do objectives — is done once per scope and reused, and a course built against a collection says so in a form other tools can verify. See subject-collection-reference.md.

A Curriculum Pack is a scheme of work that a machine can check. Where a collection says what the objectives are, a pack says how a program covers them: an ordered sequence of steps on a calendar-relative timeline (year / term / week — never real dates, so any institution’s calendar fits), each step teaching, revisiting, or assessing listed objectives, with pacing capacity and assessment checkpoints declared in the document. Because steps reference objective ids rather than prose, a validator can verify the claims a scheme of work usually only implies: nothing is assessed before it is taught, spaced revisits actually have spacing, term plans fit term capacity, and — against the referenced collection — every in-scope objective is taught and assessed at least once (coverage, a declared contract with explicit exemptions). A pack can start as a blueprint (the plan alone, every content slot empty), grow into a working manifest as courses are bound, and ship as a self-contained bundle — one document type at three depths of completion. For curriculum leads and inspectors, the pack is the difference between “the scheme of work says so” and “any conforming curriculum-pack validator holding the referenced collection can re-verify the claim.” See curriculum-pack-reference.md.

A Glossary is the vocabulary students study, as a portable document. Terms with pronunciation (IPA, a friendly respelling, audio), translations, example sentences, and inflected forms — designed language-education-first but serving any subject’s key words. A glossary attaches to a course, unit, or lesson (glossaryRefs), and consumers can auto-link its terms in content, render study lists, or drive flashcards. It is the one v1.1 type that is content rather than structure: where a collection classifies what courses teach, a glossary is learning material. See glossary-reference.md.

The three documents compose: a collection scopes a pack; the pack sequences courses, question sets, and glossaries; every reference is by stable id, so the set travels between tools without losing its joins. Consumers adopt at any depth — a tool that only reads courses ignores the rest and loses nothing (NORMATIVE.md §5.1 scopes conformance per artifact type).

Directory Structure

specification/
├── README.md                          # This file
├── NORMATIVE.md                       # RFC 2119 conformance requirements (authoritative)
├── HTML_SAFETY.md                     # Normative HTML allowlist + sanitization profile
├── ACCESSIBILITY.md                   # Producer/consumer accessibility profile
├── LOCALIZATION.md                    # Language model: language / lang / supportLanguage; BCP 47; pronunciation
├── VALIDATION.md                      # Rule catalog — schema / validator / advisory tiers
├── ITEM_PATTERNS.md                   # Informative authoring guide
├── question-types-reference.md        # Complete reference for all 19 question types
├── subject-collection-reference.md    # Complete reference for the SubjectCollection type
├── curriculum-pack-reference.md       # Complete reference for the CurriculumPack type
├── glossary-reference.md              # Complete reference for the Glossary type
├── GLOSSARY.md                        # Terminology
├── schemas/                           # JSON Schema validation files
│   ├── course.schema.json             # Course (top level)
│   ├── question-set.schema.json       # QuestionSet (flat artifact)
│   ├── glossary.schema.json           # Glossary (flat artifact)
│   ├── subject-collection.schema.json # SubjectCollection (vocabulary artifact)
│   ├── curriculum-pack.schema.json    # CurriculumPack (arrangement artifact)
│   ├── publication-fields.schema.json # Shared publication field group (§4.11)
│   ├── unit.schema.json               # Unit (within Course)
│   ├── lesson.schema.json             # Lesson (within Unit)
│   ├── item-base.schema.json          # Base schema for all Items
│   ├── content-item.schema.json       # ContentItem type
│   ├── exercise-item.schema.json      # ExerciseItem type
│   ├── quiz-item.schema.json          # QuizItem type
│   ├── content-sequence-item.schema.json  # ContentSequenceItem type
│   ├── signpost-item.schema.json      # SignpostItem type (intro/summary navigation)
│   ├── question-base.schema.json      # Base schema for all Questions
│   ├── simple-gap-fill.schema.json    # SimpleGapFill validation
│   ├── true-false-question.schema.json # TrueFalseQuestion validation
│   ├── multiple-choice.schema.json    # MultipleChoice validation
│   ├── word-bank-cloze.schema.json    # WordBankCloze validation
│   ├── multi-gap-cloze.schema.json    # MultiGapCloze validation
│   ├── multiple-choice-cloze.schema.json  # MultipleChoiceCloze validation
│   ├── short-answer.schema.json       # ShortAnswer validation
│   ├── essay.schema.json              # Essay validation
│   ├── sentence-transformation.schema.json  # SentenceTransformation validation
│   ├── matching.schema.json           # Matching validation
│   ├── ordering.schema.json           # Ordering validation
│   └── placement.schema.json          # Placement type validation
└── examples/                          # Example JSON files (35 total)
    ├── course-minimal.json            # Minimal Course example
    ├── question-set-minimal.json      # Minimal QuestionSet example
    ├── glossary-minimal.json          # Minimal Glossary example
    ├── subject-collection-minimal.json # Minimal SubjectCollection example
    ├── curriculum-pack-manifest.json  # Minimal CurriculumPack example (manifest depth)
    ├── question-set-10-true-false.json # Richer QuestionSet showcase
    ├── unit-minimal.json              # Minimal Unit example
    ├── lesson-minimal.json            # Minimal Lesson example
    ├── 10-content-item.json           # ContentItem with HTML
    ├── 11-exercise-item.json          # ExerciseItem (graded homework example)
    ├── 12a-graded-quiz-item.json      # QuizItem, isGraded:true (typical assessment)
    ├── 12b-ungraded-quiz-item.json    # QuizItem, isGraded:false (diagnostic pre-test)
    ├── 13-content-sequence-item.json  # ContentSequenceItem
    ├── 14-signpost-item.json          # SignpostItem
    ├── 01-simple-gap-fill.json        # Per-question examples (01-09)
    ├── ...                            # 09-sentence-transformation.json
    ├── 15-matching.json               # Matching example
    ├── 16-ordering.json               # Ordering example (word-level)
    ├── 16b-sentence-ordering.json     # Ordering example (sentence-level — process narrative)
    ├── 16c-paragraph-ordering.json    # Ordering example (paragraph-level — essay structure)
    ├── 17a-sentence-placement.json    # Placement example (sentence-mode — Cambridge B2 First Part 6 style)
    ├── 17b-paragraph-placement.json   # Placement example (paragraph-mode — IELTS Reading Matching Information style)
    ├── 17c-section-label-placement.json # Placement example (sectionLabel-mode — IELTS Matching Headings)
    ├── 17d-toefl-insertion-placement.json # Placement example (TOEFL Sentence Insertion — decoy-gaps variant)
    └── sample-course-with-questions.json    # Full course example

Total: 27 schemas (7 root/structural [course, questionSet, glossary, subjectCollection, curriculumPack, unit, lesson] + 1 publication-fields group + 1 item-base + 5 item types + 1 question-base + 12 question types).

Source layout vs. published layout. The tree above is the source directory, where the schemas sit in a single flat schemas/. The published specification does not use that layout: it serves each publication from its own versioned directory, so the files appear as
schemas/
├── 1.1-rc.1/        # the current publication — 27 schemas
├── 1.0/             # frozen final release
├── 1.0-rc.1/        # frozen release candidates
├── 1.0-rc.2/
└── 1.0-rc.3/
and each schema is served at an immutable versioned URL — https://lc-json.org/1.1-rc.1/<name>.schema.json for the current publication, with the frozen /1.0/ and /1.0-rc.N/ publications alongside it forever (NORMATIVE §8.3). That versioned URL is the value every document’s $schema field carries, and it is the only citable form.

This is why schema links differ between the two: the source documents use a relative schemas/<name>.schema.json so contributors browsing the source tree can follow them, and the publication step rewrites every one of those links to its absolute versioned URL. A published schema link is always immutable — there is deliberately no mutable “current” alias to cite by mistake.

Course Hierarchy

A Course document has the following nested structure:

Course (top level)
└─ Units[] (array of units)
   └─ Lessons[] (array of lessons)
      └─ Items[] (array of items - 5 types)
         ├─ ContentItem (reading/content pages)
         ├─ ExerciseItem (questions; structural form, grading via isGraded)
         ├─ QuizItem (questions; structural form, grading via isGraded)
         ├─ ContentSequenceItem (grouped content)
         └─ SignpostItem (intro/summary with objectives)
            └─ Questions[] (only for ExerciseItem and QuizItem)

Minimal Examples for Quick Reference:

course-minimal.json - Bare minimum Course structure
unit-minimal.json - Bare minimum Unit
lesson-minimal.json - Bare minimum Lesson

Lesson Item Types

Every Lesson contains an items array with one or more of these 5 item types:

Item Type	Schema	Example	Description
ContentItem	content-item.schema.json	10-content-item.json	Reading/content pages with HTML content (subject to HTML_SAFETY.md)
ExerciseItem	exercise-item.schema.json	11-exercise-item.json	Exercise-shaped questions container. Grading independent (`isGraded`).
QuizItem (graded)	quiz-item.schema.json	12a-graded-quiz-item.json	Quiz-shaped, `isGraded: true` — typical assessment.
QuizItem (ungraded)	quiz-item.schema.json	12b-ungraded-quiz-item.json	Quiz-shaped, `isGraded: false` — diagnostic pre-test, self-check. Same schema, different policy.
ContentSequenceItem	content-sequence-item.schema.json	13-content-sequence-item.json	Grouped content with layout options (carousel, tabs, accordion)
SignpostItem	signpost-item.schema.json	14-signpost-item.json	Structural navigation (intro/summary) with objectives and stats; `customHtml` subject to HTML_SAFETY.md

Exercise vs. Quiz. These are structural distinctions only. They render differently in the UI and contribute to separate point buckets (enabling weighted grading). Whether the score counts toward a learner’s grade is the isGraded flag, set independently. The examples model this: 11-exercise-item.json is a graded homework exercise (isGraded: true); 12a-graded-quiz-item.json and 12b-ungraded-quiz-item.json use the same content under the same schema to show that quiz can be either graded or ungraded. The fourth combination (ungraded exercise / open practice) is conventional and not given its own example.

For an authoring guide that walks through the full design space of type × isGraded × isOptional × passMarkPercent — common patterns (graded homework, diagnostic pre-test, exit ticket, etc.) and how different consumers may interpret each combination — see ITEM_PATTERNS.md.

Key Properties (all items inherit from item-base.schema.json):

type (required) - Discriminator: “content”, “exercise”, “quiz”, “contentsequence”, or “signpost”
title (required) - Display title for the item
sequence - Display order within lesson (0-based)
instructions - Instructions shown to learner
suggestedTime - Estimated time in minutes
isOptional - Whether item can be skipped

Questions Array:

Only ExerciseItem and QuizItem have a questions array
ContentItem, ContentSequenceItem, and SignpostItem do NOT contain questions

SignpostItem Properties:

signpostType (required) - “intro” or “summary”
scope (required) - “course”, “unit”, or “lesson”
customHtml (optional) - Custom HTML content to override auto-generated message

Documentation Files

1. question-types-reference.md

Complete JSON Format Reference

All 19 Question Types with detailed specifications
Property tables showing required/optional fields
Examples for each question type
Validation rules and best practices
Common properties inherited by all questions
Complete course example showing nested structure

When to use:

Creating new course JSON files
Understanding question type requirements
Troubleshooting import errors

1b. Artifact type references

Complete JSON Format References for the v1.1 artifact types

subject-collection-reference.md — SubjectCollection: scope, members (tags and objectives), externalAlignments, carried copies in course documents.
curriculum-pack-reference.md — CurriculumPack: sequence[] steps on the calendar-relative timeline, pacing, assessment checkpoints, the coverage contract, and the blueprint / manifest / bundle depths.
glossary-reference.md — Glossary: terms, pronunciation, the declared translationLanguages inventory, definitionTranslations, firstMention, and glossaryRefs attachment.

When to use:

Authoring or consuming a collection, pack, or glossary document
Understanding how the v1.1 types reference each other by stable id

2. JSON Schema Files (`schemas/`)

Machine-Readable Validation

JSON Schema files for automated validation using tools like ajv, jsonschema, or IDE validators.

Root Artifact Schemas:

course.schema.json - Course (top level)
question-set.schema.json - QuestionSet (flat artifact)
glossary.schema.json - Glossary (flat artifact)
subject-collection.schema.json - SubjectCollection (vocabulary artifact)
curriculum-pack.schema.json - CurriculumPack (arrangement artifact)
publication-fields.schema.json - Shared publication field group, composed via allOf (NORMATIVE §4.11)

Course Hierarchy Schemas:

unit.schema.json - Unit (within Course)
lesson.schema.json - Lesson (within Unit)

Item Type Schemas:

item-base.schema.json - Base schema for all Items
content-item.schema.json - ContentItem type
exercise-item.schema.json - ExerciseItem type
quiz-item.schema.json - QuizItem type
content-sequence-item.schema.json - ContentSequenceItem type
signpost-item.schema.json - SignpostItem type (intro/summary navigation)

Question Type Schemas:

question-base.schema.json - Base properties for all questions
simple-gap-fill.schema.json - SimpleGapFill type validation
true-false-question.schema.json - TrueFalseQuestion type validation
multiple-choice.schema.json - MultipleChoice type validation
word-bank-cloze.schema.json - WordBankCloze type validation
multi-gap-cloze.schema.json - MultiGapCloze type validation
multiple-choice-cloze.schema.json - MultipleChoiceCloze type validation
short-answer.schema.json - ShortAnswer type validation
essay.schema.json - Essay type validation
sentence-transformation.schema.json - SentenceTransformation type validation
matching.schema.json - Matching type validation
ordering.schema.json - Ordering type validation
placement.schema.json - Placement type validation

Strict enforcement: the reference validator (validate_course.py) runs every document through these schemas as a primary pass via the jsonschema package (≥4.18, modern referencing Registry API). Per-question type-specific dispatch keys off the type discriminator. Install dependencies with pip install -r tools/requirements.txt.

Usage Example (Node.js with ajv):

const Ajv = require('ajv');
const ajv = new Ajv();

const baseSchema = require('./schemas/question-base.schema.json');
const simpleGapFillSchema = require('./schemas/simple-gap-fill.schema.json');

const validate = ajv.compile(simpleGapFillSchema);
const valid = validate(questionData);

if (!valid) {
  console.error(validate.errors);
}

3. Example Files (`examples/`)

Ready-to-Use Templates

Minimal Hierarchy Examples - Quick Format Reference

Ultra-minimal examples for quick consultation when creating courses:

course-minimal.json - Bare minimum Course structure with required properties
unit-minimal.json - Minimal Unit within a course
lesson-minimal.json - Minimal Lesson within a unit

Use these as exact-format references (e.g., a unit on Travel + present perfect).

Item Type Examples (10-13) - Item Structure Reference

Individual examples for each of the Lesson Item types:

10-content-item.json — ContentItem with rich HTML content (Declaration of Independence reading)
11-exercise-item.json — ExerciseItem framed as graded homework (5 T/F world-rivers questions; isGraded: true)
12a-graded-quiz-item.json — QuizItem as a graded assessment (isGraded: true, passMarkPercent: 70); same content as 11
12b-ungraded-quiz-item.json — QuizItem as an ungraded diagnostic pre-test (isGraded: false); same content as 11 and 12a — demonstrates that quiz vs. exercise is structural, grading is policy
13-content-sequence-item.json — ContentSequenceItem with carousel layout

Individual Question Examples (01-09) - RECOMMENDED

Standalone JSON files for each implemented question type with complete feedback bundles:

01-simple-gap-fill.json - Articles with indefinite article rule
02-true-false-question.json - Science fact with per-choice feedback
03-multiple-choice.json - Programming languages with detailed choiceFeedback
04-word-bank-cloze.json - Articles in context with per-gap feedback
05-multi-gap-cloze.json - Prepositions open cloze (FCE Part 2 style, 8 gaps)
06-multiple-choice-cloze.json - Vocabulary with nested gapOptionFeedback
07-short-answer.json - Astronomy fact recall
08-essay.json - IELTS Task 2 with comprehensive rubric
09-sentence-transformation.json - FCE Part 4 with chunk feedback

Features Demonstrated:

Complete feedback bundle (correct, incorrect, choiceFeedback)
Strategic hints that guide without revealing answers
Multi-level tagging (grammar:articles:indefinite, exam:fce, level:B2)
Realistic educational content
Production-ready quality

Use these as templates - they showcase all features including feedback mechanisms that are integral to effective course design.

The 7 reserved-for-2027 graphic types (association, hotspot, graphicGapMatch, graphicAssociate, graphicOrder, fileUpload, mediaPromptedEssay) are declared in question-base.schema.json’s enum for forward compatibility, but no example payloads ship — full authoring and rendering support is targeted for 2027.

`sample-course-with-questions.json`

Complete course JSON showing:

Full hierarchy: Course → Units → Lessons → Items → Questions
Mixed item types: ContentItem, ExerciseItem, QuizItem
Real-world structure: Lessons with intro content + exercises and quizzes
Cambridge FCE alignment: Exam-style questions with proper tags
Best practices: Proper tagging, feedback, hints, difficulty levels

Quick Start Guide

For Content Creators

Creating a Simple Course:

Start with a template:

cp examples/sample-course-with-questions.json my-course.json

Modify the course metadata:

{
  "title": "Your Course Title",
  "subtitle": "Your subtitle",
  "description": "Course description",
  "tags": ["level:B1", "grammar"]
}

Add or modify questions using the per-type example files (01-simple-gap-fill.json through 09-sentence-transformation.json, plus 15-matching.json and the 16… ordering family).
Validate the result with any conforming consumer or the reference validator:
```
python ../tools/validate_course.py --course-path my-course.json
```

For Developers

Validating LC-JSON in code:

Implementations may use any JSON Schema (Draft 7) validator. The earlier ajv snippet (Node.js) is one example; common alternatives:

Python: pip install jsonschema (≥ 4.18) → Draft7Validator
Java: everit-org/json-schema or networknt/json-schema-validator
Go: santhosh-tekuri/jsonschema
Rust: Stranger6667/jsonschema
Ruby: voxpupuli/json-schema

The reference Python validator (tools/validate_course.py in this repository) layers domain checks (HTML allowlist, gap-marker counts, points consistency, signpost-without-objectives) on top of schema validation. Re-implementations are welcome.

Adding a new question type to the spec (PR-driven contributions):

Create a JSON schema in schemas/ for the new type.
Add the discriminator value to question-base.schema.json’s enum.
Add a per-type example file under examples/ (e.g. 17-new-type.json).
Document in question-types-reference.md.
Add positive and negative test cases under tests/.

Question Types — Implementation Status

Implemented (12 types, fully schema-validated as of 1.0-rc.3):

Question Type	Example	Use Case
`simpleGapFill`	01-simple-gap-fill.json	Single gap fill-in-the-blank
`trueFalseQuestion`	02-true-false-question.json	Binary choice (True/False, Yes/No)
`multipleChoice`	03-multiple-choice.json	Single or multiple selection MCQ
`wordBankCloze`	04-word-bank-cloze.json	Gap fill from word pool
`multiGapCloze`	05-multi-gap-cloze.json	Open cloze (FCE Reading Part 2)
`multipleChoiceCloze`	06-multiple-choice-cloze.json	Dropdown cloze (FCE Reading Part 1)
`shortAnswer`	07-short-answer.json	Free text short response
`essay`	08-essay.json	Long-form writing with rubric
`sentenceTransformation`	09-sentence-transformation.json	FCE Use of English Part 4
`matching`	15-matching.json	Term-definition matching
`ordering`	16-ordering.json	Sequence/chronological ordering
`placement`	17a-sentence-placement.json	Place items into anchored gaps in a structured passage (sentence / paragraph / sectionLabel; supports decoy gaps for TOEFL Sentence Insertion)

Reserved (7 types declared in the discriminator enum for forward compatibility; per-type schemas and authoring/consumer support targeted for 2027):

Question Type	Use Case
`association`	Categorization/grouping
`hotspot`	Click regions on image
`graphicGapMatch`	Drag-and-drop on image
`graphicAssociate`	Match text with images
`graphicOrder`	Order images sequentially
`fileUpload`	Document submission
`mediaPromptedEssay`	Audio/video recording

Status definitions:

Implemented — per-type schema, example, and conformance fixtures present.
Reserved — declared in the question-base.schema.json discriminator enum for forward compatibility; no per-type schema or example ships yet.

The 12 implemented types are the entire user-facing surface as of 1.0. The 7 reserved types are targeted for 2027.

Common Validation Errors

Error: “Number of gaps doesn’t match accepted answers”

Cause: Mismatch between numbered @@@N markers in passage and entries in gapAcceptedAnswers. Fix: Count @@@1, @@@2, … markers in the passage and ensure gapAcceptedAnswers has matching string keys ("1", "2", …).

Error: “Unknown question type: simplegapfill”

Cause: Type discriminator uses non-canonical casing. Fix: Use camelCase: "simpleGapFill", not "SimpleGapFill" or "simplegapfill". Per NORMATIVE.md §5.3, conforming consumers MUST reject non-canonical casings.

Error: “globalId does not match UUID pattern”

Cause: globalId is missing or not in RFC 4122 UUID form (any version; shape-only validation against the 8-4-4-4-12 hex pattern). Fix: Generate a UUID for every Unit, Lesson, Item, and Question. Use any standard UUID library; v4 is recommended.

Error: “Unsupported specVersion ‘2.0’”

Cause: The document declares a specVersion whose major version exceeds 1. Fix: This validator implements LC-JSON 1.x. Either update the validator or correct the specVersion to a 1.x value.

NORMATIVE.md — RFC 2119 conformance requirements (the authoritative source for what implementations must do)
HTML_SAFETY.md — Normative HTML allowlist for ContentItem.html and SignpostItem.customHtml (elements, attributes, URL schemes, sanitization)
ACCESSIBILITY.md — Producer/consumer accessibility obligations (alt text, captions, keyboard, language/direction) with WCAG 2.1 AA cross-references and recommended ARIA patterns; the opt-in Accessibility Profile claim binds these as MUSTs per NORMATIVE.md §12
VALIDATION.md — Catalog of every documented validation rule, tagged schema-enforced / domain-validator-enforced / advisory, with citations to the enforcing site (schemas, validate_course.py, or prose). One-map view for implementers building consumers, validators, or round-trip tests.
ITEM_PATTERNS.md — Informative authoring guide for items + signposts + objectives
IMPLEMENTATIONS.md — Directory of tools that produce, consume, or validate LC-JSON
CONTRIBUTORS.md — Acknowledgments
schemas/ — JSON Schema files (the contract)
examples/ — Example documents for every artifact and question type
tests/ — Conformance test corpus (valid and invalid cases)
question-types-reference.md — Per-type property reference

Version History

v1.0-rc.3 (2026-06-13) — second public release candidate

Adds LOCALIZATION.md: the language model (language / lang / supportLanguage), the single-language-per-document boundary, BCP 47 tags, and screen-reader pronunciation expectations. Bound by new NORMATIVE.md §13.
Adds a positioning page (RATIONALE.md) explaining where LC-JSON sits among adjacent standards.
Conformance corpus expanded to 64 cases (per-type valid + invalid coverage, grading matrix, globalId-uniqueness).
Schema change requiring a new immutable path: the prototype-era allowedFillerWords and prohibitExtraWordsBetweenChunks fields are dropped from sentence-transformation.schema.json. Because /1.0-rc.2/ is immutable, this lands at /1.0-rc.3/. Backwards-compatible — every rc.2-valid document remains valid under rc.3 (the removed fields were optional and are ignored on import).
Schemas published as immutable at https://lc-json.org/1.0-rc.3/*.schema.json; /1.0-rc.1/ and /1.0-rc.2/ stay served and frozen; the https://lc-json.org/1.0/*.schema.json URL space is reserved for the accepted final release.

v1.0-rc.2 (2026-05-30) — first publicly announced release candidate

Two artifact types under a common flat root: course (hierarchical) and questionSet (flat).
12 user-facing question types fully implemented and schema-validated; 7 graphic/upload types reserved for a 2027 minor version.
23 JSON Schemas (Draft 7) covering every artifact, item type, and question type.
32 example files; conformance test corpus under tests/ (13 valid + 25 invalid = 38 cases).
Reference validator (tools/validate_course.py) and conformance corpus harness (tools/run_corpus.py).
prompt field correction (the rc.1 → rc.2 change): prompt remains required but minLength is 0, so an empty string "" is valid. prompt is defined as non-authoritative for the eight symbolic question types (gap-fill family, sentence transformation, matching, ordering, placement), whose structured fields carry the question’s meaning; for those types it MAY be empty or MAY carry a brief producer-derived readable summary. A reference-validator domain rule still flags an empty prompt on the four real-content types (true/false, multiple choice, short answer, essay), where it is the question. Backwards-compatible widening — every rc.1-valid document remains valid under rc.2.
Apache 2.0 throughout. Release-candidate schemas are published as immutable at https://lc-json.org/1.0-rc.2/*.schema.json; the https://lc-json.org/1.0/*.schema.json URL space is reserved for the accepted final release.

v1.0-rc.1 — internal release candidate (superseded, never announced)

Frozen and served at https://lc-json.org/1.0-rc.1/*.schema.json for transparency, but never publicly announced; rc.2 is the first announced prerelease. The only substantive difference is the backwards-compatible prompt minLength 1 → 0 correction above; the /1.0-rc.1/ schema URLs remain immutable and any document valid under rc.1 is valid under rc.2.

v1.0 (2026-06-30) — accepted final release

Publishes the rc.3 schema set unchanged at immutable https://lc-json.org/1.0/*.schema.json — a pure URL rebase of rc.3 with zero wire/content delta (only the version pointer, the $id/$schema URL strings, and doc version labels change).
Further accessibility deepenings (per-criterion cross-reference table, expanded ARIA patterns, screen-reader announcement timing, accessibility-conformance fixtures) are post-1.0, additive, and informative or opt-in — they do not change the 1.0 base contract (see ACCESSIBILITY.md §11).
Any future non-breaking wire refinement lands in a new immutable minor-version path, not by mutating 1.0.

LC-JSON’s public history begins with the 1.0 release-candidate line — 1.0-rc.2 (2026-05-30) was its first publicly announced release. Internal iteration before the candidate line is not reflected in the version history.

Contributing

PRs welcome. To propose a new question type or modify an existing one, see Adding a new question type above. For non-trivial changes, open an issue first to discuss the proposal.

See CONTRIBUTORS.md for acknowledgments.

Support

GitHub Issues: open an issue on the spec repository.
Conformance questions: consult NORMATIVE.md; it is the authoritative source for implementer requirements.

LC-JSON Specification v1.0

LC-JSON Rationale and Positioning

Status: Informative
Spec version context: LC-JSON 1.1-rc.1
Last updated: 2026-07-22
Audience: teachers, curriculum designers, institutional reviewers, educational software developers, and implementers evaluating LC-JSON for adoption.

This document is informative, not normative. It explains the design rationale and positioning behind LC-JSON. Conformance requirements remain in NORMATIVE.md.

LC-JSON describes itself as an open learning-content interchange specification rather than an “industry standard.” That word is reserved for formats whose acceptance has been ratified by a recognized body or by long ecosystem use. LC-JSON has neither yet, and overclaiming would invite reasonable skepticism.

The Problem

Teachers and institutions create large amounts of learning content: courses, lessons, readings, exercises, quizzes, feedback, and assessments. Too often, that content becomes tied to the tool that created it.

Common problems include:

A course can be exported, but the export is difficult for another tool to understand.
Question banks lose feedback, scoring intent, tags, or structure during transfer.
Teachers cannot inspect their own course files without proprietary tooling.
Institutions cannot easily preserve or migrate teacher-authored content across platforms.
Accessibility metadata can be lost when content is exported, imported, edited, or repackaged.

LC-JSON exists to make learning content portable in a way that is both technically reliable and inspectable by the people who own the content.

What LC-JSON Is

LC-JSON is an open learning-content interchange specification.

For teachers, it can be understood as a portable course file format: a way to store courses, lessons, questions, answers, feedback, and related teaching material in a file that compatible tools can read.

For developers, LC-JSON defines a schema-validated JSON wire format plus producer/consumer conformance rules for exchanging learning content.

LC-JSON 1.1 defines five artifact types:

Course — hierarchical learning content: Course -> Units -> Lessons -> Items -> Questions.
QuestionSet — a flat list of questions for question-bank exchange and packaged delivery.
Glossary — a flat list of terms (definitions, pronunciation, translations, examples, etc.) that attaches to a course, unit, or lesson.
SubjectCollection — a reusable classification vocabulary: the tags and learning objectives for one subject at one level.
CurriculumPack — an arrangement: sequence, pacing, and assessment checkpoints over a collection plus content documents.

The first two (added in 1.0) carry learning content; the last three (added in 1.1) carry the vocabulary, arrangement, and study material that sit around it.

The practical goal is to preserve teacher-authored instructional intent: sequence, explanations, questions, distractors, feedback, objectives, tags, rubrics, and grading intent.

Design Principles

Machine-Validatable, Human-Inspectable

LC-JSON documents validate against published JSON Schemas, but they are also designed so that authored content remains visible in the file.

A teacher, curriculum designer, or teacher-developer should be able to open a course JSON file and recognize courses, units, lessons, items, prompts, choices, answers, and feedback without proprietary tooling.

This is a deliberate stance against formats whose meaning only emerges through tooling. It is offered without promise of zero technical fields — portability requires some, and $schema, specVersion, and globalId exist for that reason. The promise is that the pedagogical structure stays inspectable to the people who authored it. Where field-naming or structural trade-offs arise during spec evolution, the spec favors the form that keeps pedagogical content recognizable.

Hierarchy Follows Pedagogy

LC-JSON uses the structure teachers already recognize:

Course -> Unit -> Lesson -> Item -> Question

This is not a database-first shape. It is a teaching-content shape.

Plain Property Names

LC-JSON favors readable property names where technically possible:

prompt, not p.
acceptedAnswers, not accAns.
passMarkPercent, not pmp.
feedback, not an opaque metadata bundle.

The goal is not to remove every technical term. The goal is to keep teaching intent visible.

No Envelope Tricks

LC-JSON uses a flat document root with $schema, documentType, and specVersion as root-level siblings. The course or question-set payload lives beside those fields, not hidden inside an extra wrapper.

This keeps schema dispatch explicit while avoiding unnecessary nesting.

Accessibility Metadata Must Survive Transformation

Learning content often moves through multiple tools before it reaches learners. LC-JSON therefore treats accessibility metadata as something that must survive import/export cycles.

Base LC-JSON consumer conformance includes a preservation floor for accessibility-relevant data such as image alt, media <track>, lang, dir, language, supportLanguage, and reserved-type accessibility metadata. Tools that additionally claim the LC-JSON Accessibility Profile take on the rendering obligations defined in ACCESSIBILITY.md.

Where LC-JSON Fits

LC-JSON is not trying to replace every educational specification or format.

It focuses on a specific problem: portable teacher-authored courses and questions in a JSON-native format that tools can validate, exchange, preserve, and inspect.

LC-JSON is most useful when a team needs:

portable course files,
question-bank exchange,
schema validation before import,
preservation of feedback and scoring intent,
round-trip preservation of unsupported future question types,
a format that teacher-developers and technical curriculum teams can inspect directly.

Runtime delivery, gradebook integration, learner analytics, roster sync, and LMS-specific workflows remain implementation concerns unless a future LC-JSON version explicitly adds a portable contract for them.

A typical adoption path is to author or preserve content in LC-JSON, then export or map selected surfaces to delivery, package, or analytics layers such as QTI, Common Cartridge, H5P, xAPI, or Caliper where needed.

The 1.1 vocabulary and arrangement types sit alongside existing work rather than replacing it. Machine-readable standards frameworks already exist — 1EdTech® CASE®¹ is the established way authorities publish competency frameworks — and LC-JSON collections do not compete with those registers: a collection points at them (externalAlignments) while carrying the working tags and objectives that course files actually reference. What LC-JSON has lacked is the connective layer in the same plain-file format as the content: a scheme of work whose sequencing, pacing, and coverage claims are checkable by a validator against the vocabulary it cites, in a document a teacher can open and read. That — not the existence of machine-readable curricula — is the gap the Curriculum Pack addresses.

Landscape

LC-JSON is one of several specifications that touch learning content. It sits at a specific layer — content interchange — and is intended to be used alongside, not instead of, the formats that handle adjacent concerns.

Format	Layer	Relation to LC-JSON
LTI 1.3 / Advantage	Tool launch, deep linking, roster, grade passback	Different layer. LTI is how an LMS launches and integrates with an external tool; LC-JSON is the content that tool may have authored or consumed. Complementary.
xAPI / cmi5	Learning activity records	Different layer. xAPI describes what a learner did; LC-JSON describes the content they did it with. Complementary.
SCORM 2004	Packaged courseware delivery and runtime API	Older, XML-based, designed for self-paced corporate compliance training and bound to a runtime API. LC-JSON is editable interchange, not a delivery wrapper.
IMS Common Cartridge	Multi-format content package	Bundles QTI, SCORM, web links, and a manifest into a single archive. LC-JSON is a single JSON-native artifact rather than a package format.
QTI 2.x / 3.0	Question and assessment interchange	Closest peer. QTI was conceived as XML; 3.0 added a JSON binding but the conceptual model remains XML-shaped and the surface area is broad. LC-JSON is JSON-native from the start, course-shaped as well as question-bank-shaped, narrower in surface, and designed for direct human inspection.
OneRoster	Roster, enrollment, grade exchange	Different layer; orthogonal to content.
CASE	Competency and academic-standards framework	The nearest overlap is in objectives. CASE provides a standardized way for an authority to publish a canonical competency framework; an LC-JSON Subject Collection is a portable working vocabulary of objectives and tags that can declare its relationship to such a framework through typed `externalAlignments`, rather than republishing it. Complementary — the Subject Collection points to the external register; it does not replace or author it.
H5P	Interactive content packages and runtimes	Different layer. H5P provides executable interaction types and player/runtime semantics; LC-JSON is a neutral editable source/interchange format that could generate or map to selected runtime targets.
Caliper	Learning analytics event model	Different layer. Like xAPI, Caliper describes learner activity events; LC-JSON describes the content those events may refer to. Complementary.

This is a high-level map, not an exhaustive comparison. LC-JSON’s intended combination — JSON-native, human-inspectable, and covering hierarchical course structure as well as flat question sets — is uncommon among established educational interchange formats. 1.1 extends that combination beyond the content to the curriculum around it — a shared vocabulary of objectives and tags, and a scheme-of-work arrangement, in the same plain-file format — and adds something rarer still: a curriculum pack whose sequencing and coverage claims a validator can check against the vocabulary it cites.

The question of whether such a format needs to exist resolves as follows: QTI is mature and deep for assessment exchange, and LC-JSON deliberately targets a narrower, JSON-native course-and-question authoring source rather than competing on assessment surface area; SCORM and Common Cartridge are package-and-delivery formats from an earlier era, not editable JSON; xAPI, Caliper, LTI, OneRoster, and CASE are oriented at other layers. LC-JSON exists to occupy the JSON-native, teacher-readable interchange slot — and, with 1.1, to widen it beyond courses and questions to the curriculum around them: a shared vocabulary authors can classify against, and a validator-checkable scheme of work that sequences content against it.

How LC-JSON Differs

The same comparison, expressed as field-level stances:

Need	LC-JSON stance	Typical peer behavior
Teacher-readable interchange files	First-class design principle	Many established interchange formats prioritize machine/tool processing over direct human inspection
JSON-native validation	Published JSON Schemas (Draft 7)	QTI is XML-shaped (3.0 added a JSON binding); SCORM and Common Cartridge are XML and package-based
Course + question portability in one family	Separate `course` and `questionSet` artifacts under a common flat root	QTI covers questions; SCORM and Common Cartridge package courses; few formats cover both as editable JSON
Shared curriculum vocabulary across courses	A `subjectCollection` is a portable, owned vocabulary of reusable objectives and tags with stable member ids that are never re-minted. It can declare typed relationships to external frameworks (e.g. CASE) through `externalAlignments`. Courses can carry copies of collection-origin objectives with their ids preserved; a `curriculumPack` can reference the collection directly and use its objective and tag ids to sequence and pace content. This lets courses within a compatible scope share objective identities, while packs use a referenced objective-and-tag vocabulary for sequencing and coverage	Alignment references may exist in packages — for example, Common Cartridge can carry standards and CASE URIs — but a reusable, editable objective and tag vocabulary is typically managed separately from the portable content package, such as in an LMS or standards service
Curriculum sequencing and coverage as checkable data	A `curriculumPack` makes the plan portable and calendar-relative. It can declare coverage assertions against a cited collection and enables validators to enforce the defined taught-before-used sequencing rules	Other systems can order resources, and schemes of work often live as prose or LMS structures; they do not generally encode this combination of relative pacing, coverage assertions, and taught-before-used rules as a portable, independently machine-checkable contract against a cited vocabulary
Accessibility metadata preservation across import/export	Base consumer-conformance preservation floor for `alt`, `<track>`, `lang`, `dir`, `language`, `supportLanguage`, reserved-type accessibility metadata	Accessibility metadata can be dropped or normalized away during transformation
Accessible delivery claims	Opt-in Accessibility Profile binding (see `ACCESSIBILITY.md`)	Accessibility-conformance claims are typically made about the delivery platform, not the interchange file
Unsupported future question types	Preserve verbatim and report; never silently drop	Fallback behavior varies by implementation; without an explicit preservation contract, data loss is a practical risk
Tool-specific data	Namespaced `x-` extensions; other consumers ignore unknown namespaces and extension-preserving consumers round-trip them where possible (see `NORMATIVE.md` §7)	Custom-extension mechanisms exist (e.g. QTI custom interactions) but are often tightly coupled to one tool
Version stability	Immutable schema URL paths per spec version	URL stability practices vary by specification
LMS / runtime integration	Out of scope unless a future LC-JSON version adds a portable contract	SCORM defines a runtime API; LTI defines launch and grade integration

Adoption Positioning

For teachers:

LC-JSON is a portable course file format for moving teaching materials between compatible platforms.

For institutions:

LC-JSON is an open, JSON-based interchange format that makes teacher-authored learning content portable between compatible tools and platforms. From version 1.1 it also supports shared curriculum vocabularies: a portable Subject Collection carries reusable learning objectives and tags and can declare typed relationships to official curriculum frameworks. Courses can carry objectives from the collection with their ids preserved, while a Curriculum Pack can reference the collection directly to sequence content and check coverage against it. Courses within a compatible scope can therefore reuse objective identities, while packs can reuse the collection’s objective and tag vocabulary.

For developers:

LC-JSON defines a schema-validated JSON wire format plus producer/consumer conformance rules for learning-content interchange.

For standards reviewers:

LC-JSON is an emerging open learning-content interchange specification with a published 1.0 release and a 1.1 release candidate, published schemas, conformance fixtures, and explicit producer/consumer obligations.

Scope and Limits

LC-JSON is a focused, open, schema-validated interchange specification for portable learning content. It is not — on its own — any of the following:

A WCAG conformance claim. LC-JSON’s Accessibility Profile binds preservation and rendering obligations on conforming consumers, but WCAG conformance is established by the delivery platform under test, not by the interchange file.
An LMS interoperability format. Tool launch, deep linking, and grade passback are LTI’s domain.
A roster, enrollment, or grade-exchange format. That is OneRoster’s domain.
A learning-analytics or activity-record format. That is xAPI / cmi5’s domain.
A runtime delivery wrapper. SCORM 2004 defines a runtime API; LC-JSON does not.
A broadly adopted industry specification. LC-JSON is an emerging open specification with published, versioned schemas, conformance fixtures, and explicit producer/consumer obligations. Whether it becomes widely adopted will be determined by implementers and time, not by self-description.

Within those limits, LC-JSON aims to do one thing well: provide a JSON-native, human-inspectable interchange format for three learner-facing content types — hierarchical courses, flat question sets, and glossaries — plus two coordinating types: Subject Collections, which provide reusable objective-and-tag vocabularies, and Curriculum Packs, which arrange content against those vocabularies. It supports extension-preserving round-trips, a base accessibility-preservation floor, and an opt-in Accessibility Profile for delivery obligations.

CASE® and 1EdTech® are trademarks of 1EdTech Consortium, Inc. (1edtech.org). ↩

LC-JSON Specification — Normative Requirements

Spec version: 1.1 Status: Normative (release candidate — published at /1.1-rc.1/) Last updated: 2026-07-22

This document states the requirements that conforming LC-JSON (Learning Content JSON) tools MUST satisfy. It is the authoritative source of truth for compliance; descriptive material elsewhere in the specification illustrates how to meet these requirements but does not relax them.

1. Scope

This document specifies the requirements for tools that produce, consume, or validate LC-JSON 1.1 documents. It defines:

The canonical wire format for the five artifact types (Course, QuestionSet, Glossary, SubjectCollection, CurriculumPack).
Two conformance roles — producer and consumer — and what each MUST do.
Versioning rules and URL stability guarantees.
Conformance-claim language (how a tool may state it conforms).

This document does not prescribe implementation strategies, programming languages, or runtime architecture. Any tool meeting the requirements below conforms, regardless of how it is built.

LC-JSON 1.1 is a purely additive minor version over 1.0 (§8.2): every conforming 1.0 document is a conforming 1.1 document with unchanged meaning. The additions are the three new artifact types (§3.3) — two vocabulary/arrangement types and the glossary content type — the member-identity and self-containment rules that make them portable (§3.4, §4.9–§4.11, §5.7), and the optional course-root additions: three publication-metadata fields and the glossaryRefs attachment arrays (Appendix A).

2. Conformance Language

The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in RFC 2119 and RFC 8174 when, and only when, they appear in all capitals.

A requirement stated in lowercase (“must,” “should”) is descriptive prose, not a normative requirement.

3. Document Identity

3.1 Canonical URL space

LC-JSON schemas are published at:

https://lc-json.org/<spec-version>/<schema-name>.schema.json

The <spec-version> segment identifies either a released version (1.0, 1.1, 2.0, …) or a release candidate — an immutable draft of an upcoming version, published for review and implementer feedback before the final release is accepted (e.g., 1.0-rc.1, 1.0-rc.2, 1.0-rc.3, 1.1-rc.1, …). Each receives its own URL path. For released spec version 1.0, schemas resolve at https://lc-json.org/1.0/*.schema.json. For release candidates, schemas resolve at https://lc-json.org/1.0-rc.N/*.schema.json, one URL path per candidate.

URLs under any published path — released or release-candidate — are immutable. They MUST NOT be renamed, removed, redirected to a different schema, or repointed to a non-canonical host once published.

The /X.Y/ URL path is reserved for the accepted final X.Y release and MUST NOT be populated until that release is published. Release candidates targeting X.Y are published at /X.Y-rc.N/ paths. A document pinned via $schema to /X.Y-rc.N/ does not automatically validate against /X.Y/; adoption of the final release is an explicit choice by the publisher (typically a re-export against the new schema URL). See §8.1 and §8.3 for the full versioning and stability contract.

3.2 Required root fields

Every conforming LC-JSON document MUST contain at the root, as siblings (not nested under any envelope):

Field	Required?	Type	Value
`documentType`	MUST	string	`"course"`, `"questionSet"`, `"glossary"`, `"subjectCollection"`, or `"curriculumPack"`. The artifact discriminator.
`specVersion`	MUST	string	The LC-JSON contract version this document conforms to. Pattern enforced by the schemas; consumer/producer rules in §5.2 / §4.6.
`$schema`	MUST (producer) / SHOULD-tolerate (consumer)	string	A URL identifying the schema for this document type at the spec version the producer conforms to (e.g., `https://lc-json.org/1.1-rc.1/<artifact>.schema.json` for a 1.1-rc.1 producer; `https://lc-json.org/1.0/<artifact>.schema.json` for a 1.0-final producer). Consumers SHOULD accept documents that omit `$schema` (re-import scenarios from older or lenient producers), but MUST reject any other root-field omission.

A document missing documentType or specVersion is non-conforming. A producer that emits a document missing $schema is non-conforming with respect to that document; a consumer that rejects an otherwise-valid document on the basis of a missing $schema is overly strict and SHOULD instead infer the schema from documentType + specVersion.

3.3 Artifact types

Spec version 1.1 defines exactly five artifact types:

Course (documentType: "course") — hierarchical learning content (Course → Units → Lessons → Items → Questions). Validated by course.schema.json.
QuestionSet (documentType: "questionSet") — flat list of questions without a course/unit/lesson scaffold. Validated by question-set.schema.json.
Glossary (documentType: "glossary") — a flat list of terms with immutable member ids: pronunciation, translations, examples, inflected forms. Content-adjacent learning material a student studies — not vocabulary about content (contrast SubjectCollection). Validated by glossary.schema.json. See glossary-reference.md.
SubjectCollection (documentType: "subjectCollection") — a reusable classification vocabulary: tags and learning objectives for a structured (subject, level, audience, purpose, jurisdiction) scope. Validated by subject-collection.schema.json. See subject-collection-reference.md.
CurriculumPack (documentType: "curriculumPack") — an arrangement: sequence, pacing, and checkpoints referencing a SubjectCollection plus content documents. Validated by curriculum-pack.schema.json. See curriculum-pack-reference.md.

Course, QuestionSet, and Glossary are content documents; SubjectCollection is a vocabulary document; CurriculumPack is an arrangement document. The vocabulary/arrangement types never carry learner-facing content items or questions — they classify and organize content documents. A glossary carries learner-facing study material (its terms) but never items or questions.

A producer MUST emit exactly one of these artifact types per document. Mixing artifact types within a single document is non-conforming.

3.3.1 Conformance rules by artifact type (normative)

A conforming document MUST satisfy the rules for its artifact type. For Course and QuestionSet those rules are stated in §4–§6, in §12 (accessibility preservation), and in the JSON Schemas. For the artifact types introduced in 1.1 the rules are, in addition to §3.4 (member identity), §4.9 (self-containment), §4.10 (alignment claims), and §4.11 (publication metadata):

SubjectCollection — the structural, member-identity, closure, and alignment-claim rules identified SC-1 … SC-14.
CurriculumPack — the step-shape, pacing, checkpoint, taught-before-used, term-capacity, dependency-direction, bill-of-materials, coverage, and bundle-closure rules identified CP-1 … CP-17.
Glossary — the structural, gloss, and declared-translation-inventory rules identified GL-1 … GL-11.
Course (1.1 deepenings) — the specVersion↔$schema agreement rule RD-1, and the objective-pool and glossary-attachment rules identified CO-1 … CO-5.

These rule families are normative requirements of this specification. VALIDATION.md §15–§19 enumerates each rule, cites its source, and tags its enforcement tier (schema-enforced, domain-validator-enforced, or advisory); each rule’s severity (ERROR / WARN / NOTE) is as tagged there. The three reference documents — subject-collection-reference.md, curriculum-pack-reference.md, and glossary-reference.md — are informative: they explain and illustrate these rules but do not add to or relax them. Where a reference document differs from this document, the JSON Schemas, or the VALIDATION.md catalog, this document and the schemas govern.

3.4 Member identity (vocabulary and glossary documents)

A SubjectCollection’s entries — tags[] and objectives[] — and a Glossary’s entries[] are the owning document’s members. Members are identified by portable, stable ids, and those ids are the unit of interoperability across documents, tools, and time:

Every member MUST carry an id that is immutable for the life of the member. Renaming a tag, re-wording an objective, re-parenting a tag, or moving a member between display categories are content revisions of the owning document (a new document version), never id changes.
Display text is never identity. slug, name, and label are mutable presentation and lookup fields; no conforming tool may key member identity on them.
For SubjectCollection members (tags and objectives), a member id encountered in another document identifies the same member. Two documents that both carry tag id X are both referring to one shared concept — this is what makes classification comparable across independently-authored content (see §5.7 for the consumer obligations this creates). Glossary entry identity is narrower — see the glossary bullet below.
A document’s globalId is likewise portable and immutable: consumers MUST preserve it verbatim on import and MUST NOT re-mint it.

The membership model is deliberately asymmetric between the two member kinds:

A tag may be a member of any number of SubjectCollections. A document listing a tag asserts membership, not exclusive ownership.
An objective has exactly one owning document (its wording is scope-specific). Other documents may reference and carry copies of an objective (§4.9), but only the owner revises its wording.
A glossary entry is single-owner, like an objective, and its identity is document-scoped: the identifying key is the pair (glossary globalId, entry id), and entry ids are required to be unique only within their owning glossary. An entry id encountered in a different glossary is a different entry — the cross-document same-id-same-member rule above applies to collection members only. Entries are not carried into other documents in 1.1 — a course references a whole glossary (glossaryRefs), never individual entries — so entry ids exist for re-import reconciliation (§5.7): the same entry id in a later version of the same glossary (same globalId) is the same entry, updated in place, never duplicated or re-minted.

4. Producer Conformance

A producer is any tool that emits LC-JSON documents intended for external consumption.

4.1 Wire format

A producer MUST emit documents in the canonical flat-root form: $schema, documentType, and specVersion at the top level, with the artifact payload as flat siblings. Nested envelopes such as {"course": {...}} are non-conforming.

4.2 Discriminator casing

A producer MUST emit the type discriminator on questions in canonical camelCase form: "simpleGapFill", "trueFalseQuestion", "multipleChoice", "wordBankCloze", "multiGapCloze", "multipleChoiceCloze", "shortAnswer", "essay", "sentenceTransformation", "matching", "ordering", "placement".

A producer MUST emit the type discriminator on items in canonical lowercase form: "content", "exercise", "quiz", "contentsequence", "signpost".

A producer MUST emit documentType in canonical camelCase form: "course", "questionSet", "glossary", "subjectCollection", or "curriculumPack".

4.3 Item-type semantics

The exercise and quiz item-type discriminators are structural distinctions, not policy. They allow consumers to render the two forms differently in the UI and to track their points in separate buckets (enabling weighted grading).

The grading policy of an item is composed independently from its type via the isGraded, isOptional, and passMarkPercent fields. All four combinations of {exercise, quiz} × {graded, ungraded} are valid LC-JSON: a graded exercise (e.g. homework that counts), an ungraded exercise (open practice), a graded quiz (typical assessment), and an ungraded quiz (e.g. diagnostic pre-test, self-check) are all conforming.

A producer MUST NOT infer or assert grading state from item type alone, and a consumer MUST NOT reject a document on the basis that an exercise is graded or a quiz is ungraded.

4.4 Identifiers

A producer MUST emit globalId values as RFC 4122 UUIDs (any version) where the schema requires them. Specifically: every Unit, Lesson, Item, and Question MUST have a globalId; these identify the entity across re-imports, enabling consumers to match unchanged content against existing records and detect modifications.

Within a single document, globalId values MUST be unique across all entities (Units, Lessons, Items, and Questions share one namespace). A document in which two entities carry the same globalId does not conform: a consumer keyed on globalId cannot tell the entities apart, so re-import matching breaks and updates can land on the wrong record. globalId comparison is case-insensitive (the hexadecimal digits of a UUID carry no case significance).

A producer SHOULD emit a sourceCourseId at the course root for any course that may be re-imported or version-tracked. sourceCourseId is the stable course-identity field — the same sourceCourseId across versions of a course identifies them as the same logical course, enabling consumers to detect re-imports and apply update semantics rather than treating each upload as a fresh course. sourceCourseId is generated by the source authoring system; it does not identify a human author. A QuestionSet carries the analogous sourceQuestionSetId at its root, with the same source-side semantics.

Vocabulary- and glossary-document identifiers follow §3.4: a producer MUST emit an id on every SubjectCollection member and every Glossary entry, and MUST NOT re-mint an id when regenerating or revising a document (the member persists; the document’s version changes). Member ids are opaque strings; RFC 4122 UUIDs are RECOMMENDED. Document identity for both types is the root globalId (an opaque string chosen by the original publisher; stable, human-readable slugs and UUIDs are both conventional) together with version.

Document identity by artifact type. The portable document identity — the value another document uses when it references this one — depends on the document’s documentType:

`documentType`	Portable document identity
`course`	`sourceCourseId`
`questionSet`	`sourceQuestionSetId`
`subjectCollection`	`globalId`
`glossary`	`globalId`
`curriculumPack`	`globalId`

Course and QuestionSet identity is source-side (sourceCourseId / sourceQuestionSetId); the three 1.1 artifact types carry a publisher-chosen root globalId. A Curriculum Pack references content by the value above for the referenced document’s type, carried in the reference’s id field (§3 of curriculum-pack-reference.md; step-level binding in §4.3). Because a producer only SHOULD emit sourceCourseId / sourceQuestionSetId, a Curriculum Pack producer MUST NOT emit a contentRef to a Course or QuestionSet that does not carry the identifier its type resolves against — like the §4.9 closure rules, this is a producer-emission requirement: a conforming validator reports a pack that references an id-less course as an error, but a Course is never obliged to carry sourceCourseId merely to exist standalone. SubjectCollection and Glossary always carry a root globalId, so a reference to one always resolves. The identity-by-type table above defines how each artifact type is identified when it is referenced; it does not itself make every type a contentRef target. A Curriculum Pack’s contentRefs bind content documents only — Course, QuestionSet, or Glossary — and a pack MUST NOT reference another Curriculum Pack: an arrangement is not embeddable content, and 1.1 does not define pack-in-pack nesting (SubjectCollections are referenced through collectionRefs, not contentRefs). The contentRefs[].type vocabulary is closed for producers: a producer MUST emit one of course, questionSet, or glossary, and MUST NOT emit curriculumPack or any other value. Like the alignment-claim vocabulary (§4.10, §5.5), it binds producers only and is deliberately left schema-open (§5.8): a consumer that meets a contentRefs[].type it does not recognize MUST NOT reject the document — it treats the reference as unresolvable and preserves it across read/write cycles. Consumers MUST NOT conflate a source-side id with a platform-assigned identifier (see the forward-direction note below): a contentRef resolves against source-side identity, never against a platform id.

Forward-direction note (informative, not normative for 1.0): Future versions of LC-JSON may introduce a complementary coursePlatformId field for platform-assigned course identifiers, enabling round-trip flows where a teacher exports from a platform and re-imports to an authoring tool with the platform’s identity preserved. Implementations should not rely on this field’s absence in 1.0 documents being permanent. A platform-assigned identifier is deployment-scoped and is not the identity a Curriculum Pack contentRef resolves against (which is always the source-side sourceCourseId / sourceQuestionSetId), so introducing it does not change pack reference resolution.

4.5 Property naming

A producer MUST emit all property names in camelCase. PascalCase, snake_case, and other casings are non-conforming on the wire.

4.6 Spec version

A producer MUST emit specVersion matching the spec version it implements. For producers conforming to this document, specVersion MUST begin with "1." (e.g., "1.0", "1.1", "1.1.1").

specVersion carries the contract version regardless of which publication the producer targets. The specific publication — release candidate or final release — is identified by the $schema URL (§4.7). A producer conforming to 1.1-rc.1 emits specVersion: "1.1" together with $schema: "https://lc-json.org/1.1-rc.1/course.schema.json"; a producer conforming to a later 1.1 final release emits the same specVersion value together with that release’s $schema URL (the /1.1/ path, which is reserved and is not populated until the 1.1 final release ships — see §8.3). specVersion does not include release-candidate suffixes — "1.1-rc.1" is not a conforming specVersion value.

4.7 Schema URL

A producer MUST emit a $schema URL pointing at the canonical published schema for its documentType at the spec version the producer conforms to. For example: a producer conforming to LC-JSON 1.1-rc.1 emits https://lc-json.org/1.1-rc.1/course.schema.json for courses and https://lc-json.org/1.1-rc.1/subject-collection.schema.json for subject collections; a producer conforming to 1.0 final emits https://lc-json.org/1.0/course.schema.json. A producer that emits a non-canonical URL or omits the field is non-conforming.

The strict producer / lenient consumer split (§3.2 above) is deliberate: emitting $schema makes documents self-describing for IDEs, schema dispatch, and ad-hoc validation; tolerating its absence on import preserves portability across older or otherwise-non-conforming producers without hard-failing re-imports.

4.8 Validation before emit

A producer SHOULD validate every emitted document against the published JSON Schemas before delivery. A producer that emits an invalid document is non-conforming with respect to that document.

4.9 Self-containment of vocabulary references (closure and carried copies)

SubjectCollection closure. A conforming SubjectCollection document is self-contained:

categories[] MUST include every category referenced by any tags[].categoryId. Categories are shared display buckets, not identity; consumers merge them by category id.
Every objectives[].tagIds entry MUST resolve to a member of the document’s own tags[].
Every tags[].parentId MUST be the member id of another tag in the document (parents are member ids, never slugs).
The parentId relation MUST be acyclic: no tag may be its own parent, and no sequence of parentId links may return to a tag already visited along that sequence. Because each tag has at most one parent, an acyclic relation is a forest — every tag reaches a root in finitely many steps. A document containing a parentId cycle of any length is non-conforming, and a consumer walking the hierarchy is entitled to assume termination.

A producer MUST NOT emit a SubjectCollection whose members link outside the document. A document that violates closure is non-conforming.

Carried copies in content documents. A course document may assign objectives that originate in a SubjectCollection (its courseObjectiveIds / objectiveIds arrays reference them). A producer emitting such a course MUST embed a copy of every referenced objective in the course’s objectives[] pool — with the member id preserved verbatim — so the document remains self-contained. Such an embedded copy is a carried copy: it travels for portability and does not transfer ownership or revise the member’s wording (§5.7 governs what a consumer does with it). A course document declaring specVersion 1.1 or later whose objective-id references do not all resolve within its own objectives[] pool is non-conforming. Documents declaring specVersion 1.0 retain their 1.0 meaning unchanged: unresolved objective references were an advisory (warning-tier) condition in 1.0 and remain so for 1.0 documents — this rule tightens only what 1.1 producers emit, which is what keeps 1.1 additive under §8.2.

The same pattern applies at document scale for glossaries: a producer emitting a course whose glossaryRefs reference glossaries it holds SHOULD embed a carried copy of each referenced glossary document — whole, with globalId and entry member ids preserved verbatim — in the course’s root glossaries[] pool, so a single course file is self-contained. The obligation is SHOULD, not MUST: a glossary ref that resolves to no carried copy is legal (a dangling ref — consumers surface it and never fail the import; see §4.9 and glossary-reference.md §4), because the course’s learner-facing content is complete without its glossary panel, which is not true of assigned objectives.

4.10 Alignment claims

A SubjectCollection may assert typed alignments to external frameworks and registries via externalAlignments[]. Each entry carries {claim, scheme, id, label}:

claim MUST be one of the 1.1 claim types: "references", "alignedTo", or "covers". The values "assesses" and "verifiedBy" are reserved for a future version; a 1.1 producer MUST NOT emit them. This vocabulary binds producers only and is deliberately not closed in the schema: a consumer never rejects a document over a claim value it does not recognize (§5.5) — the schema leaves claim open precisely so the §5.1 schema-validation obligation and the §5.5 preservation obligation cannot collide.
scheme and id MUST both be present and non-empty: scheme names the external namespace (e.g., a standards body, a national curriculum register, an official catalog), and id is the identifier within that namespace. label is optional display text.

External registries are referenced, never re-implemented: an alignment entry points at an external identifier; it does not embed or restate the external framework’s content. A consumer MUST NOT reject a document for carrying an alignment whose scheme it does not recognize, and MUST preserve alignment entries across read/write cycles; interpretation of any given scheme is consumer-defined.

4.11 Publication metadata

The distributable artifact types — Course, SubjectCollection, CurriculumPack, and Glossary — carry publication metadata as optional top-level fields: license, canonicalUrl, and derivedFrom[] (alongside the type’s existing authors/version fields). QuestionSet is excluded by role: it is a lightweight referencable resource, not a distribution-governed one. A glossary shares QuestionSet’s structural lightness (flat root, referencable) but is distribution-governed: glossaries are shareable, remixable artifacts, which is precisely what the publication fields exist for.

A producer SHOULD populate license on any document intended for distribution beyond its authoring environment. The value "unspecified" is appropriate only for private drafts.
derivedFrom[] entries ({globalId, version}) record provenance: the document(s) this one was revised or remixed from. A producer creating a new document by modifying an existing one SHOULD record the source there.
A producer MUST NOT encode commerce data (price, entitlement, buyer identity) in publication metadata or anywhere else in an LC-JSON document.

5. Consumer Conformance

A consumer is any tool that ingests LC-JSON documents from an external source.

Consumer conformance requires more than schema validation. Schema validation (§5.1) is necessary but not sufficient: a conformant consumer ALSO satisfies the discriminator-handling rule (§5.3), the unknown-fields rule (§5.4), the reserved-enum-values rule (§5.5), the randomization requirements (§5.6), the member-identity rules where vocabulary members appear (§5.7), and — where reserved or unknown question types appear — the round-trip preservation obligations in §6. A generic JSON Schema validator alone does not implement these; consumers MUST implement the relevant §5.x and §6 obligations to claim conformance (see §10.3). See the worked example at the end of this section.

5.1 Strict validation

A consumer MUST validate incoming documents against the published JSON Schemas for the declared documentType and reject documents that fail schema validation.

Exception (§6 fallback for unknown types). Schema-validation failures whose only cause is one or more type discriminator values not present in the consumer’s implemented question-base.schema.json enum do not trigger §5.1 rejection. The consumer applies the §6 fallback to those questions (preserve verbatim, treat earned points as 0, render placeholder, report to user) and validates the rest of the document under §5.1. All other schema-validation failures — missing required fields, type mismatches, pattern violations on known fields, additionalProperties violations on closed objects, etc. — still trigger rejection. This carve-out is what makes §5.2’s “accept any 1.x specVersion” rule operable: a 1.0-only consumer reading a 1.x+ document with a future-minor question type satisfies both §5.1 and §6 by following this path.

Unimplemented artifact types. A consumer is not required to implement every artifact type. A consumer that does not implement a document’s declared documentType MUST reject that document cleanly, naming the unsupported type — it MUST NOT attempt a partial or coerced interpretation. Conformance claims are scoped per artifact type (§10.1).

5.2 Spec version handling

A consumer MUST accept any specVersion value whose major version it implements (e.g., a 1.x consumer accepts 1.0, 1.1, 1.0.1, …; the canonical pattern is enforced by course.schema.json / question-set.schema.json).

A consumer MUST reject specVersion values whose major version exceeds what it implements (a 1.x consumer rejects 2.0, 2.1, 3.0, …). The rejection SHOULD be a clear error indicating the unsupported spec version.

A consumer MUST NOT silently downgrade or interpret unknown spec versions.

5.3 Discriminator handling

A consumer MUST recognize canonical camelCase question-type discriminators and canonical lowercase item-type discriminators as defined in §4.2. Non-canonical casings are non-conforming and MUST be rejected.

5.4 Unknown fields

A consumer MUST NOT reject a document solely because it contains additional fields not defined by the schema. Such fields are reserved for forward-compatible additions and MUST be ignored or preserved at the consumer’s discretion.

5.5 Reserved enum values

A consumer MUST accept question types listed in question-base.schema.json’s enum even when no per-type schema is published for them. Full handling obligations — including round-trip preservation, learner-facing placeholder rendering, and grading semantics — are normative under §6 (Reserved and unknown types).

The same forward-compatible posture applies to alignment-claim values (§4.10): a consumer encountering an externalAlignments[].claim value outside the 1.1 set MUST NOT reject the document, MUST NOT interpret the claim, and MUST preserve the entry verbatim across read/write cycles.

5.6 Randomization requirements for matching and placement

For matching and placement questions, two surfaces a consumer presents to a learner have no author-defined order:

The choice pool, comprising every authored answer value (pairs[].match or categories[].label for matching; placements[].item for placement) plus any distractors. Source order would directly expose the correct-answer mapping (the N-th option being the correct answer for the N-th row or gap), defeating the question.
The row order in matching classification mode, where each row is one item to be classified. Source order is grouped by category — items belonging to categories[0] first, then categories[1], and so on — which directly exposes the answer (the first N rows all share the same category label).

A consumer MUST present both surfaces to learners in randomized order. A consumer MUST NOT render either surface in source order. The randomization algorithm and any seeding strategy are consumer-defined.

These requirements do not apply to:

multipleChoice and other single-question choice lists, where authors may deliberately position the correct option and the question schema’s own shuffleOptions field governs shuffle policy per question.
The order of pair rows in matching pairs mode, where each item has its own distinct match value and source row order does not directly expose the answer.
The order of items in ordering source-tile pools, where the question’s structural design requires the tile pool to be presented in non-source order regardless.

5.7 Member-identity handling

When a consumer that maintains a persistent store of vocabulary members ingests a document carrying members (a SubjectCollection, or a course with carried copies per §4.9), the member ids govern reconciliation:

An incoming member id the consumer already holds identifies the same member. The consumer MUST NOT create a duplicate member for it.
Tags — record membership. When a SubjectCollection lists a tag the consumer already holds, the consumer records the tag’s membership in that collection. It MUST NOT duplicate the tag, and MUST NOT transfer or revoke the tag’s other memberships.
Objectives — link, never overwrite. When a document carries an objective the consumer already holds, the consumer links to its existing member. It MUST NOT modify the existing member’s wording, difficulty, or tag links from the incoming copy unless the incoming document is the member’s owning document (a revision of the owning SubjectCollection, or a re-import of the course that owns the objective). Carried copies (§4.9) are read-only with respect to the member they duplicate.
Ownership is determinable only as far as the wire permits — 1.1 carries no owner/provenance marker. An objective listed in a SubjectCollection’s objectives[] is owned by that collection (§3.4: collections own their objectives), and a consumer MAY treat that as an authoritative ownership claim. A course’s objectives[] pool is deliberately ambiguous — it holds the course’s own objectives and carried copies without distinction, and nothing on the wire says which — so a consumer cannot determine from a course document alone whether that course owns a given objective id or merely carries a copy of one owned elsewhere. Because of that, a consumer MUST NOT overwrite one document’s wording for an objective id with another’s on the basis of an inferred course ownership (link-never-overwrite): it reconciles by id, keeps the ingested wordings, and SHOULD surface a divergence between two sources for the same id rather than silently picking one, offering a fork (a new member id) as the keep-mine path. When a SubjectCollection listing an id is ingested, the collection’s wording MAY be treated as authoritative for that id from then on; absent any collection claim, an id seen only in course pools has no determinable owner and is treated as shared-by-reference. Two SubjectCollections both listing the same objective id is a document-set conflict the consumer SHOULD surface — the model gives objectives exactly one owner. A per-copy provenance marker on carried copies, which would let a consumer resolve course-level ownership deterministically, is reserved for a future minor version (alongside the reserved course→collection reference); until it exists, the obligations above are the whole of what a 1.1 consumer is required to do.
Absent members are created verbatim. A member id the consumer does not hold is created from the incoming copy with its id preserved — never re-minted — so that a later document carrying the same id reconciles to it.
Glossary entries reconcile like objectives. When a consumer re-ingests a glossary it already holds (same document globalId), entry ids govern the update: an entry id already held is the same entry (update in place, never duplicate); an absent id is created verbatim; ids are never re-minted. A glossary entry’s firstMention naming a lesson the consumer does not hold is treated as absent; a consumer that regenerates lesson globalIds on import MUST remap firstMention on glossaries imported alongside.
Identity-less members are rejected; closure is a producer-validity rule. The §4.9 closure rules and §3.4 identity rule are producer-emission requirements: a producer MUST NOT emit a violating SubjectCollection, and such a document is non-conforming (a conforming validator reports it as an error). On the consumer side the two rules differ in strictness, mirroring the strict-producer / lenient-consumer split of §3.2 and §5.1:
- A consumer MUST reject a SubjectCollection whose members lack ids — identity is non-optional (§3.4) and there is nothing to reconcile against.
- A consumer SHOULD reject a SubjectCollection that violates the §4.9 closure rules, reporting the specific violations; a lenient consumer MAY instead ingest it with the violations surfaced (for example, treating an unresolved categoryId as an uncategorized tag). Closure is what a producer must guarantee; a consumer is not obliged to fail an otherwise-usable document over it.
Display collisions never override identity. If an incoming member’s slug (or other display/lookup field) collides with a different member the consumer already holds, the consumer resolves the collision on the display field (e.g., by qualifying the incoming slug) — it MUST NOT merge the two members or reassign the id.

A consumer with no persistent member store (e.g., a single-document validator or converter) satisfies §5.7 vacuously, but MUST still preserve member ids verbatim across any read/write cycle.

5.8 Consumer validation order (informative)

The §5.x obligations compose into one algorithm; implementers who follow it satisfy the strict-validation and forward-compatibility rules simultaneously:

Parse the JSON. A parse failure is fatal.
Dispatch the schema from documentType + the $schema URL (inferring from documentType + specVersion when $schema is absent, per §3.2). An unimplemented documentType is rejected cleanly, naming the type (§5.1).
Validate against the schema. A failure whose only cause is one or more unknown question-type discriminators routes those questions to the §6 fallback and the rest of the document continues (§5.1, Exception). Every other schema failure is import-fatal. Note what does not fail schema validation by design: unknown fields on open objects (§5.4 — LC-JSON objects are open unless §7.1 names them closed), and vocabulary values the spec deliberately leaves schema-open because their vocabularies bind producers only (alignment claim, pack contentRefs[].type).
Apply the domain rules cataloged in VALIDATION.md at their stated tiers — ERROR-tier domain failures reject; WARN/NOTE-tier are surfaced, never fatal.
Ignore-or-preserve unknown fields (§5.4) and extension members (§7.4); reconcile members (§5.7); apply the §6 obligations to any fallback questions.

Forward compatibility: three look-alike situations (informative)

A 1.0-conformant consumer reading a 1.x document may encounter three superficially-similar cases at the JSON layer, each governed by a different consumer obligation. A generic JSON Schema validator handles none of them automatically.

An unknown top-level field on a question. Example: "explanationVideoUrl": "..." appears on a multipleChoice question. Under §5.4 (Unknown fields), the consumer MUST NOT reject the document; it ignores or preserves the field at its discretion.
An extension-namespaced field. Example: "x-somecompany-difficultyBand": "B2" appears on the same question. Under §7 (Extensions), the consumer MUST NOT reject for it and SHOULD preserve it verbatim across read/write cycles.
An unknown type discriminator value. Example: a question carries "type": "novelCodingTask" — a value the consumer’s implemented question-base.schema.json enum does not include. Per §6.1, reserved and unknown types are handled identically: it does not matter whether novelCodingTask is destined for a future minor version, is a vendor-specific extension type, or will never be standardized at all. Under §5.1 (Strict validation, Exception) and §6.2 (Consumer obligations), the consumer applies the §6 fallback to that question (preserve verbatim, treat earned points as 0, render a placeholder naming the type, report to user) and validates the rest of the document. Note that earned points are set to 0, but the question’s possible points still count toward the item’s total — the item’s maximum is consumer-independent by design, so a learner who completes the item in a fuller consumer can earn all the points the producer declared while a learner in a more limited consumer earns whatever subset they can; both report grades against the same denominator. Under §6.4 (Round-trip preservation), if the consumer re-exports the document, the novelCodingTask question is preserved with every member, value, and nested structure intact (semantic preservation; key order is producer-discretion per §6.2).

These three cases look similar at the JSON layer but are not interchangeable. Implementers using a generic JSON Schema validator (jsonschema for Python, Ajv for JavaScript, etc.) MUST add the §5.x and §6 fallback logic above the base validation call — particularly for case 3, where a generic validator would reject the whole document on the unknown "novelCodingTask" enum value, but §5.1’s Exception is what permits the rest of the document to validate while §6 governs the unknown-type question.

6. Reserved and Unknown Types

6.1 Definitions

A reserved type is a type discriminator value listed in question-base.schema.json’s discriminator enum that does not have a published per-type schema in this spec version. The 1.0 reserved types are: association, hotspot, graphicGapMatch, graphicAssociate, graphicOrder, fileUpload, and mediaPromptedEssay.

An unknown type is a type discriminator value not listed in question-base.schema.json’s discriminator enum. Unknown types may appear in 1.x+ documents read by 1.0-only consumers.

For the purposes of this section, reserved and unknown types are handled identically.

6.2 Consumer obligations

When a consumer encounters a question whose type is reserved or unknown, the consumer:

MUST preserve every member of the question object across read/write cycles — every field name, every value, every nested object and array, and any extension fields present on import. No field dropping, no value mutation, no globalId rewriting. (Key order within JSON objects is producer-discretion: producers SHOULD preserve input key order for authoring ergonomics and diff stability, but consumers are not required to — JSON object members are unordered per RFC 8259 §4.)
MUST NOT silently drop the question from the parent item’s questions[] array. The question’s existence is preserved even when its rendering is not supported.
MUST treat the question’s earned points as 0 for grading purposes. The question’s possible points still count toward the item’s total — the maximum is not silently reduced.
MUST report the unsupported question to the user (or upstream caller) at import time, naming the type and the question’s globalId. Form is implementation-defined (UI banner, log line, returned warning), but the report is required.
SHOULD render a non-interactive placeholder in the learner UI naming the type. Example: “Question type ‘hotspot’ is not supported by this application. Skip to the next question.”
SHOULD disable navigation gating for unsupported questions (e.g. do not block lesson completion just because a reserved question was not answered).
MAY offer the learner a way to view the raw question data (instructor preview, debug mode), but MUST NOT expose internal field names to the learner UI by default.

6.3 Producer obligations

A producer that emits reserved types:

SHOULD NOT emit reserved question types in 1.0 documents intended for cross-implementation distribution. Reserved types are explicitly tool-specific extensions until promoted in a future version.
MUST still satisfy question-base.schema.json if it does emit them: valid type, valid globalId, valid points, valid prompt, plus any other question-base requirements. Consumers’ fallback handling can only operate on a structurally well-formed object.
SHOULD document in the tool’s README which reserved types it emits and which fields it populates, so other tool authors can interoperate or contribute.

6.4 Round-trip preservation

A consumer that imports an LC-JSON document, modifies it, and re-exports MUST preserve every member of every reserved-type question in the exported document — including their globalId, type, points, prompt, and any additional fields that were present on import. No field dropping, no value mutation. (Key order within JSON objects is producer-discretion per §6.2; the preservation obligation is semantic, not byte-level.)

The intent is that a teacher exporting from a consumer that does not support hotspot can take the file back to a consumer that does, without losing the hotspot question. This is the core interop guarantee for reserved types: consumers MUST NOT strip reserved questions on export even if they cannot render them on import.

6.5 Producer guidance (informative)

To make a reserved-type question maximally compatible with future first-class implementations and other producers emitting the same name:

Use the published reserved name exactly (hotspot, not Hotspot or hotspot-question).
Always populate globalId (UUID), points, and prompt.
Use additional fields conservatively — anything beyond question-base is by convention only until 1.1 publishes a per-type schema. Document any tool-specific extensions in your README.
Avoid generic field names that 1.1 schemas may use canonically (data, config, settings).

This subsection is informative — producers that do not follow it still produce valid LC-JSON. But the future first-class schemas are likelier to land cleanly if 1.0 producers stay within the spirit.

Note (1.1): this version promotes no reserved question types to first-class schemas; the 1.0 reserved-type list is unchanged in 1.1.

7. Extensions

LC-JSON is deliberately small. Tools frequently need to attach data that is meaningful to themselves but is not part of the interchange contract — authoring provenance, internal identifiers, editor state, analytics hints. Namespaced extensions provide a forward-compatible, collision-free way to carry such data without polluting the core format or requiring a spec revision.

7.1 Extension members

An extension member is an object member whose key begins with the prefix x- followed by a vendor or tool namespace, for example x-acme-reviewState or x-acme.lineage.

Extension members MAY appear on the document root and on any Course, Unit, Lesson, Item, or Question object, and — in vocabulary, arrangement, and glossary documents — on any category, tag, objective, alignment, reference, or glossary entry. They MUST NOT be added to objects whose schema declares additionalProperties: false (in 1.0, the matching pair/category entries and placement entries), because those objects are closed by contract and would fail validation.

The x- prefix is reserved exclusively for extensions. A producer MUST NOT introduce a non-extension field whose name begins with x-.

7.2 Namespacing

The segment immediately following x- is the namespace and MUST identify the originating tool or vendor (e.g. x-acme). Namespacing prevents two tools from colliding on the same key with incompatible meanings. A producer MUST NOT emit an extension member under a namespace it does not own.

A namespace owner SHOULD document the extension members it emits — their shape and meaning — in its public implementation notes (for known implementations, in IMPLEMENTATIONS.md).

7.3 Additive-only constraint

Extensions are strictly additive. A producer MUST NOT encode in an extension member any data required for a baseline-correct interpretation of the document. A consumer that ignores every extension member MUST still obtain a complete and correct learning experience. Equivalently: removing all x- members from a conforming document MUST leave a conforming document with equivalent learner-facing meaning.

This keeps extensions from degenerating into a shadow format that fragments the ecosystem.

7.4 Consumer obligations

A consumer MUST NOT reject a document solely because it contains extension members (this restates §5.4 for the namespaced case).

A consumer MUST NOT interpret an extension member outside its own namespace as having any defined meaning. A consumer MAY read and act on extension members within namespaces it understands.

A consumer that imports, modifies, and re-exports a document SHOULD preserve extension members it does not understand, re-attaching each to the same object it arrived on (identified by globalId where the object carries one). A consumer that preserves all unrecognized extension members across a round trip is said to be extension-preserving; a consumer that cannot SHOULD document the loss.

The SHOULD — rather than MUST — acknowledges that some consumers have fixed internal storage with nowhere to hold arbitrary foreign data. But preservation is what lets a tool use LC-JSON as a faithful transfer or backup format for its own tool-specific state: a document that round-trips through an extension-preserving consumer comes back whole, including data that consumer never understood.

7.5 Producer obligations

A producer MAY emit extension members under namespaces it owns, subject to §7.1–§7.3. A producer MUST keep extension content well-formed JSON. A producer SHOULD prefer extension members over overloading core fields (for example, encoding private state in tags or title) for tool-specific data.

8. Versioning and Stability

8.1 Semantic versioning

Spec versions follow a semver-style scheme: MAJOR.MINOR[.PATCH].

A major version bump (e.g., 1.x → 2.0) signifies a breaking change. New schemas are published at a new URL path (/2.0/).
A minor version bump (e.g., 1.0 → 1.1) signifies an additive change. New schemas are published at a new URL path (/1.1/).
A patch bump signifies non-normative fixes (description text, examples, clarifications). No URL change.
A release candidate of an upcoming version X.Y carries the version label X.Y-rc.N (where N is 1, 2, …) and is published at its own URL path /X.Y-rc.N/. RCs allow non-breaking refinements between the candidate and the accepted final release; each RC is its own immutable publication. The final X.Y release is published at /X.Y/ only when accepted. Documents pinned to /X.Y-rc.N/ do not auto-promote to /X.Y/ — adopting the final release is an explicit publisher choice (typically a re-export against the new schema URL).

8.2 Definition of “breaking”

For the purposes of §8.1, a change is breaking if and only if it causes a previously-conforming document to stop validating under the new schema, or to change in meaning under the new schema (i.e., a field that previously had one interpretation now has another).

Loosening the schema so that a previously-non-conforming document begins to validate is not breaking by this definition: documents that already conformed continue to conform with unchanged meaning. The additive examples below rely on this asymmetry.

Examples of breaking changes:

Renaming a property.
Removing an enum value that existing documents may have used.
Tightening a constraint (e.g., reducing a string’s maxLength below an existing value’s length).
Adding a new required property.
Changing a property’s type.

Examples of additive changes:

Adding an optional property.
Adding an enum value.
Loosening a constraint (e.g., increasing maxLength).
Removing a property from an object’s required list (the field becomes optional).
Adding an entirely new artifact type with its own documentType value.

LC-JSON 1.1 is additive by this definition: its changes are three new artifact types, new optional properties on the course document (publication metadata at the root; glossaryRefs at course/unit/lesson), and new enum values. Every conforming 1.0 document validates unchanged under the 1.1 schemas with unchanged meaning.

8.3 URL stability

Schemas published at any published version path — released versions and release candidates alike — MUST remain available at that URL with byte-identical content (modulo whitespace) for the lifetime of the specification. Specifically:

https://lc-json.org/1.0/*.schema.json MUST resolve to the 1.0 schemas indefinitely once 1.0 final is published.
https://lc-json.org/1.0-rc.N/*.schema.json MUST resolve to the rc.N schemas indefinitely once rc.N is published.
These URLs MUST NOT be redirected to a different schema, even one that is “compatible” or “improved.”
These URLs MUST NOT be moved to a non-canonical host.
The /X.Y/ URL path MUST NOT be populated until X.Y final is published; serving rc.N content at /X.Y/ is non-conforming and prevents downstream documents from distinguishing the candidate from the final release.

This guarantee enables conforming documents to embed $schema URLs that remain valid for the document’s entire lifetime in archives, version-control systems, and offline contexts — including across rc.N → final transitions, where rc.N documents continue to validate against their original rc.N URL indefinitely.

8.4 Version-path forward compatibility

A document is validated against the schemas at the URL given in its $schema field — that URL is the document’s canonical schema location and the binding target for conformance. The specVersion field declares the spec version the document targets; the $schema URL identifies the specific schema publication (release or release candidate) it was authored against. Both MUST be present (§3.2) and MUST agree on the targeted version (§4.6, §4.7): a document declaring specVersion: "1.0" MUST point $schema at either /1.0/ (the final release, once published) or a /1.0-rc.N/ candidate path; a document declaring specVersion: "1.1" MUST point $schema at /1.1/ or a /1.1-rc.N/ candidate path.

Reminder (§4.6): specVersion never carries an -rc.N suffix. Every document targeting the 1.1 contract — whether authored against an rc.N candidate or 1.1 final — declares specVersion: "1.1". The specific publication is identified by $schema. For example, a document authored during the 1.1-rc.1 phase looks like:

{
  "$schema":     "https://lc-json.org/1.1-rc.1/subject-collection.schema.json",
  "documentType": "subjectCollection",
  "specVersion":  "1.1",
  ...
}

It follows that:

A document declaring specVersion: "1.0" with $schema pointing at /1.0/ MUST validate against the schemas published at /1.0/. The 1.0 final release has shipped; /1.0/ is populated and frozen (§8.3).
A document declaring specVersion: "1.0" with $schema pointing at /1.0-rc.N/ MUST validate against the schemas published at /1.0-rc.N/ and is not required to validate against /1.0/. The rc.N → final transition is an explicit publisher choice (see §8.1, §8.3) — a re-export against the new $schema URL — not an automatic upgrade.
A document declaring specVersion: "1.1" MUST validate against the schemas at its declared $schema URL. Backward compatibility runs the other way and is defined by consumer obligation, not by cross-schema validation: a 1.1 consumer MUST continue to accept any valid 1.0 document (1.1 adds artifact types and optional fields; it removes nothing a 1.0 document relies on). “Validate a 1.1 document against 1.0 schemas” is not defined — the 1.1 artifact types have no 1.0 schema, and additive fields would fail a strict 1.0 schema.

9. Deprecation

A field, discriminator value, or shape may be deprecated in a minor version and removed in a subsequent major version.

9.1 Deprecation marking

Deprecated fields MUST be marked with "deprecated": true in their schema definition and SHOULD include a description referencing their replacement.

9.2 Producer behavior for deprecated fields

A producer MUST NOT emit deprecated fields in new documents. A producer that re-emits previously-imported documents MAY preserve deprecated fields it received, but SHOULD prefer to emit only the canonical replacement.

9.3 Consumer behavior for deprecated fields

A consumer MUST continue to accept deprecated fields for the lifetime of the major version that introduced the deprecation. Removal is permitted only at the next major version bump.

9.4 Currently deprecated items

No items are deprecated in 1.0 or 1.1. The specification ships clean.

10. Conformance Claims

10.1 Base LC-JSON conformance

A tool MAY claim conformance to LC-JSON 1.1 as follows:

“Conforms to LC-JSON 1.1 as a producer” — the tool emits documents satisfying §4.
“Conforms to LC-JSON 1.1 as a consumer” — the tool ingests documents satisfying §5, §6, §7, and the accessibility-preservation obligations of §12.1.
“Conforms to LC-JSON 1.1” without qualification — the tool implements both producer and consumer conformance.

Conformance is scoped to the artifact types the tool implements. A tool that implements only Course and QuestionSet MAY claim LC-JSON 1.1 conformance for those artifact types — provided it satisfies every applicable requirement, including §5.1’s clean rejection of documentTypes it does not implement and §5.2’s acceptance of 1.x specVersion values on the types it does. A claim SHOULD name its artifact-type scope when it is narrower than the full set (e.g., “Conforms to LC-JSON 1.1 as a consumer (course, questionSet)”). A claim without a scope qualifier asserts all five artifact types.

10.2 LC-JSON Accessibility Profile conformance (opt-in)

A tool that additionally satisfies the obligations in ACCESSIBILITY.md MAY claim:

“Conforms to the LC-JSON 1.1 Accessibility Profile as a producer” — the tool emits documents satisfying §4 plus the producer-side obligations in ACCESSIBILITY.md §§2–7.
“Conforms to the LC-JSON 1.1 Accessibility Profile as a consumer” — the tool ingests and renders documents satisfying §5/§6/§7/§12.1 plus the consumer-side MUST-level obligations in ACCESSIBILITY.md §§2–8.
“Conforms to the LC-JSON 1.1 Accessibility Profile” without qualification — both producer and consumer.

A consumer claiming the Accessibility Profile MUST satisfy all MUST-level items in ACCESSIBILITY.md §§2–8 for its role; partial satisfaction is misclaim. See §12 for the profile’s binding text.

10.3 Claim accuracy

A tool MUST NOT claim conformance unless it satisfies all applicable MUST requirements. A tool MAY publish self-test results against the conformance test corpus (see tests/) as evidence.

Three rules guard against the predictable misclaims:

Producer ≠ consumer. Claim only the roles the tool actually satisfies; a producer-side conformance claim does not extend to the consumer role without satisfying §5.
The Accessibility Profile is fully bound. Claiming the Accessibility Profile means every MUST-level item in ACCESSIBILITY.md §§2–8 (for the claimed role) is satisfied. Partial profile claims are misclaim.
LC-JSON does not certify WCAG conformance. The LC-JSON Accessibility Profile provides the wire-format affordances and consumer-rendering obligations that enable WCAG 2.1 AA delivery; a delivering consumer’s own WCAG claim (under EN 301 549, DOJ ADA Title II, Section 508, Section 504, or equivalent) is separate and remains the consumer’s responsibility.

10.4 Suggested wording (informative)

Implementers may use the following short forms for marketing pages, badges, READMEs, and footers. They are advisory — formal claims live in §10.1 and §10.2.

Tier 1 — Base LC-JSON 1.1 conformance

Form	Wording
Badge	LC-JSON 1.1 Compatible
Sentence	“Reads and writes LC-JSON 1.1 — the open Learning Content JSON specification at lc-json.org.”
Formal	“Conforms to LC-JSON 1.1 as a producer / consumer / producer and consumer.”

Tier 2 — LC-JSON 1.1 Accessibility Profile

Form	Wording
Badge	LC-JSON 1.1 Accessibility Profile
Sentence	“Delivers LC-JSON 1.1 content with accessible rendering — keyboard navigation, screen-reader support, captions, language-aware text direction. Conforms to the LC-JSON 1.1 Accessibility Profile.”
Formal	“Conforms to the LC-JSON 1.1 Accessibility Profile as a producer / consumer / producer and consumer.”

Role qualifiers ((producer) / (consumer)) SHOULD accompany the badge or sentence when the implementation supports only one role, so readers do not infer capabilities the tool does not provide.

A Tier 2 claim implies Tier 1 (the Accessibility Profile is additive to base conformance); no double-badging is needed.

10.5 Trademark

Trademark rights in “LC-JSON” and “Learning Content JSON” are not asserted against conformance claims. Any tool meeting the requirements above MAY freely state its conformance and use the suggested wording in §10.4.

11. HTML Safety Profile

LC-JSON permits HTML in two fields: ContentItem.html and SignpostItem.customHtml. The complete normative HTML safety profile — allowed elements, allowed attributes, URL-scheme allowlist, sanitization obligation, link normalization, media handling, and unknown-element handling — is specified in HTML_SAFETY.md. SubjectCollection, CurriculumPack, and Glossary documents carry no HTML-bearing fields; their text fields (name, description, objective text, glossary definition, translation values, …) are plain text, and a consumer SHOULD render them as such.

A producer that emits HTML in any HTML-bearing field MUST emit only constructs permitted by HTML_SAFETY.md §2 (elements), §3 (attributes), and §4 (URL schemes).

A consumer that renders HTML from any HTML-bearing field MUST sanitize the HTML against HTML_SAFETY.md §5 before rendering, MUST normalize <a target="_blank"> to include rel="noopener noreferrer" per §6.1, and MUST strip-while-preserving-text any unknown element per §6.2. A consumer MUST reject any document containing forbidden constructs listed under §8.1 (<script>, event handlers, javascript:/vbscript: URLs, etc.).

HTML_SAFETY.md is normative and forms part of LC-JSON 1.1. The split into a separate document reflects its length, not its status.

12. Accessibility Profile

LC-JSON’s accessibility model distinguishes two layers: preservation of accessibility metadata across read/write cycles (binding on every conforming consumer), and delivery of accessible rendering to end users (binding only when the Accessibility Profile is claimed).

The motivating concern is that accessibility information must survive transformation. In real ecosystems, educational content is exported, imported, translated, edited, and repackaged across multiple tools; accessibility failures most commonly occur during these transformations rather than during original authoring — alt text silently removed during save operations, transcripts discarded during export, localized accessibility text overwritten, unknown accessibility fields stripped by intermediate tools. The accessibility-preservation floor (§12.1) protects the format against that failure mode in every conforming consumer. The Accessibility Profile (§12.2) is the opt-in commitment to also deliver the affordances accessibly.

12.1 Base-conformance accessibility preservation

A conforming consumer that re-emits a document MUST NOT degrade its accessibility shape. Specifically:

alt attributes on <img> MUST round-trip.
<track> elements (including kind, src, srclang, label, default) on <video> and <audio> MUST round-trip.
lang and dir attributes on HTML-bearing elements MUST round-trip.
The required document-root language field MUST round-trip. The document-root supportLanguage field MUST round-trip when present, including explicit null.
Reserved-type questions MUST round-trip with any accessibility metadata they carry, per §6.4.
Extension-preserving consumers (§7.4) SHOULD round-trip x--namespaced extension members that carry accessibility data.

These obligations are part of base LC-JSON conformance; a consumer claiming “Conforms to LC-JSON 1.1 as a consumer” satisfies them. The HTML safety profile (§11 / HTML_SAFETY.md) explicitly allows alt, <track>, lang, and dir on every applicable element class to make this preservation possible.

Base conformance is preservation only: it never requires a producer to author accessibility content (alt text, captions, transcripts). A small or non-institutional producer is therefore never non-conforming for omitting them — the reference validator surfaces omissions as non-blocking warnings. The obligation to author accessibility content is part of the opt-in Accessibility Profile (§12.2). The two-layer split is intentional: accessibility information is never silently stripped or ignored on read/write (base), while the heavier “the content must actually be accessible” bar is opt-in for the products — typically institutional, or those with legal or marketing accessibility commitments — that need it.

12.2 The Accessibility Profile (opt-in)

The accessibility profile defined in ACCESSIBILITY.md — alt-text requirements, video caption obligations for instructional content, keyboard alternatives for structured-task question types, non-color feedback, language-aware rendering, accessible reserved-type placeholders, and validator severities — is bound by an opt-in claim (§10.2).

A consumer claiming the Accessibility Profile MUST satisfy the structured-task keyboard alternatives (ACCESSIBILITY.md §4), the non-color-feedback obligations (§5), the language/dir rendering obligations (§6), and the reserved-type placeholder accessibility (§7).
A producer claiming the Accessibility Profile MUST emit the producer-side authoring obligations across ACCESSIBILITY.md §§2–7. These include, at minimum: alt on every <img> (§2.1); <track> captions on prerecorded instructional video carrying speech, plus a transcript for that video, and a transcript for prerecorded audio-only instructional content (§3.1); and root language matching the delivery language (§6). These authoring MUSTs apply only under a Profile claim — they are not base-conformance obligations (§12.1).
Tools that satisfy preservation (§12.1) but not delivery (§12.2) are conforming LC-JSON consumers but are NOT conforming Accessibility Profile consumers, and MUST NOT claim the latter.

12.3 Relationship to WCAG

WCAG governs rendered user experiences; LC-JSON governs portability and metadata preservation. A consumer claiming the LC-JSON Accessibility Profile carries the wire-format affordances and consumer-rendering obligations that WCAG 2.1 AA delivery requires (alt text, captions, language/direction, textual feedback, keyboard alternatives); the consumer’s own jurisdictional WCAG conformance claim (under EN 301 549, DOJ ADA Title II, Section 508, Section 504, or equivalent) remains separate and is the consumer’s responsibility, not LC-JSON’s.

A tool MUST NOT claim WCAG 2.1 AA conformance by virtue of LC-JSON Accessibility Profile conformance alone. LC-JSON does not certify WCAG conformance.

ACCESSIBILITY.md is normative for tools claiming the Accessibility Profile and forms part of LC-JSON 1.1 in that capacity. The split into a separate document reflects the opt-in scope, not a lesser status.

13. Localization and language

LC-JSON 1.x is single-language-per-document. A document declares one delivery language in the root language field; multiple languages are delivered as multiple documents, not as localized field bundles within one document. The full model — the distinct roles of language (delivery), lang/dir (language of parts), and supportLanguage (the optional pedagogical L1 layer), the accepted language-tag forms, and the expectations around assistive-technology pronunciation — is specified in LOCALIZATION.md.

Binding requirements (restated here; full detail in LOCALIZATION.md):

A producer MUST emit a language root field matching the document’s delivery language.
Language-tag values (language, supportLanguage, HTML lang) are BCP 47 tags. Producers SHOULD use the bare ISO 639-1 primary subtag unless a region/script subtag carries meaning; a consumer MAY act on only the primary subtag.
A producer SHOULD mark HTML spans whose language differs from the delivery language with lang (and dir where script direction differs); a consumer MUST preserve lang/dir through sanitization and round-trip (see §12.1).
lang is the necessary affordance for assistive-technology language switching, but correct pronunciation also depends on the end user’s screen reader and installed voices — outside the format’s control. Emitting lang is not optional on that account; it is the floor (LOCALIZATION.md §7).

LOCALIZATION.md is normative for the obligations it states and informative for the pronunciation-expectations discussion. Where it and this document disagree, this document wins.

The root-language producer MUST above (and the §12.1 round-trip obligations on language/supportLanguage) bind per document type, for the types that define those fields: Course, QuestionSet, and Glossary. SubjectCollection and CurriculumPack documents do not carry a root language field in 1.1 — omitting it there is not a conformance failure: a vocabulary’s member wording is authored in one language as a matter of practice, but the classification it expresses is language-neutral, and the structured scope (subject/level/audience/purpose/jurisdiction) is the discovery surface. A future version may add an optional language field to the vocabulary types if implementer experience shows the need; producers wanting to record wording language today may use an extension member (§7).

Glossary documents are single-language like courses: the required root language names the language of terms, definitions, and examples. Per-entry translation maps (translations, definitionTranslations, example translations) are content — data about the term — not field-level document localization, so they do not breach the single-language rule; the optional root translationLanguages array declares their exact language inventory as a checkable claim. Glossaries carry no supportLanguage: the which-translation-to-display preference belongs to the delivery context (the attached course’s supportLanguage, or the consumer’s knowledge of the user’s L1), not to the portable artifact. See glossary-reference.md §1.

14. Validation surface

The requirements in this document are enforced across three sites: the 27 JSON Schemas under schemas/ in the published tree (23 from 1.0, plus subject-collection.schema.json, curriculum-pack.schema.json, glossary.schema.json, and the shared publication-fields.schema.json), the reference validators (the course validator, the vocabulary-document validator covering the §4.9 closure rules and §3.4 identity rules, the Curriculum Pack validator covering the CP-1 … CP-17 step, pacing, checkpoint, coverage, and bundle-closure rules, and the glossary validator covering the gloss rule and translation-inventory rules), and the per-document prose in the companion normative documents (HTML_SAFETY.md, ACCESSIBILITY.md, LOCALIZATION.md). VALIDATION.md catalogs every documented rule and tags it with its enforcement tier — schema-enforced, domain-validator-enforced, or advisory. Implementers building consumers, validators, or producer round-trip tests should consult VALIDATION.md for the one-map view of what to check.

VALIDATION.md is a catalog: it enumerates and tiers the rules whose normative force comes from this document (including the §3.3.1 artifact-type rule families it incorporates), from the schemas, and from the companion normative documents. It introduces no requirements of its own beyond those sources. Where its wording and any of those sources disagree, those sources win.

The four reference validators named above are non-authoritative reference implementations. Only this document (including the rule families it incorporates at §3.3.1), the companion normative documents, and the JSON Schemas’ constraints are authoritative. A validator’s behavior — in any mode, including --strict — never defines the conformance contract: where a validator diverges from these sources, the validator is defective and the sources govern.

The three artifact-type reference documents (subject-collection-reference.md, curriculum-pack-reference.md, glossary-reference.md) are informative — they explain and illustrate the rules incorporated by §3.3.1 but are not themselves authoritative (§3.3.1).

Schema description strings sometimes restate normative requirements (including RFC 2119 keywords) for implementer convenience at the point of use. Those restatements are not independently binding: this document and the JSON Schemas are authoritative, and where a schema description’s wording diverges from this document, this document wins; where the schema’s constraints apply, they bind as stated in §5.1. The schema’s constraints (types, patterns, required, enums) are binding as stated in §5.1.

14a. Security and privacy considerations (informative)

LC-JSON documents are content, and 1.1 widens what they can carry: external URLs (canonicalUrl, officialSourceURI-style alignment ids, glossary audioUrl/imageUrl), whole embedded documents (pack bundles, the course glossaries[] pool), globally portable identifiers, and preserved unknown fields. Implementers should hold four postures:

URLs are references, never instructions. A consumer SHOULD NOT dereference document-carried URLs automatically without a policy the deploying institution controls (allowlists, user gesture, or no fetching at all). canonicalUrl and alignment ids are provenance to display, not endpoints to call; media URLs are resolved subject to the consumer’s own content policy. Nothing in LC-JSON conformance requires network access.
Embedded documents are untrusted input. A bundle’s embedded block and a course’s glossaries[] pool are imports like any other: validate each embedded document under its own rules before use, and apply the full HTML_SAFETY.md sanitization to any HTML-bearing field regardless of how the document arrived.
Portable artifacts carry no person data. LC-JSON documents describe learning content, never learners: no learner identities, progress, grades, or contact data belong in any field of a portable artifact (grading policy fields like passMarkPercent are content; grade records are not). §4.11 already excludes commerce data; the same posture applies to personal data. authors is public display credit a contributor chose to assert.
Extensions inherit the same duty. x- members are preserved verbatim across tools and jurisdictions (§7.4); a producer SHOULD NOT place personal data or secrets in extension members, precisely because faithful consumers will carry them everywhere the document goes.

This section is informative: it creates no new conformance requirements, but implementers claiming conformance should expect deployers to ask these questions.

15. References

RFC 2119 — Key words for use in RFCs to Indicate Requirement Levels
RFC 8174 — Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words
RFC 4122 — A Universally Unique IDentifier (UUID) URN Namespace
RFC 3986 — Uniform Resource Identifier (URI): Generic Syntax
BCP 47 — Tags for Identifying Languages
JSON Schema Draft 7
LC-JSON HTML safety profile: HTML_SAFETY.md
LC-JSON accessibility profile: ACCESSIBILITY.md
LC-JSON localization and language model: LOCALIZATION.md
LC-JSON validation surface (informative): VALIDATION.md
LC-JSON glossary (informative): GLOSSARY.md
LC-JSON schemas: schemas/
LC-JSON examples: examples/
LC-JSON conformance test corpus: tests/
LC-JSON glossary reference: glossary-reference.md
LC-JSON subject-collection reference: subject-collection-reference.md
LC-JSON curriculum-pack reference: curriculum-pack-reference.md

Appendix A — Changes from 1.0 (informative)

LC-JSON 1.1 is additive per §8.2. The complete change list:

Three new artifact types. subjectCollection (vocabulary: tags + learning objectives with structured scope), curriculumPack (arrangement: sequence/pacing/checkpoints referencing a collection plus content), and glossary (content: a flat term list with pronunciation, translations, examples). New schemas subject-collection.schema.json, curriculum-pack.schema.json, and glossary.schema.json; documentType gains the three values (§3.2, §3.3, §4.2).
Member identity and membership. New §3.4 (immutable member ids; display is never identity; the same collection-member id in another document is the same member, while glossary entry identity is document-scoped as (glossary globalId, entry id); tags many-membership, objectives and glossary entries single-owner), §4.9 (collection closure + carried copies in course documents), and §5.7 (consumer reconciliation: no duplication, membership recording for tags, link-never-overwrite for objectives, ownership resolution from ingested claims, entry reconciliation on glossary re-import, verbatim creation of absent members, identity-less rejection, display-collision handling).
Alignment claims. New §4.10: externalAlignments[] with claim types references / alignedTo / covers; assesses and verifiedBy reserved; forward-compatible consumer handling in §5.5.
Publication metadata. New §4.11: optional license, canonicalUrl, and derivedFrom[] on the distributable types (Course, SubjectCollection, CurriculumPack, Glossary), added to the course root as plain optional top-level fields (composition via the shared publication-fields.schema.json). QuestionSet excluded by role. Commerce data stays out of LC-JSON.
Glossary attachment. Optional glossaryRefs arrays on the course root, units, and lessons — plain glossary globalId strings whose placement encodes scope (nearest attachment wins; junctions stop at Lesson) — plus an optional root glossaries[] pool carrying a whole-document copy of each referenced glossary, identity verbatim, for single-file self-containment (§4.9). A ref that resolves to no pool copy and no held document is legal and consumer-surfaced, never an import failure (see glossary-reference.md §4 and course.schema.json).
Unimplemented-artifact-type handling. §5.1 addition: clean rejection naming the unsupported type; §10.1 addition: conformance claims scoped per artifact type.
Extension surface. §7.1 extended to the vocabulary/arrangement/glossary objects.
No other changes. Question types (implemented and reserved), item types, HTML safety, the Accessibility Profile, and the localization model are unchanged from 1.0. Course/QuestionSet tags remain free wire strings — the opaque member ids of collections do not replace them; an optional member-id reference alongside the string arrays is reserved for future coordination.

LC-JSON HTML Safety Profile

Status: Normative. Referenced from NORMATIVE.md §11. Spec version: 1.0 Last updated: 2026-05-03

This document defines the HTML subset that LC-JSON (Learning Content JSON) 1.0 documents MAY carry in HTML-bearing fields, the obligations consumers MUST satisfy when rendering it, and the URL-scheme allowlist for embedded references.

The keywords MUST, MUST NOT, SHOULD, SHOULD NOT, MAY, and RECOMMENDED are to be interpreted as described in RFC 2119 and RFC 8174.

1. Scope

1.1 HTML-bearing fields

HTML is permitted in the following fields:

Field	Carrier	Schema reference
`html`	`ContentItem`	`schemas/content-item.schema.json`
`customHtml`	`SignpostItem`	`schemas/signpost-item.schema.json`

No other LC-JSON 1.0 field carries HTML. Question prompts, hints, choice text, feedback strings, and similar author-visible prose are plain text. A producer MUST NOT embed HTML in plain-text fields; a consumer MUST treat HTML in plain-text fields as literal text.

1.2 Why this profile exists

Without a portable allowlist, every consumer would sanitize against its own subset, and the same document would render differently — sometimes unsafely — across implementations. This profile fixes the contract:

Producers know what they MAY emit and have rendered consistently.
Consumers know what they MUST accept, what they MUST sanitize away, and where the line falls between “render-time stripping” and “reject the document.”
Third-party implementers have a single reference for <script>, event handlers, <iframe>, target="_blank", data: URLs, and the rest of the long tail.

The profile is deliberately strict-enough-to-be-safe, lenient-enough-to-author. Decisions throughout favor producer flexibility (any class, an inline-style allowlist that covers real authoring patterns, tel: for adult/corporate audiences) while binding consumer sanitization tightly enough that no conforming consumer can be coerced into XSS by a conforming document.

2. Allowed elements

A conforming consumer MUST render the following HTML elements when they appear in HTML-bearing fields, subject to the attribute allowlist in §3 and the URL-scheme allowlist in §4.

2.1 Block

<p>, <div>, <h1>, <h2>, <h3>, <h4>, <h5>, <h6>, <ul>, <ol>, <li>, <blockquote>, <pre>, <hr>, <table>, <thead>, <tbody>, <tr>, <th>, <td>, <figure>, <figcaption>

2.2 Inline

<a>, <strong>, <em>, <b>, <i>, <u>, <mark>, <small>, <sub>, <sup>, <code>, <br>, <span>, <abbr>, <q>, <time>

2.3 Media

<img>, <video>, <audio>, <source>, <track>

2.4 Forbidden elements

The following elements MUST NOT be emitted by producers and MUST be stripped (along with their entire subtree) by consumers:

<script>, <iframe>, <object>, <embed>, <form>, <input>, <button>, <select>, <textarea>, <style>, <link>, <meta>, <base>, <svg>, <math>, <applet>, <frame>, <frameset>, <noframes>

<svg> and <math> are forbidden inline (the surface for XSS via SVG sanitization is wide and inconsistently understood across libraries). SVG raster equivalents are permitted via <img src="..."> per §4.1; consumers SHOULD NOT inline-render the contents of an SVG fetched this way (the standard <img> rendering pipeline is sufficient and isolates script).

2.5 Unknown elements

When a consumer encounters an element name not listed in §2.1–§2.3 and not in the forbidden list of §2.4, the consumer MUST handle it per §6 (Unknown-element handling). Consumers MUST NOT reject a document on the basis of unknown elements alone.

3. Allowed attributes

3.1 Universal attributes

The following attributes MAY appear on every element listed in §2.1–§2.3:

Attribute	Purpose	Notes
`id`	Anchor target	SHOULD be document-unique; consumers MAY rewrite to namespace within their UI
`class`	Author-defined CSS hooks	See §3.2
`title`	Tooltip / accessible name
`lang`	Language override (BCP 47)
`dir`	Text direction (`ltr`, `rtl`, `auto`)

3.2 The `class` attribute

The class attribute is permitted on all allowed elements. Values are author-defined; the spec does not constrain or interpret them. Consumers MUST preserve the class attribute across read/write cycles (§6.4 round-trip preservation in NORMATIVE applies). Consumers MAY style classes they recognize; consumers MUST ignore (without stripping) classes they do not recognize.

This is intentional. Different consumers ship different stylesheets — img-medium matters to one consumer, lc-callout matters to another, generic Tailwind classes might appear in a third. The wire format does not arbitrate which class system wins; it preserves the author’s intent and lets each consumer apply its own visual policy.

3.3 Per-element attribute table

In addition to the universal attributes, the following per-element attributes are allowed.

Element	Attributes	URL-scheme constrained?
`<a>`	`href`, `target`, `rel`	`href` per §4.1
`<img>`	`src`, `alt` (REQUIRED), `width`, `height`	`src` per §4.1
`<video>`	`src`, `poster`, `controls`, `width`, `height`, `preload`	`src`, `poster` per §4.1
`<audio>`	`src`, `controls`, `preload`	`src` per §4.1
`<source>`	`src`, `type`	`src` per §4.1
`<track>`	`src`, `kind`, `srclang`, `label`, `default`	`src` per §4.1
`<table>`	`border` (`"1"` or absent only)	—
`<th>`, `<td>`	`colspan`, `rowspan`, `headers`, `scope`	—
`<ol>`	`start`, `reversed`, `type`	—
`<li>`	`value`	—
`<blockquote>`	`cite`	URL per §4.1
`<q>`	`cite`	URL per §4.1
`<abbr>`	(universal only)	—
`<time>`	`datetime`	—

<img alt> is REQUIRED. Empty alt="" is permitted (and indicates a decorative image — see ACCESSIBILITY.md §2). Producers MUST emit alt; consumers SHOULD treat a missing alt as a domain-validation warning and render the image.

3.4 Inline `style` attribute

The style attribute MAY appear on any element listed in §2.1–§2.3. Consumers MUST sanitize CSS properties against the allowlist below; properties outside the allowlist MUST be stripped (the property only — the element and other style properties are preserved).

Allowed CSS properties:

Category	Properties
Sizing	`max-width`, `min-width`, `width`, `max-height`, `min-height`, `height`
Spacing	`margin`, `margin-top`, `margin-right`, `margin-bottom`, `margin-left`, `padding`, `padding-top`, `padding-right`, `padding-bottom`, `padding-left`
Borders	`border`, `border-top`, `border-right`, `border-bottom`, `border-left`, `border-collapse`, `border-spacing`, `border-style`, `border-width`, `border-color`
Alignment	`text-align`, `vertical-align`

Property values:

Lengths in px, em, rem, %, or unitless 0. Negative values permitted where the property allows them. vh/vw/vmin/vmax MAY be permitted at consumer discretion; producers SHOULD NOT emit them.
Color values for border-color: hex (#abc, #aabbcc), rgb(), rgba(), named CSS colors. currentColor permitted.
auto is permitted for sizing properties.

Consumers MUST NOT execute CSS expressions, url() references to remote stylesheets, @import directives, or any value that resembles a JavaScript expression (expression(...), behavior:, -moz-binding, etc.). Consumers MUST strip any value that doesn’t lex as a simple length, color, or keyword token.

The narrow allowlist exists because authors need to size images, set table borders, and align cell content — pragmatic affordances that semantic markup alone doesn’t cover. Anything beyond layout (colors, fonts, animations, positioning, transforms) is consumer-skin territory and belongs on a class hook (§3.2).

3.5 Forbidden attributes

The following attributes MUST NOT appear on any element. Consumers MUST strip them on render:

All event handler attributes: any attribute matching on* (e.g., onclick, onload, onmouseover, onerror, onfocus, onblur).
srcdoc (on any element).
formaction, formenctype, formmethod, formnovalidate, formtarget (form submission attributes).

data: and other forbidden URL schemes are governed by §4.2; this section does not duplicate that rule.

4. URL scheme allowlist

4.1 Allowed schemes

For URL-bearing attributes (href, src, poster, cite, <source>.src, <track>.src):

Scheme	Where allowed	Notes
`https:`	All URL-bearing attributes	Always allowed.
`http:`	All URL-bearing attributes	Allowed but discouraged. Mixed-content rendering on HTTPS pages is consumer-defined; consumers SHOULD warn or upgrade.
`mailto:`	`<a href>` only	Standard mail-link behavior.
`tel:`	`<a href>` only	See §7. Consumer policy varies by audience.
Relative URLs	All URL-bearing attributes	Resolved against the consumer’s content base for the document. Producers MAY use relative paths to reference media bundled alongside the LC-JSON file (e.g., `media/images/foo.jpg`).

4.2 Forbidden schemes

The following schemes MUST NOT appear in any URL-bearing attribute. Consumers MUST reject the URL (either by stripping the attribute or by replacing the attribute with a safe placeholder, e.g., href="#"):

javascript:, vbscript:, data:, blob:, file:, chrome:, chrome-extension:, ftp:, ws:, wss:, gopher:, view-source:

data: is forbidden globally — including for <img src>. The XSS surface (SVG-via-data, HTML-via-data, type-confusion attacks via mixed content sniffing) is wider than the authoring convenience justifies. Consumers MUST strip data: URIs even on <img>.

blob: and file: are forbidden because they reference consumer-local memory or filesystem state; their meaning is not portable.

4.3 URL validation

Consumers SHOULD validate URLs against RFC 3986 before rendering. Malformed URLs (whitespace in the middle, control characters, embedded null bytes) MUST be treated as invalid and stripped.

5. Sanitization obligation

A consumer MUST sanitize HTML from LC-JSON documents before rendering. The HTML in an LC-JSON document is untrusted input from the consumer’s perspective, regardless of the document source.

A producer’s claim of LC-JSON conformance does NOT exempt the consumer from sanitization. Producers can be misconfigured, compromised, or simply buggy; consumers stand alone as the last line of defense.

5.1 Sanitization rules summary

A conforming consumer MUST:

Strip every element not listed in §2.1–§2.3, preserving its inner text content per §6.
Strip every attribute not listed in §3, preserving the element.
Strip every event handler attribute (on*).
Strip every URL with a scheme outside §4.1.
Strip every CSS property in inline style outside the §3.4 allowlist.
Normalize <a target="_blank"> to include rel="noopener noreferrer" per §6.1, even when the producer omitted it.
Reject the entire document if it contains any element from the §2.4 forbidden list (<script>, <iframe>, etc.) or any on* event-handler attribute or any javascript: / vbscript: URL. See §8 for validator severity.

5.2 Reference implementations (informative)

The following sanitizer configurations are known to align with this profile:

DOMPurify (JavaScript) — configure ALLOWED_TAGS and ALLOWED_ATTR from §2.1–§2.3 and §3.
Bleach (Python) — bleach.clean(text, tags=..., attributes=..., protocols=['http','https','mailto','tel']).
HtmlSanitizer (.NET) — equivalent allowlist configuration.

These are reference points only. Conformance is judged against the rules in this document, not against any specific library’s defaults.

6. Link safety, link normalization, and unknown-element handling

6.1 `target="_blank"` rel-normalization

A producer that emits <a target="_blank"> SHOULD also emit rel="noopener noreferrer".

A consumer MUST normalize <a target="_blank"> to include rel="noopener noreferrer" on render, adding the tokens if the producer omitted them. This applies even to documents that otherwise satisfy producer conformance — the consumer has the last word on render.

The reverse-tabnabbing risk that this mitigates is well-documented; the cost of producing rel="noopener noreferrer" is zero. Producers SHOULD save consumers the work, but consumers cannot rely on producers to do so.

6.2 Unknown-element handling

When a consumer encounters an HTML element whose name is not in §2.1–§2.3 and not in the §2.4 forbidden list, the consumer:

MUST strip the element while preserving its text content. <unknown>hello world</unknown> becomes hello world.
SHOULD log a warning (form is consumer-defined).
MUST NOT reject the document for unknown elements alone. Forward-compatibility for HTML extensions is preserved by graceful degradation, not by strict rejection.

This mirrors NORMATIVE §6’s handling of reserved/unknown question types: degrade gracefully, never fail-closed on names you don’t recognize. The contract is symmetrical across both surfaces.

6.3 Unknown-attribute handling

When a consumer encounters an attribute not listed in §3, the consumer MUST strip the attribute while preserving the element. Unknown attributes are not grounds for rejecting the document.

6.4 Unknown CSS properties

When a consumer encounters a CSS property in style="..." not listed in §3.4, the consumer MUST strip the property while preserving the element and the other (allowed) properties. Unknown properties are not grounds for rejecting the document.

7. Media handling

7.1 `<video>`

src MUST be https:, http:, or relative.
Consumers MUST NOT auto-play. Producers MUST NOT emit autoplay or loop. Consumers SHOULD ignore these attributes if a non-conforming producer emits them.
controls SHOULD be present (consumer policy MAY hide them, but the wire intent is “user-driven playback”).
Inner <source> elements MAY appear; consumers MUST process them per the same URL-scheme allowlist (§4.1).
Inner <track> elements with kind="captions" or kind="subtitles" SHOULD be present for video content. Accessibility requirements for captions are codified separately in ACCESSIBILITY.md §3.
poster URL MUST satisfy §4.1.

7.2 `<audio>`

src MUST be https:, http:, or relative.
Consumers MUST NOT auto-play. Producers MUST NOT emit autoplay or loop.
controls SHOULD be present.
Inner <source> elements MAY appear.

7.3 Bandwidth and preload

preload accepts "none", "metadata", "auto". Consumers SHOULD respect the producer’s preload hint but MAY override for bandwidth, storage, or accessibility reasons.

7.4 Format compatibility

LC-JSON does not mandate specific media codecs. Producers SHOULD use widely-compatible formats (H.264 + AAC in MP4 for video; MP3, AAC, or Opus for audio) and SHOULD provide multiple <source> fallbacks where format compatibility matters.

7.5 `<track>` for captions and subtitles

<track src> MUST satisfy §4.1. kind accepts "subtitles", "captions", "descriptions", "chapters", "metadata". srclang is a BCP 47 language tag (RECOMMENDED for subtitles and captions).

8. Validator severity

A reference validator (or any consumer’s pre-render validation pass) SHOULD classify HTML profile violations as follows.

8.1 Errors (validator MUST reject)

These violations indicate a security-critical XSS surface or a structural violation that no consumer can render safely:

Any forbidden element listed in §2.4.
Any event handler attribute (onclick, onload, onmouseover, etc.).
Any URL with scheme javascript: or vbscript:.

8.2 Warnings (validator MAY accept; consumer SHOULD strip)

These violations are sanitizable and not security-critical. The validator reports them so producers can fix their output, but the document is still useful:

Unknown elements (per §2.5, §6.2).
Unknown attributes (per §3.5, §6.3).
CSS properties outside the §3.4 allowlist (per §6.4).
URL schemes outside §4.1 but not listed in §4.2 (rare; mostly relative-URL edge cases).
tel: URLs (per §7 — consumer-policy gated; some audiences disable them).
Missing rel="noopener noreferrer" on <a target="_blank"> (per §6.1 — consumer auto-normalizes).
Missing alt on <img> (cross-references ACCESSIBILITY.md §2).
data: URLs (forbidden per §4.2, but a warning rather than an error because the consumer-side mitigation — strip the data: URL before rendering — degrades gracefully to a broken image, not an XSS surface. The forbidden-scheme rule still binds; the validator severity choice is “tell the author the image won’t render anywhere,” not “reject this otherwise-fine document.”)

8.3 Why this split

Errors fail the build. Warnings notify the author but don’t break interop. The line between them is “could a consumer render this document safely if it tried?” — yes for warnings, no for errors. Producers SHOULD treat warnings as actionable; consumers MUST sanitize regardless.

9. Round-trip preservation

NORMATIVE §6.4 requires consumers to preserve every member of reserved-type questions across read/write cycles (semantic preservation; key order is producer-discretion per §6.2). The same principle applies to HTML content with one important softening: a consumer that re-exports an LC-JSON document MAY emit the sanitized HTML rather than the input HTML, provided that:

No allowed elements, attributes, or CSS properties (per §2 and §3) are lost.
Element classes (per §3.2) are preserved verbatim.
Authored text content is preserved.
Semantic structure (heading levels, list nesting, table rows/cells) is preserved.

In other words: consumers MAY drop content the spec requires them to strip anyway (<script>, onclick, data: URLs). Consumers MUST NOT drop content they’re not required to strip. This protects authors from silent edit-on-import without forcing consumers to round-trip security-critical violations.

A consumer that imports a document containing forbidden content under §8.1 MUST report the violation to the user; the consumer MAY refuse to round-trip such a document at all.

10. Examples

10.1 Minimal conforming HTML

{
  "type": "content",
  "globalId": "...",
  "title": "Reading",
  "html": "<h2>Section 1</h2>\n<p>Some text with <strong>emphasis</strong> and <a href=\"https://example.org\">a link</a>.</p>"
}

10.2 Image with class hook

{
  "html": "<p>The diagram below shows the cycle:</p>\n<img src=\"media/cycle.png\" alt=\"Carbon cycle diagram\" class=\"img-medium\" />"
}

10.3 Video with captions

{
  "html": "<video src=\"media/lecture.mp4\" controls poster=\"media/lecture-thumb.jpg\" preload=\"metadata\" width=\"640\">\n  <track src=\"media/lecture.vtt\" kind=\"captions\" srclang=\"en\" label=\"English\" default />\n</video>"
}

10.4 Table with allowed inline styles

{
  "html": "<table border=\"1\" style=\"border-collapse: collapse; width: 100%;\">\n  <thead>\n    <tr><th style=\"padding: 8px; text-align: left;\">Country</th><th style=\"padding: 8px;\">Capital</th></tr>\n  </thead>\n  <tbody>\n    <tr><td style=\"padding: 8px;\">France</td><td style=\"padding: 8px;\">Paris</td></tr>\n  </tbody>\n</table>"
}

10.5 Link with `target="_blank"` and `rel`

{
  "html": "<p>Read more on <a href=\"https://en.wikipedia.org/wiki/Photosynthesis\" target=\"_blank\" rel=\"noopener noreferrer\">Wikipedia</a>.</p>"
}

10.6 What to avoid

<!-- ✗ <script> is forbidden — validator MUST reject -->
<script>alert("hi")</script>

<!-- ✗ event handler — validator MUST reject -->
<a href="https://example.com" onclick="track()">click</a>

<!-- ✗ javascript: URL — validator MUST reject -->
<a href="javascript:void(0)">click</a>

<!-- ✗ data: URL — consumer strips, validator warns -->
<img src="data:image/png;base64,..." alt="..." />

<!-- ✗ inline-rendered SVG — element forbidden -->
<svg><circle cx="50" cy="50" r="40" /></svg>

<!-- ✓ SVG raster reference is fine -->
<img src="https://example.org/logo.svg" alt="Example logo" />

11. Cross-references

NORMATIVE.md §11 — normative reference to this document
ITEM_PATTERNS.md §3 — tel: consumer policy as one example of consumer plurality
schemas/content-item.schema.json — html field
schemas/signpost-item.schema.json — customHtml field
ACCESSIBILITY.md — alt, captions, keyboard alternatives, language/direction, placeholder accessibility for reserved types, WCAG 2.1 AA cross-references, recommended ARIA patterns (rc.1 release; additive deepenings — per-criterion normative table, expanded ARIA patterns, conformance fixtures — land in 1.0 final)
tests/ — conformance fixtures including valid/06-html-with-video-track.json and invalid/13-html-with-script.json

12. Summary table

Category	Producer MUST	Producer SHOULD	Consumer MUST	Consumer SHOULD
Allowed elements	Stay within §2.1–§2.3	Use semantic markup	Render allowed elements; strip forbidden (§2.4); strip-while-preserving-text for unknown (§6.2)	Log warnings on unknown
Forbidden elements	Not emit `<script>`, `<iframe>`, `<form>`, etc.	—	Reject document if forbidden present (§8.1)	Surface error to user
Attributes	Stay within §3	Use semantic attributes	Strip unknown attributes (§6.3); strip event handlers always	—
Inline `style`	Stay within §3.4 allowlist	Prefer class hooks	Strip out-of-allowlist properties (§6.4)	—
URL schemes	Use `https:`, `http:`, `mailto:`, `tel:`, or relative	Prefer `https:`	Reject `javascript:`/`vbscript:`; strip `data:`/`blob:`/`file:`/etc.	Warn on `http:`, `tel:`
`target="_blank"`	Emit `rel="noopener noreferrer"`	—	Normalize to add `rel="noopener noreferrer"` if missing (§6.1)	—
`<img alt>`	Emit `alt`	Use empty `alt=""` for decorative	—	Treat missing `alt` as warning
`<video>`, `<audio>` autoplay	Not emit `autoplay`, `loop`	—	Not auto-play	Ignore `autoplay` if a non-conforming producer emits it
Sanitization	—	—	Sanitize before render, every time	Use a vetted reference implementation (§5.2)

LC-JSON Accessibility Profile

Status: Released for 1.0. This is the stable accessibility contract carried forward unchanged from 1.0-rc.3; 1.0 is a pure rebase of rc.3. Further deepenings (per-criterion cross-reference table, expanded ARIA patterns, screen-reader timing guidance, --accessibility validator flag + fixtures) are post-1.0, additive, and informative or opt-in — none change the base-vs-Profile contract; see §11. Obligations stated here will not be retracted or contradicted. Spec version: 1.0 Last updated: 2026-06-30

This document collects the accessibility expectations for LC-JSON (Learning Content JSON) producers and consumers. The keywords MUST, MUST NOT, SHOULD, SHOULD NOT, MAY, and RECOMMENDED are to be interpreted as in RFC 2119 and RFC 8174. RFC 2119 language binds wire-format obligations; ARIA-pattern guidance is informative — the spec hints at affordances rather than mandating a single canonical UI (see README.md §“Wire Format”). This two-layer split — wire-format affordances versus the duties of the consumer that ultimately delivers the content — is the organizing principle of this document.

1. Scope

LC-JSON is a portable interchange format. The wire format does not render anything itself — accessibility outcomes depend on consumer rendering. The role of this document is to:

Specify producer obligations that make accessible rendering possible (alt text, captions, language tags).
Specify consumer obligations for rendering that don’t drop accessibility affordances the producer already provided.
Cross-reference HTML_SAFETY.md where HTML-bearing fields constrain accessibility-bearing markup (<img alt>, <track>, lang, dir).

Accessibility for the authoring tools that produce LC-JSON, and for the delivery surfaces that render it, is the responsibility of those tools — not the wire format. This document binds the wire-format obligations only.

1.1 The two-layer duty (informative)

Accessibility is achievable downstream if and only if the format can carry the affordances a renderer needs, and the rendering consumer surfaces them correctly. These are two distinct, non-interchangeable layers:

The wire format (LC-JSON). Cannot produce an accessible experience on its own; it can only enable one — by carrying alt text, captions, language/direction signals, textual feedback, and position semantics for structured tasks. If the format cannot represent an affordance, no conforming consumer can ever deliver it.
The consumer (the renderer). Where a disabled end-user actually meets the content, and therefore where every accessibility law attaches. The consumer’s duty is to surface the affordances the producer provided.

A perfectly capable format rendered by a non-conformant consumer is still inaccessible. A conformant consumer cannot rescue a format that never carried the affordance. Both layers must hold.

1.2 Legal context (informative)

WCAG governs rendered user experiences; LC-JSON governs portability and metadata preservation. These layers are complementary but distinct — see NORMATIVE.md §12.3.

The technical accessibility target for educational and commercial delivery in the EU and US converges on WCAG 2.1 Level AA:

EU — European Accessibility Act, Directive (EU) 2019/882 (applicable since 28 June 2025): points at the harmonized standard EN 301 549, which references WCAG 2.1 AA.
EU — Web Accessibility Directive (EU) 2016/2102: binds public-sector bodies (public universities, schools) to EN 301 549 → WCAG 2.1 AA.
US — DOJ ADA Title II final rule (April 2024): explicitly adopts WCAG 2.1 AA for state/local government, including public schools and universities, with compliance deadlines in April 2026 / April 2027.
US — Section 508 / Section 504: WCAG-based conformance for federal procurement and recipients of federal financial assistance.

LC-JSON supports WCAG 2.1 AA by carrying the wire-format affordances a conforming renderer needs (alt text, captions, lang/dir signals, textual feedback, position semantics for structured tasks). The delivering consumer remains responsible for the full WCAG conformance claim under its applicable jurisdiction.

1.3 ATAG vs WCAG (producers vs consumers)

Web consumers (the renderers that display LC-JSON content to learners) fall under WCAG. The obligations in §§2–7 are written from this perspective.
Authoring tools that produce LC-JSON (whether browser-based editors, desktop applications, AI-assisted authoring scripts, or import converters) fall under W3C ATAG 2.0, not WCAG. ATAG covers two things: (a) making the authoring environment itself accessible to author-users (ATAG Part A), and (b) supporting authors in producing accessible content — e.g. prompting for alt text, captions, transcripts (ATAG Part B). Producer obligations in this document map to ATAG Part B: the authoring tool’s job is to make the affordances easy to author and hard to forget.
Desktop authoring tools are additionally outside WCAG’s scope entirely — WCAG governs web content. Desktop accessibility is governed by platform standards (e.g. UIAutomation on Windows; Section 508 / EN 301 549 software clauses).

1.4 Five conformance requirements for a WCAG 2.1 AA claim (informative)

A WCAG 2.1 AA claim by a delivering consumer is valid only if all five hold; failing any one voids the claim independent of per-criterion passes:

Conformance level — all Level A and Level AA criteria are met (50 criteria total: 30 A + 20 AA).
Full pages — conformance is claimed for complete pages, including dynamically loaded states; partial-page exclusions are not permitted.
Complete processes — every page in a multi-step process must conform (e.g. login → course → item → submission → results). A conformant results page after a non-conformant quiz flow does not pass.
Accessibility-supported technologies — reliance only on technologies that work with assistive technology (HTML + ARIA + native form controls is the baseline).
Non-interference — even non-relied-on content must not break 1.4.2 (Audio Control), 2.1.2 (No Keyboard Trap), 2.2.2 (Pause, Stop, Hide), or 2.3.1 (Three Flashes or Below Threshold).

This document specifies what LC-JSON producers and consumers must do to make the per-criterion items achievable. The five claim-level gates are properties of a delivering consumer, not of the wire format.

1.5 Versions targeted

This profile targets WCAG 2.1 Level AA as the primary claim baseline. Selected criteria from WCAG 2.2 are designed-in for new interactive components to avoid near-term rework:

2.5.7 Dragging Movements (2.2, A) — every drag interaction MUST ship a single-pointer, non-drag alternative. Governs structured-task question types in §4.
2.5.8 Target Size (Minimum) (2.2, AA) — interactive targets SHOULD be ≥ 24×24 CSS px.

Other WCAG 2.2 additions are out of the 2.1 claim baseline. 4.1.1 Parsing is treated as satisfied-by-default (modern browsers/AT; W3C errata; obsolete in WCAG 2.2).

2. Image alt text — WCAG 1.1.1

2.1 Producer obligations

When claiming Accessibility Profile conformance, a producer MUST emit an alt attribute on every <img> element in HTML-bearing fields (HTML_SAFETY.md §3.3). This satisfies WCAG 1.1.1 Non-text Content at Level A.

Outside an Accessibility Profile claim, authoring alt is not a base-conformance requirement: a producer that omits alt is still a conforming LC-JSON producer (the reference validator emits a non-blocking WARN per HTML_SAFETY.md §8.2). What base conformance does require is preservation — a consumer MUST NOT strip an alt that is present (NORMATIVE.md §12.1). The distinction is deliberate: a small producer is never blocked for an alt-less image, but accessibility information, once authored, is never silently dropped.

For informative images (diagrams, screenshots, photographs that carry meaning), alt MUST be a meaningful textual description.
For decorative images (visual flourishes, spacers, redundant illustrations of adjacent text), alt="" (empty string) is RECOMMENDED. An empty alt is a positive signal to assistive technology that the image carries no content; it is not a missing attribute.

Question types that carry image references in tool-specific extension fields (e.g. reserved-type hotspot, graphicGapMatch) SHOULD include an alt-text-equivalent property when those types are promoted to first-class schemas (see §11).

2.2 Consumer obligations

A consumer MUST render the alt text exposed to assistive technology when an image is rendered. A consumer that strips <img> (e.g. when sanitization fails) MUST surface the alt text as fallback content rather than silently dropping the image entirely.

A missing alt attribute SHOULD trigger a domain-validation warning per HTML_SAFETY.md §8.2; the consumer MUST still render the image (the failure mode is a warning to the author, not a refused document).

3. Video and audio: captions, transcripts, descriptions — WCAG 1.2.1, 1.2.2, 1.2.3, 1.2.5

3.1 Producer obligations

For prerecorded instructional video that contains speech or meaningful audio, producers MUST emit at least one <track kind="captions"> or <track kind="subtitles"> element with a valid src and srclang when claiming Accessibility Profile conformance. This satisfies WCAG 1.2.2 Captions (Prerecorded) at Level A. WebVTT is the RECOMMENDED caption format (broad browser support; AT-compatible).

For all other <video> content (decorative, non-speech, ambient), producers SHOULD emit a <track kind="captions"> or <track kind="subtitles"> element where the content carries any information the learner is expected to receive (HTML_SAFETY.md §7.5).

When claiming Accessibility Profile conformance, producers MUST provide a transcript for prerecorded instructional content that carries speech — either as adjacent ContentItem.html prose or as a linked resource:

For audio-only instructional content (e.g. a <audio> listening passage), the transcript is the text alternative required by WCAG 1.2.1 Audio-only (Prerecorded) at Level A.
For instructional video, the transcript is required in addition to the captions above; it satisfies WCAG 1.2.3 (media alternative) and serves learners who cannot use synchronized captions (deafblind users on a braille display, users who need to read at their own pace).

Outside an Accessibility Profile claim, a transcript is RECOMMENDED but not required — base conformance never compels a small producer to author one. As with alt (§2.1), the base floor is preservation, not production: a transcript or <track> already present MUST round-trip (NORMATIVE.md §12.1).

<track kind="descriptions"> (audio descriptions of visual-only information) is RECOMMENDED for video where visual content is essential to the pedagogy and not redundantly narrated. This pairs with WCAG 1.2.5 Audio Description (Prerecorded) at AA.

3.2 Consumer obligations

A consumer that renders <video> or <audio> MUST surface caption/subtitle controls when <track> elements are present. A consumer MUST NOT auto-play media (HTML_SAFETY.md §7.1, §7.2) — the <video> rendering pipeline is user-driven, which is itself an accessibility requirement (motion-sensitivity, screen-reader interruption, bandwidth control). Auto-play would also violate WCAG 1.4.2 Audio Control.

A consumer SHOULD render <track kind="descriptions"> as either a switchable audio track or a synchronized text alternative.

4. Keyboard alternatives for structured-task question types — WCAG 2.1.1, 2.5.1, 2.5.2, 2.5.3, 2.5.7 (2.2 designed-in), 4.1.2, 1.3.1

Three implemented question types involve drag-and-drop or pointer-driven interaction — matching, ordering, and placement. The cloze family (wordBankCloze, multiGapCloze) is structurally similar; multipleChoiceCloze is dropdown-based and inherently keyboard-accessible. The reserved-for-2027 hotspot, graphicGapMatch, graphicAssociate, and graphicOrder types compound the same pattern with image regions.

4.1 Consumer obligations

A consumer that renders these types MUST provide a fully keyboard-navigable interaction. Pointer-only implementations are non-conforming for accessibility purposes regardless of LC-JSON conformance. Per WCAG 2.5.7 (designed-in from 2.2), every drag interaction MUST additionally ship a single-pointer, non-drag alternative.

Concretely:

matching (pairs mode) — Tab-to-item, Enter-to-select, Tab-to-match, Enter-to-pair (or equivalent two-step keyboard model) MUST work without a pointer. Native <select> per item is the simplest conforming pattern (see §4.2.2).
matching (classification mode) — Tab through the item pool, Enter-to-select an item, Tab-to-category, Enter-to-place. Many items can target the same category.
ordering — Up/Down (or Left/Right for orderingUnit: "word") keys MUST move a focused tile within the sequence. The interaction model SHOULD be discoverable from focus state alone. See §4.2.1 for a recommended ARIA pattern.
placement — Tab through the distractor pool and the gap targets; Enter-to-select an item, Tab-to-gap, Enter-to-place. The interaction MUST work without a pointer regardless of placementUnit mode. A labeled <select> per gap is the simplest conforming pattern.
wordBankCloze, multiGapCloze — Bank-token selection and gap-placement MUST be reachable by keyboard. multipleChoiceCloze’s <select> rendering is inherently keyboard-accessible and is the RECOMMENDED fallback pattern when richer drag-and-drop interactions cannot be made keyboard-equivalent.

Focus indicators on interactive elements MUST be visible (WCAG 2.4.7 Focus Visible) and SHOULD meet 3:1 contrast against adjacent backgrounds (WCAG 1.4.11 Non-text Contrast).

4.1.1 The accessible alternative is expressible from the document data

The position/target semantics a consumer needs to render a keyboard- and AT-navigable alternative are already carried by the schemas, so the accessible path is expressible from the document rather than improvised at render time: ordering by item position (items[i] is the tile for position i); placement by gap number (@@@N markers correspond to placements[].gap); matching by item↔match value (pairs) or item→category value (classification). Element identity is positional or by value rather than a durable token — sufficient for rendering and for scoring, including repeated values, which are disambiguated by position. A consumer that needs durable per-element identity across systems (for example, portable response or analytics interchange) supplies it at its own layer; the wire format intentionally does not carry it.

4.2 Recommended ARIA patterns (informative)

The following patterns are RECOMMENDED for consumers; they satisfy 4.1.2 (Name, Role, Value), 1.3.1 (Info and Relationships), and 2.5.3 (Label in Name) for the structured-task question types. They are informative — a consumer that satisfies the §4.1 obligations through a different ARIA pattern is conforming.

4.2.1 Ordering

Bank — role="group" with aria-labelledby pointing at a visible label (“Available tiles” or equivalent).
Answer area — role="listbox" with aria-orientation="horizontal" for orderingUnit: "word" and aria-orientation="vertical" for sentence/paragraph; aria-labelledby for the answer label; aria-describedby pointing at visible keyboard-and-pointer instructions.
Slots inside the listbox — role="presentation" so the listbox→option relationship is preserved across intervening layout elements.
Tiles — tabindex="0" on every tile so all tiles are reachable while arrow keys remain available for movement. Each tile’s accessible name (aria-label) carries the content and position information when placed (e.g. “goes, position 2 of 5”). When a tile is placed, set aria-selected="true".
Live region — a visually-hidden aria-live="polite" aria-atomic="true" element for movement announcements (tile picked up, tile moved, tile returned to bank). This satisfies WCAG 4.1.3 Status Messages for the interaction’s transient state.
Single-pointer alternative — click-to-place from the bank, click-to-pick-up from a placed tile, click-to-place at another position. Distinct visual indication when a tile is “picked up” (separate from the focus indicator, since both can show simultaneously).
Discoverable instructions — keyboard-and-pointer instructions SHOULD be visible (not buried in aria-label) and referenced by aria-describedby on the listbox.

An alternative satisfying the same obligations is the WAI-ARIA Authoring Practices grab/drop model: Space to “grab,” arrows move only while grabbed, single-roving tabindex. The pattern above (per-tile tabindex) is the recommended baseline for short sequences; the grab/drop model scales better for long sequences at the cost of a mode step.

4.2.2 Matching, Placement

The simplest conforming pattern is native form controls:

matching (pairs) — one <select> per item, options drawn from match values + distractors (shuffled per §5.6 of NORMATIVE.md). Each <select> is labeled with the item text via a visible <label> or aria-labelledby.
matching (classification) — one <select> per item, options drawn from the category labels.
placement — one <select> per @@@N gap, options drawn from placements[].item values + distractors. Each <select> is labeled with surrounding-passage context or a gap label.

Native <select> is inherently keyboard-accessible, satisfies 2.5.3 by carrying its visible label as its accessible name, and avoids the ARIA-listbox complexity of §4.2.1. Richer drag-and-drop renderings are permitted but MUST ship the keyboard and single-pointer alternatives per §4.1.

4.2.3 Cloze family

simpleGapFill, wordBankCloze, multiGapCloze, multipleChoiceCloze MAY be rendered with native text inputs (<input type="text">) or selects (<select>). Each gap MUST have a programmatic label — either an associated <label> element, or aria-label, or aria-labelledby pointing at adjacent gap-context prose.

4.3 Producer obligations

Producers MAY include hint text guiding learners who use keyboard or assistive technology, as hint strings on the question or as adjacent ContentItem.html prose. The wire format does not currently carry interaction-specific accessibility hints; this is intentional (consumer-defined affordance), but producers SHOULD assume diverse interaction modalities when authoring.

A post-1.0 accessibility guidance update may deepen this with: an aria-grabbed/aria-dropeffect deprecation note, modern aria-activedescendant patterns as an alternative to per-tile tabindex, focus-management requirements during placement, and screen-reader announcement timing requirements for partial-credit feedback.

5. Feedback: not by color alone — WCAG 1.4.1, 4.1.3

Question types that emit feedback (question-types-reference.md, Common Properties — feedback, choiceFeedback) carry textual content. Consumers MUST render this textual feedback in addition to any visual indicators of correctness (green/red highlighting, check/cross icons).

A consumer MUST NOT convey correctness solely through color or icon. Conformant rendering provides at minimum:

An accessible textual indicator (“Correct”, “Incorrect”, or the producer-supplied feedback string) — WCAG 1.4.1 Use of Color.
A non-color visual indicator (icon, position, label) for sighted users with color-vision differences.
An assistive-technology-readable announcement when feedback updates dynamically — WCAG 4.1.3 Status Messages.

This binds consumer rendering. The wire format already carries the textual indicators; the obligation is to render them.

5.1 Recommended live-region pattern (informative)

Per-question feedback that updates dynamically (without a page reload) SHOULD be exposed to assistive technology via an ARIA live region:

Routine feedback (per-question correct/incorrect, score updates): aria-live="polite" so the announcement does not interrupt the learner’s current speech.
Critical feedback (final score, submission confirmation, error states): role="alert" (implicit aria-live="assertive").
Score summaries that change after submission: live region MUST contain the textual indicator before any visual transition begins.

Status-message regions SHOULD NOT receive focus; focus management for status announcements is governed by 4.1.3 — expose to AT without moving focus.

6. Language and direction — WCAG 3.1.1, 3.1.2

The language field requirement is also tied to EN 301 549 5.4 (Closed functionality) for educational content delivery.

6.1 Producer obligations

Every Course or QuestionSet carries a language field (a BCP 47 tag, commonly a bare ISO 639-1 code; see LOCALIZATION.md §3) at the document root. Producers MUST set language to the primary delivery language. When the document carries content in a secondary language (typically the learner’s L1 for [L1:] translation/support tags), producers SHOULD also set supportLanguage.

Within HTML-bearing fields, producers MAY use the lang attribute to mark spans of content in a different language than the document default (per HTML_SAFETY.md §3.1). Producers SHOULD use lang for any in-line foreign-language quotation or term — this satisfies WCAG 3.1.2 Language of Parts.

Producers MUST set the root language to the document’s primary delivery language. If the primary delivery language is a right-to-left language (Arabic, Hebrew, Persian, Urdu, etc.), producers SHOULD indicate document-level direction where the consumer supports it. For embedded RTL passages inside an LTR document — for example, an English lesson that quotes Arabic, Hebrew, Persian, or Urdu in the body — producers SHOULD mark the relevant HTML span or block with local lang and dir attributes (per HTML_SAFETY.md §3.1). See examples/course-rtl-writing-systems.json for a worked LTR-document-with-embedded-RTL example.

6.2 Consumer obligations

A consumer MUST honor the document-level language field when setting the rendering surface’s lang attribute. For web consumers, this means setting <html lang> from the document language rather than hardcoding a single locale. This satisfies WCAG 3.1.1 Language of Page.

A consumer MUST honor the dir attribute on HTML-bearing elements when rendering RTL content; failure to do so produces unintelligible bidirectional text. For RTL document languages (ar, he, fa, ur), a web consumer SHOULD additionally emit <html dir="rtl"> so the browser’s bidirectional algorithm is engaged for the whole rendering surface.

A consumer MUST NOT strip lang or dir attributes during sanitization. Both attributes are explicitly allowed on every element class in HTML_SAFETY.md §3.1.

Emitting lang on a foreign-language span is necessary but not sufficient for that span to be pronounced correctly by a screen reader. lang is an instruction; whether it is acted on depends on the end user’s environment, which the format and the consumer cannot control: the reader must support automatic language switching and have it enabled (support varies — screen readers such as NVDA and JAWS switch reliably, Windows Narrator’s automatic switching is comparatively limited, VoiceOver sits in between), and the matching voice must be installed (a reader with only an English voice reads a correct lang="es" span in English, mispronouncing it). The producer/consumer duty is therefore to emit and preserve lang/dir faithfully; correct pronunciation is completed by the user’s assistive technology. This does not make lang optional — without it no reader can switch at all. See LOCALIZATION.md §7 for the fuller discussion.

The localization model promised here — the distinct roles of language / lang / supportLanguage, BCP 47 language-tag rules, the single-document-per-language boundary, and the pronunciation-expectations framing above — is specified in LOCALIZATION.md. What remains for a later iteration: explicit RTL rendering tests in the conformance corpus.

7. Reserved and unknown question types: placeholder accessibility — WCAG 1.3.1, 4.1.2

Per NORMATIVE.md §6, consumers MUST preserve reserved/unknown question types in full (every field, value, and nested structure) and SHOULD render a non-interactive placeholder for them.

The placeholder MUST be accessible:

Surfaced to assistive technology with a meaningful description (at minimum: the question’s title, the type name, and the fact that the consumer cannot render this question).
Distinguishable from rendered questions (so a screen-reader user understands the question is informational, not interactive) — WCAG 1.3.1 Info and Relationships.
Not announced as “interactive” or “form control” when no interaction is possible — WCAG 4.1.2 Name, Role, Value.

A consumer MUST NOT silently skip the placeholder for assistive-technology users; the §6 round-trip-preservation philosophy applies equally to the rendering surface.

7.1 Recommended placeholder pattern (informative)

A conforming placeholder SHOULD use:

role="region" — surfaces the placeholder as a labeled landmark, distinguishable from form controls.
aria-label carrying the question’s title, the unsupported type discriminator, and an indication that the renderer can’t display this type. Recommended template: “Unsupported question: <title>. This question type (<type>) can’t be displayed by this viewer.”
A visible visual treatment that signals “informational, not interactive” (e.g. a warning or info alert styling).
No interactive children (<input>, <button>, <select>) — the placeholder is announced as a region, not a form control.

A post-1.0 accessibility guidance update may deepen this with: example placeholder text in multiple languages and producer guidance for emitting accessibility metadata on tool-specific extensions to reserved types.

8. Validator severity (current baseline, established in rc.1)

The reference validator surfaces accessibility issues at the following severities. WCAG SC references are cross-references — accessibility violations in producer output are content-validation issues, not just renderer concerns.

Issue	Severity	WCAG SC	Cross-reference
Missing `alt` on `<img>`	warning	1.1.1	`HTML_SAFETY.md` §8.2; §2 above
`<video>` without `<track kind="captions">` or `kind="subtitles"`	warning	1.2.2	§3.1
`<iframe>`, `<script>`, event handlers (inaccessible regardless)	error	4.1.2 (would-be)	`HTML_SAFETY.md` §8.1
Missing `language` at document root	error (schema-enforced)	3.1.1	§6.1
Reserved-type question without a `title`	informational note (recommended for placeholder)	1.3.1, 4.1.2	§7

A post-1.0 accessibility tooling update may deepen this with: an --accessibility validator flag (analogous to --strict) for tooling that wants to fail-build on warnings, additional severity entries for the reserved-type placeholder surface, and conformance fixtures exercising accessibility-related warnings/errors.

9. WCAG 2.1 AA mapping (informative)

The table below indexes which sections of this document cover which Success Criteria. This is an informative cross-reference; per-criterion normative obligations live in the section bodies.

WCAG 2.1 AA SC	Level	Topic	This profile
1.1.1 Non-text Content	A	Alt text on images	§2
1.2.1 Audio-only/Video-only	A	Transcript or alt media	§3
1.2.2 Captions (Prerecorded)	A	`<track kind="captions">`	§3
1.2.3 Audio Desc. or Media Alt.	A	Description track or transcript	§3
1.2.5 Audio Description (Prerecorded)	AA	`<track kind="descriptions">`	§3
1.3.1 Info and Relationships	A	ARIA roles/labels, structured-task semantics, placeholder landmark	§4, §7
1.4.1 Use of Color	A	Textual + non-color correctness cues	§5
1.4.2 Audio Control	A	No autoplay	§3.2
1.4.11 Non-text Contrast	AA	Focus indicator contrast	§4.1
2.1.1 Keyboard	A	Keyboard alternatives for structured tasks	§4
2.4.7 Focus Visible	AA	Visible focus indicators	§4.1
2.5.1 Pointer Gestures	A	Single-pointer alternatives	§4
2.5.2 Pointer Cancellation	A	Activation on up-event	§4 (consumer behavior)
2.5.3 Label in Name	A	Accessible name contains visible label	§4.2
2.5.7 Dragging Movements	A (2.2 — designed-in)	Single-pointer alternative for every drag	§4
2.5.8 Target Size (Minimum)	AA (2.2 — designed-in)	≥ 24×24 px interactive targets	§1.5
3.1.1 Language of Page	A	Document `language` → `<html lang>`	§6
3.1.2 Language of Parts	AA	Inline `lang` on foreign-language spans	§6
4.1.2 Name, Role, Value	A	ARIA semantics on custom widgets and placeholders	§4, §7
4.1.3 Status Messages	AA	Live regions for dynamic feedback	§5

Criteria not listed (e.g. 1.3.2, 2.4.1, 3.2.2, 3.3.x) are properties of a delivering consumer rather than wire-format affordances. A delivering consumer’s full WCAG 2.1 AA claim covers them per its own conformance plan.

10. Cross-references

NORMATIVE.md — RFC 2119 conformance requirements; language field requirement; reserved-type round-trip; randomization requirements (§5.6).
HTML_SAFETY.md — <img alt>, <track>, lang, dir, validator severity.
question-types-reference.md — per-type feedback fields and structured-task definitions.
GLOSSARY.md — terminology.
WCAG 2.1 — https://www.w3.org/TR/WCAG21/
WCAG 2.1 Quick Reference (filterable SC list) — https://www.w3.org/WAI/WCAG21/quickref/
WAI-ARIA Authoring Practices Guide (APG) — https://www.w3.org/WAI/ARIA/apg/ — pattern recommendations (listbox, grab/drop, status messages).
ATAG 2.0 — https://www.w3.org/TR/ATAG20/ — authoring-tool obligations referenced in §1.3.
EN 301 549 — https://www.etsi.org/deliver/etsi_en/301500_301599/301549/ — EU harmonized standard pointing at WCAG 2.1 AA.

11. From 1.0 onward

This document is the 1.0 accessibility profile. Its obligations are the stable accessibility contract: the base-conformance preservation floor (NORMATIVE.md §12.1) and the opt-in Accessibility Profile authoring MUSTs (§12.2 — alt, captions, transcripts) were settled as of rc.3 and carry into 1.0 unchanged. 1.0 is a pure rebase of rc.3 — it adds no new obligations and tightens nothing.

The deepenings below are post-1.0, additive, and either informative or opt-in: none change the base-vs-Profile contract above, none gate 1.0. They are listed so implementers can see the intended direction.

Per-criterion cross-reference table — a presentation of §9 mapping each WCAG SC to the obligation already stated in §§2–7. Clarity, not new obligation.
Expanded ARIA patterns — patterns for matching classification mode, richer announcement guidance for partial-credit feedback (§4), per-language placeholder text examples (§7). Informative.
Screen-reader timing guidance — announcement timing for auto-grading flows (§5). Informative.
--accessibility validator flag — analogous to --strict; opt-in tooling that promotes accessibility warnings (missing alt, missing <track> on speech-bearing video) to errors for teams that want to fail-build on them. Opt-in; changes no document’s conformance.
Conformance fixtures for accessibility — an a11y/ corpus suite exercising the --accessibility flag, beyond the round-trip and missing-language fixtures already in the baseline.
Reserved-type accessibility metadata schema — guidance for emitting accessibility metadata on hotspot, graphicGapMatch, and the other graphic types when their per-type schemas land (tied to the 1.1 promotion of the reserved types).
Multilingual accessibility metadata shape — localized alt text / transcripts / accessible-name fields per locale; bounded by the single-language-per-document decision in LOCALIZATION.md §2.4.

Resolved in rc.3 (no longer pending): the authoring obligations for alt, captions, and transcripts were settled as Accessibility Profile MUSTs (§12.2), deliberately not promoted into base NORMATIVE.md — base conformance stays preservation-only so a small or non-institutional producer is never blocked. The BCP 47 / ISO 639-1 language-tag reconciliation also shipped in rc.3 (see LOCALIZATION.md §3).

Implementers building against 1.0 can rely on the obligations stated above. Implementers already built against 1.0-rc.3 have the same accessibility obligations under 1.0.

LC-JSON Localization and Language Model

Status: Part of the 1.0 contract. First introduced in 1.0-rc.3; codifies the language model that has been implicit since 1.0-rc.1; introduces no breaking change. The language root field and lang/dir annotation behave exactly as they did in rc.1/rc.2 — this document states the model explicitly and sets expectations. Spec version: 1.0 Last updated: 2026-06-30

This document defines how LC-JSON represents natural language: what the language and supportLanguage fields mean, how lang/dir annotate individual spans, which language-tag forms are accepted, and — importantly for implementers — what the format can and cannot promise about pronunciation in assistive technology. The keywords MUST, MUST NOT, SHOULD, SHOULD NOT, MAY, and RECOMMENDED are interpreted as in RFC 2119 and RFC 8174.

1. Scope

The word “language” does four different jobs in a learning document, and conflating them is the most common source of confusion for implementers. This document separates them:

Concept	Field / mechanism	What it is
Delivery language	`language` (root)	The single primary language the document is authored in.
Language of parts	`lang` / `dir` on HTML	Individual spans in a different language from the delivery language.
Support language	`supportLanguage` (root)	An optional pedagogical layer: the learner’s first language (L1), surfaced to aid comprehension of second-language (L2) content.
Translation bundles	(not in 1.x)	Parallel copies of the same content in multiple languages within one document — explicitly out of scope; see §2.4.

LC-JSON’s wire format is language-neutral: a document may declare any natural language and any script. This document governs how that language is declared and annotated, not which languages are permitted (all are).

2. The four roles of “language”

2.1 Delivery language — `language`

language is a required root field on both the course and questionSet artifacts. It declares the single primary language the document is authored and delivered in — e.g. "language": "en" means the document is an English document.

A document has exactly one delivery language. LC-JSON 1.x is single-language-per-document (see §2.4).
A delivering consumer SHOULD set the rendering surface’s primary language from this field (for a web consumer, <html lang="…">), so that assistive technology, hyphenation, and font selection default correctly.
language is the document’s identity, not a runtime choice: a consumer does not “switch” a document’s delivery language; it renders the document in the language it declares.

2.2 Language of parts — `lang` and `dir` on HTML

Within HTML-bearing fields (ContentItem.html, SignpostItem.customHtml), a run of text in a language other than the delivery language is marked with the standard HTML lang attribute (and dir where the script direction differs). This is the WCAG 3.1.2 Language of Parts mechanism.

{
  "type": "content",
  "html": "<p>The French call it <span lang=\"fr\">l'esprit de l'escalier</span> — the wit of the staircase.</p>"
}

Language of parts is about correct rendering and pronunciation, not translation: the Spanish span in an English document is content in Spanish, not an English string’s Spanish equivalent. lang/dir are part of the HTML safety profile’s universal-attribute allowlist (HTML_SAFETY.md §3.1) and MUST survive a consumer’s sanitization and round-trip (NORMATIVE.md §12.1).

2.3 Support language — `supportLanguage`

supportLanguage is an optional root field (nullable). It names the learner’s first language (L1) for a document whose delivery language is a second language (L2) being taught. It exists for the language-teaching case: an English course built for Spanish-speaking learners declares "language": "en", "supportLanguage": "es", signaling that L1 (Spanish) support — glosses, hints, translations of key terms — is appropriate.

supportLanguage is a signal, not a rendering instruction. How a consumer surfaces L1 support — inline glosses, hover tooltips, a toggle, a glossary panel, or not at all — is consumer-defined. One consumer’s convention is an inline bracket tag ([L1: una hipoteca]) that its renderer expands to a lang-annotated span; that is an authoring/rendering convention of that consumer, not a wire-format construct. The wire format carries supportLanguage plus ordinary text and lang-annotated parts; the pedagogy is layered on by the consumer.

When supportLanguage is absent or null, no L1 support is implied and a consumer SHOULD render the document monolingually.

2.4 Out of scope for 1.x — translation bundles

LC-JSON 1.x does not provide field-level localization. There is no shape in which a single field carries parallel translations (no "title": {"en": "...", "es": "..."} maps, no per-locale field bundles). A document is authored in one delivery language.

Multiple languages are delivered as multiple documents. An English course and its Spanish translation are two separate LC-JSON documents, each with its own single language. This keeps the wire format simple, keeps validation unambiguous, and matches how content interchange formats in adjacent ecosystems treat translation (as separate artifacts, not multiplexed fields).

This is a deliberate boundary, not an oversight. Producers MUST NOT assume a future minor version will add localized field bundles; if it ever does, it will be additive and will not change the meaning of the single-language documents defined here.

3. Language tags

Language-tag values (language, supportLanguage, and HTML lang) are BCP 47 language tags.

The common case is a bare ISO 639-1 primary subtag: en, es, fr, ar. Producers SHOULD use the bare primary subtag when region and script do not matter.
Region and script subtags are permitted where they carry meaning: pt-BR, es-MX, en-GB, zh-Hant. These are most useful for selecting a regional voice or regional spelling.
A conforming consumer MAY act on only the primary subtag (treating es-MX as es) when it has no region-specific behavior. Producers therefore SHOULD NOT rely on a consumer honoring a region subtag, but MAY emit one so that consumers which do (for example, choosing a regional text-to-speech voice) can use it.

The reference validator performs a plausibility check on these fields (well-formed primary subtag, optional script, optional region) and emits a WARN — not an error — on a malformed tag. It does not validate the full BCP 47 registry.

4. Text direction — `dir`

The delivery language’s script direction is the document default; for a right-to-left (RTL) delivery language a consumer SHOULD set the rendering surface direction accordingly. Within content, the dir attribute marks spans or blocks whose direction differs from the surrounding text — an Arabic phrase embedded in an English paragraph, or an English term inside an Arabic passage.

Producers SHOULD emit dir alongside lang whenever an annotated part’s script direction differs from its surroundings; a lang without the matching dir can render with incorrect bidirectional ordering. The full producer/consumer direction obligations live in ACCESSIBILITY.md §6; the worked example examples/course-rtl-writing-systems.json demonstrates LTR-with-embedded-RTL across four writing systems.

5. Producer obligations

A producer MUST emit a language root field matching the document’s delivery language (§2.1).
A producer SHOULD mark any HTML span in a language other than the delivery language with lang (§2.2, WCAG 3.1.2), and SHOULD add dir where that span’s script direction differs (§4).
A producer MAY emit supportLanguage for language-teaching documents (§2.3), and MUST leave it absent or null otherwise.
A producer SHOULD use bare ISO 639-1 primary subtags unless a region/script subtag carries real meaning (§3).

6. Consumer obligations

A consumer SHOULD set the rendering surface’s primary language and direction from the document language (§2.1).
A consumer MUST preserve lang and dir on HTML through sanitization and round-trip (binds NORMATIVE.md §12.1).
A consumer MAY act on only the primary subtag of any language tag (§3).
A consumer MAY surface supportLanguage-driven L1 support in any form, or none (§2.3).

7. Screen readers and pronunciation — expectations (informative)

This section exists because the gap it describes is invisible to most implementers until a screen-reader user hits it.

Emitting lang on a foreign-language span is necessary but not sufficient for that span to be pronounced correctly. lang is an instruction to the assistive technology; whether the instruction is acted on depends on the delivery environment, which the format cannot control:

The reader must support automatic language switching, and it must be enabled. Support varies by product — NVDA and JAWS switch reliably; Windows Narrator’s automatic switching is comparatively limited; VoiceOver sits in between.
The matching voice / pronunciation data must be installed on the device. A reader with only an English voice will read a correctly-tagged lang="es" span in the English voice — mispronouncing it — even though the markup is perfect.

The practical consequence for implementers: a producer’s job is to emit the affordance (lang/dir) faithfully; a delivering consumer’s job is to preserve it and honor it on the rendering surface. Correct pronunciation is then completed by the end user’s screen reader and installed voices, which is outside the format’s and often the consumer’s control. This does not make lang optional — without it, no reader can switch at all, so the affordance is the floor, not the ceiling. It does mean that “the document is correctly tagged” and “every user hears flawless pronunciation” are different claims, and only the first is within an LC-JSON producer’s or consumer’s power to guarantee.

8. Relationship to the Accessibility Profile

The language-of-parts and direction obligations here overlap the ACCESSIBILITY.md §6 obligations and are bound by the same opt-in Accessibility Profile claim (NORMATIVE.md §10.2). This document adds the language model (the four roles, the single-document boundary, the language-tag rules) and the pronunciation-expectations framing; ACCESSIBILITY.md §6 remains the home for the per-criterion WCAG cross-references.

LC-JSON Validation Surface

Status: Informative reference. The authoritative rules live in NORMATIVE.md — including the artifact-type rule families it incorporates at §3.3.1 — in the companion normative documents (HTML_SAFETY.md, ACCESSIBILITY.md, LOCALIZATION.md), and in the constraints of the JSON Schemas under schemas/. This document catalogs them in one place.

The four reference validators — tools/validate_course.py, tools/lc_collection.py, tools/lc_pack.py, and tools/lc_glossary.py — are non-authoritative reference implementations. They implement the contract; they do not define it. Where a validator’s behavior and a normative source disagree, the normative source wins and the validator is a defect to be fixed. Spec version: 1.1 Last updated: 2026-07-22

This document maps every documented validation rule in LC-JSON (Learning Content JSON) 1.1 to the place where it is enforced. The audience is implementers building consumers, validators, or producer round-trip tests — the same audience as NORMATIVE.md.

The catalog is additive and descriptive: it introduces no new normative rules. The inventory pass that built this catalog (2026-05-24) surfaced eight documented-but-unenforced rules; all eight were closed in the same rc.1-polish session by extending tools/validate_course.py (no schema changes). See §14 Forward-looking deepenings for the deepenings cataloged during the 1.0 cycle (1.0 has since shipped; the still-open items carry forward).

Rules for the 1.1 artifact types are cataloged in §15 (Subject Collection, SC-*), §16 (Curriculum Pack, CP-*), and §17 (Glossary, GL-*); §18 catalogs the 1.1 deepenings to the root document (RD-*) and to courses (CO-*). These four rule families are normative requirements — NORMATIVE.md §3.3.1 incorporates SC-*, CP-*, GL-*, RD-1, and CO-* as conformance rules for their artifact types; the sections below enumerate each rule, cite its source, and tag its enforcement tier. (This catalog adds nothing to those rules; it is the one-map view of them.) The rule ids are load-bearing — the reference validators cite them — so they cannot silently drift. §15–§18 follow §14 rather than sitting beside §3–§11 to keep the 1.0 section numbers (and their anchors) stable.

1. Scope and structure

LC-JSON’s validation surface is split across four enforcement sites:

27 JSON Schemas in schemas/ — declarative constraints (Draft 7) enforced by any conforming JSON Schema validator.
NORMATIVE.md — RFC 2119 prose obligations that may or may not be representable in JSON Schema.
tools/validate_course.py (the reference validator) — domain checks that run after schema validation, plus consumer-friendly diagnostics.
Companion normative documents and informative references — HTML_SAFETY.md (normative) and ACCESSIBILITY.md (normative for tools claiming the Accessibility Profile; preservation obligations bind every consumer per NORMATIVE.md §12.1); per-type prose in question-types-reference.md and authoring patterns in ITEM_PATTERNS.md (both informative).

A consumer that only runs schema validation will accept documents the spec considers invalid (e.g. an MCQ with no correct option, a placement whose placements[].gap points at a missing @@@N marker). A consumer that re-implements the reference validator from prose will miss rules. This catalog gives implementers one map of “these are all the things a conforming consumer must check, and here’s where each rule is enforced.”

1.1 The three enforcement tiers

The rule tables below tag each rule with one tier:

Tier	Meaning	Citation format
Schema-enforced	Expressed in one of the JSON Schemas under `schemas/`. Any Draft-7 validator catches violations.	`schemas/<file>.schema.json: <json-pointer>`
Domain-validator-enforced	Not (or not cleanly) expressible in JSON Schema; the reference validator `tools/validate_course.py` checks it. Conforming consumers MUST replicate these checks to round-trip and grade correctly.	`validate_course.py: <function-name>` + NORMATIVE § where cited
Advisory	Described in prose (`NORMATIVE.md`, `README.md`, `question-types-reference.md`, `ITEM_PATTERNS.md`) but not mechanically enforced anywhere. SHOULD/MAY rules, naming conventions, behaviors the spec hints at but lets consumers vary. Listed so implementers know what they are choosing.	Document and section

A fourth, implicit tier — runtime-enforced — covers grading policy, navigation gating, gradebook display. Out of scope for this document; LC-JSON specifies document validity, not runtime behavior.

1.2 Severity (Domain-validator rows)

The reference validator distinguishes three severities on its domain-rule pass. Schema-enforced rows are always hard errors (any schema violation fails the document); Advisory rows are not enforced. Domain rows carry one of:

ERROR — the validator returns non-zero exit; the document is non-conforming. Consumers MUST reject.
WARN — the document is reported as suspect but still parses; the validator returns success. Conforming consumers SHOULD surface the warning to the user.
NOTE — informational only (e.g. item.points intentionally weighted away from the sum of question points). The validator returns success without raising; no consumer obligation.

Where a single rule is enforced at multiple tiers (e.g. schema + validator double-check for friendlier messages), the row lists both. Satisfying the strictest tier suffices.

1.3 Strict mode and the lenient migration path

The reference validator tools/validate_course.py accepts a --strict flag. The default (lenient) mode emits a warning and falls through with reduced enforcement when it encounters two pre-1.0 document shapes: the wrapped envelope {"course": {...}} and the bare payload {"units": [...]} with no documentType. Neither shape is part of the published 1.0 contract; the lenient handling is a maintainer-side migration aid that allows pre-1.0 document shapes to be ingested during the upgrade — it is not a published affordance third-party producers may rely on.

Under --strict, both shapes are fatal errors. The conformance corpus harness tools/run_corpus.py always invokes the validator with --strict (every fixture is run through the validator); CI runs the harness on every PR; and per NORMATIVE.md §10.3 conformance claims under §10 are evaluated in --strict mode. Third-party consumers and producers should treat the lenient path as a maintainer-side migration aid only. --strict is the mode in which the reference validator attempts to implement the published contract in full; the contract itself is stated in NORMATIVE.md, the companion normative documents, and the schema constraints — not by the validator’s behavior.

Rows in the tables below that depend on this distinction explicitly say “ERROR under --strict; WARN otherwise”; everywhere else, the rule applies uniformly.

2. Where to look

What	Where
JSON Schemas	`schemas/*.schema.json` — 27 files
Reference validator (course, questionSet)	`tools/validate_course.py`
Conformance language (RFC 2119 MUSTs/SHOULDs/MAYs)	`NORMATIVE.md`
Per-type property reference	`question-types-reference.md`
Subject Collection property reference	`subject-collection-reference.md`
Curriculum Pack property reference	`curriculum-pack-reference.md`
Glossary property reference	`glossary-reference.md`
HTML safety profile (elements, attributes, URL schemes, sanitization)	`HTML_SAFETY.md`
Accessibility profile (preservation + opt-in delivery claim)	`ACCESSIBILITY.md`
Item authoring patterns (consumer-policy plurality)	`ITEM_PATTERNS.md`
Conformance test fixtures	`tests/` — manifest + valid/invalid sets

3. Root document

Required root fields (NORMATIVE.md §3.2). Both artifact types (course, questionSet) share these.

Rule	Tier	Source	NORMATIVE §
Producer MUST emit `$schema` pointing at the canonical published schema URL	Schema-enforced (producer validity)	`course.schema.json: /required[]="$schema"`, `question-set.schema.json: /required[]="$schema"`	§3.2, §4.7
Consumer SHOULD tolerate documents that omit `$schema` (infer the schema from `documentType` + `specVersion`); MUST reject any other root-field omission	Advisory (consumer-side import tolerance)	`NORMATIVE.md` §3.2	§3.2
`$schema`, when present, is a URI	Schema-declared via `format: "uri"` (annotation; not universally enforced by Draft-7 validators — see §13)	`course.schema.json: /properties/$schema/format="uri"`, `question-set.schema.json: /properties/$schema/format="uri"`	§4.7
`documentType` required at root	Schema-enforced	`course.schema.json: /required[]="documentType"`, `question-set.schema.json: /required[]="documentType"`	§3.2
`documentType` is `"course"` (course document)	Schema-enforced	`course.schema.json: /properties/documentType/const="course"`	§3.2, §4.2, §5.3
`documentType` is `"questionSet"` (question-set document)	Schema-enforced	`question-set.schema.json: /properties/documentType/const="questionSet"`	§3.2, §4.2, §5.3
Non-canonical `documentType` casing rejected (`"Course"`, `"questionset"`, `"question-set"`)	Schema-enforced (via `const`)	`course.schema.json` / `question-set.schema.json` `const`; `validate_course.py: dispatch_document_shape` provides casing-tolerant dispatch as a maintainer-side migration aid (§1.3) — disabled under `--strict`	§4.2, §5.3
`specVersion` required at root	Schema-enforced	`course.schema.json: /required[]="specVersion"`, `question-set.schema.json: /required[]="specVersion"`	§3.2, §4.6
`specVersion` matches `^1\.[0-9]+(\.[0-9]+)?$`	Schema-enforced + Domain-validator-enforced (ERROR for `2.x`+)	`course.schema.json: /properties/specVersion/pattern`; `validate_course.py: check_spec_version`	§4.6, §5.2
`specVersion` MUST NOT carry an `-rc.N` suffix	Advisory	`NORMATIVE.md` §4.6, §8.4	§4.6
`language` required at root	Schema-enforced	`course.schema.json: /required[]="language"`, `question-set.schema.json: /required[]="language"`	§12.1
`language` is a plausible BCP 47 tag (bare ISO 639-1, or with region/script subtag)	Domain-validator-enforced (WARN; schema typed only, no `pattern`)	`validate_course.py: validate_course_level` (course path) and `validate_question_set_flat` (question-set path), via `_is_plausible_language_tag`	§13, `LOCALIZATION.md` §3
`supportLanguage` is a plausible BCP 47 tag (or omitted/null)	Domain-validator-enforced (WARN; schema typed only)	`validate_course.py: validate_course_level` (course path) and `validate_question_set_flat` (question-set path), via `_is_plausible_language_tag`	§13, `LOCALIZATION.md` §3
Pre-1.0 wrapped envelope `{"course": {...}}` rejected (published contract; lenient migration aid in default mode)	Domain-validator-enforced (ERROR under `--strict`; WARN otherwise — see §1.3)	`validate_course.py: validate_course` (`--strict` branch)	§3.2, §4.1
Pre-1.0 bare payload `{"units": [...]}` rejected (published contract; lenient migration aid in default mode)	Domain-validator-enforced (ERROR under `--strict`; WARN otherwise — see §1.3)	`validate_course.py: validate_course` (`--strict` branch)	§3.2, §4.1
Property names are camelCase	Advisory (consumer-side import is lenient via `JsonNormalizer`-style helpers)	`NORMATIVE.md` §4.5	§4.5
Extension members keyed `x-<namespace>` MAY appear on root + Course/Unit/Lesson/Item/Question	Advisory (schemas do not restrict `additionalProperties` on those objects)	`NORMATIVE.md` §7.1	§7.1
Extension members MUST NOT appear on `matching.pairs[]`, `matching.categories[]`, or `placement.placements[*]`	Schema-enforced	`matching.schema.json: /allOf/1/then/properties/pairs/items/additionalProperties=false` etc.; `placement.schema.json: /allOf/1/properties/placements/items/additionalProperties=false`	§7.1
Producer MUST NOT introduce a non-extension field beginning with `x-`	Advisory	`NORMATIVE.md` §7.1	§7.1
Consumer MUST NOT reject documents solely for unknown fields or `x-` members	Advisory	`NORMATIVE.md` §5.4, §7.4	§5.4, §7.4

4. Course-level

Course payload fields on a documentType: "course" document.

Rule	Tier	Source	NORMATIVE §
`title` required, `minLength: 1`	Schema-enforced + Domain-validator-enforced	`course.schema.json: /properties/title`, `/required[*]="title"`; `validate_course.py: validate_course_level`	§3.2
`sourceCourseId`, when present, matches the RFC 4122 UUID pattern (any version; shape-only validation)	Schema-enforced + Domain-validator-enforced (WARN if non-UUID)	`course.schema.json: /properties/sourceCourseId/pattern`; `validate_course.py: validate_course_level`	§4.4
`sourceCourseId` SHOULD be emitted for re-importable or version-tracked courses	Advisory	`NORMATIVE.md` §4.4	§4.4
`version`, when present, matches `^[0-9]+(\.[0-9]+){0,2}$` (1–3 numeric segments)	Schema-enforced + Domain-validator-enforced (WARN)	`course.schema.json: /properties/version/pattern`; `validate_course.py: validate_course_level`	§4.4
Pre-1.0 identity fields (`authorId`, `authorCourseId`) trigger a migration warning	Domain-validator-enforced (WARN)	`validate_course.py: validate_course_level`	(none — migration aid)
Course `objectives[].id` and `objectives[].text` required	Schema-enforced	`course.schema.json: /properties/objectives/items/required`	(none)
Course `objectives[*].difficultyBand` enum: `"Recall"`, `"Understand"`, `"Apply"`, `"Analyze"`, null	Schema-enforced	`course.schema.json: /properties/objectives/items/properties/difficultyBand/enum`	(none)
`courseObjectiveIds[*]` reference `course.objectives[].id`	Domain-validator-enforced (WARN)	`validate_course.py` (objective-reference integrity check)	(none — warning-tier integrity check; unresolved references break signpost auto-rendering)
`estimatedDurationMinutes >= 0`	Schema-enforced	`course.schema.json: /properties/estimatedDurationMinutes/minimum`	(none)
Course `tags[*]` are strings (Unit/Lesson/Item additionally enforce `minLength: 1`)	Schema-enforced	`course.schema.json: /properties/tags/items/type="string"`; `unit.schema.json` / `lesson.schema.json` / `item-base.schema.json: /properties/tags/items/minLength=1`	(none)
`units[]` MUST be present at the root (course); is an array when present	Domain-validator-enforced (ERROR if missing); Schema-enforced (array type when present)	`validate_course.py: validate_course` (“Missing ‘units’ array at root level”); `course.schema.json: /properties/units/type="array"` (the schema’s `default: []` would otherwise admit a missing field)	(none)

5. Unit-level

Rule	Tier	Source	NORMATIVE §
`globalId` required	Schema-enforced + Domain-validator-enforced	`unit.schema.json: /required[*]="globalId"`; `validate_course.py: validate_unit`	§4.4
`globalId` matches the RFC 4122 UUID pattern (any version; shape-only validation)	Schema-enforced + Domain-validator-enforced (WARN if non-UUID)	`unit.schema.json: /properties/globalId/pattern`; `validate_course.py: validate_unit` (via `is_valid_uuid`)	§4.4
`title` required, `minLength: 1`	Schema-enforced + Domain-validator-enforced	`unit.schema.json: /properties/title`, `/required[*]="title"`; `validate_course.py: validate_unit`	(none)
`description` defaults to `""`	Schema-declared (annotation; not enforced — see §13)	`unit.schema.json: /properties/description/default`	(none)
`tags[*]` `minLength: 1`	Schema-enforced	`unit.schema.json: /properties/tags/items/minLength`	(none)
`sequence >= 0`	Schema-enforced	`unit.schema.json: /properties/sequence/minimum`	(none — import uses array position, not `sequence`)
`sequence` duplicates/gaps within siblings	Domain-validator-enforced (WARN, advisory)	`validate_course.py: validate_sequence_order`	(none)
`objectiveIds[*]` reference `course.objectives[].id`	Domain-validator-enforced (WARN)	`validate_course.py` (objective-reference integrity check)	(none)
`lessons[]` array (default `[]` schema-declared)	Schema-enforced (type)	`unit.schema.json: /properties/lessons`	(none)

6. Lesson-level

Rule	Tier	Source	NORMATIVE §
`globalId` required + RFC 4122 UUID pattern (any version; shape-only validation)	Schema-enforced + Domain-validator-enforced (WARN)	`lesson.schema.json: /required`, `/properties/globalId/pattern`; `validate_course.py: validate_lesson`	§4.4
`title` required, `minLength: 1`	Schema-enforced + Domain-validator-enforced	`lesson.schema.json`; `validate_course.py: validate_lesson`	(none)
`items[]` is an array of `content`/`exercise`/`quiz`/`contentsequence`/`signpost` (`oneOf` dispatch)	Schema-enforced	`lesson.schema.json: /properties/items/items/oneOf`	(none)
`items` missing or empty — informational	Domain-validator-enforced (WARN)	`validate_course.py: validate_lesson` (“empty lesson”)	(none)
`sequence` duplicates/gaps within siblings (lesson item ordering)	Domain-validator-enforced (WARN, advisory)	`validate_course.py: validate_sequence_order`	(none)
`objectiveIds[*]` reference `course.objectives[].id`	Domain-validator-enforced (WARN)	`validate_course.py` (objective-reference integrity check)	(none)

7. Item-level — common

Properties inherited by every item type via item-base.schema.json.

Rule	Tier	Source	NORMATIVE §
`type` required, enum: `content`, `exercise`, `quiz`, `contentsequence`, `signpost`	Schema-enforced + Domain-validator-enforced	`item-base.schema.json: /properties/type/enum`, `/required`; `validate_course.py: validate_item`	§4.2, §5.3
Non-canonical item-type casing (`Content`, `ExerciseItem`) rejected	Schema-enforced (via `enum`/`const`) + Domain-validator-enforced (WARN; tolerated via `normalize_item_type`)	`item-base.schema.json`; `validate_course.py: validate_item`	§4.2, §5.3
`globalId` required + RFC 4122 UUID pattern (any version; shape-only validation)	Schema-enforced + Domain-validator-enforced (WARN)	`item-base.schema.json: /required`, `/properties/globalId/pattern`; `validate_course.py: validate_item`	§4.4
`title` required, `minLength: 1`	Schema-enforced + Domain-validator-enforced (WARN if missing)	`item-base.schema.json`; `validate_course.py: validate_item`	(none)
`tags[*]` `minLength: 1`	Schema-enforced	`item-base.schema.json: /properties/tags/items/minLength`	(none)
`suggestedTime >= 0`	Schema-enforced	`item-base.schema.json: /properties/suggestedTime/minimum`	(none)
`isOptional` boolean, default `false` (schema-declared)	Schema-enforced (type)	`item-base.schema.json: /properties/isOptional`	(none)

7.1 ContentItem

Rule	Tier	Source	NORMATIVE §
`type` is `"content"`	Schema-enforced	`content-item.schema.json: /allOf/1/properties/type/const="content"`	§4.2
`html` required	Schema-enforced + Domain-validator-enforced	`content-item.schema.json: /allOf/1/required[*]="html"`; `validate_course.py: validate_item` (`normalized_type == "content"`)	(none)
Deprecated `body` property → use `html`	Domain-validator-enforced (WARN)	`validate_course.py: validate_item`	(none)
`html` content satisfies the HTML safety profile	Domain-validator-enforced (ERROR / WARN per `HTML_SAFETY.md` §8)	`validate_course.py: validate_html_content`	§11, `HTML_SAFETY.md` (see §12 below)

7.2 ExerciseItem

Rule	Tier	Source	NORMATIVE §
`type` is `"exercise"`	Schema-enforced	`exercise-item.schema.json: /allOf/1/properties/type/const="exercise"`	§4.2
`questions[]` required	Schema-enforced + Domain-validator-enforced	`exercise-item.schema.json: /allOf/1/required[*]="questions"`; `validate_course.py: validate_item`	(none)
`instructions` required (PascalCase `Instructions` triggers WARN)	Domain-validator-enforced (ERROR if missing, WARN on PascalCase)	`validate_course.py: validate_item`	(none)
`isGraded` boolean, default `false` (schema-declared)	Schema-enforced (type)	`exercise-item.schema.json: /allOf/1/properties/isGraded/default=false`	§4.3
`passMarkPercent` is a number `0 <= x <= 100`, default `70.0` (schema-declared)	Schema-enforced (type/range)	`exercise-item.schema.json: /allOf/1/properties/passMarkPercent/{minimum,maximum,default}`	(none — consumer-policy gated; see `ITEM_PATTERNS.md` §3)
`points >= 0`	Schema-enforced	`exercise-item.schema.json: /allOf/1/properties/points/minimum`	(none)
Producer/consumer MUST NOT infer grading state from item type alone	Advisory	`NORMATIVE.md` §4.3	§4.3

7.3 QuizItem

Rule	Tier	Source	NORMATIVE §
`type` is `"quiz"`	Schema-enforced	`quiz-item.schema.json: /allOf/1/properties/type/const="quiz"`	§4.2
`questions[]` required, `isGraded` required	Schema-enforced	`quiz-item.schema.json: /allOf/1/required[*]={"questions","isGraded"}`	(none)
`isGraded` boolean, default `true` (schema-declared)	Schema-enforced (type)	`quiz-item.schema.json: /allOf/1/properties/isGraded/default=true`	§4.3
`passMarkPercent` is a number `0 <= x <= 100`, default `70.0` (schema-declared)	Schema-enforced (type/range)	`quiz-item.schema.json: /allOf/1/properties/passMarkPercent`	(none — see `ITEM_PATTERNS.md` §3)
`points >= 0`	Schema-enforced	`quiz-item.schema.json: /allOf/1/properties/points/minimum`	(none)
`item.points` vs `sum(question.points)` mismatch is intentional weighting	Domain-validator-enforced (NOTE)	`validate_course.py` (weighted-points NOTE collection)	(none)

7.4 ContentSequenceItem

Rule	Tier	Source	NORMATIVE §
`type` is `"contentsequence"`	Schema-enforced	`content-sequence-item.schema.json: /allOf/1/properties/type/const`	§4.2
`contentItemId` required; value is a UUID	Schema-enforced (required) + Domain-validator-enforced (WARN if non-UUID)	`content-sequence-item.schema.json: /allOf/1/required`, `/properties/contentItemId/format="uuid"` (annotation — see §13); `validate_course.py: validate_item` (via `is_valid_uuid`)	(none)
`relatedItemIds[*]` are UUIDs	Domain-validator-enforced (WARN); schema declares `format: "uuid"` as annotation	`content-sequence-item.schema.json: /allOf/1/properties/relatedItemIds/items/format="uuid"` (annotation); `validate_course.py: validate_item` (via `is_valid_uuid`)	(none)
`relatedItemIds` non-empty array	Domain-validator-enforced (ERROR if missing or empty)	`validate_course.py: validate_item`	(none)
`layout` enum: `"Auto"`, `"Split"`, `"Vertical"`, default `"Auto"` (schema-declared)	Schema-enforced (enum) + Domain-validator-enforced (WARN if other)	`content-sequence-item.schema.json: /allOf/1/properties/layout/enum`; `validate_course.py: validate_item`	(none)
`contentItemId` resolves to a sibling `content` item declared earlier in the lesson	Domain-validator-enforced (ERROR)	`validate_course.py: validate_item` (CSI branch)	(none)
Each `relatedItemIds[*]` resolves to a sibling `exercise`/`quiz` item declared earlier in the lesson	Domain-validator-enforced (ERROR)	`validate_course.py: validate_item` (CSI branch)	(none)

7.5 SignpostItem

Rule	Tier	Source	NORMATIVE §
`type` is `"signpost"`	Schema-enforced	`signpost-item.schema.json: /allOf/1/properties/type/const`	§4.2
`signpostType` required, enum: `"intro"`, `"summary"`	Schema-enforced + Domain-validator-enforced (ERROR if missing or other)	`signpost-item.schema.json: /allOf/1/required`, `/properties/signpostType/enum`; `validate_course.py: validate_item`	(none)
`scope` required, enum: `"course"`, `"unit"`, `"lesson"`	Schema-enforced + Domain-validator-enforced (ERROR if missing or other)	`signpost-item.schema.json: /allOf/1/required`, `/properties/scope/enum`; `validate_course.py: validate_item`	(none)
Signpost items MUST NOT carry `questions`	Domain-validator-enforced (ERROR)	`validate_course.py: validate_item` (signpost branch)	(none)
`customHtml`, when present, satisfies the HTML safety profile	Domain-validator-enforced (ERROR / WARN per `HTML_SAFETY.md` §8)	`validate_course.py: validate_html_content`	§11, `HTML_SAFETY.md`
A signpost with no objectives (and no `customHtml`) renders an empty stub	Advisory	`ITEM_PATTERNS.md` §5	(none)

8. Question-level — common

Properties inherited by every question via question-base.schema.json. Required by NORMATIVE §4.4: every question MUST carry a globalId.

Rule	Tier	Source	NORMATIVE §
`type` required + enum (19 values: 12 implemented + 7 reserved)	Schema-enforced + Domain-validator-enforced	`question-base.schema.json: /required[*]="type"`, `/properties/type/enum`; `validate_course.py: validate_question`	§4.2, §5.3, §6.1
Non-canonical question-type casing (`MultipleChoice`, `simplegapfill`) rejected	Schema-enforced (via `enum`) + Domain-validator-enforced (ERROR for unknown discriminator)	`question-base.schema.json: /properties/type/enum`; `validate_course.py` (per-type question dispatch)	§4.2, §5.3
`globalId` required + RFC 4122 UUID pattern (any version; shape-only validation)	Schema-enforced + Domain-validator-enforced (WARN if non-UUID)	`question-base.schema.json: /required[*]="globalId"`, `/properties/globalId/pattern`; `validate_course.py: validate_question`	§4.4
`prompt` required, `minLength: 0` (may be empty); empty/whitespace `prompt` is an ERROR for the 4 real-content types (`trueFalseQuestion`, `multipleChoice`, `shortAnswer`, `essay`), valid (empty) for the 8 symbolic types, and unconstrained for the 7 reserved types (deferred to v1.1)	Schema-enforced (`required`, `minLength: 0`) + Domain-validator-enforced (ERROR on real-content empty; WARN if missing)	`question-base.schema.json: /required[*]="prompt"`, `/properties/prompt/minLength`; `validate_course.py: validate_question`	(none)
`points` is a non-negative number, MAY be null, default `1.0` (schema-declared)	Schema-enforced (type/range) + Domain-validator-enforced (WARN if missing)	`question-base.schema.json: /properties/points/{type,minimum,default}`; `validate_course.py: validate_question`	(none)
`difficulty` is a number `0.0 <= x <= 10.0`, default `5.0` (schema-declared)	Schema-enforced (type/range)	`question-base.schema.json: /properties/difficulty/{minimum,maximum,default}`	(none — author estimate; see `question-types-reference.md` Common Properties)
`tags[*]` strings	Schema-enforced	`question-base.schema.json: /properties/tags/items/type`	(none)
`hint` is string or null, default null	Schema-enforced	`question-base.schema.json: /properties/hint`	(none)
`feedback` is an object or null; `feedback.{correct,incorrect}` are strings; `feedback.choiceFeedback` is `{string: string}`	Schema-enforced	`question-base.schema.json: /properties/feedback`	(none)
Deprecated `questionType` (instead of `type`)	Domain-validator-enforced (ERROR)	`validate_course.py: validate_question`	(none — migration aid)
For non-question fields: producer MUST NOT embed HTML in plain-text fields	Advisory	`HTML_SAFETY.md` §1.1	§11

9. Question-level — by type (12 implemented)

Properties enforced per question-type schema, plus per-type domain-validator rules. The 7 reserved types are covered in §10.

9.1 simpleGapFill

Rule	Tier	Source	NORMATIVE §
`type` is `"simpleGapFill"`	Schema-enforced	`simple-gap-fill.schema.json: /allOf/1/properties/type/const`	§4.2
`sentence` required, contains `@@@`, `minLength: 4`	Schema-enforced	`simple-gap-fill.schema.json: /allOf/1/required`, `/properties/sentence/{pattern,minLength}`	(none)
`acceptedAnswers` required, `minItems: 1`, each `minLength: 1`	Schema-enforced	`simple-gap-fill.schema.json: /allOf/1/required`, `/properties/acceptedAnswers/{minItems,items/minLength}`	(none)
`caseSensitive` boolean, default `false` (schema-declared)	Schema-enforced (type)	`simple-gap-fill.schema.json: /allOf/1/properties/caseSensitive`	(none)

9.2 trueFalseQuestion

Rule	Tier	Source	NORMATIVE §
`type` is `"trueFalseQuestion"`	Schema-enforced	`true-false-question.schema.json: /allOf/1/properties/type/const`	§4.2
`correctAnswer` required, boolean	Schema-enforced + Domain-validator-enforced (ERROR if missing both v1 and v2 forms, ERROR if non-boolean, WARN on boolean-ish coercion)	`true-false-question.schema.json: /allOf/1/required`, `/properties/correctAnswer/type`; `validate_course.py: validate_true_false_question`	(none)
Pre-1.0 TF shape (`options` / `optionsAndPoints`) deprecated	Domain-validator-enforced (WARN; multiple positive-points options WARN; zero positives WARN)	`validate_course.py: validate_true_false_question`	(none)
`displayStyle` enum: `"TrueFalse"`, `"CorrectIncorrect"`, `"CheckmarkX"`, default `"TrueFalse"` (schema-declared)	Schema-enforced (enum) + Domain-validator-enforced (WARN if other)	`true-false-question.schema.json: /allOf/1/properties/displayStyle/enum`; `validate_course.py: validate_true_false_question`	(none)
`penalizeIncorrect` boolean, default `false` (schema-declared)	Schema-enforced (type)	`true-false-question.schema.json: /allOf/1/properties/penalizeIncorrect`	(none)
`incorrectPenaltyPercent` is `0..100`, default `50.0` (schema-declared)	Schema-enforced (type/range) + Domain-validator-enforced (WARN if out of range)	`true-false-question.schema.json: /allOf/1/properties/incorrectPenaltyPercent/{minimum,maximum}`; `validate_course.py: validate_true_false_question`	(none)
`feedback.choiceFeedback` deprecated on TF	Domain-validator-enforced (WARN)	`validate_course.py: validate_true_false_question`	(none — TF v2 forbids the field; see `question-types-reference.md`)

9.3 multipleChoice

Rule	Tier	Source	NORMATIVE §
`type` is `"multipleChoice"`	Schema-enforced	`multiple-choice.schema.json: /allOf/1/properties/type/const`	§4.2
`options` required, `minItems: 2`, each `minLength: 1`	Schema-enforced	`multiple-choice.schema.json: /allOf/1/required`, `/properties/options/{minItems,items/minLength}`	(none)
`optionsAndPoints` required, `{string: number}` map	Schema-enforced	`multiple-choice.schema.json: /allOf/1/required`, `/properties/optionsAndPoints`	(none)
At least one `optionsAndPoints` value > 0 (an MCQ MUST have a correct answer)	Domain-validator-enforced (ERROR)	`validate_course.py: validate_multiple_choice`	(none)
`optionsAndPoints` keys cover every entry in `options`	Domain-validator-enforced (ERROR if missing; WARN if `optionsAndPoints` contains extras not in `options`)	`validate_course.py: validate_multiple_choice`	(none)
`allowMultipleCorrect` boolean, default `false` (schema-declared)	Schema-enforced (type)	`multiple-choice.schema.json: /allOf/1/properties/allowMultipleCorrect`	(none)
`allowPartialCredit` boolean, default `true` (schema-declared)	Schema-enforced (type)	`multiple-choice.schema.json: /allOf/1/properties/allowPartialCredit`	(none)
`penalizeIncorrect` boolean, default `false` (schema-declared)	Schema-enforced (type)	`multiple-choice.schema.json: /allOf/1/properties/penalizeIncorrect`	(none)
`showLetterLabels` boolean, default `false` (schema-declared)	Schema-enforced (type)	`multiple-choice.schema.json: /allOf/1/properties/showLetterLabels`	(none)
`shuffleOptions` governs per-question option randomization (per-question discretion)	Advisory	`NORMATIVE.md` §5.6 (multipleChoice is explicitly exempt from the §5.6 randomization MUST)	§5.6

9.4 wordBankCloze

Rule	Tier	Source	NORMATIVE §
`type` is `"wordBankCloze"`	Schema-enforced	`word-bank-cloze.schema.json: /allOf/1/properties/type/const`	§4.2
`passage` required, matches `@@@\d+`, `minLength: 4`	Schema-enforced	`word-bank-cloze.schema.json: /allOf/1/required`, `/properties/passage/{pattern,minLength}`	(none)
`wordBank` required, `minItems: 1`, each `minLength: 1`	Schema-enforced	`word-bank-cloze.schema.json: /allOf/1/required`, `/properties/wordBank`	(none)
`gapAcceptedAnswers` required, `{"^[0-9]+$": [string]}`, each gap `minItems: 1`, each accepted answer `minLength: 1`	Schema-enforced	`word-bank-cloze.schema.json: /allOf/1/required`, `/properties/gapAcceptedAnswers/patternProperties`	(none)
`passage` `@@@N` marker set MUST equal `gapAcceptedAnswers` key set	Domain-validator-enforced (ERROR)	`validate_course.py: validate_word_bank_cloze` (cloze gap-consistency check)	(none)
`@@@N` marker numbers SHOULD be sequential starting at 1	Domain-validator-enforced (WARN)	`validate_course.py: validate_word_bank_cloze` (cloze gap-consistency check)	(none)
`allowWordReuse` boolean, default `false` (schema-declared)	Schema-enforced (type)	`word-bank-cloze.schema.json: /allOf/1/properties/allowWordReuse`	(none)
`bankPosition` enum: `"above"`, `"below"`, `"side"`	Schema-enforced	`word-bank-cloze.schema.json: /allOf/1/properties/bankPosition/enum`	(none)
`gapCaseSensitive` / `gapFeedback` value types	Schema-enforced	`word-bank-cloze.schema.json: /allOf/1/properties/gapCaseSensitive`, `gapFeedback`	(none)

9.5 multiGapCloze

Rule	Tier	Source	NORMATIVE §
`type` is `"multiGapCloze"`	Schema-enforced	`multi-gap-cloze.schema.json: /allOf/1/properties/type/const`	§4.2
`passage` required, matches `@@@\d+`, `minLength: 4`	Schema-enforced	`multi-gap-cloze.schema.json: /allOf/1/required`, `/properties/passage`	(none)
`gapAcceptedAnswers` required	Schema-enforced	`multi-gap-cloze.schema.json: /allOf/1/required`	(none)
Each accepted answer MUST NOT contain `,` or `:` (scoring-engine wire format)	Schema-enforced + Domain-validator-enforced (ERROR)	`multi-gap-cloze.schema.json: /allOf/1/properties/gapAcceptedAnswers/patternProperties/.../items/not/pattern="[,:]"`; `validate_course.py: validate_multi_gap_cloze`	(none — wire-format consequence)
Other punctuation in answers SHOULD be limited to apostrophes and hyphens	Domain-validator-enforced (WARN)	`validate_course.py: validate_multi_gap_cloze`	(none)
`passage` `@@@N` marker set MUST equal `gapAcceptedAnswers` key set	Domain-validator-enforced (ERROR)	`validate_course.py: validate_multi_gap_cloze` (cloze gap-consistency check)	(none)
`@@@N` marker numbers SHOULD be sequential starting at 1	Domain-validator-enforced (WARN)	`validate_course.py: validate_multi_gap_cloze` (cloze gap-consistency check)	(none)
`allowPartialCredit` boolean, default `true` (schema-declared)	Schema-enforced (type)	`multi-gap-cloze.schema.json: /allOf/1/properties/allowPartialCredit`	(none)

9.6 multipleChoiceCloze

Rule	Tier	Source	NORMATIVE §
`type` is `"multipleChoiceCloze"`	Schema-enforced	`multiple-choice-cloze.schema.json: /allOf/1/properties/type/const`	§4.2
`passage`, `gapOptions`, `correctAnswers` required	Schema-enforced	`multiple-choice-cloze.schema.json: /allOf/1/required`	(none)
Each gap’s `gapOptions` has `minItems: 2`	Schema-enforced	`multiple-choice-cloze.schema.json: /allOf/1/properties/gapOptions/patternProperties/.../minItems`	(none)
`correctAnswers[N]` is a non-negative integer	Schema-enforced	`multiple-choice-cloze.schema.json: /allOf/1/properties/correctAnswers/patternProperties/.../{type,minimum}`	(none)
`correctAnswers[N]` index in bounds of `gapOptions[N]`	Domain-validator-enforced (ERROR)	`validate_course.py: validate_multiple_choice_cloze`	(none)
`passage` `@@@N` marker set MUST equal `gapOptions` key set	Domain-validator-enforced (ERROR)	`validate_course.py: validate_multiple_choice_cloze` (cloze gap-consistency check)	(none)
`gapOptions` key set MUST equal `correctAnswers` key set	Domain-validator-enforced (ERROR)	`validate_course.py: validate_multiple_choice_cloze`	(none)
`@@@N` marker numbers SHOULD be sequential starting at 1	Domain-validator-enforced (WARN)	`validate_course.py: validate_multiple_choice_cloze` (cloze gap-consistency check)	(none)
`shuffleOptions` boolean, default `false` (schema-declared)	Schema-enforced (type)	`multiple-choice-cloze.schema.json: /allOf/1/properties/shuffleOptions`	(none)

9.7 shortAnswer

Rule	Tier	Source	NORMATIVE §
`type` is `"shortAnswer"`	Schema-enforced	`short-answer.schema.json: /allOf/1/properties/type/const`	§4.2
`acceptedAnswers` required, `minItems: 1`, each `minLength: 1`	Schema-enforced	`short-answer.schema.json: /allOf/1/required`, `/properties/acceptedAnswers`	(none)
`acceptedAnswers[0]` is the canonical form for display	Advisory (`question-types-reference.md` §7)	(no enforcement)	(none)
`caseSensitive` boolean, default `false` (schema-declared)	Schema-enforced (type)	`short-answer.schema.json: /allOf/1/properties/caseSensitive`	(none)

9.8 essay

Rule	Tier	Source	NORMATIVE §
`type` is `"essay"`	Schema-enforced	`essay.schema.json: /allOf/1/properties/type/const`	§4.2
`expectedAnswer` required (string, may be empty)	Schema-enforced	`essay.schema.json: /allOf/1/required`	(none)
`expectedLines`, `minWords`, `maxWords` are integers `>= 0` (0 = no limit)	Schema-enforced	`essay.schema.json: /allOf/1/properties/{expectedLines,minWords,maxWords}`	(none)
`maxWords >= minWords` when both > 0	Domain-validator-enforced (WARN)	`validate_course.py: validate_essay`	(none)
`rubricText` is Markdown when present	Advisory (`question-types-reference.md` §8)	(none)	(none)

9.9 sentenceTransformation

Rule	Tier	Source	NORMATIVE §
`type` is `"sentenceTransformation"`	Schema-enforced	`sentence-transformation.schema.json: /allOf/1/properties/type/const`	§4.2
`promptSentence`, `keyword`, `targetSentence`, `acceptedChunks` required	Schema-enforced + Domain-validator-enforced (ERROR if missing)	`sentence-transformation.schema.json: /allOf/1/required`; `validate_course.py: validate_sentence_transformation`	(none)
`targetSentence` contains exactly one `@@@` placeholder (`minLength: 4`); multiple `@@@` markers are non-conforming because SentenceTransformation chunks are sequential answer pieces typed at that single position, not separate gaps	Schema-enforced (pattern requires at least one `@@@`; `minLength`) + Domain-validator-enforced (ERROR if more than one `@@@`; WARN if zero)	`sentence-transformation.schema.json: /allOf/1/properties/targetSentence/{pattern,minLength}`; `validate_course.py: validate_sentence_transformation`	(none)
`acceptedChunks` keys are `^[0-9]+$`, each value `minItems: 1`, each chunk `minLength: 1`	Schema-enforced	`sentence-transformation.schema.json: /allOf/1/properties/acceptedChunks/patternProperties`	(none)
Chunk numbers SHOULD be sequential starting at 1	Domain-validator-enforced (WARN)	`validate_course.py: validate_sentence_transformation`	(none)
`keyword` SHOULD be uppercase	Domain-validator-enforced (WARN)	`validate_course.py: validate_sentence_transformation`	(none)
Deprecated PascalCase chunks/keyword fields (`AcceptedChunks`, `Keyword`, …) → camelCase	Domain-validator-enforced (WARN)	`validate_course.py: validate_sentence_transformation` (`deprecated_props` map)	(none)
`allOrNothing` boolean, default `false` (schema-declared); `chunkCaseSensitive` / `chunkFeedback` typed maps	Schema-enforced (types) + Domain-validator-enforced (WARN if not boolean / not dict)	`sentence-transformation.schema.json`; `validate_course.py: validate_sentence_transformation`	(none)

9.10 matching

matching carries an if/then/else branch in the schema, keyed off matchingMode.

Rule	Tier	Source	NORMATIVE §
`type` is `"matching"`	Schema-enforced	`matching.schema.json: /allOf/1/properties/type/const`	§4.2
`matchingMode` required, enum: `"pairs"`, `"classification"`	Schema-enforced	`matching.schema.json: /allOf/1/required`, `/properties/matchingMode/enum`	(none)
`pairs` mode: `pairs[]` required, `minItems: 2`, each `{item,match}` required, `additionalProperties: false`	Schema-enforced	`matching.schema.json: /allOf/1/then/{required,properties/pairs}`	§7.1 (closed object disallows `x-` extensions inside)
`pairs` mode: `categories` MUST NOT be present	Schema-enforced	`matching.schema.json: /allOf/1/then/not/required[*]="categories"`	(none)
`classification` mode: `categories[]` required, `minItems: 2`, each `{label,items}` required (`items.minItems: 1`), `additionalProperties: false`	Schema-enforced	`matching.schema.json: /allOf/1/else/{required,properties/categories}`	§7.1
`classification` mode: `pairs` MUST NOT be present	Schema-enforced	`matching.schema.json: /allOf/1/else/not/required[*]="pairs"`	(none)
`distractors[*]` non-empty strings	Schema-enforced	`matching.schema.json: /allOf/1/properties/distractors/items/minLength`	(none)
`allowPartialCredit` boolean, default `true` (schema-declared)	Schema-enforced (type)	`matching.schema.json: /allOf/1/properties/allowPartialCredit`	(none)
Consumers MUST randomize the choice pool (matches + distractors)	Advisory + runtime obligation	`NORMATIVE.md` §5.6	§5.6
Consumers MUST randomize row order in `classification` mode	Advisory + runtime obligation	`NORMATIVE.md` §5.6	§5.6

9.11 ordering

Rule	Tier	Source	NORMATIVE §
`type` is `"ordering"`	Schema-enforced	`ordering.schema.json: /allOf/1/properties/type/const`	§4.2
`sourceText` required, `minLength: 1`	Schema-enforced	`ordering.schema.json: /allOf/1/required`, `/properties/sourceText/minLength`	(none)
`items` required, `minItems: 2`, each `minLength: 1`	Schema-enforced	`ordering.schema.json: /allOf/1/required`, `/properties/items`	(none)
`distractors[*]` non-empty strings, default `[]` (schema-declared)	Schema-enforced (item type)	`ordering.schema.json: /allOf/1/properties/distractors`	(none)
`scoringMode` enum: `"strict"`, `"kendall"` (when present)	Schema-enforced	`ordering.schema.json: /allOf/1/properties/scoringMode/enum`	(none)
`scoringMode` default: `"strict"` for `orderingUnit:"word"`, `"kendall"` for `"sentence"`/`"paragraph"`	Advisory (description prose; no JSON Schema literal `default`)	`ordering.schema.json: /allOf/1/properties/scoringMode/description`	(none)
`orderingUnit` enum: `"word"`, `"sentence"`, `"paragraph"`, default `"word"` (schema-declared; advisory display hint)	Schema-enforced (enum)	`ordering.schema.json: /allOf/1/properties/orderingUnit/{enum,default}`	(none)

9.12 placement

Rule	Tier	Source	NORMATIVE §
`type` is `"placement"`	Schema-enforced	`placement.schema.json: /allOf/1/properties/type/const`	§4.2
`placementUnit`, `passage`, `placements` required	Schema-enforced	`placement.schema.json: /allOf/1/required`	(none)
`placementUnit` enum: `"sentence"`, `"paragraph"`, `"sectionLabel"`, default `"sentence"` (schema-declared)	Schema-enforced (enum)	`placement.schema.json: /allOf/1/properties/placementUnit/enum`	(none)
`passage` `minLength: 1`, MUST contain at least one `@@@N` marker	Schema-enforced	`placement.schema.json: /allOf/1/properties/passage/{minLength,pattern}`	(none)
`placements` `minItems: 1`, each `{gap >= 1, item.minLength >= 1}`, `additionalProperties: false`	Schema-enforced	`placement.schema.json: /allOf/1/properties/placements/items`	§7.1 (closed)
Every `placements[*].gap` references a `@@@N` marker present in `passage`	Domain-validator-enforced (ERROR)	`validate_course.py: validate_placement`	(none)
No duplicate `gap` values within `placements[]`	Domain-validator-enforced (ERROR)	`validate_course.py: validate_placement`	(none)
`@@@N` markers SHOULD be sequential starting at 1	Domain-validator-enforced (WARN)	`validate_course.py: validate_placement`	(none)
`placementUnit: "paragraph"` — marker SHOULD stand alone in its paragraph	Domain-validator-enforced (WARN)	`validate_course.py: validate_placement`	(none)
`placementUnit: "sectionLabel"` — marker SHOULD be at the start of a paragraph followed by a space	Domain-validator-enforced (WARN)	`validate_course.py: validate_placement`	(none)
Extra `@@@N` markers without a `placements[]` entry are valid decoy gaps (TOEFL Sentence Insertion variant)	Advisory	`placement.schema.json: /allOf/1/properties/placements/description`, `question-types-reference.md` §11	(none)
`distractors[*]` non-empty strings	Schema-enforced	`placement.schema.json: /allOf/1/properties/distractors/items/minLength`	(none)
Consumers MUST randomize the choice pool (placements items + distractors)	Advisory + runtime obligation	`NORMATIVE.md` §5.6	§5.6

10. Reserved and unknown types

The 7 reserved question types — association, hotspot, graphicGapMatch, graphicAssociate, graphicOrder, fileUpload, mediaPromptedEssay — are declared in question-base.schema.json’s enum but have no per-type schemas in 1.0. Their handling is normative under NORMATIVE.md §6 (the fallback contract):

Rule	Tier	Source	NORMATIVE §
Reserved-type discriminator MUST be accepted by consumers	Schema-enforced	`question-base.schema.json: /properties/type/enum`	§5.5, §6.1
Reserved-type question MUST satisfy `question-base.schema.json` (`type`, `globalId`, `prompt` required; `points` validated against the base type/range when present, defaulting to `1.0` schema-declared)	Schema-enforced (required fields + type/range on `points`)	`question-base.schema.json: /required=["type","globalId","prompt"]`, `/properties/points/{type,minimum,default}`; validator dispatches reserved types to `question-base.schema.json`	§6.3
Consumer MUST preserve every member of reserved-type question objects across read/write cycles (semantic preservation; key order is producer-discretion per §6.2)	Advisory (round-trip preservation — runtime obligation, not document-validity)	`NORMATIVE.md` §6.2, §6.4	§6.2, §6.4
Consumer MUST NOT silently drop reserved-type questions from `questions[]`	Advisory	`NORMATIVE.md` §6.2	§6.2
Consumer MUST treat reserved-type earned points as 0 (max still counts)	Advisory (runtime obligation)	`NORMATIVE.md` §6.2	§6.2
Consumer MUST report the unsupported question to the user at import (UI banner / log / returned warning)	Advisory	`NORMATIVE.md` §6.2	§6.2
Consumer SHOULD render a non-interactive placeholder naming the type	Advisory	`NORMATIVE.md` §6.2, `ACCESSIBILITY.md` §7	§6.2, §12
Consumer SHOULD disable navigation gating for unsupported questions	Advisory	`NORMATIVE.md` §6.2	§6.2
Producer SHOULD NOT emit reserved types in cross-implementation distribution	Advisory	`NORMATIVE.md` §6.3	§6.3
Producer SHOULD use the published reserved name exactly (`hotspot`, not `Hotspot`); SHOULD document tool-specific extensions in IMPLEMENTATIONS.md / README	Advisory	`NORMATIVE.md` §6.5	§6.5
Producer: emitting a discriminator value not listed in `question-base.schema.json`’s `enum` is non-conforming at 1.0 — the schema rejects it; the reference validator surfaces it with a friendlier message naming the allowed values	Schema-enforced + Domain-validator-enforced (ERROR)	`question-base.schema.json: /properties/type/enum`; `validate_course.py` (per-type question dispatch)	§6.1
Consumer (1.0-only) reading a 1.x+ document with a type discriminator unknown to the consumer: apply the §6 fallback (preserve in full, treat earned points as 0, render placeholder, report to user) — do NOT reject the document	Advisory (runtime / forward-compat obligation, not document-validity)	`NORMATIVE.md` §6.1, §6.2, §6.4	§6.1, §6.2

The two rows above are not in conflict: the producer-validity row describes the 1.0 strict-validator behavior on a document whose type enum is exhausted at 1.0 (the schema and the reference validator agree it’s a malformed 1.0 document). The consumer-import row describes the runtime obligation a 1.0-only consumer carries when it ingests a 1.x+ document whose newer type it does not recognize — there, NORMATIVE §6 binds the consumer to graceful fallback rather than rejection. A 1.0 consumer cannot validate a 1.x+ document under a 1.0 schema and therefore SHOULD NOT use schema validation as the ingest gate when reading future-minor content; consumer-side ingest is governed by §6, not by question-base.schema.json.

11. Question Sets (flat artifact)

A documentType: "questionSet" document is a flat questions list with no course hierarchy. Required root fields apply (see §3) plus:

Rule	Tier	Source	NORMATIVE §
`title` required, `minLength: 1`	Schema-enforced	`question-set.schema.json: /required[*]="title"`, `/properties/title/minLength`	§3.2
`language` required at root	Schema-enforced	`question-set.schema.json: /required[*]="language"`	§12.1
`questions[]` required (may be empty)	Schema-enforced	`question-set.schema.json: /required[*]="questions"`	(none)
`sourceQuestionSetId`, when present, matches the RFC 4122 UUID pattern (any version; shape-only validation)	Schema-enforced	`question-set.schema.json: /properties/sourceQuestionSetId/pattern`	(none)
`version` matches `^[0-9]+(\.[0-9]+){0,2}$`, default `"1.0"` (schema-declared)	Schema-enforced (pattern)	`question-set.schema.json: /properties/version/pattern`	(none)
Each `questions[*]` validates against its per-type schema (per-question dispatch)	Domain-validator-enforced (ERROR on schema failure or unknown discriminator)	`validate_course.py` (per-type question dispatch), `validate_question_set_flat`	§5.1, §5.3

12. Cross-cutting

12.1 HTML safety profile

HTML appears in two fields: ContentItem.html and SignpostItem.customHtml. The full normative profile is in HTML_SAFETY.md. The reference validator’s HTML checks live in validate_course.py: validate_html_content and mirror that profile.

Surface	Severity (`HTML_SAFETY.md` §8)	Validator function
Forbidden elements (`<script>`, `<iframe>`, `<form>`, `<input>`, `<button>`, `<style>`, `<link>`, `<meta>`, `<base>`, `<svg>`, `<math>`, etc.)	ERROR — consumer MUST reject	`validate_html_content` (HTML_FORBIDDEN_TAGS)
Event-handler attributes (`onclick`, `onload`, `onerror`, …)	ERROR	`validate_html_content` (`attr_name.startswith("on")`)
Form-submission attributes (`srcdoc`, `formaction`, …)	ERROR	`validate_html_content`
`javascript:` or `vbscript:` URL in any URL-bearing attribute	ERROR	`validate_html_content`
`expression(...)` / `javascript:` inside `style` CSS value	ERROR	`validate_html_content`
`data:` URL in any URL-bearing attribute (including `<img src>`)	WARN	`validate_html_content`
Other forbidden URL schemes (`blob:`, `file:`, `chrome:`, `ftp:`, `ws:`, `gopher:`, `view-source:`)	WARN	`validate_html_content`
`tel:` URL (consumer-policy gated; see `ITEM_PATTERNS.md` §3)	WARN	`validate_html_content`
Unknown element (not in `HTML_ALLOWED_TAGS`, not in forbidden list)	WARN — strip while preserving text	`validate_html_content`
Unknown attribute on an allowed element (outside §3 allowlist)	WARN — strip the attribute	`validate_html_content`
CSS property outside the `HTML_SAFETY.md` §3.4 allowlist	WARN — strip the property	`validate_html_content`
`<a target="_blank">` without `rel="noopener noreferrer"`	WARN — consumer MUST normalize	`validate_html_content`
`<img>` without `alt` attribute	WARN — empty `alt=""` permitted for decorative images	`validate_html_content`
`<video>`/`<audio>` with `autoplay` or `loop`	WARN — producer MUST NOT emit; consumer SHOULD ignore	`validate_html_content`

A conforming consumer MUST sanitize HTML before render regardless of producer claims (HTML_SAFETY.md §5).

12.2 Accessibility preservation

ACCESSIBILITY.md defines two layers: a base-conformance preservation floor that binds every consumer, and an opt-in Accessibility Profile claim that binds delivery.

Round-trip preservation (base conformance, NORMATIVE.md §12.1):

Rule	Tier	Source	NORMATIVE §
`alt` on `<img>` MUST round-trip	Advisory (runtime / round-trip obligation)	`NORMATIVE.md` §12.1	§12.1
`<track>` elements (incl. `kind`, `src`, `srclang`, `label`, `default`) MUST round-trip on `<video>`/`<audio>`	Advisory	`NORMATIVE.md` §12.1	§12.1
`lang` and `dir` attributes on HTML-bearing elements MUST round-trip	Advisory	`NORMATIVE.md` §12.1	§12.1
Document-root `language` MUST round-trip	Schema-enforced (required) + runtime preservation obligation	`course.schema.json: /required[*]="language"`; `NORMATIVE.md` §12.1	§12.1
Document-root `supportLanguage` MUST round-trip when present (including explicit `null`)	Advisory	`NORMATIVE.md` §12.1	§12.1
Reserved-type questions MUST round-trip with any accessibility metadata they carry	Advisory	`NORMATIVE.md` §6.4, §12.1	§6.4, §12.1
Extension-preserving consumers (§7.4) SHOULD round-trip `x-`-namespaced extension members carrying accessibility data	Advisory	`NORMATIVE.md` §12.1	§12.1

Opt-in Accessibility Profile delivery obligations (binding only when claimed): see ACCESSIBILITY.md §§2–8. Not duplicated here.

Validator severity for accessibility issues at the current baseline (ACCESSIBILITY.md §8):

Issue	Severity	Validator function
Missing `alt` on `<img>`	WARN	`validate_html_content`
`<video>` without `<track kind="captions"\|"subtitles">`	WARN (current baseline; promotion to ERROR under the `--accessibility` flag is targeted for 1.0 final)	`validate_html_content` (post-pass scan for `<video>…</video>` blocks)
`<iframe>`, `<script>`, event handlers	ERROR	`validate_html_content`
Missing `language` at document root	ERROR (schema-enforced)	`course.schema.json` / `question-set.schema.json` `required`
Reserved-type question without `title`	NOTE	(advisory; not currently surfaced)

12.3 Randomization requirements

NORMATIVE.md §5.6 binds two surfaces. These are consumer rendering obligations, not document-validity rules — a document is conforming whether or not consumers randomize it. Listed here so implementers know what they MUST do at render time:

Choice pool for matching (pairs/classification) and placement MUST be presented in randomized order.
Row order in matching classification mode MUST be randomized.
The randomization algorithm and any seeding strategy are consumer-defined.
Exemptions: multipleChoice (per-question shuffleOptions instead), matching pairs rows, ordering source tiles.

12.4 Extensions (`x-` members)

Extension rules from NORMATIVE.md §7. Round-trip preservation by extension-preserving consumers is a runtime obligation, not a document-validity rule.

Rule	Tier	Source	NORMATIVE §
`x-` keys MAY appear on root + Course/Unit/Lesson/Item/Question	Advisory (schemas omit `additionalProperties: false` on these objects)	`NORMATIVE.md` §7.1	§7.1
`x-` keys MUST NOT appear on closed objects (`matching.pairs[]`, `matching.categories[]`, `placement.placements[*]`)	Schema-enforced	`matching.schema.json` / `placement.schema.json` (`additionalProperties: false`)	§7.1
Producer MUST NOT emit non-extension fields whose name begins with `x-`	Advisory	`NORMATIVE.md` §7.1	§7.1
Producer MUST NOT emit an extension under a namespace it does not own	Advisory	`NORMATIVE.md` §7.2	§7.2
Extensions are strictly additive — removing every `x-` member MUST leave a conforming document with equivalent learner-facing meaning	Advisory	`NORMATIVE.md` §7.3	§7.3
Consumer MUST NOT reject documents solely for `x-` members or interpret members outside its own namespace	Advisory	`NORMATIVE.md` §7.4	§7.4
Extension-preserving consumers SHOULD round-trip unrecognized `x-` members on the same object	Advisory (round-trip behavior)	`NORMATIVE.md` §7.4	§7.4

12.5 Versioning and URL stability

NORMATIVE.md §8 binds publication-side guarantees. Not document-validity rules per se, but consumers SHOULD enforce them when resolving $schema:

$schema URL identifies the specific publication; specVersion identifies the contract version (§4.6, §4.7, §8.4).
/X.Y/ paths are reserved for accepted final releases; /X.Y-rc.N/ paths are immutable once published; rc.N → final adoption is an explicit re-export (§8.1, §8.3).
A document declaring specVersion: "1.0" with $schema at /1.0-rc.N/ validates against /1.0-rc.N/ and is not required to validate against /1.0/ (§8.4).

12.6 Discriminator casing

Reiteration of §3, §7, §8: conforming consumers MUST reject non-canonical casings on documentType, item type, and question type (NORMATIVE.md §4.2, §5.3). The schemas enforce these via const / enum. The reference validator additionally provides lenient migration paths (PascalCase → camelCase warnings, casing-tolerant documentType dispatch) that are disabled under --strict.

12.7 globalId uniqueness

Rule	Tier	Source	NORMATIVE §
`globalId` values unique across all entities in a document (Units, Lessons, Items, Questions share one namespace; comparison case-insensitive)	Domain-validator-enforced (ERROR)	`validate_course.py: _collect_duplicate_global_id_errors` (course and questionSet paths)	§4.4

JSON Schema cannot express cross-entity uniqueness across nesting levels, so this rule is domain-validator-only. Reference fields that point at a globalId (contentItemId, relatedItemIds) are references, not declarations, and are exempt. Conformance fixture: tests/invalid/40-duplicate-global-id.json.

13. Conformance note

The catalog tiers describe what the reference validators enforce today under --strict (§1.3). A gap between that behavior and the normative sources is a validator defect, not a relaxation of the contract.

Producers MUST emit documents that satisfy every Schema-enforced rule and every Domain-validator-enforced (ERROR) rule. Producers SHOULD additionally honor Domain-validator-enforced (WARN) rules; the validator’s warnings flag suspect-but-not-rejected content that authors typically want to fix.

Consumers MUST reject documents that fail any Schema-enforced or Domain-validator-enforced (ERROR) rule, with explicit exceptions where a rule distinguishes producer-emission from consumer-import. This matches NORMATIVE.md §3.2’s strict-producer / lenient-consumer split — a producer that omits $schema is non-conforming with respect to that document, but a consumer that rejects an otherwise-valid document on the basis of a missing $schema is overly strict. The producer-emission-vs-consumer-import rules are:

The $schema rows in §3 — the canonical example: MUST emit, SHOULD tolerate absence.
The SubjectCollection closure rules SC-6, SC-7, SC-8 (§15) — a producer MUST NOT emit a violating document (ERROR), but a consumer SHOULD (not MUST) reject one, and MAY instead ingest it with the violations surfaced (NORMATIVE.md §5.7).
The alignment-claim vocabulary rule SC-10 (§15) is producer-tier only, and its consumer side is not a weakened rejection — it is the opposite. A producer MUST NOT emit a claim outside the 1.1 vocabulary (ERROR). A consumer encountering a claim value outside that set MUST NOT reject the document, MUST NOT interpret the claim, and MUST preserve the entry verbatim across read/write cycles (NORMATIVE.md §4.10, §5.5). Consumers therefore never apply SC-10 as a rejection criterion at all.
Identity-less members (SC-3) are not in this set — a consumer MUST reject those (§3.4).

Domain-validator-enforced (WARN) rules describe sanitization, accessibility, or migration-aid behavior — consumers SHOULD surface them but are not required to reject on their basis. NOTE-tier rows are informational only.

Advisory rules carry the RFC 2119 weight stated in the cited section (NORMATIVE.md MUST/SHOULD/MAY, HTML_SAFETY.md §8 severity, ACCESSIBILITY.md §8). Consumers that diverge from advisory SHOULD/MAY rules are non-canonical but not non-conforming. Where a rule is enforced in multiple tiers (schema + validator), satisfying the strictest tier suffices.

A note on JSON Schema format keywords. Several rows above cite format: "uri" or format: "uuid" from the schemas. Under JSON Schema Draft 7, format is an annotation by default — a validator only enforces it when configured with a FormatChecker (or equivalent). The reference validator’s Draft7Validator instance runs without explicit format assertions, so format-only claims are not guaranteed by the schema pass alone. Rows that depend on these formats also cite a regex pattern (for UUIDs on globalId properties) or a domain-validator backstop (validate_course.py: validate_item via is_valid_uuid for contentItemId / relatedItemIds). Implementers re-implementing the validator in other languages should either enable format-assertion in their JSON Schema library or replicate the regex/domain backstops.

A note on JSON Schema default keywords. Several rows above cite a property’s default value (e.g. isGraded defaults to true on quiz, points defaults to 1.0 on questions, placementUnit defaults to "sentence"). Under JSON Schema Draft 7, default is an annotation — most validators (including jsonschema for Python, AJV with default options, etc.) do not apply or enforce it. A producer that omits the property emits a document that validates; a consumer that reads such a document MUST apply the default itself if it wants the documented behavior. The defaults are listed here so implementers know what the spec intends absent an explicit value — they are schema-declared, not schema-enforced. Consumers SHOULD NOT rely on the validator filling in defaults; producers SHOULD emit explicit values when the documented default does not match their intent.

Where a reference validator and a normative document disagree, the normative document wins. Discrepancies should be reported as issues against this spec; the validator is updated to track.

14. Forward-looking deepenings (1.0 final)

(Historical: this section records the deepenings cataloged during the 1.0 cycle. 1.0 has since shipped and 1.1-rc.1 is the current publication; the still-open items below carry forward. Rules for the 1.1 artifact types are in §15–§19.)

The inventory pass that produced this catalog (2026-05-24) surfaced eight documented-but-unenforced rules. All eight were closed in the same rc.1-polish session by extending tools/validate_course.py (no schema changes — the closures land in the domain-validator pass). The corresponding rows in the per-type tables above are tagged Domain-validator-enforced rather than Advisory; new invalid conformance fixtures (tests/invalid/21-mcq-no-correct-option.json, 22-mcq-options-points-missing-entry.json, 23-word-bank-cloze-gap-count-mismatch.json, 24-multiple-choice-cloze-index-out-of-bounds.json) pin the ERROR-tier checks. The corpus runs 64/64 under python tools/run_corpus.py (the harness invokes validate_course.py --strict internally on every fixture; the 36 fixtures at the time of that rc.1 pass, plus the two prompt-correction fixtures added in rc.2, plus the per-type / referential-integrity / grading-matrix / globalId-uniqueness expansion added in rc.3).

Three areas remain explicitly forward-looking for 1.0 final or beyond:

--accessibility validator flag. The <video> without <track kind="captions"\|"subtitles"> check (§12.2) is WARN at the current baseline. The 1.0-final --accessibility flag promotes it (and related accessibility warnings) to ERROR so tooling that wants to fail-build on accessibility-profile claims can do so.
Tag namespace conventions. Optional best-practice tag prefixes (stage:, level:, exam:, …) are described informally in ITEM_PATTERNS.md §1. No schema-level constraint, no validator check; left to convention for 1.0. Referential-integrity validation on objectiveIds is closed at rc.1 (consumers MUST report unresolved IDs per the validator).
Reserved-type per-type schemas. The 7 reserved question types (hotspot, association, etc.) validate against question-base.schema.json only (§10). First-class per-type schemas remain deferred to a future minor version (targeted for 2027) and are not part of 1.1.

Future deepenings (a new accessibility rule promoted to ERROR, a new cross-document rule added by 1.1) will surface as new rows in the per-type tables above or as new entries in this section. The published /1.0-rc.2/ and /1.0-rc.3/ schema URLs remain immutable per NORMATIVE.md §8.3; any future closures land at /1.0/ or a later version path.

15. Subject Collection (`SC-*`)

Rules for documentType: "subjectCollection" documents.

#	Rule	Source	Tier	Severity
SC-1	Root triplet `$schema`/`documentType`/`specVersion` present; `documentType == "subjectCollection"`	NORMATIVE §3.2	schema	ERROR
SC-2	`globalId`, `version`, `title`, `scope` present; `scope.subject.id` non-empty	NORMATIVE §3.3; schema `required`	schema	ERROR
SC-3	Every tag and objective carries a non-empty `id` (identity is not optional)	NORMATIVE §3.4	schema	ERROR
SC-4	Member ids unique within the document (tags and objectives separately)	NORMATIVE §3.4	domain	ERROR
SC-5	Tag `slug` present and unique within the document	schema + domain	domain	ERROR
SC-6	Every `tags[].categoryId` resolves within `categories[]` (closure)	NORMATIVE §4.9	domain	ERROR
SC-7	Every `tags[].parentId`, when present, is the member id of another tag in the document (parents are member ids, never slugs), and the parent relation is acyclic — no tag is its own parent, and no chain of `parentId` links returns to a tag already on that chain. The relation therefore forms a forest	NORMATIVE §4.9	domain	ERROR
SC-8	Every `objectives[].tagIds` entry resolves within the document’s `tags[]` (closure)	NORMATIVE §4.9	domain	ERROR
SC-9	`difficultyBand` within the enum (`Recall`/`Understand`/`Apply`/`Analyze`/null)	schema	schema	ERROR
SC-10	`externalAlignments[].claim` within the 1.1 producer vocabulary (`references`/`alignedTo`/`covers`; `assesses`/`verifiedBy` reserved). Producer-tier by design: the schema deliberately leaves `claim` an open string so consumers can satisfy the §5.5 preserve-unknown-claims rule without a schema failure — the vocabulary binds what a 1.1 producer may emit, never what a consumer accepts. `scheme` and `id` non-empty (schema)	NORMATIVE §4.10, §5.5	domain (producer-tier; the reference vocabulary tooling enforces it as an authoring/emission gate) + schema (`scheme`/`id`)	ERROR (emission)
SC-11	Category ids unique within the document	schema + domain	domain	ERROR
SC-12	`aliases`, when present, is an array (not a bare string)	schema	schema	ERROR
SC-13	`license` populated with a concrete value on documents intended for distribution (`"unspecified"` only for private drafts)	NORMATIVE §4.11	advisory	WARN
SC-14	Objective `text` reads as a can-do capability (active verb, completes “…be able to:”)	`subject-collection-reference.md` §5	advisory	NOTE

Reference implementation note: SC-4…SC-8, SC-11, SC-12 are implemented by the vocabulary-document validator in the reference tooling; SC-1 and SC-2’s root-triplet checks are schema-enforced.

16. Curriculum Pack (`CP-*`)

Rules for documentType: "curriculumPack" documents.

#	Rule	Source	Tier	Severity
CP-1	Root triplet present; `documentType == "curriculumPack"`	NORMATIVE §3.2	schema	ERROR
CP-2	`packMode` is `"manifest"` or `"bundle"`	schema	schema	ERROR
CP-3	`collectionRefs[]` entries carry non-empty `globalId`; `contentRefs[]` entries carry non-empty `type` + `id` (the referenced document’s type-directed identity, NORMATIVE §4.4: course→`sourceCourseId`, questionSet→`sourceQuestionSetId`, else root `globalId`). `contentRefs[].type` is a producer-closed vocabulary — `course`, `questionSet`, or `glossary` (a SubjectCollection is referenced via `collectionRefs`; `curriculumPack` is prohibited — no pack nesting, §4.4). Like alignment `claim`, it binds producers only and is left schema-open (§5.8): a producer emitting any other value is non-conforming; a consumer MUST NOT reject an unrecognized type (§5.4), treating the ref as unresolvable	NORMATIVE §4.4, §5.8	schema (`type`/`id` non-empty) + domain (producer type-vocabulary)	ERROR (emission)
CP-4	Step shape: `id` non-empty and unique within the pack; `kind` within the five-value enum; `label` non-empty; `year`/`term`/`weekOfTerm`/`durationLessons` integers ≥ 1; the `contentRef` key present on every step (null = unauthored slot); `selector`, when present, uses the node-globalId-bearing grammar `unit:`/`lesson:`/`item:` (interior nodes retain `globalId`; only the document root’s identity is type-directed)	`curriculum-pack-reference.md` §4.1–§4.3	schema (per-step) + domain (id uniqueness)	ERROR
CP-5	`pacing` present when `sequence[]` is non-empty; `years`/`lessonsPerWeek` ≥ 1; `weeksPerTerm`, when declared, has `termsPerYear` entries ≥ 1; `teachingWeeksPerYear`, when both present, equals `sum(weeksPerTerm)`	`curriculum-pack-reference.md` §5	domain	ERROR
CP-6	Position within the frame: `year ≤ pacing.years`; `term ≤ termsPerYear`; `weekOfTerm` within the term; a step’s week span (`ceil(durationLessons/lessonsPerWeek)` weeks from its start) stays inside its term	`curriculum-pack-reference.md` §4.1, §5	domain	ERROR
CP-7	Term capacity: per `(year, term)`, `Σ durationLessons ≤ weeksPerTerm[term−1] × lessonsPerWeek`	`curriculum-pack-reference.md` §5	domain	ERROR
CP-8	`dependsOn`: ids exist; no self-reference; every edge points at a step strictly earlier in the `(year, term, weekOfTerm)` timeline, or in the same week and earlier in document order (within a week, document order is the schedule)	`curriculum-pack-reference.md` §4.4	domain	ERROR
CP-9	Checkpoint presence: a step of kind `assessment`/`mock` carries a checkpoint object; any other kind carries `null` or omits the key	`curriculum-pack-reference.md` §4.2, §4.5	domain	ERROR
CP-10	Checkpoint shape: `kind` formative/summative (mock ⇒ summative); `format` non-empty; `scope` listed/allTaughtToDate; listed ⇒ `assessesObjectiveIds` non-empty; allTaughtToDate ⇒ `assessesObjectiveIds == []`	`curriculum-pack-reference.md` §4.5	domain	ERROR
CP-11	Taught-before-used: every listed assessed objective, and every objective on a `review` step, is first taught (a `teaching` step’s `objectiveIds`) strictly earlier in the timeline	`curriculum-pack-reference.md` §4.5	domain	ERROR
CP-12	Bill of materials: every non-null step `contentRef` appears in root `contentRefs[]` with a matching `type`	`curriculum-pack-reference.md` §3, §4.3	domain	ERROR
CP-13	Coverage block: `collectionGlobalId` non-empty and present in `collectionRefs[]`; `assertions[]` within the two-value 1.1 vocabulary (unknown assertion strings are errors)	`curriculum-pack-reference.md` §6	schema + domain	ERROR
CP-14	With the referenced collection available: every `objectiveId`/`tagId` used anywhere (steps, checkpoints, exemptions) resolves to a collection member; declared coverage assertions hold over the non-exempt member set. Without it, validators skip these checks visibly rather than fail	`curriculum-pack-reference.md` §6	domain (collection-resolved)	ERROR
CP-15	Bundle closure: a bundle carries `embedded {collections, content}` and a manifest does not; every ref (root and step-level) resolves to an embedded document of the matching `documentType`, identity verbatim; a course selector resolves to a node of that kind inside the embedded course	`curriculum-pack-reference.md` §2	domain (bundle)	ERROR
CP-16	A bundle MUST NOT re-mint any embedded document’s type-directed portable identity (root `globalId`, or a course/questionSet’s `sourceCourseId` / `sourceQuestionSetId`; NORMATIVE §4.4) or its member ids	`curriculum-pack-reference.md` §2	consumer obligation — an import behavior, not a static-document check; cataloged in §19	(behavioral)
CP-17	Advisory hygiene: refs SHOULD pin `version` (consumers surface mismatches rather than silently substituting); a `teaching` step SHOULD list objectives; `sequence[]` SHOULD serialize in timeline order; an unauthored slot SHOULD carry an `authoringNote` (buffer steps excepted); week occupancy SHOULD stay within `lessonsPerWeek`; an objective SHOULD meet a formative checkpoint before a summative one; a checkpoint’s effective assessed set SHOULD be non-empty; a review SHOULD revisit no sooner than `recyclingPolicy.minSpacingWeeks` (default 2) absolute weeks after first teach; a bundle SHOULD embed nothing unreferenced and SHOULD surface pinned-version drift	`curriculum-pack-reference.md` §2–§7	advisory	WARN

Reference implementation note: CP-4…CP-15 and CP-17 are implemented by the pack validator in the reference tooling, whose docstring carries the rule-by-rule list as E01–E15/W01–W10. Embedded-document validity: the embedded block’s documents are typed as generic objects in curriculum-pack.schema.json, so the pack schema does not by itself reject a malformed embedded artifact. CP-15 checks each embedded document’s identity (documentType + its type-directed identity per NORMATIVE §4.4 — a course/questionSet by sourceCourseId/sourceQuestionSetId, a collection/glossary by root globalId) and closure (every ref resolves to an embedded document of the matching type; selectors resolve inside embedded courses). Full validation of each embedded document — against its own schema and its own domain rules, dispatched on the embedded document’s documentType — is performed by the shared emission gate before a bundle is written. Every writer in the reference tooling runs that gate (the bundle assembler, lc_pack.save(), lc_collection.save(), and the pack validator’s own --validate pass over a bundle), so an embedded artifact that is schema-valid but domain-invalid is rejected on the producer path rather than packaged. This is a producer-path guarantee of the reference tooling, not a constraint the pack schema itself expresses.

17. Glossary (`GL-*`)

Rules for documentType: "glossary" documents.

The declared-inventory rules follow a single severity principle: a false claim is an ERROR; a missing claim is a WARN. Gloss-rule and inventory ERRORs are interchange-boundary rules — they gate what a producer may emit, and a work-in-progress document inside an authoring tool may legitimately fail them mid-authoring (glossary-reference.md §2).

#	Rule	Source	Tier	Severity
GL-1	Root triplet present; `documentType == "glossary"`; `globalId`/`version`/`title`/`language` non-empty; `entries[]` present	NORMATIVE §3.2; schema	schema	ERROR
GL-2	`translationLanguages`, when present, is an array of unique, non-empty, BCP 47-shaped strings	`glossary-reference.md` §1	schema + domain	ERROR
GL-3	Entry shape: `id` non-empty and unique; `term` non-empty; `kind` within `word`/`phrase`; typed optional fields (strings non-empty incl. `firstMention`, `linkAutomatically` boolean, `examples[].text` non-empty); translation maps (`translations`, `definitionTranslations`, example translations) keyed by BCP 47 tags with non-empty values	`glossary-reference.md` §2	schema (per-field) + domain (id uniqueness)	ERROR
GL-4	The gloss rule: every entry carries a `definition`, or at least one `translations` value, or at least one `definitionTranslations` value (interchange-boundary)	`glossary-reference.md` §2	domain	ERROR
GL-5	Inventory membership: when `translationLanguages` is declared (non-empty), every translation key used anywhere in the document is a member of it. Language tags are compared case-insensitively (BCP 47 §2.1.1): a document declaring `es` and using the key `ES` satisfies this rule, and satisfies GL-6 as well. Absent and empty declarations are equivalent — no claim is made, and GL-5/GL-6 are vacuous (GL-8 then applies)	`glossary-reference.md` §1	domain	ERROR
GL-6	No false claims: every declared `translationLanguages` value is used by at least one translation value somewhere in the document, again compared case-insensitively. The same case-insensitive key governs GL-2’s uniqueness check: `es` and `ES` in one `translationLanguages` array are a duplicate declaration, not two languages	`glossary-reference.md` §1	domain	ERROR
GL-7	Duplicate matching surface (`term`/`otherForms`) across entries. Surfaces are compared after Unicode normalization to NFC followed by `casefold()` — the Unicode caseless-matching operation, not simple lowercasing, so German `Straße` and `STRASSE` are one surface and canonically equivalent compositions of an accented character do not read as two — legal (senses are separate entries) but auto-linking consumers must disambiguate. *This warning is routinely accepted, not fixed:* two senses of one spelling are the model working as designed, so GL-7 exists to inform disambiguation, not to be driven to zero	`glossary-reference.md` §2	advisory	WARN
GL-8	Translations are present but the document declares no `translationLanguages` (missing claim — fires on imports; documents authored against the declaration always carry it)	`glossary-reference.md` §1	advisory	WARN
GL-9	An entry’s `otherForms` repeats its own term (redundant matching surface)	`glossary-reference.md` §2	advisory	WARN
GL-10	Language-code lint: a declared `translationLanguages` value (or, absent a declaration, a used key) whose 2-letter primary subtag is not ISO 639-1, or whose 2-letter region subtag is not ISO 3166-1 alpha-2. 3-letter primary subtags and script subtags are shape-checked only	`glossary-reference.md` §2	advisory	WARN
GL-11	Absent `firstMention` is legal — never an error or warning (imported glossaries carry no lesson provenance; the entry renders as course-scoped background vocabulary). Surfaced only as a maturity signal (the validator’s success report counts entries with/without it). A `firstMention` naming a lesson the importer does not hold is treated as absent	`glossary-reference.md` §2.1	advisory	NOTE

Reference implementation note: GL-1…GL-10 are implemented by the glossary validator in the reference tooling (rule ids GE01–GE05/GW01–GW04 in its docstring: GL-1↔GE01, GL-2↔GE01, GL-3↔GE02, GL-4↔GE03, GL-5↔GE04, GL-6↔GE05, GL-7↔GW01, GL-8↔GW02, GL-9↔GW03, GL-10↔GW04); GL-11 is NOTE-tier — implemented as the maturity report’s firstMention n/total line, not as a GE/GW rule.

18. 1.1 deepenings — root and course (`RD-`, `CO-`)

Rules that deepen the root document (§3) and the course artifact (§4) as of 1.1. They add no new fields; they enforce obligations NORMATIVE already carried.

18.1 Root document, all artifact types

#	Rule	Source	Tier	Severity
RD-1	`specVersion` ↔ `$schema` version agreement: when both fields are present, `specVersion` is well-formed `1.x`, and `$schema` is an `lc-json.org` versioned URL, the URL’s version path MUST match `specVersion` on major.minor (patch releases share the minor’s URL per §8.1: `1.0.1` legitimately pairs with `/1.0/` or `/1.0-rc.N/`). Field absence is governed separately by §3.2 and is not re-reported here	NORMATIVE §8.4	domain	ERROR
RD-2	Unrecognized non-`x-` fields are surfaced informationally, one line per stray, stating the §5.4 consequence (consumers may ignore; free to drop on re-export; a future spec version may claim the name) and the remedy (`x-<vendor>-…` for guaranteed preservation, §7.4). `x-` members are listed separately as extension members with preservation expected. Never rises above NOTE — §5.4 makes these fields legal; this surfaces silent-drop risk and catches typos (`titel`). Skips: closed objects (already ERROR by contract), members of reserved/unknown-type questions (§6 preserves them wholesale), and the interior of `x-` subtrees (the namespace owner’s business)	NORMATIVE §5.4, §7 (explains existing rules; adds none)	advisory	NOTE

A document that pins a $schema publication other than the validator’s own is not an error — the reference validator surfaces it as a NOTE, because validating an rc.1 document with an rc.1-tagged validator is the normal case and validating it with a later one is legitimate.

18.2 Course-level

#	Rule	Source	Tier	Severity
CO-1	Every id in `courseObjectiveIds` and unit/lesson `objectiveIds` resolves within the course’s own `objectives[]` pool (carried copies embedded; no dangling vocabulary references)	NORMATIVE §4.9	domain	WARN for `specVersion` 1.0 documents (the 1.0 WARN-tier objective-reference-integrity posture, unchanged); ERROR for documents declaring `specVersion` ≥ 1.1 (this restates §4.9’s MUST)
CO-2	`license`/`canonicalUrl`/`derivedFrom` validate against the shared publication field group	schema (composed)	schema	ERROR
CO-3	`glossaryRefs`, when present at course/unit/lesson, is an array of non-empty glossary `globalId` strings (placement encodes scope; no `{globalId, scope}` object form; junctions stop at Lesson)	`glossary-reference.md`; `course.schema.json`	schema	ERROR
CO-4	A `glossaryRefs` entry that resolves to no `glossaries[]` pool copy and no held document (a dangling ref) — consumers SHOULD surface it, naming the missing `globalId`, SHOULD preserve it for later binding, and MUST NOT fail the import	`glossary-reference.md`	advisory (consumer-surfaced)	WARN
CO-5	`glossaries[]` pool hygiene: a pool entry no `glossaryRefs` references is a stowaway; the pool SHOULD hold one copy per referenced glossary, deduplicated by `globalId`. Per-copy validity is NOT schema-enforced: the `glossaries[]` items are typed as generic objects so a carried copy can travel whole (a `$ref` to the full glossary schema would collide with the pre-port/pool-copy tolerance for an omitted `$schema`/`specVersion`). A producer SHOULD ensure each pool copy is a valid glossary; a consumer MAY re-validate copies with the glossary validator. The reference course validator surfaces the dangling-ref (CO-4) and stowaway conditions but does not re-run the GL-* rules on each copy	`glossary-reference.md`; `course.schema.json`	advisory (stowaway) + producer-obligation (per-copy validity)	WARN (stowaway)

19. Consumer-side obligations not expressible as document checks

Cataloged for implementers; these are behavioral, tested by round-trip/import tests rather than document validation.

Member reconciliation by id — no duplication; membership recording for tags; link-never-overwrite for non-owned objectives; verbatim creation of absent members (NORMATIVE.md §5.7).
Display-collision handling never merges members or reassigns ids (NORMATIVE.md §5.7).
Clean rejection of unimplemented documentTypes, naming the type (NORMATIVE.md §5.1).
Preservation of unknown alignment-claim values, and of unrecognized fields inside sequence[] steps, across read/write cycles (NORMATIVE.md §5.5; curriculum-pack-reference.md §4 — the step shape itself is firm as of 1.1).
Preservation of sequence[] array order — within a week, document order is the schedule (curriculum-pack-reference.md §4.4).
Emission of the contentRef key even when null: serializers that drop null values must exempt it (curriculum-pack-reference.md §4.1).
Glossary entry reconciliation by id on re-import — update, never duplicate; ids preserved verbatim, never re-minted (NORMATIVE.md §3.4/§5.7 as extended to glossary entries).
firstMention handling on import: a value naming a lesson the importer does not hold is treated as absent (never an error); an importer that regenerates lesson GlobalIds MUST remap firstMention on glossaries imported alongside (glossary-reference.md §2.1).
Bundle re-mint prohibition (CP-16): a consumer importing a bundle MUST NOT re-mint any embedded document’s identity (root globalId or a course/questionSet’s sourceCourseId/sourceQuestionSetId) or member ids — an import-behavior obligation, not a static-document check (curriculum-pack-reference.md §2).

LC-JSON Question Types — Format Reference

Spec version: 1.0 Purpose: Per-type property reference for LC-JSON (Learning Content JSON) — the 12 implemented question types and the 7 reserved-for-2027 types.

Overview
Common Properties
Phase 1: Core Foundation
Phase 2: Cloze Family
Phase 3: Text Entry
Phase 4: Structured Tasks (Implemented)
Reserved Types
Validation Rules

Overview

LC-JSON questions are tagged-union objects. Every question carries a type field whose value selects the per-type schema that applies. Consumers dispatch on type to validate and render.

Key requirements:

The type discriminator value uses canonical camelCase (simpleGapFill, multipleChoice, …). Conforming consumers MUST reject non-canonical casings (NORMATIVE.md §5.3).
All property names use camelCase.
Every question carries a globalId in RFC 4122 UUID form (any version; shape-only validation against the 8-4-4-4-12 hex pattern).
See NORMATIVE.md for the full conformance requirements.

Supported Question Types (19 total):

Phase 1 - Core Foundation:

simpleGapFill - Single gap with free text entry
trueFalseQuestion - Binary choice questions
multipleChoice - Single or multiple correct answers

Phase 2 - Cloze Family: 4. wordBankCloze - Gap fill from word bank 5. multiGapCloze - Multiple free-text gaps 6. multipleChoiceCloze - Multiple dropdown gaps

Phase 3 - Text Entry: 7. shortAnswer - Free text response 8. essay - Long-form text with word limits

Phase 4 - Structured tasks (implemented): 9. matching - Pair items 1:1, or classify items into categories 10. ordering - Sequence items (word / sentence / paragraph variants) 11. placement - Place items into anchored gaps in a structured passage (sentence / paragraph / sectionLabel variants; supports decoy gaps for TOEFL Sentence Insertion) 12. sentenceTransformation - Cambridge exam-style controlled paraphrase

Reserved types (per NORMATIVE.md §6 — preserved on round-trip; per-type schemas targeted for 2027): 13. association - Group items into categories 14. hotspot - Click regions on image 15. graphicGapMatch - Visual drag-and-drop 16. graphicAssociate - Associate items with images 17. graphicOrder - Order items based on images 18. fileUpload - Submit documents 19. mediaPromptedEssay - Record audio/video

Common Properties

All question types inherit these base properties:

{
  "type": "simpleGapFill",
  "globalId": "550e8400-e29b-41d4-a716-446655440000",
  "title": "Question title",
  "prompt": "",
  "tags": ["tag1", "tag2"],
  "difficulty": 5.0,
  "points": 1.0,
  "hint": "Optional hint text",
  "feedback": {
    "correct": "Feedback shown when the answer is correct",
    "incorrect": "Feedback shown when the answer is incorrect",
    "choiceFeedback": {
      "choice1": "Per-choice feedback (where applicable)"
    }
  }
}

Property Details:

Property	Type	Required	Default	Description
`type`	string	✅ Yes	-	Question type discriminator. Canonical camelCase form.
`globalId`	string (UUID)	✅ Yes	-	RFC 4122 UUID (any version; shape-only validation); stable across versions of the question.
`title`	string	❌ No	`""`	Short title for editorial/list views.
`prompt`	string	✅ Yes	`""`	Main question text. Required for every type; may be empty (`""`). Authoritative for the real-content types (true/false, multiple choice, short answer, essay), where it is the question. Non-authoritative for the symbolic types, whose structured fields carry the meaning — there it MAY be empty or MAY carry a brief producer-derived readable summary (see the symbolic-type note below).
`tags`	string[]	❌ No	`[]`	Tag array for categorization.
`difficulty`	number	❌ No	`5.0`	Estimated difficulty for the intended learners (0.0 = extremely easy, 10.0 = extremely difficult).
`points`	number	❌ No	`1.0`	Points awarded for a correct answer.
`hint`	string	❌ No	`null`	Optional hint shown to the learner.
`feedback`	object	❌ No	`null`	Optional feedback bundle (see FeedbackBundle below).

difficulty is an author estimate, not a subject level, grade level, CEFR level, or Bloom band. It estimates how challenging the question is for its intended learners. Applications SHOULD display it in teacher-readable form, commonly rounded to the nearest whole number on the 0-10 scale, and MAY later compare it with observed first-attempt success rates.

Phase 1: Core Foundation

1. SimpleGapFill

Description: Single gap with free text entry and multiple acceptable answers.

Use Case: Simple fill-in-the-blank questions.

Example: “The capital of France is ___.”

{
  "type": "simpleGapFill",
  "globalId": "550e8400-e29b-41d4-a716-446655440001",
  "prompt": "",
  "title": "Capital of France",
  "tags": ["geography", "level:A1"],
  "difficulty": 2.0,
  "points": 1.0,
  "sentence": "The capital of France is @@@.",
  "acceptedAnswers": ["Paris", "paris"],
  "caseSensitive": false
}

Type-Specific Properties:

Property	Type	Required	Default	Description
`sentence`	string	✅ Yes	`""`	Sentence with `@@@` marking gap position
`acceptedAnswers`	string[]	✅ Yes	`[]`	List of acceptable answers
`caseSensitive`	boolean	❌ No	`false`	Whether answer matching is case-sensitive

2. TrueFalseQuestion

Description: Binary choice question with boolean correctAnswer as single source of truth. Supports configurable display styles and penalty system.

Use Case: True/False, Correct/Incorrect, or visual checkmark/X questions.

Example: “Water boils at 100°C at sea level. True or False?”

{
  "type": "trueFalseQuestion",
  "globalId": "550e8400-e29b-41d4-a716-446655440002",
  "prompt": "Water boils at 100°C at sea level.",
  "title": "Boiling Point",
  "tags": ["science", "level:A2"],
  "difficulty": 1.0,
  "points": 1.0,
  "correctAnswer": true,
  "displayStyle": "TrueFalse",
  "penalizeIncorrect": false,
  "incorrectPenaltyPercent": 50.0
}

Type-Specific Properties:

Property	Type	Required	Default	Description
`correctAnswer`	boolean	✅ Yes	-	The correct answer (`true` or `false`). Single source of truth for scoring.
`displayStyle`	string	❌ No	`"TrueFalse"`	UI label style: “TrueFalse”, “CorrectIncorrect”, “CheckmarkX”. Presentation only — does not affect scoring.
`penalizeIncorrect`	boolean	❌ No	`false`	Whether to apply a point penalty for a wrong answer. Independent of item type and `isGraded` — author’s choice. Common: `false` when you want learners to try without risk; `true` when guessing should cost something.
`incorrectPenaltyPercent`	number	❌ No	`50.0`	Penalty percentage (0-100). 0%=no penalty, 50%=partial, 100%=full penalty.

Import normalization (pre-1.0 lenient migration affordance — NOT conforming behavior). Some authoring tools historically emitted non-boolean correctAnswer values for True/False questions. A consumer MAY accept and normalize the following on read, purely as a migration aid for ingesting pre-1.0 documents:

true, "true", "True", "correct", "tick", "✓", 1 → normalized to true
false, "false", "False", "incorrect", "cross", "✗", 0 → normalized to false

Conforming behavior under LC-JSON 1.0 is unambiguous: the schema requires correctAnswer to be a JSON boolean (true / false). Conforming producers MUST emit it as a boolean. Conforming consumers in strict mode MUST reject non-boolean values per NORMATIVE.md §5.1 — the reference validator’s --strict mode (which tools/run_corpus.py invokes on every fixture) does so. Tools relying on the normalization above should treat it as a transitional ingestion aid that does not survive into a --strict-conforming document on re-export.

Note: For True/False/Not Mentioned questions (3 options), use MultipleChoice instead.

3. MultipleChoice

Description: Single or multiple correct answers with optional partial credit and shuffling.

Use Case: Traditional multiple-choice questions (MCQ).

Example: “Which of the following are programming languages? (Select all that apply)”

{
  "type": "multipleChoice",
  "globalId": "550e8400-e29b-41d4-a716-446655440003",
  "prompt": "Which of the following are programming languages?",
  "title": "Programming Languages",
  "tags": ["programming", "level:B1"],
  "difficulty": 3.0,
  "points": 2.0,
  "options": ["Python", "HTML", "Java", "CSS"],
  "optionsAndPoints": {
    "Python": 1.0,
    "HTML": 0.0,
    "Java": 1.0,
    "CSS": 0.0
  },
  "allowMultipleCorrect": true,
  "allowPartialCredit": true,
  "penalizeIncorrect": false,
  "shuffleOptions": true,
  "showLetterLabels": true
}

Type-Specific Properties:

Property	Type	Required	Default	Description
`options`	string[]	✅ Yes	`[]`	Array of answer choices
`optionsAndPoints`	object	✅ Yes	`{}`	Dictionary mapping options to points (>0 = correct)
`allowMultipleCorrect`	boolean	❌ No	`false`	Allow selecting multiple answers
`allowPartialCredit`	boolean	❌ No	`true`	Award partial credit for partially correct answers
`penalizeIncorrect`	boolean	❌ No	`false`	Deduct points for incorrect selections
`shuffleOptions`	boolean	❌ No	`false`	Randomize option order for each student
`showLetterLabels`	boolean	❌ No	`false`	Display A, B, C, D labels

Phase 2: Cloze Family

4. WordBankCloze

Description: Passage with gaps filled from a shared word pool (includes distractors).

Use Case: Cambridge FCE/CAE, vocabulary exercises.

Example: “Fill in the blanks using words from the word bank.”

{
  "type": "wordBankCloze",
  "globalId": "550e8400-e29b-41d4-a716-446655440004",
  "prompt": "",
  "title": "Word Bank Exercise",
  "tags": ["grammar:articles", "level:B1"],
  "difficulty": 5.0,
  "points": 5.0,
  "passage": "I saw @@@1 cat and @@@2 dog in @@@3 park yesterday. @@@4 cat was chasing @@@5 dog.",
  "wordBank": ["a", "an", "the", "some"],
  "gapAcceptedAnswers": {
    "1": ["a"],
    "2": ["a"],
    "3": ["the"],
    "4": ["The"],
    "5": ["the"]
  },
  "gapCaseSensitive": {
    "4": true
  },
  "allowWordReuse": true,
  "bankPosition": "above",
  "gapFeedback": {
    "1": "Remember: 'a' is used before consonants"
  },
  "allowPartialCredit": true
}

Type-Specific Properties:

Property	Type	Required	Default	Description
`passage`	string	✅ Yes	`""`	Text with numbered `@@@1`, `@@@2`, etc. marking gaps (1-based)
`wordBank`	string[]	✅ Yes	`[]`	Pool of words to choose from (includes distractors)
`gapAcceptedAnswers`	object	✅ Yes	`{}`	Dictionary: gap number (1-based) → array of accepted answers
`gapCaseSensitive`	object	❌ No	`null`	Dictionary: gap number → boolean (default: false)
`allowWordReuse`	boolean	❌ No	`false`	Can same word be used multiple times?
`bankPosition`	string	❌ No	`"above"`	Word bank position: “above”, “below”, “side”
`gapFeedback`	object	❌ No	`null`	Dictionary: gap number → feedback string
`allowPartialCredit`	boolean	❌ No	`true`	Award partial credit for some correct answers

5. MultiGapCloze

Description: Passage with multiple gaps, each accepting free text with multiple valid answers.

Use Case: Cambridge FCE/CAE Reading Part 2 (Open Cloze).

Example: “Fill in the blanks (no word bank provided).”

{
  "type": "multiGapCloze",
  "globalId": "550e8400-e29b-41d4-a716-446655440005",
  "prompt": "",
  "title": "Open Cloze Exercise",
  "tags": ["grammar:prepositions", "exam:fce", "level:B2"],
  "difficulty": 6.0,
  "points": 8.0,
  "passage": "She walked @@@1 the park and sat @@@2 a bench @@@3 the lake.",
  "gapAcceptedAnswers": {
    "1": ["through", "in", "into"],
    "2": ["on"],
    "3": ["by", "near", "beside"]
  },
  "gapCaseSensitive": {
    "1": false,
    "2": false,
    "3": false
  },
  "gapFeedback": {
    "2": "We use 'on' with bench"
  },
  "allowPartialCredit": true
}

Type-Specific Properties:

Property	Type	Required	Default	Description
`passage`	string	✅ Yes	`""`	Text with numbered `@@@1`, `@@@2`, etc. marking gaps (1-based)
`gapAcceptedAnswers`	object	✅ Yes	`{}`	Dictionary: gap number (1-based) → array of accepted answers
`gapCaseSensitive`	object	❌ No	`null`	Dictionary: gap number → boolean (default: false)
`gapFeedback`	object	❌ No	`null`	Dictionary: gap number → feedback string
`allowPartialCredit`	boolean	❌ No	`true`	Award partial credit for some correct answers

6. MultipleChoiceCloze

Description: Passage with multiple gaps, each gap has 3-4 discrete options (dropdown).

Use Case: Cambridge FCE/CAE Reading Part 1.

Example: “Choose the correct word for each gap from the dropdown.”

{
  "type": "multipleChoiceCloze",
  "globalId": "550e8400-e29b-41d4-a716-446655440006",
  "prompt": "",
  "title": "Multiple Choice Cloze",
  "tags": ["vocabulary", "exam:fce", "level:B2"],
  "difficulty": 7.0,
  "points": 6.0,
  "passage": "The weather was @@@1 cold that we decided to stay indoors. We @@@2 a movie instead.",
  "gapOptions": {
    "1": ["so", "such", "very", "too"],
    "2": ["watched", "saw", "looked", "viewed"]
  },
  "correctAnswers": {
    "1": 0,
    "2": 0
  },
  "gapOptionFeedback": {
    "1": {
      "0": "Correct! 'so' is used before adjectives",
      "1": "'such' is used before nouns",
      "2": "'very' doesn't fit with 'that'",
      "3": "'too' suggests excess"
    }
  },
  "allowPartialCredit": true,
  "shuffleOptions": false
}

Type-Specific Properties:

Property	Type	Required	Default	Description
`passage`	string	✅ Yes	`""`	Text with numbered `@@@1`, `@@@2`, etc. marking gaps (1-based)
`gapOptions`	object	✅ Yes	`{}`	Dictionary: gap number (1-based) → array of options
`correctAnswers`	object	✅ Yes	`{}`	Dictionary: gap number (1-based) → correct option index (0-based)
`gapOptionFeedback`	object	❌ No	`null`	Dictionary: gap number → option index → feedback
`allowPartialCredit`	boolean	❌ No	`true`	Award partial credit for some correct answers
`shuffleOptions`	boolean	❌ No	`false`	Randomize option order within each gap

Phase 3: Text Entry

7. ShortAnswer

Description: Free text response with multiple acceptable answers and case sensitivity options.

Use Case: Short answer questions, name/term identification.

Example: “What is the largest planet in our solar system?”

{
  "type": "shortAnswer",
  "globalId": "550e8400-e29b-41d4-a716-446655440007",
  "prompt": "What is the largest planet in our solar system?",
  "title": "Largest Planet",
  "tags": ["science:astronomy", "stage:lower-secondary"],
  "difficulty": 2.0,
  "points": 1.0,
  "acceptedAnswers": ["Jupiter"],
  "caseSensitive": false
}

Type-Specific Properties:

Property	Type	Required	Default	Description
`acceptedAnswers`	string[]	✅ Yes	`[]`	All acceptable answers. The first entry is treated as the canonical form shown in solutions and feedback.
`caseSensitive`	boolean	❌ No	`false`	Whether answer matching is case-sensitive

8. Essay

Description: Long-form text response with optional word limits and grading rubric.

Use Case: Essay questions, extended writing tasks.

Example: “Write a 250-word essay about climate change.”

{
  "type": "essay",
  "globalId": "550e8400-e29b-41d4-a716-446655440008",
  "prompt": "Write an essay discussing the impact of climate change on global ecosystems.",
  "title": "Climate Change Essay",
  "tags": ["writing", "exam:ielts", "level:C1"],
  "difficulty": 8.0,
  "points": 20.0,
  "expectedAnswer": "Sample model answer...",
  "expectedLines": 15,
  "minWords": 200,
  "maxWords": 300,
  "rubricText": "## Grading Criteria\n- Task Response (25%)\n- Coherence & Cohesion (25%)\n- Lexical Resource (25%)\n- Grammatical Range (25%)"
}

Type-Specific Properties:

Property	Type	Required	Default	Description
`expectedAnswer`	string	❌ No	`""`	Model answer / sample response
`expectedLines`	integer	❌ No	`0`	Suggested number of lines in text area
`minWords`	integer	❌ No	`0`	Minimum word count (0 = no limit)
`maxWords`	integer	❌ No	`0`	Maximum word count (0 = no limit)
`rubricText`	string	❌ No	`null`	Markdown-formatted grading rubric

Phase 4: Structured Tasks (Implemented)

9. Matching

Description: Match items to their corresponding match (1:1) or classify items into categories (many-to-one). The shape branches by an explicit matchingMode discriminator: "pairs" for 1:1 matching where each item has one correct match, and "classification" for many-to-one where each item belongs to one category and multiple items may share a category. distractors carries decoys in either mode (extra match values in pairs mode; extra category labels in classification mode).

Use Cases:

pairs — Vocabulary ↔ definition, country ↔ capital, author ↔ work, thinker ↔ idea, cause ↔ effect. Any 1:1 association where the pedagogical meaning lives in the pairing.
classification — Time expressions ↔ tense, foods ↔ food group, animals ↔ habitat, sentences ↔ register, examples ↔ argument-role. Any sort where multiple items share each category.

Common properties (both modes)

Property	Type	Required	Description
`matchingMode`	string (`"pairs"` \| `"classification"`)	✅ Yes	Selects the sub-shape. No default — the schema can’t validate the shape without it.
`distractors`	string[]	❌ No	Pairs mode: extra match values with no correct item. Classification mode: extra category labels that don’t own any item. Default `[]`.
`allowPartialCredit`	boolean	❌ No	If `true` (default), score per correct row; if `false`, all-or-nothing.

`pairs` mode — properties

Property	Type	Required	Description
`pairs`	object[]	✅ Yes	Each entry is one item-and-match row: `{ "item": string, "match": string }`. Both fields required, `minLength: 1`. Minimum 2 pairs.

`classification` mode — properties

Property	Type	Required	Description
`categories`	object[]	✅ Yes	Each entry is one category: `{ "label": string, "items": string[] }`. `label` required, `minLength: 1`. `items` required, `minItems: 1` (an empty category is meaningless — list it as a `distractors[]` entry instead). Minimum 2 categories. Consumers MUST randomize the row order at render time per NORMATIVE §5.6 — source order is grouped by category and would directly expose the answer.

A document that mixes shapes (both pairs and categories present, or matchingMode omitted) fails validation.

Example — `pairs` mode

{
  "type": "matching",
  "globalId": "550e8400-e29b-41d4-a716-446655441502",
  "title": "Roots of Democracy — Match the Thinker",
  "tags": ["politics:enlightenment", "philosophy:political"],
  "points": 8.0,
  "difficulty": 5.0,
  "prompt": "",
  "matchingMode": "pairs",
  "pairs": [
    { "item": "John Locke",         "match": "Government derives its authority from the consent of the governed." },
    { "item": "Jean-Jacques Rousseau", "match": "Citizens form a 'social contract' that creates the legitimate state." },
    { "item": "Baron de Montesquieu", "match": "Power should be divided across separate branches of government." },
    { "item": "John Stuart Mill",   "match": "Liberty is the freedom to act, limited only by harm to others." }
  ],
  "distractors": [
    "The state should own the means of production.",
    "Tradition is the safest guide to political reform."
  ]
}

See examples/15-matching.json for the full canonical pairs example.

Example — `classification` mode

{
  "type": "matching",
  "globalId": "550e8400-e29b-41d4-a716-446655441550",
  "title": "Time Expressions — Classify by Tense",
  "tags": ["grammar:tenses:past-simple", "grammar:tenses:present-perfect", "level:B1"],
  "points": 6.0,
  "difficulty": 4.0,
  "prompt": "",
  "matchingMode": "classification",
  "categories": [
    { "label": "past simple",     "items": ["a year ago", "yesterday", "in May 2019"] },
    { "label": "present perfect", "items": ["all my life", "never", "since 2020"] }
  ],
  "distractors": ["future continuous"]
}

See examples/15b-matching-classification.json for the full canonical classification example.

Renderer expectation

In pairs mode, consumers MAY render two columns with drag-and-drop pairing (or a per-row dropdown of choosable matches). In classification mode, consumers MAY render the items as draggable chips and the category labels (plus distractors) as drop zones. Both presentations are consumer-defined; the wire format describes the structural relationship and hints at the affordance.

Scoring

Per-row scoring. In pairs mode, each row is one item↔match comparison. In classification mode, each item is compared against its category’s label (the row is correct if the learner placed the item under the correct label). allowPartialCredit: true (default) awards partial credit per correct row; false requires every row correct for any credit.

10. Ordering

Description: Sequence items correctly. Students arrange shuffled tiles into the teacher-defined correct order.

Use Case: Sentence word-order unscrambling, ordering process steps in a paragraph, ordering paragraphs of an essay, chronological ordering — any task where pedagogical meaning lives in the sequence.

Property	Type	Required	Description
`sourceText`	string	Yes	The original sentence or passage shown for context (the correct ordering when items are joined)
`items`	string[]	Yes	Tiles in correct order. `items[i]` is the correct tile at position `i`
`distractors`	string[]	No	Extra tiles that do not belong in the sequence (mixed into the tile bank as decoys)
`scoringMode`	string	No	Scoring policy hint: `"strict"` = all-or-nothing exact match; `"kendall"` = partial credit via Kendall tau distance. When omitted, the recommended default is `"strict"` for `orderingUnit: "word"` and `"kendall"` for `"sentence"` / `"paragraph"`. See Scoring below
`orderingUnit`	string	No	Display granularity hint: `"word"` (default), `"sentence"`, or `"paragraph"`. See variants below

Display variants (`orderingUnit`)

orderingUnit is an advisory hint — the same ordering discriminator covers all three variants and consumers MAY render uniformly. The hint lets a consumer choose layout that fits the chunk size:

`orderingUnit`	Typical chunk size	Typical layout	Example use
`"word"`	one word or short phrase	inline draggable tokens on one line	Unscramble a sentence — see `16-ordering.json`
`"sentence"`	one sentence (10–30 words)	stacked card blocks, vertical	Order steps of a process, narrative beats — see `16b-sentence-ordering.json`
`"paragraph"`	one paragraph (50–100 words)	stacked block cards, vertical, larger	Order paragraphs of an essay — see `16c-paragraph-ordering.json`

Word-level example

{
  "type": "ordering",
  "globalId": "550e8400-e29b-41d4-a716-446655440010",
  "prompt": "",
  "title": "Word Order",
  "sourceText": "She went shopping yesterday.",
  "items": ["She", "went", "shopping", "yesterday"],
  "distractors": ["quickly"],
  "points": 1.0,
  "tags": ["grammar", "level:A2"]
}

(orderingUnit is omitted — the default "word" applies. scoringMode is also omitted; consumers default to "strict" for word-level — see Scoring below.)

Sentence-level example

{
  "type": "ordering",
  "globalId": "550e8400-e29b-41d4-a716-446655441620",
  "prompt": "",
  "title": "Cellular Respiration — Order the Stages",
  "sourceText": "Glucose enters the cell and is split into two pyruvate molecules… (full passage)",
  "items": [
    "Glucose enters the cell and is split into two pyruvate molecules in the cytoplasm during glycolysis…",
    "Each pyruvate is then transported into the mitochondrion and converted to acetyl-CoA…",
    "…"
  ],
  "scoringMode": "kendall",
  "orderingUnit": "sentence",
  "points": 4.0,
  "tags": ["biology:cell-biology:respiration"]
}

See 16b-sentence-ordering.json for the full example.

Paragraph-level example

Same shape, with orderingUnit: "paragraph" and longer items. See 16c-paragraph-ordering.json — a four-paragraph essay-structure reorder.

Scoring

Two modes selected by scoringMode:

"strict" — all items must be in their correct positions AND no distractors placed in the answer area. Any deviation = 0 points.
"kendall" — partial credit by Kendall tau distance over the learner’s permutation: each discordant pair (one item placed before another that should follow it) reduces the score. With N items and k discordant pairs, the score is points × (1 − k / (N × (N−1) / 2)). Useful when the chunks have a single defensible order but partial credit reflects partial understanding (e.g., process narratives, essay structure).

When scoringMode is omitted, the recommended default is "strict" for orderingUnit: "word" (where pairwise inversions don’t have pedagogical meaning — a sentence either reads correctly or it doesn’t) and "kendall" for orderingUnit: "sentence" and "paragraph" (where partial credit reflects partial understanding of the discourse structure). Consumers that don’t support partial credit MAY collapse "kendall" to "strict".

11. Placement

Description: Place items into anchored gaps in a structured passage. Each placement entry pairs a 1-based gap-marker number (@@@N in the passage) with the item that belongs in that gap. Distractors are extra items with no correct gap; passage gap-markers without a corresponding placement entry are decoy positions — a TOEFL-style variant where one item must be placed into one of several candidate positions. The shape mirrors the matching-redesign principle: for structured tasks where the relationship between items and slots is the data, the relationship is encoded explicitly per row.

Use Cases:

Cambridge B2 First Part 6 — sentence-level “missing sentences” tasks where 6 short sentences must be placed back into a 6-gap article. Use placementUnit: "sentence". See examples/17a-sentence-placement.json.
Cambridge C1 Advanced Part 7 — paragraph-level reordering of a 6-gap essay. Use placementUnit: "paragraph". See examples/17b-paragraph-placement.json.
IELTS Matching Headings — short headings labeled to sections of a passage. Use placementUnit: "sectionLabel". The same shape covers analytical meta-labels (e.g., labeling each paragraph ‘thesis’, ‘counter-argument’, ‘evidence’) — both real headings and analytical labels share the wire format. See examples/17c-section-label-placement.json.
TOEFL Sentence Insertion — a single missing sentence with multiple candidate positions; only one is correct. Author 4 @@@N markers in passage and a single placements[] entry whose gap is the correct position. The unanswered markers are decoy gaps. Use placementUnit: "sentence" and allowPartialCredit: false (single-gap → all-or-nothing). See examples/17d-toefl-insertion-placement.json.

Word-level placement is covered by wordBankCloze — placement does not include "word" in the placementUnit enum.

Symbolic-type prompt convention. On the eight symbolic question types (the gap-fill family, sentence transformation, matching, ordering, placement) the structured fields carry the question’s meaning, so prompt is non-authoritative. It remains required but MAY be empty (""); equally, a producer MAY populate it with a brief human-readable summary derived from the question’s content — a readable preview, not authored framing. See examples/01b-simple-gap-fill-readable-prompt.json, which shows "I saw ___ elephant at the zoo yesterday." as one valid form, with "" (as in examples/01-simple-gap-fill.json) equally valid. Consumers MUST NOT rely on a symbolic prompt’s content for scoring, rendering, equality, or deduplication.

Framing instructions for the exercise (e.g., “Place these sentences in the gaps where they best fit. Some markers are decoys.”) belong on the parent exerciseItem.instructions (or quizItem.instructions) field, not duplicated into each question’s prompt. Consumers typically render the parent item’s instructions once at the top of the exercise — above all its questions — so per-question framing would be redundant. This applies symmetrically to all eight symbolic question types.

Properties

Property	Type	Required	Description
`placementUnit`	string	✅ Yes	Display granularity hint: `"sentence"`, `"paragraph"`, or `"sectionLabel"`. Default `"sentence"`. Each value carries a different marker-placement convention — see the table below.
`passage`	string	✅ Yes	Structured text with `@@@1`, `@@@2`, … gap markers (1-based). Must contain at least one marker (the schema’s `pattern` keyword enforces this). Plain text only — no HTML.
`placements`	object[]	✅ Yes	Each entry: `{ "gap": int, "item": string }`. Both required; `gap` ≥ 1; `item.minLength` 1. Order is author-free. `minItems: 1` — permits the TOEFL variant (1 item, multiple candidate gaps).
`distractors`	string[]	❌ No	Extra items with no correct gap. Default `[]`. Distinct from decoy gaps (extra `@@@N` markers without a corresponding `placements[].gap` entry).
`allowPartialCredit`	boolean	❌ No	Award partial credit per correct gap instead of all-or-nothing. Default `true`.

Display variants (`placementUnit`)

placementUnit is an advisory hint and a marker-placement convention. The same placement discriminator covers all three variants and consumers MAY render uniformly.

`placementUnit`	Typical render	Marker-placement convention	Example use
`"sentence"`	Inline drop slot at each marker	Marker appears mid-prose; surrounding whitespace and punctuation are the author’s choice.	Cambridge B2 missing-sentences (17a) and TOEFL Sentence Insertion (17d).
`"paragraph"`	Block-level drop slot between paragraphs	Marker is the entire content of its paragraph — surrounded by `\n\n` or at start/end of passage.	Cambridge C1 paragraph-reordering (17b).
`"sectionLabel"`	Label slot above / leading edge of a paragraph	Marker at the start of the section it labels (first non-whitespace token of a paragraph), followed by a space and then the section’s content.	IELTS Matching Headings and analytical meta-labels (17c).

Marker-placement conventions

passage is plain text. Consumers detect paragraph boundaries by \n\n (double newline). The conventions above are documented in the schema’s placementUnit description and given side-by-side here:

sentence:
  "The experiment began with a simple question. @@@1 The results surprised the team."

paragraph:
  "Coined money had served European trade for centuries.\n\n@@@1\n\nWhat converted private bills into public currency was the cost of seventeenth-century war."

sectionLabel:
  "@@@1 Paper currency did not spread simply because it was convenient.\n\n@@@2 Before paper notes, European trade depended mainly on metal coin."

Decoy gaps (TOEFL Sentence Insertion variant)

A passage with N @@@N markers and fewer than N placements[] entries is valid: the unanswered markers are decoy gaps — candidate positions where the missing item could plausibly fit but doesn’t. This natively expresses TOEFL Sentence Insertion: 4 candidate positions, 1 correct placement. See examples/17d-toefl-insertion-placement.json.

This is distinct from decoy items (distractors[]) — extra content the learner has but should not place anywhere.

Validator policy

Hard errors (validation fails):

Every placements[].gap MUST reference a @@@N marker present in passage. Orphan placement entries fail.
No duplicate gap values within placements[].
gap must be a positive integer (≥ 1).
passage MUST contain at least one @@@N marker (enforced by the schema’s pattern keyword).

Soft warnings (NOTE-tier, not blocking):

@@@N markers SHOULD be sequential starting at 1 (1, 2, 3, …). Inherits the wordBankCloze convention.
Per-placementUnit marker-placement convention violations: paragraph markers that sit mid-prose alongside other text rather than alone on a paragraph; sectionLabel markers that don’t appear at the start of a paragraph. sentence mode has no positional rule.
A @@@N marker without a corresponding placements[].gap entry is not a warning — it is a valid decoy gap. The validator distinguishes intentional decoys from authoring errors only by inference.

Scoring

Per-gap scoring against the authored placements[]. With allowPartialCredit: true (default), each gap whose chosen item matches the authored item is worth points / placements.length; remaining gaps contribute zero. With allowPartialCredit: false, every gap must be correct for any credit. Decoy gaps (markers without a placements[] entry) are unscored — placing an item into one is a “wrong gap” event whose treatment is consumer-defined; placing nothing into one is the expected case.

12. SentenceTransformation

Description: Cambridge exam-style controlled paraphrase tasks.

Use Case: Cambridge FCE/CAE Use of English Part 4 (Key Word Transformation).

Example: Transform sentence using given keyword.

{
  "type": "sentenceTransformation",
  "globalId": "550e8400-e29b-41d4-a716-446655440018",
  "prompt": "",
  "title": "Key Word Transformation",
  "tags": ["grammar", "exam:fce", "level:B2"],
  "difficulty": 8.0,
  "points": 2.0,
  "promptSentence": "I haven't seen John for three weeks.",
  "keyword": "LAST",
  "targetSentence": "The @@@ was three weeks ago.",
  "allOrNothing": false,
  "acceptedChunks": {
    "1": ["last time"],
    "2": ["I saw John"]
  },
  "chunkFeedback": {
    "1": "Use 'LAST' + 'time'. The fixed phrase is 'the last time'.",
    "2": "Past simple 'saw' is needed here, not present perfect 'have seen'."
  }
}

Type-Specific Properties:

Property	Type	Required	Default	Description
`promptSentence`	string	✅ Yes	`""`	Original sentence to transform
`keyword`	string	✅ Yes	`""`	Word that MUST be used (uppercase)
`targetSentence`	string	✅ Yes	`""`	Template with `@@@` for answer chunks
`allOrNothing`	boolean	❌ No	`false`	All chunks correct or zero points
`acceptedChunks`	object	✅ Yes	`{}`	Dictionary: chunk index → array of accepted answers
`chunkCaseSensitive`	object	❌ No	`null`	Dictionary: chunk index → boolean (default: false)
`chunkFeedback`	object	❌ No	`null`	Dictionary: chunk index → feedback string

Reserved Types

The seven question types in this section are reserved in the question-base.schema.json discriminator enum but do not yet have per-type schemas; full implementation is targeted for 2027. Per NORMATIVE.md §6, conforming consumers MUST preserve them in full across read/write cycles (every field, value, and nested structure — per §6.2/§6.4), MUST treat earned points as 0, and SHOULD render a non-interactive placeholder. Producers SHOULD NOT emit reserved types in cross-implementation distribution.

13. Association

Description: Group items into categories (Categorization).

Use Case: Classify items, group by category.

Status: 🔒 Reserved (per NORMATIVE.md §6) — discriminator name is reserved in 1.0; per-type schema is targeted for 2027. Producers SHOULD NOT emit reserved types in cross-implementation distribution; consumers MUST preserve them in full across read/write cycles (every field, value, and nested structure — per §6.2/§6.4), MUST treat earned points as 0, and SHOULD render a non-interactive placeholder. The example below shows only the question-base fields any reserved-type instance must carry.

{
  "type": "association",
  "globalId": "550e8400-e29b-41d4-a716-446655440011",
  "prompt": "Group these words by part of speech",
  "title": "Parts of Speech",
  "tags": ["grammar", "level:B1"],
  "difficulty": 5.0,
  "points": 4.0
}