LC-JSON
An open learning-content interchange specification.
LC-JSON (Learning Content JSON) is a JSON-native format, schema set, and producer/consumer behavior contract for portable teacher-authored courses, lessons, questions, feedback, and assessment intent. A course authored in one tool can be validated, transferred, and delivered in another, with predictable behavior on both ends.
The specification is open. The schemas are public, stable, and versioned. The license is permissive (Apache 2.0). Implementers can build conforming tools without permission.
LC-JSON is a content-layer format — complementary to LMS interop standards (LTI, OneRoster, xAPI, SCORM) rather than competing with them. See the Rationale for the full landscape and what LC-JSON is not.
What you can do with it
Take your courses with you. Course content in LC-JSON is independent of the tool that authored it. Schools, publishers, and authors can move content between platforms without rewriting it.
Validate before you ship. Every LC-JSON document validates against published JSON Schemas. Authoring errors are caught before delivery, not after a learner gets stuck.
Build with confidence.
Schema URLs at every published version path — lc-json.org/1.0-rc.3/ today, lc-json.org/1.0/ once 1.0 final ships, and any future minor or major release — are immutable. A document that validates today will validate forever. Forward-compatible additions land at new URL paths; existing files keep working.
Read the specification
- Specification overview — what LC-JSON looks like, with worked examples.
- NORMATIVE.md — the conformance requirements (RFC 2119 keywords, producer/consumer roles).
- Question types reference — per-type property reference for all 12 implemented question types.
- Schemas — Draft-7 JSON Schemas for every artifact and question type.
- Examples — minimal and full course examples; per-type fragments.
For implementers
- Conformance test corpus — valid and invalid cases per clause, with a machine-readable manifest. Run your validator over the corpus to verify conformance.
- Reference tools:
validate_course.py(validator) andrun_corpus.py(corpus harness for spec contributors + the spec repo’s CI). Seetools/. - GitHub repository — issues, discussions, releases.
Who is this for?
| If you are… | LC-JSON gives you… |
|---|---|
| A teacher or course author | Confidence that the courses you write are not locked into any single tool. |
| A school or institution | A portable, vendor-neutral format for learning content. Procurement decisions don’t lock in pedagogical content. |
| An EdTech tool builder | A clean import/export target. Conforming tools interoperate without bespoke adapters. |
| A learning-platform vendor | Reduced friction in onboarding teacher-authored content from any source. |
What’s covered in 1.0
Two artifact types sharing a common flat root format:
- Course — hierarchical: Course → Units → Lessons → Items → Questions.
- Question Set — flat list of questions for question-bank exchange and packaged delivery.
Twelve question types fully implemented and schema-validated:
simpleGapFill · trueFalseQuestion · multipleChoice · wordBankCloze · multiGapCloze · multipleChoiceCloze · shortAnswer · essay · sentenceTransformation · matching · ordering · placement
Seven additional types are reserved for a future minor version (targeted for 2027).
Five lesson item types: content, exercise, quiz, content-sequence, signpost.
License
LC-JSON is licensed under the Apache License, Version 2.0. The license includes a patent grant. Conforming implementations require no further permission.
“Lesson Commons” is a separate trademark and is not asserted over LC-JSON or its conforming implementations.
Project status
Version 1.0-rc.3 — public release candidate (2026-06-13). The wire format is stable and schema URLs at lc-json.org/1.0-rc.3/ are immutable per NORMATIVE.md §8.3 — early adopters can build against rc.3 with confidence. Each release candidate gets its own immutable URL path; the /1.0/ URL is reserved for 1.0 final (targeted 2026-06-30). rc.3 supersedes two earlier candidates — internal 1.0-rc.1 and announced 1.0-rc.2 — whose /1.0-rc.1/ and /1.0-rc.2/ schema sets stay served and frozen. rc.3 adds the localization model and an expanded conformance corpus, and removes two prototype-era sentenceTransformation fields from the schema (the change requiring a new immutable path). It is backwards-compatible — every rc.2-valid document remains valid under rc.3 — and the move to 1.0 final is planned as a pure URL rebase with no content change. Feedback is welcome through 2026-06-27; 1.0 final is planned for 2026-06-30 as that rebase, barring substantive feedback — the date announced with rc.2 on 2026-05-30 and unchanged since.
LC-JSON’s public history begins with the 1.0 release-candidate line — 1.0-rc.2 (2026-05-30) was its first publicly announced release. Internal iteration before the candidate line is not reflected in the version history.
LC-JSON is maintained under a single-maintainer steward model; see GOVERNANCE.md for the decision-making process and the criteria for transitioning to a working group.
LC-JSON Specification
Spec version: 1.0 (release candidate: rc.3) Last updated: 2026-06-13
This directory contains the LC-JSON (Learning Content JSON) specification for structured learning content, covering the complete hierarchy from Course structure down to individual Question types.
Implementing LC-JSON? See NORMATIVE.md for the conformance requirements (RFC 2119 keywords, producer/consumer roles, versioning rules, URL stability promises). This README is descriptive; NORMATIVE.md is authoritative. For terminology, see GLOSSARY.md.
Complete Coverage:
- Two artifact types (Course, QuestionSet) sharing a common flat root format
- Course Hierarchy (Course → Units → Lessons → Items)
- 5 Lesson Item Types (Content, Exercise, Quiz, ContentSequence, Signpost)
- 19 Question Types (12 fully implemented + schema-validated; 7 reserved for a future minor version)
- JSON Schemas (23) for validation — strictly enforced by the reference validator
- Minimal + detailed examples (32 files, all schema-clean)
Design Principles
LC-JSON is machine-validatable, but human-inspectable.
The documents are validated automatically against JSON Schema Draft 7, but they are also designed so that authored content remains visible in the file. A teacher, curriculum designer, or teacher-developer can recognize courses, units, lessons, items, questions, prompts, choices, answers, and feedback without proprietary tooling — opening a course .json in any text editor should be enough to inspect what the course actually contains.
Technical fields such as $schema, specVersion, and globalId exist to make documents portable across tools and stable across re-imports, but they should not bury the pedagogical content. Where this trade-off arises in spec evolution — naming, structure, ordering of fields — the spec favors the form that keeps pedagogical content recognizable.
This is a deliberate stance against formats whose meaning only emerges through tooling. It is offered without promise of zero technical fields, because portability requires some; the promise is that the pedagogical structure stays inspectable to the people who authored it.
Wire Format
LC-JSON uses a flat root with a documentType discriminator (no enclosing envelope around the document). Every conforming document carries $schema, documentType, and specVersion as root-level siblings. The course content itself is hierarchical — Course → Units → Lessons → Items → Questions — and reflects how teachers structure their material.
Two artifact types
| Artifact | documentType | Schema | Description |
|---|---|---|---|
| Course | "course" | course.schema.json | Hierarchical course (Units → Lessons → Items). The standard shape for a full course. |
| Question Set | "questionSet" | question-set.schema.json | Flat list of questions for question-bank exchange and packaged delivery — no hierarchy. |
Required root fields (both artifact types)
{
"$schema": "https://lc-json.org/1.0-rc.3/<artifact>.schema.json",
"documentType": "course", // or "questionSet"
"specVersion": "1.0",
"title": "...",
...
}
The $schema URL serves as a stable, versioned identifier and is used
by integrated development environments (IDEs such as VS Code) for schema
autocomplete. specVersion is
forward-compatible across the 1.x series — conforming consumers MUST
accept any 1.x value and reject 2.x+ cleanly.
Reserved question types (targeted for 2027)
Seven question types are reserved in the polymorphic discriminator set but do not yet have per-type schemas — full authoring and consumer support is targeted for 2027:
association, hotspot, graphicGapMatch, graphicAssociate,
graphicOrder, fileUpload, mediaPromptedEssay
The 12 question types with full per-type schemas — simpleGapFill,
trueFalseQuestion, multipleChoice, wordBankCloze, multiGapCloze,
multipleChoiceCloze, shortAnswer, essay, sentenceTransformation,
matching, ordering, placement — are the spec’s stable surface as of 1.0.
Consumer obligations for reserved (and unknown) types are normative under NORMATIVE.md §6: consumers MUST preserve them verbatim across read/write cycles, MUST NOT silently drop them, MUST treat their earned points as zero, and SHOULD render a non-interactive placeholder. The intent is round-trip preservation: a teacher exporting from a consumer that does not support hotspot can take the file back to a consumer that does, without losing the question. Producers SHOULD NOT emit reserved types in 1.0 documents intended for cross-implementation distribution; reserved types are tool-specific extensions until promoted.
Discriminator casing
Conforming producers emit camelCase question discriminators
(simpleGapFill, multipleChoice, etc.). All examples in this
directory strictly validate against the schemas in their canonical
casing. Non-canonical casings are non-conforming; consumers MUST
reject them.
Directory Structure
specification/
├── README.md # This file
├── NORMATIVE.md # RFC 2119 conformance requirements (authoritative)
├── HTML_SAFETY.md # Normative HTML allowlist + sanitization profile
├── ACCESSIBILITY.md # Producer/consumer accessibility profile
├── LOCALIZATION.md # Language model: language / lang / supportLanguage; BCP 47; pronunciation
├── VALIDATION.md # Rule catalog — schema / validator / advisory tiers
├── ITEM_PATTERNS.md # Informative authoring guide
├── question-types-reference.md # Complete reference for all 19 question types
├── GLOSSARY.md # Terminology
├── schemas/ # JSON Schema validation files
│ ├── course.schema.json # Course (top level)
│ ├── question-set.schema.json # QuestionSet (flat artifact)
│ ├── unit.schema.json # Unit (within Course)
│ ├── lesson.schema.json # Lesson (within Unit)
│ ├── item-base.schema.json # Base schema for all Items
│ ├── content-item.schema.json # ContentItem type
│ ├── exercise-item.schema.json # ExerciseItem type
│ ├── quiz-item.schema.json # QuizItem type
│ ├── content-sequence-item.schema.json # ContentSequenceItem type
│ ├── signpost-item.schema.json # SignpostItem type (intro/summary navigation)
│ ├── question-base.schema.json # Base schema for all Questions
│ ├── simple-gap-fill.schema.json # SimpleGapFill validation
│ ├── true-false-question.schema.json # TrueFalseQuestion validation
│ ├── multiple-choice.schema.json # MultipleChoice validation
│ ├── word-bank-cloze.schema.json # WordBankCloze validation
│ ├── multi-gap-cloze.schema.json # MultiGapCloze validation
│ ├── multiple-choice-cloze.schema.json # MultipleChoiceCloze validation
│ ├── short-answer.schema.json # ShortAnswer validation
│ ├── essay.schema.json # Essay validation
│ ├── sentence-transformation.schema.json # SentenceTransformation validation
│ ├── matching.schema.json # Matching validation
│ ├── ordering.schema.json # Ordering validation
│ └── placement.schema.json # Placement type validation
└── examples/ # Example JSON files (32 total)
├── course-minimal.json # Minimal Course example
├── question-set-minimal.json # Minimal QuestionSet example
├── question-set-10-true-false.json # Richer QuestionSet showcase
├── unit-minimal.json # Minimal Unit example
├── lesson-minimal.json # Minimal Lesson example
├── 10-content-item.json # ContentItem with HTML
├── 11-exercise-item.json # ExerciseItem (graded homework example)
├── 12a-graded-quiz-item.json # QuizItem, isGraded:true (typical assessment)
├── 12b-ungraded-quiz-item.json # QuizItem, isGraded:false (diagnostic pre-test)
├── 13-content-sequence-item.json # ContentSequenceItem
├── 14-signpost-item.json # SignpostItem
├── 01-simple-gap-fill.json # Per-question examples (01-09)
├── ... # 09-sentence-transformation.json
├── 15-matching.json # Matching example
├── 16-ordering.json # Ordering example (word-level)
├── 16b-sentence-ordering.json # Ordering example (sentence-level — process narrative)
├── 16c-paragraph-ordering.json # Ordering example (paragraph-level — essay structure)
├── 17a-sentence-placement.json # Placement example (sentence-mode — Cambridge B2 First Part 6 style)
├── 17b-paragraph-placement.json # Placement example (paragraph-mode — IELTS Reading Matching Information style)
├── 17c-section-label-placement.json # Placement example (sectionLabel-mode — IELTS Matching Headings)
├── 17d-toefl-insertion-placement.json # Placement example (TOEFL Sentence Insertion — decoy-gaps variant)
└── sample-course-with-questions.json # Full course example
Total: 23 schemas (4 [course, questionSet, unit, lesson] + 1 item-base + 5 item types + 1 question-base + 12 question types).
Course Hierarchy
A Course document has the following nested structure:
Course (top level)
└─ Units[] (array of units)
└─ Lessons[] (array of lessons)
└─ Items[] (array of items - 5 types)
├─ ContentItem (reading/content pages)
├─ ExerciseItem (questions; structural form, grading via isGraded)
├─ QuizItem (questions; structural form, grading via isGraded)
├─ ContentSequenceItem (grouped content)
└─ SignpostItem (intro/summary with objectives)
└─ Questions[] (only for ExerciseItem and QuizItem)
Minimal Examples for Quick Reference:
- course-minimal.json - Bare minimum Course structure
- unit-minimal.json - Bare minimum Unit
- lesson-minimal.json - Bare minimum Lesson
Lesson Item Types
Every Lesson contains an items array with one or more of these 5 item types:
| Item Type | Schema | Example | Description |
|---|---|---|---|
| ContentItem | content-item.schema.json | 10-content-item.json | Reading/content pages with HTML content (subject to HTML_SAFETY.md) |
| ExerciseItem | exercise-item.schema.json | 11-exercise-item.json | Exercise-shaped questions container. Grading independent (isGraded). |
| QuizItem (graded) | quiz-item.schema.json | 12a-graded-quiz-item.json | Quiz-shaped, isGraded: true — typical assessment. |
| QuizItem (ungraded) | quiz-item.schema.json | 12b-ungraded-quiz-item.json | Quiz-shaped, isGraded: false — diagnostic pre-test, self-check. Same schema, different policy. |
| ContentSequenceItem | content-sequence-item.schema.json | 13-content-sequence-item.json | Grouped content with layout options (carousel, tabs, accordion) |
| SignpostItem | signpost-item.schema.json | 14-signpost-item.json | Structural navigation (intro/summary) with objectives and stats; customHtml subject to HTML_SAFETY.md |
Exercise vs. Quiz. These are structural distinctions only. They render differently in the UI and contribute to separate point buckets (enabling weighted grading). Whether the score counts toward a learner’s grade is the
isGradedflag, set independently. The examples model this:11-exercise-item.jsonis a graded homework exercise (isGraded: true);12a-graded-quiz-item.jsonand12b-ungraded-quiz-item.jsonuse the same content under the same schema to show that quiz can be either graded or ungraded. The fourth combination (ungraded exercise / open practice) is conventional and not given its own example.For an authoring guide that walks through the full design space of
type×isGraded×isOptional×passMarkPercent— common patterns (graded homework, diagnostic pre-test, exit ticket, etc.) and how different consumers may interpret each combination — seeITEM_PATTERNS.md.
Key Properties (all items inherit from item-base.schema.json):
type(required) - Discriminator: “content”, “exercise”, “quiz”, “contentsequence”, or “signpost”title(required) - Display title for the itemsequence- Display order within lesson (0-based)instructions- Instructions shown to learnersuggestedTime- Estimated time in minutesisOptional- Whether item can be skipped
Questions Array:
- Only ExerciseItem and QuizItem have a
questionsarray - ContentItem, ContentSequenceItem, and SignpostItem do NOT contain questions
SignpostItem Properties:
signpostType(required) - “intro” or “summary”scope(required) - “course”, “unit”, or “lesson”customHtml(optional) - Custom HTML content to override auto-generated message
Documentation Files
1. question-types-reference.md
Complete JSON Format Reference
- All 19 Question Types with detailed specifications
- Property tables showing required/optional fields
- Examples for each question type
- Validation rules and best practices
- Common properties inherited by all questions
- Complete course example showing nested structure
When to use:
- Creating new course JSON files
- Understanding question type requirements
- Troubleshooting import errors
2. JSON Schema Files (schemas/)
Machine-Readable Validation
JSON Schema files for automated validation using tools like ajv, jsonschema, or IDE validators.
Course Hierarchy Schemas:
course.schema.json- Course (top level)unit.schema.json- Unit (within Course)lesson.schema.json- Lesson (within Unit)
Item Type Schemas:
item-base.schema.json- Base schema for all Itemscontent-item.schema.json- ContentItem typeexercise-item.schema.json- ExerciseItem typequiz-item.schema.json- QuizItem typecontent-sequence-item.schema.json- ContentSequenceItem typesignpost-item.schema.json- SignpostItem type (intro/summary navigation)
Question Type Schemas:
question-base.schema.json- Base properties for all questionssimple-gap-fill.schema.json- SimpleGapFill type validationtrue-false-question.schema.json- TrueFalseQuestion type validationmultiple-choice.schema.json- MultipleChoice type validationword-bank-cloze.schema.json- WordBankCloze type validationmulti-gap-cloze.schema.json- MultiGapCloze type validationmultiple-choice-cloze.schema.json- MultipleChoiceCloze type validationshort-answer.schema.json- ShortAnswer type validationessay.schema.json- Essay type validationsentence-transformation.schema.json- SentenceTransformation type validationmatching.schema.json- Matching type validationordering.schema.json- Ordering type validationplacement.schema.json- Placement type validation
Strict enforcement: the reference validator (validate_course.py) runs every document through these schemas as a primary pass via the jsonschema package (≥4.18, modern referencing Registry API). Per-question type-specific dispatch keys off the type discriminator. Install dependencies with pip install -r tools/requirements.txt.
Usage Example (Node.js with ajv):
const Ajv = require('ajv');
const ajv = new Ajv();
const baseSchema = require('./schemas/question-base.schema.json');
const simpleGapFillSchema = require('./schemas/simple-gap-fill.schema.json');
const validate = ajv.compile(simpleGapFillSchema);
const valid = validate(questionData);
if (!valid) {
console.error(validate.errors);
}
3. Example Files (examples/)
Ready-to-Use Templates
Minimal Hierarchy Examples - Quick Format Reference
Ultra-minimal examples for quick consultation when creating courses:
- course-minimal.json - Bare minimum Course structure with required properties
- unit-minimal.json - Minimal Unit within a course
- lesson-minimal.json - Minimal Lesson within a unit
Use these as exact-format references (e.g., a unit on Travel + present perfect).
Item Type Examples (10-13) - Item Structure Reference
Individual examples for each of the Lesson Item types:
- 10-content-item.json — ContentItem with rich HTML content (Declaration of Independence reading)
- 11-exercise-item.json — ExerciseItem framed as graded homework (5 T/F world-rivers questions;
isGraded: true) - 12a-graded-quiz-item.json — QuizItem as a graded assessment (
isGraded: true,passMarkPercent: 70); same content as 11 - 12b-ungraded-quiz-item.json — QuizItem as an ungraded diagnostic pre-test (
isGraded: false); same content as 11 and 12a — demonstrates that quiz vs. exercise is structural, grading is policy - 13-content-sequence-item.json — ContentSequenceItem with carousel layout
Individual Question Examples (01-09) - RECOMMENDED
Standalone JSON files for each implemented question type with complete feedback bundles:
- 01-simple-gap-fill.json - Articles with indefinite article rule
- 02-true-false-question.json - Science fact with per-choice feedback
- 03-multiple-choice.json - Programming languages with detailed choiceFeedback
- 04-word-bank-cloze.json - Articles in context with per-gap feedback
- 05-multi-gap-cloze.json - Prepositions open cloze (FCE Part 2 style, 8 gaps)
- 06-multiple-choice-cloze.json - Vocabulary with nested gapOptionFeedback
- 07-short-answer.json - Astronomy fact recall
- 08-essay.json - IELTS Task 2 with comprehensive rubric
- 09-sentence-transformation.json - FCE Part 4 with chunk feedback
Features Demonstrated:
- Complete feedback bundle (correct, incorrect, choiceFeedback)
- Strategic hints that guide without revealing answers
- Multi-level tagging (grammar:articles:indefinite, exam:fce, level:B2)
- Realistic educational content
- Production-ready quality
Use these as templates - they showcase all features including feedback mechanisms that are integral to effective course design.
The 7 reserved-for-2027 graphic types (association, hotspot,
graphicGapMatch, graphicAssociate, graphicOrder, fileUpload,
mediaPromptedEssay) are declared in question-base.schema.json’s
enum for forward compatibility, but no example payloads ship — full
authoring and rendering support is targeted for 2027.
sample-course-with-questions.json
Complete course JSON showing:
- Full hierarchy: Course → Units → Lessons → Items → Questions
- Mixed item types: ContentItem, ExerciseItem, QuizItem
- Real-world structure: Lessons with intro content + exercises and quizzes
- Cambridge FCE alignment: Exam-style questions with proper tags
- Best practices: Proper tagging, feedback, hints, difficulty levels
Quick Start Guide
For Content Creators
Creating a Simple Course:
-
Start with a template:
cp examples/sample-course-with-questions.json my-course.json -
Modify the course metadata:
{ "title": "Your Course Title", "subtitle": "Your subtitle", "description": "Course description", "tags": ["level:B1", "grammar"] } -
Add or modify questions using the per-type example files (
01-simple-gap-fill.jsonthrough09-sentence-transformation.json, plus15-matching.jsonand the16…ordering family). -
Validate the result with any conforming consumer or the reference validator:
python ../tools/validate_course.py --course-path my-course.json
For Developers
Validating LC-JSON in code:
Implementations may use any JSON Schema (Draft 7) validator. The earlier ajv snippet (Node.js) is one example; common alternatives:
- Python:
pip install jsonschema(≥ 4.18) →Draft7Validator - Java:
everit-org/json-schemaornetworknt/json-schema-validator - Go:
santhosh-tekuri/jsonschema - Rust:
Stranger6667/jsonschema - Ruby:
voxpupuli/json-schema
The reference Python validator (tools/validate_course.py in this repository) layers domain checks (HTML allowlist, gap-marker counts, points consistency, signpost-without-objectives) on top of schema validation. Re-implementations are welcome.
Adding a new question type to the spec (PR-driven contributions):
- Create a JSON schema in
schemas/for the new type. - Add the discriminator value to
question-base.schema.json’senum. - Add a per-type example file under
examples/(e.g.17-new-type.json). - Document in
question-types-reference.md. - Add positive and negative test cases under
tests/.
Question Types — Implementation Status
Implemented (12 types, fully schema-validated as of 1.0-rc.3):
| Question Type | Example | Use Case |
|---|---|---|
simpleGapFill | 01-simple-gap-fill.json | Single gap fill-in-the-blank |
trueFalseQuestion | 02-true-false-question.json | Binary choice (True/False, Yes/No) |
multipleChoice | 03-multiple-choice.json | Single or multiple selection MCQ |
wordBankCloze | 04-word-bank-cloze.json | Gap fill from word pool |
multiGapCloze | 05-multi-gap-cloze.json | Open cloze (FCE Reading Part 2) |
multipleChoiceCloze | 06-multiple-choice-cloze.json | Dropdown cloze (FCE Reading Part 1) |
shortAnswer | 07-short-answer.json | Free text short response |
essay | 08-essay.json | Long-form writing with rubric |
sentenceTransformation | 09-sentence-transformation.json | FCE Use of English Part 4 |
matching | 15-matching.json | Term-definition matching |
ordering | 16-ordering.json | Sequence/chronological ordering |
placement | 17a-sentence-placement.json | Place items into anchored gaps in a structured passage (sentence / paragraph / sectionLabel; supports decoy gaps for TOEFL Sentence Insertion) |
Reserved (7 types declared in the discriminator enum for forward compatibility; per-type schemas and authoring/consumer support targeted for 2027):
| Question Type | Use Case |
|---|---|
association | Categorization/grouping |
hotspot | Click regions on image |
graphicGapMatch | Drag-and-drop on image |
graphicAssociate | Match text with images |
graphicOrder | Order images sequentially |
fileUpload | Document submission |
mediaPromptedEssay | Audio/video recording |
Status definitions:
- Implemented — per-type schema, example, and conformance fixtures present.
- Reserved — declared in the
question-base.schema.jsondiscriminator enum for forward compatibility; no per-type schema or example ships yet.
The 12 implemented types are the entire user-facing surface as of 1.0. The 7 reserved types are targeted for 2027.
Common Validation Errors
Error: “Number of gaps doesn’t match accepted answers”
Cause: Mismatch between numbered @@@N markers in passage and entries in gapAcceptedAnswers.
Fix: Count @@@1, @@@2, … markers in the passage and ensure gapAcceptedAnswers has matching string keys ("1", "2", …).
Error: “Unknown question type: simplegapfill”
Cause: Type discriminator uses non-canonical casing.
Fix: Use camelCase: "simpleGapFill", not "SimpleGapFill" or "simplegapfill". Per NORMATIVE.md §5.3, conforming consumers MUST reject non-canonical casings.
Error: “globalId does not match UUID pattern”
Cause: globalId is missing or not in RFC 4122 UUID form (any version; shape-only validation against the 8-4-4-4-12 hex pattern).
Fix: Generate a UUID for every Unit, Lesson, Item, and Question. Use any standard UUID library; v4 is recommended.
Error: “Unsupported specVersion ‘2.0’”
Cause: The document declares a specVersion whose major version exceeds 1.
Fix: This validator implements LC-JSON 1.x. Either update the validator or correct the specVersion to a 1.x value.
Related Documentation
NORMATIVE.md— RFC 2119 conformance requirements (the authoritative source for what implementations must do)HTML_SAFETY.md— Normative HTML allowlist forContentItem.htmlandSignpostItem.customHtml(elements, attributes, URL schemes, sanitization)ACCESSIBILITY.md— Producer/consumer accessibility obligations (alt text, captions, keyboard, language/direction) with WCAG 2.1 AA cross-references and recommended ARIA patterns; the opt-in Accessibility Profile claim binds these as MUSTs per NORMATIVE.md §12VALIDATION.md— Catalog of every documented validation rule, tagged schema-enforced / domain-validator-enforced / advisory, with citations to the enforcing site (schemas,validate_course.py, or prose). One-map view for implementers building consumers, validators, or round-trip tests.ITEM_PATTERNS.md— Informative authoring guide for items + signposts + objectivesIMPLEMENTATIONS.md— Directory of tools that produce, consume, or validate LC-JSONCONTRIBUTORS.md— Acknowledgmentsschemas/— JSON Schema files (the contract)examples/— Example documents for every artifact and question typetests/— Conformance test corpus (valid and invalid cases)question-types-reference.md— Per-type property reference
Version History
v1.0-rc.3 (target: 2026-06-13) — second public release candidate
- Adds
LOCALIZATION.md: the language model (language/lang/supportLanguage), the single-language-per-document boundary, BCP 47 tags, and screen-reader pronunciation expectations. Bound by newNORMATIVE.md§13. - Adds a positioning page (
RATIONALE.md) explaining where LC-JSON sits among adjacent standards. - Conformance corpus expanded to 64 cases (per-type valid + invalid coverage, grading matrix, globalId-uniqueness).
- Schema change requiring a new immutable path: the prototype-era
allowedFillerWordsandprohibitExtraWordsBetweenChunksfields are dropped fromsentence-transformation.schema.json. Because/1.0-rc.2/is immutable, this lands at/1.0-rc.3/. Backwards-compatible — every rc.2-valid document remains valid under rc.3 (the removed fields were optional and are ignored on import). - Schemas published as immutable at
https://lc-json.org/1.0-rc.3/*.schema.json;/1.0-rc.1/and/1.0-rc.2/stay served and frozen; thehttps://lc-json.org/1.0/*.schema.jsonURL space is reserved for the accepted final release.
v1.0-rc.2 (2026-05-30) — first publicly announced release candidate
- Two artifact types under a common flat root:
course(hierarchical) andquestionSet(flat). - 12 user-facing question types fully implemented and schema-validated; 7 graphic/upload types reserved for a 2027 minor version.
- 23 JSON Schemas (Draft 7) covering every artifact, item type, and question type.
- 32 example files; conformance test corpus under
tests/(13 valid + 25 invalid = 38 cases). - Reference validator (
tools/validate_course.py) and conformance corpus harness (tools/run_corpus.py). promptfield correction (the rc.1 → rc.2 change):promptremains required butminLengthis0, so an empty string""is valid.promptis defined as non-authoritative for the eight symbolic question types (gap-fill family, sentence transformation, matching, ordering, placement), whose structured fields carry the question’s meaning; for those types it MAY be empty or MAY carry a brief producer-derived readable summary. A reference-validator domain rule still flags an emptyprompton the four real-content types (true/false, multiple choice, short answer, essay), where it is the question. Backwards-compatible widening — every rc.1-valid document remains valid under rc.2.- Apache 2.0 throughout. Release-candidate schemas are published as immutable at
https://lc-json.org/1.0-rc.2/*.schema.json; thehttps://lc-json.org/1.0/*.schema.jsonURL space is reserved for the accepted final release.
v1.0-rc.1 — internal release candidate (superseded, never announced)
- Frozen and served at
https://lc-json.org/1.0-rc.1/*.schema.jsonfor transparency, but never publicly announced; rc.2 is the first announced prerelease. The only substantive difference is the backwards-compatiblepromptminLength1→0correction above; the/1.0-rc.1/schema URLs remain immutable and any document valid under rc.1 is valid under rc.2.
v1.0 (planned: 2026-06-30) — accepted final release
- Publishes the rc.3 schema set unchanged at immutable
https://lc-json.org/1.0/*.schema.json— a pure URL rebase of rc.3 with zero wire/content delta (only the version pointer, the$id/$schemaURL strings, and doc version labels change). - Further accessibility deepenings (per-criterion cross-reference table, expanded ARIA patterns, screen-reader announcement timing, accessibility-conformance fixtures) are post-1.0, additive, and informative or opt-in — they do not change the 1.0 base contract (see
ACCESSIBILITY.md§11). - Any non-breaking refinement that does warrant a wire change before final lands in a further immutable
/1.0-rc.N/candidate first, not in 1.0 itself.
LC-JSON’s public history begins with the 1.0 release-candidate line —
1.0-rc.2(2026-05-30) was its first publicly announced release. Internal iteration before the candidate line is not reflected in the version history.
Contributing
PRs welcome. To propose a new question type or modify an existing one, see Adding a new question type above. For non-trivial changes, open an issue first to discuss the proposal.
See CONTRIBUTORS.md for acknowledgments.
Support
- GitHub Issues: open an issue on the spec repository.
- Conformance questions: consult
NORMATIVE.md; it is the authoritative source for implementer requirements.
LC-JSON Specification v1.0-rc.3
LC-JSON Rationale and Positioning
Status: Informative
Spec version context: LC-JSON 1.0-rc.3
Audience: teachers, curriculum designers, institutional reviewers, educational software developers, and implementers evaluating LC-JSON for adoption.
This document is informative, not normative. It explains the design rationale and positioning behind LC-JSON. Conformance requirements remain in NORMATIVE.md.
LC-JSON describes itself as an open learning-content interchange specification rather than an “industry standard.” That word is reserved for formats whose acceptance has been ratified by a recognized body or by long ecosystem use. LC-JSON has neither yet, and overclaiming would invite reasonable skepticism.
The Problem
Teachers and institutions create large amounts of learning content: courses, lessons, readings, exercises, quizzes, feedback, and assessments. Too often, that content becomes tied to the tool that created it.
Common problems include:
- A course can be exported, but the export is difficult for another tool to understand.
- Question banks lose feedback, scoring intent, tags, or structure during transfer.
- Teachers cannot inspect their own course files without proprietary tooling.
- Institutions cannot easily preserve or migrate teacher-authored content across platforms.
- Accessibility metadata can be lost when content is exported, imported, edited, or repackaged.
LC-JSON exists to make learning content portable in a way that is both technically reliable and inspectable by the people who own the content.
What LC-JSON Is
LC-JSON is an open learning-content interchange specification.
For teachers, it can be understood as a portable course file format: a way to store courses, lessons, questions, answers, feedback, and related teaching material in a file that compatible tools can read.
For developers, LC-JSON defines a schema-validated JSON wire format plus producer/consumer conformance rules for exchanging learning content.
LC-JSON 1.0 defines two artifact types:
- Course — hierarchical learning content: Course -> Units -> Lessons -> Items -> Questions.
- QuestionSet — a flat list of questions for question-bank exchange and packaged delivery.
The practical goal is to preserve teacher-authored instructional intent: sequence, explanations, questions, distractors, feedback, objectives, tags, rubrics, and grading intent.
Design Principles
Machine-Validatable, Human-Inspectable
LC-JSON documents validate against published JSON Schemas, but they are also designed so that authored content remains visible in the file.
A teacher, curriculum designer, or teacher-developer should be able to open a course JSON file and recognize courses, units, lessons, items, prompts, choices, answers, and feedback without proprietary tooling.
This is a deliberate stance against formats whose meaning only emerges through tooling. It is offered without promise of zero technical fields — portability requires some, and $schema, specVersion, and globalId exist for that reason. The promise is that the pedagogical structure stays inspectable to the people who authored it. Where field-naming or structural trade-offs arise during spec evolution, the spec favors the form that keeps pedagogical content recognizable.
Hierarchy Follows Pedagogy
LC-JSON uses the structure teachers already recognize:
Course -> Unit -> Lesson -> Item -> Question
This is not a database-first shape. It is a teaching-content shape.
Plain Property Names
LC-JSON favors readable property names where technically possible:
prompt, notp.acceptedAnswers, notaccAns.passMarkPercent, notpmp.feedback, not an opaque metadata bundle.
The goal is not to remove every technical term. The goal is to keep teaching intent visible.
No Envelope Tricks
LC-JSON uses a flat document root with $schema, documentType, and specVersion as root-level siblings. The course or question-set payload lives beside those fields, not hidden inside an extra wrapper.
This keeps schema dispatch explicit while avoiding unnecessary nesting.
Accessibility Metadata Must Survive Transformation
Learning content often moves through multiple tools before it reaches learners. LC-JSON therefore treats accessibility metadata as something that must survive import/export cycles.
Base LC-JSON consumer conformance includes a preservation floor for accessibility-relevant data such as image alt, media <track>, lang, dir, language, supportLanguage, and reserved-type accessibility metadata. Tools that additionally claim the LC-JSON Accessibility Profile take on the rendering obligations defined in ACCESSIBILITY.md.
Where LC-JSON Fits
LC-JSON is not trying to replace every educational specification or format.
It focuses on a specific problem: portable teacher-authored courses and questions in a JSON-native format that tools can validate, exchange, preserve, and inspect.
LC-JSON is most useful when a team needs:
- portable course files,
- question-bank exchange,
- schema validation before import,
- preservation of feedback and scoring intent,
- round-trip preservation of unsupported future question types,
- a format that teacher-developers and technical curriculum teams can inspect directly.
Runtime delivery, gradebook integration, learner analytics, roster sync, and LMS-specific workflows remain implementation concerns unless a future LC-JSON version explicitly adds a portable contract for them.
A typical adoption path is to author or preserve content in LC-JSON, then export or map selected surfaces to delivery, package, or analytics layers such as QTI, Common Cartridge, H5P, xAPI, or Caliper where needed.
Landscape
LC-JSON is one of several specifications that touch learning content. It sits at a specific layer — content interchange — and is intended to be used alongside, not instead of, the formats that handle adjacent concerns.
| Format | Layer | Relation to LC-JSON |
|---|---|---|
| LTI 1.3 / Advantage | Tool launch, deep linking, roster, grade passback | Different layer. LTI is how an LMS launches and integrates with an external tool; LC-JSON is the content that tool may have authored or consumed. Complementary. |
| xAPI / cmi5 | Learning activity records | Different layer. xAPI describes what a learner did; LC-JSON describes the content they did it with. Complementary. |
| SCORM 2004 | Packaged courseware delivery and runtime API | Older, XML-based, designed for self-paced corporate compliance training and bound to a runtime API. LC-JSON is editable interchange, not a delivery wrapper. |
| IMS Common Cartridge | Multi-format content package | Bundles QTI, SCORM, web links, and a manifest into a single archive. LC-JSON is a single JSON-native artifact rather than a package format. |
| QTI 2.x / 3.0 | Question and assessment interchange | Closest peer. QTI was conceived as XML; 3.0 added a JSON binding but the conceptual model remains XML-shaped and the surface area is broad. LC-JSON is JSON-native from the start, course-shaped as well as question-bank-shaped, narrower in surface, and designed for direct human inspection. |
| OneRoster | Roster, enrollment, grade exchange | Different layer; orthogonal to content. |
| CASE | Competency and academic-standards framework | Different layer. CASE describes the competencies a course might address; LC-JSON describes the course content itself. Complementary. |
| H5P | Interactive content packages and runtimes | Different layer. H5P provides executable interaction types and player/runtime semantics; LC-JSON is a neutral editable source/interchange format that could generate or map to selected runtime targets. |
| Caliper | Learning analytics event model | Different layer. Like xAPI, Caliper describes learner activity events; LC-JSON describes the content those events may refer to. Complementary. |
This is a high-level map, not an exhaustive comparison. LC-JSON’s intended combination — JSON-native, human-inspectable, and covering hierarchical course structure as well as flat question sets — is uncommon among established educational interchange formats.
The question of whether such a format needs to exist resolves as follows: QTI is mature and deep for assessment exchange, and LC-JSON deliberately targets a narrower, JSON-native course-and-question authoring source rather than competing on assessment surface area; SCORM and Common Cartridge are package-and-delivery formats from an earlier era, not editable JSON; xAPI, Caliper, LTI, OneRoster, and CASE are oriented at other layers. LC-JSON exists to occupy the JSON-native, teacher-readable, course-and-question interchange slot.
How LC-JSON Differs
The same comparison, expressed as field-level stances:
| Need | LC-JSON stance | Typical peer behavior |
|---|---|---|
| Teacher-readable interchange files | First-class design principle | Many established interchange formats prioritize machine/tool processing over direct human inspection |
| JSON-native validation | Published JSON Schemas (Draft 7) | QTI is XML-shaped (3.0 added a JSON binding); SCORM and Common Cartridge are XML and package-based |
| Course + question portability in one family | Separate course and questionSet artifacts under a common flat root | QTI covers questions; SCORM and Common Cartridge package courses; few formats cover both as editable JSON |
| Accessibility metadata preservation across import/export | Base consumer-conformance preservation floor for alt, <track>, lang, dir, language, supportLanguage, reserved-type accessibility metadata | Accessibility metadata can be dropped or normalized away during transformation |
| Accessible delivery claims | Opt-in Accessibility Profile binding (see ACCESSIBILITY.md) | Accessibility-conformance claims are typically made about the delivery platform, not the interchange file |
| Unsupported future question types | Preserve verbatim and report; never silently drop | Fallback behavior varies by implementation; without an explicit preservation contract, data loss is a practical risk |
| Tool-specific data | Namespaced x- extensions; other consumers ignore unknown namespaces and extension-preserving consumers round-trip them where possible (see NORMATIVE.md §7) | Custom-extension mechanisms exist (e.g. QTI custom interactions) but are often tightly coupled to one tool |
| Version stability | Immutable schema URL paths per spec version | URL stability practices vary by specification |
| LMS / runtime integration | Out of scope unless a future LC-JSON version adds a portable contract | SCORM defines a runtime API; LTI defines launch and grade integration |
Adoption Positioning
For teachers:
LC-JSON is a portable course file format for moving teaching materials between compatible platforms.
For institutions:
LC-JSON is an open JSON-based interchange format designed to make teacher-authored learning content portable between compatible tools and platforms.
For developers:
LC-JSON defines a schema-validated JSON wire format plus producer/consumer conformance rules for learning-content interchange.
For standards reviewers:
LC-JSON is an emerging open learning-content interchange specification with a stable 1.0 release-candidate contract, published schemas, conformance fixtures, and explicit producer/consumer obligations.
Scope and Limits
LC-JSON is a focused, open, schema-validated interchange specification for portable learning content. It is not — on its own — any of the following:
- A WCAG conformance claim. LC-JSON’s Accessibility Profile binds preservation and rendering obligations on conforming consumers, but WCAG conformance is established by the delivery platform under test, not by the interchange file.
- An LMS interoperability format. Tool launch, deep linking, and grade passback are LTI’s domain.
- A roster, enrollment, or grade-exchange format. That is OneRoster’s domain.
- A learning-analytics or activity-record format. That is xAPI / cmi5’s domain.
- A runtime delivery wrapper. SCORM 2004 defines a runtime API; LC-JSON does not.
- An established industry standard. LC-JSON is an emerging open specification at 1.0-rc.3, with published schemas, conformance fixtures, and explicit producer/consumer obligations. Whether it becomes widely adopted will be determined by implementers and time, not by self-description.
Within those limits, LC-JSON aims to be exactly one thing well: a JSON-native, human-inspectable interchange format for hierarchical courses and flat question sets, with extension-preserving round-trips, a base accessibility-preservation floor, and an opt-in Accessibility Profile for delivery obligations.
LC-JSON Specification — Normative Requirements
Spec version: 1.0 Status: Normative Last updated: 2026-05-24
This document states the requirements that conforming LC-JSON (Learning Content JSON) tools MUST satisfy. It is the authoritative source of truth for compliance; descriptive material elsewhere in the specification illustrates how to meet these requirements but does not relax them.
1. Scope
This document specifies the requirements for tools that produce, consume, or validate LC-JSON 1.0 documents. It defines:
- The canonical wire format for the two artifact types (Course, QuestionSet).
- Two conformance roles — producer and consumer — and what each MUST do.
- Versioning rules and URL stability guarantees.
- Conformance-claim language (how a tool may state it conforms).
This document does not prescribe implementation strategies, programming languages, or runtime architecture. Any tool meeting the requirements below conforms, regardless of how it is built.
2. Conformance Language
The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in RFC 2119 and RFC 8174 when, and only when, they appear in all capitals.
A requirement stated in lowercase (“must,” “should”) is descriptive prose, not a normative requirement.
3. Document Identity
3.1 Canonical URL space
LC-JSON schemas are published at:
https://lc-json.org/<spec-version>/<schema-name>.schema.json
The <spec-version> segment identifies either a released version (1.0, 1.1, 2.0, …) or a release candidate — an immutable draft of an upcoming version, published for review and implementer feedback before the final release is accepted (e.g., 1.0-rc.1, 1.0-rc.2, 1.0-rc.3, 1.1-rc.1, …). Each receives its own URL path. For released spec version 1.0, schemas resolve at https://lc-json.org/1.0/*.schema.json. For release candidates, schemas resolve at https://lc-json.org/1.0-rc.N/*.schema.json, one URL path per candidate.
URLs under any published path — released or release-candidate — are immutable. They MUST NOT be renamed, removed, redirected to a different schema, or repointed to a non-canonical host once published.
The /X.Y/ URL path is reserved for the accepted final X.Y release and MUST NOT be populated until that release is published. Release candidates targeting X.Y are published at /X.Y-rc.N/ paths. A document pinned via $schema to /X.Y-rc.N/ does not automatically validate against /X.Y/; adoption of the final release is an explicit choice by the publisher (typically a re-export against the new schema URL). See §8.1 and §8.3 for the full versioning and stability contract.
3.2 Required root fields
Every conforming LC-JSON document MUST contain at the root, as siblings (not nested under any envelope):
| Field | Required? | Type | Value |
|---|---|---|---|
documentType | MUST | string | "course" or "questionSet". The artifact discriminator. |
specVersion | MUST | string | The LC-JSON contract version this document conforms to. Pattern enforced by the schemas; consumer/producer rules in §5.2 / §4.6. |
$schema | MUST (producer) / SHOULD-tolerate (consumer) | string | A URL identifying the schema for this document type at the spec version the producer conforms to (e.g., https://lc-json.org/1.0-rc.3/<artifact>.schema.json for an rc.3 producer; https://lc-json.org/1.0/<artifact>.schema.json for a 1.0-final producer). Consumers SHOULD accept documents that omit $schema (re-import scenarios from older or lenient producers), but MUST reject any other root-field omission. |
A document missing documentType or specVersion is non-conforming. A producer that emits a document missing $schema is non-conforming with respect to that document; a consumer that rejects an otherwise-valid document on the basis of a missing $schema is overly strict and SHOULD instead infer the schema from documentType + specVersion.
3.3 Artifact types
Spec version 1.0 defines exactly two artifact types:
- Course (
documentType: "course") — hierarchical learning content (Course → Units → Lessons → Items → Questions). Validated bycourse.schema.json. - QuestionSet (
documentType: "questionSet") — flat list of questions without a course/unit/lesson scaffold. Validated byquestion-set.schema.json.
A producer MUST emit exactly one of these artifact types per document. Mixing artifact types within a single document is non-conforming.
4. Producer Conformance
A producer is any tool that emits LC-JSON documents intended for external consumption.
4.1 Wire format
A producer MUST emit documents in the canonical flat-root form: $schema, documentType, and specVersion at the top level, with the artifact payload as flat siblings. Nested envelopes such as {"course": {...}} are non-conforming.
4.2 Discriminator casing
A producer MUST emit the type discriminator on questions in canonical camelCase form: "simpleGapFill", "trueFalseQuestion", "multipleChoice", "wordBankCloze", "multiGapCloze", "multipleChoiceCloze", "shortAnswer", "essay", "sentenceTransformation", "matching", "ordering", "placement".
A producer MUST emit the type discriminator on items in canonical lowercase form: "content", "exercise", "quiz", "contentsequence", "signpost".
A producer MUST emit documentType in canonical lowercase camelCase form: "course" or "questionSet".
4.3 Item-type semantics
The exercise and quiz item-type discriminators are structural distinctions, not policy. They allow consumers to render the two forms differently in the UI and to track their points in separate buckets (enabling weighted grading).
The grading policy of an item is composed independently from its type via the isGraded, isOptional, and passMarkPercent fields. All four combinations of {exercise, quiz} × {graded, ungraded} are valid LC-JSON: a graded exercise (e.g. homework that counts), an ungraded exercise (open practice), a graded quiz (typical assessment), and an ungraded quiz (e.g. diagnostic pre-test, self-check) are all conforming.
A producer MUST NOT infer or assert grading state from item type alone, and a consumer MUST NOT reject a document on the basis that an exercise is graded or a quiz is ungraded.
4.4 Identifiers
A producer MUST emit globalId values as RFC 4122 UUIDs (any version) where the schema requires them. Specifically: every Unit, Lesson, Item, and Question MUST have a globalId; these identify the entity across re-imports, enabling consumers to match unchanged content against existing records and detect modifications.
Within a single document, globalId values MUST be unique across all entities (Units, Lessons, Items, and Questions share one namespace). A document in which two entities carry the same globalId does not conform: a consumer keyed on globalId cannot tell the entities apart, so re-import matching breaks and updates can land on the wrong record. globalId comparison is case-insensitive (the hexadecimal digits of a UUID carry no case significance).
A producer SHOULD emit a sourceCourseId at the course root for any course that may be re-imported or version-tracked. sourceCourseId is the stable course-identity field — the same sourceCourseId across versions of a course identifies them as the same logical course, enabling consumers to detect re-imports and apply update semantics rather than treating each upload as a fresh course. sourceCourseId is generated by the source authoring system; it does not identify a human author.
Forward-direction note (informative, not normative for 1.0): Future versions of LC-JSON may introduce a complementary
coursePlatformIdfield for platform-assigned course identifiers, enabling round-trip flows where a teacher exports from a platform and re-imports to an authoring tool with the platform’s identity preserved. Implementations should not rely on this field’s absence in 1.0 documents being permanent.
4.5 Property naming
A producer MUST emit all property names in camelCase. PascalCase, snake_case, and other casings are non-conforming on the wire.
4.6 Spec version
A producer MUST emit specVersion matching the spec version it implements. For producers conforming to this document, specVersion MUST begin with "1." (e.g., "1.0", "1.0.1").
specVersion carries the contract version regardless of which publication the producer targets. The specific publication — release candidate or final release — is identified by the $schema URL (§4.7). A producer conforming to 1.0-rc.3 emits specVersion: "1.0" together with $schema: "https://lc-json.org/1.0-rc.3/course.schema.json"; a producer conforming to 1.0 final emits the same specVersion value together with $schema: "https://lc-json.org/1.0/course.schema.json". specVersion does not include release-candidate suffixes — "1.0-rc.3" is not a conforming specVersion value.
4.7 Schema URL
A producer MUST emit a $schema URL pointing at the canonical published schema for its documentType at the spec version the producer conforms to. For example: a producer conforming to LC-JSON 1.0-rc.3 emits https://lc-json.org/1.0-rc.3/course.schema.json for courses and https://lc-json.org/1.0-rc.3/question-set.schema.json for question sets; a producer conforming to 1.0 final emits https://lc-json.org/1.0/course.schema.json. A producer that emits a non-canonical URL or omits the field is non-conforming.
The strict producer / lenient consumer split (§3.2 above) is deliberate: emitting $schema makes documents self-describing for IDEs, schema dispatch, and ad-hoc validation; tolerating its absence on import preserves portability across older or otherwise-non-conforming producers without hard-failing re-imports.
4.8 Validation before emit
A producer SHOULD validate every emitted document against the published JSON Schemas before delivery. A producer that emits an invalid document is non-conforming with respect to that document.
5. Consumer Conformance
A consumer is any tool that ingests LC-JSON documents from an external source.
Consumer conformance requires more than schema validation. Schema validation (§5.1) is necessary but not sufficient: a conformant consumer ALSO satisfies the discriminator-handling rule (§5.3), the unknown-fields rule (§5.4), the reserved-enum-values rule (§5.5), the randomization requirements (§5.6), and — where reserved or unknown question types appear — the round-trip preservation obligations in §6. A generic JSON Schema validator alone does not implement these; consumers MUST implement the relevant §5.x and §6 obligations to claim conformance (see §10.3). See the worked example at the end of this section.
5.1 Strict validation
A consumer MUST validate incoming documents against the published JSON Schemas for the declared documentType and reject documents that fail schema validation.
Exception (§6 fallback for unknown types). Schema-validation failures whose only cause is one or more type discriminator values not present in the consumer’s implemented question-base.schema.json enum do not trigger §5.1 rejection. The consumer applies the §6 fallback to those questions (preserve verbatim, treat earned points as 0, render placeholder, report to user) and validates the rest of the document under §5.1. All other schema-validation failures — missing required fields, type mismatches, pattern violations on known fields, additionalProperties violations on closed objects, etc. — still trigger rejection. This carve-out is what makes §5.2’s “accept any 1.x specVersion” rule operable: a 1.0-only consumer reading a 1.x+ document with a future-minor question type satisfies both §5.1 and §6 by following this path.
5.2 Spec version handling
A consumer MUST accept any specVersion value whose major version it implements (e.g., a 1.x consumer accepts 1.0, 1.1, 1.0.1, …; the canonical pattern is enforced by course.schema.json / question-set.schema.json).
A consumer MUST reject specVersion values whose major version exceeds what it implements (a 1.x consumer rejects 2.0, 2.1, 3.0, …). The rejection SHOULD be a clear error indicating the unsupported spec version.
A consumer MUST NOT silently downgrade or interpret unknown spec versions.
5.3 Discriminator handling
A consumer MUST recognize canonical camelCase question-type discriminators and canonical lowercase item-type discriminators as defined in §4.2. Non-canonical casings are non-conforming and MUST be rejected.
5.4 Unknown fields
A consumer MUST NOT reject a document solely because it contains additional fields not defined by the schema. Such fields are reserved for forward-compatible additions and MUST be ignored or preserved at the consumer’s discretion.
5.5 Reserved enum values
A consumer MUST accept question types listed in question-base.schema.json’s enum even when no per-type schema is published for them. Full handling obligations — including round-trip preservation, learner-facing placeholder rendering, and grading semantics — are normative under §6 (Reserved and unknown types).
5.6 Randomization requirements for matching and placement
For matching and placement questions, two surfaces a consumer presents to a learner have no author-defined order:
- The choice pool, comprising every authored answer value (
pairs[].matchorcategories[].labelfor matching;placements[].itemfor placement) plus anydistractors. Source order would directly expose the correct-answer mapping (the N-th option being the correct answer for the N-th row or gap), defeating the question. - The row order in
matchingclassification mode, where each row is one item to be classified. Source order is grouped by category — items belonging tocategories[0]first, thencategories[1], and so on — which directly exposes the answer (the first N rows all share the same category label).
A consumer MUST present both surfaces to learners in randomized order. A consumer MUST NOT render either surface in source order. The randomization algorithm and any seeding strategy are consumer-defined.
These requirements do not apply to:
multipleChoiceand other single-question choice lists, where authors may deliberately position the correct option and the question schema’s ownshuffleOptionsfield governs shuffle policy per question.- The order of pair rows in
matchingpairs mode, where each item has its own distinct match value and source row order does not directly expose the answer. - The order of items in
orderingsource-tile pools, where the question’s structural design requires the tile pool to be presented in non-source order regardless.
Forward compatibility: three look-alike situations (informative)
A 1.0-conformant consumer reading a 1.x document may encounter three superficially-similar cases at the JSON layer, each governed by a different consumer obligation. A generic JSON Schema validator handles none of them automatically.
-
An unknown top-level field on a question. Example:
"explanationVideoUrl": "..."appears on amultipleChoicequestion. Under §5.4 (Unknown fields), the consumer MUST NOT reject the document; it ignores or preserves the field at its discretion. -
An extension-namespaced field. Example:
"x-somecompany-difficultyBand": "B2"appears on the same question. Under §7 (Extensions), the consumer MUST NOT reject for it and SHOULD preserve it verbatim across read/write cycles. -
An unknown
typediscriminator value. Example: a question carries"type": "novelCodingTask"— a value the consumer’s implementedquestion-base.schema.jsonenum does not include. Per §6.1, reserved and unknown types are handled identically: it does not matter whethernovelCodingTaskis destined for a future minor version, is a vendor-specific extension type, or will never be standardized at all. Under §5.1 (Strict validation, Exception) and §6.2 (Consumer obligations), the consumer applies the §6 fallback to that question (preserve verbatim, treat earned points as0, render a placeholder naming the type, report to user) and validates the rest of the document. Note that earned points are set to0, but the question’s possible points still count toward the item’s total — the item’s maximum is consumer-independent by design, so a learner who completes the item in a fuller consumer can earn all the points the producer declared while a learner in a more limited consumer earns whatever subset they can; both report grades against the same denominator. Under §6.4 (Round-trip preservation), if the consumer re-exports the document, thenovelCodingTaskquestion is preserved with every member, value, and nested structure intact (semantic preservation; key order is producer-discretion per §6.2).
These three cases look similar at the JSON layer but are not interchangeable. Implementers using a generic JSON Schema validator (jsonschema for Python, Ajv for JavaScript, etc.) MUST add the §5.x and §6 fallback logic above the base validation call — particularly for case 3, where a generic validator would reject the whole document on the unknown "novelCodingTask" enum value, but §5.1’s Exception is what permits the rest of the document to validate while §6 governs the unknown-type question.
6. Reserved and Unknown Types
6.1 Definitions
A reserved type is a type discriminator value listed in question-base.schema.json’s discriminator enum that does not have a published per-type schema in this spec version. The 1.0 reserved types are: association, hotspot, graphicGapMatch, graphicAssociate, graphicOrder, fileUpload, and mediaPromptedEssay.
An unknown type is a type discriminator value not listed in question-base.schema.json’s discriminator enum. Unknown types may appear in 1.x+ documents read by 1.0-only consumers.
For the purposes of this section, reserved and unknown types are handled identically.
6.2 Consumer obligations
When a consumer encounters a question whose type is reserved or unknown, the consumer:
- MUST preserve every member of the question object across read/write cycles — every field name, every value, every nested object and array, and any extension fields present on import. No field dropping, no value mutation, no
globalIdrewriting. (Key order within JSON objects is producer-discretion: producers SHOULD preserve input key order for authoring ergonomics and diff stability, but consumers are not required to — JSON object members are unordered per RFC 8259 §4.) - MUST NOT silently drop the question from the parent item’s
questions[]array. The question’s existence is preserved even when its rendering is not supported. - MUST treat the question’s earned points as
0for grading purposes. The question’s possible points still count toward the item’s total — the maximum is not silently reduced. - MUST report the unsupported question to the user (or upstream caller) at import time, naming the type and the question’s
globalId. Form is implementation-defined (UI banner, log line, returned warning), but the report is required. - SHOULD render a non-interactive placeholder in the learner UI naming the type. Example: “Question type ‘hotspot’ is not supported by this application. Skip to the next question.”
- SHOULD disable navigation gating for unsupported questions (e.g. do not block lesson completion just because a reserved question was not answered).
- MAY offer the learner a way to view the raw question data (instructor preview, debug mode), but MUST NOT expose internal field names to the learner UI by default.
6.3 Producer obligations
A producer that emits reserved types:
- SHOULD NOT emit reserved question types in 1.0 documents intended for cross-implementation distribution. Reserved types are explicitly tool-specific extensions until promoted in a future version.
- MUST still satisfy
question-base.schema.jsonif it does emit them: validtype, validglobalId, validpoints, validprompt, plus any otherquestion-baserequirements. Consumers’ fallback handling can only operate on a structurally well-formed object. - SHOULD document in the tool’s README which reserved types it emits and which fields it populates, so other tool authors can interoperate or contribute.
6.4 Round-trip preservation
A consumer that imports an LC-JSON document, modifies it, and re-exports MUST preserve every member of every reserved-type question in the exported document — including their globalId, type, points, prompt, and any additional fields that were present on import. No field dropping, no value mutation. (Key order within JSON objects is producer-discretion per §6.2; the preservation obligation is semantic, not byte-level.)
The intent is that a teacher exporting from a consumer that does not support hotspot can take the file back to a consumer that does, without losing the hotspot question. This is the core interop guarantee for reserved types: consumers MUST NOT strip reserved questions on export even if they cannot render them on import.
6.5 Producer guidance (informative)
To make a reserved-type question maximally compatible with future first-class implementations and other producers emitting the same name:
- Use the published reserved name exactly (
hotspot, notHotspotorhotspot-question). - Always populate
globalId(UUID),points, andprompt. - Use additional fields conservatively — anything beyond
question-baseis by convention only until 1.1 publishes a per-type schema. Document any tool-specific extensions in your README. - Avoid generic field names that 1.1 schemas may use canonically (
data,config,settings).
This subsection is informative — producers that do not follow it still produce valid LC-JSON. But the future first-class schemas are likelier to land cleanly if 1.0 producers stay within the spirit.
7. Extensions
LC-JSON is deliberately small. Tools frequently need to attach data that is meaningful to themselves but is not part of the interchange contract — authoring provenance, internal identifiers, editor state, analytics hints. Namespaced extensions provide a forward-compatible, collision-free way to carry such data without polluting the core format or requiring a spec revision.
7.1 Extension members
An extension member is an object member whose key begins with the prefix x- followed by a vendor or tool namespace, for example x-acme-reviewState or x-acme.lineage.
Extension members MAY appear on the document root and on any Course, Unit, Lesson, Item, or Question object. They MUST NOT be added to objects whose schema declares additionalProperties: false (in 1.0, the matching pair/category entries and placement entries), because those objects are closed by contract and would fail validation.
The x- prefix is reserved exclusively for extensions. A producer MUST NOT introduce a non-extension field whose name begins with x-.
7.2 Namespacing
The segment immediately following x- is the namespace and MUST identify the originating tool or vendor (e.g. x-acme). Namespacing prevents two tools from colliding on the same key with incompatible meanings. A producer MUST NOT emit an extension member under a namespace it does not own.
A namespace owner SHOULD document the extension members it emits — their shape and meaning — in its public implementation notes (for known implementations, in IMPLEMENTATIONS.md).
7.3 Additive-only constraint
Extensions are strictly additive. A producer MUST NOT encode in an extension member any data required for a baseline-correct interpretation of the document. A consumer that ignores every extension member MUST still obtain a complete and correct learning experience. Equivalently: removing all x- members from a conforming document MUST leave a conforming document with equivalent learner-facing meaning.
This keeps extensions from degenerating into a shadow format that fragments the ecosystem.
7.4 Consumer obligations
A consumer MUST NOT reject a document solely because it contains extension members (this restates §5.4 for the namespaced case).
A consumer MUST NOT interpret an extension member outside its own namespace as having any defined meaning. A consumer MAY read and act on extension members within namespaces it understands.
A consumer that imports, modifies, and re-exports a document SHOULD preserve extension members it does not understand, re-attaching each to the same object it arrived on (identified by globalId where the object carries one). A consumer that preserves all unrecognized extension members across a round trip is said to be extension-preserving; a consumer that cannot SHOULD document the loss.
The SHOULD — rather than MUST — acknowledges that some consumers have fixed internal storage with nowhere to hold arbitrary foreign data. But preservation is what lets a tool use LC-JSON as a faithful transfer or backup format for its own tool-specific state: a document that round-trips through an extension-preserving consumer comes back whole, including data that consumer never understood.
7.5 Producer obligations
A producer MAY emit extension members under namespaces it owns, subject to §7.1–§7.3. A producer MUST keep extension content well-formed JSON. A producer SHOULD prefer extension members over overloading core fields (for example, encoding private state in tags or title) for tool-specific data.
8. Versioning and Stability
8.1 Semantic versioning
Spec versions follow a semver-style scheme: MAJOR.MINOR[.PATCH].
- A major version bump (e.g., 1.x → 2.0) signifies a breaking change. New schemas are published at a new URL path (
/2.0/). - A minor version bump (e.g., 1.0 → 1.1) signifies an additive change. New schemas are published at a new URL path (
/1.1/). - A patch bump signifies non-normative fixes (description text, examples, clarifications). No URL change.
- A release candidate of an upcoming version
X.Ycarries the version labelX.Y-rc.N(whereNis1,2, …) and is published at its own URL path/X.Y-rc.N/. RCs allow non-breaking refinements between the candidate and the accepted final release; each RC is its own immutable publication. The finalX.Yrelease is published at/X.Y/only when accepted. Documents pinned to/X.Y-rc.N/do not auto-promote to/X.Y/— adopting the final release is an explicit publisher choice (typically a re-export against the new schema URL).
8.2 Definition of “breaking”
For the purposes of §8.1, a change is breaking if and only if it causes a previously-conforming document to stop validating under the new schema, or to change in meaning under the new schema (i.e., a field that previously had one interpretation now has another).
Loosening the schema so that a previously-non-conforming document begins to validate is not breaking by this definition: documents that already conformed continue to conform with unchanged meaning. The additive examples below rely on this asymmetry.
Examples of breaking changes:
- Renaming a property.
- Removing an enum value that existing documents may have used.
- Tightening a constraint (e.g., reducing a string’s
maxLengthbelow an existing value’s length). - Adding a new required property.
- Changing a property’s type.
Examples of additive changes:
- Adding an optional property.
- Adding an enum value.
- Loosening a constraint (e.g., increasing
maxLength). - Removing a property from an object’s
requiredlist (the field becomes optional). - Adding an entirely new artifact type with its own
documentTypevalue.
8.3 URL stability
Schemas published at any published version path — released versions and release candidates alike — MUST remain available at that URL with byte-identical content (modulo whitespace) for the lifetime of the specification. Specifically:
https://lc-json.org/1.0/*.schema.jsonMUST resolve to the 1.0 schemas indefinitely once 1.0 final is published.https://lc-json.org/1.0-rc.N/*.schema.jsonMUST resolve to the rc.N schemas indefinitely once rc.N is published.- These URLs MUST NOT be redirected to a different schema, even one that is “compatible” or “improved.”
- These URLs MUST NOT be moved to a non-canonical host.
- The
/X.Y/URL path MUST NOT be populated untilX.Yfinal is published; serving rc.N content at/X.Y/is non-conforming and prevents downstream documents from distinguishing the candidate from the final release.
This guarantee enables conforming documents to embed $schema URLs that remain valid for the document’s entire lifetime in archives, version-control systems, and offline contexts — including across rc.N → final transitions, where rc.N documents continue to validate against their original rc.N URL indefinitely.
8.4 Version-path forward compatibility
A document is validated against the schemas at the URL given in its $schema field — that URL is the document’s canonical schema location and the binding target for conformance. The specVersion field declares the spec version the document targets; the $schema URL identifies the specific schema publication (release or release candidate) it was authored against. Both MUST be present (§3.2) and MUST agree on the targeted version (§4.6, §4.7): a document declaring specVersion: "1.0" MUST point $schema at either /1.0/ (the final release, once published) or a /1.0-rc.N/ candidate path; a document declaring specVersion: "1.1" MUST point $schema at /1.1/ or a /1.1-rc.N/ candidate path.
Reminder (§4.6): specVersion never carries an -rc.N suffix. Every document targeting the 1.0 contract — whether authored against an rc.N candidate or 1.0 final — declares specVersion: "1.0". The specific publication is identified by $schema. For example, a document authored during the rc.3 phase looks like:
{
"$schema": "https://lc-json.org/1.0-rc.3/course.schema.json",
"documentType": "course",
"specVersion": "1.0",
...
}
It follows that:
- A document declaring
specVersion: "1.0"with$schemapointing at/1.0/MUST validate against the schemas published at/1.0/. This is the post-1.0-final scenario;/1.0/is reserved until 1.0 ships (§8.3). - A document declaring
specVersion: "1.0"with$schemapointing at/1.0-rc.N/MUST validate against the schemas published at/1.0-rc.N/and is not required to validate against/1.0/. This is the current rc-cycle case. The rc.N → final transition is an explicit publisher choice (see §8.1, §8.3) — a re-export against the new$schemaURL — not an automatic upgrade. - A document declaring
specVersion: "1.1"SHOULD validate against the schemas at its declared$schemaURL and SHOULD also validate against/1.0/schemas for fields that are unchanged between minor versions.
9. Deprecation
A field, discriminator value, or shape may be deprecated in a minor version and removed in a subsequent major version.
9.1 Deprecation marking
Deprecated fields MUST be marked with "deprecated": true in their schema definition and SHOULD include a description referencing their replacement.
9.2 Producer behavior for deprecated fields
A producer MUST NOT emit deprecated fields in new documents. A producer that re-emits previously-imported documents MAY preserve deprecated fields it received, but SHOULD prefer to emit only the canonical replacement.
9.3 Consumer behavior for deprecated fields
A consumer MUST continue to accept deprecated fields for the lifetime of the major version that introduced the deprecation. Removal is permitted only at the next major version bump.
9.4 Currently deprecated items (1.0)
No items are deprecated in 1.0. The specification ships clean.
10. Conformance Claims
10.1 Base LC-JSON 1.0 conformance
A tool MAY claim conformance to LC-JSON 1.0 as follows:
- “Conforms to LC-JSON 1.0 as a producer” — the tool emits documents satisfying §4.
- “Conforms to LC-JSON 1.0 as a consumer” — the tool ingests documents satisfying §5, §6, §7, and the accessibility-preservation obligations of §12.1.
- “Conforms to LC-JSON 1.0” without qualification — the tool implements both producer and consumer conformance.
10.2 LC-JSON 1.0 Accessibility Profile conformance (opt-in)
A tool that additionally satisfies the obligations in ACCESSIBILITY.md MAY claim:
- “Conforms to the LC-JSON 1.0 Accessibility Profile as a producer” — the tool emits documents satisfying §4 plus the producer-side obligations in
ACCESSIBILITY.md§§2–7. - “Conforms to the LC-JSON 1.0 Accessibility Profile as a consumer” — the tool ingests and renders documents satisfying §5/§6/§7/§12.1 plus the consumer-side MUST-level obligations in
ACCESSIBILITY.md§§2–8. - “Conforms to the LC-JSON 1.0 Accessibility Profile” without qualification — both producer and consumer.
A consumer claiming the Accessibility Profile MUST satisfy all MUST-level items in ACCESSIBILITY.md §§2–8 for its role; partial satisfaction is misclaim. See §12 for the profile’s binding text.
10.3 Claim accuracy
A tool MUST NOT claim conformance unless it satisfies all applicable MUST requirements. A tool MAY publish self-test results against the conformance test corpus (see tests/) as evidence.
Three rules guard against the predictable misclaims:
- Producer ≠ consumer. Claim only the roles the tool actually satisfies; a producer-side conformance claim does not extend to the consumer role without satisfying §5.
- The Accessibility Profile is fully bound. Claiming the Accessibility Profile means every MUST-level item in
ACCESSIBILITY.md§§2–8 (for the claimed role) is satisfied. Partial profile claims are misclaim. - LC-JSON does not certify WCAG conformance. The LC-JSON Accessibility Profile provides the wire-format affordances and consumer-rendering obligations that enable WCAG 2.1 AA delivery; a delivering consumer’s own WCAG claim (under EN 301 549, DOJ ADA Title II, Section 508, Section 504, or equivalent) is separate and remains the consumer’s responsibility.
10.4 Suggested wording (informative)
Implementers may use the following short forms for marketing pages, badges, READMEs, and footers. They are advisory — formal claims live in §10.1 and §10.2.
Tier 1 — Base LC-JSON 1.0 conformance
| Form | Wording |
|---|---|
| Badge | LC-JSON 1.0 Compatible |
| Sentence | “Reads and writes LC-JSON 1.0 — the open Learning Content JSON specification at lc-json.org.” |
| Formal | “Conforms to LC-JSON 1.0 as a producer / consumer / producer and consumer.” |
Tier 2 — LC-JSON 1.0 Accessibility Profile
| Form | Wording |
|---|---|
| Badge | LC-JSON 1.0 Accessibility Profile |
| Sentence | “Delivers LC-JSON 1.0 content with accessible rendering — keyboard navigation, screen-reader support, captions, language-aware text direction. Conforms to the LC-JSON 1.0 Accessibility Profile.” |
| Formal | “Conforms to the LC-JSON 1.0 Accessibility Profile as a producer / consumer / producer and consumer.” |
Role qualifiers ((producer) / (consumer)) SHOULD accompany the badge or sentence when the implementation supports only one role, so readers do not infer capabilities the tool does not provide.
A Tier 2 claim implies Tier 1 (the Accessibility Profile is additive to base conformance); no double-badging is needed.
10.5 Trademark
Trademark rights in “LC-JSON” and “Learning Content JSON” are not asserted against conformance claims. Any tool meeting the requirements above MAY freely state its conformance and use the suggested wording in §10.4.
11. HTML Safety Profile
LC-JSON 1.0 permits HTML in two fields: ContentItem.html and SignpostItem.customHtml. The complete normative HTML safety profile — allowed elements, allowed attributes, URL-scheme allowlist, sanitization obligation, link normalization, media handling, and unknown-element handling — is specified in HTML_SAFETY.md.
A producer that emits HTML in any HTML-bearing field MUST emit only constructs permitted by HTML_SAFETY.md §2 (elements), §3 (attributes), and §4 (URL schemes).
A consumer that renders HTML from any HTML-bearing field MUST sanitize the HTML against HTML_SAFETY.md §5 before rendering, MUST normalize <a target="_blank"> to include rel="noopener noreferrer" per §6.1, and MUST strip-while-preserving-text any unknown element per §6.2. A consumer MUST reject any document containing forbidden constructs listed under §8.1 (<script>, event handlers, javascript:/vbscript: URLs, etc.).
HTML_SAFETY.md is normative and forms part of LC-JSON 1.0. The split into a separate document reflects its length, not its status.
12. Accessibility Profile
LC-JSON’s accessibility model distinguishes two layers: preservation of accessibility metadata across read/write cycles (binding on every conforming consumer), and delivery of accessible rendering to end users (binding only when the Accessibility Profile is claimed).
The motivating concern is that accessibility information must survive transformation. In real ecosystems, educational content is exported, imported, translated, edited, and repackaged across multiple tools; accessibility failures most commonly occur during these transformations rather than during original authoring — alt text silently removed during save operations, transcripts discarded during export, localized accessibility text overwritten, unknown accessibility fields stripped by intermediate tools. The accessibility-preservation floor (§12.1) protects the format against that failure mode in every conforming consumer. The Accessibility Profile (§12.2) is the opt-in commitment to also deliver the affordances accessibly.
12.1 Base-conformance accessibility preservation
A conforming consumer that re-emits a document MUST NOT degrade its accessibility shape. Specifically:
altattributes on<img>MUST round-trip.<track>elements (includingkind,src,srclang,label,default) on<video>and<audio>MUST round-trip.langanddirattributes on HTML-bearing elements MUST round-trip.- The required document-root
languagefield MUST round-trip. The document-rootsupportLanguagefield MUST round-trip when present, including explicitnull. - Reserved-type questions MUST round-trip with any accessibility metadata they carry, per §6.4.
- Extension-preserving consumers (§7.4) SHOULD round-trip
x--namespaced extension members that carry accessibility data.
These obligations are part of base LC-JSON conformance; a consumer claiming “Conforms to LC-JSON 1.0 as a consumer” satisfies them. The HTML safety profile (§11 / HTML_SAFETY.md) explicitly allows alt, <track>, lang, and dir on every applicable element class to make this preservation possible.
Base conformance is preservation only: it never requires a producer to author accessibility content (alt text, captions, transcripts). A small or non-institutional producer is therefore never non-conforming for omitting them — the reference validator surfaces omissions as non-blocking warnings. The obligation to author accessibility content is part of the opt-in Accessibility Profile (§12.2). The two-layer split is intentional: accessibility information is never silently stripped or ignored on read/write (base), while the heavier “the content must actually be accessible” bar is opt-in for the products — typically institutional, or those with legal or marketing accessibility commitments — that need it.
12.2 The Accessibility Profile (opt-in)
The accessibility profile defined in ACCESSIBILITY.md — alt-text requirements, video caption obligations for instructional content, keyboard alternatives for structured-task question types, non-color feedback, language-aware rendering, accessible reserved-type placeholders, and validator severities — is bound by an opt-in claim (§10.2).
- A consumer claiming the Accessibility Profile MUST satisfy the structured-task keyboard alternatives (
ACCESSIBILITY.md§4), the non-color-feedback obligations (§5), the language/dirrendering obligations (§6), and the reserved-type placeholder accessibility (§7). - A producer claiming the Accessibility Profile MUST emit the producer-side authoring obligations across
ACCESSIBILITY.md§§2–7. These include, at minimum:alton every<img>(§2.1);<track>captions on prerecorded instructional video carrying speech, plus a transcript for that video, and a transcript for prerecorded audio-only instructional content (§3.1); and rootlanguagematching the delivery language (§6). These authoring MUSTs apply only under a Profile claim — they are not base-conformance obligations (§12.1). - Tools that satisfy preservation (§12.1) but not delivery (§12.2) are conforming LC-JSON consumers but are NOT conforming Accessibility Profile consumers, and MUST NOT claim the latter.
12.3 Relationship to WCAG
WCAG governs rendered user experiences; LC-JSON governs portability and metadata preservation. A consumer claiming the LC-JSON Accessibility Profile carries the wire-format affordances and consumer-rendering obligations that WCAG 2.1 AA delivery requires (alt text, captions, language/direction, textual feedback, keyboard alternatives); the consumer’s own jurisdictional WCAG conformance claim (under EN 301 549, DOJ ADA Title II, Section 508, Section 504, or equivalent) remains separate and is the consumer’s responsibility, not LC-JSON’s.
A tool MUST NOT claim WCAG 2.1 AA conformance by virtue of LC-JSON Accessibility Profile conformance alone. LC-JSON does not certify WCAG conformance.
ACCESSIBILITY.md is normative for tools claiming the Accessibility Profile and forms part of LC-JSON 1.0 in that capacity. The split into a separate document reflects the opt-in scope, not a lesser status.
13. Localization and language
LC-JSON 1.x is single-language-per-document. A document declares one delivery language in the root language field; multiple languages are delivered as multiple documents, not as localized field bundles within one document. The full model — the distinct roles of language (delivery), lang/dir (language of parts), and supportLanguage (the optional pedagogical L1 layer), the accepted language-tag forms, and the expectations around assistive-technology pronunciation — is specified in LOCALIZATION.md.
Binding requirements (restated here; full detail in LOCALIZATION.md):
- A producer MUST emit a
languageroot field matching the document’s delivery language. - Language-tag values (
language,supportLanguage, HTMLlang) are BCP 47 tags. Producers SHOULD use the bare ISO 639-1 primary subtag unless a region/script subtag carries meaning; a consumer MAY act on only the primary subtag. - A producer SHOULD mark HTML spans whose language differs from the delivery language with
lang(anddirwhere script direction differs); a consumer MUST preservelang/dirthrough sanitization and round-trip (see §12.1). langis the necessary affordance for assistive-technology language switching, but correct pronunciation also depends on the end user’s screen reader and installed voices — outside the format’s control. Emittinglangis not optional on that account; it is the floor (LOCALIZATION.md§7).
LOCALIZATION.md is normative for the obligations it states and informative for the pronunciation-expectations discussion. Where it and this document disagree, this document wins.
14. Validation surface
The requirements in this document are enforced across three sites: the 23 JSON Schemas under schemas/, the reference validator tools/validate_course.py, and the per-document prose in the companion normative documents (HTML_SAFETY.md, ACCESSIBILITY.md, LOCALIZATION.md). VALIDATION.md catalogs every documented rule and tags it with its enforcement tier — schema-enforced, domain-validator-enforced, or advisory — and identifies the forward-looking deepenings scheduled for 1.0 final. Implementers building consumers, validators, or producer round-trip tests should consult VALIDATION.md for the one-map view of what to check.
VALIDATION.md is informative and additive — it introduces no requirements beyond those already stated in this document, in the schemas, or in the companion normative documents. Where it and any of those sources disagree, those sources win.
15. References
- RFC 2119 — Key words for use in RFCs to Indicate Requirement Levels
- RFC 8174 — Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words
- RFC 4122 — A Universally Unique IDentifier (UUID) URN Namespace
- RFC 3986 — Uniform Resource Identifier (URI): Generic Syntax
- BCP 47 — Tags for Identifying Languages
- JSON Schema Draft 7
- LC-JSON HTML safety profile:
HTML_SAFETY.md - LC-JSON accessibility profile:
ACCESSIBILITY.md - LC-JSON localization and language model:
LOCALIZATION.md - LC-JSON validation surface (informative):
VALIDATION.md - LC-JSON glossary (informative):
GLOSSARY.md - LC-JSON schemas:
schemas/ - LC-JSON examples:
examples/ - LC-JSON conformance test corpus:
tests/
LC-JSON HTML Safety Profile
Status: Normative. Referenced from NORMATIVE.md §11. Spec version: 1.0 Last updated: 2026-05-03
This document defines the HTML subset that LC-JSON (Learning Content JSON) 1.0 documents MAY carry in HTML-bearing fields, the obligations consumers MUST satisfy when rendering it, and the URL-scheme allowlist for embedded references.
The keywords MUST, MUST NOT, SHOULD, SHOULD NOT, MAY, and RECOMMENDED are to be interpreted as described in RFC 2119 and RFC 8174.
1. Scope
1.1 HTML-bearing fields
HTML is permitted in the following fields:
| Field | Carrier | Schema reference |
|---|---|---|
html | ContentItem | schemas/content-item.schema.json |
customHtml | SignpostItem | schemas/signpost-item.schema.json |
No other LC-JSON 1.0 field carries HTML. Question prompts, hints, choice text, feedback strings, and similar author-visible prose are plain text. A producer MUST NOT embed HTML in plain-text fields; a consumer MUST treat HTML in plain-text fields as literal text.
1.2 Why this profile exists
Without a portable allowlist, every consumer would sanitize against its own subset, and the same document would render differently — sometimes unsafely — across implementations. This profile fixes the contract:
- Producers know what they MAY emit and have rendered consistently.
- Consumers know what they MUST accept, what they MUST sanitize away, and where the line falls between “render-time stripping” and “reject the document.”
- Third-party implementers have a single reference for
<script>, event handlers,<iframe>,target="_blank",data:URLs, and the rest of the long tail.
The profile is deliberately strict-enough-to-be-safe, lenient-enough-to-author. Decisions throughout favor producer flexibility (any class, an inline-style allowlist that covers real authoring patterns, tel: for adult/corporate audiences) while binding consumer sanitization tightly enough that no conforming consumer can be coerced into XSS by a conforming document.
2. Allowed elements
A conforming consumer MUST render the following HTML elements when they appear in HTML-bearing fields, subject to the attribute allowlist in §3 and the URL-scheme allowlist in §4.
2.1 Block
<p>, <div>, <h1>, <h2>, <h3>, <h4>, <h5>, <h6>, <ul>, <ol>, <li>, <blockquote>, <pre>, <hr>, <table>, <thead>, <tbody>, <tr>, <th>, <td>, <figure>, <figcaption>
2.2 Inline
<a>, <strong>, <em>, <b>, <i>, <u>, <mark>, <small>, <sub>, <sup>, <code>, <br>, <span>, <abbr>, <q>, <time>
2.3 Media
<img>, <video>, <audio>, <source>, <track>
2.4 Forbidden elements
The following elements MUST NOT be emitted by producers and MUST be stripped (along with their entire subtree) by consumers:
<script>, <iframe>, <object>, <embed>, <form>, <input>, <button>, <select>, <textarea>, <style>, <link>, <meta>, <base>, <svg>, <math>, <applet>, <frame>, <frameset>, <noframes>
<svg> and <math> are forbidden inline (the surface for XSS via SVG sanitization is wide and inconsistently understood across libraries). SVG raster equivalents are permitted via <img src="..."> per §4.1; consumers SHOULD NOT inline-render the contents of an SVG fetched this way (the standard <img> rendering pipeline is sufficient and isolates script).
2.5 Unknown elements
When a consumer encounters an element name not listed in §2.1–§2.3 and not in the forbidden list of §2.4, the consumer MUST handle it per §6 (Unknown-element handling). Consumers MUST NOT reject a document on the basis of unknown elements alone.
3. Allowed attributes
3.1 Universal attributes
The following attributes MAY appear on every element listed in §2.1–§2.3:
| Attribute | Purpose | Notes |
|---|---|---|
id | Anchor target | SHOULD be document-unique; consumers MAY rewrite to namespace within their UI |
class | Author-defined CSS hooks | See §3.2 |
title | Tooltip / accessible name | |
lang | Language override (BCP 47) | |
dir | Text direction (ltr, rtl, auto) |
3.2 The class attribute
The class attribute is permitted on all allowed elements. Values are author-defined; the spec does not constrain or interpret them. Consumers MUST preserve the class attribute across read/write cycles (§6.4 round-trip preservation in NORMATIVE applies). Consumers MAY style classes they recognize; consumers MUST ignore (without stripping) classes they do not recognize.
This is intentional. Different consumers ship different stylesheets — img-medium matters to one consumer, lc-callout matters to another, generic Tailwind classes might appear in a third. The wire format does not arbitrate which class system wins; it preserves the author’s intent and lets each consumer apply its own visual policy.
3.3 Per-element attribute table
In addition to the universal attributes, the following per-element attributes are allowed.
| Element | Attributes | URL-scheme constrained? |
|---|---|---|
<a> | href, target, rel | href per §4.1 |
<img> | src, alt (REQUIRED), width, height | src per §4.1 |
<video> | src, poster, controls, width, height, preload | src, poster per §4.1 |
<audio> | src, controls, preload | src per §4.1 |
<source> | src, type | src per §4.1 |
<track> | src, kind, srclang, label, default | src per §4.1 |
<table> | border ("1" or absent only) | — |
<th>, <td> | colspan, rowspan, headers, scope | — |
<ol> | start, reversed, type | — |
<li> | value | — |
<blockquote> | cite | URL per §4.1 |
<q> | cite | URL per §4.1 |
<abbr> | (universal only) | — |
<time> | datetime | — |
<img alt> is REQUIRED. Empty alt="" is permitted (and indicates a decorative image — see ACCESSIBILITY.md §2). Producers MUST emit alt; consumers SHOULD treat a missing alt as a domain-validation warning and render the image.
3.4 Inline style attribute
The style attribute MAY appear on any element listed in §2.1–§2.3. Consumers MUST sanitize CSS properties against the allowlist below; properties outside the allowlist MUST be stripped (the property only — the element and other style properties are preserved).
Allowed CSS properties:
| Category | Properties |
|---|---|
| Sizing | max-width, min-width, width, max-height, min-height, height |
| Spacing | margin, margin-top, margin-right, margin-bottom, margin-left, padding, padding-top, padding-right, padding-bottom, padding-left |
| Borders | border, border-top, border-right, border-bottom, border-left, border-collapse, border-spacing, border-style, border-width, border-color |
| Alignment | text-align, vertical-align |
Property values:
- Lengths in
px,em,rem,%, or unitless0. Negative values permitted where the property allows them.vh/vw/vmin/vmaxMAY be permitted at consumer discretion; producers SHOULD NOT emit them. - Color values for
border-color: hex (#abc,#aabbcc),rgb(),rgba(), named CSS colors.currentColorpermitted. autois permitted for sizing properties.
Consumers MUST NOT execute CSS expressions, url() references to remote stylesheets, @import directives, or any value that resembles a JavaScript expression (expression(...), behavior:, -moz-binding, etc.). Consumers MUST strip any value that doesn’t lex as a simple length, color, or keyword token.
The narrow allowlist exists because authors need to size images, set table borders, and align cell content — pragmatic affordances that semantic markup alone doesn’t cover. Anything beyond layout (colors, fonts, animations, positioning, transforms) is consumer-skin territory and belongs on a class hook (§3.2).
3.5 Forbidden attributes
The following attributes MUST NOT appear on any element. Consumers MUST strip them on render:
- All event handler attributes: any attribute matching
on*(e.g.,onclick,onload,onmouseover,onerror,onfocus,onblur). srcdoc(on any element).formaction,formenctype,formmethod,formnovalidate,formtarget(form submission attributes).
data: and other forbidden URL schemes are governed by §4.2; this section does not duplicate that rule.
4. URL scheme allowlist
4.1 Allowed schemes
For URL-bearing attributes (href, src, poster, cite, <source>.src, <track>.src):
| Scheme | Where allowed | Notes |
|---|---|---|
https: | All URL-bearing attributes | Always allowed. |
http: | All URL-bearing attributes | Allowed but discouraged. Mixed-content rendering on HTTPS pages is consumer-defined; consumers SHOULD warn or upgrade. |
mailto: | <a href> only | Standard mail-link behavior. |
tel: | <a href> only | See §7. Consumer policy varies by audience. |
| Relative URLs | All URL-bearing attributes | Resolved against the consumer’s content base for the document. Producers MAY use relative paths to reference media bundled alongside the LC-JSON file (e.g., media/images/foo.jpg). |
4.2 Forbidden schemes
The following schemes MUST NOT appear in any URL-bearing attribute. Consumers MUST reject the URL (either by stripping the attribute or by replacing the attribute with a safe placeholder, e.g., href="#"):
javascript:, vbscript:, data:, blob:, file:, chrome:, chrome-extension:, ftp:, ws:, wss:, gopher:, view-source:
data: is forbidden globally — including for <img src>. The XSS surface (SVG-via-data, HTML-via-data, type-confusion attacks via mixed content sniffing) is wider than the authoring convenience justifies. Consumers MUST strip data: URIs even on <img>.
blob: and file: are forbidden because they reference consumer-local memory or filesystem state; their meaning is not portable.
4.3 URL validation
Consumers SHOULD validate URLs against RFC 3986 before rendering. Malformed URLs (whitespace in the middle, control characters, embedded null bytes) MUST be treated as invalid and stripped.
5. Sanitization obligation
A consumer MUST sanitize HTML from LC-JSON documents before rendering. The HTML in an LC-JSON document is untrusted input from the consumer’s perspective, regardless of the document source.
A producer’s claim of LC-JSON conformance does NOT exempt the consumer from sanitization. Producers can be misconfigured, compromised, or simply buggy; consumers stand alone as the last line of defense.
5.1 Sanitization rules summary
A conforming consumer MUST:
- Strip every element not listed in §2.1–§2.3, preserving its inner text content per §6.
- Strip every attribute not listed in §3, preserving the element.
- Strip every event handler attribute (
on*). - Strip every URL with a scheme outside §4.1.
- Strip every CSS property in inline
styleoutside the §3.4 allowlist. - Normalize
<a target="_blank">to includerel="noopener noreferrer"per §6.1, even when the producer omitted it. - Reject the entire document if it contains any element from the §2.4 forbidden list (
<script>,<iframe>, etc.) or anyon*event-handler attribute or anyjavascript:/vbscript:URL. See §8 for validator severity.
5.2 Reference implementations (informative)
The following sanitizer configurations are known to align with this profile:
- DOMPurify (JavaScript) — configure
ALLOWED_TAGSandALLOWED_ATTRfrom §2.1–§2.3 and §3. - Bleach (Python) —
bleach.clean(text, tags=..., attributes=..., protocols=['http','https','mailto','tel']). - HtmlSanitizer (.NET) — equivalent allowlist configuration.
These are reference points only. Conformance is judged against the rules in this document, not against any specific library’s defaults.
6. Link safety, link normalization, and unknown-element handling
6.1 target="_blank" rel-normalization
A producer that emits <a target="_blank"> SHOULD also emit rel="noopener noreferrer".
A consumer MUST normalize <a target="_blank"> to include rel="noopener noreferrer" on render, adding the tokens if the producer omitted them. This applies even to documents that otherwise satisfy producer conformance — the consumer has the last word on render.
The reverse-tabnabbing risk that this mitigates is well-documented; the cost of producing rel="noopener noreferrer" is zero. Producers SHOULD save consumers the work, but consumers cannot rely on producers to do so.
6.2 Unknown-element handling
When a consumer encounters an HTML element whose name is not in §2.1–§2.3 and not in the §2.4 forbidden list, the consumer:
- MUST strip the element while preserving its text content.
<unknown>hello world</unknown>becomeshello world. - SHOULD log a warning (form is consumer-defined).
- MUST NOT reject the document for unknown elements alone. Forward-compatibility for HTML extensions is preserved by graceful degradation, not by strict rejection.
This mirrors NORMATIVE §6’s handling of reserved/unknown question types: degrade gracefully, never fail-closed on names you don’t recognize. The contract is symmetrical across both surfaces.
6.3 Unknown-attribute handling
When a consumer encounters an attribute not listed in §3, the consumer MUST strip the attribute while preserving the element. Unknown attributes are not grounds for rejecting the document.
6.4 Unknown CSS properties
When a consumer encounters a CSS property in style="..." not listed in §3.4, the consumer MUST strip the property while preserving the element and the other (allowed) properties. Unknown properties are not grounds for rejecting the document.
7. Media handling
7.1 <video>
srcMUST behttps:,http:, or relative.- Consumers MUST NOT auto-play. Producers MUST NOT emit
autoplayorloop. Consumers SHOULD ignore these attributes if a non-conforming producer emits them. controlsSHOULD be present (consumer policy MAY hide them, but the wire intent is “user-driven playback”).- Inner
<source>elements MAY appear; consumers MUST process them per the same URL-scheme allowlist (§4.1). - Inner
<track>elements withkind="captions"orkind="subtitles"SHOULD be present for video content. Accessibility requirements for captions are codified separately inACCESSIBILITY.md§3. posterURL MUST satisfy §4.1.
7.2 <audio>
srcMUST behttps:,http:, or relative.- Consumers MUST NOT auto-play. Producers MUST NOT emit
autoplayorloop. controlsSHOULD be present.- Inner
<source>elements MAY appear.
7.3 Bandwidth and preload
preload accepts "none", "metadata", "auto". Consumers SHOULD respect the producer’s preload hint but MAY override for bandwidth, storage, or accessibility reasons.
7.4 Format compatibility
LC-JSON does not mandate specific media codecs. Producers SHOULD use widely-compatible formats (H.264 + AAC in MP4 for video; MP3, AAC, or Opus for audio) and SHOULD provide multiple <source> fallbacks where format compatibility matters.
7.5 <track> for captions and subtitles
<track src> MUST satisfy §4.1. kind accepts "subtitles", "captions", "descriptions", "chapters", "metadata". srclang is a BCP 47 language tag (RECOMMENDED for subtitles and captions).
8. Validator severity
A reference validator (or any consumer’s pre-render validation pass) SHOULD classify HTML profile violations as follows.
8.1 Errors (validator MUST reject)
These violations indicate a security-critical XSS surface or a structural violation that no consumer can render safely:
- Any forbidden element listed in §2.4.
- Any event handler attribute (
onclick,onload,onmouseover, etc.). - Any URL with scheme
javascript:orvbscript:.
8.2 Warnings (validator MAY accept; consumer SHOULD strip)
These violations are sanitizable and not security-critical. The validator reports them so producers can fix their output, but the document is still useful:
- Unknown elements (per §2.5, §6.2).
- Unknown attributes (per §3.5, §6.3).
- CSS properties outside the §3.4 allowlist (per §6.4).
- URL schemes outside §4.1 but not listed in §4.2 (rare; mostly relative-URL edge cases).
tel:URLs (per §7 — consumer-policy gated; some audiences disable them).- Missing
rel="noopener noreferrer"on<a target="_blank">(per §6.1 — consumer auto-normalizes). - Missing
alton<img>(cross-referencesACCESSIBILITY.md§2). data:URLs (forbidden per §4.2, but a warning rather than an error because the consumer-side mitigation — strip thedata:URL before rendering — degrades gracefully to a broken image, not an XSS surface. The forbidden-scheme rule still binds; the validator severity choice is “tell the author the image won’t render anywhere,” not “reject this otherwise-fine document.”)
8.3 Why this split
Errors fail the build. Warnings notify the author but don’t break interop. The line between them is “could a consumer render this document safely if it tried?” — yes for warnings, no for errors. Producers SHOULD treat warnings as actionable; consumers MUST sanitize regardless.
9. Round-trip preservation
NORMATIVE §6.4 requires consumers to preserve every member of reserved-type questions across read/write cycles (semantic preservation; key order is producer-discretion per §6.2). The same principle applies to HTML content with one important softening: a consumer that re-exports an LC-JSON document MAY emit the sanitized HTML rather than the input HTML, provided that:
- No allowed elements, attributes, or CSS properties (per §2 and §3) are lost.
- Element classes (per §3.2) are preserved verbatim.
- Authored text content is preserved.
- Semantic structure (heading levels, list nesting, table rows/cells) is preserved.
In other words: consumers MAY drop content the spec requires them to strip anyway (<script>, onclick, data: URLs). Consumers MUST NOT drop content they’re not required to strip. This protects authors from silent edit-on-import without forcing consumers to round-trip security-critical violations.
A consumer that imports a document containing forbidden content under §8.1 MUST report the violation to the user; the consumer MAY refuse to round-trip such a document at all.
10. Examples
10.1 Minimal conforming HTML
{
"type": "content",
"globalId": "...",
"title": "Reading",
"html": "<h2>Section 1</h2>\n<p>Some text with <strong>emphasis</strong> and <a href=\"https://example.org\">a link</a>.</p>"
}
10.2 Image with class hook
{
"html": "<p>The diagram below shows the cycle:</p>\n<img src=\"media/cycle.png\" alt=\"Carbon cycle diagram\" class=\"img-medium\" />"
}
10.3 Video with captions
{
"html": "<video src=\"media/lecture.mp4\" controls poster=\"media/lecture-thumb.jpg\" preload=\"metadata\" width=\"640\">\n <track src=\"media/lecture.vtt\" kind=\"captions\" srclang=\"en\" label=\"English\" default />\n</video>"
}
10.4 Table with allowed inline styles
{
"html": "<table border=\"1\" style=\"border-collapse: collapse; width: 100%;\">\n <thead>\n <tr><th style=\"padding: 8px; text-align: left;\">Country</th><th style=\"padding: 8px;\">Capital</th></tr>\n </thead>\n <tbody>\n <tr><td style=\"padding: 8px;\">France</td><td style=\"padding: 8px;\">Paris</td></tr>\n </tbody>\n</table>"
}
10.5 Link with target="_blank" and rel
{
"html": "<p>Read more on <a href=\"https://en.wikipedia.org/wiki/Photosynthesis\" target=\"_blank\" rel=\"noopener noreferrer\">Wikipedia</a>.</p>"
}
10.6 What to avoid
<!-- ✗ <script> is forbidden — validator MUST reject -->
<script>alert("hi")</script>
<!-- ✗ event handler — validator MUST reject -->
<a href="https://example.com" onclick="track()">click</a>
<!-- ✗ javascript: URL — validator MUST reject -->
<a href="javascript:void(0)">click</a>
<!-- ✗ data: URL — consumer strips, validator warns -->
<img src="data:image/png;base64,..." alt="..." />
<!-- ✗ inline-rendered SVG — element forbidden -->
<svg><circle cx="50" cy="50" r="40" /></svg>
<!-- ✓ SVG raster reference is fine -->
<img src="https://example.org/logo.svg" alt="Example logo" />
11. Cross-references
NORMATIVE.md§11 — normative reference to this documentITEM_PATTERNS.md§3 —tel:consumer policy as one example of consumer pluralityschemas/content-item.schema.json—htmlfieldschemas/signpost-item.schema.json—customHtmlfieldACCESSIBILITY.md—alt, captions, keyboard alternatives, language/direction, placeholder accessibility for reserved types, WCAG 2.1 AA cross-references, recommended ARIA patterns (rc.1 release; additive deepenings — per-criterion normative table, expanded ARIA patterns, conformance fixtures — land in 1.0 final)tests/— conformance fixtures includingvalid/06-html-with-video-track.jsonandinvalid/13-html-with-script.json
12. Summary table
| Category | Producer MUST | Producer SHOULD | Consumer MUST | Consumer SHOULD |
|---|---|---|---|---|
| Allowed elements | Stay within §2.1–§2.3 | Use semantic markup | Render allowed elements; strip forbidden (§2.4); strip-while-preserving-text for unknown (§6.2) | Log warnings on unknown |
| Forbidden elements | Not emit <script>, <iframe>, <form>, etc. | — | Reject document if forbidden present (§8.1) | Surface error to user |
| Attributes | Stay within §3 | Use semantic attributes | Strip unknown attributes (§6.3); strip event handlers always | — |
Inline style | Stay within §3.4 allowlist | Prefer class hooks | Strip out-of-allowlist properties (§6.4) | — |
| URL schemes | Use https:, http:, mailto:, tel:, or relative | Prefer https: | Reject javascript:/vbscript:; strip data:/blob:/file:/etc. | Warn on http:, tel: |
target="_blank" | Emit rel="noopener noreferrer" | — | Normalize to add rel="noopener noreferrer" if missing (§6.1) | — |
<img alt> | Emit alt | Use empty alt="" for decorative | — | Treat missing alt as warning |
<video>, <audio> autoplay | Not emit autoplay, loop | — | Not auto-play | Ignore autoplay if a non-conforming producer emits it |
| Sanitization | — | — | Sanitize before render, every time | Use a vetted reference implementation (§5.2) |
LC-JSON Accessibility Profile
Status: Released for 1.0-rc.3, and the stable accessibility contract: the obligations stated here carry into 1.0 final (2026-06-30) unchanged — 1.0 is a pure rebase of rc.3. Further deepenings (per-criterion cross-reference table, expanded ARIA patterns, screen-reader timing guidance, --accessibility validator flag + fixtures) are post-1.0, additive, and informative or opt-in — none change the base-vs-Profile contract; see §11. Obligations stated here will not be retracted or contradicted.
Spec version: 1.0 (release candidate: rc.3)
Last updated: 2026-06-13
This document collects the accessibility expectations for LC-JSON (Learning Content JSON) producers and consumers. The keywords MUST, MUST NOT, SHOULD, SHOULD NOT, MAY, and RECOMMENDED are to be interpreted as in RFC 2119 and RFC 8174. RFC 2119 language binds wire-format obligations; ARIA-pattern guidance is informative — the spec hints at affordances rather than mandating a single canonical UI (see README.md §“Wire Format”). This two-layer split — wire-format affordances versus the duties of the consumer that ultimately delivers the content — is the organizing principle of this document.
1. Scope
LC-JSON is a portable interchange format. The wire format does not render anything itself — accessibility outcomes depend on consumer rendering. The role of this document is to:
- Specify producer obligations that make accessible rendering possible (alt text, captions, language tags).
- Specify consumer obligations for rendering that don’t drop accessibility affordances the producer already provided.
- Cross-reference
HTML_SAFETY.mdwhere HTML-bearing fields constrain accessibility-bearing markup (<img alt>,<track>,lang,dir).
Accessibility for the authoring tools that produce LC-JSON, and for the delivery surfaces that render it, is the responsibility of those tools — not the wire format. This document binds the wire-format obligations only.
1.1 The two-layer duty (informative)
Accessibility is achievable downstream if and only if the format can carry the affordances a renderer needs, and the rendering consumer surfaces them correctly. These are two distinct, non-interchangeable layers:
- The wire format (LC-JSON). Cannot produce an accessible experience on its own; it can only enable one — by carrying alt text, captions, language/direction signals, textual feedback, and position semantics for structured tasks. If the format cannot represent an affordance, no conforming consumer can ever deliver it.
- The consumer (the renderer). Where a disabled end-user actually meets the content, and therefore where every accessibility law attaches. The consumer’s duty is to surface the affordances the producer provided.
A perfectly capable format rendered by a non-conformant consumer is still inaccessible. A conformant consumer cannot rescue a format that never carried the affordance. Both layers must hold.
1.2 Legal context (informative)
WCAG governs rendered user experiences; LC-JSON governs portability and metadata preservation. These layers are complementary but distinct — see NORMATIVE.md §12.3.
The technical accessibility target for educational and commercial delivery in the EU and US converges on WCAG 2.1 Level AA:
- EU — European Accessibility Act, Directive (EU) 2019/882 (applicable since 28 June 2025): points at the harmonized standard EN 301 549, which references WCAG 2.1 AA.
- EU — Web Accessibility Directive (EU) 2016/2102: binds public-sector bodies (public universities, schools) to EN 301 549 → WCAG 2.1 AA.
- US — DOJ ADA Title II final rule (April 2024): explicitly adopts WCAG 2.1 AA for state/local government, including public schools and universities, with compliance deadlines in April 2026 / April 2027.
- US — Section 508 / Section 504: WCAG-based conformance for federal procurement and recipients of federal financial assistance.
LC-JSON supports WCAG 2.1 AA by carrying the wire-format affordances a conforming renderer needs (alt text, captions, lang/dir signals, textual feedback, position semantics for structured tasks). The delivering consumer remains responsible for the full WCAG conformance claim under its applicable jurisdiction.
1.3 ATAG vs WCAG (producers vs consumers)
- Web consumers (the renderers that display LC-JSON content to learners) fall under WCAG. The obligations in §§2–7 are written from this perspective.
- Authoring tools that produce LC-JSON (whether browser-based editors, desktop applications, AI-assisted authoring scripts, or import converters) fall under W3C ATAG 2.0, not WCAG. ATAG covers two things: (a) making the authoring environment itself accessible to author-users (ATAG Part A), and (b) supporting authors in producing accessible content — e.g. prompting for
alttext, captions, transcripts (ATAG Part B). Producer obligations in this document map to ATAG Part B: the authoring tool’s job is to make the affordances easy to author and hard to forget. - Desktop authoring tools are additionally outside WCAG’s scope entirely — WCAG governs web content. Desktop accessibility is governed by platform standards (e.g. UIAutomation on Windows; Section 508 / EN 301 549 software clauses).
1.4 Five conformance requirements for a WCAG 2.1 AA claim (informative)
A WCAG 2.1 AA claim by a delivering consumer is valid only if all five hold; failing any one voids the claim independent of per-criterion passes:
- Conformance level — all Level A and Level AA criteria are met (50 criteria total: 30 A + 20 AA).
- Full pages — conformance is claimed for complete pages, including dynamically loaded states; partial-page exclusions are not permitted.
- Complete processes — every page in a multi-step process must conform (e.g. login → course → item → submission → results). A conformant results page after a non-conformant quiz flow does not pass.
- Accessibility-supported technologies — reliance only on technologies that work with assistive technology (HTML + ARIA + native form controls is the baseline).
- Non-interference — even non-relied-on content must not break 1.4.2 (Audio Control), 2.1.2 (No Keyboard Trap), 2.2.2 (Pause, Stop, Hide), or 2.3.1 (Three Flashes or Below Threshold).
This document specifies what LC-JSON producers and consumers must do to make the per-criterion items achievable. The five claim-level gates are properties of a delivering consumer, not of the wire format.
1.5 Versions targeted
This profile targets WCAG 2.1 Level AA as the primary claim baseline. Selected criteria from WCAG 2.2 are designed-in for new interactive components to avoid near-term rework:
- 2.5.7 Dragging Movements (2.2, A) — every drag interaction MUST ship a single-pointer, non-drag alternative. Governs structured-task question types in §4.
- 2.5.8 Target Size (Minimum) (2.2, AA) — interactive targets SHOULD be ≥ 24×24 CSS px.
Other WCAG 2.2 additions are out of the 2.1 claim baseline. 4.1.1 Parsing is treated as satisfied-by-default (modern browsers/AT; W3C errata; obsolete in WCAG 2.2).
2. Image alt text — WCAG 1.1.1
2.1 Producer obligations
When claiming Accessibility Profile conformance, a producer MUST emit an alt attribute on every <img> element in HTML-bearing fields (HTML_SAFETY.md §3.3). This satisfies WCAG 1.1.1 Non-text Content at Level A.
Outside an Accessibility Profile claim, authoring alt is not a base-conformance requirement: a producer that omits alt is still a conforming LC-JSON producer (the reference validator emits a non-blocking WARN per HTML_SAFETY.md §8.2). What base conformance does require is preservation — a consumer MUST NOT strip an alt that is present (NORMATIVE.md §12.1). The distinction is deliberate: a small producer is never blocked for an alt-less image, but accessibility information, once authored, is never silently dropped.
- For informative images (diagrams, screenshots, photographs that carry meaning),
altMUST be a meaningful textual description. - For decorative images (visual flourishes, spacers, redundant illustrations of adjacent text),
alt=""(empty string) is RECOMMENDED. An emptyaltis a positive signal to assistive technology that the image carries no content; it is not a missing attribute.
Question types that carry image references in tool-specific extension fields (e.g. reserved-type hotspot, graphicGapMatch) SHOULD include an alt-text-equivalent property when those types are promoted to first-class schemas (1.0 final, see §11).
2.2 Consumer obligations
A consumer MUST render the alt text exposed to assistive technology when an image is rendered. A consumer that strips <img> (e.g. when sanitization fails) MUST surface the alt text as fallback content rather than silently dropping the image entirely.
A missing alt attribute SHOULD trigger a domain-validation warning per HTML_SAFETY.md §8.2; the consumer MUST still render the image (the failure mode is a warning to the author, not a refused document).
3. Video and audio: captions, transcripts, descriptions — WCAG 1.2.1, 1.2.2, 1.2.3, 1.2.5
3.1 Producer obligations
For prerecorded instructional video that contains speech or meaningful audio, producers MUST emit at least one <track kind="captions"> or <track kind="subtitles"> element with a valid src and srclang when claiming Accessibility Profile conformance. This satisfies WCAG 1.2.2 Captions (Prerecorded) at Level A. WebVTT is the RECOMMENDED caption format (broad browser support; AT-compatible).
For all other <video> content (decorative, non-speech, ambient), producers SHOULD emit a <track kind="captions"> or <track kind="subtitles"> element where the content carries any information the learner is expected to receive (HTML_SAFETY.md §7.5).
When claiming Accessibility Profile conformance, producers MUST provide a transcript for prerecorded instructional content that carries speech — either as adjacent ContentItem.html prose or as a linked resource:
- For audio-only instructional content (e.g. a
<audio>listening passage), the transcript is the text alternative required by WCAG 1.2.1 Audio-only (Prerecorded) at Level A. - For instructional video, the transcript is required in addition to the captions above; it satisfies WCAG 1.2.3 (media alternative) and serves learners who cannot use synchronized captions (deafblind users on a braille display, users who need to read at their own pace).
Outside an Accessibility Profile claim, a transcript is RECOMMENDED but not required — base conformance never compels a small producer to author one. As with alt (§2.1), the base floor is preservation, not production: a transcript or <track> already present MUST round-trip (NORMATIVE.md §12.1).
<track kind="descriptions"> (audio descriptions of visual-only information) is RECOMMENDED for video where visual content is essential to the pedagogy and not redundantly narrated. This pairs with WCAG 1.2.5 Audio Description (Prerecorded) at AA.
3.2 Consumer obligations
A consumer that renders <video> or <audio> MUST surface caption/subtitle controls when <track> elements are present. A consumer MUST NOT auto-play media (HTML_SAFETY.md §7.1, §7.2) — the <video> rendering pipeline is user-driven, which is itself an accessibility requirement (motion-sensitivity, screen-reader interruption, bandwidth control). Auto-play would also violate WCAG 1.4.2 Audio Control.
A consumer SHOULD render <track kind="descriptions"> as either a switchable audio track or a synchronized text alternative.
4. Keyboard alternatives for structured-task question types — WCAG 2.1.1, 2.5.1, 2.5.2, 2.5.3, 2.5.7 (2.2 designed-in), 4.1.2, 1.3.1
Three implemented question types involve drag-and-drop or pointer-driven interaction — matching, ordering, and placement. The cloze family (wordBankCloze, multiGapCloze) is structurally similar; multipleChoiceCloze is dropdown-based and inherently keyboard-accessible. The reserved-for-2027 hotspot, graphicGapMatch, graphicAssociate, and graphicOrder types compound the same pattern with image regions.
4.1 Consumer obligations
A consumer that renders these types MUST provide a fully keyboard-navigable interaction. Pointer-only implementations are non-conforming for accessibility purposes regardless of LC-JSON conformance. Per WCAG 2.5.7 (designed-in from 2.2), every drag interaction MUST additionally ship a single-pointer, non-drag alternative.
Concretely:
matching(pairsmode) — Tab-to-item, Enter-to-select, Tab-to-match, Enter-to-pair (or equivalent two-step keyboard model) MUST work without a pointer. Native<select>per item is the simplest conforming pattern (see §4.2.2).matching(classificationmode) — Tab through the item pool, Enter-to-select an item, Tab-to-category, Enter-to-place. Many items can target the same category.ordering— Up/Down (or Left/Right fororderingUnit: "word") keys MUST move a focused tile within the sequence. The interaction model SHOULD be discoverable from focus state alone. See §4.2.1 for a recommended ARIA pattern.placement— Tab through the distractor pool and the gap targets; Enter-to-select an item, Tab-to-gap, Enter-to-place. The interaction MUST work without a pointer regardless ofplacementUnitmode. A labeled<select>per gap is the simplest conforming pattern.wordBankCloze,multiGapCloze— Bank-token selection and gap-placement MUST be reachable by keyboard.multipleChoiceCloze’s<select>rendering is inherently keyboard-accessible and is the RECOMMENDED fallback pattern when richer drag-and-drop interactions cannot be made keyboard-equivalent.
Focus indicators on interactive elements MUST be visible (WCAG 2.4.7 Focus Visible) and SHOULD meet 3:1 contrast against adjacent backgrounds (WCAG 1.4.11 Non-text Contrast).
4.1.1 The accessible alternative is expressible from the document data
The position/target semantics a consumer needs to render a keyboard- and AT-navigable alternative are already carried by the schemas, so the accessible path is expressible from the document rather than improvised at render time: ordering by item position (items[i] is the tile for position i); placement by gap number (@@@N markers correspond to placements[].gap); matching by item↔match value (pairs) or item→category value (classification). Element identity is positional or by value rather than a durable token — sufficient for rendering and for scoring, including repeated values, which are disambiguated by position. A consumer that needs durable per-element identity across systems (for example, portable response or analytics interchange) supplies it at its own layer; the wire format intentionally does not carry it.
4.2 Recommended ARIA patterns (informative)
The following patterns are RECOMMENDED for consumers; they satisfy 4.1.2 (Name, Role, Value), 1.3.1 (Info and Relationships), and 2.5.3 (Label in Name) for the structured-task question types. They are informative — a consumer that satisfies the §4.1 obligations through a different ARIA pattern is conforming.
4.2.1 Ordering
- Bank —
role="group"witharia-labelledbypointing at a visible label (“Available tiles” or equivalent). - Answer area —
role="listbox"witharia-orientation="horizontal"fororderingUnit: "word"andaria-orientation="vertical"forsentence/paragraph;aria-labelledbyfor the answer label;aria-describedbypointing at visible keyboard-and-pointer instructions. - Slots inside the listbox —
role="presentation"so the listbox→option relationship is preserved across intervening layout elements. - Tiles —
tabindex="0"on every tile so all tiles are reachable while arrow keys remain available for movement. Each tile’s accessible name (aria-label) carries the content and position information when placed (e.g. “goes, position 2 of 5”). When a tile is placed, setaria-selected="true". - Live region — a visually-hidden
aria-live="polite" aria-atomic="true"element for movement announcements (tile picked up, tile moved, tile returned to bank). This satisfies WCAG 4.1.3 Status Messages for the interaction’s transient state. - Single-pointer alternative — click-to-place from the bank, click-to-pick-up from a placed tile, click-to-place at another position. Distinct visual indication when a tile is “picked up” (separate from the focus indicator, since both can show simultaneously).
- Discoverable instructions — keyboard-and-pointer instructions SHOULD be visible (not buried in
aria-label) and referenced byaria-describedbyon the listbox.
An alternative satisfying the same obligations is the WAI-ARIA Authoring Practices grab/drop model: Space to “grab,” arrows move only while grabbed, single-roving tabindex. The pattern above (per-tile tabindex) is the recommended baseline for short sequences; the grab/drop model scales better for long sequences at the cost of a mode step.
4.2.2 Matching, Placement
The simplest conforming pattern is native form controls:
matching(pairs) — one<select>per item, options drawn frommatchvalues +distractors(shuffled per §5.6 ofNORMATIVE.md). Each<select>is labeled with the item text via a visible<label>oraria-labelledby.matching(classification) — one<select>per item, options drawn from the category labels.placement— one<select>per@@@Ngap, options drawn fromplacements[].itemvalues +distractors. Each<select>is labeled with surrounding-passage context or a gap label.
Native <select> is inherently keyboard-accessible, satisfies 2.5.3 by carrying its visible label as its accessible name, and avoids the ARIA-listbox complexity of §4.2.1. Richer drag-and-drop renderings are permitted but MUST ship the keyboard and single-pointer alternatives per §4.1.
4.2.3 Cloze family
simpleGapFill, wordBankCloze, multiGapCloze, multipleChoiceCloze MAY be rendered with native text inputs (<input type="text">) or selects (<select>). Each gap MUST have a programmatic label — either an associated <label> element, or aria-label, or aria-labelledby pointing at adjacent gap-context prose.
4.3 Producer obligations
Producers MAY include hint text guiding learners who use keyboard or assistive technology, as hint strings on the question or as adjacent ContentItem.html prose. The wire format does not currently carry interaction-specific accessibility hints; this is intentional (consumer-defined affordance), but producers SHOULD assume diverse interaction modalities when authoring.
1.0 final will deepen this with: an
aria-grabbed/aria-dropeffectdeprecation note, modernaria-activedescendantpatterns as an alternative to per-tiletabindex, focus-management requirements during placement, and screen-reader announcement timing requirements for partial-credit feedback.
5. Feedback: not by color alone — WCAG 1.4.1, 4.1.3
Question types that emit feedback (question-types-reference.md, Common Properties — feedback, choiceFeedback) carry textual content. Consumers MUST render this textual feedback in addition to any visual indicators of correctness (green/red highlighting, check/cross icons).
A consumer MUST NOT convey correctness solely through color or icon. Conformant rendering provides at minimum:
- An accessible textual indicator (“Correct”, “Incorrect”, or the producer-supplied feedback string) — WCAG 1.4.1 Use of Color.
- A non-color visual indicator (icon, position, label) for sighted users with color-vision differences.
- An assistive-technology-readable announcement when feedback updates dynamically — WCAG 4.1.3 Status Messages.
This binds consumer rendering. The wire format already carries the textual indicators; the obligation is to render them.
5.1 Recommended live-region pattern (informative)
Per-question feedback that updates dynamically (without a page reload) SHOULD be exposed to assistive technology via an ARIA live region:
- Routine feedback (per-question correct/incorrect, score updates):
aria-live="polite"so the announcement does not interrupt the learner’s current speech. - Critical feedback (final score, submission confirmation, error states):
role="alert"(implicitaria-live="assertive"). - Score summaries that change after submission: live region MUST contain the textual indicator before any visual transition begins.
Status-message regions SHOULD NOT receive focus; focus management for status announcements is governed by 4.1.3 — expose to AT without moving focus.
6. Language and direction — WCAG 3.1.1, 3.1.2
The language field requirement is also tied to EN 301 549 5.4 (Closed functionality) for educational content delivery.
6.1 Producer obligations
Every Course or QuestionSet carries a language field (a BCP 47 tag, commonly a bare ISO 639-1 code; see LOCALIZATION.md §3) at the document root. Producers MUST set language to the primary delivery language. When the document carries content in a secondary language (typically the learner’s L1 for [L1:] translation/support tags), producers SHOULD also set supportLanguage.
Within HTML-bearing fields, producers MAY use the lang attribute to mark spans of content in a different language than the document default (per HTML_SAFETY.md §3.1). Producers SHOULD use lang for any in-line foreign-language quotation or term — this satisfies WCAG 3.1.2 Language of Parts.
Producers MUST set the root language to the document’s primary delivery language. If the primary delivery language is a right-to-left language (Arabic, Hebrew, Persian, Urdu, etc.), producers SHOULD indicate document-level direction where the consumer supports it. For embedded RTL passages inside an LTR document — for example, an English lesson that quotes Arabic, Hebrew, Persian, or Urdu in the body — producers SHOULD mark the relevant HTML span or block with local lang and dir attributes (per HTML_SAFETY.md §3.1). See examples/course-rtl-writing-systems.json for a worked LTR-document-with-embedded-RTL example.
6.2 Consumer obligations
A consumer MUST honor the document-level language field when setting the rendering surface’s lang attribute. For web consumers, this means setting <html lang> from the document language rather than hardcoding a single locale. This satisfies WCAG 3.1.1 Language of Page.
A consumer MUST honor the dir attribute on HTML-bearing elements when rendering RTL content; failure to do so produces unintelligible bidirectional text. For RTL document languages (ar, he, fa, ur), a web consumer SHOULD additionally emit <html dir="rtl"> so the browser’s bidirectional algorithm is engaged for the whole rendering surface.
A consumer MUST NOT strip lang or dir attributes during sanitization. Both attributes are explicitly allowed on every element class in HTML_SAFETY.md §3.1.
6.3 Screen-reader pronunciation — what lang can and cannot promise (informative)
Emitting lang on a foreign-language span is necessary but not sufficient for that span to be pronounced correctly by a screen reader. lang is an instruction; whether it is acted on depends on the end user’s environment, which the format and the consumer cannot control: the reader must support automatic language switching and have it enabled (support varies — screen readers such as NVDA and JAWS switch reliably, Windows Narrator’s automatic switching is comparatively limited, VoiceOver sits in between), and the matching voice must be installed (a reader with only an English voice reads a correct lang="es" span in English, mispronouncing it). The producer/consumer duty is therefore to emit and preserve lang/dir faithfully; correct pronunciation is completed by the user’s assistive technology. This does not make lang optional — without it no reader can switch at all. See LOCALIZATION.md §7 for the fuller discussion.
The localization model promised here — the distinct roles of
language/lang/supportLanguage, BCP 47 language-tag rules, the single-document-per-language boundary, and the pronunciation-expectations framing above — is specified inLOCALIZATION.md. What remains for a later iteration: explicit RTL rendering tests in the conformance corpus.
7. Reserved and unknown question types: placeholder accessibility — WCAG 1.3.1, 4.1.2
Per NORMATIVE.md §6, consumers MUST preserve reserved/unknown question types in full (every field, value, and nested structure) and SHOULD render a non-interactive placeholder for them.
The placeholder MUST be accessible:
- Surfaced to assistive technology with a meaningful description (at minimum: the question’s
title, the type name, and the fact that the consumer cannot render this question). - Distinguishable from rendered questions (so a screen-reader user understands the question is informational, not interactive) — WCAG 1.3.1 Info and Relationships.
- Not announced as “interactive” or “form control” when no interaction is possible — WCAG 4.1.2 Name, Role, Value.
A consumer MUST NOT silently skip the placeholder for assistive-technology users; the §6 round-trip-preservation philosophy applies equally to the rendering surface.
7.1 Recommended placeholder pattern (informative)
A conforming placeholder SHOULD use:
role="region"— surfaces the placeholder as a labeled landmark, distinguishable from form controls.aria-labelcarrying the question’stitle, the unsupportedtypediscriminator, and an indication that the renderer can’t display this type. Recommended template: “Unsupported question:<title>. This question type (<type>) can’t be displayed by this viewer.”- A visible visual treatment that signals “informational, not interactive” (e.g. a warning or info alert styling).
- No interactive children (
<input>,<button>,<select>) — the placeholder is announced as a region, not a form control.
1.0 final will deepen this with: example placeholder text in multiple languages and producer guidance for emitting accessibility metadata on tool-specific extensions to reserved types.
8. Validator severity (current baseline, established in rc.1)
The reference validator surfaces accessibility issues at the following severities. WCAG SC references are cross-references — accessibility violations in producer output are content-validation issues, not just renderer concerns.
| Issue | Severity | WCAG SC | Cross-reference |
|---|---|---|---|
Missing alt on <img> | warning | 1.1.1 | HTML_SAFETY.md §8.2; §2 above |
<video> without <track kind="captions"> or kind="subtitles" | warning | 1.2.2 | §3.1 |
<iframe>, <script>, event handlers (inaccessible regardless) | error | 4.1.2 (would-be) | HTML_SAFETY.md §8.1 |
Missing language at document root | error (schema-enforced) | 3.1.1 | §6.1 |
Reserved-type question without a title | informational note (recommended for placeholder) | 1.3.1, 4.1.2 | §7 |
1.0 final will deepen this with: an
--accessibilityvalidator flag (analogous to--strict) for tooling that wants to fail-build on warnings, additional severity entries for the reserved-type placeholder surface, and conformance fixtures exercising accessibility-related warnings/errors.
9. WCAG 2.1 AA mapping (informative)
The table below indexes which sections of this document cover which Success Criteria. This is an informative cross-reference; per-criterion normative obligations live in the section bodies.
| WCAG 2.1 AA SC | Level | Topic | This profile |
|---|---|---|---|
| 1.1.1 Non-text Content | A | Alt text on images | §2 |
| 1.2.1 Audio-only/Video-only | A | Transcript or alt media | §3 |
| 1.2.2 Captions (Prerecorded) | A | <track kind="captions"> | §3 |
| 1.2.3 Audio Desc. or Media Alt. | A | Description track or transcript | §3 |
| 1.2.5 Audio Description (Prerecorded) | AA | <track kind="descriptions"> | §3 |
| 1.3.1 Info and Relationships | A | ARIA roles/labels, structured-task semantics, placeholder landmark | §4, §7 |
| 1.4.1 Use of Color | A | Textual + non-color correctness cues | §5 |
| 1.4.2 Audio Control | A | No autoplay | §3.2 |
| 1.4.11 Non-text Contrast | AA | Focus indicator contrast | §4.1 |
| 2.1.1 Keyboard | A | Keyboard alternatives for structured tasks | §4 |
| 2.4.7 Focus Visible | AA | Visible focus indicators | §4.1 |
| 2.5.1 Pointer Gestures | A | Single-pointer alternatives | §4 |
| 2.5.2 Pointer Cancellation | A | Activation on up-event | §4 (consumer behavior) |
| 2.5.3 Label in Name | A | Accessible name contains visible label | §4.2 |
| 2.5.7 Dragging Movements | A (2.2 — designed-in) | Single-pointer alternative for every drag | §4 |
| 2.5.8 Target Size (Minimum) | AA (2.2 — designed-in) | ≥ 24×24 px interactive targets | §1.5 |
| 3.1.1 Language of Page | A | Document language → <html lang> | §6 |
| 3.1.2 Language of Parts | AA | Inline lang on foreign-language spans | §6 |
| 4.1.2 Name, Role, Value | A | ARIA semantics on custom widgets and placeholders | §4, §7 |
| 4.1.3 Status Messages | AA | Live regions for dynamic feedback | §5 |
Criteria not listed (e.g. 1.3.2, 2.4.1, 3.2.2, 3.3.x) are properties of a delivering consumer rather than wire-format affordances. A delivering consumer’s full WCAG 2.1 AA claim covers them per its own conformance plan.
10. Cross-references
NORMATIVE.md— RFC 2119 conformance requirements;languagefield requirement; reserved-type round-trip; randomization requirements (§5.6).HTML_SAFETY.md—<img alt>,<track>,lang,dir, validator severity.question-types-reference.md— per-type feedback fields and structured-task definitions.GLOSSARY.md— terminology.- WCAG 2.1 — https://www.w3.org/TR/WCAG21/
- WCAG 2.1 Quick Reference (filterable SC list) — https://www.w3.org/WAI/WCAG21/quickref/
- WAI-ARIA Authoring Practices Guide (APG) — https://www.w3.org/WAI/ARIA/apg/ — pattern recommendations (listbox, grab/drop, status messages).
- ATAG 2.0 — https://www.w3.org/TR/ATAG20/ — authoring-tool obligations referenced in §1.3.
- EN 301 549 — https://www.etsi.org/deliver/etsi_en/301500_301599/301549/ — EU harmonized standard pointing at WCAG 2.1 AA.
11. From rc.3 to 1.0 final and beyond
This document is the 1.0-rc.3 accessibility profile, and its obligations are the stable accessibility contract: the base-conformance preservation floor (NORMATIVE.md §12.1) and the opt-in Accessibility Profile authoring MUSTs (§12.2 — alt, captions, transcripts) are settled as of rc.3 and carry into 1.0 final unchanged. 1.0 final is a pure rebase of rc.3 — it adds no new obligations and tightens nothing.
The deepenings below are post-1.0, additive, and either informative or opt-in: none change the base-vs-Profile contract above, none gate 1.0. They are listed so implementers can see the intended direction.
- Per-criterion cross-reference table — a presentation of §9 mapping each WCAG SC to the obligation already stated in §§2–7. Clarity, not new obligation.
- Expanded ARIA patterns — patterns for
matchingclassificationmode, richer announcement guidance for partial-credit feedback (§4), per-language placeholder text examples (§7). Informative. - Screen-reader timing guidance — announcement timing for auto-grading flows (§5). Informative.
--accessibilityvalidator flag — analogous to--strict; opt-in tooling that promotes accessibility warnings (missingalt, missing<track>on speech-bearing video) to errors for teams that want to fail-build on them. Opt-in; changes no document’s conformance.- Conformance fixtures for accessibility — an
a11y/corpus suite exercising the--accessibilityflag, beyond the round-trip and missing-language fixtures already in the baseline. - Reserved-type accessibility metadata schema — guidance for emitting accessibility metadata on
hotspot,graphicGapMatch, and the other graphic types when their per-type schemas land (tied to the 1.1 promotion of the reserved types). - Multilingual accessibility metadata shape — localized alt text / transcripts / accessible-name fields per locale; bounded by the single-language-per-document decision in
LOCALIZATION.md§2.4.
Resolved in rc.3 (no longer pending): the authoring obligations for
alt, captions, and transcripts were settled as Accessibility Profile MUSTs (§12.2), deliberately not promoted into baseNORMATIVE.md— base conformance stays preservation-only so a small or non-institutional producer is never blocked. The BCP 47 / ISO 639-1 language-tag reconciliation also shipped in rc.3 (seeLOCALIZATION.md§3).
Implementers building against 1.0-rc.3 can rely on the obligations stated above; 1.0 final (2026-06-30) carries them unchanged.
LC-JSON Localization and Language Model
Status: New in 1.0-rc.3. Clarification document for the 1.0 contract; codifies the language model that has been implicit since 1.0-rc.1; introduces no breaking change. The language root field and lang/dir annotation behave exactly as they did in rc.1/rc.2 — this document states the model explicitly and sets expectations.
Spec version: 1.0 (release candidate: rc.3)
Last updated: 2026-06-13
This document defines how LC-JSON represents natural language: what the language and supportLanguage fields mean, how lang/dir annotate individual spans, which language-tag forms are accepted, and — importantly for implementers — what the format can and cannot promise about pronunciation in assistive technology. The keywords MUST, MUST NOT, SHOULD, SHOULD NOT, MAY, and RECOMMENDED are interpreted as in RFC 2119 and RFC 8174.
1. Scope
The word “language” does four different jobs in a learning document, and conflating them is the most common source of confusion for implementers. This document separates them:
| Concept | Field / mechanism | What it is |
|---|---|---|
| Delivery language | language (root) | The single primary language the document is authored in. |
| Language of parts | lang / dir on HTML | Individual spans in a different language from the delivery language. |
| Support language | supportLanguage (root) | An optional pedagogical layer: the learner’s first language (L1), surfaced to aid comprehension of second-language (L2) content. |
| Translation bundles | (not in 1.x) | Parallel copies of the same content in multiple languages within one document — explicitly out of scope; see §2.4. |
LC-JSON’s wire format is language-neutral: a document may declare any natural language and any script. This document governs how that language is declared and annotated, not which languages are permitted (all are).
2. The four roles of “language”
2.1 Delivery language — language
language is a required root field on both the course and questionSet artifacts. It declares the single primary language the document is authored and delivered in — e.g. "language": "en" means the document is an English document.
- A document has exactly one delivery language. LC-JSON 1.x is single-language-per-document (see §2.4).
- A delivering consumer SHOULD set the rendering surface’s primary language from this field (for a web consumer,
<html lang="…">), so that assistive technology, hyphenation, and font selection default correctly. languageis the document’s identity, not a runtime choice: a consumer does not “switch” a document’s delivery language; it renders the document in the language it declares.
2.2 Language of parts — lang and dir on HTML
Within HTML-bearing fields (ContentItem.html, SignpostItem.customHtml), a run of text in a language other than the delivery language is marked with the standard HTML lang attribute (and dir where the script direction differs). This is the WCAG 3.1.2 Language of Parts mechanism.
{
"type": "content",
"html": "<p>The French call it <span lang=\"fr\">l'esprit de l'escalier</span> — the wit of the staircase.</p>"
}
Language of parts is about correct rendering and pronunciation, not translation: the Spanish span in an English document is content in Spanish, not an English string’s Spanish equivalent. lang/dir are part of the HTML safety profile’s universal-attribute allowlist (HTML_SAFETY.md §3.1) and MUST survive a consumer’s sanitization and round-trip (NORMATIVE.md §12.1).
2.3 Support language — supportLanguage
supportLanguage is an optional root field (nullable). It names the learner’s first language (L1) for a document whose delivery language is a second language (L2) being taught. It exists for the language-teaching case: an English course built for Spanish-speaking learners declares "language": "en", "supportLanguage": "es", signaling that L1 (Spanish) support — glosses, hints, translations of key terms — is appropriate.
supportLanguage is a signal, not a rendering instruction. How a consumer surfaces L1 support — inline glosses, hover tooltips, a toggle, a glossary panel, or not at all — is consumer-defined. One consumer’s convention is an inline bracket tag ([L1: una hipoteca]) that its renderer expands to a lang-annotated span; that is an authoring/rendering convention of that consumer, not a wire-format construct. The wire format carries supportLanguage plus ordinary text and lang-annotated parts; the pedagogy is layered on by the consumer.
When supportLanguage is absent or null, no L1 support is implied and a consumer SHOULD render the document monolingually.
2.4 Out of scope for 1.x — translation bundles
LC-JSON 1.x does not provide field-level localization. There is no shape in which a single field carries parallel translations (no "title": {"en": "...", "es": "..."} maps, no per-locale field bundles). A document is authored in one delivery language.
Multiple languages are delivered as multiple documents. An English course and its Spanish translation are two separate LC-JSON documents, each with its own single language. This keeps the wire format simple, keeps validation unambiguous, and matches how content interchange formats in adjacent ecosystems treat translation (as separate artifacts, not multiplexed fields).
This is a deliberate boundary, not an oversight. Producers MUST NOT assume a future minor version will add localized field bundles; if it ever does, it will be additive and will not change the meaning of the single-language documents defined here.
3. Language tags
Language-tag values (language, supportLanguage, and HTML lang) are BCP 47 language tags.
- The common case is a bare ISO 639-1 primary subtag:
en,es,fr,ar. Producers SHOULD use the bare primary subtag when region and script do not matter. - Region and script subtags are permitted where they carry meaning:
pt-BR,es-MX,en-GB,zh-Hant. These are most useful for selecting a regional voice or regional spelling. - A conforming consumer MAY act on only the primary subtag (treating
es-MXases) when it has no region-specific behavior. Producers therefore SHOULD NOT rely on a consumer honoring a region subtag, but MAY emit one so that consumers which do (for example, choosing a regional text-to-speech voice) can use it.
The reference validator performs a plausibility check on these fields (well-formed primary subtag, optional script, optional region) and emits a WARN — not an error — on a malformed tag. It does not validate the full BCP 47 registry.
4. Text direction — dir
The delivery language’s script direction is the document default; for a right-to-left (RTL) delivery language a consumer SHOULD set the rendering surface direction accordingly. Within content, the dir attribute marks spans or blocks whose direction differs from the surrounding text — an Arabic phrase embedded in an English paragraph, or an English term inside an Arabic passage.
Producers SHOULD emit dir alongside lang whenever an annotated part’s script direction differs from its surroundings; a lang without the matching dir can render with incorrect bidirectional ordering. The full producer/consumer direction obligations live in ACCESSIBILITY.md §6; the worked example examples/course-rtl-writing-systems.json demonstrates LTR-with-embedded-RTL across four writing systems.
5. Producer obligations
- A producer MUST emit a
languageroot field matching the document’s delivery language (§2.1). - A producer SHOULD mark any HTML span in a language other than the delivery language with
lang(§2.2, WCAG 3.1.2), and SHOULD adddirwhere that span’s script direction differs (§4). - A producer MAY emit
supportLanguagefor language-teaching documents (§2.3), and MUST leave it absent ornullotherwise. - A producer SHOULD use bare ISO 639-1 primary subtags unless a region/script subtag carries real meaning (§3).
6. Consumer obligations
- A consumer SHOULD set the rendering surface’s primary language and direction from the document
language(§2.1). - A consumer MUST preserve
langanddiron HTML through sanitization and round-trip (bindsNORMATIVE.md§12.1). - A consumer MAY act on only the primary subtag of any language tag (§3).
- A consumer MAY surface
supportLanguage-driven L1 support in any form, or none (§2.3).
7. Screen readers and pronunciation — expectations (informative)
This section exists because the gap it describes is invisible to most implementers until a screen-reader user hits it.
Emitting lang on a foreign-language span is necessary but not sufficient for that span to be pronounced correctly. lang is an instruction to the assistive technology; whether the instruction is acted on depends on the delivery environment, which the format cannot control:
- The reader must support automatic language switching, and it must be enabled. Support varies by product — NVDA and JAWS switch reliably; Windows Narrator’s automatic switching is comparatively limited; VoiceOver sits in between.
- The matching voice / pronunciation data must be installed on the device. A reader with only an English voice will read a correctly-tagged
lang="es"span in the English voice — mispronouncing it — even though the markup is perfect.
The practical consequence for implementers: a producer’s job is to emit the affordance (lang/dir) faithfully; a delivering consumer’s job is to preserve it and honor it on the rendering surface. Correct pronunciation is then completed by the end user’s screen reader and installed voices, which is outside the format’s and often the consumer’s control. This does not make lang optional — without it, no reader can switch at all, so the affordance is the floor, not the ceiling. It does mean that “the document is correctly tagged” and “every user hears flawless pronunciation” are different claims, and only the first is within an LC-JSON producer’s or consumer’s power to guarantee.
8. Relationship to the Accessibility Profile
The language-of-parts and direction obligations here overlap the ACCESSIBILITY.md §6 obligations and are bound by the same opt-in Accessibility Profile claim (NORMATIVE.md §10.2). This document adds the language model (the four roles, the single-document boundary, the language-tag rules) and the pronunciation-expectations framing; ACCESSIBILITY.md §6 remains the home for the per-criterion WCAG cross-references.
LC-JSON Validation Surface
Status: Informative reference. The authoritative rules live in NORMATIVE.md, the JSON Schemas under schemas/, and the reference validator tools/validate_course.py. This document catalogs them in one place.
Spec version: 1.0
Last updated: 2026-05-24
This document maps every documented validation rule in LC-JSON (Learning Content JSON) 1.0 to the place where it is enforced. The audience is implementers building consumers, validators, or producer round-trip tests — the same audience as NORMATIVE.md.
The catalog is additive and descriptive: it introduces no new normative rules. The inventory pass that built this catalog (2026-05-24) surfaced eight documented-but-unenforced rules; all eight were closed in the same rc.1-polish session by extending tools/validate_course.py (no schema changes). See §14 Forward-looking deepenings for what’s still scheduled for 1.0 final.
1. Scope and structure
LC-JSON’s validation surface is split across four enforcement sites:
- 23 JSON Schemas in
schemas/— declarative constraints (Draft 7) enforced by any conforming JSON Schema validator. NORMATIVE.md— RFC 2119 prose obligations that may or may not be representable in JSON Schema.tools/validate_course.py(the reference validator) — domain checks that run after schema validation, plus consumer-friendly diagnostics.- Companion normative documents and informative references —
HTML_SAFETY.md(normative) andACCESSIBILITY.md(normative for tools claiming the Accessibility Profile; preservation obligations bind every consumer perNORMATIVE.md§12.1); per-type prose inquestion-types-reference.mdand authoring patterns inITEM_PATTERNS.md(both informative).
A consumer that only runs schema validation will accept documents the spec considers invalid (e.g. an MCQ with no correct option, a placement whose placements[].gap points at a missing @@@N marker). A consumer that re-implements the reference validator from prose will miss rules. This catalog gives implementers one map of “these are all the things a conforming consumer must check, and here’s where each rule is enforced.”
1.1 The three enforcement tiers
The rule tables below tag each rule with one tier:
| Tier | Meaning | Citation format |
|---|---|---|
| Schema-enforced | Expressed in one of the JSON Schemas under schemas/. Any Draft-7 validator catches violations. | schemas/<file>.schema.json: <json-pointer> |
| Domain-validator-enforced | Not (or not cleanly) expressible in JSON Schema; the reference validator tools/validate_course.py checks it. Conforming consumers MUST replicate these checks to round-trip and grade correctly. | validate_course.py: <function-name> + NORMATIVE § where cited |
| Advisory | Described in prose (NORMATIVE.md, README.md, question-types-reference.md, ITEM_PATTERNS.md) but not mechanically enforced anywhere. SHOULD/MAY rules, naming conventions, behaviors the spec hints at but lets consumers vary. Listed so implementers know what they are choosing. | Document and section |
A fourth, implicit tier — runtime-enforced — covers grading policy, navigation gating, gradebook display. Out of scope for this document; LC-JSON specifies document validity, not runtime behavior.
1.2 Severity (Domain-validator rows)
The reference validator distinguishes three severities on its domain-rule pass. Schema-enforced rows are always hard errors (any schema violation fails the document); Advisory rows are not enforced. Domain rows carry one of:
- ERROR — the validator returns non-zero exit; the document is non-conforming. Consumers MUST reject.
- WARN — the document is reported as suspect but still parses; the validator returns success. Conforming consumers SHOULD surface the warning to the user.
- NOTE — informational only (e.g.
item.pointsintentionally weighted away from the sum of question points). The validator returns success without raising; no consumer obligation.
Where a single rule is enforced at multiple tiers (e.g. schema + validator double-check for friendlier messages), the row lists both. Satisfying the strictest tier suffices.
1.3 Strict mode and the lenient migration path
The reference validator tools/validate_course.py accepts a --strict flag. The default (lenient) mode emits a warning and falls through with reduced enforcement when it encounters two pre-1.0 document shapes: the wrapped envelope {"course": {...}} and the bare payload {"units": [...]} with no documentType. Neither shape is part of the published 1.0 contract; the lenient handling is a maintainer-side migration aid that allows pre-1.0 document shapes to be ingested during the upgrade — it is not a published affordance third-party producers may rely on.
Under --strict, both shapes are fatal errors. The conformance corpus harness tools/run_corpus.py always invokes the validator with --strict (every fixture is run through the validator); CI runs the harness on every PR; and per NORMATIVE.md §10.3 conformance claims under §10 are evaluated in --strict mode. The published conformance contract is the --strict behavior — third-party consumers and producers should treat the lenient path as a maintainer-side migration aid only.
Rows in the tables below that depend on this distinction explicitly say “ERROR under --strict; WARN otherwise”; everywhere else, the rule applies uniformly.
2. Where to look
| What | Where |
|---|---|
| JSON Schemas | schemas/*.schema.json — 23 files |
| Reference validator | tools/validate_course.py |
| Conformance language (RFC 2119 MUSTs/SHOULDs/MAYs) | NORMATIVE.md |
| Per-type property reference | question-types-reference.md |
| HTML safety profile (elements, attributes, URL schemes, sanitization) | HTML_SAFETY.md |
| Accessibility profile (preservation + opt-in delivery claim) | ACCESSIBILITY.md |
| Item authoring patterns (consumer-policy plurality) | ITEM_PATTERNS.md |
| Conformance test fixtures | tests/ — manifest + valid/invalid sets |
3. Root document
Required root fields (NORMATIVE.md §3.2). Both artifact types (course, questionSet) share these.
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
Producer MUST emit $schema pointing at the canonical published schema URL | Schema-enforced (producer validity) | course.schema.json: /required[*]="$schema", question-set.schema.json: /required[*]="$schema" | §3.2, §4.7 |
Consumer SHOULD tolerate documents that omit $schema (infer the schema from documentType + specVersion); MUST reject any other root-field omission | Advisory (consumer-side import tolerance) | NORMATIVE.md §3.2 | §3.2 |
$schema, when present, is a URI | Schema-declared via format: "uri" (annotation; not universally enforced by Draft-7 validators — see §13) | course.schema.json: /properties/$schema/format="uri", question-set.schema.json: /properties/$schema/format="uri" | §4.7 |
documentType required at root | Schema-enforced | course.schema.json: /required[*]="documentType", question-set.schema.json: /required[*]="documentType" | §3.2 |
documentType is "course" (course document) | Schema-enforced | course.schema.json: /properties/documentType/const="course" | §3.2, §4.2, §5.3 |
documentType is "questionSet" (question-set document) | Schema-enforced | question-set.schema.json: /properties/documentType/const="questionSet" | §3.2, §4.2, §5.3 |
Non-canonical documentType casing rejected ("Course", "questionset", "question-set") | Schema-enforced (via const) | course.schema.json / question-set.schema.json const; validate_course.py: dispatch_document_shape provides casing-tolerant dispatch as a maintainer-side migration aid (§1.3) — disabled under --strict | §4.2, §5.3 |
specVersion required at root | Schema-enforced | course.schema.json: /required[*]="specVersion", question-set.schema.json: /required[*]="specVersion" | §3.2, §4.6 |
specVersion matches ^1\.[0-9]+(\.[0-9]+)?$ | Schema-enforced + Domain-validator-enforced (ERROR for 2.x+) | course.schema.json: /properties/specVersion/pattern; validate_course.py: check_spec_version | §4.6, §5.2 |
specVersion MUST NOT carry an -rc.N suffix | Advisory | NORMATIVE.md §4.6, §8.4 | §4.6 |
language required at root | Schema-enforced | course.schema.json: /required[*]="language", question-set.schema.json: /required[*]="language" | §12.1 |
language is a plausible BCP 47 tag (bare ISO 639-1, or with region/script subtag) | Domain-validator-enforced (WARN; schema typed only, no pattern) | validate_course.py: validate_course_level (course path) and validate_question_set_flat (question-set path), via _is_plausible_language_tag | §13, LOCALIZATION.md §3 |
supportLanguage is a plausible BCP 47 tag (or omitted/null) | Domain-validator-enforced (WARN; schema typed only) | validate_course.py: validate_course_level (course path) and validate_question_set_flat (question-set path), via _is_plausible_language_tag | §13, LOCALIZATION.md §3 |
Pre-1.0 wrapped envelope {"course": {...}} rejected (published contract; lenient migration aid in default mode) | Domain-validator-enforced (ERROR under --strict; WARN otherwise — see §1.3) | validate_course.py: validate_course (--strict branch) | §3.2, §4.1 |
Pre-1.0 bare payload {"units": [...]} rejected (published contract; lenient migration aid in default mode) | Domain-validator-enforced (ERROR under --strict; WARN otherwise — see §1.3) | validate_course.py: validate_course (--strict branch) | §3.2, §4.1 |
| Property names are camelCase | Advisory (consumer-side import is lenient via JsonNormalizer-style helpers) | NORMATIVE.md §4.5 | §4.5 |
Extension members keyed x-<namespace> MAY appear on root + Course/Unit/Lesson/Item/Question | Advisory (schemas do not restrict additionalProperties on those objects) | NORMATIVE.md §7.1 | §7.1 |
Extension members MUST NOT appear on matching.pairs[*], matching.categories[*], or placement.placements[*] | Schema-enforced | matching.schema.json: /allOf/1/then/properties/pairs/items/additionalProperties=false etc.; placement.schema.json: /allOf/1/properties/placements/items/additionalProperties=false | §7.1 |
Producer MUST NOT introduce a non-extension field beginning with x- | Advisory | NORMATIVE.md §7.1 | §7.1 |
Consumer MUST NOT reject documents solely for unknown fields or x- members | Advisory | NORMATIVE.md §5.4, §7.4 | §5.4, §7.4 |
4. Course-level
Course payload fields on a documentType: "course" document.
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
title required, minLength: 1 | Schema-enforced + Domain-validator-enforced | course.schema.json: /properties/title, /required[*]="title"; validate_course.py: validate_course_level | §3.2 |
sourceCourseId, when present, matches the RFC 4122 UUID pattern (any version; shape-only validation) | Schema-enforced + Domain-validator-enforced (WARN if non-UUID) | course.schema.json: /properties/sourceCourseId/pattern; validate_course.py: validate_course_level | §4.4 |
sourceCourseId SHOULD be emitted for re-importable or version-tracked courses | Advisory | NORMATIVE.md §4.4 | §4.4 |
version, when present, matches ^[0-9]+(\.[0-9]+){0,2}$ (1–3 numeric segments) | Schema-enforced + Domain-validator-enforced (WARN) | course.schema.json: /properties/version/pattern; validate_course.py: validate_course_level | §4.4 |
Pre-1.0 identity fields (authorId, authorCourseId) trigger a migration warning | Domain-validator-enforced (WARN) | validate_course.py: validate_course_level | (none — migration aid) |
Course objectives[*].id and objectives[*].text required | Schema-enforced | course.schema.json: /properties/objectives/items/required | (none) |
Course objectives[*].difficultyBand enum: "Recall", "Understand", "Apply", "Analyze", null | Schema-enforced | course.schema.json: /properties/objectives/items/properties/difficultyBand/enum | (none) |
courseObjectiveIds[*] reference course.objectives[].id | Domain-validator-enforced (WARN) | validate_course.py (objective-reference integrity check) | (none — warning-tier integrity check; unresolved references break signpost auto-rendering) |
estimatedDurationMinutes >= 0 | Schema-enforced | course.schema.json: /properties/estimatedDurationMinutes/minimum | (none) |
Course tags[*] are strings (Unit/Lesson/Item additionally enforce minLength: 1) | Schema-enforced | course.schema.json: /properties/tags/items/type="string"; unit.schema.json / lesson.schema.json / item-base.schema.json: /properties/tags/items/minLength=1 | (none) |
units[] MUST be present at the root (course); is an array when present | Domain-validator-enforced (ERROR if missing); Schema-enforced (array type when present) | validate_course.py: validate_course (“Missing ‘units’ array at root level”); course.schema.json: /properties/units/type="array" (the schema’s default: [] would otherwise admit a missing field) | (none) |
5. Unit-level
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
globalId required | Schema-enforced + Domain-validator-enforced | unit.schema.json: /required[*]="globalId"; validate_course.py: validate_unit | §4.4 |
globalId matches the RFC 4122 UUID pattern (any version; shape-only validation) | Schema-enforced + Domain-validator-enforced (WARN if non-UUID) | unit.schema.json: /properties/globalId/pattern; validate_course.py: validate_unit (via is_valid_uuid) | §4.4 |
title required, minLength: 1 | Schema-enforced + Domain-validator-enforced | unit.schema.json: /properties/title, /required[*]="title"; validate_course.py: validate_unit | (none) |
description defaults to "" | Schema-declared (annotation; not enforced — see §13) | unit.schema.json: /properties/description/default | (none) |
tags[*] minLength: 1 | Schema-enforced | unit.schema.json: /properties/tags/items/minLength | (none) |
sequence >= 0 | Schema-enforced | unit.schema.json: /properties/sequence/minimum | (none — import uses array position, not sequence) |
sequence duplicates/gaps within siblings | Domain-validator-enforced (WARN, advisory) | validate_course.py: validate_sequence_order | (none) |
objectiveIds[*] reference course.objectives[].id | Domain-validator-enforced (WARN) | validate_course.py (objective-reference integrity check) | (none) |
lessons[] array (default [] schema-declared) | Schema-enforced (type) | unit.schema.json: /properties/lessons | (none) |
6. Lesson-level
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
globalId required + RFC 4122 UUID pattern (any version; shape-only validation) | Schema-enforced + Domain-validator-enforced (WARN) | lesson.schema.json: /required, /properties/globalId/pattern; validate_course.py: validate_lesson | §4.4 |
title required, minLength: 1 | Schema-enforced + Domain-validator-enforced | lesson.schema.json; validate_course.py: validate_lesson | (none) |
items[] is an array of content/exercise/quiz/contentsequence/signpost (oneOf dispatch) | Schema-enforced | lesson.schema.json: /properties/items/items/oneOf | (none) |
items missing or empty — informational | Domain-validator-enforced (WARN) | validate_course.py: validate_lesson (“empty lesson”) | (none) |
sequence duplicates/gaps within siblings (lesson item ordering) | Domain-validator-enforced (WARN, advisory) | validate_course.py: validate_sequence_order | (none) |
objectiveIds[*] reference course.objectives[].id | Domain-validator-enforced (WARN) | validate_course.py (objective-reference integrity check) | (none) |
7. Item-level — common
Properties inherited by every item type via item-base.schema.json.
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
type required, enum: content, exercise, quiz, contentsequence, signpost | Schema-enforced + Domain-validator-enforced | item-base.schema.json: /properties/type/enum, /required; validate_course.py: validate_item | §4.2, §5.3 |
Non-canonical item-type casing (Content, ExerciseItem) rejected | Schema-enforced (via enum/const) + Domain-validator-enforced (WARN; tolerated via normalize_item_type) | item-base.schema.json; validate_course.py: validate_item | §4.2, §5.3 |
globalId required + RFC 4122 UUID pattern (any version; shape-only validation) | Schema-enforced + Domain-validator-enforced (WARN) | item-base.schema.json: /required, /properties/globalId/pattern; validate_course.py: validate_item | §4.4 |
title required, minLength: 1 | Schema-enforced + Domain-validator-enforced (WARN if missing) | item-base.schema.json; validate_course.py: validate_item | (none) |
tags[*] minLength: 1 | Schema-enforced | item-base.schema.json: /properties/tags/items/minLength | (none) |
suggestedTime >= 0 | Schema-enforced | item-base.schema.json: /properties/suggestedTime/minimum | (none) |
isOptional boolean, default false (schema-declared) | Schema-enforced (type) | item-base.schema.json: /properties/isOptional | (none) |
7.1 ContentItem
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
type is "content" | Schema-enforced | content-item.schema.json: /allOf/1/properties/type/const="content" | §4.2 |
html required | Schema-enforced + Domain-validator-enforced | content-item.schema.json: /allOf/1/required[*]="html"; validate_course.py: validate_item (normalized_type == "content") | (none) |
Deprecated body property → use html | Domain-validator-enforced (WARN) | validate_course.py: validate_item | (none) |
html content satisfies the HTML safety profile | Domain-validator-enforced (ERROR / WARN per HTML_SAFETY.md §8) | validate_course.py: validate_html_content | §11, HTML_SAFETY.md (see §12 below) |
7.2 ExerciseItem
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
type is "exercise" | Schema-enforced | exercise-item.schema.json: /allOf/1/properties/type/const="exercise" | §4.2 |
questions[] required | Schema-enforced + Domain-validator-enforced | exercise-item.schema.json: /allOf/1/required[*]="questions"; validate_course.py: validate_item | (none) |
instructions required (PascalCase Instructions triggers WARN) | Domain-validator-enforced (ERROR if missing, WARN on PascalCase) | validate_course.py: validate_item | (none) |
isGraded boolean, default false (schema-declared) | Schema-enforced (type) | exercise-item.schema.json: /allOf/1/properties/isGraded/default=false | §4.3 |
passMarkPercent is a number 0 <= x <= 100, default 70.0 (schema-declared) | Schema-enforced (type/range) | exercise-item.schema.json: /allOf/1/properties/passMarkPercent/{minimum,maximum,default} | (none — consumer-policy gated; see ITEM_PATTERNS.md §3) |
points >= 0 | Schema-enforced | exercise-item.schema.json: /allOf/1/properties/points/minimum | (none) |
| Producer/consumer MUST NOT infer grading state from item type alone | Advisory | NORMATIVE.md §4.3 | §4.3 |
7.3 QuizItem
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
type is "quiz" | Schema-enforced | quiz-item.schema.json: /allOf/1/properties/type/const="quiz" | §4.2 |
questions[] required, isGraded required | Schema-enforced | quiz-item.schema.json: /allOf/1/required[*]={"questions","isGraded"} | (none) |
isGraded boolean, default true (schema-declared) | Schema-enforced (type) | quiz-item.schema.json: /allOf/1/properties/isGraded/default=true | §4.3 |
passMarkPercent is a number 0 <= x <= 100, default 70.0 (schema-declared) | Schema-enforced (type/range) | quiz-item.schema.json: /allOf/1/properties/passMarkPercent | (none — see ITEM_PATTERNS.md §3) |
points >= 0 | Schema-enforced | quiz-item.schema.json: /allOf/1/properties/points/minimum | (none) |
item.points vs sum(question.points) mismatch is intentional weighting | Domain-validator-enforced (NOTE) | validate_course.py (weighted-points NOTE collection) | (none) |
7.4 ContentSequenceItem
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
type is "contentsequence" | Schema-enforced | content-sequence-item.schema.json: /allOf/1/properties/type/const | §4.2 |
contentItemId required; value is a UUID | Schema-enforced (required) + Domain-validator-enforced (WARN if non-UUID) | content-sequence-item.schema.json: /allOf/1/required, /properties/contentItemId/format="uuid" (annotation — see §13); validate_course.py: validate_item (via is_valid_uuid) | (none) |
relatedItemIds[*] are UUIDs | Domain-validator-enforced (WARN); schema declares format: "uuid" as annotation | content-sequence-item.schema.json: /allOf/1/properties/relatedItemIds/items/format="uuid" (annotation); validate_course.py: validate_item (via is_valid_uuid) | (none) |
relatedItemIds non-empty array | Domain-validator-enforced (ERROR if missing or empty) | validate_course.py: validate_item | (none) |
layout enum: "Auto", "Split", "Vertical", default "Auto" (schema-declared) | Schema-enforced (enum) + Domain-validator-enforced (WARN if other) | content-sequence-item.schema.json: /allOf/1/properties/layout/enum; validate_course.py: validate_item | (none) |
contentItemId resolves to a sibling content item declared earlier in the lesson | Domain-validator-enforced (ERROR) | validate_course.py: validate_item (CSI branch) | (none) |
Each relatedItemIds[*] resolves to a sibling exercise/quiz item declared earlier in the lesson | Domain-validator-enforced (ERROR) | validate_course.py: validate_item (CSI branch) | (none) |
7.5 SignpostItem
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
type is "signpost" | Schema-enforced | signpost-item.schema.json: /allOf/1/properties/type/const | §4.2 |
signpostType required, enum: "intro", "summary" | Schema-enforced + Domain-validator-enforced (ERROR if missing or other) | signpost-item.schema.json: /allOf/1/required, /properties/signpostType/enum; validate_course.py: validate_item | (none) |
scope required, enum: "course", "unit", "lesson" | Schema-enforced + Domain-validator-enforced (ERROR if missing or other) | signpost-item.schema.json: /allOf/1/required, /properties/scope/enum; validate_course.py: validate_item | (none) |
Signpost items MUST NOT carry questions | Domain-validator-enforced (ERROR) | validate_course.py: validate_item (signpost branch) | (none) |
customHtml, when present, satisfies the HTML safety profile | Domain-validator-enforced (ERROR / WARN per HTML_SAFETY.md §8) | validate_course.py: validate_html_content | §11, HTML_SAFETY.md |
A signpost with no objectives (and no customHtml) renders an empty stub | Advisory | ITEM_PATTERNS.md §5 | (none) |
8. Question-level — common
Properties inherited by every question via question-base.schema.json. Required by NORMATIVE §4.4: every question MUST carry a globalId.
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
type required + enum (19 values: 12 implemented + 7 reserved) | Schema-enforced + Domain-validator-enforced | question-base.schema.json: /required[*]="type", /properties/type/enum; validate_course.py: validate_question | §4.2, §5.3, §6.1 |
Non-canonical question-type casing (MultipleChoice, simplegapfill) rejected | Schema-enforced (via enum) + Domain-validator-enforced (ERROR for unknown discriminator) | question-base.schema.json: /properties/type/enum; validate_course.py (per-type question dispatch) | §4.2, §5.3 |
globalId required + RFC 4122 UUID pattern (any version; shape-only validation) | Schema-enforced + Domain-validator-enforced (WARN if non-UUID) | question-base.schema.json: /required[*]="globalId", /properties/globalId/pattern; validate_course.py: validate_question | §4.4 |
prompt required, minLength: 0 (may be empty); empty/whitespace prompt is an ERROR for the 4 real-content types (trueFalseQuestion, multipleChoice, shortAnswer, essay), valid (empty) for the 8 symbolic types, and unconstrained for the 7 reserved types (deferred to v1.1) | Schema-enforced (required, minLength: 0) + Domain-validator-enforced (ERROR on real-content empty; WARN if missing) | question-base.schema.json: /required[*]="prompt", /properties/prompt/minLength; validate_course.py: validate_question | (none) |
points is a non-negative number, MAY be null, default 1.0 (schema-declared) | Schema-enforced (type/range) + Domain-validator-enforced (WARN if missing) | question-base.schema.json: /properties/points/{type,minimum,default}; validate_course.py: validate_question | (none) |
difficulty is a number 0.0 <= x <= 10.0, default 5.0 (schema-declared) | Schema-enforced (type/range) | question-base.schema.json: /properties/difficulty/{minimum,maximum,default} | (none — author estimate; see question-types-reference.md Common Properties) |
tags[*] strings | Schema-enforced | question-base.schema.json: /properties/tags/items/type | (none) |
hint is string or null, default null | Schema-enforced | question-base.schema.json: /properties/hint | (none) |
feedback is an object or null; feedback.{correct,incorrect} are strings; feedback.choiceFeedback is {string: string} | Schema-enforced | question-base.schema.json: /properties/feedback | (none) |
Deprecated questionType (instead of type) | Domain-validator-enforced (ERROR) | validate_course.py: validate_question | (none — migration aid) |
| For non-question fields: producer MUST NOT embed HTML in plain-text fields | Advisory | HTML_SAFETY.md §1.1 | §11 |
9. Question-level — by type (12 implemented)
Properties enforced per question-type schema, plus per-type domain-validator rules. The 7 reserved types are covered in §10.
9.1 simpleGapFill
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
type is "simpleGapFill" | Schema-enforced | simple-gap-fill.schema.json: /allOf/1/properties/type/const | §4.2 |
sentence required, contains @@@, minLength: 4 | Schema-enforced | simple-gap-fill.schema.json: /allOf/1/required, /properties/sentence/{pattern,minLength} | (none) |
acceptedAnswers required, minItems: 1, each minLength: 1 | Schema-enforced | simple-gap-fill.schema.json: /allOf/1/required, /properties/acceptedAnswers/{minItems,items/minLength} | (none) |
caseSensitive boolean, default false (schema-declared) | Schema-enforced (type) | simple-gap-fill.schema.json: /allOf/1/properties/caseSensitive | (none) |
9.2 trueFalseQuestion
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
type is "trueFalseQuestion" | Schema-enforced | true-false-question.schema.json: /allOf/1/properties/type/const | §4.2 |
correctAnswer required, boolean | Schema-enforced + Domain-validator-enforced (ERROR if missing both v1 and v2 forms, ERROR if non-boolean, WARN on boolean-ish coercion) | true-false-question.schema.json: /allOf/1/required, /properties/correctAnswer/type; validate_course.py: validate_true_false_question | (none) |
Pre-1.0 TF shape (options / optionsAndPoints) deprecated | Domain-validator-enforced (WARN; multiple positive-points options WARN; zero positives WARN) | validate_course.py: validate_true_false_question | (none) |
displayStyle enum: "TrueFalse", "CorrectIncorrect", "CheckmarkX", default "TrueFalse" (schema-declared) | Schema-enforced (enum) + Domain-validator-enforced (WARN if other) | true-false-question.schema.json: /allOf/1/properties/displayStyle/enum; validate_course.py: validate_true_false_question | (none) |
penalizeIncorrect boolean, default false (schema-declared) | Schema-enforced (type) | true-false-question.schema.json: /allOf/1/properties/penalizeIncorrect | (none) |
incorrectPenaltyPercent is 0..100, default 50.0 (schema-declared) | Schema-enforced (type/range) + Domain-validator-enforced (WARN if out of range) | true-false-question.schema.json: /allOf/1/properties/incorrectPenaltyPercent/{minimum,maximum}; validate_course.py: validate_true_false_question | (none) |
feedback.choiceFeedback deprecated on TF | Domain-validator-enforced (WARN) | validate_course.py: validate_true_false_question | (none — TF v2 forbids the field; see question-types-reference.md) |
9.3 multipleChoice
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
type is "multipleChoice" | Schema-enforced | multiple-choice.schema.json: /allOf/1/properties/type/const | §4.2 |
options required, minItems: 2, each minLength: 1 | Schema-enforced | multiple-choice.schema.json: /allOf/1/required, /properties/options/{minItems,items/minLength} | (none) |
optionsAndPoints required, {string: number} map | Schema-enforced | multiple-choice.schema.json: /allOf/1/required, /properties/optionsAndPoints | (none) |
At least one optionsAndPoints value > 0 (an MCQ MUST have a correct answer) | Domain-validator-enforced (ERROR) | validate_course.py: validate_multiple_choice | (none) |
optionsAndPoints keys cover every entry in options | Domain-validator-enforced (ERROR if missing; WARN if optionsAndPoints contains extras not in options) | validate_course.py: validate_multiple_choice | (none) |
allowMultipleCorrect boolean, default false (schema-declared) | Schema-enforced (type) | multiple-choice.schema.json: /allOf/1/properties/allowMultipleCorrect | (none) |
allowPartialCredit boolean, default true (schema-declared) | Schema-enforced (type) | multiple-choice.schema.json: /allOf/1/properties/allowPartialCredit | (none) |
penalizeIncorrect boolean, default false (schema-declared) | Schema-enforced (type) | multiple-choice.schema.json: /allOf/1/properties/penalizeIncorrect | (none) |
showLetterLabels boolean, default false (schema-declared) | Schema-enforced (type) | multiple-choice.schema.json: /allOf/1/properties/showLetterLabels | (none) |
shuffleOptions governs per-question option randomization (per-question discretion) | Advisory | NORMATIVE.md §5.6 (multipleChoice is explicitly exempt from the §5.6 randomization MUST) | §5.6 |
9.4 wordBankCloze
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
type is "wordBankCloze" | Schema-enforced | word-bank-cloze.schema.json: /allOf/1/properties/type/const | §4.2 |
passage required, matches @@@\d+, minLength: 4 | Schema-enforced | word-bank-cloze.schema.json: /allOf/1/required, /properties/passage/{pattern,minLength} | (none) |
wordBank required, minItems: 1, each minLength: 1 | Schema-enforced | word-bank-cloze.schema.json: /allOf/1/required, /properties/wordBank | (none) |
gapAcceptedAnswers required, {"^[0-9]+$": [string]}, each gap minItems: 1, each accepted answer minLength: 1 | Schema-enforced | word-bank-cloze.schema.json: /allOf/1/required, /properties/gapAcceptedAnswers/patternProperties | (none) |
passage @@@N marker set MUST equal gapAcceptedAnswers key set | Domain-validator-enforced (ERROR) | validate_course.py: validate_word_bank_cloze (cloze gap-consistency check) | (none) |
@@@N marker numbers SHOULD be sequential starting at 1 | Domain-validator-enforced (WARN) | validate_course.py: validate_word_bank_cloze (cloze gap-consistency check) | (none) |
allowWordReuse boolean, default false (schema-declared) | Schema-enforced (type) | word-bank-cloze.schema.json: /allOf/1/properties/allowWordReuse | (none) |
bankPosition enum: "above", "below", "side" | Schema-enforced | word-bank-cloze.schema.json: /allOf/1/properties/bankPosition/enum | (none) |
gapCaseSensitive / gapFeedback value types | Schema-enforced | word-bank-cloze.schema.json: /allOf/1/properties/gapCaseSensitive, gapFeedback | (none) |
9.5 multiGapCloze
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
type is "multiGapCloze" | Schema-enforced | multi-gap-cloze.schema.json: /allOf/1/properties/type/const | §4.2 |
passage required, matches @@@\d+, minLength: 4 | Schema-enforced | multi-gap-cloze.schema.json: /allOf/1/required, /properties/passage | (none) |
gapAcceptedAnswers required | Schema-enforced | multi-gap-cloze.schema.json: /allOf/1/required | (none) |
Each accepted answer MUST NOT contain , or : (scoring-engine wire format) | Schema-enforced + Domain-validator-enforced (ERROR) | multi-gap-cloze.schema.json: /allOf/1/properties/gapAcceptedAnswers/patternProperties/.../items/not/pattern="[,:]"; validate_course.py: validate_multi_gap_cloze | (none — wire-format consequence) |
| Other punctuation in answers SHOULD be limited to apostrophes and hyphens | Domain-validator-enforced (WARN) | validate_course.py: validate_multi_gap_cloze | (none) |
passage @@@N marker set MUST equal gapAcceptedAnswers key set | Domain-validator-enforced (ERROR) | validate_course.py: validate_multi_gap_cloze (cloze gap-consistency check) | (none) |
@@@N marker numbers SHOULD be sequential starting at 1 | Domain-validator-enforced (WARN) | validate_course.py: validate_multi_gap_cloze (cloze gap-consistency check) | (none) |
allowPartialCredit boolean, default true (schema-declared) | Schema-enforced (type) | multi-gap-cloze.schema.json: /allOf/1/properties/allowPartialCredit | (none) |
9.6 multipleChoiceCloze
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
type is "multipleChoiceCloze" | Schema-enforced | multiple-choice-cloze.schema.json: /allOf/1/properties/type/const | §4.2 |
passage, gapOptions, correctAnswers required | Schema-enforced | multiple-choice-cloze.schema.json: /allOf/1/required | (none) |
Each gap’s gapOptions has minItems: 2 | Schema-enforced | multiple-choice-cloze.schema.json: /allOf/1/properties/gapOptions/patternProperties/.../minItems | (none) |
correctAnswers[N] is a non-negative integer | Schema-enforced | multiple-choice-cloze.schema.json: /allOf/1/properties/correctAnswers/patternProperties/.../{type,minimum} | (none) |
correctAnswers[N] index in bounds of gapOptions[N] | Domain-validator-enforced (ERROR) | validate_course.py: validate_multiple_choice_cloze | (none) |
passage @@@N marker set MUST equal gapOptions key set | Domain-validator-enforced (ERROR) | validate_course.py: validate_multiple_choice_cloze (cloze gap-consistency check) | (none) |
gapOptions key set MUST equal correctAnswers key set | Domain-validator-enforced (ERROR) | validate_course.py: validate_multiple_choice_cloze | (none) |
@@@N marker numbers SHOULD be sequential starting at 1 | Domain-validator-enforced (WARN) | validate_course.py: validate_multiple_choice_cloze (cloze gap-consistency check) | (none) |
shuffleOptions boolean, default false (schema-declared) | Schema-enforced (type) | multiple-choice-cloze.schema.json: /allOf/1/properties/shuffleOptions | (none) |
9.7 shortAnswer
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
type is "shortAnswer" | Schema-enforced | short-answer.schema.json: /allOf/1/properties/type/const | §4.2 |
acceptedAnswers required, minItems: 1, each minLength: 1 | Schema-enforced | short-answer.schema.json: /allOf/1/required, /properties/acceptedAnswers | (none) |
acceptedAnswers[0] is the canonical form for display | Advisory (question-types-reference.md §7) | (no enforcement) | (none) |
caseSensitive boolean, default false (schema-declared) | Schema-enforced (type) | short-answer.schema.json: /allOf/1/properties/caseSensitive | (none) |
9.8 essay
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
type is "essay" | Schema-enforced | essay.schema.json: /allOf/1/properties/type/const | §4.2 |
expectedAnswer required (string, may be empty) | Schema-enforced | essay.schema.json: /allOf/1/required | (none) |
expectedLines, minWords, maxWords are integers >= 0 (0 = no limit) | Schema-enforced | essay.schema.json: /allOf/1/properties/{expectedLines,minWords,maxWords} | (none) |
maxWords >= minWords when both > 0 | Domain-validator-enforced (WARN) | validate_course.py: validate_essay | (none) |
rubricText is Markdown when present | Advisory (question-types-reference.md §8) | (none) | (none) |
9.9 sentenceTransformation
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
type is "sentenceTransformation" | Schema-enforced | sentence-transformation.schema.json: /allOf/1/properties/type/const | §4.2 |
promptSentence, keyword, targetSentence, acceptedChunks required | Schema-enforced + Domain-validator-enforced (ERROR if missing) | sentence-transformation.schema.json: /allOf/1/required; validate_course.py: validate_sentence_transformation | (none) |
targetSentence contains exactly one @@@ placeholder (minLength: 4); multiple @@@ markers are non-conforming because SentenceTransformation chunks are sequential answer pieces typed at that single position, not separate gaps | Schema-enforced (pattern requires at least one @@@; minLength) + Domain-validator-enforced (ERROR if more than one @@@; WARN if zero) | sentence-transformation.schema.json: /allOf/1/properties/targetSentence/{pattern,minLength}; validate_course.py: validate_sentence_transformation | (none) |
acceptedChunks keys are ^[0-9]+$, each value minItems: 1, each chunk minLength: 1 | Schema-enforced | sentence-transformation.schema.json: /allOf/1/properties/acceptedChunks/patternProperties | (none) |
| Chunk numbers SHOULD be sequential starting at 1 | Domain-validator-enforced (WARN) | validate_course.py: validate_sentence_transformation | (none) |
keyword SHOULD be uppercase | Domain-validator-enforced (WARN) | validate_course.py: validate_sentence_transformation | (none) |
Deprecated PascalCase chunks/keyword fields (AcceptedChunks, Keyword, …) → camelCase | Domain-validator-enforced (WARN) | validate_course.py: validate_sentence_transformation (deprecated_props map) | (none) |
allOrNothing boolean, default false (schema-declared); chunkCaseSensitive / chunkFeedback typed maps | Schema-enforced (types) + Domain-validator-enforced (WARN if not boolean / not dict) | sentence-transformation.schema.json; validate_course.py: validate_sentence_transformation | (none) |
9.10 matching
matching carries an if/then/else branch in the schema, keyed off matchingMode.
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
type is "matching" | Schema-enforced | matching.schema.json: /allOf/1/properties/type/const | §4.2 |
matchingMode required, enum: "pairs", "classification" | Schema-enforced | matching.schema.json: /allOf/1/required, /properties/matchingMode/enum | (none) |
pairs mode: pairs[] required, minItems: 2, each {item,match} required, additionalProperties: false | Schema-enforced | matching.schema.json: /allOf/1/then/{required,properties/pairs} | §7.1 (closed object disallows x- extensions inside) |
pairs mode: categories MUST NOT be present | Schema-enforced | matching.schema.json: /allOf/1/then/not/required[*]="categories" | (none) |
classification mode: categories[] required, minItems: 2, each {label,items} required (items.minItems: 1), additionalProperties: false | Schema-enforced | matching.schema.json: /allOf/1/else/{required,properties/categories} | §7.1 |
classification mode: pairs MUST NOT be present | Schema-enforced | matching.schema.json: /allOf/1/else/not/required[*]="pairs" | (none) |
distractors[*] non-empty strings | Schema-enforced | matching.schema.json: /allOf/1/properties/distractors/items/minLength | (none) |
allowPartialCredit boolean, default true (schema-declared) | Schema-enforced (type) | matching.schema.json: /allOf/1/properties/allowPartialCredit | (none) |
| Consumers MUST randomize the choice pool (matches + distractors) | Advisory + runtime obligation | NORMATIVE.md §5.6 | §5.6 |
Consumers MUST randomize row order in classification mode | Advisory + runtime obligation | NORMATIVE.md §5.6 | §5.6 |
9.11 ordering
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
type is "ordering" | Schema-enforced | ordering.schema.json: /allOf/1/properties/type/const | §4.2 |
sourceText required, minLength: 1 | Schema-enforced | ordering.schema.json: /allOf/1/required, /properties/sourceText/minLength | (none) |
items required, minItems: 2, each minLength: 1 | Schema-enforced | ordering.schema.json: /allOf/1/required, /properties/items | (none) |
distractors[*] non-empty strings, default [] (schema-declared) | Schema-enforced (item type) | ordering.schema.json: /allOf/1/properties/distractors | (none) |
scoringMode enum: "strict", "kendall" (when present) | Schema-enforced | ordering.schema.json: /allOf/1/properties/scoringMode/enum | (none) |
scoringMode default: "strict" for orderingUnit:"word", "kendall" for "sentence"/"paragraph" | Advisory (description prose; no JSON Schema literal default) | ordering.schema.json: /allOf/1/properties/scoringMode/description | (none) |
orderingUnit enum: "word", "sentence", "paragraph", default "word" (schema-declared; advisory display hint) | Schema-enforced (enum) | ordering.schema.json: /allOf/1/properties/orderingUnit/{enum,default} | (none) |
9.12 placement
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
type is "placement" | Schema-enforced | placement.schema.json: /allOf/1/properties/type/const | §4.2 |
placementUnit, passage, placements required | Schema-enforced | placement.schema.json: /allOf/1/required | (none) |
placementUnit enum: "sentence", "paragraph", "sectionLabel", default "sentence" (schema-declared) | Schema-enforced (enum) | placement.schema.json: /allOf/1/properties/placementUnit/enum | (none) |
passage minLength: 1, MUST contain at least one @@@N marker | Schema-enforced | placement.schema.json: /allOf/1/properties/passage/{minLength,pattern} | (none) |
placements minItems: 1, each {gap >= 1, item.minLength >= 1}, additionalProperties: false | Schema-enforced | placement.schema.json: /allOf/1/properties/placements/items | §7.1 (closed) |
Every placements[*].gap references a @@@N marker present in passage | Domain-validator-enforced (ERROR) | validate_course.py: validate_placement | (none) |
No duplicate gap values within placements[] | Domain-validator-enforced (ERROR) | validate_course.py: validate_placement | (none) |
@@@N markers SHOULD be sequential starting at 1 | Domain-validator-enforced (WARN) | validate_course.py: validate_placement | (none) |
placementUnit: "paragraph" — marker SHOULD stand alone in its paragraph | Domain-validator-enforced (WARN) | validate_course.py: validate_placement | (none) |
placementUnit: "sectionLabel" — marker SHOULD be at the start of a paragraph followed by a space | Domain-validator-enforced (WARN) | validate_course.py: validate_placement | (none) |
Extra @@@N markers without a placements[] entry are valid decoy gaps (TOEFL Sentence Insertion variant) | Advisory | placement.schema.json: /allOf/1/properties/placements/description, question-types-reference.md §11 | (none) |
distractors[*] non-empty strings | Schema-enforced | placement.schema.json: /allOf/1/properties/distractors/items/minLength | (none) |
| Consumers MUST randomize the choice pool (placements items + distractors) | Advisory + runtime obligation | NORMATIVE.md §5.6 | §5.6 |
10. Reserved and unknown types
The 7 reserved question types — association, hotspot, graphicGapMatch, graphicAssociate, graphicOrder, fileUpload, mediaPromptedEssay — are declared in question-base.schema.json’s enum but have no per-type schemas in 1.0. Their handling is normative under NORMATIVE.md §6 (the fallback contract):
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
| Reserved-type discriminator MUST be accepted by consumers | Schema-enforced | question-base.schema.json: /properties/type/enum | §5.5, §6.1 |
Reserved-type question MUST satisfy question-base.schema.json (type, globalId, prompt required; points validated against the base type/range when present, defaulting to 1.0 schema-declared) | Schema-enforced (required fields + type/range on points) | question-base.schema.json: /required=["type","globalId","prompt"], /properties/points/{type,minimum,default}; validator dispatches reserved types to question-base.schema.json | §6.3 |
| Consumer MUST preserve every member of reserved-type question objects across read/write cycles (semantic preservation; key order is producer-discretion per §6.2) | Advisory (round-trip preservation — runtime obligation, not document-validity) | NORMATIVE.md §6.2, §6.4 | §6.2, §6.4 |
Consumer MUST NOT silently drop reserved-type questions from questions[] | Advisory | NORMATIVE.md §6.2 | §6.2 |
| Consumer MUST treat reserved-type earned points as 0 (max still counts) | Advisory (runtime obligation) | NORMATIVE.md §6.2 | §6.2 |
| Consumer MUST report the unsupported question to the user at import (UI banner / log / returned warning) | Advisory | NORMATIVE.md §6.2 | §6.2 |
| Consumer SHOULD render a non-interactive placeholder naming the type | Advisory | NORMATIVE.md §6.2, ACCESSIBILITY.md §7 | §6.2, §12 |
| Consumer SHOULD disable navigation gating for unsupported questions | Advisory | NORMATIVE.md §6.2 | §6.2 |
| Producer SHOULD NOT emit reserved types in cross-implementation distribution | Advisory | NORMATIVE.md §6.3 | §6.3 |
Producer SHOULD use the published reserved name exactly (hotspot, not Hotspot); SHOULD document tool-specific extensions in IMPLEMENTATIONS.md / README | Advisory | NORMATIVE.md §6.5 | §6.5 |
Producer: emitting a discriminator value not listed in question-base.schema.json’s enum is non-conforming at 1.0 — the schema rejects it; the reference validator surfaces it with a friendlier message naming the allowed values | Schema-enforced + Domain-validator-enforced (ERROR) | question-base.schema.json: /properties/type/enum; validate_course.py (per-type question dispatch) | §6.1 |
| Consumer (1.0-only) reading a 1.x+ document with a type discriminator unknown to the consumer: apply the §6 fallback (preserve in full, treat earned points as 0, render placeholder, report to user) — do NOT reject the document | Advisory (runtime / forward-compat obligation, not document-validity) | NORMATIVE.md §6.1, §6.2, §6.4 | §6.1, §6.2 |
The two rows above are not in conflict: the producer-validity row describes the 1.0 strict-validator behavior on a document whose type enum is exhausted at 1.0 (the schema and the reference validator agree it’s a malformed 1.0 document). The consumer-import row describes the runtime obligation a 1.0-only consumer carries when it ingests a 1.x+ document whose newer type it does not recognize — there, NORMATIVE §6 binds the consumer to graceful fallback rather than rejection. A 1.0 consumer cannot validate a 1.x+ document under a 1.0 schema and therefore SHOULD NOT use schema validation as the ingest gate when reading future-minor content; consumer-side ingest is governed by §6, not by question-base.schema.json.
11. Question Sets (flat artifact)
A documentType: "questionSet" document is a flat questions list with no course hierarchy. Required root fields apply (see §3) plus:
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
title required, minLength: 1 | Schema-enforced | question-set.schema.json: /required[*]="title", /properties/title/minLength | §3.2 |
language required at root | Schema-enforced | question-set.schema.json: /required[*]="language" | §12.1 |
questions[] required (may be empty) | Schema-enforced | question-set.schema.json: /required[*]="questions" | (none) |
sourceQuestionSetId, when present, matches the RFC 4122 UUID pattern (any version; shape-only validation) | Schema-enforced | question-set.schema.json: /properties/sourceQuestionSetId/pattern | (none) |
version matches ^[0-9]+(\.[0-9]+){0,2}$, default "1.0" (schema-declared) | Schema-enforced (pattern) | question-set.schema.json: /properties/version/pattern | (none) |
Each questions[*] validates against its per-type schema (per-question dispatch) | Domain-validator-enforced (ERROR on schema failure or unknown discriminator) | validate_course.py (per-type question dispatch), validate_question_set_flat | §5.1, §5.3 |
12. Cross-cutting
12.1 HTML safety profile
HTML appears in two fields: ContentItem.html and SignpostItem.customHtml. The full normative profile is in HTML_SAFETY.md. The reference validator’s HTML checks live in validate_course.py: validate_html_content and mirror that profile.
| Surface | Severity (HTML_SAFETY.md §8) | Validator function |
|---|---|---|
Forbidden elements (<script>, <iframe>, <form>, <input>, <button>, <style>, <link>, <meta>, <base>, <svg>, <math>, etc.) | ERROR — consumer MUST reject | validate_html_content (HTML_FORBIDDEN_TAGS) |
Event-handler attributes (onclick, onload, onerror, …) | ERROR | validate_html_content (attr_name.startswith("on")) |
Form-submission attributes (srcdoc, formaction, …) | ERROR | validate_html_content |
javascript: or vbscript: URL in any URL-bearing attribute | ERROR | validate_html_content |
expression(...) / javascript: inside style CSS value | ERROR | validate_html_content |
data: URL in any URL-bearing attribute (including <img src>) | WARN | validate_html_content |
Other forbidden URL schemes (blob:, file:, chrome:, ftp:, ws:, gopher:, view-source:) | WARN | validate_html_content |
tel: URL (consumer-policy gated; see ITEM_PATTERNS.md §3) | WARN | validate_html_content |
Unknown element (not in HTML_ALLOWED_TAGS, not in forbidden list) | WARN — strip while preserving text | validate_html_content |
| Unknown attribute on an allowed element (outside §3 allowlist) | WARN — strip the attribute | validate_html_content |
CSS property outside the HTML_SAFETY.md §3.4 allowlist | WARN — strip the property | validate_html_content |
<a target="_blank"> without rel="noopener noreferrer" | WARN — consumer MUST normalize | validate_html_content |
<img> without alt attribute | WARN — empty alt="" permitted for decorative images | validate_html_content |
<video>/<audio> with autoplay or loop | WARN — producer MUST NOT emit; consumer SHOULD ignore | validate_html_content |
A conforming consumer MUST sanitize HTML before render regardless of producer claims (HTML_SAFETY.md §5).
12.2 Accessibility preservation
ACCESSIBILITY.md defines two layers: a base-conformance preservation floor that binds every consumer, and an opt-in Accessibility Profile claim that binds delivery.
Round-trip preservation (base conformance, NORMATIVE.md §12.1):
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
alt on <img> MUST round-trip | Advisory (runtime / round-trip obligation) | NORMATIVE.md §12.1 | §12.1 |
<track> elements (incl. kind, src, srclang, label, default) MUST round-trip on <video>/<audio> | Advisory | NORMATIVE.md §12.1 | §12.1 |
lang and dir attributes on HTML-bearing elements MUST round-trip | Advisory | NORMATIVE.md §12.1 | §12.1 |
Document-root language MUST round-trip | Schema-enforced (required) + runtime preservation obligation | course.schema.json: /required[*]="language"; NORMATIVE.md §12.1 | §12.1 |
Document-root supportLanguage MUST round-trip when present (including explicit null) | Advisory | NORMATIVE.md §12.1 | §12.1 |
| Reserved-type questions MUST round-trip with any accessibility metadata they carry | Advisory | NORMATIVE.md §6.4, §12.1 | §6.4, §12.1 |
Extension-preserving consumers (§7.4) SHOULD round-trip x--namespaced extension members carrying accessibility data | Advisory | NORMATIVE.md §12.1 | §12.1 |
Opt-in Accessibility Profile delivery obligations (binding only when claimed): see ACCESSIBILITY.md §§2–8. Not duplicated here.
Validator severity for accessibility issues at the current baseline (ACCESSIBILITY.md §8):
| Issue | Severity | Validator function |
|---|---|---|
Missing alt on <img> | WARN | validate_html_content |
<video> without <track kind="captions"|"subtitles"> | WARN (current baseline; promotion to ERROR under the --accessibility flag is targeted for 1.0 final) | validate_html_content (post-pass scan for <video>…</video> blocks) |
<iframe>, <script>, event handlers | ERROR | validate_html_content |
Missing language at document root | ERROR (schema-enforced) | course.schema.json / question-set.schema.json required |
Reserved-type question without title | NOTE | (advisory; not currently surfaced) |
12.3 Randomization requirements
NORMATIVE.md §5.6 binds two surfaces. These are consumer rendering obligations, not document-validity rules — a document is conforming whether or not consumers randomize it. Listed here so implementers know what they MUST do at render time:
- Choice pool for
matching(pairs/classification) andplacementMUST be presented in randomized order. - Row order in
matchingclassification mode MUST be randomized. - The randomization algorithm and any seeding strategy are consumer-defined.
- Exemptions:
multipleChoice(per-questionshuffleOptionsinstead),matchingpairs rows,orderingsource tiles.
12.4 Extensions (x- members)
Extension rules from NORMATIVE.md §7. Round-trip preservation by extension-preserving consumers is a runtime obligation, not a document-validity rule.
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
x- keys MAY appear on root + Course/Unit/Lesson/Item/Question | Advisory (schemas omit additionalProperties: false on these objects) | NORMATIVE.md §7.1 | §7.1 |
x- keys MUST NOT appear on closed objects (matching.pairs[*], matching.categories[*], placement.placements[*]) | Schema-enforced | matching.schema.json / placement.schema.json (additionalProperties: false) | §7.1 |
Producer MUST NOT emit non-extension fields whose name begins with x- | Advisory | NORMATIVE.md §7.1 | §7.1 |
| Producer MUST NOT emit an extension under a namespace it does not own | Advisory | NORMATIVE.md §7.2 | §7.2 |
Extensions are strictly additive — removing every x- member MUST leave a conforming document with equivalent learner-facing meaning | Advisory | NORMATIVE.md §7.3 | §7.3 |
Consumer MUST NOT reject documents solely for x- members or interpret members outside its own namespace | Advisory | NORMATIVE.md §7.4 | §7.4 |
Extension-preserving consumers SHOULD round-trip unrecognized x- members on the same object | Advisory (round-trip behavior) | NORMATIVE.md §7.4 | §7.4 |
12.5 Versioning and URL stability
NORMATIVE.md §8 binds publication-side guarantees. Not document-validity rules per se, but consumers SHOULD enforce them when resolving $schema:
$schemaURL identifies the specific publication;specVersionidentifies the contract version (§4.6, §4.7, §8.4)./X.Y/paths are reserved for accepted final releases;/X.Y-rc.N/paths are immutable once published; rc.N → final adoption is an explicit re-export (§8.1, §8.3).- A document declaring
specVersion: "1.0"with$schemaat/1.0-rc.N/validates against/1.0-rc.N/and is not required to validate against/1.0/(§8.4).
12.6 Discriminator casing
Reiteration of §3, §7, §8: conforming consumers MUST reject non-canonical casings on documentType, item type, and question type (NORMATIVE.md §4.2, §5.3). The schemas enforce these via const / enum. The reference validator additionally provides lenient migration paths (PascalCase → camelCase warnings, casing-tolerant documentType dispatch) that are disabled under --strict.
12.7 globalId uniqueness
| Rule | Tier | Source | NORMATIVE § |
|---|---|---|---|
globalId values unique across all entities in a document (Units, Lessons, Items, Questions share one namespace; comparison case-insensitive) | Domain-validator-enforced (ERROR) | validate_course.py: _collect_duplicate_global_id_errors (course and questionSet paths) | §4.4 |
JSON Schema cannot express cross-entity uniqueness across nesting levels, so this rule is domain-validator-only. Reference fields that point at a globalId (contentItemId, relatedItemIds) are references, not declarations, and are exempt. Conformance fixture: tests/invalid/40-duplicate-global-id.json.
13. Conformance note
The catalog tiers describe what validate_course.py --strict enforces today; the published conformance contract is the --strict behavior (§1.3).
Producers MUST emit documents that satisfy every Schema-enforced rule and every Domain-validator-enforced (ERROR) rule. Producers SHOULD additionally honor Domain-validator-enforced (WARN) rules; the validator’s warnings flag suspect-but-not-rejected content that authors typically want to fix.
Consumers MUST reject documents that fail any Schema-enforced or Domain-validator-enforced (ERROR) rule, with one explicit exception: where a row distinguishes producer-emission from consumer-import (the $schema rows in §3 are the canonical example), the consumer-side row applies. This matches NORMATIVE.md §3.2’s strict-producer / lenient-consumer split — a producer that omits $schema is non-conforming with respect to that document, but a consumer that rejects an otherwise-valid document on the basis of a missing $schema is overly strict. Domain-validator-enforced (WARN) rules describe sanitization, accessibility, or migration-aid behavior — consumers SHOULD surface them but are not required to reject on their basis. NOTE-tier rows are informational only.
Advisory rules carry the RFC 2119 weight stated in the cited section (NORMATIVE.md MUST/SHOULD/MAY, HTML_SAFETY.md §8 severity, ACCESSIBILITY.md §8). Consumers that diverge from advisory SHOULD/MAY rules are non-canonical but not non-conforming. Where a rule is enforced in multiple tiers (schema + validator), satisfying the strictest tier suffices.
A note on JSON Schema format keywords. Several rows above cite format: "uri" or format: "uuid" from the schemas. Under JSON Schema Draft 7, format is an annotation by default — a validator only enforces it when configured with a FormatChecker (or equivalent). The reference validator’s Draft7Validator instance runs without explicit format assertions, so format-only claims are not guaranteed by the schema pass alone. Rows that depend on these formats also cite a regex pattern (for UUIDs on globalId properties) or a domain-validator backstop (validate_course.py: validate_item via is_valid_uuid for contentItemId / relatedItemIds). Implementers re-implementing the validator in other languages should either enable format-assertion in their JSON Schema library or replicate the regex/domain backstops.
A note on JSON Schema default keywords. Several rows above cite a property’s default value (e.g. isGraded defaults to true on quiz, points defaults to 1.0 on questions, placementUnit defaults to "sentence"). Under JSON Schema Draft 7, default is an annotation — most validators (including jsonschema for Python, AJV with default options, etc.) do not apply or enforce it. A producer that omits the property emits a document that validates; a consumer that reads such a document MUST apply the default itself if it wants the documented behavior. The defaults are listed here so implementers know what the spec intends absent an explicit value — they are schema-declared, not schema-enforced. Consumers SHOULD NOT rely on the validator filling in defaults; producers SHOULD emit explicit values when the documented default does not match their intent.
Where the reference validator and a normative document disagree, the normative document wins. Discrepancies should be reported as issues against this spec; the validator is updated to track.
14. Forward-looking deepenings (1.0 final)
The inventory pass that produced this catalog (2026-05-24) surfaced eight documented-but-unenforced rules. All eight were closed in the same rc.1-polish session by extending tools/validate_course.py (no schema changes — the closures land in the domain-validator pass). The corresponding rows in the per-type tables above are tagged Domain-validator-enforced rather than Advisory; new invalid conformance fixtures (tests/invalid/21-mcq-no-correct-option.json, 22-mcq-options-points-missing-entry.json, 23-word-bank-cloze-gap-count-mismatch.json, 24-multiple-choice-cloze-index-out-of-bounds.json) pin the ERROR-tier checks. The corpus runs 64/64 under python tools/run_corpus.py (the harness invokes validate_course.py --strict internally on every fixture; the 36 fixtures at the time of that rc.1 pass, plus the two prompt-correction fixtures added in rc.2, plus the per-type / referential-integrity / grading-matrix / globalId-uniqueness expansion added in rc.3).
Three areas remain explicitly forward-looking for 1.0 final or beyond:
--accessibilityvalidator flag. The<video>without<track kind="captions"\|"subtitles">check (§12.2) is WARN at the current baseline. The 1.0-final--accessibilityflag promotes it (and related accessibility warnings) to ERROR so tooling that wants to fail-build on accessibility-profile claims can do so.- Tag namespace conventions. Optional best-practice tag prefixes (
stage:,level:,exam:, …) are described informally inITEM_PATTERNS.md§1. No schema-level constraint, no validator check; left to convention for1.0. Referential-integrity validation onobjectiveIdsis closed at rc.1 (consumers MUST report unresolved IDs per the validator). - Reserved-type per-type schemas. The 7 reserved question types (
hotspot,association, etc.) validate againstquestion-base.schema.jsononly in 1.0 (§10). First-class per-type schemas are targeted for the1.1minor.
Future deepenings (a new accessibility rule promoted to ERROR, a new cross-document rule added by 1.1) will surface as new rows in the per-type tables above or as new entries in this section. The published /1.0-rc.2/ and /1.0-rc.3/ schema URLs remain immutable per NORMATIVE.md §8.3; any future closures land at /1.0/ or a later version path.
LC-JSON Question Types — Format Reference
Spec version: 1.0 Purpose: Per-type property reference for LC-JSON (Learning Content JSON) — the 12 implemented question types and the 7 reserved-for-2027 types.
Table of Contents
- Overview
- Common Properties
- Phase 1: Core Foundation
- Phase 2: Cloze Family
- Phase 3: Text Entry
- Phase 4: Structured Tasks (Implemented)
- Reserved Types
- Validation Rules
Overview
LC-JSON questions are tagged-union objects. Every question carries a type field whose value selects the per-type schema that applies. Consumers dispatch on type to validate and render.
Key requirements:
- The
typediscriminator value uses canonical camelCase (simpleGapFill,multipleChoice, …). Conforming consumers MUST reject non-canonical casings (NORMATIVE.md§5.3). - All property names use camelCase.
- Every question carries a
globalIdin RFC 4122 UUID form (any version; shape-only validation against the 8-4-4-4-12 hex pattern). - See
NORMATIVE.mdfor the full conformance requirements.
Supported Question Types (19 total):
Phase 1 - Core Foundation:
simpleGapFill- Single gap with free text entrytrueFalseQuestion- Binary choice questionsmultipleChoice- Single or multiple correct answers
Phase 2 - Cloze Family:
4. wordBankCloze - Gap fill from word bank
5. multiGapCloze - Multiple free-text gaps
6. multipleChoiceCloze - Multiple dropdown gaps
Phase 3 - Text Entry:
7. shortAnswer - Free text response
8. essay - Long-form text with word limits
Phase 4 - Structured tasks (implemented):
9. matching - Pair items 1:1, or classify items into categories
10. ordering - Sequence items (word / sentence / paragraph variants)
11. placement - Place items into anchored gaps in a structured passage (sentence / paragraph / sectionLabel variants; supports decoy gaps for TOEFL Sentence Insertion)
12. sentenceTransformation - Cambridge exam-style controlled paraphrase
Reserved types (per NORMATIVE.md §6 — preserved on round-trip; per-type schemas targeted for 2027):
13. association - Group items into categories
14. hotspot - Click regions on image
15. graphicGapMatch - Visual drag-and-drop
16. graphicAssociate - Associate items with images
17. graphicOrder - Order items based on images
18. fileUpload - Submit documents
19. mediaPromptedEssay - Record audio/video
Common Properties
All question types inherit these base properties:
{
"type": "simpleGapFill",
"globalId": "550e8400-e29b-41d4-a716-446655440000",
"title": "Question title",
"prompt": "",
"tags": ["tag1", "tag2"],
"difficulty": 5.0,
"points": 1.0,
"hint": "Optional hint text",
"feedback": {
"correct": "Feedback shown when the answer is correct",
"incorrect": "Feedback shown when the answer is incorrect",
"choiceFeedback": {
"choice1": "Per-choice feedback (where applicable)"
}
}
}
Property Details:
| Property | Type | Required | Default | Description |
|---|---|---|---|---|
type | string | ✅ Yes | - | Question type discriminator. Canonical camelCase form. |
globalId | string (UUID) | ✅ Yes | - | RFC 4122 UUID (any version; shape-only validation); stable across versions of the question. |
title | string | ❌ No | "" | Short title for editorial/list views. |
prompt | string | ✅ Yes | "" | Main question text. Required for every type; may be empty (""). Authoritative for the real-content types (true/false, multiple choice, short answer, essay), where it is the question. Non-authoritative for the symbolic types, whose structured fields carry the meaning — there it MAY be empty or MAY carry a brief producer-derived readable summary (see the symbolic-type note below). |
tags | string[] | ❌ No | [] | Tag array for categorization. |
difficulty | number | ❌ No | 5.0 | Estimated difficulty for the intended learners (0.0 = extremely easy, 10.0 = extremely difficult). |
points | number | ❌ No | 1.0 | Points awarded for a correct answer. |
hint | string | ❌ No | null | Optional hint shown to the learner. |
feedback | object | ❌ No | null | Optional feedback bundle (see FeedbackBundle below). |
difficulty is an author estimate, not a subject level, grade level, CEFR level, or Bloom band. It estimates how challenging the question is for its intended learners. Applications SHOULD display it in teacher-readable form, commonly rounded to the nearest whole number on the 0-10 scale, and MAY later compare it with observed first-attempt success rates.
Phase 1: Core Foundation
1. SimpleGapFill
Description: Single gap with free text entry and multiple acceptable answers.
Use Case: Simple fill-in-the-blank questions.
Example: “The capital of France is ___.”
{
"type": "simpleGapFill",
"globalId": "550e8400-e29b-41d4-a716-446655440001",
"prompt": "",
"title": "Capital of France",
"tags": ["geography", "level:A1"],
"difficulty": 2.0,
"points": 1.0,
"sentence": "The capital of France is @@@.",
"acceptedAnswers": ["Paris", "paris"],
"caseSensitive": false
}
Type-Specific Properties:
| Property | Type | Required | Default | Description |
|---|---|---|---|---|
sentence | string | ✅ Yes | "" | Sentence with @@@ marking gap position |
acceptedAnswers | string[] | ✅ Yes | [] | List of acceptable answers |
caseSensitive | boolean | ❌ No | false | Whether answer matching is case-sensitive |
2. TrueFalseQuestion
Description: Binary choice question with boolean correctAnswer as single source of truth. Supports configurable display styles and penalty system.
Use Case: True/False, Correct/Incorrect, or visual checkmark/X questions.
Example: “Water boils at 100°C at sea level. True or False?”
{
"type": "trueFalseQuestion",
"globalId": "550e8400-e29b-41d4-a716-446655440002",
"prompt": "Water boils at 100°C at sea level.",
"title": "Boiling Point",
"tags": ["science", "level:A2"],
"difficulty": 1.0,
"points": 1.0,
"correctAnswer": true,
"displayStyle": "TrueFalse",
"penalizeIncorrect": false,
"incorrectPenaltyPercent": 50.0
}
Type-Specific Properties:
| Property | Type | Required | Default | Description |
|---|---|---|---|---|
correctAnswer | boolean | ✅ Yes | - | The correct answer (true or false). Single source of truth for scoring. |
displayStyle | string | ❌ No | "TrueFalse" | UI label style: “TrueFalse”, “CorrectIncorrect”, “CheckmarkX”. Presentation only — does not affect scoring. |
penalizeIncorrect | boolean | ❌ No | false | Whether to apply a point penalty for a wrong answer. Independent of item type and isGraded — author’s choice. Common: false when you want learners to try without risk; true when guessing should cost something. |
incorrectPenaltyPercent | number | ❌ No | 50.0 | Penalty percentage (0-100). 0%=no penalty, 50%=partial, 100%=full penalty. |
Import normalization (pre-1.0 lenient migration affordance — NOT conforming behavior). Some authoring tools historically emitted non-boolean correctAnswer values for True/False questions. A consumer MAY accept and normalize the following on read, purely as a migration aid for ingesting pre-1.0 documents:
true,"true","True","correct","tick","✓",1→ normalized totruefalse,"false","False","incorrect","cross","✗",0→ normalized tofalse
Conforming behavior under LC-JSON 1.0 is unambiguous: the schema requires correctAnswer to be a JSON boolean (true / false). Conforming producers MUST emit it as a boolean. Conforming consumers in strict mode MUST reject non-boolean values per NORMATIVE.md §5.1 — the reference validator’s --strict mode (which tools/run_corpus.py invokes on every fixture) does so. Tools relying on the normalization above should treat it as a transitional ingestion aid that does not survive into a --strict-conforming document on re-export.
Note: For True/False/Not Mentioned questions (3 options), use MultipleChoice instead.
3. MultipleChoice
Description: Single or multiple correct answers with optional partial credit and shuffling.
Use Case: Traditional multiple-choice questions (MCQ).
Example: “Which of the following are programming languages? (Select all that apply)”
{
"type": "multipleChoice",
"globalId": "550e8400-e29b-41d4-a716-446655440003",
"prompt": "Which of the following are programming languages?",
"title": "Programming Languages",
"tags": ["programming", "level:B1"],
"difficulty": 3.0,
"points": 2.0,
"options": ["Python", "HTML", "Java", "CSS"],
"optionsAndPoints": {
"Python": 1.0,
"HTML": 0.0,
"Java": 1.0,
"CSS": 0.0
},
"allowMultipleCorrect": true,
"allowPartialCredit": true,
"penalizeIncorrect": false,
"shuffleOptions": true,
"showLetterLabels": true
}
Type-Specific Properties:
| Property | Type | Required | Default | Description |
|---|---|---|---|---|
options | string[] | ✅ Yes | [] | Array of answer choices |
optionsAndPoints | object | ✅ Yes | {} | Dictionary mapping options to points (>0 = correct) |
allowMultipleCorrect | boolean | ❌ No | false | Allow selecting multiple answers |
allowPartialCredit | boolean | ❌ No | true | Award partial credit for partially correct answers |
penalizeIncorrect | boolean | ❌ No | false | Deduct points for incorrect selections |
shuffleOptions | boolean | ❌ No | false | Randomize option order for each student |
showLetterLabels | boolean | ❌ No | false | Display A, B, C, D labels |
Phase 2: Cloze Family
4. WordBankCloze
Description: Passage with gaps filled from a shared word pool (includes distractors).
Use Case: Cambridge FCE/CAE, vocabulary exercises.
Example: “Fill in the blanks using words from the word bank.”
{
"type": "wordBankCloze",
"globalId": "550e8400-e29b-41d4-a716-446655440004",
"prompt": "",
"title": "Word Bank Exercise",
"tags": ["grammar:articles", "level:B1"],
"difficulty": 5.0,
"points": 5.0,
"passage": "I saw @@@1 cat and @@@2 dog in @@@3 park yesterday. @@@4 cat was chasing @@@5 dog.",
"wordBank": ["a", "an", "the", "some"],
"gapAcceptedAnswers": {
"1": ["a"],
"2": ["a"],
"3": ["the"],
"4": ["The"],
"5": ["the"]
},
"gapCaseSensitive": {
"4": true
},
"allowWordReuse": true,
"bankPosition": "above",
"gapFeedback": {
"1": "Remember: 'a' is used before consonants"
},
"allowPartialCredit": true
}
Type-Specific Properties:
| Property | Type | Required | Default | Description |
|---|---|---|---|---|
passage | string | ✅ Yes | "" | Text with numbered @@@1, @@@2, etc. marking gaps (1-based) |
wordBank | string[] | ✅ Yes | [] | Pool of words to choose from (includes distractors) |
gapAcceptedAnswers | object | ✅ Yes | {} | Dictionary: gap number (1-based) → array of accepted answers |
gapCaseSensitive | object | ❌ No | null | Dictionary: gap number → boolean (default: false) |
allowWordReuse | boolean | ❌ No | false | Can same word be used multiple times? |
bankPosition | string | ❌ No | "above" | Word bank position: “above”, “below”, “side” |
gapFeedback | object | ❌ No | null | Dictionary: gap number → feedback string |
allowPartialCredit | boolean | ❌ No | true | Award partial credit for some correct answers |
5. MultiGapCloze
Description: Passage with multiple gaps, each accepting free text with multiple valid answers.
Use Case: Cambridge FCE/CAE Reading Part 2 (Open Cloze).
Example: “Fill in the blanks (no word bank provided).”
{
"type": "multiGapCloze",
"globalId": "550e8400-e29b-41d4-a716-446655440005",
"prompt": "",
"title": "Open Cloze Exercise",
"tags": ["grammar:prepositions", "exam:fce", "level:B2"],
"difficulty": 6.0,
"points": 8.0,
"passage": "She walked @@@1 the park and sat @@@2 a bench @@@3 the lake.",
"gapAcceptedAnswers": {
"1": ["through", "in", "into"],
"2": ["on"],
"3": ["by", "near", "beside"]
},
"gapCaseSensitive": {
"1": false,
"2": false,
"3": false
},
"gapFeedback": {
"2": "We use 'on' with bench"
},
"allowPartialCredit": true
}
Type-Specific Properties:
| Property | Type | Required | Default | Description |
|---|---|---|---|---|
passage | string | ✅ Yes | "" | Text with numbered @@@1, @@@2, etc. marking gaps (1-based) |
gapAcceptedAnswers | object | ✅ Yes | {} | Dictionary: gap number (1-based) → array of accepted answers |
gapCaseSensitive | object | ❌ No | null | Dictionary: gap number → boolean (default: false) |
gapFeedback | object | ❌ No | null | Dictionary: gap number → feedback string |
allowPartialCredit | boolean | ❌ No | true | Award partial credit for some correct answers |
6. MultipleChoiceCloze
Description: Passage with multiple gaps, each gap has 3-4 discrete options (dropdown).
Use Case: Cambridge FCE/CAE Reading Part 1.
Example: “Choose the correct word for each gap from the dropdown.”
{
"type": "multipleChoiceCloze",
"globalId": "550e8400-e29b-41d4-a716-446655440006",
"prompt": "",
"title": "Multiple Choice Cloze",
"tags": ["vocabulary", "exam:fce", "level:B2"],
"difficulty": 7.0,
"points": 6.0,
"passage": "The weather was @@@1 cold that we decided to stay indoors. We @@@2 a movie instead.",
"gapOptions": {
"1": ["so", "such", "very", "too"],
"2": ["watched", "saw", "looked", "viewed"]
},
"correctAnswers": {
"1": 0,
"2": 0
},
"gapOptionFeedback": {
"1": {
"0": "Correct! 'so' is used before adjectives",
"1": "'such' is used before nouns",
"2": "'very' doesn't fit with 'that'",
"3": "'too' suggests excess"
}
},
"allowPartialCredit": true,
"shuffleOptions": false
}
Type-Specific Properties:
| Property | Type | Required | Default | Description |
|---|---|---|---|---|
passage | string | ✅ Yes | "" | Text with numbered @@@1, @@@2, etc. marking gaps (1-based) |
gapOptions | object | ✅ Yes | {} | Dictionary: gap number (1-based) → array of options |
correctAnswers | object | ✅ Yes | {} | Dictionary: gap number (1-based) → correct option index (0-based) |
gapOptionFeedback | object | ❌ No | null | Dictionary: gap number → option index → feedback |
allowPartialCredit | boolean | ❌ No | true | Award partial credit for some correct answers |
shuffleOptions | boolean | ❌ No | false | Randomize option order within each gap |
Phase 3: Text Entry
7. ShortAnswer
Description: Free text response with multiple acceptable answers and case sensitivity options.
Use Case: Short answer questions, name/term identification.
Example: “What is the largest planet in our solar system?”
{
"type": "shortAnswer",
"globalId": "550e8400-e29b-41d4-a716-446655440007",
"prompt": "What is the largest planet in our solar system?",
"title": "Largest Planet",
"tags": ["science:astronomy", "stage:lower-secondary"],
"difficulty": 2.0,
"points": 1.0,
"acceptedAnswers": ["Jupiter"],
"caseSensitive": false
}
Type-Specific Properties:
| Property | Type | Required | Default | Description |
|---|---|---|---|---|
acceptedAnswers | string[] | ✅ Yes | [] | All acceptable answers. The first entry is treated as the canonical form shown in solutions and feedback. |
caseSensitive | boolean | ❌ No | false | Whether answer matching is case-sensitive |
8. Essay
Description: Long-form text response with optional word limits and grading rubric.
Use Case: Essay questions, extended writing tasks.
Example: “Write a 250-word essay about climate change.”
{
"type": "essay",
"globalId": "550e8400-e29b-41d4-a716-446655440008",
"prompt": "Write an essay discussing the impact of climate change on global ecosystems.",
"title": "Climate Change Essay",
"tags": ["writing", "exam:ielts", "level:C1"],
"difficulty": 8.0,
"points": 20.0,
"expectedAnswer": "Sample model answer...",
"expectedLines": 15,
"minWords": 200,
"maxWords": 300,
"rubricText": "## Grading Criteria\n- Task Response (25%)\n- Coherence & Cohesion (25%)\n- Lexical Resource (25%)\n- Grammatical Range (25%)"
}
Type-Specific Properties:
| Property | Type | Required | Default | Description |
|---|---|---|---|---|
expectedAnswer | string | ❌ No | "" | Model answer / sample response |
expectedLines | integer | ❌ No | 0 | Suggested number of lines in text area |
minWords | integer | ❌ No | 0 | Minimum word count (0 = no limit) |
maxWords | integer | ❌ No | 0 | Maximum word count (0 = no limit) |
rubricText | string | ❌ No | null | Markdown-formatted grading rubric |
Phase 4: Structured Tasks (Implemented)
9. Matching
Description: Match items to their corresponding match (1:1) or classify items into categories (many-to-one). The shape branches by an explicit matchingMode discriminator: "pairs" for 1:1 matching where each item has one correct match, and "classification" for many-to-one where each item belongs to one category and multiple items may share a category. distractors carries decoys in either mode (extra match values in pairs mode; extra category labels in classification mode).
Use Cases:
pairs— Vocabulary ↔ definition, country ↔ capital, author ↔ work, thinker ↔ idea, cause ↔ effect. Any 1:1 association where the pedagogical meaning lives in the pairing.classification— Time expressions ↔ tense, foods ↔ food group, animals ↔ habitat, sentences ↔ register, examples ↔ argument-role. Any sort where multiple items share each category.
Common properties (both modes)
| Property | Type | Required | Description |
|---|---|---|---|
matchingMode | string ("pairs" | "classification") | ✅ Yes | Selects the sub-shape. No default — the schema can’t validate the shape without it. |
distractors | string[] | ❌ No | Pairs mode: extra match values with no correct item. Classification mode: extra category labels that don’t own any item. Default []. |
allowPartialCredit | boolean | ❌ No | If true (default), score per correct row; if false, all-or-nothing. |
pairs mode — properties
| Property | Type | Required | Description |
|---|---|---|---|
pairs | object[] | ✅ Yes | Each entry is one item-and-match row: { "item": string, "match": string }. Both fields required, minLength: 1. Minimum 2 pairs. |
classification mode — properties
| Property | Type | Required | Description |
|---|---|---|---|
categories | object[] | ✅ Yes | Each entry is one category: { "label": string, "items": string[] }. label required, minLength: 1. items required, minItems: 1 (an empty category is meaningless — list it as a distractors[] entry instead). Minimum 2 categories. Consumers MUST randomize the row order at render time per NORMATIVE §5.6 — source order is grouped by category and would directly expose the answer. |
A document that mixes shapes (both pairs and categories present, or matchingMode omitted) fails validation.
Example — pairs mode
{
"type": "matching",
"globalId": "550e8400-e29b-41d4-a716-446655441502",
"title": "Roots of Democracy — Match the Thinker",
"tags": ["politics:enlightenment", "philosophy:political"],
"points": 8.0,
"difficulty": 5.0,
"prompt": "",
"matchingMode": "pairs",
"pairs": [
{ "item": "John Locke", "match": "Government derives its authority from the consent of the governed." },
{ "item": "Jean-Jacques Rousseau", "match": "Citizens form a 'social contract' that creates the legitimate state." },
{ "item": "Baron de Montesquieu", "match": "Power should be divided across separate branches of government." },
{ "item": "John Stuart Mill", "match": "Liberty is the freedom to act, limited only by harm to others." }
],
"distractors": [
"The state should own the means of production.",
"Tradition is the safest guide to political reform."
]
}
See examples/15-matching.json for the full canonical pairs example.
Example — classification mode
{
"type": "matching",
"globalId": "550e8400-e29b-41d4-a716-446655441550",
"title": "Time Expressions — Classify by Tense",
"tags": ["grammar:tenses:past-simple", "grammar:tenses:present-perfect", "level:B1"],
"points": 6.0,
"difficulty": 4.0,
"prompt": "",
"matchingMode": "classification",
"categories": [
{ "label": "past simple", "items": ["a year ago", "yesterday", "in May 2019"] },
{ "label": "present perfect", "items": ["all my life", "never", "since 2020"] }
],
"distractors": ["future continuous"]
}
See examples/15b-matching-classification.json for the full canonical classification example.
Renderer expectation
In pairs mode, consumers MAY render two columns with drag-and-drop pairing (or a per-row dropdown of choosable matches). In classification mode, consumers MAY render the items as draggable chips and the category labels (plus distractors) as drop zones. Both presentations are consumer-defined; the wire format describes the structural relationship and hints at the affordance.
Scoring
Per-row scoring. In pairs mode, each row is one item↔match comparison. In classification mode, each item is compared against its category’s label (the row is correct if the learner placed the item under the correct label). allowPartialCredit: true (default) awards partial credit per correct row; false requires every row correct for any credit.
10. Ordering
Description: Sequence items correctly. Students arrange shuffled tiles into the teacher-defined correct order.
Use Case: Sentence word-order unscrambling, ordering process steps in a paragraph, ordering paragraphs of an essay, chronological ordering — any task where pedagogical meaning lives in the sequence.
| Property | Type | Required | Description |
|---|---|---|---|
sourceText | string | Yes | The original sentence or passage shown for context (the correct ordering when items are joined) |
items | string[] | Yes | Tiles in correct order. items[i] is the correct tile at position i |
distractors | string[] | No | Extra tiles that do not belong in the sequence (mixed into the tile bank as decoys) |
scoringMode | string | No | Scoring policy hint: "strict" = all-or-nothing exact match; "kendall" = partial credit via Kendall tau distance. When omitted, the recommended default is "strict" for orderingUnit: "word" and "kendall" for "sentence" / "paragraph". See Scoring below |
orderingUnit | string | No | Display granularity hint: "word" (default), "sentence", or "paragraph". See variants below |
Display variants (orderingUnit)
orderingUnit is an advisory hint — the same ordering discriminator covers all three variants and consumers MAY render uniformly. The hint lets a consumer choose layout that fits the chunk size:
orderingUnit | Typical chunk size | Typical layout | Example use |
|---|---|---|---|
"word" | one word or short phrase | inline draggable tokens on one line | Unscramble a sentence — see 16-ordering.json |
"sentence" | one sentence (10–30 words) | stacked card blocks, vertical | Order steps of a process, narrative beats — see 16b-sentence-ordering.json |
"paragraph" | one paragraph (50–100 words) | stacked block cards, vertical, larger | Order paragraphs of an essay — see 16c-paragraph-ordering.json |
Word-level example
{
"type": "ordering",
"globalId": "550e8400-e29b-41d4-a716-446655440010",
"prompt": "",
"title": "Word Order",
"sourceText": "She went shopping yesterday.",
"items": ["She", "went", "shopping", "yesterday"],
"distractors": ["quickly"],
"points": 1.0,
"tags": ["grammar", "level:A2"]
}
(orderingUnit is omitted — the default "word" applies. scoringMode is also omitted; consumers default to "strict" for word-level — see Scoring below.)
Sentence-level example
{
"type": "ordering",
"globalId": "550e8400-e29b-41d4-a716-446655441620",
"prompt": "",
"title": "Cellular Respiration — Order the Stages",
"sourceText": "Glucose enters the cell and is split into two pyruvate molecules… (full passage)",
"items": [
"Glucose enters the cell and is split into two pyruvate molecules in the cytoplasm during glycolysis…",
"Each pyruvate is then transported into the mitochondrion and converted to acetyl-CoA…",
"…"
],
"scoringMode": "kendall",
"orderingUnit": "sentence",
"points": 4.0,
"tags": ["biology:cell-biology:respiration"]
}
See 16b-sentence-ordering.json for the full example.
Paragraph-level example
Same shape, with orderingUnit: "paragraph" and longer items. See 16c-paragraph-ordering.json — a four-paragraph essay-structure reorder.
Scoring
Two modes selected by scoringMode:
"strict"— all items must be in their correct positions AND no distractors placed in the answer area. Any deviation = 0 points."kendall"— partial credit by Kendall tau distance over the learner’s permutation: each discordant pair (one item placed before another that should follow it) reduces the score. With N items and k discordant pairs, the score ispoints × (1 − k / (N × (N−1) / 2)). Useful when the chunks have a single defensible order but partial credit reflects partial understanding (e.g., process narratives, essay structure).
When scoringMode is omitted, the recommended default is "strict" for orderingUnit: "word" (where pairwise inversions don’t have pedagogical meaning — a sentence either reads correctly or it doesn’t) and "kendall" for orderingUnit: "sentence" and "paragraph" (where partial credit reflects partial understanding of the discourse structure). Consumers that don’t support partial credit MAY collapse "kendall" to "strict".
11. Placement
Description: Place items into anchored gaps in a structured passage. Each placement entry pairs a 1-based gap-marker number (@@@N in the passage) with the item that belongs in that gap. Distractors are extra items with no correct gap; passage gap-markers without a corresponding placement entry are decoy positions — a TOEFL-style variant where one item must be placed into one of several candidate positions. The shape mirrors the matching-redesign principle: for structured tasks where the relationship between items and slots is the data, the relationship is encoded explicitly per row.
Use Cases:
- Cambridge B2 First Part 6 — sentence-level “missing sentences” tasks where 6 short sentences must be placed back into a 6-gap article. Use
placementUnit: "sentence". Seeexamples/17a-sentence-placement.json. - Cambridge C1 Advanced Part 7 — paragraph-level reordering of a 6-gap essay. Use
placementUnit: "paragraph". Seeexamples/17b-paragraph-placement.json. - IELTS Matching Headings — short headings labeled to sections of a passage. Use
placementUnit: "sectionLabel". The same shape covers analytical meta-labels (e.g., labeling each paragraph ‘thesis’, ‘counter-argument’, ‘evidence’) — both real headings and analytical labels share the wire format. Seeexamples/17c-section-label-placement.json. - TOEFL Sentence Insertion — a single missing sentence with multiple candidate positions; only one is correct. Author 4
@@@Nmarkers inpassageand a singleplacements[]entry whosegapis the correct position. The unanswered markers are decoy gaps. UseplacementUnit: "sentence"andallowPartialCredit: false(single-gap → all-or-nothing). Seeexamples/17d-toefl-insertion-placement.json.
Word-level placement is covered by wordBankCloze — placement does not include "word" in the placementUnit enum.
Symbolic-type prompt convention. On the eight symbolic question types (the gap-fill family, sentence transformation, matching, ordering, placement) the structured fields carry the question’s meaning, so prompt is non-authoritative. It remains required but MAY be empty (""); equally, a producer MAY populate it with a brief human-readable summary derived from the question’s content — a readable preview, not authored framing. See examples/01b-simple-gap-fill-readable-prompt.json, which shows "I saw ___ elephant at the zoo yesterday." as one valid form, with "" (as in examples/01-simple-gap-fill.json) equally valid. Consumers MUST NOT rely on a symbolic prompt’s content for scoring, rendering, equality, or deduplication.
Framing instructions for the exercise (e.g., “Place these sentences in the gaps where they best fit. Some markers are decoys.”) belong on the parent exerciseItem.instructions (or quizItem.instructions) field, not duplicated into each question’s prompt. Consumers typically render the parent item’s instructions once at the top of the exercise — above all its questions — so per-question framing would be redundant. This applies symmetrically to all eight symbolic question types.
Properties
| Property | Type | Required | Description |
|---|---|---|---|
placementUnit | string | ✅ Yes | Display granularity hint: "sentence", "paragraph", or "sectionLabel". Default "sentence". Each value carries a different marker-placement convention — see the table below. |
passage | string | ✅ Yes | Structured text with @@@1, @@@2, … gap markers (1-based). Must contain at least one marker (the schema’s pattern keyword enforces this). Plain text only — no HTML. |
placements | object[] | ✅ Yes | Each entry: { "gap": int, "item": string }. Both required; gap ≥ 1; item.minLength 1. Order is author-free. minItems: 1 — permits the TOEFL variant (1 item, multiple candidate gaps). |
distractors | string[] | ❌ No | Extra items with no correct gap. Default []. Distinct from decoy gaps (extra @@@N markers without a corresponding placements[].gap entry). |
allowPartialCredit | boolean | ❌ No | Award partial credit per correct gap instead of all-or-nothing. Default true. |
Display variants (placementUnit)
placementUnit is an advisory hint and a marker-placement convention. The same placement discriminator covers all three variants and consumers MAY render uniformly.
placementUnit | Typical render | Marker-placement convention | Example use |
|---|---|---|---|
"sentence" | Inline drop slot at each marker | Marker appears mid-prose; surrounding whitespace and punctuation are the author’s choice. | Cambridge B2 missing-sentences (17a) and TOEFL Sentence Insertion (17d). |
"paragraph" | Block-level drop slot between paragraphs | Marker is the entire content of its paragraph — surrounded by \n\n or at start/end of passage. | Cambridge C1 paragraph-reordering (17b). |
"sectionLabel" | Label slot above / leading edge of a paragraph | Marker at the start of the section it labels (first non-whitespace token of a paragraph), followed by a space and then the section’s content. | IELTS Matching Headings and analytical meta-labels (17c). |
Marker-placement conventions
passage is plain text. Consumers detect paragraph boundaries by \n\n (double newline). The conventions above are documented in the schema’s placementUnit description and given side-by-side here:
sentence:
"The experiment began with a simple question. @@@1 The results surprised the team."
paragraph:
"Coined money had served European trade for centuries.\n\n@@@1\n\nWhat converted private bills into public currency was the cost of seventeenth-century war."
sectionLabel:
"@@@1 Paper currency did not spread simply because it was convenient.\n\n@@@2 Before paper notes, European trade depended mainly on metal coin."
Decoy gaps (TOEFL Sentence Insertion variant)
A passage with N @@@N markers and fewer than N placements[] entries is valid: the unanswered markers are decoy gaps — candidate positions where the missing item could plausibly fit but doesn’t. This natively expresses TOEFL Sentence Insertion: 4 candidate positions, 1 correct placement. See examples/17d-toefl-insertion-placement.json.
This is distinct from decoy items (distractors[]) — extra content the learner has but should not place anywhere.
Validator policy
Hard errors (validation fails):
- Every
placements[].gapMUST reference a@@@Nmarker present inpassage. Orphan placement entries fail. - No duplicate
gapvalues withinplacements[]. gapmust be a positive integer (≥ 1).passageMUST contain at least one@@@Nmarker (enforced by the schema’spatternkeyword).
Soft warnings (NOTE-tier, not blocking):
@@@Nmarkers SHOULD be sequential starting at 1 (1, 2, 3, …). Inherits thewordBankClozeconvention.- Per-
placementUnitmarker-placement convention violations:paragraphmarkers that sit mid-prose alongside other text rather than alone on a paragraph;sectionLabelmarkers that don’t appear at the start of a paragraph.sentencemode has no positional rule. - A
@@@Nmarker without a correspondingplacements[].gapentry is not a warning — it is a valid decoy gap. The validator distinguishes intentional decoys from authoring errors only by inference.
Scoring
Per-gap scoring against the authored placements[]. With allowPartialCredit: true (default), each gap whose chosen item matches the authored item is worth points / placements.length; remaining gaps contribute zero. With allowPartialCredit: false, every gap must be correct for any credit. Decoy gaps (markers without a placements[] entry) are unscored — placing an item into one is a “wrong gap” event whose treatment is consumer-defined; placing nothing into one is the expected case.
12. SentenceTransformation
Description: Cambridge exam-style controlled paraphrase tasks.
Use Case: Cambridge FCE/CAE Use of English Part 4 (Key Word Transformation).
Example: Transform sentence using given keyword.
{
"type": "sentenceTransformation",
"globalId": "550e8400-e29b-41d4-a716-446655440018",
"prompt": "",
"title": "Key Word Transformation",
"tags": ["grammar", "exam:fce", "level:B2"],
"difficulty": 8.0,
"points": 2.0,
"promptSentence": "I haven't seen John for three weeks.",
"keyword": "LAST",
"targetSentence": "The @@@ was three weeks ago.",
"allOrNothing": false,
"acceptedChunks": {
"1": ["last time"],
"2": ["I saw John"]
},
"chunkFeedback": {
"1": "Use 'LAST' + 'time'. The fixed phrase is 'the last time'.",
"2": "Past simple 'saw' is needed here, not present perfect 'have seen'."
}
}
Type-Specific Properties:
| Property | Type | Required | Default | Description |
|---|---|---|---|---|
promptSentence | string | ✅ Yes | "" | Original sentence to transform |
keyword | string | ✅ Yes | "" | Word that MUST be used (uppercase) |
targetSentence | string | ✅ Yes | "" | Template with @@@ for answer chunks |
allOrNothing | boolean | ❌ No | false | All chunks correct or zero points |
acceptedChunks | object | ✅ Yes | {} | Dictionary: chunk index → array of accepted answers |
chunkCaseSensitive | object | ❌ No | null | Dictionary: chunk index → boolean (default: false) |
chunkFeedback | object | ❌ No | null | Dictionary: chunk index → feedback string |
Reserved Types
The seven question types in this section are reserved in the question-base.schema.json discriminator enum but do not yet have per-type schemas; full implementation is targeted for 2027. Per NORMATIVE.md §6, conforming consumers MUST preserve them in full across read/write cycles (every field, value, and nested structure — per §6.2/§6.4), MUST treat earned points as 0, and SHOULD render a non-interactive placeholder. Producers SHOULD NOT emit reserved types in cross-implementation distribution.
13. Association
Description: Group items into categories (Categorization).
Use Case: Classify items, group by category.
Status: 🔒 Reserved (per NORMATIVE.md §6) — discriminator name is reserved in 1.0; per-type schema is targeted for 2027. Producers SHOULD NOT emit reserved types in cross-implementation distribution; consumers MUST preserve them in full across read/write cycles (every field, value, and nested structure — per §6.2/§6.4), MUST treat earned points as 0, and SHOULD render a non-interactive placeholder. The example below shows only the question-base fields any reserved-type instance must carry.
{
"type": "association",
"globalId": "550e8400-e29b-41d4-a716-446655440011",
"prompt": "Group these words by part of speech",
"title": "Parts of Speech",
"tags": ["grammar", "level:B1"],
"difficulty": 5.0,
"points": 4.0
}
Note: Per-type properties are not defined in 1.0; tool-specific extension fields MAY be carried (consumers MUST preserve them per NORMATIVE.md §6.4 round-trip preservation).
14. Hotspot
Description: Click regions on image (HotspotImage).
Use Case: Image-based identification, anatomy diagrams.
Status: 🔒 Reserved (per NORMATIVE.md §6) — discriminator name is reserved in 1.0; per-type schema is targeted for 2027. Producers SHOULD NOT emit reserved types in cross-implementation distribution; consumers MUST preserve them in full across read/write cycles (every field, value, and nested structure — per §6.2/§6.4), MUST treat earned points as 0, and SHOULD render a non-interactive placeholder. The example below shows only the question-base fields any reserved-type instance must carry.
{
"type": "hotspot",
"globalId": "550e8400-e29b-41d4-a716-446655440012",
"prompt": "Click on the heart in the diagram",
"title": "Anatomy Hotspot",
"tags": ["science:biology", "level:B1"],
"difficulty": 4.0,
"points": 2.0
}
Note: Per-type properties are not defined in 1.0; tool-specific extension fields MAY be carried (consumers MUST preserve them per NORMATIVE.md §6.4 round-trip preservation).
15. GraphicGapMatch
Description: Visual arrangement (DragAndDrop).
Use Case: Drag-and-drop activities, visual matching.
Status: 🔒 Reserved (per NORMATIVE.md §6) — discriminator name is reserved in 1.0; per-type schema is targeted for 2027. Producers SHOULD NOT emit reserved types in cross-implementation distribution; consumers MUST preserve them in full across read/write cycles (every field, value, and nested structure — per §6.2/§6.4), MUST treat earned points as 0, and SHOULD render a non-interactive placeholder. The example below shows only the question-base fields any reserved-type instance must carry.
{
"type": "graphicGapMatch",
"globalId": "550e8400-e29b-41d4-a716-446655440013",
"prompt": "Drag the labels to the correct positions on the diagram",
"title": "Label Diagram",
"tags": ["science", "level:B2"],
"difficulty": 6.0,
"points": 5.0
}
Note: Per-type properties are not defined in 1.0; tool-specific extension fields MAY be carried (consumers MUST preserve them per NORMATIVE.md §6.4 round-trip preservation).
16. GraphicAssociate
Description: Associate items with images.
Use Case: Match text with images.
Status: 🔒 Reserved (per NORMATIVE.md §6) — discriminator name is reserved in 1.0; per-type schema is targeted for 2027. Producers SHOULD NOT emit reserved types in cross-implementation distribution; consumers MUST preserve them in full across read/write cycles (every field, value, and nested structure — per §6.2/§6.4), MUST treat earned points as 0, and SHOULD render a non-interactive placeholder. The example below shows only the question-base fields any reserved-type instance must carry.
{
"type": "graphicAssociate",
"globalId": "550e8400-e29b-41d4-a716-446655440014",
"prompt": "Match each animal with its habitat",
"title": "Animal Habitats",
"tags": ["science:biology", "level:A2"],
"difficulty": 4.0,
"points": 3.0
}
Note: Per-type properties are not defined in 1.0; tool-specific extension fields MAY be carried (consumers MUST preserve them per NORMATIVE.md §6.4 round-trip preservation).
17. GraphicOrder
Description: Order items based on images.
Use Case: Sequence images, visual ordering.
Status: 🔒 Reserved (per NORMATIVE.md §6) — discriminator name is reserved in 1.0; per-type schema is targeted for 2027. Producers SHOULD NOT emit reserved types in cross-implementation distribution; consumers MUST preserve them in full across read/write cycles (every field, value, and nested structure — per §6.2/§6.4), MUST treat earned points as 0, and SHOULD render a non-interactive placeholder. The example below shows only the question-base fields any reserved-type instance must carry.
{
"type": "graphicOrder",
"globalId": "550e8400-e29b-41d4-a716-446655440015",
"prompt": "Put these images in the correct order to show the life cycle",
"title": "Life Cycle Order",
"tags": ["science:biology", "level:B1"],
"difficulty": 5.0,
"points": 4.0
}
Note: Per-type properties are not defined in 1.0; tool-specific extension fields MAY be carried (consumers MUST preserve them per NORMATIVE.md §6.4 round-trip preservation).
18. FileUpload
Description: Submit documents.
Use Case: Assignment submission, file uploads.
Status: 🔒 Reserved (per NORMATIVE.md §6) — discriminator name is reserved in 1.0; per-type schema is targeted for 2027. Producers SHOULD NOT emit reserved types in cross-implementation distribution; consumers MUST preserve them in full across read/write cycles (every field, value, and nested structure — per §6.2/§6.4), MUST treat earned points as 0, and SHOULD render a non-interactive placeholder. The example below shows only the question-base fields any reserved-type instance must carry.
{
"type": "fileUpload",
"globalId": "550e8400-e29b-41d4-a716-446655440016",
"prompt": "Upload your completed assignment (PDF format)",
"title": "Assignment Submission",
"tags": ["assignment", "level:C1"],
"difficulty": 0.0,
"points": 50.0
}
Note: Per-type properties are not defined in 1.0; tool-specific extension fields MAY be carried (consumers MUST preserve them per NORMATIVE.md §6.4 round-trip preservation).
19. MediaPromptedEssay
Description: Record audio/video answer.
Use Case: Speaking tasks, oral presentations.
Status: 🔒 Reserved (per NORMATIVE.md §6) — discriminator name is reserved in 1.0; per-type schema is targeted for 2027. Producers SHOULD NOT emit reserved types in cross-implementation distribution; consumers MUST preserve them in full across read/write cycles (every field, value, and nested structure — per §6.2/§6.4), MUST treat earned points as 0, and SHOULD render a non-interactive placeholder. The example below shows only the question-base fields any reserved-type instance must carry.
{
"type": "mediaPromptedEssay",
"globalId": "550e8400-e29b-41d4-a716-446655440017",
"prompt": "Record a 2-minute audio response describing your favorite place",
"title": "Speaking Task",
"tags": ["speaking", "exam:ielts", "level:B2"],
"difficulty": 7.0,
"points": 10.0
}
Note: Per-type properties are not defined in 1.0; tool-specific extension fields MAY be carried (consumers MUST preserve them per NORMATIVE.md §6.4 round-trip preservation).
Validation Rules
Type Discriminator
- ✅
"type"value MUST match one of the supported types in canonical camelCase:simpleGapFill,multipleChoice,trueFalseQuestion, etc. - ❌ Non-conforming:
"type": "SimpleGapFill"(pre-1.0 PascalCase). Conforming consumers MUST reject perNORMATIVE.md§5.3. - ❌ Non-conforming:
"type": "simplegapfill"(wrong case). Conforming consumers MUST reject. - ❌ Invalid:
"type": "GapFill"— not a recognized discriminator.
Common Properties
- ✅
globalIdmust be a valid RFC 4122 UUID (any version; shape-only validation against the 8-4-4-4-12 hex pattern); required perNORMATIVE.md§4.4. - ✅
difficultymust be 0.0 to 10.0. - ✅
pointsmust be a non-negative number (minimum0.0); MAY benullto inherit a consumer-default scoring weight. Use0for ungraded questions; positive values for graded. - ✅
tagsmust be array of strings (can be empty; per the empty-default-strip rule in spec examples, omit when empty).
Type-Specific Validation
SimpleGapFill:
- ✅
sentencemust contain exactly one@@@marker - ✅
acceptedAnswersmust have at least one answer
TrueFalseQuestion:
- ✅
correctAnswermust be a boolean (trueorfalse) - ✅
displayStyleif present must be one of: “TrueFalse”, “CorrectIncorrect”, “CheckmarkX”
MultipleChoice:
- ✅
optionsmust have at least 2 items - ✅
optionsAndPointsmust have entries for all options - ✅ At least one option must have points > 0
WordBankCloze / MultiGapCloze:
- ✅ Numbered
@@@1,@@@2, etc. inpassagemust matchgapAcceptedAnswerscount - ✅ Gap numbers must be sequential starting at 1 (1, 2, 3, …)
- ✅ Each gap must have at least one accepted answer
MultipleChoiceCloze:
- ✅ Numbered
@@@1,@@@2, etc. inpassagemust matchgapOptionsandcorrectAnswerscount - ✅
correctAnswersindices must be valid (within option array bounds)
SentenceTransformation:
- ✅
keywordmust appear in student’s answer (validated by renderer) - ✅
targetSentencemust contain exactly one@@@placeholder — chunks are sequential answer pieces typed at that single position, not separate gaps. Multiple@@@markers are ambiguous and non-conforming. - ✅ Chunk indices in
acceptedChunksmust be sequential starting from 1 ("1","2","3", …)
Complete Example: ExerciseItem with Mixed Questions
{
"type": "exercise",
"globalId": "550e8400-e29b-41d4-a716-446655440100",
"title": "Grammar Practice Exercise",
"sequence": 0,
"instructions": "Complete all questions to the best of your ability.",
"suggestedTime": 15,
"isOptional": false,
"isGraded": true,
"points": 10.0,
"passMarkPercent": 70.0,
"questions": [
{
"type": "simpleGapFill",
"globalId": "550e8400-e29b-41d4-a716-446655440101",
"prompt": "",
"title": "Articles",
"tags": ["grammar:articles"],
"difficulty": 3.0,
"points": 2.0,
"sentence": "I saw @@@ elephant at the zoo.",
"acceptedAnswers": ["an"],
"caseSensitive": false
},
{
"type": "multipleChoice",
"globalId": "550e8400-e29b-41d4-a716-446655440102",
"prompt": "Which sentence is grammatically correct?",
"title": "Correct Sentence",
"tags": ["grammar:tenses"],
"difficulty": 5.0,
"points": 3.0,
"options": [
"She go to school every day.",
"She goes to school every day.",
"She going to school every day.",
"She is go to school every day."
],
"optionsAndPoints": {
"She go to school every day.": 0.0,
"She goes to school every day.": 1.0,
"She going to school every day.": 0.0,
"She is go to school every day.": 0.0
},
"allowMultipleCorrect": false,
"allowPartialCredit": false,
"penalizeIncorrect": false,
"shuffleOptions": true,
"showLetterLabels": true
},
{
"type": "trueFalseQuestion",
"globalId": "550e8400-e29b-41d4-a716-446655440103",
"prompt": "The past tense of 'go' is 'went'.",
"title": "Past Tense",
"tags": ["grammar:irregular-verbs"],
"difficulty": 2.0,
"points": 1.0,
"correctAnswer": true,
"displayStyle": "TrueFalse",
"penalizeIncorrect": false,
"incorrectPenaltyPercent": 0.0
}
]
}
Related Documentation
NORMATIVE.md— Conformance requirements (RFC 2119 keywords, producer/consumer roles).README.md— Specification overview.schemas/— JSON Schema files (the contract for each type).examples/— Per-type example files (01-simple-gap-fill.json…16c-paragraph-ordering.json).tests/— Conformance test corpus.
Version history:
- 1.0-rc.1 (2026-05-25) — initial release candidate; internal, never publicly announced.
- 1.0-rc.2 (2026-05-30) — first publicly announced release candidate;
prompt-field correction. - 1.0-rc.3 (2026-06-13) — localization model, corpus expansion;
sentenceTransformationschema drops two prototype-era fields. - 1.0 (target 2026-06-30) — final release.
LC-JSON Item Patterns — Authoring Guide
Status: Informative. See NORMATIVE.md for binding requirements. Spec version: 1.0 Last updated: 2026-05-02
This document is a guide for authors and developers working with LC-JSON (Learning Content JSON) items (exercise, quiz). It explains the four policy fields available on each item, names the common combinations teachers use, and surveys how different consumers may interpret them.
It is informative, not normative. Nothing here adds requirements to producers or consumers — it describes the affordances the wire format already gives you, and how to think about composing them.
1. The four policy fields
Every Exercise or Quiz item carries four fields that an author composes to express pedagogical intent. They are independent: any combination is valid LC-JSON.
| Field | What it is on the wire | What it isn’t |
|---|---|---|
type | Structural form: "exercise" or "quiz". Signals the consumer’s UI rendering (different chrome, different settings panes) and tells the consumer to track points in separate buckets — enabling future weighted-grading schemes (e.g., “exercises = 30%, quizzes = 70%”). | A grading-policy signal. quiz does not mean graded; exercise does not mean ungraded. |
isGraded | Boolean. Whether the score on this item should be recorded against the learner’s grade. | A navigation signal. Whether the learner can move forward despite a low score is a separate consumer-policy question. |
isOptional | Boolean. Whether the learner is required to attempt this item to be considered “done” with the lesson. | A grading or navigation signal in itself — but consumers often combine it with completion-tracking. |
passMarkPercent | Number 0–100. The threshold at which a score counts as “passing.” | A gating directive. The wire format does not say what the consumer does when a learner falls below the threshold. |
The crucial point: passMarkPercent is a threshold, not a gate. What happens when a learner scores below it is consumer policy, not spec mandate. See §3 below.
Plus: tags (metadata, not policy)
Items also carry an optional tags array — taxonomic strings used for filtering, search, gradebook grouping, and similar metadata operations. Tags are not policy fields (they don’t affect grading, navigation, or threshold logic). Conventional namespaces use colons:
| Namespace | Example | Use |
|---|---|---|
| Domain hierarchy | history:immigration:united-states | Subject → topic → sub-topic |
| Stage | stage:lower-secondary, grade:9, year:9, level:B2 | Pick the convention your audience knows; the spec doesn’t impose a single vocabulary |
| Cognitive skill | fact-recall, analysis, synthesis | Bloom-style levels |
| Exam alignment | exam:cambridge-fce, exam:ap-history | When content is targeted at a specific exam |
Item-level tags are independent of any tags on inner questions. A quiz with three tagged questions can itself be tagged differently — the quiz’s tags describe the assessment as a whole; the questions’ tags describe each item.
Note on ContentSequenceItem (CSI) tagging
A contentsequence item references a contentItemId plus zero or more relatedItemIds (typically Exercise/Quiz items). Each of those referenced items may carry its own tags. The wire format does not specify how a consumer should compose tags across the CSI and its referenced items — that is consumer policy. Reasonable choices include:
- Surface only the CSI’s own tags in search/filter UIs (treat the CSI as the authoritative metadata source for the bundle)
- Surface the union of CSI tags + related-items’ tags (the bundle inherits everything contributed)
- Surface only the related items’ tags when the CSI is being treated as a thin wrapper
- Combine with namespace-aware deduplication (e.g., union of
history:*tags but only the CSI’s owngrade:*tag)
Authors should apply tags to whichever items make sense for their content; consumers should document their composition policy so authors know what to expect.
2. Common authoring patterns
Question-level display hints
A few question types carry advisory display hints that don’t change the wire shape but help consumers pick a fitting layout. The clearest example is ordering: an ordering question can carry orderingUnit: "word" | "sentence" | "paragraph", signalling tile size — inline tokens for word-level, stacked card blocks for sentence- or paragraph-level (see examples/16-ordering.json, 16b-sentence-ordering.json, 16c-paragraph-ordering.json). The discriminator stays ordering in all three cases; the hint lets the consumer choose layout. Consumers MAY render uniformly regardless. Placement carries the parallel placementUnit: "sentence" | "paragraph" | "sectionLabel" hint with the same advisory framing — see question-types-reference.md §11.
Two-mode shape: matching
Matching questions carry a matchingMode: "pairs" | "classification" discriminator, with two distinct sub-shapes underneath: pairs: [{ item, match }] for 1:1 matching and categories: [{ label, items }] for classification (many-to-one). In pairs mode, consumers MAY render two columns with drag-and-drop pairing or a per-row dropdown of candidate matches. In classification mode, consumers MAY render items as draggable chips and category labels (plus distractors) as drop zones. Both modes use the same distractors[] field for extra unmatched values. The two presentations are consumer-defined; the spec describes the structural relationship and hints at the affordance. See examples/15-matching.json (pairs) and examples/15b-matching-classification.json (classification).
Item-level patterns
The four item-level fields generate a wide design space. These patterns are common enough to have names:
| Pattern | type | isGraded | isOptional | passMarkPercent | Use case |
|---|---|---|---|---|---|
| Open practice | exercise | false | false | 0 | Free practice; no stakes. Learners try as much as they want. |
| Graded homework | exercise | true | false | 70 (varies) | Counts toward grade. Threshold defines “passing the homework.” |
| Mastery exercise | exercise | true | false | 80–100 | Must (typically) demonstrate mastery before continuing — depends on consumer’s gating model. |
| Diagnostic pre-test | quiz | false | false | 0 | “What do you know going in?” Surfaces baseline; never blocks; learner sees their score for self-awareness. |
| Typical assessment | quiz | true | false | 70 | Standard quiz. Counts toward grade. |
| Exit ticket | quiz | true | false | 0 | Quick check at end of lesson. Recorded in gradebook (so the teacher can see attempts), but no threshold gate. |
| Bonus / enrichment | exercise or quiz | varies | true | varies | Skippable. Some teachers grade bonus work, some don’t. |
| Self-check | quiz | false | true | 0 | Optional, ungraded reflection. Pure metacognition. |
Pick the row that matches your intent. Or compose your own combination — the spec doesn’t enumerate “valid” patterns; the patterns above just happen to be common.
3. How consumers may interpret these fields
The wire format describes what the fields are. It does not prescribe what a consumer does with them. Different consumers — even different modes of the same consumer — can interpret the same payload differently. Some examples:
Consumer-policy patterns (illustrative)
- Open-navigation LMS: Learner moves freely between items in any order.
passMarkPercentis used for the score-display badge and for the gradebook-column threshold; it doesn’t gate forward progress. - Sequential-navigation LMS: Learner moves in order. A graded item below
passMarkPercentblocks forward progress until passed; ungraded items always allow forward progress regardless of score. The “is this graded?” question is what determines whether a low score gates anything. - Mastery-routing LMS: Scores below
passMarkPercentroute the learner to remediation content; scores above route to enrichment. Grading is recorded but doesn’t block progress (the routing handles it). - Reporting-only consumer: Uses
passMarkPercentpurely for the gradebook-column display. Doesn’t affect navigation. Ungraded items don’t appear in the gradebook at all. - Strict-gate consumer: Gates forward progress on every item with a
passMarkPercent > 0, regardless ofisGraded. (This is more restrictive than the spec mandates, but it’s a real pattern in some LMS.)
Many consumers also implement a common gradebook-semantics pattern: items where the learner has scored ≥ passMarkPercent (and the item is graded) are reported as Completed in the teacher’s gradebook view. Whether “completed” maps to a letter grade is a separate gradebook configuration.
The point is plurality. Author intent is portable across consumers only insofar as the author composes it across multiple fields. Setting passMarkPercent: 0 alongside isGraded: false is the most permissive combination — virtually no consumer policy can construct a friction surface from “ungraded + no threshold.”
Consumer plurality on tel: links (HTML safety profile, §7)
The same consumer-plurality principle governs the HTML safety profile’s URL allowlist. tel: links are permitted in <a href> (per HTML_SAFETY.md §4.1), but whether the consumer makes them actionable is consumer-policy:
- K–12 / school-aged-audience consumer: gates
tel:links behind a configuration flag, default off. When the flag is off, the link’s text remains visible but thehrefis neutralized so taps don’t open the dialer. - Adult/corporate-training consumer: typically enables
tel:unconditionally — calling a sales contact or support line is a legitimate authoring affordance. - Print/export consumer: renders the phone number as plain text; no actionable link.
- Reporting-only consumer: treats
tel:no differently frommailto:— passes through verbatim.
The safety profile’s domain validator emits a warning (not an error) for tel: links so producers know to verify their target audience before shipping. Consumers MUST NOT reject documents that contain tel: links; the choice of how to render them is consumer policy.
Authored-text line-break conventions
Plain-text authored-text fields — description, prompt, passage, hint, feedback.correct, feedback.incorrect, etc. — carry literal newline characters as part of the field value (escaped as \n in JSON per RFC 8259). The spec is silent on how consumers should render those characters: a runtime might collapse all whitespace and treat the field as a single block, or honor line breaks, or wrap paragraphs in semantic block elements. The wire format is a string; what consumers do with embedded newlines is consumer policy.
Where consumers want portable rendering across the ecosystem, a useful convention is:
\n\n(a blank-line boundary) denotes a paragraph break. Consumers MAY render each paragraph as a block element (e.g.<p>...</p>).- A single
\ndenotes a line break within a paragraph. Consumers MAY render it as<br>or equivalent.
Producers using this convention get aligned rendering across consumers that adopt it; producers ignoring it get whatever the consumer chooses.
Some consumers implement this convention as: \n\n becomes <p> blocks, single \n becomes <br>, both with HTML-safe encoding of authored content. Other consumer models may legitimately:
- Collapse all whitespace and render the field as a single inline block (typical of grid-cell renderers, screen-reader-first consumers).
- Honor single
\nas<br>but treat\n\nthe same way (line-break-only convention). - Render the field through a Markdown processor that gives
\n\nparagraph semantics plus richer features (lists, emphasis, etc.) — out of scope for this spec but a legitimate consumer choice.
This convention applies only to plain-text authored fields. Trusted-HTML surfaces (ContentItem.html, SignpostItem.customHtml) carry HTML directly, sanitized per HTML_SAFETY.md, and use the producer’s chosen block elements (<p>, <h2>, <blockquote>, etc.) — they do not rely on newline-character conventions.
4. Author tips
-
For diagnostics, exit tickets, and any item where you don’t want a threshold to matter at all: set
passMarkPercent: 0. Combined withisGraded: false(where appropriate), this is the lowest-friction signal across all consumer models. -
isGraded: false+passMarkPercent: 0is the universal “let-them-through” combination. No consumer policy can reasonably construct a navigation gate from these values. -
isGraded: true+passMarkPercent: 0is a real pattern (exit ticket): graded for record so the teacher sees attempts, but no threshold gate. -
If you target multiple consumers, prefer composing intent across multiple fields rather than relying on one alone. A reading of
isGraded: falsealone might still be paired by some consumer with a default 70% threshold and friction. SettingpassMarkPercent: 0explicitly makes the intent self-evident. -
Test your content against the consumer you target. The spec defines the wire format; it doesn’t guarantee navigation behavior. Different consumers — and different navigation modes within a consumer — will produce different learner experiences from the same payload.
-
Document author intent in
instructions. When a learner sees “Your score doesn’t count toward your grade — this is just to help you see where you’re starting from,” they understand the experience regardless of how the consumer chooses to interpret the fields. The wire payload is for the consumer; the prose is for the human.
5. Signposts + Learning Objectives
A signpost item is a structural marker — typically the first or last item in a unit or lesson — that consumers use to orient learners (“In this unit, you will be able to:”) or close a section (“You can now:”).
The wire format does not bind a specific rendering, but the field design suggests an expected pattern that any consumer can implement:
| Field | Wire-level meaning | Expected use |
|---|---|---|
signpostType: "intro" or "summary" | The structural role of the signpost | Intro signposts open a section; summary signposts close it |
scope: "course" / "unit" / "lesson" | Which structural level the signpost belongs to | Determines which objectiveIds[] set the consumer can read |
customHtml (optional) | Authored prose | When provided, lets the author add motivational/orientational text beyond the auto-rendered objectives |
The expected pattern (consumer-defined):
When a course/unit/lesson has objectives assigned via objectiveIds[], consuming applications are expected to render them in or near the matching signpost — typically:
- Intro signpost: a header phrase like “In this unit, you will be able to:” followed by a bulleted list of the unit’s objectives (resolved from
course.objectives[]via the unit’sobjectiveIds). - Summary signpost: a header phrase like “You can now:” followed by a checklist (often with checkmarks) of the same objectives.
The customHtml field, when set, lets the author add prose ABOVE or alongside this auto-rendered objectives block — typically motivational framing, an evocative image, an applications-of-the-topic list, etc. The author writes the hook; the consumer auto-renders the objectives.
Implication for authors:
- Don’t restate the objectives in
customHtml— the consumer is expected to render them automatically. Doing so creates duplication. - Don’t open
customHtmlwith phrasing that clashes with the consumer’s likely auto-render header (e.g., “By the end of this unit, you will…” — that’s typically what the consumer prepends to the objectives list). - Do use
customHtmlfor what the auto-render can’t say: the why of the unit, an opening image or visual, an “applications in:” list, a motivational closing. - For courses delivered to multiple consumers with different rendering policies, prefer brief
customHtmlso the consumer’s auto-rendered objectives are always the dominant orientation surface.
Fallback when there are no objectives:
If a unit/lesson has no objectiveIds[] and no customHtml, consumers may render nothing, render a generic placeholder, or skip the signpost entirely — that policy is consumer-defined. Authors who include signposts SHOULD also assign objectives or provide customHtml, lest some consumers render an empty stub.
See also
- NORMATIVE.md — binding producer/consumer requirements
- README.md — spec overview
- examples/11-exercise-item.json — graded homework exercise
- examples/12a-graded-quiz-item.json — typical graded assessment
- examples/12b-ungraded-quiz-item.json — ungraded diagnostic pre-test
- schemas/exercise-item.schema.json, schemas/quiz-item.schema.json — formal schemas
LC-JSON Glossary
Status: Informative. Spec version: 1.0 Last updated: 2026-06-13
This glossary defines the terms LC-JSON (Learning Content JSON) uses throughout the specification. It is informative — definitive normative meaning lives in NORMATIVE.md — but implementers should treat the entries below as the project’s working vocabulary.
Terms are organized in five groups: core concepts, artifact and hierarchy, conformance, language and localization, and identity, versioning, and extensions.
Core concepts
artifact
A top-level document type defined by the specification. LC-JSON 1.0 defines two artifact types: Course (hierarchical) and QuestionSet (flat). The artifact a document represents is signaled by its documentType root field.
document
A single instance of an artifact — one JSON file (or in-memory equivalent) that carries $schema, documentType, specVersion, and the artifact payload as flat root siblings. A document is the unit of validation, exchange, and storage.
wire format
The on-disk / on-the-wire JSON shape that conforming tools produce and consume. The wire format is what the specification binds; how a tool represents the same content internally (database rows, AST, runtime DTOs, etc.) is outside the spec.
producer
Any tool that emits LC-JSON documents intended for external consumption. Producer obligations are listed in NORMATIVE.md §4. Examples: course-authoring tools, AI-assisted authoring scripts, format converters that export to LC-JSON.
consumer
Any tool that ingests LC-JSON documents from an external source. Consumer obligations are listed in NORMATIVE.md §5. Examples: learning-management systems, delivery platforms, conversion tools that import from LC-JSON, validators.
validator
A tool that checks whether a document conforms to the spec. A conforming validator runs each document against the published JSON Schemas at the appropriate version URL, plus the domain rules described in NORMATIVE.md §5.1. The reference validator is tools/validate_course.py.
round-trip preservation
The property that a document survives a read → modify → write cycle through a consumer without losing or silently dropping data the consumer didn’t understand. LC-JSON requires round-trip preservation for reserved question types (§6.4) and recommends it for x--namespaced extension members (§7.4). Round-trip preservation is what lets one tool use LC-JSON as a faithful transfer or backup format for another tool’s content.
Artifact and hierarchy
Course
The hierarchical artifact type: a course contains units, each unit contains lessons, each lesson contains items, and items of type exercise or quiz contain questions. Identified at the root by documentType: "course". Schema: course.schema.json.
QuestionSet
The flat artifact type: a question set is a list of questions without any enclosing course/unit/lesson scaffold. Used for question-bank exchange and packaged delivery of curated question batches. Identified at the root by documentType: "questionSet". Schema: question-set.schema.json.
Unit
The second tier of a Course’s hierarchy. A unit groups related lessons under a single banner (e.g., a topic, a module, a week). Carries title, globalId, an objectiveIds reference list, and a lessons array.
Lesson
The third tier of a Course’s hierarchy. A lesson groups related items under a single banner (e.g., a class period, a sub-topic). Carries title, globalId, an objectiveIds reference list, and an items array.
Item
The fourth tier of a Course’s hierarchy — the atomic unit a learner interacts with. Items come in five types: content (reading material), exercise (practice questions), quiz (graded assessment), contentsequence (a content item paired with related exercises/quizzes), and signpost (intro/summary marker). Items of type exercise or quiz carry a questions array.
Question
A single assessment unit inside an exercise or quiz item. LC-JSON 1.0 defines 19 question types: 12 implemented (with full per-type schemas) and 7 reserved for a future minor version. Every question carries a type discriminator, a globalId, and type-specific fields.
HTML-bearing field
A spec field whose value is HTML markup rather than plain text or a structured object. The two HTML-bearing fields are ContentItem.html and SignpostItem.customHtml. Both are subject to the HTML safety profile in HTML_SAFETY.md, which constrains the allowed elements, attributes, URL schemes, and inline CSS.
Conformance
conformance
The state of meeting all MUST-level requirements of NORMATIVE.md for a given role (producer, consumer, or both). A tool MAY claim conformance to LC-JSON 1.0 per NORMATIVE.md §10. Conformance is per-role and per-spec-version.
strict mode
A validator-configuration mode in which warnings are treated as errors. The reference validator’s default mode reports warnings as warnings; --strict promotes them to errors and exits non-zero on any warning. Useful in CI pipelines that require an unambiguous pass/fail signal.
schema-clean
A document that validates without errors against the published JSON Schemas at its declared specVersion. “Schema-clean” excludes domain-rule warnings (e.g., HTML allowlist violations, gap-count mismatches) — those are reported separately by the reference validator’s domain pass.
validator (cross-reference)
See validator under Core concepts above.
Language and localization
See LOCALIZATION.md for the full model.
delivery language
The single primary language a document is authored and delivered in, declared in the root language field. LC-JSON 1.x is single-language-per-document: a document has exactly one delivery language, and multiple languages are delivered as multiple documents.
language of parts
A run of HTML content in a language different from the delivery language, marked with the lang attribute (and dir where script direction differs). This is the WCAG 3.1.2 mechanism; it concerns correct rendering and pronunciation, not translation.
support language
The learner’s first language (L1), declared in the optional root supportLanguage field, for a document whose delivery language is a second language being taught. It signals that L1 support (glosses, hints) is appropriate; how a consumer surfaces that support is consumer-defined.
language tag
A BCP 47 tag identifying a language, used by language, supportLanguage, and HTML lang. Commonly a bare ISO 639-1 primary subtag (en, es); region and script subtags (pt-BR, zh-Hant) are permitted, and a consumer may act on only the primary subtag.
Identity, versioning, and extensions
globalId
An RFC 4122 UUID (any version; shape-only validation against the 8-4-4-4-12 hex pattern) assigned to every Unit, Lesson, Item, and Question. globalId identifies an entity across re-imports, enabling consumers to match unchanged content against existing records and detect modifications. Required per NORMATIVE.md §4.4.
sourceCourseId
A stable, course-level identity field. The same sourceCourseId across multiple exports identifies them as the same logical course; consumers use this to detect re-imports and apply update semantics rather than treating each upload as a fresh course. Generated by the source authoring system; does not identify a human author. QuestionSet artifacts use the parallel sourceQuestionSetId field with the same semantics.
specVersion
A required root field declaring which contract version of LC-JSON the document conforms to. Pattern: ^1\.[0-9]+(\.[0-9]+)?$ (e.g., "1.0", "1.0.1") — anchored to major version 1, since the LC-JSON contract is currently in the 1.x family. Distinct from the author-provided version root field, which carries the content’s own version (pattern ^[0-9]+(\.[0-9]+){0,2}$, 1 to 3 segments) and tracks revisions to the content rather than to the spec contract. specVersion does not carry release-candidate suffixes — the candidate vs final-release distinction is carried by the $schema URL, not by specVersion. A consumer MUST reject documents whose major version exceeds what it supports (NORMATIVE.md §5.2).
$schema
A required root field carrying the canonical URL of the JSON Schema for the document’s artifact type at the specific publication the producer targets. A 1.0-rc.3 Course document carries "$schema": "https://lc-json.org/1.0-rc.3/course.schema.json"; a 1.0-final Course document carries "$schema": "https://lc-json.org/1.0/course.schema.json". URLs at any published path — released versions and release candidates alike — are immutable for the lifetime of the spec (§8.3).
implemented question type
A question type that has a published per-type JSON Schema in the current spec version, full authoring-tool support, and reference-runtime scoring. LC-JSON 1.0 has 12 implemented types: simpleGapFill, trueFalseQuestion, multipleChoice, wordBankCloze, multiGapCloze, multipleChoiceCloze, shortAnswer, essay, sentenceTransformation, matching, ordering, placement.
reserved question type
A type discriminator value listed in question-base.schema.json’s discriminator enum that does not yet have a published per-type schema. The 1.0 reserved types are association, hotspot, graphicGapMatch, graphicAssociate, graphicOrder, fileUpload, and mediaPromptedEssay. Consumers MUST preserve reserved-type questions in full across read/write cycles — every field, value, and nested structure (§6.4; semantic preservation, key order is producer-discretion per §6.2).
unknown question type
A type discriminator value not listed in the spec’s discriminator enum at all — typically a tool-specific extension under a type name that the producer’s tooling recognizes but other tools do not. Unknown types follow the same round-trip preservation contract as reserved types (§6).
extension member
A root-level or per-object JSON member whose key begins with x- and identifies vendor-specific data not defined by the LC-JSON spec. Extension members let tools carry authoring provenance, internal identifiers, editor state, analytics hints, etc., without polluting the core format. Defined in NORMATIVE.md §7.
namespace
The segment of an extension-member key immediately following x-. The namespace identifies the originating tool or vendor (e.g., x-acme.workflow-id has namespace acme). Namespacing prevents collisions: a producer MUST NOT emit an extension member under a namespace it does not own (§7.2).
See also
NORMATIVE.md— binding conformance requirements (the authoritative source for all terms above).README.md— descriptive specification overview.HTML_SAFETY.md— the HTML allowlist for HTML-bearing fields.ACCESSIBILITY.md— accessibility expectations for producers and consumers.IMPLEMENTATIONS.md— directory of conforming implementations.
Implementations
Status: Informative. Spec version: 1.0 Last updated: 2026-05-26
Tools that produce, consume, or validate LC-JSON (Learning Content JSON) 1.0.
To list a new implementation, open a PR adding an entry below. Implementations are listed in alphabetical order within each section. Inclusion does not imply endorsement.
Producers
Tools that emit LC-JSON documents.
- Lesson Commons — Authoring and delivery platform for structured learning content. Emits courses and question sets in LC-JSON 1.0. https://lessoncommons.com
Consumers
Tools that ingest LC-JSON documents.
- Lesson Commons — Imports LC-JSON 1.0. Extension-preserving consumer per NORMATIVE §7 (unknown
x-*members round-trip through a load/save cycle), §6.4 (reserved-type questions preserve their type-specific bodies), and §12.1 (accessibility-preservation floor). https://lessoncommons.com
Reference tools
Distributed alongside the specification in this repository.
validate_course.py— Python reference validator. Runs documents through the published JSON Schemas (jsonschema≥ 4.18) plus a hand-written domain pass for rules JSON Schema cannot easily express (HTML allowlist, gap-marker counts, points consistency). Default mode is lenient — pre-1.0 document shapes (wrapped envelopes, bare payloads) are tolerated with warnings so legacy/pre-1.0 documents can still be ingested during migration. The--strictflag is the public-conformance mode: those shapes become fatal errors, and the full set ofNORMATIVE.md§3.2 / §4.1 rejections is enforced. Public conformance claims underNORMATIVE.md§10 are evaluated in--strictmode.run_corpus.py— Conformance corpus harness for spec maintainers and contributors. Readstests/manifest.jsonand runs every fixture throughvalidate_course.py --strict, asserting that valid fixtures pass and invalid fixtures fail with non-zero exit. The spec repo’s CI runs it as a gating step on every push and PR — a corpus regression blocks deployment. Contributors SHOULD run it locally before opening a PR that touches the spec. LC-JSON consumers (tools that read/write LC-JSON documents in their own applications) do not need it; they can ignore it and treattests/manifest.jsonplus the fixture files as the canonical test set for their own implementation tests.
Conformance claims
Implementations may state conformance per NORMATIVE.md §10:
- Conforms to LC-JSON 1.0 as a producer
- Conforms to LC-JSON 1.0 as a consumer
- Conforms to LC-JSON 1.0 (both producer and consumer)
The conformance test corpus at tests/ lets implementations self-verify.
Extension namespaces
Registered x--namespaced extension prefixes (NORMATIVE §7). Listing a namespace here documents its owner and intent so other tools can interoperate or avoid collision; it does not make the extension part of the core format.
x-lessoncommons— Lesson Commons. Carries tool-specific authoring metadata that has no place in the interchange core; member details are documented in the Lesson Commons developer docs. Consumers outside Lesson Commons MUST ignore these members (NORMATIVE §7.4) but are encouraged to preserve them across round trips so authoring provenance survives a transfer through third-party tools.
Changelog
All notable changes to the LC-JSON (Learning Content JSON) specification are documented in this file.
The format is based on Keep a Changelog, and this project adheres to the versioning policy described in NORMATIVE.md §8.
[1.0-rc.3] — 2026-06-13
Second publicly announced release candidate. Adds the localization model (LOCALIZATION.md) and a conformance-corpus expansion; removes two prototype-era sentenceTransformation fields from the schema (the change that requires a new immutable path — /1.0-rc.2/ cannot be mutated). Backwards-compatible with 1.0-rc.2: every rc.2-valid document remains valid under rc.3. Published at immutable /1.0-rc.3/ URLs; /1.0-rc.1/ and /1.0-rc.2/ stay served and frozen.
Removed
allowedFillerWordsandprohibitExtraWordsBetweenChunksproperty declarations removed fromsentence-transformation.schema.json. Completes the removal documented in the 2026-05-26 rc.1 doc revisions: the reference docs,VALIDATION.md§9.9, and the shipped fixtures dropped both fields then; the working schema was the last surface still declaring them. NoadditionalProperties: falseadded —1.0-rc.2documents carrying the fields continue to validate as unknown members perNORMATIVE.md§5.4 and are dropped on re-emit by conforming consumers. The frozen/1.0-rc.1/and/1.0-rc.2/schemas are unchanged.
Added
LOCALIZATION.md— language model. New normative-and-informative document specifying the three roles of language —language(delivery),lang/dir(language of parts, WCAG 3.1.2), andsupportLanguage(optional pedagogical L1 layer) — and stating explicitly that LC-JSON 1.x is single-language-per-document (translations are separate documents, not localized field bundles). Bound by newNORMATIVE.md§13 (sections renumbered: Validation surface §13→§14, References §14→§15).GLOSSARY.mdgains a “Language and localization” group.- Language tags are BCP 47.
language,supportLanguage, and HTMLlangaccept BCP 47 tags; bare ISO 639-1 (en,es) is the common case, region/script subtags (pt-BR,zh-Hant) are permitted, and a consumer MAY act on only the primary subtag. The reference validator’ssupportLanguagecheck was widened from “2-letter ISO 639-1” to a BCP 47 plausibility check (WARN), and the same check now also coverslanguage.ACCESSIBILITY.md§6.1 wording aligned. - Screen-reader pronunciation expectations (informative).
LOCALIZATION.md§7 andACCESSIBILITY.md§6.3 state thatlangis necessary but not sufficient for correct pronunciation: automatic language switching and installed voices vary across screen readers (NVDA, JAWS, Narrator, VoiceOver) and are outside the format’s control.langremains required where parts differ — it is the floor, not optional. - GitHub issue and PR templates.
bug_report.mdandspec_change_proposal.mdunder.github/ISSUE_TEMPLATE/, pluspull_request_template.mdmirroringCONTRIBUTING.md’s pre-merge checklist. RATIONALE.mdadded to the published book. The positioning page was linked from the landing page but missing from the site navigation and sync set; it now ships as a chapter (“Rationale and positioning”).globalIddocument-wide uniqueness made explicit and enforced.NORMATIVE.md§4.4: within a single document,globalIdvalues MUST be unique across all entities (Units, Lessons, Items, and Questions share one namespace); comparison is case-insensitive. Previously implied by “identify the entity across re-imports” but stated nowhere and enforced by no published artifact. Reference validator gains a document-wide duplicate check (ERROR-tier) on both the course and questionSet paths; reference fields that point at aglobalId(contentItemId,relatedItemIds) are exempt as non-declarations. NewVALIDATION.md§12.7 catalog row. Fixture:tests/invalid/40-duplicate-global-id.json(case-varied duplicate).- Validator: WARN on course-root
author(singular). Course author credits are carried by theauthorsarray; a singularauthorat the course root is not declared bycourse.schema.json(it belongs to the QuestionSet artifact) and conforming consumers discard it. Tolerated as an unknown field perNORMATIVE.md§5.4; the reference validator now surfaces it as a likely authoring mistake. The QuestionSet artifact’sauthorfield is unaffected. - Accessibility: base-vs-Profile authoring split made explicit. Base conformance is now stated as preservation only (
NORMATIVE.md§12.1) — it never requires a producer to authoralttext, captions, or transcripts, so a small or non-institutional producer is never non-conforming for omitting them (the validator surfaces omissions as non-blocking warnings). Authoring obligations are bound by the opt-in Accessibility Profile (§12.2): under a Profile claim, a producer MUST emitalton every<img>(ACCESSIBILITY.md§2.1), captions and a transcript on prerecorded instructional video carrying speech, and a transcript on prerecorded audio-only instructional content (§3.1, WCAG 1.2.1 / 1.2.2 / 1.2.3). Transcripts were previously SHOULD; they are now MUST under the Profile. No schema change — normative prose only. - Conformance corpus expanded 38 → 64 fixtures. Per-type coverage for all 12 implemented question types: 12 new valid fixtures (
valid/14–25) including bothmatchingsub-shapes (pairs,classification),orderingwithscoringMode: "kendall", and a grading combination matrix (valid/25) exercising all four graded/ungraded × exercise/quiz combinations with a resolvable objectives pool. 14 new invalid fixtures (invalid/27–40) pinning schema-tier rules (missing required fields, gap-marker pattern,matchingModesub-shape selection andpairs/categoriescollision,itemsminItems,scoringModeenum, emptywordBank, question-level missingglobalId) and domain-tier rules (non-booleancorrectAnswer, comma in amultiGapClozeaccepted answer,ContentSequencerelatedItemIdsreferential integrity, duplicateglobalId).tests/manifest.jsonupdated;tools/run_corpus.pyreports 64/64.
[1.0-rc.2] — 2026-05-30
First formally announced public release candidate. Corrects the prompt-field definition from the (never-announced) internal 1.0-rc.1 candidate. Backwards-compatible widening: every 1.0-rc.1-valid document remains valid under 1.0-rc.2. Published at immutable /1.0-rc.2/ URLs; /1.0-rc.1/ stays served and frozen.
Changed
prompt:minLength1→0; defined as non-authoritative for the symbolic question types. Required on every question. Authoritative fortrueFalseQuestion,multipleChoice,shortAnswer, andessay. Non-authoritative for the symbolic types (gap-fill family,sentenceTransformation,matching,ordering,placement): MAY be empty or carry a producer-derived summary; consumers MUST NOT rely on its content for scoring, rendering, equality, or deduplication. Affected:question-base.schema.json,question-types-reference.md, eight symbolic examples; newexamples/01b-simple-gap-fill-readable-prompt.json. Object shape unchanged.
Added
- Real-content empty-prompt domain rule.
validate_course.pyflags empty/whitespacepromptontrueFalseQuestion,multipleChoice,shortAnswer, andessayas an ERROR. Symbolic types pass; reserved types unconstrained (deferred tov1.1). Fixtures:tests/valid/13-symbolic-empty-prompt.json,tests/invalid/26-real-content-empty-prompt.json. Corpus: 38/38. GOVERNANCE.md: masterdoc principle + Release Candidate Policy. Single-living-source model: published releases are immutable versioned artifacts; the git tag is the historical source for prior versions. RC policy: RC releases MAY introduce backwards-compatible corrections and clarify language; MUST NOT silently modify previously published artifacts; v1.0 final establishes the stable contract.
Documentation
- NORMATIVE §5 (Consumer Conformance): preamble + forward-compatibility worked example. Preamble at top of §5: schema validation is necessary but not sufficient; cross-references §5.3–§5.6, §6, §10.3. New informative subsection at end of §5 (“Forward compatibility: three look-alike situations”) covers three JSON-layer cases governed by different consumer obligations: unknown top-level field (§5.4), extension-namespaced field (§7), unknown
typediscriminator value (§5.1 Exception + §6 fallback). Cites §6.1’s “reserved and unknown types are handled identically.” - NORMATIVE §5 case 3: design-choice note on points handling for unsupported types. Earned
0; possible points stay in the item total; item maximum is consumer-independent. Open question logged in the rc.2 release notes’ “Areas still under discussion.” tests/README.md: Behavioral conformance (informative) section. Three-step round-trip self-test recipe (load → re-emit → diff) for the existing fixtures tagged inmanifest.jsonas demonstrating §6.4, §7, and §12.1 preservation obligations. No new fixtures or runner.- Scope clarification. Release notes intro,
src/index.md, andREADME-public.mdnow state that LC-JSON is a content-layer format complementary to LTI, OneRoster, xAPI, and SCORM, with a pointer toRATIONALE.md’s Scope and Limits. README.md: count + listing corrections. Example count31→32; removed the directory-tree reference to non-existentexamples/course-legacy-wrapped.json; addedplacement.schema.jsonto the Question Type Schemas list.tests/manifest.json: two stale § references.valid/01-course-minimal.json§4.4 camelCase property naming→§4.5.valid/06-html-with-video-track.json§10→§11(HTML Safety Profile). Sweep confirms no further stale NORMATIVE § refs in the manifest.- Accessibility severity alignment.
ACCESSIBILITY.md§8 listed video-without-<track>as “informational note”;VALIDATION.md§12.2 andvalidate_course.pyenforce it asWARN.ACCESSIBILITY.mdis now aligned toWARN. - “rc.1 baseline” wording → “current baseline (established in rc.1)”. Affected:
ACCESSIBILITY.md§8 heading, §11 heading, §11 inline;VALIDATION.md§12.2 + §14. The accessibility severity policy is unchanged; only the labeling is brought into the current package’s framing. - UUID prose: “RFC 4122 UUID” → “RFC 4122 UUID (any version; shape-only validation)”. Brings prose into line with the schema regex (which checks shape only, not version/variant bits). Affected:
GLOSSARY.md(globalId entry),README.md(Common Validation Errors),VALIDATION.md(six catalog rows across §4, §5, §6, §7, §8, §10),question-types-reference.md(intro requirement, Common Properties table, Validation Rules).NORMATIVE.md§4.4 already carried “(any version)” and is unchanged. Open question for v1.0 (whether to enforce strict v4/variant) logged in the rc.2 release notes’ “Areas still under discussion.” - Reserved/unknown-type preservation language: “verbatim” / “byte-equivalent” → semantic preservation.
NORMATIVE.md§6.2 and §6.4 now require preservation of every member, value, and nested structure across read/write cycles; key order within JSON objects is producer-discretion (SHOULD for authoring ergonomics, not MUST for consumers) per RFC 8259 §4. §5 forward-compatibility case 3 “preserved byte-equivalent” → “preserved with every member, value, and nested structure intact.” Tracked throughGLOSSARY.md,VALIDATION.md(two catalog rows),HTML_SAFETY.md§9,ACCESSIBILITY.md,question-types-reference.md(8 reserved-type Status lines), andtests/manifest.jsonvalid/05. - TrueFalseQuestion
correctAnswerimport normalization: explicitly labeled as pre-1.0 lenient migration affordance, not conforming behavior.question-types-reference.mdpreviously documented value-coercion ("true","correct","tick","✓",1→true; symmetric forfalse) without saying it was non-conforming. The schema requires a JSON boolean andNORMATIVE.md§5.1 binds consumers to reject schema-validation failures (strict mode). The section now states: conforming producers MUST emit a JSON boolean; conforming consumers in strict mode MUST reject non-boolean values; the coercion is a transitional ingestion aid for pre-1.0 documents that does not survive into a--strict-conforming re-export. - Release notes (“Areas still under discussion”) — three further open questions logged for 1.0 final: normative authority of the reference validator’s ERROR-tier domain rules; HTML processing-model split between strict-validation rejection and recovery-rendering sanitization; language-tag model (ISO 639-1 root vs. BCP 47 inline). None block rc.2.
Note
1.0-rc.1was never publicly announced;1.0-rc.2is the first announced prerelease. The/1.0-rc.1/schema set remains served and byte-frozen at its immutable URL.
[1.0-rc.1 doc revisions] — 2026-05-27
Doc-only follow-up revisions to the rc.1 publication. No schema or wire-format changes; the rc.1 contract at /1.0-rc.1/ is unchanged.
Added
- GOVERNANCE.md: canonical sources identified. New paragraph in the Trademark and naming section names
github.com/lc-json/specificationandlc-json.orgas the canonical sources for the LC-JSON specification, distinguishing them from forks and mirrors (which remain permitted under Apache 2.0 but should be named as derivative works rather than as “LC-JSON” without qualification). README-public.md trademark paragraph extended with a one-line pointer.
[1.0-rc.1 doc revisions] — 2026-05-26
Doc-only follow-up revisions to the rc.1 publication. No schema or wire-format changes; the rc.1 contract at /1.0-rc.1/ is unchanged.
Removed
allowedFillerWordsandprohibitExtraWordsBetweenChunks(sentenceTransformation) — prototype-era optionals, removed from the public spec surface. Both fields paired to permit specific words between chunks — a degree of leniency that practice never used. Every authored example emitted the empty default (allowedFillerWords: []) and the strict default (prohibitExtraWordsBetweenChunks: true); the off-stance for either was not a real authoring use case. The same intent — accepting a chunk with a permitted filler — is better expressed by adding the filler-bearing variant to the chunk’sacceptedChunks[N]array (rare edge case). The/1.0-rc.1/sentence-transformation.schema.jsonstill declares both properties because rc.1 schemas are immutable; the properties have been dropped from the reference docs (question-types-reference.mdJSON example and property table), fromVALIDATION.md§9.9, and from the shipped fixtures (examples/09-sentence-transformation.json,tests/invalid/25-sentence-transformation-multiple-markers.json). The/1.0/schema MUST omit both when finalized.
Fixed
- Editorial polish pass on rc.1 documentation. Count consistency across
question-types-reference.md(19 question types; reserved-types numbered 13–19),VALIDATION.mdrule-catalog completeness rows for optional schema fields, vendor-neutral phrasing inITEM_PATTERNS.md§3, US-English normalization gap closure,IMPLEMENTATIONS.mdstandard header block, internal-rationale generalization, elimination of redundant audience-targeted guidance in favor of single-source rules inOverview/Common Properties/Validation Rules, and markdown hygiene throughout. No normative changes; the wire format and JSON Schemas are unchanged.
1.0-rc.1 — 2026-05-25
Release-candidate hardening pass before 1.0 final. Closes contract-drift issues identified in the external evaluation while keeping the wire format additive (no breaking changes vs unreleased 1.0 baseline).
Added
placementquestion type — full schema, four examples, validator domain rules, conformance corpus. New top-level type with aplacements: [{gap, item}]explicit-relationship shape (mirrors the matching-redesign principle). Three modes viaplacementUnit:sentence(inline gaps),paragraph(block-level gaps), andsectionLabel(label slots above sections — covers IELTS Matching Headings and analytical meta-labels). Decoy gaps (extra@@@Nmarkers withoutplacements[]entries) natively express TOEFL Sentence Insertion. Word-level placement is covered bywordBankCloze. Four canonical examples: 17a (Cambridge B2 sentence-mode missing-sentences), 17b (Cambridge C1 paragraph-mode reorder), 17c (IELTS Matching Headings + analytical meta-labels), 17d (TOEFL Sentence Insertion with 4 candidate positions). Validator: hard-error rules onplacements[].gapreferences and duplicates; soft-warning rules on marker-placement convention violations perplacementUnitand on non-sequential@@@Nnumbering. Conformance corpus extended with 4 valid + 6 invalid placement fixtures. Reference-runtime scoring:pointsEarned = pointsPossible × correctCount / gapCountunder partial credit; strict zero-or-full otherwise. Decoy-gap rule: an item placed at an unlisted gap marks the question incorrect but does not subtract from credit earned on correctly-filled gaps when partial credit is allowed; under all-or-nothing it fails the strict check. Discriminator enum extended.- Ordering: Kendall tau partial-credit scoring. The reference runtime implements Kendall tau over the learner’s permutation; with N items and k inversions,
pointsEarned = pointsPossible × (1 − k / (N × (N−1) / 2)). Distractors placed in the answer area drop the question to incorrect at the strict-mode level but partial credit is still awarded over the in-bounds correct indices. A new helper resolves the conditional default at scoring time. - Matching question —
matchingModediscriminator with two sub-shapes. Replaces the legacystems[]/targets[]parallel-array shape entirely.matchingMode: "pairs"selectspairs: [{ item, match }]for 1:1 matching;matchingMode: "classification"selectscategories: [{ label, items }]for many-to-one classification. The discriminator is required (no default). Mixed shapes (bothpairsandcategoriespresent, ormatchingModeomitted) are rejected by the schema. Eachpairs[i].{item,match}andcategories[i].{label,items}is required andminLength: 1;categories[i].itemshasminItems: 1. New canonical example15b-matching-classification.json(time expressions → tense). A one-off migration script rewrites legacy on-disk content; the new shape replaces the legacy one entirely with no back-compat path. - NORMATIVE §6 — Reserved and Unknown Types. Full fallback contract: consumers MUST preserve reserved/unknown question types verbatim across read/write cycles, MUST NOT silently drop, MUST treat earned points as 0, SHOULD render a non-interactive placeholder. Producers MUST satisfy
question-base.schema.jsonif emitting reserved types and SHOULD NOT emit them in cross-implementation distribution. Closes the round-trip preservation gap that broke portability for tool-specific extensions. - NORMATIVE §7 — Extensions (namespaced
x-members). Defines a forward-compatible, collision-free mechanism for tools to attach data outside the interchange contract (authoring provenance, internal identifiers, editor state). Extension members are keyedx-<namespace>(e.g.x-acme-reviewState), MAY appear on the document root and any Course/Unit/Lesson/Item/Question object, and are strictly additive — removing allx-members MUST leave a conforming document with equivalent learner-facing meaning. Consumers MUST NOT reject documents for carrying them, MUST NOT interpret members outside namespaces they own, and SHOULD preserve unrecognized members across a round trip (defining an extension-preserving consumer). This is the contract that lets a tool use LC-JSON as a faithful transfer/backup format for its own tool-specific state.IMPLEMENTATIONS.mdregisters thex-lessoncommonsnamespace (question lineage / authoring provenance). Sections §7–§11 (was §6–§10 after Reserved Types) renumbered to §8–§12. - NORMATIVE §11 + new sibling
HTML_SAFETY.md— HTML safety profile. Normative allowlist forContentItem.htmlandSignpostItem.customHtml: allowed elements (block, inline, media including<video>,<audio>,<source>,<track>,<h1>–<h6>,<div>,<blockquote>,<figure>); per-element attribute table; URL-scheme allowlist (http:,https:,mailto:,tel:, relative —javascript:/vbscript:/data:etc. forbidden); link normalization (target="_blank"→rel="noopener noreferrer"); inlinestyleCSS-property allowlist (sizing, spacing, borders, alignment); class attribute permitted unconstrained (author-defined CSS hooks); sanitization obligation; unknown-element strip-while-preserving-text per §6.2 (mirrors §6 reserved-types philosophy — degrade gracefully, never fail-closed); validator severity split (ERROR for XSS-class violations like<script>andon*, WARN for sanitizable cases). Closes the largest public-spec gap from the 2026-05-02 external evaluation. unitLevelhint onorderingquestions —"word"(default),"sentence","paragraph". Enables sentence-in-text and paragraph-in-text ordering tasks under the same discriminator. Existingorderingexamples continue to validate (default ="word").- Two ordering examples:
16b-sentence-ordering.json(cellular respiration stages withscoringMode: "kendall") and16c-paragraph-ordering.json(Enlightenment essay withscoringMode: "strict"). - Authored-text line-break rendering convention —
\n\n→ paragraph,\n→ line break.ITEM_PATTERNS.md§3 gains a non-normative subsection documenting the affordance for plain-text authored fields (description,prompt,passage,hint,feedback.correct,feedback.incorrect, etc.). Producers using this convention get portable rendering across consumers that align with the affordance; consumers may legitimately collapse all whitespace and treat the field as a single block. Lesson Commons Learn implements this convention as the reference behavior. Wire format unchanged — JSON strings continue to escape newlines per RFC 8259; no schema change. Trusted-HTML fields (ContentItem.html,SignpostItem.customHtml) keep their existing sanitizer pipeline and stay outside this convention. - NORMATIVE §5.6 — Randomization requirements for
matchingandplacement. Consumers MUST present two surfaces in randomized order: (1) the choice pool (authored answer values + distractors) for both matching and placement, and (2) the row order in matching classification mode (where source order is grouped by category and would directly expose the answer). Source order is forbidden on these surfaces. The randomization algorithm and any seeding strategy are consumer-defined. The requirement does not apply tomultipleChoice(whereshuffleOptionsper question still governs), to matching pairs-mode rows (each item has a distinct match, so source order doesn’t leak), or toorderingsource tiles (already shuffled structurally). Schema descriptions onmatching.distractors,matching.categories, andplacement.distractorscross-reference §5.6. tagsarrays on Unit, Lesson, and Item. Five-tier tagging now uniform across Course, Unit, Lesson, Item, and Question.ITEM_PATTERNS.md— informative authoring guide covering policy fields, common patterns, signposts + learning objectives, consumer-policy variance. §3 gains atel:consumer-policy example showing how the same wire-level construct (atel:link) is gated differently across consumer audiences.- Conformance test corpus expansion:
valid/05-reserved-type-with-extensions.json(round-trip preservation case for §6.4),valid/06-html-with-video-track.json(HTML safety §7 media handling),invalid/12-unit-missing-global-id.json(§4.4 enforcement),invalid/13-html-with-script.json(HTML safety §2.4 + §3.5 — forbidden<script>element andonclickevent handler). - Validator
[NOTE]tier — informational lines for intentional weighted-points overrides, distinct from warnings. validate_course.py --strictmode — public-conformance mode that rejects pre-1.0 document shapes (wrapped envelope{"course":{...}}, bare payload{"units":[...]}with nodocumentType) as fatal errors. Default mode remains lenient for legacy/pre-1.0 document ingestion during migration.NORMATIVE.md§10 conformance claims are evaluated in--strictmode.tools/run_corpus.py— conformance corpus harness. Readstests/manifest.jsonand asserts every valid fixture passes and every invalid fixture fails (the harness invokesvalidate_course.py --stricton every fixture internally). Wired into.github/workflows/publish.ymlas a required job — a corpus regression blocks deployment. Closes the gap betweenCONTRIBUTING.md’s “CI runs the corpus” promise and the previous behavior (3 invalid fixtures silently passing).ACCESSIBILITY.md(rc.1 release) — producer/consumer accessibility profile covering image alt text, video/audio (captions/transcripts/descriptions), keyboard alternatives for structured-task question types, feedback not conveyed by color alone, language and direction, and reserved-type placeholder accessibility. The rc.1 release (2026-05-23) carries per-section WCAG 2.1 AA SC cross-references, recommended ARIA patterns for the structured-task question types, the two-layer format/consumer duty framing, EN 301 549 + DOJ ADA Title II legal context, the ATAG vs WCAG split for producers vs consumers, the five claim-level gates for an AA claim, and selected WCAG 2.2 criteria designed-in (2.5.7 Dragging Movements, 2.5.8 Target Size). RTL producer wording precisely distinguishes RTL-primary documents from LTR documents with embedded RTL passages. Caption requirements split: SHOULD for generic video, MUST for prerecorded instructional video with speech when claiming Accessibility Profile conformance (WCAG 1.2.2). Additive deepenings (per-criterion normative cross-reference table, expanded ARIA patterns, screen-reader timing requirements, expanded accessibility conformance fixtures,--accessibilityvalidator flag) land in1.0final on 2026-06-30 — every deferral is an explicit §-level callout. Closes theHTML_SAFETY.mddangling-reference issue flagged by the v2 audit.- NORMATIVE.md §12 — Accessibility Profile binding (hybrid: preservation in base conformance + opt-in claim for delivery). Splits the accessibility obligation into two layers per the 2026-05-23 accessibility audit Finding 1 and the consultant’s “must survive transformation” framing. §12.1 codifies the base-conformance accessibility-preservation floor (every conforming consumer MUST round-trip
alt,<track>,lang,dir,language,supportLanguage, reserved-type accessibility metadata, andx--namespaced accessibility extensions). §12.2 defines the opt-in LC-JSON 1.0 Accessibility Profile claim binding all MUST-level items inACCESSIBILITY.md§§2–8 (keyboard, ARIA, feedback, language-aware rendering, placeholder accessibility). §12.3 codifies the relationship to WCAG: LC-JSON enables WCAG 2.1 AA delivery via the affordances and consumer obligations; the WCAG claim itself remains the delivering consumer’s responsibility, not LC-JSON’s. Closes accessibility-audit Finding 1. - NORMATIVE.md §10 — Conformance Claims expanded with marketing wording. §10.1 lists base-conformance claim forms; §10.2 lists Accessibility Profile claim forms; §10.3 codifies three claim-accuracy rules (producer≠consumer; Accessibility Profile is fully bound; LC-JSON does not certify WCAG); §10.4 provides suggested badge/sentence/formal-claim wording for both tiers (Tier 1 badge: “LC-JSON 1.0 Compatible”; Tier 2 badge: “LC-JSON 1.0 Accessibility Profile”); §10.5 reaffirms the trademark stance. The marketing wording is informative and freely usable.
languageis now a required root field oncourse.schema.jsonandquestion-set.schema.json(per accessibility-audit Finding 2 — the spec saidlanguagewas schema-enforced but the schemas didn’t require it). All 31 existing root-level course/questionSet examples and conformance fixtures already carriedlanguage; no migration burden. New invalid fixturetests/invalid/20-missing-language.jsonpins the constraint.- Two new accessibility conformance fixtures.
tests/valid/12-accessibility-round-trip.jsondemonstrates §12.1 base-conformance accessibility preservation:alton<img>,<track>on<video>,lang/diron inline spans, document-rootlanguage/supportLanguage, andx--namespaced accessibility metadata on a reserved-type question MUST all round-trip. Paired with thetests/invalid/20-missing-language.jsoninvalid fixture from the previous entry, the two additions bring the conformance corpus toward its rc.1 total of 12 valid + 24 invalid = 36 fixtures;run_corpus.pyreports 36/36 behave as expected. VALIDATION.md— schema-vs-domain-vs-advisory rule catalog. New sibling document referenced fromNORMATIVE.md§13 (“Validation surface”) and listed inREADME.md’s directory tree. Catalogs every documented validation rule across the 23 JSON Schemas,tools/validate_course.py(domain pass, ERROR/WARN/NOTE severities), and prose inNORMATIVE.md/HTML_SAFETY.md/ACCESSIBILITY.md/question-types-reference.md/ITEM_PATTERNS.md. Each rule is tagged with its enforcement tier and a precise citation (schemas/<file>.schema.json: <json-pointer>orvalidate_course.py: <function-name>+ NORMATIVE §). Organized by document scope (root → course → unit → lesson → item-base → per item type → question-base → per question type → reserved types → question sets → cross-cutting). Cross-cutting section covers HTML safety severity table, accessibility preservation + validator severities, randomization (§5.6), extensions (x-namespacing), versioning / URL stability, and discriminator casing. Closes the external evaluation’s “validation appendix” gap (2026-05-02 §3) — consumers that only run JSON Schema validation can now see at a glance which rules they would otherwise miss. The inventory pass surfaced eight documented-but-unenforced rules; all eight were closed in the same rc.1-polish session by extendingtools/validate_course.py(see next entry). The catalog reflects the rc.1 enforced state, not a gaps list. §14 “Forward-looking deepenings” tracks what’s scheduled for1.0final (the--accessibilityvalidator flag, tag namespace conventions, and reserved-type per-type schemas).- Validator gap closures — 8 documented rules promoted from prose to enforced. Extends
tools/validate_course.pywith no schema changes — all rules were already documented inNORMATIVE.md/question-types-reference.md/README.md/ACCESSIBILITY.mdbut enforced nowhere. New domain-validator functions:validate_multiple_choice(KG-3: MCQ MUST have ≥1 positiveoptionsAndPointsvalue; KG-4:optionsAndPointsMUST cover every entry inoptions, with WARN for orphan entries);validate_word_bank_clozeandvalidate_multiple_choice_cloze(KG-1:passage@@@Nmarker set MUST equal the answer/option dictionary key set; KG-2: marker numbers SHOULD be sequential from 1; KG-5:correctAnswers[N]MUST be in bounds ofgapOptions[N]; new ERROR forgapOptions/correctAnswerskey-set mismatch);validate_essay(maxWords >= minWordswhen both > 0, WARN). Extendedvalidate_multi_gap_clozewith the same KG-1/KG-2 checks via the new shared_check_cloze_gap_consistencyhelper. New_collect_objective_id_violationswalker (KG-6: WARN ifcourseObjectiveIds[*]/unit.objectiveIds[*]/lesson.objectiveIds[*]reference an id not declared incourse.objectives[]— closes the referential-integrity gap on objective references).validate_html_contentgains a post-pass that scans for<video>blocks without a<track kind="captions"\|"subtitles">child element (KG-8 — WARN at rc.1; ERROR-tier promotion under the--accessibilityflag is targeted for 1.0 final). Internal: renamed_placement_marker_numbers→_gap_marker_numberssince the helper is now shared across all cloze types. Four new invalid conformance fixtures pin the ERROR-tier checks (tests/invalid/21-mcq-no-correct-option.json,22-mcq-options-points-missing-entry.json,23-word-bank-cloze-gap-count-mismatch.json,24-multiple-choice-cloze-index-out-of-bounds.json);tests/manifest.jsonupdated;python tools/run_corpus.pyreports 36/36 fixtures behave as expected (the harness invokesvalidate_course.py --stricton every fixture internally). KG-7 was a false alarm during inventory (validator and schema agree onrelatedItemIds) and is dropped from the catalog. No new normative rules; no schema changes. - Language Policy (editorial). Specification text is written in US English as a single editorial register; example content should vary across English regional varieties (British, American, Australian, Indian, Irish, Canadian, South African, and others); non-English content in examples is reserved for demonstrating language-specific or multilingual capabilities (RTL rendering, non-Latin scripts, bilingual
[L1:]tag rendering,language/dirbehavior). The wire format is language-neutral — the policy governs only the editorial language of the specification and its examples.
Changed
- Ordering: renamed
unitLevel→orderingUnitfor teacher readability. Pre-publication rename, non-breaking. Schema, examples (16b,16c), reference docs, andITEM_PATTERNS.mdall updated. Placement’s parallelplacementUnitfield-name reads naturally alongside. - Ordering:
scoringModedefault is now conditional onorderingUnit. Schema description prose now states: whenscoringModeis omitted, consumers SHOULD default to"strict"fororderingUnit: "word"and"kendall"for"sentence"/"paragraph". The literal"default": "strict"keyword is removed from the schema property since JSON Schema draft-07 can’t cleanly express conditional defaults; the conditional guidance lives in the description, where consumers actually read it. Existing examples are unaffected:16-ordering.json(word) carries noscoringModeand inheritsstrict;16band16csetkendallexplicitly. globalIdis now required on every Unit, Lesson, Item, and Question schema (was nullable + not-required on Unit/Lesson/Item, despite being normatively MUST under §4.4). Pattern unchanged (RFC 4122 any version).$schemais now required at the document root (producer MUST emit; consumer SHOULD tolerate omission for re-import flexibility). Updatedcourse.schema.jsonandquestion-set.schema.jsonrequired[].scoringModeandorderingUnit(formerlyunitLevel) remain optional onordering.schema.json—orderingUnitdefaults to"word";scoringMode’s default is the conditional rule above (description prose, no literal default keyword).exercise/quizdecoupled from grading policy.quiz-item.schema.jsonno longer has"isGraded": {"const": true}. All four combinations of{exercise, quiz} × {graded, ungraded}are valid; new NORMATIVE §4.3 codifies this.- Brand consistency: schema descriptions now say “LC-JSON spec version” (was “LC.JSON”).
question-types-reference.md: matching reference entry rewritten as a real type (was labeled stub);pointsconstraint corrected to≥ 0allowingnull; reserved-type entries (11 total) reframed under §6 fallback contract;Phase 4: Advanced Typesheading split intoStructured & Reserved Types; casing references corrected to NORMATIVE §5.3.README.md(specification): Reserved Types section now describes consumer obligations under §6 explicitly; example count updated to 30 across the file (placement adds 17a–17d to the example set).ACCESSIBILITY.md: keyboard interaction obligations split per matching mode (pairsvsclassification) and extended with aplacementrow;unitLevelreference updated toorderingUnit.- NORMATIVE §5.1 carve-out for §6 fallback. Resolved an internal contradiction between §5.1 (“consumer MUST reject documents that fail schema validation”), §5.2 (“consumer MUST accept any 1.x
specVersion”), and §6 (“consumer MUST preserve unknown-type questions verbatim and apply fallback, do not reject”). Applied literally, the three together made the forward-compat case impossible: a 1.0-only consumer reading a 1.1 document with a future-minor question type was bound to both reject (§5.1) and preserve (§6). §5.1 now carries an explicit exception: schema-validation failures whose only cause is one or moretypediscriminator values not in the consumer’s implementedquestion-base.schema.jsonenum do not trigger rejection; the consumer applies §6 fallback to those questions and validates the rest of the document under §5.1. All other schema failures (missing required fields, type mismatches, pattern violations on known fields,additionalPropertiesviolations) still trigger rejection. Doc-only NORMATIVE change; no schema change, no validator change —tools/run_corpus.pycontinues to report 36/36 fixtures behave as expected. Closes the consultant’s 2026-05-24 third-pass note on VALIDATION.md. - URL policy clarified — release candidates get their own immutable URL paths (NORMATIVE §3.1, §4.7, §8.1, §8.3). Each release candidate of an upcoming version
X.Yis published at/X.Y-rc.N/, immutable once published. The/X.Y/URL path is reserved for the accepted finalX.Yrelease and MUST NOT be populated until that release is published. A document pinned via$schemato/1.0-rc.1/continues to validate against rc.1 forever; adoption of/1.0/final is an explicit re-export, not an automatic promotion. Producers MUST emit a$schemaURL matching the spec version they conform to (/1.0-rc.1/for an rc.1 producer,/1.0/for a 1.0-final producer). This closes the audit’s “Release Candidate URL Immutability Wording” concern and preserves the §8.3 immutability guarantee across the rc.1 → 1.0 transition.
Deprecated
- None.
Removed
course-walkthrough.json(was excluded from publication via build-time SKIP; deleted to remove residual maintenance noise).
Fixed
TrueFalseQuestionexports no longer leakchoiceFeedback: {}(TF v2 architecture forbids the field; previousnew()default emitted empty dict on every TF question).- Course/QuestionSet
Versionis now mandatory + numeric on producer side (^[0-9]+(\.[0-9]+){0,2}$); previously could ship as blank or free-form, breaking sortability across consumers. - Python authoring tools emit canonical camelCase question discriminators; prior PascalCase emissions are gone.
- Cross-references in
README-public.md,IMPLEMENTATIONS.md, and the test corpus updated for the §5.5 collapse + §6→§7 / §8→§9 renumber driven by the new §6 insertion. - Reference Runtime (consumer-side): bilingual tag rendering on question
promptis now applied uniformly across all 12 question types. PreviouslyShortAnswer,Essay,SentenceTransformation,WordBankCloze, andMultiGapClozeskipped the bilingual transform onprompt; all 12 types skipped it onhint. Adopting a unified line-break renderer (which composes the bilingual transform) closes the gap. No spec or schema change; behavior change in the reference runtime only. HTML_SAFETY.md§8.1 / §2.4 alignment. The §8.1 “validator MUST reject” enumeration of forbidden elements had drifted from the §2.4 source list — six elements were missing (<select>,<textarea>,<applet>,<frame>,<frameset>,<noframes>). §8.1 now references §2.4 directly (“Any forbidden element listed in §2.4.”) so the two lists cannot drift in future revisions.HTML_SAFETY.md§3.5 cleanup. Removed a misleadingdata:cross-listing in the forbidden-attributes section. The rule lives in §4.2’s URL-scheme allowlist; the §3.5 line carried an empty “except as permitted in §4.1” carve-out that didn’t reflect any actual permission (§4.1 permits nodata:URLs). §3.5 now cross-references §4.2 instead of duplicating the rule.ACCESSIBILITY.md§8 validator-severity table — root field name. The “Missinglangat document root” row swapped the LC-JSON root field (language, ISO 639-1, schema-enforced) for the HTML attribute (lang). Schemas enforcelanguage;langis the HTML attribute used inline for WCAG 3.1.2 Language of Parts. Table now reads “Missinglanguageat document root.”
Notes
- All 31 spec examples + 12 valid + 24 invalid conformance fixtures validate clean against the tightened schemas (run
python LC.JSON/tools/run_corpus.py— expects “36/36 fixtures behaved as expected”). - Wire format is forward-compatible with
1.0baseline: every change either softens a constraint, adds an optional field, or codifies behavior consumers were already expected to provide.
[1.0-internal] — 2026-04-29
Internal milestone — not publicly released. Captured here as the wire-format baseline that the 1.0-rc.x release-candidate line builds on. 1.0-rc.2 (2026-05-30) was the first publicly announced candidate; public consumers should target it or later (the earlier 1.0-rc.1 is served for transparency but was never announced). The 1.0-rc.1 schemas are published at lc-json.org/1.0-rc.1/; the /1.0/ URL space is reserved for the accepted final 1.0 release (target 2026-06-30) and does not resolve until then.
Added
- Two artifact types under a common flat root:
course(hierarchical) andquestionSet(flat). - 11 user-facing question types:
simpleGapFill,trueFalseQuestion,multipleChoice,wordBankCloze,multiGapCloze,multipleChoiceCloze,shortAnswer,essay,sentenceTransformation,matching,ordering. All schema-validated. - 7 reserved question types declared in the discriminator enum for forward compatibility, with full implementation targeted for 2027:
association,hotspot,graphicGapMatch,graphicAssociate,graphicOrder,fileUpload,mediaPromptedEssay. - 22 JSON Schemas (Draft 7) covering every artifact, item type, and question type.
- 25 example documents covering every artifact and per-type fragments.
- Conformance test corpus under
tests/(4 valid + 10 invalid cases) with a machine-readablemanifest.jsonmapping each file to the clause it tests. NORMATIVE.md— RFC 2119 conformance requirements, producer/consumer roles, versioning policy, deprecation rules, conformance-claim language.- Reference Python tools:
validate_course.py(validator). The fixture-scaffolding helper used during internal development was reclassified as internal authoring infrastructure for 1.0-rc.1 publication and is not shipped to the public repo. - Schema URL stability guarantee codified in
NORMATIVE.md§8.3 (each published version path is immutable for the lifetime of the spec). Public publication oflc-json.org/1.0/*.schema.jsonis deferred to the accepted 1.0 final release;1.0-rc.1publishes atlc-json.org/1.0-rc.1/*.schema.json. - Apache License 2.0 throughout.
Notes
- LC-JSON 1.0 distils approximately 12 months of internal format iteration. Earlier internal versions were never publicly released and are not part of this version history.
Governance
Status: Informative. Spec version: 1.0 Last updated: 2026-05-29
Stewardship
LC-JSON (Learning Content JSON) is currently maintained by Brent Miller, the originator of the specification. The project operates under a benevolent-dictator model: substantive proposals are reviewed and accepted, modified, or rejected by the maintainer, with deliberations conducted publicly via GitHub Issues and Discussions on this repository.
This is a single-maintainer steward model. It is appropriate for a specification at its early stage — one originator, no working group, a small but growing community of implementers. It is not a permanent posture.
How decisions are made
| Change type | Process |
|---|---|
| Typo, broken link, formatting | PR welcome directly. Merged quickly. |
| Schema or example bug fix | Open an issue describing the bug and a test case (positive or negative) demonstrating it. PR welcome alongside or after the issue. |
| New question type | Open an issue first describing the type, its scoring semantics, and the audience (which exam, which level, which subject). A reference schema, an example, and conformance test cases are required before merge. |
| Breaking change to existing schemas | Strongly discouraged within a major version (see NORMATIVE.md §8). Requires a new minor or major version with a new URL path. Open an issue to discuss before any PR. |
| Process or governance change | Open an issue. Substantial changes will be discussed publicly before being adopted. |
Substantive proposals (anything beyond a typo or non-normative fix) are documented in proposals/ (added when the first such RFC lands) as one-page RFCs with a clear status line: draft, accepted, or rejected.
Versioning and the living specification
The LC-JSON repository maintains a single living specification source. Published releases create immutable versioned artifacts — the per-version schema URLs (lc-json.org/<version>/<name>.schema.json). Backward compatibility and behavioral changes are documented through versioned schema URLs, release notes, and the changelog.
There is no separate, independently maintained prose source for a prior published version: the git tag is the historical source. The living specification (this repository’s working tree) advances to the current version; prior versions persist as (a) their immutable published schema URLs and (b) git tags and commits. For example, the schemas at /1.0-rc.1/ and /1.0-rc.2/ stay served and frozen, but the repository’s working schemas have moved on to /1.0-rc.3/; to read the spec exactly as rc.1 published it, read the rc.1 git tag, not the current working tree.
Release Candidate Policy
- RC releases MAY introduce backwards-compatible corrections.
- RC releases MAY clarify specification language.
- RC releases MUST NOT silently modify previously published artifacts — each RC publishes at its own immutable URL path (
/1.0-rc.1/,/1.0-rc.2/,/1.0-rc.3/, …). - Final v1.0 establishes the stable contract.
The rc.1 → rc.2 prompt-field correction is exactly this kind of sanctioned change: a backwards-compatible correction (minLength: 1 → 0) and a language clarification (defining prompt as non-authoritative for symbolic types), published at a new immutable /1.0-rc.2/ path while /1.0-rc.1/ stays frozen. The rc.2 → rc.3 cut follows the same pattern: it removes two prototype-era sentenceTransformation fields from the schema — which, because /1.0-rc.2/ is immutable, must land at a new /1.0-rc.3/ path — while rc.1 and rc.2 stay frozen, and every rc.2-valid document remains valid under rc.3.
Working group
A formal working group will be considered when at least two independent third-party implementations of LC-JSON exist and are in active use. “Independent” means not built or maintained by the same organization that originated the specification.
Until that threshold is met, decisions remain with the maintainer. Forks are permitted under the Apache 2.0 license; community input is welcomed via Issues and Discussions, but accepted changes are at the maintainer’s discretion.
Once the threshold is met, the maintainer will publish a transition plan: charter, working-group composition criteria, decision-making process, and (if appropriate) custodial handoff to a foundation or neutral steward.
Trademark and naming
“Lesson Commons” is a trademark of Brent Miller and is not asserted over the LC-JSON specification, the names “LC-JSON” or “Learning Content JSON”, or any conforming implementation. Implementers may state conformance freely.
The canonical sources for the LC-JSON specification are this repository — github.com/lc-json/specification — and the published site at lc-json.org. Forks and mirrors are permitted under the Apache 2.0 license, but only the URLs above carry the canonical specification text. A forked or mirrored copy that materially differs from these sources is a derivative work and should be named accordingly — not “LC-JSON” without qualification.
The specification’s name and canonical URL (lc-json.org) are not vendor-coupled by design. If stewardship of LC-JSON ever passes to a foundation, working group, or successor maintainer, the name and URL travel with the specification, not with any sponsoring organization.
Contact
- Issues and proposals: open an issue on this repository.
- Conformance questions: consult
NORMATIVE.md; it is the authoritative source for implementer requirements. - Maintainer correspondence: via GitHub.
Contributing to LC-JSON
Thank you for your interest in contributing to the LC-JSON (Learning Content JSON) specification. This document describes how to file issues, propose changes, and submit pull requests.
Quick orientation
- Specification text:
README-spec.md,NORMATIVE.md,question-types-reference.md. - Schemas:
schemas/— JSON Schema Draft 7. - Examples:
examples/— 31 example documents. - Conformance tests:
tests/— valid and invalid cases per clause; seetests/README.md. - Implementation directory:
IMPLEMENTATIONS.md.
Filing issues
Open an issue for:
- Bugs in the specification text, schemas, or examples. Please include the file and line, the text in question, and a clear description of the discrepancy.
- Proposals for new question types or new properties. Describe the use case (which audience, which exam or subject, which scoring semantics), the rationale, and a draft schema or example if you have one.
- Conformance questions. First check
NORMATIVE.md; it is the authoritative source for implementer requirements. If the spec is ambiguous, file an issue noting the ambiguous passage. - Implementations to list. See
IMPLEMENTATIONS.mdfor the listing format.
A short, focused issue is more useful than a long one. If you are not sure whether something belongs as an issue or a discussion, open an issue.
Submitting pull requests
| Change type | Process |
|---|---|
| Typo, broken link, small clarification | PR directly. |
| Schema or example fix | Open an issue describing the bug; reference it in the PR. Include a conformance test case under tests/valid/ or tests/invalid/ if applicable. |
| New schema or new question type | Open an issue first to discuss. PRs without prior discussion may be closed pending alignment. |
| Spec text revisions | Open an issue first if the change is substantive (more than a clarification or typo). |
What a good PR looks like
- A clear PR title summarizing the change.
- A description tying the PR to an issue (if one exists) and explaining the rationale.
- For schema or example changes: positive and negative test cases under
tests/, if applicable. - For new question types: a reference schema, an example file, an entry in
question-types-reference.md, and at least one positive test case. - A
CHANGELOG.mdentry if the change is observable to implementers.
Tests and CI
CI for this repository runs the conformance test corpus against the reference validator using tools/run_corpus.py, which invokes validate_course.py --strict on every fixture in tests/manifest.json. The harness exits non-zero unless every valid fixture passes and every invalid fixture fails — PRs that break a corpus expectation will not merge.
To reproduce locally before opening a PR:
pip install -r tools/requirements.txt
python tools/run_corpus.py
If your change requires updating a test fixture or the manifest, do so in the same PR and call it out in the description.
Contribution license
By submitting a contribution, you agree that your contribution is licensed under the terms of the Apache License, Version 2.0. No separate Contributor License Agreement (CLA) is required at this stage.
If you contribute substantively (more than a typo fix), you may add yourself to CONTRIBUTORS.md in the same PR.
Code of Conduct
This project follows the Contributor Covenant 2.1. By participating, you agree to abide by its terms.
Scope and direction
LC-JSON is a specification for portable learning-content interchange. Contributions that align with this scope are welcome:
- Schema fixes and clarifications.
- New question types with clear use cases and pedagogical motivation.
- Improvements to the conformance test corpus.
- Translations of the specification text (open an issue first to discuss).
Out of scope for the specification repository:
- Implementations (these belong in their own repositories; list them in
IMPLEMENTATIONS.md). - Authoring UIs, editors, content libraries, or delivery platforms.
- Pedagogical guidance unrelated to the JSON wire format.
Decision-making
Substantive proposals are reviewed by the maintainer per GOVERNANCE.md. Acceptance, modification, or rejection is the maintainer’s discretion at this stage; a working-group governance model will replace this when the criteria in GOVERNANCE.md are met.
Decisions are made publicly via Issues and PRs. If a discussion is going long or complex, the maintainer may summarize the resolution in a proposals/ RFC document for durable reference (the directory is added when the first such RFC lands).
Thank you
Open standards depend on people who care enough to file issues and read drafts. If you have noticed something that could be better, please tell us — the cost of an issue is small, the value of catching a problem early is large.
Code of Conduct
This project follows the Contributor Covenant 2.1, a widely-adopted code of conduct for open-source communities.
The full text is available at: https://www.contributor-covenant.org/version/2/1/code_of_conduct/
Summary
We are committed to providing a friendly, safe, and welcoming environment for all contributors and participants, regardless of background, experience, or identity. We expect everyone interacting in this project’s repositories, issue trackers, and discussions to follow the Contributor Covenant.
In short:
- Be respectful. Disagreement is fine; personal attacks are not.
- Be inclusive. Welcome newcomers, assume good intent, ask before correcting.
- Be patient. Open-source contribution is voluntary; people respond when they can.
- Keep discussions on-topic. Issues and PRs are about the specification.
Reporting
If you experience or witness behavior that violates this Code of Conduct, please report it by opening a private GitHub message to the maintainer (Brent Miller, @bantonym) or, if appropriate, filing a public issue.
Reports will be reviewed and acted upon. Confidentiality of reporters will be respected to the extent consistent with addressing the issue.
Enforcement
The maintainer is responsible for clarifying and enforcing this Code of Conduct. Enforcement actions may include: a warning, temporary restriction from contribution, or permanent removal, depending on severity.
The maintainer reserves the right to remove, edit, or reject comments, commits, code, issues, and other contributions that are not aligned with this Code of Conduct.
Contributors
LC-JSON (Learning Content JSON) was originated by Brent Miller, who holds all substantive decisions on scope, naming, licensing, trademark stance, and design.
AI-assistant contributions
The specification text, schemas, conformance corpus, reference tools, and publication pipeline were drafted with substantial AI assistance — Claude (Anthropic) as primary drafting and implementation assistant, and Codex (OpenAI) as auditor and editor across review passes.
All substantive decisions — scope, naming, design, licensing, and trademark stance, and every normative claim — are the originator’s.
Future contributors will be listed below as pull requests land.
How to be listed
Open a PR with a substantive contribution (specification text, schema fix, conformance test, tooling, governance) and add yourself to the list in the same PR. Typo fixes and small clarifications are welcome but do not require listing.
Inclusion in this file is acknowledgment, not assignment of authorship rights — copyright in contributed work follows the project’s license (LICENSE, Apache 2.0) and is governed by the contribution model in CONTRIBUTING.md.