Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

MD-Models is a markdown-based specification language for research data management.

It is designed to be easy to read and write, and to be converted to various programming languages and schema languages.

# Hello MD-Models

This is a simple markdown file that defines a model.

### Object

Enrich your objects with documentation and communicate intent to domain experts.

This is a simple object definition:

- string_attribute
    - type: string
    - description: A string attribute
- integer_attribute
    - type: integer
    - description: An integer attribute

Core Philosophy

The primary motivation behind MD-Models is to reduce cognitive overhead and maintenance burden by unifying documentation and structural definition into a single source of truth. Traditional approaches often require maintaining separate artifacts:

  1. Technical schemas (JSON Schema, XSD, ShEx, SHACL)
  2. Programming language implementations
  3. Documentation for domain experts
  4. API documentation

This separation frequently leads to documentation drift and increases the cognitive load on both developers and domain experts.

A Little Anecdote

When I began my journey in research data management, I was frequently overwhelmed by the intricate tools and standards in use. As a researcher suddenly thrown into a blend of software engineering, format creation, and data management, it felt like I was plunged into deep water without a safety net.

Data management, by its very nature, spans multiple disciplines and demands a thorough understanding of the domain, the data itself, and the available tools. Yet, even the most impressive tools lose their value if they don’t cater to the needs of domain experts. I came to realize that those experts are best positioned to define the structure and purpose of the data, but the overwhelming complexity of existing tools and standards often prevents their active participation.

MD-Models is my response to this challenge. It makes building structured data models easier by enabling domain experts to document the data’s intent and structure in a clear and manageable way. Markdown is an ideal choice for this task. It is simple to read and write, and it effectively communicates the necessary intent. Moreover, its semi-structured format allows for effortless conversion into various schema languages and programming languages, eliminating the need for excessive boilerplate code.

Quickstart

In order to get started with MD-Models, you can follow the steps below.

Installation

In order to install the command line tool, you can use the following command:

cargo install mdmodels

Writing your first MD-Models file

MD-Models files can be written in any editor that supports markdown. In the following is a list of recommended editors:

We also provide a web-editor at mdmodels.vercel.app that can be used to write and validate MD-Models files. This editor not only features a syntax higlighted editor, but also …

  • Live preview of the rendered MD-Models file
  • Graph editor to visualize the relationships between objects
  • Automatic validation of the MD-Models file
  • Export to various schema languages and programming languages

Packages

The main Rust crate is compiled to Python and WebAssembly, allowing the usage beyond the command line tool. These are the main packages:

  • Core Python Package: Install via pip:

    # Mainly used to access the core functionality of the library
    pip install mdmodels-core
    
  • Python Package: Install via pip:

    # Provides in-memory data models, database support, LLM support, etc.
    pip install mdmodels
    
  • NPM Package: Install via npm:

    # Mainly used to access the core functionality of the library
    npm install mdmodels-core
    

Examples

The following projects are examples of how to use MD-Models in practice:

Syntax

This section describes the syntax of MD-Models. It is intended to be used as a reference for the syntax and semantics of MD-Models.

Objects

Objects are the building blocks of your data structure. Think of them as containers for related information, similar to how a form organizes different fields of information about a single topic.

What is an Object?

An object is simply a named collection of properties. For example, a Person object might have properties like name, age, and address. In our system, objects are defined using a straightforward format that’s easy to read and write, even if you’re not a programmer.

How to Define an Object

You start objects by declaring its name using a level 3 heading (###) followed by the name of the object. In the example below, we define an object called Person.

### Person

This is an object definition.

Great! Now we have a named object. But what’s next?

Object Properties

Objects can have properties, which define the specific data fields that belong to the object. Properties are defined using a structured list format with the following components:

  1. The property name - starts with a dash (-) followed by the name
  2. The property type - indicates what kind of data the property holds
  3. Optional metadata - additional specifications like descriptions, constraints, or validation rules

Here’s the basic structure:

### Person (schema:object)

- name
  - type: string
  - description: The name of the person

Lets break this down:

  • - name - The name of the property
  • - type: string - The type of the property, because we expect a name to be a string (e.g. “John Doe”)
  • - description: The name of the person - A description of the property

The name of the property and its type are required. The description is optional, but it is a good practice to add it. Later on we will see that a thourough description can be used to guide a large language model to extract the information from a text.

By default, properties are optional. If you want to make a property required, you need to bold the property name using either __name__ or **name**. Replace name with the name of the property.

Property Types

The data type of a property is very important and generally communicates what kind of data the property holds. Here is a list of the supported base types:

  • string - A string of characters
  • integer - A whole number
  • float - A floating point number
  • number - A numeric value (integer or float)
  • boolean - A true or false value

Arrays

While these types are the building blocks, they fail to capture the full range of data types that can be used in a data model. For example, we need to be able to express that a property is an array/list of strings, or an array/list of numbers. This is where the array notation comes in.

We define an array of a given type by placing empty square brackets after the type. For example, an array of strings would be written as string[][^inspired by TypeScript].

### Person (schema:object)

- an_array_of_strings
  - type: string[]
  - description: An array of strings
- an_array_of_numbers
  - type: number[]
  - description: An array of numbers

Connecting Objects

Now we know how to define singular and array properties, but we often need to create relationships between objects in our data models. For example, a Person object might have an address property that references an Address object. This relationship is easily established by using another object’s name as a property’s type.

### Person

- name
  - type: string
- address
  - type: Address

### Address

- street
  - type: string
- city
  - type: string
- zip
  - type: string

This approach allows you to build complex, interconnected data models that accurately represent real-world relationships between entities. You can create both one-to-one relationships (like a person having one address) and one-to-many relationships (by using array notation).

Property Options

When defining properties in your data model, you can apply various options to control their behavior, validation, and representation. These options are defined using the - option: value syntax. In the following sections, we will look at the different options that are available.

General Options

OptionDescriptionExample
descriptionProvides a description for the property- description "The name of the person"
exampleProvides an example value for the property- example "John Doe"

JSON Schema Validation Options

These options map to standard JSON Schema validation constraints, allowing you to enforce data integrity and validation rules in your models. When you use these options, they will be translated into corresponding JSON Schema properties during schema generation, ensuring that your data adheres to the specified constraints. This provides a standardized way to validate data across different systems and implementations that support JSON Schema.

OptionDescriptionExample
minimumSpecifies the minimum value for a numeric property- minimum: 0
maximumSpecifies the maximum value for a numeric property- maximum: 100
minitemsSpecifies the minimum number of items for an array property- minitems: 1
maxitemsSpecifies the maximum number of items for an array property- maxitems: 10
minlengthSpecifies the minimum length for a string property- minlength: 3
maxlengthSpecifies the maximum length for a string property- maxlength: 50
pattern or regexSpecifies a regular expression pattern that a string property must match- pattern: "^[a-zA-Z0-9]+$"
uniqueSpecifies whether array items must be unique- unique: true
multipleofSpecifies that a numeric value must be a multiple of this number- multipleof: 5
exclusiveminimumSpecifies an exclusive minimum value for a numeric property- exclusiveminimum: 0
exclusivemaximumSpecifies an exclusive maximum value for a numeric property- exclusivemaximum: 100

Format Options

The following options are used to define how the property should be represented in different formats.

OptionDescriptionExample
xmlSpecifies that the property should be represented in XML format- xml: someName

A note on the xml option

The xml option has multiple effects:

  • Element will be set as an element in the XML Schema.
  • @Name will be set as an attribute in the XML Schema.
  • someWrapper/Element will wrap the element in a parent element called someWrapper.

Semantic Options

The following options are used to define semantic annotations. Read more about semantic annotations in the Semantics section.

OptionDescriptionExample
termSpecifies the term for the property in the ontology- term: schema:name

SQL Database Options

Database options allow you to specify how properties should be represented in relational database systems. MD-Models supports the following options:

OptionDescriptionExample
pkIndicates whether the property is a primary key in a database- primary key: true

LinkML Specific Options

Options specific to the LinkML specification:

OptionDescriptionExample
readonlyIndicates whether the property is read-only- readonly: true
recommendedIndicates whether the property is recommended- recommended: true

Custom Options

You can also define custom options that aren’t covered by the predefined ones:

- name
  - MyKey: my value

Example Usage

Here’s how you might use these options in a data model:

### Person (schema:object)

- id
  - type: string
  - primary key: true
  - description: The unique identifier for the person
- name
  - type: string
  - description: The name of the person
  - example: "John Doe"
- age
  - type: integer
  - description: The age of the person
  - minimum: 0

These options help to define constraints, provide validation rules, and give hints to code generators about how properties should be treated in the resulting applications and schemas.

Enumerations

Sometimes you want to restrict the values that can be assigned to a property. For example, you might want to restrict the categories of a product to a set of predefined values. A product might be of category book, movie, music, or other. This is where enumerations come in.

Defining an enumeration

To define an enumeration, we start the same as we do for any other type, by using a level 3 heading (###) and then the name of the type.

### ProductCategory

```
BOOK = "book"
MOVIE = "movie"
MUSIC = "music"
OTHER = "other"
```

We are defining a key and value here, wrapped in a code fence (```), where the value is the actual value of the enumeration and the key is an identifier. This is required, because when we want to re-use the enumeration in a programming language, we need to be able to refer to it by a key. For instance, in python we can pass an enumeration via the following code:

from model import ProductCategory, Product

product = Product(
    name="Inception",
    category=ProductCategory.MOVIE
)

print(product)
{
    "name": "Inception",
    "category": "movie"
}

Similar to how we can use an object as a type for a property, we can also use an enumeration as a type for a property:

### Product

- name
  - type: string
- category
  - type: ProductCategory

Descriptions

This section further highlights the usage of descriptions in MD-Models. Since we are using markdown, we can enrich our data model with any additional information that we want to add. This not only includes text, but also links and images.

Text

To add a text description to an object, we can use the following syntax:

### Product

A product is a physical or digital item that can be bought or sold.

- name
  - type: string
  - description: The name of the product

To add a link to an object, we can use the following syntax:

### Product

[Additional information](https://www.google.com)

- name
  - type: string
  - description: The name of the product

Images

To add an image to an object, we can use the following syntax:

### Product

![Product image](https://www.google.com/images/branding/googlelogo/1x/googlelogo_color_272x92dp.png)

- name
  - type: string
  - description: The name of the product

Please note that tables can be used within object definitions, but can under circumstances lead to parsing errors. It is therefore recommended to only use tables in sections.

Sections

Since objects and enumerations can get quite complex, we can use sections to group related information together. The level 2 heading (##) can be used to create a new section:

## Store-related information

This is section contains information about the store.

### Product

[...]

### Customer

[...]

## Sales-related information

This section contains information about the sales.

### Order

[...]

### Invoice

[...]

Within these sections, you can add any of the previously mentioned elements, including tables. This is very useful to breathe life into your data model and communicate intent and additional information. Treat this as the non-technical part you would usually add in an additional document. It should be noted, that the parsers will ignore these sections, so they will not be included in the generated code.

Best Practices

  • Use sections to group related information together.
  • Use links to reference external sources.
  • Use images to visually represent complex concepts.
  • Use tables to represent concepts that are better understood in a table format.

Semantics

MD-Models supports a variety of semantic annotations to help you add meaning to your data model. Most commonly, you want to annotate objects and properties with a semantic type to allow for better interoperability and discoverability. For this, ontologies are used:

Ontologies

Ontologies are a way to add semantic meaning to your data model. They are a collection of concepts and relationships between them and are specific to the domain of your data model. For instance, the schema.org ontology is a collection of concepts and relationships that span across many domains. This is very useful when you want to connect to other data models that employ similar concepts, but use different names for them.

Typically these relations are defined as triples, consisting of a subject, predicate and object. For instance, the statement “John is a person” can be represented as the triple (John, is a, person). The first element of the triple is the subject, the second is the predicate and the third is the object.

With MD-Models, you can define the is a predicate as an object annotation for an object definition. On the other hand, you can define the predicate as a property annotation for a property definition.

How to annotate objects

Objects are annotated at the level 3 heading of the object definition. The annotation is followed by a whitespace and enclosed in parentheses. Typically, these annotations are expressed in the form of a URI, which points to a definition of the concept in the ontology. But this is a verbose way and can be simplified by using a prefix. We will be using the schema prefix in the following examples. More on how to use prefixes can be found in the preambles section.

We want to express - “A Product is a schema:Product.

### Product (schema:Product)

- name
  - type: string

How to annotate properties

Properties are annotated using an option, as defined in the Property Options section. We utilize the keyword term to add a semantic type to the property. Properties can function in one of two ways:

  1. If the type of the property is a primitive type, the term option describes an is a relationship and thus the object in the sense of the triple.
  2. If the type of the property is an object or an array of objects, the term option describes the relationship (predicate) between the subject (object) and the object (type).

Object-valued properties

We want to express - “A Product is ordered by a Person.

### Product

- orders
  - type: Person[]
  - term: schema:orderedBy

The annotation effectively describes the relationship between the orders property and the Person type. Given that a Person is also annotated with a term, one can then build a Knowledge Graph that connects the orders property to the Person type in a semantically rich way, which can be used for a variety of purposes, such as semantic search and discovery.

Primitive-valued properties

We want to express - “The name of a Product is a schema:name.

### Product

- name
  - type: string
  - term: schema:name

Naturally, since the name property is part of the Product object, it builds the relationship “A Product has a name”. In terms of triples, this is represented as (Product, has, name).

Once these annotations are defined, they are automatically added to the generated code and schemes, if supported. Semantic annotations are currently supported in the following language templates:

  • python-dataclass (JSON-LD)
  • python-pydantic (JSON-LD)
  • typescript (JSON-LD)
  • shacl (Shapes Constraint Language)
  • shex (Shape Expressions)

Preamble

The preamble is the first section of your data model. It is used to provide metadata about the data model, such as the name, version, and author.

---
id: my-data-model
prefix: md
repo: http://mdmodel.net/
prefixes:
  schema: http://schema.org/
nsmap:
  tst: http://example.com/test/
imports:
  common.md: common.md
---

Frontmatter Keys

The frontmatter section of your MD-Models document supports several configuration keys that control how your data model is processed and interpreted. Here’s a detailed explanation of each available key:

id

  • Type: String (Optional)
  • Description: A unique identifier for your data model. This can be used to reference your model from other models or systems.
  • Example: id: my-data-model

prefixes

  • Type: Map of String to String (Optional)
  • Description: Defines namespace prefixes that can be used throughout your model to reference external vocabularies or schemas. This is particularly useful for semantic annotations.
  • Example:
    prefixes:
      schema: http://schema.org/
      foaf: http://xmlns.com/foaf/0.1/
    

nsmap

  • Type: Map of String to String (Optional)
  • Description: Similar to prefixes, defines namespace mappings that can be used in your model. This is often used for XML-based formats or when integrating with systems that use namespaces.
  • Example:
    nsmap:
      tst: http://example.com/test/
      ex: http://example.org/
    

repo

  • Type: String
  • Default: http://mdmodel.net/
  • Description: Specifies the base repository URL for your model. This can be used to generate absolute URIs for your model elements.
  • Example: repo: https://github.com/myorg/myrepo/

prefix

  • Type: String
  • Default: md
  • Description: Defines the default prefix to use for your model elements when generating URIs or qualified names.
  • Example: prefix: mymodel

imports

  • Type: Map of String to String
  • Default: Empty map
  • Description: Specifies other models to import into your current model. The key is the alias or name to use for the import, and the value is the location of the model to import. The location can be either a local file path or a remote URL.
  • Example:
    imports:
      common: common.md
      external: https://example.com/models/external.md
    

Import Types

The imports key supports two types of imports:

  1. Local Imports: References to local files on your filesystem

    imports:
      common: ./common/base.md
    
  2. Remote Imports: References to models hosted on remote servers (URLs)

    imports:
      external: https://example.com/models/external.md
    

When importing models, the definitions from the imported models become available in your current model, allowing you to reference and extend them. This is useful for creating modular and reusable data models.

Full example

The following is a full example of an MD-Models files that defines a data model for a research publication.

---
id: research-publication
prefix: rpub
prefixes:
  - schema: https://schema.org/
---

### ResearchPublication (schema:Publication)

This model represents a scientific publication with its core metadata, authors, 
and citations.

- __doi__
  - Type: Identifier
  - Term: schema:identifier
  - Description: Digital Object Identifier for the publication
  - XML: @doi
- title
  - Type: string
  - Term: schema:name
  - Description: The main title of the publication
- authors
  - Type: [Author](#author)[]
  - Term: schema:authored
  - Description: List of authors who contributed to the publication
- publication_year
  - Type: integer
  - Term: schema:datePublished
  - Description: Year when the publication was published
  - Minimum: 1900
  - Maximum: 2100
- citations
  - Type: integer
  - Term: schema:citation
  - Description: Number of times this publication has been cited
  - Default: 0


### Author (schema:Person)

The `Author` object is a simple object that has a name and an email address.

- __name__
  - Type: string
  - Term: schema:name
  - Description: The name of the author
- __email__
  - Type: string
  - Term: schema:email
  - Description: The email address of the author

Best practices

  1. Use Descriptive Names

    • Object names should be PascalCase (e.g., ResearchPublication)
    • Attribute names should be in snake_case (e.g., publication_year)
    • Use clear, domain-specific terminology
  2. Identifiers

    • Mark primary keys with double underscores (e.g., __doi__)
    • Choose meaningful identifier fields
  3. Documentation

    • Always include object descriptions
    • Document complex attributes
    • Explain any constraints or business rules
  4. Semantic Mapping

    • Use standard vocabularies when possible
    • Define custom terms in your prefix map
    • Maintain consistent terminology
  5. Validation Rules

    • Include range constraints for numbers
    • Specify default values when appropriate
    • Document any special validation requirements

Common Patterns

Array Types

- tags
  - Type: string[]
  - Description: List of keywords describing the publication

Object References

- main_author
  - Type: Author
  - Description: The primary author of the publication

Required Fields

- __id__
  - Type: Identifier
  - Description: Unique identifier for the object

Remember that MD-Models aims to balance human readability with technical precision. Your object definitions should be clear enough for domain experts to understand while maintaining the structure needed for technical implementation.

Command Line Interface

The MD-Models command-line interface (CLI) provides a comprehensive set of tools for working with markdown data models. It enables you to validate models, generate code in multiple formats, extract data using AI, and automate workflows through pipelines.

Installation

Install the MD-Models CLI using Cargo:

cargo install mdmodels

Once installed, you can use the md-models command from anywhere in your terminal.

Available Commands

The MD-Models CLI provides the following commands:

validate

Validates markdown model files for structural integrity, naming conventions, and type consistency.

Quick start:

md-models validate -i model.md

Use cases:

  • Check models before code generation
  • Validate models in CI/CD pipelines
  • Ensure models meet naming and structure requirements

Learn more: Schema Validation


convert

Converts markdown models to various output formats including programming languages, schema definitions, API specifications, and documentation.

Quick start:

# Generate Python Pydantic models
md-models convert -i model.md -t python-pydantic -o models.py

# Generate JSON Schema
md-models convert -i model.md -t json-schema -r Document -o schema.json

Use cases:

  • Generate type-safe code for your application
  • Create API schemas and specifications
  • Produce documentation from models
  • Export to semantic web formats

Learn more: Code Generation

Available formats: See Exporters for a complete list of supported templates.


pipeline

Generates multiple output files from one or more models using a TOML configuration file. Ideal for automating code generation workflows.

Quick start:

md-models pipeline -i pipeline.toml

Use cases:

  • Generate multiple formats in one command
  • Automate code generation for entire projects
  • Maintain consistency across generated outputs
  • Integrate with CI/CD workflows

Learn more: Pipelines


extract

Uses Large Language Models (LLMs) to extract structured data from unstructured text based on your data model.

Quick start:

md-models extract -m model.md -i text.txt -o output.json

Use cases:

  • Extract structured data from documents
  • Parse unstructured text into typed objects
  • Convert legacy data formats to structured JSON
  • Automate data entry tasks

Learn more: Large Language Models


dataset validate

Validates JSON datasets against markdown models to ensure data conforms to the model structure.

Quick start:

md-models dataset validate -i data.json -m model.md

Use cases:

  • Validate API request/response data
  • Check data quality in ETL pipelines
  • Ensure data consistency before processing
  • Validate user-submitted data

Common Workflows

Development Workflow

# 1. Validate your model
md-models validate -i model.md

# 2. Generate code for your application
md-models convert -i model.md -t python-pydantic -o models.py

# 3. Generate API schemas
md-models convert -i model.md -t json-schema -r Document -o schema.json
md-models convert -i model.md -t graphql -o schema.graphql

Automated Pipeline Workflow

# Use a pipeline configuration to generate everything at once
md-models pipeline -i pipeline.toml

Data Extraction Workflow

# Extract structured data from unstructured text
md-models extract -m model.md -i document.txt -o extracted.json

# Validate the extracted data
md-models dataset validate -i extracted.json -m model.md

Input Sources

All commands that accept input files support:

  • Local file paths: md-models validate -i model.md
  • Remote URLs: md-models validate -i https://example.com/model.md

MD-Models automatically detects whether the input is a URL (starts with http/https) or a local file path.

Getting Help

Get help for any command using the --help flag:

# General help
md-models --help

# Command-specific help
md-models convert --help
md-models validate --help
md-models pipeline --help

Command Reference

CommandPurposeDocumentation
validateValidate model structure and syntaxSchema Validation
convertGenerate code and schemasCode Generation
pipelineBatch generation from configPipelines
extractLLM-powered data extractionLarge Language Models
dataset validateValidate data against modelsSee md-models dataset validate --help

Next Steps

For detailed information about available export formats and templates, see the Exporters documentation.

Code generation

MD-Models provides powerful code generation capabilities through the convert command, allowing you to transform your markdown data models into various formats including programming languages, schema definitions, API specifications, and documentation.

The Convert Command

The convert command is the primary way to export your MD-Models data model to different formats. It reads your markdown model file and generates output in the specified template format.

Basic Syntax

md-models convert -i <input> -t <template> [-o <output>] [-r <root>] [-O <options>]

Command Parameters

Input (-i or --input)

Specifies the input markdown model file that contains your data model definition. The input can be provided in two ways:

  • Local file path: Path to a markdown file on your local filesystem. Use this when working with models stored locally on your machine.

    md-models convert -i model.md -t python-pydantic
    

    This example reads the model.md file from the current directory and converts it to Python Pydantic models. The file path can be relative (like model.md or ../models/schema.md) or absolute (like /path/to/model.md).

  • Remote URL: URL to a markdown file hosted online. Use this to fetch and convert models directly from web repositories, GitHub, or other online sources.

    md-models convert -i https://example.com/model.md -t json-schema
    

    This example fetches the model from the specified URL and generates a JSON Schema. This is particularly useful when working with shared models or models hosted in version control systems like GitHub.

MD-Models automatically detects whether the input is a URL (starts with http or https) or a local file path, so you don’t need to specify the input type explicitly.

Template (-t or --template)

Specifies the output format template. Available templates are organized into categories:

  • Programming Languages: Generate type-safe code in Python, TypeScript, Rust, Go, and Julia with validation and serialization support
  • Schema Languages: Create validation schemas (JSON Schema, XML Schema, SHACL, ShEx, OWL) and semantic definitions (JSON-LD, LinkML)
  • API Specifications: Generate API specification formats (GraphQL, Protobuf) for service contracts
  • Documentation: Produce documentation formats (MkDocs, Markdown) for your data models

See the Exporters documentation for a complete list of available templates and their usage.

Output (-o or --output)

Specifies the destination file path where the generated output will be written. This parameter is optional but highly recommended for production use.

Writing to a file:

md-models convert -i model.md -t python-pydantic -o models.py

This example generates Python Pydantic models and writes them to models.py. The output file path can be relative or absolute. If the file already exists, it will be overwritten. The directory containing the output file will be created automatically if it doesn’t exist.

Printing to stdout:

md-models convert -i model.md -t python-pydantic

When the output parameter is omitted, the generated content is printed directly to the terminal (stdout). This is useful for quick previews, piping to other commands, or when you want to inspect the output before saving it to a file. For example, you could pipe the output to less for paginated viewing: md-models convert -i model.md -t python-pydantic | less.

Root Object (-r or --root)

Specifies which object in your data model should be treated as the root or entry point for the generated output. This parameter is essential for certain export formats that need to know where to start traversing your model’s object graph.

When root object is required:

  • JSON Schema: The root object parameter is required for JSON Schema generation. JSON Schema needs to know which object represents the top-level structure of your data, as it generates a schema that validates documents starting from that root object.

    md-models convert -i model.md -t json-schema -r MyRootObject -o schema.json
    

    This example generates a JSON Schema where MyRootObject is the root. The schema will validate JSON documents that have MyRootObject as their top-level structure, and all referenced types will be included in the $defs section.

When root object is optional:

  • JSON-LD: The root object parameter is optional. If not specified, MD-Models uses the first object defined in your model as the root.

    md-models convert -i model.md -t json-ld -r Document -o context.jsonld
    

    This example generates a JSON-LD context header specifically for the Document object. The context will include term definitions for all attributes and relationships starting from Document, making it suitable for serializing Document instances as JSON-LD.

For other templates: The root object parameter is typically ignored, as these templates generate code or schemas for all objects in your model rather than focusing on a specific root.

Options (-O or --options)

Passes template-specific configuration options to customize the generated output. These options enable additional features or modify the behavior of the template. Multiple options can be specified as a comma-separated list without spaces.

md-models convert -i model.md -t <template> -O option1,option2

Available options by template:

  • Rust (rust):

    • jsonld - Adds JSON-LD header support to generated Rust structs, including JsonLdHeader with context management methods (add_term, update_term, remove_term). This enables semantic web integration for your Rust data structures.
  • TypeScript Zod (typescript-zod):

    • json-ld - Includes JSON-LD schema definitions (JsonLdSchema, JsonLdContextSchema) in the generated TypeScript code, allowing you to validate and work with JSON-LD data structures.
  • Go (golang):

    • gorm - Adds GORM (Go Object-Relational Mapping) tags to struct fields, enabling database integration. This includes primary key tags, foreign key relationships, many-to-many relationships, and JSON serializer tags for complex types.
    • xml - Adds XML serialization tags to struct fields, enabling XML marshaling/unmarshaling. Supports XML element names, attributes, and wrapped elements.
  • Python Pydantic (python-pydantic):

    • astropy - Enables Astropy unit support for UnitDefinition types. This replaces standard UnitDefinition with UnitDefinitionAnnot and filters out unit-related objects that Astropy handles natively, making the generated code compatible with Astropy’s unit system.
  • JSON Schema (json-schema):

    • openai - Generates OpenAI-compatible JSON Schema by removing options from schema properties. OpenAI’s function calling API doesn’t support custom options, so this option ensures compatibility when using the schema with OpenAI’s API.

Exampless

# Generate Rust code with JSON-LD support
md-models convert -i model.md -t rust -O jsonld -o models.rs

This generates Rust structs with embedded JSON-LD header support. Each struct will include a jsonld field of type Option<JsonLdHeader>, along with helper methods for managing JSON-LD contexts. This is useful when you need to serialize your Rust data as JSON-LD for semantic web applications.

# Generate Go code with GORM tags
md-models convert -i model.md -t golang -O gorm -o models.go

This generates Go structs with GORM database tags. Fields marked as primary keys will have gorm:"primaryKey" tags, relationships will have foreign key tags, and arrays of objects will have many-to-many relationship tags. This enables direct database persistence using GORM.

# Generate Python Pydantic with Astropy support
md-models convert -i model.md -t python-pydantic -O astropy -o models.py

This generates Python Pydantic models optimized for use with Astropy’s unit system. If your model includes UnitDefinition objects, they’ll be replaced with Astropy-compatible annotations, making it easy to work with physical units in scientific computing applications.

Export Categories

MD-Models supports exporting to multiple categories of formats:

  • Programming Languages: Generate type-safe code with validation and serialization in Python, TypeScript, Rust, Go, and Julia
  • Schema Languages: Create validation schemas (JSON Schema, XML Schema) and semantic web formats (SHACL, ShEx, OWL, JSON-LD, LinkML)
  • API Specifications: Generate API specification formats (GraphQL, Protobuf) for defining service contracts
  • Documentation: Produce documentation formats (MkDocs, Markdown) with interactive diagrams and cross-references

Each category includes multiple templates optimized for specific use cases. See the linked documentation pages for detailed information about available templates, features, and usage examples.

Input from JSON Schema

MD-Models can also read JSON Schema files as input, allowing you to convert from JSON Schema to other formats. This enables workflows where you start with a JSON Schema and generate code or other schemas from it.

Usage:

# Convert JSON Schema to Python Pydantic
md-models convert -i schema.json -t python-pydantic -o models.py

The tool automatically detects JSON Schema files by parsing the file content. If the file contains valid JSON Schema syntax, MD-Models will parse it as a JSON Schema rather than a markdown model. This allows you to use JSON Schema as a source format and convert it to any supported output format, making it easy to migrate from JSON Schema-based workflows to MD-Models or generate code from existing JSON Schema definitions.

Complete Examples

The following examples demonstrate common use cases for the convert command with detailed explanations:

Generate Python Code with JSON-LD

md-models convert -i model.md -t python-pydantic -o models.py

What this does: Converts your markdown model to Python Pydantic classes with built-in JSON-LD support. The generated models.py file will contain:

  • Pydantic model classes for each object in your model
  • Type hints and runtime validation
  • Automatic JSON serialization/deserialization
  • JSON-LD fields (@id, @type, @context) for semantic web integration
  • Helper methods for managing JSON-LD contexts

Use case: Ideal for Python web APIs (especially FastAPI), data validation pipelines, or applications that need semantic web integration.

Generate JSON Schema for API Validation

md-models convert -i model.md -t json-schema -r Document -o api-schema.json

What this does: Generates a JSON Schema Draft 2020-12 compliant schema file with Document as the root object. The schema includes:

  • Complete type definitions for Document and all referenced objects
  • Validation rules (required fields, data types, constraints)
  • Enumeration definitions
  • All referenced types in the $defs section

Use case: Perfect for API documentation, request/response validation, form generation, or ensuring data consistency across services. The -r Document parameter specifies that Document is the root type that API consumers will send/receive.

Generate Multiple Formats with Options

# Rust with JSON-LD support
md-models convert -i model.md -t rust -O jsonld -o models.rs

What this does: Generates Rust structs with serde serialization and JSON-LD header support. Each struct includes:

  • Builder pattern support via derive_builder
  • JSON-LD header field with context management
  • Helper methods for managing JSON-LD contexts (add_term, update_term, remove_term)
  • Default JSON-LD header functions for each object type

Use case: Ideal for Rust web servers, high-performance APIs, or systems programming applications that need semantic web capabilities.

# Go with GORM and XML support
md-models convert -i model.md -t golang -O gorm,xml -o models.go

What this does: Generates Go structs with both GORM database tags and XML serialization tags. The structs include:

  • GORM tags for database relationships (primary keys, foreign keys, many-to-many)
  • XML tags for XML marshaling/unmarshaling
  • JSON tags for JSON serialization
  • Custom marshaling for union types

Use case: Perfect for Go microservices that need both database persistence (via GORM) and XML data exchange (for SOAP services or legacy system integration).

# TypeScript Zod with JSON-LD
md-models convert -i model.md -t typescript-zod -O json-ld -o schemas.ts

What this does: Generates TypeScript Zod schemas with JSON-LD support. The output includes:

  • Zod schema definitions with type inference
  • Runtime validation functions
  • JSON-LD schema types (JsonLdSchema, JsonLdContextSchema)
  • TypeScript types inferred from the schemas

Use case: Excellent for TypeScript/JavaScript applications that need runtime validation with semantic web support, such as React applications, Node.js APIs, or TypeScript-based frontend frameworks.

Generate from Remote Model

md-models convert -i https://raw.githubusercontent.com/user/repo/main/model.md -t graphql -o schema.graphql

What this does: Fetches a markdown model from a remote URL (in this case, a GitHub raw file) and generates a GraphQL Schema Definition Language (SDL) file. The generated schema includes:

  • Type definitions for all objects
  • Enum definitions
  • Union types for multi-type attributes
  • Query type with automatically generated query operations

Use case: Useful when working with shared models hosted in version control, or when you want to generate schemas from models maintained by other teams or in public repositories. This enables collaborative data modeling workflows where models are versioned and shared via Git.

For more detailed information about specific exporters, their features, and configuration options, see the Exporters documentation.

Pipelines

Pipelines provide a powerful way to generate multiple output files from one or more MD-Models files in a single command. Instead of running multiple convert commands manually, pipelines allow you to define a configuration file that specifies all the formats you want to generate, making it easy to automate code generation workflows and maintain consistency across your project.

The Pipeline Command

The pipeline command reads a TOML configuration file and generates all specified output formats in one execution.

Basic Usage

md-models pipeline -i <pipeline_config.toml>

Example:

md-models pipeline -i pipeline.toml

Pipeline Configuration Format

Pipeline configurations are written in TOML format and consist of two main sections:

  1. [meta]: Metadata and input file paths
  2. [generate]: Output generation specifications

Configuration Structure

[meta]
name = "My Project"
description = "Project data models"
paths = ["model1.md", "model2.md"]

[generate]
python-pydantic = { out = "models.py" }
json-schema = { out = "schema.json", root = "Document" }
graphql = { out = "schema.graphql" }

Meta Section

The [meta] section defines metadata and input files for the pipeline.

Fields

  • name (optional): A name for the pipeline configuration
  • description (optional): A description of what the pipeline generates
  • paths (required): An array of paths to MD-Models markdown files

Example:

[meta]
name = "API Models"
description = "Generate API models and schemas"
paths = ["models/user.md", "models/product.md"]

Path Resolution

Paths in the paths array are resolved relative to the pipeline configuration file’s directory. For example, if your pipeline file is at configs/pipeline.toml and you specify paths = ["../models/user.md"], the path will be resolved relative to configs/.

Multiple Input Files:

When multiple files are specified in paths, MD-Models will:

  • Merge mode (default): Combine all models into a single unified model before generation
  • Per-spec mode: Generate separate outputs for each input file (requires per-spec = true)

Generate Section

The [generate] section defines what outputs to generate. Each key is a template name, and the value is a specification object.

Basic Generation Specification

[generate]
python-pydantic = { out = "models.py" }

Specification Fields

Each generation specification supports the following fields:

  • out (required): Output file path or directory path
  • root (optional): Root object name (required for JSON Schema, optional for JSON-LD)
  • per-spec (optional): Boolean indicating whether to generate separate files per input (default: false)
  • fname-case (optional): Case transformation for output filenames (pascal, snake, kebab, camel, none)
  • description (optional): Description of this generation step
  • Template-specific options: Additional options like jsonld, gorm, astropy, etc.

Merge Mode (Default)

When per-spec is false or omitted, all input models are merged into a single unified model, and one output file is generated.

Example:

[meta]
paths = ["user.md", "product.md"]

[generate]
python-pydantic = { out = "all_models.py" }
json-schema = { out = "schema.json", root = "User" }

What happens:

  • user.md and product.md are merged into one model
  • A single all_models.py file is generated containing both User and Product classes
  • A single schema.json file is generated with User as the root

Per-Spec Mode

When per-spec = true, separate output files are generated for each input file. This requires using a wildcard (*) in the output path.

Example:

[meta]
paths = ["user.md", "product.md"]

[generate]
python-pydantic = { out = "models/*.py", per-spec = true }
json-schema = { out = "schemas/*.json", per-spec = true, root = "User" }

What happens:

  • user.md generates models/user.py and schemas/user.json
  • product.md generates models/product.py and schemas/product.json
  • Each file contains only the objects from its corresponding input model

Wildcard Requirements:

When using per-spec = true, the output path must contain a wildcard (*). The wildcard will be replaced with the input filename (without extension).

Valid wildcard examples:

  • "models/*.py"models/user.py, models/product.py
  • "output/*_schema.json"output/user_schema.json, output/product_schema.json
  • "schemas/*"schemas/user, schemas/product

Invalid (will cause error):

  • "models/filename.py" with per-spec = true → Error: must contain wildcard

File Name Case Transformation

The fname-case option allows you to transform output filenames to different case conventions when using per-spec mode.

Available cases:

  • pascal: PascalCase (e.g., UserProfile.py)
  • snake: snake_case (e.g., user_profile.py)
  • kebab: kebab-case (e.g., user-profile.py)
  • camel: camelCase (e.g., userProfile.py)
  • none: No transformation (default)

Example:

[generate]
python-pydantic = { 
    out = "models/*.py", 
    per-spec = true,
    fname-case = "snake"
}

If your input file is UserProfile.md, this will generate models/user_profile.py instead of models/UserProfile.py.

Template-Specific Options

You can pass template-specific options in the generation specification, just like with the convert command’s -O flag.

Example:

[generate]
rust = { out = "models.rs", jsonld = true }
golang = { out = "models.go", gorm = true, xml = true }
python-pydantic = { out = "models.py", astropy = true }
typescript-zod = { out = "schemas.ts", json-ld = true }
json-schema = { out = "schema.json", root = "Document", openai = true }

Available options by template:

  • Rust: jsonld
  • Go: gorm, xml
  • Python Pydantic: astropy
  • TypeScript Zod: json-ld
  • JSON Schema: openai

Options are specified as boolean values (true/false) or as string values in the TOML configuration.

Root Object Specification

For templates that require a root object (like JSON Schema and JSON-LD), you can specify it in the generation specification.

Example:

[generate]
json-schema = { out = "schema.json", root = "Document" }
json-ld = { out = "context.jsonld", root = "User" }

If root is not specified:

  • JSON Schema: Will use the first object in the merged model
  • JSON-LD: Will use the first object in the merged model

Complete Examples

Example 1: Single Model, Multiple Formats

Generate multiple formats from a single model file:

[meta]
name = "API Models"
description = "Generate API models and schemas"
paths = ["api/models.md"]

[generate]
python-pydantic = { out = "api/models.py" }
json-schema = { out = "api/schema.json", root = "Document" }
graphql = { out = "api/schema.graphql" }
protobuf = { out = "api/schema.proto" }
rust = { out = "api/models.rs", jsonld = true }

Usage:

md-models pipeline -i pipeline.toml

Output:

  • api/models.py - Python Pydantic models
  • api/schema.json - JSON Schema with Document as root
  • api/schema.graphql - GraphQL schema
  • api/schema.proto - Protobuf definitions
  • api/models.rs - Rust structs with JSON-LD support

Example 2: Multiple Models, Merged Output

Merge multiple model files and generate unified outputs:

[meta]
paths = ["models/user.md", "models/product.md", "models/order.md"]

[generate]
python-pydantic = { out = "lib/all_models.py" }
json-schema-all = { out = "schemas/" }
shacl = { out = "schemas/shapes.ttl" }

What happens:

  • All three models are merged into one
  • all_models.py contains User, Product, and Order classes
  • schemas/ directory contains separate JSON Schema files for each object type
  • shapes.ttl contains SHACL shapes for all merged objects

Example 3: Per-Spec Generation

Generate separate outputs for each input model:

[meta]
paths = ["user.md", "product.md"]

[generate]
python-pydantic = { 
    out = "models/*.py", 
    per-spec = true,
    fname-case = "snake"
}
json-schema = { 
    out = "schemas/*.json", 
    per-spec = true,
    root = "User"
}
xml-schema = { 
    out = "schemas/*.xsd", 
    per-spec = true
}

What happens:

  • user.mdmodels/user.py, schemas/user.json, schemas/user.xsd
  • product.mdmodels/product.py, schemas/product.json, schemas/product.xsd
  • Each output file contains only the objects from its corresponding input

Example 4: Complex Pipeline with Options

A comprehensive pipeline with various options and configurations:

[meta]
name = "Full Stack API"
description = "Generate models for frontend, backend, and documentation"
paths = ["api/models.md"]

[generate]
# Backend - Python
python-pydantic = { 
    out = "backend/models.py",
    description = "Python API models"
}

# Backend - Rust
rust = { 
    out = "backend/models.rs",
    jsonld = true
}

# Frontend - TypeScript
typescript-zod = { 
    out = "frontend/schemas.ts",
    json-ld = true
}

# API Schemas
json-schema = { 
    out = "api/schema.json",
    root = "Document",
    openai = true
}
graphql = { out = "api/schema.graphql" }
protobuf = { out = "api/schema.proto" }

# Documentation
mk-docs = { out = "docs/api.md" }

# Semantic Web
shacl = { out = "schemas/shapes.ttl" }
json-ld = { out = "schemas/context.jsonld", root = "Document" }

Special Templates

JSON Schema All

The json-schema-all template generates separate JSON Schema files for each object. The output must be a directory path.

Merge mode:

[generate]
json-schema-all = { out = "schemas/" }

Generates one .json file per object in the schemas/ directory.

Per-spec mode:

[generate]
json-schema-all = { out = "schemas/", per-spec = true }

Generates separate directories for each input file, each containing JSON Schema files for that file’s objects.

MkDocs

The MkDocs template automatically disables navigation when in merge mode (unless explicitly enabled via nav option).

Example:

[generate]
mk-docs = { out = "docs.md" }  # Navigation disabled automatically
mk-docs = { out = "docs.md", nav = true }  # Navigation enabled

Path Resolution

All paths in the pipeline configuration are resolved relative to the pipeline configuration file’s directory:

  • Input paths (paths): Relative to the pipeline file’s directory
  • Output paths (out): Relative to the pipeline file’s directory

Example:

If your pipeline file is at configs/pipeline.toml:

[meta]
paths = ["../models/user.md"]  # Resolved as configs/../models/user.md

[generate]
python-pydantic = { out = "output/models.py" }  # Resolved as configs/output/models.py

Error Handling

The pipeline command will:

  • Stop on first error: If any generation step fails, the pipeline stops
  • Validate inputs: Ensures all input files exist before processing
  • Create directories: Automatically creates output directories if they don’t exist
  • Report errors: Provides clear error messages for missing files, invalid configurations, or generation failures

Best Practices

  1. Use descriptive names: Give your pipeline a meaningful name and description
  2. Organize output paths: Use consistent directory structures for outputs
  3. Version control: Include pipeline configuration files in version control
  4. Test incrementally: Start with a few templates and add more as needed
  5. Use per-spec for modular models: When models are independent, use per-spec = true for separate outputs
  6. Use merge for related models: When models share types or should be combined, use merge mode (default)
  7. Document options: Use the description field to document why specific options are used

Integration with CI/CD

Pipelines are ideal for CI/CD workflows. You can:

  • Generate all formats automatically on model changes
  • Ensure consistency across all generated outputs
  • Version control both models and generated code together
  • Automate documentation generation

Example CI/CD usage:

# In your CI/CD pipeline
md-models validate -i models/*.md
md-models pipeline -i pipeline.toml
# Generated files are ready for deployment

For more information about individual templates and their options, see the Exporters documentation.

Schema validation

MD-Models provides comprehensive validation capabilities to ensure your data models are well-formed, consistent, and ready for code generation. The validation system checks for structural integrity, naming conventions, type consistency, and other potential issues that could cause problems during code generation or runtime.

The Validate Command

The validate command checks your MD-Models file for errors and inconsistencies. It can validate both markdown model files and JSON Schema files.

Basic Usage

md-models validate -i <input>

Examples:

# Validate a local markdown model file
md-models validate -i model.md

# Validate a model from a remote URL
md-models validate -i https://example.com/model.md

# Validate a JSON Schema file
md-models validate -i schema.json

Input Sources

The validate command accepts the same input types as the convert command:

  • Local file path: Path to a markdown model or JSON Schema file on your local filesystem
  • Remote URL: URL to a markdown model or JSON Schema file hosted online

MD-Models automatically detects the file type (markdown model vs JSON Schema) and applies the appropriate validation rules.

Validation Checks

MD-Models performs comprehensive validation checks on your data model. Understanding these checks helps you write correct models and quickly identify issues.

Global Model Validation

Empty Model Check

What it checks: Ensures your model contains at least one object definition.

Error message: "This model has no definitions."

Solution: Add at least one object to your model.

Example:

# My Model

<!-- This would fail validation - no objects defined -->

Duplicate Object Names

What it checks: Ensures each object has a unique name within the model.

Error type: DuplicateError

Error message: "Object '<name>' is defined more than once."

Solution: Rename one of the duplicate objects to be unique.

Example:

## User
- name: string

## User  <!-- Error: duplicate name -->
- email: string

Duplicate Enum Names

What it checks: Ensures each enumeration has a unique name within the model.

Error type: DuplicateError

Error message: "Enumeration '<name>' is defined more than once."

Solution: Rename one of the duplicate enumerations to be unique.

Object Validation

Empty Object Check

What it checks: Ensures objects have at least one attribute (unless allow_empty: true is set in frontmatter).

Error type: ObjectError

Error message: "Type '<name>' is empty and has no properties."

Solution: Add at least one property to the object, or set allow_empty: true in the model’s frontmatter if empty objects are intentional.

Example:

## EmptyObject
<!-- This would fail validation - no attributes -->

Object Name Validation

What it checks: Validates that object names follow naming conventions:

  1. Must start with a letter: Object names cannot start with numbers or special characters
  2. No whitespace: Object names cannot contain spaces
  3. No special characters: Only alphanumeric characters and underscores are allowed

Error type: NameError

Error messages:

  • "Name '<name>' must start with a letter."
  • "Name '<name>' contains whitespace, which is not valid. Use underscores instead."
  • "Name '<name>' contains special characters, which are not valid except for underscores."

Valid examples:

  • User
  • UserProfile
  • user_profile
  • User123

Invalid examples:

  • 123User (starts with number)
  • User Profile (contains space)
  • User-Profile (contains special character)
  • User.Profile (contains special character)

Attribute Validation

Duplicate Attribute Names

What it checks: Ensures each attribute within an object has a unique name.

Error type: DuplicateError

Error message: "Property '<name>' is defined more than once."

Solution: Rename one of the duplicate properties to be unique within the object.

Example:

## User
- name: string
- name: string  <!-- Error: duplicate attribute name -->

Attribute Name Validation

What it checks: Validates that attribute names follow the same naming conventions as object names:

  1. Must start with a letter
  2. No whitespace
  3. No special characters (except underscores)

Error type: NameError

Error messages: Same as object name validation

Valid examples:

  • name
  • user_name
  • emailAddress
  • age123

Invalid examples:

  • 123age (starts with number)
  • user name (contains space)
  • user-name (contains special character)

Type Definition Validation

What it checks: Ensures every attribute has at least one valid type defined.

Error type: TypeError

Error messages:

  • "Property '<name>' has no type specified."
  • "Property '<name>' has no type defined. Either define a type or use a base type."

Solution: Add a type to the property using the syntax - <property>: <TYPE>.

Example:

## User
- name  <!-- Error: no type specified -->
- email: string  <!-- Valid -->

Type Reference Validation

What it checks: Ensures that all referenced types exist in the model or are basic types.

Error type: TypeError

Error message: "Type '<type>' of property '<name>' not found."

Solution: Either add the referenced type to your model, or use one of the basic types: string, number, integer, boolean, float, date, bytes.

Valid basic types:

  • string - Text data
  • number - Numeric value
  • integer - Whole number
  • boolean - True/false value
  • float - Floating-point number
  • date - Date value
  • bytes - Binary data

Example:

## User
- name: string  <!-- Valid: basic type -->
- profile: UserProfile  <!-- Error if UserProfile doesn't exist -->
- status: Status  <!-- Valid if Status enum exists -->

XML Option Validation

MD-Models validates XML serialization options to ensure they’re correctly formatted.

XML Element Option Validation

What it checks: Validates XML element options (used for custom XML tag names).

Error type: XMLError

Error messages:

  • "XML option is not defined."
  • "Name '<name>' contains special characters..."

Solution: Ensure XML options are properly defined and don’t contain invalid characters.

XML Attribute Option Validation

What it checks: Validates XML attribute options (used for XML attributes).

Error type: XMLError

Error messages:

  • "XML attribute option is not defined."
  • "Name '<name>' contains special characters..."

Solution: Ensure XML attribute options use the @ prefix and are properly formatted.

XML Wrapped Option Validation

What it checks: Validates XML wrapped element options (for nested XML structures).

Error type: XMLError

Error messages:

  • "XML wrapped option can only contain two types."
  • "Name '<name>' contains special characters..."

Solution: XML wrapped options can only have a depth of two types. For deeper nesting, create intermediate objects.

Error Reporting

When validation fails, MD-Models provides detailed error messages to help you identify and fix issues.

Error Format

Each validation error includes:

  • Line number: The line(s) where the error occurs
  • Location: The object and attribute (if applicable) where the error was found
  • Error type: The category of error (NameError, TypeError, DuplicateError, etc.)
  • Message: A clear description of what’s wrong
  • Solution: Suggested fix for the error

Example Error Output

[line: 5, 12] [User.name] NameError:
 └── Name 'user name' contains whitespace, which is not valid. Use underscores instead.
     Resolve the issue by using 'user_name'.

[line: 8] [User] TypeError:
 └── Type 'Profile' of property 'profile' not found.
     Add the type 'Profile' to the model or use a base type.

Error Types

MD-Models categorizes errors into several types:

  • NameError: Issues with naming conventions (object names, attribute names)
  • TypeError: Issues with type definitions or type references
  • DuplicateError: Duplicate definitions (objects, enums, attributes)
  • ObjectError: Issues with object structure (empty objects)
  • XMLError: Issues with XML serialization options
  • GlobalError: Model-level issues (empty model)

Validation Success

When validation passes, you’ll see:

✓ Model is valid

This indicates your model is well-formed and ready for code generation or export.

JSON Schema Validation

MD-Models can also validate JSON Schema files. When you provide a JSON Schema file as input, MD-Models:

  1. Parses the JSON Schema
  2. Converts it to an internal model representation
  3. Validates the converted model using the same validation rules

This allows you to validate JSON Schema files and ensure they’re compatible with MD-Models workflows.

Example:

md-models validate -i schema.json

Best Practices

  1. Validate before generating: Always validate your model before running code generation to catch errors early
  2. Fix errors systematically: Start with global errors (duplicates, empty model) before fixing attribute-level errors
  3. Use descriptive names: Follow naming conventions to avoid NameError issues
  4. Check type references: Ensure all referenced types exist in your model or are basic types
  5. Validate in CI/CD: Include validation checks in your continuous integration pipeline

Integration with Code Generation

Validation is automatically performed when parsing models for code generation. If validation fails during parsing, code generation will not proceed, ensuring that only valid models are used to generate code.

You can also use validation independently to check models without generating code:

# Just validate
md-models validate -i model.md

# Validate and generate (validation happens automatically)
md-models convert -i model.md -t python-pydantic -o models.py

Large Language Models

To be added

Exporters

MD-Models provides a comprehensive set of exporters that allow you to convert your data models into various formats, including programming languages, schema definitions, API specifications, and documentation formats. This enables you to:

  • Generate type-safe code in multiple programming languages
  • Create validation schemas for data validation
  • Build API specifications for service contracts
  • Produce documentation for your data models
  • Enable semantic web integration with RDF and JSON-LD support

Export Categories

MD-Models exporters are organized into several categories:

Programming Languages

Generate type-safe code and data structures in various programming languages:

  • Python: Dataclasses, Pydantic models (with XML support)
  • TypeScript: io-ts and Zod schemas
  • Rust: Structs with serde serialization
  • Go: Structs with GORM and XML support
  • Julia: Type definitions with JSON serialization

View all programming language exporters →

Schema Languages

Create validation schemas and semantic definitions:

  • JSON Schema: JSON data validation (Draft 2020-12)
  • XML Schema (XSD): XML document validation
  • SHACL & ShEx: RDF graph validation
  • OWL: Web Ontology Language for knowledge graphs
  • JSON-LD: Linked data context generation
  • LinkML: Multi-format schema generation

View all schema language exporters →

API Specifications

Generate API specification formats:

  • GraphQL: GraphQL Schema Definition Language (SDL)
  • Protobuf: Protocol Buffer message definitions

View all API specification exporters →

Documentation

Generate documentation formats:

  • MkDocs: Markdown documentation with interactive diagrams

View all documentation exporters →

Quick Start

To export your MD-Models file to any format, use the convert command:

md-models convert -i <model> -t <template> -o <output>

Example:

# Generate Python Pydantic models
md-models convert -i model.md -t python-pydantic -o models.py

# Generate JSON Schema
md-models convert -i model.md -t json-schema -r MyObject -o schema.json

# Generate GraphQL schema
md-models convert -i model.md -t graphql -o schema.graphql

Template Options

Many exporters support additional options via the -O or --options flag:

md-models convert -i <model> -t <template> -O option1,option2

Common options include:

  • jsonld / json-ld: Enable JSON-LD support (Rust, TypeScript Zod)
  • gorm: Enable GORM tags (Go)
  • xml: Enable XML serialization (Go)
  • astropy: Enable Astropy unit support (Python Pydantic)
  • openai: OpenAI-compatible schema (JSON Schema)

See the individual exporter pages for specific options available for each template.

Choosing the Right Exporter

Use CaseRecommended Exporters
API DevelopmentGraphQL, Protobuf, JSON Schema
Data ValidationJSON Schema, XML Schema, SHACL, ShEx
Code GenerationPython Pydantic, TypeScript Zod, Rust, Go
Semantic WebOWL, SHACL, ShEx, JSON-LD
DocumentationMkDocs
Multi-format SupportLinkML

For detailed information about each exporter, including features, usage examples, and configuration options, visit the dedicated pages linked above.

Programming languages

MD-Models can export your data models to various programming languages, generating type-safe code with validation, serialization, and additional features like JSON-LD support.

Python

Python is a high-level, interpreted programming language known for its simplicity and readability. It’s widely used in web development, data science, scientific computing, automation, and API development. Python’s extensive ecosystem and ease of use make it ideal for rapid prototyping and production applications.

Python Dataclass

The Python dataclass exporter generates Python classes using the dataclasses module and dataclasses-json for JSON serialization. This is ideal for simple data models that need basic validation and serialization.

Features:

  • Type hints for all attributes
  • Automatic JSON serialization/deserialization
  • Runtime validation
  • Helper methods for adding nested objects

Usage:

md-models convert -i <model> -t python-dataclass

Example:

md-models convert -i model.md -t python-dataclass -o models.py

JSON-LD Support: ✅ Yes - Includes @id, @type, and @context fields automatically


Python Pydantic

The Python Pydantic exporter generates Pydantic models, which provide powerful runtime validation and type checking. Pydantic is widely used in modern Python applications, especially with FastAPI for building REST APIs, data validation in data pipelines, and configuration management.

Features:

  • Runtime data validation
  • Type coercion and conversion
  • Field descriptions and documentation
  • Filter methods for nested collections
  • JSON-LD helper methods (set_attr_term, add_type_term)
  • Support for Astropy units (via astropy option)

Usage:

md-models convert -i <model> -t python-pydantic

Example:

md-models convert -i model.md -t python-pydantic -o models.py

Options:

  • astropy: Enable Astropy unit support for UnitDefinition types

    md-models convert -i model.md -t python-pydantic -O astropy -o models.py
    

JSON-LD Support: ✅ Yes - Includes JSON-LD fields and helper methods for managing semantic annotations


Python Pydantic XML

The Python Pydantic XML exporter generates Pydantic models with XML serialization support using pydantic-xml. This is ideal for applications that need to work with XML data formats, such as SOAP services, legacy system integration, document processing, and scientific data exchange formats.

Features:

  • XML serialization and deserialization
  • Support for XML namespaces and attributes
  • Wrapped XML elements
  • Pretty-printed XML output
  • Runtime validation

Usage:

md-models convert -i <model> -t python-pydantic-xml

Example:

md-models convert -i model.md -t python-pydantic-xml -o models.py

JSON-LD Support: ❌ No - Focused on XML serialization


TypeScript

TypeScript is a typed superset of JavaScript that compiles to plain JavaScript. It adds static type checking to JavaScript, making it ideal for large-scale web applications, frontend frameworks (React, Vue, Angular), Node.js backend services, and anywhere type safety is crucial for maintainability and catching errors early.

TypeScript (io-ts)

The TypeScript io-ts exporter generates TypeScript interfaces with runtime validation using the io-ts library. This provides both static type checking and runtime validation, making it perfect for API clients, data validation layers, and functional programming approaches in TypeScript.

Features:

  • TypeScript interfaces with type inference
  • Runtime validation with io-ts decoders
  • Generic validate function for type-safe validation
  • JSON-LD interface support

Usage:

md-models convert -i <model> -t typescript

Example:

md-models convert -i model.md -t typescript -o models.ts

JSON-LD Support: ✅ Yes - Includes JsonLd interface that all types extend


TypeScript Zod

The TypeScript Zod exporter generates Zod schemas, which provide both runtime validation and TypeScript type inference. Zod is a popular choice for modern TypeScript applications, especially in form validation, API request/response validation, configuration schemas, and anywhere you need runtime type safety with excellent developer experience.

Features:

  • Zod schema definitions with type inference
  • Runtime validation
  • Field descriptions
  • Union type support
  • Optional JSON-LD schema support

Usage:

md-models convert -i <model> -t typescript-zod

Example:

md-models convert -i model.md -t typescript-zod -o schemas.ts

Options:

  • json-ld: Enable JSON-LD schema support

    md-models convert -i model.md -t typescript-zod -O json-ld -o schemas.ts
    

JSON-LD Support: ✅ Yes - Available via the json-ld option


Rust

Rust is a systems programming language focused on safety, performance, and concurrency. It provides memory safety without garbage collection, making it ideal for systems programming, web servers, embedded systems, blockchain development, and performance-critical applications. Rust’s strong type system and ownership model prevent many common programming errors at compile time.

The Rust exporter generates Rust struct definitions with serde serialization, builder pattern support, and optional JSON-LD functionality.

Features:

  • Rust structs with serde serialization
  • Builder pattern using derive_builder
  • JSON Schema generation support (schemars)
  • Optional JSON-LD header support
  • Union types for multi-type attributes
  • XML wrapped element support

Usage:

md-models convert -i <model> -t rust

Example:

md-models convert -i model.md -t rust -o models.rs

Options:

  • jsonld: Enable JSON-LD header support with context management

    md-models convert -i model.md -t rust -O jsonld -o models.rs
    

When jsonld is enabled, the generated code includes:

  • JsonLdHeader struct with context, ID, and type fields
  • Helper methods for managing JSON-LD contexts (add_term, update_term, remove_term)
  • Default JSON-LD header functions for each object type

JSON-LD Support: ✅ Yes - Available via the jsonld option


Go

Go (Golang) is a statically typed, compiled language designed for simplicity, efficiency, and concurrency. Developed by Google, Go is widely used for building scalable backend services, microservices, cloud-native applications, command-line tools, and distributed systems. Its built-in concurrency primitives (goroutines and channels) make it excellent for handling high-throughput network services and concurrent operations.

The Go exporter generates Go struct definitions with JSON and XML serialization tags, and optional GORM support for database integration.

Features:

  • Go structs with JSON/XML tags
  • Automatic type conversion (string, float64, int64, bool, []byte)
  • Union types for multi-type attributes with custom marshaling
  • Optional GORM tags for database integration
  • Self-referential type handling (pointers)

Usage:

md-models convert -i <model> -t golang

Example:

md-models convert -i model.md -t golang -o models.go

Options:

  • gorm: Enable GORM tags for database relationships

    md-models convert -i model.md -t golang -O gorm -o models.go
    
  • xml: Enable XML serialization tags

    md-models convert -i model.md -t golang -O xml -o models.go
    

When gorm is enabled, the generated code includes:

  • Primary key tags (gorm:"primaryKey")
  • Foreign key relationships (gorm:"foreignKey:...")
  • Many-to-many relationships (gorm:"many2many:...")
  • JSON serializer tags for complex types (gorm:"serializer:json")

When xml is enabled, the generated structs include XML tags for serialization, supporting:

  • XML element names
  • XML attributes
  • Wrapped XML elements

JSON-LD Support: ❌ No


Julia

Julia is a high-level, high-performance dynamic programming language designed for numerical and scientific computing. It combines the ease of use of Python with the performance of C, making it ideal for data science, machine learning, scientific simulations, computational biology, and high-performance numerical computing. Julia’s multiple dispatch and just-in-time compilation enable both rapid prototyping and production performance.

The Julia exporter generates Julia struct definitions with JSON serialization support using JSON3 and StructTypes.

Features:

  • Julia mutable structs with keyword constructors
  • JSON serialization via JSON3
  • Union types for optional and multi-type fields
  • Type-safe field definitions
  • Module-based organization

Usage:

md-models convert -i <model> -t julia

Example:

md-models convert -i model.md -t julia -o models.jl

JSON-LD Support: ❌ No


Template Options

All templates support passing options via the -O or --options flag:

md-models convert -i <model> -t <template> -O option1,option2

Multiple options can be passed as a comma-separated list. Available options vary by template:

  • Python Pydantic: astropy
  • TypeScript Zod: json-ld
  • Rust: jsonld
  • Go: gorm, xml

Options are passed to the template as configuration flags and can modify the generated code structure and features.


Language Comparison

LanguagePrimary Use CaseRuntime ValidationType SafetyJSON-LD SupportBest For
PythonWeb APIs, Data Science✅ Yes✅ Yes✅ YesFastAPI, Data pipelines, Scientific computing
TypeScriptWeb Development, Frontend/Backend✅ Yes✅ Yes✅ YesReact, Node.js, Type-safe APIs
RustSystems Programming, Performance✅ Yes✅ Yes✅ YesWeb servers, Embedded systems, High performance
GoBackend Services, Microservices✅ Yes✅ Yes❌ NoCloud-native apps, Distributed systems
JuliaScientific Computing, Data Science✅ Yes✅ Yes❌ NoNumerical computing, Machine learning

Schema languages

MD-Models can export your data models to various schema languages, enabling validation, semantic annotation, and interoperability across different systems and standards.

JSON Schema

JSON Schema is a vocabulary that allows you to annotate and validate JSON documents. It provides a way to describe the structure of JSON data, making it ideal for API documentation, data validation, form generation, and ensuring data consistency across systems.

The JSON Schema exporter generates JSON Schema Draft 2020-12 compliant schemas from your MD-Models data model. These schemas can be used to validate JSON data, generate API documentation, and provide type information for various tools and frameworks.

Usage:

md-models convert -i <model> -t json-schema -r <root_object>

Example:

md-models convert -i model.md -t json-schema -r MyObject -o schema.json

Options:

  • openai: Remove options from schema properties (OpenAI function calling compatibility)
    md-models convert -i model.md -t json-schema -r MyObject -O openai -o schema.json
    

Features:

  • Generates JSON Schema Draft 2020-12 compliant schemas
  • Includes all referenced types in $defs section
  • Supports required fields, default values, and constraints
  • Handles nested objects and enumerations
  • Optional OpenAI-compatible mode

Note: The -r or --root parameter is required to specify which object should be the root of the schema.


JSON Schema All

The JSON Schema All exporter generates separate JSON Schema files for each object in your data model. This is useful when you need individual schemas for each type rather than a single schema with definitions.

Usage:

md-models convert -i <model> -t json-schema-all -o <output_directory>

Example:

md-models convert -i model.md -t json-schema-all -o schemas/

Features:

  • Generates one JSON Schema file per object type
  • Each file is named after the object (e.g., MyObject.json)
  • Useful for API documentation where each endpoint has its own schema
  • Output directory is required (will be created if it doesn’t exist)

XML Schema (XSD)

XML Schema Definition (XSD) is a W3C standard for describing and validating the structure and content of XML documents. XSD is widely used in enterprise systems, SOAP web services, document exchange, and legacy system integration.

The XML Schema exporter generates W3C XML Schema 1.1 compliant schemas from your MD-Models data model. These schemas define the structure, data types, and constraints for XML documents.

Usage:

md-models convert -i <model> -t xml-schema

Example:

md-models convert -i model.md -t xml-schema -o schema.xsd

Features:

  • Generates W3C XML Schema 1.1 compliant schemas
  • Supports XML attributes and elements
  • Handles required/optional fields and cardinality (minOccurs/maxOccurs)
  • Supports default values
  • Generates complex types for nested objects
  • Supports XML namespace declarations

Generated Output:

  • Complex type definitions for each object
  • Simple type definitions for enumerations
  • Element and attribute declarations
  • Namespace prefixes and target namespaces

SHACL

SHACL (Shapes Constraint Language) is a W3C standard for validating RDF data graphs. SHACL shapes describe constraints that RDF data must satisfy, making it ideal for data quality validation, semantic web applications, and linked data validation.

The SHACL exporter generates SHACL shapes in Turtle (TTL) format from your MD-Models data model. These shapes can be used with SHACL validators to check RDF data against your model constraints.

Usage:

md-models convert -i <model> -t shacl

Example:

md-models convert -i model.md -t shacl -o shapes.ttl

Features:

  • Generates SHACL NodeShapes for each object type
  • Property shapes with cardinality constraints (sh:minCount, sh:maxCount)
  • Datatype constraints (sh:datatype)
  • Enumeration constraints (sh:in)
  • Class constraints (sh:class)
  • Term-based property paths (supports semantic annotations)
  • Class-scoped prefixes for semantic web integration

Requirements:

  • Objects must have semantic terms defined (via @term annotations)
  • The model must include ontology prefixes for proper RDF namespace handling

Use Cases:

  • Validating RDF/JSON-LD data against semantic models
  • Data quality assurance in knowledge graphs
  • Semantic web application development
  • Linked data validation pipelines

ShEx

ShEx (Shape Expressions) is a language for describing and validating RDF graphs. Similar to SHACL, ShEx provides a concise syntax for expressing constraints on RDF data, making it popular in bioinformatics, semantic web applications, and RDF validation tools.

The ShEx exporter generates ShEx shape expressions from your MD-Models data model. ShEx provides a more compact syntax than SHACL and is well-suited for validating RDF data in various formats.

Usage:

md-models convert -i <model> -t shex

Example:

md-models convert -i model.md -t shex -o shapes.shex

Features:

  • Generates ShEx shape definitions for each object type
  • Cardinality constraints (?, *, +)
  • Datatype constraints
  • Enumeration constraints
  • Class constraints
  • Term-based property paths
  • Class-scoped prefixes

Requirements:

  • Objects must have semantic terms defined (via - Term: <term> annotations)
  • The model must include ontology prefixes for proper RDF namespace handling

Use Cases:

  • RDF data validation
  • Semantic web applications
  • Bioinformatics data validation
  • Knowledge graph quality assurance

OWL

OWL (Web Ontology Language) is a W3C standard for defining ontologies on the semantic web. OWL allows you to express rich logical relationships between concepts, making it ideal for knowledge representation, semantic reasoning, and building sophisticated knowledge graphs.

The OWL exporter generates OWL 2 ontologies in Turtle (TTL) format from your MD-Models data model. These ontologies define classes, properties, and their relationships in a machine-readable format suitable for semantic reasoning and inference.

Usage:

md-models convert -i <model> -t owl

Example:

md-models convert -i model.md -t owl -o ontology.ttl

Features:

  • Generates OWL 2 ontology definitions
  • Class definitions with rdfs:comment descriptions
  • Object properties for relationships between classes
  • Datatype properties for primitive attributes
  • Enumeration classes for enum types
  • Subclass relationships (rdfs:subClassOf)
  • Property domain and range constraints
  • Ontology metadata and prefixes

Generated Output:

  • OWL ontology header with metadata
  • Class definitions for each object type
  • Object properties for relationships
  • Datatype properties for attributes
  • Enumeration class definitions
  • Property constraints (domain, range, cardinality)

Use Cases:

  • Building semantic knowledge bases
  • Semantic reasoning and inference
  • Knowledge graph construction
  • Semantic web application development
  • Ontology-driven data integration

JSON-LD

JSON-LD (JSON for Linked Data) is a method of encoding linked data using JSON. It provides a way to add semantic meaning to JSON data through context definitions, making JSON data part of the semantic web while maintaining compatibility with existing JSON tools.

The JSON-LD exporter generates JSON-LD context headers (@context, @id, @type) from your MD-Models data model. These headers provide the semantic context needed to interpret JSON data as linked data.

Usage:

md-models convert -i <model> -t json-ld -r <root_object>

Example:

md-models convert -i model.md -t json-ld -r MyObject -o context.jsonld

Features:

  • Generates JSON-LD @context with term definitions
  • Includes @id and @type for the root object
  • Maps model attributes to semantic terms
  • Supports nested contexts for complex object graphs
  • Includes ontology prefixes and namespace mappings

Note: The -r or --root parameter is optional. If not specified, the first object in the model is used as the root.

Use Cases:

  • Adding semantic annotations to JSON APIs
  • Creating linked data from structured JSON
  • Semantic web data serialization
  • Knowledge graph data exchange
  • Schema.org and other vocabulary integration

LinkML

LinkML (Linked Data Modeling Language) is a modeling language for building schemas that can be used to generate various artifacts including JSON Schema, Python classes, RDF, and more. LinkML provides a YAML-based schema format that bridges the gap between data modeling and code generation.

The LinkML exporter generates LinkML YAML schemas from your MD-Models data model. These schemas can be used with the LinkML toolkit to generate code, documentation, and other artifacts in multiple formats.

Usage:

md-models convert -i <model> -t linkml -o schema.yaml

Example:

md-models convert -i model.md -t linkml -o schema.yaml

Features:

  • Generates LinkML YAML schema format
  • Class definitions with attributes
  • Slot definitions (shared attributes across classes)
  • Enumeration definitions
  • Prefix and import declarations
  • Tree root identification
  • Dependency-aware class ordering

Generated Output:

  • LinkML schema header with ID, name, and prefixes
  • Class definitions with slot usage
  • Global slot definitions
  • Enumeration definitions with permissible values
  • Import declarations for external schemas

Use Cases:

  • Multi-format code generation via LinkML toolkit
  • Schema-driven development workflows
  • Interoperability between different schema formats
  • Biomedical and scientific data modeling
  • Building comprehensive data model ecosystems

Schema Comparison

Schema FormatPrimary Use CaseOutput FormatValidation Target
JSON SchemaJSON data validationJSONJSON documents
XML SchemaXML document validationXML (XSD)XML documents
SHACLRDF graph validationTurtle (TTL)RDF graphs
ShExRDF graph validationShExRDF graphs
OWLOntology definitionTurtle (TTL)Knowledge graphs
JSON-LDLinked data contextJSONJSON-LD documents
LinkMLMulti-format schemaYAMLVarious formats

Each schema format serves different purposes in the data modeling ecosystem, from runtime validation (JSON Schema, XML Schema) to semantic web technologies (SHACL, ShEx, OWL) to multi-format code generation (LinkML).

API specifications

MD-Models can export your data models to API specification formats, enabling you to generate type-safe API schemas and client/server code.

GraphQL

The GraphQL exporter generates GraphQL Schema Definition Language (SDL) files from your MD-Models data model. This allows you to:

  • Define GraphQL types, enums, and queries based on your data model
  • Generate type-safe GraphQL APIs
  • Use union types for attributes with multiple possible types
  • Automatically create query types for fetching collections and filtering by attributes

Usage

To generate a GraphQL schema from your MD-Models file:

md-models convert -i <model> -t graphql

For example:

md-models convert -i model.md -t graphql -o schema.graphql

Generated Output

The GraphQL exporter generates:

  • Type definitions: Each object in your model becomes a GraphQL type with its attributes as fields
  • Union types: Attributes with multiple possible types are converted to GraphQL union types
  • Enum definitions: Enumerations from your model are converted to GraphQL enums
  • Query type: Automatically generates query operations including:
    • all{ObjectName}s: Query to fetch all instances of an object type
    • {objectName}By{AttributeName}: Query to filter objects by scalar attribute values

The exporter automatically maps MD-Models types to GraphQL scalar types:

  • integerInt
  • float / numberFloat
  • booleanBoolean
  • stringString
  • bytesString
  • dateString

Fields are marked as required (using !) based on the required attribute in your model, and arrays are represented using GraphQL list syntax [Type].

Protobuf

The Protobuf exporter generates Protocol Buffer (protobuf) message definitions from your MD-Models data model. Protocol Buffers are Google’s language-neutral, platform-neutral mechanism for serializing structured data, commonly used for:

  • Efficient data serialization in microservices
  • Cross-language data exchange
  • gRPC service definitions
  • High-performance data storage and transmission

Usage

To generate a Protocol Buffer schema from your MD-Models file:

md-models convert -i <model> -t protobuf

For example:

md-models convert -i model.md -t protobuf -o schema.proto

Generated Output

The Protobuf exporter generates:

  • Message definitions: Each object in your model becomes a protobuf message type
  • Enum definitions: Enumerations from your model are converted to protobuf enums
  • OneOf types: Attributes with multiple possible types are converted to protobuf oneof fields
  • Field rules: Fields are marked as repeated for arrays or optional for non-required fields
  • Package declaration: Uses the model title (or “model” if not specified) as the package name

The exporter uses proto3 syntax and automatically maps MD-Models types to protobuf types:

  • stringstring
  • floatdouble
  • intint32
  • boolbool
  • Object types and enums are preserved as-is

Each field is assigned a unique field number starting from 1, following protobuf conventions.


API Specification Comparison

FormatPrimary Use CaseSerialization FormatType SystemBest For
GraphQLFlexible API queriesJSONStrong typingModern web APIs, Flexible data fetching
ProtobufEfficient data serializationBinary/TextStrong typingMicroservices, gRPC, High-performance APIs

Documentation

MD-Models can export your data models to documentation formats, making it easy to generate comprehensive, interactive documentation for your data structures.

MkDocs

The MkDocs exporter generates Markdown documentation files formatted for use with MkDocs, a static site generator for project documentation. This exporter creates human-readable documentation that includes:

  • Visual graph representation of your data model structure
  • Detailed type definitions with all attributes and their properties
  • Enumeration documentation with value mappings
  • Cross-references between related types
  • Ontology and prefix information

Usage

To generate MkDocs documentation from your MD-Models file:

md-models convert -i <model> -t mk-docs

For example:

md-models convert -i model.md -t mk-docs -o documentation.md

Generated Output

The MkDocs exporter generates a comprehensive documentation page that includes:

Visual Graph

An interactive Mermaid flowchart that visualizes:

  • All object types and enumerations in your model
  • Relationships between types (when one type references another)
  • Clickable links to navigate to each type’s documentation section

The graph is displayed in a collapsible quote block using MkDocs’ ??? quote syntax.

Ontologies Section

If your model includes ontology prefixes, they are listed with links to their namespace URIs. This helps users understand the semantic context of your data model.

Types Section

Each object in your model is documented with:

  • Type name: The name of the object type
  • Description: The docstring from your model
  • Attributes: A detailed list of all attributes including:
    • Attribute name (marked with * if required)
    • Data type(s) with automatic linking to referenced types
    • Array notation (list[Type]) for multiple values
    • Attribute description/docstring
    • Default values (if specified)
    • Additional options (e.g., primary key, unique constraints)

Types are automatically cross-referenced - when an attribute references another type in your model, it becomes a clickable link to that type’s section.

Enumerations Section

All enumerations are documented in table format showing:

  • Enumeration name and description
  • A table mapping each enum alias to its corresponding value

Integration with MkDocs

The generated Markdown file can be directly included in your MkDocs site. To use it:

  1. Add the generated file to your docs/ directory
  2. Include it in your mkdocs.yml navigation:
nav:
  - Home: index.md
  - API Reference: documentation.md
  1. The documentation will automatically render with:
    • Syntax highlighting for code blocks
    • Interactive Mermaid diagrams (requires mkdocs-mermaid2-plugin)
    • Cross-referenced links between types
    • Responsive tables for enumerations

Configuration Options

The MkDocs exporter supports a nav configuration option. When set, the generated documentation will include navigation elements. Without it, the navigation is hidden by default, making it suitable for embedding in existing MkDocs sites.


Documentation Format Comparison

FormatOutput FormatInteractive FeaturesBest For
MkDocsMarkdown✅ Diagrams, LinksProject documentation, API docs

FAQ

My model is not validating

This section highlights the most common mistakes that break validation rules. Understanding these pitfalls will help you write valid models and avoid errors during validation and code generation.

Naming Issues

❌ Names Starting with Numbers

Problem: Object and attribute names cannot start with a number.

### 1Test          <!-- ❌ Invalid: starts with number -->
- 1number: string <!-- ❌ Invalid: starts with number -->

Solution: Use letters at the beginning of names.

### Test1          <!-- ✅ Valid -->
- number1: string <!-- ✅ Valid -->

❌ Names with Whitespace

Problem: Object and attribute names cannot contain spaces.

### Test Object    <!-- ❌ Invalid: contains space -->
- some name: string <!-- ❌ Invalid: contains space -->

Solution: Use underscores or camelCase instead.

### TestObject     <!-- ✅ Valid -->
- some_name: string <!-- ✅ Valid -->
- someName: string <!-- ✅ Valid -->

❌ Names with Special Characters

Problem: Only alphanumeric characters and underscores are allowed in names.

### User-Profile   <!-- ❌ Invalid: contains hyphen -->
- user.name: string <!-- ❌ Invalid: contains dot -->

Solution: Use underscores or camelCase.

### UserProfile    <!-- ✅ Valid -->
- user_name: string <!-- ✅ Valid -->

Type Definition Issues

❌ Missing Type Definitions

Problem: Every attribute must have a type specified.

### User
- name            <!-- ❌ Invalid: no type -->
- email: string   <!-- ✅ Valid -->

Solution: Always specify a type using - <property>: <TYPE> syntax.

### User
- name: string    <!-- ✅ Valid -->
- email: string   <!-- ✅ Valid -->

❌ Undefined Type References

Problem: Referenced types must exist in the model or be basic types.

### User
- profile: UserProfile <!-- ❌ Invalid if UserProfile doesn't exist -->
- name: string          <!-- ✅ Valid: basic type -->

Solution: Either define the referenced type or use a basic type (string, number, integer, boolean, float, date, bytes).

### UserProfile    <!-- Define the type first -->
- bio: string

### User
- profile: UserProfile <!-- ✅ Valid -->

❌ Incorrect Type Syntax

Problem: Using wrong keywords or syntax for type definitions.

### User
- name
  - DType: string <!-- ❌ Invalid: should be "Type" not "DType" -->

Solution: Use the correct syntax: - <property>: <TYPE> or - Type: <TYPE>.

### User
- name: string    <!-- ✅ Valid: inline syntax -->
- email            <!-- ✅ Valid: block syntax -->
  - Type: string

Duplicate Definitions

❌ Duplicate Object Names

Problem: Each object must have a unique name within the model.

### User
- name: string

### User  <!-- ❌ Invalid: duplicate name -->
- email: string

Solution: Rename one of the duplicate objects.

### User
- name: string

### UserProfile  <!-- ✅ Valid: unique name -->
- email: string

❌ Duplicate Attribute Names

Problem: Each attribute within an object must have a unique name.

### User
- name: string
- name: string  <!-- ❌ Invalid: duplicate attribute -->

Solution: Rename one of the duplicate attributes or combine them if they represent the same concept.

### User
- first_name: string
- last_name: string  <!-- ✅ Valid: unique names -->

❌ Duplicate Enum Names

Problem: Each enumeration must have a unique name within the model.

### Status

```text
ACTIVE = "active"
```

### Status  <!-- ❌ Invalid: duplicate enum name -->

```text
INACTIVE = "inactive"
```

Solution: Rename one of the duplicate enumerations.

### UserStatus

```text
ACTIVE = "active"
```

### AccountStatus  <!-- ✅ Valid: unique name -->

```text
INACTIVE = "inactive"
```

Structure Issues

❌ Empty Models

Problem: Models must contain at least one object definition.

# My Model

<!-- ❌ Invalid: no objects defined -->

Solution: Add at least one object to your model.

# My Model

### User
- name: string  <!-- ✅ Valid: model has objects -->

❌ Empty Objects

Problem: Objects must have at least one attribute (unless allow_empty: true is set).

### EmptyObject
<!-- ❌ Invalid: no attributes -->

Solution: Add at least one property or set allow_empty: true in frontmatter.

---
allow_empty: true
---

### EmptyObject  <!-- ✅ Valid with allow_empty -->

Or:

### EmptyObject
- id: string  <!-- ✅ Valid: has at least one attribute -->

XML Option Issues

❌ Empty XML Options

Problem: XML options cannot be empty or contain only special characters.

### User
- name: string
  - XML:        <!-- ❌ Invalid: empty -->
  - XML: @      <!-- ❌ Invalid: only special character -->

Solution: Provide a valid XML element name or use proper attribute syntax.

### User
- name: string
  - XML: userName  <!-- ✅ Valid: element name -->
  - XML: @id       <!-- ✅ Valid: attribute with name -->

❌ Special Characters in XML Options

Problem: XML element and attribute names cannot contain special characters like colons (unless part of a namespace prefix).

### User
- name: string
  - XML: schema:hello  <!-- ❌ Invalid: colon not allowed in element names -->
  - XML: @schema:hello <!-- ❌ Invalid: colon not allowed in attribute names -->

Solution: Use valid XML names without special characters, or restructure your XML serialization.

### User
- name: string
  - XML: hello         <!-- ✅ Valid: simple name -->
  - XML: @hello        <!-- ✅ Valid: attribute -->

❌ Invalid XML Wrapped Syntax

Problem: XML wrapped options can only contain two levels of nesting.

### Test
- items: string[]
  - XML: some/other/path  <!-- ❌ Invalid: more than 2 levels -->

Solution: Limit XML wrapped paths to two levels, or create intermediate objects for deeper nesting.

### Test
- items: string[]
  - XML: items/item  <!-- ✅ Valid: exactly 2 levels -->

Or create intermediate objects:

### ItemList
- items: Item[]

### Item
- value: string

### Test
- list: ItemList
  - XML: list/items  <!-- ✅ Valid: uses intermediate object -->

❌ Invalid Multiple Types with XML

Problem: When using multiple types, XML options must match the number of types.

### Test
- value: string, float
  - XML: fine, bad:  <!-- ❌ Invalid: second XML option has colon -->

Solution: Provide valid XML options for each type, matching the type count.

### Test
- value: string, float
  - XML: stringValue, floatValue  <!-- ✅ Valid: matches type count -->

Reserved Names

❌ Using Reserved Names

Problem: Certain names like __other__ are reserved and cannot be used as attribute names.

### Test
- __other__: string  <!-- ❌ Invalid: reserved name -->

Solution: Use a different name that doesn’t conflict with reserved keywords.

### Test
- other_value: string  <!-- ✅ Valid: not reserved -->

Inheritance Issues

❌ Invalid Inheritance Syntax

Problem: Inheritance syntax must be properly formatted.

### Test [  <!-- ❌ Invalid: incomplete syntax -->
- name: string

Solution: Use proper inheritance syntax with a valid parent type.

### Test [Parent]  <!-- ✅ Valid: proper inheritance -->
- name: string

Or if no inheritance:

### Test  <!-- ✅ Valid: no inheritance -->
- name: string

Array Type Issues

❌ Mixing Array Syntax Incorrectly

Problem: When using multiple types with arrays, the syntax must be consistent.

### Test
- primitive: string[], integer, float, boolean  <!-- ❌ Invalid: mixing array and non-array types incorrectly -->

Solution: Use consistent array syntax or separate into different attributes.

### Test
- strings: string[]      <!-- ✅ Valid: array of strings -->
- numbers: integer[]     <!-- ✅ Valid: array of integers -->
- single_value: string   <!-- ✅ Valid: single value -->

Or use union types properly:

### Test
- values: string | integer | float  <!-- ✅ Valid: union type -->

Quick Reference: Valid vs Invalid

Issue❌ Invalid✅ Valid
Name starts with number1User, 123testUser1, test123
Name with spaceUser Profile, user nameUserProfile, user_name
Name with special charUser-Profile, user.nameUserProfile, user_name
Missing type- name- name: string
Undefined type- profile: Profile (if Profile doesn’t exist)- profile: Profile (if Profile exists)
Duplicate objectTwo ### User sectionsUnique object names
Duplicate attributeTwo - name: string in same objectUnique attribute names
Empty modelNo objects definedAt least one object
Empty object### User with no attributes### User with attributes or allow_empty: true
Empty XML option- XML: (empty)- XML: elementName
XML wrapped depth- XML: a/b/c/d (3+ levels)- XML: a/b (2 levels)
Reserved name- __other__: string- other: string

Tips for Avoiding Common Mistakes

  1. Always validate first: Run md-models validate -i model.md before generating code
  2. Start simple: Begin with basic types and simple structures, then add complexity
  3. Use descriptive names: Follow naming conventions from the start to avoid refactoring later
  4. Check type references: Ensure all referenced types exist before using them
  5. Test incrementally: Add objects and attributes one at a time, validating as you go
  6. Read error messages carefully: Validation errors include line numbers and specific solutions
  7. Use basic types when possible: Prefer string, number, integer, boolean over custom types when appropriate