md4qt C++ Classes

Static C++ library for parsing Markdown. More...

Namespaces

MD	Main namespace of md4qt library
MD::details	Namespace for some implemetation details, but useful for reuse

Classes

MD::ATXHeadingParser	ATX heading parser
MD::Anchor	Just an anchor
MD::AsteriskEmphasisParser	Asterisk emphasis parser
MD::AutolinkParser	Autolink parser
MD::Block	Abstract block (storage of child items)
MD::BlockParser	Base class for parsing a block
MD::Blockquote	Blockquote
MD::BlockquoteParser	Blockquote parser
MD::Code	Code
MD::Context	Parsing context
MD::Document	Document
MD::EmphasisParser	Emphasis parser
MD::FencedCodeParser	Fenced code parser
MD::Footnote	Footnote
MD::FootnoteParser	Footnote parser
MD::FootnoteRef	Footnote reference
MD::GfmAutolinkParser	GFM autolink parser
MD::HTMLParser	HTML block parser
MD::HardLineBreakParser	Hard line break parser
MD::Heading	Heading
MD::HorizontalLine	Horizontal line
MD::Image	Image
MD::IndentedCodeParser	Indented code parser
MD::InlineCodeParser	Inline code parser
MD::InlineContext	Inline parsing context
MD::InlineContext::Delimiter	Description
MD::InlineHtmlParser	Inline HTML parser
MD::InlineMathParser	Inline math parser
MD::InlineParser	Base class for parsing inlines
MD::Item	Base class for item in Markdown document
MD::ItemWithOpts	Base class for items with style options
MD::Line	Text line in the Markdown input document
MD::Line::RollbackLine	Struct to make a rollback of line state on destruction
MD::LineBreak	Line break
MD::Link	Link
MD::LinkBase	Base class for links
MD::LinkImageParser	Link and image parser
MD::List	List
MD::ListItem	List item in a list
MD::ListParser	List parser
MD::Math	LaTeX math expression
MD::PageBreak	Page break
MD::Paragraph	Paragraph
MD::ParagraphParser	Paragraph parser
MD::ParagraphStream	Text stream for paragraph processing
MD::ParagraphStream::State	Of the stream
MD::Parser	Markdown parser
MD::PosCache	Cache of Markdown items to be accessed via position
MD::PosCache::Items	Vector of items
MD::RawHtml	Raw HTML
MD::ReverseSolidusHandler	Helper for process reverse solidus characters
MD::SetextHeadingParser	Setext heading parser
MD::StrikethroughEmphasisParser	Strikethrough emphasis parser
MD::StyleDelim	Emphasis in the Markdown document
MD::Table	Table
MD::TableCell	Table cell
MD::TableParser	Table parser
MD::TableRow	Table row
MD::Text	Item in Paragraph
MD::TextStream	Actual text stream
MD::TextStreamBase	Base class for text stream
MD::ThematicBreakParser	Thematic break parser
MD::UnderlineEmphasisParser	Underline emphasis parser
MD::Visitor	Interface to walk through the MD::Document
MD::WithPosition	Base for any thing with start and end position
MD::YAMLHeader	YAML header item in the document
MD::YAMLParser	YAML parser
MD::details::AlgoVisitor	Visitor for algorithms
MD::details::AlgoVisitor::IncrementNestingLevel	Auxiliary structure for MD::details::AlgoVisitor
MD::details::HtmlVisitor	HTML visitor interface to walk through the MD:Document
MD::details::HtmlVisitor::FootnoteRefStuff	Auxiliary struct to process footnotes
MD::details::PosRange	Cached position of MD::Item

Detailed Description

md4qt is a static C++ library for parsing Markdown.

md4qt supports CommonMark 0.31.2 Spec, and some GitHub extensions, such as tables, footnotes, tasks lists, strikethroughs, LaTeX Maths injections, GitHub's auto-links.

This library parses Markdown into tree structure.

Example

#include <md4qt/parser.hpp>

int main()
{
    MD::Parser p;

    auto doc = p.parse(QStringLiteral("your_markdown.md"));

    for (auto it = doc->items().cbegin(), last = doc->items().cend(); it != last; ++it) {
        switch ((*it)->type())
        {
        case MD::ItemType::Anchor :
        {
            auto a = static_cast<MD::Anchor *> (it->get());
            qDebug() << a->label();
        }
            break;

        default :
            break;
        }
    }

    return 0;
}

License

/*
    SPDX-FileCopyrightText: 2025 Igor Mironchik <igor.mironchik@gmail.com>
    SPDX-License-Identifier: MIT
*/

Benchmark

Approximate benchmark with cmark-gfm says, that Qt6 version of md4qt is slower ~8 times. But you will get complete C++ tree structure of the Markdown document with all major extensions and sugar and cherry on the cake.

Markdown library	Result
cmark-gfm	~0.22 ms
`md4qt` with `Qt6`	~1.7 ms
`md4qt` with `Qt6` without `GitHub` auto-links extension	~1.2 ms

Playground

You can play in action with md4qt in Markdown Tools. There you can find Markdown editor/viewer/converter to PDF.

And KleverNotes from KDE uses md4qt too.

Release notes

Notes that version 5.0.0 is API incompatible with 4.x.x. Version 5.0.0 was fully refactored for better performance and be more user-friendly.
Note that version 4.0.0 is API incompatible with 3.0.0. In version 4.0.0 were changed rules with spaces, this version fully supports CommonMark standard in this question. Methods isSpaceBefore(), isSpaceAfter() were removed, and spaces are presented as in Markdown, so keep it in mind.

Known issues

You can find a list of know issues here.

What should I know about links in the document?

In some cases in Markdown link's URL is something document related. So, when you got a MD::Link in the document check if the labelled links of the document contains key with URL in the link, and if so, use URL from labelled links, look:
```
MD::Link *item = ...;

QString url = item->url();

const auto it = doc->labeledLinks().find(url);

if (it != doc->labeledLinks().cend()) {
    url = it->second->url();
}
```

What is the second argument of `MD::Parser::parse()`?

Second argument of MD::Parser::parse() is a flag that tells the parser to process Markdown files recursively or not. If parsing is recursive then if in the targeted Markdown file exist links to other Markdown files, then they will be parsed too and will exist in the resulting document.

What is an `MD::Anchor`?

As md4qt supports recursive Markdown parsing, then in the resulting document can be represented more than one Markdown file. Each file in the document starts with MD::Anchor, it just shows that during traversing through the document you reached new file.

Does the library throw exceptions?

No. This library doesn't use exceptions. Any text is a valid Markdown, so I don't need to inform user about errors. Qt itself doesn't use exceptions either. So you can catch only standard C++ exceptions, like std::bad_alloc, for example.

Why is parsing wrong on Windows with `std::ifstream`?

Such a problem can occur on Windows with MSVC if you open the file in text mode, so for MD::Parser always open std::ifstream with std::ios::binary flag. And yes, I expect to receive UTF-8 encoded content...

How can I convert `MD::Document` into `HTML`?

In version 2.0.5 were made commits with implementation of MD::toHtml() function. You can do the following:

#define MD4QT_QT_SUPPORT
#include <md4qt/parser.h>
#include <md4qt/html.h>

int main()
{
    MD::Parse p;

    auto doc = p.parse(QStringLiteral("your_markdown.md"));

    const auto html = MD::toHtml(doc);

    return 0;
}

How can I obtain positions of blocks/elements in `Markdown` file?

Done in version 2.0.5. Remember that all positions in md4qt start with 0, where first symbol on first line will have coordinates (0,0). One more important thing is that all ranges of position in md4qt are given inclusive, that mean that last column of any element will point to the last symbol in this element.

How can I easily traverse through the `MD::Document`?

Since version 2.6.0 in visitor.h header implemented MD::Visitor interface with which you can easily walk through the document, all you need is implement/override virtual methods to handle that or another element in the document, like:
```
/*!
 * Handle heading.
 *
 * \a h Heading.
 */
virtual void onHeading(Heading *h) = 0;
```

Is it possible to find `Markdown` item by its position?

Since version 3.0.0 was added new structure MD::PosCache. You can pass MD::Document into its MD::PosCache::initialize() method and find first item with all its nested first children by given position with MD::PosCache::findFirstInCache() method.

How can I walk through the document and find all items of given type?

Since version 3.0.0 was added algorithm MD::forEach().

/*!
 * \inheaderfile md4qt/algo.h
 *
 * \brief Calls function for each item in the document with the given type.
 *
 * \a types Vector of item's types to be processed.
 *
 * \a doc Document.
 *
 * \a func Functor object.
 *
 * \a maxNestingLevel Maximun nesting level. 0 means infinity, 1 - only top level items...
 */
inline void forEach(
    const QVector<ItemType> &types,
    QSharedPointer<Document> doc,
    ItemFunctor func,
    unsigned int maxNestingLevel = 0);

How can I add and process a custom (user-defined) item in `MD::Document`?

Since version 3.0.0 in MD::ItemType enum appeared MD::UserDefined enumerator. So you can inherit from any MD::Item class and return from type() method value greater or equal MD::ItemType::UserData. To handle user-defined types of items in MD::Visitor class now exists method void onUserDefined(MD::Item *item). So you can handle your custom items and do what you need.

Contents