summaryrefslogtreecommitdiffstats
path: root/src/gui/text/qtextmarkdownimporter_p.h
Commit message (Collapse)AuthorAgeFilesLines
* Extract and re-write "front matter" in markdown documentsShawn Rutledge2024-02-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | It's increasingly common for YAML to be used as metadata in front of markdown documents. md4c does not handle this, so we need to remove it ahead of time, lest md4c misinterpret it as heading text or so. The --- fences are expected to be consistent regardless of the format of what's between them, and the yaml (or whatever) parser does not need to see them. So we remove them while reading, and QTextMarkdownWriter writes them around the front matter if there is any. If your application needs to parse this "front matter", just call qtd->metaInformation(QTextDocument::FrontMatter).toUtf8() and feed that to some parser that you've linked in, such as yaml-cpp. Since YAML is used with GitHub Docs, we consider this feature to be part of the GitHub dialect: https://docs.github.com/en/contributing/writing-for-github-docs/using-yaml-frontmatter [ChangeLog][QtGui][Text] Markdown "front matter" (usually YAML) is now extracted during parsing (GitHub dialect) and can be retrieved from QTextDocument::metaInformation(FrontMatter). QTextMarkdownWriter also writes front matter (if any) to the output. Fixes: QTBUG-120722 Change-Id: I220ddcd2b94c99453853643516ca7a36bb2bcd6f Reviewed-by: Axel Spoerl <axel.spoerl@qt.io>
* Add QTextDocument* constructor argument to QTextMarkdownImporterShawn Rutledge2023-09-011-5/+4
| | | | | | | | | | | | | | | | | QTextMarkdownImporter::import() took a QTD* as if it would be ok to reuse one instance of QTextMarkdownImporter for repeated importing into different documents; but in practice, we never do that: in fact it's usually a short-lived, stack-allocated object, as in QTextMarkdownImporter(&doc, QTMI::DialectGitHub).import(input); So it's less clumsy internally to require the document be provided to the constructor: that way a QTextCursor can be constructed immediately too, as part of the importer object rather than separately on the heap. This is private API, unused outside qtbase. Change-Id: I8041ceb33cb7e7608df55dc5a963292c585afb90 Reviewed-by: Oliver Eftevaag <oliver.eftevaag@qt.io> Reviewed-by: Giuseppe D'Angelo <giuseppe.dangelo@kdab.com>
* Use SPDX license identifiersLucie Gérard2022-05-161-38/+2
| | | | | | | | | | | | | Replace the current license disclaimer in files by a SPDX-License-Identifier. Files that have to be modified by hand are modified. License files are organized under LICENSES directory. Task-number: QTBUG-67283 Change-Id: Id880c92784c40f3bbde861c0d93f58151c18b9f1 Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org> Reviewed-by: Lars Knoll <lars.knoll@qt.io> Reviewed-by: Jörg Bornemann <joerg.bornemann@qt.io>
* Make sure all qtbase private headers include at least one otherThiago Macieira2022-02-241-0/+1
| | | | | | | | | | See script in qtbase/util/includeprivate for the rules. Since these files are being touched anyway, I also ran the updatecopyright.pl script too. Change-Id: Ib056b47dde3341ef9a52ffff13ef677e471674b6 Reviewed-by: Marc Mutz <marc.mutz@qt.io>
* Support the markdown underline extensionShawn Rutledge2020-11-071-0/+1
| | | | | | | | | | | | | | | | MarkdownDialectGitHub now includes this feature, so *emph* is italicized and _emph_ is underlined. This is a better fit for QTextDocument capabilities; until now, _underlined_ markdown could be read, but would be rendered with italics, because in CommonMark, *emphasis* and _emphasis_ are the same. But QTextMarkdownWriter already writes underlining and italics distinctly in this way. [ChangeLog][QtGui][Text] By default (with MarkdownDialectGitHub), markdown _underline_ and *italic* text styles are now distinct. Fixes: QTBUG-84429 Change-Id: Ifc6defa4852abe831949baa4ce28bae5f1a82265 Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
* Use QList instead of QVector in guiJarek Kobus2020-06-291-1/+1
| | | | | | | | Applied to headers only. Source file to be changed separately. Task-number: QTBUG-84469 Change-Id: Ic08a899321eaffc46b8461aaee3dbaa4d2c727a9 Reviewed-by: Laszlo Agocs <laszlo.agocs@qt.io>
* Merge remote-tracking branch 'origin/5.15' into devLars Knoll2020-03-041-1/+2
|\ | | | | | | Change-Id: I99ee6f8b4bdc372437ee60d1feab931487fe55c4
| * QTextMarkdownImporter: fix use after free; add fuzz-generated testsShawn Rutledge2020-02-281-1/+1
|/ | | | | | | | | | | | | | It was possible to end up with a dangling pointer in m_listStack. This is now avoided by using QPointer and doing nullptr checks before accessing any QTextList pointer stored there. We have 2 specimens of garbage that caused crashes before; now they don't. But only fuzz20450 triggered the dangling pointer in the list stack. The crash caused by fuzz20580 was fixed by updating md4c from upstream: 4b0fc030777cd541604f5ebaaad47a2b76d61ff9 Change-Id: I8e1eca23b281256a03aea0f55e9ae20f1bdd2a38 Reviewed-by: Robert Loehning <robert.loehning@qt.io>
* Enforce QTextDocument::MarkdownFeature compatibility at compile timeShawn Rutledge2019-10-241-16/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We use md4c for parsing markdown. It provides flags to control the feature set that will be supported when parsing particular documents. QTextMarkdownImporter::Feature is a fine-grained set of flags that exactly match the md4c feature flags that we support in Qt so far. QTextMarkdownImporter is a private exported class (new in 5.14). We don't expect the corresponding flags in md4c to change in incompatible ways in the future: the md4c authors have as much respect for avoiding compatibility issues as we do, and likely will only add features, not remove them. We now enforce QTextMarkdownImporter::Features compatibility with QTextDocument::MarkdownFeatures by setting them directly. We check QTextMarkdownImporter::Features compatibility with md4c's #define'd feature flags using static asserts, so that any hypothetical incompatibility would be detected at compile time. The enum conversion from QTextDocument::MarkdownFeatures to QTextMarkdownImporter::Features is moved to a new QTextMarkdownImporter constructor; thus the conversions from QTextDocument::MarkdownFeatures to QTextMarkdownImporter::Features, and then to unsigned (in QTextMarkdownImporter::import()) are adjacent in the same private class implementation. If incompatibility ever occurred, we would need to replace one or both of those with another suitable conversion function. Change-Id: I0bf8a21eb7559df1d38406b948ef657f9060c67b Reviewed-by: Vitaly Fanaskov <vitaly.fanaskov@qt.io>
* Make QTextBlockFormat::MarkerType an enum classShawn Rutledge2019-10-101-1/+1
| | | | | | | This came up during API review. Change-Id: I9198e1eb96db0c21e46a226a032919bb62d3ca66 Reviewed-by: Giuseppe D'Angelo <giuseppe.dangelo@kdab.com>
* QTextMarkdownWriter: write fenced code blocks with language declarationShawn Rutledge2019-06-041-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MD4C now makes it possible to detect indented and fenced code blocks: https://github.com/mity/md4c/issues/81 Fenced code blocks have the advantages of being easier to write by hand, and having an "info string" following the opening fence, which is commonly used to declare the language. Also, the HTML parser now recognizes tags of the form <pre class="language-foo"> which is one convention for declaring the programming language (as opposed to human language, for which the lang attribute would be used): https://stackoverflow.com/questions/5134242/semantics-standards-and-using-the-lang-attribute-for-source-code-in-markup So it's possible to read HTML and write markdown without losing this information. It's also possible to read markdown with any type of code block: fenced with ``` or ~~~, or indented, and rewrite it the same way. Change-Id: I33c2bf7d7b66c8f3ba5bdd41ab32572f09349c47 Reviewed-by: Gatis Paeglis <gatis.paeglis@qt.io>
* Markdown and HTML: support image alt text and titleShawn Rutledge2019-06-011-0/+1
| | | | | | | | | | | | | | | | | | | | | It's a required CommonMark feature: https://spec.commonmark.org/0.29/#images and alt text is also required in HTML: https://www.w3.org/wiki/Html/Elements/img#Requirements_for_providing_text_to_act_as_an_alternative_for_images Now we are able to read these attributes from either html or markdown and rewrite either an html or markdown document that preserves them. This patch does not add viewing or editing support in QTextEdit etc. Change-Id: I51307389f8f9fc00809808390e583a83111a7b33 Reviewed-by: Gatis Paeglis <gatis.paeglis@qt.io>
* QTextMarkdownImporter: don't keep heading level on following list itemShawn Rutledge2019-05-241-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When reading a document like # heading - list item and then re-writing it, it turned into # heading - # list item because QTextCursor::insertList() simply calls QTextCursor::insertBlock(), thus inheriting block format from the previous block, without an opportunity to explicitly define the block format. So be more consistent: use QTextMarkdownImporter::insertBlock() for blocks inside list items too. Now it fully defines blockFormat first, then inserts the block, and then adds it to the current list only when the "paragraph" is actually the list item's text (but not when it's a continuation paragraph). Also, be prepared for applying and removing block markers to arbitrary blocks, just in case (they might be useful for block quotes, for example). Change-Id: I391820af9b65e75abce12abab45d2477c49c86ac Reviewed-by: Gatis Paeglis <gatis.paeglis@qt.io>
* Fix gui build without feature.regularexpressionTasuku Suzuki2019-05-221-0/+4
| | | | | | | | | | | Change-Id: Id27fc81af8d2b0355b186540f41d75a9c8d7c7f3 Reviewed-by: David Faure <david.faure@kdab.com>
* Markdown: blockquotes, code blocks, and generalized nestingShawn Rutledge2019-05-081-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Can now detect nested quotes and code blocks inside quotes, and can rewrite the markdown too. QTextHtmlParser sets hard-coded left and right margins, so we need to do the same to be able to read HTML and write markdown, or vice-versa, and to ensure that all views (QTextEdit, QTextBrowser, QML Text etc.) will render it with margins. But now we add a semantic memory too: BlockQuoteLevel is similar to HeadingLevel, which was added in 310daae53926628f80c08e4415b94b90ad525c8f to preserve H1..H6 heading levels, because detecting it via font size didn't make sense in QTextMarkdownWriter. Likewise detecting quote level by its margins didn't make sense; markdown supports nesting quotes; and indenting nested quotes via 40 pixels may be a bit too much, so we should consider it subject to change (and perhaps be able to change it via CSS later on). Since we're adding BlockQuoteLevel and depending on it in QTextMarkdownWriter, it's necessary to set it in QTextHtmlParser to enable HTML->markdown conversion. (But so far, nested blockquotes in HTML are not supported.) Quotes (and nested quotes) can contain indented code blocks, but it seems the reverse is not true (according to https://spec.commonmark.org/0.29/#example-201 ) Quotes can contain fenced code blocks. Quotes can contain lists. Nested lists can be interrupted with nested code blocks and nested quotes. So far the writer assumes all code blocks are the indented type. It will be necessary to add another attribute to remember whether the code block is indented or fenced (assuming that's necessary). Fenced code blocks would work better for writing inside block quotes and list items because the fence is less ambiguous than the indent. Postponing cursor->insertBlock() as long as possible helps with nesting. cursor->insertBlock() needs to be done "just in time" before inserting text that will go in the block. The block and char formats aren't necessarily known until that time. When a nested block (such as a nested quote) ends, the context reverts to the previous block format, which then needs to be re-determined and set before we insert text into the outer block; but if no text will be inserted, no new block is necessary. But we can't use QTextBlockFormat itself as storage, because for some reason bullets become very "sticky" and it becomes impossible to have plain continuation paragraphs inside list items: they all get bullets. Somehow QTextBlockFormat remembers, if we copy it. But we can create a new one each time and it's OK. Change-Id: Icd0529eb90d2b6a3cb57f0104bf78a7be81ede52 Reviewed-by: Gatis Paeglis <gatis.paeglis@qt.io>
* Markdown: fix several issues with lists and continuationsShawn Rutledge2019-05-081-0/+2
| | | | | | | | | | | | | | | | | | | | Importer fixes: - the first list item after a heading doesn't keep the heading font - the first text fragment after a bullet is the bullet text, not a separate paragraph - detect continuation lines and append to the list item text - detect continuation paragraphs and indent them properly - indent nested list items properly - add a test for QTextMarkdownImporter Writer fixes: - after bullet items, continuation lines and paragraphs are indented - indentation of continuations isn't affected by checkboxes - add extra newlines between list items in "loose" lists - avoid writing triple newlines - enhance the test for QTextMarkdownWriter Change-Id: Ib1dda514832f6dc0cdad177aa9a423a7038ac8c6 Reviewed-by: Gatis Paeglis <gatis.paeglis@qt.io>
* Remove 3rdparty include from QTextMarkdownImporter header fileShawn Rutledge2019-05-081-23/+22
| | | | | | | | It's a private header; but to be able to use it in a test, it has to be as clean as a public header. Change-Id: I868372406e62acc24051a6523fee89bb911a61f9 Reviewed-by: Gatis Paeglis <gatis.paeglis@qt.io>
* Add QTextMarkdownImporterShawn Rutledge2019-04-171-0/+127
This provides the ability to read from a Markdown string or file into a QTextDocument, such that the formatting will be recognized and can be rendered. - Add QTextDocument::setMarkdown(QString) - Add QTextEdit::setMarkdown(QString) - Add TextFormat::MarkdownText - QWidgetTextControl::setContent() calls QTextDocument::setMarkdown() if that's the format Fixes: QTBUG-72349 Change-Id: Ief2ad71bf840666c64145d58e9ca71d05fad5659 Reviewed-by: Lisandro Damián Nicanor Pérez Meyer <perezmeyer@gmail.com> Reviewed-by: Gatis Paeglis <gatis.paeglis@qt.io>