Skip to content
Snippets Groups Projects
Unverified Commit 2a6d8133 authored by dpsutton's avatar dpsutton Committed by GitHub
Browse files

Relativize links when emitting html from md in pulse/subscriptions (#18428)

* Relativize links when emitting html from md in pulse/subscriptions

Helpful links:
- https://awesomeopensource.com/project/vsch/flexmark-java
- https://github.com/vsch/flexmark-java/blob/master/flexmark-java-samples/src/com/vladsch/flexmark/java/samples/PegdownCustomLinkResolverOptions.java
- https://github.com/vsch/flexmark-java/blob/8a881b73109a287b5f202e2e1fc9f9c497d5eecf/flexmark/src/main/java/com/vladsch/flexmark/html/HtmlRenderer.java#L433
- https://github.com/vsch/flexmark-java/blob/8a881b73109a287b5f202e2e1fc9f9c497d5eecf/flexmark/src/main/java/com/vladsch/flexmark/html/renderer/ResolvedLink.java#L10

This was a pain in the ass. I had been looking for a way to just easily
traverse the ast, but this might cause problems since there are spans
and text positions everywhere. So i looked at how to change the
parser. Everything is so difficult to change. Luckily there was a
built-in way to do this with link resolvers, found in a github issue
https://github.com/vsch/flexmark-java/issues/308 . And ideally this
would be done in the parser but it seems to be done in the html
emitter.

https://github.com/vsch/flexmark-java/blob/master/flexmark-java-samples/src/com/vladsch/flexmark/java/samples/CustomLinkResolverSample.java

And then I got turned around on what is relative or not. What happens if
you do something seemingly sane like:

```markdown
[a link to google](www.google.com)
```

This is apparently a relative link since it lacks a protocol. And our
`resolve-uri` code treats it as such:

```clojure
markdown=> (resolve-uri "www.google.com")
"http://localhost:3000www.google.com"
markdown=>
```

This seems strange to me but is also the behavior on gist.github.com:
https://gist.github.com/dpsutton/412502ffa89186487e41885855dfa781 has a
link to https://gist.github.com/dpsutton/www.google.com

In all, trying to figure out this object oriented factory mess i had 24
tabs open to the source of NodeVisitor, NodeVisitorBase,
AstActionHandler, VisitHandler, ParserExtension,
NodePostProcessorFactory, etc. It was truly unpleasant.

* Remove errant require of mb.util.urls

* Ensure the site setting always has a slash when resolving relative

The URI class will do some wonky things. Sometimes it feels structural,
but it can also just feel like it is combining strings willy nilly.

```clojure
(.. (URI. "http://example.com") (resolve "www.google.com") toString)
"http://example.comwww.google.com"
```

So ensure that there is a final trailing slash
parent 8da16d62
Branches
Tags
No related merge requests found
......@@ -7,7 +7,8 @@
HardLineBreak Heading HtmlBlock HtmlCommentBlock HtmlEntity HtmlInline HtmlInlineBase HtmlInlineComment
HtmlInnerBlockComment Image ImageRef IndentedCodeBlock Link LinkRef MailLink OrderedList OrderedListItem
Paragraph Reference SoftLineBreak StrongEmphasis Text ThematicBreak]
com.vladsch.flexmark.html.HtmlRenderer
[com.vladsch.flexmark.html HtmlRenderer LinkResolver LinkResolverFactory]
[com.vladsch.flexmark.html.renderer LinkResolverBasicContext LinkStatus]
com.vladsch.flexmark.parser.Parser
[com.vladsch.flexmark.util.ast Document Node]
com.vladsch.flexmark.util.data.MutableDataSet
......@@ -190,11 +191,14 @@
(defn- resolve-uri
"If the provided URI is a relative path, resolve it relative to the site URL so that links work
correctly in Slack/Email."
[uri]
(when uri
(if-let [site-url (public-settings/site-url)]
(.toString (.resolve (new URI ^String site-url) ^String uri))
uri)))
[^String uri]
(letfn [(ensure-slash [s] (when s
(cond-> s
(not (str/ends-with? s "/")) (str "/"))))]
(when uri
(if-let [^String site-url (ensure-slash (public-settings/site-url))]
(.. (URI. site-url) (resolve uri) toString)
uri))))
(defn- ast->mrkdwn
"Takes an AST representing Markdown input, and converts it to a mrkdwn string that will render nicely in Slack.
......@@ -301,10 +305,19 @@
(def ^:private renderer
"An instance of a Flexmark HTML renderer"
(let [options (.. (MutableDataSet.)
(set (. HtmlRenderer ESCAPE_HTML) true)
(toImmutable))]
(delay (.build (HtmlRenderer/builder options)))))
(let [options (.. (MutableDataSet.)
(set HtmlRenderer/ESCAPE_HTML true)
(toImmutable))
lr-factory (reify LinkResolverFactory
(^LinkResolver apply [_this ^LinkResolverBasicContext _context]
(reify LinkResolver
(resolveLink [_this node context link]
(if-let [url (resolve-uri (.getUrl link))]
(.. link
(withStatus LinkStatus/VALID)
(withUrl url))
link)))))]
(delay (.build (.linkResolverFactory (HtmlRenderer/builder options) lr-factory)))))
(defmulti process-markdown
"Converts a markdown string from a virtual card into a form that can be sent to a channel
......
......@@ -175,8 +175,14 @@
(html "1. foo\n 1. bar")))
(is (= "<p>/</p>\n"
(html "\\/")))
(is (= "<p><img src=\"image.png\" alt=\"alt-text\" /></p>\n"
(html "![alt-text](image.png)"))))
(doseq [temp-setting ["https://example.com" "https://example.com/"]]
(tu/with-temporary-setting-values [site-url temp-setting]
(is (= "<p><img src=\"https://example.com/image.png\" alt=\"alt-text\" /></p>\n"
(html "![alt-text](/image.png)")))
(is (= "<p><img src=\"https://example.com/image.png\" alt=\"alt-text\" /></p>\n"
(html "![alt-text](image.png)")))
(is (= "<p><a href=\"https://example.com/dashboard/1\">dashboard 1</a></p>\n"
(html "[dashboard 1](/dashboard/1)"))))))
(testing "HTML in the source markdown is escaped properly, but HTML entities are retained"
(is (= "<p>&lt;h1&gt;header&lt;/h1&gt;</p>\n" (html "<h1>header</h1>")))
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment