Confluence does not have an export to Markdown feature. For long-term conversion I will download as .docx and convert to Markdown with Pandoc but for quick uses (e.g. copying reference documentation for an LLM) that’s overkill.
This morning I iterated with Claude on a script that works very nicely. I use an Alfred workflow to trigger it on my Mac with ⌘ ⌥ M (for Markdown) but you could just as easily trigger it with FastScripts, RayCast, Keyboard Maestro or probably even Shortcuts (or adapt it for use on Linux or Windows).
The script assumes you have Pandoc installed and that you’ve just pressed ⌘ C on some selected HTML text.
osascript -e 'the clipboard as «class HTML»' \
| perl -ne 'print chr foreach unpack("C*",pack("H*",substr($_,11,-3)))' \
| perl -pe '
s/<span class="Apple-converted-space">[^<]*<\/span>/ /g;
s/ (data-[\w-]+|style)="[^"]*"//g;
s/<span class="code"[^>]*>(.*?)<\/span>/<code>$1<\/code>/g;
' \
| pandoc -f html -t gfm --wrap=none \
| perl -pe '
s/<div[^>]*>//g;
s/<\/div>//g;
s/<[^>]+>//g;
s/^``` code-block$/```/gm;
' \
| perl -0777 -pe '
s/^[^\S\n]+$//mg;
s/\n{3,}/\n\n/g;
'
I plan to update this post as I use it more and make small tweaks. It cleans up Confluence HTML nicely but I haven’t yet tested it with other sources.
If I remember, I’ll also write another post explaining the evolution from pbpaste -Prefer rtf | pandoc -f rtf -t markdown --wrap=none (which doesn’t work) to the monstrosity above.