An unclosed, unescaped <script> tag in markdown should be rendered as text to match GH/GL behavior #935

Closed
opened 2025-11-02 03:42:16 -06:00 by GiteaMirror · 14 comments
Owner

Originally created by @wyattoday on GitHub (Aug 2, 2017).

Description

An unclosed, unescaped <script> tag in markdown breaks all subsequent markdown rendering in gitea. The same problem does not effect more benign tags like <strong>.

Gitea should render the <script> tag "as is" (that is, the text, but not emitting the <script> HTML). That would match the behavior in github.

Here's the raw Markdown file: https://try.gitea.io/wyattoday/simple-respository/raw/new-feature-branch/BrokenRendering.md

Here's the broken rendering: https://try.gitea.io/wyattoday/simple-respository/src/new-feature-branch/BrokenRendering.md

Screenshots

gitea-broken-markdown
Originally created by @wyattoday on GitHub (Aug 2, 2017). - Gitea version (or commit ref): Current master: https://github.com/go-gitea/gitea/commit/f29458bd3a20d2d89638d5031d801c161f456374 - Git version: 2.13.3 - Operating system: Ubuntu Linux 16.04 - Database (use `[x]`): - [ ] PostgreSQL - [x] MySQL - [ ] MSSQL - [ ] SQLite - Can you reproduce the bug at https://try.gitea.io: - [x] Yes: https://try.gitea.io/wyattoday/simple-respository/src/new-feature-branch/BrokenRendering.md - [ ] No - [ ] Not relevant - Log gist: ## Description An unclosed, unescaped <script> tag in markdown breaks all subsequent markdown rendering in gitea. The same problem does not effect more benign tags like `<strong>`. Gitea should render the <script> tag "as is" (that is, the text, but not emitting the <script> HTML). That would match the behavior in github. Here's the raw Markdown file: https://try.gitea.io/wyattoday/simple-respository/raw/new-feature-branch/BrokenRendering.md Here's the broken rendering: https://try.gitea.io/wyattoday/simple-respository/src/new-feature-branch/BrokenRendering.md ## Screenshots <img width="926" alt="gitea-broken-markdown" src="https://user-images.githubusercontent.com/11202536/28880372-dcf9adee-7772-11e7-9b2c-39b206ef9457.png">
GiteaMirror added the type/proposalissue/confirmed labels 2025-11-02 03:42:16 -06:00
Author
Owner

@stale[bot] commented on GitHub (Feb 13, 2019):

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs during the next 2 weeks. Thank you for your contributions.

@stale[bot] commented on GitHub (Feb 13, 2019): This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs during the next 2 weeks. Thank you for your contributions.
Author
Owner

@wyattoday commented on GitHub (Feb 13, 2019):

This issue is still very much alive: https://try.gitea.io/wyattoday/Test1234/src/branch/master/README.md

@wyattoday commented on GitHub (Feb 13, 2019): This issue is still very much alive: https://try.gitea.io/wyattoday/Test1234/src/branch/master/README.md
Author
Owner

@lunny commented on GitHub (Jan 2, 2021):

Should be closed, see https://try.gitea.io/wyattoday/Test1234/src/branch/master/README.md

@lunny commented on GitHub (Jan 2, 2021): Should be closed, see https://try.gitea.io/wyattoday/Test1234/src/branch/master/README.md
Author
Owner

@wyattoday commented on GitHub (Jan 2, 2021):

Still very broken.

See : https://try.gitea.io/wyattoday/Test1234/raw/branch/master/README.md

Notice how the lines after the <script> tag are not written out.

The markdown renderer should remove malicious tags (expecially <script>) to match how other systems like Gitlab / Github render markdown.

@wyattoday commented on GitHub (Jan 2, 2021): **Still very broken.** See : https://try.gitea.io/wyattoday/Test1234/raw/branch/master/README.md Notice how the lines after the <script> tag are not written out. The markdown renderer should remove malicious tags (expecially <script>) to match how other systems like Gitlab / Github render markdown.
Author
Owner

@zeripath commented on GitHub (Jan 2, 2021):

How would you suggest we do this?

How should they be rendered? Please link to an actual example of how you expect it be rendered and explain your rules for how we are supposed to determine what was "supposed to inside" the unclosed tag that we need to sanitize away. Including what you do when the tag is closed.

@zeripath commented on GitHub (Jan 2, 2021): How would you suggest we do this? How should they be rendered? Please link to an actual example of how you expect it be rendered and explain your rules for how we are supposed to determine what was "supposed to inside" the unclosed tag that we need to sanitize away. Including what you do when the tag is closed.
Author
Owner

@wyattoday commented on GitHub (Jan 2, 2021):

Script tags should be stripped completely. Or render them as text.

Either way, the current markdown behavior of breaking after a tag it doesn't like isn't a good way to handle things.

@wyattoday commented on GitHub (Jan 2, 2021): Script tags should be stripped completely. Or render them as text. Either way, the current markdown behavior of breaking after a tag it doesn't like isn't a good way to handle things.
Author
Owner

@zeripath commented on GitHub (Jan 2, 2021):

The script tag is stripped along with its contents. Script is a block level element so an unclosed script block contains everything after it.

@zeripath commented on GitHub (Jan 2, 2021): The script tag *is* stripped along with its contents. Script is a block level element so an unclosed script block contains everything after it.
Author
Owner

@wyattoday commented on GitHub (Jan 2, 2021):

We ran into this issue in real life by writing documentation with <script> being written in some markdown documentation. We wanted it rendered as text. This is how it’s done on GitHub / Gitlab. (See 2 sentences back in this comment for how it’s just rendered as text, no backticks necessary. Just write it and the markdown renders it as text.)

So, the ideal solution is to either strip it out correctly (I.e. if the tag is never closed, then loop back and assume it’s a standalone tag), or ideally match the behavior of GH/GL and render as text.

What gittea does currently is the worst of both worlds (strips it and breaks the rest of the rendering).

@wyattoday commented on GitHub (Jan 2, 2021): We ran into this issue in real life by writing documentation with <script> being written in some markdown documentation. We wanted it rendered as text. This is how it’s done on GitHub / Gitlab. (See 2 sentences back in this comment for how it’s just rendered as text, no backticks necessary. Just write it and the markdown renders it as text.) So, the ideal solution is to either strip it out correctly (I.e. if the tag is never closed, then loop back and assume it’s a standalone tag), or ideally match the behavior of GH/GL and render as text. What gittea does currently is the worst of both worlds (strips it and breaks the rest of the rendering).
Author
Owner

@lunny commented on GitHub (Jan 3, 2021):

Then, I think the title should be changed to render script tag as text but not it's a break.

@lunny commented on GitHub (Jan 3, 2021): Then, I think the title should be changed to render script tag as text but not it's a break.
Author
Owner

@zeripath commented on GitHub (Jan 3, 2021):

it appears that @wyattoday wants a completely different DOM to the way bluemonday creates it. They almost want the file to be passed through html tidy before sanitizing,

I suspect however that regexp replacing <(/?script[ >]) with &lt;$1 in sanitizer.go would work.

@zeripath commented on GitHub (Jan 3, 2021): it appears that @wyattoday wants a completely different DOM to the way bluemonday creates it. They almost want the file to be passed through html tidy before sanitizing, I suspect however that regexp replacing `<(/?script[ >])` with `&lt;$1` in sanitizer.go would work.
Author
Owner

@milahu commented on GitHub (Oct 1, 2023):

Script tags should be stripped completely. Or render them as text.

generally, im voting to ignore these tags in markdown in the blob api:

  • <script>...</script>
  • <style>...</style>
  • <head>...</head>
  • <!doctype html>
  • <html>
  • </html>

this is useful to render html files as markdown in the blob api
by creating a symlink from readme.md to index.html

example:

We wanted it rendered as text. This is how it’s done on GitHub / Gitlab.

both do it wrong.
if you want to render the text <script> then just write &lt;script&gt;
which works in all markdown renderers

@milahu commented on GitHub (Oct 1, 2023): > Script tags should be stripped completely. Or render them as text. generally, im voting to ignore these tags in markdown in the blob api: - `<script>...</script>` - `<style>...</style>` - `<head>...</head>` - `<!doctype html>` - `<html>` - `</html>` this is useful to render html files as markdown in the blob api by creating a symlink from `readme.md` to `index.html` example: - https://github.com/milahu/gitea-html-in-markdown-render-bug - https://try.gitea.io/milahu/gitea-html-in-markdown-render-bug - https://codeberg.org/milahu/gitea-html-in-markdown-render-bug > We wanted it rendered as text. This is how it’s done on GitHub / Gitlab. both do it wrong. if you want to render the text `<script>` then just write `&lt;script&gt;` which works in all markdown renderers
Author
Owner

@wxiaoguang commented on GitHub (Apr 6, 2025):

It's impossible to correctly support it by the current goldmark render used by Gitea, unless someone writes a special HTML tag parser for goldmark to handle "unclosed" tags.

I believe writing `<script>` or &lt;script&gt; is the right approach for all markdown renders.

@wxiaoguang commented on GitHub (Apr 6, 2025): It's impossible to correctly support it by the current goldmark render used by Gitea, unless someone writes a special HTML tag parser for goldmark to handle "unclosed" tags. I believe writing `` `<script>` `` or `` &lt;script&gt; `` is the right approach for all markdown renders.
Author
Owner

@wxiaoguang commented on GitHub (Apr 6, 2025):

And we can have a simple improvement to output the broken (unclosed) tags: Make markdown render match GitHub's behavior #34129 , it should be good enough for most cases, telling the writer that "you forgot to correctly close or escape the tags".

Image

@wxiaoguang commented on GitHub (Apr 6, 2025): And we can have a simple improvement to output the broken (unclosed) tags: Make markdown render match GitHub's behavior #34129 , it should be good enough for most cases, telling the writer that "you forgot to correctly close or escape the tags". ![Image](https://github.com/user-attachments/assets/7b23f689-755b-4fbe-819e-2b549677ca30)
Author
Owner

@wxiaoguang commented on GitHub (Apr 6, 2025):

Hmm, good news, according to my test, I think the fix should almost behave the same as GitHub.

So I think this issue could be closed as "completed" now.

@wxiaoguang commented on GitHub (Apr 6, 2025): Hmm, good news, according to my test, I think the fix should almost behave the same as GitHub. So I think this issue could be closed as "completed" now.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/gitea#935