Mathematical formulas cannot be displayed normally. #1738

Closed
opened 2025-11-11 14:51:18 -06:00 by GiteaMirror · 19 comments
Owner

Originally created by @MangoFF on GitHub (Aug 9, 2024).

Originally assigned to: @tjbck on GitHub.

image It is So Uncomfortable for me, as it can't display mathematical formulas correctly. Hope for solving this bug, thank you.
Originally created by @MangoFF on GitHub (Aug 9, 2024). Originally assigned to: @tjbck on GitHub. <img width="1023" alt="image" src="https://github.com/user-attachments/assets/0ff940e5-12db-4be0-bc8e-6791acacd7b1"> It is So Uncomfortable for me, as it can't display mathematical formulas correctly. Hope for solving this bug, thank you.
Author
Owner

@Yanyutin753 commented on GitHub (Aug 9, 2024):

yep,It seems that in the latest dev branch, some special mathematical formulas are not supported, such as the following example.


逆矩阵的求法主要有以下几种常见的方法:

### 1. 通过伴随矩阵求逆矩阵(适用于小规模矩阵)
对于一个 \(n \times n\) 的方阵 \(A\),其逆矩阵 \(A^{-1}\) 可以通过以下公式求得:
\[ A^{-1} = \frac{1}{\text{det}(A)} \text{adj}(A) \]
其中:
- \(\text{det}(A)\) 是矩阵 \(A\) 的行列式。
- \(\text{adj}(A)\) 是矩阵 \(A\) 的伴随矩阵,伴随矩阵是由 \(A\) 的代数余子式构成的转置矩阵。

步骤:
1. 计算矩阵 \(A\) 的行列式 \(\text{det}(A)\)。
2. 计算矩阵 \(A\) 的每个元素的余子式,构成伴随矩阵 \(\text{adj}(A)\)。
3. 使用上述公式计算 \(A^{-1}\)。

### 2. 使用行变换法(高斯-约旦消元法)
高斯-约旦消元法是求解逆矩阵的一种直接方法,适用于大多数规模的方阵。

步骤:
1. 将矩阵 \(A\) 和单位矩阵 \(I\) 组合成一个增广矩阵 \([A | I]\)。
2. 对增广矩阵进行初等行变换,使得矩阵 \(A\) 转化为单位矩阵 \(I\)。
3. 此时增广矩阵右侧的矩阵就是 \(A\) 的逆矩阵 \(A^{-1}\)。

### 3. 利用LU分解法
LU分解将矩阵 \(A\) 分解为两个三角矩阵的乘积:一个下三角矩阵 \(L\) 和一个上三角矩阵 \(U\),即 \(A = LU\)。如果能找到这样的分解,可以通过分别求解 \(L\) 和 \(U\) 的逆矩阵来得到 \(A\) 的逆矩阵。

步骤:
1. 对矩阵 \(A\) 进行LU分解,得到 \(L\) 和 \(U\)。
2. 分别求解 \(L\) 和 \(U\) 的逆矩阵 \(L^{-1}\) 和 \(U^{-1}\)(对于三角矩阵,求逆相对简单)。
3. 最后 \(A^{-1} = U^{-1}L^{-1}\)。

### 4. 使用软件工具或计算器
在实际应用中,对于较大规模的矩阵,通常会使用数值计算软件(如 MATLAB、Python 的 NumPy、Mathematica 等)来求逆矩阵。这些工具通常使用更加复杂的算法,如分块算法或分解法来提高计算效率。

### 注意事项
- 只有方阵(即行数等于列数的矩阵)才可能有逆矩阵。
- 如果矩阵的行列式为零,则矩阵不可逆(即不存在逆矩阵)。

你可以根据具体情况选择合适的方法来求逆矩阵。

image

@Yanyutin753 commented on GitHub (Aug 9, 2024): yep,It seems that in the latest dev branch, some special mathematical formulas are not supported, such as the following example. ``` 逆矩阵的求法主要有以下几种常见的方法: ### 1. 通过伴随矩阵求逆矩阵(适用于小规模矩阵) 对于一个 \(n \times n\) 的方阵 \(A\),其逆矩阵 \(A^{-1}\) 可以通过以下公式求得: \[ A^{-1} = \frac{1}{\text{det}(A)} \text{adj}(A) \] 其中: - \(\text{det}(A)\) 是矩阵 \(A\) 的行列式。 - \(\text{adj}(A)\) 是矩阵 \(A\) 的伴随矩阵,伴随矩阵是由 \(A\) 的代数余子式构成的转置矩阵。 步骤: 1. 计算矩阵 \(A\) 的行列式 \(\text{det}(A)\)。 2. 计算矩阵 \(A\) 的每个元素的余子式,构成伴随矩阵 \(\text{adj}(A)\)。 3. 使用上述公式计算 \(A^{-1}\)。 ### 2. 使用行变换法(高斯-约旦消元法) 高斯-约旦消元法是求解逆矩阵的一种直接方法,适用于大多数规模的方阵。 步骤: 1. 将矩阵 \(A\) 和单位矩阵 \(I\) 组合成一个增广矩阵 \([A | I]\)。 2. 对增广矩阵进行初等行变换,使得矩阵 \(A\) 转化为单位矩阵 \(I\)。 3. 此时增广矩阵右侧的矩阵就是 \(A\) 的逆矩阵 \(A^{-1}\)。 ### 3. 利用LU分解法 LU分解将矩阵 \(A\) 分解为两个三角矩阵的乘积:一个下三角矩阵 \(L\) 和一个上三角矩阵 \(U\),即 \(A = LU\)。如果能找到这样的分解,可以通过分别求解 \(L\) 和 \(U\) 的逆矩阵来得到 \(A\) 的逆矩阵。 步骤: 1. 对矩阵 \(A\) 进行LU分解,得到 \(L\) 和 \(U\)。 2. 分别求解 \(L\) 和 \(U\) 的逆矩阵 \(L^{-1}\) 和 \(U^{-1}\)(对于三角矩阵,求逆相对简单)。 3. 最后 \(A^{-1} = U^{-1}L^{-1}\)。 ### 4. 使用软件工具或计算器 在实际应用中,对于较大规模的矩阵,通常会使用数值计算软件(如 MATLAB、Python 的 NumPy、Mathematica 等)来求逆矩阵。这些工具通常使用更加复杂的算法,如分块算法或分解法来提高计算效率。 ### 注意事项 - 只有方阵(即行数等于列数的矩阵)才可能有逆矩阵。 - 如果矩阵的行列式为零,则矩阵不可逆(即不存在逆矩阵)。 你可以根据具体情况选择合适的方法来求逆矩阵。 ``` ![image](https://github.com/user-attachments/assets/cdefa845-f33a-4a03-a08b-2329533adb25)
Author
Owner

@Yanyutin753 commented on GitHub (Aug 10, 2024):

@tjbck

In the latest version of dev

Issues

  1. The current logic is to wait for the message to end before rendering in latex, but there are some mathematical formulas that are not rendered.
  2. While waiting for the message to end, latex rendering will make the overall reply effect not as good as other clients.

Improvements

  1. Did solve the previous render markdown when there are repeated deletion of new elements, the return effect is not ideal situation
  2. It also makes markdown rendering virtually problem-free and suitable for all markdown formats

Genuine

Hope to continue the feat of the new edition, while fixing the problems mentioned, thank the author again for your selfless dedication 🥳

@Yanyutin753 commented on GitHub (Aug 10, 2024): @tjbck ## In the latest version of dev ### Issues 1. The current logic is to wait for the message to end before rendering in latex, but there are some mathematical formulas that are not rendered. 2. While waiting for the message to end, latex rendering will make the overall reply effect not as good as other clients. ### Improvements 1. Did solve the previous render markdown when there are repeated deletion of new elements, the return effect is not ideal situation 2. It also makes markdown rendering virtually problem-free and suitable for all markdown formats ### Genuine Hope to continue the feat of the new edition, while fixing the problems mentioned, thank the author again for your selfless dedication 🥳
Author
Owner

@Yanyutin753 commented on GitHub (Aug 12, 2024):

I don't know how to fix this real-time rendering, hope everyone to fix this problem

@Yanyutin753 commented on GitHub (Aug 12, 2024): I don't know how to fix this real-time rendering, hope everyone to fix this problem⭐
Author
Owner

@tjbck commented on GitHub (Aug 14, 2024):

Should be fixed on dev, testing wanted!

@tjbck commented on GitHub (Aug 14, 2024): Should be fixed on dev, testing wanted!
Author
Owner

@MangoFF commented on GitHub (Aug 14, 2024):

I will test it tomorrow!

@MangoFF commented on GitHub (Aug 14, 2024): I will test it tomorrow!
Author
Owner

@maxkonrad commented on GitHub (Aug 14, 2024):

Would like to test dev but can't run on my pi 4b check #4331

@maxkonrad commented on GitHub (Aug 14, 2024): Would like to test dev but can't run on my pi 4b check #4331
Author
Owner

@MangoFF commented on GitHub (Aug 15, 2024):

image thanks for solving ! It seems all right.
@MangoFF commented on GitHub (Aug 15, 2024): <img width="1011" alt="image" src="https://github.com/user-attachments/assets/209f9dd5-2bd5-4b9b-95e0-66e392c85e0f"> thanks for solving ! It seems all right.
Author
Owner

@MangoFF commented on GitHub (Aug 15, 2024):

image image

sometimes, still have some problem,and some words become red.

@MangoFF commented on GitHub (Aug 15, 2024): <img width="1017" alt="image" src="https://github.com/user-attachments/assets/6f3b713f-d66d-473a-a8a2-37a757567677"> <img width="1127" alt="image" src="https://github.com/user-attachments/assets/9365dbf9-7ae2-4bae-8615-c1b0f50b5230"> sometimes, still have some problem,and some words become red.
Author
Owner

@MangoFF commented on GitHub (Aug 15, 2024):

@tjbck

@MangoFF commented on GitHub (Aug 15, 2024): @tjbck
Author
Owner

@jukofyork commented on GitHub (Sep 5, 2024):

image image

sometimes, still have some problem,and some words become red.

Yeah, I've got this and think I know what is causing it:

Screenshot_20240905-082315

It seems to be that \)) isn't closing the inline markdown block, so whenever the LLM writes a markdown expression inside of some brackets it misses this and then waits until it sees the next closing markdown \) and makes everything red between the two, eg:

While Q is orthogonal (\(Q^T Q = I\)), \(I + Q\) is generally not orthogonal

Is treating:

\(Q^T Q = I\)), \(I + Q\)

as the expression, instead of two expressions:

\(Q^T Q = I\)
\(I + Q\)

and the red likely means it's come a cross invalid LaTeX syntax.

@jukofyork commented on GitHub (Sep 5, 2024): > <img alt="image" width="1017" src="https://private-user-images.githubusercontent.com/29300150/358117786-6f3b713f-d66d-473a-a8a2-37a757567677.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjU1MjEyMzQsIm5iZiI6MTcyNTUyMDkzNCwicGF0aCI6Ii8yOTMwMDE1MC8zNTgxMTc3ODYtNmYzYjcxM2YtZDY2ZC00NzNhLWE4YTItMzdhNzU3NTY3Njc3LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA5MDUlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwOTA1VDA3MjIxNFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTVjZTM2ZDRkYmI0NmM1MDczN2E1Njg3NGM3ZTVjMGRlNjBlMjg4NTEwYzljZDdiOWMzNDU5ZTNmNDAwN2I0ZTkmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.YtI7lpnVMsg2XAUIAyKNsV6bdHuFplBOkEx_v5LoXqA"> <img alt="image" width="1127" src="https://private-user-images.githubusercontent.com/29300150/358117920-9365dbf9-7ae2-4bae-8615-c1b0f50b5230.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjU1MjEyMzQsIm5iZiI6MTcyNTUyMDkzNCwicGF0aCI6Ii8yOTMwMDE1MC8zNTgxMTc5MjAtOTM2NWRiZjktN2FlMi00YmFlLTg2MTUtYzFiMGY1MGI1MjMwLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA5MDUlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwOTA1VDA3MjIxNFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPThhZGVjNzFlMDg1OGU1NzM0NmFkOWJjMTA0OWViMmRmZjdkMTc1YzNkOTBlZDAzODBiZmVlMmZlZmQwOGE2NzEmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.HpY4IcDLyopnAgCuug2hLa_cwMcGLjJdtX72cKLnGRc"> > > sometimes, still have some problem,and some words become red. Yeah, I've got this and think I know what is causing it: ![Screenshot_20240905-082315](https://github.com/user-attachments/assets/403a87c5-abdc-4653-a1a3-8433f13ef226) It seems to be that `\))` isn't closing the inline markdown block, so whenever the LLM writes a markdown expression inside of some brackets it misses this and then waits until it sees the next closing markdown `\)` and makes everything red between the two, eg: ``` While Q is orthogonal (\(Q^T Q = I\)), \(I + Q\) is generally not orthogonal ``` Is treating: ``` \(Q^T Q = I\)), \(I + Q\) ``` as the expression, instead of two expressions: ``` \(Q^T Q = I\) ``` ``` \(I + Q\) ``` and the red likely means it's come a cross invalid LaTeX syntax.
Author
Owner

@jukofyork commented on GitHub (Sep 5, 2024):

It seems the \\\(.*?\\\) part of the regex in:

function escapeBrackets(text: string) {
	let cleanSquareBracket = '';
	let cleanRoundBracket = '';

	const pattern = /(```[\s\S]*?```|`.*?`)|\\\[([\s\S]*?[^\\])\\\]|\\\((.*?)\\\)/g;
	return text.replace(pattern, (match, codeBlock, squareBracket, roundBracket) => {
		if (codeBlock) {
			return codeBlock;
		} else if (squareBracket) {
			cleanSquareBracket = squareBracket.replace(/\s*\n\s*/g, ' ').trim();
			return `$$${cleanSquareBracket}$$`;
		} else if (roundBracket) {
			cleanRoundBracket = roundBracket.replace(/\s*\n\s*/g, ' ').trim();
			return `$${cleanRoundBracket}$`;
		}
		return match.replace(/\s*\n\s*/g, ' ').trim();
	});
}

Is the cause of the problem, but I've no idea why it isn't acting greedy nor what difference the extra bracket afer could make... :/

I've tried Claude Sonnet 3.5, GPT 4-Turbo and GPT 4o to see if they can see the problem and solve it and all just start going round in circles and contradict themselves when asked about this:

I suspect the problem is deeper and relates to the theory that "regular expressions are fundamentally not capable of parsing context-free grammars".

That theory relates to nested structures though and the problem above doesn't seem to have any nesting, but for some reason the extra ) after the \) seems to cause the .*? to not behave "lazily" and I've no idea why...

@jukofyork commented on GitHub (Sep 5, 2024): It seems the `\\\(.*?\\\)` part of the regex in: ```js function escapeBrackets(text: string) { let cleanSquareBracket = ''; let cleanRoundBracket = ''; const pattern = /(```[\s\S]*?```|`.*?`)|\\\[([\s\S]*?[^\\])\\\]|\\\((.*?)\\\)/g; return text.replace(pattern, (match, codeBlock, squareBracket, roundBracket) => { if (codeBlock) { return codeBlock; } else if (squareBracket) { cleanSquareBracket = squareBracket.replace(/\s*\n\s*/g, ' ').trim(); return `$$${cleanSquareBracket}$$`; } else if (roundBracket) { cleanRoundBracket = roundBracket.replace(/\s*\n\s*/g, ' ').trim(); return `$${cleanRoundBracket}$`; } return match.replace(/\s*\n\s*/g, ' ').trim(); }); } ``` Is the cause of the problem, but I've no idea why it isn't acting greedy nor what difference the extra bracket afer could make... :/ I've tried Claude Sonnet 3.5, GPT 4-Turbo and GPT 4o to see if they can see the problem and solve it and all just start going round in circles and contradict themselves when asked about this: I suspect the problem is deeper and relates to the theory that "regular expressions are fundamentally not capable of parsing context-free grammars". That theory relates to nested structures though and the problem above doesn't seem to have any nesting, but for some reason the extra `)` after the `\)` seems to cause the `.*?` to not behave "lazily" and I've no idea why...
Author
Owner

@jukofyork commented on GitHub (Sep 5, 2024):

Could this be solved by first converting all of the \( strings to some obscure ascii character and all of the \) strings to a different obscure ascii character, and then performing the regex using these instead? Surely the there could be no confusion with \) and \))?

Absolutely, substituting \( and \) with unique placeholders can be a practical workaround to sidestep issues associated with distinguishing escaped and non-escaped parentheses. This method essentially simplifies the regex patterns by ensuring that the placeholders do not have syntactic ambiguity within the text. Here’s how you might approach this:

Steps Involved:

  1. Choose Unique Placeholders: Select placeholders that are unlikely to appear in the regular text. It could be a combination of non-standard ASCII characters or a rarely-used Unicode character.

  2. Substitution: Replace all instances of \( with a unique opening placeholder, and \) with a corresponding closing placeholder.

  3. Regex Application: Run your regex operations using these placeholders instead of the original bracket characters.

  4. Reverse Substitution: Convert the placeholders back to the original LaTeX brackets after processing.

Example Implementation:

Here's a simple JavaScript example that demonstrates this approach:

function escapeBrackets(text) {
    // Unique placeholders
    const openPlaceholder = "«";  // Example: U+00AB
    const closePlaceholder = "»"; // Example: U+00BB

    // Step 1: Replace LaTeX brackets with placeholders
    let modifiedText = text.replace(/\\\(/g, openPlaceholder).replace(/\\\)/g, closePlaceholder);

    // Step 2: Apply regex on the modified text
    const regexPattern = new RegExp(`${openPlaceholder}(.*?)${closePlaceholder}`, 'g');
    modifiedText = modifiedText.replace(regexPattern, (match, expression) => {
        // Assuming transformation of the captured expression needed
        return `$${expression.trim()}$`;
    });

    // Step 3: Replace placeholders back to LaTeX brackets
    modifiedText = modifiedText.replace(new RegExp(openPlaceholder, 'g'), '\\(').replace(new RegExp(closePlaceholder, 'g'), '\\)');

    return modifiedText;
}

// Test the function
const inputText = "While Q is orthogonal \\(Q^T Q = I\\)), \\(I + Q\\)";
const outputText = escapeBrackets(inputText);
console.log(outputText);

Advantages:

  • Clarity: Removes ambiguity about what constitutes the bounding parts of the expressions to be matched.
  • Flexibility: Allows regex patterns to be simpler and focus solely on content between the placeholders.
  • Control: Minimizes unexpected matching errors that arise from overlooking escaped characters mixed with regular characters.

Considerations:

  • Ensure Uniqueness: The placeholders must be unique enough to not occur naturally in your datasets.
  • Reversibility: Ensure that the conversion back and forth does not alter any data unintentionally.
  • Efficiency: Multiple pass-through text replacements might introduce a performance overhead, so this strategy is more suited for scenarios where text size is manageable.

This approach works well for circumventing the complications due to syntax mixture in languages like LaTeX embedded in texts and gives more controlled management over how expressions are parsed and handled.


No idea of this will work or why it would make a difference though.

@jukofyork commented on GitHub (Sep 5, 2024): > Could this be solved by first converting all of the `\(` strings to some obscure ascii character and all of the `\)` strings to a different obscure ascii character, and then performing the regex using these instead? Surely the there could be no confusion with `\)` and `\))`? Absolutely, substituting `\(` and `\)` with unique placeholders can be a practical workaround to sidestep issues associated with distinguishing escaped and non-escaped parentheses. This method essentially simplifies the regex patterns by ensuring that the placeholders do not have syntactic ambiguity within the text. Here’s how you might approach this: ### Steps Involved: 1. **Choose Unique Placeholders**: Select placeholders that are unlikely to appear in the regular text. It could be a combination of non-standard ASCII characters or a rarely-used Unicode character. 2. **Substitution**: Replace all instances of `\(` with a unique opening placeholder, and `\)` with a corresponding closing placeholder. 3. **Regex Application**: Run your regex operations using these placeholders instead of the original bracket characters. 4. **Reverse Substitution**: Convert the placeholders back to the original LaTeX brackets after processing. ### Example Implementation: Here's a simple JavaScript example that demonstrates this approach: ```javascript function escapeBrackets(text) { // Unique placeholders const openPlaceholder = "«"; // Example: U+00AB const closePlaceholder = "»"; // Example: U+00BB // Step 1: Replace LaTeX brackets with placeholders let modifiedText = text.replace(/\\\(/g, openPlaceholder).replace(/\\\)/g, closePlaceholder); // Step 2: Apply regex on the modified text const regexPattern = new RegExp(`${openPlaceholder}(.*?)${closePlaceholder}`, 'g'); modifiedText = modifiedText.replace(regexPattern, (match, expression) => { // Assuming transformation of the captured expression needed return `$${expression.trim()}$`; }); // Step 3: Replace placeholders back to LaTeX brackets modifiedText = modifiedText.replace(new RegExp(openPlaceholder, 'g'), '\\(').replace(new RegExp(closePlaceholder, 'g'), '\\)'); return modifiedText; } // Test the function const inputText = "While Q is orthogonal \\(Q^T Q = I\\)), \\(I + Q\\)"; const outputText = escapeBrackets(inputText); console.log(outputText); ``` ### Advantages: - **Clarity**: Removes ambiguity about what constitutes the bounding parts of the expressions to be matched. - **Flexibility**: Allows regex patterns to be simpler and focus solely on content between the placeholders. - **Control**: Minimizes unexpected matching errors that arise from overlooking escaped characters mixed with regular characters. ### Considerations: - **Ensure Uniqueness**: The placeholders must be unique enough to not occur naturally in your datasets. - **Reversibility**: Ensure that the conversion back and forth does not alter any data unintentionally. - **Efficiency**: Multiple pass-through text replacements might introduce a performance overhead, so this strategy is more suited for scenarios where text size is manageable. This approach works well for circumventing the complications due to syntax mixture in languages like LaTeX embedded in texts and gives more controlled management over how expressions are parsed and handled. --- No idea of this will work or why it would make a difference though.
Author
Owner

@jukofyork commented on GitHub (Sep 5, 2024):

https://onecompiler.com/javascript/42r9qmmny

function escapeBrackets(text) {
    let cleanRoundBracket = '';
    let result = '';
    let i = 0;

    while (i < text.length) {
        if (text[i] === '\\' && text[i + 1] === '(') {
            // Found the start of a LaTeX inline block
            let end = i + 2; // Start searching after '\('
            while (end < text.length && !(text[end - 1] === '\\' && text[end] === ')')) {
                end++;
            }
            if (end < text.length) { // Ensure the closing delimiter was found
                cleanRoundBracket = text.substring(i + 2, end - 1).replace(/\s*\n\s*/g, ' ').trim();
                result += `$${cleanRoundBracket}$`;
                i = end + 1; // Move index past the closing '\)'
            } else {
                // No closing bracket found, add the rest as normal text
                result += text.substring(i);
                break;
            }
        } else {
            result += text[i];
            i++;
        }
    }

    return result;
}

// Test cases
const testText = `While Q is orthogonal (\\(Q^T Q = I\\)), \\(I + Q\\) is generally not orthogonal.`;
console.log(escapeBrackets(testText));
Output:

While Q is orthogonal ($Q^T Q = I$), $I + Q$ is generally not orthogonal.

Maybe just have 3 like this?

@jukofyork commented on GitHub (Sep 5, 2024): https://onecompiler.com/javascript/42r9qmmny ```js function escapeBrackets(text) { let cleanRoundBracket = ''; let result = ''; let i = 0; while (i < text.length) { if (text[i] === '\\' && text[i + 1] === '(') { // Found the start of a LaTeX inline block let end = i + 2; // Start searching after '\(' while (end < text.length && !(text[end - 1] === '\\' && text[end] === ')')) { end++; } if (end < text.length) { // Ensure the closing delimiter was found cleanRoundBracket = text.substring(i + 2, end - 1).replace(/\s*\n\s*/g, ' ').trim(); result += `$${cleanRoundBracket}$`; i = end + 1; // Move index past the closing '\)' } else { // No closing bracket found, add the rest as normal text result += text.substring(i); break; } } else { result += text[i]; i++; } } return result; } // Test cases const testText = `While Q is orthogonal (\\(Q^T Q = I\\)), \\(I + Q\\) is generally not orthogonal.`; console.log(escapeBrackets(testText)); ``` ``` Output: While Q is orthogonal ($Q^T Q = I$), $I + Q$ is generally not orthogonal. ``` Maybe just have 3 like this?
Author
Owner

@jukofyork commented on GitHub (Sep 5, 2024):

https://onecompiler.com/javascript/42r9rd8h3

function parseRoundBrackets(text) {
    let i = 0, result = '';
    while (i < text.length) {
        if (text[i] === '\\' && text[i + 1] === '(') {
            let end = i + 2;
            while (end < text.length && !(text[end - 1] === '\\' && text[end] === ')')) {
                end++;
            }
            if (end < text.length) {
                let content = text.substring(i + 2, end - 1).replace(/\s*\n\s*/g, ' ').trim();
                result += `$${content}$`;
                i = end + 1;
            } else {
                result += text.substring(i);
                break;
            }
        } else {
            result += text[i];
            i++;
        }
    }
    return result;
}

function parseSquareBrackets(text) {
    let i = 0, result = '';
    while (i < text.length) {
        if (text[i] === '\\' && text[i + 1] === '[') {
            let end = i + 2;
            while (end < text.length && !(text[end - 1] === '\\' && text[end] === ']')) {
                end++;
            }
            if (end < text.length) {
                let content = text.substring(i + 2, end - 1).replace(/\s*\n\s*/g, ' ').trim();
                result += `$$${content}$$`;
                i = end + 1;
            } else {
                result += text.substring(i);
                break;
            }
        } else {
            result += text[i];
            i++;
        }
    }
    return result;
}

function parseCodeBlocks(text) {
    let i = 0, result = '';
    while (i < text.length) {
        if (text[i] === '`' && text[i + 1] === '`' && text[i + 2] === '`') {
            let end = i + 3;
            while (end < text.length && !(text[end - 2] === '`' && text[end - 1] === '`' && text[end] === '`')) {
                end++;
            }
            if (end < text.length) {
                let content = text.substring(i + 3, end - 2).trim();
                result += content;  // Escape or process further as needed
                i = end + 1;
            } else {
                result += text.substring(i);
                break;
            }
        } else {
            result += text[i];
            i++;
        }
    }
    return result;
}

function escapeBrackets(text) {
    let result = parseRoundBrackets(text);
    result = parseSquareBrackets(result);
    result = parseCodeBlocks(result);
    return result;
}

// Test cases
const testText = `While Q is orthogonal (\\(Q^T Q = I\\)), \\(I + Q\\) is generally not orthogonal.`;
console.log(escapeBrackets(testText));

// Testing for various content types and combinations
const testText1 = "Simple LaTeX inline mode: \\(x^2 + y^2 = z^2\\) with text following.";
const testText2 = "LaTeX display mode with square brackets: \\[ \\frac{a}{b} = \\frac{c}{d} \\]";
const testText3 = "Code block test with backticks: ```console.log('Hello World');``` and more text.";
const testText4 = "Mixed content: \\(x + y = z\\) followed by a code block: ```int main() { return 0; }``` and more \\(u+v=w\\) text.";
const testText5 = "Incorrect closing sequence test: \\(n*m)) additional text here.";
const testText6 = "Multiple LaTeX expressions: \\(a + b = c\\) and \\(d = e + f\\) in one line.";
const testText7 = "Nested fake delimiters test: \\( this contains (fake) delimiters \\) but should parse correctly.";
const testText8 = "Incomplete expressions: Here is one \\(a+b incomplete and text continues.";
const testText9 = "Test escaping: This is not a latex \\( because the bracket is escaped: \\(n = m \\cdot k\\) the actual.";

console.log(escapeBrackets(testText1));
console.log(escapeBrackets(testText2));
console.log(escapeBrackets(testText3));
console.log(escapeBrackets(testText4));
console.log(escapeBrackets(testText5));
console.log(escapeBrackets(testText6));
console.log(escapeBrackets(testText7));
console.log(escapeBrackets(testText8));
console.log(escapeBrackets(testText9));
While Q is orthogonal ($Q^T Q = I$), $I + Q$ is generally not orthogonal.
Simple LaTeX inline mode: $x^2 + y^2 = z^2$ with text following.
LaTeX display mode with square brackets: $$\frac{a}{b} = \frac{c}{d}$$
Code block test with backticks: console.log('Hello World'); and more text.
Mixed content: $x + y = z$ followed by a code block: int main() { return 0; } and more $u+v=w$ text.
Incorrect closing sequence test: \(n*m)) additional text here.
Multiple LaTeX expressions: $a + b = c$ and $d = e + f$ in one line.
Nested fake delimiters test: $this contains (fake) delimiters$ but should parse correctly.
Incomplete expressions: Here is one \(a+b incomplete and text continues.
Test escaping: This is not a latex $because the bracket is escaped: \(n = m \cdot k$ the actual.

Something like this but tidied up. It still won't handle nesting but at least it will perform "lazy" matches where the .*? seems to be failing...

@jukofyork commented on GitHub (Sep 5, 2024): https://onecompiler.com/javascript/42r9rd8h3 ```js function parseRoundBrackets(text) { let i = 0, result = ''; while (i < text.length) { if (text[i] === '\\' && text[i + 1] === '(') { let end = i + 2; while (end < text.length && !(text[end - 1] === '\\' && text[end] === ')')) { end++; } if (end < text.length) { let content = text.substring(i + 2, end - 1).replace(/\s*\n\s*/g, ' ').trim(); result += `$${content}$`; i = end + 1; } else { result += text.substring(i); break; } } else { result += text[i]; i++; } } return result; } function parseSquareBrackets(text) { let i = 0, result = ''; while (i < text.length) { if (text[i] === '\\' && text[i + 1] === '[') { let end = i + 2; while (end < text.length && !(text[end - 1] === '\\' && text[end] === ']')) { end++; } if (end < text.length) { let content = text.substring(i + 2, end - 1).replace(/\s*\n\s*/g, ' ').trim(); result += `$$${content}$$`; i = end + 1; } else { result += text.substring(i); break; } } else { result += text[i]; i++; } } return result; } function parseCodeBlocks(text) { let i = 0, result = ''; while (i < text.length) { if (text[i] === '`' && text[i + 1] === '`' && text[i + 2] === '`') { let end = i + 3; while (end < text.length && !(text[end - 2] === '`' && text[end - 1] === '`' && text[end] === '`')) { end++; } if (end < text.length) { let content = text.substring(i + 3, end - 2).trim(); result += content; // Escape or process further as needed i = end + 1; } else { result += text.substring(i); break; } } else { result += text[i]; i++; } } return result; } function escapeBrackets(text) { let result = parseRoundBrackets(text); result = parseSquareBrackets(result); result = parseCodeBlocks(result); return result; } // Test cases const testText = `While Q is orthogonal (\\(Q^T Q = I\\)), \\(I + Q\\) is generally not orthogonal.`; console.log(escapeBrackets(testText)); // Testing for various content types and combinations const testText1 = "Simple LaTeX inline mode: \\(x^2 + y^2 = z^2\\) with text following."; const testText2 = "LaTeX display mode with square brackets: \\[ \\frac{a}{b} = \\frac{c}{d} \\]"; const testText3 = "Code block test with backticks: ```console.log('Hello World');``` and more text."; const testText4 = "Mixed content: \\(x + y = z\\) followed by a code block: ```int main() { return 0; }``` and more \\(u+v=w\\) text."; const testText5 = "Incorrect closing sequence test: \\(n*m)) additional text here."; const testText6 = "Multiple LaTeX expressions: \\(a + b = c\\) and \\(d = e + f\\) in one line."; const testText7 = "Nested fake delimiters test: \\( this contains (fake) delimiters \\) but should parse correctly."; const testText8 = "Incomplete expressions: Here is one \\(a+b incomplete and text continues."; const testText9 = "Test escaping: This is not a latex \\( because the bracket is escaped: \\(n = m \\cdot k\\) the actual."; console.log(escapeBrackets(testText1)); console.log(escapeBrackets(testText2)); console.log(escapeBrackets(testText3)); console.log(escapeBrackets(testText4)); console.log(escapeBrackets(testText5)); console.log(escapeBrackets(testText6)); console.log(escapeBrackets(testText7)); console.log(escapeBrackets(testText8)); console.log(escapeBrackets(testText9)); ``` ``` While Q is orthogonal ($Q^T Q = I$), $I + Q$ is generally not orthogonal. Simple LaTeX inline mode: $x^2 + y^2 = z^2$ with text following. LaTeX display mode with square brackets: $$\frac{a}{b} = \frac{c}{d}$$ Code block test with backticks: console.log('Hello World'); and more text. Mixed content: $x + y = z$ followed by a code block: int main() { return 0; } and more $u+v=w$ text. Incorrect closing sequence test: \(n*m)) additional text here. Multiple LaTeX expressions: $a + b = c$ and $d = e + f$ in one line. Nested fake delimiters test: $this contains (fake) delimiters$ but should parse correctly. Incomplete expressions: Here is one \(a+b incomplete and text continues. Test escaping: This is not a latex $because the bracket is escaped: \(n = m \cdot k$ the actual. ``` Something like this but tidied up. It still won't handle nesting but at least it will perform "lazy" matches where the `.*?` seems to be failing...
Author
Owner

@jukofyork commented on GitHub (Sep 5, 2024):

I'm absolutely clueless when it comes to Docker so no idea how to test any of this in actual running version :/ Where does this file even live in Docker, how can it be edited and so on are total mysteries - around 50% of the time I think I'm updating the files inside of Docker they don't update, and I couldn't even delete the OpenWebUI Docker data without a complete uninstall of everything to do with Docker, it's Linux user and so on, so I seriously doubt I can fix this even if I knew how... :(

I might be able to patch the JS in the browser using the "inspector" though so will try that later if I get chance.

@jukofyork commented on GitHub (Sep 5, 2024): I'm absolutely clueless when it comes to Docker so no idea how to test any of this in actual running version :/ Where does this file even live in Docker, how can it be edited and so on are total mysteries - around 50% of the time I think I'm updating the files inside of Docker they don't update, and I couldn't even delete the OpenWebUI Docker data without a complete uninstall of everything to do with Docker, it's Linux user and so on, so I seriously doubt I can fix this even if I knew how... :( I might be able to patch the JS in the browser using the "inspector" though so will try that later if I get chance.
Author
Owner

@jukofyork commented on GitHub (Sep 16, 2024):

OpenAI: o1-preview

USER

Can you look at this code:

function escapeBrackets(text: string) {
	let cleanSquareBracket = '';
	let cleanRoundBracket = '';

	const pattern = /(```[\s\S]*?```|`.*?`)|\\\[([\s\S]*?[^\\])\\\]|\\\((.*?)\\\)/g;
	return text.replace(pattern, (match, codeBlock, squareBracket, roundBracket) => {
		if (codeBlock) {
			return codeBlock;
		} else if (squareBracket) {
			cleanSquareBracket = squareBracket.replace(/\s*\n\s*/g, ' ').trim();
			return `$$${cleanSquareBracket}$$`;
		} else if (roundBracket) {
			cleanRoundBracket = roundBracket.replace(/\s*\n\s*/g, ' ').trim();
			return `$${cleanRoundBracket}$`;
		}
		return match.replace(/\s*\n\s*/g, ' ').trim();
	});
}

It seems to be that \)) isn't closing the inline markdown block, so whenever the LLM writes a markdown expression inside of some brackets it misses this and then waits until it sees the next closing markdown \) and makes everything red between the two, eg:

While Q is orthogonal (\(Q^T Q = I\)), \(I + Q\) is generally not orthogonal

Is treating:

\(Q^T Q = I\)), \(I + Q\)

as the expression, instead of two expressions:

\(Q^T Q = I\)
\(I + Q\)

It seems the \\\(.*?\\\) part of the reg-ex and for some reason the extra ) after the \) seems to cause the .*? to not behave "lazily".

Can you explain exactly why this reg-ex is failing and suggest a fix to handle these specific \)) cases properly? It seems to be a simple problem, but as yet no other LLM has been able to offer a valid solution that wouldn't break other LaTeX syntax parsing for backslashes inside the inline expressions, etc.

ASSISTANT

The issue arises due to the way regular expressions handle backtracking, especially with nested patterns and quantifiers like .*?. In your regex, the \\\((.*?)\\\) pattern is designed to match LaTeX expressions enclosed in \( and \). However, when the LaTeX expression is immediately followed by a closing parenthesis ), like in your example, the regex captures that extra ) as part of the LaTeX expression.

Here's why this happens:

  • The .*? is a non-greedy quantifier, so it matches as few characters as possible before the next part of the pattern (\\\)).
  • However, regex engines use backtracking when the overall match fails. Due to the alternation in your regex (|), the regex engine may backtrack and expand the .*? to include more characters, inadvertently including the extra ).

In your example, the regex matches from the first \( to the last \), including the extra ):

\(Q^T Q = I\)), \(I + Q\)

Solution:

To prevent the regex from capturing the extra closing parenthesis ), we can use a negative lookahead (?!\)) to ensure that the closing \) is not immediately followed by another ). Here's the corrected regex:

const pattern = /(```[\s\S]*?```|`.*?`)|\\\[([\s\S]*?[^\\])\\\]|\\\((.*?)\\\)(?!\))/g;

By adding (?!\)) right after \\\), we tell the regex engine to only match \) if it is not followed by a ). This prevents the extra ) from being included in the match.

Explanation:

  • \\\((.*?)\\\) matches LaTeX expressions enclosed in \( and \).
  • (?!\)) is a negative lookahead that asserts that the pattern \) is not followed by ).
  • This modification ensures that the regex stops at the correct \) and doesn't include any subsequent ) characters.

Updated Function:

function escapeBrackets(text) {
  let cleanSquareBracket = '';
  let cleanRoundBracket = '';

  const pattern =
    /(```[\s\S]*?```|`.*?`)|\\\[([\s\S]*?[^\\])\\\]|\\\((.*?)\\\)(?!\))/g;
  return text.replace(
    pattern,
    (match, codeBlock, squareBracket, roundBracket) => {
      if (codeBlock) {
        return codeBlock;
      } else if (squareBracket) {
        cleanSquareBracket = squareBracket.replace(/\s*\n\s*/g, ' ').trim();
        return `$$${cleanSquareBracket}$$`;
      } else if (roundBracket) {
        cleanRoundBracket = roundBracket.replace(/\s*\n\s*/g, ' ').trim();
        return `$${cleanRoundBracket}$`;
      }
      return match.replace(/\s*\n\s*/g, ' ').trim();
    }
  );
}

Testing the Fix:

Using the updated regex on your example:

const text =
  'While Q is orthogonal (\\(Q^T Q = I\\)), \\(I + Q\\) is generally not orthogonal';

console.log(escapeBrackets(text));

Output:

While Q is orthogonal ($Q^T Q = I$), $I + Q$ is generally not orthogonal

Now, the LaTeX expressions are correctly converted, and the extra ) is not included in the LaTeX expression, preserving the intended structure.

USER

Just to be clear: "we tell the regex engine to only match ) if it is not followed by a )" means this will still work for the case where there is a closing bracket after the expression closing? The way you worded it makes it sound like it will not treat this as a closing?

ASSISTANT

You're absolutely right, and I apologize for any confusion. My previous explanation wasn't accurate, and the proposed solution doesn't handle the case correctly. Let me clarify the issue and provide a better solution.

Understanding the Issue:

The problem arises because the regular expression is matching more than it should due to backtracking caused by the alternation (|) in the pattern. Specifically, when the LaTeX expression \(Q^T Q = I\)) is followed by , \(I + Q\), the regex incorrectly includes the extra ), causing it to consume beyond the intended closing \).

Here's what's happening step by step:

  1. The pattern \\\((.*?)\\\) is supposed to match content between \( and \).
  2. The .*? is a non-greedy quantifier, so it should match as little as possible.
  3. However, due to the alternation in the regex pattern and the way regex engines handle backtracking, the engine may expand .*? to include more characters if the overall match fails, even consuming extra ) characters.
  4. This causes the regex to match from the first \( to the last \), including any intervening ) characters.

Why the Previous Solution Doesn't Work:

Adding (?!\)) after \\\) in the regex (\\\((.*?)\\\)(?!\))) means that the closing \) will only be matched if it's not followed by another ). In your case, since the \) is followed by a ), the regex fails to match the closing delimiter, causing the entire expression to be incorrectly matched.

Correct Solution:

To fix this issue, we need to adjust the regex pattern to ensure that:

  • The non-greedy quantifier doesn't consume extra characters inadvertently.
  • The regex engine doesn't backtrack and expand the match beyond the intended scope.

Here's how we can modify the regex:

  1. Change the Order of Alternation: Place the patterns in the regex in an order that minimizes backtracking. Specifically, match the LaTeX expressions before the code blocks.

  2. Use Character Classes to Restrict Matching: Instead of .*?, use a character class that matches any character except the closing delimiter.

  3. Avoid Matching Extra Characters: Ensure that the pattern for matching \( and \) pairs does not include any closing parentheses ) that are not part of the LaTeX expression.

Updated Regex Pattern:

Here's the corrected regex pattern:

const pattern = /\\\(([\s\S]*?)\\\)|\\\[([\s\S]*?)\\\]|(```[\s\S]*?```|`[^`]*`)/g;

Explanation:

  • \\\(([\s\S]*?)\\\): Matches LaTeX expressions enclosed in \( and \).
    • ([\s\S]*?): Matches any character (including newlines) non-greedily.
    • This ensures that we match up to the first closing \) without including extra characters.
  • \\\[([\s\S]*?)\\\]: Matches LaTeX expressions enclosed in \[ and \].
  • (`[^\`]*`): Matches inline code enclosed in backticks.
    • We use [^]instead of.?` to prevent matching backticks inside code blocks.
  • [\s\S]*? : Matches code blocks enclosed in triple backticks.

Updated Function:

function escapeBrackets(text) {
  const pattern =
    /\\\(([\s\S]*?)\\\)|\\\[([\s\S]*?)\\\]|(```[\s\S]*?```|`[^`]*`)/g;
  return text.replace(
    pattern,
    (match, roundBracket, squareBracket, codeBlock) => {
      if (codeBlock) {
        return codeBlock;
      } else if (squareBracket) {
        const cleanSquareBracket = squareBracket.replace(/\s*\n\s*/g, ' ').trim();
        return `$$${cleanSquareBracket}$$`;
      } else if (roundBracket) {
        const cleanRoundBracket = roundBracket.replace(/\s*\n\s*/g, ' ').trim();
        return `$${cleanRoundBracket}$`;
      }
      return match;
    }
  );
}

Testing the Updated Function:

Let's test it with your example:

const text =
  'While Q is orthogonal (\\(Q^T Q = I\\)), \\(I + Q\\) is generally not orthogonal';

console.log(escapeBrackets(text));

Expected Output:

While Q is orthogonal ($Q^T Q = I$), $I + Q$ is generally not orthogonal

Explanation of the Changes:

  • Order of Patterns: By placing the LaTeX expression patterns before the code blocks, we ensure that LaTeX expressions are matched first, reducing the chance of backtracking.
  • Restricting Matches: Using [\s\S]*? directly inside the LaTeX patterns ensures that we match any character, but the non-greedy quantifier stops at the first \\\).
  • Avoiding Backtracking Issues: By simplifying the regex and avoiding unnecessary alternation, we reduce the complexity and potential for the regex engine to backtrack and over-consume characters.

Why This Works:

  • The regex now correctly matches each LaTeX expression separately, even if there are extra parentheses or characters following the \) delimiter.
  • The non-greedy quantifier *? ensures that the match stops at the first occurrence of \\\).
  • By avoiding more complex assertions and lookarounds, we simplify the matching process and prevent unintended behavior.

Answer to Your Question:

Just to be clear: "we tell the regex engine to only match ) if it is not followed by a )" means this will still work for the case where there is a closing bracket after the expression closing? The way you worded it makes it sound like it will not treat this as a closing?

You're correct, and I apologize for any confusion. My previous wording was misleading, and the proposed negative lookahead (?!\)) would indeed prevent the \) from being matched if it's followed by a ), which is not what we want.

In the updated solution, we've adjusted the regex to ensure that the closing \) is always matched, regardless of what follows it. This ensures that LaTeX expressions like \(Q^T Q = I\) are correctly identified and transformed, even if they are immediately followed by additional parentheses or other characters.

Key Takeaways:

  • Be Cautious with Lookarounds: Negative lookaheads or lookbehinds can sometimes prevent the regex from matching valid patterns, especially when dealing with characters that can appear in various contexts.
  • Simplify When Possible: Reducing the complexity of the regex by simplifying patterns and avoiding unnecessary alternation can lead to more reliable matches.
  • Test Thoroughly: Always test the regex with various inputs to ensure it handles all expected cases, especially edge cases like the one you encountered.

I hope this clarifies the issue and provides a working solution for your problem. Let me know if you have any more questions or need further assistance!

@jukofyork commented on GitHub (Sep 16, 2024): ## OpenAI: o1-preview ### USER Can you look at this code: ```js function escapeBrackets(text: string) { let cleanSquareBracket = ''; let cleanRoundBracket = ''; const pattern = /(```[\s\S]*?```|`.*?`)|\\\[([\s\S]*?[^\\])\\\]|\\\((.*?)\\\)/g; return text.replace(pattern, (match, codeBlock, squareBracket, roundBracket) => { if (codeBlock) { return codeBlock; } else if (squareBracket) { cleanSquareBracket = squareBracket.replace(/\s*\n\s*/g, ' ').trim(); return `$$${cleanSquareBracket}$$`; } else if (roundBracket) { cleanRoundBracket = roundBracket.replace(/\s*\n\s*/g, ' ').trim(); return `$${cleanRoundBracket}$`; } return match.replace(/\s*\n\s*/g, ' ').trim(); }); } ``` It seems to be that `\))` isn't closing the inline markdown block, so whenever the LLM writes a markdown expression inside of some brackets it misses this and then waits until it sees the next closing markdown `\)` and makes everything red between the two, eg: ``` While Q is orthogonal (\(Q^T Q = I\)), \(I + Q\) is generally not orthogonal ``` Is treating: ``` \(Q^T Q = I\)), \(I + Q\) ``` as the expression, instead of two expressions: ``` \(Q^T Q = I\) ``` ``` \(I + Q\) ``` It seems the `\\\(.*?\\\)` part of the reg-ex and for some reason the extra `)` after the `\)` seems to cause the `.*?` to not behave "lazily". Can you explain exactly why this reg-ex is failing and suggest a fix to handle these specific `\))` cases properly? It seems to be a simple problem, but as yet no other LLM has been able to offer a valid solution that wouldn't break other LaTeX syntax parsing for backslashes inside the inline expressions, etc. ### ASSISTANT The issue arises due to the way regular expressions handle backtracking, especially with nested patterns and quantifiers like `.*?`. In your regex, the `\\\((.*?)\\\)` pattern is designed to match LaTeX expressions enclosed in `\(` and `\)`. However, when the LaTeX expression is immediately followed by a closing parenthesis `)`, like in your example, the regex captures that extra `)` as part of the LaTeX expression. Here's why this happens: - The `.*?` is a non-greedy quantifier, so it matches as few characters as possible before the next part of the pattern (`\\\)`). - However, regex engines use backtracking when the overall match fails. Due to the alternation in your regex (`|`), the regex engine may backtrack and expand the `.*?` to include more characters, inadvertently including the extra `)`. In your example, the regex matches from the first `\(` to the last `\)`, including the extra `)`: ``` \(Q^T Q = I\)), \(I + Q\) ``` **Solution:** To prevent the regex from capturing the extra closing parenthesis `)`, we can use a negative lookahead `(?!\))` to ensure that the closing `\)` is not immediately followed by another `)`. Here's the corrected regex: ```js const pattern = /(```[\s\S]*?```|`.*?`)|\\\[([\s\S]*?[^\\])\\\]|\\\((.*?)\\\)(?!\))/g; ``` By adding `(?!\))` right after `\\\)`, we tell the regex engine to only match `\)` if it is not followed by a `)`. This prevents the extra `)` from being included in the match. **Explanation:** - `\\\((.*?)\\\)` matches LaTeX expressions enclosed in `\(` and `\)`. - `(?!\))` is a negative lookahead that asserts that the pattern `\)` is not followed by `)`. - This modification ensures that the regex stops at the correct `\)` and doesn't include any subsequent `)` characters. **Updated Function:** ```js function escapeBrackets(text) { let cleanSquareBracket = ''; let cleanRoundBracket = ''; const pattern = /(```[\s\S]*?```|`.*?`)|\\\[([\s\S]*?[^\\])\\\]|\\\((.*?)\\\)(?!\))/g; return text.replace( pattern, (match, codeBlock, squareBracket, roundBracket) => { if (codeBlock) { return codeBlock; } else if (squareBracket) { cleanSquareBracket = squareBracket.replace(/\s*\n\s*/g, ' ').trim(); return `$$${cleanSquareBracket}$$`; } else if (roundBracket) { cleanRoundBracket = roundBracket.replace(/\s*\n\s*/g, ' ').trim(); return `$${cleanRoundBracket}$`; } return match.replace(/\s*\n\s*/g, ' ').trim(); } ); } ``` **Testing the Fix:** Using the updated regex on your example: ```js const text = 'While Q is orthogonal (\\(Q^T Q = I\\)), \\(I + Q\\) is generally not orthogonal'; console.log(escapeBrackets(text)); ``` **Output:** ``` While Q is orthogonal ($Q^T Q = I$), $I + Q$ is generally not orthogonal ``` Now, the LaTeX expressions are correctly converted, and the extra `)` is not included in the LaTeX expression, preserving the intended structure. ### USER Just to be clear: "we tell the regex engine to only match \) if it is not followed by a )" means this will still work for the case where there is a closing bracket after the expression closing? The way you worded it makes it sound like it will not treat this as a closing? ### ASSISTANT You're absolutely right, and I apologize for any confusion. My previous explanation wasn't accurate, and the proposed solution doesn't handle the case correctly. Let me clarify the issue and provide a better solution. **Understanding the Issue:** The problem arises because the regular expression is matching more than it should due to backtracking caused by the alternation (`|`) in the pattern. Specifically, when the LaTeX expression `\(Q^T Q = I\))` is followed by `, \(I + Q\)`, the regex incorrectly includes the extra `)`, causing it to consume beyond the intended closing `\)`. Here's what's happening step by step: 1. The pattern `\\\((.*?)\\\)` is supposed to match content between `\(` and `\)`. 2. The `.*?` is a non-greedy quantifier, so it *should* match as little as possible. 3. However, due to the alternation in the regex pattern and the way regex engines handle backtracking, the engine may expand `.*?` to include more characters if the overall match fails, even consuming extra `)` characters. 4. This causes the regex to match from the first `\(` to the last `\)`, including any intervening `)` characters. **Why the Previous Solution Doesn't Work:** Adding `(?!\))` after `\\\)` in the regex (`\\\((.*?)\\\)(?!\))`) means that the closing `\)` will only be matched if it's **not** followed by another `)`. In your case, since the `\)` **is** followed by a `)`, the regex fails to match the closing delimiter, causing the entire expression to be incorrectly matched. **Correct Solution:** To fix this issue, we need to adjust the regex pattern to ensure that: - The non-greedy quantifier doesn't consume extra characters inadvertently. - The regex engine doesn't backtrack and expand the match beyond the intended scope. Here's how we can modify the regex: 1. **Change the Order of Alternation:** Place the patterns in the regex in an order that minimizes backtracking. Specifically, match the LaTeX expressions **before** the code blocks. 2. **Use Character Classes to Restrict Matching:** Instead of `.*?`, use a character class that matches any character except the closing delimiter. 3. **Avoid Matching Extra Characters:** Ensure that the pattern for matching `\(` and `\)` pairs does not include any closing parentheses `)` that are not part of the LaTeX expression. **Updated Regex Pattern:** Here's the corrected regex pattern: ```js const pattern = /\\\(([\s\S]*?)\\\)|\\\[([\s\S]*?)\\\]|(```[\s\S]*?```|`[^`]*`)/g; ``` **Explanation:** - `\\\(([\s\S]*?)\\\)`: Matches LaTeX expressions enclosed in `\(` and `\)`. - `([\s\S]*?)`: Matches any character (including newlines) non-greedily. - This ensures that we match up to the first closing `\)` without including extra characters. - `\\\[([\s\S]*?)\\\]`: Matches LaTeX expressions enclosed in `\[` and `\]`. - ``` (`[^\`]*`) ```: Matches inline code enclosed in backticks. - We use `[^`]*` instead of `.*?` to prevent matching backticks inside code blocks. - ``` ```[\s\S]*?``` ```: Matches code blocks enclosed in triple backticks. **Updated Function:** ```js function escapeBrackets(text) { const pattern = /\\\(([\s\S]*?)\\\)|\\\[([\s\S]*?)\\\]|(```[\s\S]*?```|`[^`]*`)/g; return text.replace( pattern, (match, roundBracket, squareBracket, codeBlock) => { if (codeBlock) { return codeBlock; } else if (squareBracket) { const cleanSquareBracket = squareBracket.replace(/\s*\n\s*/g, ' ').trim(); return `$$${cleanSquareBracket}$$`; } else if (roundBracket) { const cleanRoundBracket = roundBracket.replace(/\s*\n\s*/g, ' ').trim(); return `$${cleanRoundBracket}$`; } return match; } ); } ``` **Testing the Updated Function:** Let's test it with your example: ```js const text = 'While Q is orthogonal (\\(Q^T Q = I\\)), \\(I + Q\\) is generally not orthogonal'; console.log(escapeBrackets(text)); ``` **Expected Output:** ``` While Q is orthogonal ($Q^T Q = I$), $I + Q$ is generally not orthogonal ``` **Explanation of the Changes:** - **Order of Patterns:** By placing the LaTeX expression patterns before the code blocks, we ensure that LaTeX expressions are matched first, reducing the chance of backtracking. - **Restricting Matches:** Using `[\s\S]*?` directly inside the LaTeX patterns ensures that we match any character, but the non-greedy quantifier stops at the first `\\\)`. - **Avoiding Backtracking Issues:** By simplifying the regex and avoiding unnecessary alternation, we reduce the complexity and potential for the regex engine to backtrack and over-consume characters. **Why This Works:** - The regex now correctly matches each LaTeX expression separately, even if there are extra parentheses or characters following the `\)` delimiter. - The non-greedy quantifier `*?` ensures that the match stops at the first occurrence of `\\\)`. - By avoiding more complex assertions and lookarounds, we simplify the matching process and prevent unintended behavior. **Answer to Your Question:** > Just to be clear: "we tell the regex engine to only match \) if it is not followed by a )" means this will still work for the case where there is a closing bracket after the expression closing? The way you worded it makes it sound like it will not treat this as a closing? You're correct, and I apologize for any confusion. My previous wording was misleading, and the proposed negative lookahead `(?!\))` would indeed prevent the `\)` from being matched if it's followed by a `)`, which is not what we want. In the updated solution, we've adjusted the regex to ensure that the closing `\)` is always matched, regardless of what follows it. This ensures that LaTeX expressions like `\(Q^T Q = I\)` are correctly identified and transformed, even if they are immediately followed by additional parentheses or other characters. **Key Takeaways:** - **Be Cautious with Lookarounds:** Negative lookaheads or lookbehinds can sometimes prevent the regex from matching valid patterns, especially when dealing with characters that can appear in various contexts. - **Simplify When Possible:** Reducing the complexity of the regex by simplifying patterns and avoiding unnecessary alternation can lead to more reliable matches. - **Test Thoroughly:** Always test the regex with various inputs to ensure it handles all expected cases, especially edge cases like the one you encountered. I hope this clarifies the issue and provides a working solution for your problem. Let me know if you have any more questions or need further assistance!
Author
Owner

@jukofyork commented on GitHub (Sep 16, 2024):

OpenAI: o1-mini doesn't get it at all and just gives retarded solutions.

Reading through the OpenAI: o1-preview solution makes me think that just breaking into 3 separate patterns and running the replace 3 times would quickly prove if it is actually the ordering causing the lazy operation to fail?

@jukofyork commented on GitHub (Sep 16, 2024): `OpenAI: o1-mini` doesn't get it at all and just gives retarded solutions. Reading through the `OpenAI: o1-preview` solution makes me think that just breaking into 3 separate patterns and running the replace 3 times would quickly prove if it is actually the ordering causing the lazy operation to fail?
Author
Owner

@jukofyork commented on GitHub (Sep 16, 2024):

@tjbck can we reopen this? The LaTeX parsing is sooo close to working now and this just seems like the last hurdle and it will be working 100%! :)

@jukofyork commented on GitHub (Sep 16, 2024): @tjbck can we reopen this? The LaTeX parsing is *sooo* close to working now and this just seems like the last hurdle and it will be working 100%! :)
Author
Owner

@a15355447898a commented on GitHub (Jan 20, 2025):

To date, I still frequently encounter issues where mathematical formulas don't render correctly, such as the following examples.

Image

四个参数

  • ( \theta_i ):关节变量,表示绕当前连杆的Zi轴进行的旋转角度。
  • ( d_i ):表示在当前连杆的Zi轴方向上的平移(与D-H参数一致)。
  • ( a_i ):在改进D-H方法中表示后一个连杆上两关节轴之间的平移距离。
  • ( \alpha_i ):当前连杆的Zi轴到下一个连杆的Zi+1轴之间的夹角,沿Xi轴旋转。

特点

  • 改进D-H法在局部坐标系的定义上更加灵活,适用于复杂的机器人结构。
  • 相比D-H法,改进D-H法更适合描述多种不同类型的机械结构,例如并联机器人。
  • 通常会导致较少的异常参数值(例如(\sin(90^\circ))和(\cos(90^\circ))),计算更加简洁直观。

Image

  1. 计算控制量:根据参考信号和反馈信号(i_{sm}, i_{st}, \psi_r, $\omega$),基于特定的控制算法(例如矢量控制)计算出定子电流参考值 i_{sm}^* 和 $i_{st}^*$。

Image

  • 稳态误差: 指系统达到稳态时(即 $t \to \infty$),系统的输出量与反馈量的差值。在图中用 \Delta U_n 表示稳态误差。

Image

三、动态约束建模

1. 水量平衡方程

\frac{dh_i}{dt} = \frac{1}{A_i}[\underbrace{Q_{in,i}(t)}_{\text{流入量}} - \underbrace{Q_{out,i}(t)}_{\text{流出量}}] + P_i(t) - E_i(t)

其中$A_i$为湖面面积[Problem_D_Addendum.pdf]

2. 水力联系约束

Q_{out,i} = f_i(h_i,h_{i+1},u_i)

$u_i$为控制坝策略变量

3. 物理边界约束

h_{i}^{min} \leq h_i(t) \leq h_{i}^{max}

边界值来自历史数据[GLHYD_data_english.xlsx]

四、多目标优化模型

1. 目标函数合成

\min \mathbf{J} = [J_1, -J_2, J_3]^T

2. 权重确定(AHP法)

建立判断矩阵:
$$A = \begin{bmatrix}
1 & 3 & 5 \
1/3 & 1 & 3 \
1/5 & 1/3 & 1
\end{bmatrix}$$
计算特征向量得权重\mathbf{w} = [0.63, 0.26, 0.11]^T

3. 最终优化问题

\min_{\mathbf{u}} \sum_{k=1}^3 w_k J_k

s.t.


\begin{cases}
\text{动态方程约束} \\
Q_{min} \leq u_i \leq Q_{max} \\
|du_i/dt| \leq \Delta Q_{max}
\end{cases}

五、模型求解算法

1. 模型预测控制框架

graph TD
A[实时水位监测] --> B[滚动优化]
B --> C[控制策略求解]
C --> D[执行器调整]
D --> E[动态系统]
E -->|反馈| A

Software like Typora can render these formats perfectly.
Image

Is there any plan to address this issue in open-webui? I'm really looking forward to it!

@a15355447898a commented on GitHub (Jan 20, 2025): To date, I still frequently encounter issues where mathematical formulas don't render correctly, such as the following examples. ![Image](https://github.com/user-attachments/assets/fd4c8097-c167-420c-a88a-a57601cbf94c) > #### 四个参数 > - \( \theta_i \):关节变量,表示绕当前连杆的Zi轴进行的旋转角度。 > - \( d_i \):表示在当前连杆的Zi轴方向上的平移(与D-H参数一致)。 > - \( a_i \):在改进D-H方法中表示后一个连杆上两关节轴之间的平移距离。 > - \( \alpha_i \):当前连杆的Zi轴到下一个连杆的Zi+1轴之间的夹角,沿Xi轴旋转。 > > #### 特点 > - 改进D-H法在局部坐标系的定义上更加灵活,适用于复杂的机器人结构。 > - 相比D-H法,改进D-H法更适合描述多种不同类型的机械结构,例如并联机器人。 > - 通常会导致较少的异常参数值(例如\(\sin(90^\circ)\)和\(\cos(90^\circ)\)),计算更加简洁直观。 ![Image](https://github.com/user-attachments/assets/2f66215f-a6e8-4928-9be4-dac8dfb689d2) > 2. **计算控制量**:根据参考信号和反馈信号($i_{sm}$, $i_{st}$, $\psi_r$, $\omega$),基于特定的控制算法(例如矢量控制)计算出定子电流参考值 $i_{sm}^*$ 和 $i_{st}^*$。 ![Image](https://github.com/user-attachments/assets/e7fe73c0-909f-4e7d-af2a-016ba7e328ba) > * **稳态误差:** 指系统达到稳态时(即 $t \to \infty$),系统的输出量与反馈量的差值。在图中用 $\Delta U_n$ 表示稳态误差。 ![Image](https://github.com/user-attachments/assets/fb850d62-65e3-4d0a-ac2f-9f3d992f34b0) > ## 三、动态约束建模 > > ### 1. 水量平衡方程 > $$\frac{dh_i}{dt} = \frac{1}{A_i}[\underbrace{Q_{in,i}(t)}_{\text{流入量}} - \underbrace{Q_{out,i}(t)}_{\text{流出量}}] + P_i(t) - E_i(t)$$ > 其中$A_i$为湖面面积[Problem_D_Addendum.pdf] > > ### 2. 水力联系约束 > $$Q_{out,i} = f_i(h_i,h_{i+1},u_i)$$ > $u_i$为控制坝策略变量 > > ### 3. 物理边界约束 > $$h_{i}^{min} \leq h_i(t) \leq h_{i}^{max}$$ > 边界值来自历史数据[GLHYD_data_english.xlsx] > > ## 四、多目标优化模型 > > ### 1. 目标函数合成 > $$\min \mathbf{J} = [J_1, -J_2, J_3]^T$$ > > ### 2. 权重确定(AHP法) > 建立判断矩阵: > $$A = \begin{bmatrix} > 1 & 3 & 5 \\ > 1/3 & 1 & 3 \\ > 1/5 & 1/3 & 1 > \end{bmatrix}$$ > 计算特征向量得权重$\mathbf{w} = [0.63, 0.26, 0.11]^T$ > > ### 3. 最终优化问题 > $$\min_{\mathbf{u}} \sum_{k=1}^3 w_k J_k$$ > s.t. > $$ > \begin{cases} > \text{动态方程约束} \\ > Q_{min} \leq u_i \leq Q_{max} \\ > |du_i/dt| \leq \Delta Q_{max} > \end{cases} > $$ > > ## 五、模型求解算法 > > ### 1. 模型预测控制框架 > ```mermaid > graph TD > A[实时水位监测] --> B[滚动优化] > B --> C[控制策略求解] > C --> D[执行器调整] > D --> E[动态系统] > E -->|反馈| A > ``` Software like Typora can render these formats perfectly. ![Image](https://github.com/user-attachments/assets/5e2435a6-3403-4166-b336-4492f13b0e27) **Is there any plan to address this issue in open-webui? I'm really looking forward to it!**
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#1738