[PR #14885] [CLOSED] feat(i18n): Implement automated translation file management script #23629

Closed
opened 2026-04-20 04:56:04 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/open-webui/open-webui/pull/14885
Author: @silentoplayz
Created: 6/11/2025
Status: Closed

Base: devHead: i18n-keys-alignment


📝 Commits (4)

  • f725c40 fix: Removed orphaned keys & aligned with en-US keys
  • 63b7bda chore: Update 2 translation.json files
  • 4858268 fix: consistent sorting of all keys, including nested ones, with lowercase before uppercase
  • 4c2885b Update sync-translations.cjs

📊 Changes

4 files changed (+1675 additions, -557 deletions)

View changed files

📝 package.json (+2 -1)
scripts/sync-translations.cjs (+228 -0)
📝 src/lib/i18n/locales/gl-ES/translation.json (+259 -47)
📝 src/lib/i18n/locales/tk-TM/translation.json (+1186 -509)

📄 Description

Pull Request Checklist

Before submitting, make sure you've checked the following:

  • Target branch: Please verify that the pull request targets the dev branch.
  • Description: Provide a concise description of the changes made in this pull request.
  • Changelog: Ensure a changelog entry following the format of Keep a Changelog is added at the bottom of the PR description.
  • Documentation: Have you updated relevant documentation Open WebUI Docs, or other documentation sources?
  • Dependencies: Are there any new dependencies? Have you updated the dependency versions in the documentation?
  • Testing: Have you written and run sufficient tests to validate the changes?
  • Code review: Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards?
  • Prefix: To clearly categorize this pull request, prefix the pull request title using one of the following:
    • BREAKING CHANGE: Significant changes that may affect compatibility
    • build: Changes that affect the build system or external dependencies
    • ci: Changes to our continuous integration processes or workflows
    • chore: Refactor, cleanup, or other non-functional code changes
    • docs: Documentation update or addition
    • feat: Introduces a new feature or enhancement to the codebase
    • fix: Bug fix or error correction
    • i18n: Internationalization or localization changes
    • perf: Performance improvement
    • refactor: Code restructuring for better maintainability, readability, or scalability
    • style: Changes that do not affect the meaning of the code (white space, formatting, missing semi-colons, etc.)
    • test: Adding missing tests or correcting existing tests
    • WIP: Work in progress, a temporary label for incomplete or ongoing work

Changelog Entry

Description

  • Introduces a new utility script (scripts/sync-translations.cjs) and an associated npm run i18n:sync command to fully automate and standardize the management of translation.json files across all locales. This significantly reduces manual effort for internationalization, ensuring unparalleled consistency and accuracy in our internationalization assets.

Added

  • New npm run i18n:sync script in package.json.
  • New utility script: scripts/sync-translations.cjs.

Changed

  • All translation.json files, including newly created ones, are now automatically synchronized with the en-US master locale. This includes:
    • Automatic Creation: Simply create a new locale directory (src/lib/i18n/locales/xx-YY), and running npm run i18n:sync will automatically generate and populate its translation.json file from the en-US master.
    • Key Synchronization: Existing translation.json files are strictly aligned with the en-US master's key set and nested structure, preserving existing translations.

Removed

  • Obsolete and orphaned translation keys are now automatically removed from all translation.json files if they are not present in the en-US master, ensuring a clean and current key set.

Fixed

  • Inconsistent key sets across different translation.json files (e.g., missing keys are added, and keys not present in en-US are removed).
  • Discrepancies in the sorting order of keys within translation.json files, including nested objects, to ensure a consistent, natural alphabetical order (lowercase before uppercase).
  • Reduced manual work involved in maintaining translation file consistency and creating new language files.

Breaking Changes

  • BREAKING CHANGE: None. While this script removes unused keys from translation files (i.e., keys not present in the en-US master), this is not considered a breaking change as these keys were already non-functional or obsolete. It strictly improves consistency without impacting active translations.

Additional Information

This pull request enhances our existing internationalization (i18n) workflow by introducing a dedicated synchronization script.

Existing i18n Workflow (Context):

  • i18next-parser (npm run i18n:parse): Our project currently uses i18next-parser (configured by i18next-parser.config.ts) to automate the extraction of translation keys.
    • This parser scans our Svelte and JavaScript/TypeScript source files (src/**/*.{js,svelte}).
    • It generates/updates translation.json files (e.g., src/lib/i18n/locales/$LOCALE/$NAMESPACE.json).
    • It helps keep our en-US/translation.json (and other generated files) up-to-date with strings found in the code, adding new keys (defaultValue: '') and potentially removing keys no longer detected in the codebase (keepRemoved: false).
    • The list of languages processed by i18next-parser is dynamically sourced from src/lib/i18n/locales/languages.json through the getLanguages helper in src/lib/i18n/index.ts.
  • src/lib/i18n/index.ts: This file sets up i18next for our application, handling language detection, loading translation resources dynamically (resourcesToBackend), and managing the active language. It's the runtime core of our i18n system.

The Problem sync-translations.cjs Solves:

While i18next-parser is excellent for code-to-translation file synchronization, it has limitations in ensuring perfect consistency between existing translation.json files themselves, or in enforcing a precise, consistent internal sorting order across all files. This can lead to:

  • Key Discrepancies: A key might exist in en-US but be missing in tk-TM or gl-ES (if it wasn't a newly extracted key by the parser). Conversely, some locale files might accumulate "orphaned" keys not present in en-US or the current codebase.
  • Inconsistent Formatting: While i18next-parser offers sort: true, its default alphabetical sorting might not match a "natural" order (e.g., placing "Add Arena" before "Add a model"), leading to messy Git diffs and difficult readability.
  • Manual Overhead: Adding a new language often requires manually creating an empty translation.json file and then running the parser, which still wouldn't guarantee a complete key set without manual merging.

How scripts/sync-translations.cjs (npm run i18n:sync) Works:

To address these gaps, the new scripts/sync-translations.cjs script provides a post-processing and comprehensive synchronization layer that operates directly on the JSON translation files, leveraging the en-US file as the ultimate source of truth.

  1. Master Locale as Canonical Source:

    • The script first loads src/lib/i18n/locales/en-US/translation.json. This file is used as the definitive canonical structure for all keys (including nested paths) and their desired alphabetical order. All other locale files will be brought into strict alignment with this master.
  2. Iterative Processing of All Locales:

    • It then systematically identifies and processes every locale folder found under src/lib/i18n/locales/ (e.g., tk-TM, gl-ES, fr), regardless of whether a translation.json file currently exists within them.
  3. Automatic File Creation and Initialization:

    • For any locale directory where a translation.json file is missing (e.g., for a newly added language where only the folder exists), the script automatically creates this file. It initializes this new file as an empty JSON object in memory, ready to be populated.
  4. Intelligent Key Merging (Adding Missing Keys):

    • For each locale's translation.json data (whether newly loaded or initialized as empty), the script performs a deep, recursive merge with the en-US master's canonical structure.
    • Any translation keys (including nested objects) present in the en-US master but missing in the current locale's file are recursively added.
    • Crucially, if a key already exists in the current locale's file, its existing translated value is preserved and never overwritten. This is fundamental to preventing data loss.
    • Newly added leaf keys are populated with their corresponding value from the en-US master as a placeholder, or an empty string ("") if no en-US equivalent is found (though this is rare given en-US is the source).
  5. Strict Key Pruning (Removing Unused Keys):

    • Conversely, if a translation key (or a nested object structure) exists in a locale's translation.json file but is not present in the en-US master's canonical structure, that key is automatically removed. This eliminates obsolete entries and keeps our translation files lean and current, ensuring they only contain keys actively defined in the master.
  6. Recursive Natural Alphabetical Sorting:

    • After all merging and pruning operations, the entire JSON object for each locale's translation.json file is recursively sorted alphabetically.
    • The sorting logic uses String.prototype.localeCompare with caseFirst: 'lower', which ensures a consistent and "natural" alphabetical order. This means lowercase letters will consistently appear before their uppercase counterparts (e.g., "Add a model ID" will sort before "Add Arena Model"), creating highly readable files and minimal Git diffs.
  7. Optimized File Writes:

    • The script intelligently compares the final, sorted content with the original file content (if it existed) before writing. A file is only written back to disk if actual changes (key additions/removals or sorting adjustments) have occurred, preventing unnecessary file modifications.

The Workflow:

The ideal workflow after this PR will be:

  1. npm run i18n:parse: Run this first to scan your codebase for new strings and update en-US/translation.json (and other files) with what's found in the code. This is your "source code to master translation" sync.
  2. npm run i18n:sync: Run this second to take the (now updated) en-US/translation.json and propagate its structure, keys, and order to all other translation.json files. This is your "master translation to other translations" sync and cleanup.

In essence:

  • i18n:parse is about discovery and extraction from code.
  • i18n:sync is about enforcement of consistency, standardization, and cleanup between translation files, driven by a single master.
  • To run this comprehensive translation file synchronization process, execute:
    npm run i18n:sync
    

Contributor License Agreement

By submitting this pull request, I confirm that I have read and fully agree to the Contributor License Agreement (CLA), and I am providing my contributions under its terms.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/open-webui/open-webui/pull/14885 **Author:** [@silentoplayz](https://github.com/silentoplayz) **Created:** 6/11/2025 **Status:** ❌ Closed **Base:** `dev` ← **Head:** `i18n-keys-alignment` --- ### 📝 Commits (4) - [`f725c40`](https://github.com/open-webui/open-webui/commit/f725c40959dc1ad93475f9b273fd4e691e439248) fix: Removed orphaned keys & aligned with en-US keys - [`63b7bda`](https://github.com/open-webui/open-webui/commit/63b7bda74584b4d33e910a45ca248f93aaadaa7c) chore: Update 2 translation.json files - [`4858268`](https://github.com/open-webui/open-webui/commit/4858268182eb22a820425049c7b37756e58515d9) fix: consistent sorting of all keys, including nested ones, with lowercase before uppercase - [`4c2885b`](https://github.com/open-webui/open-webui/commit/4c2885b36a222a3acb48fa82915355f1f877bed0) Update sync-translations.cjs ### 📊 Changes **4 files changed** (+1675 additions, -557 deletions) <details> <summary>View changed files</summary> 📝 `package.json` (+2 -1) ➕ `scripts/sync-translations.cjs` (+228 -0) 📝 `src/lib/i18n/locales/gl-ES/translation.json` (+259 -47) 📝 `src/lib/i18n/locales/tk-TM/translation.json` (+1186 -509) </details> ### 📄 Description # Pull Request Checklist **Before submitting, make sure you've checked the following:** - [x] **Target branch:** Please verify that the pull request targets the `dev` branch. - [x] **Description:** Provide a concise description of the changes made in this pull request. - [x] **Changelog:** Ensure a changelog entry following the format of [Keep a Changelog](https://keepachangelog.com/) is added at the bottom of the PR description. - [x] **Documentation:** Have you updated relevant documentation [Open WebUI Docs](https://github.com/open-webui/docs), or other documentation sources? - [x] **Dependencies:** Are there any new dependencies? Have you updated the dependency versions in the documentation? - [x] **Testing:** Have you written and run sufficient tests to validate the changes? - [x] **Code review:** Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards? - [x] **Prefix:** To clearly categorize this pull request, prefix the pull request title using one of the following: - **BREAKING CHANGE**: Significant changes that may affect compatibility - **build**: Changes that affect the build system or external dependencies - **ci**: Changes to our continuous integration processes or workflows - **chore**: Refactor, cleanup, or other non-functional code changes - **docs**: Documentation update or addition - **feat**: Introduces a new feature or enhancement to the codebase - **fix**: Bug fix or error correction - **i18n**: Internationalization or localization changes - **perf**: Performance improvement - **refactor**: Code restructuring for better maintainability, readability, or scalability - **style**: Changes that do not affect the meaning of the code (white space, formatting, missing semi-colons, etc.) - **test**: Adding missing tests or correcting existing tests - **WIP**: Work in progress, a temporary label for incomplete or ongoing work # Changelog Entry ### Description - Introduces a new utility script (`scripts/sync-translations.cjs`) and an associated `npm run i18n:sync` command to fully automate and standardize the management of `translation.json` files across all locales. This significantly reduces manual effort for internationalization, ensuring unparalleled consistency and accuracy in our internationalization assets. ### Added - New `npm run i18n:sync` script in `package.json`. - New utility script: `scripts/sync-translations.cjs`. ### Changed - All `translation.json` files, including newly created ones, are now automatically synchronized with the `en-US` master locale. This includes: - **Automatic Creation:** Simply create a new locale directory (`src/lib/i18n/locales/xx-YY`), and running `npm run i18n:sync` will automatically generate and populate its `translation.json` file from the `en-US` master. - **Key Synchronization:** Existing `translation.json` files are strictly aligned with the `en-US` master's key set and nested structure, preserving existing translations. ### Removed - Obsolete and orphaned translation keys are now automatically removed from all `translation.json` files if they are not present in the `en-US` master, ensuring a clean and current key set. ### Fixed - Inconsistent key sets across different `translation.json` files (e.g., missing keys are added, and keys not present in `en-US` are removed). - Discrepancies in the sorting order of keys within `translation.json` files, including nested objects, to ensure a consistent, natural alphabetical order (lowercase before uppercase). - Reduced manual work involved in maintaining translation file consistency and creating new language files. ### Breaking Changes - **BREAKING CHANGE**: None. While this script removes unused keys from translation files (i.e., keys not present in the `en-US` master), this is not considered a breaking change as these keys were already non-functional or obsolete. It strictly improves consistency without impacting active translations. --- ### Additional Information This pull request enhances our existing internationalization (i18n) workflow by introducing a dedicated synchronization script. **Existing i18n Workflow (Context):** * **`i18next-parser` (`npm run i18n:parse`):** Our project currently uses `i18next-parser` (configured by `i18next-parser.config.ts`) to automate the extraction of translation keys. * This parser scans our Svelte and JavaScript/TypeScript source files (`src/**/*.{js,svelte}`). * It generates/updates `translation.json` files (e.g., `src/lib/i18n/locales/$LOCALE/$NAMESPACE.json`). * It helps keep our `en-US/translation.json` (and other generated files) up-to-date with strings *found in the code*, adding new keys (`defaultValue: ''`) and potentially removing keys no longer detected in the codebase (`keepRemoved: false`). * The list of languages processed by `i18next-parser` is dynamically sourced from `src/lib/i18n/locales/languages.json` through the `getLanguages` helper in `src/lib/i18n/index.ts`. * **`src/lib/i18n/index.ts`:** This file sets up `i18next` for our application, handling language detection, loading translation resources dynamically (`resourcesToBackend`), and managing the active language. It's the runtime core of our i18n system. **The Problem `sync-translations.cjs` Solves:** While `i18next-parser` is excellent for *code-to-translation* file synchronization, it has limitations in ensuring perfect consistency *between* existing `translation.json` files themselves, or in enforcing a precise, consistent internal sorting order across all files. This can lead to: * **Key Discrepancies:** A key might exist in `en-US` but be missing in `tk-TM` or `gl-ES` (if it wasn't a newly extracted key by the parser). Conversely, some locale files might accumulate "orphaned" keys not present in `en-US` or the current codebase. * **Inconsistent Formatting:** While `i18next-parser` offers `sort: true`, its default alphabetical sorting might not match a "natural" order (e.g., placing "Add Arena" before "Add a model"), leading to messy Git diffs and difficult readability. * **Manual Overhead:** Adding a new language often requires manually creating an empty `translation.json` file and then running the parser, which still wouldn't guarantee a complete key set without manual merging. **How `scripts/sync-translations.cjs` (`npm run i18n:sync`) Works:** To address these gaps, the new `scripts/sync-translations.cjs` script provides a **post-processing and comprehensive synchronization layer** that operates directly on the JSON translation files, leveraging the `en-US` file as the ultimate source of truth. 1. **Master Locale as Canonical Source:** * The script first loads `src/lib/i18n/locales/en-US/translation.json`. This file is used as the **definitive canonical structure** for *all* keys (including nested paths) and their desired alphabetical order. All other locale files will be brought into strict alignment with this master. 2. **Iterative Processing of All Locales:** * It then systematically identifies and processes *every* locale folder found under `src/lib/i18n/locales/` (e.g., `tk-TM`, `gl-ES`, `fr`), regardless of whether a `translation.json` file currently exists within them. 3. **Automatic File Creation and Initialization:** * For any locale directory where a `translation.json` file is *missing* (e.g., for a newly added language where only the folder exists), the script automatically creates this file. It initializes this new file as an empty JSON object in memory, ready to be populated. 4. **Intelligent Key Merging (Adding Missing Keys):** * For each locale's `translation.json` data (whether newly loaded or initialized as empty), the script performs a deep, recursive merge with the `en-US` master's canonical structure. * Any translation keys (including nested objects) present in the `en-US` master but **missing** in the current locale's file are recursively **added**. * **Crucially, if a key already exists in the current locale's file, its existing translated value is preserved and never overwritten.** This is fundamental to preventing data loss. * Newly added leaf keys are populated with their corresponding value from the `en-US` master as a placeholder, or an empty string (`""`) if no `en-US` equivalent is found (though this is rare given `en-US` is the source). 5. **Strict Key Pruning (Removing Unused Keys):** * Conversely, if a translation key (or a nested object structure) exists in a locale's `translation.json` file but is **not present** in the `en-US` master's canonical structure, that key is automatically **removed**. This eliminates obsolete entries and keeps our translation files lean and current, ensuring they only contain keys actively defined in the master. 6. **Recursive Natural Alphabetical Sorting:** * After all merging and pruning operations, the entire JSON object for each locale's `translation.json` file is **recursively sorted alphabetically**. * The sorting logic uses `String.prototype.localeCompare` with `caseFirst: 'lower'`, which ensures a consistent and "natural" alphabetical order. This means lowercase letters will consistently appear before their uppercase counterparts (e.g., `"Add a model ID"` will sort before `"Add Arena Model"`), creating highly readable files and minimal Git diffs. 7. **Optimized File Writes:** * The script intelligently compares the final, sorted content with the original file content (if it existed) before writing. A file is only written back to disk if actual changes (key additions/removals or sorting adjustments) have occurred, preventing unnecessary file modifications. #### **The Workflow:** The ideal workflow after this PR will be: 1. **`npm run i18n:parse`**: Run this first to scan your codebase for new strings and update `en-US/translation.json` (and other files) with what's found in the code. This is your "source code to master translation" sync. 2. **`npm run i18n:sync`**: Run this second to take the (now updated) `en-US/translation.json` and propagate its structure, keys, and order to *all other* `translation.json` files. This is your "master translation to other translations" sync and cleanup. **In essence:** * `i18n:parse` is about **discovery and extraction** from code. * `i18n:sync` is about **enforcement of consistency, standardization, and cleanup** *between* translation files, driven by a single master. - To run this comprehensive translation file synchronization process, execute: ```bash npm run i18n:sync ``` ### Contributor License Agreement By submitting this pull request, I confirm that I have read and fully agree to the [Contributor License Agreement (CLA)](/CONTRIBUTOR_LICENSE_AGREEMENT), and I am providing my contributions under its terms. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-20 04:56:04 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#23629