Dump contains duplicated data #4894

Closed
opened 2025-11-02 06:06:24 -06:00 by GiteaMirror · 4 comments
Owner

Originally created by @PhilippHomann on GitHub (Feb 19, 2020).

  • Gitea version (or commit ref): v1.11.1
  • Git version: 2.24.1
  • Operating system: Docker Container
  • Database (use [x]):
    • PostgreSQL
    • MySQL
    • MSSQL
    • SQLite
  • Can you reproduce the bug at https://try.gitea.io:
    • Yes (provide example URL)
    • No
    • Not relevant
  • Log gist:

Description

It seems that many data inside a dump created using gitea dump is duplicated.

Steps to reproduce

  1. Run docker container: docker run --name=gitea -p 3000:3000 -ti --rm gitea/gitea:1
  2. Run initial setup without any configuration change
  3. Run dump command
    docker exec -ti gitea sh
    su git
    gitea dump
  4. Examine dump file
Archive:  gitea-dump-1582113818.zip
Zip file size: 66623 bytes, number of entries: 53
drwxr-xr-x  2.0 unx        0 bx stor 20-Feb-19 13:03 custom/
drwxr-xr-x  2.0 unx        0 bx stor 20-Feb-19 13:03 custom/conf/
-rw-r--r--  2.0 unx     2069 bX defN 20-Feb-19 13:03 custom/conf/app.ini
drwxr-xr-x  2.0 unx        0 bx stor 20-Feb-19 13:03 custom/indexers/
drwxr-xr-x  2.0 unx        0 bx stor 20-Feb-19 13:03 custom/indexers/issues.bleve/
-rw-r--r--  2.0 unx       13 bX defN 20-Feb-19 13:03 custom/indexers/issues.bleve/rupture_meta.json
-rw-------  2.0 unx    32768 bX defN 20-Feb-19 13:03 custom/indexers/issues.bleve/store
-rw-r--r--  2.0 unx       47 bX defN 20-Feb-19 13:03 custom/indexers/issues.bleve/index_meta.json
drwxr-xr-x  2.0 unx        0 bx stor 20-Feb-19 13:03 custom/log/
-rw-r-----  2.0 unx    51734 bX defN 20-Feb-19 13:03 custom/log/gitea.log
-rw-r--r--  2.0 unx  1110016 bX defN 20-Feb-19 13:03 custom/gitea.db
drwxr-xr-x  2.0 unx        0 bx stor 20-Feb-19 13:03 custom/queues/
drwxr-xr-x  2.0 unx        0 bx stor 20-Feb-19 13:03 custom/queues/issue_indexer/
-rw-r--r--  2.0 unx       54 bX defN 20-Feb-19 13:03 custom/queues/issue_indexer/MANIFEST-000000
-rw-r--r--  2.0 unx       67 bX defN 20-Feb-19 13:03 custom/queues/issue_indexer/000001.log
-rw-r--r--  2.0 unx        0 bX defN 20-Feb-19 13:03 custom/queues/issue_indexer/LOCK
-rw-r--r--  2.0 unx       16 bX defN 20-Feb-19 13:03 custom/queues/issue_indexer/CURRENT
-rw-r--r--  2.0 unx      360 bX defN 20-Feb-19 13:03 custom/queues/issue_indexer/LOG
drwxr-xr-x  2.0 unx        0 bx stor 20-Feb-19 13:03 custom/queues/task/
-rw-r--r--  2.0 unx       54 bX defN 20-Feb-19 13:03 custom/queues/task/MANIFEST-000000
-rw-r--r--  2.0 unx       67 bX defN 20-Feb-19 13:03 custom/queues/task/000001.log
-rw-r--r--  2.0 unx        0 bX defN 20-Feb-19 13:03 custom/queues/task/LOCK
-rw-r--r--  2.0 unx       16 bX defN 20-Feb-19 13:03 custom/queues/task/CURRENT
-rw-r--r--  2.0 unx      358 bX defN 20-Feb-19 13:03 custom/queues/task/LOG
drwxr-xr-x  2.0 unx        0 bx stor 20-Feb-19 13:03 data/
drwxr-xr-x  2.0 unx        0 bx stor 20-Feb-19 13:03 data/conf/
-rw-r--r--  2.0 unx     2069 bX defN 20-Feb-19 13:03 data/conf/app.ini
drwxr-xr-x  2.0 unx        0 bx stor 20-Feb-19 13:03 data/indexers/
drwxr-xr-x  2.0 unx        0 bx stor 20-Feb-19 13:03 data/indexers/issues.bleve/
-rw-r--r--  2.0 unx       13 bX defN 20-Feb-19 13:03 data/indexers/issues.bleve/rupture_meta.json
-rw-------  2.0 unx    32768 bX defN 20-Feb-19 13:03 data/indexers/issues.bleve/store
-rw-r--r--  2.0 unx       47 bX defN 20-Feb-19 13:03 data/indexers/issues.bleve/index_meta.json
drwxr-xr-x  2.0 unx        0 bx stor 20-Feb-19 13:03 data/log/
-rw-r-----  2.0 unx    51734 bX defN 20-Feb-19 13:03 data/log/gitea.log
-rw-r--r--  2.0 unx  1110016 bX defN 20-Feb-19 13:03 data/gitea.db
drwxr-xr-x  2.0 unx        0 bx stor 20-Feb-19 13:03 data/queues/
drwxr-xr-x  2.0 unx        0 bx stor 20-Feb-19 13:03 data/queues/issue_indexer/
-rw-r--r--  2.0 unx       54 bX defN 20-Feb-19 13:03 data/queues/issue_indexer/MANIFEST-000000
-rw-r--r--  2.0 unx       67 bX defN 20-Feb-19 13:03 data/queues/issue_indexer/000001.log
-rw-r--r--  2.0 unx        0 bX defN 20-Feb-19 13:03 data/queues/issue_indexer/LOCK
-rw-r--r--  2.0 unx       16 bX defN 20-Feb-19 13:03 data/queues/issue_indexer/CURRENT
-rw-r--r--  2.0 unx      360 bX defN 20-Feb-19 13:03 data/queues/issue_indexer/LOG
drwxr-xr-x  2.0 unx        0 bx stor 20-Feb-19 13:03 data/queues/task/
-rw-r--r--  2.0 unx       54 bX defN 20-Feb-19 13:03 data/queues/task/MANIFEST-000000
-rw-r--r--  2.0 unx       67 bX defN 20-Feb-19 13:03 data/queues/task/000001.log
-rw-r--r--  2.0 unx        0 bX defN 20-Feb-19 13:03 data/queues/task/LOCK
-rw-r--r--  2.0 unx       16 bX defN 20-Feb-19 13:03 data/queues/task/CURRENT
-rw-r--r--  2.0 unx      358 bX defN 20-Feb-19 13:03 data/queues/task/LOG
drwxr-xr-x  2.0 unx        0 bx stor 20-Feb-19 13:03 log/
-rw-r-----  2.0 unx    51734 bX defN 20-Feb-19 13:03 log/gitea.log
-rw-r--r--  2.0 unx    32618 bX defN 20-Feb-19 13:03 gitea-db.sql
-rw-r--r--  2.0 unx     2069 bX defN 20-Feb-19 13:03 app.ini
-rw-r--r--  2.0 unx      142 bX defN 20-Feb-19 13:03 gitea-repo.zip
53 files, 2481841 bytes uncompressed, 58531 bytes compressed:  97.7%

LTM like the data and the custom directory contain the same data.
Also the log files (which might be quite huge) are dumped for a third time. The app.ini also.
Is this a expected behaviour?

Originally created by @PhilippHomann on GitHub (Feb 19, 2020). - Gitea version (or commit ref): v1.11.1 - Git version: 2.24.1 - Operating system: [Docker Container](https://hub.docker.com/layers/gitea/gitea/1.11.1/images/sha256-b038973ab4ee1a04ac5518124c7b5be32aa46cb0fa025adb2b11af38b1acc6a3?context=explore) - Database (use `[x]`): - [ ] PostgreSQL - [ ] MySQL - [ ] MSSQL - [x] SQLite - Can you reproduce the bug at https://try.gitea.io: - [ ] Yes (provide example URL) - [ ] No - [x] Not relevant - Log gist: ## Description It seems that many data inside a dump created using gitea dump is duplicated. ### Steps to reproduce 1. Run docker container: docker run --name=gitea -p 3000:3000 -ti --rm gitea/gitea:1 2. Run initial setup without any configuration change 3. Run dump command docker exec -ti gitea sh su git gitea dump 4. Examine dump file ``` Archive: gitea-dump-1582113818.zip Zip file size: 66623 bytes, number of entries: 53 drwxr-xr-x 2.0 unx 0 bx stor 20-Feb-19 13:03 custom/ drwxr-xr-x 2.0 unx 0 bx stor 20-Feb-19 13:03 custom/conf/ -rw-r--r-- 2.0 unx 2069 bX defN 20-Feb-19 13:03 custom/conf/app.ini drwxr-xr-x 2.0 unx 0 bx stor 20-Feb-19 13:03 custom/indexers/ drwxr-xr-x 2.0 unx 0 bx stor 20-Feb-19 13:03 custom/indexers/issues.bleve/ -rw-r--r-- 2.0 unx 13 bX defN 20-Feb-19 13:03 custom/indexers/issues.bleve/rupture_meta.json -rw------- 2.0 unx 32768 bX defN 20-Feb-19 13:03 custom/indexers/issues.bleve/store -rw-r--r-- 2.0 unx 47 bX defN 20-Feb-19 13:03 custom/indexers/issues.bleve/index_meta.json drwxr-xr-x 2.0 unx 0 bx stor 20-Feb-19 13:03 custom/log/ -rw-r----- 2.0 unx 51734 bX defN 20-Feb-19 13:03 custom/log/gitea.log -rw-r--r-- 2.0 unx 1110016 bX defN 20-Feb-19 13:03 custom/gitea.db drwxr-xr-x 2.0 unx 0 bx stor 20-Feb-19 13:03 custom/queues/ drwxr-xr-x 2.0 unx 0 bx stor 20-Feb-19 13:03 custom/queues/issue_indexer/ -rw-r--r-- 2.0 unx 54 bX defN 20-Feb-19 13:03 custom/queues/issue_indexer/MANIFEST-000000 -rw-r--r-- 2.0 unx 67 bX defN 20-Feb-19 13:03 custom/queues/issue_indexer/000001.log -rw-r--r-- 2.0 unx 0 bX defN 20-Feb-19 13:03 custom/queues/issue_indexer/LOCK -rw-r--r-- 2.0 unx 16 bX defN 20-Feb-19 13:03 custom/queues/issue_indexer/CURRENT -rw-r--r-- 2.0 unx 360 bX defN 20-Feb-19 13:03 custom/queues/issue_indexer/LOG drwxr-xr-x 2.0 unx 0 bx stor 20-Feb-19 13:03 custom/queues/task/ -rw-r--r-- 2.0 unx 54 bX defN 20-Feb-19 13:03 custom/queues/task/MANIFEST-000000 -rw-r--r-- 2.0 unx 67 bX defN 20-Feb-19 13:03 custom/queues/task/000001.log -rw-r--r-- 2.0 unx 0 bX defN 20-Feb-19 13:03 custom/queues/task/LOCK -rw-r--r-- 2.0 unx 16 bX defN 20-Feb-19 13:03 custom/queues/task/CURRENT -rw-r--r-- 2.0 unx 358 bX defN 20-Feb-19 13:03 custom/queues/task/LOG drwxr-xr-x 2.0 unx 0 bx stor 20-Feb-19 13:03 data/ drwxr-xr-x 2.0 unx 0 bx stor 20-Feb-19 13:03 data/conf/ -rw-r--r-- 2.0 unx 2069 bX defN 20-Feb-19 13:03 data/conf/app.ini drwxr-xr-x 2.0 unx 0 bx stor 20-Feb-19 13:03 data/indexers/ drwxr-xr-x 2.0 unx 0 bx stor 20-Feb-19 13:03 data/indexers/issues.bleve/ -rw-r--r-- 2.0 unx 13 bX defN 20-Feb-19 13:03 data/indexers/issues.bleve/rupture_meta.json -rw------- 2.0 unx 32768 bX defN 20-Feb-19 13:03 data/indexers/issues.bleve/store -rw-r--r-- 2.0 unx 47 bX defN 20-Feb-19 13:03 data/indexers/issues.bleve/index_meta.json drwxr-xr-x 2.0 unx 0 bx stor 20-Feb-19 13:03 data/log/ -rw-r----- 2.0 unx 51734 bX defN 20-Feb-19 13:03 data/log/gitea.log -rw-r--r-- 2.0 unx 1110016 bX defN 20-Feb-19 13:03 data/gitea.db drwxr-xr-x 2.0 unx 0 bx stor 20-Feb-19 13:03 data/queues/ drwxr-xr-x 2.0 unx 0 bx stor 20-Feb-19 13:03 data/queues/issue_indexer/ -rw-r--r-- 2.0 unx 54 bX defN 20-Feb-19 13:03 data/queues/issue_indexer/MANIFEST-000000 -rw-r--r-- 2.0 unx 67 bX defN 20-Feb-19 13:03 data/queues/issue_indexer/000001.log -rw-r--r-- 2.0 unx 0 bX defN 20-Feb-19 13:03 data/queues/issue_indexer/LOCK -rw-r--r-- 2.0 unx 16 bX defN 20-Feb-19 13:03 data/queues/issue_indexer/CURRENT -rw-r--r-- 2.0 unx 360 bX defN 20-Feb-19 13:03 data/queues/issue_indexer/LOG drwxr-xr-x 2.0 unx 0 bx stor 20-Feb-19 13:03 data/queues/task/ -rw-r--r-- 2.0 unx 54 bX defN 20-Feb-19 13:03 data/queues/task/MANIFEST-000000 -rw-r--r-- 2.0 unx 67 bX defN 20-Feb-19 13:03 data/queues/task/000001.log -rw-r--r-- 2.0 unx 0 bX defN 20-Feb-19 13:03 data/queues/task/LOCK -rw-r--r-- 2.0 unx 16 bX defN 20-Feb-19 13:03 data/queues/task/CURRENT -rw-r--r-- 2.0 unx 358 bX defN 20-Feb-19 13:03 data/queues/task/LOG drwxr-xr-x 2.0 unx 0 bx stor 20-Feb-19 13:03 log/ -rw-r----- 2.0 unx 51734 bX defN 20-Feb-19 13:03 log/gitea.log -rw-r--r-- 2.0 unx 32618 bX defN 20-Feb-19 13:03 gitea-db.sql -rw-r--r-- 2.0 unx 2069 bX defN 20-Feb-19 13:03 app.ini -rw-r--r-- 2.0 unx 142 bX defN 20-Feb-19 13:03 gitea-repo.zip 53 files, 2481841 bytes uncompressed, 58531 bytes compressed: 97.7% ``` LTM like the data and the custom directory contain the same data. Also the log files (which might be quite huge) are dumped for a third time. The app.ini also. Is this a expected behaviour?
GiteaMirror added the issue/needs-feedback label 2025-11-02 06:06:24 -06:00
Author
Owner

@lunny commented on GitHub (Feb 19, 2020):

And could you check your docker that do those duplicated files exist?

@lunny commented on GitHub (Feb 19, 2020): And could you check your docker that do those duplicated files exist?
Author
Owner

@PhilippHomann commented on GitHub (Feb 19, 2020):

The folder structure after initial setup:

bash-5.0# find /data/gitea
/data/gitea
/data/gitea/conf
/data/gitea/conf/app.ini
/data/gitea/indexers
/data/gitea/indexers/issues.bleve
/data/gitea/indexers/issues.bleve/rupture_meta.json
/data/gitea/indexers/issues.bleve/store
/data/gitea/indexers/issues.bleve/index_meta.json
/data/gitea/log
/data/gitea/log/gitea.log
/data/gitea/gitea.db
/data/gitea/queues
/data/gitea/queues/issue_indexer
/data/gitea/queues/issue_indexer/MANIFEST-000000
/data/gitea/queues/issue_indexer/000001.log
/data/gitea/queues/issue_indexer/LOCK
/data/gitea/queues/issue_indexer/CURRENT
/data/gitea/queues/issue_indexer/LOG
/data/gitea/queues/task
/data/gitea/queues/task/MANIFEST-000000
/data/gitea/queues/task/000001.log
/data/gitea/queues/task/LOCK
/data/gitea/queues/task/CURRENT
/data/gitea/queues/task/LOG

I already took a look at cmd/dump.go and it seems that the custom path and the data path are backed up separately.
But for the docker image the default custom path (set by GITEA_CUSTOM) is /data/gitea, which is the same as the APP_DATA_PATH generated by the docker image.

Also the log path is backed up explicitly. Also when its below APP_DATA_PATH.

@PhilippHomann commented on GitHub (Feb 19, 2020): The folder structure after initial setup: ``` bash-5.0# find /data/gitea /data/gitea /data/gitea/conf /data/gitea/conf/app.ini /data/gitea/indexers /data/gitea/indexers/issues.bleve /data/gitea/indexers/issues.bleve/rupture_meta.json /data/gitea/indexers/issues.bleve/store /data/gitea/indexers/issues.bleve/index_meta.json /data/gitea/log /data/gitea/log/gitea.log /data/gitea/gitea.db /data/gitea/queues /data/gitea/queues/issue_indexer /data/gitea/queues/issue_indexer/MANIFEST-000000 /data/gitea/queues/issue_indexer/000001.log /data/gitea/queues/issue_indexer/LOCK /data/gitea/queues/issue_indexer/CURRENT /data/gitea/queues/issue_indexer/LOG /data/gitea/queues/task /data/gitea/queues/task/MANIFEST-000000 /data/gitea/queues/task/000001.log /data/gitea/queues/task/LOCK /data/gitea/queues/task/CURRENT /data/gitea/queues/task/LOG ``` I already took a look at cmd/dump.go and it seems that the custom path and the data path are backed up separately. But for the docker image the default custom path (set by `GITEA_CUSTOM`) is `/data/gitea`, which is the same as the `APP_DATA_PATH` generated by the docker image. Also the log path is backed up explicitly. Also when its below `APP_DATA_PATH`.
Author
Owner

@stale[bot] commented on GitHub (Apr 19, 2020):

This issue has been automatically marked as stale because it has not had recent activity. I am here to help clear issues left open even if solved or waiting for more insight. This issue will be closed if no further activity occurs during the next 2 weeks. If the issue is still valid just add a comment to keep it alive. Thank you for your contributions.

@stale[bot] commented on GitHub (Apr 19, 2020): This issue has been automatically marked as stale because it has not had recent activity. I am here to help clear issues left open even if solved or waiting for more insight. This issue will be closed if no further activity occurs during the next 2 weeks. If the issue is still valid just add a comment to keep it alive. Thank you for your contributions.
Author
Owner

@PhilippHomann commented on GitHub (Apr 20, 2020):

@lunny Could you please approve #10376, which fixes this?

@PhilippHomann commented on GitHub (Apr 20, 2020): @lunny Could you please approve #10376, which fixes this?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/gitea#4894