mirror of
https://github.com/go-gitea/gitea.git
synced 2026-03-12 10:39:38 -05:00
Is LFS store garbage collected? #3380
Closed
opened 2025-11-02 05:10:46 -06:00 by GiteaMirror
·
4 comments
No Branch/Tag Specified
main
release/v1.25
release/v1.24
release/v1.23
release/v1.22
release/v1.21
release/v1.20
release/v1.19
release/v1.18
release/v1.17
release/v1.16
release/v1.15
release/v1.14
release/v1.13
release/v1.12
release/v1.11
release/v1.10
release/v1.9
release/v1.8
v1.25.3
v1.25.2
v1.25.1
v1.25.0
v1.24.7
v1.25.0-rc0
v1.26.0-dev
v1.24.6
v1.24.5
v1.24.4
v1.24.3
v1.24.2
v1.24.1
v1.24.0
v1.23.8
v1.24.0-rc0
v1.25.0-dev
v1.23.7
v1.23.6
v1.23.5
v1.23.4
v1.23.3
v1.23.2
v1.23.1
v1.23.0
v1.23.0-rc0
v1.24.0-dev
v1.22.6
v1.22.5
v1.22.4
v1.22.3
v1.22.2
v1.22.1
v1.22.0
v1.23.0-dev
v1.22.0-rc1
v1.21.11
v1.22.0-rc0
v1.21.10
v1.21.9
v1.21.8
v1.21.7
v1.21.6
v1.21.5
v1.21.4
v1.21.3
v1.21.2
v1.20.6
v1.21.1
v1.21.0
v1.21.0-rc2
v1.21.0-rc1
v1.20.5
v1.22.0-dev
v1.21.0-rc0
v1.20.4
v1.20.3
v1.20.2
v1.20.1
v1.20.0
v1.19.4
v1.21.0-dev
v1.20.0-rc2
v1.20.0-rc1
v1.20.0-rc0
v1.19.3
v1.19.2
v1.19.1
v1.19.0
v1.19.0-rc1
v1.20.0-dev
v1.19.0-rc0
v1.18.5
v1.18.4
v1.18.3
v1.18.2
v1.18.1
v1.18.0
v1.17.4
v1.18.0-rc1
v1.19.0-dev
v1.18.0-rc0
v1.17.3
v1.17.2
v1.17.1
v1.17.0
v1.17.0-rc2
v1.16.9
v1.17.0-rc1
v1.18.0-dev
v1.16.8
v1.16.7
v1.16.6
v1.16.5
v1.16.4
v1.16.3
v1.16.2
v1.16.1
v1.16.0
v1.15.11
v1.17.0-dev
v1.16.0-rc1
v1.15.10
v1.15.9
v1.15.8
v1.15.7
v1.15.6
v1.15.5
v1.15.4
v1.15.3
v1.15.2
v1.15.1
v1.14.7
v1.15.0
v1.15.0-rc3
v1.14.6
v1.15.0-rc2
v1.14.5
v1.16.0-dev
v1.15.0-rc1
v1.14.4
v1.14.3
v1.14.2
v1.14.1
v1.14.0
v1.13.7
v1.14.0-rc2
v1.13.6
v1.13.5
v1.14.0-rc1
v1.15.0-dev
v1.13.4
v1.13.3
v1.13.2
v1.13.1
v1.13.0
v1.12.6
v1.13.0-rc2
v1.14.0-dev
v1.13.0-rc1
v1.12.5
v1.12.4
v1.12.3
v1.12.2
v1.12.1
v1.11.8
v1.12.0
v1.11.7
v1.12.0-rc2
v1.11.6
v1.12.0-rc1
v1.13.0-dev
v1.11.5
v1.11.4
v1.11.3
v1.10.6
v1.12.0-dev
v1.11.2
v1.10.5
v1.11.1
v1.10.4
v1.11.0
v1.11.0-rc2
v1.10.3
v1.11.0-rc1
v1.10.2
v1.10.1
v1.10.0
v1.9.6
v1.9.5
v1.10.0-rc2
v1.11.0-dev
v1.10.0-rc1
v1.9.4
v1.9.3
v1.9.2
v1.9.1
v1.9.0
v1.9.0-rc2
v1.10.0-dev
v1.9.0-rc1
v1.8.3
v1.8.2
v1.8.1
v1.8.0
v1.8.0-rc3
v1.7.6
v1.8.0-rc2
v1.7.5
v1.8.0-rc1
v1.9.0-dev
v1.7.4
v1.7.3
v1.7.2
v1.7.1
v1.7.0
v1.7.0-rc3
v1.6.4
v1.7.0-rc2
v1.6.3
v1.7.0-rc1
v1.7.0-dev
v1.6.2
v1.6.1
v1.6.0
v1.6.0-rc2
v1.5.3
v1.6.0-rc1
v1.6.0-dev
v1.5.2
v1.5.1
v1.5.0
v1.5.0-rc2
v1.5.0-rc1
v1.5.0-dev
v1.4.3
v1.4.2
v1.4.1
v1.4.0
v1.4.0-rc3
v1.4.0-rc2
v1.3.3
v1.4.0-rc1
v1.3.2
v1.3.1
v1.3.0
v1.3.0-rc2
v1.3.0-rc1
v1.2.3
v1.2.2
v1.2.1
v1.2.0
v1.2.0-rc3
v1.2.0-rc2
v1.1.4
v1.2.0-rc1
v1.1.3
v1.1.2
v1.1.1
v1.1.0
v1.0.2
v1.0.1
v1.0.0
v0.9.99
Labels
Clear labels
$20
$250
$50
$500
backport/done
💎 Bounty
docs-update-needed
good first issue
hacktoberfest
issue/bounty
issue/confirmed
issue/critical
issue/duplicate
issue/needs-feedback
issue/not-a-bug
issue/regression
issue/stale
issue/workaround
lgtm/need 2
modifies/api
modifies/translation
outdated/backport/v1.18
outdated/theme/markdown
outdated/theme/timetracker
performance/bigrepo
performance/cpu
performance/memory
performance/speed
pr/breaking
proposal/accepted
proposal/rejected
pr/wip
pull-request
reviewed/wontfix
💰 Rewarded
skip-changelog
status/blocked
topic/accessibility
topic/api
topic/authentication
topic/build
topic/code-linting
topic/commit-signing
topic/content-rendering
topic/deployment
topic/distribution
topic/federation
topic/gitea-actions
topic/issues
topic/lfs
topic/mobile
topic/moderation
topic/packages
topic/pr
topic/projects
topic/repo
topic/repo-migration
topic/security
topic/theme
topic/ui
topic/ui-interaction
topic/ux
topic/webhooks
topic/wiki
type/bug
type/deprecation
type/docs
type/enhancement
type/feature
type/miscellaneous
type/proposal
type/question
type/refactoring
type/summary
type/testing
type/upstream
Mirrored from GitHub Pull Request
Milestone
No items
No Milestone
Projects
Clear projects
No project
No Assignees
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: github-starred/gitea#3380
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @yacoob on GitHub (May 25, 2019).
[x]):Description
I'm trying to understand how is Gitea's LFS store gets garbage collected. I can see some references to LFS object removal in the code, but I can't find a definite answer when exactly are unreferenced blobs removed from LFS directory. As a test, I've created a repository on gitea, pushed some LFS objects to it, then removed the branch referencing them and forced a gc in the admin panel. The objects under
git/lfsare still present.Does this kind of gc happen at all? Or is it only after whole repository is removed? If there's no automatic gc, please treat this bug as a feature request. If there is, consider this a documentation request.
Thanks!
@ghost commented on GitHub (May 26, 2019):
You have to delete the repository to remove the LFS objects from disk.
@yacoob commented on GitHub (Jun 1, 2019):
That's the way only GC that happens for lfs? Okay - can we treat this topic as a request to implement something more granular, that would run during
git gc?Thanks!
@zeripath commented on GitHub (Jun 1, 2019):
There is one big store - files are not stored by repository but by oid. The repository information is kept separately. We don't have the putative filename - LFS never gives us it - we never get the SHA of the pointer file that points to the oid and although you can guess what the bland pointer should be, the spec allows for extensions so you won't be able to guess them all. You can't even simply use a .gitattributes file stored within the repository - as it might not be stored - and they might not call the filter lfs!
Therefore in terms of a GC, what you would have to do is:
Get a list of the oids that are stored in the LFS for the repo. (Simple select on the database)
Walk the git repository, find all blobs <=1k, check if they look like a pointer file, if so get the oid, check if it's stored in the LFS and is associated with the repo.
Give a diff of the two (possibly three) states.
Any unreachable LFS objects by repository suggest deletion? I guess, but you don't know why they're there - you're assuming LFS is only being used by git-lfs. This might be useful to know about and then you could prune these but this can't be automatically done.
What about potential oids that are missing - either because they're not attached to the repo or they're not in the LFS? Well are you sure that they're actually pointers rather than just files that look like pointers? (You cannot tell the difference - you can't assume .gitattributes is present and you can't really assume that they're only placed there by filter.lfs.* commands either.)
Do you reveal that you have a file matching an oid but one that is not attached to the oid? It could be a security issue to do so - although if sha256 is a secure hash the only way you should have the hash is if you have the object.
In #7082 I decided that the only sensible thing to do when merging a pr from one repository to another was to check if a blob could be a pointer file, check if it's oid is in the LFS and if so associate it with the base repository. (I probably should only add it to the base repository if that oid is actually associated with the head repository for that possible security reason above.)
When we display files in the UI we tend to just check if the blob looks like a pointer and then if the oid is associated with repository assume it's meant to be an LFS object.
It's only during uploads to repositories that we actually pay attention to .gitattributes as that's the only possible hint we have that an object should be in the LFS.
It's not simple at all. The spec for LFS is so extensible that you just don't know why an object has been placed in the LFS.
There is one final thing that might be useful - find all things in the store that are not associated with a repo - then you have to walk all the repos and try to find out if they could match a repo.
@stale[bot] commented on GitHub (Jul 31, 2019):
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs during the next 2 weeks. Thank you for your contributions.