Commit Briefs

13b2bc374c Stefan Sperling

introduce gotd(8), a Git repository server reachable via ssh(1)

This is an initial barebones implementation which provides the absolute minimum of functionality required to serve got(1) and git(1) clients. Basic fetch/send functionality has been tested and seems to work here, but this server is not yet expected to be stable. More testing is welcome. See the man pages for setup instructions. The current design uses one reader and one writer process per repository, which will have to be extended to N readers and N writers in the future. At startup, each process will chroot(2) into its assigned repository. This works because gotd(8) can only be started as root, and will then fork+exec, chroot, and privdrop. At present the parent process runs with the following pledge(2) promises: "stdio rpath wpath cpath proc getpw sendfd recvfd fattr flock unix unveil" The parent is the only process able to modify the repository in a way that becomes visible to Git clients. The parent uses unveil(2) to restrict its view of the filesystem to /tmp and the repositories listed in the configuration file gotd.conf(5). Per-repository chroot(2) processes use "stdio rpath sendfd recvfd". The writer defers to the parent for modifying references in the repository to point at newly uploaded commits. The reader is fine without such help, because Git repositories can be read without having to create any lock-files. gotd(8) requires a dedicated user ID, which should own repositories on the filesystem, and a separate secondary group, which should not have filesystem-level repository access, and must be allowed access to the gotd(8) socket. To obtain Git repository access, users must be members of this secondary group, and must have their login shell set to gotsh(1). gotsh(1) connects to the gotd(8) socket and speaks Git-protocol towards the client on the other end of the SSH connection. gotsh(1) is not an interactive command shell. At present, authenticated clients are granted read/write access to all repositories and all references (except for the "refs/got/" and the "refs/remotes/" namespaces, which are already being protected from modification). While complicated access control mechanism are not a design goal, making it possible to safely offer anonymous Git repository access over ssh(1) is on the road map.


eb81bc23c7 Tracey Emery

move got_opentempfd out of open_blob. ok stsp@


db9b9b1c2b Stefan Sperling

let got-read-pack be explicit about whether it could enumerate all objects

This allows the main process to avoid looping over all object IDs again in case the pack file used for enumeration is complete. ok op@


0ab4c95723 Stefan Sperling

Bring back object enumeration inside got-read-pack as a fast path.

The problem that was found in the earlier version has been fixed. ok op@


e44d939152 Stefan Sperling

revert object enumeration in got-read-pack for now; needs more work

This implementation marked commits and trees as enumerated before all trees which they depend on were enumerated. This behaviour leads to incomplete pack files when a tree is only partially packed and got-read-pack hits a missing tree entry as a result. The algorithm must be reworked such that packed leave nodes are marked enumerated first, then bubble-up. Found by op@


cee6a7ea55 Stefan Sperling

implement object enumeration support in got-read-pack

ok op@


fae7e03842 Stefan Sperling

run the search for deltas to reuse in got-read-pack

This significantly speeds up the deltification step of packing by avoiding imsg traffic. gotadmin no longer requests individual raw deltas from got-read-pack to check whether it can reuse them. Instead, got-read-pack obtains a list of objects we want to pack, and hands back the list of all deltas in its pack file which can be reused. Messages are now batched such that imsg buffers are filled as much as possible. Another advantage is that deltas we are not going to reuse will no longer be written to the delta cache file, saving disk space. Before this patch, any raw delta candidate was written to the delta cache file by got-read-pack, and the decision whether to reuse the delta happened afterwards in the gotadmin process. Code for reading individual raw deltas is now unused and could be removed at some point. ok op@


2d9e6abf24 Stefan Sperling

store deltas in compressed form while packing, both in memory and cache file

This reduces memory and disk space consumption during packing. with tweaks + memleak on error fix from op@ ok op@


d7b5a0e827 Stefan Sperling

inline struct got_object_id in struct got_object_qid

Saves us from doing a malloc/free call for every item on the list. ok op@


67fd684965 Stefan Sperling

reuse existing deltas when creating pack files

tested by thomas, naddy, and myself


64a8571e12 Stefan Sperling

map raw object files into memory while packing if possible



284e766353 Stefan Sperling

remove unused internal raw object API functions


d3c116bf72 Stefan Sperling

cache raw objects in order to speed up gotadmin pack




b3d68e7f99 Stefan Sperling

implement 'gotadmin cleanup'





5aa813935b Stefan Sperling

add copyright year for files already touched in 2020


2c98ee284c Stefan Sperling

NAME_MAX does not account for a terminating NUL


56e0773df7 Stefan Sperling

convert tree entries from SIMPLEQ to an array



8aa93786da Stefan Sperling

make 'got cat' output look more like raw object files