Commit Briefs

Omar Polo

use struct got_object_id instead of sha1 digest in a few imsg

change got_img_commit_painting_request, got_imsg_tag_object and the data of GOT_IMSG_TRAVERSED_COMMITS not to copy the sha1 digest to the imsg buffer and then from it to a new struct got_object_id but send directly the whole struct. ok stsp@



Omar Polo

style


Omar Polo

got_imsg_raw_delta_request: use struct instead of buffer for id

ok stsp@


Omar Polo

got_imsg_packed_object: use struct instead of buffer for id

ok stsp@


Stefan Sperling

avoid traversing enumerated commits more than once in got-read-pack

Keep track of parent commits that will be processed as part of looping over the commit queue provided by the main process, and do not add these commits to the queue again. Fixes pointless traversal of commits on the queue which will simply be skipped. The end result is the same either way. ok tracey


Stefan Sperling

fix missing commits in pack files created with packed object enumeration

got-read-pack forgot to send a tree-enumeration-done message to the main process if the tree of a given commit had already been traversed. The main process would then not add the corresponding commit to the pack file, even though it should be added. Found while using 'got send' towards gotd in order to populate an empty repository on the server with non-trivial history, where some commits always ended up missing due to this bug. ok tracey



Omar Polo

check size before calling mmap(2)

It's only a preparatory step, as checking whether a size_t is less than SIZE_MAX is moot. In a follow-up commit, however, the `filesize' field of the struct got_pack will become off_t and these checks will kick in. This also makes consistent how we guard mmap(2) against empty files. ok and improvements stsp@


Stefan Sperling

allow got_object_parse_tree to reuse entries buffer allocations for speed

ok millert@



Omar Polo

mark got_error_fmt as printf-like and fix the arisen errors

ok stsp@


Stefan Sperling

implement support for commit coloring in got-read-pack for speed

ok op, tracey



Stefan Sperling

fix a bug in got_privsep_send_object_idlist() exposed by recent changes

The old code did not work correctly if only a single object Id was to be sent to got-read-pack. Make got-read-pack error out if the list of commits for object enumeration is empty to catch this problem if it occurs again. Found by the send_basic test, which was failing with GOT_TEST_PACK=1 ok tracey


Stefan Sperling

let got-read-pack be explicit about whether it could enumerate all objects

This allows the main process to avoid looping over all object IDs again in case the pack file used for enumeration is complete. ok op@



Stefan Sperling

Bring back object enumeration inside got-read-pack as a fast path.

The problem that was found in the earlier version has been fixed. ok op@


Stefan Sperling

revert object enumeration in got-read-pack for now; needs more work

This implementation marked commits and trees as enumerated before all trees which they depend on were enumerated. This behaviour leads to incomplete pack files when a tree is only partially packed and got-read-pack hits a missing tree entry as a result. The algorithm must be reworked such that packed leave nodes are marked enumerated first, then bubble-up. Found by op@


Stefan Sperling

in enumeration_request(), use the correct index for tagged commit objects

Fixes an error where got-read-pack errors out with "bad object data" during 'got send' because we ended up handing a tag object to the commit object parser.


Stefan Sperling

implement object enumeration support in got-read-pack

ok op@


Stefan Sperling

convert delta cache to a hash table

This approach uses more memory but is much faster. To offset the additional memory usage somewhat the cache now stores very small deltas only. However, overall memory usage goes up. Hopefully we will find a way to reduce this later. ok op@



Stefan Sperling

parse tree entries into an array instead of a pathlist

Avoids some extra malloc/free in a performance-critical path. ok op@


Stefan Sperling

run the search for deltas to reuse in got-read-pack

This significantly speeds up the deltification step of packing by avoiding imsg traffic. gotadmin no longer requests individual raw deltas from got-read-pack to check whether it can reuse them. Instead, got-read-pack obtains a list of objects we want to pack, and hands back the list of all deltas in its pack file which can be reused. Messages are now batched such that imsg buffers are filled as much as possible. Another advantage is that deltas we are not going to reuse will no longer be written to the delta cache file, saving disk space. Before this patch, any raw delta candidate was written to the delta cache file by got-read-pack, and the decision whether to reuse the delta happened afterwards in the gotadmin process. Code for reading individual raw deltas is now unused and could be removed at some point. ok op@