commit 4639d50089ac22478075efda37c09e4ecaf0db88 from: Stefan Sperling date: Wed Aug 01 18:39:27 2018 UTC continue the git-repository(5) man page; still incomplete commit - 5e5560e10410aa7dab84154c6cad083c6fd3ef76 commit + 4639d50089ac22478075efda37c09e4ecaf0db88 blob - bec518e7eb43b478dc1a3d613f5deb835001f7f8 blob + c394438c77219166adde0da921d27747e4778aec --- got/git-repository.5 +++ got/git-repository.5 @@ -18,20 +18,11 @@ .Os .Sh NAME .Nm git-repository -.Nd git repository format +.Nd Git repository format .Sh DESCRIPTION -A git repository stores a series of versioned snapshots of a file hierarchy. -.Pp +A Git repository stores a series of versioned snapshots of a file hierarchy. The repository's core data model is a directed acyclic graph which contains three types of objects as nodes. -Each object is identified by the SHA-1 hash calculated over the object's -header plus the content stored in the object. -The object header names the type of object in an ASCII string, which is -followed by a space, followed by the size of data in the object encoded -as an ASCII number string. -This header is terminated by a -.Sy NUL -character. .Pp The content of tracked files is stored in objects of type .Em blob . @@ -39,14 +30,13 @@ The content of tracked files is stored in objects of t A .Em tree object points to any number of such blobs, and also to other trees in -order to form a hierarchy of files and directories. +order to represent a hierarchy of files and directories. .Pp A .Em commit object points to the root element of one tree, and thus records the state of this entire tree as a snapshot. -Commit objects are chained together and thus form a line of history -of snapshots. +Commit objects are chained together to form a line of history of snapshots. A given commit can be suceeded by an arbitrary number of subsequent commits, such that diverging lines of version control history, known as .Em branches , @@ -56,17 +46,34 @@ A commit which preceeds another commit is referred to A commit with multiple parents reunites diverged lines of history and is known as a .Em merge commit . -While the data model allows for commits with an arbitrary number of -parent commits, -.Xr got 1 -restricts all commits to at most 2 parents in order to discourage chaotic -branching and merging practices. .Pp -When stored on disk, all objects are compressed with +Each object is identified by a SHA1 hash calculated over the object's +header and the data stored in the object. +.Sh OBJECT STORAGE +Loose objects are stored as individual files beneath the directory +.Pa objects , +spread across 256 sub-directories named after the 256 possible hexadecimal +values of the first byte of an object identifier. +The name of the loose object file corresponds to the remaining bytes of the +object's identifier. +.Pp +A loose object file begins with a header which specifies the type of object +as an ASCII string, followed by an ASCII space character, followed by the +object data's size encoded as an ASCII number string. +The header is terminated by a +.Sy NUL +character, and the remainder of the file contains object data. +Loose objects files are compressed with .Xr deflate 3 . -Mulitple objects may be stored together in a +.Pp +Multiple objects can be bundled in a .Em pack file -which provides for deltification of object content. +for better disk space efficiency and increased run-time performance. +The pack file format adds two additional types of objects: +offset delta objects and reference delta objects. +.Pp +TODO describe pack file format +.Pp .Sh FILES .Bl -tag -width /etc/rpc -compact .It Pa HEAD @@ -86,6 +93,9 @@ which provides for deltification of object content. .Sh SEE ALSO .Xr got 1 , .Xr deflate 3 , +.Xr SHA1 3 , .Xr got-worktree 5 .Sh HISTORY -The Git repository format was designed by Linus Torvalds in 2005. +The Git repository format was initially designed by Linus Torvalds in 2005 +and has since been extended by various people involved in the development +of the Git version control system.