Commits


make diff_chunk_type() public and clarify comment As discussed with stsp, reword an easily misunderstood comment, and move diff_chunk_type() into the public diff API to improve caller efficiency. ok stsp@


always cast ctype' is*() arguments to unsigned char Almost all had already an unsigned argument (uint8_t or unsigned char), but cast anyway in case the types are changed in the future. ok stsp@


c99-only construct is not good on OpenBSD's old gcc; patch by Ted Bullock


Don't return errno when fread fails fread doesn't consistently set errno on failure. - On OpenBSD fread sets errno on possible argument overflows, but this doesn't occur on other platforms. rfread doesn't set errno on EOF or other failures. - ferror does not set errno on failure. Returning errno here is possibly inconsistent. Return EIO here instead. ok stsp@


diff: Add API for consumers to check if diff is printable Programs using the libdiff API they can need to know if the diff contained anything that would be printed, or would be empty. Expose the same check that the output functions do as a function call. ok stsp@


sync files from got.git 336075a42a5ae0fa322db734c481d21998e82bb8 ok tb@


remove gcc ternary if extension ok stsp@


set diff box recursion limit to UINT_MAX by default In practice, recursion is already limited by our Myers max-effort cut and this heuristic should generally provide a better split than a limit on the number of diff boxes. A recursion limit is only required for diff configs that do not include the Myers algorithm, which we currently don't have. Discussed with Neels


fix result chunks: add minus above plus failed to shift "empty" positions


undup code to add result chunks


add sanity assertions around adding result chunks uncovers errors related to placing minus chunks above already added plus chunks.


patience debug


debug fix in diff_data_init_subsection


debug init subsection


allow diff API users to atomize files separately This is a breaking API change (not that we care about that at this point). This can avoid redundant work spent on atomizing a file multiple times. There are use cases where one particular file must be compared to other files over and over again, such as when blaming file history. The old API gave access to both versions of the file to the atomizer just in case a future atomizer implementation needs this. This can still be achieved by passing a second file via the atomizer's private data pointer.


optimize diff_atom_same(): if hashes differ, return false


diff_atom_cmp: no special case for ignore whitespace when both atoms empty


cache kd_buf in struct diff_state to avoid repeated allocation + free


initialize to NULL instead of ""; allows pointer check to see if file is mapped


rename diff_atom->d to diff_atom->root, because it always is The idea was that for each diff box within the files, the atoms would have a backpointer to the current layer of diff_data (indicating the current section), but it is not actually needed to update the backpointer in each atom to the current diff_data. That is why the current code always points atom->d to the root diff_data for the entire file. Clarify by proper name. Constructs like atom->d->root->foo are redundant, just use atom->root->foo.


diff_main: don't run algo if left or right are empty


debug: fix logging first chunk


diff_algo_none: cosmetics


fix diff_algo_none() for ending in plus chunk


results: also combine chunks coming from temp_chunks