Tags: rextge/git
Tags
Changed Paths Bloom Filters Hey! The commit graph feature brought in a lot of performance improvements across multiple commands. However, file based history continues to be a performance pain point, especially in large repositories. Adopting changed path bloom filters has been discussed on the list before, and a prototype version was worked on by SZEDER Gábor, Jonathan Tan and Dr. Derrick Stolee [1]. This series is based on Dr. Stolee's proof of concept in [2] Performance Gains: We tested the performance of git log -- path on the git repo, the linux repo and some internal large repos, with a variety of paths of varying depths. On the git and linux repos: We observed a 2x to 5x speed up. On a large internal repo with files seated 6-10 levels deep in the tree: We observed 10x to 20x speed ups, with some paths going up to 28 times faster. Future Work (not included in the scope of this series): 1. Supporting multiple path based revision walk 2. Adopting it in git blame logic. 3. Interactions with line log git log -L ---------------------------------------------------------------------------- Updates since the last submission * Removed all the RFC callouts, this is a ready for full review version * Added unit tests for the bloom filter computation layer * Added more evolved functional tests for git log * Fixed a lot of the bugs found by the tests * Reacted to other miscellaneous feedback on the RFC series. Cheers! Garima Singh [1] https://lore.kernel.org/git/20181009193445.21908-1-szeder.dev@gmail.com/ [2] https://lore.kernel.org/git/61559c5b-546e-d61b-d2e1-68de692f5972@gmail.com/ Derrick Stolee (2): diff: halt tree-diff early after max_changes commit-graph: examine commits by generation number Garima Singh (8): commit-graph: use MAX_NUM_CHUNKS bloom: core Bloom filter implementation for changed paths commit-graph: compute Bloom filters for changed paths commit-graph: write Bloom filters to commit graph file commit-graph: reuse existing Bloom filters during write. commit-graph: add --changed-paths option to write subcommand revision.c: use Bloom filters to speed up path based revision walks commit-graph: add GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS test flag Jeff King (1): commit-graph: examine changed-path objects in pack order Documentation/git-commit-graph.txt | 5 + .../technical/commit-graph-format.txt | 24 ++ Makefile | 2 + bloom.c | 277 ++++++++++++++++++ bloom.h | 58 ++++ builtin/commit-graph.c | 10 +- ci/run-build-and-tests.sh | 1 + commit-graph.c | 211 ++++++++++++- commit-graph.h | 9 +- diff.h | 5 + revision.c | 124 +++++++- revision.h | 11 + t/README | 5 + t/helper/test-bloom.c | 84 ++++++ t/helper/test-read-graph.c | 4 + t/helper/test-tool.c | 1 + t/helper/test-tool.h | 1 + t/t0095-bloom.sh | 113 +++++++ t/t4216-log-bloom.sh | 143 +++++++++ t/t5318-commit-graph.sh | 2 + t/t5324-split-commit-graph.sh | 1 + tree-diff.c | 6 + 22 files changed, 1088 insertions(+), 9 deletions(-) create mode 100644 bloom.c create mode 100644 bloom.h create mode 100644 t/helper/test-bloom.c create mode 100755 t/t0095-bloom.sh create mode 100755 t/t4216-log-bloom.sh base-commit: 5b0ca87 Submitted-As: https://lore.kernel.org/git/pull.497.v2.git.1580943390.gitgitgadget@gmail.com In-Reply-To: https://lore.kernel.org/git/pull.497.git.1576879520.gitgitgadget@gmail.com
clone: use submodules.recurse option for automatically clone submodules From: Markus Klein <masmiseim@gmx.de> Simplify cloning repositories with submodules when the option submodules.recurse is set by the user. This makes it transparent to the user if submodules are used. The user doesn’t have to know if he has to add an extra parameter to get the full project including the used submodules. This makes clone behave identical to other commands like fetch, pull, checkout, ... which include the submodules automatically if this option is set. It is implemented analog to the pull command by using an own config function instead of using just the default config. In contrast to the pull command, the submodule.recurse state is saved as an array of strings as it can take an optionally pathspec argument which describes which submodules should be recursively initialized and cloned. To recursively initialize and clone all submodules a pathspec of "." has to be used. The regression test is simplified compared to the test for "git clone --recursive" as the general functionality is already checked there. Changes since v1: * Fixed the commit author to match the Signed-off-by line Signed-off-by: Markus Klein <masmiseim@gmx.de> Submitted-As: https://lore.kernel.org/git/pull.695.v2.git.git.1580851963616.gitgitgadget@gmail.com In-Reply-To: https://lore.kernel.org/git/pull.695.git.git.1580505092071.gitgitgadget@gmail.com
Reftable support git-core This adds the reftable library, and hooks it up as a ref backend. At this point, I am mainly interested in feedback on the spots marked with XXX in the Git source code, in particular, how to handle reflog expiry in this backend. v2 * address Jun's nits. * address Dscho's portability comments * more background in commit messages. Han-Wen Nienhuys (6): refs.h: clarify reflog iteration order setup.c: enable repo detection for reftable create .git/refs in files-backend.c refs: document how ref_iterator_advance_fn should handle symrefs Add reftable library Reftable support for git-core Makefile | 24 +- builtin/init-db.c | 42 +- cache.h | 2 + refs.c | 22 +- refs.h | 5 +- refs/files-backend.c | 5 + refs/refs-internal.h | 6 + refs/reftable-backend.c | 880 +++++++++++++++++++++++++++++++ reftable/LICENSE | 31 ++ reftable/README.md | 19 + reftable/VERSION | 5 + reftable/basics.c | 196 +++++++ reftable/basics.h | 37 ++ reftable/block.c | 401 ++++++++++++++ reftable/block.h | 71 +++ reftable/blocksource.h | 20 + reftable/bytes.c | 0 reftable/config.h | 1 + reftable/constants.h | 27 + reftable/dump.c | 97 ++++ reftable/file.c | 97 ++++ reftable/iter.c | 229 ++++++++ reftable/iter.h | 56 ++ reftable/merged.c | 286 ++++++++++ reftable/merged.h | 34 ++ reftable/pq.c | 114 ++++ reftable/pq.h | 34 ++ reftable/reader.c | 708 +++++++++++++++++++++++++ reftable/reader.h | 52 ++ reftable/record.c | 1107 +++++++++++++++++++++++++++++++++++++++ reftable/record.h | 79 +++ reftable/reftable.h | 399 ++++++++++++++ reftable/slice.c | 199 +++++++ reftable/slice.h | 39 ++ reftable/stack.c | 983 ++++++++++++++++++++++++++++++++++ reftable/stack.h | 40 ++ reftable/system.h | 57 ++ reftable/tree.c | 66 +++ reftable/tree.h | 24 + reftable/writer.c | 622 ++++++++++++++++++++++ reftable/writer.h | 46 ++ reftable/zlib-compat.c | 92 ++++ repository.c | 4 + repository.h | 3 + setup.c | 27 +- 45 files changed, 7255 insertions(+), 33 deletions(-) create mode 100644 refs/reftable-backend.c create mode 100644 reftable/LICENSE create mode 100644 reftable/README.md create mode 100644 reftable/VERSION create mode 100644 reftable/basics.c create mode 100644 reftable/basics.h create mode 100644 reftable/block.c create mode 100644 reftable/block.h create mode 100644 reftable/blocksource.h create mode 100644 reftable/bytes.c create mode 100644 reftable/config.h create mode 100644 reftable/constants.h create mode 100644 reftable/dump.c create mode 100644 reftable/file.c create mode 100644 reftable/iter.c create mode 100644 reftable/iter.h create mode 100644 reftable/merged.c create mode 100644 reftable/merged.h create mode 100644 reftable/pq.c create mode 100644 reftable/pq.h create mode 100644 reftable/reader.c create mode 100644 reftable/reader.h create mode 100644 reftable/record.c create mode 100644 reftable/record.h create mode 100644 reftable/reftable.h create mode 100644 reftable/slice.c create mode 100644 reftable/slice.h create mode 100644 reftable/stack.c create mode 100644 reftable/stack.h create mode 100644 reftable/system.h create mode 100644 reftable/tree.c create mode 100644 reftable/tree.h create mode 100644 reftable/writer.c create mode 100644 reftable/writer.h create mode 100644 reftable/zlib-compat.c base-commit: 5b0ca87 Submitted-As: https://lore.kernel.org/git/pull.539.v3.git.1580848060.gitgitgadget@gmail.com In-Reply-To: https://lore.kernel.org/git/pull.539.git.1579808479.gitgitgadget@gmail.com In-Reply-To: https://lore.kernel.org/git/pull.539.v2.git.1580134944.gitgitgadget@gmail.com
Avoid multiple recursive calls for same path in read_directory_recurs… …ive() This patch series builds on en/fill-directory-fixes-more. This series should be considered an RFC because of the untracked-cache changes (see the last two commits), for which I'm hoping to get an untracked-cache expert to comment. This series does provide some modest speedups (see second to last commit message), and should allow 'git status --ignored' to complete in a more reasonable timeframe for Martin Melka (see https://lore.kernel.org/git/CANt4O2L_DZnMqVxZzTBMvr=BTWqB6L0uyORkoN_yMHLmUX7yHw@mail.gmail.com/ ) Changes since v1: * Replaced patch 4 with improved version from Stolee (with additional improvement of my own) * Clarifications, wording fixes, and more about linear perf in commit message to patch 5 * More detail in patch 5 about why "whackamole" particularly makes me uneasy for dir.c Stuff clearly still missing from v2: * I didn't make the DIR_KEEP_UNTRACKED_CONTENTS changes I mentioned in https://lore.kernel.org/git/CABPp-BEQ5s=+6Rnb-A+pdEaoPXxfo-hMSegSe1eai=RE74A3Og@mail.gmail.com/ which I think would make the code cleaner & clearer. * I still have not addressed the untracked-cache issue mentioned in the last two commits. I looked at it very, very briefly, but I was really close to doing something similar to [1] and just dropping my patches in this series before even submitting them on Wednesday[2] (dir.c is a really unpleasant to work in). Other than wording fixes, I just need a week or two off from this area before I dig further, unless someone else wants to dive in and needs me to provide pointers on what I've done so far. [1] https://lore.kernel.org/git/pull.676.v3.git.git.1576571586.gitgitgadget@gmail.com/ [2] I was inches from doing that Wednesday morning. I had done several rounds of "Okay, I fixed all the tests that broke with my changes last time, let's re-run the testsuite -- wow, four totally different tests from testfiles I hadn't looked at before now break", and decided that I would only do one more before dropping it an maybe coming back in a month or two. That time happened to work, minus the untracked-cache, so I decided to put it in front of other eyeballs. Derrick Stolee (1): dir: refactor treat_directory to clarify control flow Elijah Newren (5): dir: consolidate treat_path() and treat_one_path() dir: fix broken comment dir: fix confusion based on variable tense dir: replace exponential algorithm with a linear one t7063: blindly accept diffs dir.c | 331 +++++++++++++++++------------- t/t7063-status-untracked-cache.sh | 50 ++--- 2 files changed, 208 insertions(+), 173 deletions(-) base-commit: 0cbb605 Submitted-As: https://lore.kernel.org/git/pull.700.v2.git.git.1580495486.gitgitgadget@gmail.com In-Reply-To: https://lore.kernel.org/git/pull.700.git.git.1580335424.gitgitgadget@gmail.com
git-p4: add hook p4-pre-edit-changelist Our company's workflow requires that our P4 check-in messages have a specific format. A helpful feature in the GIT-P4 program would be a hook that occurs after the P4 change list is created but before it is displayed in the editor that would allow an external program to possibly edit the changelist text. v1:My suggestion for the hook name is p4-pre-edit-changelist. It would take a single parameter, the full path of the temporary file. If the hook returns a non-zero exit code, it would cancel the current P4 submit. The hook should be optional. v2:Instead of a single hook, p4-pre-edit-changelist, follow the git convention for hook names and add the trio of hooks that work together, similar to git commit. The hook names are: * p4-prepare-changelist * p4-changelist * p4-post-changelist The hooks should follow the same convention as git commit, so a new command line option for the git-p4 submit function --no-verify should also be added. Ben Keene (4): git-p4: rewrite prompt to be Windows compatible git-p4: create new method gitRunHook git-p4: add hook p4-pre-edit-changelist git-p4: add p4 submit hooks Documentation/git-p4.txt | 44 ++++++++- Documentation/githooks.txt | 46 +++++++++ git-p4.py | 191 ++++++++++++++++++++++++++----------- 3 files changed, 225 insertions(+), 56 deletions(-) base-commit: 5b0ca87 Submitted-As: https://lore.kernel.org/git/pull.698.v2.git.git.1580507895.gitgitgadget@gmail.com In-Reply-To: https://lore.kernel.org/git/pull.698.git.git.1579555036314.gitgitgadget@gmail.com
clone: use submodules.recurse option for automatically clone submodules From: Markus <masmiseim@gmx.de> Simplify cloning repositories with submodules when the option submodules.recurse is set by the user. This makes it transparent to the user if submodules are used. The user doesn’t have to know if he has to add an extra parameter to get the full project including the used submodules. This makes clone behave identical to other commands like fetch, pull, checkout, ... which include the submodules automatically if this option is set. It is implemented analog to the pull command by using an own config function instead of using just the default config. In contrast to the pull command, the submodule.recurse state is saved as an array of strings as it can take an optionally pathspec argument which describes which submodules should be recursively initialized and cloned. To recursively initialize and clone all submodules a pathspec of "." has to be used. The regression test is simplified compared to the test for "git clone --recursive" as the general functionality is already checked there. Signed-off-by: Markus Klein <masmiseim@gmx.de> Submitted-As: https://lore.kernel.org/git/pull.695.git.git.1580505092071.gitgitgadget@gmail.com
Harden the sparse-checkout builtin
This series is based on ds/sparse-list-in-cone-mode.
This series attempts to clean up some rough edges in the sparse-checkout
feature, especially around the cone mode.
Unfortunately, after the v2.25.0 release, we noticed an issue with the "git
clone --sparse" option when using a URL instead of a local path. This is
fixed and properly tested here.
Also, let's improve Git's response to these more complicated scenarios:
1. Running "git sparse-checkout init" in a worktree would complain because
the "info" dir doesn't exist.
2. Tracked paths that include "*" and "\" in their filenames.
3. If a user edits the sparse-checkout file to have non-cone pattern, such
as "**" anywhere or "*" in the wrong place, then we should respond
appropriately. That is: warn that the patterns are not cone-mode, then
revert to the old logic.
Updates in V2:
* Added C-style quoting to the output of "git sparse-checkout list" in cone
mode.
* Improved documentation.
* Responded to most style feedback. Hopefully I didn't miss anything.
* I was lingering on this a little to see if I could also fix the issue
raised in [1], but I have not figured that one out, yet.
Update in V3:
* Input now uses Peff's recommended pattern: unquote C-style strings over
stdin and otherwise do not un-escape input.
[1]
https://lore.kernel.org/git/062301d5d0bc$c3e17760$4ba46620$@Frontier.com/
Thanks, -Stolee
Derrick Stolee (14):
t1091: use check_files to reduce boilerplate
t1091: improve here-docs
sparse-checkout: create leading directories
clone: fix --sparse option with URLs
sparse-checkout: cone mode does not recognize "**"
sparse-checkout: detect short patterns
sparse-checkout: warn on globs in cone patterns
sparse-checkout: properly match escaped characters
sparse-checkout: write escaped patterns in cone mode
sparse-checkout: unquote C-style strings over --stdin
sparse-checkout: use C-style quotes in 'list' subcommand
sparse-checkout: escape all glob characters on write
sparse-checkout: improve docs around 'set' in cone mode
sparse-checkout: fix cone mode behavior mismatch
Jeff King (1):
sparse-checkout: fix documentation typo for core.sparseCheckoutCone
Documentation/git-sparse-checkout.txt | 19 +-
builtin/clone.c | 2 +-
builtin/sparse-checkout.c | 48 +++-
dir.c | 79 +++++-
t/t1091-sparse-checkout-builtin.sh | 352 +++++++++++++++-----------
unpack-trees.c | 2 +-
6 files changed, 346 insertions(+), 156 deletions(-)
base-commit: 4fd683b
Submitted-As: https://lore.kernel.org/git/pull.513.v4.git.1580501775.gitgitgadget@gmail.com
In-Reply-To: https://lore.kernel.org/git/pull.513.git.1579029962.gitgitgadget@gmail.com
In-Reply-To: https://lore.kernel.org/git/pull.513.v2.git.1579900782.gitgitgadget@gmail.com
In-Reply-To: https://lore.kernel.org/git/pull.513.v3.git.1580236003.gitgitgadget@gmail.com
grep: ignore --recurse-submodules if --no-index is given From: Philippe Blain <levraiphilippeblain@gmail.com> Since grep learned to recurse into submodules in 0281e48 (grep: optionally recurse into submodules, 2016-12-16), using --recurse-submodules along with --no-index makes Git die(). This is unfortunate because if submodule.recurse is set in a user's ~/.gitconfig, invoking `git grep --no-index` either inside or outside a Git repository results in fatal: option not supported with --recurse-submodules Let's allow using these options together, so that setting submodule.recurse globally does not prevent using `git grep --no-index`. Using `--recurse-submodules` should not have any effect if `--no-index` is used inside a repository, as Git will recurse into the checked out submodule directories just like into regular directories. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Submitted-As: https://lore.kernel.org/git/pull.540.v2.git.1580391448318.gitgitgadget@gmail.com In-Reply-To: https://lore.kernel.org/git/pull.540.git.1580000298097.gitgitgadget@gmail.com
git: update documentation for --git-dir From: Heba Waly <heba.waly@gmail.com> git --git-dir <path> is a bit confusing and sometimes doesn't work as the user would expect it to. For example, if the user runs `git --git-dir=<path> status`, git will skip the repository discovery algorithm and will assign the work tree to the user's current work directory unless otherwise specified. When this assignment is wrong, the output will not match the user's expectations. This patch updates the documentation to make it clearer. Signed-off-by: Heba Waly <heba.waly@gmail.com> Helped-by: Junio C Hamano <gitster@pobox.com> Submitted-As: https://lore.kernel.org/git/pull.537.v4.git.1580346841614.gitgitgadget@gmail.com In-Reply-To: https://lore.kernel.org/git/pull.537.git.1579745811615.gitgitgadget@gmail.com In-Reply-To: https://lore.kernel.org/git/pull.537.v2.git.1580091855792.gitgitgadget@gmail.com In-Reply-To: https://lore.kernel.org/git/pull.537.v3.git.1580185440512.gitgitgadget@gmail.com
add: use advice API to display hints From: Heba Waly <heba.waly@gmail.com> In the "add" command, use the advice API to display hints to users, as it provides a neat and a standard format for hint messages, and the message visibility will be configurable. Signed-off-by: Heba Waly <heba.waly@gmail.com> Submitted-As: https://lore.kernel.org/git/pull.508.v3.git.1580346702203.gitgitgadget@gmail.com In-Reply-To: https://lore.kernel.org/git/pull.508.git.1577934241.gitgitgadget@gmail.com In-Reply-To: https://lore.kernel.org/git/pull.508.v2.git.1578438752.gitgitgadget@gmail.com
PreviousNext