A tool set for fast and efficient git scanning to capture data with focus on large repos
MIT License
npm install @discoveryjs/scan-git
import { createGitReader } from '@discoveryjs/scan-git';
const repo = await createGitReader('path/to/.git');
const commits = await repo.log({ ref: 'my-branch', depth: 10 });
console.log(commits);
await repo.dispose();
});
gitdir
: string - path to the git repooptions
– optional settings:
cruftPacks
– defines how cruft packs are processed:
'include'
or true
(default) - process all packs'exclude'
or false
- exclude cruft packs from processing'only'
- process cruft packs onlyCommon parameters:
ref
: string – a reference to an object in repositorywithOid
: boolean – a flag to include resolved oid for a referenceReturns default branch name used in a repo:
const defaultBranch = await repo.defaultBranch();
// 'main'
The algorithm to identify a default branch name:
upstream/HEAD
origin/HEAD
main
master
Expands a ref
into a full form, e.g. 'main'
-> 'refs/heads/main'
.
Returns null
if ref
doesn't exist. For the symbolic ref names ('HEAD'
, 'FETCH_HEAD'
, 'CHERRY_PICK_HEAD'
, 'MERGE_HEAD'
and 'ORIG_HEAD'
) returns a name without changes.
const fullPath = repo.expandRef('heads/main');
// 'refs/heads/main'
Resolves ref
into oid if it exists, otherwise throws an exception.
In case if ref
is oid, returns this oid back. If ref is not a full path, expands it first.
const oid = repo.resolveRef('main');
// '8bb6e23769902199e39ab70f2441841712cbdd62'
Checks if a ref
exists.
const isValidRef = repo.isRefExists('main');
// true
const remotes = repo.listRemotes();
// [
// 'origin'
// ]
Get a list of branches for a remote.
const originBranches = await repo.listRemoteBranches('origin');
// [
// 'HEAD',
// 'main'
// ]
const originBranches = await repo.listRemoteBranches('origin', true);
// [
// { name: 'HEAD', oid: '7c2a62cdbc2ef28afaaed3b6f3aef9b581e5aa8e' }
// { name: 'main', oid: '56ea7a808e35df13e76fee92725a65a373a9835c' }
// ]
Get a list of local branches.
const localBranches = await repo.listBranches();
// [
// 'HEAD',
// 'main'
// ]
const localBranches = await repo.listBranches(true);
// [
// { name: 'HEAD', oid: '7c2a62cdbc2ef28afaaed3b6f3aef9b581e5aa8e' }
// { name: 'main', oid: '56ea7a808e35df13e76fee92725a65a373a9835c' }
// ]
Get a list of tags.
const tags = await repo.listTags();
// [
// 'v1.0.0',
// 'some-feature'
// ]
const tags = await repo.listTags(true);
// [
// { name: 'v1.0.0', oid: '7c2a62cdbc2ef28afaaed3b6f3aef9b581e5aa8e' }
// { name: 'some-feature', oid: '56ea7a808e35df13e76fee92725a65a373a9835c' }
// ]
Resolve a tree oid by a commit reference.
ref
: string (default: 'HEAD'
) – commit referenceconst treeOid = await repo.treeOidFromRef('HEAD');
// 'a1b2c3d4e5f6...'
List all files in the repository at the specified commit reference.
ref
: string (default: 'HEAD'
) – commit referencefilesWithHash
: boolean (default: false
) – specify to return blob's hashesconst headFiles = repo.listFiles(); // the same as repo.listFiles('HEAD')
// [ 'file.ext', 'path/to/file.ext', ... ]
const headFilesWithHashes = repo.listFiles('HEAD', true);
// [ { path: 'file.ext', hash: 'f2e492a3049...' }, ... ]
Retrieve a tree entry (file or directory) by its path at the specified commit reference.
path
: string - the path to the file or directoryref
: string (default: 'HEAD'
) - commit referenceconst entry = await repo.getPathEntry('path/to/file.txt');
// { isTree: false, path: 'path/to/file.txt', hash: 'a1b2c3d4e5f6...' }
Retrieve a list of tree entries (files or directories) by their paths at the specified commit reference.
paths
: string[] - an array of paths to files or directoriesref
: string (default: 'HEAD'
) - commit referenceconst entries = await repo.getPathsEntries([
'path/to/file1.txt',
'path/to/dir1',
'path/to/file2.txt'
]);
// [
// { isTree: false, path: 'path/to/file1.txt', hash: 'a1b2c3d4e5f6...' },
// { isTree: true, path: 'path/to/dir1', hash: 'b1c2d3e4f5g6...' },
// { isTree: false, path: 'path/to/file2.txt', hash: 'c1d2e3f4g5h6...' }
// ]
Compute the file delta (changes) between two commit references, including added, modified, and removed files.
nextRef
: string (default: 'HEAD'
) - commit reference for the "next" stateprevRef
: string (optional) - commit reference for the "previous" stateconst fileDelta = await repo.deltaFiles('HEAD', 'branch-name');
// {
// add: [ { path: 'path/to/new/file.txt', hash: 'a1b2c3d4e5f6...' }, ... ],
// modify: [ { path: 'path/to/modified/file.txt', hash: 'f1e2d3c4b5a6...', prevHash: 'a1b2c3d4e5f6...' }, ... ],
// remove: [ { path: 'path/to/removed/file.txt', hash: 'a1b2c3d4e5f6...' }, ... ]
// }
Return a list of commits in topological order.
Options:
ref
– oid, hash, refdepth
(default 50
) – limits commits countconst commits = await repo.log({ ref: 'my-branch', depth: 10 });
// [
// Commit,
// Commit,
// ...
// ]
Note: Pass
Infinity
asdepth
value to load all the commits that are reachable fromref
at once.
Returns statistics for a repo:
const stats = await repo.stat();
// {
// refs: { ... },
// objects: {
// loose: { ... },
// packed: { ... }
// }
// }
scan-git | isomorphic-git | Feature |
---|---|---|
✅ | ✅ | loose refs |
✅ | ✅ | packed refs |
🚫 | ✅ | index file Boosts fetching a file list for HEAD |
✅ | ✅ | loose objects |
✅ | ✅ | packed objects (*.pack + *.idx files) |
✅ | 🚫 |
2Gb+ packs support Version 2 pack-*.idx files support packs larger than 4 GiB by adding an optional table of 8-byte offset entries for large offsets
|
✅ | 🚫 |
On-disk reverse indexes (*.rev files) Reverse index is boosting operations such as a seeking an object by offset or scanning objects in a pack order
|
🚫 | 🚫 | multi-pack-index (MIDX) Stores a list of objects and their offsets into multiple packfiles, can provide O(log N) lookup time for any number of packfiles |
🚫 | 🚫 | multi-pack-index reverse indexes (RIDX) Similar to the pack-based reverse index |
✅ | 🚫 | Cruft packs A cruft pack eliminates the need for storing unreachable objects in a loose state by including the per-object mtimes in a separate file alongside a single pack containing all loose objects |
🚫 | 🚫 | Pack and multi-pack bitmaps Bitmaps store reachability information about the set of objects in a packfile, or a multi-pack index |
🚫 (TBD) | 🚫 | commit-graph A binary file format that creates a structured representation of Git’s commit history, boost some operations |
MIT