Well THAT was a long break from blogging...
One of the things that's happened in the illumos community is a subtle shift of the main illumos source repository from being primarily Mercurial to being primarily Git. This means I've had to learn Git. At first, I wasn't sure why people were so rabidly pro-Git. I found one of the big reasons:
everywhere(~/ws)[0]% /bin/time git clone git-illumos git-illumos.copy Cloning into git-illumos.copy... done. real 11.8 user 4.7 sys 3.2 everywhere(~/ws)[0]% /bin/time hg clone illumos-clone illumos-clone.copy updating working directory 44332 files updated, 0 files merged, 0 files removed, 0 files unresolved real 1:52.6 user 28.9 sys 25.4 everywhere(~/ws)[0]%
Wow! Yeah, I can see why this would appeal to people. I'm still using Mercurial in a fair amount of places, both for my illumos work and for Nexenta as well. I should show one other thing that both SCM cloning operations do: take up disk space.
everywhere(~/ws)[0]% zpool list NAME SIZE ALLOC FREE EXPANDSZ CAP DEDUP HEALTH ALTROOT rpool 298G 198G 100G - 66% 1.00x ONLINE - everywhere(~/ws)[0]% /bin/time git clone git-illumos git-illumos.copy *** SNIP! *** everywhere(~/ws)[0]% sync everywhere(~/ws)[0]% zpool list NAME SIZE ALLOC FREE EXPANDSZ CAP DEDUP HEALTH ALTROOT rpool 298G 198G 99.6G - 66% 1.00x ONLINE - everywhere(~/ws)[0]% /bin/time hg clone illumos-clone illumos-clone.copy *** SNIP! *** everywhere(~/ws)[0]% sync everywhere(~/ws)[0]% zpool list NAME SIZE ALLOC FREE EXPANDSZ CAP DEDUP HEALTH ALTROOT rpool 298G 199G 98.7G - 66% 1.00x ONLINE - everywhere(~/ws)[0]%
I believe Git will also take up less disk space, but still, that's approximately half a gig or more for an illumos workspace. If it's populated, say with a preinstalled proto area and compiled objects, that'll be even larger.
Consider one of the great strengths of ZFS: its copy-on-write architecture. Take a local, on-disk master repo, say one you're pulling directly from the source, and make it its own filesystem. Child/downstream workspaces from your on-disk master now can be created using low-latency ZFS operations. Only two problems need to be solved: non-privileged usage, and SCM correction to properly designate the parent/child or upstream/downstream relationship.
Another useful ZFS feature is administrative delegation. Put simply, an administrator can allow an ordinary user to perform selected ZFS primitives on a given filesystem, and its descendants in the ZFS filesystem tree. For example:
everywhere(~)[0]% zfs allow rpool/export/home/danmcd everywhere(~)[0]% zfs allow rpool/export/home/danmcd/ws ---- Permissions on rpool/export/home/danmcd/ws ---------------------- Local+Descendent permissions: user danmcd clone,create,destroy,mount,promote,snapshot everywhere(~)[0]%
I (as root) delegated several permissions for a subdirectory of $HOME to me (as danmcd). From here, I can create new filesystems in ~/ws, as well as destroy them, clone them, mount, snapshot, and promote them. All of these are useful operations. The syntax for delegation is mostly straightforward: zfs allow -ld clone,create,destroy,mount,promote,snapshot rpool/export/home/danmcd/ws. The -ld flags enable local and descendant permission propagation.
First thing I did was zfs create rpool/export/home/danmcd/ws/illumos-clone, followed by hg clone ssh://anonhg@hg.illumos.org/illumos-gate illumos-clone. This populates my local Mercurial illumos repo. I can perform a similar operation with git. Per my above timing examples, I did so with git-illumos.
I wrote a script to clone, promote, and reparent Git and Mercurial workspaces using ZFS operations. It's called zclone and it's here for download. It's still a work in progress, and I'd like to maybe have it end up in usr/src/tools in illumos-gate someday. (I'll try and update this particular post as things evolve.)
Check out the times, and the disk space (not) used:
everywhere(~/ws)[0]% zpool list NAME SIZE ALLOC FREE EXPANDSZ CAP DEDUP HEALTH ALTROOT rpool 298G 198G 100G - 66% 1.00x ONLINE - everywhere(~/ws)[0]% /bin/time zclone git-illumos git-illumos.zc Created rpool/export/home/danmcd/ws/git-illumos.zc, a zfs clone of rpool/export/home/danmcd/ws/git-illumos real 1.0 user 0.0 sys 0.0 everywhere(~/ws)[0]% /bin/time zclone illumos-clone illumos-clone.zc Created rpool/export/home/danmcd/ws/illumos-clone.zc, a zfs clone of rpool/export/home/danmcd/ws/illumos-clone real 1.0 user 0.0 sys 0.0 everywhere(~/ws)[0]% zpool list NAME SIZE ALLOC FREE EXPANDSZ CAP DEDUP HEALTH ALTROOT rpool 298G 198G 100G - 66% 1.00x ONLINE - everywhere(~/ws)[0]%
These are constant-time operations, folks. And like I said earlier, I suppose its possible to have the local master repos populated with pre-compiled objects, header files in proto areas (an illumos build trick), and other disk-intensive operations pre-performed.
A quick search didn't yield me any results in this area: using ZFS to help make source trees take up less space. I'm surprised nobody's blogged about this or documented it, but I may have missed something. Either way, it doesn't hurt to mention it again.
I've been doing something like this with hg for a while, and yes, it's an excellent time saver, and saves large amounts of disk space too. (I do keep an unpacked copy of the closed bins, and do "make rootdirs" in my zfs parent dataset.)
ReplyDeleteThanks for posting a git version of this handy tool.
use "git branch" :-)
ReplyDeleteLike it or not, here's my take on this (with surprisingly similar name) among my other git plugins: https://github.com/jimklimov/git-scripts
ReplyDeleteAnnounced in mailing lists quite a while ago :)
Usage: e.g. "git pull" into local workspace which is just a replica of upstream, "git zclone" it to spawn a build workspace (with downloads and publish-repos pointed via envvars into another dataset) and build there. Then instead of a "make clean" you can just "zfs destroy" this workspace.