tag:blogger.com,1999:blog-48887405447831503972024-03-12T22:19:24.576-04:00Kebe Says: a blog by Dan McD.Dan McDonald -- illumos engineer and RTI advocate at Joyent. This blog formerly resided on Sun's blog roll as, "End-to-end... and everything in between."danmcdhttp://www.blogger.com/profile/02293330539766533891noreply@blogger.comBlogger69125tag:blogger.com,1999:blog-4888740544783150397.post-71129985031737460552020-08-21T13:43:00.002-04:002020-08-21T13:43:58.633-04:00Goodbye blogspot<p>
First off, long time no blog!
</p>
<p>
This is the last post I'm putting on the
<a href=https://kebesays.blogspot.com/>Blogspot site</a>. In the spirit of
eating my own dogfood, I've now set up a
<a href=https://kebe.com/blog/>self-hosted blog</a> on my HDC. I'm sure it
won't be hard for all half-dozen of you readers to move over. I'll have new
content over there, at the very least the Hello, World post, a catchup post,
and a HDC 3.0 post to match the ones for 1.0 and 2.0.
</p>
danmcdhttp://www.blogger.com/profile/02293330539766533891noreply@blogger.com0tag:blogger.com,1999:blog-4888740544783150397.post-49010225712659596342015-11-03T10:57:00.001-05:002015-11-03T10:57:38.075-05:00From 0-to-illumos on OmniOS r151016<p>
Today we updated OmniOS to its next stable release: <a href=http://omnios.omniti.com/wiki.php/ReleaseNotes/r151016>r151016</a>. You can click the link to see its release notes, and you may notice a brief mention the <tt>illumos-tools</tt> package.
</p>
<p>
I want to see more people working on illumos. A way to help that is to get people started on actually BUILDING illumos more quickly. To that end, r151016 contains everything to bring up an illumos development environment. You can <a href=http://kebesays.blogspot.com/2011/03/for-illumos-newbies-on-developing-small.html>develop small</a> on it, but this post is going to discuss how we make building all of illumos-gate from scratch easier. (I plan on updating the older post on small/focused compilation after ws(1) and bldenv(1) effectively merge into one tool.)
</p>
<p>
The first thing you want to do is install OmniOS. The latest release media can be found <a href=http://omnios.omniti.com/wiki.php/Installation>here, on the Installation page</a>.
</p>
<p>
After installation, your system is a blank slate. You'll need to set a root password, create a non-root user, and finally add networking parameters. The OmniOS wiki's <a href=http://omnios.omniti.com/wiki.php/GeneralAdministration>General Administration Guide</a> covers how to do this.
</p>
<p>
I've added a new <a href=http://omnios.omniti.com/wiki.php/illumos-tools>building illumos</a> page to the OmniOS wiki that should detail how straightforward the process is. You should be able to kick off a full <a href=http://illumos.org/man/1onbld/nightly>nightly</a>(1ONBLD) build quickly enough. If you don't want to edit one of the omnios-illumos-* samples in /opt/onbld/env, just make sure you have a $USER/ws directory, clone one of illumos-gate or illumos-omnios into $USER/ws/testws and use one of the template /opt/onbld/env/omnios-illumos-* files corresponding to illumos-gate or illumos-omnios. For example:
<pre>
omnios(~)[0]% mkdir ws
omnios(~)[0]% cd ws
omnios(~/ws)[0]% git clone https://github.com/illumos/illumos-gate/ testws
<OUTPUT SHORTENED...>
omnios(~/ws)[0]% /bin/time /opt/onbld/bin/nightly /opt/onbld/env/omnios-illumos-gate
</pre>
You can then look in testws/log/log-<i>date&time</i>/mail_msg to see how your build went.
</p>
danmcdhttp://www.blogger.com/profile/02293330539766533891noreply@blogger.com0tag:blogger.com,1999:blog-4888740544783150397.post-20363276765526474792015-04-20T16:00:00.002-04:002015-04-20T16:00:39.743-04:00Quick Reminder -- tcp_{xmit,recv}_hiwat and high-bandwidth*delay networks<p>
I was recently working with a colleague on connecting two data centers via an IPsec tunnel. He was using <tt>iperf</tt> (coming soon to OmniOS bloody along with <tt>netperf</tt>) to test the bandwidth, and was disappointed in his results.
</p>
<p>
The amount of memory you need to hold a TCP connection's unacknowledged data is the <a href=http://en.wikipedia.org/wiki/Bandwidth-delay_product>Bandwidth-Delay product</a>. The defaults shipped in illumos are small on the receive side:
<pre>
bloody(~)[0]% ndd -get /dev/tcp tcp_recv_hiwat
128000
bloody(~)[0]%
</pre>
and even smaller on the transmit side:
<pre>
bloody(~)[0]% ndd -get /dev/tcp tcp_xmit_hiwat
49152
bloody(~)[0]%
</pre>
</p>
<p>
Even platforms with <a href=http://wenxueliu.github.io/blog/10/10/2014/linux-tuning/>Automatic tuning</a>, the maximums they use are often not set highly enough.
</p>
<p>
Introducing IPsec into the picture adds additional latency (if not so much for encryption thanks to AES-NI & friends, then for the encapsulation and checks). This often is enough to take what are normally good enough maximums and invalidate them as too small. To change these on illumos, you can use the ndd(1M) command shown above, OR you can use the modern, persists-across-reboots, ipadm(1M) command:
<pre>
bloody(~)[1]% sudo ipadm set-prop -p recv_buf=1048576 tcp
bloody(~)[0]% sudo ipadm set-prop -p send_buf=1048576 tcp
bloody(~)[0]% ipadm show-prop -p send_buf tcp
PROTO PROPERTY PERM CURRENT PERSISTENT DEFAULT POSSIBLE
tcp send_buf rw 1048576 1048576 49152 4096-1048576
bloody(~)[0]% ipadm show-prop -p recv_buf tcp
PROTO PROPERTY PERM CURRENT PERSISTENT DEFAULT POSSIBLE
tcp recv_buf rw 1048576 1048576 128000 2048-1048576
bloody(~)[0]%
</pre>
</p>
<p>
There's future work there in not only increasing the upper bound (easy), but also adopting the automatic tuning so the maximum just isn't taken right off the bat.
</p>danmcdhttp://www.blogger.com/profile/02293330539766533891noreply@blogger.com0tag:blogger.com,1999:blog-4888740544783150397.post-56233675321199845642015-03-15T11:23:00.000-04:002015-03-15T11:23:25.702-04:00New HDC service: Calendaring (or, The Limitation Game)<p>
I'll start by stating my biases: I don't like data bloat like ASN.1, XML, or even bloaty protocols like HTTP. (Your homework: Would a 1980s-developed WAN-scale RPC have obviated HTTP? Write a paper with your answer to that question, with support.) I understand the big problems they attempt to solve. I also still think not enough people in the business were paying attention in OS (or Networking) class when seeing the various attempts at data representation during the 80s and 90s. Also, I generally like pushing intelligence out to the end-nodes, and in client/server models, this means the clients. CalDAV rubs me the wrong way on the first bias, and MOSTLY the right way on my second bias, though the clients I use aren't very smart. I will admit near-complete ignorance of CalDAV. I poked a little at its <a href=https://tools.ietf.org/html/rfc4791>RFC</a>, looking up how Alarms are implemented, and discovered that mostly, Alarm processing is a client issue. ("This specification makes no attempt to provide multi-user alarms on group calendars or to find out for whom an alarm is intended.")
</p>
<p>
I've configured <a href=http://radicale.org/>Radicale</a> on my <a href=http://kebesays.blogspot.com/2014/06/home-data-center-20-dogfooding-again.html>Home Data Center</a>. I need to publicly thank Lauri Tirkkonen (aka. <tt>lotheac</tt> on Freenode) for the IPS publisher which serves me up Radicale. Since my target audience is my family-of-four, I wasn't particularly concerned with its reported lack of scalability. I also didn't want to have CalDAV be a supplicant of Apache or another web server for the time. If I decide to revisit my web server choices, I may move CalDAV to that new webserver (likely nginx). I got TLS and four users configured on stock Radicale.
</p>
<p>
My job was to make an electronic equivalent of our family paper calendar. We have seven (7) colors/categories for this calendar (names withheld from the search engines): Whole-Family, Parent1, Parent2, Both-Parents, Child1, Child2, Both-Children. I thought, given iCal (10.6), Calendar.app (10.10), or Calendar (iOS), it wouldn't be too hard for these to be created and shared. I was mildly wrong.
</p>
<p>
I'm not sure if what I had to do was a limitation of my clients, of Radicale, or of CalDAV itself, but I had to create seven (7) different accounts, each with a distinct ends-in-'/' URL:
<ul>
<li>https://.../Whole-Family.ics/</li>
<li>https://.../Parent1.ics/</li>
<li>https://.../Parent2.ics/</li>
<li>https://.../Both-Parents.ics/</li>
<li>https://.../Child1.ics/</li>
<li>https://.../Child2.ics/</li>
<li>https://.../Both-Children.ics/</li>
</ul>
I had to configure N (large N) devices or machine-logins with these seven (7) accounts.
</p>
<p>
</p>
Luckily, Radicale DID allow me to restrict Child1's and Child2's write access to just their own calendars. Apart from that, we want the whole family to read all of the calendars. This means the colors are uniform across all of our devices (stored on the server). It also means any alarms (per above) trigger on ALL of our devices. This makes alarms (something I really like in my own Calendar) useless. Modulo the alarms problem (which can be mitigated by judicious use of iOS's Reminders app and a daily glance at the calendar), this seems to end up working pretty well, so far.
</p>
<p>
Both children recently acquired iPhones. Which means if I open this service outside our internal home network, we can schedule calendars no matter where we are, and get up to date changes no matter where we are. That will be extremely convenient.
</p>
<p>
I somewhat hope that one of my half-dozen readers will find something so laughably wrong with how I configured things that any complaints I make will be rendered moot. I'm not certain, however, that will be the case.
</p>
danmcdhttp://www.blogger.com/profile/02293330539766533891noreply@blogger.com0tag:blogger.com,1999:blog-4888740544783150397.post-18868792084989535862014-11-09T23:21:00.000-05:002014-11-09T23:21:27.204-05:00Toolsmiths - since everything is software now anyway...<p>
A recent twitter storm occurred in light of last week's <a href=https://twitter.com/hashtag/encryptnews?src=hash>#encryptnews</a> event.
</p>
<p>
I was rather flattered when well-known whistleblower Thomas Drake retweeted this response of mine:
<blockquote class="twitter-tweet" lang="en"><p><a href="https://twitter.com/KevinBankston">@KevinBankston</a> <a href="https://twitter.com/Thomas_Drake1">@Thomas_Drake1</a> <a href="https://twitter.com/headhntr">@headhntr</a> Because vendors can be compromised. You want goodness, fully fund an FOSS project.</p>— Dan McDonald (@kebesays) <a href="https://twitter.com/kebesays/status/530743332134477825">November 7, 2014</a></blockquote>
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>
The mention of "buying usable software" probably makes sense to someone who's used to dealing with Commercial, Off-The-Shelf (COTS) software. We don't live in a world where COTS is necessarily safe anymore. There was a period (which I luckily lived and worked in), where Defense Department ARPA money was being directed specifically to make COTS software more secure and high-assurance. Given the Snowden revelations, however, COTS can possibly be a vulnerability as much as it could be a strength.
</p>
<p>
In the seminal Frederick Brooks book, <a href=http://www.amazon.com/Mythical-Man-Month-Software-Engineering-Anniversary/dp/0201835959/ref=sr_1_1?s=books&ie=UTF8&qid=1415567117&sr=1-1&keywords=the+mythical+man+month>The Mythical Man-Month</a>, he describes one approach to software engineering: The Surgical Team. See <a href=http://www.dfpug.de/loseblattsammlung/online/workshop/design_patterns/sonstiges.htm>here</a> and scroll down for a proper description. Note the different roles for such a team.
</p>
<p>
Given that most media is equivalent to software (easily copied, distributed, etc.), I wonder if media organizations shouldn't adopt certain types of those organizational roles that have been until now the domain of traditional software. In particular, the role of the Toolsmith should be one that modern media organizations adopt. Ignoring traditional functions of "IT", a toolsmith for, say, an investigative organization should be well-versed in what military types like to call Defensive Information Warfare. Beyond just the mere use of encryption (NOTE: ANYONE who equates encryption with security should be shot, or at least distrusted), such Toolsmiths should enable their journalists (who would correspond to the surgeon or the assistant in the surgical team model) to do their job in the face of strong adversaries. An entity that needs a toolsmith will also need a software base, and unless the entity has resources enough to create an entire software stack, that entity will need Free Open-Source Software (for various definitions of Free and Open I won't get into for fear of derailing my point).
</p>
<p>
I haven't been working in security much since the Solaris Diaspora, so I'm a little out of touch with modern threat environments. I suspect it's everything I'd previous imagined, just more real, and where the word "foreign" can be dropped from "major foreign governments". Anyone who cares about keeping their information and themselves safe should, in my opinion, have at least a toolsmith on their staff. Several organizations do, or at least have technology experts, like the ACLU's <a href=https://twitter.com/csoghoian>Christopher Soghoian</a>, for example. The analogy could probably extend beyond security, but I wanted to at least point out the use of an effective toolsmith.
</p>danmcdhttp://www.blogger.com/profile/02293330539766533891noreply@blogger.com0tag:blogger.com,1999:blog-4888740544783150397.post-80843957687226987142014-07-21T10:00:00.000-04:002014-07-21T10:00:03.272-04:00Happy (early) 20th anniversary, IPv6<p>
My first full-time job out of school was with the <a href=http://www.nrl.navy.mil/>The U.S. Naval Research Laboratory</a>. It was a spectacular opportunity. I was going to be working on next-generation (at the time) Internet Protocol research and development.
</p>
<p>
When I joined in early 1994, the IPng proposals had been narrowed to three:
</p>
<ul>
<li><b>SIPP</b> - Simple Internet Protocol Plus. 8-byte addresses, combined with a routing header that could, in theory, extend the space even further (inherited from IPng contender PIP).</li>
<li><b>TUBA</b> - TCP Using Big Addresses. The use of OSI's CLNP with proven IPv4 transports TCP and UDP running over it.</li>
<li><b>CATNIP</b> - Common Architecture for the Internet. I never understood this proposal, to be honest, but I believe it was an attempt to merge CLNP and IPv4.</li>
</ul>
<p>
NRL, well, <a href=http://www.nrl.navy.mil/itd/chacs/>my part of NRL, anyway</a> placed its bet on SIPP. I was hired to help build SIPP for then-nascent 4.4BSD. (The first 10 months were actually on 4.3 Net/2 as shipped by BSDI!) It was a great team to work with, and our <a href=http://www.usenix.org/publications/library/proceedings/sd96/full_papers/atkinson.ps>1995 USENIX paper</a> displayed our good work.
</p>
<p>
Ooops... I'm getting a bit ahead of myself.
</p>
<p>
The announcement of the IPng winner was to be at the <a href=http://www.ietf.org/proceedings/30/>30th IETF meeting in Toronto</a>, late in July. Some of us were fortunate to find out early that what would become IPv6 was SIPP, but with 16-byte addresses. Since I was building this thing, I figured it was time to get to work before Toronto.
</p>
<p>
20 years ago today, I sent this (with slightly reordered header fields) mail out to a subset of people. I didn't use the public mailing list, because I couldn't disclose SIPP-16 (which became IPv6) before the Toronto meeting. I also discovered some issues that later implementors would discover, as you can see.
</p>
<pre style=font-size:7pt>
From: "Daniel L. McDonald" <danmcd>
Subject: SIPP-16 stuff
To: danmcd (Daniel L. McDonald), cmetz (Craig Metz), atkinson (Ran Atkinson),
deering@parc.xerox.com, Bob.Hinden@eng.sun.com,
bob.gilligan@eng.sun.com, francis@cactus.ntt.jp,
rxg@thumper.bellcore.com, set@thumper.bellcore.com, bound@zk3.dec.com,
christian.huitema@sophia.inria.fr, conta@lassie.ucx.lkg.dec.com,
grehan@flotsm.ozy.dec.com, nordmark@jurassic-248.Eng.Sun.COM,
bill.simpson@um.cc.umich.edu, rj@sgi.com
Date: Thu, 21 Jul 1994 19:20:33 -0500 (EST)
Cc: vjs@sgi.com
X-Mailer: ELM [version 2.4 PL23]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Message-ID: <9407220020.aa02835@sundance.itd.nrl.navy.mil>
Content-Length: 1578
Status: RO
X-Status:
X-Keywords: NotJunk
X-UID: 155
SIPP folks,
Has anyone tried quick-n-dirty SIPP-16 mods yet?
We have managed to send/receive SIPP-16 pings across both Ethernet and
loopback. UDP was working with SIPP-8, and we're working on it for SIPP-16.
Minor multicast cases were working for SIPP-8 also, and will be moved to
SIPP-16. TCP will be forthcoming once we're comfortable with some of the
protocol control block changes.
My idea for the SIPP-16 sockaddr_sipp and sipp_addr is something like:
struct sipp_addr {
u_long words[4];
};
struct sockaddr_sipp {
u_char ss_len; /* For BSD routing tree code. */
u_char ss_family;
u_short ss_port;
u_long ss_reserved;
struct sipp_addr ss_addr;
};
We've managed to use the above to configure our interfaces, and send raw
SIPP-16 ICMP pings. I've a feeling the routing tree will get hairy with the
new sockaddr_sipp. The size discrepancy between the sockaddr_sipp, and the
conventional sockaddr will cause other compatibility issues to arise.
(E.g. SIOCAIFADDR will not work with SIPP, but SIOCAIFADDR_SIPP will.)
We look forward to the implementors meeting, so we can talk about bloody
gory details, experience with certain internals (PCBs!), and to find out
how far behind we still are.
Dan McD, Craig Metz, & Ran Atkinson
--
Dan McDonald | Mail: {danmcd,mcdonald}@itd.nrl.navy.mil --------------+
Computer Scientist | WWW: http://wintermute.itd.nrl.navy.mil/danmcd.html |
Naval Research Lab | Phone: (202) 404-7122 #include <disclaimer.h> |
Washington, DC | "Rise from the ashes, A blaze of everyday glory" - Rush +
</pre>
<p>
Funny how many defunct-or-at-least-renamed organizations are in that mail (Sun, DEC, Bellcore) are in that mail. BTW, for Solarish systems, the SIOCSLIFADDR (note the 'L') became the ioctl of choice for longer sockaddrs. Also, this was before I discovered uintN_t data types.
</p>
<p>
If it wasn't clear from the text of the mail, we actually transmitted IPv6 packets across an Ethernet that day. It's possible these were the first IPv6 packets ever sent <i>on a wire</i>. (Other early implementations used IPv6-in-IPv4 exclusively.) I won't fully claim that honor here, but I do believe it <i>could</i> be true.
</p>danmcdhttp://www.blogger.com/profile/02293330539766533891noreply@blogger.com2tag:blogger.com,1999:blog-4888740544783150397.post-65161204920381181742014-06-02T10:03:00.000-04:002014-06-02T10:03:25.311-04:00Home Data Center 2.0 - dogfooding again!Over six years ago, I put together my first <a href="http://kebesays.blogspot.com/2008/03/kebes-home-data-center-or-fbarts-new.html">home data center (HDC)</a>, which I assembled around a free CPU that was given to me.
<br />
A lot has happened in those six years. I've moved house, been through <a href="http://oracle.com/">three</a> <a href="http://nexenta.com/">different</a> <a href="http://omniti.com/">employers</a> (and yes, I count Oracle as a different employer, for reasons you can see <a href="https://www.youtube.com/watch?v=-zRN7XLCRhc">here</a>), and most relevant to this blog post - technology has improved.
<br />
My old home server was an energy pig, loud, and hitting certain limits. The <a href="http://www.cpu-world.com/CPUs/K8/AMD-Dual-Core%20Opteron%20185%20-%20OSA185DAA6CD%20(OSA185CDBOX).html">Opteron Model 185</a> has a TDP of 110 watts, and worse, the original power supply in the original HDC broke, and I replaced it with a LOUD one from a Sun w2100z workstation. I also replaced other parts over the years as things evolved. What I ended up with at the start of 2014 was:
<br />
<ul>
<li><b>AMD Opteron Model 185</b> - No changes here.</li>
<li><b>Tyan S2866</b> - Same here, too.</li>
<li><b>4GB of ECC RAM</b> - Up from 2GB of ECC, to the motherboard's maximum. I tried at first with two additional GB of non-ECC, but one nightly build of <a href="http://github.com/illumos/illumos-gate/">illumos-gate</a> where I saw a single-bit error in one built binary was enough to convince me about ECC's fundamental goodness.</li>
<li><b>Two Intel S3500 80GB SATA SSDs</b> - I use these as mirrored root, and mirrored slog, leaving alone ~20GB slices (16 + 4) each. I'm under the assumption that the Intel disk controller will do proper wear-leveling, and what-not. (Any corrections are most appreciated!) These replace two different, lesser-brand 64GB SSDs that crapped out on me.</li>
<li><b>Two Seagate ST2000DL003 2TB SATA drives.</b> - I bought these on clearance a month before the big Thailand flood that disrupted the disk-drive market. At $30/TB, I still haven't found as good of a deal, and the batch on sale were of sufficient quality to not fail me or my mirrored data (so says ZFS, anyway).</li>
<li><b>Lian Li case</b> - I still like the overall mechanical design of this brother-in-law recommended case. I already mentioned the power supply, so I'll skip that here.</li>
<li><b>A cheap nVidia 8400 card</b> - It runs twm on a 1920x1200 display, good enough!</li>
<li><b>OpenIndiana</b> - After moving OpenSolaris from SVR4 to IPS, I used OpenSolaris until Oracle happened. OI was a natural stepping stone off of OpenSolaris.</li>
</ul>
I gave a <a href="https://www.youtube.com/watch?v=APL7FzuzbpE">talk</a> on how I use my HDC. I'll update that later in this post, but suffice to say, between the energy consumption and the desire for me and my family to enable more services, I figured it was time to upgrade the hardware. With my new job at <a href="http://omniti.com/">OmniTI</a>, I also wanted to start <a href="http://en.wikipedia.org/wiki/Eating_your_own_dog_food">dogfooding</a> something I was working with. I couldn't use <a href="http://nexenta.com/products/nexentastor">NexentaStor</a> with my HDC, because of the non-storage functions of <a href="http://www.illumos.org/">Illumos</a> I was using. <a href="http://omnios.omniti.com/">OmniOS</a>, on the other hand, was going to be a near-ideal candidate to replace OpenIndiana, especially given its server focus.
<br />
As before, I started with a CPU for the system. The Socket 1150 Xeon E3 chips, which we had on one server at Nexenta (to help with the Illumos bring up of Intel's I210 and I217 ethernet chip, alongside Joyent and Pluribus), seemed an ideal candidate. Some models had low power draws, and they had all of the features needed to exploit more advanced Illumos features like KVM, if I ever needed it. I also considered the Socket 2011 Xeon E5 chips, but decided that I really didn't need more than 32GB of RAM for the forseeable future. So with that in mind, I asked OmniTI's Supermicro sales rep to put together a box for me. Here's what I got:
<br />
<ul>
<li><b>Intel Xeon E3 1265L v3</b> - <a href="http://www.cpu-world.com/CPUs/Xeon/Intel-Xeon%20E3-1265L%20v3.html">This CPU</a> has a TDP of 45 watts, that's 40% of the TDP of the old CPU. It clocks slightly slower, but otherwise is <a href="http://www.cpu-world.com/Compare/962/AMD_Dual-Core_Opteron_185_vs_Intel_Xeon_E3-1265L_v3.html">quite the upgrade</a> with 4 cores, hyperthreading (looking like 8 CPUs to Illumos), and all of the modern bells and whistles like VT-x with EPT and AES-NI. It also is being used in <a href="http://www.listbox.com/member/archive/182179/2014/05/sort/time_rev/page/18/entry/0:432/20140501213634:1DDB9C50-D19A-11E3-837C-EBC080CE6C91/">at least one shipping illumos-driven product</a>, which is nice to know.</li>
<li><b>Supermicro X10SLM-LN4F motherboard</b> - <a href="http://www.supermicro.com/products/motherboard/Xeon/C220/X10SLM_-LN4F.cfm">This motherboard</a> has four Intel I210 Gigabit ethernet ports on it. I only need two for now, thanks to Crossbow, but I have plans that my paranoia about separate physical LANs may require one or both of those last two. I'm using all four of its 6Gbit SATA ports, and it has two more 3Gbit ones for later. (I'll probably move the SSDs to the 3Gbit ones, because of latency vs. throughput, if I go to a 4-spinning-rust storage setup.) I've disabled USB3 for now, but if/when illumos supports it, I'll be able to test it here.</li>
<li><b>32 GB of ECC RAM</b> - Maxxed out now. So far, this hasn't been a concern.</li>
<li><b>Same drives as the old one</b> - I moved them right over from the old setup. Installed OmniOS (see below), but basically did "zpool split", "zpool export" from the old server, and "zpool import" on the new one. ZFS again for the win!</li>
<li><b>Supermicro SC732D4</b> - <a href="http://www.supermicro.com/products/chassis/tower/732/SC732D4-865.cfm">The case</a>, while not QUITE as cabling-friendly as the old Lian Li, has plastic disk trays that are an improvement over just screwing them in place on the Lian Li. The case comes standard with a four-disk 3.5" cage, and I added a four-disk 2.5" cage to mine. The 500W power supply seems to be an energy improvement, and is DEFINITELY quieter.</li>
<li><b>OmniOS r151010</b> - For my home server use, I'm going to be using the stable OmniOS release, which as of very recently became r151010. Every six months, therefore, I'll be getting a new OmniOS to use on this server. I haven't tried installing X or twm just yet, but that, and possibly printer support for my USB color printer, are the only things lacking over my old OI install.</li>
</ul>
I've had this hardware running for about two weeks now. It does everything the old server did, and a few new things.
<br />
<ul>
<li><b>File Service</b> - NFS, and as of very recently, CIFS as well. The latter is entirely to enable scan-to-network-disk scanning. This happens in the global zone, on the "internal network" NIC.</li>
<li><b>Router</b> - This is a dedicated zone which serves as the default router and NAT box. It also redirects external web and Minecraft requests (see below) to their respective zones. It also serves as an IPsec-protected remote access point. Ex-Sun people will know <i>exactly</i> what I'm talking about. It uses an internal vNIC, and a dedicated external NIC.</li>
<li><b>Webserver</b> - As advertised. Right now it just serves static content on port 80 (<a href="http://kebe.com/">www.kebe.com</a>), but I may expand this, if I don't put HTTPS service in another zone later. This sits on an internal vNIC, and its inbound traffic is directed by the NAT/Router.</li>
<li><b>Minecraft</b> - My children discovered <a href="https://minecraft.net/">Minecraft</a> in the past year or so. Turns out, Illumos does a good job of serving Minecraft. With this new server, and running the processes as 32-bit ones (implicit 4Gig limit), I can host two Minecraft servers easily now. This sits on an internal vNIC as well.</li>
<li><b>Work</b> - For now, this is just a place for me to store files for my job and build things. Soon, I plan on using another IPsec tunnel in the Router zone, an etherstub, and making this a part of my office, sitting in my house. Once that happens, I'll be using a dedicated NIC (for separation) to plug my work-issued laptop into.</li>
<li><b>Remote printing</b> - I have a USB color printer that the global zone can share (via lpd). To be honest, I don't have this working on OmniOS just yet, but I'll get that back.</li>
<li><b>DHCP and DNS</b> - Some people assume these are part of a router, but that's not necessarily the case. In this new instantiation, they'll live in the same zone as the webserver (which has a default route installed but is NOT the router). For this new OmniOS install, I'm switching to the ISC DHCP daemon. I hope to upstream it to <a href="http://github.com/omniti-labs/omnios-build/">omnios-build</a> after some operational experience.</li>
</ul>
Not quite two weeks now, and so far, so good. My kids haven't noticed any lags in Minecraft, and I've built illumos-gate from scratch, both DEBUG and non-DEBUG, in less than 90 minutes. We'll see how DHCP holds up when <a href="http://anywaybecause.blogspot.com/2013/09/great-expectations.html">Homeschool Book Club</a> shows up with Moms carrying smartphones, tablets, and laptops, plus even a kid or two bringing a Minecraft-playing laptop as well for after the discussion.
danmcdhttp://www.blogger.com/profile/02293330539766533891noreply@blogger.com4tag:blogger.com,1999:blog-4888740544783150397.post-21300154474796665972014-02-26T20:17:00.000-05:002014-02-26T20:20:33.950-05:00It's just me, I think, but return()s in loops can be bad.I was reviewing some code tonight. It was a simple linked-list match which originally looked like:
<pre><tt>
obj_t *
lookup(match_t key)
{
obj_t *p;
for (p = list_head(); p; p = list_next()) {
if (p->val == key)
return (p);
}
return (NULL);
}
</tt></pre>
<p>Not bad. But it turns out the list in question needed mutually exclusive access, so the reviewee inserted the mutex into this code.</p>
<pre><tt>
obj_t *
lookup(match_t key)
{
obj_t *p;
mutex_enter(list_lock());
for (p = list_head(); p; p = list_next()) {
if (p->val == key) {
mutex_exit(list_lock());
return (p);
}
}
mutex_exit(list_lock());
return (NULL);
}
</tt></pre>
<p>Eeesh, two places to call mutex_exit(). I suppose a good compiler would recognize the common basic blocks and optimize them out, but that's still mildly ugly to look at. Still, that above code just rubbed me the wrong way, even though I KNOW there are other bits of Illumos that are like the above. I didn't block the reviewer, but I did write down what I thought it should look like:</p>
<pre><tt>
obj_t *
lookup(match_t key)
{
obj_t *p;
mutex_enter(list_lock());
p = list_head();
while ( p != NULL && p->val != key)
p = list_next();
mutex_exit(list_lock());
return (p);
}
</tt></pre>
<p>That seems simpler. The operation is encapsulated in the mutex_{enter,exit} section, and there are no escape hatches save those in the while boolean. (It's the always-drop-the-mutex-upon-return that makes language constructs like monitors look appealing.)</p>
<p>I think I'm probably making a bigger deal out of this than I should, but the last code looks more readable to me.</p>
<p>One thing the reviewee suggested to me was that a for loop like before, but with breaks, would be equally clean w.r.t. only having one place to drop the mutex. I think the reviewee is right, and it allows for more sophisticated exits from a loop.</p>
<p>Some people would even use "goto fail" here, but we know what can happen when that goes wrong. :) </p>danmcdhttp://www.blogger.com/profile/02293330539766533891noreply@blogger.com3tag:blogger.com,1999:blog-4888740544783150397.post-53791798468934904322014-01-07T14:35:00.001-05:002014-01-07T14:35:59.668-05:00Greetings from OmniTI<h2>Hello again, world!</h2>
<a href=http://www.omniti.com/>OmniTI</a> gave me an opportunity to get back into the networking stack, while still having the ability to stay a jack-of-all-trades at least some of the time. It was a hard decision to make, but as of this past Monday, I'm now at OmniTI. My first week I'm down here in Maryland at HQ, but I'll be working from my house primarily. I hope also with this new job to appear at conferences a bit more, and meet more <a href=http://www.illumos.org/>illumos</a> users and developers in person, especially <a href=http://omnios.omniti.com/>OmniOS</a> ones.danmcdhttp://www.blogger.com/profile/02293330539766533891noreply@blogger.com0tag:blogger.com,1999:blog-4888740544783150397.post-64134114253256369232013-12-12T02:14:00.000-05:002013-12-12T02:36:15.219-05:00What I learned from my Atari 8-bit days<p>
Happy Throwback Thursday! Some time ago, also on Throwback Thursday, I tweeted a link to a document I wish I had when I was much younger:
</p>
<blockquote class="twitter-tweet"><p>Apparently it's throw-back Thursday, aka <a href="https://twitter.com/search/%23tbt">#tbt</a>. Here's the Atari XL/XE memory map. Cut my assembly teeth here. <a href="http://t.co/q6lHRYjUN6" title="http://www.atariarchives.org/mapping/appendix12.php">atariarchives.org/mapping/append…</a></p>— Dan McDonald (@kebesays) <a href="https://twitter.com/kebesays/status/324968626601984000">April 18, 2013</a></blockquote>
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>
<p>
I wish I'd had it, because it may have helped me save my first 8-bit Atari computer (an 800XL) from having its POKEY chip fried by a dumb copy-ROM-into-RAM loop. Beyond learning not to blindly write into hardware registers, my Atari 8-bits ended up teaching me a surprising amount. A fair amount of what I learned helped me mature into a proper Computer Scientist and Software Engineer.
</p>
<h3>Be Careful of the Next Version</h3>
<p>
I generally look forward to upgrades. Bugs get fixed, features get added, things move faster, and if you're really lucky, you get more than one of those with one upgrade. It doesn't always turn out nicely, though. Sometimes, the next version changes things enough where things that once worked no longer do. Other times, the next version just plain sucks.
</p>
<p>
8-bit Atari owners had two serious negative encounters - one of each kind. The unexpected change was the transition from the original 400 & 800 models to the XL (and later XE) series. The reason this was a problem is actually best described later.
</p>
<p>
Atari's DOS (almost every 8-bit machine's disk drivers were called "DOS") lingered on version 2.0 from 1980 until 1984. To accompany new "enhanced density" 5.25" floppy drives, Atari released DOS 3. DOS 3 falls squarely into the, "just plain sucks," category. It was a poor design, including such misfeatures as:
<ul>
<li>Larger block sizes (2048 bytes vs. 128 bytes), which lead to wasted disk space and sometimes less overall capacity if anything barely spilled into the next block</li>
<li>One-way migration. Once your data moved to DOS 3, it wasn't going back.</li>
<li>An overbearing help system that took up disk space (already at a premium).</li>
</ul>
I didn't know what it was called at the time, but DOS 3 suffered from the <a href=http://www.catb.org/jargon/html/S/second-system-effect.html>Second-System Effect</a>. Luckily, Atari ended up offering DOS 2.5, which looked like DOS 2.0, save for both support for enhance-density floppies, AND the ability to migrate DOS 3 files back to DOS 2.x.
</p>
<h3>Declare Your String Sizes</h3>
<p>
Jumping from Pascal or even BASIC to a language like C could be confusing to some. "What do you mean strings are just a character array?" If you cut your teeth on Atari BASIC, you already had an inkling of what was going on.
</p>
<p>
The classic Microsoft BASIC took up more than the 8K bytes that 8-bit Ataris had reserved for the cartridge slot. The resulting shrinkage of Atari BASIC included the array-like requirements for strings. On classic Microsoft BASIC:
<pre><tt>
10 A$="HELLO, WORLD"
20 PRINT "THE TEST STRING IS: ", A$
</tt></pre>
But you had to declare the string size in Atari BASIC:
<pre><tt>
5 DIM A$(100)
10 A$="HELLO, WORLD"
20 PRINT "THE TEST STRING IS: ", A$
</tt></pre>
One could not have an array of strings in Atari BASIC, and some of the classic BASIC array operators took on new significance in Atari BASIC. See <a href=http://www.atariarchives.org/c2ba/page023.php>here</a> for a treatise on the subject.
</p>
<h3>Don't Depend on Implementation Details</h3>
<p>
I mentioned the transition from the 400 and 800 to the XL series. Several pieces of software broke when they loaded onto an XL. The biggest reason for this was because these programs, to save cycles, would jump directly into various ROM routines that were supposed to be accessed through a documented table of JMP instructions. To save the three cycles of an additional JMP, programs would often inline the table entries into their programs. The XL series included a rewritten ROM, which scrambled a large portion of where these routines were implemented. BOOM, no more working code.
</p>
<p>
Atari, to their credit, released a "Translator" boot disk, which loaded a variant of the old 800 ROM into the XL's extended, bank-switched, RAM, and ran the system using the old 800 ROM. This allowed the broken software to continue to work.
</p>
<h3>You WILL Have Rejected Submissions</h3>
<p>
Owning an 8-bit Atari meant you subscribed to at least one of <a href=http://www.atarimagazines.com/antic/>Antic</a> or <a href=http://www.atarimagazines.com/analog/>ANALOG</a>. I was an ANTIC subscriber until I graduated high school. I even tried to submit, twice, type-in programs with accompanying articles to ANTIC. Both were terrible, and rightly rejected by the editor. I'm honestly afraid to remember what they were.
</p>
<h3>And William Gibson's a Pretty Good Writer</h3>
<p>
Speaking of <a href=http://www.atarimagazines.com/antic/>Antic</a> , check out this <a href=http://www.atarimagazines.com/v4n5/hackers.html>article</a> from September, 1985, especially Part 3 of <a href=http://www.atarimagazines.com/v4n5/hackers.html>the article</a>. I <b>immediately</b> scoured the Waukesha County Library System trying to find <i>Neuromancer</i>, and wasn't disappointed... not at all. 16-year-old me really liked this book, and wouldn't have discovered it before college were it not for ANTIC, which I'd have not read without my 8-bit Atari.
</p>danmcdhttp://www.blogger.com/profile/02293330539766533891noreply@blogger.com0tag:blogger.com,1999:blog-4888740544783150397.post-1402716419603990472013-09-07T00:36:00.000-04:002013-09-07T00:36:51.945-04:00I Have No Whistle to Blow, But I Must Scream<p>
I'm sure all twelve of you readers out there know what's been going on with respect to recent <a href=http://www.theguardian.com/commentisfree/2013/sep/06/nsa-surveillance-revelations-encryption-expert-chat>revelations about NSA activity</a>. Among other things is the unnerving discovery that NSA has been attempting to <a href=http://www.theguardian.com/commentisfree/2013/sep/05/government-betrayed-internet-nsa-spying>actively dumb-down security for the Internet</a>.
</p>
<p>
In the second linked article, Bruce Schneier calls upon people to blow the whistle on, "how the NSA and other agencies are subverting routers, switches, the internet backbone, encryption technologies and cloud systems." Here's the deal:
</p>
<menu>
<b>I have never been asked to introduce back-doors or weaken security in the Solaris, OpenSolaris, Oracle Solaris 11 (for the four months I worked on it post-barn-door-closing), or Illumos. If there are weaknesses there, it was not because of any deliberate effort on my part.</b>
</menu>
<p>
You can view the kernel IPsec protocol sources (AH & ESP) <a href=http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/inet/ip/>here</a>, by looking at <tt>ipsec*.c, sadb.c, spd.c, spdsock.c, keysock.c</tt> and header files in the directory above it. You can see the IPsec management utilities <a href=http://src.illumos.org/source/xref/illumos-gate/usr/src/cmd/cmd-inet/usr.sbin/ipsecutils/>here</a>. According to at least one well-known security researcher, the Illumos (nee OpenSolaris) IPsec code <a href=http://kebe.com/paterson-on-solaris-ipsec.mp3>isn't bollocks</a>.
</p>
<p>
There is no open-source for IKE, because the libike.so.1 library was mostly OEM code, from a vendor whose technical lead let me co-write <a href=http://tools.ietf.org/html/rfc5879>an RFC</a> with him. You can use the various observability and debugging tools in Illumos to see how things work, however, if you wish.
</p>
<p>
If you want to write your own, better, key management application for Illumos (or even Oracle Solaris), you can use PF_KEY to control the IPsec SADB. I detail the subsequent additions to <a href=http://tools.ietf.org/html/rfc2367>RFC 2367</a> on my <a href=http://kebesays.blogspot.com/2005/06/pfkey-in-solaris-or-dude-wheres-my-spec.html>day-one-of-OpenSolaris blog post</a>. If you want to work on IPsec in totally-open-source <a href=http://www.illumos.org/>Illumos</a>, you have my blessing, and I'll definitely be reviewing (and maybe integrating if you pass code reviews) your code.
</p>
danmcdhttp://www.blogger.com/profile/02293330539766533891noreply@blogger.com1tag:blogger.com,1999:blog-4888740544783150397.post-50486617384624647692013-03-25T22:01:00.000-04:002013-03-25T22:01:56.231-04:00Broad-Spectrum Dogfooding, or Why I Miss Jurassic.<p>
I think most of you dozen readers know what I mean, when I refer to <a href=http://en.wikipedia.org/wiki/Eating_your_own_dog_food>dogfooding</a>. Some people think of Microsoft when they hear the term, but I first heard it from the same person via his being a Sun customer, AND via my old roommate, who worked for him.
</p>
<p>
I saw this Tweet last week:
<blockquote class="twitter-tweet"><p>RT @<a href="https://twitter.com/stu">stu</a>: "Compared to networking, storage is serious business" great article on storage networking <a href="http://t.co/JBMPgFcTkS" title="http://bit.ly/WNEOxT">bit.ly/WNEOxT</a> @<a href="https://twitter.com/ioshints">ioshints</a> <a href="https://twitter.com/search/%23iSCSI">#iSCSI</a></p>— Charles Beeler (@charlesbeeler) <a href="https://twitter.com/charlesbeeler/status/314780219477221377">March 21, 2013</a></blockquote>
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>
I then checked out the blog post. It dealt with how an iSCSI LAN can be a failure point, partially due to the weakness of the <a href=http://en.wikipedia.org/wiki/Transmission_Control_Protocol#Checksum_computation>ones-complement TCP/IP checksum</a>
</p>
<p>
Reading this reminded me of an old bug we found in Sun with either NFS or an ethernet device driver, and the only way we caught it was by using IPsec (AH particularly) and seeing packets fail the authentication check. The corrupt NFS packets had 16-bits worth of 1 (0xffff), where it should have had 16-bits worth of 0 (0x0000). Using the standard TCP/IP checksum, there's no difference between those two values, no matter where they fall in the packet. Using IPsec, however, even with HMAC-MD5, showed the packet failure clearly when the packet authentication check failed. This bug wouldn't have been discovered were it not for the Solaris Team's big honking server, jurassic, and how its multiple concurrent uses interacted with each other.
</p>
<p>
Even before there was OpenSolaris, people knew about jurassic. Solaris people's (not any old Sun people... Solaris people) posts on IETF mailing lists often showed <tt>user@jurassic</tt>. Jurassic served as the NFS source of home directories, and until the early 2000s e-mail inboxes as well. Every two weeks the in-development Solaris build would be placed upon jurassic. As a Solaris developer, if your changes broke jurassic, you fixed those changes immediately, or risked getting your changes yanked out. Not breaking jurassic was a great motivator for code quality. Also, if you had a new feature, you wanted it used on jurassic, even if not by everyone.
</p>
<p>
Once the basic IPsec protocols - AH & ESP - went into Solaris 8, I convinced the jurassic maintainers to protect all traffic between jurassic and a couple of workstations. One was mine, naturally. I encrypted all of my traffic to jurassic. Since we only had 100Mbit in our building at that time, the performance hit wasn't too bad, relatively speaking. Another belonged to an NFS developer, who I'd somehow convinced to run AH, because I was already running ESP (and AH used less cycles for protection). It was this NFS developer, surprised he wasn't getting data corruption while other were, who helped suss out the bug in question.
</p>
<p><small>
At this point, I'd like to have a moment of silence for all of the made-public Solaris information that Oracle has since put back in its box. I could've had a bug id here, folks, A REAL BUG ID!!!
</small></p>
<p>
So for a few of us, jurassic also served as an IPsec testbed. It also was helpful in determining that nobody else's cleartext performance dropped while a few of us were running with network traffic (put more succinctly, connection policy latching worked). Other services would run on jurassic as well: DNS, IMAP, and others I'm sure I'm forgetting. Jurassic core dumps eventually would be used to test out the then-new mdb (oh, those early ::findleaks results...), and I'm sure more than a few DTrace scripts helped diagnose some jurassic-discovered bugs.
</p>
<p>
At Nexenta, we make a dedicated storage appliance. Naturally, we use them inside where appropriate. We Nexentians (especially the ones in Lowell) use Illumos from other distributions for even greater effect. My <a href=http://www.youtube.com/watch?feature=player_detailpage&v=APL7FzuzbpE>Illumos Home Data Center talk</a> touches upon these at about <a href=http://www.youtube.com/watch?feature=player_detailpage&v=APL7FzuzbpE#t=643s>10:43 in</a>. We use Illumos to host VMs (Thank you <a href=http://www.joyent.com/>Joyent</a>), we use it for site-to-site VPNs, we will be using it for public services at some point, and everything I mentioned all runs on Illumos. It's not quite the magnifying glass Jurassic was, but we do what we can.
</p>
<p>
I believe Oracle still has jurassic around, I know it did prior to my 2011 departure. I suspect it's helping Oracle Solaris even today. I suspect, however, that a less dense, but more widely instantiated broad-spectrum dogfooding continues on in Illumos today.
</p>danmcdhttp://www.blogger.com/profile/02293330539766533891noreply@blogger.com1tag:blogger.com,1999:blog-4888740544783150397.post-12427903085003251432013-02-26T16:09:00.000-05:002013-02-26T16:09:05.573-05:00Delegated ZFS, cloning, and SCM<p>Well THAT was a long break from blogging...</p>
<p>
One of the things that's happened in the <a href=http://www.illumos.org/>illumos</a> community is a subtle shift of the main illumos source repository from being primarily Mercurial to being primarily Git. This means I've had to learn Git. At first, I wasn't sure why people were so rabidly pro-Git. I found one of the big reasons:
</p>
<pre>
everywhere(~/ws)[0]% /bin/time git clone git-illumos git-illumos.copy
Cloning into git-illumos.copy...
done.
real 11.8
user 4.7
sys 3.2
everywhere(~/ws)[0]% /bin/time hg clone illumos-clone illumos-clone.copy
updating working directory
44332 files updated, 0 files merged, 0 files removed, 0 files unresolved
real 1:52.6
user 28.9
sys 25.4
everywhere(~/ws)[0]%
</pre>
<p>
Wow! Yeah, I can see why this would appeal to people. I'm still using Mercurial in a fair amount of places, both for my illumos work and for Nexenta as well. I should show one other thing that both SCM cloning operations do: take up disk space.
</p>
<pre>
everywhere(~/ws)[0]% zpool list
NAME SIZE ALLOC FREE EXPANDSZ CAP DEDUP HEALTH ALTROOT
rpool 298G 198G 100G - 66% 1.00x ONLINE -
everywhere(~/ws)[0]% /bin/time git clone git-illumos git-illumos.copy
<b> *** SNIP! *** </b>
everywhere(~/ws)[0]% sync
everywhere(~/ws)[0]% zpool list
NAME SIZE ALLOC FREE EXPANDSZ CAP DEDUP HEALTH ALTROOT
rpool 298G 198G 99.6G - 66% 1.00x ONLINE -
everywhere(~/ws)[0]% /bin/time hg clone illumos-clone illumos-clone.copy
<b> *** SNIP! *** </b>
everywhere(~/ws)[0]% sync
everywhere(~/ws)[0]% zpool list
NAME SIZE ALLOC FREE EXPANDSZ CAP DEDUP HEALTH ALTROOT
rpool 298G 199G 98.7G - 66% 1.00x ONLINE -
everywhere(~/ws)[0]%
</pre>
<p>
I believe Git will also take up less disk space, but still, that's approximately half a gig or more for an illumos workspace. If it's populated, say with a preinstalled proto area and compiled objects, that'll be even larger.
</p>
<p>
Consider one of the great strengths of ZFS: its copy-on-write architecture. Take a local, on-disk master repo, say one you're pulling directly from the source, and make it its own filesystem. Child/downstream workspaces from your on-disk master now can be created using low-latency ZFS operations. Only two problems need to be solved: non-privileged usage, and SCM correction to properly designate the parent/child or upstream/downstream relationship.
</p>
<p>
Another useful ZFS feature is administrative delegation. Put simply, an administrator can allow an ordinary user to perform selected ZFS primitives on a given filesystem, and its descendants in the ZFS filesystem tree. For example:
</p>
<pre>
everywhere(~)[0]% zfs allow rpool/export/home/danmcd
everywhere(~)[0]% zfs allow rpool/export/home/danmcd/ws
---- Permissions on rpool/export/home/danmcd/ws ----------------------
Local+Descendent permissions:
user danmcd clone,create,destroy,mount,promote,snapshot
everywhere(~)[0]%
</pre>
<p>
I (as root) delegated several permissions for a subdirectory of $HOME to me (as danmcd). From here, I can create new filesystems in ~/ws, as well as destroy them, clone them, mount, snapshot, and promote them. All of these are useful operations. The syntax for delegation is mostly straightforward: <tt>zfs allow -ld clone,create,destroy,mount,promote,snapshot rpool/export/home/danmcd/ws</tt>. The -ld flags enable local and descendant permission propagation.
</p>
<p>
First thing I did was <tt>zfs create rpool/export/home/danmcd/ws/illumos-clone</tt>, followed by <tt>hg clone ssh://anonhg@hg.illumos.org/illumos-gate illumos-clone</tt>. This populates my local Mercurial illumos repo. I can perform a similar operation with git. Per my above timing examples, I did so with <tt>git-illumos</tt>.
</p>
<p>
I wrote a script to clone, promote, and reparent Git and Mercurial workspaces using ZFS operations. It's called <tt>zclone</tt> and it's <a href=http://kebe.com/~danmcd/zclone.sh>here for download</a>. It's still a work in progress, and I'd like to maybe have it end up in <tt>usr/src/tools</tt> in illumos-gate someday. (I'll try and update this particular post as things evolve.)
</p>
<p>Check out the times, and the disk space (not) used:</p>
<pre>
everywhere(~/ws)[0]% zpool list
NAME SIZE ALLOC FREE EXPANDSZ CAP DEDUP HEALTH ALTROOT
rpool 298G 198G 100G - 66% 1.00x ONLINE -
everywhere(~/ws)[0]% /bin/time zclone git-illumos git-illumos.zc
Created rpool/export/home/danmcd/ws/git-illumos.zc,
a zfs clone of rpool/export/home/danmcd/ws/git-illumos
real 1.0
user 0.0
sys 0.0
everywhere(~/ws)[0]% /bin/time zclone illumos-clone illumos-clone.zc
Created rpool/export/home/danmcd/ws/illumos-clone.zc,
a zfs clone of rpool/export/home/danmcd/ws/illumos-clone
real 1.0
user 0.0
sys 0.0
everywhere(~/ws)[0]% zpool list
NAME SIZE ALLOC FREE EXPANDSZ CAP DEDUP HEALTH ALTROOT
rpool 298G 198G 100G - 66% 1.00x ONLINE -
everywhere(~/ws)[0]%
</pre>
<p>
These are constant-time operations, folks. And like I said earlier, I suppose its possible to have the local master repos populated with pre-compiled objects, header files in proto areas (an illumos build trick), and other disk-intensive operations pre-performed.
</p>
<p>
A quick search didn't yield me any results in this area: using ZFS to help make source trees take up less space. I'm surprised nobody's blogged about this or documented it, but I may have missed something. Either way, it doesn't hurt to mention it again.
</p>danmcdhttp://www.blogger.com/profile/02293330539766533891noreply@blogger.com3tag:blogger.com,1999:blog-4888740544783150397.post-90735400641093159312012-01-18T01:09:00.002-05:002012-01-18T01:11:34.167-05:00On SOPA and PIPAI can't say anything you haven't heard my tech friends say already on the subject. I can, however, quote this, because it's both funny and true:<br /><br />"I think we need to drive a stake into this thing's heart, fill its mouth with garlic, cut off its head, expose it to sunlight and then throw the ash into a running body of water. It is vital that people not let up on the pressure merely because they appear to compromise."<br /><br />Thank you Perry, for eloquently stating what should be SOPA's and PIPA's fates.danmcdhttp://www.blogger.com/profile/02293330539766533891noreply@blogger.com0tag:blogger.com,1999:blog-4888740544783150397.post-42880486734390209622011-11-29T13:34:00.000-05:002011-11-29T15:05:52.911-05:00A Tale of Two Soccer Websites (A Security Story)(Pardon the latency on this post. I had it in the Drafts section for a while.)<br /><br />When a website requires a password for registration, said site SHOULD NOT EVER mail you back the password in the clear in an e-mail. Let me repeat that... SHOULD... NOT... EVER.<br /><br />One of my daughters plays soccer, and has for two towns. My whole family enjoy seeing the Boston Breakers play soccer too. Both my daughter's town website (outsourced to Blue Sombrero) and the Boston Breaker's ticketing website (run by PMI ticketing using TicketSocket's technology) made the aforementioned mistake. Both of them, quickly addressed the issue with direct and up-front e-mails. I believe Blue Sombrero addressed the problem a bit quicker, but that's because of a combination of smaller organizations and the Breakers' mistake happening on a weekend.<br /><br />The Blue Sombrero handling of my daughter's old town website mistake was quick, and without incident. Hats off (no pun intended) to the Blue Sombrero folks, who I hope have implemented the no-mailing-passwords policy throughout their entire customer base.<br /><br />One bad thing <i>someone</i> in the Breakers organization did was remove my original complaining posts on Facebook. I suspect this was merely the case of panic and not active malice. The General Manager of the Breakers, Andy Crossley, sent me a mail on Saturday to see what was going on. Once he understood the problem, he got the relevant technical folks involved, and they solved things.<br /><br />While I'm glad to see quick turnaround on these flaws, the one piece of advice I will reiterate is NEVER SEND OUT CLEARTEXT PASSWORDS. Thank you.danmcdhttp://www.blogger.com/profile/02293330539766533891noreply@blogger.com0tag:blogger.com,1999:blog-4888740544783150397.post-20477911243417847622011-10-07T16:35:00.009-04:002011-10-07T17:00:14.235-04:00Finding Ada, but with better technology examples!I found out thanks to <a href=http://codingrelic.geekhold.com/>Denny Gentry</a> about <a href=http://findingada.com/>Ada Lovelace Day</a> today. Denny has a great blog post <a href=http://codingrelic.geekhold.com/2011/10/finding-ada-2011.html>citing three engineers and their work with ATM</a>.<br /><br />The three engineers are wonderful examples of excellence, ones I'd gladly mention. What bugs me is that he cited... ewww.... ATM. His third paragraph mentioned why I go, "ewww..." over ATM. He didn't have to deal with (I think) some of the politics of ATM zealots, but that doesn't take away from Allyn's, Sally's, or Renee's abilities or contributions.<br /><br />In fact, it's not difficult to cite further contributions from each of them... two of which I can further support with source code!<br /><br />First off, Sally Floyd is well known for much TCP and congestion control goodness. If you followed the link to <a href=http://icir.org/floyd/>Sally's page</a> you can see all (or at least most) of her work for yourself. I unfortunately don't know of any quickly-linkable code to cite, but I'll gladly accept suggestions.<br /><br />Allyn Romanow was a engineer at Sun, and worked in my old group (Solaris Internet Engineering) while she was there. Her big contribution to the Solaris TCP/IP stack was the support for large, fast networks (aka. <a href=http://www.ietf.org/rfc/rfc1323.txt>RFC 1323</a>), which you can see scattered throughout the <a href=http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/inet/tcp/>TCP code</a>, particularly <a href=http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/inet/tcp/tcp_sack.c>here</a>.<br /><br />Renee Danson (now Sommerfeld), also an engineer at Sun, escaped the world of ATM to join Internet Engineering later on. I was fortunate to have her land with Team IPsec for a while. As we were bringing up IKE for Solaris 9, I was hoping to have a command-line tool alter the running IKE daemon using the Solaris lightweight IPC mechanism known as <a href=http://www.unix.com/man-page/OpenSolaris/3c/door_create/>doors</a>. Renee made this happen. Because of a large OEM component, the IKE daemon source isn't available for browsing, but the control program, <a href=http://src.illumos.org/source/xref/illumos-gate/usr/src/cmd/cmd-inet/usr.sbin/ipsecutils/ikeadm.c>ikeadm(1M)</a> is there for the world to see.<br /><br />An unofficial IETF slogan was, "We believe in rough consensus and running code." I figured it's even better to find Ada with some running code to back it up.danmcdhttp://www.blogger.com/profile/02293330539766533891noreply@blogger.com0tag:blogger.com,1999:blog-4888740544783150397.post-50581149604580677412011-08-01T23:18:00.004-04:002011-08-01T23:40:19.432-04:00MTV is 30, and do you remember PopClips?MTV (Music Television... at least it used to be), is 30 today (or yesterday depending on how late I post this). I'm sure lots of people have written about this already. I'd recommend checking out <a href=http://www.youtube.com/user/MTVTheFirst24>this YouTube channel</a> if you want a glimpse into the past. It has commercials inserted (which I believe weren't actually on MTV in those early days), but otherwise should stir some 1981 memories.<br /><br />I'm here to write about a precursor to MTV that I remember seeing months before MTV appeared. Nickelodeon used to air a show on Sunday nights called "PopClips". Internet searching on it turns up very little. The <a href="http://en.wikipedia.org/wiki/PopClips">Wikipedia article</a> sums up all of my own recollections, and includes some tidbits that former Monkee Michael Nesmith produced the show.<br /><br />Do any of you half-dozen readers who are approximately my age (40s) remember PopClips? I remember seeing some good videos on there that did eventually make their way to MTV (my favorite specifically from PopClips was "Walking on the Moon" from The Police). The amount of collective net data on PopClips is surprisingly sparse.danmcdhttp://www.blogger.com/profile/02293330539766533891noreply@blogger.com0tag:blogger.com,1999:blog-4888740544783150397.post-61712940997723314622011-06-07T14:13:00.003-04:002011-06-07T14:32:59.625-04:00WRITE_SAME support now in Illumos COMSTARThe WRITE_SAME primitive is now available in Illumos as of this push:<br /><br />13382:d84aa76f7cd2 Dan McDonald <danmcd@nexenta.com><br />937 WRITE_SAME support for COMSTAR<br />Reviewed by: Gordon Ross <gwr@nexenta.com><br />Reviewed by: Richard Elling <richard.elling@richardelling.com><br />Reviewed by: Robert Gordon <rbg@openrbg.com><br />Approved by: Gordon Ross <gwr@nexenta.com> <br /><br />Sumit Gupta wrote the original contribution, and after a bit of my own massaging, it's now in Illumos. Unlike the UNMAP push, this one did not have a lot of rewhacking (in large part due to its lower amount of direct interaction with ZFS).<br /><br />The WRITE_SAME primitive works pretty much like its name. The iSCSI initiator passes in a WRITE_SAME primitive along with a single disk block. The iSCSI target then writes the same block over the range of logical block addresses specified in the command.<br /><br />One set of experiments I did prior to integration was figuring out what size buffer to allocate for an I/O. In a perfect world, you don't want to do sbd_write() calls for every 512-byte block. On the other hand, you also don't want to force the kmem allocator to perform unholy tasks of allocation. I settled on a default of 128kbytes, which has a kmem_cache magazine backing it up (according to kmem stats). Users can experiment with this themselves by tweaking stmf_sbd's sbd_write_same_optimal_chunk variable. Every WRITE_SAME request, once it generates the data, consults this variable prior to allocating a block. Source-junkies can look <a href=http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/io/comstar/lu/stmf_sbd/sbd_scsi.c#2242>here</a> for the function in question.<br /><br />Happy block-writing, folks!danmcdhttp://www.blogger.com/profile/02293330539766533891noreply@blogger.com0tag:blogger.com,1999:blog-4888740544783150397.post-49080056205932886102011-04-07T00:44:00.003-04:002011-04-07T01:22:16.092-04:00Showing your kids the Star Wars films - which order?WARNING: Spoilers for the Star Wars movies. Here's some old-school spoiler space...<br /><br /><br /><br /><br /><br /><br /><br /><br /><br /><br />We finally finished (in fits and starts) showing our twin 8-year-olds all six Star Wars films. We showed them in the order Wendy and I saw them on the big screen: 4 (<i>Star Wars</i>), 5 (<i>The Empire Strikes Back</i>, 6 (<i>Return of the Jedi</i>), 1 (<i>The Phantom Menace</i>), 2 (<i>Attack of the Clones</i>), and finally 3 (<i>Revenge of the Sith</i>). Now I'll admit we skipped over char-broiled Anakin and Vader's suit-fitting during <i>Sith</i>, but they're <b>8</b>, what would you expect?<br /><br />As we finished up, something occurred to me. I remember reading my favorite online-exclusive film critic (and fellow parent) Drew McWeeney mentioning toward the bottom of <a href=http://www.hitfix.com/blogs/motion-captured/posts/fox-makes-it-official-at-ces-star-wars-on-blu-ray-in-september>this article</a> that he was going to show them to his son in a slightly different order: 4, 5, 1, 2, 3, 6.<br /><br />That's a fascinating way to show them. At the end of <i>The Empire Strikes Back</i> the first-time viewer may have a question about whether or not Darth Vader is Luke's father. Why not, at that point, show the first-time viewer the story of Anakin Skywalker? This works especially well now, where the special-edition <i>Empire</i> uses Ian McDiarmid's Emperor Palpatine, and an astute child will notice how much Darth Sidious resembles him (or even Senator Palpatine).<br /><br />Commenters (oh gotta love Internet feedback... makes me glad I only have a half-dozen readers) mention a few other orders: 1, 2, 3, 4, 5, 6 ("to get the crap out of the way"), or the flip-flop 1, 4, 2, 5, 3, 6 (tracking both in single steps).<br /><br />There's a little part of me that wishes we tried the flashback-in-the-middle approach, but the only thing that matters is that our girls enjoyed the movies, and now they get one or two more of the jokes Wendy and I make.danmcdhttp://www.blogger.com/profile/02293330539766533891noreply@blogger.com0tag:blogger.com,1999:blog-4888740544783150397.post-89619286556875548982011-03-22T00:14:00.004-04:002011-03-22T00:52:09.390-04:00For Illumos newbies: On developing smallI just finished a chat with a person who's doing a device driver, and he was worried that a certain header file wasn't available in his /usr/include. This struck me as odd, as I always get my headers from the workspace's proto area...<br /><br />Then I realized I've had 15 years at Sun under my belt and this person's a complete newbie.<br /><br />I haven't looked very closely at the Illumos build instructions, but I'm going to do some things now that will help kernel module writers (e.g. device drivers) get started without resorting to a full build right off the bat. I'll assume that you've installed the appropriate compilers and the "onbld" package so that you have a populated /opt/onbld/bin.<br /><br />STEP 1: The /opt/onbld/bin/ws command:<br /><br />When you go to work in an Illumos source base, your best off "entering it" via the ws command. I've hacked my .tcshrc to print a different prompt when I'm in with ws. Here, check it out:<br /><br /><small><pre><br />everywhere(~)[1]% ws ws/to_mhi<br /><br />Workspace : /export/home/danmcd/ws/to_mhi<br />Workspace Parent : /export/home/danmcd/ws/illumos-clone<br />Proto area ($ROOT) : /export/home/danmcd/ws/to_mhi/proto/root_i386<br />Parent proto area ($PARENT_ROOT) : /export/home/danmcd/ws/illumos-clone/proto/root_i386<br />Root of source ($SRC) : /export/home/danmcd/ws/to_mhi/usr/src<br />Root of test source ($TSRC) : /export/home/danmcd/ws/to_mhi/usr/ontest<br />Current directory ($PWD) : /export/home/danmcd/ws/to_mhi<br /><br />WS-everywhere-WS(~/ws/to_mhi)[0]% <br /></small></pre><br /><br />You'll notice a few things got set in the environment. What I use to alter my .tcshrc is the CODEMGR_WS variable. You should do the same in your favorite shell's config.<br /><br /><b>UPDATE</b>: You will need to set SPRO_ROOT and BUILD_TOOLS after invoking ws. I do this already in my .tcshrc, but forgot to report it. A newer tool: bldenv, fixes this, but currently at the cost of a configuration file. There's talk of merging ws's simplicity with bldenv's completeness.<br /><br />One of the key concepts in building Illumos is the "proto area". This is a version of the root filesystem that lives within your source tree. You'll see it set above. There's one per basic architecture type (i386 or sparc). When a full "nightly" build happens, the proto area gets populated with headers, libraries, commands, kernel modules, etc., and then the packaging tools sweep up their input from the proto area. The proto area contains more than what is on a running system.<br /><br />You need to populate your proto area with basics (directory structures, etc.) to start.<br /><br /><pre><small><br />WS-everywhere-WS(~/ws/to_mhi)[1]% cd $SRC<br />WS-everywhere-WS(usr/src)[0]% pwd<br />/export/home/danmcd/ws/to_mhi/usr/src<br />WS-everywhere-WS(usr/src)[0]% dmake sgs<br /> < Go get a drink of water or coffee, it's gonna be a bit... ><br />WS-everywhere-WS(usr/src)[1]% <br /></pre></small><br /><br />The "sgs" target sets up the proto area completely.<br /><br />If you're proceeding to build, say, kernel modules, you should populate the kernel include files in the proto area.<br /><br /><pre><small><br />WS-everywhere-WS(~/ws/to_mhi)[0]% cd usr/src/uts<br />WS-everywhere-WS(src/uts)[0]% dmake install_h<br /> < TONS of output deleted... ><br />WS-everywhere-WS(src/uts)[0]% <br /></pre></small><br /><br /><b>UPDATE</b> Fellow Illumos hacker Rich Lowe has informed me that "dmake setup" does both sgs and install_h in one fell swoop.<br /><br />And then you can go and compile your kernel module. I'll use "ip" as an example:<br /><br /><pre><small><br />WS-everywhere-WS(src/uts)[1]% cd intel/ip<br />WS-everywhere-WS(intel/ip)[0]% pwd<br />/export/home/danmcd/ws/to_mhi/usr/src/uts/intel/ip<br />WS-everywhere-WS(intel/ip)[0]% dmake<br /> < MORE output deleted... ><br />WS-everywhere-WS(intel/ip)[0]% <br /></pre></small><br /><br />If you want to lint-check your module, don't do the obvious "make lint" but instead do "make modlintlib". This will perform basic lint sanity without the overhead of a full crosscheck.<br /><br />Now if you want to do something in userland, you'll need to do more than a simple header install. You MIGHT need to bringup libraries too, because it's possible your workspace's libraries have different versions than the machine you're actually building on.<br /><br /><pre><small><br />WS-everywhere-WS(intel/ip)[0]% cd $SRC/lib<br />WS-everywhere-WS(src/lib)[0]% <br /></pre></small><br /><br />If you utter "dmake install", it's going to be a while. You can, if you know only a certain library was altered, cd into that library and utter "dmake install" in there. For example:<br /><br /><pre><small><br />WS-everywhere-WS(src/lib)[0]% cd libipsecutil<br />WS-everywhere-WS(lib/libipsecutil)[0]% dmake install_h<br /> < output deleted... ><br />WS-everywhere-WS(lib/libipsecutil)[0]% dmake install<br /> < MORE output deleted... ><br />WS-everywhere-WS(lib/libipsecutil)[0]% <br /></pre></small><br /><br />Then you can go to, say, your new command, and start compiling and debugging there. Once you're done, you can exit this shell, and it will return you to your original pre-ws shell.<br /><br />Hopefully this will lower some of the barriers to entry for budding Illumos hackers.danmcdhttp://www.blogger.com/profile/02293330539766533891noreply@blogger.com0tag:blogger.com,1999:blog-4888740544783150397.post-64468760461896094132011-03-10T18:50:00.002-05:002011-03-10T18:55:25.272-05:00Finally unpackedI think I've managed to move all of my old blog entries over from blogs.sun.com. Hopefully I'll be posting some <a href=http://www.illumos.org/>Illumos</a>-related technical content before too long. Stay tuned!danmcdhttp://www.blogger.com/profile/02293330539766533891noreply@blogger.com0tag:blogger.com,1999:blog-4888740544783150397.post-71955904407785092852011-03-07T15:33:00.003-05:002011-03-07T15:38:49.175-05:00Hello again, world!At <a href=http://anywaybecause.blogspot.com/>Wendy's</a> and <a href=http://gdamore.blogspot.com>Garrett's</a> advice, I've set up shop here on Blogger/Blogspot.<br /><br />I plan on importing all of my old <a href=http://blogs.sun.com/danmcd/>Sun blog posts</a> here, but I exported in a non-blogger non-XML (ick) format. So I'll be backpatching by copy-and-paste when time allows.<br /><br />Happy blog reading, you half-dozen readers! :)danmcdhttp://www.blogger.com/profile/02293330539766533891noreply@blogger.com0tag:blogger.com,1999:blog-4888740544783150397.post-81190832529411044522011-01-25T08:44:00.000-05:002011-03-10T18:46:36.785-05:00A final suggested readDavid Reed passed along a pointer to this paper by Dan Geer:<br /><br /><a href=http://geer.tinho.net/ieee/ieee.sp.geer.1101b.pdf>A Time for Choosing</a><br /><br />Please read it, and understand the founding spirit of the Internet. And with that, I say goodbye to Oracle.danmcdhttp://www.blogger.com/profile/02293330539766533891noreply@blogger.com0tag:blogger.com,1999:blog-4888740544783150397.post-88414982946204504242011-01-18T19:35:00.001-05:002011-03-10T18:45:10.921-05:00I'm leaving Oracle, and switching gears15 years ago I was finishing up last-minute changes at <a href=http://www.nrl.navy.mil />NRL</a> while getting ready to move coasts. While I'm not moving coasts, I'm at the point where I'm finishing up last-minute changes again.<br /><br />I'm leaving Oracle this week, and will be trying something a bit different after that. I've been doing IPsec or at least TCP/IP related work for the entirety of my time at Sun. I expect to be back in TCP/IP-land relatively soon, but I will be learning some new-to-me technologies in the immediate future.<br /><br />I've met and worked with some extraordinary people during my time at Sun. I hope to keep in touch with them after I depart. If any of you half-dozen readers wish to keep up, I'd suggest following my <a href=http://twitter.com/kebesays>Twitter feed</a> until I decide whether or not I find a new home for this blog. I'm also findable on <a href=http://www.facebook.com/>Facebook</a> and <a href=http://www.linkedin.com/>LinkedIn</a> for those so inclined.danmcdhttp://www.blogger.com/profile/02293330539766533891noreply@blogger.com0tag:blogger.com,1999:blog-4888740544783150397.post-33331970340768572822010-11-02T12:45:00.000-04:002011-03-07T15:24:00.210-05:00MAC-then-encrypt - also harmful, also hard to do in Solaris<p>Hello again!<br />
</p><p>Kenny Paterson's once again turning the theoretical into practical. This time he's pointed out that if one configures IPsec to MAC-then-encrypt (do packet authentication first, THEN encrypt the packet), one is open to cryptographic attack. Here's a <a href=http://www.citeulike.org/user/lispler/article/8133053>citation</a> for his ACM CCS paper.<br />
</p><p>The good news is that we cannot configure the IPsec SPD to perform MAC-then-encrypt at all. One could configure transport mode to just MAC, then have the packet transit a tunnel that just encrypts, but then you'll see warnings about the encryption-only tunnel configuration. This has been true for a LONG time (starting with S9, maybe even S8).<br />
</p><p>So basically, we don't make it easy for you to shoot yourself in the foot this way. You really have to try, and as I pointed out <a href=http://blogs.sun.com/danmcd/entry/esp_without_authentication_considered_harmful>earlier</a>, the encryption-only part will warn you.<br />
</p>danmcdhttp://www.blogger.com/profile/02293330539766533891noreply@blogger.com0