Wednesday, October 31, 2007

We'll miss you, Itojun

Jun-Ichiro Hagino, better known to the IETF community as Itojun, died on October 29th at the age of 37.

Please visit this link for a message from his colleages at Japan's WIDE Project.

He led the KAME/WIDE IPv6 work that now appears in all of the BSDs (including MacOS X), and even their user-land tools are in Linux. Having worked on one of the pre-KAME IPv6 implementations (NRL), I can tell you that Itojun pushed things along even further than I'd hoped.

I had the privilege to talk with him when I was attending IETF meetings, and even visited him on his home turf in Tokyo. He was very unassuming, enthusiastic, and an all-around Good Guy (TM). I'll miss him, and a LOT of other people will too. PLEASE follow the link I mentioned above to see what you can do to honor Itojun.

Tuesday, October 30, 2007

No HD happiness. Who do I blame? DirecTV? TiVo? Both of 'em? Telcos?

Okay... I'm mad. I see that TiVo is offering another lifetime subscription transfer if you buy a new HD TiVo between now and the 8th of November. Pretty cool, huh?

Well, it's anything BUT cool if that lifetime subscription is on a DirecTV + TiVo box. TiVo's fine-print spells out quite clearly that:

DirecTV DVR's with TiVo service not eligible for offer.

I know TiVo and DirecTV get along about as well as Sun and NetApp (okay... maybe not THAT bad), but this leaves me, and I gotta believe some others, in the lurch.

I've determined I'm more TiVo-happy than DirecTV-happy. I've heard multiple-independent accounts that other DVR services and software leave a lot to be desired, especially if one had TiVo at one point. I've also continued to hear verification that the cable companies still suck, and my own recent experiences with the Cable Company have been less than stellar. I even started using Verizon for local phone service... something I swore I'd never do after then-Bell-Atlantic botched a simple two-phone-line installation so badly that I had to violate the network interface boundary to fix it.

So I see several opportunities for improvement, but I'm assuming any of the companies in question will bite. I doubt it'll happen, but if the Red Sox can win a second World Series in four years (GO SOX!), anything's possible.

Hey telcos --> pay attention, here's one geek's wishlist:

  • Comcast - Start charging rates for two-cable-card HD setups that are competitive with DirecTV. You can get bonus points if I can get bundled discounts with business-class Internet service.

  • Verizon - Stop holding the Mass. legislature hostage with a statewide TV franchise bill and keep rolling out more FiOS. I'd kill for 5MB upstream (lets me do more work), and I'd even pay for it. Throw in FiOS TV (which is supposed to have the bandwidth of DirecTV with the bundling/saving opportunities of cable) and I might be all over it. I told your phone reps that if they do well, they'd get more of my monthly subscription money!

  • DirecTV - Get over yourself and re-partner with TiVO. I hear that if John Malone gets a large interest that such a re-partnering may very well happen. Throw in the $200 lifetime subscription upgrade fee, and I'll keep facing South-Southwest!

  • TiVo - Unless one of the above miracles happens, please consider a migration path for DirecTV Tivo users? I've a bad feeling one of your divorce terms was to not poach from the DirecTV subscriber base. If so, that's too bad, because I'll bet there are a lot of people who want to keep TiVo more than DirecTV.



Any takers? (I somehow doubt it, but a guy can hope...)

Wednesday, September 26, 2007

Go Blue! Recruiting at Michigan (day 2)

Oh my am I exhausted! I hoped to have most of the text of this completed before my flight got back to Manchester last night, but that didn't happen.

I keep telling people I know that Michigan is a hardware school (in spite of having some great software people - see my post from Monday). We Solaris developers at the Sun table were brutally reminded of this yesterday. Lots of EE's with Verilog and/or VHDL experience. Many of them asking about architecture and/or verification, but a surprising number who have never heard of SPARC, the UltraSPARC T1 (aka. Niagara), or that they can see the entire source for the Niagara with OpenSPARC. Almost every business card of mine I handed out to folks had the word, "OpenSPARC" on the back so they could Google it later.

We also tried to make sure everyone had OpenSolaris disks. There are four binary distributions of OpenSolaris on that set of disks: Solaris Express Community Edition (see the previous link) - Sun's current OpenSolaris vehicle, Nexenta - which is probably going to be one of the more comfortable ones for Ubuntu Linux users to land in, Belenix - which is optimized for Live CD use, and Schillix, which was the first non-Sun distribution of OpenSolaris, by Joerg Schilling of "cdrecord" fame. I hope some of the students went home and had success playing with OpenSolaris. You all should visit opensolaris.org and engage the community discussions with your feedback and questions.

I mentioned Monday about how much like a geezer I felt. I had more of that yesterday not only saying, "Class of '91" a few times, but also when Professor Quentin Stout visited our table. My only graduate-level class I took at U. of M. was his Parallel Algorithms class in the fall of 1990 (during Football/Marching Band season). Back in the day it was all theory - we discussed how to partition problems using the abstract PRAM (Parallel Random Access Machine). It was the ONLY parallel ANYTHING class offered when I had an available slot. This was when shared-memory multiprocessors were experiments or startups (anyone remember the BBN Butterfly, the Sequent Balance, or the Encore Multimax?). I mentioned to Prof. Stout I took his class back then. He then proceeded to tell me how the class is far more practical now. He told me all about stuff like OpenMP, and other high-level constructs that as a systems' programmer I just don't get to use all that much. I still, however, felt pretty smart for seeing the future back in 1990. I hope I have as good luck 17 years later.

Anyway, I had a great time in Ann Arbor, and I hope to get back there sooner rather than later. If anyone who visited our table is reading this, leave a comment, and don't be afraid to be honest. :)

Monday, September 24, 2007

Go Blue! Recruiting at Michigan (day 1)

I mentioned I was going to be at the University of Michigan's Engineering career fair, and here I am!

I got in yesterday (Sunday) afternoon, and did some things to re-orient myself. I visited my fraternity house first, and quickly, because rush began that night. In some ways things hadn't changed a bit - the house is still there and the rooms have the same names (my old room with a skylight window is still called Lighthouse). In other ways, they had - the TV is bigger and flatter, half of 'em had laptops, and the basement was being seriously renovated. The guys were pretty mellow, probably because of all of the post-beating-of-Penn-State celebrations. I then wandered around campus, eating dinner at Krazy Jim's Blimpyburger, where they give you burgers made of small, ground-that-day, patties. Yum!

When I flew in, the woman next to me on the plane explained the phenomenon she experienced when taking one of her kids to her alma mater. It all felt intimately familiar to her, even modulo some new buildings, but then she suddenly realized she was an old fart wandering campus. My kids aren't old enough to be shopping colleges yet, but I definitely felt the combination of familiarity and age. I saw buildings with new names, old names on new buildings, and just plain new buildings (esp. at North Campus). 20 years ago I was a freshman, now I'm literally old enough to be a father to a student in the incoming class of 2011.

This morning, I tagged along with Kais Belgaied as he visted some Computer Science faculty and grad students here. Our first visit was with Professor Z. Morley Mao, who's a new professor here. She has a lot of great ideas on how to exploit the Crossbow project for aiding intrusion detection (and mitigation), among other interesting ideas. We then talked to two other professors, Atul Prakash and Thomas Wenisch, and a few students as well. I remember Prof. Prakash from my time at Michigan (1987-1991), but the other two are new Assistant Professors. I'm confident from what I saw that U. of M.'s CSE division of EECS is going to be strong for a continuing number of years.

[Edit from Wednesday]Shoot! I forgot I also visited my old theory professor, Kevin Compton. He's a very good teacher, and helps even the most clueless undergrads (hem hem). He told me he's teaching a very popular undergraduate cryptography class, which is just too-cool, IMHO.

This evening several of us (Kais, Eric Kustarz, Bill and Sherry Moore, and I) gave a breezy tech talk about various goodies in OpenSolaris that we work on. We also had very yummy Pizza House pizza. Pizza House was "established 1986", which means it wasn't all that old when I was there, but it was good enough to have our host recommend it.

I'm now back in my hotel, squeezing packets over a flaky, but free, wifi. Tomorrow we will be spending the whole day at the table, taking resumes and answering questions. If one of you four readers of this blog is a U. of M. student, you don't have to wear a suit when visiting us. :)

Friday, September 21, 2007

More ZFS Love - Rapid Recovery

I recently scragged my laptop's primary root partition such that I needed to install-from-scratch again. I had a bootable secondary root, but since it was running an experimental BFUed build, this partition could not be upgraded.

Let's quickly look at how I configure my 100-decimal-GB laptop disk (&*%$% disk vendors):

  • c0d0s0 --> Primary root, approx 8GB (and I mean GB the way software geeks mean it, 8 * 1024^3).

  • c0d0s1 --> Secondary root, same size.

  • c0d0s3 --> swap, 3GB, same as main memory size (useful for system dumps).

  • c0d0s7 --> ZFS pool "tank", with 5 ZFS filesystems (tank, CSW, spro, local, and danmcd).


Before I shut it down for upgrade, I simply uttered this:

zpool export tank

That's it!

Then I plugged in my laptop to a local netinstall network, and PXE-booted to a Nevada build 73 install (which includes detangled NAT-Traversal) and started it up. I used the old Solaris installer because I know how to tell it to preserve disk slices. I told it to preserver the secondary root and the zpool.

One install later, I get root, and to recover my miscellaneous backups, CSW software, compilers, local binaires, and home directory, I just did:

zpool import tank

And again, that's it! All of my filesystems got mounted properly, no tables to edit, NOTHING.

I'll be at the University of Michigan Engineering Career Fair this coming Tuesday, and will be wandering campus on Monday. If you're one of the four people who read this blog and are there, drop by the Sun table - and see the very laptop I'm talking about. :)

Wednesday, September 12, 2007

IPsec Tunnel Reform, IP Instances, and other new-in-S10 goodies

Solaris 10 Update 4 (or as marketing calls it, Solaris 10 08/07) contains some backported goodies we've had in Nevada/OpenSolaris for a while.

IPsec Tunnel Reform was one of the first big pieces of code to be dropped into the S10u4 codebase. It shores up our interoperability story, so you can now start constructing VPNs that tell IKE to negotiation Tunnel-Mode (as opposed to IP-in-IP transport mode). Tunnels themselves are still network interfaces, but their IPsec configuration is now wholly in the purview of ipsecconf(1M). Modulo IKE (which we still OEM part of), we developed Tunnel Reform in the open with OpenSolaris.

Also new for S10u4 is IP Instances. Before u4, you could create non-global zones, but their network management (e.g. ifconfig(1M)) had to be done from the global zone. With u4, one can create a unique instance zone which gives the zone its own complete TCP/IP stack. The global zone needs to only assign a GLDv3-compatible interface to the zone (e.g. bge, nge, e1000g) to give it a unique IP Instance. You could have a single box be your router/firewall/NAT, your web server, and who knows what else, all while keeping those functions out of the fully-privileged global zone. It makes me think about upgrading to business-class Internet service at home, building my own box like Bart did and getting a few extra Ethernet ports.

Oh, and if you want to do it all with less ethernet ports, check out OpenSolaris's Crossbow and its VNIC abstraction!

Have fun moving your network bits in new and interesting ways!

Tuesday, September 4, 2007

Detangling IPsec NAT-Traversal, and a more stable API

As of OpenSolaris build 73, the way we do IPsec NAT-Traversal changes for the cleaner.

Before this build, IPsec NAT-Traversal was performed by pushing a STREAMS module on top of an open UDP socket. This module (nattymod) would either strip UDP headers out of ESP-in-UDP packets, or strip the "0-SPI" marker (four bytes of zeroes) before passing the datagram up to the application.

This method worked, but it had some flaws, including the implicit setting of certain socket options (UDP_INCLHDR) that would then potentially be blocked from applications that actually required them. Also, nattymod did not perform the insertion of the 0-SPI automatically, the application was stuck doing that on its own. And while FireEngine merged TCP in to IP for S10, we needed to wait for one of the earlier builds of OpenSolaris to get the UDP equivalent.

With the new NAT-Traversal scheme, a key management application (like our closed-source in.iked(1M)) that wishes to aid in NAT-Traversal simply sets a new socket option: UDP_NAT_T_ENDPOINT. If this option is set, the following things happen:

  • On inbound packets, the first four bytes after the UDP header are inspected.


    • If there are less than four byte, the packet is dropped, and assumed to be a NAT-T keepalive.

    • If the four-byte are all zeros (i.e. the 0-SPI), they are stripped and regular UDP processing occurs.

    • Otherwise, the UDP header is stripped and the packet is shuffled off the IPsec's ESP for processing.


  • On outbound packets


    • The 0-SPI is inserted between the UDP header and the user-generated data.

    • ESP will send ESP-in-UDP by itself depending on the Security Association's properties.



This will help anyone who wants to port their open-source IKE or other key management application to Solaris deal with the possibility of NAT boxes.



And on a related note, this will be mentioned during Nicolas Droux's OpenSolaris Networking for Developers talk next week at Sun Tech Days in Boston. I'll be there too, talking about S10 and OpenSolaris security features, as well as being in the audience for Nicolas's talk.

Monday, July 16, 2007

Not using "ncp" on Niagara considered harmful

One of our IPsec remote-access servers here is on a Niagara-powered T2000 server. It's really overkill for the job, but we get to see how IKE and the Niagara crypto accelerator (known as "ncp" by its driver name) interact.

The nice thing about running your own stuff is you find things out before others do. Consider bug 6339802. We saw AWFUL IKE performance on Niagara boxes before we fixed this. Admittedly, IKE is single-threaded (for reasons beyond the scope of this blog entry), but it was taking seconds to complete an IKE Phase I with 2048-bit RSA and 1536-bit Diffie-Hellman.

More recently, we've been enabling bigger Diffie-Hellman MODP groups in our IKE. The Niagara driver has a limit of 2048-bit operations, so we limited the Phase I DH to 2048-bits.

Here's a DTrace script we like to use to measure responder-side Phase I times (in.iked and libike are closed-source, but trust me on this one):

#!/usr/sbin/dtrace -s

/*
* Responder-side Phase I setup.
*/
pid$1::ssh_policy_new_connection:entry
{
self->negstart[arg0] = timestamp;
printf("Initial packet received, pm_info = %p", arg0);
}

pid$1::ssh_policy_negotiation_done_isakmp:entry
{
/* Use 16384 value for "CONNECTED" from isakmp_doi.h */
printf("return %d - %s ", arg1,
(arg1 == 16384) ? "Success" : "Error case.");

printf("pm_info %p finished, took %d ns", arg0,
timestamp - self->negstart[arg0]);
}



With the fix for 6339802 in place, we can get pretty good phase 1 times....


dtrace: script '/space/responder-phase1.d' matched 4 probes
CPU ID FUNCTION:NAME
4 48040 ssh_policy_new_connection:entry Initial packet received, pm_info = bed6c8
4 48042 ssh_policy_negotiation_done_isakmp:entry return 16384 - Success pm_info bed6c8 finished, took 165512764 ns


That's 165msec. Some of that time is packet round-trips, but let's ignore that now for this exercise.

Now let's do someting drastic: cryptoadm disable provider=ncp/0 all. Suddenly those seconds come back...


4 48040 ssh_policy_new_connection:entry Initial packet received, pm_info = bed6c8
4 48042 ssh_policy_negotiation_done_isakmp:entry return 16384 - Success pm_info bed6c8 finished, took 7732419300 ns


WOW! Like Darren said, that's blog-worthy, and that's why I'm here.

So why is Niagara so slow without its crypto accelerator?


That's a good question. Keep in mind that four big-number operations occur in an IKE Phase I exchange like I measured above: one RSA Signature, one RSA Verification, one Diffie-Hellman generate, and one Diffie-Hellman agree. I'm not 100% sure, but I believe the default software implementation of big-number operations on SPARC uses floating-point tricks to help out. Using floating-point on a Niagara kicks in a software emulation, which would definitely increase the time taken for each bignum operation.

So the moral of the story is to make sure you're exploiting all of the hardware that's available to you!

Monday, May 21, 2007

"Adaptation Issues", and a Neuromancer movie?

Ahh what to do while test runs finish? Sometimes I get additional real work done, but when it gets close to a holiday, or just the end of a day, I visit Movie websites.

Coming Soon recently quoted an article from Variety about one of my favorite novels - William Gibson's Neuromancer getting a producer for a (long-overdue, IMHO) movie.

Now granted, some of the book hasn't aged very well (Gibson's introduction to a recent re-issue asks about the lack of cell phones), but some of it would still translate VERY nicely to the big screen. Of course, the last major-release attempt to bring one of his works to the big screen was a dismal failure, story-wise, even though (again IMHO) the look-and-feel was close.

Another movie site I visit introduced me to a phrase - "adaptation issues". Basically, if one has enjoyed a story in one form, one may have problem what that story is retold in another form. Ask die-hard (insert-book or comic here) fans about how (book or comic)'s movie version just messed things up SO BADLY. Often, some of their criticisms turn out to be well-based, other times, it's just acute fan{boy,girl} nitpicking. One's adaptation issues seem to be tied to how beloved the original source material is by someone. My only real experience with adaptation issues was the reaction I had to Johnny Mnemonic, where I felt like there was a wonderful opportunity that had been squandered. I really hope Neuromancer's movie adaptation doesn't leave me feeling the same way.

To that end, I'm going to resurrect a conversation I had with a former movie-site writer about who should be cast in a Neuromancer movie. Hey Widge -- care to revisit that cast again?

Tuesday, February 13, 2007

Unnumbered interfaces confuse Quagga

The whole reason I was reading e-mail on a Sunday was not to look for telnetd exploits.

I was logged in because Team IPsec runs its punchin IPsec remote-access server (sometimes called a VPN server, but I hate that term because it's pushed by too many middlebox vendors) which was having routing problems.

As stated before, Solaris implements tunnels as point-to-point interfaces. For a remote-access server like we have in punchin, this means every external IP address gets a tunnel interface. (Until we had Tunnel Reform, this meant only one client per external IP address, which messed up NATs for multiple clients.) A tunnel interface has two addresses - a local one and a remote one. The local one can be shared with other tunnels or even with a different local interface (like the local ethernet). Such interfaces are called unnumbered interfaces.

A remote access server does forward packets, and is therefore by definition a router. One of our servers just swapped out Zebra (from older OpenSolaris/Nevada build) to Quagga. We use Quagga's OSPF to learn the topology of the Sun internal network (the SWAN).

As clients "punch out", their tunnel gets destroyed. Now each of these tunnels shares the same local IP address with our ethernet to the SWAN. Unfortunately, these "interface down" events confuse Quagga, and suddenly all of my punchin clients can't move bits to the internal network anymore.

There is a workaround, and that's to assign a different local IP address than the one that is directly connected to the SWAN for use with all of the client tunnels. It's not that painful, as I only lose one out of 256 possible client addresses (our engineering ones only have a /24 from which to allocate client addresses). Still, as an esteemed colleague said, "I hope that's not the *whole* solution."

It isn't, and I would like to ask the Quagga community (as I've already asked our local routing folks, Paul Jakma and Alan Maguire) to make sure that Quagga and its routing protocols play nicely with unnumbered interfaces. It'll allow me to plumb tunnels until I'm all out of address space! :)

This entry brought to you by the Technorati tags , , and .

How OpenSolaris did its job during this telnet mess

I don't have a tag for general Security because dammit, I'm still a networking person who works on security!

Anyway, you've seen elsewhere about how Alan H. turned around the S10 fix as quickly as he could. I'm going to tell you how Alan already found this:


D 1.67 07/02/11 19:46:41 danmcd 90 89 00009/00010/04896
6523815 LARGE vulnerability in telnetd


when he went to file a bug that'd already been putback into Nevada/OpenSolaris.

The best place to see what happened is to visit the OpenSolaris discussions, especially this thread.

I was reading e-mail on a Sunday because of an operations problem I was having with one of our punchin IPsec remote access servers. (I'll discuss the problem, a routing one, in a followup entry later today.) I found the initial note and read the PDF file to which "skunsul" so graciously provided a link. MAN I was embarassed. After trying it on some lab machines and my laptop, I brought up the in.telnetd source (at the line number provided by Kingcope). My first approach was to verify the content of the $USER environment variable fed to in.telnetd. I compiled-and-ran the fix, which seemed to work. Great! Time to find some code reviewers.

My only regret about this was not putting the review on security-discuss@opensolaris.org or networking-discuss@opensolaris.org. I'll try better next time, especially for something that was announced on an opensolaris list initially. Anyway, two reviewers (OpenSolaris board member and well-known Sun Good Guy Casper Dik, and crypto framework expert Krishna Yenduri) suggested that login(1) is already getopt-compliant, and that I should just pass "--" between the rest of the arguments and the contents of $USER, no matter how *&^$-ed up it is. Because it was a Sunday, I didn't get rapid turnaround on e-mail replies. This is why the putback didn't happen until six hours after I'd read the note from skunsul. Krishna also recommended (in the spirit of open development) that I place the diffs on the very thread, and I did just that.

Anyone I know here who happened to have seen the initial note would've jumped on this in the same way - please don't think I did something others wouldn't do. My point is - this is the first security exploit reported to us via OpenSolaris, and I think the "Open" part of OpenSolaris helped out the code, as well as Sun's customers.

This entry brought to you by the Technorati tags and .