Monday, July 16, 2007

Not using "ncp" on Niagara considered harmful

One of our IPsec remote-access servers here is on a Niagara-powered T2000 server. It's really overkill for the job, but we get to see how IKE and the Niagara crypto accelerator (known as "ncp" by its driver name) interact.

The nice thing about running your own stuff is you find things out before others do. Consider bug 6339802. We saw AWFUL IKE performance on Niagara boxes before we fixed this. Admittedly, IKE is single-threaded (for reasons beyond the scope of this blog entry), but it was taking seconds to complete an IKE Phase I with 2048-bit RSA and 1536-bit Diffie-Hellman.

More recently, we've been enabling bigger Diffie-Hellman MODP groups in our IKE. The Niagara driver has a limit of 2048-bit operations, so we limited the Phase I DH to 2048-bits.

Here's a DTrace script we like to use to measure responder-side Phase I times (in.iked and libike are closed-source, but trust me on this one):

#!/usr/sbin/dtrace -s

/*
* Responder-side Phase I setup.
*/
pid$1::ssh_policy_new_connection:entry
{
self->negstart[arg0] = timestamp;
printf("Initial packet received, pm_info = %p", arg0);
}

pid$1::ssh_policy_negotiation_done_isakmp:entry
{
/* Use 16384 value for "CONNECTED" from isakmp_doi.h */
printf("return %d - %s ", arg1,
(arg1 == 16384) ? "Success" : "Error case.");

printf("pm_info %p finished, took %d ns", arg0,
timestamp - self->negstart[arg0]);
}



With the fix for 6339802 in place, we can get pretty good phase 1 times....


dtrace: script '/space/responder-phase1.d' matched 4 probes
CPU ID FUNCTION:NAME
4 48040 ssh_policy_new_connection:entry Initial packet received, pm_info = bed6c8
4 48042 ssh_policy_negotiation_done_isakmp:entry return 16384 - Success pm_info bed6c8 finished, took 165512764 ns


That's 165msec. Some of that time is packet round-trips, but let's ignore that now for this exercise.

Now let's do someting drastic: cryptoadm disable provider=ncp/0 all. Suddenly those seconds come back...


4 48040 ssh_policy_new_connection:entry Initial packet received, pm_info = bed6c8
4 48042 ssh_policy_negotiation_done_isakmp:entry return 16384 - Success pm_info bed6c8 finished, took 7732419300 ns


WOW! Like Darren said, that's blog-worthy, and that's why I'm here.

So why is Niagara so slow without its crypto accelerator?


That's a good question. Keep in mind that four big-number operations occur in an IKE Phase I exchange like I measured above: one RSA Signature, one RSA Verification, one Diffie-Hellman generate, and one Diffie-Hellman agree. I'm not 100% sure, but I believe the default software implementation of big-number operations on SPARC uses floating-point tricks to help out. Using floating-point on a Niagara kicks in a software emulation, which would definitely increase the time taken for each bignum operation.

So the moral of the story is to make sure you're exploiting all of the hardware that's available to you!

No comments:

Post a Comment