Cisco VPN Client, Ralink Wi-Fi Drivers and Random Kernel Panics

A client came in recently with a series of odd, apparently unpredictable kernel panics. A few minute of log trolling led to four panic logs. Knowing that she’d installed Ralink USB Wi-Fi device drivers on the machine under Snow Leopard, I initially suspect them as the root cause (I’ve never liked Ralink’s ugly, kludgy software, and tend to suspect it on principle.)

Two logs showed a crash on loading Airport extensions, and two on USB drivers, but the common link was the extension loaded just prior…in each case, a kext for Cisco’s VPN client. This surprised me not just because I’d expected to find the Ralink stuff, but because we switched from the Cisco native VPN client to AnyConnect over a year ago. Apparently, a significant portion of the old VPN client had remained on her system, and been carried forward through at least one upgrade cycle.

A check of the /private/var/opt/ subdirectory showed most of the client still around. Running the vpn_unintall script removed most of the bits (which I double-checked by visiting each location specified in the script. I then manually delete the cisco_vpn subdirectory in opt. I also removed the Ralink utilities, including a preferences file in /Library/Preferences and a directory in /Library/Frameworks. Ralink provides no uninstaller.

So far, no more kernel panics. Possibly my distaste for both Cisco’s VPN client and Ralink’s just is justified.

UPDATE: This situation is a good argument for keeping a copy of John Welch’s Kext Lister utility handy.

802.1x PEAP authentication errors under Mac OS X Lion (10.7.2)

UPDATE: I have it from a source I trust that the “Unknown” not-a-certificate mentioned below is ignorable, and will go away with 10.7.3.

Just bumped up against an interesting problem.

A client’s personal laptop was having problems authenticating to our Wi-Fi network. The machine is a black MacBook running Mac OS X 10.7.2. Our Wi-Fi uses Aruba access points to authenticate against our Active Directory using 802.1x, with additional security provided by a Bradford Networks NAC. Our 802.1x profiles are built automatically by our CloudPath XpressConnect tool (a Java Web-based utility). Normally, all this works quite seamlessly.
This particular client was seeing an odd response, though. After running (successfully) the CloudPath XPC setup routine, XPC attempted to switch the Wi-Fi to our standard secured network, and failed. The failure produced the following error dialog:
The identity of the authentication server could not be established. Contact your network administrator to verify your configuration settings.
This dialog appeared thrice before the MacBook finally quit attempting to join the network. Checking the logs, I found these errors:
eapolclient	peap_verify_server: server certificate not trusted, status 6 0
eapolclient	en1 PEAP: authentication failed with status 6
I checked in with our Wi-Fi admin, and confirmed that no certificates should be involved in this transaction, and that he hadn’t ever seen the cited PEAP errors, either. A cursory Google search didn’t help, either. My next shot was the keychain, so I made a backup of ~/Library/Keychains/login.keychain, and went looking.
The login keychain showed a number of keys that appeared to be spurious keys (several dozen). I deleted these, and got no change in behavior (predictably). Next, I ran Keychain First Aid, and got the following error:
User differs on ~/Library/Preferences/com.apple.security.plist, should be 501, owner is 0.
Repair and rescan.
Keychain search list not properly configured.
Repair and rescan. Finally, no errors, but the error dialog on Wi-Fi connection attempts still appears. At last, in what might be considered a mild fit of pique, I deleted all references and files for the login keychain, logged out, and logged back in to recreate it. Lo! There was Wi-Fi! As it should, the system queried for the user’s AD credentials, and added them to the login keychain. Swapping the old keychain file back in made the problem reappear, so clearly the keychain was the problem. But why?
Regrettably, I don’t have a definitive answer. The client needed his machine back, and breakfix was a higher priority than investigation. However, I did notice that after connecting to Wi-Fi, the new login.keychain contains only a half-dozen entries…one sort of suspicious.
  • Two com.apple.ubiquity keys (private and public) and one com.apple.ubiquity certificate (surprisingly listed as untrusted). These are associated with iCloud, which the client confirmed activating.
  • One Apple Persistent State Encryption application password. Google that for a list of brief, generally uninformative explanations that basically amount to “it’s a Lion thing.”
  • One 802.1x credential (for our secure Wi-Fi).
  • Most interestingly, one “certificate” named Unknown, and marked as untrusted. This reappeared each time I removed and recreated the keychain. Its inspector says its data is not recognized as a certificate, and no similar entry appears on my own Lion laptop, set up via the same procedure.
The short story here is that if you get weird certificate errors with 802.1x and Lion, you could do worse than looking hard at the login.keychain. The longer story is that I’m not sure what was borked, or why the new keychain worked, but I’m very curious about the Unknown “certificate.”
Pity I won’t get to run down an answer.