Josef “Jeff” Sipek

iSCSI boot - Success

In my previous post, I documented some steps necessary to get OpenIndiana to boot from iSCSI.

I finally managed to get it to work cleanly. So, here are the remaining details necessary to boot your OI box from iSCSI.

Installation

First, boot from one of the OI installation media. I used a USB flash drive. Then, before starting the installer, drop into a shell and connect to the target.

# iscsiadm add discovery-address 172.16.0.1
# iscsiadm modify discovery -t enable

At this point, you should have all the LUs accessible:

# format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
       0. c5t600144F000000000000052A4B4CE0002d0 <SUN-COMSTAR-1.0 cyl 13052 alt 2 hd 255 sec 63>
          /scsi_vhci/disk@g600144f000000000000052a4b4ce0002
Specify disk (enter its number): 

Exit the shell and start the installer.

Now, the tricky part… When you get to the network configuration page, you must select the “None” option. Selecting “Automatically” will cause nwam to try to start on boot and it’ll step onto the already configured network interface. That’s it. Finish installation normally. Once you’re ready to reboot, either configure your network card or use iPXE as I’ve shared before.

e1000g

For the curious, here’s what the iSCSI booted (from the e1000g NIC) system looks like:

# svcs network/physical
STATE          STIME    FMRI
disabled       17:13:10 svc:/network/physical:nwam
online         17:13:15 svc:/network/physical:default
# dladm show-link
LINK        CLASS     MTU    STATE    BRIDGE     OVER
e1000g0     phys      1500   up       --         --
# ipadm show-addr
ADDROBJ           TYPE     STATE        ADDR
e1000g0/?         static   ok           172.16.0.179/24
lo0/v4            static   ok           127.0.0.1/8
lo0/v6            static   ok           ::1/128

nge

Does switching back to the on-board nge NICs work now? No. We still get a lovely panic:

WARNING: Cannot plumb network device 19

panic[cpu0]/thread=fffffffffbc2f400: vfs_mountroot: cannot mount root

Warning - stack not written to the dump buffer
fffffffffbc71ae0 genunix:vfs_mountroot+75 ()
fffffffffbc71b10 genunix:main+136 ()
fffffffffbc71b20 unix:_locore_start+90 ()

iSCSI boot

I decided a couple of days ago to try to see if OpenIndiana would still fail to boot from iSCSI like it did about two years ago. This post exists to remind me later what I did. If you find it helpful, great.

First, I got to set up the target. There is a bunch of documentation how to use COMSTAR to export a LU, so I won’t explain. I made a 100 GB LU.

I dug up an older system to act as my test box and disconnected its SATA disk. Booting from the OI USB image was uneventful. Before starting the installer, dropped into a shell and connected to the target (using iscsiadm). Then I installed OI onto the LU. Then, I dropped back into the shell to modify Grub’s menu.lst to use the serial port for both the Grub menu as well as make the kernel direct console output there.

Since the two on-board NICs can’t boot off iSCSI, I ended up using iPXE to boot off iSCSI. First, I made a script file:

#!ipxe

dhcp
sanboot iscsi:172.16.0.1:::0:iqn.2010-08.org.illumos:02:oi-test

Then it was time to grab the source and build it. I did run into a simple problem in a test file, so I patched it trivially.

$ git clone git://git.ipxe.org/ipxe.git
$ cd ipxe
$ cat /tmp/ipxe.patch
diff --git a/src/tests/vsprintf_test.c b/src/tests/vsprintf_test.c
index 11512ec..2231574 100644
--- a/src/tests/vsprintf_test.c
+++ b/src/tests/vsprintf_test.c
@@ -66,7 +66,7 @@ static void vsprintf_test_exec ( void ) {
 	/* Basic format specifiers */
 	snprintf_ok ( 16, "%", "%%" );
 	snprintf_ok ( 16, "ABC", "%c%c%c", 'A', 'B', 'C' );
-	snprintf_ok ( 16, "abc", "%lc%lc%lc", L'a', L'b', L'c' );
+	//snprintf_ok ( 16, "abc", "%lc%lc%lc", L'a', L'b', L'c' );
 	snprintf_ok ( 16, "Hello world", "%s %s", "Hello", "world" );
 	snprintf_ok ( 16, "Goodbye world", "%ls %s", L"Goodbye", "world" );
 	snprintf_ok ( 16, "0x1234abcd", "%p", ( ( void * ) 0x1234abcd ) );
$ patch -p1 < /tmp/ipxe.patch
$ make bin/ipxe.usb EMBED=/tmp/ipxe.script
$ sudo dd if=bin/ipxe.usb of=/dev/rdsk/c8t0d0p0 bs=1M

Now, I had a USB flash drive with iPXE that’d get a DHCP lease and then proceed to boot from my iSCSI target.

Did the system boot? Partially. iPXE did everything right — DHCP, storing the iSCSI information in the Wikipedia article: iBFT, reading from the LU and handing control over to Grub. Grub did the right thing too. Sadly, once within kernel, things didn’t quite work out the way they should.

iBFT

Was the iBFT getting parsed properly? After reading the code for a while and using mdb to examine the state, I found a convenient tunable (read: global int that can be set using the debugger) that will cause the iSCSI boot parameters to be dumped to the console. It is called iscsi_print_bootprop. Setting it to non-zero will produce nice output:

Welcome to kmdb
kmdb: unable to determine terminal type: assuming `vt100'
Loaded modules: [ unix krtld genunix ]
[0]> iscsi_print_bootprop/W 1
iscsi_print_bootprop:           0               =       0x1
[0]> :c
OpenIndiana Build oi_151a7 64-bit (illumos 13815:61cf2631639d)
SunOS Release 5.11 - Copyright 1983-2010 Oracle and/or its affiliates.
All rights reserved. Use is subject to license terms.
Initiator Name : iqn.2010-04.org.ipxe:00020003-0004-0005-0006-000700080009
Local IP addr  : 172.16.0.179
Local gateway  : 172.16.0.1
Local DHCP     : 0.0.0.0
Local MAC      : 00:02:b3:a8:66:0c
Target Name    : iqn.2010-08.org.illumos:02:oi-test
Target IP      : 172.16.0.1
Target Port    : 3260
Boot LUN       : 0000-0000-0000-0000

nge vs. e1000g

So, the iBFT was getting parsed properly. The only “error” message to indicate that something was wrong was the “Cannot plumb network device 19”. Searching the code reveals that this is in the rootconf function. After more tracing, it became apparent that the kernel was trying to set up the NIC but was failing to find a device with the MAC address iBFT indicated. (19 is ENODEV)

At this point, it dawned on me that the on-board NICs are mere nge devices. I popped in a PCI-X e1000g moved the cable over and rebooted. Things got a lot farther!

unable to connect

Currently, I’m looking at this output.

NOTICE: Configuring iSCSI boot session...
NOTICE: iscsi connection(5) unable to connect to target iqn.2010-08.org.illumos:02:oi-test
Loading smf(5) service descriptions: 171/171
Hostname: oi-test
Configuring devices.
Loading smf(5) service descriptions: 6/6
NOTICE: iscsi connection(12) unable to connect to target iqn.2010-08.org.illumos:02:oi-test

The odd thing is, while these appear SMF is busy loading manifests and tracing the iSCSI traffic to the target shows that the kernel is doing a bunch of reads and writes. I suspect that all the successful I/O was done over one connection and then something happens and we lose the link. This is where I am now.

Meili

Back in June I got myself a new laptop — Thinkpad T520. As always, it’s a solid design (yes, I know, Thinkpads aren’t what they used to be). The unfortunate news is that the hardware was just a bit too new to be supported well. It came with Windows 7, which of course knew how to deal with all the devices in the system.

Debian

I tried installing Debian, but anything but the latest testing snapshot didn’t recognize my Intel 82579LM ethernet chip. The latest development snapshot installed just fine, but when I tried to boot into the installed system, everything got stuck in the middle of the initramfs. Booting with init=/bin/bash got me a shell, but anything and everything I tried didn’t fix the problem in the end. I searched the bug tracker for similar issues — no luck. In a last ditch effort, I tried to ask #debian. In has been my experience in the past that this channel is useless, but I asked anyway. I got precisely zero responses.

OpenIndiana

Unlike Debian, OpenIndiana installed and booted just fine. Sadly, both my wifi (Intel Wifi Link 1000) and my wired ethernet were not supported. I ended up installing VirtualBox in Windows and OpenIndiana underneath it. It worked reasonably well. At the same time, I started pestering some of the Illumos developers that mentioned that they were working on an update to the e1000g driver — the driver for my wired network interface.

Yesterday, one of them updated the bug related to the driver update with binaries for people to try. Well, guess what? I’m writing this entry in Vim over ssh.

The driver install was a simple matter of overwriting the existing e1000g files, and then running update_drv -a -i ’"pci8086,1502"’ e1000g and then rebooting. (I could have used devfsadm instead, but I wanted to make sure things would come up on boot anyway.)

I still need to switch Xorg to the proprietary NVidia driver.

Windows

I’m pretty sure I’ll end up rebooting into Windows every so often anyway…if only to play Wikipedia article: Age of Mythology. :)

P.S. In case you haven’t guessed it yet, Meili is my laptop’s hostname.

OLS 2006 - Day 5

The day began with an awesome presentation I gave about Unionfs. :) Shawn was recoding it, but after the presentation, he found out that the video turned out to be crap. He has audio only. I’m sure he’ll share it soon. :) I was pleasantly surprised at the number of people that use Unionfs or were interested in Unionfs.

The keynote was excelent as always. However I must say that Greg K-H made it sound like any piece of code will get into the kernel. Yeah, right :) But he did say few nice things about the status of Linux.

After the keynote, there was the GPG key signing - which I did not attend, although I wanted to. Instead we went to get some food. Food was good, we (I, Dave, Mike Halcrow, and Prof. Zadok) talked about a bunch of things ranging from MythTV and terabyte storage servers, to things like the number of ants in Texas. (Apparently, it is a lot of fun to watch termites and fire ants battle to the death. O_o )

We finished food around 19:45 which was about right to head over to the Black Thorn for the after event party. Just as last year it was quite interesting. Pretty much as soon as I got there, I noticed Peter Baudis aka. pasky - the cogito maintainer. We chatted about how git and Mercurial differ (Matt’s talk the day before came in handy :) ). I mentioned I was slowly working on a generic benchmark script that would test a number of popular SCMs including Mercurial, Subversion, and CVS. He was thrilled about the prospect of knowing exactly where git sucked compared to other SCMs - my guess is that he wants to fix it and make it better, a noble goal, but unnecessary as Mercurial already exists and why reinvent the wheel? ;) Seriously, though, I think a lot of people would benefit from knowing exactly where each SCM excels, and where each sucks. The nice thing about collaborating with the git people would be that it would make it more apparent that this wouldn’t just be yet-another-fake-test. After some time, a bunch of other Czech people poped up right next to us (people like, Pavel Machek, etc.). It was quite interesting. :)

After than I joined a converation with some Intel people. As it turns out, one of the Intel people is working on the e1000 driver — awesome piece of hardware, by the way, don’t ever buy anything other than it. :) Some time later, Jens Axboe joined the group briefly. When he said my name seemed familiar, I mentioned how I tried to implement IO priorities - and failed :) Later on, a guy from University of Toronto joined the group. He approached me earlier in the day about unionfs on clusters. We chatted about things ranging from school (undergraduate program, and grad school) to submitting kernel code to lkml. The e1000 guy said a similar thing that we should split unionfs up into a few patches, and send it off. During the event a few people still asked me about Unionfs, which felt good :)

Then, I decided that it would be fun to talk to some IRC people. I found John Levon and Seth Arnold. We sat down, and had an interesting conversation about a number of things. Since at least some of these were quite interesting, here’s a list:

  1. How can I deal with VFS and not drink vodka or other hard liquer
  2. Everybody hates CDE, even people at Sun
  3. Solaris is dead (well, they didn’t say it, but that’s the feeling I got)
  4. Brittons have some interesting sports or at least some of the expected behavior during the sport is interesting, namely:

  1. darts - you are expected to drink as you play
  2. I can’t recall the name - gigantic pool table
  3. cricket - everyone smokes "reefer" (to quote Movement, I just find this name of the substance mildly amusing) because their games sometimes take several days

After that, they kicked everyone out as it was 2:45 already. We (Seth, John, and I) went back to the hotel. There, I Prof. Zadok and Chip (who arrived on Friday) were about to get up and head to the airport. :) I just went to bed.

Powered by blahgd