Josef “Jeff” Sipek

Exclusive Or Character

A couple of years ago I blogged about the CCS instruction in the Apollo Guidance Computer. Today I want to tell you about the XC instruction from the System/360 ISA.

Many ISAs have some sort of xor instruction. The 360 is no different. It offers several different xor instructions which differ in the type of operands that they operate on. In all cases, the operation they perform could be summarized as (using C syntax):

A ^= B;

That is one of the operands is used as both a source and a destination.

There are the boring X (reg ^= memory), XR (reg ^= reg), and XI (reg ^= immediate). Then there is XC which is what inspired this post. XC, or Exclusive Or Character, takes two memory locations and a length and performs what appears as byte by byte xor of the two buffers. (The hardware is smart enough to operate on bigger chunks of memory but the effect is as if it was done byte at a time.) In assembly XC looks like:

XC d1(l,b1),d2(b2)

The d are 12-bit unsigned displacements while the b specify the registers with the base address. For each of the operands the actual address is dX plus the value of the bX register. The l is a length field which encodes a length between 1 and 256.

To use more C pseudocode, XC does:

void XC(unsigned char *op1, size_t len, unsigned char *op2)
{
	while (len--) {
		*op1 ^= *op2;
		op1++;
		op2++;
	}
}

(This pseudo code ignores the condition code calculation and exception generation which are not relevant to the discussion.)

This by itself is neat but not every exciting…until you remember that xor can be used to zero out a register. You can use XC to zero out up to 256 bytes of memory. It turns out this idiom is used pretty often in handwritten assembly, and compilers such as gcc even produce such instructions without any special effort on the programmer’s behalf.

For example, in HVF I have this line:

memset(&psw, 0, sizeof(struct psw));

Which GCC helpfully turns into (struct psw is 16 bytes in size):

xc      160(16,%r15),160(%r15)

When I first saw that line in the disassembly of HVF years ago, it blew my mind. It is elegant, fast thanks to the microarchitecture optimizations, and once you are used to the idiom it is clear about what it does. I hope your mind was as blown as mine. Till next time!

CMake

Recently, I looked into various build system in hopes of finding one that sucks less than the custom built (not by me) one I had the “pleasure” of dealing with for about a month. The requirements were that it had to work on Windows, Linux, and Mac OS X. Of all the possibilities, CMake looked really promising. After successfully convincing the others that CMake was better for the project, most of the code got switched over. I have to say, CMake is nice.

Since then, I have replaced Makefiles in several of my projects (e.g., my blogging system) with CMakeLists.txt. Not only is the console output cleaner, but it does proper dependencies, some sanity checks, and in general simplifies one’s life. I am actually considering switching HVF build system to it.

HVF v0.16-rc3

Whee! This weekend happened to be filled with coding.

First, I realized that it’s been 15 months since the last HVF release. I looked at the list of commits since then, and it was a sizable enough list to warrant a new release. Since there wasn’t a whole lot “Oh my! Must have!” it ended up being just another release candidate.

Once released, I looked around my patch directories to see what else I should hack at. The installer patch caught my eye.

I had a branch for the DASD loader work for a while — and the loader is complete. On the same branch, I had uncommitted code to implement a simple installer program. I started working on it almost a year ago (at least that’s what I gather from the bug report: bug # 146), but I kept all the code uncommitted. I did some simple cleanup, and committed the work-in-progress code. Then, over the next few hours, I managed to get a very large portion of it done. The last piece that needs to be implemented is the EDF handling. That is, the code that lets HVF use Wikipedia article: CMS file system for config files, etc.

In other news, I ran Doxygen on the HVF codebase. The output wasn’t as impressive as I hoped it would be. Part of it is probably because there are way too many functions that aren’t documented bug # 76. This would be a good starter project for anyone looking to start hacking on HVF. Since I’m on the topic of looking for help, I realized that the HVF website is awful and could use some help — both look & feel as well content.

Lastly, I’d like to post a link here to the HVF Ohloh page.

Odin

I finally decided that enough was enough, and I ordered the parts for my new server. This means that in the next week or two, I will be replacing the good ol’ dual Athlon (see below for specs), with a shiny new quad-core Xeon.

Current setup — baal:

2x AMD Athlon MP 1800+ (1.533 GHz, 256 KB cache)
2x 40GB IDE disk
4x 512 MB
1x e1000 Intel NIC

New setup - odin:

1x Intel Xeon W3520 Bloomfield 2.66GHz 4 x 256KB L2 Cache 8MB L3 Cache LGA 1366 130W Quad-Core Server Processor
6x Kingston 2GB 240-Pin DDR3 SDRAM ECC Unbuffered DDR3 1333 Server Memory Model KVR1333D3E9S/2G
6x Seagate Barracuda 7200.11 ST31500341AS 1.5TB 7200 RPM 32MB Cache SATA 3.0Gb/s 3.5" Internal Hard Drive
1x SUPERMICRO CSE-743T-645B Black 4U Pedestal Chassis w/ 645W Power Supply 2 External 5.25" Drive Bays
1x SUPERMICRO MBD-X8STE-O LGA 1366 Intel X58 ATX Intel Core i7 Intel Motherboard

I’ve “stolen” some images of the case from NewEgg:
Odin’s SuperMicro case
Odin’s SuperMicro case

Baal gives me about 40 GB of disk space (I use RAID 1 across the two drives). Odin will give me about 6TB (RAID 6). This will finally allow me to do a few things I wanted to do for a while; one such thing is to provide a Hercules image with Linux set up to do HVF development.

HVF v0.15

Two days ago, I decided to release HVF v0.15. It’s been over a year since I did the v0.14 release. There were 4 -rc’s inbetween. All in all, there have been 132 commits with lots of changes all around.

You can get the source code via Git (git://repo.or.cz/hvf.git), or a tarball.

HVF: Sample Session

Looking at some of my older posts about z/Architecture, I decided to post a sample console session (including some annotations) with the latest version of the code with some work-in-progress patches that I haven’t touched in a while.

Every OS needs a nice banner, right?

                    HH        HH  VV        VV  FFFFFFFFFFFF
                    HH        HH  VV        VV  FFFFFFFFFFFF
                    HH        HH  VV        VV  FF
                    HH        HH  VV        VV  FF
                    HH        HH  VV        VV  FF
                    HHHHHHHHHHHH  VV        VV  FFFFFFF
                    HHHHHHHHHHHH  VV        VV  FFFFFFF
                    HH        HH   VV      VV   FF
                    HH        HH    VV    VV    FF
                    HH        HH     VV  VV     FF
                    HH        HH      VVVV      FF
                    HH        HH       VV       FF

HVF VERSION v0.15-rc4-7-g62eac50

NOW 06:38:44 UTC 2009-04-15

LOGON AT 06:38:45 UTC 2009-04-15

IPL command isn’t completely done, so for the time being, It has the device number hardcoded in.

ipl
WARNING: IPL command is work-in-progress
GUEST IPL HELPER LOADED; ENTERED STOPPED STATE

You can see the device number in R2, the SSCH in R1, and the base address in R12.

d g
GR  0 = 0000000000000000 0000000000010005
GR  2 = 0000000000000a00 0000000000000000
GR  4 = 0000000000000000 0000000000000000
GR  6 = 0000000000000000 0000000000000000
GR  8 = 0000000000000000 0000000000000000
GR 10 = 0000000000000000 0000000000000000
GR 12 = 0000000001000000 0000000000000000
GR 14 = 0000000000000000 0000000000000000

Execution will begin at 16MB, that’s where the loader gets copied.

d psw
PSW = 00080000 81000000 00000000 00000000

The first few instruction of the loader…as disassembled by the built in disassembler.

d s i1000000.20
R0000000001000000  B234C090      STSCH  144(R12)
R0000000001000004  4770C040      BC     7,64(R0,R12)
R0000000001000008  9680C095      OI     149(R12),128
R000000000100000C  B232C090      MSCH   144(R12)
R0000000001000010  4770C040      BC     7,64(R0,R12)
R0000000001000014  D2070078C060  MVC    120(8,R0),96(R12)
R000000000100001A  5830007C      L      R3,124(R0,R0)
R000000000100001E  4133C03C      LA     R3,60(R3,R12)

There are real devices. Since this run was under Hercules, these were all defined in the hvf.cnf.

q real
CPU RUNNING
STORAGE = 128M
CONS 0009 3215 SCH = 10000
RDR  000C 3505 SCH = 10001
PUN  000D 3525 SCH = 10002
PRT  000E 1403 SCH = 10003
GRAF 0040 3278 SCH = 10004
GRAF 0041 3278 SCH = 10005
TAPE 0580 3590 SCH = 10006

And there are virtual devices (including their subchannel information blocks).

q virtual
CPU STOPPED
STORAGE = 17M
CONS 0009 3215 ON CONS 0009 SCH = 10000
RDR  000C 3505 SCH = 10001
PUN  000D 3525 SCH = 10002
PRT  000E 1403 SCH = 10003
DASD 0191 3390      0 CYL ON DASD 0000 SCH = 10004
d schib all
SCHIB DEV  INT-PARM ISC FLG LP PNO LPU PI MBI  PO PA CHPID0-3 CHPID4-7
10000 0009 00000000   0  01 80  00  00 80 —- FF 80 00000000 00000000
10001 000C 00000000   0  01 80  00  00 80 —- FF 80 00000000 00000000
10002 000D 00000000   0  01 80  00  00 80 —- FF 80 00000000 00000000
10003 000E 00000000   0  01 80  00  00 80 —- FF 80 00000000 00000000
10004 0191 00000000   0  01 80  00  00 80 —- FF 80 00000000 00000000

Let ’er rip! Well, it gets past SSCH (well, kind of) and then it stopped when it didn’t know what to do with a DIAG.

be
INTRCPT: INST (b234 c0900000)
STSCH handler (raw:0000b234c0900000 addr:0000000001000090 sch:10005)
INTRCPT: INST (8300 00010000)
Unknown/mis-handled intercept code 04, err = -4

Ah, condition code 3, that’s why the loader gave up with DIAG, instead of attempting MSCH.

d psw
PSW = 00083000 81000048 00000000 00000000
d s i1000040.10
R0000000001000040  980FC0C4      LM     R0,R15,196(R12)
R0000000001000044  83000001      DIAG   X’000001’
R0000000001000048  980FC0C4      LM     R0,R15,196(R12)
R000000000100004C  83000000      DIAG   X’000000’

What version is this anyway? Is it 6:45 already?!

q cplevel
HVF version v0.15-rc4-7-g62eac50
IPL at 06:38:44 UTC 2009-04-15
q time
TIME IS 06:45:26 UTC 2009-04-15

P.S. I just realized that the post id for this post is 360. How apt! :)

HVF & z/VM

After a couple of days of trying to figure out why HVF didn’t want to IPL under z/VM, I finally managed to find out that z/VM is trying to act like a real 3215 printer-keyboard console, and refusing to do Sense-ID. It just rejects it. Grrr. So I set up a “device blacklist” where I say that device number 0009 is a 3215 console. It works well enough. This is what things look like on a real mainframe :)

ipl c clear
HVF version 0.13
 Memory:
    4096 kB/page
    131072 kB
    32768 pages
    PSA for each CPU     0..1024 kB
    nucleus              1024..4096 kB
    struct page array    4096..4608 kB
    generic pages        4608..131072 kB
 Devices:
    3215-00 @ 0009 (sch 10003)
    3390-12 @ 0190 (sch 10004)
    3390-12 @ 019d (sch 10005)
    3390-12 @ 019e (sch 10006)
    3390-12 @ 0191 (sch 10007)
    3390-12 @ 0192 (sch 10008)
 Scheduler:
    no task max

Shiny isn’t it? Subchannels 0-2 are missing because the devices (card reader, punch, and printer) don’t support Sense-ID, but I haven’t bothered to set up the “blacklist” entries because I won’t need these devices anytime soon.

Putting the V in HVF

After a number of days trying to avoid having to implement the Dynamic Address Translation (DAT) related code, I finally came to the conclusion that I had to…and so I did…in one afternoon. Woohoo! It actually wasn’t all that bad. Sure, it’s not complete, I still need a way to invalidate entries in the TLB, but right now, the only user that exists is the nucleus, so I just set up the address space, load up a pointer to the first level of page tables (called Region-Third-Table), and toggle the right bit in the Process Status Word (PSW). The DAT hardware turns on, and happily starts translating the addresses from virtual to physical.

One thing that struck me as somewhat wasteful was the fact that if I have, say 128MB to set up page tables for (each page being 4kB), I need to set up 32768 page table entries (PTEs)! Sure, if they are physically fragmented, then you have no choice, but if they are in a couple of (if not one) contiguous chunk, then what? You still need 32768 entries. Oh, and have I mentioned that each entry is 8 bytes? That means, that 128MB needs 256kB of PTEs. Now, you can’t just have just PTEs, that would make the translation far too inefficient. You have segment tables (with segment table entries — STEs) which point to sections of the page tables. As it turns out, each STE can point to a section of only 256 PTEs. In other words, the 32768 PTEs will require 128 STEs — at 8 bytes each, that’s just a single page…but…

Now, imagine we have 2 GB to set up page tables for, that’s…
…2147483648 bytes
…524288 pages
…524288 pages table entries
…4194304 bytes of page table entries (= 4 MB)
…2048 segment table entries
…16384 bytes of segment table entries (= 16 KB)
…~4MB overhead for 2GB

Still doesn’t sound that bad…but if the hardware allowed some kind of extent-based PTE, or if for example the STE had a flag that said this is the physical address and the begining of a 1MB chunk of memory (256 * 4kB = 1MB). A whole lot of memory could be saved.

[ANNOUNCE] HVF v0.11

Hello all!

I would like to announce the first public release of HVF — an open source OS for the zArchitecture written in C.

Currently, the OS does very little. It consists of:

- simple process scheduler
- console layer (currently supports only one 3215 device)
- page allocator
- slab allocator (to provide a libc-like malloc())

Once the system is IPLed, it outputs some information to the console, and then continues to idle. While this is not much there is enough code that it lends itself to (aside from my goal with it — see below):

- being used as a basis for experimenting with zArch
- being used as the beginning of a toy OS

Since I do not have access to a zSeries and therefore I had to resort to developing and testing on Hercules. It is possible that there are issues that need fixing to get things running smoothly on the real thing.

The ultimate goal is to have a VM/370-like OS that runs on the zArchtecture - to allow Linux and other modern OSes to run concurrently on a single machine. Here are few of the goals on the TODO list:

- nucleus should be all 64-bit (minus the arch mode switching code)
- mostly in C
- support multiple users
- use SIE to virtualize the hardware (S/390 and zArch modes)
- give something to the mainframe hobbyist community to play with :)

Note that this is all for the hypervisor — I’d like to have a CMS-like OS as well, but that’s secondary. (In a couple of days, I’m actually planning to post a list of ideas for the guest OS to the HVF mailing list — see below.)

You can find the released source code in a tarball at:

http://www.josefsipek.net/projects/hvf/src/

I use Git[1] as the version control system. You can browse the history, as well as obtain the source at:

http://repo.or.cz/w/hvf.git

Feel free to grab a copy of the source code, build it (see Documentation/building.txt in the source tree), IPL it, tweak it, and submit patches :)

I have also set up a mailing list as a place to discuss design, comment on code, etc.:

http://lists.josefsipek.net/listinfo/hvf

Currently, the list gets commit messages whenever something changes in the repository but I’m hoping that once people join it’ll be more interesting :)

Then there is the IRC channel where you can catch me pretty much all the time:

server: irc.oftc.net (the OFTC network)
channel: #hvf

And finally, I have decided to use GPLv2 as the license of choice for the code. The major advantage of doing so is the ability to borrow code (with proper citation of the borrowing) from other GPLv2 projects — namely Linux. The extent of the borrowing is restricted to basic building blocks — e.g., atomic variable types, locking primitives, but not much more beyond that.

Josef ’Jeff’ Sipek.

[1] http://git.or.cz/

Powered by blahgd