Josef “Jeff” Sipek

FAST 2012

Last week I went to FAST ’12. As always, it’s been fun attending. Overall, the talks were good and gave me a couple of ideas. There was definitely a lot of flash and SSD related work — both of which I don’t find all that exciting. There was however a bunch of work related to dedup and backup.

Anyway, here’s my reading list. I’ve skimmed most of the papers, but I want to take a closer look at these.

  • Characteristics of Backup Workloads in Production Systems
  • WAN Optimized Replication of Backup Datasets Using Stream-Informed Delta Compression
  • Recon: Verifying File System Consistency at Runtime
  • Consistency Without Ordering
  • ZZFS: A Hybrid Device and Cloud File System for Spontaneous Users
  • Serving Large-scale Batch Computed Data with Project Voldemort
  • BlueSky: A Cloud-Backed File System for the Enterprise
  • Extracting Flexible, Replayable Models from Large Block Traces
  • iDedup: Latency-aware, Inline Data Deduplication for Primary Storage

Usenix 2009, Part 2

As promised, here’s more of the day-by-day summary of Usenix ’09.


The last day of the conference. As before, I got to the place at 8:30, and had breakfast.

The first session, System Optimization, was interesting. It started with Reducing Seek Overhead with Application-Directed Prefetching. The idea is pretty obvious. You have a library that takes lists of future accesses from the application, and tries to prefetch the data while monitoring the application’s IO accesses. The first deviation from the prefetch-list causes an invocation of a call-back. This allows the application to specify a new prefetch list.

The second talk of the session, Fido: Fast Inter-Virtual-Machine Communication for Enterprise Appliances, was about a simple and fast way to have multiple VMs communicating. Their target was a collection of virtual machines running in an appliance box. Since the OSes were inherently trustworthy (before virtualization took off and even now, there was one OS that did everything), they achieve zero-copy by mapping all the other OSes into each address space. For example, suppose you have 3 VMs (red, green, blue), their address spaces would be something like:

Fido: address spaces

Each VM gets all the VM’s address spaces read-only, and its own read-write. Then a simple message can be exchanged to specify buffer addresses.

The Web, Internet, Data Center session wasn’t very exciting. The one talk that stuck in my head was RCB: A Simple and Practical Framework for Real-time Collaborative Browsing. What they did was some javascript-fu that synchronized the DOM trees between two (or more?) browsers.

The last session for the day (and the conference) was Bugs and Software Updates. It opened with The Beauty and the Beast: Vulnerabilities in Red Hat’s Packages. The authors did some crazy statistics and found that there was some correlation between packages’ dependencies and the number of vulnerabilities. Several audience members pointed out that publishing these findings may cause a feedback that completely maybe either exaggerate this correlation, or it may cause the developers to make dependency choices to make their software seem less likely to have vulnerabilities.

The second talk, Immediate Multi-Threaded Dynamic Software Updates Using Stack Reconstruction sounded interesting, but I really feel like I need to look at the paper first before drawing any further conclusions.

The last talk of the session, Zephyr: Efficient Incremental Reprogramming of Sensor Nodes using Function Call Indirections and Difference Computation, seemed like 2 talks in one. First, we were told about rsync protocol’s shortcomings, and how they fixed them, and then we were told about their function call indirection scheme to make the deltas smaller. This function call indirection sounded far too much like dynamically linked binaries, with GOT pointer and all that good stuff.

After a short break, the last invited talk began. The speaker was Wikipedia article: David Brin - a science fiction writer. He is one of those people that really knows how to present. He gives off an aura of knowing exactly what he’ll say next. I can’t tell for sure if he does know what he’ll say next, or if he merely has an idea where he wants to get to, and “improvises”; to get there.

I went back to the hotel, but got bored soon after. Not knowing what to do, I went for a walk. I just took the first street away from the hotel. It turned out to be a road that went considerably uphill toward the San Diego Medical Campus (or whatever it was called). After some exploring, I got some food (this was the only place I saw that had fast food joints, the hotel area was about as boring as it can get).


I woke up relatively late - 10am. I blame getting used to the pacific timezone (grr, just in time to fly back!). After packing up, and checking out, I ran into Swami. We went to get breakfast, and talked about all sort of stuff - grad school, conferences, the good ol’ days at FSL. After some time, he and another student from Wisconsin took off. I decided to go for a walk. I ended up in the same fast food place. There, I started typing up the previous blahg post. After about 2 hours of working my laptop, I went back and got a shuttle to the airport.

At the airport, I tried to do some work. Before long, I switched to reading a book. Shortly after, it was time to board. The flight itself was mostly uneventful. Sadly, the 2 hour layover in Charlotte, NC was painful. The free wifi that was there few days earlier disappeared. I survived it. Two-ish hours later, we landed at DTW. Due to some miscommunication, I ended up without a ride back to Ann Arbor. I managed to call up a friend that drove me back.

Usenix 2009, Part 1

Phew! It’s been a fun couple of days. I’m going to provide a day-by-day summary of what’s been going on.


More or less the entire day was take up by travel. The flight was mostly uneventful, with a 1 hour layover in Charlotte, NC. The one hour was long enough to get from one terminal to the other and get food, yet short enough that by the time I was done eating, the first class passengers were about to board. For virtually the entire duration of the flight, I was preoccupied with a book I took with me.


Waking up for a 9am event was never easier! I guess it was the timezone difference. Instead of going to bed at about 3am, I went to bed at around midnight (which happens to be 3am on the other coast!).

I woke up at about 6:30 - not due the alarm. I went back to sleep. At 7:30, I woke up for real. Before long, it was 8:30, and I was already at the conference about to get my proceedings and badge. Free food, namely orange juice and crossionts followed.

Nine o’clock rolled around, and the conference began. The keynote was ok. I wasn’t amazed by it, even though the speaker had valid points.

At 11, the first session began - virtualization. This was one of the sessions I’ve been looking forward to. Well, I was interested in one paper specifically: vNUMA: A Virtual Shared-Memory Multiprocessor. The name summarizes the idea very nicely. Why was I interested in it? Well, a long time ago, when you couldn’t walk down the street and buy a multi-core system, the year was 2003 - I had the same idea. It’s a rather obvious idea, as is the name. At first, I was going to hack the Linux kernel to accomplish it (I still have a couple of patches I started), but other things sapped up my time. (The fact that everyone I mentioned it to told me that the paging over ethernet overheads were going to make it impractical made me want to do it even more!) As it turns out, the folks from University of New South Wales, that got a publication out of it, started working on it in 2002.

Their implementation is for the Itanium architecture. They said that they chose it because at the time they started, Intel was pushing Itanium as the architecture that’ll replace x86. Unfortunately, I didn’t get to talk to any of them at the conference.

The lunchtime invited talk was nice. It was about how faculty at Harvard tried to make the intro to computer science course fun and appealing, yet still retain the same important intro material. The big thing: they used Amazon’s Elastic Compute Cloud (EC2) instead of the on-campus computer network. They liked being in charge of the system (not having to wait for some admin to take care of a request for some change), but at the same time they didn’t like being the admins themselves. Some of the students were also not exactly “thrilled” about hearing about the cloud so often - especially since they didn’t have to use it in a cloud-way. One of the most memorable parts of the talk was when the speaker whipped out a phone book, asked the audience about how to find something in it, and shortly after proceeded to do binary search - ripping the book in half, then tossing a half across the podium, and then ripping the remainder in half.

Afterward, the networking session took place. The one talk that was fun was StrobeLight: Lightweight Availability Mapping and Anomaly Detection. The summary of the idea: ping the subnet, then some time later, ping it again, and count the number of IPs that changed - essentially XOR and then count ones. It’s a simple idea that apparently works rather well.

The next session was about storage. It started with a paper that got a best paper award. One of the authors is Swami - a former FSL member, not at University of Wisconsin-Madison. While he never said it during the talk, they essentially implemented a stackable filesystem. Right after it was a talk by guys from the MIT AI lab, about decentralized deduplication in SAN cluster filesystems.

After a short break, the last invited talk for the day happened. A dude from Sun talked about SunSPOTs. SunSPOT is a small wireless sensor/actuator platform that’s based on Java. It reminded me of several other such platforms, but seemed a lot more polished. Unfortunately, it’s Java-centric way was rather disappointing. (More or less everyone that knows me, knows that I don’t like Java.)

The day concluded with a poster session & food.


Again, waking up wasn’t a problem. The breakfast happened to be bagels.

9am: the first talk of the distributed systems session, was Object Storage on CRAQ: High-Throughput Chain Replication for Read-Mostly Workloads. This is one of the papers I’m intending to look at. The other two I found less interesting.

11am: the kernel development session had 2 interesting talks. The first, Decaf: Moving Device Drivers to a Modern Language talked about taking device drivers, splitting them into two portions (a performance critical section in C, and a non-performance critical section that could be moved to “a modern language” - read: Java). While I strongly disagree with the language choice, the overall idea is interesting. Java (along with other more managed languages) provides stronger static checking than C. They actually took some Linux drivers, and split them up. I want to go over their evaluation again. The next interesting talk was about executing filesystems in a userspace framework (Rump File System: Kernel Code Reborn). Unlike FUSE and other attempts, this one aims to take in-kernel filesystem code, and execute it in userspace without any modifications.

The lunchtime invited talk about about how teaching students how to program is hard. How some error messages the compiler outputs are completely misleading and confuse students.

I zoned out for most of the 2pm session about automated management. There were emails & other things to catch up on.

The short paper session started off nicely with The Restoration of Early UNIX Artifacts. The speaker mentioned that he managed to get his hands on a copy of first edition UNIX kernel source. While it was slightly mangled up, he managed to get it up and running. After a bit of effort, he got his hands on some near-second edition UNIX userspace. Another short paper was about how Linux kernel developers respond to static analysis bug reports.

The last invited talk for the day was unusual. It was about the Wikipedia article: Antikythera mechanism. He used Squeak EToys (a Smalltalk environment) to simulate the mechanism. He put the “software” up on the web:

Afterwards, more food. This time without posters. And after the food, there were some BOFs. The one I went to was about ancient UNIX artifacts. There I got to see first edition UNIX running. Really neat. It felt like UNIX - same but different in some ways. The prompt was ’@’; the ’cd’ command didn’t exist, instead you had ’chdir’… that’s right, the “tersness” of UNIX wasn’t always there!; the ’rwx’ bits you see when you do ls -l were different, you had one ’x’ bit, and 2 rw pairs. On my way out, I got more or less dragged into a mobile cluster BOF (or whatever the title was), the most interesting part was when we got to talk about Plan 9 all the way at the end.

Kids Read Comics

If you are into comics and happen to be near Chelsea, MI (about 15 mins from Ann Arbor, MI) on June 12 & 13, you might want to consider going to Kids Read Comics comic convention. (As the name implies, it’s targeted at a younger crowd but don’t get discouraged by that.)

The guest list looks quite good (at least in my opinion).

Kids Read Comics

OLS 2008 - Day 4

I’m a still a day late when it comes to writing about OLS. Here’s Friday’s list of talks, and other happenings.

The day began with A Practical Guide to using Git (From a Kernel Maintainer) — it was very crowded in the room, so much so that I didn’t really see the slideshow, but since I already know enough about how to use Git, I don’t mind all that much. Good talk.

The next talk which I kinda had to go to was SynergyFS: A Stackable File System Creating Synergies Between Heterogeneous Storage Devices. It was a disaster — and I put that mildly. The first 30 minutes of the 45 minute talk consisted of Samsung branded marketing material showing that solid state disks were better than the regular platter-based disks. Since the marketing people care mostly about Windows users, the propaganda materials consisted of things like a video thing showing Microsoft Windows Vista booting on two identical laptops — with the exception of the storage device.

Anyway, about 2/3 of the talk through, a SynergyFS got mentioned. And that’s when the one quite important bit got mentioned. At the time of the writing of the paper, the filesystem was a “proposed filesystems.” In other words, it didn’t exist. I am not certain if it exists at the moment, and if it does, what state it is in, but I do know (since an audience member asked when/where he could look at the code) that unless one signs an NDA with Samsung, he can’t even look at it. The code is not GPL licensed, since Samsung lawyers apparently see it as a way to lose some magical intellectual property, which as far as I know they never had. There has been papers published about hybrid storage, there have been papers published about fanout stackable filesystems, there have been papers published about fanout stackable filesystems which use different storage technologies (in no particular order: FiST, GreenFS, RAIF, Unionfs).

Overall, I feel like going to the talk was a waste of time. Meh.

Then I lunched.

Well, just before lunch, I was playing around with SELinux on my laptop, and after logging in, the processes weren’t getting the right context. After lunch, I went to SELinux for Consumer Electronic Devices. I walked into the room, and saw the NSA/Tresys/RedHat SELinux developers (including Dave Quigley) clustered in one area of the room. I just couldn’t resist, and I said “SELinux sucks” and then proceeded to walk away. The really amusing thing was all the SELinux people turned around to see who it was that dared to say such a thing. Very amusing. I sat next to them, and mentioned my SELinux problem. Stephen Smalley tried to figure out what the problem was, and in the end, reached the conclusion that somehow, even though the targeted policy was in use, the system was using some information from the strict policy completely confusing everything. I installed the strict policy, and things started working….well, for the most part. I should file this under the Debian bug tracker since it is a bug.

The SELinux talk was ok. It was what I expect…SELinux is kinda bloated for embedded systems. Some time after the talk, I overheard Stephen Smalley talking to Dave, saying that they should look into it a bit.

The next talk which I went to was Around the Linux File System World in 45 minutes. The reason I went to it was because it was being presented by Steve French. It was interesting, as I expected, and I’m going to read through his paper to see what exactly he did for the accounting (and what his thoughts are).

After Steve’s talk, I was going to go to a BOF about MIPS kernel port, but got distracted by people (including Steve).

At first, I wasn’t sure if I was going to go to the keynote (The Joy of Synchronicity) by Mark Shuttleworth (of the Ubuntu, space travel, and other-random-stuff fame). The title alone makes it sound like a hand-wavy, dreamy thing, but in the end I decided to at least spend 5 minutes listening. It was ok. Not great, from a technical perspective, but he did have some interesting ideas…well, it was really all just one idea — open source projects should have regular release schedules. I don’t know if I agree or not. On one hand it’s a nice thing, but at the same time, schedules are quite annoying when you want to make major changes (the KDE 3.x to 4.0 changes come to mind). In the end, I did stay the entire time, but I bailed at the beginning of the Q&A session.

Some food later, I headed to the hotel room to finish up writing notes for the day before. Well, I tried to upgrade my Wordpress install…but more about that later.

Porcelain Incident

Earlier today, July 25 2008 at 9:35 AM EDT, in Les Suites hotel room 1614 located in Ottawa, Ontario, Canada, an unidentified Canadian national going under the alias Shawn, was caught by security moving porcelain items from one of the bathrooms into his bed.

Porcelain Incident, exhibit 1 Porcelain Incident, exhibit 2 Porcelain Incident, exhibit 3

OLS 2008 - Day 3, Part 2

Ok, here’s the rest of what happened yesterday.

My BOF went fairly well. Of course, I have the slides for your perusal. For the demo, I started with following the HOWTO in the Documentation directory of the guilt repository, but soon after, I just deviated, and used the demo time to show people what happens under some cases that they were asking about. I got some useful questions. I even got an email from one of the people asking a question with the details about the question he asked.

After my BOF, a former labmate, Dave Quigley had a BOF about SELinux. During it, I decided that it would be fun to try SELinux again. I installed a bunch of Debian packages, fixed up a few config files based on the SELinux Setup page on the Debian Wiki, and rebooted to have the system relabel itself. That took a looong time. Mostly because I have something close to 300 thousand files on my laptop, and each and every one of them had to get a new extended attribute → every inode had to either set up an in-line xattr, or if there isn’t enough space, a new extent for the 20-40 byte label (I guess closer to 64 bytes with the label name + value).

One amusing thing that happened during the BOF was…at one point, Stephen Smalley (SELinux master-mind), was commenting about AppArmor (a competing security system) when he said “AppArmor is better…” well, the ellipsis really is “…than nothing.” :) Someone there actually made this joke.

After the BOF, Shawn Starr, Dave Quigley and talked for a couple of hours about a ton of random topics. Fun, fun, fun.

When Shawn and I got to the hotel, I mentioned the fact that he moved parts of the toilet the day before at an ungodly hour (something like 7:45am)…but for whatever reason I said “you moved the porcelain” … that’s where things took a strange turn… The endless stream of porcelain jokes just kept on going on and on.

OLS 2008 - Day 3

Yeah, I really wanted to write this yesterday — since it is about yesterday, but I was too tired when I got to the hotel. Either way, here it is.

The day started at 10am again - I love it. Previous years, presentations started at 9am (except the first day that was 10am). The first talk I attended was a about kernel documentation — where it resides, and why the current state is bad. The talk was a bit confusing. At one point, the presenter decided to read some text right from a HTML file — opening it in a text editor instead of a browser. He also seemed to contradict himself a bit … at one point he seemed to have said that HTML was better than plaintext docs, and then some time later, he said the other thing — plaintext docs were better than HTML. I kinda gave up understanding what his point was.

I decided to be lazy, and stayed in the same room for the next talk: On submitting kernel features. I zoned out for quite a bit — I knew a bunch of things already, and it was a bit hard to lex what Andi Kleen (the speaker) was saying.

I was going to go to the ext4 talk. Unfortunately, I got distracted by people on my way to the talk, and before I knew it, I missed most of it. I guess I’ll just have to read the paper.

After lunch, I went to Virtualization of Linux servers: a comparative study. The talk was interesting, and I will read the paper. It showed exactly how much x86 virtualization sucks (at least compared to what’s on the mainframe). I can’t wait to have some time to hack on HVF some more. :)

Then, I got distracted by people, preparation of slides for my BOF about Guilt, pondering about trying SELinux again, etc., etc.

Anyway, I’m going to finish a summary of what happened yesterday later today. Until then…

OLS 2008 - Day 2

Day 2…well…day 1 of the conference. The first talk (a keynote, actually) started at 10am. Unfortunatelly, I couldn’t use my phone as an alarm clock as the LCD broke earlier this weekend…

Broken LCD

…and yesterday before going to bed, I forgot to set the time on the alarm clock in the hotel room. So, I got up when it felt like a good time to get out of bed. Well, oddly enough, that turned out to be something like 7:30. A bit scary actually.

Anyway, a croissant and some orange juice later, I was at the conference center, with about 45 mins to spare. I ran into Bruce Fields (one of the NFS guys; working at University of Michigan).

The keynote was ok. First, there were some technical problems (the mic and the projector didn’t work), once those got out of the way and the actual talk began. The speaker was a bit too quiet - not sure if that was his fault or if the amplifier wasn’t set to the right level.

After the keynote, as people started pouring out of the room, I saw Dave Quigley (a former lab mate from FSL) hanging around some SELinux guys (makes sense, since that’s what he’s working on). We talked for a little, and then headed down to one of the talks in the next time slot: Confining the User with SELinux. It wasn’t bad…but being no fan of SELinux, I probably didn’t get all the fun out of it. I am somewhat tempted to give SELinux a try…again.

Seeing that there were no interesting looking talks, Dave, I, and 5 people doing SELinux work went out to get lunch. I had some pretty good burger with blue cheese…I think I’m on a blue cheese binge.

After lunch, it was time for another talk to attend. I decided to check out the “Real Time” vs. “Real Fast”: How to Choose? It wasn’t bad at all. Not being into real-time systems, I learned a few things here and there.

For the next talk, I just stayed in the same room. It wasn’t much fun. The presentation wasn’t the best, and about 15 minutes into it, I realized (talking with some people on IRC) that there was another talk in the other room that I wanted to see. So I left.

The other talk (Tux meets Radar O’reilly - Linux in military telecom was good. Well presented, and while obvious, it made a couple of thoughts explicitly stated.

The last talk (A Runtime Code Modification Method for Application Programs) for the day was quite interesting. I am going to check out the project website and read the paper in the proceedings.

After talks, there were BOFs… I went to the ones about and iSCSI HBA with Linux OpeniSCSI.

The BOF was much like last year - and it was a good update to see what’s going these days with the services they provide. The summary: things work. :) And more will work.

The iSCSI BOF was interesting. I got to hear about some iSCSI offload stuff that Broadcom is working on.

After the BOFs, I waited around for a little bit with Christoph Lameter, Pekka Enberg, and Bruce Fields. Then a bunch of people from the conference (I guess something between 25-35 in total) headed to a rather fancy place for some food. The soup was good. The salmon was good. The ice cream was good.

Anyway, it’s getting late, and I should get some sleep tonight.

OLS 2008 - Day 1

Times has come once again to talk of Ottawa Linux Symposium. Yep, that’s right it’s that time of the year again (ok, I’m posting it a little bit late).

Today was mostly uneventful. I woke up just before 6, got to the airport, flew to Ottawa, got to the hotel, realized I was about 4 hours early for check-in. I left my bag at the hotel, and went to the conference center. There I ran into Christoph Lameter. Having nothing better to do, we decided to pick up our registration and then go eat something.

When I got to the airport (LaGuardia) in the morning, I went on to continue reading a book I was about 2/3 of the way through — Princess Bride. My reading got briefly interrupted by the boarding call. After boarding the plane, I resumed reading. There is one really nice property these paper things have over electronic ways of eating up time…they don’t need to be turned off for take off and landing. Anyway, I fished the book while waiting for the hotel shuttle at the Ottawa airport. For what it’s worth, it’s a good book that everyone should read.

Around 6pm, a bunch of people I know (but probably shouldn’t associate myself with them in public ;) ) got to the hotel…about 2 hours later, one of them, a weather-loving KDE code monkey (yeah, that’s you Shawn) and I ordered pizza. It was quite tasty, as were the chicken pieces in the blue-cheese sauce.

Anyhow, time to run to the next talk….I’ll try to post at least once a day with random stuff…we’ll see how that goes.

Powered by blahgd