Josef “Jeff” Sipek

Interactivity During nightly(1), part 2

Back in May, I talked about how I increase the priority of Firefox in order to get decent response times while killing my laptop with a nightly build of Illumos. Specifically, I have been increasing the priority of Firefox so that it would get to run in a timely manner. I have been doing this by setting it to the real-time (RT) scheduling class which has higher priority than most things on the system. This, of course, requires extra privileges.

Today, I realized that I was thinking about the problem the wrong way. What I really should be doing is lowering the priority of the build. This requires no special privileges. How do I do this? In my environment file, I include the following line:

priocntl -s -c FX -p 0 $$

This sets the nightly build script’s scheduling class to fixed (FX) and manually sets the priority to 0. From that point on, the nightly script and any processes it spawns run with a lower priority (zero) than everything else (which tends to be in the 40-59 range).

Interactivity During nightly(1)

Every so often, I do a nightly build of illumos on my laptop. This is a long and very CPU intensive process. During the build (which takes about 2.75 hours), the load average is rarely below 8 and most of the time it hovers in the low twenties. (This is a full debug and non-debug build with lint and all the other checking. If I need a build quickly, I can limit it to just what I need and then we’re talking minutes or seconds.)

Anyway, as you might imagine this sort of load puts some pressure on the CPUs. As a result, some interactive processes suffer a bit. Specifically, Firefox doesn’t get enough cycles to render the modern web (read: JavaScript cesspool). Instead of suffering for almost three hours, I just change Firefox’s scheduling class from IA (interactive) to RT (real time):

# priocntl -s -c RT `pgrep firefox`

This allows Firefox to preempt just about everything on my laptop. This works because Firefox actually yields the CPU properly. This will probably bite me sometime in the future when I end up on a page with such a busted JavaScript turd that it takes over a CPU and won’t let go. Till then, I have a pretty good workaround.

DTrace: qsort use in Firefox, part 2

Earlier, I wrote about some silly qsort behavior in Firefox. I couldn’t help but dig a bit deeper.

Before, we concluded that there were a lot of 8-element, 4-byte element sorts. What are these used for? What part of Firefox is causing these? DTrace to the rescue.

First, let’s change the last DTrace command from last time a bit. First of all, let’s look at 4-byte element invocations only (arg2 equals 4) and let’s aggregate on the caller function name:

# dtrace -n 'pid1120:libc:qsort:entry/arg2==4/{@[ufunc(ucaller)]=llquantize(arg1, 10,0,6,20)} tick-60sec{printa(@)}'

  1  75455                      :tick-60sec`DPCache_GetUpToDate                    
           value  ------------- Distribution ------------- count    
             < 1 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 14       
               1 |                                         0  `g_array_sort              
           value  ------------- Distribution ------------- count    
             700 |                                         0        
             750 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1        
             800 |                                         0  `FcFontSetSort                  
           value  ------------- Distribution ------------- count    
              55 |                                         0        
              60 |@                                        4        
              65 |                                         2        
              70 |                                         0        
              75 |@@@@                                     20       
              80 |@@@@@                                    24       
              85 |@@@@@@@                                  36       
              90 |@@@@@                                    26       
              95 |                                         0        
             100 |@@@@@@@@@@@@@@                           76       
             150 |@@@@                                     22       
             200 |                                         0  `_cairo_bentley_ottmann_tessellate_bo_edges
           value  ------------- Distribution ------------- count    
               3 |                                         0        
               4 |@                                        59       
               5 |                                         0        
               6 |                                         32       
               7 |                                         0        
               8 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@            2357     
               9 |                                         0        
              10 |@@                                       137      
              15 |@@@                                      215      
              20 |@                                        52       
              25 |                                         12       
              30 |@                                        58       
              35 |@@                                       200      
              40 |@                                        67       
              45 |                                         3        
              50 |                                         2        
              55 |                                         2        
              60 |                                         7        
              65 |                                         0        
              70 |@                                        67       
              75 |                                         0        
              80 |                                         0        
              85 |                                         0        
              90 |                                         0        
              95 |                                         16       
             100 |                                         0   

We see four unique functions that call qsort. It doesn’t take long to spot the one we were looking for: _cairo_bentley_ottmann_tessellate_bo_edges in Interesting, so it turns out that it wasn’t Firefox itself doing all these sorts (8-element, 4-byte element) but rather the Cairo graphics library. It would also seem that it is the only place that does these sorts. Let’s see how Firefox is involved in this.

# dtrace -n 'pid1120:libc:qsort:entry/arg2==4 && arg1==8/{@[ustack()]=count()} tick-60sec{printa(@)}'

  0  75455                      :tick-60sec 


This counts the number of 4-byte element, 8-element arrays qsort aggregated on the stack trace. We get to use ustack() here because we are in userspace (stack() would give us the kernel stack trace). Where is Firefox in this mess? This is the limit of my knowledge of Firefox internals and someone more knowledgeable could tell you more.

Your Turn

Do you use DTrace? Do you have some interesting war stories? Let me know in the comments!

DTrace: qsort use in Firefox

I’ve talked about OpenIndiana a bunch. I’ve mentioned several of its features. Let me tell you about my Wikipedia article: DTrace experiments from today. Inspired by Wikipedia article: Con Kolivas, I decided to see how Firefox uses the qsort C function.

First things first, let’s look at what the function signature looks like.

void qsort(void *base, size_t nel, size_t width,
           int (*compar)(const void *, const void *));

The second argument contains the number of elements.

Now, let’s take a look at DTrace. We want the pid provider, which lets us instrument a specific process. In my case, Firefox was pid 1069. pid1069:libc:qsort:entry is the name of the probe that will fire every time qsort in is called by Firefox (pid 1069). Let’s aggregate the second argument (the number of elements). To keep things sane, I used the llquantize function. It is a log-linear quantization function that was rather well explained by Bryan Cantrill. (Base 10 with buckets between zero and one million seemed reasonable enough.) Additionally, I wanted DTrace to give me the current histogram every minute — that’s why there is the tick-60sec probe.

# dtrace -n 'pid1069:libc:qsort:entry{@=llquantize(arg1, 10,0,6,20)} tick-60sec{printa(@)}'

  1  78738                      :tick-60sec 

           value  ------------- Distribution ------------- count    
             < 1 |                                         2        
               1 |                                         0        
               2 |                                         21       
               3 |                                         2        
               4 |@@@@                                     365      
               5 |                                         1        
               6 |@                                        132      
               7 |                                         0        
               8 |@@@@@@@@@@@@@@@@@@@@@                    1923     
               9 |                                         0        
              10 |@                                        135      
              15 |@@                                       194      
              20 |@                                        134      
              25 |                                         9        
              30 |@@@                                      246      
              35 |                                         31       
              40 |                                         8        
              45 |                                         0        
              50 |                                         0        
              55 |                                         10       
              60 |                                         1        
              65 |                                         10       
              70 |                                         39       
              75 |                                         2        
              80 |@                                        112      
              85 |@                                        56       
              90 |@                                        82       
              95 |                                         1        
             100 |@                                        132      
             150 |@                                        90       
             200 |                                         0        
             250 |                                         0        
             300 |                                         4        
             350 |                                         0        

Interesting! After several minutes of browsing various websites, we can see that Firefox really likes to sort 8-element arrays. (The value column is the bucket for the various array lengths. The count column specifies how many times there was a qsort call for each bucket.) Let’s dig a little deeper. Sorting 1 byte array elements is very different from sorting 1 MB elements. It would be really nice if we could break the histogram down into several histograms — one for each size. Well, guess what? DTrace lets you do that very easily.

Note that the command changed only a little. Now, in addition to looking at the second argument (the array length), DTrace breaks down the distribution based on the value of the third argument (the array element size). Since I visited different websites and Firefox does caching, the distribution of qsorts is a bit different — but it is still close enough.

# dtrace -n 'pid1069:libc:qsort:entry{@[arg2]=llquantize(arg1, 10,0,6,20)} tick-60sec{printa(@)}'

  4  78738                      :tick-60sec 
           value  ------------- Distribution ------------- count    
               1 |                                         0        
               2 |@@@@@@@@@@@@@@@@@                        60       
               3 |@@                                       9        
               4 |@@@@@                                    17       
               5 |                                         0        
               6 |@@@@@@@@@@@@@@@                          55       
               7 |                                         0        
               8 |@                                        3        
               9 |                                         1        
              10 |                                         0        

           value  ------------- Distribution ------------- count    
             < 1 |@@@@@@@@@@@@@                            2        
               1 |                                         0        
               2 |                                         0        
               3 |                                         0        
               4 |                                         0        
               5 |                                         0        
               6 |                                         0        
               7 |                                         0        
               8 |                                         0        
               9 |                                         0        
              10 |                                         0        
              15 |                                         0        
              20 |                                         0        
              25 |                                         0        
              30 |                                         0        
              35 |                                         0        
              40 |                                         0        
              45 |                                         0        
              50 |                                         0        
              55 |                                         0        
              60 |                                         0        
              65 |                                         0        
              70 |                                         0        
              75 |                                         0        
              80 |                                         0        
              85 |                                         0        
              90 |                                         0        
              95 |                                         0        
             100 |                                         0        
             150 |                                         0        
             200 |@@@@@@@@@@@@@@@@@@@@                     3        
             250 |                                         0        
             300 |@@@@@@@                                  1        
             350 |                                         0        

           value  ------------- Distribution ------------- count    
             < 1 |                                         12       
               1 |                                         0        
               2 |                                         1        
               3 |                                         0        
               4 |@@@@@@@                                  1351     
               5 |                                         0        
               6 |@                                        247      
               7 |                                         0        
               8 |@@@@@@@@@                                1868     
               9 |                                         0        
              10 |@@@                                      594      
              15 |@@                                       422      
              20 |@                                        230      
              25 |                                         4        
              30 |@@@@@@                                   1193     
              35 |@@                                       466      
              40 |                                         57       
              45 |                                         63       
              50 |                                         1        
              55 |                                         18       
              60 |@                                        190      
              65 |@@                                       341      
              70 |@                                        207      
              75 |                                         2        
              80 |                                         56       
              85 |@                                        158      
              90 |                                         46       
              95 |                                         0        
             100 |@@                                       350      
             150 |@                                        206      
             200 |                                         3        
             250 |                                         10       
             300 |                                         8        
             350 |                                         0        
             400 |                                         0        
             450 |                                         0        
             500 |                                         0        
             550 |                                         0        
             600 |                                         0        
             650 |                                         0        
             700 |                                         1        
             750 |                                         10       
             800 |                                         0        
             850 |                                         0        
             900 |                                         0        
             950 |                                         0        
            1000 |                                         0        
            1500 |                                         0        
            2000 |                                         0        
            2500 |                                         0        
            3000 |                                         8        
            3500 |                                         0        

As you can see, there are now three histograms printed — that’s because DTrace saw 3 unique values of arg2. The first histogram is for 52-byte array element sorts. There weren’t many of those over the few minutes of browsing I did. The second is for 8-bytes elements — there are six of those total! The third distribution is where things get interesting. These are all the sorts of 4-byte array elements. Now we know that the large amount of 8-element sorts Firefox performs are on 4-byte element arrays. I wonder what that’s about. We also see that there were eight times that Firefox ended up sorting an array that had somewhere between 3000 and 3500 4-byte elements. Eeek!

DTrace is a really powerful tool. It lets you inspect the operation of a system with minimal disruption (the performance overhead is rather small). I hope to post more analyses of various software in the future.

I should add, this experiment was conducted with Firefox 3.6.12 on OpenIndiana 151a.


Dear Firefox,

You Suck.


Josef ’Jeff’ Sipek.

P.S. xulrunner-stub using 4% CPU when the window is not visible and 36% when re-rendering parts of the page is a bit too excessive.

Memory Leaks

Alright, I think I’ve had just about enough. Why does Amarok eat up 22% of my RAM (1GB) after 4 days of running (and playing music for maybe 18 hours of those 4 days)? Why does Firefox use up 33% of my RAM in 4 days?

Why is it that when I shut down the app, and restart it, the usage is 4–5 times less?

Amarok Firefox 2
before app restart 225 MB 338 MB
after app restart 58 MB 72 MB

The only reason I can think of is application being buggy, or having really crappy defaults.

Buggy Applications

Dear developers, believe it or not, when you allocate memory, you also have to free it when you are done with it. If you don’t, you are committing a crime against humanity known as a “memory leak”. This memory is unusable, and essentially becomes dead weight the process carries around. Since it is not used, the OS may swap it out, and before long, your swap file/partition becomes full of memory that has been leaked.

Contrary to popular belief, freeing memory is really simple.

For you C++ coders (yes, that includes you Amarok folks), you simply use the delete keyword followed by a pointer of what you want to free. For example,

delete some_pointer;

If you are using C, the free function is your friend. Just call it, and make the one argument you give it the pointer to what you want to free. For example,


Now, if you are working on a larger project, there might be wrappers around the memory management (malloc/free, new/delete) functions, but whatever the “free this memory” function is called, USE IT.

I can almost hear all the managed languages fans yell: “Just use a language that does garbage collection, and you won’t have to worry about freeing memory.” Well, you are WRONG!

Garbage collectors maintain graphs of memory allocations, and whenever they notice that some piece of memory is unreachable, they mark it as garbage, and free it. Here’s my favorite example for causing leaks in a garbage collected language:

Suppose that you have implemented a class that works as a stack. You implemented it as a list of elements, and an index into the array to mark the top of the stack. Pushing an element is trivial, you just increment the index, and set the reference in the array to the object you want to store. Popping is really easy, you just decrement the index, and you’re done. Right? WRONG! Decrementing the index changes that one integer variable, but that reference in the array is still valid, and therefore the object is still reachable as far as the garbage collector is concerned. Sure, next time you push into that slot, the previous reference will get broken, and the previous allocation will get freed (assuming that there are no other references). But what if you never push that many elements back onto the stack? What if you experienced some high-load spike? You’ll have a large number of objects incorrectly referenced, tieing up memory, and quite possibly making the entire system slower.

How can you solve this? Pretty simple, just reset the reference to some “null” quantity. In Java, that means using the null literal. For example,

some_reference = null;

In Python, None is the proper keyword to use:

some_reference = None

The lesson is, free the memory you allocated when you are done using it.

Crappy Defaults

Many large applications (Firefox included), have many options you can set that affect its behavior. The default options should cover 95% or more of the users (or at least the greatest majority possible). Why such a high number? Well, suppose you settle for making 90% of your users happy out of the box…that means that 1 in 10 people that try your app will not be happy with the defaults. How many will bother checking if there even are knobs they can turn to make it work the way they want? Not all. Some will just try to install another open source app written by someone else that does pretty much the same thing. So, the default options should make as many people happy as possible.

How does this tie into a third of my RAM being used by Firefox? Simple, I do not know if there are any knobs that would “fix” the problem I am seeing. For all I know, someone decided that it was a great idea to be really aggressive about caching web page content in memory — something that’s fine if you have 16GB RAM, but guess what most people don’t.

Whatever it is (defaults that don’t make sense or memory leaks), Firefox and Amarok have problems that must get addressed. What is one of the reasons people complain about Microsoft Word? It takes up tons of memory. Well, I don’t feel like throwing over 200 MB of RAM at an application that plays MP3s, displays a playlist, and cover art.

And before someone suggests that I use Firefox 3… I realize that it is all super-duper-better-than-ever, but let’s think for a second. When the original Firefox was released, it was hailed as the non-leaky, light-weight Mozilla. Then, things started to get slow again. Firefox 2 was supposed to be the super-fast, non-leaky browser. What happened? What happened to my >300 MB of RAM? Now, Firefox 3 is all the rage…do you see the pattern yet?

I think this brings up a larger issue. It’s no secret that I do some Linux kernel coding from time to time. In the kernel, there are leaks at times, but it seems that the kernel leaks are effectively non-existent compared to applications like Firefox. Don’t believe me? How come you can have a server run for over a year and it responds just as well after the year as it did when you booted it? Imagine running Firefox for a year without restarting it? Can you even imagine that? The Linux kernel doesn’t seem to be the only “non-leaky” (there are leaks, but they are very rare, and probably mostly in the ugliest parts of the kernel — device drivers), Apache performs quite well even after running for a while, PostgreSQL, and the list goes on and on.

Why is it that Firefox and other projects seem to have so many problems? The only thing I can think of is the quality control that goes into checking new code before it’s committed. In the kernel community, a patch may get rewritten a dozen times, submitted to mailing lists for review, get comments from people familiar with the subsystem, but also from other developers (and budding developers trying to understand the existing code). It takes a lot of effort to get a piece of code into the kernel, but in the end, that code is well written, well reviewed, and it should benefit most users. Do the Firefox, et. al., communities do this? I do not know, but somehow, I suspect that it isn’t the case.

Firefox + vim = ?

What would happen if someone took Firefox and combined it with the awesomeness of vim’s UI? Wouldn’t that be slick? Well, wait no longer! Someone has done precisely that!


By default, all you see is just the tab list, and the status line on the bottom of the window. I’ve enabled the menu bars because I’m not ready to make a complete switch ;)

Of course, since it’s still Firefox, everything renders just the same.

Edit: It’s called Vimperator and it is Firefox extension.


You may have heard me say that Firefox sucks. Well it does, but today, I accidentally found something that makes it suck less than what I thought.

Go to some page with text, and press the slash key and type something! Very nice. I like my applications to understand slash as "search" (vim, less, etc.). Now, if only it did regular expressions ;)

Powered by blahgd