Josef “Jeff” Sipek

Post Formats in blahgd

After reading What creates a good wikitext dialect depends on how it’s going to be used, I decided to write a short post about how I handled the changing needs that my blahg experienced.

One of the things that I implemented in my blogging software a long time ago was the support for different flavors of markup. This allowed me to switch to a “saner” markup without revisiting all the previous entries. Since every post already has metadata (title, publication time, etc.) it was easy enough to add another field (“fmt”) which in my case is just a simple integer identifying how the post contents should be parsed.

Over the years, there have been four formats:

fmt 0 (removed)
Wordpress compat
fmt 1
“Improved” Wordpress format
fmt 2
raw html
fmt 3
LaTeX-like (my current format of choice)

The formats follow a pretty natural progression.

It all started in January 2009 as an import of about 250 posts from Wordpress. Wordpress stores the posts in a html-esque format. At the very least, it inserts line and paragraph breaks. If I remember correctly, one newline is a line break and two newlines are a paragraph break, but otherwise is passes along HTML more or less unchanged. Suffice to say, I was not in the mood to rewrite all the posts in a different format right away so I implemented something that resembled the Wordpress behavior. I did eventually end up converting these posts to a more modern format and then I removed support for this one.

The next format (fmt 1) was a natural evolution of fmt 0. One thing that drove me nuts about fmt 0 was the handling of line breaks. Since the whole point of blahgd was to allow me to compose entries in a text editor (vim if you must know) and I like my text files to be word wrapped, the transformation of every newline to <br/> was quite annoying. (It would result in jagged lines in the browser!) So, about a month later (February 2009), I made a trivial change to the parsing code to treat a single newline as just whitespace. I called this changed up parser fmt 1. (There are currently 24 posts using this format.)

A couple of months after I added fmt 1, I came to the conclusion that in some cases I just wanted to write raw HTML. And so fmt 2 was born (April 2009). (There are currently 5 posts using this format.)

After using fmt 2 for about a year and a half, I concluded that writing raw HTML is not the way to go. Any time I wanted to change the rendering of a certain thing (e.g., switching between <strong> and <b>), I had to revisit every post. Since I am a big fan of LaTeX, I thought it would be cool to have a LaTeX-like markup. It took a couple of false starts spanning about six months but eventually (February 2011) I got the lex and yacc files working well enough. (There are currently 422 posts using this format.)

While I am reasonably happy with fmt 3, but I do see some limitations that I’d like to address. This will inevitably lead to fmt 4. I am hoping to make a separate post about the fmt 3 limitations and how fmt 4 will address them sometime in the (hopefully) near future.

Post Preview

One of the blogs I’ve been reading for a few months now just had a post about partial vs. full entries on blog front pages. Since I have some opinions on the subject, I decided to comment. My response turned into something sufficiently content-full that I decided that my blahg would be a better place for it. Sorry, Chris :P

First of all, my blog doesn’t support partial post display because… technical reasons. (The sinking feeling of discovering a design mistake in your code really resonated with me about this exact thing.) With that said, I don’t think that partial display is necessarily bad. I feel like any reasonable (this is of course subjective) blogging software should follow these rules:

  1. if we’re displaying a atom/rss feed, display full post
  2. if we’re displaying a single post, display full post
  3. if the post contains magical marker that denotes where to stop the preview, display everything above the marker
  4. display full post

I really dislike when the feeds give me the first sentence and I have to click a link to read more. At the very least, it is inconvenient, and in extreeme cases it feels outright insulting.

I think the post-by-post-basis Chris suggests is the way to go, but in the absence of a user-defined division point I would display the whole thing.

Do I write many posts where I wish I could use this magical marker? No. If that were the case, I’d make supporting this a higher priority. However, there have been a handful of times where I believe that the rest of the post is uninteresting to…well…just about everyone and it is really long. So long, that you might get bored trying to scroll past it. (If you are reading my blahg, I don’t want you to be bored because you had to scroll for too long to skip over an entry — you are my guest, and I am here to entertain you.) This is the time I believe displaying a partial post is good.

I’m hoping that eventually I’ll wrestle with my blogging software sufficiently to eliminate the technical reasons preventing me from introducing and processing this special marker. Not that you’ll really notice anything different. :)

Powered by blahgd