Archive for the 'Programming' Category

Boost Rails performance by generating constants

Thursday, March 11th, 2010

This may be obvious to everyone already, but it was new to me.

Serendeputy hits a few Tokyo Cabinets when it assembles a page. I was having trouble with these sometimes getting hung up, especially when looking up some of the ancillary information. I keep all the tag metadata in the main librarian cabinet. The site looks up this data pretty often. This would manifest itself on the site with a 500 error, and the Passenger Rails process would hang. The site would return 500s until I restarted the process. Ugh.

So, I did a couple of things (which I’ll likely talk about later.) The easiest one was probably the simplest.

I ended up adjusting the librarian build to not only populate the appropriate cabinet, but to also generate a ruby source code file in which I declared a constant with the tag metadata hash defined. I then restart Passenger to pick up the new constant.

Now, Rails only loads this once, and it’s in memory for the rest of the time. This not only saves me the network latency, it makes the individual responses much faster.

It’s always fun to solve complex problems with simple brute-force actions.

Make everything as simple as possible but no simpler

Tuesday, January 26th, 2010

After living with Serendeputy for the past year and a half, I’ve been able to reduce all the massive complexity of the application into just three core concepts: the gesture, the profile and the list. By making all the interactions on the site and through the API go through these three primitives, I’ve been able to solve the scalability issues and most of the performance issues inherent in mass personalization. Now, it’s like Legos. Shiny, geeky Legos.

Thank goodness Ruby and Tokyo Cabinet are so flexible. If I’d been doing this using the tools of five years ago, I would not have been able to pivot this smoothly.

Look for a couple of fun announcements in the next [real soon now] weeks.

(And yes, Tokyo Cabinet is probably one of the geekiest topics I have up on the site, right up there with Machine Learning.)

I hope Yahoo BOSS doesn’t go away

Thursday, July 30th, 2009

I implemented Yahoo BOSS as the backstop search engine for Serendeputy a couple of months ago.

When you do a search, I first look to see if I have a match among the pre-compiled topics. If you do, I bring you directly there. If you don’t have a direct match, I do a call to Yahoo BOSS’s news search and show you those results.

But now, Microsoft is handling all the search for Yahoo. What does this mean for the future of BOSS? I hope they keep it up and running. Here is the current news.

UPDATE: It looks like Yahoo’s developers don’t know what’s happening with it either.

What specifically does it mean for BOSS? Honestly the team is still absorbing the implications and we just don’t know. We can tell you that BOSS will remain live for the time being. There are many aspects still to be considered. Over the next several days we’ll be working hard to get clarity and will update the community as soon as we can.

Workaround for IE7 mouseout bug

Tuesday, June 30th, 2009

I had a beautifully crafted site (Serendeputy, in case you were wondering) that worked in all the browsers. Then, I noticed that in Internet Explorer 7, the mouseover and mouseout actions weren’t working correctly. Specifically, the mouseover event would fire correctly, adding a hover class that exposed additional information; unfortunately, the mouseout action was way too eager: it would fire when the mouse left any text, not when it left the div.

The users couldn’t access the items the hover class exposed. As soon as they went to click on them, the items disappeared. It is not nice to cruelly taunt your users.

Anyway, Google let me down, and I couldn’t find any good workarounds. So, this is what I did.

I ended up using a different class for IE, despite the icky duplication it caused in the stylesheet. This is what the original beautifully-simple jQuery code looked like:


$("div.story").mouseover(function(e) {
  $(this).addClass("hover");
});

$("div.story").mouseout(function(e) {
  $(this).removeClass("hover");
});

This code worked in every browser but IE7. Here’s what I did for IE7:


$("div.iestory").mouseover(function(e) {
  $(".iestory").removeClass("hover");
  $(this).addClass("hover");
});

So, this removes the class from all possible divs as you enter a new one. It then lights up the hover for the current div. This is hitting a fly with a sledgehammer, but it was the simplest thing I could think of that would possibly work. You don’t really notice the speed hit unless you have tons of these possible divs on the page.

If there’s a better way to do this, please leave it in the comments. If not, then I hope you find this useful.

Built out the vocabulary engine

Wednesday, June 24th, 2009

Today has been one of my more exciting days building. I finally finished up my vocabulary engine.

The vocabulary engine lets me fine-tune my classification engine topic-by-topic and source-by-source. This will allow me to do some pretty sophisticated disambiguation, and I hope that it will make document classification all the more effective.

This is still a hand-defined engine, but I also wrote in the hooks for the machine-learning piece. That’s still a ways away, though.

I’ve been an IA geek for a little over fifteen years at this point. Having the system of my dreams is pretty cool. It’s pretty rewarding to have the system you’ve had in your head for a long time actually exist in the real world. I’m not quite a sculptor, but I have to imagine the feeling is the same.

Tables in Emacs Org Mode

Friday, May 29th, 2009

I run my entire life in Emacs Org Mode. I’ll write up more on that later, but for now, check out this little trick.

You can create tables in Emacs Org Mode by just starting a line with a pipe (|). Then, just add in another pipe when you need to add columns. When you add new lines, it automatically resizes all the columns to the length of the data.

This is incredibly handy when you need to keep a running tab of something in the context of your to-do. (Like, say, if your document server is constantly leaking memory…)

Emacs Org Mode

Another little delight with working with Emacs — Aquamacs in my case.

How to prevent browsers from caching a page in Rails

Tuesday, April 14th, 2009

This took me forever to figure out, so I hope I’ll be able to save someone a few hours of annoyance someday.

Serendeputy is always recalculating, and I needed to make sure that the browsers wouldn’t cache the page when someone clicked off and then hit the back button. This is how I was able to do it.


..in application_controller.rb..
  before_filter :set_cache_buster
  def set_cache_buster
    response.headers["Cache-Control"] = "no-cache, no-store, max-age=0, must-revalidate"
    response.headers["Pragma"] = "no-cache"
    response.headers["Expires"] = "Fri, 01 Jan 1990 00:00:00 GMT"
  end

I just tested this out, and it works on Safari and Firefox on the Mac, and IE7, Firefox and Chrome on the PC.

Hope this helps.

Where am I?

Thursday, April 9th, 2009

Ahh, spring is here. The Red Sox have opened up, the tulips are coming out of the garden, and I’m making good progress on Serendeputy. We (believe it or not) are getting close.

I’ve built the servers out, and they’re up and running now, burning in and breaking in fun new ways. I knew nothing about systems administration, so it’s been a bit of a journey from bare Linux installs to fully-functioning (and even reasonably snappy) servers. My librarian application has been running for a couple of months, the deputy and memcached servers for a couple of weeks. I’m working on the rails application now, and futzing around with jquery. I’m on version 0.5 of the application, up to check-in 913 in Subversion, and up to Bug 171 in FogBugz.

I’m going to launch a private-invite beta starting with version 0.7. 0.8 will be the public beta, and I hope to be at 1.0 within a few months. After that, the world.

Thanks for keeping in touch. I’ll write more soon. (Especially about the imploding newspaper industry. I’m not an insider anymore, but I still know how things work. It’s as if the NAA has turned into a giant suicide pact.)

Getting through the Dip

Monday, February 23rd, 2009

So, right now I’m in the middle of a serious re-write of my librarian application (the piece that talks to rest of the world). It’s moving in the right direction, and it will ensure that the whole building won’t fall over on the first day, but it’s been a horrible slog.

I’ve decided to look at it this way, though: I’m building distance between me and potential competitors. I’ve been deeply involved in this enough to know that it’s something that a YCombinator kid can’t clone in a weekend. (Or, so I hope).

1000.times do
  puts “It’s important to keep going through the dip.”
end

I also need to re-read my review of The Dip.

The sublime joy of bug-fixing

Friday, September 26th, 2008

Find a bug. Write a test to reproduce it. Fix it. Watch the test pass. Check in the changes.

Realize you’re one step closer to perfection.

How to reload a class in irb

Tuesday, September 16th, 2008

I’m working interactively in irb, and I tweak the class to make changes. The changes won’t be reflected in the irb session unless you reload the class.


irb >> load 'document.rb'

Make sure to add the file extension. Unlike require, load needs the full filename.

How to test tag attributes using assert_select

Wednesday, September 10th, 2008

There’s undoubtedly a smarter way to do this, but I couldn’t figure it out. I’m doing a functional (controller) test for one of my rails pages. I want to make sure that the picture coming back is exactly 224 pixels high.

I couldn’t find a way to do this using the assert_select syntax, but this workaround seems to work:


  def test_picture_height
    assert_select "div.secondary div.lead_picture img[height]" do |height|
      # pull the tag -- this gives you the whole img tag
      @h = height.to_s
      # extract the height
      @h.gsub!(/.*height="(.*?)".*$/, '\1')
      #run the test
      assert_equal @h, "224"
    end
  end

This can be tightened up, but I left it explicit so that I could understand what I was going. The assert select makes sure that I’m setting the height. The block makes sure that it’s equal to 224.

Are there easier ways to do this?

How to get Subversion to ignore your Rails log files

Friday, September 5th, 2008

Rails puts the log files in the same part of the tree that ends up under source control. If you don’t tell Subversion to ignore them, you’ll end up with all the log files under source control, which will drive you nuts.

If you already have them in the system:


cd YOUR_RAILS_APP/log
ls -l # Make sure you're in the right directory
svn --force rm * # Force Subversion to kill those files
cd .. # Move back to the rails root level
svn propset svn:ignore "*.log" log

Hope this helps.

Revving the prototypes

Monday, August 25th, 2008

I just started a new Subversion repository, taking to heart Fred Brooks’ advice to “Build one to throw away.”

Of course, I’m planning on building four or five to throw away. If things go extremely well, I’ll have something release-worthy by prototype six…