TransFS DevBlog RSS Feed

By josh

We recently integrated KISSMetrics tracking into our website (more on that in a future post!)… and I was once again faced with the annoying task of carefully updating the mess of javascript that was cluttering up our page in various places.  We’re data & analytics tracking freaks at TransFS.com… so we have a bunch of different analytics packages in place on our site. Managing all of these packages and keeping the tracking javascript clean & simple had finally become a real pain point for me.

At various times, we’ve had Google Analytics, Clicky, MixPanel, and our own internal FunnelCake tracking code all monitoring our visitor traffic.  Each library requires specific javascript code to be added to the page; some insisting that the code be added to the top of the <body> element, while others suggesting that it be before the closing </body> tag.  To further complicate matters, we want to be able to trigger a manual track() function to track a specific event.  However, it isn’t always convenient to declare these tracking calls at the bottom of the page, or in a controller, or in any single location.  What if we want to track an action in a view?  How can we be sure that the corresponding library has been initialized at the time when the javascript code is inserted?  What if we want to declare a tracking event in a controller?  How can we easily send the same tracking event to all of the different analytics packages we have installed?

So, in a fury of frustration, I banged out a new rails gem for helping with this exact problem.  I’d like to introduce you to Analytical.  This gem is designed to be an all-purpose analytics front-end.  By providing a simple API to a variety of analytics libraries, it allows you to clean up your views and easily track events in your app without worrying about the details of the underlying analytics apis.  Even better, it gracefully deals with queuing up tracking commands for services, so that the tracking commands are always inserted after the initialization scripts have been added to your page.

So, how does it work?  You start off by adding a single line to your application controller:

    analytical :modules=>[:google, :clicky]

This tells analytical which packages you intend to use. You can define the api keys in config/analytical.yml, like so:

   google:
     key: UA-5555555-5
   clicky:
     key: 55555

Then, in your layout file, you need to add a few hooks so that Analytical can insert any initialization code for your modules:

    <!DOCTYPE html>
    <html>
      <head>
        <title>Example</title>
        <%= stylesheet_link_tag :all %>
        <%= javascript_include_tag :defaults %>
        <%= csrf_meta_tag %>
        <% analytical.identify current_user.id, :email=>current_user.email if current_user %>
        <%= raw analytical.head_javascript %>
      </head>
      <body>
        <%= raw analytical.body_prepend_javascript %>
        <%= yield %>
        <%= raw analytical.body_append_javascript %>
      </body>
    </html>

As you can see, I’ve also added a simple identify call to the template, so that we can tell our analytics modules about the current user (if they support per-user identification, like Clicky/KISSMetrics).

From there, using analytical to track events is dead simple. Track a url with:

    analytical.track '/a/really/nice/url'

And an event with:

    analytical.event 'Some Awesome Event', :with=>:data

… and these can be used anyplace in your app: views, controllers, templates, whatever. Since views are processed before templates, your commands will queued up and then inserted immediately after the corresponding analytics library has been initialized. This produces javascript code on your page that might look something like:

    <!-- Analytical Init: KissMetrics --> 
    <script type="text/javascript"> 
        var .... initialization javascript for KISSMetrics ...
    </script> 
    <script type='text/javascript'> 
        _kmq.push(["record", "Visited Tour", {}]);
    </script>

In this example, analytical.event 'Visited Tour' was called in the view… and the event was queued up so it could be inserted immediately after the KISSMetrics library was initialized.

But what about triggering a specific command that is not one of the standard calls that Analytical knows about (track, event, identify)? No problem… you can call any custom method on a module like so:

   analytical.clicky.special_tracking_method :that_only=>:clicky_supports

If you need to output javascript immediately without queuing it up for later insertion (to use in a javascript callback, for instance)… you can do this using the .now accessor:

   <%= analytical.now.track '/some/url' %>

outputs:

   clicky.log('/some/url');
   googleAnalyticsTracker._trackPageview("/some/url");

In our app, we wrap this into a convenience javascript function in our that looks like:

	var analytical_track = function(page){
		<%= analytical.now.track('page').gsub(/"page"/,'page') %>
	};
	var analytical_event = function(event, data){
		if (!data) { data = {}; }
		<%= analytical.now.event('event', {}).gsub(/"event"/,'event').gsub(/\"?\{\}\"?/,'data') %>
	};

This allows us to call

analytical_track('/some/url');

from our javascript, anywhere in the page, and it’ll hit every module that we have installed. (Note: we do have to be sure that if we use this javascript helper, that we only call it in ajax situations where the page is certain to be fully baked and all analytics libraries are initialized.)

That’s about it. We’ve had the gem in production for a while now, and it has really cleaned up our tracking code. However, I’d love to get some feedback on how it could be improved!

As of now, I have implemented support for Google Analytics, Clicky, and KISSMetrics… but the library is extremely easy to extend, so I’m hopeful that the community will add support for other services that people would like to use. You can find the code here: http://github.com/jkrall/analytical, so fork away!

Lastly, I’d like to call out the cool Snogmetrics library for providing a lot of the inspiration for Analytical. I stole some great ideas from that project, and adapted it for our broader needs. Thanks Theo!

By josh

I know I’m a few months late to this party… but I finally jumped aboard the Ruby Version Manager (RVM) train today, and wow is it awesome!

I had to wipe my main development machine clean today, and start from scratch… and I was dreading the prospect of digging through a million blog posts to determine which ruby version I should install, grabbing tar files, making sure that I’m on the right patchlevel, etc. Enter RVM…

Ruby Version Manager is a really simple set of scripts that manages your ruby environment, allowing you to install multiple versions of ruby with incredible ease. The source lives here: http://github.com/wayneeseguin/rvm, and you get up and running with the following simple command (copied directly from their great install page):

mkdir -p ~/.rvm/src/ && cd ~/.rvm/src && rm -rf ./rvm/ && git clone --depth 1 git://github.com/wayneeseguin/rvm.git && cd rvm && ./install

Once you install RVM… you can install a ruby version with:

rvm install 1.8.7

It’s that easy! The ruby tar file will be downloaded on-the-fly, compiled, and installed into a sandbox area in your ~/.rvm. You can pick macruby, jruby, 1.8.6, 1.8.7, whatever you want…

To begin using a specific installed version:

rvm --default 1.8.7

The –default flag makes it sticky so that your shells will use this version by default. Now open a new shell, and ruby will be pointing at the correct version:

% which ruby
/Users/krall/.rvm/rubies/ruby-1.8.7-p249/bin/ruby
% ruby -v
ruby 1.8.7 (2010-01-10 patchlevel 249) [i686-darwin10.2.0]

Even better: No more installing gems with “sudo gem install …”! Each ruby version comes with its own set of gems, all residing in a tidy directory structure inside .rvm. So, you cam install gems at-will, and they will link up with the correct ruby version at all times.

Great stuff. Thanks Wayne!

By josh

SEOpener, a can opener for SEOFor the past year or so, we’ve been tracking our SEO progress with a tool that we developed internally. The idea was to keep an eye on the daily movement of our domain in the search results for various terms. We also wanted to collect some general statistics on these terms, like how much traffic they receive, the PageRank of the top ranking site, and advertising CPC estimates. All of this was rolled into a set of controllers, models, views, and background-processing code that I’ve now extracted out into a new rails engine plugin: SEOpener!

SEOpener combines data from the Google Ajax Search API, with other datasources such as Yahoo’s keyword estimation tool, and google toolbar pagerank queries, to try to give you a complete picture of how you are positioned on a given search term. It provides a simple admin interface for listing all of your terms, adding new ones, and viewing the top 64 search results for a given term.

SEOpener is designed to be a drop-in solution for existing apps, providing everything you might need to get started in tracking your sites search rankings. For the most part, it should be fairly easy to install, however you will need to come up with a background processing solution (you can use cron & workling/starling like we do, or something else) to run the data-scraping jobs at regular intervals.

SEOpener Admin Interface

SEOpener Admin Interface

We’ve been pretty happy with this tool, and have been running it in production for some time now…. but it would be great to get help from the community. Leave a comment if you have any ideas, or even better, fork us and hack away on it!

By josh

We use Google Apps for all of our emailing… and for the most part it works great. However, dealing with their outbound smtp relay can be a real pain. The biggest problem is that it only allows you to send an email if the From header matches the account that is used for authentication. (Note: I’m not suggesting that this is a bad practice, or something wrong with Google… it makes total sense.)

Because of this, I needed to set up different smtp account settings for one of our ActionMailer subclasses than those used by the rest of our site. Unfortunately, because of the way ActionMailer is designed… this isn’t exactly an easy task!

I found this article, which gave me some clues about how to do this. Here’s the solution I came up with:

class MailerWithCustomSmtp < ActionMailer::Base
    SMTP_SETTINGS = {
      :address => "smtp.gmail.com",
      :port => 587,
      :authentication => :plain,
      :user_name => "custom_account@transfs.com",
      :password => 'password',
   }
 
   def awesome_email(bidder, options={})
      with_custom_smtp_settings do
         subject       'Awesome Email D00D!'
         recipients    'someone@test.com'
         from          'custom_reply_to@transfs.com'
         body          'Hope this works...'
      end
   end
 
  # Override the deliver! method so that we can reset our custom smtp server settings
  def deliver!(mail = @mail)
    out = super
    reset_smtp_settings if @_temp_smtp_settings
    out
  end
 
  private
 
  def with_custom_smtp_settings(&block)
    @_temp_smtp_settings = @@smtp_settings
    @@smtp_settings = SMTP_SETTINGS
    yield
  end
 
  def reset_smtp_settings
    @@smtp_settings = @_temp_smtp_settings
    @_temp_smtp_settings = nil
  end
 
end

The idea here is simple… before sending the mail, set the @@smtp_settings to whatever it needs to be. Then, after delivering the email, set it back to whatever it was. Hope this helps someone else!

By josh

In our recent switch to EngineYard Cloud… one of the most annoying problems I ran into was getting our WordPress blog configured properly so that it would play nicely with our Rails app. In our previous hosting environment, we ran with an Apache setup… and we used a simple virtualhost config that looked something like this:

  Alias /blog /var/www/apps/tfs_blog
  <Directory /var/www/apps/tfs_blog>
  PassengerEnabled off
  allow from all
  AllowOverride All
  </Directory>

With the move to EngineYard, we have made the switch to Nginx… because, after all, that’s what all the cool kids are using these days. I admit that one reason I was looking forward to Nginx was to rid myself of the awful apache config file syntax. However, I quickly learned that Nginx (or at least the version that is installed by default on the EY Cloud images) requires its own voodoo tricks in order to do seemingly simple things.

In this case, all we want to do is host our blog, alongside our rails app, at /blog. (and of course, this blog is hosted at /devblog) Unfortunately, getting this working properly required endless hour of scouring google for nginx config snippets… until I finally landed on the following, working setup:

location /blog/ {
 alias /data/tfs_blog/;
 index index.php index.html index.htm;
 if (-e $request_filename) {
 break;
 }
 rewrite ^/blog/(.+)$ /blog/index.php?q=$1;
}
location = /blog {
 fastcgi_pass 127.0.0.1:1027;
 fastcgi_param SCRIPT_FILENAME /data/tfs_blog/index.php;
 include /etc/nginx/common/fcgi.conf;
}
location ~ /blog/?.*\.php$ {
 if ($fastcgi_script_name ~ /blog/?(.*)$) {
 set $valid_fastcgi_script_name /$1;
 }
 fastcgi_pass 127.0.0.1:1027;
 fastcgi_index index.php;
 fastcgi_param SCRIPT_FILENAME /data/tfs_blog$valid_fastcgi_script_name;
 include /etc/nginx/common/fcgi.conf;
}

The worst thing about this, aside from the complexity required to get such a simple thing working, is that the error condition you get when the nginx fastcgi setup is misconfigured is incredibly obscure: “No input file specified.” This apparently means that fastcgi is not receiving the SCRIPT_FILENAME parameter properly… but the actual cause of this can be anything from misconfigured permissions to location paths that aren’t grabbing the proper filename from the request uri. As you can see, the final result requires manually capturing the php filename from the uri path using a regex, and then setting the SCRIPT_FILENAME env variable accordingly.

It’s also worth mentioning that this snippet:

 if (-e $request_filename) {
 break;
 }
 rewrite ^/blog/(.+)$ /blog/index.php?q=$1;

… from the first block is required to make wordpress’s pretty urls map correctly to the cgi handler.

That sure was a lot of work! Hopefully this post will save someone else all the pain and headache… or maybe someone will stumble across this and show me how it can all be done in a couple of simple config lines.

By josh

Engine Yard

We recently underwent a pretty big change here at TransFS.com… all entirely “under the hood”. We decided to migrate from our old hosting setup at Joyent, to our new home at EngineYard Cloud. We’re now running on Amazon EC2 instances, all transparently and easily managed by the excellent admin tools provided by Engine Yard.

Why the switch?

I had several reasons for deciding that Joyent was no longer working for us. First, the Solaris architecture that Joyent uses provided no obvious benefits, and a surprising number of headaches. Solaris has most of the same tools as linux… but many of them take different command-line arguments or are annoyingly lacking in features. For example: “grep -R” for recursive search through files, doesn’t exist on Solaris. Why? I have no idea. But these types of things can become a real pain over time, and while they seem small, they really do matter.

Secondly, Joyent provides no automatic backup solution for your data. I was very surprised when we first switched to Joyent that I had to roll my own backup solution with mysqldump, rsync, and a script that uploaded to Amazon S3. I never felt good about this, because backups are so important, and because it isn’t an area that I’m very confident in… and I was never able to properly test that my backups would save us in the case of massive data loss. When selecting a hosting provider, I really want someone to hand me a pre-built backup solution that I can trust uses industry best-practices and has been carefully tested. EngineYard provided me with much more confidence on this front, and automated daily or hourly backups are dead-simple to set up (no hacking of my own scripts!)

Finally, we have been having some real problems with the memory footprint of our app. It has ballooned to over 150MB per rails instance, sometimes as high as 250MB. Between our Passenger children, background workers, mysql, etc… we kept bumping up against the RAM limit on our Joyent slice. This was obviously our fault, but the process for dealing with it with Joyent made it a real pain to solve. Moving to a two-slice setup, where our mysql db lived on a separate server was an obvious solution… but provisioning and setting up a whole new server was something that I didn’t have time for, and all Joyent really offered was to clone my existing setup. I would have had to deal with everything else myself, and suffer through a lot of downtime while I figured it out. Even switching to a Joyent slice with more RAM was not an instantaneous process. I would have needed to create a support ticket, and then wait for someone to get back to me and clone us over to a new “accelerator” slice. I’m sure this would have worked… but it isn’t nearly as convenient as it should be, in the age of push-button admin interfaces like those provided by EngineYard Cloud. In fact, with EY Cloud, I was able to switch to running our db on a separate server with a single checkbox and re-deploy. Very cool. We’re now running safely under the RAM threshold of the smallest EC2 instance, and I can very easily scale up to more servers or larger server by selecting a different server type in an EngineYard dropdown.

Custom Chef Recipes

It actually took us a lot longer to switch from Joyent than I had hoped. I began the process of investigating EY Cloud a few months ago… and had delayed the process for a while because I had some real concerns about some parts of our setup. Gems, packages, etc are all super-simple to set up in EY Cloud… but certain things are not as easy as a “checkbox”. For instance, we run Starling & Workling for background processing at TransFS.com. Getting these set up in EY Cloud was non-trivial, because it required writing a custom chef recipe. Chef is a really amazing server configuration tool that uses a ruby DSL to define changes to your server config. EngineYard has some decent chef documentation, but it still took some time to learn how Chef works before I was comfortable enough with it to get our servers set up properly.

However, once you learn to be a master Chef… it gives you amazing power to set up your server exactly the way you want, and in a totally repeatable way. In fact, when I need to add a new application server to our cluster… it should be as simple as flipping a switch, and all of my custom configuration settings will be applied automatically to bring that new server into the proper state for TransFS.com. Very cool. Learning Chef is a necessity to getting up and running with EngineYard Cloud… but it is worth the effort.

There are a few other quirks to EY Cloud that took some work to figure out, such as migrating capistrano hooks… but overall I was very pleased with the process. I had us up-and-running on the new servers in about 30 hours, start to finish, and we only had about 30 minutes of real downtime while I rsync’d our data over.

Anyway, thanks EngineYard for a great experience so far. And thanks as well to Joyent for hosting us through our early days… we’ll miss you, but we’ve found a new home that is much better suited to our needs!

Coming up in the next post: some lessons learned about Nginx, Rails, and WordPress…

By josh

I recently discovered a simple, but really great, ruby idiom… that I’m not sure how I’ve lived without for all this time. You can query strings with a regexp using the [] array operator:

"Grab the number 555 from this string"[/\d+/]
"Grab the number 555 from this string"[/number (\d+)/, 1]

This will return nil, if the regexp is not matched, and the matched text otherwise. You can even supply an index number as a second argument, to return a particular matched substring.

So cool. It is this kindof amazing syntactic sugar that makes Ruby a joy to work with.

By josh

We use a lot of git submodules at TransFS.com. Nearly all of our rails plugins are installed via submodules, as is rails itself. This works very well for keeping multiple repositories synced up with each other……. when it works. The biggest problem with git submodules, in our experience, is that they don’t automatically stay synced up when you pull or checkout different branches. Consider this scenario:

  1. I make a change to our funnelcake plugin, and push that change to the github repo
  2. I then jump back to our main TransFS repo, and commit the updated funnel_cake submodule
  3. I push that change to the TransFS repo… and call it a day
  4. Later that day, one my fellow developers pulls the latest code from github
  5. He forgets to run “git submodule update”
  6. He codes away, commits, and pushes his changes back to github

Now we have a problem… because the latest changes from my colleague have bumped the funnel_cake submodule backwards to its previous state. So, even tho the funnel_cake repository has newer code in it, we’re pointing to the submodule commit hash with the older code.

The best trick we’ve found for solving this problem is simple: use a git alias for all pulls and checkouts. Here’s what I added to my .gitconfig:

[alias]
        pullup = !git pull && git submodule update && git status
        checkup = !sh -c 'git checkout $1 && git submodule update && git status' -

Now, I can run:

git pullup transfs

instead of:

git pull transfs

… I can ensure that my submodules will be updated properly (and also get a handy stat of the repo at the same time)

Same thing with:

git checkup new_fancy_branch

instead of:

git checkout new_fancy_branch

It seems to work well! Now if we could just get everyone to remember to use these commands…
Do you have any good tricks for avoiding these problems with git submodules?

By josh

Here at TransFS.com, we use Selenium tests to verify most of the mission-critical parts of our website.  While a quality RSpec test suite is also very important, there is nothing like an end-to-end integration test that simulates real user behavior to give you confidence that your code is bug-free.  Since we run Ruby on Rails, we take advantage of the handy Selenium On Rails plugin… or more specifically, my fork of this plugin with some of our own modifications.

Why do we run our own custom fork?  There are a few reasons, but originally it had to do with our need for some custom hooks that our CruiseControl.Rb continuous integration setup uses to take screenshots of the tests in-action.  (A topic for another post!)  Most recently, however, I needed to hack the plugin to provide a basic workaround to testing code that uses SWFUpload for file uploading.

For the uninitiated, SWFUpload is a nice Flash app that provides javascript hooks for  file-selection and uploading with support for multiple files at once.  It is a nice solution for web forms that need to upload more than one file, if you are willing to deal with the added cost of adding Flash to your page.  It isn’t ideal in all cases, but it has worked OK for us thus far.  However, testing SWFUpload is a real pain… because Selenium cannot trigger events on flash movie elements.

One solution for this would be to simulate native-OS keyboard and mouse events, and add test commands to click the “upload” button and select a file.  In our CI test environment, however, this was not a good option.  Instead, I’ve opted to test the POST action on the server only… simulating the file upload without actually triggering SWFUpload to do the work.  To accomplish this, I added a custom selenium action that dynamically inserts a form and file field that are targetted at the corresponding SWFUpload endpoint url.

Here’s what it looks like:

// Custom method for uploading a file, simulating SWFUpload
Selenium.prototype.doSwfUpload = function(selector, filename) {
 
	// Grab the SWFUpload element from the DOM
	var doc = this.browserbot.getCurrentWindow().document;
	var swf_uploader = Element.select($(doc.body), selector)[0];
 
	// Slurp the list of params from SWFUload, and parse them into a Hash
	var flashvars_s = swf_uploader.down('param[name=flashvars]').value;
	var flashvars = {};
	$A(flashvars_s.split(/&/)).each(function(kv){
		var key = unescape(kv.split(/=/)[0]);
		var value = unescape(kv.split(/=/)[1]);
		flashvars[key] = value;
	});
	var params_array = $A(decodeURI(flashvars.params).split(/&amp;/));
	var params = new Hash({});
	params_array.each(function(kv){
		var key = decodeURI(kv.split(/=/)[0]);
		var value = decodeURI(kv.split(/=/)[1]);
		params.set(key, value);
	});
	params.unset('format'); // Remove the format param, since we don't want to request as json
 
	// Grab the SWFUpload form from the hidden IFrame,
	// and insert the key/value params into the form
	var faker_form = Element.select($$('#selenium_fileupload_iframe')[0].contentDocument.body, '#swfupload_faker_form')[0];
	params.each(function(kv) {
		Element.insert(faker_form, { bottom: '<input type="hidden" name="'+kv.key+'" value="'+kv.value+'" />' });
	});
 
	// Assign the selected file to the file field
	netscape.security.PrivilegeManager.enablePrivilege("UniversalFileRead");
	this.browserbot.replaceText(faker_form.down('#swfupload_faker_file'), filename);
 
	// Assign the URL and submit the Form
	faker_form.action = flashvars.uploadURL;
	faker_form.submit();
 
	// Clean up the params
	Element.select(faker_form, 'input[type=hidden]').each(function(e){
		e.remove();
	});
 
	// Retarget the IFrame back to the fileupload frame
	$$('#selenium_fileupload_iframe')[0].contentWindow.location = "TestRunner-fileupload.html";
};

In order to make this work, we also need a new <iframe> added to the selenium-core frameset:

<iframe name="selenium_fileupload_iframe" id="selenium_fileupload_iframe" src="TestRunner-fileupload.html" style="width: 1px; height: 1px"></iframe>

With these changes… we now have an .rsel command in our tests that looks like this:

swf_upload '#SWFUpload_0', "#{RAILS_ROOT}/public/images/test-upload.png"

What this will do is:

  1. Query the SWFUpload <object> element, and grab the <param> elements from inside it
  2. Parse out these parameters, adding them as <input type=”hidden”> tags to the hidden form IFrame
  3. Submit the hidden form, POSTing the file contents and parameters to the same action as the SWFUpload form
  4. … and clean itself up

While not a perfect solution… this allows us to test the server-side response to file uploads, and that’s what we cared about most. If you are using SWFUpload in your application, and have run into the same problem, I’d love to hear how you solved it!

By josh

Last March, I wrote about an internal sales funnel visualization and tracking project that we call FunnelCake.

In recent weeks, we’ve made some significant upgrades to this system… making it a much more comprehensive tool for visualizing our sales funnel. Much of the code was refactored (or rewritten from the ground up), and the codebase is now cleaner and more flexible. In addition, we added a bunch of new features, including:

  • Revised conversion stats logic to make numbers more consistent and easier to read
  • Viewing of funnel stats using weekly, bi-weekly, and monthly time windows (fixed to the calendar year, rather than the less-useful “past 14 days” window that was used previously)
  • RESTful design for funnel stages, individual states, visitors, etc
  • Dashboard view for at-a-glance view of primary funnel stage stats
  • Graphing! All funnel stages now show graphs of historical conversions
  • Customized dashboards can be easily built using simple “widget” partials for viewing specific stats, graphs, diagrams, tables, etc
  • All widgets are fully ajax-updating, making the page much more responsive
  • Filtering! All funnel stats can now be viewed through landing-page, referer, and visited-page filters.
  • Caching: all stat calculation methods now use the Rails cache for storing commonly accessed data points… making switching between stats much more efficient.
  • Drill down to individual visitors and view their event page through the site

Overall, these changes have made this a much more useful tool for TransFS.com. We’re able to keep track of our funnel on a daily basis using the simple dashboard, while also having the ability to dig in and investigate specific visitor segments and campaigns. FunnelCake will allow us to answer more complicated questions like: “Do the customers who arrive at our site via google adwords behave different than those who arrive via organic search?” and “Which step in our signup process is the least effective?”

Here are some new screenshots of the TransFS.com FunnelCake dashboard:

Funnel Dashboard

Funnel Dashboard

Funnel Stage Detail

Funnel Stage Detail

Funnel Overview

Funnel Overview

FunnelCake is available as an open-source Rails plugin for anyone to investigate, use, and improve upon.  You can find my GitHub repository here.  If you have any ideas on how to make it better… fork away and send me a pull request!