TransFS DevBlog RSS Feed
For the past year or so, we’ve been tracking our SEO progress with a tool that we developed internally. The idea was to keep an eye on the daily movement of our domain in the search results for various terms. We also wanted to collect some general statistics on these terms, like how much traffic they receive, the PageRank of the top ranking site, and advertising CPC estimates. All of this was rolled into a set of controllers, models, views, and background-processing code that I’ve now extracted out into a new rails engine plugin: SEOpener!
SEOpener combines data from the Google Ajax Search API, with other datasources such as Yahoo’s keyword estimation tool, and google toolbar pagerank queries, to try to give you a complete picture of how you are positioned on a given search term. It provides a simple admin interface for listing all of your terms, adding new ones, and viewing the top 64 search results for a given term.
SEOpener is designed to be a drop-in solution for existing apps, providing everything you might need to get started in tracking your sites search rankings. For the most part, it should be fairly easy to install, however you will need to come up with a background processing solution (you can use cron & workling/starling like we do, or something else) to run the data-scraping jobs at regular intervals.

SEOpener Admin Interface
We’ve been pretty happy with this tool, and have been running it in production for some time now…. but it would be great to get help from the community. Leave a comment if you have any ideas, or even better, fork us and hack away on it!
We use Google Apps for all of our emailing… and for the most part it works great. However, dealing with their outbound smtp relay can be a real pain. The biggest problem is that it only allows you to send an email if the From header matches the account that is used for authentication. (Note: I’m not suggesting that this is a bad practice, or something wrong with Google… it makes total sense.)
Because of this, I needed to set up different smtp account settings for one of our ActionMailer subclasses than those used by the rest of our site. Unfortunately, because of the way ActionMailer is designed… this isn’t exactly an easy task!
I found this article, which gave me some clues about how to do this. Here’s the solution I came up with:
class MailerWithCustomSmtp < ActionMailer::Base SMTP_SETTINGS = { :address => "smtp.gmail.com", :port => 587, :authentication => :plain, :user_name => "custom_account@transfs.com", :password => 'password', } def awesome_email(bidder, options={}) with_custom_smtp_settings do subject 'Awesome Email D00D!' recipients 'someone@test.com' from 'custom_reply_to@transfs.com' body 'Hope this works...' end end # Override the deliver! method so that we can reset our custom smtp server settings def deliver!(mail = @mail) out = super reset_smtp_settings if @_temp_smtp_settings out end private def with_custom_smtp_settings(&block) @_temp_smtp_settings = @@smtp_settings @@smtp_settings = SMTP_SETTINGS yield end def reset_smtp_settings @@smtp_settings = @_temp_smtp_settings @_temp_smtp_settings = nil end end
The idea here is simple… before sending the mail, set the @@smtp_settings to whatever it needs to be. Then, after delivering the email, set it back to whatever it was. Hope this helps someone else!
In our recent switch to EngineYard Cloud… one of the most annoying problems I ran into was getting our Wordpress blog configured properly so that it would play nicely with our Rails app. In our previous hosting environment, we ran with an Apache setup… and we used a simple virtualhost config that looked something like this:
Alias /blog /var/www/apps/tfs_blog <Directory /var/www/apps/tfs_blog> PassengerEnabled off allow from all AllowOverride All </Directory>
With the move to EngineYard, we have made the switch to Nginx… because, after all, that’s what all the cool kids are using these days. I admit that one reason I was looking forward to Nginx was to rid myself of the awful apache config file syntax. However, I quickly learned that Nginx (or at least the version that is installed by default on the EY Cloud images) requires its own voodoo tricks in order to do seemingly simple things.
In this case, all we want to do is host our blog, alongside our rails app, at /blog. (and of course, this blog is hosted at /devblog) Unfortunately, getting this working properly required endless hour of scouring google for nginx config snippets… until I finally landed on the following, working setup:
location /blog/ { alias /data/tfs_blog/; index index.php index.html index.htm; if (-e $request_filename) { break; } rewrite ^/blog/(.+)$ /blog/index.php?q=$1; } location = /blog { fastcgi_pass 127.0.0.1:1027; fastcgi_param SCRIPT_FILENAME /data/tfs_blog/index.php; include /etc/nginx/common/fcgi.conf; } location ~ /blog/?.*\.php$ { if ($fastcgi_script_name ~ /blog/?(.*)$) { set $valid_fastcgi_script_name /$1; } fastcgi_pass 127.0.0.1:1027; fastcgi_index index.php; fastcgi_param SCRIPT_FILENAME /data/tfs_blog$valid_fastcgi_script_name; include /etc/nginx/common/fcgi.conf; }
The worst thing about this, aside from the complexity required to get such a simple thing working, is that the error condition you get when the nginx fastcgi setup is misconfigured is incredibly obscure: “No input file specified.” This apparently means that fastcgi is not receiving the SCRIPT_FILENAME parameter properly… but the actual cause of this can be anything from misconfigured permissions to location paths that aren’t grabbing the proper filename from the request uri. As you can see, the final result requires manually capturing the php filename from the uri path using a regex, and then setting the SCRIPT_FILENAME env variable accordingly.
It’s also worth mentioning that this snippet:
if (-e $request_filename) {
break;
}
rewrite ^/blog/(.+)$ /blog/index.php?q=$1;… from the first block is required to make wordpress’s pretty urls map correctly to the cgi handler.
That sure was a lot of work! Hopefully this post will save someone else all the pain and headache… or maybe someone will stumble across this and show me how it can all be done in a couple of simple config lines.

We recently underwent a pretty big change here at TransFS.com… all entirely “under the hood”. We decided to migrate from our old hosting setup at Joyent, to our new home at EngineYard Cloud. We’re now running on Amazon EC2 instances, all transparently and easily managed by the excellent admin tools provided by Engine Yard.
Why the switch?
I had several reasons for deciding that Joyent was no longer working for us. First, the Solaris architecture that Joyent uses provided no obvious benefits, and a surprising number of headaches. Solaris has most of the same tools as linux… but many of them take different command-line arguments or are annoyingly lacking in features. For example: “grep -R” for recursive search through files, doesn’t exist on Solaris. Why? I have no idea. But these types of things can become a real pain over time, and while they seem small, they really do matter.
Secondly, Joyent provides no automatic backup solution for your data. I was very surprised when we first switched to Joyent that I had to roll my own backup solution with mysqldump, rsync, and a script that uploaded to Amazon S3. I never felt good about this, because backups are so important, and because it isn’t an area that I’m very confident in… and I was never able to properly test that my backups would save us in the case of massive data loss. When selecting a hosting provider, I really want someone to hand me a pre-built backup solution that I can trust uses industry best-practices and has been carefully tested. EngineYard provided me with much more confidence on this front, and automated daily or hourly backups are dead-simple to set up (no hacking of my own scripts!)
Finally, we have been having some real problems with the memory footprint of our app. It has ballooned to over 150MB per rails instance, sometimes as high as 250MB. Between our Passenger children, background workers, mysql, etc… we kept bumping up against the RAM limit on our Joyent slice. This was obviously our fault, but the process for dealing with it with Joyent made it a real pain to solve. Moving to a two-slice setup, where our mysql db lived on a separate server was an obvious solution… but provisioning and setting up a whole new server was something that I didn’t have time for, and all Joyent really offered was to clone my existing setup. I would have had to deal with everything else myself, and suffer through a lot of downtime while I figured it out. Even switching to a Joyent slice with more RAM was not an instantaneous process. I would have needed to create a support ticket, and then wait for someone to get back to me and clone us over to a new “accelerator” slice. I’m sure this would have worked… but it isn’t nearly as convenient as it should be, in the age of push-button admin interfaces like those provided by EngineYard Cloud. In fact, with EY Cloud, I was able to switch to running our db on a separate server with a single checkbox and re-deploy. Very cool. We’re now running safely under the RAM threshold of the smallest EC2 instance, and I can very easily scale up to more servers or larger server by selecting a different server type in an EngineYard dropdown.
Custom Chef Recipes
It actually took us a lot longer to switch from Joyent than I had hoped. I began the process of investigating EY Cloud a few months ago… and had delayed the process for a while because I had some real concerns about some parts of our setup. Gems, packages, etc are all super-simple to set up in EY Cloud… but certain things are not as easy as a “checkbox”. For instance, we run Starling & Workling for background processing at TransFS.com. Getting these set up in EY Cloud was non-trivial, because it required writing a custom chef recipe. Chef is a really amazing server configuration tool that uses a ruby DSL to define changes to your server config. EngineYard has some decent chef documentation, but it still took some time to learn how Chef works before I was comfortable enough with it to get our servers set up properly.
However, once you learn to be a master Chef… it gives you amazing power to set up your server exactly the way you want, and in a totally repeatable way. In fact, when I need to add a new application server to our cluster… it should be as simple as flipping a switch, and all of my custom configuration settings will be applied automatically to bring that new server into the proper state for TransFS.com. Very cool. Learning Chef is a necessity to getting up and running with EngineYard Cloud… but it is worth the effort.
There are a few other quirks to EY Cloud that took some work to figure out, such as migrating capistrano hooks… but overall I was very pleased with the process. I had us up-and-running on the new servers in about 30 hours, start to finish, and we only had about 30 minutes of real downtime while I rsync’d our data over.
Anyway, thanks EngineYard for a great experience so far. And thanks as well to Joyent for hosting us through our early days… we’ll miss you, but we’ve found a new home that is much better suited to our needs!
Coming up in the next post: some lessons learned about Nginx, Rails, and Wordpress…
A nice ruby regexp idiom
I recently discovered a simple, but really great, ruby idiom… that I’m not sure how I’ve lived without for all this time. You can query strings with a regexp using the [] array operator:
"Grab the number 555 from this string"[/\d+/] "Grab the number 555 from this string"[/number (\d+)/, 1]
This will return nil, if the regexp is not matched, and the matched text otherwise. You can even supply an index number as a second argument, to return a particular matched substring.
So cool. It is this kindof amazing syntactic sugar that makes Ruby a joy to work with.
We use a lot of git submodules at TransFS.com. Nearly all of our rails plugins are installed via submodules, as is rails itself. This works very well for keeping multiple repositories synced up with each other……. when it works. The biggest problem with git submodules, in our experience, is that they don’t automatically stay synced up when you pull or checkout different branches. Consider this scenario:
- I make a change to our funnelcake plugin, and push that change to the github repo
- I then jump back to our main TransFS repo, and commit the updated funnel_cake submodule
- I push that change to the TransFS repo… and call it a day
- Later that day, one my fellow developers pulls the latest code from github
- He forgets to run “git submodule update”
- He codes away, commits, and pushes his changes back to github
Now we have a problem… because the latest changes from my colleague have bumped the funnel_cake submodule backwards to its previous state. So, even tho the funnel_cake repository has newer code in it, we’re pointing to the submodule commit hash with the older code.
The best trick we’ve found for solving this problem is simple: use a git alias for all pulls and checkouts. Here’s what I added to my .gitconfig:
[alias] pullup = !git pull && git submodule update && git status checkup = !sh -c 'git checkout $1 && git submodule update && git status' -
Now, I can run:
git pullup transfs
instead of:
git pull transfs
… I can ensure that my submodules will be updated properly (and also get a handy stat of the repo at the same time)
Same thing with:
git checkup new_fancy_branch
instead of:
git checkout new_fancy_branch
It seems to work well! Now if we could just get everyone to remember to use these commands…
Do you have any good tricks for avoiding these problems with git submodules?
Here at TransFS.com, we use Selenium tests to verify most of the mission-critical parts of our website. While a quality RSpec test suite is also very important, there is nothing like an end-to-end integration test that simulates real user behavior to give you confidence that your code is bug-free. Since we run Ruby on Rails, we take advantage of the handy Selenium On Rails plugin… or more specifically, my fork of this plugin with some of our own modifications.
Why do we run our own custom fork? There are a few reasons, but originally it had to do with our need for some custom hooks that our CruiseControl.Rb continuous integration setup uses to take screenshots of the tests in-action. (A topic for another post!) Most recently, however, I needed to hack the plugin to provide a basic workaround to testing code that uses SWFUpload for file uploading.
For the uninitiated, SWFUpload is a nice Flash app that provides javascript hooks for file-selection and uploading with support for multiple files at once. It is a nice solution for web forms that need to upload more than one file, if you are willing to deal with the added cost of adding Flash to your page. It isn’t ideal in all cases, but it has worked OK for us thus far. However, testing SWFUpload is a real pain… because Selenium cannot trigger events on flash movie elements.
One solution for this would be to simulate native-OS keyboard and mouse events, and add test commands to click the “upload” button and select a file. In our CI test environment, however, this was not a good option. Instead, I’ve opted to test the POST action on the server only… simulating the file upload without actually triggering SWFUpload to do the work. To accomplish this, I added a custom selenium action that dynamically inserts a form and file field that are targetted at the corresponding SWFUpload endpoint url.
Here’s what it looks like:
// Custom method for uploading a file, simulating SWFUpload Selenium.prototype.doSwfUpload = function(selector, filename) { // Grab the SWFUpload element from the DOM var doc = this.browserbot.getCurrentWindow().document; var swf_uploader = Element.select($(doc.body), selector)[0]; // Slurp the list of params from SWFUload, and parse them into a Hash var flashvars_s = swf_uploader.down('param[name=flashvars]').value; var flashvars = {}; $A(flashvars_s.split(/&/)).each(function(kv){ var key = unescape(kv.split(/=/)[0]); var value = unescape(kv.split(/=/)[1]); flashvars[key] = value; }); var params_array = $A(decodeURI(flashvars.params).split(/&/)); var params = new Hash({}); params_array.each(function(kv){ var key = decodeURI(kv.split(/=/)[0]); var value = decodeURI(kv.split(/=/)[1]); params.set(key, value); }); params.unset('format'); // Remove the format param, since we don't want to request as json // Grab the SWFUpload form from the hidden IFrame, // and insert the key/value params into the form var faker_form = Element.select($$('#selenium_fileupload_iframe')[0].contentDocument.body, '#swfupload_faker_form')[0]; params.each(function(kv) { Element.insert(faker_form, { bottom: '<input type="hidden" name="'+kv.key+'" value="'+kv.value+'" />' }); }); // Assign the selected file to the file field netscape.security.PrivilegeManager.enablePrivilege("UniversalFileRead"); this.browserbot.replaceText(faker_form.down('#swfupload_faker_file'), filename); // Assign the URL and submit the Form faker_form.action = flashvars.uploadURL; faker_form.submit(); // Clean up the params Element.select(faker_form, 'input[type=hidden]').each(function(e){ e.remove(); }); // Retarget the IFrame back to the fileupload frame $$('#selenium_fileupload_iframe')[0].contentWindow.location = "TestRunner-fileupload.html"; };
In order to make this work, we also need a new <iframe> added to the selenium-core frameset:
<iframe name="selenium_fileupload_iframe" id="selenium_fileupload_iframe" src="TestRunner-fileupload.html" style="width: 1px; height: 1px"></iframe>
With these changes… we now have an .rsel command in our tests that looks like this:
swf_upload '#SWFUpload_0', "#{RAILS_ROOT}/public/images/test-upload.png"
What this will do is:
- Query the SWFUpload <object> element, and grab the <param> elements from inside it
- Parse out these parameters, adding them as <input type=”hidden”> tags to the hidden form IFrame
- Submit the hidden form, POSTing the file contents and parameters to the same action as the SWFUpload form
- … and clean itself up
While not a perfect solution… this allows us to test the server-side response to file uploads, and that’s what we cared about most. If you are using SWFUpload in your application, and have run into the same problem, I’d love to hear how you solved it!
FunnelCake Update
Last March, I wrote about an internal sales funnel visualization and tracking project that we call FunnelCake.
In recent weeks, we’ve made some significant upgrades to this system… making it a much more comprehensive tool for visualizing our sales funnel. Much of the code was refactored (or rewritten from the ground up), and the codebase is now cleaner and more flexible. In addition, we added a bunch of new features, including:
- Revised conversion stats logic to make numbers more consistent and easier to read
- Viewing of funnel stats using weekly, bi-weekly, and monthly time windows (fixed to the calendar year, rather than the less-useful “past 14 days” window that was used previously)
- RESTful design for funnel stages, individual states, visitors, etc
- Dashboard view for at-a-glance view of primary funnel stage stats
- Graphing! All funnel stages now show graphs of historical conversions
- Customized dashboards can be easily built using simple “widget” partials for viewing specific stats, graphs, diagrams, tables, etc
- All widgets are fully ajax-updating, making the page much more responsive
- Filtering! All funnel stats can now be viewed through landing-page, referer, and visited-page filters.
- Caching: all stat calculation methods now use the Rails cache for storing commonly accessed data points… making switching between stats much more efficient.
- Drill down to individual visitors and view their event page through the site
Overall, these changes have made this a much more useful tool for TransFS.com. We’re able to keep track of our funnel on a daily basis using the simple dashboard, while also having the ability to dig in and investigate specific visitor segments and campaigns. FunnelCake will allow us to answer more complicated questions like: “Do the customers who arrive at our site via google adwords behave different than those who arrive via organic search?” and “Which step in our signup process is the least effective?”
Here are some new screenshots of the TransFS.com FunnelCake dashboard:
FunnelCake is available as an open-source Rails plugin for anyone to investigate, use, and improve upon. You can find my GitHub repository here. If you have any ideas on how to make it better… fork away and send me a pull request!
We’ve recently started using HighRise as our key CRM tool, and so far it looks like it will work very well for our needs. However, hooking our existing processes and our customer data into the web service takes a little bit of time & effort.
First of all, while HighRise has an excellent API… the “official” ruby code on the developers’ site is pretty minimal. Fortunately, some community efforts on GitHub have cleaned up this code and improved it dramatically. I’ve picked up this work and updated it slightly, and hopefully the lessons I learn can be rolled back into my GitHub repository.
One particularly tricky problem, even with a nice ActiveResource library like the one hosted on GitHub, is creating/updating contacts with associated “contact-data”. This includes adding email addresses, phone numbers, etc to contacts. I looked around on the developers forum, and there seem to be a number of people who have had questions about how to accomplish this… but no solid answers. After fiddling around with it for a while, I came up with the following solution:
Highrise::Person.create 'first-name'=>'Test', 'last-name'=>'API', 'contact_data'=>{ 'email_addresses'=>[ { 'address'=>'test@test.com', 'location'=>'Work' } ] }
This code creates the following POST xml:
<?xml version="1.0" encoding="UTF-8"?> <person> <contact-data> <email-addresses type="array"> <email-address> <address>test@test.com</address> <location>Work</location> </email-address> </email-addresses> </contact-data> <last-name>API</last-name> <first-name>Test</first-name> </person>
… And it works great! So hopefully this will help anyone out there who is struggling to get the HighRise API to do what they want via Ruby.
Rails 2.3 was released a few months ago with some great features. Perhaps chief among them was an “official” solution to nested models in forms. This is something that the community has been arguing about for some time, and has been solved in various ways.
With the Rails 2.3 solution, we finally have direct support for updating nested attributes in our models. This allows us to add a simple directive to our model that specifies which associations can be mass-assigned. For instance, in our app, we have the following (simplified) model:
class Iso < ActiveRecord::Base has_many :emails has_many :phones accepts_nested_attributes_for :emails, :allow_destroy => true accepts_nested_attributes_for :phones, :allow_destroy => true end
Pages
Archives
Categories
- analytics (9)
- background processing (2)
- css (2)
- development (1)
- git (5)
- iPhone (1)
- issue tracking (1)
- javascript (7)
- Objective-C (1)
- ObjectiveMerchant (1)
- rails (26)
- rspec (1)
- selenium (1)
- SEO (2)
- testing (2)
- Uncategorized (1)
- webdesign (1)
Blogroll
-
Meta


