Upgrading to Rails 2.3.2 from Rails 2.1


We recently upgraded our Rails app at TST Media from Rails 2.1 to Rails 2.3.2. It was a pain in the ass! A little background on our application first. We have 36,600 lines of code, 322 models, 108 controllers, and a relatively weak test suite. Additionally we upgraded the majority of our gems and plugins, of which we have 32 gems and 16 plugins in our vendor directory. Thats 48 third-party open source ruby libraries our application depends on! Crazy. The application originated back in August 2006 when it was first running on Rails 1.1.2.


git rm -r vendor/rails
git ci -a -m "removing rails to prepare for upgrade"
rake rails:freeze:edge RELEASE=2.3.2

Here are the issues we ran into during the upgrade, many of which were not documented in the release notes:

1. In Rails 2.1 and earlier, when using check_box_tag the convention is to put a hidden_field_tag after it with the same name and a value of 0 if you need there to always be a name/value pair sent for the check box. The check_box form helper did just this. The browser would then send the check_box_tag value if it is checked, otherwise it would send the hidden_field_tag value. Well, for some reason, in Rails 2.3 the order is swapped. The hidden_field_tag must come before the check_box_tag. The Rails 2.3.2 check_box form helper swapped the order as well.

Tags

Posted in | Posted on 11 Jun 2009 14:05by Luke Ludwig | no comments

Capistrano Tip to Avoid Disk Intensive Removal of Files

At TST Media we host our Rails app at Engine Yard on four slices which all utilize the same shared disk via gfs. Anytime there is any hard core disk activity our sites slow down to a crawl. This makes removing files a bit tricky, such as when we want to empty our cache files or when Capistrano removes a release at the end of a deploy. The strategy we have come up with to handle this is to move the files to a specific directory we call the "caches_to_remove" directory, and use a cron task to empty this directory at night when most of our users are asleep. Moving files is extremely fast and not disk intensive, as long as the source and destination is on the same disk of course.

The shell script that the cron runs nightly is simple:

# remove_cache_dirs.sh
rm -rf /data/tst/caches_to_remove
mkdir /data/tst/caches_to_remove

Tags ,

Posted in | Posted on 14 May 2009 00:10by Luke Ludwig | no comments

A Capistrano task for a rolling Mongrel restart and deploy

At TST Media we have our rails app hosted at Engine Yard. Currently we use Nginx, haproxy, and Mongrel and have 4 slices each with 4 mongrels. When an HTTP request first comes in to our system it hits the load balancer which chooses a slice to send it to. The nginx on the given slice picks the request up and sends it onto haproxy. Haproxy chooses a mongrel to send the request to based on availability. When we roll out bug fixes, which we do once every other day or so, the Mongrels all restart at once and all the users browsing our sites experience 20-30 seconds of... basically downtime. The browser spins and waits until the mongrels are ready to go. If requests come in at a certain time the users may see a 502 Bad Gateway response or a 503 Service Unavailable response, both of which started showing up once we started using haproxy. Clearly this is unacceptable. Soon we hope to switch to Nginx with Phusion Passenger which may not have this problem. Until then we have started doing rolling restarts, where one slice is down at a time which allows us to do small deploys without impact to our users.

Tags , , ,

Posted in | Posted on 13 May 2009 08:40by Luke Ludwig | no comments

RailsConf 2009 and the Danger of Remote Mob Mentality


My first Ruby on Rails Conference was a positive experience.  RailsConf was in Vegas this year, and while I didn't win any money gambling, I did see several good talks and met some interesting Rails developers.

During the Wednesday morning keynote, as Chad Fowler was introducing Chris Wanstrath of Github, he asked who uses Git. Basically everyone in the room raised their hand. He went on to say that Rails programmers are like lemmings, which I think is a very interesting observation. It wasn't too long ago that most Rails developers used Subversion, and as soon as the Rails core team switched to Git everyone followed. It wasn't too long ago that test-driven development was an obscure programming practice only used by "Extreme" programmers. Now, if you are working on a Rails project it is a given that you have a decent test suite. And don't forget about Rest architecture.... people love Rest architecture.

After Timothy Ferriss's disappointing keynote Tuesday night, which served to entertain as the source of many jokes throughout the remainder of the conference, everyone was ready for a real hardcore motivational speech. Wow did Robert Martin deliver in his talk, "What Killed Smalltalk Could Kill Ruby Too." No slides, just Robert Martin pacing on the stage and flinging his note-cards into the air when he was done with them. Being a great speaker, he had everyones rapt attention. He recapped a short history of Smalltalk and why it "died", and outlined what the Ruby and Rails community can do to avoid the same fate. This included doing test-driven development, professionalism, not being arrogant towards non Ruby programmers, and the development of more powerful Ruby Integrated Development Environments. He stressed test-driven development quite a bit, as I knew he would given his Extreme programming background. When the speech finished the crowd gave him a standing ovation. Everyone loved it.

At RailsConf it was apparent to me that Rails developers are a young crowd. I knew this before the conference, but seeing 1300 Ruby on Rails nerds all in the same room made it even more obvious. An analogy to lemmings is clearly extreme, but certainly Rails developers are impressionistic. There definitely seems to be a sort of remote mob mentality thing going on, which is a little disturbing. You know those Simpsons episodes where the towns people group together in a mob and everyone wants to kill Bart. Then someones yells some other new purpose and the mob follows without thinking. Anyway, the point is that I'd like to see Rails Developers and other programmers think more for themselves. Everyone's circumstances and project is different, and pretending that there are a few programming practices such as test-driven development that absolutely must be done to succeed as Robert Martin implied is absurd. I would add "Think for Yourself" to Robert Martin's list of what the Rails community must do to avoid the fate of Smalltalk.

At TST Media we spend very little time writing tests and have a weak test suite. Our lines of code comes out at 33435, and our test lines of code is 1811, a test to code ratio of 0.05. While I would like to see this improved marginally, given our current situation it is simply not worth trading features for a slightly higher quality code base, which is what a better test suite would give us.
 

Tags , , , ,

Posted in | Posted on 10 May 2009 22:27by Luke Ludwig | no comments

Ack in project for TextMate 10-20 times faster than Grep in Project

Until yesterday, I had been using Grep in Project for TextMate. While faster than the built-in Find in Project for TextMate, Grep in Project is still rather slow. Many times I would wait 20 seconds or longer for a single grep to finish.  Sometimes I would switch to the command line and do the grep manually, but the advantage of having the grep built-into TextMate is the ability to click on a result and have the file brought up, which is huge.  So yesterday I searched around and found Ack in Project for TextMate. Wow, what a difference. Those 20 second greps are now instantaneous.  Very impressive.

 

Tags , , , ,

Posted in | Posted on 23 Apr 2009 13:10by Luke Ludwig | no comments

Rails patch for caching 'SHOW FIELDS' for has_and_belongs_to_many associations

Last week I was examining the MySQL slow query logs at work and discovered the following which led to an easy Rails patch which improved the performance of our app by about 25%.

# Time: 090108 11:05:02
# Query_time: 14.412306  Lock_time: 0.000521  Rows_sent: 2  Rows_examined: 2
SHOW FIELDS FROM `events_page_nodes`;
# Query_time: 14.390774  Lock_time: 0.000556  Rows_sent: 2  Rows_examined: 2
SHOW FIELDS FROM `events_page_nodes`;

Normally 'SHOW FIELDS' queries are moderately fast. I ran it manually just now and it took 0.16 seconds. However here you can see that these 'SHOW FIELDS' queries took 14 seconds to complete! Turns out that MySQL creates a temporary table on disk for 'SHOW FIELDS' queries, so if the disk is busy with something else these queries can take awhile to complete as seen here.

In development mode these 'SHOW FIELDS' queries are not cached and occur very frequently, but in production mode Rails caches these queries the first time they are called for each model. I noticed that our database was receiving a large number of these 'SHOW FIELDS' queries, which I thought should only occur when a Rails environment is loaded or shortly thereafter when the models are first loaded. (ex. mongrel restarts, a background job, or a cron job).   
                                                 
However, upon inspection it turns out that Rails DOES NOT cache 'SHOW FIELDS' queries for has_and_belongs_to_many associations. So every time a select or an insert is done via a Rails has_and_belongs_to_many association, a 'SHOW FIELDS' on the join table is executed. One way to solve this problem would be to switch to using the has_many :through approach, which involves adding a primary key id column to the join table and creating an ActiveRecord model for it, which would then take advantage of the built-in Rails caching of 'SHOW FIELDS'. However we have 20 some join tables in our application. So instead I patched Rails to cache the 'SHOW FIELDS' queries, which turned out to be rather simple and noticeably impacted the performance of our app (see charts below).                                                 

Tags

Posted in | Posted on 08 Jan 2009 16:43by Luke Ludwig | 3 comments

Switching from mongrel to mod_rails.

A mod_rails for Apache, also called Phusion Passenger, has been released.  I switched this blogs deployment from using mongrel to using mod_rails in less than 20 minutes without any problems. The installation procedure is dead simple. I had to look up one thing in the user's guide, how to serve the rails app off a sub path of the domain (lukeludwig.com/blog), and I found it quickly. Overall I am extremely impressed and am curious how Passenger will do with higher traffic production sites.

Previously I was running this blog on a single mongrel instance, which is all that is needed due to the low traffic on this site. But lets just say that I wrote an article with great content and it got digged, causing traffic to spike like crazy for a few days. The single mongrel would quickly be overwhelmed and my site would be for the most part, unusuable, until I noticed and was able to start up a whole mongrel cluster of say a dozen mongrels. My server has 1 GB of RAM, most of which goes unused. Passenger is able to handle the number of rails instances in a much nicer manner. From a configuration perspective, I set the RailsMaxPoolSize to 12 and I'm done. Passenger will run as many rails instances as it needs depending on load, up to the max of 12. When traffic slows down, it kills the idle rails instances to conserve memory. Great stuff.

 

Posted in | Posted on 13 Apr 2008 09:40by Luke Ludwig | 2 comments

Use pagination to save memory when iterating over a lot of ActiveRecord objects.

It is very easy to consume a ton of memory when using ActiveRecord's find(:all) method. When transitioning the Team Sport Tech rails app from using file column to attachment_fu I wrote a migration to convert all of the photos we had to the new database format and the new location on disk. Without thinking I wrote code like this:

    Photo.find(:all).each do |photo|
      # conversion commands
      .
      .
    end  
                   

I didn't notice any issues executing this on my laptop which has 2 GB of RAM, but when I went to run this migration on our staging server over at Engine Yard I had problems. Our staging server only has 640 MB of RAM. Our database has over 100,000 photos. That is an array with 100,000 ActiveRecord objects in memory all at once. As the migration executed on the staging server I could see that the rake task was using all available RAM and it was paging like crazy, utilizing 600 MB of virtual memory from disk. The cpu fluctuated between 0 and 1 percent of utilization due to the paging. If adequate memory was available this migration would take around 10 minutes and the cpu would be working like crazy. I did a few other things for awhile and came back to this migration 2 hours later. Still working and it probably had a long ways yet to go.

Tags , , , , ,

Posted in | Posted on 16 Feb 2008 22:57by Luke Ludwig | no comments

RMagick has memory problems. MiniMagick is slow. Go ImageScience.

RMagick has memory problems. MiniMagick with Attachment_Fu is slowwww. Go ImageScience.

Most Rails applications have to deal with resizing uploaded images for the creation of thumbnails.  The main choices include RMagick, MiniMagick, or ImageScience, all of which come packaged as gems. Alternatively you can write your own which really isn't that difficult.  So which one should you use? First I would recommend not writing your own, because it is really nice to take advantage of one of the very fine attachment plugins that are available. File column is the old standby rails attachment plugin, but it uses RMagick. Attachment_Fu is more flexible since you have the choice of using RMagick, MiniMagick, or ImageScience and can switch between them easily. At TeamSportTech where I work I recently transitioned our relatively large rails application from using file column and RMagick to using attachment_fu and ImageScience. Originally I was planning on using MiniMagick instead of ImageScience, but it turns out that MiniMagick is quite slow. The following timing results are on my Mac Book when running in development mode, and include uploading a single 3.4 MB jpeg which is resized down to 4 different sizes. MiniMagick consistently took 18 seconds to accomplish this, RMagick 7 seconds, and ImageScience 6 seconds. This doesn't accurately represent a production environment, but I do believe it is a fair comparison. Note that the time to upload is not a factor since this was done entirely on my local computer.

So why is MiniMagick with attachment_fu so slow? And why not use RMagick?  RMagick and MiniMagick use the well known ImageMagick C libraries. RMagick works by providing API ruby bindings to the ImageMagick libraries, which means that RMagick operates within the ruby process. It is well documented that RMagick consumes a lot of memory and has memory leaks as well. See Craig Ambrose's article and a Mephisto article. It appears like RMagick2 provides better memory management.

Tags , , , , , , , ,

Posted in | Posted on 16 Feb 2008 18:35by Luke Ludwig | 2 comments

Running ar_sendmail with monit

Sending email from a web application, especially blast emails to a lot of people, can take a lot of time. Generally you don't want the user to wait until all the emails have been handed off to the smtp server. You also probably don't want to tie up an entire mongrel with sending mail. The ar_mailer gem solves this problem in excellent fashion, by saving pending emails to the database and having a separate ruby daemon process periodically check the database and send emails. I recently set up one of our rails apps at work to use ar_mailer. Configuring it to use ar_mailer was incredibly easy, but it was tricky to get the ar_sendmail ruby daemon process to run under monit. On our production servers which we have hosted at Engine Yard, we want every process that our application depends on to be monitored by monit. 

The primary feature that ar_sendmail lacks to play nice with monit is the ability to leave a pid file after it starts up and to remove it when the process exits. This has already been pointed out on rubyforge as a feature request. Here is what I did to get ar_sendmail working under monit: (ar_mailer 1.3.1)

Tags , , ,

| Posted on 05 Dec 2007 19:35by Luke Ludwig | 7 comments

Categories

Syndicate

Copyright © Mad Marmot

Tech Blue designed by Hive Designs • Ported by Free WordPress Themes and Frédéric de Villamil Powered by Typo

• Ported by Free WordPress Themes and Frédéric de Villamil Powered by Typo