1. Do you know why I pulled you over, hacker?

    Or: Slow Down to Speed Up

    When it comes to programming languages, I haven’t lived in the same place for more than a few years.  This is good, fun, and keeps me sharp.  But I means I’ve never *really* mastered any particular language.

    So, when I code, I tend to write it like an author writing prose.  That is, I rewrite my draft a couple of times.  After the code (including tests, though not usually 100% TDD) is written, I go over it, looking at the `git diff` and looking at the entire files with an understanding of the full program in my head.  I’m generally doing two things as I go over it:

    One: Critical Analysis

    Does the code do what it should, and do my tests fully express the purpose of the code?  They don’t have to test every line and detail, but one should be able to re-write my code from scratch by looking at just the tests.

    Can the code be more verbose, less verbose, faster, easier to read or more consistent?  There are only three tough problems in computer science (naming variables and off by one errors); I check for those, too.

    Two: What Can I Learn?

    I can generally accomplish everything I need with techniques I already know.  But that’s no fun and it doesn’t make me a better engineer/hacker/person.  So I try to find code (either the stuff I’ve just written or stuff related to it) that can be implemented with a new technique.  

    Maybe I can learn more of Ruby’s meta-programming and create an extension (that will probably help me in the future).  Maybe I want to learn how to drop some C code into my Ruby and make this critical path super quick.  Whatever it is, it should be something I haven’t done before that challenges my knowledge of the language (ps. open source is a great place to find inspiration).  Sometimes I just question the status quo.

    It’s easy to solve a problem, crank out some good code, and call it.  But that’s sub-optimal.  The moral/trick is that efficiency is not always about solving problems quickly and correctly.  Over the long haul, you need to be continuously improving to reach maximum efficiency.  Hopefully I can remember that.

     

    tags:  ruby  hacking  improvement 

    Comments
  2. FAIL: Your name is invalid. (Ruby UTF-8 regex)

    Often times you’ll want to validate users’ names (or nicknames) in your web applications.  Although I’m not fundamentally opposed to using all the wonderful Unicode (UTF-8 in particular) characters under the sun, it does make it easier to use and understand an app when user names are at least recognizable to you.  If an app is global (like twitter) this may not be the case.  But you get there organically, and it doesn’t make sense to open the floodgates right now.  So we started with

    validates_format_of :nickname, :with => /\A[a-zA-Z0-9_\.\-]+\Z/
    

    Allowing lower and uppercase letters, numbers, dot, hyphen and underscore.  A pretty standard start.  But we saw some validations fail when user names were copied from 3rd party services (twitter, facebook, tumblr) including some letters with accents over them.

    Fortunately we are using Ruby 1.9 (with Rails 3.1) and validating with unicode is straightforward.  We added support for all the Latin extensions with just two tweaks.  First, we need this at the top of the file with the regular expression so Ruby interprets it correctly

    # encoding: UTF-8
    

    then we add from code point U+00C0 to U+02AE, chaning our regex to

    validates_format_of :nickname, :with => /\A[\u00c0-\u02aea-zA-Z0-9_\.\-]+\Z/
    

    Once we get bigger in countries that speak other languages, I’ll be adding some more characters sets to that regex.

     

    tags:  utf-8  unicode  ruby  regex  fail 

    Comments
  3. Thread Pooled Beanstalk Client for Ruby 1.9

    We’ve built Shelby with a distributed architecture (the subject of another post) with a highly parallelized social stream processing infrastructure (another post as well).  To accomplish this we’re using the Beanstalk message queue (not using Amazon SQS as we’re not running on EC2) to allow the disparate components of the system to communicate.

    Beanstalk handles publishers and consumers via sockets and has had no trouble taking a pounding from actual multithreaded usage (i.e. multi-core CPU coupled with language and OS support).  Unfortunately, I couldn’t find a Ruby client designed for a truly multithreaded environment.  The best and most popular client I found is beanstalk-client.  While that client is thread safe, the implementation will hang if used naively.  Indeed, there is one very important aspect of Beanstalk itself that should be made perfectly clear: you must reserve and delete a job using the same connection.  You cannot call reserve() on a client before somehow releasing the job previously returned by that client.  This can get messy when multithreaded, so I forked that client…

    The recently open-sourced OvertimeMedia/beanstalk-client-ruby includes a Beanstalk::ThreadedPool class which creates a managed set of Beanstalk clients.  A proper Thread would reserve a Beanstalk client with ThreadedPool.reserve_connections() (which will block if no connections are currently available) then call reserve() on that client.  This connection will not be released back to the pool of available connections until it deletes/releases/buries the job it reserved from Beanstalk.  This is conveniently wrapped into the job itself.  

    For example, if you have a threaded work queue and want to process up to 100 Beanstalk jobs at a time…

    @beanstalk_pool = Beanstalk::ThreadedPool.new('localhost', 100, 'tubename')
    while(true)
      job = @beanstalk_pool.reserve_connection.reserve
      @work_queue.process_in_new_thread(job) { |job| work_on(job); job.delete(); }
    end
    

    I’ve not implemented the full Beanstalk API with my multithreaded fork as I’m not using the full API.  I selfishly only built out the parts I’m using.  But this fork has been running the Shelby alpha for two weeks and processed millions of jobs without issue.  Please let me know what you think of the client if you use it.  I’d love to improve it and help any others who are using it.

    p.s. It’s tough to take time out to write up these posts when you could be writing some bad ass multithreaded code ;-]

     

    tags:  multithreaded  threads  ruby  shelby  opensource 

    Comments
  4. Rails DB Backup to S3

    If you don’t believe backing up your data is important, come back when (not if) you lose some important information.  Sensible (Rails) developers, read on…

    I’ve seen (and used) decent backup solutions involving mysqldump, sftp and an additional server.  They’re okay, but with a few plugins/gems your DB backup can be elegant (and Amazon secure).  I happen to like db2s3.  Configure the gem like the README says.  Then add a couple of whenever tasks:

    every 1.hour { rake “db2s3:backup:full” }
    every 1.day { rake “db2s3:backup:clean” }

    Holy shit done.  But you will have to pay the S3 bill.  So pick up a penny from the ground every day.  You’ll be $.25 ahead of the game at the end of the month.

    Gotchas?  Of course.  db2s3 relies on aws-s3, so you must have that gem installed (this is not explained in the README).  NBD, but if you happen to be using right_aws with paperclip, like me, you may run into issues due to the gem load order.  So, keep your house in order:

    config.gem “aws-s3”, :lib => “aws/s3”, :version => ‘>= 0.6.2’
    config.gem “right_aws”

    Thoughts?

     

    tags:  rails  S3  backup  ruby 

    Comments
  5. Flash Now!

    In development of a new, greenfield application I had flash messages popping up twice: once for the current page, and then again on the next page.  It turns out I was using flash[]= and then rendering within the same action.  As flash[]= is designed to carry over to the next action (i.e. after you redirect, a new action is performed) it was showing up again.  The solution: flash.now[]=.  This hash is cleared after the current action.  The following code illustrates when to use each:

    def create_application
      #do work
      if it_worked?
        flash[:notice] = "Great success!"
        redirect_to @cash
      else
        flash.now[:error] = "Let's try that again"
        render :action => "new"
      end
    end
    

    The flash[:notice] must be carried over to the action that handles the @cash path (following to the redirect), whereas the flash.now[:error] will be shown for the current action (even though it renders a different view) and should not be carried over to the next action.

    It seems somewhat strange to me that I’m just discovering this now.  I suppose that as I learn more about Rails and web applications in general, I adjust my techniques and start experimenting with new ones (that aren’t always better).  The fact that flash[]= carries over to the next action and flash.now[]= doesn’t has only come into play now that I’m using flash as more of a central messaging system than just the standard notices from scaffolds.

     

    tags:  ruby  rails  fail 

    Comments
  6. db:fresh - Easily rebuild your DB from scratch

    I suppose the idea behind Rails’ migration is that once you create them, you don’t change them.  This is usually the right concept, and from this decision grows many of the rake db:* tasks.  Such tasks — migrate, reset, schema:load, schema:dump, and setup — do not re-run migrations as they assume schema.rb is up to date (rightfully so).  But oftentimes (especially on greenfield development) I will tweak my migrations and want to start from scratch.  So here’s a quick hitter I now have my application templates add to every project I start:

    module DBTasks
      namespace :db do
        
        desc "drops, creates, migrates and seeds the DB"
        task :fresh => :environment do
          puts "Dropping DB..."
          Rake::Task["db:drop"].invoke; puts "done."
          puts "Creating DB..."
          Rake::Task["db:create"].invoke; puts "done."
          puts "Migrating DB..."
          Rake::Task["db:migrate"].invoke; puts "done."
          puts "Seeding DB..."
          Rake::Task["db:seed"].invoke; puts "done."
          puts "Your DB is so fresh and so clean clean!"
        end
        
      end
    end
    

     

    tags:  rails  ruby  db 

    Comments
  7. 1_000_000 Ruby’s

    Ruby ignores the underscore character within integer literals.  Instead of writing

    bank_account.deposit(:cash => 1000000)

    you can write

    bank_account.deposit(:cash => 1_000_000)

    which is much easier to read.

    When I’m depositing a million at my ATM however, I will a) refactor that constant out of my code and b) probably be blogging from somewhere other than my bedroom/home office =)

     

    tags:  ruby  knowledge 

    Comments
  8. Ruby 1.9 fluent API

    I don’t like magic because I don’t understand how it works.  After I understand it, it’s no longer magic, but I usually have a much greater appreciation for the execution.

    I’ve been using Ruby without understanding the language’s nuts and bolts for a while.  So I picked up O’Reilly’s The Ruby Programming Language and am making my way through it.  Along the way I will document some of the useful and interesting pieces that I enjoy…

    A ‘fluent API’ allows coders to conveniently chain many statements together which act on an object (or a series of objects there derived) in a logically progressive manner.

    indy500_winner = Race.new('Indianapolis').start.run_laps(500).winner

    While elegant, these chains can get very long.  Ruby 1.9 has altered its statement terminator rules to allow continuations on successive lines.  When the first non-whitespace character on a line is a period, it is considered a continuation.

    indy500_winner = Race.new('Indianapolis')
      .start
      .run_laps(250)
      .yellow_flag(10)
      .run_laps(240)
      .winner
    

    Nice, eh?

     

    tags:  ruby  ruby1.9  syntax  knowledge 

    Comments
  9. blog comments powered by Disqus