Friday, November 9, 2007

Synchronizing using Active Record


Recently, I was working on implementing a simple job queue. I needed multiple processes to pull jobs from the queue, and I needed to synchronize access to the job queue. ActiveRecord allows you to pass the :lock option to #find.



def find_new_job
Job.transaction do
job = Job.find(:first, :conditions => "status = 'new'", :lock => true)
if job
job.status = "processing"
job.save!
end
job
end
end


ActiveRecord will use a SELECT ... FOR UPDATE, which locks the rows in the result set. When another query selects any of those rows, it will block until the first transaction is complete. This works well on Postgres and MySQL 5, but Oracle gave me an error:


ActiveRecord::StatementInvalid: OCIError: ORA-02014: cannot select FOR UPDATE from view with DISTINCT, GROUP BY, etc.: select * from (select raw_sql_.*, rownum raw_rnum_ from (SELECT * FROM jobs WHERE (status = 'new') ) raw_sql_ where rownum <= 1) where raw_rnum_ > 0 FOR UPDATE
from ./script/../config/../config/../vendor/rails/activerecord/lib/active_record/connection_adapters/abstract_adapter.rb:128:in `log'
from ./script/../config/../config/../vendor/rails/activerecord/lib/active_record/connection_adapters/oracle_adapter.rb:222:in
`execute'
etc...


The Oracle adapter generates the above SQL for a find(:first), and SELECT .. FOR UPDATE won't work on a subselect. So, changing the method to something like the following will allow it to work in Oracle:

def find_new_job
Job.transaction do
jobs = Job.find(:all, :conditions => "status = 'new'", :lock => true)
job = nil
if jobs.size > 0
job = jobs.first
job.status = "processing"
job.save!
end
job
end
end


I've used this sort of job queue in Rails apps to allow for asynchronous calls to external services. Instead of calling the services synchronously and making the user wait (and use up a mongrel instance), create an entry in the job queue. External ruby processes can poll the job queue looking for work to do. If the user needs to see the results of the job (such as placing an order or authorizing a credit card), the request that created the job redirects to a controller action that polls the status of the job. When the job is completed, the action will redirect to a new action that will display the results of the job.

The same database locking mechanism can also be used as a simple way to synchronize access to resources that are shared across multiple processes on multiple servers. Create a simple ActiveRecord model for a Lock object:

class Lock < ActiveRecord::Base
def self.synchronize(name)
self.transaction do
Lock.find(:first, :conditions => ["name = ?", name], :lock => true)
yield
end
end
end


Create a lock:

Lock.create!(:name => "TestLock")


Then use it like this:

Lock.synchronize("TestLock") do
# access resource here
end


The downside to this is that all locks you want to use need to be created beforehand. If there are a few predetermined locks needed for the application, this approach works fine.

Photo: Gatun locks at the Panama Canal

1 comment:

theturingmachine said...

Hi, this happened to me too using the oracle adapter with backgroundrb, although I took a different approach. What I did was make one query and get the ids of the jobs and then do a query by id on the row I want to lock. Like this:

def find_new_job
Job.transaction do
t_job = Job.find(:first, :conditions => "status = 'new'", :select => 'id')
if t_job
job = Job.find(t_job[:id], :lock => true)
job.status = "processing"
job.save!
end
end
end

I've been trying to change the oracle adapter so you can use lock but haven't figured out how.