svs.io
← Back to blogEasy Tag Search With Sequel
One of the most under-rated libraries in the Ruby world in my opinion is the brilliant SQL builder library Sequel. Sequel allows you to build SQL statements in Ruby and compose them using all kinds of nice logic and Turing-complete programming language goodness. So, why in the world of ActiveRecord and DataMapper would we need something that helps us write SQL?
Well, ActiveRecord and DataMapper are ORMs. They exist in order to persist your objects to the database and work fine as long as the particular object is the correct abstraction through which to approach your desired functionality. Sometimes however, you need more than an ORM. You need to be able to look at all the data in your database holistically. You need something where the level of abstraction is the Relational Algebra, so you can make broad, sweeping statements about your entire database. One typical use case for such a level of abstraction would be a search feature. When you search, you probably want to span a lot of tables in your database and ORMs handle such a use case clumsily - you don’t really want to be firing off one query per table now do you?
In digiDoc, we offer powerful tag search as described in this video here:
You can actually get this entire functionality including stuff not shown in the video in about a 75 lines of Ruby code (150 if you care about readability), and that is what judicious use of Sequel will get you.
Here’s the code -
class Search | |
ALLOWED_MODELS = [:receipt, :business_card, :other_document] | |
SEARCH_COLS = { | |
:receipt => [:title,:organisation], | |
:business_card => [:first_name,:last_name,:organisation,:designation,:locality,:city,:pincode, :email, :twitter, :mobile, :notes], | |
:other_document => [:title,:notes] | |
} | |
NON_SEARCH_KEYS = [:tags, :q, :model, :all, :only, :from, :to, :by, :page] | |
attr_accessor :options, :result, :model_id_column | |
# options is a hash with the following keys | |
# model -> the name of the model to search. will search all models if omitted | |
# q -> the text to search. will also search tags and categories | |
# t -> to explicitly search tags and categories, pass them in this parameter. |See #parse_params for info on what this parameter looks like | |
# from -> the date to search from | |
# to -> the date to search uptil | |
# by -> paginate by this column. i.e. sometimes we want to paginate by first name or last name. | |
# page -> if paginating, whichth page to bring. i.e. you can say {:by => "last_name", :page => 'd'} will return all whose last name starts with d | |
# any other keys get passed on as where clauses | |
def initialize(options) | |
@options = parse_options(options) | |
end | |
# Public: Returns the ids of the objects matching the search criteria | |
def do | |
@result = join_table.select(model_id_column.as(:model_id)) | |
add_additional_where_clauses | |
search_for_text | |
paginate | |
filter_dates | |
filter_tags | |
{model_name.camelcase => @result.distinct.all.map{|h| h.values}.flatten} | |
end | |
private | |
# Private: returns a Sequel#function equivalent to i.e. CONCAT('title',' ','organisation') for Receipt and so on | |
# this is the value against which the search term is compared | |
# model -> the name of the model table to query | |
# returns a Sequel CONCAT function | |
def self.search_function_for(model) | |
concat_cols = search_columns_for(model) | |
c = concat_cols.zip((2..concat_cols.count).map{" "}).flatten.compact | |
Sequel.function(*([:concat] + c)) | |
end | |
def self.search_columns_for(model) | |
SEARCH_COLS[model.to_sym] + [:frozen_category_list, :frozen_tag_list] | |
end | |
# Private: | |
def parse_options(options) | |
options.tap do | |
tags = options.delete(:t) | |
options[:tags] = {:must => [], :must_not => [], :any => [], :only => []} | |
if tags | |
tags = tags.split(",").map(&:strip) if tags.class == String | |
only = tags.delete("!") | |
tags.each do |t| | |
key = {"+" => :must, "-" => :must_not}.fetch(t.strip[0],:any) | |
options[:tags][key] << t.gsub(/[+-]/,"").strip | |
end | |
options[:tags] = options[:tags].select{|k,v| !v.empty?} | |
options[:tags][:only] = options[:tags].delete(:any) if only | |
end | |
end | |
end | |
# Returns a Sequel representation of the model table joined with the taggings table joined with the tags table | |
def join_table | |
DB[model_table].left_join(:taggings,:taggable_id => :id, :taggable_type => model_name.camelcase).left_join(:tags, :id => :tag_id) | |
end | |
def add_additional_where_clauses | |
@result = @result.where(@options.except(*NON_SEARCH_KEYS).symbolize_keys) | |
end | |
def search_for_text | |
if @options[:q] | |
@result = @result.where(Sequel.ilike(Search.search_function_for(model_name), "%#{@options[:q]}%")) | |
end | |
end | |
def paginate | |
if options[:by] and options[:page] | |
@result = @result.where(Sequel.ilike(options[:by].to_sym, "#{@options[:page]}%")) | |
end | |
end | |
def filter_dates | |
if @options[:from] | |
from_date = Date.parse(@options[:from].to_s) | |
if from_date.year == 0 | |
@result = @result.where(:date => nil) | |
else | |
@result = @result.where('date >= ?', from_date) if @options[:from] | |
end | |
end | |
if @options[:to] | |
to_date = Date.parse(@options[:to].to_s) rescue nil | |
@result = @result.where('date <= ?', Date.parse(@options[:to].to_s)) if to_date and to_date.year > 0 | |
end | |
end | |
def filter_tags | |
if @options[:tags] | |
filter_for_any_clause | |
filter_for_must_have_clause | |
filter_for_only_clause | |
filter_for_must_not_clause | |
end | |
end | |
def filter_for_any_clause | |
unless @options[:tags][:any].blank? | |
@result = @result.where(:name => @options[:tags][:any]) | |
end | |
end | |
def filter_for_must_have_clause | |
if !@options[:tags][:must].blank? | |
count_must_have_tags = join_table.select(model_id_column.as(:model_id),Sequel.function(:count, model_id_column).as(:count_must)).where(:name => @options[:tags][:must]).group_by(model_id_column) | |
num = @options[:tags][:must].count | |
@result = @result.left_join(count_must_have_tags, :model_id => model_id_column).where{count_must >= num} | |
end | |
end | |
def filter_for_only_clause | |
if !@options[:tags][:only].blank? | |
count_only_tags = join_table.select(model_id_column.as(:model_id),Sequel.function(:count, model_id_column).as(:count_only)).group_by(model_id_column) | |
num = @options[:tags][:only].count | |
@result = @result.left_join(count_only_tags, :model_id => model_id_column).where(:count_only => num, :name => @options[:tags][:only]) | |
end | |
end | |
def filter_for_must_not_clause | |
if !@options[:tags][:must_not].blank? | |
mid = model_id_column | |
count_must_not_have_tags = join_table.select(model_id_column.as(:model_id)).where(:name => @options[:tags][:must_not]).group_by(model_id_column).having{Sequel.function(:count, mid) > 0} | |
@result = @result.exclude(:receipts__id => count_must_not_have_tags) | |
end | |
end | |
# i.e. "receipt" | |
def model_name | |
@options[:model].to_s | |
end | |
# i.e. :receipts | |
def model_table | |
model_name.pluralize.to_sym | |
end | |
# i.e. :receipts__id | |
def model_id_column | |
"#{model_name.pluralize}__id".to_sym | |
end | |
end |
As you can see, Sequel provides a beautiful abstraction of SQL and allows you to exploit the full power of your data store.
The tag filter bits where we compare for “must have” and “must only have” are based on a clever trick I read on StackOverflow (sorry can’t find the link). Instead of comparing against a given set of tags, what we do is to aggregate the count of tags matching the given criterea per record and then check if the count is more than the number of tags we’re looking for. This would have been extremely difficult without changing the level of abstraction. As Uncle Bob says - perspective is everything.
These days, I am using Sequel in all my projects. I cannot think of a better way to work with aggregation, reporting, search and other stuff that doesn’t have to do with object persistence.
If you liked this article, and would love your web-apps to have similar cool features built with the same level of care and consideration, consider hiring our new Ruby consulting firm - Sealink Consulting. We’re based in Mumbai but available to work remotely. Email svs at this domain to get in touch.