svs.io

← Back to blog

Easy Tag Search With Sequel

One of the most under-rated libraries in the Ruby world in my opinion is the brilliant SQL builder library Sequel. Sequel allows you to build SQL statements in Ruby and compose them using all kinds of nice logic and Turing-complete programming language goodness. So, why in the world of ActiveRecord and DataMapper would we need something that helps us write SQL?

Well, ActiveRecord and DataMapper are ORMs. They exist in order to persist your objects to the database and work fine as long as the particular object is the correct abstraction through which to approach your desired functionality. Sometimes however, you need more than an ORM. You need to be able to look at all the data in your database holistically. You need something where the level of abstraction is the Relational Algebra, so you can make broad, sweeping statements about your entire database. One typical use case for such a level of abstraction would be a search feature. When you search, you probably want to span a lot of tables in your database and ORMs handle such a use case clumsily - you don’t really want to be firing off one query per table now do you?

In digiDoc, we offer powerful tag search as described in this video here:

You can actually get this entire functionality including stuff not shown in the video in about a 75 lines of Ruby code (150 if you care about readability), and that is what judicious use of Sequel will get you.

Here’s the code -

class Search
ALLOWED_MODELS = [:receipt, :business_card, :other_document]
SEARCH_COLS = {
:receipt => [:title,:organisation],
:business_card => [:first_name,:last_name,:organisation,:designation,:locality,:city,:pincode, :email, :twitter, :mobile, :notes],
:other_document => [:title,:notes]
}
NON_SEARCH_KEYS = [:tags, :q, :model, :all, :only, :from, :to, :by, :page]
attr_accessor :options, :result, :model_id_column
# options is a hash with the following keys
# model -> the name of the model to search. will search all models if omitted
# q -> the text to search. will also search tags and categories
# t -> to explicitly search tags and categories, pass them in this parameter. |See #parse_params for info on what this parameter looks like
# from -> the date to search from
# to -> the date to search uptil
# by -> paginate by this column. i.e. sometimes we want to paginate by first name or last name.
# page -> if paginating, whichth page to bring. i.e. you can say {:by => "last_name", :page => 'd'} will return all whose last name starts with d
# any other keys get passed on as where clauses
def initialize(options)
@options = parse_options(options)
end
# Public: Returns the ids of the objects matching the search criteria
def do
@result = join_table.select(model_id_column.as(:model_id))
add_additional_where_clauses
search_for_text
paginate
filter_dates
filter_tags
{model_name.camelcase => @result.distinct.all.map{|h| h.values}.flatten}
end
private
# Private: returns a Sequel#function equivalent to i.e. CONCAT('title',' ','organisation') for Receipt and so on
# this is the value against which the search term is compared
# model -> the name of the model table to query
# returns a Sequel CONCAT function
def self.search_function_for(model)
concat_cols = search_columns_for(model)
c = concat_cols.zip((2..concat_cols.count).map{" "}).flatten.compact
Sequel.function(*([:concat] + c))
end
def self.search_columns_for(model)
SEARCH_COLS[model.to_sym] + [:frozen_category_list, :frozen_tag_list]
end
# Private:
def parse_options(options)
options.tap do
tags = options.delete(:t)
options[:tags] = {:must => [], :must_not => [], :any => [], :only => []}
if tags
tags = tags.split(",").map(&:strip) if tags.class == String
only = tags.delete("!")
tags.each do |t|
key = {"+" => :must, "-" => :must_not}.fetch(t.strip[0],:any)
options[:tags][key] << t.gsub(/[+-]/,"").strip
end
options[:tags] = options[:tags].select{|k,v| !v.empty?}
options[:tags][:only] = options[:tags].delete(:any) if only
end
end
end
# Returns a Sequel representation of the model table joined with the taggings table joined with the tags table
def join_table
DB[model_table].left_join(:taggings,:taggable_id => :id, :taggable_type => model_name.camelcase).left_join(:tags, :id => :tag_id)
end
def add_additional_where_clauses
@result = @result.where(@options.except(*NON_SEARCH_KEYS).symbolize_keys)
end
def search_for_text
if @options[:q]
@result = @result.where(Sequel.ilike(Search.search_function_for(model_name), "%#{@options[:q]}%"))
end
end
def paginate
if options[:by] and options[:page]
@result = @result.where(Sequel.ilike(options[:by].to_sym, "#{@options[:page]}%"))
end
end
def filter_dates
if @options[:from]
from_date = Date.parse(@options[:from].to_s)
if from_date.year == 0
@result = @result.where(:date => nil)
else
@result = @result.where('date >= ?', from_date) if @options[:from]
end
end
if @options[:to]
to_date = Date.parse(@options[:to].to_s) rescue nil
@result = @result.where('date <= ?', Date.parse(@options[:to].to_s)) if to_date and to_date.year > 0
end
end
def filter_tags
if @options[:tags]
filter_for_any_clause
filter_for_must_have_clause
filter_for_only_clause
filter_for_must_not_clause
end
end
def filter_for_any_clause
unless @options[:tags][:any].blank?
@result = @result.where(:name => @options[:tags][:any])
end
end
def filter_for_must_have_clause
if !@options[:tags][:must].blank?
count_must_have_tags = join_table.select(model_id_column.as(:model_id),Sequel.function(:count, model_id_column).as(:count_must)).where(:name => @options[:tags][:must]).group_by(model_id_column)
num = @options[:tags][:must].count
@result = @result.left_join(count_must_have_tags, :model_id => model_id_column).where{count_must >= num}
end
end
def filter_for_only_clause
if !@options[:tags][:only].blank?
count_only_tags = join_table.select(model_id_column.as(:model_id),Sequel.function(:count, model_id_column).as(:count_only)).group_by(model_id_column)
num = @options[:tags][:only].count
@result = @result.left_join(count_only_tags, :model_id => model_id_column).where(:count_only => num, :name => @options[:tags][:only])
end
end
def filter_for_must_not_clause
if !@options[:tags][:must_not].blank?
mid = model_id_column
count_must_not_have_tags = join_table.select(model_id_column.as(:model_id)).where(:name => @options[:tags][:must_not]).group_by(model_id_column).having{Sequel.function(:count, mid) > 0}
@result = @result.exclude(:receipts__id => count_must_not_have_tags)
end
end
# i.e. "receipt"
def model_name
@options[:model].to_s
end
# i.e. :receipts
def model_table
model_name.pluralize.to_sym
end
# i.e. :receipts__id
def model_id_column
"#{model_name.pluralize}__id".to_sym
end
end
view raw gistfile1.rb hosted with ❤ by GitHub

As you can see, Sequel provides a beautiful abstraction of SQL and allows you to exploit the full power of your data store.

The tag filter bits where we compare for “must have” and “must only have” are based on a clever trick I read on StackOverflow (sorry can’t find the link). Instead of comparing against a given set of tags, what we do is to aggregate the count of tags matching the given criterea per record and then check if the count is more than the number of tags we’re looking for. This would have been extremely difficult without changing the level of abstraction. As Uncle Bob says - perspective is everything.

These days, I am using Sequel in all my projects. I cannot think of a better way to work with aggregation, reporting, search and other stuff that doesn’t have to do with object persistence.

If you liked this article, and would love your web-apps to have similar cool features built with the same level of care and consideration, consider hiring our new Ruby consulting firm - Sealink Consulting. We’re based in Mumbai but available to work remotely. Email svs at this domain to get in touch.