Jun 21, 2016

Accent insensitive queries in Rails with PostgreSQL

blogpost cover image
Hello there! Today’s post will be short, but important nonetheless. There are many languages that contain special characters resembling those of basic latin charset. You know, like Spanish or French accentuated letters, or those fancy Polish ones (which was exactly the reason I had to implement this feature). 

Imagine a web app containing information about different companies and their location. Let’s say you look for companies based in my hometown – Łódź. Honestly, I can't imagine a place with a trickier name at the moment.

Now, let's suppose that it isn't anywhere near as easy to enter these characters. Your system may not support them, or you could be using your mobile phone’s onscreen keyboard, with no easy access to these characters. You type 'lodz', and... you don't find anything, maybe except for some company that happened to mistype their city name. Or, you manage to find some of the characters, but mistype one of them ('ź' may happen to be quite challenging).

Enter Unaccent

I actually thought that working this out would be harder than it actually was. I was afraid that I'll need to write some dirty Ruby, or even SQL hack-arounds. Luckily, it turned out that Postgres delivers an extension capable of handling such cases. It is called unaccent, and actually making it work is a piece of cake.

Let's start with a migration that will enable the extension (it's not available by default).

Create a migration:

class EnableUnaccentExtension < ActiveRecord::Migration
  def change
    enable_extension 'unaccent'

and run rake db:migrate.

Now you can use unaccent(column_name) inside your SQL calls:

# company.rb
  def self.search_for_company_in(city)
    where('unaccent(city) ILIKE unaccent(?)', "%#{city}%")

Piece of cake, isn't it? You might actually consider converting the above to a scope, just to make it look somewhat nicer.

What may become a bit troublesome is making this work production-side. You need to have the extension-supporting files installed, and you need appropriate privileges – in most cases either a superuser or database owner privilege. 

For more information about extensions, check the PostgreSQL documentation. If you want to read some more about unaccent extension, also refer to the docs.

That's all folks. Thanks for your attention and happy coding! :)