Apr 25, 2017

Browser automation with Watir - guide (not only) for testers

blogpost cover image
Let's assume that you're a tester working on some kind of web application. You've got some kind of specification, and you're manually testing the app against it. This approach takes a lot of your time, and requires meticulous attention to details. 

After every change the team and clients want testing done ASAP, but on the other hand you don't want to miss anything, so the time it takes probably doesn't meet their expectations.

The rational solution is to automate your tests. But let's not fool ourselves - test automation is hard.

Maybe you tried out Selenium IDE, but found it unwieldy and ineffective, maybe you started scripting in Selenium, but it feels strange and uncomfortable. Both methods take a lot of time - which you don't have.

What worked for me was Watir (http://watir.github.io/) - an open source Ruby library providing a simple and comprehensive language for automating browsers. It's implemented as a wrapper around Selenium, so it should behave no different regarding browser compatibility.


This section is mostly directed at testers wanting to maintain automatic test suites - I believe that doing browser automation tests effectively requires something more than a great tool. Before starting, you need to understand some core principles, otherwise it's gonna be a big investition that never pays off. If you're a developer wanting to take a look at the API, don't hesitate to go straight to “Gory Details” below.

Unit First

Unit tests are the foundation of regression testing. At Binar::Apps we believe that they're the responsibility of the programmer - ensuring that the code works just as he intended. We maintain our end-to-end tests separately, and believe that it's better to do so.

If you're not sure about unit tests in your project, talk about them with the developers - good unit tests can save the team a lot of stress (you can find our posts concerned with testing here).

I wouldn't recommend focusing on browser automation before having a strong foundation of unit tests - as a lot of “testing pyramid” articles on the Internet will tell you - basing your quality on automatic end-to-end tests is a bad idea. But you need to have them - hence the post you're reading right now.

Source: https://watirmelon.blog/2016/05/18/ama-the-eye-above-my-testing-pyramid/

It's a good idea to take a look at the app's unit tests before you start. You'll spend less time stressing over features that are extensively covered there.

What should I automate?

I like to think of automated browser tests as a tool supporting the manual testing. They're supposed to save you time - and you need to keep that in mind. If you forget about it, it's really easy to end up spending more time on writing and maintaining tests than what you gain in the end:

Source: https://xkcd.com/1205/

Substituting 100% of your manual tests with browser automation is not only hard. It's also impractical, and certainly not bug-proof.

Take a look at your scripts. Ask yourself these questions:

  1. What is simplest to automate? - basic features with simple prerequisites, preferably not using a lot of JS.

  2. What is hardest to automate? - e.g. JS-heavy flows, tests that are dependent on the state of the app or require processing a lot of data.

  3. Which features are the most important?

  4. Where were previous regressions discovered?

  5. Which tests take you the longest? Which are executed most often?

  6. What is hardest to test manually? (e.g. long forms that require monitoring a lot of data)

I think that answering them should give you a good idea of where to start.

When to do it?

This one is fairly simple - you should handle your automated tests just like project documentation - start early, update often.

In a perfect world you would even write these tests before the feature is introduced, but - unfortunately - that is not a practical solution.

The Gory Details

I assume you already have Ruby set up on your machine and have some basic knowledge of it. If you don't, don't be afraid - it's a language where code almost writes itself, I recommend giving it a shot. I won't elaborate on the Ruby setup process here, but I suggest using RVM and Bundler.


First, you'll need to install the 'watir' gem - either manually or through your Gemfile.

Make sure that you have the browsers you want to automate installed, and download the appropriate driver binaries:

Firefox - you should download 'geckodriver' from https://github.com/mozilla/geckodriver/releases

Chrome - you should download 'chromedriver' from https://sites.google.com/a/chromium.org/chromedriver/downloads

Unpack the downloaded archives and put the folder that contains them on your PATH.

We're good to go - it's time for the magic to happen.


No time to waste! Enter `irb` in your console and repeat after me:

require 'watir'
# disable waiting for elements that aren't on the page - when fiddling around in console, 30s timeout may be an overkill
Watir.relaxed_locate = false
browser = Watir::Browser.new :chrome
browser.goto 'binarapps.com'
puts browser.title
puts browser.h1.text
browser.element(text: 'Blog').click
puts browser.divs(class: 'blog-post-item').count
posts = browser.divs(class: 'blog-post-item')
puts posts[2].a(class: 'post-title').text
posts.each { |post| puts post.a(class: 'post-title').text }
posts[0].a(class: 'post-title').click
puts browser.h1.text

Wasn't that simple?

Let's take a look at the things we can do.


The first line after requiring the gem is creating a browser object and binding it to a variable - so we can control it. We can pass the browser type (firefox, chrome) as a parameter. Every interaction with the page has to pass through this variable.

Basic methods:

browser.goto 'binarapps.com' # navigates to the provided address
browser.title                # returns the page title
browser.url                  # returns the current url
browser.back                 # navigates back in history
browser.forward              # navigates forward in history
browser.close                # closes the browser

Finding elements

The browser class includes the `Container` module, allowing us to access child elements through their HTML tags (all supported HTML5 tags are included), input type (e.g. “text_field” or “radio”) or through generic “element” keyword.

Most elements we interact with also include this module, so we can access their child elements using the very same methods.

These methods take selectors as parameters - a selector is a key-value pair consisting of a property to be checked and the value we're looking for. Most importantly, you can locate elements based on their text or HTML attributes.

The list of handled attributes is finite, but extensive. If you're not sure, just give it a try (remember to replace dashes with underscores, e.g. in “data-something” attributes). If an attribute isn't supported, you can always use `xpath: ''` and `css: ''` locators to make up for it.

If multiple elements fit the description, the first one is selected.

browser.goto 'binarapps.com'
browser.div(class: 'navbar-menu-item')

# locating the “a” child of the selected div
browser.div(text: 'Blog').a

# you can also assign the element to a variable
blog_link = browser.div(text: 'Blog').a

# bypassing the fact that the 'navbar' tag isn't present in the HTML5 specs
navbar = browser.element(tag_name: 'navbar')

# using multiple selectors in a single instruction
browser.element(class: 'navbar-menu-item', text: 'Blog')

# locating by css
browser.element(css: '.navbar > div > .navbar-wrapper')

# locating by xpath
br.element(xpath: "//navbar/div/div[@class='navbar-wrapper']")

Assigning elements to variables

In contrast to Selenium, Watir doesn't bind elements to a specific object on page, but rather remembers the given set of locators - the variable refers to the object fitting them (more here). This approach has both upsides and downsides - the most important thing we have to remember is that if multiple elements fit the description - we're not guaranteed that every call will refer to the same one (I've never had a problem of this kind, but it can happen).

Element properties

We can access the basic properties of the chosen elements using simple methods:

.class_name (as “class” is of course reserved in Ruby)

You can also extract most of the element attributes (the same that we can locate on) this way. If the one you're looking for isn't supported, you can use the 'attribute_value' method, providing the name as a parameter.

browser.goto 'binarapps.com'
browser.div(text: 'Blog').class_name
browser.a(class: 'linkedin').href

# accessing unsupported attributes

Element presence

Watir allows us to check if the element with provided locator is present. There are three methods that fill this purpose;

.present? - checks whether an element exists and is present (doesn't throw an exception for nonexistent elements)
.visible? - checks whether an existing element is visible on the page (throws an exception if the element doesn't exist)
.exists? - checks whether an element exists (it can be invisible)

Source: https://jkotests.wordpress.com/2012/11/02/checking-for-an-element-exists-visible-present/

browser.goto 'binarapps.com'

# a visible element
browser.div(text: 'Blog').present?
browser.div(text: 'Blog').exists?
browser.div(text: 'Blog').visible?

# an invisible element

# element that doesn't exist
browser.div(text: 'nosuchdiv').present?
browser.div(text: 'nosuchdiv').exists?
browser.div(text: 'nosuchdiv').visible?


Basic interactions with all elements may be executed with these methods:


Input elements

Some input elements have special methods we can use to interact with them - for example:

Text fields (text_field) - .set, .value
Radio buttons (radio) - .set, .set?
Select boxes (select_list) - .select .selected_options

You can find practical examples here - http://watir.github.io/docs/elements/


You can keep groups of items with a common locator as collections. To create them you need to pluralize the element type in the locator (text_field => text_fields; div => divs; checkbox => checkboxes, etc.). Collections allow you to access elements within them, get their number or iterate through them.

browser.goto 'binarapps.com/blog'
posts = browser.divs(class: 'blog-post-item')
posts.length # returns the number of posts; equivalent to browser.divs(class: 'blog-post-item').length
posts.first  # returns the first post
posts.last   # returns the last post
posts[2]     # returns the post with selected index ([2] is the third one - indexes start at 0!)
posts.each { |p| puts p.element(class: 'post-title').text }


You can take a screenshot of the currently open page any time you like.

browser.screenshot.save 'screenshot.png'

Need more?

Extensive documentation is available here - http://www.rubydoc.info/gems/watir

Headless mode

If you've got your tests already scripted, having a browser interface running may be unnecessary and distracting - there is a simple way around this, using the 'headless' gem. Basic usage:

require 'watir'
require 'headless'

headless = Headless.new
browser = Watir::Browser.new :firefox
browser.goto 'binarapps.com/blog'
puts 'it works' if browser. h1.present?


That's it for today - hope you've enjoyed the ride :)

I think that Watir is a really powerful tool that allows you to write clean and readable scripts. If you're in need of browser automation, I recommend giving it a try.

Thanks for your attention and happy testing!