Use Advanced User Interactions in Selenium WebDriver to drive keyboard and mouse

How to use Selenium WebDriver’s ActionBuilder for automated testing

Courtney Zhan
4 min readJun 5, 2022

The ActionBuilder in Selenium WebDriver provides a way to set up and perform complex user interactions. Specifically, grouping a series of keyboard and mouse operations and sending them to the browser.

Mouse interactions

  • click
  • click_and_hold
  • context_click
  • double_click
  • drag_and_drop
  • drag_and_drop_by
  • move_by
  • move_to
  • release

Keyboard interactions

  • key_down
  • key_up
  • send_keys

These functions in Advanced User Interactions are self-explanatory. This article will show some examples of how to use them.

The usage

driver.action. + one or more above operations + .perform

Check out the ActionBuilder API for more.

Mouse Over

When a mouse is moved over the email field below,

A tip ( must contains @ ) shows up.

The HTML fragment

<input id="email" name="email" type="email" style="height:30px; width: 280px;" data-toggle="tooltip" data-placement="right" title="must contains @">

Test Script + "/html5.html")
elem = driver.find_element(:id, "email")
sleep 1
expect(page_text).to include("must contains")

move_to takes an element as a parameter, but Selenium can also move to x- and y- coordinates. An example is:

driver.action.move_to_location(0, 0).perform

Which moves the mouse over the top left-most pixel in the browser window.

Double Click

When you double click the text “Quick Fill”,.

The password field will automatically be filled.

The HTML fragment

<input type="password" name="password" id="pass">
<span id="quickfill" ondblclick="quick_fill()">Quick Fill(double click)</span>
function quick_fill() {

Test Script + "/text_field.html")
quick_fill_elem = driver.find_element(:id, "quickfill")
# double click to fill
sleep 0.2
expect(driver.find_element(:id, "pass")["value"]).to eq("ABC")

Click and Hold

Select the item boxes 6 to 8 in the following grid.

Screenshot selecting item boxes 6, 7 and 8 in one mouse hold. Screenshot from website.

The HTML fragment

<ol id="selectable" class="ui-selectable">
<li class="ui-state-default ui-selectee">1</li>
<li class="ui-state-default ui-selectee">2</li>
<li class="ui-state-default ui-selectee">3</li>
<li class="ui-state-default ui-selectee">12</li>

Test Script"")
sleep 1
driver.find_element(:link_text, "Display as grid").click
sleep 1
list_items = driver.find_elements(:xpath, "//ol[@id='selectable']/li")
sleep 0.5

Note in this one, click_and_hold is used twice for the start and end elements, then the mouse also performs a release to complete the hold.

Drag and Drop

Drag Item 1to the Trash block.

The HTML fragment

<div id="trash" class="ui-droppable over">
<div id="items">
<div class="item ui-draggable ui-draggable-handle" id="item_1" style="position: relative;">
<span>Item 1</span>
<div class="item ui-draggable ui-draggable-handle" id="item_2" style="position: relative;">
<span>Item 2</span>
<div class="item ui-draggable ui-draggable-handle" id="item_3" style="position: relative;">
<span>Item 3</span>

Test Script + "/drag_n_drop.html")
drag_from = driver.find_element(:id, "item_1")
target = driver.find_element(:id, "trash")
driver.action.drag_and_drop(drag_from, target).perform

In some cases, click_and_hold + move_to can be an alternativedrag_and_drop. The below works for the above HTML as well.



Right-clicking will bring up the context menu, where additional actions can be performed.

The HTML fragment

<input type="password" name="password" id="pass">

Test Script

elem = driver.find_element(:id, "pass")driver.action.context_click(elem).perform

Will show the popup context menu.

Move Mouse by Offset

You can move the mouse to a specific coordinate relative to a web element by providing an offset coordinate.

Below is an image with the linked area defined.

Click the coordinates within the linked area on the WhenWise image to go to the site.

The HTML fragment

<img src="images/agileway_software.png" border="0" width="400" id="agileway_software" usemap="#agileway_software_map">
<map name="agileway_software_map" id="agileway_software_map">
<area shape="rect" coords="13,16,120,42" href="" alt="testwise" title="">
<area shape="rect" coords="13,73,127,100" href="" alt="buildwise" title="">
<area shape="circle" coords="220,30,27" href="" alt="whenwise">

Test Script

elem = driver.find_element(:id, "agileway_software")
driver.action.move_to(elem, 210, 30).click.perform
expect(driver.title).to eq("WhenWise - Booking Made Easy")

Send Key Sequences

Select all the text in a text area and delete it.

HTML Fragment

<textarea id="comments" name="comments"></textarea>

Test Script

driver.find_element(:id, "comments").send_keys("Multi\r\n Line\r\n Comment")
elem = driver.find_element(:id, "comments")
# use Command key for macOS, Control key otherwise
ctrl_key = RUBY_PLATFORM.include?("darwin") ? :command : :control

The above performs the Command + A on macOS (Control + A otherwise). This is also equivalent to:

elem.send_keys([ctrl_key, "a"])

Note that click(elem) is the first action. This is because actions are sent directly to the browser, not a web element (unlike elem.send_keys(...)). So you must focus on the element first — the easiest way to do this is by clicking on it.