Handling pop up windows in Webdriver

In WebDriver, testing popup windows involve switching the driver to the popup window and then running the corresponding actions. Steps involved are:

  1. Identify the popup or new windows
  2. Switch to the new window
  3. Carry out some tasks
  4. Switch back to parent window

You can use the “WebDriver.SwitchTo().Window()” method to switch to a window with a known name. If the name is not known, you can use “WebDriver.WindowHandles()” to obtain a list of known windows. You may pass the handle to “SwitchTo().Window()”

You can usually get the window name by looking at the page source and checking for the value of the <title> tag. This name can then be passed to “SwitchTo().Window()”. For example:

// This method take in a name of a window as a parameter, and then switches to that window.
 
public void SwitchToWindow(string windowName)
{
 
_driver.SwitchTo().Window(windowName);
 
}

If you do not know the name of the window, you can use a different approach. For example.

 ReadOnlyCollection&lt;string&gt; handles = _driver.WindowHandles;
 
            foreach (string windowHandle in handles)
            {
               string newWindowName = _driver.SwitchTo().Window(windowHandle).Title;
               Console.WriteLine("Window Hadle is: " + windowHandle + " and Window Name is:
               " + newWindowName);
}

The above will print out the window handles and window titles of all open windows. You can then use the newly discovered window name in the SwitchToWindow function described above or use a new function like below:

 public void SwitchToWindow(string windowName)
        {
           string parentWindowHandle = _driver.CurrentWindowHandle;
 
            IWebDriver newWindow = null;
 
            ReadOnlyCollection&lt;string&gt; handles = _driver.WindowHandles;
 
            foreach (string windowHandle in handles)
            {
                newWindow = _driver.SwitchTo().Window(windowHandle);
 
                if (newWindow.Title.Equals(windowName))
                {
                    Console.WriteLine("Switching to window with handle " + windowHandle +
                    " and window name" + windowName );
                    break;
 
                }
            }
        }

See the first line of code in the above function. It captures the current window handle. In case you want to switch to the parent window, you can simply do this:

 _driver.SwitchTo().Window(parentWindowHandle);

Webdriver – How to Write Automated Tests Using Page Object Design Patterns

Look at this code

Site.Click("crt1h-yth");
Site.Click("chk-formb1");
Site.Click("pay_02");

and compare it with this..

Shop.AddToCart();
Shop.Checkout();
Shop.MakePayment();

The functionality is the same, but the way to present this is so much different. This is where page objects come into the fold and this blogpost is just about that.

So what are page objects ?

Page Object is a Design Pattern which has become popular in test automation for enhancing test maintenance and reducing code duplication. Page object is a language neutral pattern for representing a complete page or a part of a page in an objected oriented manner. We use them to model the application’s user interface. Page objects expose methods that reflect things that a user can see and do on a page. It also hides the details of telling the browser how to do those things. In short, page object encapsulates behaviour of a page.

Your tests then use the methods of this page object class whenever they need to interact with that page of the UI. The benefit is that if the UI changes for the page, the tests themselves don’t need to change, only the code within the page object needs to change. Subsequently all changes to support that new UI are located in one place.

So why do page objects? The main reasons are

  • Maintenance
  • Readability of scripts
  • Reduced or eliminated duplication
  • Reusability

Consider the following example, which does not use page object design.

namespace SeleniumTests
{
[TestFixture]
public class LinkedInTest
{
private IWebDriver driver;
private string baseURL;
 
[SetUp]
public void SetupTest()
{
driver = new FirefoxDriver();
baseURL = "http://www.linkedin.com/";
 
}
 
[TearDown]
public void TeardownTest()
{
driver.Quit();
}
 
[Test]
public void LoginAndLogoutOfLinkedIn()
{
driver.Navigate().GoToUrl(baseURL + "/");
driver.FindElement(By.Id("session_key-login")).Clear();
driver.FindElement(By.Id("session_key-login")).SendKeys("test@123.com");
driver.FindElement(By.Id("session_password-login")).Clear();
driver.FindElement(By.Id("session_password-login")).SendKeys("awesomePswd");
driver.FindElement(By.Id("btn-login")).Click();
Assert.assertTrue(driver.isElementPresent("inbox button"),
                                "Login was unsuccessful");
driver.FindElement(By.LinkText("Sign Out")).Click();
}
 
}
}

There are a few problems with this approach:

  1. There is no clear separation between test methods and the locators of the application (Ids or LinkText locators that I’ve used above). They are all in a single method. If the application changes its identifiers or layout, then the tests must change.
  2. Imagine a scenario of multiple tests which require the use of this login functionality (There can be various scenarios to do with testing login functionality). The same login code will be repeated again and again in each test. Any change in UI will mean that all tests will have to be modified.
  3. The above code is not very readable, or maintainable, has duplication and is not reusable. (Exactly the issues addressed by Page Object Design).

 

By applying the Page Object design technique, I can rewrite the above test in the following way.

Firstly here is the flow that I will be automating using C#, NUnit and Webdriver:

  1. Navigate to the LinkedIn Homepage
  2. Enter Username
  3. Enter Password
  4. Click Login
  5. Assert Some Text is appearing on the HomePage after login
  6. Logout
  7. Assert that I have signed out

 

Below is the Page Object for the Login-In Page:

using System;
using OpenQA.Selenium;
 
namespace SeleniumTests
{
    public class LoginPage
    {
        protected readonly IWebDriver WebDriver;
 
        public LoginPage(IWebDriver webdriver)
        {
            this.WebDriver = webdriver;
 
            string title = WebDriver.Title;
 
            if (!title.Equals("World's Largest Professional Network | LinkedIn"))
            {
                throw new InvalidOperationException("This is not the Login Page. Current page is: "
                                                    + title);
            }
        }
 
        /*
        A property to enter username on login page
     */
        public string EnterUsername
        {
            set
            {
                IWebElement usernameField = WebDriver.FindElement(By.Id("session_key-login"));
                usernameField.Clear();
                usernameField.SendKeys(value);
            }
        }
 
        /*
        A property to enter password on login page
     */
        public string EnterPassword
        {
            set
            {
                IWebElement passwordField = WebDriver.FindElement(By.Id("session_password-login"));
                passwordField.Clear();
                passwordField.SendKeys(value);
            }
        }
 
        /*
        A method to click the login button on the page
     */
        public void ClickLogin()
        {
            IWebElement loginButton = WebDriver.FindElement(By.Id("btn-login")); // find the login button
            loginButton.Click(); // click on the login button
        }
    }
}

Here is how the Page Object for the HomePage looks like:

using System;
using OpenQA.Selenium;
 
namespace SeleniumTests
{
    public class HomePage
    {
        protected readonly IWebDriver WebDriver;
 
        public HomePage(IWebDriver webdriver)
        {
            this.WebDriver = webdriver;
 
            string title = WebDriver.Title;
 
            if (!title.Equals("Welcome, Hasan! | LinkedIn"))
            {
                throw new InvalidOperationException("This is not the HomePage. Current page is: "
                                                    + title);
            }
        }
 
        /*
A method to logout of LinkedIn
*/
 public void Logout()
        {
            driver.FindElement(By.LinkText("Sign Out")).Click();
        }
    }
}

Finally, here is my LinkedIn login Test that consumes the two page objects:

using NUnit.Framework;
using OpenQA.Selenium;
 
namespace SeleniumTests
{
    [TestFixture]
    public class LinkedInTest
    {
        private IWebDriver _driver;
 
        [SetUp]
        public void Setup()
        {
            _driver = SeleniumHelper.GetSelenium();
        }
 
        [TearDown]
        public void TearDown()
        {
            SeleniumHelper.TearDownCurrentSelenium();
        }
 
        [Test]
        public void VerifyThatYouCanLoginAndLogoutOfLinkedIn()
        {
           // Create an instance of the Login Page
            LoginPage login = new LoginPage(_driver);
 
            //Enter Username
            login.EnterUsername = "HasanAziz@test.com";
 
            //Enter Password
            login.EnterPassword = "XYZ123";
 
            // Click Login
            login.ClickLogin();
 
            //Create an instance of the HomePage
            HomePage home = new HomePage(_driver);
 
            //Assert that some text is displayed on the Home Page
            Assert.IsTrue(_driver.PageSource.Contains("LinkedIn Today:"));
 
            //LogOut from LinkedIn
            home.Logout();
 
            // Assert
            Assert.IsTrue(_driver.Title.Equals("Signed Out | LinkedIn"));
        }
    }
}

As you can see from above, this is much more clearer now. The comments in the tests are really not needed as they are just repeating what the code is doing (Something I mentioned in my blogpost “Comments & Bad Comments” earlier, but I have added these here just so you could understand whats going on.

There is alot of flexibility when it comes to designing page objects. I could also use techniques like inheritance, function overloading or any other Object Oriented programming concept, and  instead of having the browser name and LinkedIn URL set as hardcoded values, I could have picked them up from a config file etc. Lets take another example of the eBay homepage. You can automate it using 1 Page Object if you wish, but ideally you can have 5-6 Page objects for the home page only.

Although the flexibility is present, there are a few basic rules that you should adhere to for mainiting your code:

  1. Do not create page objects at once, do only what you need at this given time. You can spend weeks in trying to create page objects for your whole application and this would be a waste of time. Your page objects will grow when new requirements come in which inturn will require new test scripts.
  2. Asserts do not belong in page objects, they belong in test scripts. Page objects only interact with the page, they do not make the decision of whether something is passing or failing.
  3. An exception to the above rule is that there should be one single verification within the page object and that is to verify that the page and any important elements within the page were loaded correctly. This verification should be done while instantiating the page object. In my example above, both the LogInPage and HomePage constructors check that the expected page is available and ready for requests from the test.

Further Reading:

  • There are other Design Patterns, some use Page Factory for instantiating objects. More on this here 
  • Some Information on Page objects can be found on code.google.com
  • Very Useful test design considerations can be found on the Selenium HQ site

Books on this topic:

  • I recommend xUnit Test Patterns: Refactoring Test Code by Gerard Meszaros – Almost 900 pages of patterns that can be used for automation and unit testing.
  • Refactoring: Improving the Design of Existing Code by Martin Fowler and Kent Beck. Its developer centric but a very good read.
  • Design Patterns: Elements of Reusable Object-Oriented Software – This is on my reading list but I have heard very good reviews about it.

Webdriver – Drop Down boxes, Check Boxes & Filling in forms

Here is how you can select values in a drop down box: The drop Down has an ID called “Domain” and values Yahoo, Virgin, AOL.

SelectElement select = new SelectElement(_webdriver.FindElement(By.Id(“Domain”)));

Select.DeselectAll();

select.SelectByText(“Yahoo”);

This will deselect all options, and then select the option withthe displayed text of “Yahoo”.

You can select Radio Boxes and Check Boxes by the following command:

driver.FindElement(By.Id(“SingleJourneyCheckBox”)).Click();

You can find out whether this checkbox/radiobox is alrady selected by this command.

IWebElement checkbox = driver.FindElement(By.Id(“SingleJourneyCheckBox”));

if(checkbox.Selected)

{

//do something here.

}

If you were filling out a form and  once you’ve finished, you probably want to submit it. One way to do this would be to find the “submit” button and click it:

driver.findElement(By.id(“submit”)).Click(); // Assume the button has the ID “submit”

Alternatively, WebDriver has the convenience method “submit” on every element. If you call this on an element within a form, WebDriver will crawl up the DOM until it finds the enclosing form and then calls submit on that. If the element isn’t in a form, then the NoSuchElementException will be thrown:

element.submit();

Webdriver Commands & Operations

 1. Opening a New page:

 driver.get(“http://www.google.com”);

WebDriver will wait until the page has fully loaded (that is, the “onload” event has fired) before returningcontrol to your test or script. It’s worth noting that if your page uses a lot of AJAX on load then WebDriver may not know when it has completely loaded. If you need to ensure such pages are fullyloaded then you can use “waits”. More on this later.

2. Locating UI Elements & Interactions with the page:

Imagine that you have an element displayed on a page with the following code

<input type=”text” name=”Password” id=”Password-ID”/>

You can find this element by using Any of the following commands:

Locate By ID:

This is the most efficient and preferred way to locate an element.

IWebElement element;

element=driver.findElement(By.id(“Password-ID”));

Locate By Name:

Find the input element with matching name attribute.

IWebElement element;

element=driver.findElement(By.name(“Password”));

Locate By XPath:

WebDriver uses a browser’s native XPath capabilities wherever possible. On browsers that don’t have native XPath support, webdriver provides its own implementation. This can lead to some unexpected behaviour unless you are aware of the differences in the various xpath engines.

 IWebElement element;

element=driver.findElement(By.xpath(“//input[@id=’Password-ID’]”));

 Locate By Class Name:

“Class” in this case refers to the attribute on the DOM element. Often in practical use there are many DOM elements with the same class name, so be very carefull when finding elements using this option. For example, lets say that you have the following html code in your browser:

<div class="cheese"><span>Cheddar</span></div><div class="cheese"><span>Brie
</span></div>

If you use the below code, it will only target the first element it finds, i.e Cheddar.

IWebElement element;

element=driver.findElement(By.ClassName(“cheese”));

But if you want to find multiple elements, you can use the command below:

IList<IWebElement> cheeses = driver.FindElements(By.ClassName("cheese"));

 Locate by Tag Name:

You can also locate an element by its DOM tag name.

For example, say you want to find an iframe which is displayed on the page with the following code:

<iframe src="..."></iframe>

The command you will need is this:

IWebElement frame = driver.FindElement(By.TagName(“iframe”));

 Locate By Link Text:

You can find an element by matching its visible text.

Say you have a link displayed with the following code.

<a href="http://www.thetrainline.com/Hotels">Hotels</a>>

The command you will need is this:

IWebElement cheese = driver.FindElement(By.LinkText(“Hotels”));

 Locate By Partial Link Text:

You can find an element by matching its partial visible text.

Say you have a link displayed with the following code.

<a href="http://www.thetrainline.com/BestFareFinder">Best Fare Finder</a>> 

The command you will need is this:

IWebElement cheese = driver.FindElement(By.LinkText(“Fare”));

When using all of the above commands to find elements, if no element can be found then a NoSuchElementException will be thrown.

3. What to do after finding an element:

So you have found an element, what would you like to do with it?

You can enter some text in the text field:

element.SendKeys(“some text”);

 Keep in mind that typing something into a text field will not clear it. Instead, what you type will be appended to what’s already there.  You can easily clear the contents of the text field or text area:

element.Clear();

You can click an element:

After finding the element using its link text or partial link text, you can click on it.

element.Click();

Getting Started with Webdriver and CSharp (C#)

Tools you will need:

  1. An IDE – I’m using Visual Studio 2010 but you can use any IDE. For example Visual Express, which is free
  2. Nunit – Download it from here
  3. Webdriver C# Clent Drivers – Download them from here

 

Open Visual Studio and create a New Project (Class Library):

 

In your project references, add the references for Nunit.Framework.dll, Webdriver.dll & Webdriver.Support.dll

 

 

You are now ready to write some code. Here is what we will be doing.

  1. We will run our tests in firefox browser
  2. The test will navigate to www.yahoo.co.uk
  3. Type “selenium hq” into the search box
  4. Clink the “Search” button
  5. On the results page we will Assert on the title of the page.

 

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using NUnit.Framework;
using OpenQA.Selenium;
using OpenQA.Selenium.Firefox;

namespace GettingStartedWithWebdriver
{
[TestFixture]
public class WebDriverDemo
{
private IWebDriver _driver;
private string _baseUrl;

[SetUp]
public void SetupTest()
{
_driver = new FirefoxDriver(); // we want our tests to run in Firefox
_baseUrl = “http://www.yahoo.co.uk”;

}

[TearDown]
public void TeardownTest()
{

_driver.Quit(); // quits webdriver and closes any associated windows
}

[Test]
public void NavigateToTrainLineHomePageAndAssertOnTitle()
{
_driver.Navigate().GoToUrl(_baseUrl);
_driver.FindElement(By.Id(“p_13838465-p”)).Clear(); // clear the text field before entering text
_driver.FindElement(By.Id(“p_13838465-p”)).SendKeys(“selenium hq”); // type into the text field
_driver.FindElement(By.Id(“search-submit”)).Click(); // click on the search button
Assert.AreEqual(“selenium hq – Yahoo! Search Results”, _driver.Title); // Assert using nunit on the window title
}

}
}

Incase you are wondering where did I get the ID of the fields from, you can use a tool like firebug in firefox to inspect elements on a broweser. See below

 

 And there you have it, your first test in Webdriver and CSharp.

Moving from Selenium RC to Webdriver

Two years ago, while working at the The Trainilne, I was part of the Automated Refunds project and the main QA/Developer for its automated test framework. The aim was to replace the existing mainly manual system and provide a transactional and fully automated refund processing system.

As this application was mainly to be used in Internet Explorer, we decided to build an automated testing framework using Selenium RC, NUNIT and C#. This testing framework was a success, with 30,000 lines of code (including coding of 120 tests) and 1650 Assertions.

My Selenium journey started way back in September 2006, while I was still at University and was doing my work placement at Volantis. The selenium RC version back then was 0.8.1 and Selenium IDE was 0.8.2. Selenium has come a long way since then and became a popular and well established testing framework that worked with a large number of browsers, allowed you to write your tests in almost any language, it was Open Source and as it was written in JavaScript, it was very easy and quick to add support for new browsers that might be released.

But it was not perfect. I took a lot of shortcuts, cut corners and did stuff in ugly/dirty ways to get things done. JavaScript being selenium’s strength was also its weakness. Browsers impose a very strict security model on any JavaScript that they execute in order to protect a user from malicious scripts. Examples of where this security model makes testing harder are when trying to upload a file (IE prevents JavaScript from changing the value of an Input file element) and when trying to navigate between domains (because of the single host origin policy problem). Selenium API has also grown over time and with that, it’s become harder to understand how to best use it. I personally found the API difficult to navigate.

To solve these problems, I said Hi to Selenium 2 also know as Webdriver! Webdriver takes a different approach to solve the same problem as Selenium. Rather than being a JavaScript application running within the browser, it uses whichever mechanism is most appropriate to control the browser. For Firefox, this means that WebDriver is implemented as an extension. For IE, WebDriver makes use of IE’s Automation controls. According to Selenium HQ:

The primary new feature is the integration of the WebDriver API. This addresses a number of limitations along with providing an alternative, and simpler, programming interface. The goal is to develop an object-oriented API that provides additional support for a larger number of browsers along with improved support for modern advanced web-app testing problems.

Let’s hope that is the case. I am starting to switch over from Selenium  RC to Webdriver. Along the way, I will be sharing my findings with you. The aim is to one day, change the beast (Automated Refunds Testing Framework) from Selenium RC to Webdriver. I will also be using Automated Refunds as my guinea pig while learning Webdriver. Watch this space!