Selenium WebDriver for Beginners
The web application market is huge and intensely competitive. To be able to introduce a high-quality application product to the market at a reasonable price, it is extremely important to work with effective test automation tools, preferably open source. This is why, out of many browser automation tools available in the market, open source Selenium became popular within a short span of time. Selenium is comprised of many components, and Selenium WebDriver is the latest component added to Selenium.
What is Selenium?
Let’s start with understanding what Selenium is. Selenium is a tool for automating browsers, which means that Selenium makes browsers execute commands according to your scenario. This is why it’s the perfect tool for web application testing, but you are not limited to just that.
Selenium is actually a suite of tools: Selenium IDE, Selenium WebDriver, Selenium Grid and Selenium Standalone Server. Today, we will focus on Selenium WebDriver, also known as Selenium 2.0 or just WebDriver, as it is a very popular and powerful automation tool for web applications.
What is Selenium WebDriver?
Selenium WebDriver is a free, open-source, portable software-testing framework for testing web applications. It provides a common application programming interface (API) for browser automation. In fact, Selenium WebDriver is a library that you call from your code, which executes your commands on the browser of your choice.
How does Selenium WebDriver work?
To understand how WebDriver works, let’s look at an example of a taxi drive. When a taxi order is in process, 3 main actors interact:
- the customer, who gives instructions to the driver on how to reach to the desired destination,
- the driver, who according to the customer’s guidance gives commands to the car using the wheel, pedals, etc.
- and finally, the car, which executes the commands of the taxi driver.
Similarly as in the case of the customer giving guidelines to the driver, the testing engineer gives commands through the API provided by WebDriver bindings (write a code) to the WebDriver. Then, as the taxi driver provides commands to the car, WebDriver passes the commands to the web browser. Finally, the browser, like the car, follows the commands and returns the results.
Selenium WebDriver Architecture
The Selenium WebDriver Architecture consists of a language-specific client, a Selenium standalone server, and browser-specific driver. The architecture is transparent to the users.
First, the user writes tests in an Integrated Development Environment (IDE), using the language-specific client provided API of her/his choice. These include Java, C#, Python, Ruby, and more. Then the build code will communicate with the Selenium
Standalone server, which will forward the user commands to the browser-specific driver, such as the Chrome Driver for Google Chrome, or Gecko Driver for Firefox. The browser driver launches the browser to perform the commands and then returns a response.
Scripting with Selenium WebDriver
Your Selenium code uses the Selenium WebDriver library. Here are some of the most important and basic WebDriver commands to put in your script:
- Browser Commands - get, getTitle, getCurrentUrl
- Browser Navigation Commands - back, forward, to, refresh
- WebElement Commands - clean, click, getText, sendKeys
- FindElements Commands - findElement() with param as locator or query object
Keep in mind, this is not a full list of WebDriver abilities, but it is already enough to automate a big part of your website business cases. For example, let’s imagine we have a login page with a username (class username), password fields (class pass) and a login button (id login), and we need to check that the user can login and that the next page’s title is “Success”.
As you can see:
- driver.get opens a page with the url http://somepage.com/login
- findElement and the provided className fill the user and password fields
- The login button is clicked on with the click() method
- The getTitle() method gets the title
- asserting asserts equality
After execution, you will get the test results document by default in a HTML file, like the one below:
Your Selenium code should be written in your IDE. Future blog posts will explain Selenium scripting in more detail.
- It’s free and open source
- Binding is available for many languages, like Java, C#, Ruby, Python & Groovy, NodeJs
- Supports all operating systems
- Can be easily integrated with Gradle and Maven
- Can be integrated with CI tools like Jenkins
- Has a large community for asking questions and support
- Non-intractable randomly generated elements like CAPTCHA picture are impossible to test
- The test engineer needs to know the programming language supported by Selenium binding
- It is only for web applications
- No reporting feature out of the box
- No reporting feature for file upload out of the box
- Speed - giant tests are not fast by default (this can be shaped from the code side)
To make your script creation easier, use the free Chrome extension recorder. BlazeMeter’s Chrome Extension allows you to record and run your scripts, in Selenium, JMeter or both together. Get it here and start testing now.