🌐 Complete Tutorial

Java Selenium WebDriver Tutorial

Browser Automation A to Z

Master browser automation from setup to advanced design patterns. Learn WebDriver API, locators, waits, Page Object Model, and best practices used in professional QA teams.

⏱ ~4 hrs 📚 12 Topics 💻 Code Examples 🎯 Quiz Each Topic ☕ Java + Maven
01
Introduction

What is Selenium?

Definition

Selenium is a free, open-source suite of tools for automating web browsers. It provides a programming interface to drive browsers programmatically, enabling automated functional testing of web applications across different browsers and operating systems.

Selenium was originally created by Jason Huggins in 2004 at ThoughtWorks to automate repetitive manual tests. Today it is governed by the Selenium Project under the Software Freedom Conservancy. The official documentation is at selenium.dev.

The Selenium Suite

Selenium WebDriver
The core API that controls browsers directly through browser-native commands. The focus of this tutorial.
Selenium IDE
Chrome & Firefox extension for recording and replaying scripts — no code needed.
Selenium Grid
Distributes tests across multiple machines and browser versions for parallel execution.
Analogy

Selenium WebDriver is like a remote control for your browser. Just as a TV remote sends signals to change channels, WebDriver sends commands — click here, type this, navigate there — through code instead of your hands.

ℹ️
Selenium 4 — Current Version

Selenium 4 (released October 2021) implements the W3C WebDriver standard natively, removing the older JSON Wire Protocol. It adds Chrome DevTools Protocol (CDP) support, a revamped Grid, and improved relative locators.

🧠Quick Quiz — Topic 1

Which component of the Selenium suite is used to run tests in parallel across multiple machines and browsers?

02
Architecture

Selenium WebDriver Architecture

Understanding the architecture helps you troubleshoot failures and write more reliable automation. Selenium 4 uses the W3C WebDriver standard — a REST-like HTTP API for controlling browsers.

W3C WebDriver Protocol

The W3C WebDriver specification defines a platform- and language-neutral wire protocol for remote control of web browsers. Every Selenium 4 command maps to an HTTP request sent to the browser driver executable.

4 Architecture Layers

  1. Test Script (Language Bindings): The Java / Python / C# / JS code you write using the Selenium client library.

  2. Selenium Client Library: Translates your method calls into HTTP requests conforming to the W3C WebDriver protocol.

  3. Browser Driver (ChromeDriver / GeckoDriver etc.): A separate executable that receives HTTP requests and translates them into browser-native instructions.

  4. Real Browser: Chrome, Firefox, Edge, or Safari — the actual browser being automated.

Flow Example

When you call driver.findElement(By.id("btn")).click(), the client sends a POST to localhost:9515/session/{id}/element (ChromeDriver's port). ChromeDriver translates this to a native Chrome command and the browser clicks the element.

⚠️
Driver Version Must Match Browser

ChromeDriver's major version must match your installed Chrome version. A mismatch causes SessionNotCreatedException. Use Selenium Manager (built into Selenium 4.6+) or WebDriverManager to handle this automatically.

🧠Quick Quiz — Topic 2

In Selenium 4 architecture, what is the role of the Browser Driver (e.g., ChromeDriver)?

03
Setup

Setting Up the Environment

This tutorial uses Java with Maven — the most common enterprise QA setup. The same concepts apply to Python, C#, and JavaScript bindings.

Prerequisites

JDK 11+
Java Development Kit (LTS recommended). Verify: java -version
Apache Maven
Build and dependency tool. Verify: mvn -version
IDE
IntelliJ IDEA Community (recommended) or Eclipse IDE for Java EE
Chrome
Any modern version — driver managed automatically via Selenium Manager

pom.xml — Maven Dependencies

XML — pom.xml
<dependencies>
  <!-- Selenium Java -->
  <dependency>
    <groupId>org.seleniumhq.selenium</groupId>
    <artifactId>selenium-java</artifactId>
    <version>4.21.0</version>
  </dependency>
  <!-- TestNG -->
  <dependency>
    <groupId>org.testng</groupId>
    <artifactId>testng</artifactId>
    <version>7.10.2</version>
    <scope>test</scope>
  </dependency>
</dependencies>

First Test

Java — FirstTest.java
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;

public class FirstTest {
    public static void main(String[] args) {
        // Selenium 4.6+ manages driver versions automatically
        WebDriver driver = new ChromeDriver();
        driver.get("https://www.selenium.dev");
        System.out.println("Title: " + driver.getTitle());
        driver.quit();
    }
}
💡
Selenium Manager

From Selenium 4.6 onward, the built-in Selenium Manager automatically downloads the correct browser driver at runtime. You no longer need to manually set the ChromeDriver path.

🧠Quick Quiz — Topic 3

Starting from Selenium 4.6, what handles browser driver management automatically?

04
WebDriver API

WebDriver Core Browser Commands

The WebDriver interface provides top-level methods for controlling the browser session — the building blocks of every Selenium script.

Java — Navigation & Window
WebDriver driver = new ChromeDriver();

// Navigation
driver.get("https://example.com");
driver.navigate().back();
driver.navigate().forward();
driver.navigate().refresh();
driver.navigate().to("https://page2.com");

// Browser info
String title = driver.getTitle();
String url   = driver.getCurrentUrl();
String src   = driver.getPageSource();

// Window
driver.manage().window().maximize();
driver.manage().window().setSize(new Dimension(1280, 800));

// Cleanup
driver.close();  // Close current tab only
driver.quit();   // End session + close ALL windows
🚨
Always Call driver.quit()

Not calling driver.quit() leaks browser processes in memory. Always place it in @AfterMethod or a finally block to guarantee cleanup even when tests fail.

🧠Quick Quiz — Topic 4

What is the key difference between driver.close() and driver.quit()?

05
Element Location

Locators & Strategies

Locators tell WebDriver how to find elements on the page. Choosing the right strategy affects how robust and performant your tests are.

All Locator Types

LocatorExampleBest Used WhenSpeed
By.idBy.id("login")Element has a unique id attributeFastest
By.nameBy.name("email")Form input with name attributeFast
By.cssSelectorBy.cssSelector(".btn")Complex attribute-based selectionFast
By.xpathBy.xpath("//input[@id='q']")Text-based paths, parent traversalSlower
By.classNameBy.className("card")Single class on unique elementMedium
By.linkTextBy.linkText("Sign In")Exact anchor link textMedium
By.tagNameBy.tagName("h1")Fetching all elements of a tagMedium
LocatorBy.id
ExampleBy.id("login")
SpeedFastest
LocatorBy.cssSelector
ExampleBy.cssSelector(".btn")
SpeedFast
LocatorBy.xpath
Example//input[@id='q']
SpeedSlower
LocatorBy.linkText
ExampleBy.linkText("Sign In")
SpeedMedium

XPath Common Patterns

XPath Patterns
// By attribute
//input[@id='username']
//button[@type='submit']

// By text
//button[text()='Login']
//button[contains(text(),'Log')]

// Sibling / parent traversal
//label[text()='Email']/following-sibling::input
//td[text()='John']/parent::tr

// ❌ AVOID — absolute XPath (breaks on any DOM change)
/html/body/div[2]/form/input[1]
💡
Locator Priority

Use in this order: ID → Name → CSS Selector → XPath. Use XPath only when CSS cannot do the job — e.g., selecting by text content or traversing to a parent element.

🧠Quick Quiz — Topic 5

Which XPath expression correctly selects a button whose text contains "Login"?

06
Interactions

WebElement Interactions

Once located, the WebElement interface provides methods to interact with and read data from elements.

Java — WebElement Methods
WebElement field = driver.findElement(By.id("email"));

// Actions
field.click();
field.sendKeys("[email protected]");
field.clear();
field.submit();

// Read data
String text  = field.getText();
String value = field.getAttribute("value");
String css   = field.getCssValue("color");

// State checks
boolean vis = field.isDisplayed();
boolean ena = field.isEnabled();
boolean sel = field.isSelected();

HTML select Dropdown — Select Class

Java — Select Dropdown
Select select = new Select(driver.findElement(By.id("country")));

select.selectByVisibleText("India");
select.selectByValue("IN");
select.selectByIndex(2);

String chosen = select.getFirstSelectedOption().getText();

// Special keys
field.sendKeys(Keys.ENTER);
field.sendKeys(Keys.chord(Keys.CONTROL, "a"), Keys.DELETE);
ℹ️
Custom Dropdowns

The Select class only works on native HTML <select> elements. For custom dropdowns built with div/ul/li (React, Angular etc.), locate and click each element directly.

🧠Quick Quiz — Topic 6

Which Selenium class is specifically designed to interact with native HTML <select> dropdowns?

07
Synchronization

Waits & Synchronization

Timing issues are the leading cause of flaky tests. Selenium provides three types of waits to handle page load timing reliably.

Implicit Wait

Implicit Wait

A global timeout telling WebDriver to poll the DOM for a set duration when any findElement call doesn't immediately find the element. Set once for the whole session.

Java — Implicit Wait
driver.manage().timeouts().implicitlyWait(Duration.ofSeconds(10));

Explicit Wait — Recommended

Java — Explicit Wait
WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(15));

// Wait until visible
WebElement el = wait.until(
    ExpectedConditions.visibilityOfElementLocated(By.id("result"))
);

// Wait until clickable
wait.until(ExpectedConditions.elementToBeClickable(By.id("btn")));

// Wait for page title
wait.until(ExpectedConditions.titleContains("Dashboard"));

// Wait for spinner to disappear
wait.until(ExpectedConditions.invisibilityOfElementLocated(
    By.cssSelector(".spinner")
));

Fluent Wait

Java — Fluent Wait
FluentWait<WebDriver> wait = new FluentWait<>(driver)
    .withTimeout(Duration.ofSeconds(30))
    .pollingEvery(Duration.ofMillis(500))
    .ignoring(NoSuchElementException.class);

WebElement el = wait.until(d -> d.findElement(By.id("dynamic")));
⚠️
Never Mix Implicit and Explicit Waits

Combining them causes unpredictable behaviour. The official Selenium docs recommend using only one approach — prefer explicit waits for granular control.

🧠Quick Quiz — Topic 7

What is the key advantage of Explicit Wait over Implicit Wait?

08
Advanced

Advanced Actions & JavascriptExecutor

The Actions class handles complex gestures. JavascriptExecutor lets you run JS directly in the browser for edge cases.

Java — Actions Class
Actions actions = new Actions(driver);

// Hover (reveals dropdown menus)
actions.moveToElement(navMenu).perform();

// Right-click
actions.contextClick(element).perform();

// Double-click
actions.doubleClick(element).perform();

// Drag and drop
actions.dragAndDrop(sourceEl, targetEl).perform();

// Shift + click (multi-select)
actions.keyDown(Keys.SHIFT)
       .click(item1).click(item2)
       .keyUp(Keys.SHIFT).perform();
Java — JavascriptExecutor
JavascriptExecutor js = (JavascriptExecutor) driver;

// Scroll element into view
js.executeScript("arguments[0].scrollIntoView(true);", el);

// Force-click (bypass overlays)
js.executeScript("arguments[0].click();", el);

// Scroll to bottom
js.executeScript("window.scrollTo(0, document.body.scrollHeight);");
🧠Quick Quiz — Topic 8

What method must be called at the end of an Actions chain to actually execute the actions?

09
Popups & Frames

Alerts, iFrames & Multiple Windows

Real applications use JavaScript alerts, iframes, and multiple browser tabs. Selenium's switchTo() API handles each.

JavaScript Alerts

Java — Alerts
wait.until(ExpectedConditions.alertIsPresent());
Alert alert = driver.switchTo().alert();

alert.getText();           // Read message
alert.accept();           // Click OK
alert.dismiss();          // Click Cancel
alert.sendKeys("input");  // Type in prompt

iFrames

Java — iFrames
// Switch into iframe
driver.switchTo().frame(0);
driver.switchTo().frame("iframeName");
driver.switchTo().frame(driver.findElement(By.id("myFrame")));

// Return to main document
driver.switchTo().defaultContent();
driver.switchTo().parentFrame();

Multiple Windows / Tabs

Java — Windows
String original = driver.getWindowHandle();
driver.findElement(By.id("openNew")).click();

for (String h : driver.getWindowHandles()) {
    if (!h.equals(original)) {
        driver.switchTo().window(h);
        break;
    }
}
// Switch back when done
driver.switchTo().window(original);
🧠Quick Quiz — Topic 9

After interacting inside an iframe, which method returns focus to the main page?

10
Design Pattern

Page Object Model (POM)

Page Object Model

POM is a design pattern where each web page has a corresponding Java class. Each class encapsulates that page's locators and interaction methods, separating test logic from page-interaction code.

Analogy

A Page Object is like a user manual for a single page. When a test needs to log in, it calls the LoginPage "manual" — loginPage.login("user","pass") — without knowing how the page works internally.

Benefits

Maintainability
Locator change? Update in one class — not 50 test files.
Reusability
All tests share the same page class methods.
Readability
Tests read like plain English business steps.
Separation of Concerns
Tests describe WHAT; Page Objects describe HOW.
Java — LoginPage.java
public class LoginPage {
    private WebDriver driver;

    @FindBy(id = "username") private WebElement usernameField;
    @FindBy(id = "password") private WebElement passwordField;
    @FindBy(cssSelector = "button[type='submit']") private WebElement loginBtn;

    public LoginPage(WebDriver driver) {
        this.driver = driver;
        PageFactory.initElements(driver, this);
    }

    public DashboardPage login(String user, String pass) {
        usernameField.sendKeys(user);
        passwordField.sendKeys(pass);
        loginBtn.click();
        return new DashboardPage(driver);
    }
}
Java — LoginTest.java
@Test
public void testValidLogin() {
    LoginPage page = new LoginPage(driver);
    DashboardPage dash = page.login("admin", "secret");
    Assert.assertTrue(dash.isLoaded());
}
🧠Quick Quiz — Topic 10

What is the role of PageFactory.initElements(driver, this) in a Page Object class?

11
Test Framework

TestNG Integration

TestNG is the most widely used test framework with Selenium in Java. It provides lifecycle annotations, grouping, parallel execution, data-driven testing, and HTML reports.

Key Annotations

@BeforeSuite
Runs once before the entire suite — global setup
@BeforeClass
Runs once before the first test in the class
@BeforeMethod
Runs before each test method — open browser here
@Test
Marks a method as a test. Supports priority, groups, enabled
@AfterMethod
Runs after each test method — call driver.quit() here
@DataProvider
Supplies multiple data rows to a single test method

Data-Driven Testing

Java — @DataProvider
@DataProvider(name = "loginData")
public Object[][] credentials() {
    return new Object[][] {
        { "admin",   "pass1", true  },
        { "user2",   "pass2", true  },
        { "invalid", "wrong", false }
    };
}

@Test(dataProvider = "loginData")
public void testLogin(String user, String pass, boolean success) {
    loginPage.login(user, pass);
    if (success) {
        Assert.assertTrue(dashboard.isLoaded());
    } else {
        Assert.assertTrue(loginPage.getError().contains("Invalid"));
    }
}
🧠Quick Quiz — Topic 11

Which TestNG annotation is best suited for opening a browser before each individual test method?

12
Professional Quality

Best Practices & Common Pitfalls

Knowing the API is only half the battle. These practices separate reliable automation suites from fragile scripts that teams eventually abandon.

Best Practices — Always Follow

  1. Use explicit waits — never Thread.sleep(). Thread.sleep(3000) always waits the full duration. Explicit waits exit as soon as the condition is met, saving seconds per test.

  2. Implement POM from day one. Even for small projects. Locators scattered across test classes become a nightmare after 20+ tests.

  3. Never hard-code test data. Use @DataProvider, external CSV/Excel files, or a TestData factory class.

  4. Always clean up in @AfterMethod. driver.quit() there guarantees cleanup even when tests fail.

  5. Capture screenshots on failure. Use a TestNG ITestListener to auto-save screenshots on test failures.

  6. Use meaningful method names. testLoginWithValidCredentials() is far better than test1().

🚨
Anti-Patterns to Avoid

❌ Absolute XPaths like /html/body/div[3]/table/tr[2]/td[1] — break on any DOM change
Thread.sleep() everywhere — slow and still flaky
❌ No assertions — tests pass even when app is broken
❌ Sharing one WebDriver instance across parallel tests

Recommended Project Structure

Project Structure
src/main/java/
  pages/            ← Page Object classes
  utils/            ← DriverFactory, ScreenshotUtil, WaitUtil
  data/             ← TestData constants
src/test/java/
  tests/            ← All @Test classes
  listeners/        ← ITestListener (screenshot on fail)
src/test/resources/
  testng.xml        ← Suite configuration
  config.properties ← Base URL, timeouts, credentials
screenshots/        ← Auto-captured on failure
pom.xml
🧠Quick Quiz — Topic 12

Why should Thread.sleep() be avoided for synchronization in Selenium tests?

Ready to advance your QA career?

Hands-on QA training covering Selenium, Playwright, JIRA, Manual Testing & more — taught by industry practitioners.

Visit stadsolution.com →
What is Selenium?