XPath Cheatsheet

Syntax for selecting nodes in XML/HTML documents

Reference
Contents
📝

Basics

Node selection
/              root node
//             anywhere in document
.              current node
..             parent node
@              attribute

Examples
/html/body/div         absolute path
//div                  all divs anywhere
//div/p                p children of any div
//div//p               p descendants of any div
//@class               all class attributes
//a/@href              href of all links
🔄

Axes

Axis — direction from context node
ancestor::div           ancestor divs
ancestor-or-self::div   self or ancestor divs
child::p                child paragraphs (default)
descendant::span        all descendant spans
descendant-or-self::*   self + descendants
following::p            all following p
following-sibling::li   following siblings
parent::div             parent if div
preceding::p            all preceding p
preceding-sibling::li   preceding siblings
self::div               self if div
🎯

Predicates

Filter with []
//li[1]                  first li child
//li[last()]             last li
//li[position() < 3]     first two
//div[@class]            divs with class attr
//div[@class="main"]     exact match
//div[@id="app"]
//input[@type="text"]
//a[@href and @class]    multiple attrs
//p[text()="Hello"]      by text content
//div[count(p) > 2]      divs with 3+ p children

Functions

String functions
contains(@class, "btn")        class contains "btn"
starts-with(@id, "user")       id starts with
normalize-space(text())        trim whitespace
string-length(text())          text length
concat("a", "b")               concatenation

Node functions
text()                         text content
name()                         element name
count(//li)                    count nodes
position()                     current position
last()                         last position
not(condition)                 negation
🔧

Operators

Comparison
=   !=   <   >   <=   >=

Logical
and     or      not()

Union
//h1 | //h2 | //h3        select all three

Examples
//div[@class="a" or @class="b"]
//input[@type!="hidden"]
//li[position() >= 2 and position() <= 4]
📋

Common Patterns

# Class contains (like CSS .btn)
//div[contains(@class, "btn")]

# Text contains
//a[contains(text(), "Click")]

# Following text
//label[text()="Email"]/following-sibling::input

# Nth child
//ul/li[3]

# Not
//div[not(@class)]
//div[not(contains(@class, "hidden"))]

# Wildcard
//*[@id="app"]              any element with id
//div/*                     all direct children
🔄

XPath vs CSS Selectors

CSS                          XPath
div                          //div
#id                          //*[@id="id"]
.class                       //*[contains(@class,"class")]
div > p                      //div/p
div p                        //div//p
div + p                      //div/following-sibling::p[1]
div ~ p                      //div/following-sibling::p
:first-child                 [1]
:last-child                  [last()]
[attr="val"]                 [@attr="val"]
:nth-child(3)                [3]