
Build Status Coverage Status GoDoc Go Report Card


htmlquery is an XPath query package for HTML, lets you extract data or evaluate from HTML documents by an XPath expression.



  • #7 Removed deprecated FindEach() and FindEachWithBreak() methods.


  • Avoid adding duplicate elements to list for Find() method. #6


$ go get

Getting Started

Load HTML document from URL.

doc, err := htmlquery.LoadURL("")

Load HTML document from string.

s := `<html>....</html>`
doc, err := htmlquery.Parse(strings.NewReader(s))

Find all A elements.

list := htmlquery.Find(doc, "//a")

Find all A elements that have href attribute.

list := range htmlquery.Find(doc, "//a[@href]")	

Find all A elements and only get href attribute self.

list := range htmlquery.Find(doc, "//a/@href")	

Find the third A element.

a := htmlquery.FindOne(doc, "//a[3]")

Evaluate the number of all IMG element.

expr, _ := xpath.Compile("count(//img)")
v := expr.Evaluate(htmlquery.CreateXPathNavigator(doc)).(float64)
fmt.Printf("total count is %f", v)

Quick Tutorial

func main() {
	doc, err := htmlquery.LoadURL("")
	if err != nil {
	// Find all news item.
	for i, n := range htmlquery.Find(doc, "//ol/li") {
		a := htmlquery.FindOne(n, "//a")
		fmt.Printf("%d %s(%s)\n", i, htmlquery.InnerText(a), htmlquery.SelectAttr(a, "href"))

List of supported XPath query packages

Name Description
htmlquery XPath query package for the HTML document
xmlquery XPath query package for the XML document
jsonquery XPath query package for the JSON document


Please let me know if you have any questions.