packer-cn/vendor/github.com/antchfx/htmlquery/README.md

103 lines
2.6 KiB
Markdown

htmlquery
====
[![Build Status](https://travis-ci.org/antchfx/htmlquery.svg?branch=master)](https://travis-ci.org/antchfx/htmlquery)
[![Coverage Status](https://coveralls.io/repos/github/antchfx/htmlquery/badge.svg?branch=master)](https://coveralls.io/github/antchfx/htmlquery?branch=master)
[![GoDoc](https://godoc.org/github.com/antchfx/htmlquery?status.svg)](https://godoc.org/github.com/antchfx/htmlquery)
[![Go Report Card](https://goreportcard.com/badge/github.com/antchfx/htmlquery)](https://goreportcard.com/report/github.com/antchfx/htmlquery)
Overview
====
htmlquery is an XPath query package for HTML, lets you extract data or evaluate from HTML documents by an XPath expression.
Changelogs
===
2019-02-04
- [#7](https://github.com/antchfx/htmlquery/issues/7) Removed deprecated `FindEach()` and `FindEachWithBreak()` methods.
2018-12-28
- Avoid adding duplicate elements to list for `Find()` method. [#6](https://github.com/antchfx/htmlquery/issues/6)
Installation
====
> $ go get github.com/antchfx/htmlquery
Getting Started
====
#### Load HTML document from URL.
```go
doc, err := htmlquery.LoadURL("http://example.com/")
```
#### Load HTML document from string.
```go
s := `<html>....</html>`
doc, err := htmlquery.Parse(strings.NewReader(s))
```
#### Find all A elements.
```go
list := htmlquery.Find(doc, "//a")
```
#### Find all A elements that have `href` attribute.
```go
list := range htmlquery.Find(doc, "//a[@href]")
```
#### Find all A elements and only get `href` attribute self.
```go
list := range htmlquery.Find(doc, "//a/@href")
```
### Find the third A element.
```go
a := htmlquery.FindOne(doc, "//a[3]")
```
#### Evaluate the number of all IMG element.
```go
expr, _ := xpath.Compile("count(//img)")
v := expr.Evaluate(htmlquery.CreateXPathNavigator(doc)).(float64)
fmt.Printf("total count is %f", v)
```
Quick Tutorial
===
```go
func main() {
doc, err := htmlquery.LoadURL("https://www.bing.com/search?q=golang")
if err != nil {
panic(err)
}
// Find all news item.
for i, n := range htmlquery.Find(doc, "//ol/li") {
a := htmlquery.FindOne(n, "//a")
fmt.Printf("%d %s(%s)\n", i, htmlquery.InnerText(a), htmlquery.SelectAttr(a, "href"))
}
}
```
List of supported XPath query packages
===
|Name |Description |
|--------------------------|----------------|
|[htmlquery](https://github.com/antchfx/htmlquery) | XPath query package for the HTML document|
|[xmlquery](https://github.com/antchfx/xmlquery) | XPath query package for the XML document|
|[jsonquery](https://github.com/antchfx/jsonquery) | XPath query package for the JSON document|
Questions
===
Please let me know if you have any questions.