Sitemaps: Add XML sitemaps functionality to WordPress.
While web crawlers are able to discover pages from links within the site and from other sites, XML sitemaps supplement this approach by allowing crawlers to quickly and comprehensively identify all URLs included in the sitemap and learn other signals about those URLs using the associated metadata.
See https://make.wordpress.org/core/2020/06/10/merge-announcement-extensible-core-sitemaps/ for more details.
This feature exposes the sitemap index via `/wp-sitemap.xml` and exposes a variety of new filters and hooks for developers to modify the behavior. Users can disable sitemaps completely by turning off search engine visibility in WordPress admin.
This change also introduces a new `esc_xml()` function to escape strings for output in XML, as well as XML support to `wp_kses_normalize_entities()`.
Props Adrian McShane, afragen, adamsilverstein, casiepa, flixos90, garrett-eclipse, joemcgill, kburgoine, kraftbj, milana_cap, pacifika, pbiron, pfefferle, Ruxandra Gradina, swissspidy, szepeviktor, tangrufus, tweetythierry.
Fixes #50117.
See #3670. See #19998.
Built from https://develop.svn.wordpress.org/trunk@48072
git-svn-id: http://core.svn.wordpress.org/trunk@47839 1a063a9b-81f0-0310-95a4-ce76da25c4cd
2020-06-17 11:24:07 -04:00
|
|
|
<?php
|
|
|
|
/**
|
|
|
|
* Sitemaps: WP_Sitemaps_Renderer class
|
|
|
|
*
|
|
|
|
* Responsible for rendering Sitemaps data to XML in accordance with sitemap protocol.
|
|
|
|
*
|
|
|
|
* @package WordPress
|
|
|
|
* @subpackage Sitemaps
|
|
|
|
* @since 5.5.0
|
|
|
|
*/
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Class WP_Sitemaps_Renderer
|
|
|
|
*
|
|
|
|
* @since 5.5.0
|
|
|
|
*/
|
|
|
|
class WP_Sitemaps_Renderer {
|
|
|
|
/**
|
|
|
|
* XSL stylesheet for styling a sitemap for web browsers.
|
|
|
|
*
|
|
|
|
* @since 5.5.0
|
|
|
|
*
|
|
|
|
* @var string
|
|
|
|
*/
|
|
|
|
protected $stylesheet = '';
|
|
|
|
|
|
|
|
/**
|
|
|
|
* XSL stylesheet for styling a sitemap for web browsers.
|
|
|
|
*
|
|
|
|
* @since 5.5.0
|
|
|
|
*
|
|
|
|
* @var string
|
|
|
|
*/
|
|
|
|
protected $stylesheet_index = '';
|
|
|
|
|
|
|
|
/**
|
|
|
|
* WP_Sitemaps_Renderer constructor.
|
|
|
|
*
|
|
|
|
* @since 5.5.0
|
|
|
|
*/
|
|
|
|
public function __construct() {
|
|
|
|
$stylesheet_url = $this->get_sitemap_stylesheet_url();
|
2020-06-18 10:46:09 -04:00
|
|
|
|
Sitemaps: Add XML sitemaps functionality to WordPress.
While web crawlers are able to discover pages from links within the site and from other sites, XML sitemaps supplement this approach by allowing crawlers to quickly and comprehensively identify all URLs included in the sitemap and learn other signals about those URLs using the associated metadata.
See https://make.wordpress.org/core/2020/06/10/merge-announcement-extensible-core-sitemaps/ for more details.
This feature exposes the sitemap index via `/wp-sitemap.xml` and exposes a variety of new filters and hooks for developers to modify the behavior. Users can disable sitemaps completely by turning off search engine visibility in WordPress admin.
This change also introduces a new `esc_xml()` function to escape strings for output in XML, as well as XML support to `wp_kses_normalize_entities()`.
Props Adrian McShane, afragen, adamsilverstein, casiepa, flixos90, garrett-eclipse, joemcgill, kburgoine, kraftbj, milana_cap, pacifika, pbiron, pfefferle, Ruxandra Gradina, swissspidy, szepeviktor, tangrufus, tweetythierry.
Fixes #50117.
See #3670. See #19998.
Built from https://develop.svn.wordpress.org/trunk@48072
git-svn-id: http://core.svn.wordpress.org/trunk@47839 1a063a9b-81f0-0310-95a4-ce76da25c4cd
2020-06-17 11:24:07 -04:00
|
|
|
if ( $stylesheet_url ) {
|
|
|
|
$this->stylesheet = '<?xml-stylesheet type="text/xsl" href="' . esc_url( $stylesheet_url ) . '" ?>';
|
|
|
|
}
|
2020-06-18 10:46:09 -04:00
|
|
|
|
|
|
|
$stylesheet_index_url = $this->get_sitemap_index_stylesheet_url();
|
|
|
|
|
Sitemaps: Add XML sitemaps functionality to WordPress.
While web crawlers are able to discover pages from links within the site and from other sites, XML sitemaps supplement this approach by allowing crawlers to quickly and comprehensively identify all URLs included in the sitemap and learn other signals about those URLs using the associated metadata.
See https://make.wordpress.org/core/2020/06/10/merge-announcement-extensible-core-sitemaps/ for more details.
This feature exposes the sitemap index via `/wp-sitemap.xml` and exposes a variety of new filters and hooks for developers to modify the behavior. Users can disable sitemaps completely by turning off search engine visibility in WordPress admin.
This change also introduces a new `esc_xml()` function to escape strings for output in XML, as well as XML support to `wp_kses_normalize_entities()`.
Props Adrian McShane, afragen, adamsilverstein, casiepa, flixos90, garrett-eclipse, joemcgill, kburgoine, kraftbj, milana_cap, pacifika, pbiron, pfefferle, Ruxandra Gradina, swissspidy, szepeviktor, tangrufus, tweetythierry.
Fixes #50117.
See #3670. See #19998.
Built from https://develop.svn.wordpress.org/trunk@48072
git-svn-id: http://core.svn.wordpress.org/trunk@47839 1a063a9b-81f0-0310-95a4-ce76da25c4cd
2020-06-17 11:24:07 -04:00
|
|
|
if ( $stylesheet_index_url ) {
|
|
|
|
$this->stylesheet_index = '<?xml-stylesheet type="text/xsl" href="' . esc_url( $stylesheet_index_url ) . '" ?>';
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Gets the URL for the sitemap stylesheet.
|
|
|
|
*
|
|
|
|
* @since 5.5.0
|
|
|
|
*
|
|
|
|
* @return string The sitemap stylesheet url.
|
|
|
|
*/
|
|
|
|
public function get_sitemap_stylesheet_url() {
|
|
|
|
/* @var WP_Rewrite $wp_rewrite */
|
|
|
|
global $wp_rewrite;
|
|
|
|
|
|
|
|
$sitemap_url = home_url( '/wp-sitemap.xsl' );
|
|
|
|
|
|
|
|
if ( ! $wp_rewrite->using_permalinks() ) {
|
|
|
|
$sitemap_url = add_query_arg( 'sitemap-stylesheet', 'sitemap', home_url( '/' ) );
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Filters the URL for the sitemap stylesheet.
|
|
|
|
*
|
|
|
|
* If a falsy value is returned, no stylesheet will be used and
|
|
|
|
* the "raw" XML of the sitemap will be displayed.
|
|
|
|
*
|
|
|
|
* @since 5.5.0
|
|
|
|
*
|
|
|
|
* @param string $sitemap_url Full URL for the sitemaps xsl file.
|
|
|
|
*/
|
|
|
|
return apply_filters( 'wp_sitemaps_stylesheet_url', $sitemap_url );
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Gets the URL for the sitemap index stylesheet.
|
|
|
|
*
|
|
|
|
* @since 5.5.0
|
|
|
|
*
|
|
|
|
* @return string The sitemap index stylesheet url.
|
|
|
|
*/
|
|
|
|
public function get_sitemap_index_stylesheet_url() {
|
|
|
|
/* @var WP_Rewrite $wp_rewrite */
|
|
|
|
global $wp_rewrite;
|
|
|
|
|
|
|
|
$sitemap_url = home_url( '/wp-sitemap-index.xsl' );
|
|
|
|
|
|
|
|
if ( ! $wp_rewrite->using_permalinks() ) {
|
|
|
|
$sitemap_url = add_query_arg( 'sitemap-stylesheet', 'index', home_url( '/' ) );
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Filters the URL for the sitemap index stylesheet.
|
|
|
|
*
|
|
|
|
* If a falsy value is returned, no stylesheet will be used and
|
|
|
|
* the "raw" XML of the sitemap index will be displayed.
|
|
|
|
*
|
|
|
|
* @since 5.5.0
|
|
|
|
*
|
|
|
|
* @param string $sitemap_url Full URL for the sitemaps index xsl file.
|
|
|
|
*/
|
|
|
|
return apply_filters( 'wp_sitemaps_stylesheet_index_url', $sitemap_url );
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Renders a sitemap index.
|
|
|
|
*
|
|
|
|
* @since 5.5.0
|
|
|
|
*
|
|
|
|
* @param array $sitemaps Array of sitemap URLs.
|
|
|
|
*/
|
|
|
|
public function render_index( $sitemaps ) {
|
|
|
|
header( 'Content-type: application/xml; charset=UTF-8' );
|
|
|
|
|
|
|
|
$this->check_for_simple_xml_availability();
|
|
|
|
|
|
|
|
$index_xml = $this->get_sitemap_index_xml( $sitemaps );
|
|
|
|
|
|
|
|
if ( ! empty( $index_xml ) ) {
|
|
|
|
// All output is escaped within get_sitemap_index_xml().
|
|
|
|
// phpcs:ignore WordPress.Security.EscapeOutput.OutputNotEscaped
|
|
|
|
echo $index_xml;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Gets XML for a sitemap index.
|
|
|
|
*
|
|
|
|
* @since 5.5.0
|
|
|
|
*
|
|
|
|
* @param array $sitemaps Array of sitemap URLs.
|
|
|
|
* @return string|false A well-formed XML string for a sitemap index. False on error.
|
|
|
|
*/
|
|
|
|
public function get_sitemap_index_xml( $sitemaps ) {
|
|
|
|
$sitemap_index = new SimpleXMLElement(
|
|
|
|
sprintf(
|
|
|
|
'%1$s%2$s%3$s',
|
|
|
|
'<?xml version="1.0" encoding="UTF-8" ?>',
|
|
|
|
$this->stylesheet_index,
|
|
|
|
'<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" />'
|
|
|
|
)
|
|
|
|
);
|
|
|
|
|
|
|
|
foreach ( $sitemaps as $entry ) {
|
|
|
|
$sitemap = $sitemap_index->addChild( 'sitemap' );
|
|
|
|
|
|
|
|
// Add each element as a child node to the <sitemap> entry.
|
|
|
|
foreach ( $entry as $name => $value ) {
|
|
|
|
if ( 'loc' === $name ) {
|
|
|
|
$sitemap->addChild( $name, esc_url( $value ) );
|
|
|
|
} elseif ( 'lastmod' === $name ) {
|
|
|
|
$sitemap->addChild( $name, esc_xml( $value ) );
|
|
|
|
} else {
|
|
|
|
_doing_it_wrong(
|
|
|
|
__METHOD__,
|
|
|
|
sprintf(
|
2020-06-18 10:46:09 -04:00
|
|
|
/* translators: %s: List of element names. */
|
Sitemaps: Add XML sitemaps functionality to WordPress.
While web crawlers are able to discover pages from links within the site and from other sites, XML sitemaps supplement this approach by allowing crawlers to quickly and comprehensively identify all URLs included in the sitemap and learn other signals about those URLs using the associated metadata.
See https://make.wordpress.org/core/2020/06/10/merge-announcement-extensible-core-sitemaps/ for more details.
This feature exposes the sitemap index via `/wp-sitemap.xml` and exposes a variety of new filters and hooks for developers to modify the behavior. Users can disable sitemaps completely by turning off search engine visibility in WordPress admin.
This change also introduces a new `esc_xml()` function to escape strings for output in XML, as well as XML support to `wp_kses_normalize_entities()`.
Props Adrian McShane, afragen, adamsilverstein, casiepa, flixos90, garrett-eclipse, joemcgill, kburgoine, kraftbj, milana_cap, pacifika, pbiron, pfefferle, Ruxandra Gradina, swissspidy, szepeviktor, tangrufus, tweetythierry.
Fixes #50117.
See #3670. See #19998.
Built from https://develop.svn.wordpress.org/trunk@48072
git-svn-id: http://core.svn.wordpress.org/trunk@47839 1a063a9b-81f0-0310-95a4-ce76da25c4cd
2020-06-17 11:24:07 -04:00
|
|
|
__( 'Fields other than %s are not currently supported for the sitemap index.' ),
|
|
|
|
implode( ',', array( 'loc', 'lastmod' ) )
|
|
|
|
),
|
|
|
|
'5.5.0'
|
|
|
|
);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return $sitemap_index->asXML();
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Renders a sitemap.
|
|
|
|
*
|
|
|
|
* @since 5.5.0
|
|
|
|
*
|
|
|
|
* @param array $url_list Array of URLs for a sitemap.
|
|
|
|
*/
|
|
|
|
public function render_sitemap( $url_list ) {
|
|
|
|
header( 'Content-type: application/xml; charset=UTF-8' );
|
|
|
|
|
|
|
|
$this->check_for_simple_xml_availability();
|
|
|
|
|
|
|
|
$sitemap_xml = $this->get_sitemap_xml( $url_list );
|
|
|
|
|
|
|
|
if ( ! empty( $sitemap_xml ) ) {
|
|
|
|
// All output is escaped within get_sitemap_xml().
|
|
|
|
// phpcs:ignore WordPress.Security.EscapeOutput.OutputNotEscaped
|
|
|
|
echo $sitemap_xml;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Gets XML for a sitemap.
|
|
|
|
*
|
|
|
|
* @since 5.5.0
|
|
|
|
*
|
|
|
|
* @param array $url_list Array of URLs for a sitemap.
|
|
|
|
* @return string|false A well-formed XML string for a sitemap index. False on error.
|
|
|
|
*/
|
|
|
|
public function get_sitemap_xml( $url_list ) {
|
|
|
|
$urlset = new SimpleXMLElement(
|
|
|
|
sprintf(
|
|
|
|
'%1$s%2$s%3$s',
|
|
|
|
'<?xml version="1.0" encoding="UTF-8" ?>',
|
|
|
|
$this->stylesheet,
|
|
|
|
'<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" />'
|
|
|
|
)
|
|
|
|
);
|
|
|
|
|
|
|
|
foreach ( $url_list as $url_item ) {
|
|
|
|
$url = $urlset->addChild( 'url' );
|
|
|
|
|
|
|
|
// Add each element as a child node to the <url> entry.
|
|
|
|
foreach ( $url_item as $name => $value ) {
|
|
|
|
if ( 'loc' === $name ) {
|
|
|
|
$url->addChild( $name, esc_url( $value ) );
|
|
|
|
} elseif ( in_array( $name, array( 'lastmod', 'changefreq', 'priority' ), true ) ) {
|
|
|
|
$url->addChild( $name, esc_xml( $value ) );
|
|
|
|
} else {
|
|
|
|
_doing_it_wrong(
|
|
|
|
__METHOD__,
|
|
|
|
sprintf(
|
2020-06-18 10:46:09 -04:00
|
|
|
/* translators: %s: List of element names. */
|
Sitemaps: Add XML sitemaps functionality to WordPress.
While web crawlers are able to discover pages from links within the site and from other sites, XML sitemaps supplement this approach by allowing crawlers to quickly and comprehensively identify all URLs included in the sitemap and learn other signals about those URLs using the associated metadata.
See https://make.wordpress.org/core/2020/06/10/merge-announcement-extensible-core-sitemaps/ for more details.
This feature exposes the sitemap index via `/wp-sitemap.xml` and exposes a variety of new filters and hooks for developers to modify the behavior. Users can disable sitemaps completely by turning off search engine visibility in WordPress admin.
This change also introduces a new `esc_xml()` function to escape strings for output in XML, as well as XML support to `wp_kses_normalize_entities()`.
Props Adrian McShane, afragen, adamsilverstein, casiepa, flixos90, garrett-eclipse, joemcgill, kburgoine, kraftbj, milana_cap, pacifika, pbiron, pfefferle, Ruxandra Gradina, swissspidy, szepeviktor, tangrufus, tweetythierry.
Fixes #50117.
See #3670. See #19998.
Built from https://develop.svn.wordpress.org/trunk@48072
git-svn-id: http://core.svn.wordpress.org/trunk@47839 1a063a9b-81f0-0310-95a4-ce76da25c4cd
2020-06-17 11:24:07 -04:00
|
|
|
__( 'Fields other than %s are not currently supported for sitemaps.' ),
|
|
|
|
implode( ',', array( 'loc', 'lastmod', 'changefreq', 'priority' ) )
|
|
|
|
),
|
|
|
|
'5.5.0'
|
|
|
|
);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return $urlset->asXML();
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Checks for the availability of the SimpleXML extension and errors if missing.
|
|
|
|
*
|
|
|
|
* @since 5.5.0
|
|
|
|
*/
|
|
|
|
private function check_for_simple_xml_availability() {
|
|
|
|
if ( ! class_exists( 'SimpleXMLElement' ) ) {
|
|
|
|
add_filter(
|
|
|
|
'wp_die_handler',
|
|
|
|
static function () {
|
|
|
|
return '_xml_wp_die_handler';
|
|
|
|
}
|
|
|
|
);
|
|
|
|
|
|
|
|
wp_die(
|
|
|
|
sprintf(
|
|
|
|
/* translators: %s: SimpleXML */
|
|
|
|
esc_xml( __( 'Could not generate XML sitemap due to missing %s extension' ) ),
|
|
|
|
'SimpleXML'
|
|
|
|
),
|
|
|
|
esc_xml( __( 'WordPress › Error' ) ),
|
|
|
|
array(
|
|
|
|
'response' => 501, // "Not implemented".
|
|
|
|
)
|
|
|
|
);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|