Sitemaps: Add XML sitemaps functionality to WordPress.
While web crawlers are able to discover pages from links within the site and from other sites, XML sitemaps supplement this approach by allowing crawlers to quickly and comprehensively identify all URLs included in the sitemap and learn other signals about those URLs using the associated metadata.
See https://make.wordpress.org/core/2020/06/10/merge-announcement-extensible-core-sitemaps/ for more details.
This feature exposes the sitemap index via `/wp-sitemap.xml` and exposes a variety of new filters and hooks for developers to modify the behavior. Users can disable sitemaps completely by turning off search engine visibility in WordPress admin.
This change also introduces a new `esc_xml()` function to escape strings for output in XML, as well as XML support to `wp_kses_normalize_entities()`.
Props Adrian McShane, afragen, adamsilverstein, casiepa, flixos90, garrett-eclipse, joemcgill, kburgoine, kraftbj, milana_cap, pacifika, pbiron, pfefferle, Ruxandra Gradina, swissspidy, szepeviktor, tangrufus, tweetythierry.
Fixes #50117.
See #3670. See #19998.
Built from https://develop.svn.wordpress.org/trunk@48072
git-svn-id: http://core.svn.wordpress.org/trunk@47839 1a063a9b-81f0-0310-95a4-ce76da25c4cd
2020-06-17 11:24:07 -04:00
|
|
|
<?php
|
|
|
|
/**
|
|
|
|
* Sitemaps: WP_Sitemaps_Provider class
|
|
|
|
*
|
|
|
|
* This class is a base class for other sitemap providers to extend and contains shared functionality.
|
|
|
|
*
|
|
|
|
* @package WordPress
|
|
|
|
* @subpackage Sitemaps
|
|
|
|
* @since 5.5.0
|
|
|
|
*/
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Class WP_Sitemaps_Provider.
|
|
|
|
*
|
|
|
|
* @since 5.5.0
|
|
|
|
*/
|
|
|
|
abstract class WP_Sitemaps_Provider {
|
|
|
|
/**
|
|
|
|
* Provider name.
|
|
|
|
*
|
|
|
|
* This will also be used as the public-facing name in URLs.
|
|
|
|
*
|
|
|
|
* @since 5.5.0
|
|
|
|
*
|
|
|
|
* @var string
|
|
|
|
*/
|
|
|
|
protected $name = '';
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Object type name (e.g. 'post', 'term', 'user').
|
|
|
|
*
|
|
|
|
* @since 5.5.0
|
|
|
|
*
|
|
|
|
* @var string
|
|
|
|
*/
|
|
|
|
protected $object_type = '';
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Gets a URL list for a sitemap.
|
|
|
|
*
|
|
|
|
* @since 5.5.0
|
|
|
|
*
|
|
|
|
* @param int $page_num Page of results.
|
|
|
|
* @param string $object_subtype Optional. Object subtype name. Default empty.
|
2020-06-19 18:55:12 -04:00
|
|
|
* @return array Array of URLs for a sitemap.
|
Sitemaps: Add XML sitemaps functionality to WordPress.
While web crawlers are able to discover pages from links within the site and from other sites, XML sitemaps supplement this approach by allowing crawlers to quickly and comprehensively identify all URLs included in the sitemap and learn other signals about those URLs using the associated metadata.
See https://make.wordpress.org/core/2020/06/10/merge-announcement-extensible-core-sitemaps/ for more details.
This feature exposes the sitemap index via `/wp-sitemap.xml` and exposes a variety of new filters and hooks for developers to modify the behavior. Users can disable sitemaps completely by turning off search engine visibility in WordPress admin.
This change also introduces a new `esc_xml()` function to escape strings for output in XML, as well as XML support to `wp_kses_normalize_entities()`.
Props Adrian McShane, afragen, adamsilverstein, casiepa, flixos90, garrett-eclipse, joemcgill, kburgoine, kraftbj, milana_cap, pacifika, pbiron, pfefferle, Ruxandra Gradina, swissspidy, szepeviktor, tangrufus, tweetythierry.
Fixes #50117.
See #3670. See #19998.
Built from https://develop.svn.wordpress.org/trunk@48072
git-svn-id: http://core.svn.wordpress.org/trunk@47839 1a063a9b-81f0-0310-95a4-ce76da25c4cd
2020-06-17 11:24:07 -04:00
|
|
|
*/
|
|
|
|
abstract public function get_url_list( $page_num, $object_subtype = '' );
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Gets the max number of pages available for the object type.
|
|
|
|
*
|
|
|
|
* @since 5.5.0
|
|
|
|
*
|
|
|
|
* @param string $object_subtype Optional. Object subtype. Default empty.
|
|
|
|
* @return int Total number of pages.
|
|
|
|
*/
|
|
|
|
abstract public function get_max_num_pages( $object_subtype = '' );
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Gets data about each sitemap type.
|
|
|
|
*
|
|
|
|
* @since 5.5.0
|
|
|
|
*
|
2020-06-19 18:26:10 -04:00
|
|
|
* @return array[] Array of sitemap types including object subtype name and number of pages.
|
Sitemaps: Add XML sitemaps functionality to WordPress.
While web crawlers are able to discover pages from links within the site and from other sites, XML sitemaps supplement this approach by allowing crawlers to quickly and comprehensively identify all URLs included in the sitemap and learn other signals about those URLs using the associated metadata.
See https://make.wordpress.org/core/2020/06/10/merge-announcement-extensible-core-sitemaps/ for more details.
This feature exposes the sitemap index via `/wp-sitemap.xml` and exposes a variety of new filters and hooks for developers to modify the behavior. Users can disable sitemaps completely by turning off search engine visibility in WordPress admin.
This change also introduces a new `esc_xml()` function to escape strings for output in XML, as well as XML support to `wp_kses_normalize_entities()`.
Props Adrian McShane, afragen, adamsilverstein, casiepa, flixos90, garrett-eclipse, joemcgill, kburgoine, kraftbj, milana_cap, pacifika, pbiron, pfefferle, Ruxandra Gradina, swissspidy, szepeviktor, tangrufus, tweetythierry.
Fixes #50117.
See #3670. See #19998.
Built from https://develop.svn.wordpress.org/trunk@48072
git-svn-id: http://core.svn.wordpress.org/trunk@47839 1a063a9b-81f0-0310-95a4-ce76da25c4cd
2020-06-17 11:24:07 -04:00
|
|
|
*/
|
|
|
|
public function get_sitemap_type_data() {
|
|
|
|
$sitemap_data = array();
|
|
|
|
|
|
|
|
$object_subtypes = $this->get_object_subtypes();
|
|
|
|
|
|
|
|
// If there are no object subtypes, include a single sitemap for the
|
|
|
|
// entire object type.
|
|
|
|
if ( empty( $object_subtypes ) ) {
|
|
|
|
$sitemap_data[] = array(
|
|
|
|
'name' => '',
|
|
|
|
'pages' => $this->get_max_num_pages(),
|
|
|
|
);
|
|
|
|
return $sitemap_data;
|
|
|
|
}
|
|
|
|
|
|
|
|
// Otherwise, include individual sitemaps for every object subtype.
|
|
|
|
foreach ( $object_subtypes as $object_subtype_name => $data ) {
|
|
|
|
$object_subtype_name = (string) $object_subtype_name;
|
|
|
|
|
|
|
|
$sitemap_data[] = array(
|
|
|
|
'name' => $object_subtype_name,
|
|
|
|
'pages' => $this->get_max_num_pages( $object_subtype_name ),
|
|
|
|
);
|
|
|
|
}
|
|
|
|
|
|
|
|
return $sitemap_data;
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Lists sitemap pages exposed by this provider.
|
|
|
|
*
|
|
|
|
* The returned data is used to populate the sitemap entries of the index.
|
|
|
|
*
|
|
|
|
* @since 5.5.0
|
|
|
|
*
|
2020-06-19 18:26:10 -04:00
|
|
|
* @return array[] Array of sitemap entries.
|
Sitemaps: Add XML sitemaps functionality to WordPress.
While web crawlers are able to discover pages from links within the site and from other sites, XML sitemaps supplement this approach by allowing crawlers to quickly and comprehensively identify all URLs included in the sitemap and learn other signals about those URLs using the associated metadata.
See https://make.wordpress.org/core/2020/06/10/merge-announcement-extensible-core-sitemaps/ for more details.
This feature exposes the sitemap index via `/wp-sitemap.xml` and exposes a variety of new filters and hooks for developers to modify the behavior. Users can disable sitemaps completely by turning off search engine visibility in WordPress admin.
This change also introduces a new `esc_xml()` function to escape strings for output in XML, as well as XML support to `wp_kses_normalize_entities()`.
Props Adrian McShane, afragen, adamsilverstein, casiepa, flixos90, garrett-eclipse, joemcgill, kburgoine, kraftbj, milana_cap, pacifika, pbiron, pfefferle, Ruxandra Gradina, swissspidy, szepeviktor, tangrufus, tweetythierry.
Fixes #50117.
See #3670. See #19998.
Built from https://develop.svn.wordpress.org/trunk@48072
git-svn-id: http://core.svn.wordpress.org/trunk@47839 1a063a9b-81f0-0310-95a4-ce76da25c4cd
2020-06-17 11:24:07 -04:00
|
|
|
*/
|
|
|
|
public function get_sitemap_entries() {
|
|
|
|
$sitemaps = array();
|
|
|
|
|
|
|
|
$sitemap_types = $this->get_sitemap_type_data();
|
|
|
|
|
|
|
|
foreach ( $sitemap_types as $type ) {
|
|
|
|
for ( $page = 1; $page <= $type['pages']; $page ++ ) {
|
|
|
|
$sitemap_entry = array(
|
|
|
|
'loc' => $this->get_sitemap_url( $type['name'], $page ),
|
|
|
|
);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Filters the sitemap entry for the sitemap index.
|
|
|
|
*
|
|
|
|
* @since 5.5.0
|
|
|
|
*
|
|
|
|
* @param array $sitemap_entry Sitemap entry for the post.
|
|
|
|
* @param string $object_type Object empty name.
|
|
|
|
* @param string $object_subtype Object subtype name.
|
|
|
|
* Empty string if the object type does not support subtypes.
|
2020-06-19 18:26:10 -04:00
|
|
|
* @param int $page Page number of results.
|
Sitemaps: Add XML sitemaps functionality to WordPress.
While web crawlers are able to discover pages from links within the site and from other sites, XML sitemaps supplement this approach by allowing crawlers to quickly and comprehensively identify all URLs included in the sitemap and learn other signals about those URLs using the associated metadata.
See https://make.wordpress.org/core/2020/06/10/merge-announcement-extensible-core-sitemaps/ for more details.
This feature exposes the sitemap index via `/wp-sitemap.xml` and exposes a variety of new filters and hooks for developers to modify the behavior. Users can disable sitemaps completely by turning off search engine visibility in WordPress admin.
This change also introduces a new `esc_xml()` function to escape strings for output in XML, as well as XML support to `wp_kses_normalize_entities()`.
Props Adrian McShane, afragen, adamsilverstein, casiepa, flixos90, garrett-eclipse, joemcgill, kburgoine, kraftbj, milana_cap, pacifika, pbiron, pfefferle, Ruxandra Gradina, swissspidy, szepeviktor, tangrufus, tweetythierry.
Fixes #50117.
See #3670. See #19998.
Built from https://develop.svn.wordpress.org/trunk@48072
git-svn-id: http://core.svn.wordpress.org/trunk@47839 1a063a9b-81f0-0310-95a4-ce76da25c4cd
2020-06-17 11:24:07 -04:00
|
|
|
*/
|
|
|
|
$sitemap_entry = apply_filters( 'wp_sitemaps_index_entry', $sitemap_entry, $this->object_type, $type['name'], $page );
|
|
|
|
|
|
|
|
$sitemaps[] = $sitemap_entry;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return $sitemaps;
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Gets the URL of a sitemap entry.
|
|
|
|
*
|
|
|
|
* @since 5.5.0
|
|
|
|
*
|
2020-06-19 13:56:09 -04:00
|
|
|
* @global WP_Rewrite $wp_rewrite WordPress rewrite component.
|
|
|
|
*
|
Sitemaps: Add XML sitemaps functionality to WordPress.
While web crawlers are able to discover pages from links within the site and from other sites, XML sitemaps supplement this approach by allowing crawlers to quickly and comprehensively identify all URLs included in the sitemap and learn other signals about those URLs using the associated metadata.
See https://make.wordpress.org/core/2020/06/10/merge-announcement-extensible-core-sitemaps/ for more details.
This feature exposes the sitemap index via `/wp-sitemap.xml` and exposes a variety of new filters and hooks for developers to modify the behavior. Users can disable sitemaps completely by turning off search engine visibility in WordPress admin.
This change also introduces a new `esc_xml()` function to escape strings for output in XML, as well as XML support to `wp_kses_normalize_entities()`.
Props Adrian McShane, afragen, adamsilverstein, casiepa, flixos90, garrett-eclipse, joemcgill, kburgoine, kraftbj, milana_cap, pacifika, pbiron, pfefferle, Ruxandra Gradina, swissspidy, szepeviktor, tangrufus, tweetythierry.
Fixes #50117.
See #3670. See #19998.
Built from https://develop.svn.wordpress.org/trunk@48072
git-svn-id: http://core.svn.wordpress.org/trunk@47839 1a063a9b-81f0-0310-95a4-ce76da25c4cd
2020-06-17 11:24:07 -04:00
|
|
|
* @param string $name The name of the sitemap.
|
|
|
|
* @param int $page The page of the sitemap.
|
|
|
|
* @return string The composed URL for a sitemap entry.
|
|
|
|
*/
|
|
|
|
public function get_sitemap_url( $name, $page ) {
|
|
|
|
global $wp_rewrite;
|
|
|
|
|
|
|
|
if ( ! $wp_rewrite->using_permalinks() ) {
|
|
|
|
return add_query_arg(
|
|
|
|
// Accounts for cases where name is not included, ex: sitemaps-users-1.xml.
|
|
|
|
array_filter(
|
|
|
|
array(
|
|
|
|
'sitemap' => $this->name,
|
|
|
|
'sitemap-subtype' => $name,
|
|
|
|
'paged' => $page,
|
|
|
|
)
|
|
|
|
),
|
|
|
|
home_url( '/' )
|
|
|
|
);
|
|
|
|
}
|
|
|
|
|
|
|
|
$basename = sprintf(
|
|
|
|
'/wp-sitemap-%1$s.xml',
|
|
|
|
implode(
|
|
|
|
'-',
|
|
|
|
// Accounts for cases where name is not included, ex: sitemaps-users-1.xml.
|
|
|
|
array_filter(
|
|
|
|
array(
|
|
|
|
$this->name,
|
|
|
|
$name,
|
|
|
|
(string) $page,
|
|
|
|
)
|
|
|
|
)
|
|
|
|
)
|
|
|
|
);
|
|
|
|
|
|
|
|
return home_url( $basename );
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Returns the list of supported object subtypes exposed by the provider.
|
|
|
|
*
|
|
|
|
* @since 5.5.0
|
|
|
|
*
|
|
|
|
* @return array List of object subtypes objects keyed by their name.
|
|
|
|
*/
|
|
|
|
public function get_object_subtypes() {
|
|
|
|
return array();
|
|
|
|
}
|
|
|
|
}
|