WordPress, Git-ified. This repository is just a mirror of the WordPress subversion repository. Please do not send pull requests. Submit pull requests to https://github.com/WordPress/wordpress-develop and patches to https://core.trac.wordpress.org/ instead.
62e0ef411b
Since its introduction in WordPress 6.2 the HTML Tag Processor has provided a way to scan through all of the HTML tags in a document and then read and modify their attributes. In order to reliably do this, it also needed to be aware of other kinds of HTML syntax, but it didn't expose those syntax tokens to consumers of the API. In this patch the Tag Processor introduces a new scanning method and a few helper methods to read information about or from each token. Most significantly, this introduces the ability to read `#text` nodes in the document. What's new in the Tag Processor? ================================ - `next_token()` visits every distinct syntax token in a document. - `get_token_type()` indicates what kind of token it is. - `get_token_name()` returns something akin to `DOMNode.nodeName`. - `get_modifiable_text()` returns the text associated with a token. - `get_comment_type()` indicates why a token represents an HTML comment. Example usage. ============== {{{ <?php function strip_all_tags( $html ) { $text_content = ''; $processor = new WP_HTML_Tag_Processor( $html ); while ( $processor->next_token() ) { if ( '#text' !== $processor->get_token_type() ) { continue; } $text_content .= $processor->get_modifiable_text(); } return $text_content; } }}} What changes in the Tag Processor? ================================== Previously, the Tag Processor would scan the opening and closing tag of every HTML element separately. Now, however, there are special tags which it only visits once, as if those elements were void tags without a closer. These are special tags because their content contains no other HTML or markup, only non-HTML content. - SCRIPT elements contain raw text which is isolated from the rest of the HTML document and fed separately into a JavaScript engine. There are complicated rules to avoid escaping the script context in the HTML. The contents are left verbatim, and character references are not decoded. - TEXTARA and TITLE elements contain plain text which is decoded before display, e.g. transforming `&` into `&`. Any markup which resembles tags is treated as verbatim text and not a tag. - IFRAME, NOEMBED, NOFRAMES, STYLE, and XMP elements are similar to the textarea and title elements, but no character references are decoded. For example, `&` inside a STYLE element is passed to the CSS engine as the literal string `&` and _not_ as `&`. Because it's important not treat this inner content separately from the elements containing it, the Tag Processor combines them when scanning into a single match and makes their content available as modifiable text (see below). This means that the Tag Processor will no longer visit a closing tag for any of these elements unless that tag is unexpected. {{{ <title>There is only a single token in this line</title> <title>There are two tokens in this line></title></title> </title><title>There are still two tokens in this line></title> }}} What are tokens? ================ The term "token" here is a parsing term, which means a primitive unit in HTML. There are only a few kinds of tokens in HTML: - a tag has a name, attributes, and a closing or self-closing flag. - a text node, or `#text` node contains plain text which is displayed in a browser and which is decoded before display. - a DOCTYPE declaration indicates how to parse the document. - a comment is hidden from the display on a page but present in the HTML. There are a few more kinds of tokens that the HTML Tag Processor will recognize, some of which don't exist as concepts in HTML. These mostly comprise XML syntax elements that aren't part of HTML (such as CDATA and processing instructions) and invalid HTML syntax that transforms into comments. What is a funky comment? ======================== This patch treats a specific kind of invalid comment in a special way. A closing tag with an invalid name is considered a "funky comment." In the browser these become HTML comments just like any other, but their syntax is convenient for representing a variety of bits of information in a well-defined way and which cannot be nested or recursive, given the parsing rules handling this invalid syntax. - `</1>` - `</%avatar_url>` - `</{"wp_bit": {"type": "post-author"}}>` - `</[post-author]>` - `</__( 'Save Post' );>` All of these examples become HTML comments in the browser. The content inside the funky content is easily parsable, whereby the only rule is that it starts at the `<` and continues until the nearest `>`. There can be no funky comment inside another, because that would imply having a `>` inside of one, which would actually terminate the first one. What is modifiable text? ======================== Modifiable text is similar to the `innerText` property of a DOM node. It represents the span of text for a given token which may be modified without changing the structure of the HTML document or the token. There is currently no mechanism to change the modifiable text, but this is planned to arrive in a later patch. Tags ==== Most tags have no modifiable text because they have child nodes where text nodes are found. Only the special tags mentioned above have modifiable text. {{{ <div class="post">Another day in HTML</div> └─ tag ──────────┘└─ text node ─────┘└────┴─ tag }}} {{{ <title>Is <img> > <image>?</title> │ └ modifiable text ───┘ │ "Is <img> > <image>?" └─ tag ─────────────────────────────┘ }}} Text nodes ========== Text nodes are entirely modifiable text. {{{ This HTML document has no tags. └─ modifiable text ───────────┘ }}} Comments ======== The modifiable text inside a comment is the portion of the comment that doesn't form its syntax. This applies for a number of invalid comments. {{{ <!-- this is inside a comment --> │ └─ modifiable text ──────┘ │ └─ comment token ───────────────┘ }}} {{{ <!--> This invalid comment has no modifiable text. }}} {{{ <? this is an invalid comment --> │ └─ modifiable text ────────┘ │ └─ comment token ───────────────┘ }}} {{{ <[CDATA[this is an invalid comment]]> │ └─ modifiable text ───────┘ │ └─ comment token ───────────────────┘ }}} Other token types also have modifiable text. Consult the code or tests for further information. Developed in https://github.com/WordPress/wordpress-develop/pull/5683 Discussed in https://core.trac.wordpress.org/ticket/60170 Follows [57575] Props bernhard-reiter, dlh, dmsnell, jonsurrell, zieladam Fixes #60170 Built from https://develop.svn.wordpress.org/trunk@57348 git-svn-id: http://core.svn.wordpress.org/trunk@56854 1a063a9b-81f0-0310-95a4-ce76da25c4cd |
||
---|---|---|
wp-admin | ||
wp-content | ||
wp-includes | ||
index.php | ||
license.txt | ||
readme.html | ||
wp-activate.php | ||
wp-blog-header.php | ||
wp-comments-post.php | ||
wp-config-sample.php | ||
wp-cron.php | ||
wp-links-opml.php | ||
wp-load.php | ||
wp-login.php | ||
wp-mail.php | ||
wp-settings.php | ||
wp-signup.php | ||
wp-trackback.php | ||
xmlrpc.php |
readme.html
<!DOCTYPE html> <html lang="en"> <head> <meta name="viewport" content="width=device-width" /> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>WordPress › ReadMe</title> <link rel="stylesheet" href="wp-admin/css/install.css?ver=20100228" type="text/css" /> </head> <body> <h1 id="logo"> <a href="https://wordpress.org/"><img alt="WordPress" src="wp-admin/images/wordpress-logo.png" /></a> </h1> <p style="text-align: center">Semantic Personal Publishing Platform</p> <h2>First Things First</h2> <p>Welcome. WordPress is a very special project to me. Every developer and contributor adds something unique to the mix, and together we create something beautiful that I am proud to be a part of. Thousands of hours have gone into WordPress, and we are dedicated to making it better every day. Thank you for making it part of your world.</p> <p style="text-align: right">— Matt Mullenweg</p> <h2>Installation: Famous 5-minute install</h2> <ol> <li>Unzip the package in an empty directory and upload everything.</li> <li>Open <span class="file"><a href="wp-admin/install.php">wp-admin/install.php</a></span> in your browser. It will take you through the process to set up a <code>wp-config.php</code> file with your database connection details. <ol> <li>If for some reason this does not work, do not worry. It may not work on all web hosts. Open up <code>wp-config-sample.php</code> with a text editor like WordPad or similar and fill in your database connection details.</li> <li>Save the file as <code>wp-config.php</code> and upload it.</li> <li>Open <span class="file"><a href="wp-admin/install.php">wp-admin/install.php</a></span> in your browser.</li> </ol> </li> <li>Once the configuration file is set up, the installer will set up the tables needed for your site. If there is an error, double check your <code>wp-config.php</code> file, and try again. If it fails again, please go to the <a href="https://wordpress.org/support/forums/">WordPress support forums</a> with as much data as you can gather.</li> <li><strong>If you did not enter a password, note the password given to you.</strong> If you did not provide a username, it will be <code>admin</code>.</li> <li>The installer should then send you to the <a href="wp-login.php">login page</a>. Sign in with the username and password you chose during the installation. If a password was generated for you, you can then click on “Profile” to change the password.</li> </ol> <h2>Updating</h2> <h3>Using the Automatic Updater</h3> <ol> <li>Open <span class="file"><a href="wp-admin/update-core.php">wp-admin/update-core.php</a></span> in your browser and follow the instructions.</li> <li>You wanted more, perhaps? That’s it!</li> </ol> <h3>Updating Manually</h3> <ol> <li>Before you update anything, make sure you have backup copies of any files you may have modified such as <code>index.php</code>.</li> <li>Delete your old WordPress files, saving ones you’ve modified.</li> <li>Upload the new files.</li> <li>Point your browser to <span class="file"><a href="wp-admin/upgrade.php">/wp-admin/upgrade.php</a>.</span></li> </ol> <h2>Migrating from other systems</h2> <p>WordPress can <a href="https://wordpress.org/documentation/article/importing-content/">import from a number of systems</a>. First you need to get WordPress installed and working as described above, before using <a href="wp-admin/import.php">our import tools</a>.</p> <h2>System Requirements</h2> <ul> <li><a href="https://secure.php.net/">PHP</a> version <strong>7.0</strong> or greater.</li> <li><a href="https://www.mysql.com/">MySQL</a> version <strong>5.5.5</strong> or greater.</li> </ul> <h3>Recommendations</h3> <ul> <li><a href="https://secure.php.net/">PHP</a> version <strong>7.4</strong> or greater.</li> <li><a href="https://www.mysql.com/">MySQL</a> version <strong>8.0</strong> or greater OR <a href="https://mariadb.org/">MariaDB</a> version <strong>10.4</strong> or greater.</li> <li>The <a href="https://httpd.apache.org/docs/2.2/mod/mod_rewrite.html">mod_rewrite</a> Apache module.</li> <li><a href="https://wordpress.org/news/2016/12/moving-toward-ssl/">HTTPS</a> support.</li> <li>A link to <a href="https://wordpress.org/">wordpress.org</a> on your site.</li> </ul> <h2>Online Resources</h2> <p>If you have any questions that are not addressed in this document, please take advantage of WordPress’ numerous online resources:</p> <dl> <dt><a href="https://wordpress.org/documentation/">HelpHub</a></dt> <dd>HelpHub is the encyclopedia of all things WordPress. It is the most comprehensive source of information for WordPress available.</dd> <dt><a href="https://wordpress.org/news/">The WordPress Blog</a></dt> <dd>This is where you’ll find the latest updates and news related to WordPress. Recent WordPress news appears in your administrative dashboard by default.</dd> <dt><a href="https://planet.wordpress.org/">WordPress Planet</a></dt> <dd>The WordPress Planet is a news aggregator that brings together posts from WordPress blogs around the web.</dd> <dt><a href="https://wordpress.org/support/forums/">WordPress Support Forums</a></dt> <dd>If you’ve looked everywhere and still cannot find an answer, the support forums are very active and have a large community ready to help. To help them help you be sure to use a descriptive thread title and describe your question in as much detail as possible.</dd> <dt><a href="https://make.wordpress.org/support/handbook/appendix/other-support-locations/introduction-to-irc/">WordPress <abbr>IRC</abbr> (Internet Relay Chat) Channel</a></dt> <dd>There is an online chat channel that is used for discussion among people who use WordPress and occasionally support topics. The above wiki page should point you in the right direction. (<a href="https://web.libera.chat/#wordpress">irc.libera.chat #wordpress</a>)</dd> </dl> <h2>Final Notes</h2> <ul> <li>If you have any suggestions, ideas, or comments, or if you (gasp!) found a bug, join us in the <a href="https://wordpress.org/support/forums/">Support Forums</a>.</li> <li>WordPress has a robust plugin <abbr>API</abbr> (Application Programming Interface) that makes extending the code easy. If you are a developer interested in utilizing this, see the <a href="https://developer.wordpress.org/plugins/">Plugin Developer Handbook</a>. You shouldn’t modify any of the core code.</li> </ul> <h2>Share the Love</h2> <p>WordPress has no multi-million dollar marketing campaign or celebrity sponsors, but we do have something even better—you. If you enjoy WordPress please consider telling a friend, setting it up for someone less knowledgeable than yourself, or writing the author of a media article that overlooks us.</p> <p>WordPress is the official continuation of <a href="https://cafelog.com/">b2/cafélog</a>, which came from Michel V. The work has been continued by the <a href="https://wordpress.org/about/">WordPress developers</a>. If you would like to support WordPress, please consider <a href="https://wordpress.org/donate/">donating</a>.</p> <h2>License</h2> <p>WordPress is free software, and is released under the terms of the <abbr>GPL</abbr> (GNU General Public License) version 2 or (at your option) any later version. See <a href="license.txt">license.txt</a>.</p> </body> </html>