Commit Graph

8337 Commits

Author SHA1 Message Date
Gary D. Gregory f923a51525 Add comment for reproducible builds
See https://maven.apache.org/guides/mini/guide-reproducible-builds.html
2025-01-11 08:13:24 -05:00
Gary D. Gregory 318f892fbf Make sure JAR files are readable in the TAR file
See also https://issues.apache.org/jira/browse/BCEL-375
2025-01-10 21:32:12 -05:00
dependabot[bot] 84103a7e86
Bump actions/upload-artifact from 4.5.0 to 4.6.0 (#1341)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.5.0 to 4.6.0.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](6f51ac03b9...65c4c4a1dd)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-01-10 10:18:23 -05:00
Gary Gregory bdcc5a0502 Fix Spotbugs SE_TRANSIENT_FIELD_NOT_RESTORED issues
[ERROR] Medium: The field
org.apache.commons.lang3.builder.DiffBuilder$SDiff.leftSupplier is
transient but isn't set by deserialization
[org.apache.commons.lang3.builder.DiffBuilder$SDiff] In DiffBuilder.java
SE_TRANSIENT_FIELD_NOT_RESTORED
[ERROR] Medium: The field
org.apache.commons.lang3.builder.DiffBuilder$SDiff.rightSupplier is
transient but isn't set by deserialization
[org.apache.commons.lang3.builder.DiffBuilder$SDiff] In DiffBuilder.java
SE_TRANSIENT_FIELD_NOT_RESTORED
2025-01-09 15:29:10 -05:00
Gary Gregory 744a8c30c6 Add RegExUtils methods typed to CharSequence input and deprecate old
versions typed to String
2025-01-09 15:03:29 -05:00
Gary Gregory f43534ec04 Add RegExUtils methods typed to CharSequence input and deprecate old
versions typed to String
2025-01-09 13:20:23 -05:00
Gary Gregory ea47d47ea8 Fix Spotbugs SE_TRANSIENT_FIELD_NOT_RESTORED issues
[ERROR] Medium: The field
org.apache.commons.lang3.builder.DiffBuilder$SDiff.leftSupplier is
transient but isn't set by deserialization
[org.apache.commons.lang3.builder.DiffBuilder$SDiff] In DiffBuilder.java
SE_TRANSIENT_FIELD_NOT_RESTORED
[ERROR] Medium: The field
org.apache.commons.lang3.builder.DiffBuilder$SDiff.rightSupplier is
transient but isn't set by deserialization
[org.apache.commons.lang3.builder.DiffBuilder$SDiff] In DiffBuilder.java
SE_TRANSIENT_FIELD_NOT_RESTORED
2025-01-09 13:13:08 -05:00
Gary Gregory dee0da18ed Undeprecate ObjectUtils.toString(Object) 2025-01-09 13:11:45 -05:00
Gary Gregory aafb046b68 Undeprecate ObjectUtils.toString(Object) 2025-01-09 12:48:10 -05:00
Gary Gregory 65112a297b Jaavdoc 2025-01-09 10:06:44 -05:00
Gary Gregory e7d001f1b7 Add RegExUtils.removePattern(CharSequence, String) and deprecate
RegExUtils.removePattern(String, String)
2025-01-09 10:01:49 -05:00
Gary Gregory 6e0c4f9cc9 Bump org.apache.commons:commons-parent from 78 to 79 2025-01-09 09:41:22 -05:00
Gary Gregory 6b81f9e6cf Add RegExUtils.replacePattern(CharSequence, String, String) and
deprecate RegExUtils.replacePattern(String, String, String)
2025-01-09 09:26:58 -05:00
Gary Gregory 9bc57f7fed Add RegExUtils.dotAllMatcher(String, CharSequence) and deprecate
RegExUtils.dotAllMatcher(String, String)
2025-01-09 09:19:52 -05:00
Gary Gregory 1ea3bee710 Javadoc current behavior 2025-01-09 08:35:50 -05:00
Gary Gregory bfc2275e33 Remove _ from test method names 2025-01-09 08:18:41 -05:00
Gary Gregory b3ecf47086 Add StringUtilsTrimStripTest.testStripAccentsUnicodeVulgarFractions() 2025-01-09 08:17:40 -05:00
Gary Gregory 0349409132 Rename test method 2025-01-09 08:05:02 -05:00
Gary D. Gregory 4c050914d9 Merge branch 'master' of https://garydgregory@github.com/apache/commons-lang.git 2025-01-07 12:22:45 -05:00
Gary D. Gregory fc3638e0fc Add tests
- Rename test
- Format tweak
2025-01-07 12:22:35 -05:00
Gary D. Gregory d272066e8c Rename test 2025-01-07 12:16:13 -05:00
Gary Gregory f5da9b7bc3
[StringUtils::indexOfAnyBut] redesign due to inconsistent/faulty behavior regarding UTF-16 surrogates #1327 2025-01-06 17:25:08 -05:00
IBue 665f047e55
[StringUtils::indexOfAnyBut] redesign due to inconsistent/faulty behaviour regarding UTF-16 surrogates (#1327)
* [StringUtils::indexOfAnyBut] redesign due to inconsistent/faulty…
…behaviour regarding UTF-16 surrogates

Both signatures of StringUtils::indexOfAnyBut currently behave
inconsistently in matching UTF-16 supplementary characters and single
UTF-16 surrogate characters (i.e. paired and unpaired surrogates), since
they differ unnecessarily in their algorithmic implementations, use
their own incomplete and faulty interpretation of UTF-16 and don't take
full advantage of the standard library.

The example cases below show that they may yield contradictory results
or correct results for the wrong reasons.

This proposal gives a unified algorithmic implementation of both
signatures that
a) is much easier to grasp due to a clear mathematical set approach and
   safe iteration and doesn't become entangled in index arithmetic;
   stresses the set semantics of the 2nd argument
b) fully relies on the standard library for defined UTF-16
   handling/interpretation;
   paired surrogates are merged into one codepoint, unpaired surrogates
   are left as they are
c) scales much better with input sizes and result index position
d) can benefit from current and future improvements in the standard
   library and JVM
   (streams implementation, parallelization, JIT optimization, JEP 218,
   ???…)

The algorithm boils down to:
find index i of first char in cs such that
(cs.codePointAt(i) ∈ {x ∈ codepoints(cs) ∣ x ∉
codepoints(searchChars) })

Examples:
---------

<H>: high-surrogate character
<L>: low-surrogate character
(<H><L>): valid supplementary character
signature 1: StringUtils::indexOfAnyBut(final CharSequence seq,
final CharSequence searchChars)
signature 2: StringUtils::indexOfAnyBut(final CharSequence cs,
final char... searchChars)

Case 1: matching of unpaired high-surrogate
---------seq/cs-------searchChars------exp./new-----sig.1-------sig.2---

 1.1     <H>aaaa      <H>abcd          !found       !found      !found
  sig.2: 'a' happens to follow <H> in searchChars;
  sig.1: 'a' is somewhere in searchChars

 1.2     <H>baaa      <H>abcd          !found       !found      0
  sig.1: 'b' is somewhere in searchChars

 1.3     <H>aaaa      (<H><L>)abcd     0            !found      0
  sig.1: 'a' is somewhere in searchChars

 1.4     aaaa<H>      (<H><L>)abcd     4            !found      !found
  sig.1+2 don't interpret suppl. character

Case 2: matching of unpaired low-surrogate
---------seq/cs-------searchChars------exp./new-----sig.1-------sig.2---

 2.1     <L>aaaa      (<H><L>)abcd     0            !found      !found
  sig.1+2 don't interpret suppl. character

 2.2     aaaa<L>      (<H><L>)abcd     4            !found      !found
  sig.1+2 don't interpret suppl. character

Case 3: matching of supplementary character
---------seq/cs-------------searchChars-----exp./new----sig.1-----sig.2-

 3.1     (<H><L>)aaaa       <L>ab<H>cd      0           !found    0
  sig.1: <L> is somewhere in searchChars

 3.2     (<H><L>)aaaa       abcd            0           1         0
  sig.1 always points to low-surrogate of (fully) unmatched
  suppl. character

 3.3     (<H><L>)aaaa       abcd<H>         0           0         1
 3.4     (<H><L>)aaaa       abcd<L>         0           !found    0
  sig.1: <H> skipped by algorithm

* [StringUtils::indexOfAnyBut] further reduction of algorithm

by simplifying set consideration:
find index i of first char in seq such that (seq.codePointAt(i) ∉ { x ∈
codepoints(searchChars) })

* [StringUtils::indexOfAnyBut] simplify input-sequence iteration

by transforming ListIterator loop into index-based loop,
advancing by Character.charCount(codepoint);
enabling short-circuit processing, avoiding full in-advance processing of
input-sequence

* [StringUtils:indexOfAnyBut] parameterization of test functions

providing a single source-of-truth (arguments stream) for the two
function variants

* [StringUtils:indexOfAnyBut] remove comment

Set::contains of immutable Set has unclear desastrous performance issues
when searching for large values (here: >0xffff) in a set of smaller
values (including JDK 23)

---------

Co-authored-by: IBue <>
2025-01-06 17:22:42 -05:00
Gary Gregory 1f83019d15 Javadoc whitespace 2025-01-06 10:49:11 -05:00
Gary Gregory cc591bc0a2 better comment 2025-01-06 10:22:43 -05:00
Gary Gregory 70ff24dda0 Javadoc: Remove obsolete comments and clarify 2025-01-06 10:20:25 -05:00
Gary D. Gregory e55e2b7475 Javadoc
Remove extra blank lines
2025-01-05 10:50:52 -05:00
Gary D. Gregory b046e22faf Add Predicates
- Refactor to use Predicates,
- Explicitly test deprecated code
2025-01-05 10:49:18 -05:00
Gary D. Gregory 0d46368e15 Java 23 fails on Windows, back to experimental 2025-01-05 09:57:31 -05:00
Gary D. Gregory 835af71a58 Revert Javadoc failOnWarnings
Java 8 says:
[WARNING] javadoc: warning - Error fetching URL:
https://commons.apache.org/proper/commons-text/apidocs/
2025-01-05 09:37:44 -05:00
Gary D. Gregory 461b171885 Add Java 25-ea build as experimental 2025-01-05 09:29:43 -05:00
Gary D. Gregory 54c6018a3f Java 23 build is no longer experimental 2025-01-05 09:28:42 -05:00
Gary D. Gregory b60ef1ef6d Fail Javadoc on warnings 2025-01-05 09:28:01 -05:00
Gary D. Gregory ef3d5d93a3 Add ArrayUtils.startsWith() 2025-01-05 09:14:36 -05:00
Gary D. Gregory 4d77a45601 Add Apache license header 2025-01-03 10:14:07 -05:00
Gary Gregory c307291235 Update notice file copyright end date 2025-01-01 07:59:15 -05:00
dependabot[bot] 62b08ffdba
Bump github/codeql-action from 3.27.9 to 3.28.0 (#1340)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.27.9 to 3.28.0.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](df409f7d92...48ab28a6f5)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-27 10:35:27 -05:00
Gary D. Gregory 7131386054 Replace 2x empty lines with a single one 2024-12-26 09:34:20 -05:00
Gary D. Gregory 1a89cca97e Replace 2x empty lines with a single one 2024-12-26 09:32:44 -05:00
dependabot[bot] 41f68e6f9b
Bump actions/setup-java from 4.5.0 to 4.6.0 (#1339)
Bumps [actions/setup-java](https://github.com/actions/setup-java) from 4.5.0 to 4.6.0.
- [Release notes](https://github.com/actions/setup-java/releases)
- [Commits](8df1039502...7a6d8a8234)

---
updated-dependencies:
- dependency-name: actions/setup-java
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-20 10:22:45 -05:00
dependabot[bot] cef9d7bcd8
Bump actions/upload-artifact from 4.4.3 to 4.5.0 (#1338)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.4.3 to 4.5.0.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](b4b15b8c7c...6f51ac03b9)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-20 10:22:30 -05:00
Gary Gregory 35fee6998b Add BasicThreadFactory.builder() and deprecate
BasicThreadFactory.Builder()

Add BasicThreadFactory.deamon()
2024-12-18 09:53:58 -05:00
Gary Gregory 82640281c8 USe generics 2024-12-18 09:12:25 -05:00
Gary Gregory 95d3485cd4 Javadoc 2024-12-13 20:02:09 -05:00
Gary Gregory 5f31e5b18c
[test] Bump org.apache.commons:commons-text from 1.12.0 to 1.13.0 #1336 2024-12-13 14:37:07 -05:00
dependabot[bot] 6e943dda41
Bump org.apache.commons:commons-text from 1.12.0 to 1.13.0 (#1336)
Bumps org.apache.commons:commons-text from 1.12.0 to 1.13.0.

---
updated-dependencies:
- dependency-name: org.apache.commons:commons-text
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-13 14:35:48 -05:00
dependabot[bot] 48f61ac4f7
Bump github/codeql-action from 3.27.6 to 3.27.9 (#1335)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.27.6 to 3.27.9.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](aa57810251...df409f7d92)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-13 10:29:06 -05:00
Gary Gregory ccb13fb58c Javadoc 2024-12-11 16:54:03 -05:00
Gary Gregory 23f0816589 Javadoc 2024-12-10 07:48:34 -05:00
Gary Gregory f4803450c5 Javadoc: Add @see tag 2024-12-10 06:50:55 -05:00