Commit Graph

1709 Commits

Author SHA1 Message Date
Clint Wylie 2831d79871
update kafka dependency version to 3.9.0 (#17513)
* update kafka dependency version to 3.9.0

* update licenses.yaml
2024-11-27 12:14:05 +05:30
Akshat Jain dd46c7722d
Remove pre-java-11 profile (#17511)
We have removed support for Java 8 in #17466. This PR removes an unused profile pre-java-11 which activated for JDK < 11.
2024-11-26 08:43:20 +01:00
Akshat Jain 17215cd677
Remove support for Java 8 (#17466)
All JDK 8 based CI checks have been removed.
    Images used in Dockerfile(s) have been updated to Java 17 based images.
    Documentation has been updated accordingly.
2024-11-21 15:33:08 +05:30
Rishabh Singh 7f335ff486
Resolve CVEs: Upgrade jetty version and suppress azure cve (#17385) 2024-11-15 10:55:02 +05:30
Nandini Anagondi 32394e55f9
Upgrading org.codehaus to com.fasterxml (#17371) 2024-11-07 10:55:47 +01:00
Gian Merlino 446a8f466f
Update errorprone, mockito, jacoco, checkerframework. (#17414)
* Update errorprone, mockito, jacoco, checkerframework.

This patch updates various build and test dependencies, to see if they
cause unit tests on JDK 21 to behave more reliably.

* Update licenses, tests.

* Remove assertEquals.

* Repair two tests.

* Update some more tests.
2024-10-28 11:34:03 -07:00
Suraj Goel 7306d280cc
Migrate jaxb bind dependency to jakarta (#17370)
- Migrated from javax.xml.bind 2.3.1  to jakarta.xml.bind 2.3.3.
- Minor version is modified to avoid any breaking changes.
2024-10-26 21:24:17 -07:00
Shivam Garg 7d9e6d36fd
Upgraded Protobuf to 3.25.5 (#17249)
* Bump com.google.protobuf:protobuf-java from 3.24.0 to 3.25.5

Bumps [com.google.protobuf:protobuf-java](https://github.com/protocolbuffers/protobuf) from 3.24.0 to 3.25.5.
- [Release notes](https://github.com/protocolbuffers/protobuf/releases)
- [Changelog](https://github.com/protocolbuffers/protobuf/blob/main/protobuf_release.bzl)
- [Commits](https://github.com/protocolbuffers/protobuf/compare/v3.24.0...v3.25.5)

---
updated-dependencies:
- dependency-name: com.google.protobuf:protobuf-java
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* Updated the license

* Updated licenses.yaml

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-06 12:34:02 +05:30
Shivam Garg 93b5a8326b
Upgrade commons-io to 2.17.0 (#17227) 2024-10-04 09:56:56 +02:00
Abhishek Agarwal 421aae39ad
Upgrade avro - minor version (#17230) 2024-10-03 18:02:11 +05:30
Shivam Garg ab361747a8
Migrated commons-lang usages to commons-lang3 (#17156) 2024-09-28 10:28:11 +02:00
Rishabh Singh 953fe11e31
gRPC query extension (#15982)
Revives #14024 and additionally supports,

Native queries
gRPC health check endpoint
This PR doesn't have the shaded module for packaging gRPC and Guava libraries since grpc-query module uses the same Guava version as that of Druid.

The response is gRPC-specific. It provides the result schema along with the results as a binary "blob". Results can be in CSV, JSON array lines or as an array of Protobuf objects. If using Protobuf, the corresponding class must be installed along with the gRPC query extension so it is available to the Broker at runtime.
2024-09-19 12:16:32 +05:30
Misha 6aad9b08dd
Fix low sonatype findings (#17017)
Fixed vulnerabilities
CVE-2021-26291 : Apache Maven is vulnerable to Man-in-the-Middle (MitM) attacks. Various
functions across several files, mentioned below, allow for custom repositories to use the
insecure HTTP protocol. An attacker can exploit this as part of a Man-in-the-Middle (MitM)
attack, taking over or impersonating a repository using the insecure HTTP protocol.
Unsuspecting users may then have the compromised repository defined as a dependency in
their Project Object Model (pom) file and download potentially malicious files from it.
Was fixed by removing outdated tesla-aether library containing vulnerable maven-settings (v3.1.1) package, pull-deps utility updated to use maven resolver instead.

sonatype-2020-0244 : The joni package is vulnerable to Man-in-the-Middle (MitM) attacks.
This project downloads dependencies over HTTP due to an insecure repository configuration
within the .pom file. Consequently, a MitM could intercept requests to the specified
repository and replace the requested dependencies with malicious versions, which can execute
arbitrary code from the application that was built with them.
Was fixed by upgrading joni package to recommended 2.1.34 version
2024-09-16 16:10:25 +05:30
Akshat Jain 6ed8632420
Handle memory leaks from Mockito inline mocks (#17070) 2024-09-15 11:17:25 -07:00
Abhishek Radhakrishnan 7a0d7d1897
Bump up -Xmx2500m from 2GB and keep MaxDirectMemorySize as 2500m as well. (#17056) 2024-09-13 14:54:07 +05:30
Abhishek Radhakrishnan 668169d9a9
Provide `chmod` command for `-XX:OnOutOfMemoryError` from shell script (#17054)
A command line arg -XX:OnOutOfMemoryError='chmod 644 ${project.parent.basedir}/target/*.hprof' was added to collect heap dumps: #17029

This arg is causing problems when running tests from Intellij. Intellij doesn't seem to likechmod 644, but this command works as expected in mvn. So as a workaround, add the chmod 644 ${BASE_DIR/target/*.hprof' command in a shell script that can then be executed when OnOutOfMemoryError happens to make Intellij happy.
2024-09-13 00:17:28 -04:00
Abhishek Radhakrishnan c077daaade
GHA steps to collect and upload heap dumps to debug UT OOM errors (#17029)
* Add GHA steps to tar and upload any heap dumps on failure to debug UT OOM issues.

* Add jvm options to heap dump OnOutOfMemoryError

Co-authored-by: Elliott Freis <108356317+imply-elliott@users.noreply.github.com>

---------

Co-authored-by: Elliott Freis <108356317+imply-elliott@users.noreply.github.com>
2024-09-12 09:06:35 -04:00
Abhishek Agarwal 78775ad398
Prepare master for 32.0.0 release (#17022) 2024-09-10 11:01:20 +05:30
Zoltan Haindrich 26e3c44f4b
Quidem record (#16624)
* enables to launch a fake broker based on test resources (druidtest uri)
* could record queries into new testfiles during usage
* instead of re-purpose Calcite's Hook migrates to use DruidHook which we can add further keys
* added a quidem-ut module which could be the place for tests which could iteract with modules/etc
2024-08-05 14:58:32 +02:00
Alberic Liu 0eaa810e89
Fix the maven warning during build (#16746) 2024-07-18 14:56:15 +08:00
Kashif Faraz 6c87b1637b
Revert "Downgrade the version of Apache Curator from 5.5.0 to 5.3.0 to avoid a bug in the new version (#16425)" (#16688)
This reverts commit cb7c2c1e37.
2024-07-03 11:18:50 +05:30
Zoltan Haindrich ac19b148c2
Upgrade calcite to 1.37.0 (#16504)
* contains Make a full copy of the parser and apply our modifications to it #16503
* some minor api changes pair/entry
* some unnecessary aggregation was removed from a set of queries in `CalciteSubqueryTest`
* `AliasedOperatorConversion` was detecting `CHAR_LENGTH` as not a function ; I've removed the check
  * the field it was using doesn't look maintained that much
  * the `kind` is passed for the created `SqlFunction` so I don't think this check is actually needed
* some decoupled test cases become broken - will be fixed later
* some aggregate related changes: due to the fact that SUM() and COUNT() of no inputs are different
* upgrade avatica to 1.25.0
* `CalciteQueryTest#testExactCountDistinctWithFilter` is now executable

Close apache/druid#16503
2024-06-13 08:47:50 +02:00
dependabot[bot] 80db8cd93b
Bump org.openrewrite.maven:rewrite-maven-plugin from 5.27.0 to 5.31.0 (#16477) 2024-05-21 09:47:05 +02:00
Benedict Jin cb7c2c1e37
Downgrade the version of Apache Curator from 5.5.0 to 5.3.0 to avoid a bug in the new version (#16425) 2024-05-10 15:08:33 +05:30
Zoltan Haindrich 1811674753
Enable quidem tests to use different suppliers (#16382)
* enable quidem uri support for `druidtest:///?ComponentSupplier=Nested` and similar
* changes the way `SqlTestFrameworkConfig` is being applied; all options will have their own annotation (its kinda impossible to detect that an annotation has a set value or its the default)
* enables hierarchical processing of config annotation (was needed to enable class level supplier annotation)
* moves uri processing related string2config stuff into `SqlTestFrameworkConfig`
2024-05-09 09:21:02 +02:00
dependabot[bot] a2223ce821
Bump org.scala-lang:scala-library from 2.13.11 to 2.13.14 (#16364)
Bumps [org.scala-lang:scala-library](https://github.com/scala/scala) from 2.13.11 to 2.13.14.
- [Release notes](https://github.com/scala/scala/releases)
- [Commits](https://github.com/scala/scala/compare/v2.13.11...v2.13.14)

---
updated-dependencies:
- dependency-name: org.scala-lang:scala-library
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-06 22:06:23 +08:00
Alberic Liu 92fb0ff718
upgrade mysql:mysql-connector-java to 8.2.0 (#16024)
* upgrade mysql:mysql-connector-java to 8.2.0

* fix the check errors

* remove unused comment
2024-05-06 21:58:37 +08:00
Jan Werner b16401323b
update dependencies to address CVEs (#16374)
update dependencies to address new batch of CVEs:
- Azure POM from 1.2.19 to 1.2.23 to update transitive dependency nimbus-jose-jwt to address:  CVE-2023-52428
- commons-configuration2 from 2.8.0 to 2.10.1 to address: CVE-2024-29131 CVE-2024-29133
- bcpkix-jdk18on from 1.76 to 1.78.1 to address: CVE-2024-30172 CVE-2024-30171 CVE-2024-29857
2024-05-02 21:35:21 -07:00
Zoltan Haindrich 2d0e86cbdc
Use quidem to run tests (#16249)
* test scoped jdbc driver for druidtest:/// backed DruidAvaticaTestDriver
** DecoupledTestConfig is used inside the URI - this will make it possible to attach to existing things more easily
* DruidQuidemTestBase can be used to create module level set of quidem tests
* added quidem commands: !convertedPlan, !logicalPlan, !druidPlan, !nativePlan
** for these I've used some values of the Hook which was there in calcite
* there are some shortcuts with proxies(they are only used during testing) - we can probably remove those later
2024-05-02 02:12:42 -04:00
Adarsh Sanjeev 9a2d7c28bc
Prepare master branch for 31.0.0 release (#16333) 2024-04-26 09:22:43 +05:30
Jan Werner c45da431fb
update netty and zookeeper dependencies to address CVEs (#16267)
Update dependencies to address CVEs: 
- Update netty from 4.1.107.Final to 4.1.108.Final to address: CVE-2024-29025 
- Update zookeeper from 3.8.3 to 3.8.4 to address: CVE-2024-23944


Release notes:
- Update netty from 4.1.107.Final to 4.1.108.Final to address: CVE-2024-29025 
- Update zookeeper from 3.8.3 to 3.8.4 to address: CVE-2024-23944
2024-04-15 20:40:50 -07:00
sullis f4649fece9
Bump openrewrite plugin + recipes (#16238) 2024-04-08 15:13:57 +05:30
Zoltan Haindrich 0a42342cef
Update Calcite*Test to use junit5 (#16106)
* Update Calcite*Test to use junit5

* change the way temp dirs are handled
* add openrewrite workflow to safeguard upgrade
* replace junitparamrunner with standard junit5 parametered tests
* update a few rules to junit5 api
* lots of boring changes

* cleanup QueryLogHook

* cleanup

* fix compile error: ARRAYS_DATASOURCE

* fix test

* remove enclosed

* empty

+TEST:TDigestSketchSqlAggregatorTest,HllSketchSqlAggregatorTest,DoublesSketchSqlAggregatorTest,ThetaSketchSqlAggregatorTest,ArrayOfDoublesSketchSqlAggregatorTest,BloomFilterSqlAggregatorTest,BloomDimFilterSqlTest,CatalogIngestionTest,CatalogQueryTest,FixedBucketsHistogramQuantileSqlAggregatorTest,QuantileSqlAggregatorTest,MSQArraysTest,MSQDataSketchesTest,MSQExportTest,MSQFaultsTest,MSQInsertTest,MSQLoadedSegmentTests,MSQParseExceptionsTest,MSQReplaceTest,MSQSelectTest,InsertLockPreemptedFaultTest,MSQWarningsTest,SqlMSQStatementResourcePostTest,SqlStatementResourceTest,CalciteSelectJoinQueryMSQTest,CalciteSelectQueryMSQTest,CalciteUnionQueryMSQTest,MSQTestBase,VarianceSqlAggregatorTest,SleepSqlTest,SqlRowTransformerTest,DruidAvaticaHandlerTest,DruidStatementTest,BaseCalciteQueryTest,CalciteArraysQueryTest,CalciteCorrelatedQueryTest,CalciteExplainQueryTest,CalciteExportTest,CalciteIngestionDmlTest,CalciteInsertDmlTest,CalciteJoinQueryTest,CalciteLookupFunctionQueryTest,CalciteMultiValueStringQueryTest,CalciteNestedDataQueryTest,CalciteParameterQueryTest,CalciteQueryTest,CalciteReplaceDmlTest,CalciteScanSignatureTest,CalciteSelectQueryTest,CalciteSimpleQueryTest,CalciteSubqueryTest,CalciteSysQueryTest,CalciteTableAppendTest,CalciteTimeBoundaryQueryTest,CalciteUnionQueryTest,CalciteWindowQueryTest,DecoupledPlanningCalciteJoinQueryTest,DecoupledPlanningCalciteQueryTest,DecoupledPlanningCalciteUnionQueryTest,DrillWindowQueryTest,DruidPlannerResourceAnalyzeTest,IngestTableFunctionTest,QueryTestRunner,SqlTestFrameworkConfig,SqlAggregationModuleTest,ExpressionsTest,GreatestExpressionTest,IPv4AddressMatchExpressionTest,IPv4AddressParseExpressionTest,IPv4AddressStringifyExpressionTest,LeastExpressionTest,TimeFormatOperatorConversionTest,CombineAndSimplifyBoundsTest,FiltrationTest,SqlQueryTest,CalcitePlannerModuleTest,CalcitesTest,DruidCalciteSchemaModuleTest,DruidSchemaNoDataInitTest,InformationSchemaTest,NamedDruidSchemaTest,NamedLookupSchemaTest,NamedSystemSchemaTest,RootSchemaProviderTest,SystemSchemaTest,CalciteTestBase,SqlResourceTest

* use @Nested

* add rule to remove enclosed; upgrade surefire

* remove enclosed

* cleanup

* add comment about surefire exclude
2024-03-19 04:05:12 -07:00
sullis 148ad32e75
netty 4.1.107 (#16027)
* netty 4.1.107

* update licenses.yaml
2024-03-11 15:57:44 +08:00
Jan Werner 834a0ad9f1
update jose4j and corresponding license file (#16078)
Update org.bitbucket.b_c:jose4j from 0.9.3 to 0.9.6. to resolve https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-51775

fixes #16075
2024-03-08 07:36:07 -08:00
Jan Werner a7b2747e56
remove aws-sdk from ranger-extension (#16011)
Fixes # size blowup regression introduced in https://github.com/apache/druid/pull/15443

This PR removes the transitive dependency of ranger-plugins-audit to reduce the size of the compiled artifacts

* add aws-logs-sdk to ensure that all the transitive dependencies are satisfied
* replace aws-bundle-sdk with aws-logs-sdk
* add additional guidance on ranger update, add dependency ignore to satisfy dependency analyzer
* add aws-sdk-logs to list of ignored dependencies to satisfy the maven plugin
* align aws-sdk versions
2024-03-08 07:35:29 -08:00
Zoltan Haindrich bf0995f846
Introduce dynamic table append (#15897) 2024-03-01 04:31:57 -05:00
Jan Werner baaa4a6808
update common-compress to address CVE-2024-25710 CVE-2024-26308 (#16009)
* Update common-compress to 1.26.0 to address CVEs CVE-2024-25710 CVE-2024-26308
* Add commons-codec as a runtime dependency required by common-compress 1.26.0

---------

Co-authored-by: Xavier Léauté <xl+github@xvrl.net>
2024-02-29 14:05:31 -08:00
Jan Werner d6f59d1999
update jetty to address CVE (#16000) 2024-02-29 09:27:31 +08:00
dependabot[bot] 3011829419
Bump log4j.version from 2.18.0 to 2.22.1 (#15934)
* Bump log4j.version from 2.18.0 to 2.22.1

Bumps `log4j.version` from 2.18.0 to 2.22.1.

Updates `org.apache.logging.log4j:log4j-api` from 2.18.0 to 2.22.1

Updates `org.apache.logging.log4j:log4j-core` from 2.18.0 to 2.22.1

Updates `org.apache.logging.log4j:log4j-slf4j-impl` from 2.18.0 to 2.22.1

Updates `org.apache.logging.log4j:log4j-1.2-api` from 2.18.0 to 2.22.1

Updates `org.apache.logging.log4j:log4j-jul` from 2.18.0 to 2.22.1

---
updated-dependencies:
- dependency-name: org.apache.logging.log4j:log4j-api
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: org.apache.logging.log4j:log4j-core
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: org.apache.logging.log4j:log4j-slf4j-impl
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: org.apache.logging.log4j:log4j-1.2-api
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: org.apache.logging.log4j:log4j-jul
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update License

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: frank chen <frank.chen021@outlook.com>
2024-02-23 16:19:35 +08:00
dependabot[bot] 936ba25e85
Bump org.postgresql:postgresql from 42.6.0 to 42.7.2 (#15931)
* Bump org.postgresql:postgresql from 42.6.0 to 42.7.2

Bumps [org.postgresql:postgresql](https://github.com/pgjdbc/pgjdbc) from 42.6.0 to 42.7.2.
- [Release notes](https://github.com/pgjdbc/pgjdbc/releases)
- [Changelog](https://github.com/pgjdbc/pgjdbc/blob/master/CHANGELOG.md)
- [Commits](https://github.com/pgjdbc/pgjdbc/commits)

---
updated-dependencies:
- dependency-name: org.postgresql:postgresql
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update License

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: frank chen <frank.chen021@outlook.com>
2024-02-23 16:19:26 +08:00
Jamie 80942d5754
Feature: add support for ingesting from rabbitmq super streams (#14137)
* Add support for ingesting from Rabbit MQ Super Streams
2024-02-22 10:50:37 +05:30
Zoltan Haindrich bcce0806d7
Support Union in decoupled mode (#15870) 2024-02-21 10:54:50 -05:00
Parth Agrawal 495e66f2e7
CVE Fix: Update json-path version (#15772)
Apache Druid brings the dependency json-path which is affected by CVE-2023-51074.
Its latest version 2.9.0 fixes the above CVE.

Append function has been added to json-path and so the unit test to check for the append function not present has been updated.

---------

Co-authored-by: Xavier Léauté <xvrl@apache.org>
2024-02-14 20:58:27 -08:00
Vishesh Garg 5de39c6251
Resolve CVE issues (#15814)
* Resolve CVE issues

* Update license.yaml
2024-02-01 14:10:12 +05:30
Abhishek Radhakrishnan 9f95a691f7
Extension to read and ingest Delta Lake tables (#15755)
* something

* test commit

* compilation fix

* more compilation fixes (fixme placeholders)

* Comment out druid-kereberos build since it conflicts with newly added transitive deps from delta-lake

Will need to sort out the dependencies later.

* checkpoint

* remove snapshot schema since we can get schema from the row

* iterator bug fix

* json json json

* sampler flow

* empty impls for read(InputStats) and sample()

* conversion?

* conversion, without timestamp

* Web console changes to show Delta Lake

* Asset bug fix and tile load

* Add missing pieces to input source info, etc.

* fix stuff

* Use a different delta lake asset

* Delta lake extension dependencies

* Cleanup

* Add InputSource, module init and helper code to process delta files.

* Test init

* Checkpoint changes

* Test resources and updates

* some fixes

* move to the correct package

* More tests

* Test cleanup

* TODOs

* Test updates

* requirements and javadocs

* Adjust dependencies

* Update readme

* Bump up version

* fixup typo in deps

* forbidden api and checkstyle checks

* Trim down dependencies

* new lines

* Fixup Intellij inspections.

* Add equals() and hashCode()

* chain splits, intellij inspections

* review comments and todo placeholder

* fix up some docs

* null table path and test dependencies. Fixup broken link.

* run prettify

* Different test; fixes

* Upgrade pyspark and delta-spark to latest (3.5.0 and 3.0.0) and regenerate tests

* yank the old test resource.

* add a couple of sad path tests

* Updates to readme based on latest.

* Version support

* Extract Delta DateTime converstions to DeltaTimeUtils class and add test

* More comprehensive split tests.

* Some test renames.

* Cleanup and update instructions.

* add pruneSchema() optimization for table scans.

* Oops, missed the parquet files.

* Update default table and rename schema constants.

* Test setup and misc changes.

* Add class loader logic as the context class loader is unaware about extension classes

* change some table client creation logic.

* Add hadoop-aws, hadoop-common and related exclusions.

* Remove org.apache.hadoop:hadoop-common

* Apply suggestions from code review

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

* Add entry to .spelling to fix docs static check

---------

Co-authored-by: abhishekagarwal87 <1477457+abhishekagarwal87@users.noreply.github.com>
Co-authored-by: Laksh Singla <lakshsingla@gmail.com>
Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
2024-01-30 21:53:50 -08:00
Zoltan Haindrich 2eba20d724
Fix minor build issues and stabilize intellij-inspections runs (#15747)
* Possibly stabilize intellij-inspections

* remove `integration-tests-ex/cases` from excluded projects from initial build
* enable ErrorProne's `CheckedExceptionNotThrown` to get earlier errors than intellij-inspections

* fix ddsketch pom.xml

* fix spellcheck
2024-01-24 15:17:33 +05:30
Hiroshi Fukada 3fe3a65344
New: Add DDSketch in extensions-contrib (#15049)
* New: Add DDSketch-Druid extension

- Based off of http://www.vldb.org/pvldb/vol12/p2195-masson.pdf and uses
 the corresponding https://github.com/DataDog/sketches-java library
- contains tests for post building and using aggregation/post
  aggregation.
- New aggregator: `ddSketch`
- New post aggregators: `quantileFromDDSketch` and
  `quantilesFromDDSketch`

* Fixing easy CodeQL warnings/errors

* Fixing docs, and dependencies

Also moved aggregator ids to AggregatorUtil and PostAggregatorIds

* Adding more Docs and better null/empty handling for aggregators

* Fixing docs, and pom version

* DDSketch documentation format and wording
2024-01-23 20:17:07 +05:30
Karan Kumar c4990f56d6
Prepare main branch for next 30.0.0 release. (#15707) 2024-01-23 15:55:54 +05:30
Gian Merlino d3d0c1c91e
Faster parsing: reduce String usage, list-based input rows. (#15681)
* Faster parsing: reduce String usage, list-based input rows.

Three changes:

1) Reworked FastLineIterator to optionally avoid generating Strings
   entirely, and reduce copying somewhat. Benefits the line-oriented
   JSON, CSV, delimited (TSV), and regex formats.

2) In the delimited (TSV) format, when the delimiter is a single byte,
   split on UTF-8 bytes directly.

3) In CSV and delimited (TSV) formats, use list-based input rows when
   the column list is provided upfront by the user.

* Fix style.

* Fix inspections.

* Restore validation.

* Remove fastutil-extra.

* Exception type.

* Fixes for error messages.

* Fixes for null handling.
2024-01-18 19:18:46 +08:00