Commit Graph

68 Commits

Author SHA1 Message Date
Matthew Burgess b9076ca26e NIFI-7989: Add Update Field Names and Record Writer to UpdateHiveTable processors
NIFI-7989: Only rewrite records if a field name doesn't match a table column name exactly
NIFI-7989: Rewrite records for created tables if Update Field Names is set

This closes #4750.

Signed-off-by: Peter Turcsanyi <turcsanyi@apache.org>
2021-01-13 23:26:44 +01:00
Matthew Burgess f29d6a6046 NIFI-7989: Add support to UpdateHiveTable for creating external tables
NIFI-7989: Add support for creating partitions, quote identifiers
NIFI-7989: Quote table name when getting description

This closes #4697.

Signed-off-by: Peter Turcsanyi <turcsanyi@apache.org>
2020-12-15 22:05:53 +01:00
exceptionfactory cfbcecc4c6
NIFI-7884 Added and applied Distributed File System permissions (#4713) 2020-12-08 15:56:46 -05:00
Matthew Burgess edc060bd92 NIFI-7989: Add UpdateHiveTable processors for data drift capability
NIFI-7989: Allow for optional blank line after optional column and partition headers
NIFI-7989: Incorporated review comments
NIFI-7989: Close Statement when finishing processing
NIFI-7989: Remove database name property, update output table attribute

This closes #4653.

Signed-off-by: Peter Turcsanyi <turcsanyi@apache.org>
2020-11-18 00:39:07 +01:00
Pierre Villard 14ec02f21d
NIFI-7981 - add support for enum type in avro schema
This closes #4648

Signed-off-by: Mike Thomsen <mthomsen@apache.org>
2020-11-05 18:19:55 -05:00
Mohammed Nadeem e2ccfbbacf
NIFI-7859: Support for capturing execution duration of query run as attributes in SelectHiveQL processors
Signed-off-by: Matthew Burgess <mattyb149@apache.org>

This closes #4560
2020-10-01 09:06:01 -04:00
Matthew Burgess 7e145142e1
NIFI-7799: Relogin with Kerberos on connect exception in DBCPConnectionPool (#4519) 2020-09-17 13:33:54 -04:00
Matthew Burgess 45470b0984 NIFI-7740: Add Records Per Transaction and Transactions Per Batch properties to PutHive3Streaming
NIFI-7740: Incorporated review comments

NIFI-7740: Restore RecordsEOFException superclass to SerializationError

This closes #4489.

Signed-off-by: Peter Turcsanyi <turcsanyi@apache.org>
2020-09-01 17:17:02 +02:00
Joe Witt 8baa5c9940
NIFI-7692 updating for next dev release 1.13.0 2020-08-18 14:48:02 -07:00
Joe Witt fb57bcbc11
NIFI-7692-RC1 prepare for next development iteration 2020-08-13 09:20:39 -07:00
Joe Witt 303d6c59ba
NIFI-7692-RC1 prepare release nifi-1.12.0-RC1 2020-08-13 09:20:36 -07:00
Bence Simon 5c2bfcf7d3 NIFI-7369 Adding decimal support for record handling in order to avoid missing precision when reading in records
Signed-off-by: Mark Payne <markap14@hotmail.com>
2020-06-02 15:13:14 -04:00
Matthew Burgess 53a161234e
NIFI-7448: Fix quoting of DDL table name in PutORC
Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

This closes #4269.
2020-05-14 11:41:35 +02:00
Matthew Burgess b7c81d6007
NIFI-7428: Switch hive.version property to set Hive 3 version
This closes #4259
2020-05-06 19:38:23 -04:00
Joe Witt f694e6464f NIFI-7187 adding missing version strings from accumulo bundle pom
- Removed Cat X JSON.org dep inclusion which seems to not be necessary
- Updated a ton of easier/safer looking deps
- Updated tika due to CVE

This closes #4086

Signed-off-by: Mark Payne <markap14@hotmail.com>
2020-03-20 10:07:56 -04:00
Joe Witt 97e250cdaa
NIFI-7244 Updated all tests which dont run well on windows to either work or be ignored on windows
Also dealt with unreliable tests which depend on timing by ignoring them or converting to IT.

Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

This closes #4132.
2020-03-12 19:13:59 +01:00
jstorck 4b6de8d164
NIFI-7025: Wrap Hive 3 calls with UGI.doAs
Updated PutHive3Streaming to wrap calls to Hive in UGI.doAs methods
Fixed misleading logging message after the principal has been authenticated with the KDC
When connecting to unsecured Hive 3, a UGI with "simple" auth will be used

Signed-off-by: Matthew Burgess <mattyb149@apache.org>

This closes #4108
2020-03-04 11:21:27 -05:00
jstorck 1678531638
NIFI-7025: Initial commit adding Kerberos Password feature for Hive components
Kerberos Password property should not support EL, this includes a change to KerberosProperties which is also used by the HDFS processors (AbstractHadoopProcessor)
Added wiring in a KerberosContext to a TestRunner's MockProcessorInitializationContext
Removed synchronization blocks around KerberosUser.checkTGTAndRelogin, since that method is already synchronized
Updated AbstractHadoopProcessor to have a boolean accessor method to determine if explicit keytab configuration is allowed
Removed synchronization block from HiveConnectionPool's getConnection method (in Hive, Hive_1_1, Hive3 modules), since new TGT ticket acquisition is handled by the KerberosUser implementation.  If UGI is used to relogin, synchronization is handled internally by UGI.
Added Kerberos Principal and Kerberos Password properties to Hive, Hive_1_1, and Hive3 components
Hive, Hive_1_1, and Hive3 components now use KerberosUser implementations to authenticate with a KDC

Updated handling of the NIFI_ALLOW_EXPLICIT_KEYTAB environment variable in Hive and Hive3 components.  An accessor method has been added that uses Boolean.parseBoolean, which returns true if the environment variable is set to true, and false otherwise (including when the environment variable is unset).

Addressing PR feedback

Addressing PR feedback

This closes #4102.
2020-03-02 11:28:59 -05:00
jstorck 614136ce51
NIFI-7018: Initial commit of processors extending AbstractHadoopProcessor supporting kerberos passwords
AbstractHadoopProcessor will always authenticate the principal with a KerberosUser implementation and a UGI will be acquired from the Subject associated with the KerberosUser implementation
AbstractHadoopProcessor's getUserGroupInformation method will now attempt to check the TGT and relogin if a KerberosUser impelmentation is available, otherwise it will return the UGI referenced in the HdfsResource instance
Updated AbstractHadoopProcessor's customValidate method to consider the provided password and updated validation failure explanations when a KerberosCredentialsService is specified together with a principal, password, or keytab
Added toString method override to AbstractKerberosUser
Updated Hive/HBase components to be compatible with the KerberosProperties.validatePrincipalWithKeytabOrPassword method
Fixed null ComponentLog in GetHDFSSequenceFileTest

Added package-protected accessor method (getAllowExplicitKeytabEnvironmentVariable) to AbstractHadoopProcessor for determining if the environment variable "NIFI_ALLOW_EXPLICIT_KEYTAB" has been set
AbstractHadoopProcessor will now only fail validation when the NIFI_ALLOW_EXPLICIT_KEYTAB environment variable is set to false if a keytab is provided to allow the user to specify a principal and password
Added AbstractHadoopProcessorSpec to verify validation of principal/keytab/password/kerberos credential service combinations

This closes #4095.
2020-02-28 10:10:19 -05:00
Joe Witt 3de77ebacc
NIFI-7021-RC3 prepare for next development iteration 2020-01-19 14:14:40 -05:00
Joe Witt 633408bce7
NIFI-7021-RC3 prepare release nifi-1.11.0-RC3 2020-01-19 14:14:38 -05:00
Matthew Burgess 0024668dd5
NIFI-6952: Evaluate EL for Hive3ConnectionPool properties
Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

This closes #3937.
2019-12-19 11:51:18 +01:00
Matthew Burgess f398299cfe NIFI-6956: Fixed StreamingException handling in PutHive3Streaming (#3941) 2019-12-17 14:49:04 -05:00
Matthew Burgess 6d1bcc22e7 NIFI-6925: Fixed JoltTransformRecord for RecordReaders, improved MockProcessSession (#3913)
* NIFI-6925: Fixed JoltTransformRecord for RecordReaders, improved MockProcessSession

* Fixed logic for no records, added unit test

* Fixed PutElasticsearchHttpRecord and PutHive3Streaming, same bug as JoltTransformRecord

* Added null checks
2019-12-17 11:42:24 -05:00
Koji Kawamura b3880a4a06
NIFI-5970 Handle multiple input FlowFiles at Put.initConnection
Signed-off-by: Matthew Burgess <mattyb149@apache.org>

This closes #3583
2019-11-19 16:48:14 -05:00
Joe Witt f8c3d877cf
NIFI-6733 updating to next release version for master branch 2019-11-04 13:31:39 -05:00
Joe Witt 418179f5b2
NIFI-6733-RC3 prepare for next development iteration 2019-10-28 15:13:13 -07:00
Joe Witt b217ae20ad
NIFI-6733-RC3 prepare release nifi-1.10.0-RC3 2019-10-28 15:12:57 -07:00
Peter Wicks 555004cdde
NIFI-6684 - Fixing checkstyle violation (#3788)
signed-off by: Peter Wicks <patricker@gmail.com>
2019-10-04 08:41:56 -06:00
James Cheng f1d35f46ce NIFI-6684 Add more property to Hive3ConnectionPool (#3763)
* NiFi-6684 Add more property to Hive3ConnectionPool

signed-off by: Peter Wicks <patricker@gmail.com>
2019-10-03 07:24:29 -06:00
archon 8a8852e73d
NIFI-6480: PutORC/PutParquet can't overwrite file even if set 'Overwrite Files' to true
This closes #3599.

Signed-off-by: Bryan Bende <bbende@apache.org>
2019-09-13 12:08:27 -04:00
korir 3d1bb09ff8
fix deserialization issues with NiFiRecordSerDe for hive3streaming
NIFI-6295: Refactored NiFiRecordSerDe to handle nested complex types

NIFI-6295: Incorporated review comments

NIFI-6295: Refactored unit tests for NiFiRecordSerDe
Signed-off-by: Matthew Burgess <mattyb149@apache.org>

This closes #3684
2019-09-13 09:08:57 -04:00
Bryan Bende 76c2c3fee2
NIFI-6089 Add Parquet record reader and writer
NIFI-5755 Allow PutParquet processor to specify avro write configuration
Review feedback
Additional review feedback

This closes #3679

Signed-off-by: Mike Thomsen <mthomsen@apache.org>
2019-09-04 08:52:17 -04:00
Alessandro D'Armiento 9071e5baa7
- Removed unused AUTOCREATE_PARTITIONS from PutHive3Streaming
- Renamed PARTITION_VALUES to STATIC_PARTITION_VALUES for correctness and better understanding
- STATIC_PARTITION_VALUES descriptions clearly states that having that property filler implies Hive Static Partitioning

NIFI-6536: Additional documentation for Static Partition Values

Signed-off-by: Matthew Burgess <mattyb149@apache.org>

This closes #3631
2019-09-03 15:45:08 -04:00
Tamas Palfy 2baafcc2e6
NIFI-6534 Improve logging in PutHive3Streaming processor
Added logging for PutHive3Streaming when routing to failure

Signed-off-by: Matthew Burgess <mattyb149@apache.org>

This closes #3640
2019-08-09 17:19:26 -04:00
Jeff Storck 1d560e2b02 NIFI-6360 Updated Mockito to 2.28.2, PowerMock to 2.0.2
Fixed test failures in nifi-couchbase-processors, BinaryDocument matcher replaced with ByteArrayDocument
Fixed test failures in nifi-riemann-processors, anyInt() matcher replaced with anyLong() matcher, calling method passes a long, not int
Removed unnecessary method mocks from nifi-toolkit-tls tests, TlsCertificateAuthorityServiceHandlerTest and TlsCertificateSigningRequestPerformerTest, since those were flagged by Mockito as unnecessary (they're unused)
Removed explicit mockito dependency version in nifi-gcp-processors pom to inherit version from nifi's pom.xml
Updated ArgumentMatchers in Kafka 0.10, 0.11, 1.0, and 2.0 processor tests, since in Mockito 2.x, the "any" matchers no longer allow nulls
Updated ArgumentMatchers in nifi-jolt-transform-json-ui, since in Mockito 2.x, the "any" matchers no longer allow nulls
Removed unnecessary method mocks from MetricsReportingTaskTest
Updated TestStandardRemoteGroupPort to return Long instead of Integer for test flowfile.size() invocations
Updated AbstractCassandraProcessor to include keyspaceProperty.getValue() in null check
Updated SimpleProcessLogger and TestSimpleProcessLogger, vararg matching does not work the same in Java 8 and 11
Updated TestStandardProcessScheduler to allow null values during mock invocations, Mockito 2.x no longer allows nulls in those matchers
Updated TestPutHiveStreaming to allow null values during mock invocations, Mockito 2.x no longer allows nulls in those matchers
Updated FetchParquetTest to allow null values during mock invocations, Mockito 2.x no longer allows nulls in those matchers
Updated ControllerSearchServiceTest to allow null values during mock invocations, Mockito 2.x no longer allows nulls in those matchers
Removed usage of Whitebox from GetAzureEventHubTest due to Mockito 2.x, replaced with FieldUtils
Removed usage of Whitebox from StandardOidcIdentityProviderTest due to Mockito 2.x, replaced with FieldUtils
Updated apache-rat-plugin configuration in root POM to make use of useIdeaDefaultExcludes which makes the rat plugin exclude IntelliJ artifacts
Updated several modules to use mockito-core instead of mockito-all (discontinued in Mockito 2.x)
Updated nifi-site-to-site-reporting-task tests to be compatible with Mockito 2.x
Ignored TestPutJMS tests; the tests need to be refactored to work with Mockito 2.x, but the processor is deprecated.  Refactor may be done in a separate PR.
Adjusted several mock interaction iterations to 0 for TestPublishKafkaRecord_* tests.  Mockito 2.x flagged several interactions as unused and were adjusted to 0 interactions.
Updated PowerMock and Mockito dependencies to exclude transitive dependency on bytebuddy, added explicit dependency on bytebuddy 1.9.10 so that PowerMock and Mockito use the same version.  Bytebuddy 1.9.3 (used by PowerMock 2.0.2) did not allow for the mocking of final/private classes, bytebuddy 1.9.10 (used by Mockito 2.28.2) does.
Updated TestSiteToSiteProvenanceReportingTask use of InvocationOnMock.getArgument to use objects for the resulting object rather than primitives
Removed unnecessary stubs from evtx tests, Mockito 2.x defaults to strict mocks
Fixed classloader issue with tests in nifi-windows-event-log-processors module that use JNAJUnitRunner when Mockito mocked JNA classes (Kernel32)
Addressed Mockito-related deprecation warnings
Import cleanup

This closes #3533

Signed-off-by: Mike Thomsen <mikerthomsen@gmail.com>
2019-06-17 12:21:07 -04:00
Andy LoPresto e6c843f465
NIFI-6323 Changed URLs for repositories, project description, and mailing lists to use HTTPS.
NIFI-6323 Changed URLs for splunk.artifactoryonline.com to use HTTPS (certificate validity warning in browsers, but command-line connection using openssl s_client is successful).
NIFI-6323 Changed URLs for XMLNS schema locations to use HTTPS (the XMLNS and schema identifier remain http:// because they are not designed to be resolvable).
NIFI-6323 Fixed Maven XML schema descriptor URLs.

This closes #3497
2019-05-29 14:36:40 -04:00
Matthew Burgess 97690275ac
NIFI-6210: Applied NIFI-5134 Kerberos TGT renewal to Hive3ConnectionPool
This closes #3432.

Signed-off-by: Bryan Bende <bbende@apache.org>
2019-04-12 15:09:52 -04:00
joewitt 0e204f3576
NIFI-6029-RC2 prepare for next development iteration 2019-02-16 21:50:35 -05:00
joewitt 45bb53d2aa
NIFI-6029-RC2 prepare release nifi-1.9.0-RC2 2019-02-16 21:50:15 -05:00
Mark Payne 36c0a99e91 NIFI-5938: Added ability to infer record schema on read from JsonTreeReader, JsonPathReader, XML Reader, and CSV Reader.
- Updates to make UpdateRecord and RecordPath automatically update Record schema when performing update and perform the updates on the first record in UpdateRecord before obtaining Writer Schema. This allows the Writer to  to inherit the Schema of the updated Record instead of the Schema of the Record as it was when it was read.
 - Updated JoltTransformRecord so that schema is inferred on the first transformed object before passing the schema to the Record Writer, so that if writer inherits schema from record, the schema that is inherited is the trans transformed schema
 - Updated LookupRecord to allow for Record fields to be arbitrarily added
 - Implemented ContentClaimInputStream
 - Added controller service for caching schemas
 - UpdatedQueryRecord to cache schemas automatically up to some number of schemas, which will significantly inprove throughput in many cases, especially with inferred schemas.

NIFI-5938: Updated AvroTypeUtil so that if creating an Avro Schema using a field name that is not valid for Avro, it creates a Schema that uses a different, valid field name and adds an alias for the given field name so that the fields still are looked up appropriately. Fixed a bug in finding the appropriate Avro field when aliases are used. Updated ContentClaimInputStream so that if mark() is called followed by multiple calls to reset(), that each reset() call is successful instead of failing after the first one (the JavaDoc for InputStream appears to indicate that the InputStream is free to do either and in fact the InputStream is even free to allow reset() to reset to the beginning of file if mark() is not even called, if it chooses to do so instead of requiring a call to mark()).

NIFI-5938: Added another unit test for AvroTypeUtil

NIFI-5938: If using inferred schema in CSV Reader, do not consider first record as a header line. Also addressed a bug in StandardConfigurationContext that was exposed by CSVReader, in which calling getProperty(PropertyDescriptor) did not properly lookup the canonical representation of the Property Descriptor from the component before attempting to get a default value

Signed-off-by: Matthew Burgess <mattyb149@apache.org>

This closes #3253
2019-02-11 12:56:50 -05:00
Koji Kawamura 05de73d6a0 NIFI-5951 Fix error logging with rollback on failure
Signed-off-by: Matthew Burgess <mattyb149@apache.org>

This closes #3264
2019-01-17 17:11:39 -05:00
Alex Savitsky 3e52ae952d NIFI-5909 added optional settings for date, time, and timestamp formats used to write Records to Elasticsearch
NIFI-5909 added content checks to the unit tests

NIFI-5937 use explicit long value for test dates/times (to not depend on the timezone of test executor)

NIFI-5937 tabs to spaces

Fixing checkstyle violations introduced by https://github.com/apache/nifi/pull/3249 PR)

NIFI-5937 adjusted property descriptions for consistency; limited EL scope to variable registry; added an appropriate validator along with its Maven dependency; moved format initialization to @OnScheduled

NIFI-5909 tabs to spaces

Signed-off-by: Ed <edward.berezitsky@gmail.com>

This closes #3227
2019-01-09 21:55:09 -05:00
kei miyauchi e6e4175d71 NIFI-5841 Fix memory leak of PutHive3Streaming.
This closes #3249.

Signed-off-by: Koji Kawamura <ijokarumawak@apache.org>
2019-01-09 12:06:27 +09:00
Koji Kawamura 595a2decc6
NIFI-5917 Fix TestSelectHiveQL.testNoTimeLimit
Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

This closes #3237.
2019-01-03 11:31:38 +01:00
Matthew Burgess d34789881b NIFI-5904: Fix PutHive3Streaming handling of RecordReaderFactoryException 2018-12-17 14:12:18 -05:00
gkkorir c51512f5e3 NIFI-5891 fix handling of null logical types in Hive3Streaming processor
NIFI-5891: Fixed Checkstyle issues
Signed-off-by: Matthew Burgess <mattyb149@apache.org>

This closes #3216
2018-12-13 10:23:18 -05:00
Matthew Burgess 68a49cfad0 NIFI-5845: Add support for OTHER and SQLXML JDBC types to SQL/Hive processors
NIFI-5845: Incorporated review comments

This closes #3184.

Signed-off-by: Koji Kawamura <ijokarumawak@apache.org>
2018-11-29 09:50:21 +09:00
Matthew Burgess 455e3c1bc8 NIFI-5834: Restore default PutHiveQL error handling behavior
NIFI-5834: Incorporated review comments

This closes #3179.

Signed-off-by: Koji Kawamura <ijokarumawak@apache.org>
2018-11-27 18:10:26 +09:00
Pierre Villard 9e7610ac70 NIFI-5815 - PutORC processor 'Restricted' still requires access to restricted components regardless of restriction
This closes #3169.

Signed-off-by: Koji Kawamura <ijokarumawak@apache.org>
2018-11-14 13:50:00 +09:00