Commit Graph

48 Commits

Author SHA1 Message Date
Jeff Storck 42a1ee011b NIFI-4323 This closes #2360. Wrapped Get/ListHDFS hadoop operations in ugi.doAs calls
NIFI-3472 NIFI-4350 Removed explicit relogin code from HDFS/Hive/HBase components and updated SecurityUtils.loginKerberos to use UGI.loginUserFromKeytab. This brings those components in line with daemon-process-style usage, made possible by NiFi's InstanceClassloader isolation.  Relogin (on ticket expiry/connection failure) can now be properly handled by hadoop-client code implicitly.
NIFI-3472 Added default value (true) for javax.security.auth.useSubjectCredsOnly to bootstrap.conf
NIFI-3472 Added javadoc explaining the removal of explicit relogin threads and usage of UGI.loginUserFromKeytab
Readded Relogin Period property to AbstractHadoopProcessor, and updated its documentation to indicate that it is now a deprecated property
Additional cleanup of code that referenced relogin periods
Marked KerberosTicketRenewer is deprecated

NIFI-3472 Cleaned up imports in TestPutHiveStreaming
2018-01-03 11:31:47 -05:00
Marco Gaido 353fcdda9c NIFI-3660: This closes #2356. Support schema containing a map with an array value in ConvertAvroToORC
Signed-off-by: joewitt <joewitt@apache.org>
2017-12-26 17:46:35 -05:00
Matthew Burgess febb119fac NIFI-4696 - Add Flowfile attribute EL support and per-table concurrency to PutHiveStreaming
Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

This closes #2342.
2017-12-20 12:11:38 +01:00
Koji Kawamura 37ef2839e0 NIFI-4545: Improve Hive processors provenance transit URL
Incorporated review comments:

- Added 'input' to equals() method so that the same table name can appear
as input and output tables.

Signed-off-by: Matthew Burgess <mattyb149@apache.org>

This closes #2239
2017-11-09 12:53:28 -05:00
Matthew Burgess 5cd8a3e729 NIFI-4473: Fixed SelectHiveQL flowfile handling during error conditions
Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

This closes #2247.
2017-11-03 17:03:27 +01:00
Matthew Burgess 1a1e01c568 NIFI-4473 - fixed NPE in parametrized queries
Signed-off-by: Matthew Burgess <mattyb149@apache.org>

This closes #2215
2017-10-18 10:13:03 -04:00
Matthew Burgess 4edafad6e5 NIFI-4473: Add support for large result sets and normalizing Avro names to SelectHiveQL
Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

This closes #2212.
2017-10-18 13:37:31 +02:00
Pierre Villard 9ac88d210a NIFI-4342 - Add EL support to PutHiveStreaming
Signed-off-by: Matthew Burgess <mattyb149@apache.org>

This closes #2120
2017-09-06 13:33:18 -04:00
Maurizio Colleluori 59a32948ea
NIFI-2923 Added evaluation of attribute expressions for Kerberos principal and keytab
Signed-off-by: Bryan Bende <bbende@apache.org>
2017-06-21 17:14:28 -04:00
Matt Burgess fb925fc182 NIFI-3867: Fixed issue with getConnectionURL in HiveConnectionPool using Expression Language
Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

This closes #1847.
2017-05-23 21:28:59 +02:00
Matt Burgess 3353865ce9 NIFI-3867: Add Expression Language support to HiveConnectionPool properties
Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

This closes #1783.
2017-05-15 13:46:29 +02:00
Tim Reardon e9848f4276 NIFI-3881 Fix PutHiveStreaming EL evaluation
Signed-off-by: Matt Burgess <mattyb149@apache.org>

This closes #1791
2017-05-12 14:06:32 -04:00
Koji Kawamura af6f63691c
NIFI-3818: PutHiveStreaming throws IllegalStateException
Changed from async append to sync as it breaks 'recursionSet' check in StandardProcessSession by updating it from multiple threads, resulting IllegalStateException to happen.

This closes #1761.

Signed-off-by: Bryan Bende <bbende@apache.org>
2017-05-05 13:25:59 -04:00
Bryan Rosander 0054a9e35f
NIFI-3755 - Restoring Hive exception handling behavior
This closes #1711.

Signed-off-by: Bryan Bende <bbende@apache.org>
2017-04-28 11:45:43 -04:00
Andrew Lim 2664ea093d NIFI-3756 This closes #1715. Update ConvertAvroToOrc documentation to note that processor does not support unions of nested data
Signed-off-by: joewitt <joewitt@apache.org>
2017-04-28 00:02:03 -04:00
Koji Kawamura d9acdb54be NIFI-3415: Add Rollback on Failure.
- Added org.apache.nifi.processor.util.pattern package in nifi-processor-utils containing reusable functions to mix-in 'Rollback on Failure' capability.
- Created a process pattern classes, Put and PutGroup. It will be helpful to standardize Processor implementations.
- Applied Rollback on Failure to PutSQL, PutHiveQL, PutHiveStreaming and PutDatabaseRecord.
- Stop using AbstractProcessor for these processors, as it penalizes FlowFiles being processed when it rollback a process session. If FlowFiles are penalized, it will not be fetched again until penalization expires.
- Yield processor when a failure occurs and RollbackOnFailure is enabled. If we do not penalize nor yield, a failed FlowFile retries too frequently.
- When Rollback on Failure is enabled but processor is not transactional, discontinue when an error occurred after successful processes.
- Fixed existing issues on PutHiveStreaming:
  - Output FlowFile Avro format was corrupted by concatenating multiple Avro files.
  - Output FlowFile records had incorrect values because of reusing GenericRecord instance.

Signed-off-by: Matt Burgess <mattyb149@apache.org>

This closes #1658
2017-04-27 13:44:56 -04:00
Bryan Rosander 97461657b1 NIFI-3744 - PutHiveStreaming cleanup null fixes
This closes #1698.

Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>
2017-04-26 22:54:49 +02:00
Bryan Bende d90cf846b9 NIFI-3380 Bumping NAR plugin to 1.2.0-SNAPSHOT development to leverage changes from master, adding buildnumber-maven-plugin to nifi-nar-bundles to properly set build info in MANIFEST of NARs
- Refactoring NarDetails to include all info from MANIFEST
- Adding the concept of a Bundle and refactoring NarClassLoaders to pass Bundles to ExtensionManager
- Adding logic to fail start-up when multiple NARs with same coordinates exist, moving Bundle classes to framework API
- Refactoring bundle API to classes and creating BundleCoordinate
- Updating FlowController to use BundleCoordinate

- Updating the UI and DTO model to support showing bundle details that loaded an extension type.
- Adding bundle details for processor canvas node, processor dialogs, controller service dialogs, and reporting task dialogs.
- Updating the formating of the bundle coordinates.
- Addressing text overflow in the configuration/details dialog.
- Fixing self referencing functions.
- Updating extension UI mapping to incorporate bundle coordinates.
- Discovering custom UIs through the supplied bundles.
- Adding verification methods for creating extensions through the rest api.
- Only returning extensions that are common amongst all nodes.
- Rendering the ghost processors using a dotted border.
- Adding bundle details to the flow.xml.
- Loading NiFi build and version details from the framework NAR.
- Removing properties for build and version details.
- Wiring together front end and back end changes.
- Including bundle coordinates in the component data model.
- Wiring together component data model and flow.xml.
- Addressing issue when resolve unvesioned dependent NARs.

Updating unit tests to pass based on framework changes
- Fixing logging of extension types during start up

- Allowing the application to start if there is a compatible bundle found. - Reporting missing bundle when the a compatible bundle is not found. - Fixing table height in new component dialogs.

Fixing chechstyle error and increasing test timeout for TestStandardControllerServiceProvider
- Adding ability to change processor type at runtime
- Adding backend code to change type for controller services

- Cleaning up instance classloaders for temp components.
- Creating a dialog for changing the version of a component.
- Updating the formatting of the component type and bundle throughout.
- Updating the new component dialogs to support selecting source group.
- Cleaning up new component dialogs.
- Cleaning up documentation in the cluster node endpoint.

Adding missing include in nifi-web-ui pom compressor plugin
- Refactoring so ConfigurableComponent provides getLogger() and so the nodes provide the ConfigurableComponent
- Creating LoggableComponent to pass around the component, logger, and coordinate with in the framework

- Finishing clean up following rebase.

Calling lifecycle methods for add and remove when changing versions of a component
- Introducing verifyCanUpdateBundle(coordinate) to ConfiguredComponent, and adding unit tests

- Ensuring documentation is available for all components. Including those of the same type that are loaded from different bundles.

Adding lookup from ClassLoader to Bundle, adding fix for instance class loading to include all parent NARs, and adding additional unit tests for FlowController
- Adding validation to ensure referenced controller services implement the required API
- Fixing template instantiation to look up compatible bundle

- Requiring services/reporting tasks to be disabled/stopped.
- Only supporting a change version option when the item has multiple versions available.
- Limiting the possible new controller services to the applicable API version.
- Showing the implemented API versions for Controller Services.
- Updating the property descriptor tooltip to indicate the required service requirements.
- Introducing version based sorting in the new component dialog, change version dialog, and new controller service dialog.
- Addressing remainder of the issues from recent rebase.

Ensuring bundles have been added to the flow before proposing a flow, and incorporating bundle information into flow fingerprinting
- Refactoring the way missing bundles work to retain the desired bundle if available
- Fixing logger.isDebugEnabled to be logger.isTraceEnabled

- Auditing when user changes the bundle. - Ensuring bundle details are present in templates.

Moving standard prioritizers to framework NAR and refactoring ExtensionManager logic to handle cases where an extension is in a JAR directly in the lib directory

- Ensuring all nodes attempt to instantiate the same template instance when the available bundles may differ. - Fixing the auditing of copy/paste and template instantiation. - Running addtional verification methods when running standalone.

Refactoring controller service invocation handler to allow updating the node used by the invocation handler
- Ensuring the bundles in a proposed flow are compatible with the current instance when the current instance has no flow is going to accept the proposed flow
- Merging whether multiple versions of the component are available
- Setting NAR plugin back to current released version
- Cleaning up DocGenerator to not process multiple times

Addressing incorrect usage of nf.Common. - Using formatType in the new component type dialogs.

Improving error messages when looking for bundles

Addressing comments from PR. - Fixing references to global nf namespace. - Fixing injection of nfProcessGroupConfiguration in nfComponentVersion. - Fixing web api integration tests.

Not rendering unversioned in help documentation. - Ensuring the isExtentionMissing flag is correct after changing the component type.

Adding synchronization in node classes to ensure changing component can't occur when component is running, introducing MissingBundleException for better reporting when a node can't join cluster due to a missing bundle, and bumping NAR plugin to released version 1.2.0

Adding concept of missing components to fingerprinting to ensure nodes agree on missing components when joining a cluster

NIFI-3380: NIFI-3520: - Fixing hive nar dependency. - Marking DBCPService as provided. - Skipping services that require instance classloading and are cobundled with their service API. - Skipping components that require instance classloading and reference service APIs that are cobundled. - Addressing UI issues in the new component dialogs when re-opening with a filter applied.

Fixing checkstyles issue and adding back assume checks to distributed cache server test

Ensuring new component types are sorted correctly when shown initially.

This closes #1585.
2017-03-24 11:06:44 -04:00
Jeff Storck a61f353051
NIFI-3520 Updated nifi-hdfs-processors POM to depend directly on hadoop-client
- Removed NAR dependency on nifi-hadoop-libraries-nar from nifi-hadoop-nar so that hadoop-client dependencies will be included directly in nifi-hadoop-nar
- Added RequiresInstanceClassLoading annotation to AbstractHadoopProcessor and HiveConnectionPool
- UGI relogins are now performed using doAs
- Added debug-level logging for UGI relogins in KerberosTicketRenewer and AbstractHadoopProcessor

This closes #1539.

Signed-off-by: Bryan Bende <bbende@apache.org>
2017-03-13 12:21:49 -04:00
Bryan Rosander cd8eb775e6 NIFI-3574 - PutHiveStreaming UGI fixes
Signed-off-by: Matt Burgess <mattyb149@apache.org>

This closes #1578
2017-03-10 10:50:52 -05:00
Bryan Rosander cfe899d189 NIFI-3530 - Ensuring configuration and ugi information is passed down to Hive in PutHiveStreaming
NIFI-3530 - closing renewer in PutHiveStreaming.cleanup(), making npe less likely in HiveWriter.toString()

This closes #1544
2017-02-27 15:01:59 -05:00
Ben Schofield d5b139ffd4 NIFI-3418 Add records-per-transaction property to putHiveStreaming processor
Signed-off-by: Matt Burgess <mattyb149@apache.org>

This closes #1455

Minor whitespace Checkstyle issue fixed
2017-01-30 11:46:42 -05:00
David W. Streever bbc714e73b PutHiveQL and SelectHiveQL Processor enhancements. Added support for multiple statements in a script. Options for delimiters, quotes, escaping, include header and alternate header.
Add support in SelectHiveQL to get script content from the Flow File to bring consistency with patterns used for PutHiveQL and support extra query management.

Changed behavior of using Flowfile to match ExecuteSQL.  Handle query delimiter when embedded.  Added test case for embedded delimiter

Formatting and License Header

PutHiveQL and SelectHiveQL Processor enhancements. Added support for multiple statements in a script.  Options for delimiters, quotes, escaping, include header and alternate header.

Add support in SelectHiveQL to get script content from the Flow File to bring consistency with patterns used for PutHiveQL and support extra query management.

Changed behavior of using Flowfile to match ExecuteSQL.  Handle query delimiter when embedded.  Added test case for embedded delimiter

Removing dead code.

Signed-off-by: Matt Burgess <mattyb149@apache.org>

Comments to Clarify test case.

Signed-off-by: Matt Burgess <mattyb149@apache.org>

Final whitespace/formatting/typo changes

This closes #1316
2017-01-18 14:47:31 -05:00
Matt Burgess eb5abf809d
NIFI-2927: Added Validation Query to HiveConnectionPool
This closes #1252.

Signed-off-by: Bryan Bende <bbende@apache.org>
2016-11-21 14:21:22 -05:00
Mark Payne c441a8696d NIFI-2850 This closes #1115. Added a migrate() method to ProcessSession and refactored BinFiles and MergeContent to use it 2016-11-09 16:25:03 -05:00
Matt Burgess b52b839895 NIFI-2897: Fixed SelectHiveQL for CSV output of complex types
This closes #1132
2016-10-14 12:35:38 -04:00
d810146 e969a5ffe3 NIFI-2873: Nifi throws UnknownHostException with HA NameNode
Signed-off-by: Matt Burgess <mattyb149@apache.org>

NIFI-2873: Changed test hive-site.xml to use local FS, fixed Checkstyle violations

This closes #1113
2016-10-14 09:23:19 -04:00
Andre F de Miranda 3b408f5601 NIFI-2816 - Clean typos across the code - Part 2. This closes #1085 2016-10-05 13:07:57 -04:00
Andre F de Miranda 446cd44702 NIFI-2816 - Clean typos across the code
This closes #1057.
2016-09-26 17:47:31 +02:00
Matt Burgess 6c9291ad53
NIFI-2765: Fixed Kerberos support for PutHiveStreaming
This closes #1012.

Signed-off-by: Bryan Bende <bbende@apache.org>
2016-09-13 11:13:06 -04:00
Matt Burgess 5d1a4f343f
NIFI-2622: Added support for complex types in SelectHiveQL
This closes #922.

Signed-off-by: Bryan Bende <bbende@apache.org>
2016-08-23 15:52:16 -04:00
Matt Burgess fbec3b9c13
NIFI-2623: Fixed support for binary types in SelectHiveQL
This closes #920.

Signed-off-by: Bryan Bende <bbende@apache.org>
2016-08-23 11:43:46 -04:00
Matt Burgess 46b81058c7 NIFI-2602: Fixed NPE in SelectHiveQL when CSV output and null column value
This closes #898.
2016-08-19 09:56:26 +02:00
Matt Burgess 6874a5d82d NIFI-2593: This closes #891. Fixed handling of nested records/structs in ConvertAvroToORC 2016-08-18 15:15:51 -04:00
Matt Burgess a74bc2c7c7 NIFI-2598: This closes #889. Fixed issue with static init of properties in HiveConnectionPool 2016-08-18 13:38:22 -04:00
Bryan Bende e0e4b3407a NIFI-2574 This clsoes #887. Fixing NPE when using kerberos keytab location from contexts, and cleaning up hadoop/hbase/hive kerberos variables 2016-08-18 12:19:46 -04:00
joewitt d9633757a6 NIFI-2574 fix spring context definitions 2016-08-17 03:38:31 -07:00
joewitt 7d7401add4 NIFI-2574 Changed NiFiProperties to avoid static initializer and updated all references to it. 2016-08-17 00:10:07 -07:00
Matt Burgess 03d3b3961d NIFI-2577: Increased default stripe size in ConvertAvroToORC to 64MB
This closes #870.
2016-08-16 11:53:05 +02:00
Matt Burgess d9720239f5 NIFI-1663: Add ConvertAvroToORC processor
This closes #727
2016-08-10 12:37:15 -04:00
Matt Burgess 3943d72e95
NIFI-1868: Incorporate PutHiveStreaming review comments
This closes #706.

Signed-off-by: Bryan Bende <bbende@apache.org>
2016-08-04 10:05:56 -04:00
Matt Burgess 59659232c7
NIFI-1868: Downgrade to Hive 1.2.1 and remove ConvertAvroToORC
Signed-off-by: Bryan Bende <bbende@apache.org>
2016-08-04 10:05:45 -04:00
Matt Burgess c2019b9339
NIFI-1868: Add PutHiveStreaming processor
Signed-off-by: Bryan Bende <bbende@apache.org>
2016-08-04 10:05:44 -04:00
Matt Burgess b213ed95e0 NIFI-2422: Fix SelectHiveQL handling of Number types
This closes #744.

Signed-off-by: Bryan Bende <bbende@apache.org>
2016-07-29 15:38:32 -04:00
joewitt f987b21609 NIFI-1157 searched for and resolved all remaining references to deprecated items that were clearly addressable. 2016-07-14 09:32:35 -04:00
Matt Burgess d6391652e0 NIFI-1663: Add ConvertAvroToORC processor
- Code review changes
 - This closes #477.
2016-06-29 12:18:27 -04:00
Aldrin Piri 1bd2cf0d09 NIFI-1811 Renaming MockProcessorLogger to MockComponentLogger for consistency. Removing unused imports from ExecuteScript causing checkstyle failures. 2016-05-19 14:38:41 -04:00
Matt Burgess 106b0fa0fc NIFI-981: Added SelectHiveQL and PutHiveQL processors
This closes #410.

Signed-off-by: Bryan Bende <bbende@apache.org>
2016-05-03 13:51:38 -04:00