Commit Graph

67 Commits

Author SHA1 Message Date
Jeff Storck 42a1ee011b NIFI-4323 This closes #2360. Wrapped Get/ListHDFS hadoop operations in ugi.doAs calls
NIFI-3472 NIFI-4350 Removed explicit relogin code from HDFS/Hive/HBase components and updated SecurityUtils.loginKerberos to use UGI.loginUserFromKeytab. This brings those components in line with daemon-process-style usage, made possible by NiFi's InstanceClassloader isolation.  Relogin (on ticket expiry/connection failure) can now be properly handled by hadoop-client code implicitly.
NIFI-3472 Added default value (true) for javax.security.auth.useSubjectCredsOnly to bootstrap.conf
NIFI-3472 Added javadoc explaining the removal of explicit relogin threads and usage of UGI.loginUserFromKeytab
Readded Relogin Period property to AbstractHadoopProcessor, and updated its documentation to indicate that it is now a deprecated property
Additional cleanup of code that referenced relogin periods
Marked KerberosTicketRenewer is deprecated

NIFI-3472 Cleaned up imports in TestPutHiveStreaming
2018-01-03 11:31:47 -05:00
Koji Kawamura 84cecfbeea NIFI-4707: Fixed ProcessGroup tree
- Removed duplicated creation of a ParentProcessGroupSearchNode for the
root ProcessGroup.
- Removed duplicated creation of a ParentProcessGroupSearchNode for each
component inside a ProcessGroup.
- Fixed ProcessGroup id hierarchy.
- Fixed filtering logic.
- Added unit tests for filtering by ProcessGroupId and Remote
Input/Output ports.

Signed-off-by: Matthew Burgess <mattyb149@apache.org>

This closes #2351
2018-01-02 14:49:00 -05:00
Matthew Burgess 97dc20e2d9 NIFI-4707: Changed process group parent stack to tree 2018-01-02 14:46:48 -05:00
Koji Kawamura d65e6b2563 NIFI-4707: Improved S2SProvenanceReportingTask
- Simplified consumeEvents method signature
- Refactored ComponentMapHolder methods visibility
- Renamed componentMap to componentNameMap
- Map more metadata from ConnectionStatus for Remote Input/Output Ports
- Support Process Group hierachy filtering
- Throw an exception when the reporting task fails to send provenance
data to keep current provenance event index so that events can be
consumed again
2018-01-02 14:46:42 -05:00
Matthew Burgess 1f793923a4 NIFI-4707: Build full component map for ID -> Name association in provenance reporting"
NIFI-4707: Add process group ID/name to S2SProvReportingTask records

NIFI-4707: Added support for filtering provenance on process group ID

NIFI-4707: Fixed support for provenance in Atlas reporting task

NIFI-4707: Refactored common code into reporting-utils, fixed filtering
2018-01-02 14:46:36 -05:00
Mark Payne c91d99884a NIFI-4717: Several minor bug fixes and performance improvements around record-oriented processors
Signed-off-by: Matthew Burgess <mattyb149@apache.org>

This closes #2359
2017-12-29 10:43:21 -05:00
Koji Kawamura 62e388aa4f NIFI-4709 - Fixed ListAzureBlobStorage timestamp precision handling.
Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

This closes #2354.
2017-12-21 15:15:09 +01:00
Mark Payne f772f2f093 NIFI-4671: This closes #2328. Ensure that Avro Schemas that are created properly denote fields as being nullable iff the schemas says they are, for non-top-level fields
Signed-off-by: joewitt <joewitt@apache.org>
2017-12-11 11:46:15 -05:00
joewitt cdc1facf39
NIFI-4664, NIFI-4662, NIFI-4660, NIFI-4659 moved tests which are timing/threading/network dependent and brittle to integration tests and un-ignored tests that are IT. Updated travis to reduce impact on infra and appveyor now skips test runs so is just to prove build works on windows. This closes #2319
squash
2017-12-06 10:53:09 -05:00
Koji Kawamura 77a51e1a9e NIFI-4544: Improve HDFS processors provenance transit URL
Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

This closes #2238.
2017-11-02 10:10:03 +01:00
Koji Kawamura d914ad2924 NIFI-4547: Add ProvenanceEventConsumer utility class
Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

This closes #2236.
2017-10-30 09:50:44 +01:00
patricker e3482cc772 NIFI-4534 Choose Character Set for CSV Record Read/Write streams
Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

This closes #2229.
2017-10-27 10:34:15 +02:00
patricker fd00df3d2f NIFI-4465 ConvertExcelToCSV Data Formatting and Delimiters
This closes #2194.

Signed-off-by: Koji Kawamura <ijokarumawak@apache.org>
2017-10-17 14:56:49 +09:00
Matt Gilman 6baea8ccff
NIFI-4444:
- Upgrading to Jersey 2.x.
- Updating NOTICE files where necessary.
- Fixing checkstyle issues.

This closes #2206.

Signed-off-by: Andy LoPresto <alopresto@apache.org>
2017-10-12 10:27:02 -07:00
Bryan Bende 9324a2a742 NIFI-4476 Improving logic for determining when to yield in PutTCP/UDP/Syslog/Splunk
Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

This closes #2204.
2017-10-10 09:02:02 +02:00
Takanobu Asanuma 13e42678b6 NIFI-4338. This closes #2143. add docs for ssl configurations in HDFS processors
remove redundant additionalDetails.html and add docs to CapabilityDescription in HDFS processors

revert the modified CapabilityDescriptions in HDFS processors and add it to AbstractHadoopProcessor
2017-10-10 00:17:44 -04:00
Andy LoPresto d4168f5ff1
NIFI-4297
- Upgraded immediately actionable dependency versions from Meterian report.
- Upgraded jackson-core test dependencies for HBase and Elasticsearch modules.
- Only 3 instances of jackson-core < 2.8.6 (Google Cloud Platform and Spark Receiver modules).
- Upgraded version of poi dependency in nifi-email-processors to 3.16.
- Resolving dependency issues after rebasing against 1.5.0-SNAPSHOT.
- Removed jackson-databind from <dependencyManagement> block in nifi/pom.xml and added explicit reference to ${jackson.version} in all referenced artifacts.
- Removed jackson-mapper-asl from <dependencyManagement> block in nifi/pom.xml and added explicit reference to ${jackson.old.version} in all referenced artifacts.
- Removed Jasypt from <dependencyManagement> and added explicit version in test dependency for legacy compatibility.
- This closes #2084
2017-10-05 15:23:52 -04:00
Jeff Storck a57911d3db NIFI-4412-RC2 prepare for next development iteration 2017-09-28 13:45:36 -04:00
Jeff Storck e6508ba7d3 NIFI-4412-RC2 prepare release nifi-1.4.0-RC2 2017-09-28 13:45:21 -04:00
Bryan Bende 6eab91923e NIFI-4418 Adding ListenUDPRecord processor. This closes #2173. 2017-09-25 13:19:23 -04:00
Bryan Bende a813ae113e NIFI-4391 Ensuring channel is closed when unable to connect in SocketChannelSender
NIFI-4391 Adding debug logging of client port upon connection

Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

This closes #2159.
2017-09-21 16:30:10 +02:00
Koji Kawamura 1f67cbf628 NIFI-4004: Use RecordReaderFactory without FlowFile.
- Removed FlowFile from RecordReaderFactory, RecordSetWriterFactory and SchemaAccessStrategy.
- Renamed variable 'allowableValue' to 'strategy' to represent its meaning better.
- Removed creation of temporal FlowFile to resolve Record Schema from ConsumerLease.

- Removed unnecessary 'InputStream content' argument from
  RecordSetWriterFactory.getSchema method.

This closes #1877.
2017-09-08 12:37:40 -04:00
Koji Kawamura e68ff153e8
NIFI-3332: ListXXX to not miss files with the latest processed timestamp
Before this fix, it's possible that ListXXX processors can miss files those have the same timestamp as the one which was the latest processed timestamp at the previous cycle. Since it only used timestamps, it was not possible to determine whether a file is already processed or not.

However, storing every single processed identifier as we used to will not perform well.
Instead, this commit makes ListXXX to store only identifiers those have the latest timestamp at a cycle to minimize the amount of state data to store.

NIFI-3332: ListXXX to not miss files with the latest processed timestamp

- Fixed TestAbstractListProcessor to use appropriate time precision.
  Without this fix, arbitrary test can fail if generated timestamp does
  not have the desired time unit value, e.g. generated '10:51:00' where
  second precision is tested.
- Fixed TestFTP.basicFileList to use millisecond time precision explicitly
  because FakeFtpServer's time precision is in minutes.
- Changed junit dependency scope to 'provided' as it is needed by
  ListProcessorTestWatcher which is shared among different modules.

This closes #1975.

Signed-off-by: Bryan Bende <bbende@apache.org>
2017-08-28 11:31:04 -04:00
Koji Kawamura 28ee70222b
NIFI-4069: Make ListXXX work with timestamp precision in seconds or minutes
- Refactored variable names to better represents what those are meant for.
- Added deterministic logic which detects target filesystem timestamp precision and adjust lag time based on it.
- Changed from using System.nanoTime() to System.currentTimeMillis in test because Java File API reports timestamp in milliseconds at the best granularity. Also, System.nanoTime should not be used in mix with epoch milliseconds because it uses arbitrary origin and measured differently.
- Changed TestListFile to use more longer interval between file timestamps those are used by testFilterAge to provide more consistent test result because sleep time can be longer with filesystems whose timestamp in seconds precision.
- Added logging at TestListFile.
- Added TestWatcher to dump state in case assertion fails for further investigation.
- Added Timestamp Precision property so that user can set if auto-detect is not enough
- Adjust timestamps for ages test

This closes #1915.

Signed-off-by: Bryan Bende <bbende@apache.org>
2017-08-28 11:31:03 -04:00
Bryan Bende cf57639396 NIFI-4311 Allowing umask to get set properly before initializing the FileSystem
Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

This closes #2106.
2017-08-22 22:40:26 +02:00
Mark Payne 451f9cf124 NIFI-4142: This closes #2015. Refactored Record Reader/Writer to allow for reading/writing "raw records". Implemented ValidateRecord. Updated Record Reader to take two parameters for nextRecord: (boolean coerceTypes) and (boolean dropUnknownFields)
Signed-off-by: joewitt <joewitt@apache.org>
2017-08-11 22:01:46 -07:00
Bryan Bende 0029f025f8 NIFI-4152 Initial commit of ListenTCPRecord 2017-08-07 22:44:11 +02:00
Wesley-Lawrence 40cde0466a NIFI-4215 NiFi can now parse an Avro schema of a record that references an already defined record, including itself.
This closes #2052.
2017-08-03 15:13:07 -04:00
James Wing 2502b79bae NIFI-4215 Revert Complex Avro Schema Changes
This reverts commit cf49a58ee7.
2017-08-01 21:03:04 -07:00
Wesley-Lawrence cf49a58ee7 NIFI-4215 Allow Complex Avro Schema Parsing
NiFi can now parse an Avro schema of a record that references an already defined record, including itself.

Signed-off-by: James Wing <jvwing@gmail.com>

This closes #2034.
2017-07-30 16:31:39 -07:00
Mark Payne 1d6b486b63 NIFI-4232: Ensure that we handle conversions to Avro Arrays properly. Also, if unable to convert a value to the expected object, include in the log message the (fully qualified) name of the field that is problematic
Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

This closes #2040.
2017-07-27 08:57:25 +02:00
Bryan Bende f87d2a2f57 NIFI-4157 Improvements to PutTCP
Signed-off-by: Bryan Bende <bbende@apache.org>
Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

This closes #1989.
2017-07-19 12:25:42 +02:00
Mark Payne b603cb955d NIFI-4060: Initial implementation of MergeRecord
NIFI-4060: Addressed threading issue with RecordBin being updated after it is completed; fixed issue that caused mime.type attribute not to be written properly if all incoming flowfiles already have a different value for that attribute

NIFI-4060: Bug fixes; improved documentation; added a lot of debug information; updated StandardProcessSession to produce more accurate logs in case of a session being committed/rolled back with open input/output streams
Signed-off-by: Matt Burgess <mattyb149@apache.org>

This closes #1958
2017-07-12 16:36:48 -04:00
Mark Payne 3906d4e1d2 NIFI-1763: Initial implementation of ConfluentSchemaRegistry.
NIFI-1763: Fixed bug where the Confluent Schema Registry Schema Access Writer was not being created

Signed-off-by: Yolanda M. Davis <ymdavis@apache.org>

This closes #1938
2017-07-06 15:52:07 -04:00
Jeff Storck c99100c934
NIFI-4010 Enables EL on Fetch/List/PutSFTP and List/Fetch/Put/DeleteHDFS processor properties
FetchSFTP/ListSFTP/PutSFTP: Private Key Path
ListHDFS/FetchHDFS/PutHDFS/DeleteHDFS: Hadoop Configuration Resources, Kerberos Principal, Kerberos Keytab, Kerberos Relogin Period

This closes #1148
This closes #1930.

Signed-off-by: Bryan Bende <bbende@apache.org>
2017-06-21 17:14:49 -04:00
Maurizio Colleluori 59a32948ea
NIFI-2923 Added evaluation of attribute expressions for Kerberos principal and keytab
Signed-off-by: Bryan Bende <bbende@apache.org>
2017-06-21 17:14:28 -04:00
Maurizio Colleluori 86fa1bba4f
NIFI-2923 Add expression language support to Kerberos parameters used by processors
Signed-off-by: Bryan Bende <bbende@apache.org>
2017-06-21 17:14:27 -04:00
Mark Payne e7dcb6f6c5 NIFI-3921: Allow Record Writers to inherit schema from Record
Signed-off-by: Matt Burgess <mattyb149@apache.org>

This closes #1902
2017-06-09 16:13:25 -04:00
Matt Gilman cc741d2be6
NIFI-3997:
- Bumping to next minor version.
2017-06-08 15:22:51 -04:00
Matt Gilman 1bf0a1a849
Merge branch 'NIFI-3997-RC1' 2017-06-08 14:30:10 -04:00
Bryan Bende b0c9428776 NIFI-4030 Populating default values on GenericRecord from Avro schema if not present in RecordSchema
This closes #1896.
2017-06-07 13:59:40 -04:00
Steve Champagne 45e035686f
NIFI-4029: Allow null Avro default values in HortonworksSchemaRegistry
This closes #1894.

Signed-off-by: Bryan Bende <bbende@apache.org>
2017-06-07 12:03:53 -04:00
Matt Gilman 6ee12e9b47
NIFI-3997-RC1prepare for next development iteration 2017-06-05 11:07:43 -04:00
Matt Gilman ddb73612bd
NIFI-3997-RC1prepare release nifi-1.3.0-RC1 2017-06-05 11:07:28 -04:00
Mark Payne 7035694e37
NIFI-3995: Updated Hwx Encoded Schema Ref Writer to write 13 bytes for header instead of 14; added unit test to verify
This closes #1876.

Signed-off-by: Bryan Bende <bbende@apache.org>
2017-06-01 10:29:58 -04:00
Mark Payne a0b2311ff6 NIFI-3995: This closes #1873. No longer use the 14th byte in the header for hwx content-encoded schema reference
Signed-off-by: joewitt <joewitt@apache.org>
2017-05-31 13:41:25 -04:00
Koji Kawamura 13b59b5621 NIFI-3958: Decimal logical type with undefined precision and scale.
- Oracle NUMBER can return 0 precision and -127 or 0 scale with variable scale NUMBER such as ROWNUM or function result
- Added 'Default Decimal Precision' and 'Default Decimal Scale' property to ExecuteSQL and QueryDatabaseTable to apply default precision and scale if those are unknown
- Coerce BigDecimal scale to field schema logical type, so that BigDecimals having different scale can be written

Signed-off-by: Matt Burgess <mattyb149@apache.org>

This closes #1851
2017-05-25 14:37:17 -04:00
Mark Payne 08b66b5b6a NIFI-3969: Prevent merging flowfiles prematurely when all bins fill but some are already full and can be processed
Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

This closes #1850.
2017-05-24 19:36:18 +02:00
Pierre Villard ba49b8427c
NIFI-3191 - HDFS Processors Should Allow Choosing LZO Compression
This closes #1802.

Signed-off-by: Bryan Bende <bbende@apache.org>
2017-05-24 11:04:00 -04:00
Mark Payne c49933f03d NIFI-3948: This closes #1834. Added flush() method to RecordWriter and call it when writing a single record to OutputStream for PublishKafkaRecord. Also removed no-longer-used class WriteAvroResult
Signed-off-by: joewitt <joewitt@apache.org>
2017-05-19 23:05:04 -04:00