Commit Graph

145 Commits

Author SHA1 Message Date
Koji Kawamura e68ff153e8
NIFI-3332: ListXXX to not miss files with the latest processed timestamp
Before this fix, it's possible that ListXXX processors can miss files those have the same timestamp as the one which was the latest processed timestamp at the previous cycle. Since it only used timestamps, it was not possible to determine whether a file is already processed or not.

However, storing every single processed identifier as we used to will not perform well.
Instead, this commit makes ListXXX to store only identifiers those have the latest timestamp at a cycle to minimize the amount of state data to store.

NIFI-3332: ListXXX to not miss files with the latest processed timestamp

- Fixed TestAbstractListProcessor to use appropriate time precision.
  Without this fix, arbitrary test can fail if generated timestamp does
  not have the desired time unit value, e.g. generated '10:51:00' where
  second precision is tested.
- Fixed TestFTP.basicFileList to use millisecond time precision explicitly
  because FakeFtpServer's time precision is in minutes.
- Changed junit dependency scope to 'provided' as it is needed by
  ListProcessorTestWatcher which is shared among different modules.

This closes #1975.

Signed-off-by: Bryan Bende <bbende@apache.org>
2017-08-28 11:31:04 -04:00
Koji Kawamura 28ee70222b
NIFI-4069: Make ListXXX work with timestamp precision in seconds or minutes
- Refactored variable names to better represents what those are meant for.
- Added deterministic logic which detects target filesystem timestamp precision and adjust lag time based on it.
- Changed from using System.nanoTime() to System.currentTimeMillis in test because Java File API reports timestamp in milliseconds at the best granularity. Also, System.nanoTime should not be used in mix with epoch milliseconds because it uses arbitrary origin and measured differently.
- Changed TestListFile to use more longer interval between file timestamps those are used by testFilterAge to provide more consistent test result because sleep time can be longer with filesystems whose timestamp in seconds precision.
- Added logging at TestListFile.
- Added TestWatcher to dump state in case assertion fails for further investigation.
- Added Timestamp Precision property so that user can set if auto-detect is not enough
- Adjust timestamps for ages test

This closes #1915.

Signed-off-by: Bryan Bende <bbende@apache.org>
2017-08-28 11:31:03 -04:00
Bryan Bende cf57639396 NIFI-4311 Allowing umask to get set properly before initializing the FileSystem
Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

This closes #2106.
2017-08-22 22:40:26 +02:00
Mark Payne 451f9cf124 NIFI-4142: This closes #2015. Refactored Record Reader/Writer to allow for reading/writing "raw records". Implemented ValidateRecord. Updated Record Reader to take two parameters for nextRecord: (boolean coerceTypes) and (boolean dropUnknownFields)
Signed-off-by: joewitt <joewitt@apache.org>
2017-08-11 22:01:46 -07:00
Bryan Bende 0029f025f8 NIFI-4152 Initial commit of ListenTCPRecord 2017-08-07 22:44:11 +02:00
Wesley-Lawrence 40cde0466a NIFI-4215 NiFi can now parse an Avro schema of a record that references an already defined record, including itself.
This closes #2052.
2017-08-03 15:13:07 -04:00
James Wing 2502b79bae NIFI-4215 Revert Complex Avro Schema Changes
This reverts commit cf49a58ee7.
2017-08-01 21:03:04 -07:00
Wesley-Lawrence cf49a58ee7 NIFI-4215 Allow Complex Avro Schema Parsing
NiFi can now parse an Avro schema of a record that references an already defined record, including itself.

Signed-off-by: James Wing <jvwing@gmail.com>

This closes #2034.
2017-07-30 16:31:39 -07:00
Mark Payne 1d6b486b63 NIFI-4232: Ensure that we handle conversions to Avro Arrays properly. Also, if unable to convert a value to the expected object, include in the log message the (fully qualified) name of the field that is problematic
Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

This closes #2040.
2017-07-27 08:57:25 +02:00
Bryan Bende f87d2a2f57 NIFI-4157 Improvements to PutTCP
Signed-off-by: Bryan Bende <bbende@apache.org>
Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

This closes #1989.
2017-07-19 12:25:42 +02:00
Mark Payne b603cb955d NIFI-4060: Initial implementation of MergeRecord
NIFI-4060: Addressed threading issue with RecordBin being updated after it is completed; fixed issue that caused mime.type attribute not to be written properly if all incoming flowfiles already have a different value for that attribute

NIFI-4060: Bug fixes; improved documentation; added a lot of debug information; updated StandardProcessSession to produce more accurate logs in case of a session being committed/rolled back with open input/output streams
Signed-off-by: Matt Burgess <mattyb149@apache.org>

This closes #1958
2017-07-12 16:36:48 -04:00
Mark Payne 3906d4e1d2 NIFI-1763: Initial implementation of ConfluentSchemaRegistry.
NIFI-1763: Fixed bug where the Confluent Schema Registry Schema Access Writer was not being created

Signed-off-by: Yolanda M. Davis <ymdavis@apache.org>

This closes #1938
2017-07-06 15:52:07 -04:00
Jeff Storck c99100c934
NIFI-4010 Enables EL on Fetch/List/PutSFTP and List/Fetch/Put/DeleteHDFS processor properties
FetchSFTP/ListSFTP/PutSFTP: Private Key Path
ListHDFS/FetchHDFS/PutHDFS/DeleteHDFS: Hadoop Configuration Resources, Kerberos Principal, Kerberos Keytab, Kerberos Relogin Period

This closes #1148
This closes #1930.

Signed-off-by: Bryan Bende <bbende@apache.org>
2017-06-21 17:14:49 -04:00
Maurizio Colleluori 59a32948ea
NIFI-2923 Added evaluation of attribute expressions for Kerberos principal and keytab
Signed-off-by: Bryan Bende <bbende@apache.org>
2017-06-21 17:14:28 -04:00
Maurizio Colleluori 86fa1bba4f
NIFI-2923 Add expression language support to Kerberos parameters used by processors
Signed-off-by: Bryan Bende <bbende@apache.org>
2017-06-21 17:14:27 -04:00
Mark Payne e7dcb6f6c5 NIFI-3921: Allow Record Writers to inherit schema from Record
Signed-off-by: Matt Burgess <mattyb149@apache.org>

This closes #1902
2017-06-09 16:13:25 -04:00
Matt Gilman cc741d2be6
NIFI-3997:
- Bumping to next minor version.
2017-06-08 15:22:51 -04:00
Matt Gilman 1bf0a1a849
Merge branch 'NIFI-3997-RC1' 2017-06-08 14:30:10 -04:00
Bryan Bende b0c9428776 NIFI-4030 Populating default values on GenericRecord from Avro schema if not present in RecordSchema
This closes #1896.
2017-06-07 13:59:40 -04:00
Steve Champagne 45e035686f
NIFI-4029: Allow null Avro default values in HortonworksSchemaRegistry
This closes #1894.

Signed-off-by: Bryan Bende <bbende@apache.org>
2017-06-07 12:03:53 -04:00
Matt Gilman 6ee12e9b47
NIFI-3997-RC1prepare for next development iteration 2017-06-05 11:07:43 -04:00
Matt Gilman ddb73612bd
NIFI-3997-RC1prepare release nifi-1.3.0-RC1 2017-06-05 11:07:28 -04:00
Mark Payne 7035694e37
NIFI-3995: Updated Hwx Encoded Schema Ref Writer to write 13 bytes for header instead of 14; added unit test to verify
This closes #1876.

Signed-off-by: Bryan Bende <bbende@apache.org>
2017-06-01 10:29:58 -04:00
Mark Payne a0b2311ff6 NIFI-3995: This closes #1873. No longer use the 14th byte in the header for hwx content-encoded schema reference
Signed-off-by: joewitt <joewitt@apache.org>
2017-05-31 13:41:25 -04:00
Koji Kawamura 13b59b5621 NIFI-3958: Decimal logical type with undefined precision and scale.
- Oracle NUMBER can return 0 precision and -127 or 0 scale with variable scale NUMBER such as ROWNUM or function result
- Added 'Default Decimal Precision' and 'Default Decimal Scale' property to ExecuteSQL and QueryDatabaseTable to apply default precision and scale if those are unknown
- Coerce BigDecimal scale to field schema logical type, so that BigDecimals having different scale can be written

Signed-off-by: Matt Burgess <mattyb149@apache.org>

This closes #1851
2017-05-25 14:37:17 -04:00
Mark Payne 08b66b5b6a NIFI-3969: Prevent merging flowfiles prematurely when all bins fill but some are already full and can be processed
Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

This closes #1850.
2017-05-24 19:36:18 +02:00
Pierre Villard ba49b8427c
NIFI-3191 - HDFS Processors Should Allow Choosing LZO Compression
This closes #1802.

Signed-off-by: Bryan Bende <bbende@apache.org>
2017-05-24 11:04:00 -04:00
Mark Payne c49933f03d NIFI-3948: This closes #1834. Added flush() method to RecordWriter and call it when writing a single record to OutputStream for PublishKafkaRecord. Also removed no-longer-used class WriteAvroResult
Signed-off-by: joewitt <joewitt@apache.org>
2017-05-19 23:05:04 -04:00
Mark Payne ae9953db64 NIFI-3857: This closes #1825. Added PartitionRecord processor
Signed-off-by: joewitt <joewitt@apache.org>
2017-05-19 02:08:52 -04:00
Mark Payne 9bd0246a96 NIFI-3863: Initial implementation of Lookup Services. Implemented LookupRecord processors. This required some refactoring of RecordSetWriter interface, so refactored that interface and all implementations and references of it 2017-05-19 01:02:41 -04:00
Koji Kawamura 40a9cd4f2e NIFI-3919: Let AvroTypeUtil try every possible type
Before this fix, AvroTypeUtil can throw an Exception before trying every possible data types defined within a union field.

This closes #1816.
2017-05-17 10:53:27 -04:00
Koji Kawamura 33dc3e36fb NIFI-3920: Remove unnecessary code from AvroTypeUtil
- Removed remaining duplicate lines of code left by NIFI-3861 refactoring.
- Added test case that writes Avro record having union field.

This closes #1813.
2017-05-17 09:45:21 -04:00
Koji Kawamura 1811ba5681 NIFI-2624: Avro logical types for ExecuteSQL and QueryDatabaseTable
- Added Logical type support for DECIMAL/NUMBER, DATE, TIME and TIMESTAMP column types.
- Added Logical type 'decimal' to AvroReader so that Avro records with logical types written by ExecuteSQL and QueryDatabaseTable can be consumed by AvroReader.
- Added JdbcCommon.AvroConversionOptions to consolidate conversion options.
- Added 'Use Avro Logical Types' property to ExecuteSQL and QueryDatabaseTable to toggle whether to use Logical types.
- Added 'mime.type' FlowFile attribute as 'application/avro-binary' so that output FlowFiles can be displayed by content viewer.

Signed-off-by: Matt Burgess <mattyb149@apache.org>

This closes #1798
2017-05-15 14:15:23 -04:00
Steve Champagne 382eef2183 NIFI-3879: Allow null Avro default values
- Avro uses their own class for null values, so a standard check against
  null isn't picking them up.

This closes #1792.
2017-05-12 14:58:43 -04:00
Mark Payne b1901d5fe0 NIFI-3838: Initial implementation of RecordPath and UpdateRecord processor
NIFI-3838: Updated version from 1.2.0-SNAPSHOT to 1.3.0-SNAPSHOT; removed unneeded value from AttributeExpression.ResultType enum

NIFI-3838: Addressed PR Review feedback

NIFI-3838: Allow for schemas to be merged together for a record; refactored RecordSetWriterFactory so that there is a method to obtain the schema and then the writer is created with that schema. Added additional unit tests

NIFI-3838: Addressed problems with documentation based on PR Review

NIFI-3838: Fixed checkstyle violation

NIFI-3838: Addressed issue of comparing different types of Number objects

Signed-off-by: Matt Burgess <mattyb149@apache.org>

This closes #1772
2017-05-12 12:36:52 -04:00
Koji Kawamura 0b0ac196ea NIFI-3861: AvroTypeUtil used different constant.
Previous fix #1779 refactored the way to check Logical type to use string constants.
One of those refactoring used wrong constant mistakenly in normalizeValue method.

Fortunately, this defect is harmless since even though normalizeValue did not convert int to Time, DataTypeUtils.convertType does the same conversion.

Signed-off-by: Matt Burgess <mattyb149@apache.org>

This closes #1782
2017-05-11 21:31:07 -04:00
Steve Champagne 6e4db6b11a NIFI-3871: Convert Avro map values
This closes #1787.

- When converting from a raw value to an Avro object, convert the values
  of any Avro map types so that they can be complex types like other
  records.
2017-05-11 20:25:19 -04:00
Koji Kawamura 72de1cbdef NIFI-3861: Fixed AvroReader nullable logical types issue
- AvroReader did not convert logical types if those are defined with union
- Consolidated createSchema method in AvroSchemaRegistry and AvroTypeUtil as both has identical implementation and mai
ntaining both would be error-prone

This closes #1779.
2017-05-10 10:08:22 -04:00
Bryan Bende 3af53419af
NIFI-3770-RC2 prepare for next development iteration 2017-05-05 20:50:28 -04:00
Bryan Bende 3a605af8e0
NIFI-3770-RC2 prepare release nifi-1.2.0-RC2 2017-05-05 20:50:14 -04:00
Mark Payne 9b177fbcba
NIFI-3787: Addressed NPE and ensure that if validation fails due to RuntimeException, that it gets logged. Also clarified documentation for Json Reader services
This closes #1742.

Signed-off-by: Bryan Bende <bbende@apache.org>
2017-05-03 13:13:43 -04:00
Mark Payne daa277b067 NIFI-3775: This closes #1735. Ensure that a '0' byte is written out to Hortonworks Encoded Schema Reference header to indicate that GenericRecord is being used
Signed-off-by: joewitt <joewitt@apache.org>
2017-05-02 17:43:26 -04:00
Jeff Storck 26d90fbccf
NIFI-1833 - Addressed issues from PR review.
NIFI-1833 Moved AbstractListProcessor.java, EntityListing.java, and ListableEntity.java from nifi-standard-processors into nifi-processor-utils
Moved TestAbstractListProcessor.java into nifi-processor-utils
Set nifi-azure-nar's nar dependency back to nifi-standard-services-api-nar
Fixed failing integration tests (ITFetchAzureBlobStorage.java, ITListAzureBlobStorage.java, and ITPutAzureStorageBlob.java) and refactored them to be able to run in parallel

NIFI-1833 Moved security notice info in the additional details documentation into the descriptions of the specific attributes for which those notices are intended
Added displayName usage to properties
Updated exception handling in FetchAzureBlobStorage.java and PutAzureBlobStorage.java to cause flowfiles with Output/InputStreamCallback failures to be routed to the processor's failure relationship
Cleaned up dependencies in pom

NIFI-1833 Removed unnecessary calls to map on Optional in the onTrigger exception handling of FetchAzureBlobStorage.java and PutAzureBlobStorage.java

NIFI-1833 Updates due to nifi-processor-utils being moved under nifi-nar-bundles

This closes #1719.

Signed-off-by: Bryan Rosander <brosander@apache.org>
2017-05-02 14:39:46 -04:00
Mark Payne 07989b8460 NIFI-3739: This closes #1695. Added ConsumeKafkaRecord_0_10 and PublishKafkaRecord_0_10 processors 2017-05-01 18:47:51 -04:00
Bryan Bende 60d88b5a64
NIFI-3724 - Initial commit of Parquet bundle with PutParquet and FetchParquet
- Creating nifi-records-utils to share utility code from record services
- Refactoring Parquet tests to use MockRecorderParser and MockRecordWriter
- Refactoring AbstractPutHDFSRecord to use schema access strategy
- Adding custom validate to AbstractPutHDFSRecord and adding handling of UNION types when writing Records as Avro
- Refactoring project structure to get CS API references out of nifi-commons, introducing nifi-extension-utils under nifi-nar-bundles
- Updating abstract put/fetch processors to obtain the WriteResult and update flow file attributes

This closes #1712.

Signed-off-by: Andy LoPresto <alopresto@apache.org>
2017-05-01 16:10:35 -04:00