- Updated FlowFile Repo / Write Ahead Log so that any update that writes more than 1 MB of data is written to a file inside the FlowFile Repo rather than being buffered in memory
- Update SplitText so that it does not hold FlowFiles that are not the latest version in heap. Doing them from being garbage collected, so while the Process Session is holding the latest version of the FlowFile, SplitText is holding an older version, and this results in two copies of the same FlowFile object
NIFI-5533: Checkpoint
NIFI-5533: Bug Fixes
Signed-off-by: Matthew Burgess <mattyb149@apache.org>
This closes#2974
Added regression test for ProvenanceEventConsumer#isFilteringEnabled().
Changed isFilteringEnabled implementation to be expandable as other attributes are added using Streams.
EL + indentation.
This closes#2973.
Co-authored-by: Andy LoPresto <alopresto@apache.org>
Signed-off-by: Andy LoPresto <alopresto@apache.org>
Adding followings:
- Use separate DistributedMapCache for tracking entities to avoid
conflict with existing code
- Added more validation
- Delete listed entities from cache if reset is needed
- Support Local scope
- Added Initial Listing Target
This closes#2876.
Signed-off-by: Mark Payne <markap14@hotmail.com>
- Create nifi-syslog-utils to move syslog parsing functionalty to a central location shared by the processors and serialization/record system.
- Refactor Processors to use these utils
- Update 5424 syslog classes using simple-syslog-5424 to pick up new changes to support this work, as well as keep dependencies/types from bleeding out to the
processors or readers
- Refactor Syslog5424Event and Parser
- Create Syslog5424RecordReader
- per review, handle blank message differently from eof
- name schema per review
This closes#2816.
Signed-off-by: Bryan Bende <bbende@apache.org>
NIFI-5059 Changed it to use a schema registry.
NIFI-5059 Updated MongoDBLookupService to be a SchemaRegistryService.
NIFI-5059 Added two changes from a code review.
NIFI-5059 Fixed two bad references.
NIFI-5059 Refactored schema strategy handling.
NIFI-5059 Moved schema strategy handling to JsonInferenceSchemaRegistryService.
NIFI-5059 Updated to use new LookupService method.
NIFI-5059 fixed schema inference bug.
NIFI-5059 Added test for schema text strategy
NIFI-5059 incremented version number to make the build work.
NIFI-5059 fixed a stray 1.7.0 reference.
NIFI-5059 Added getDatabase to client service.
NIFI-5059 Added changes requested in a code review.
Signed-off-by: Matthew Burgess <mattyb149@apache.org>
This closes#2619
NIFI-5041: fixes http client version issue
Change-Id: I1b87ec4752ff6e1603025883a72113919aba5dd4
NIFI-5041: fixes Kerberos configuration
Change-Id: I868fdf3ea7cfd28cf415164e420f23bf3f6eefeb
NIFI-5041: adds new NOTICE entries
NIFI-5041: yields processor if no session is available, fixes error handling in session manager thread, fixes error returned in KerberosKeytabSPNegoScheme on authentication failure
Change-Id: I443e063ae21c446980087e5464a4b70373d730f6
NIFI-5041: makes the session manager thread exceptions visible to the users
Change-Id: I33fde5df6933cec2a87a4d82e681d4464f21b459
NIFI-5041: adds special SessionManagerException to identify error occurred on session manager thread
Change-Id: I25a52c025376a0cd238f14bda533d6f5f3e5fb4a
This closes#2630
Signed-off-by: Matthew Burgess <mattyb149@apache.org>
NIFI-950: Still seeing some slow response times when instantiating a large template in cluster mode so making some minor tweaks based on the results of CPU profiling
NIFI-5112: Refactored FlowSerializer so that it creates the desired intermediate data model that can be serialized, separate from serializing. This allows us to hold the FlowController's Read Lock only while creating the data model, not while actually serializing the data. Configured Jersey Client in ThreadPoolRequestReplicator not to look for features using the Service Loader for every request. Updated Template object to hold a DOM Node that represents the template contents instead of having to serialize the DTO, then parse the serialized form as a DOM object each time that it needs to be serialized.
NIFI-5112: Change ThreadPoolRequestReplicator to use OkHttp client instead of Jersey Client
NIFI-5111: Ensure that if a node is no longer cluster coordinator, that it clears any stale heartbeats.
NIFI-5110: Notify StandardProcessScheduler when a component is removed so that it will clean up any resource related to component lifecycle.
NIFI-950: Avoid gathering the Status objects for entire flow when we don't need them; removed unnecessary code
NIFI-950: Bug fixes
NIFI-950: Bug fix; added validation status to ProcessorDTO, ControllerServiceDTO, ReportingTaskDTO; updated DebugFlow to allow for pause time to be set in the customValidate method for testing functionality
NIFI-950: Addressing test failures
NIFI-950: Bug fixes
NIFI-950: Addressing review feedback
NIFI-950: Fixed validation logic in mock framework
This closes#2693
- Forcing FileSystem statistics thread to be interrupted when HDFS processors are stopped
- Stop creating temp components during import from registry, use bundle info instead
This closes#2668.
Signed-off-by: Mark Payne <markap14@hotmail.com>
Fixed dependency issue by providing a local JSON reader
Rebased + fixed conflict + updated versions in pom + EL scope
Signed-off-by: Matthew Burgess <mattyb149@apache.org>
This closes#2575
- take into account input requirement for documentation rendering
- Renamed variable registry scope and added comments
- Doc + change in mock framework to check scope + update of components + UI
The feature allows users to convert from non-integral types to the correct underlying type. The
original behavior is maintained; however, now simple conversions take place automatically for some
logical types (date, time, and timestamp).
This closes#2526.
Signed-off-by: Derek Straka <derek@asterius.io>
Signed-off-by: Mark Payne <markap14@hotmail.com>
address review comments
NIFI-4809 - Added Record Writer property
added unit tests and additional details doc
review comments
Signed-off-by: Matthew Burgess <mattyb149@apache.org>
This closes#2430
Before this fix, MergeContent only processed the first bin even if there
were multiple bins.
There were two unit tests marked with Ignore those had been
failing because of this.
This closes#2444.
Signed-off-by: Mark Payne <markap14@hotmail.com>
- Applied the same scale adjustment not only to BigDecimal inputs, but
also to Double values.
Signed-off-by: Matthew Burgess <mattyb149@apache.org>
This closes#2450
NIFI-3472 NIFI-4350 Removed explicit relogin code from HDFS/Hive/HBase components and updated SecurityUtils.loginKerberos to use UGI.loginUserFromKeytab. This brings those components in line with daemon-process-style usage, made possible by NiFi's InstanceClassloader isolation. Relogin (on ticket expiry/connection failure) can now be properly handled by hadoop-client code implicitly.
NIFI-3472 Added default value (true) for javax.security.auth.useSubjectCredsOnly to bootstrap.conf
NIFI-3472 Added javadoc explaining the removal of explicit relogin threads and usage of UGI.loginUserFromKeytab
Readded Relogin Period property to AbstractHadoopProcessor, and updated its documentation to indicate that it is now a deprecated property
Additional cleanup of code that referenced relogin periods
Marked KerberosTicketRenewer is deprecated
NIFI-3472 Cleaned up imports in TestPutHiveStreaming
- Removed duplicated creation of a ParentProcessGroupSearchNode for the
root ProcessGroup.
- Removed duplicated creation of a ParentProcessGroupSearchNode for each
component inside a ProcessGroup.
- Fixed ProcessGroup id hierarchy.
- Fixed filtering logic.
- Added unit tests for filtering by ProcessGroupId and Remote
Input/Output ports.
Signed-off-by: Matthew Burgess <mattyb149@apache.org>
This closes#2351
- Simplified consumeEvents method signature
- Refactored ComponentMapHolder methods visibility
- Renamed componentMap to componentNameMap
- Map more metadata from ConnectionStatus for Remote Input/Output Ports
- Support Process Group hierachy filtering
- Throw an exception when the reporting task fails to send provenance
data to keep current provenance event index so that events can be
consumed again
NIFI-4707: Add process group ID/name to S2SProvReportingTask records
NIFI-4707: Added support for filtering provenance on process group ID
NIFI-4707: Fixed support for provenance in Atlas reporting task
NIFI-4707: Refactored common code into reporting-utils, fixed filtering
- Upgrading to Jersey 2.x.
- Updating NOTICE files where necessary.
- Fixing checkstyle issues.
This closes#2206.
Signed-off-by: Andy LoPresto <alopresto@apache.org>
remove redundant additionalDetails.html and add docs to CapabilityDescription in HDFS processors
revert the modified CapabilityDescriptions in HDFS processors and add it to AbstractHadoopProcessor
- Upgraded immediately actionable dependency versions from Meterian report.
- Upgraded jackson-core test dependencies for HBase and Elasticsearch modules.
- Only 3 instances of jackson-core < 2.8.6 (Google Cloud Platform and Spark Receiver modules).
- Upgraded version of poi dependency in nifi-email-processors to 3.16.
- Resolving dependency issues after rebasing against 1.5.0-SNAPSHOT.
- Removed jackson-databind from <dependencyManagement> block in nifi/pom.xml and added explicit reference to ${jackson.version} in all referenced artifacts.
- Removed jackson-mapper-asl from <dependencyManagement> block in nifi/pom.xml and added explicit reference to ${jackson.old.version} in all referenced artifacts.
- Removed Jasypt from <dependencyManagement> and added explicit version in test dependency for legacy compatibility.
- This closes#2084
- Removed FlowFile from RecordReaderFactory, RecordSetWriterFactory and SchemaAccessStrategy.
- Renamed variable 'allowableValue' to 'strategy' to represent its meaning better.
- Removed creation of temporal FlowFile to resolve Record Schema from ConsumerLease.
- Removed unnecessary 'InputStream content' argument from
RecordSetWriterFactory.getSchema method.
This closes#1877.
Before this fix, it's possible that ListXXX processors can miss files those have the same timestamp as the one which was the latest processed timestamp at the previous cycle. Since it only used timestamps, it was not possible to determine whether a file is already processed or not.
However, storing every single processed identifier as we used to will not perform well.
Instead, this commit makes ListXXX to store only identifiers those have the latest timestamp at a cycle to minimize the amount of state data to store.
NIFI-3332: ListXXX to not miss files with the latest processed timestamp
- Fixed TestAbstractListProcessor to use appropriate time precision.
Without this fix, arbitrary test can fail if generated timestamp does
not have the desired time unit value, e.g. generated '10:51:00' where
second precision is tested.
- Fixed TestFTP.basicFileList to use millisecond time precision explicitly
because FakeFtpServer's time precision is in minutes.
- Changed junit dependency scope to 'provided' as it is needed by
ListProcessorTestWatcher which is shared among different modules.
This closes#1975.
Signed-off-by: Bryan Bende <bbende@apache.org>
- Refactored variable names to better represents what those are meant for.
- Added deterministic logic which detects target filesystem timestamp precision and adjust lag time based on it.
- Changed from using System.nanoTime() to System.currentTimeMillis in test because Java File API reports timestamp in milliseconds at the best granularity. Also, System.nanoTime should not be used in mix with epoch milliseconds because it uses arbitrary origin and measured differently.
- Changed TestListFile to use more longer interval between file timestamps those are used by testFilterAge to provide more consistent test result because sleep time can be longer with filesystems whose timestamp in seconds precision.
- Added logging at TestListFile.
- Added TestWatcher to dump state in case assertion fails for further investigation.
- Added Timestamp Precision property so that user can set if auto-detect is not enough
- Adjust timestamps for ages test
This closes#1915.
Signed-off-by: Bryan Bende <bbende@apache.org>
NiFi can now parse an Avro schema of a record that references an already defined record, including itself.
Signed-off-by: James Wing <jvwing@gmail.com>
This closes#2034.
NIFI-4060: Addressed threading issue with RecordBin being updated after it is completed; fixed issue that caused mime.type attribute not to be written properly if all incoming flowfiles already have a different value for that attribute
NIFI-4060: Bug fixes; improved documentation; added a lot of debug information; updated StandardProcessSession to produce more accurate logs in case of a session being committed/rolled back with open input/output streams
Signed-off-by: Matt Burgess <mattyb149@apache.org>
This closes#1958
NIFI-1763: Fixed bug where the Confluent Schema Registry Schema Access Writer was not being created
Signed-off-by: Yolanda M. Davis <ymdavis@apache.org>
This closes#1938
- Oracle NUMBER can return 0 precision and -127 or 0 scale with variable scale NUMBER such as ROWNUM or function result
- Added 'Default Decimal Precision' and 'Default Decimal Scale' property to ExecuteSQL and QueryDatabaseTable to apply default precision and scale if those are unknown
- Coerce BigDecimal scale to field schema logical type, so that BigDecimals having different scale can be written
Signed-off-by: Matt Burgess <mattyb149@apache.org>
This closes#1851
- Removed remaining duplicate lines of code left by NIFI-3861 refactoring.
- Added test case that writes Avro record having union field.
This closes#1813.
- Added Logical type support for DECIMAL/NUMBER, DATE, TIME and TIMESTAMP column types.
- Added Logical type 'decimal' to AvroReader so that Avro records with logical types written by ExecuteSQL and QueryDatabaseTable can be consumed by AvroReader.
- Added JdbcCommon.AvroConversionOptions to consolidate conversion options.
- Added 'Use Avro Logical Types' property to ExecuteSQL and QueryDatabaseTable to toggle whether to use Logical types.
- Added 'mime.type' FlowFile attribute as 'application/avro-binary' so that output FlowFiles can be displayed by content viewer.
Signed-off-by: Matt Burgess <mattyb149@apache.org>
This closes#1798
NIFI-3838: Updated version from 1.2.0-SNAPSHOT to 1.3.0-SNAPSHOT; removed unneeded value from AttributeExpression.ResultType enum
NIFI-3838: Addressed PR Review feedback
NIFI-3838: Allow for schemas to be merged together for a record; refactored RecordSetWriterFactory so that there is a method to obtain the schema and then the writer is created with that schema. Added additional unit tests
NIFI-3838: Addressed problems with documentation based on PR Review
NIFI-3838: Fixed checkstyle violation
NIFI-3838: Addressed issue of comparing different types of Number objects
Signed-off-by: Matt Burgess <mattyb149@apache.org>
This closes#1772
Previous fix#1779 refactored the way to check Logical type to use string constants.
One of those refactoring used wrong constant mistakenly in normalizeValue method.
Fortunately, this defect is harmless since even though normalizeValue did not convert int to Time, DataTypeUtils.convertType does the same conversion.
Signed-off-by: Matt Burgess <mattyb149@apache.org>
This closes#1782
This closes#1787.
- When converting from a raw value to an Avro object, convert the values
of any Avro map types so that they can be complex types like other
records.
- AvroReader did not convert logical types if those are defined with union
- Consolidated createSchema method in AvroSchemaRegistry and AvroTypeUtil as both has identical implementation and mai
ntaining both would be error-prone
This closes#1779.
NIFI-1833 Moved AbstractListProcessor.java, EntityListing.java, and ListableEntity.java from nifi-standard-processors into nifi-processor-utils
Moved TestAbstractListProcessor.java into nifi-processor-utils
Set nifi-azure-nar's nar dependency back to nifi-standard-services-api-nar
Fixed failing integration tests (ITFetchAzureBlobStorage.java, ITListAzureBlobStorage.java, and ITPutAzureStorageBlob.java) and refactored them to be able to run in parallel
NIFI-1833 Moved security notice info in the additional details documentation into the descriptions of the specific attributes for which those notices are intended
Added displayName usage to properties
Updated exception handling in FetchAzureBlobStorage.java and PutAzureBlobStorage.java to cause flowfiles with Output/InputStreamCallback failures to be routed to the processor's failure relationship
Cleaned up dependencies in pom
NIFI-1833 Removed unnecessary calls to map on Optional in the onTrigger exception handling of FetchAzureBlobStorage.java and PutAzureBlobStorage.java
NIFI-1833 Updates due to nifi-processor-utils being moved under nifi-nar-bundles
This closes#1719.
Signed-off-by: Bryan Rosander <brosander@apache.org>
- Creating nifi-records-utils to share utility code from record services
- Refactoring Parquet tests to use MockRecorderParser and MockRecordWriter
- Refactoring AbstractPutHDFSRecord to use schema access strategy
- Adding custom validate to AbstractPutHDFSRecord and adding handling of UNION types when writing Records as Avro
- Refactoring project structure to get CS API references out of nifi-commons, introducing nifi-extension-utils under nifi-nar-bundles
- Updating abstract put/fetch processors to obtain the WriteResult and update flow file attributes
This closes#1712.
Signed-off-by: Andy LoPresto <alopresto@apache.org>