- The idea behind NIFI-3356 was to improve the efficiency and throughput of the Provenance Repository, as it is often the bottleneck. While testing the newly designed repository,
a handful of other, fairly minor, changes were made to improve efficiency as well, as these came to light when testing the new repository:
- Use a BufferedOutputStream within StandardProcessSession (via a ClaimCache abstraction) in order to avoid continually writing to FileOutputStream when writing many small FlowFiles
- Updated threading model of MinimalLockingWriteAheadLog - now performs serialization outside of lock and writes to a 'synchronized' OutputStream
- Change minimum scheduling period for components from 30 microseconds to 1 nanosecond. ScheduledExecutor is very inconsistent with timing of task scheduling. With the bored.yield.duration
now present, this value doesn't need to be set to 30 microseconds. This was originally done to avoid processors that had no work from dominating the CPU. However, now that we will yield
when processors have no work, this results in slowing down processors that are able to perform work.
- Allow nifi.properties to specify multiple directories for FlowFile Repository
- If backpressure is engaged while running a batch of sessions, then stop batch processing earlier. This helps FlowFiles to move through the system much more smoothly instead of the
herky-jerky queuing that we previously saw at very high rates of FlowFiles.
- Added NiFi PID to log message when starting nifi. This was simply an update to the log message that provides helpful information.
NIFI-3356: Fixed bug in ContentClaimWriteCache that resulted in data corruption and fixed bug in RepositoryConfiguration that threw exception if cache warm duration was set to empty string
NIFI-3356: Fixed NPE
NIFI-3356: Added debug-level performance monitoring
NIFI-3356: Updates to unit tests that failed after rebasing against master
NIFI-3356: Incorporated PR review feedback
NIFI-3356: Fixed bug where we would delete index directories that are still in use; also added additional debug logging and a simple util class that can be used to textualize provenance event files - useful in debugging
This closes#1493
The 'exec' command added by NIFI-2689 affected restart behavior
negatively as 'exec' command will not execute subsequent commands in the
shell script.
This commit changes 'exec' is added only when 'run' is specified.
This closes#1523.
Signed-off-by: Aldrin Piri <aldrin@apache.org>
Added support for simple Key/Value Schema Registry as Controller Service
Added support for registering multiple schemas as dynamic properties of Schema Registry Controller Service
Added the following 8 processors
- ExtractAvroFieldsViaSchemaRegistry
- TransformAvroToCSVViaSchemaRegistry
- TransformAvroToJsonViaSchemaRegistry
- TransformCSVToAvroViaSchemaRegistry
- TransformCSVToJsonViaSchemaRegistry
- TransformJsonToAvroViaSchemaRegistry
- TransformJsonToCSVViaSchemaRegistry
- UpdateAttributeWithSchemaViaSchemaRegistry
polishing
NIFI-3354 Adding support for HDFS Schema Registry, unions and default values in the Avro Schema and NULL columns in the source CSV
NIFI-3354 Adding support for logicalTypes per the Avro 1.7.7 spec
NIFI-3354 polishing and restructuring CSVUtils
NIFI-3354 renamed processors to address PR comment
NIFI-3354 addressed latest PR comments
- removed HDFS-based ControllerService. It will be migrated into a separate bundle as a true extension.
- removed UpdateAttribute. . . processor
- added mime.type attribute to all Transform* processors
NIFI-3354 added missing L&N entries
This closes pr/1436
Before this fix, files with the latest timestamp within a listing
iteration are always be held back one cycle no matter how old it is.
Signed-off-by: Andre F de Miranda <trixpan@users.noreply.github.com>
- Add 'nifi.flow.configuration.archive.max.count' in nifi.properties
- Change default archive limit so that it uses archive max time(30 days)
and storage (500MB) if no limitation is specified
- Simplified logic to delete old archives
This closes#1460.
Signed-off-by: Koji Kawamura <ijokarumawak@apache.org>
- Added Signal Counter Delta property
- Added Signal Buffer Count property
- Added processor property name and display name
- Changed IOException handling from routing it to failure to throw
RuntimeException, so that NiFi framework can yield the processor for a while and try again
Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>
This closes#1466.
- Added detailed description about how the URL property works with
GetHTMLElement
- Added Expression support with URL
- Made URL property dynamic with ModifyHTMLElement and PutHTMLElement,
since it won't be used to alter HTML element and need not to be
specified. Making it a dynamic property let existing processor configuration stays valid
* add exec to RUN_NIFI_CMD
* remove subshell for else
* tested compatible with runit with these changes
This closes#966.
Signed-off-by: Aldrin Piri <aldrin@apache.org>
* Corrected handling of corrupt journal file records that prevents instance startup and loss of records from corrupt files. Specifically, exception handling was expanded to cover failures on records after the first the same as failures on the first record.
* Adjusted log messages to reflect that the remainder or all of the journal will be skipped, not just the current record.
This closes#1485.
H2 and Kafka broker uses the same default port 9092.
If an user is running Kafka broker on the same machine, or run the unit
tests in parallel, DBCPServiceTest can fail since some of its test
methods connects to port 9092.
This closes#1504.
Signed-off-by: Andy LoPresto <alopresto@apache.org>
* Remove function based on JDK source.
* Add new function to find bytes based on RFC3629.
* Add field name to log entry when field is truncated.
Signed-off-by: Mike Moser <mosermw@apache.org>
This closes#1475
* Credentials service with tests
* Abstract processor definitions
* GCS-themed processors and their corresponding tests
Signed-off-by: James Wing <jvwing@gmail.com>
This closes#1482.
use the FileNameFilter when not passing down explit jar paths
Filter out ^. files when reading lists of files from directories
Signed-off-by: Koji Kawamura <ijokarumawak@apache.org>
- Added configure audits for Transport Protocol, HTTP Proxy Server Host,
Port, User and Password in RemoteProcessGroup configuration
- Added configure audits for enabling/disabling individual remote port
- Added configure audits for Concurrent Tasks and Compressed in Remote
Port configuration
- This closes#1476
* Updated StandardRecordWriter, even though it is now deprecated to consider the encoding behavior of java.io.DataOutputStream.writeUTF() and truncate string values such that the UTF representation will not be longer than that DataOutputStream's 64K UTF format limit.
* Updated the new SchemaRecordWriter class to similarly truncate long Strings that will be written as UTF.
* Add tests to confirm handling of large UTF strings and various edge conditions of UTF string handling.
Signed-off-by: Mike Moser <mosermw@apache.org>
This closes#1469.
- Marked PutKafka Partition Strategy property as deprecated, as Kafka 0.8 client doesn't use 'partitioner.class' as producer property, we don't have to specify it.
- Changed Partition Strategy property from a required one to a dynamic property, so that existing processor config can stay in valid state.
- Fixed partition property to work.
- Route a flow file if it failed to be published due to invalid partition.
This closes#1425
NIFI-2615 Addressing changes from P/R. Specifically, removing .gitignore as it should not be there for a nar. Removed non-used class. Changed name in notice
- Requiring WRITE permissions to the parent resource when attempting to remove a component.
- Updating expired certificates in the REST API integration tests.
This closes#1399.
Signed-off-by: James Wing <jvwing@gmail.com>
- Support counters at Wait/Notify processors so that NiFi flow can be
configured to wait for N signals
- Extract Wait/Notify logics into WaitNotifyProtocol
- Added FragmentAttributes to manage commonly used fragment attributes
- Changed existing split processors to set 'fragment.identifier' and
'fragment.count', so that Wait can use those to wait for all splits
get processed
This closes#1420.
Signed-off-by: Bryan Bende <bbende@apache.org>
Add support in SelectHiveQL to get script content from the Flow File to bring consistency with patterns used for PutHiveQL and support extra query management.
Changed behavior of using Flowfile to match ExecuteSQL. Handle query delimiter when embedded. Added test case for embedded delimiter
Formatting and License Header
PutHiveQL and SelectHiveQL Processor enhancements. Added support for multiple statements in a script. Options for delimiters, quotes, escaping, include header and alternate header.
Add support in SelectHiveQL to get script content from the Flow File to bring consistency with patterns used for PutHiveQL and support extra query management.
Changed behavior of using Flowfile to match ExecuteSQL. Handle query delimiter when embedded. Added test case for embedded delimiter
Removing dead code.
Signed-off-by: Matt Burgess <mattyb149@apache.org>
Comments to Clarify test case.
Signed-off-by: Matt Burgess <mattyb149@apache.org>
Final whitespace/formatting/typo changes
This closes#1316
- Adding additional parameters to be able to limit the size of the provenance response. Specifically, whether the events should be summarized and whether events should be returned incrementally before the query has completed.
- Ensuring the cluster node address is included in provenance events returned.
- Ensuring there is a cluster coordinator before attempting to get the cluster node address.
- Removing exponential back off between provenance requests.
- Ensuring the content viewer url is retrieve before initializing the provenance table.
This closes#1413.
- Using fetch and replace together can provide optimistic locking for
concurrency control.
- Added fetch to get cache entry with its meta data such as revision
number.
- Added replace to update cache only if it has not been updated.
- Added Map Cache protocol version 2 for those new operations.
- Existing operations such as get or put can work with protocol version
1.
This closes#1410.
Signed-off-by: Bryan Bende <bbende@apache.org>
NIFI-3004 Added logic to expire StandardSSLContextService customValidate cache after 5 invocations.
Updated unit test to demonstrate this logic.
This closes#1375.
Signed-off-by: Andy LoPresto <alopresto@apache.org>
- Removing unnecessary authorization check during second phase of connection creation.
- Ensuring that the remote group port returns the correct resource type though not super critical since it is not possible to create policies for remote ports.
This closes#1353.