- Updated FlowFile Repo / Write Ahead Log so that any update that writes more than 1 MB of data is written to a file inside the FlowFile Repo rather than being buffered in memory
- Update SplitText so that it does not hold FlowFiles that are not the latest version in heap. Doing them from being garbage collected, so while the Process Session is holding the latest version of the FlowFile, SplitText is holding an older version, and this results in two copies of the same FlowFile object
NIFI-5533: Checkpoint
NIFI-5533: Bug Fixes
Signed-off-by: Matthew Burgess <mattyb149@apache.org>
This closes#2974
Updated Javadoc for SiteToSiteClient#createTransaction() and HttpClient implementation.
Reverted exception listing in method contract for SiteToSiteClient#createTransaction and HttpClient tion of same.
Reverted import ordering in TestSiteToSiteClient.
Reverted exception listing in TestGetHDFSFileInfo, TestListHDFS, and StandardHttpFlowFileServerProtocol.
Restored @SuppressWarnings annotation and removed unnecessary "public static" keywords from inner classes in SiteToSiteClient.
This closes#2841.
Signed-off-by: Joe Witt <joewitt@apache.org>
NIFI-3678: If we have a transaction ID but then have no more data written to Partition file, we end up with a NPE. Added logic to avoid this and instead return null for the next record when this happens
This closes#1656.
Signed-off-by: Bryan Bende <bbende@apache.org>
- The idea behind NIFI-3356 was to improve the efficiency and throughput of the Provenance Repository, as it is often the bottleneck. While testing the newly designed repository,
a handful of other, fairly minor, changes were made to improve efficiency as well, as these came to light when testing the new repository:
- Use a BufferedOutputStream within StandardProcessSession (via a ClaimCache abstraction) in order to avoid continually writing to FileOutputStream when writing many small FlowFiles
- Updated threading model of MinimalLockingWriteAheadLog - now performs serialization outside of lock and writes to a 'synchronized' OutputStream
- Change minimum scheduling period for components from 30 microseconds to 1 nanosecond. ScheduledExecutor is very inconsistent with timing of task scheduling. With the bored.yield.duration
now present, this value doesn't need to be set to 30 microseconds. This was originally done to avoid processors that had no work from dominating the CPU. However, now that we will yield
when processors have no work, this results in slowing down processors that are able to perform work.
- Allow nifi.properties to specify multiple directories for FlowFile Repository
- If backpressure is engaged while running a batch of sessions, then stop batch processing earlier. This helps FlowFiles to move through the system much more smoothly instead of the
herky-jerky queuing that we previously saw at very high rates of FlowFiles.
- Added NiFi PID to log message when starting nifi. This was simply an update to the log message that provides helpful information.
NIFI-3356: Fixed bug in ContentClaimWriteCache that resulted in data corruption and fixed bug in RepositoryConfiguration that threw exception if cache warm duration was set to empty string
NIFI-3356: Fixed NPE
NIFI-3356: Added debug-level performance monitoring
NIFI-3356: Updates to unit tests that failed after rebasing against master
NIFI-3356: Incorporated PR review feedback
NIFI-3356: Fixed bug where we would delete index directories that are still in use; also added additional debug logging and a simple util class that can be used to textualize provenance event files - useful in debugging
This closes#1493