Commit Graph

23 Commits

Author SHA1 Message Date
Mark Payne fd068fe978
NIFI-7557: uses a canonical representation of strings when recovering data from FlowFile Repository in order to avoid using huge amounts of heap when not necessary
- Fixed some problems with unit/integration tests

This closes #4507.

Signed-off-by: Bryan Bende <bbende@apache.org>
2020-09-03 10:21:50 -04:00
Mark Payne d68720920f
NIFI-7242: When a Parameter is changed, any property referencing that parameter should have its #onPropertyModified method called. Also renamed Accumulo tests to integration tests because they start embedded servers and connect to them, which caused failures in my environment. Also fixed a bug in TestLengthDelimitedJournal because it was resulting in failures when building locally as well.
Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>

This closes #4134.
2020-03-11 21:00:43 +01:00
Andy LoPresto 2cc467eb58
NIFI-3833 Added encrypted flowfile repository implementation.
Added EncryptedSchemaRepositoryRecordSerde.
Refactored CryptoUtils utility methods for repository encryption configuration validation checks to RepositoryEncryptorUtils.
Added FlowFile repo encryption config container.
Added more logging in cryptographic and serialization operations.
Generalized log messages in shared encryption services.
Added encrypted serde factory.
Added marker impl for encrypted WAL.
Moved validation of FF repo encryption config earlier in startup process.
Refactored duplicate property lookup code in NiFiProperties.
Added title case string helper.
Added validation and warning around misformatted encryption repo properties.
Added unit tests.
Added documentation to User Guide & Admin Guide.
Added screenshot for docs.
Added links to relevant sections of NiFi In-Depth doc to User Guide.
Added flowfile & content repository encryption configuration properties to default nifi.properties.

Signed-off-by: Joe Witt <joewitt@apache.org>
Signed-off-by: Mark Payne <markap14@hotmail.com>

This closes #3968.
2020-01-10 10:44:59 -08:00
Mark Payne eb6085a31d
NIFI-6658: Implement new bin/nifi.sh diagnostics command that is responsible for obtaining diagnostic information about many different parts of nifi, the operating system, etc.
This closes #3727.

Signed-off-by: Bryan Bende <bbende@apache.org>
2019-09-13 10:45:53 -04:00
Mark Payne 6b17c4b134
NIFI-6613: If LengthDelimitedJournal gets poisoned, log the reason and hold onto it so that it can be included as the cause of subsequent Exceptions that are thrown
This closes #3704.

Signed-off-by: Koji Kawamura <ijokarumawak@apache.org>
2019-09-13 09:02:45 +09:00
Mark Payne 40dcd1577b
NIFI-6410: Addressed race condition in LengthDelimitedJournal in which a Thread could throw an Exception, then another Thread could update the Journal before the first thread closes it. Added unit test to replicate.
This closes #3561
2019-07-02 10:15:52 -04:00
Mark Payne 58a25cfa5a
NIFI-6110: Updated StandardProcessSession such that if we fail to update the FlowFile Repository, we do not decrement claimant counts for any FlowFiles that were removed. Doing so can cause an issue where a FlowFile is removed, then the FlowFileRepo update fails, resulting in the flowfile being rolled back, but after its claimant count is decremented. It will then be processed again, which can result in the same thing, and we'll end up decrementing the claimant count repeatedly. Also updated LengthDelimitedJournal so that if the overflow directory already exists, it does not fail when trying to create the directory and instead just moves on. Updated unit tests to test both of these new fixes and updated DummyRecordSerde to be thread-safe because the TestLengthDelimitedJournal now needs it to be thread safe due to the new unit test that was added.
This closes #3358.

Signed-off-by: Bryan Bende <bbende@apache.org>
2019-03-08 10:32:38 -05:00
Mark Payne 83ac191736
NIFI-5997: If we swap out data, ensure that we do not increment the size of the queue by the size of the data that we failed to swap out. Also, if the FlowFile Repo does not know about a given swap file, do not restore it on restart
This closes #3290.

Signed-off-by: Bryan Bende <bbende@apache.org>
2019-02-04 14:29:17 -05:00
Mark Payne c425bd2880 NIFI-5533: Be more efficient with heap utilization
- Updated FlowFile Repo / Write Ahead Log so that any update that writes more than 1 MB of data is written to a file inside the FlowFile Repo rather than being buffered in memory
 - Update SplitText so that it does not hold FlowFiles that are not the latest version in heap. Doing them from being garbage collected, so while the Process Session is holding the latest version of the FlowFile, SplitText is holding an older version, and this results in two copies of the same FlowFile object

NIFI-5533: Checkpoint

NIFI-5533: Bug Fixes

Signed-off-by: Matthew Burgess <mattyb149@apache.org>

This closes #2974
2018-10-09 09:18:02 -04:00
Mark Payne 5872eb3c4a NIFI-5331: When checkpointing SequentialAccessWriteAheadLog, if the journal is not healthy, ensure that we roll it over and ensure that if an Exception is thrown when attempting to fsync() or close() the journal, we continue creating a new one.
This closes #2952.
Signed-off-by: Brandon Devries <devriesb@apache.org>
2018-10-04 15:25:53 -04:00
Andy LoPresto f60585a9b6
NIFI-5376 Removed deprecation warnings.
Updated Javadoc for SiteToSiteClient#createTransaction() and HttpClient implementation.
Reverted exception listing in method contract for SiteToSiteClient#createTransaction and HttpClient tion of same.
Reverted import ordering in TestSiteToSiteClient.
Reverted exception listing in TestGetHDFSFileInfo, TestListHDFS, and StandardHttpFlowFileServerProtocol.
Restored @SuppressWarnings annotation and removed unnecessary "public static" keywords from inner classes in SiteToSiteClient.

This closes #2841.

Signed-off-by: Joe Witt <joewitt@apache.org>
2018-07-09 20:45:34 -07:00
patricker 3be08b7e49 NIFI-5344 Fix Write Ahead Log Unit Test on Windows
This closes #2819

Signed-off-by: zenfenan <zenfenan@apache.org>
2018-06-28 22:39:45 +05:30
Mark Payne 0bcb241db3 NIFI-4774: Implementation of SequentialAccessWriteAheadLog and updates to WriteAheadFlowFileRepository to make use of the new implementation instead of MinimalLockingWriteAheadLog.
Signed-off-by: Matthew Burgess <mattyb149@apache.org>

This closes #2416
2018-02-19 09:26:01 -05:00
Mark Payne 0f2ac39f69 NIFI-3273 This closes #1611. Handle the case of trailing NUL bytes in MinimalLockingWriteAheadLog 2017-04-19 22:08:59 -07:00
Mark Payne 292dd1d66b
NIFI-3678: Ensure that we catch EOFException when reading header information from WAL Partition files; previously, we caught EOFExceptions when reading a 'record' from the WAL but not when reading header info
NIFI-3678: If we have a transaction ID but then have no more data written to Partition file, we end up with a NPE. Added logic to avoid this and instead return null for the next record when this happens

This closes #1656.

Signed-off-by: Bryan Bende <bbende@apache.org>
2017-04-07 10:28:27 -04:00
Mark Payne 091359b450 NIFI-3630 This closes #1632. Use a BufferedOutputStream when checkpointing FlowFile Repository
Signed-off-by: joewitt <joewitt@apache.org>
2017-03-30 16:46:53 -04:00
Mark Payne 96ed405d70 NIFI-3356: Initial implementation of writeahead provenance repository
- The idea behind NIFI-3356 was to improve the efficiency and throughput of the Provenance Repository, as it is often the bottleneck. While testing the newly designed repository,
  a handful of other, fairly minor, changes were made to improve efficiency as well, as these came to light when testing the new repository:

- Use a BufferedOutputStream within StandardProcessSession (via a ClaimCache abstraction) in order to avoid continually writing to FileOutputStream when writing many small FlowFiles
- Updated threading model of MinimalLockingWriteAheadLog - now performs serialization outside of lock and writes to a 'synchronized' OutputStream
- Change minimum scheduling period for components from 30 microseconds to 1 nanosecond. ScheduledExecutor is very inconsistent with timing of task scheduling. With the bored.yield.duration
  now present, this value doesn't need to be set to 30 microseconds. This was originally done to avoid processors that had no work from dominating the CPU. However, now that we will yield
  when processors have no work, this results in slowing down processors that are able to perform work.
- Allow nifi.properties to specify multiple directories for FlowFile Repository
- If backpressure is engaged while running a batch of sessions, then stop batch processing earlier. This helps FlowFiles to move through the system much more smoothly instead of the
  herky-jerky queuing that we previously saw at very high rates of FlowFiles.
- Added NiFi PID to log message when starting nifi. This was simply an update to the log message that provides helpful information.

NIFI-3356: Fixed bug in ContentClaimWriteCache that resulted in data corruption and fixed bug in RepositoryConfiguration that threw exception if cache warm duration was set to empty string

NIFI-3356: Fixed NPE

NIFI-3356: Added debug-level performance monitoring

NIFI-3356: Updates to unit tests that failed after rebasing against master

NIFI-3356: Incorporated PR review feedback

NIFI-3356: Fixed bug where we would delete index directories that are still in use; also added additional debug logging and a simple util class that can be used to textualize provenance event files - useful in debugging

This closes #1493
2017-02-22 12:40:06 -05:00
Mark Payne 1be0871473 NIFI-2854: Refactor repositories and swap files to use schema-based serialization so that nifi can be rolled back to a previous version after an upgrade.
NIFI-2854: Incorporated PR review feedback

NIFI-2854: Implemented feedback from PR Review

NIFI-2854: Ensure that all resources are closed on CompressableRecordReader.close() even if an IOException is thrown when closing one of them

This closes #1202
2016-11-18 14:53:13 -05:00
Mark Payne 62333c9e0a NIFI-1574: Ensure that we never flush a BufferedOutputStream's buffer on close of the write-ahead log 2016-02-29 15:42:54 -05:00
Mark Payne a8b063d61b NIFI-902: Ensure that we close the underlying file stream when we roll over a partition instead of the bufferedoutputstream, which could cause corruption of there was a failure to flush the entire buffer previously. 2015-08-30 19:48:19 -04:00
Mark Payne 5de37f63d9 NIFI-902: Ensure that if we get an IOException during rollover of WAL, we are able to recover 2015-08-28 10:04:58 -04:00
Mark Payne 4baffacc42 NIFI-892: If nifi.flowfile.repository.partitions property is changed, but repository already exists, just previous value 2015-08-25 09:58:37 -04:00
joewitt aa99884782 NIFI-850 removed nifi parent, updated nifi pom, moved all nifi subdirs up one level, fixed readme. 2015-08-15 13:12:22 -04:00