NIFI-1374 updated admin guide to provide advice for permgen and codecache

Signed-off-by: Aldrin Piri <aldrin@apache.org>
This commit is contained in:
joewitt 2016-01-10 12:41:05 -05:00 committed by Aldrin Piri
parent 035562bb33
commit 684f48ff92
1 changed files with 74 additions and 106 deletions

View File

@ -24,18 +24,22 @@ System Requirements
Apache NiFi can run on something as simple as a laptop, but it can also be clustered across many enterprise-class servers. Therefore, the amount of hardware and memory needed will depend on the size and nature of the dataflow involved. The data is stored on disk while NiFi is processing it. So NiFi needs to have sufficient disk space allocated for its various repositories, particularly the content repository, flowfile repository, and provenance repository (see the <<system_properties>> section for more information about these repositories). NiFi has the following minimum system requirements:
* Requires Java 7 or newer
* Supported Operating Systems:
* Supported Operating Systems:
** Linux
** Unix
** Windows
** Mac OS X
* Supported Web Browsers:
* Supported Web Browsers:
** Internet Explorer 9+ (see note below)
** Mozilla FireFox 24+
** Google Chrome 36+
** Safari 8
Note that there is a known issue in Internet Explorer (IE) 10 and 11 that can cause problems when moving items on the NiFi graph. If you encounter this problem, we suggest using a browser other than IE. This known issue is described here: https://connect.microsoft.com/IE/Feedback/Details/1050422.
**Note** There is a known issue in Internet Explorer (IE) 10 and 11 that can cause problems when moving items on the NiFi graph. If you encounter this problem, we suggest using a browser other than IE. This known issue is described here: https://connect.microsoft.com/IE/Feedback/Details/1050422.
**Note** Java 7 default perm gen sizing can result in 'out of memory errors' due to the amount of classes loaded by NiFi. See the <<bootstrap_properties>> section for more information.
**Note** Under sustained and extremely high throughput the CodeCache settings may need to be tuned to avoid sudden performance loss. See the <<bootstrap_properties>> section for more information.
How to install and start NiFi
-----------------------------
@ -103,14 +107,14 @@ And your distribution may require an edit to /etc/security/limits.d/90-nproc.con
----
Increase the number of TCP socket ports available::
This is particularly important if your flow will be setting up and tearing
This is particularly important if your flow will be setting up and tearing
down a large number of sockets in small period of time.
----
sudo sysctl -w net.ipv4.ip_local_port_range="10000 65000"
----
Set how long sockets stay in a TIMED_WAIT state when closed::
You don't want your sockets to sit and linger too long given that you want to be
You don't want your sockets to sit and linger too long given that you want to be
able to quickly setup and teardown new sockets. It is a good idea to read more about
it but to adjust do something like
----
@ -142,17 +146,17 @@ NiFi provides several different configuration options for security purposes. The
|`nifi.security.keystoreType` | The type of Keystore. Must be either `PKCS12` or `JKS`.
|`nifi.security.keystorePasswd` | The password for the Keystore.
|`nifi.security.keyPasswd` | The password for the certificate in the Keystore. If not set, the value of `nifi.security.keystorePasswd` will be used.
|`nifi.security.truststore` | Filename of the Truststore that will be used to authorize those connecting to NiFi. If not set, all who
|`nifi.security.truststore` | Filename of the Truststore that will be used to authorize those connecting to NiFi. If not set, all who
attempt to connect will be provided access as the 'Anonymous' user.
|`nifi.security.truststoreType` | The type of the Truststore. Must be either `PKCS12` or `JKS`.
|`nifi.security.truststorePasswd` | The password for the Truststore.
|`nifi.security.needClientAuth` | Specifies whether or not connecting clients must authenticate themselves. Specifically this property is used
by the NiFi cluster protocol. If the Truststore properties are not set, this must be `false`. Otherwise, a value
of `true` indicates that nodes in the cluster will be authenticated and must have certificates that are trusted
of `true` indicates that nodes in the cluster will be authenticated and must have certificates that are trusted
by the Truststores.
|`nifi.security.anonymous.authorities` | Specifies the roles that should be granted to users that connect over HTTPS anonymously. All users can make
|`nifi.security.anonymous.authorities` | Specifies the roles that should be granted to users that connect over HTTPS anonymously. All users can make
use of anonymous access, however if they have been granted a particular level of access by an administrator
it will take precedence if they access NiFi using a client certificate or once they have logged in.
it will take precedence if they access NiFi using a client certificate or once they have logged in.
|==================================================================================================================================================
Once the above properties have been configured, we can enable the User Interface to be accessed over HTTPS instead of HTTP. This is accomplished
@ -163,9 +167,9 @@ be accessible from all network interfaces, a value of `0.0.0.0` should be used.
NOTE: It is important when enabling HTTPS that the `nifi.web.http.port` property be unset.
Similar to `nifi.security.needClientAuth`, the web server can be configured to require certificate based client authentication for users accessing
the User Interface. In order to do this it must be configured to not support username/password authentication (see below) and not grant access to
anonymous users (see `nifi.security.anonymous.authorities` above). Either of these options will configure the web server to WANT certificate based client
authentication. This will allow it to support users with certificates and those without that may be logging in with their credentials or those accessing
the User Interface. In order to do this it must be configured to not support username/password authentication (see below) and not grant access to
anonymous users (see `nifi.security.anonymous.authorities` above). Either of these options will configure the web server to WANT certificate based client
authentication. This will allow it to support users with certificates and those without that may be logging in with their credentials or those accessing
anonymously. If username/password authentication and anonymous access are not configured, the web server will REQUIRE certificate based client authentication.
Now that the User Interface has been secured, we can easily secure Site-to-Site connections and inner-cluster communications, as well. This is
@ -179,9 +183,9 @@ NiFi supports user authentication via client certificates or via username/passwo
Provider'. The Login Identity Provider is a pluggable mechanism for authenticating users via their username/password. Which Login Identity Provider
to use is configured in two properties in the _nifi.properties_ file.
The `nifi.login.identity.provider.configuration.file` property specifies the configuration file for Login Identity Providers.
The `nifi.login.identity.provider.configuration.file` property specifies the configuration file for Login Identity Providers.
The `nifi.security.user.login.identity.provider` property indicates which of the configured Login Identity Provider should be
used. If this property is not configured, NiFi will not support username/password authentication and will require client
used. If this property is not configured, NiFi will not support username/password authentication and will require client
certificates for authenticating users over HTTPS. By default, this property is not configured meaning that username/password must be
explicity enabled.
@ -258,13 +262,13 @@ Once NiFi is configured to run securely and an authentication mechanism is confi
to configure who will have access to the system and what types of access those people will have.
NiFi controls this through the user of an 'Authority Provider.' The Authority Provider is a pluggable
mechanism for providing authorizations to different users. Which Authority Provider to use is configured
using two properties in the _nifi.properties_ file.
using two properties in the _nifi.properties_ file.
The `nifi.authority.provider.configuration.file` property specifies the configuration file for Authority Providers.
The `nifi.authority.provider.configuration.file` property specifies the configuration file for Authority Providers.
The `nifi.security.user.authority.provider` property indicates which of the configured Authority Providers should be
used.
By default, the `file-provider` Authority Provider is selected and is configured to use the permissions granted in
By default, the `file-provider` Authority Provider is selected and is configured to use the permissions granted in
the _authorized-users.xml_ file. This is typically sufficient for instances of NiFi that are run in "standalone" mode.
If the NiFi instance is configured to run in a cluster, the node will typically use the `cluster-node-provider`
Provider and the Cluster Manager will typically use the `cluster-ncm-provider` Provider. Both of these Providers
@ -272,8 +276,8 @@ have a default configuration in the _authority-providers.xml_ file but are comme
When using the `cluster-node-provider` Provider, all of the authorization is provided by the Cluster Manager. In this
way, the configuration only has to be maintained in one place and will be consistent across the entire cluster.
When configuring the Cluster Manager or a standalone node, it is necessary to manually designate an ADMIN user
When configuring the Cluster Manager or a standalone node, it is necessary to manually designate an ADMIN user
in the _authorized-users.xml_ file, which is located in the root installation's conf directory.
After this ADMIN user has been added, s/he may grant access
to other users, systems, and other instances of NiFi, through the User Interface (UI) without having to manually edit the _authorized-users.xml_
@ -330,9 +334,9 @@ The following roles are available in NiFi:
the lineage of data. Additionally, this role provides the ability to view or download
the content of a FlowFile from a Provenance event (assuming that the content is still
available in the Content Repository and that the Authority Provider also grants access).
This access is not provided to users with Read Only
(unless the user has both Read Only and Provenance roles) because the information provided
to users with this role can potentially be very sensitive in nature, as all FlowFile attributes
This access is not provided to users with Read Only
(unless the user has both Read Only and Provenance roles) because the information provided
to users with this role can potentially be very sensitive in nature, as all FlowFile attributes
and data are exposed. In order to Replay a Provenance event, a user is required to have both
the Provenance role as well as the Data Flow Manager role.
| NiFi | The NiFi Role is intended to be assigned to machines that will interact with an instance of NiFi
@ -367,12 +371,12 @@ cluster, s/he can grant it to the group and avoid having to grant it individuall
Clustering Configuration
------------------------
This section provides a quick overview of NiFi Clustering and instructions on how to set up a basic cluster. In the future, we hope to provide supplemental documentation that covers the NiFi Cluster Architecture in depth.
This section provides a quick overview of NiFi Clustering and instructions on how to set up a basic cluster. In the future, we hope to provide supplemental documentation that covers the NiFi Cluster Architecture in depth.
The design of NiFi clustering is a simple master/slave model where there is a master and one or more slaves.
While the model is that of master and slave, if the master dies, the slaves are all instructed to continue operating
as they were to ensure the dataflow remains live. The absence of the master simply means new slaves cannot join the
cluster and cluster flow changes cannot occur until the master is restored. In NiFi clustering, we call the master
The design of NiFi clustering is a simple master/slave model where there is a master and one or more slaves.
While the model is that of master and slave, if the master dies, the slaves are all instructed to continue operating
as they were to ensure the dataflow remains live. The absence of the master simply means new slaves cannot join the
cluster and cluster flow changes cannot occur until the master is restored. In NiFi clustering, we call the master
the NiFi Cluster Manager (NCM), and the slaves are called Nodes. See a full description of each in the Terminology section below.
*Why Cluster?* +
@ -388,11 +392,11 @@ NiFi Clustering is unique and has its own terminology. It's important to underst
*Nodes*: Each cluster is made up of the NCM and one or more nodes. The nodes do the actual data processing. (The NCM does not process any data; all data runs through the nodes.) While nodes are connected to a cluster, the DFM may not access the User Interface for any of the individual nodes. The User Interface of a node may only be accessed if the node is manually removed from the cluster.
*Primary Node*: Every cluster has one Primary Node. On this node, it is possible to run "Isolated Processors" (see below). By default, the NCM will elect the first node that connects to the cluster as the Primary Node; however, the DFM may select a new node as the Primary Node in the Cluster Management page of the User Interface if desired. If the cluster restarts, the NCM will "remember" which node was the Primary Node and wait for that node to re-connect before allowing the DFM to make any changes to the dataflow. The ADMIN may adjust how long the NCM waits for the Primary Node to reconnect by adjusting the property _nifi.cluster.manager.safemode.duration_ in the _nifi.properties_ file, which is discussed in the <<system_properties>> section of this document.
*Primary Node*: Every cluster has one Primary Node. On this node, it is possible to run "Isolated Processors" (see below). By default, the NCM will elect the first node that connects to the cluster as the Primary Node; however, the DFM may select a new node as the Primary Node in the Cluster Management page of the User Interface if desired. If the cluster restarts, the NCM will "remember" which node was the Primary Node and wait for that node to re-connect before allowing the DFM to make any changes to the dataflow. The ADMIN may adjust how long the NCM waits for the Primary Node to reconnect by adjusting the property _nifi.cluster.manager.safemode.duration_ in the _nifi.properties_ file, which is discussed in the <<system_properties>> section of this document.
*Isolated Processors*: In a NiFi cluster, the same dataflow runs on all the nodes. As a result, every component in the flow runs on every node. However, there may be cases when the DFM would not want every processor to run on every node. The most common case is when using a processor that communicates with an external service using a protocol that does not scale well. For example, the GetSFTP processor pulls from a remote directory, and if the GetSFTP on every node in the cluster tries simultaneously to pull from the same remote directory, there could be race conditions. Therefore, the DFM could configure the GetSFTP on the Primary Node to run in isolation, meaning that it only runs on that node. It could pull in data and -with the proper dataflow configuration- load-balance it across the rest of the nodes in the cluster. Note that while this feature exists, it is also very common to simply use a standalone NiFi instance to pull data and feed it to the cluster. It just depends on the resources available and how the Administrator decides to configure the cluster.
*Heartbeats*: The nodes communicate their health and status to the NCM via "heartbeats", which let the NCM know they are still connected to the cluster and working properly. By default, the nodes emit heartbeats to the NCM every 5 seconds, and if the NCM does not receive a heartbeat from a node within 45 seconds, it disconnects the node due to "lack of heartbeat". (The 5-second and 45-second settings are configurable in the _nifi.properties_ file. See the <<system_properties>> section of this document for more information.) The reason that the NCM disconnects the node is because the NCM needs to ensure that every node in the cluster is in sync, and if a node is not heard from regularly, the NCM cannot be sure it is still in sync with the rest of the cluster. If, after 45 seconds, the node does send a new heartbeat, the NCM will automatically reconnect the node to the cluster. Both the disconnection due to lack of heartbeat and the reconnection once a heartbeat is received are reported to the DFM in the NCM's User Interface.
*Heartbeats*: The nodes communicate their health and status to the NCM via "heartbeats", which let the NCM know they are still connected to the cluster and working properly. By default, the nodes emit heartbeats to the NCM every 5 seconds, and if the NCM does not receive a heartbeat from a node within 45 seconds, it disconnects the node due to "lack of heartbeat". (The 5-second and 45-second settings are configurable in the _nifi.properties_ file. See the <<system_properties>> section of this document for more information.) The reason that the NCM disconnects the node is because the NCM needs to ensure that every node in the cluster is in sync, and if a node is not heard from regularly, the NCM cannot be sure it is still in sync with the rest of the cluster. If, after 45 seconds, the node does send a new heartbeat, the NCM will automatically reconnect the node to the cluster. Both the disconnection due to lack of heartbeat and the reconnection once a heartbeat is received are reported to the DFM in the NCM's User Interface.
*Communication within the Cluster* +
@ -403,11 +407,11 @@ When the DFM makes changes to the dataflow, the NCM communicates those changes t
*Dealing with Disconnected Nodes* +
A DFM may manually disconnect a node from the cluster. But if a node becomes disconnected for any other reason (such as due to lack of heartbeat), the NCM will show a bulletin on the User Interface, and the DFM will not be able to make any changes to the dataflow until the issue of the disconnected node is resolved. The DFM or the Administrator will need to troubleshoot the issue with the node and resolve it before any new changes may be made to the dataflow. However, it is worth noting that just because a node is disconnected does not mean that it is not working; it just means that the NCM cannot communicate with the node.
*Basic Cluster Setup* +
This section describes the setup for a simple two-node, non-secure, unicast cluster comprised of three instances of NiFi:
This section describes the setup for a simple two-node, non-secure, unicast cluster comprised of three instances of NiFi:
* The NCM
* Node 1
@ -415,7 +419,7 @@ This section describes the setup for a simple two-node, non-secure, unicast clus
Administrators may install each instance on a separate server; however, it is also perfectly fine to install the NCM and one of the nodes on the same server, as the NCM is very lightweight. Just keep in mind that the ports assigned to each instance must not collide if the NCM and one of the nodes share the same server.
For each instance, certain properties in the _nifi.properties_ file will need to be updated. In particular, the Web and Clustering properties should be evaluated for your situation and adjusted accordingly. All the properties are described in the <<system_properties>> section of this guide; however, in this section, we will focus on the minimum properties that must be set for a simple cluster.
For each instance, certain properties in the _nifi.properties_ file will need to be updated. In particular, the Web and Clustering properties should be evaluated for your situation and adjusted accordingly. All the properties are described in the <<system_properties>> section of this guide; however, in this section, we will focus on the minimum properties that must be set for a simple cluster.
For all three instances, the Cluster Common Properties can be left with the default settings. Note, however, that if you change these settings, they must be set the same on every instance in the cluster (NCM and nodes).
@ -433,7 +437,7 @@ For Node 1, the minimum properties to configure are as follows:
** nifi.cluster.is.node - Set this to _true_.
** nifi.cluster.node.address - Set this to the fully qualified hostname of the node. If left blank, it defaults to "localhost".
** nifi.cluster.node.protocol.port - Set this to an open port that is higher than 1024 (anything lower requires root). If Node 1 and the NCM are on the same server, make sure this port is different from the nifi.cluster.manager.protocol.port.
** nifi.cluster.node.unicast.manager.address - Set this to the NCM's fully qualified hostname.
** nifi.cluster.node.unicast.manager.address - Set this to the NCM's fully qualified hostname.
** nifi.cluster.node.unicast.manager.protocol.port - Set this to exactly the same port that was set on the NCM for the property nifi.cluster.manager.protocol.port.
For Node 2, the minimum properties to configure are as follows:
@ -448,7 +452,7 @@ For Node 2, the minimum properties to configure are as follows:
Now, it is possible to start up the cluster. Technically, it does not matter which instance starts up first. However, you could start the NCM first, then Node 1 and then Node 2. Since the first node that connects is automatically elected as the Primary Node, this sequence should create a cluster where Node 1 is the Primary Node. Navigate to the URL for the NCM in your web browser, and the User Interface should look similar to the following:
image:ncm.png["NCM User Interface", width=940]
image:ncm.png["NCM User Interface", width=940]
*Troubleshooting*
@ -475,7 +479,7 @@ take effect only after NiFi has been stopped and restarted.
|====
|*Property*|*Description*
|java|Specifies the fully qualified java command to run. By default, it is simply `java` but could be changed to an absolute path or a reference an environment variable, such as `$JAVA_HOME/bin/java`
|run.as|The username to run NiFi as. For instance, if NiFi should be run as the 'nifi' user, setting this value to 'nifi' will cause the NiFi Process to be run as the 'nifi' user.
|run.as|The username to run NiFi as. For instance, if NiFi should be run as the 'nifi' user, setting this value to 'nifi' will cause the NiFi Process to be run as the 'nifi' user.
This property is ignored on Windows. For Linux, the specified user may require sudo permissions.
|lib.dir|The _lib_ directory to use for NiFi. By default, this is set to `./lib`
|conf.dir|The _conf_ directory to use for NiFi. By default, this is set to `./conf`
@ -497,10 +501,28 @@ take effect only after NiFi has been stopped and restarted.
configured recipients whenever NiFi is stopped.
|nifi.died.notification.services|This property is a comma-separated list of Notification Service identifiers that correspond to the Notification Services
defined in the `notification.services.file` property. The services with the specified identifiers will be used to notify their
configured recipients if the bootstrap determines that NiFi has unexpectedly died.
configured recipients if the bootstrap determines that NiFi has unexpectedly died.
|====
*Java 7 PermGen Sizing*
The provided _bootstrap.conf_ file may include a line such as
....
#java.arg.11=-XX:PermSize=128M
#java.arg.12=-XX:MaxPermSize=128M
....
If running in Java 7 it is recommended to uncomment those lines to ensure the PermGen size and maximum can be larger than is available by default. This is important because NiFi
can load a significant number of classes which will result in OutOfMemoryError due to PermGen being full. You might choose a value larger than 128MB as well.
*Java 7 and 8 handling of codecache*
It has been observed in both Java 7 and Java 8 runtime environments that performance can suddenly drop by more than an order of magnitude after days or weeks of otherwise ideal
behavior. This has only been observed under extremely high load and in cases where considerable Just in Time (JIT) compilation occurs. The core problem is the CodeCache becomes
full and is seemingly not properly garbage collected or grown. When this occurs JIT seems to no longer occur or involve considerable delays and performance drops.
This is easily overcome by ensuring the following lines are available in the _boostrap.conf_. By default they are there but commented. Uncomment them for maximum sustained throughput.
....
#java.arg.7=-XX:ReservedCodeCacheSize=256m
#java.arg.8=-XX:CodeCacheFlushingMinimumFreeSpace=10m
#java.arg.9=-XX:+UseCodeCacheFlushing
....
[[notification_services]]
Notification Services
@ -520,8 +542,8 @@ The syntax of the XML file is as follows:
<id>some-identifier</id>
<!-- The fully-qualified class name of the Notification Service. -->
<class>org.apache.nifi.bootstrap.notification.email.EmailNotificationService</class>
<!-- Any number of properties can be set using this syntax.
<!-- Any number of properties can be set using this syntax.
The properties available depend on the Notification Service. -->
<property name="Property Name 1">Property Value</property>
<property name="Another Property Name">Property Value 2</property>
@ -578,10 +600,10 @@ System Properties
The _nifi.properties_ file in the _conf_ directory is the main configuration file for controlling how NiFi runs. This section provides an overview of the properties in this file and includes some notes on how to configure it in a way that will make upgrading easier. *After making changes to this file, restart NiFi in order
for the changes to take effect.*
NOTE: The contents of this file are relatively stable but do change from time to time. It is always a good idea to
NOTE: The contents of this file are relatively stable but do change from time to time. It is always a good idea to
review this file when upgrading and pay attention for any changes. Consider configuring items
below marked with an asterisk (*) in such a way that upgrading will be easier. For details, see a full discussion on upgrading
at the end of this section. Note that values for periods of time and data sizes must include the unit of measure,
below marked with an asterisk (*) in such a way that upgrading will be easier. For details, see a full discussion on upgrading
at the end of this section. Note that values for periods of time and data sizes must include the unit of measure,
for example "10 sec" or "10 MB", not simply "10".
*Core Properties* +
@ -592,14 +614,14 @@ The first section of the _nifi.properties_ file is for the Core Properties. Thes
|*Property*|*Description*
|nifi.version|The version number of the current release. If upgrading but reusing this file, be sure to update this value.
|nifi.flow.configuration.file*|The location of the flow configuration file (i.e., the file that contains what is currently displayed on the NiFi graph). The default value is ./conf/flow.xml.gz.
|nifi.flow.configuration.archive.dir*|The location of the archive directory where backup copies of the flow.xml are saved. The default value is ./conf/archive.
|nifi.flow.configuration.archive.dir*|The location of the archive directory where backup copies of the flow.xml are saved. The default value is ./conf/archive.
|nifi.flowcontroller.autoResumeState|Indicates whether -upon restart- the components on the NiFi graph should return to their last state. The default value is _true_.
|nifi.flowcontroller.graceful.shutdown.period|Indicates the shutdown period. The default value is 10 sec.
|nifi.flowservice.writedelay.interval|When many changes are made to the flow.xml, this property specifies how long to wait before writing out the changes, so as to batch the changes into a single write. The default value is 500 ms.
|nifi.administrative.yield.duration|If a component allows an unexpected exception to escape, it is considered a bug. As a result, the framework will pause (or administratively yield) the component for this amount of time. This is done so that the component does not use up massive amounts of system resources, since it is known to have problems in the existing state. The default value is 30 sec.
|nifi.bored.yield.duration|When a component has no work to do (i.e., is "bored"), this is the amount of time it will wait before checking to see if it has new data to work on. This way, it does not use up CPU resources by checking for new work too often. When setting this property, be aware that it could add extra latency for components that do not constantly have work to do, as once they go into this "bored" state, they will wait this amount of time before checking for more work. The default value is 10 millis.
|nifi.authority.provider.configuration.file*|This is the location of the file that specifies how user access is authorized. The default value is ./conf/authority-providers.xml.
|nifi.login.identity.provider.configuration.file*|This is the location of the file that specifies how username/password authentication is performed. This file is
|nifi.login.identity.provider.configuration.file*|This is the location of the file that specifies how username/password authentication is performed. This file is
only consider if `nifi.security.user.login.identity.provider` configured with a provider identifier. The default value is ./conf/login-identity-providers.xml.
|nifi.templates.directory*|This is the location of the directory where flow templates are saved. The default value is ./conf/templates.l
|nifi.ui.banner.text|This is banner text that may be configured to display at the top of the User Interface. It is blank by default.
@ -611,7 +633,7 @@ only consider if `nifi.security.user.login.identity.provider` configured with a
*H2 Settings* +
The H2 Settings section defines the settings for the H2 database, which keeps track of user access and flow controller history.
The H2 Settings section defines the settings for the H2 database, which keeps track of user access and flow controller history.
|====
|*Property*|*Description*
@ -637,7 +659,7 @@ to configure it on a separate drive if available.
*Swap Management* +
NiFi keeps FlowFile information in memory (the JVM)
NiFi keeps FlowFile information in memory (the JVM)
but during surges of incoming data, the FlowFile information can start to take up so much of the JVM that system performance
suffers. To counteract this effect, NiFi "swaps" the FlowFile information to disk temporarily until more JVM space becomes
available again. These properties govern how that process occurs.
@ -708,13 +730,13 @@ nifi.provenance.repository.directory.provenance2=/repos/provenance2 +
Providing three total locations, including _nifi.provenance.repository.directory.default_.
|nifi.provenance.repository.max.storage.time|The maximum amount of time to keep data provenance information. The default value is 24 hours.
|nifi.provenance.repository.max.storage.size|The maximum amount of data provenance information to store at a time. The default is 1 GB.
|nifi.provenance.repository.rollover.time|The amount of time to wait before rolling over the latest data provenance information so that it is available in the User Interface. The default value is 5 mins.
|nifi.provenance.repository.rollover.time|The amount of time to wait before rolling over the latest data provenance information so that it is available in the User Interface. The default value is 5 mins.
|nifi.provenance.repository.rollover.size|The amount of information to roll over at a time. The default value is 100 MB.
|nifi.provenance.repository.query.threads|The number of threads to use for Provenance Repository queries. The default value is 2.
|nifi.provenance.repository.index.threads|The number of threads to use for indexing Provenance events so that they are searchable. The default value is 1.
For flows that operate on a very high number of FlowFiles, the indexing of Provenance events could become a bottleneck. If this is the case, a bulletin will appear, indicating that
"The rate of the dataflow is exceeding the provenance recording rate. Slowing down flow to accommodate." If this happens, increasing the value of this property
may increase the rate at which the Provenance Repository is able to process these records, resulting in better overall throughput.
may increase the rate at which the Provenance Repository is able to process these records, resulting in better overall throughput.
|nifi.provenance.repository.compress.on.rollover|Indicates whether to compress the provenance information when rolling it over. The default value is _true_.
|nifi.provenance.repository.always.sync|If set to _true_, any change to the repository will be synchronized to the disk, meaning that NiFi will ask the operating system not to cache the information. This is very expensive and can significantly reduce NiFi performance. However, if it is _false_, there could be the potential for data loss if either there is a sudden power loss or the operating system crashes. The default value is _false_.
|nifi.provenance.repository.journal.count|The number of journal files that should be used to serialize Provenance Event data. Increasing this value will allow more tasks to simultaneously update the repository but will result in more expensive merging of the journal files later. This value should ideally be equal to the number of threads that are expected to update the repository simultaneously, but 16 tends to work well in must environments. The default value is 16.
@ -733,7 +755,7 @@ Providing three total locations, including _nifi.provenance.repository.director
*Component Status Repository* +
The Component Status Repository contains the information for the Component Status History tool in the User Interface. These
The Component Status Repository contains the information for the Component Status History tool in the User Interface. These
properties govern how that tool works.
The buffer.size and snapshot.frequency work together to determine the amount of historical data to retain. As an example to
@ -779,12 +801,12 @@ These properties pertain to the web-based User Interface.
*Security Properties* +
These properties pertain to various security features in NiFi. Many of these properties are covered in more detail in the
These properties pertain to various security features in NiFi. Many of these properties are covered in more detail in the
Security Configuration section of this Administrator's Guide.
|====
|*Property*|*Description*
|nifi.sensitive.props.key|This is the password used to encrypt any sensitive property values that are configured in processors. By default, it is blank, but the system administrator should provide a value for it. It can be a string of any length. Be aware that once this password is set and one or more sensitive processor properties have been configured, this password should not be changed.
|nifi.sensitive.props.key|This is the password used to encrypt any sensitive property values that are configured in processors. By default, it is blank, but the system administrator should provide a value for it. It can be a string of any length. Be aware that once this password is set and one or more sensitive processor properties have been configured, this password should not be changed.
|nifi.sensitive.props.algorithm|The algorithm used to encrypt sensitive properties. The default value is PBEWITHMD5AND256BITAES-CBC-OPENSSL.
|nifi.sensitive.props.provider|The sensitive property provider. The default value is BC.
|nifi.security.keystore*|The full path and name of the keystore. It is blank by default.
@ -870,64 +892,10 @@ Only configure these properties for the cluster manager.
|====
|*Property*|*Description*
|nifi.kerberos.krb5.file*|The location of the krb5 file, if used. It is blank by default. Note that this property is not used to authenticate NiFi users.
|nifi.kerberos.krb5.file*|The location of the krb5 file, if used. It is blank by default. Note that this property is not used to authenticate NiFi users.
Rather, it is made available for extension points, such as Hadoop-based Processors, to use. At this time, only a single krb5 file is allowed to
be specified per NiFi instance, so this property is configured here rather than in individual Processors.
be specified per NiFi instance, so this property is configured here rather than in individual Processors.
|====
NOTE: *For Upgrading* - Take care when configuring the properties above that are marked with an asterisk (*). To make the upgrade process easier, it is advisable to change the default configurations to locations outside the main root installation directory. In this way, these items can remain in their configured location through an upgrade, and NiFi can find all the repositories and configuration files and pick up where it left off as soon as the old version is stopped and the new version is started. Furthermore, the administrator may reuse this _nifi.properties_ file and any other configuration files without having to re-configure them each time an upgrade takes place. As previously noted, it is important to check for any changes in the _nifi.properties_ file of the new version when upgrading and make sure they are reflected in the _nifi.properties_ file you use.