diff --git a/nifi-docs/src/main/asciidoc/administration-guide.adoc b/nifi-docs/src/main/asciidoc/administration-guide.adoc index 769b7bc798..f62b68c6a5 100644 --- a/nifi-docs/src/main/asciidoc/administration-guide.adoc +++ b/nifi-docs/src/main/asciidoc/administration-guide.adoc @@ -40,66 +40,66 @@ Apache NiFi can run on something as simple as a laptop, but it can also be clust * Linux/Unix/OS X ** Decompress and untar into desired installation directory -** Make any desired edits in files found under /conf -*** At a minimum, we recommend editing the _nifi.properties_ file and entering a password for the nifi.sensitive.props.key (see <> below) -** From the /bin directory, execute the following commands by typing ./nifi.sh : +** Make any desired edits in files found under `/conf` +*** At a minimum, we recommend editing the _nifi.properties_ file and entering a password for the `nifi.sensitive.props.key` (see <> below) +** From the `/bin` directory, execute the following commands by typing `./nifi.sh `: *** start: starts NiFi in the background *** stop: stops NiFi that is running in the background *** status: provides the current status of NiFi *** run: runs NiFi in the foreground and waits for a Ctrl-C to initiate shutdown of NiFi *** install: installs NiFi as a service that can then be controlled via -**** service nifi start -**** service nifi stop -**** service nifi status +**** `service nifi start` +**** `service nifi stop` +**** `service nifi status` * Windows ** Decompress into the desired installation directory -** Make any desired edits in the files found under /conf -*** At a minimum, we recommend editing the _nifi.properties_ file and entering a password for the nifi.sensitive.props.key (see <> below) -** Navigate to the /bin directory -** Double-click run-nifi.bat. This runs NiFi in the foreground and waits for a Ctrl-C to initiate shutdown of NiFi -** To see the current status of NiFi, double-click status-nifi.bat +** Make any desired edits in the files found under `/conf` +*** At a minimum, we recommend editing the _nifi.properties_ file and entering a password for the `nifi.sensitive.props.key` (see <> below) +** Navigate to the `/bin` directory +** Double-click `run-nifi.bat`. This runs NiFi in the foreground and waits for a Ctrl-C to initiate shutdown of NiFi +** To see the current status of NiFi, double-click `status-nifi.bat` When NiFi first starts up, the following files and directories are created: -* content_repository -* database_repository -* flowfile_repository -* provenance_repository -* work directory -* logs directory -* Within the conf directory, the _flow.xml.gz_ file and the templates directory are created +* `content_repository` +* `database_repository` +* `flowfile_repository` +* `provenance_repository` +* `work` directory +* `logs` directory +* Within the `conf` directory, the _flow.xml.gz_ file is created See the <> section of this guide for more information about configuring NiFi repositories and configuration files. == Configuration Best Practices -NOTE: If you are running on Linux, consider these best practices. Typical Linux defaults are not necessarily well tuned for the needs of an IO intensive application like NiFi. For all of these areas, your distribution's requirements may vary. Use these sections as advice, but +NOTE: If you are running on Linux, consider these best practices. Typical Linux defaults are not necessarily well-tuned for the needs of an IO intensive application like NiFi. For all of these areas, your distribution's requirements may vary. Use these sections as advice, but consult your distribution-specific documentation for how best to achieve these recommendations. Maximum File Handles:: NiFi will at any one time potentially have a very large number of file handles open. Increase the limits by -editing '/etc/security/limits.conf' to add +editing _/etc/security/limits.conf_ to add something like ---- * hard nofile 50000 * soft nofile 50000 ---- Maximum Forked Processes:: -NiFi may be configured to generate a significant number of threads. To increase the allowable number edit '/etc/security/limits.conf' +NiFi may be configured to generate a significant number of threads. To increase the allowable number, edit _/etc/security/limits.conf_ ---- * hard nproc 10000 * soft nproc 10000 ---- -And your distribution may require an edit to /etc/security/limits.d/90-nproc.conf by adding +And your distribution may require an edit to _/etc/security/limits.d/90-nproc.conf_ by adding ---- * soft nproc 10000 ---- Increase the number of TCP socket ports available:: This is particularly important if your flow will be setting up and tearing -down a large number of sockets in small period of time. +down a large number of sockets in a small period of time. ---- sudo sysctl -w net.ipv4.ip_local_port_range="10000 65000" ---- @@ -107,23 +107,23 @@ sudo sysctl -w net.ipv4.ip_local_port_range="10000 65000" Set how long sockets stay in a TIMED_WAIT state when closed:: You don't want your sockets to sit and linger too long given that you want to be able to quickly setup and teardown new sockets. It is a good idea to read more about -it but to adjust do something like +it and adjust to something like ---- sudo sysctl -w net.ipv4.netfilter.ip_conntrack_tcp_timeout_time_wait="1" ---- Tell Linux you never want NiFi to swap:: Swapping is fantastic for some applications. It isn't good for something like -NiFi that always wants to be running. To tell Linux you'd like swapping off you -can edit '/etc/sysctl.conf' to add the following line +NiFi that always wants to be running. To tell Linux you'd like swapping off, you +can edit _/etc/sysctl.conf_ to add the following line ---- vm.swappiness = 0 ---- -For the partitions handling the various NiFi repos turn off things like 'atime'. -Doing so can cause a surprising bump in throughput. Edit the '/etc/fstab' file -and for the partition(s) of interest add the 'noatime' option. - +For the partitions handling the various NiFi repos, turn off things like `atime`. +Doing so can cause a surprising bump in throughput. Edit the `/etc/fstab` file +and for the partition(s) of interest, add the `noatime` option. +[[security_configuration]] == Security Configuration NiFi provides several different configuration options for security purposes. The most important properties are those under the @@ -164,12 +164,12 @@ accomplished by setting the `nifi.remote.input.secure` and `nifi.cluster.protoco In order to facilitate the secure setup of NiFi, you can use the `tls-toolkit` command line utility to automatically generate the required keystores, truststore, and relevant configuration files. This is especially useful for securing multiple NiFi nodes, which can be a tedious and error-prone process. -Wildcard certificates (i.e. two nodes `node1.nifi.apache.org` and `node2.nifi.apache.org` being assigned the same certificate with a CN or SAN entry of +*.nifi.apache.org+) are *not officially supported* and *not recommended*. There are numerous disadvantages to using wildcard certificates, and a cluster working with wildcard certificates has occurred in previous versions out of lucky accidents, not intentional support. Wildcard SAN entries are acceptable *if* each cert maintains an additional unique SAN entry and CN entry. +Wildcard certificates (i.e. two nodes `node1.nifi.apache.org` and `node2.nifi.apache.org` being assigned the same certificate with a CN or SAN entry of `+*.nifi.apache.org+`) are *not officially supported* and *not recommended*. There are numerous disadvantages to using wildcard certificates, and a cluster working with wildcard certificates has occurred in previous versions out of lucky accidents, not intentional support. Wildcard SAN entries are acceptable *if* each cert maintains an additional unique SAN entry and CN entry. ==== Potential issues with wildcard certificates * In many places throughout the codebase, cluster communications use certificate identities many times to identify a node, and if the certificate simply presents a wildcard DN, that doesn’t resolve to a specific node -* Admins may need to provide a custom node identity in `authorizers.xml` for `*.nifi.apache.org` because all proxy actions only resolve to the cert DN (see <>) +* Admins may need to provide a custom node identity in _authorizers.xml_ for `*.nifi.apache.org` because all proxy actions only resolve to the cert DN (see <>) * Admins have no traceability into which node performed an action because they all resolve to the same DN * Admins running multiple instances on the same machine using different ports to identify them can accidentally put `node1` hostname with `node2` port, and the address will resolve fine because it’s using the same certificate, but the host header handler will block it because the `node1` hostname is (correctly) not listed as an acceptable host for `node2` instance * If the wildcard certificate is compromised, all nodes are compromised @@ -178,7 +178,7 @@ NOTE: JKS keystores and truststores are recommended for NiFi. This tool allows The `tls-toolkit` command line tool has two primary modes of operation: -1. Standalone -- generates the certificate authority, keystores, truststores, and nifi.properties files in one command. +1. Standalone -- generates the certificate authority, keystores, truststores, and _nifi.properties_ files in one command. 2. Client/Server mode -- uses a Certificate Authority Server that accepts Certificate Signing Requests from clients, signs them, and sends the resulting certificates back. Both client and server validate the other’s identity through a shared secret. ==== Standalone @@ -192,7 +192,7 @@ You can use the following command line options with the `tls-toolkit` in standal * `-c`,`--certificateAuthorityHostname ` Hostname of NiFi Certificate Authority (default: `localhost`) * `-C`,`--clientCertDn ` Generate client certificate suitable for use in browser with specified DN (Can be specified multiple times) * `-d`,`--days ` Number of days issued certificate should be valid for (default: `1095`) -* `-f`,`--nifiPropertiesFile ` Base `nifi.properties` file to update (Embedded file identical to the one in a default NiFi install will be used if not specified) +* `-f`,`--nifiPropertiesFile ` Base _nifi.properties_ file to update (Embedded file identical to the one in a default NiFi install will be used if not specified) * `-g`,`--differentKeyAndKeystorePasswords` Use different generated password for the key and the keystore * `-G`,`--globalPortSequence ` Use sequential ports that are calculated for all hosts according to the provided hostname expressions (Can be specified multiple times, MUST BE SAME FROM RUN TO RUN) * `-h`,`--help` Print help and exit @@ -217,17 +217,17 @@ Hostname Patterns: Examples: -Create 4 sets of keystore, truststore, nifi.properties for localhost along with a client certificate with the given DN: +Create 4 sets of keystore, truststore, _nifi.properties_ for localhost along with a client certificate with the given DN: ---- bin/tls-toolkit.sh standalone -n 'localhost(4)' -C 'CN=username,OU=NIFI' ---- -Create keystore, truststore, nifi.properties for 10 NiFi hostnames in each of 4 subdomains: +Create keystore, truststore, _nifi.properties_ for 10 NiFi hostnames in each of 4 subdomains: ---- bin/tls-toolkit.sh standalone -n 'nifi[01-10].subdomain[1-4].domain' ---- -Create 2 sets of keystore, truststore, nifi.properties for 10 NiFi hostnames in each of 4 subdomains along with a client certificate with the given DN: +Create 2 sets of keystore, truststore, _nifi.properties_ for 10 NiFi hostnames in each of 4 subdomains along with a client certificate with the given DN: ---- bin/tls-toolkit.sh standalone -n 'nifi[01-10].subdomain[1-4].domain(2)' -C 'CN=username,OU=NIFI' ---- @@ -612,9 +612,10 @@ NiFi supports user authentication via client certificates, via username/password Username/password authentication is performed by a 'Login Identity Provider'. The Login Identity Provider is a pluggable mechanism for authenticating users via their username/password. Which Login Identity Provider to use is configured in the _nifi.properties_ file. -Currently NiFi offers username/password with Login Identity Providers options for LDAP and Kerberos. +Currently NiFi offers username/password with Login Identity Providers options for <> and <>. + +The `nifi.login.identity.provider.configuration.file` property specifies the configuration file for Login Identity Providers. By default, this property is set to `./conf/login-identity-providers.xml`. -The `nifi.login.identity.provider.configuration.file` property specifies the configuration file for Login Identity Providers. The `nifi.security.user.login.identity.provider` property indicates which of the configured Login Identity Provider should be used. By default, this property is not configured meaning that username/password must be explicitly enabled. @@ -627,7 +628,7 @@ token during authentication. NOTE: NiFi can only be configured for username/password, OpenId Connect, or Apache Knox at a given time. It does not support running each of these concurrently. NiFi will require client certificates for authenticating users over HTTPS if none of these are configured. -A secured instance of NiFi cannot be accessed anonymously unless configured to use an LDAP or Kerberos Login Identity Provider, which in turn must be configured to explicitly allow anonymous access. Anonymous access is not currently possible by the default FileAuthorizer (see <>), but is a future effort (link:https://issues.apache.org/jira/browse/NIFI-2730[NIFI-2730^]). +A secured instance of NiFi cannot be accessed anonymously unless configured to use an <> or <> Login Identity Provider, which in turn must be configured to explicitly allow anonymous access. Anonymous access is not currently possible by the default FileAuthorizer (see <>), but is a future effort (link:https://issues.apache.org/jira/browse/NIFI-2730[NIFI-2730^]). NOTE: NiFi does not perform user authentication over HTTP. Using HTTP, all users will be granted all roles. @@ -636,6 +637,14 @@ NOTE: NiFi does not perform user authentication over HTTP. Using HTTP, all users Below is an example and description of configuring a Login Identity Provider that integrates with a Directory Server to authenticate users. +Set the following in _nifi.properties_ to enable LDAP username/password authentication: + +---- +nifi.security.user.login.identity.provider=ldap-provider +---- + +Modify _login-identity-providers.xml_ to enable the `ldap-provider`. Here is the sample provided in the file: + ---- ldap-provider @@ -668,73 +677,75 @@ Below is an example and description of configuring a Login Identity Provider tha ---- -With this configuration, username/password authentication can be enabled by referencing this provider in _nifi.properties_. - ----- -nifi.security.user.login.identity.provider=ldap-provider ----- +The `ldap-provider` has the following properties: [options="header,footer"] |================================================================================================================================================== | Property Name | Description -|`Authentication Strategy` | How the connection to the LDAP server is authenticated. Possible values are ANONYMOUS, SIMPLE, LDAPS, or START_TLS. +|`Authentication Strategy` | How the connection to the LDAP server is authenticated. Possible values are `ANONYMOUS`, `SIMPLE`, `LDAPS`, or `START_TLS`. |`Manager DN` | The DN of the manager that is used to bind to the LDAP server to search for users. |`Manager Password` | The password of the manager that is used to bind to the LDAP server to search for users. |`TLS - Keystore` | Path to the Keystore that is used when connecting to LDAP using LDAPS or START_TLS. |`TLS - Keystore Password` | Password for the Keystore that is used when connecting to LDAP using LDAPS or START_TLS. -|`TLS - Keystore Type` | Type of the Keystore that is used when connecting to LDAP using LDAPS or START_TLS (i.e. JKS or PKCS12). +|`TLS - Keystore Type` | Type of the Keystore that is used when connecting to LDAP using LDAPS or START_TLS (i.e. `JKS` or `PKCS12`). |`TLS - Truststore` | Path to the Truststore that is used when connecting to LDAP using LDAPS or START_TLS. |`TLS - Truststore Password` | Password for the Truststore that is used when connecting to LDAP using LDAPS or START_TLS. -|`TLS - Truststore Type` | Type of the Truststore that is used when connecting to LDAP using LDAPS or START_TLS (i.e. JKS or PKCS12). -|`TLS - Client Auth` | Client authentication policy when connecting to LDAP using LDAPS or START_TLS. Possible values are REQUIRED, WANT, NONE. -|`TLS - Protocol` | Protocol to use when connecting to LDAP using LDAPS or START_TLS. (i.e. TLS, TLSv1.1, TLSv1.2, etc). +|`TLS - Truststore Type` | Type of the Truststore that is used when connecting to LDAP using LDAPS or START_TLS (i.e. `JKS` or `PKCS12`). +|`TLS - Client Auth` | Client authentication policy when connecting to LDAP using LDAPS or START_TLS. Possible values are `REQUIRED`, `WANT`, `NONE`. +|`TLS - Protocol` | Protocol to use when connecting to LDAP using LDAPS or START_TLS. (i.e. `TLS`, `TLSv1.1`, `TLSv1.2`, etc). |`TLS - Shutdown Gracefully` | Specifies whether the TLS should be shut down gracefully before the target context is closed. Defaults to false. -|`Referral Strategy` | Strategy for handling referrals. Possible values are FOLLOW, IGNORE, THROW. -|`Connect Timeout` | Duration of connect timeout. (i.e. 10 secs). -|`Read Timeout` | Duration of read timeout. (i.e. 10 secs). -|`Url` | Space-separated list of URLs of the LDAP servers (i.e. ldap://:). -|`User Search Base` | Base DN for searching for users (i.e. CN=Users,DC=example,DC=com). -|`User Search Filter` | Filter for searching for users against the 'User Search Base'. (i.e. sAMAccountName={0}). The user specified name is inserted into '{0}'. -|`Identity Strategy` | Strategy to identify users. Possible values are USE_DN and USE_USERNAME. The default functionality if this property is missing is USE_DN in order to retain backward -compatibility. USE_DN will use the full DN of the user entry if possible. USE_USERNAME will use the username the user logged in with. +|`Referral Strategy` | Strategy for handling referrals. Possible values are `FOLLOW`, `IGNORE`, `THROW`. +|`Connect Timeout` | Duration of connect timeout. (i.e. `10 secs`). +|`Read Timeout` | Duration of read timeout. (i.e. `10 secs`). +|`Url` | Space-separated list of URLs of the LDAP servers (i.e. `ldap://:`). +|`User Search Base` | Base DN for searching for users (i.e. `CN=Users,DC=example,DC=com`). +|`User Search Filter` | Filter for searching for users against the `User Search Base`. (i.e. `sAMAccountName={0}`). The user specified name is inserted into '{0}'. +|`Identity Strategy` | Strategy to identify users. Possible values are `USE_DN` and `USE_USERNAME`. The default functionality if this property is missing is USE_DN in order to retain backward +compatibility. `USE_DN` will use the full DN of the user entry if possible. `USE_USERNAME` will use the username the user logged in with. |`Authentication Expiration` | The duration of how long the user authentication is valid for. If the user never logs out, they will be required to log back in following this duration. |================================================================================================================================================== +NOTE: For changes to _nifi.properties_ and _login-identity-providers.xml_ to take effect, NiFi needs to be restarted. If NiFi is clustered, configuration files must be the same on all nodes. + [[kerberos_login_identity_provider]] === Kerberos Below is an example and description of configuring a Login Identity Provider that integrates with a Kerberos Key Distribution Center (KDC) to authenticate users. ----- - - kerberos-provider - org.apache.nifi.kerberos.KerberosProvider - NIFI.APACHE.ORG - /etc/krb5.conf - 12 hours - ----- - -With this configuration, username/password authentication can be enabled by referencing this provider in _nifi.properties_. +Set the following in _nifi.properties_ to enable Kerberos username/password authentication: ---- nifi.security.user.login.identity.provider=kerberos-provider ---- +Modify _login-identity-providers.xml_ to enable the `kerberos-provider`. Here is the sample provided in the file: + +---- + + kerberos-provider + org.apache.nifi.kerberos.KerberosProvider + NIFI.APACHE.ORG + 12 hours + +---- + +The `kerberos-provider` has the following properties: + [options="header,footer"] |================================================================================================================================================== | Property Name | Description -|`Default Realm` | Default realm to provide when user enters incomplete user principal (i.e. NIFI.APACHE.ORG). -|`Kerberos Config File` | Absolute path to Kerberos client configuration file. +|`Default Realm` | Default realm to provide when user enters incomplete user principal (i.e. `NIFI.APACHE.ORG`). |`Authentication Expiration`| The duration of how long the user authentication is valid for. If the user never logs out, they will be required to log back in following this duration. |================================================================================================================================================== See also <> to allow single sign-on access via client Kerberos tickets. +NOTE: For changes to _nifi.properties_ and _login-identity-providers.xml_ to take effect, NiFi needs to be restarted. If NiFi is clustered, configuration files must be the same on all nodes. + [[openid_connect]] === OpenId Connect -To enable authentication via OpenId Connect the following properties must be configured in nifi.properties. +To enable authentication via OpenId Connect the following properties must be configured in _nifi.properties_. [options="header,footer"] |================================================================================================================================================== @@ -744,23 +755,23 @@ To enable authentication via OpenId Connect the following properties must be con |`nifi.security.user.oidc.read.timeout` | Read timeout when communicating with the OpenId Connect Provider. |`nifi.security.user.oidc.client.id` | The client id for NiFi after registration with the OpenId Connect Provider. |`nifi.security.user.oidc.client.secret` | The client secret for NiFi after registration with the OpenId Connect Provider. -|`nifi.security.user.oidc.preferred.jwsalgorithm` | The preferred algorithm for for validating identity tokens. If this value is blank, it will default to 'RS256' which is required to be supported -by the OpenId Connect Provider according to the specification. If this value is 'HS256', 'HS384', or 'HS512', NiFi will attempt to validate HMAC protected tokens using the specified client secret. -If this value is 'none', NiFi will attempt to validate unsecured/plain tokens. Other values for this algorithm will attempt to parse as an RSA or EC algorithm to be used in conjunction with the +|`nifi.security.user.oidc.preferred.jwsalgorithm` | The preferred algorithm for for validating identity tokens. If this value is blank, it will default to `RS256` which is required to be supported +by the OpenId Connect Provider according to the specification. If this value is `HS256`, `HS384`, or `HS512`, NiFi will attempt to validate HMAC protected tokens using the specified client secret. +If this value is `none`, NiFi will attempt to validate unsecured/plain tokens. Other values for this algorithm will attempt to parse as an RSA or EC algorithm to be used in conjunction with the JSON Web Key (JWK) provided through the jwks_uri in the metadata found at the discovery URL. |================================================================================================================================================== [[apache_knox]] === Apache Knox -To enable authentication via Apache Knox the following properties must be configured in nifi.properties. +To enable authentication via Apache Knox the following properties must be configured in _nifi.properties_. [options="header,footer"] |================================================================================================================================================== | Property Name | Description -|`nifi.security.user.knox.url` | The URL for the Apache Knox log in page. +|`nifi.security.user.knox.url` | The URL for the Apache Knox login page. |`nifi.security.user.knox.publicKey` | The path to the Apache Knox public key that will be used to verify the signatures of the authentication tokens in the HTTP Cookie. -|`nifi.security.user.knox.cookieName` | The name of the HTTP Cookie that Apache Knox will generate after successful log in. +|`nifi.security.user.knox.cookieName` | The name of the HTTP Cookie that Apache Knox will generate after successful login. |`nifi.security.user.knox.audiences` | Optional. A comma separate listed of allowed audiences. If set, the audience in the token must be present in this listing. The audience that is populated in the token can be configured in Knox. |================================================================================================================================================== @@ -778,99 +789,161 @@ user has privileges to perform that action. These privileges are defined by poli An 'authorizer' grants users the privileges to manage users and policies by creating preliminary authorizations at startup. -Authorizers are configured using two properties in the 'nifi.properties' file: +Authorizers are configured using two properties in the _nifi.properties_ file: -* The `nifi.authorizer.configuration.file` property specifies the configuration file where authorizers are defined. By default, the 'authorizers.xml' file located in the root installation conf directory is selected. -* The `nifi.security.user.authorizer` property indicates which of the configured authorizers in the 'authorizers.xml' file to use. +* The `nifi.authorizer.configuration.file` property specifies the configuration file where authorizers are defined. By default, the _authorizers.xml_ file located in the root installation conf directory is selected. +* The `nifi.security.user.authorizer` property indicates which of the configured authorizers in the _authorizers.xml_ file to use. [[authorizers-setup]] === Authorizers.xml Setup -The 'authorizers.xml' file is used to define and configure available authorizers. The default authorizer is the StandardManagedAuthorizer. The managed authorizer is comprised of a UserGroupProvider +The _authorizers.xml_ file is used to define and configure available authorizers. The default authorizer is the StandardManagedAuthorizer. The managed authorizer is comprised of a UserGroupProvider and a AccessPolicyProvider. The users, group, and access policies will be loaded and optionally configured through these providers. The managed authorizer will make all access decisions based on these provided users, groups, and access policies. During startup there is a check to ensure that there are no two users/groups with the same identity/name. This check is executed regardless of the configured implementation. This is necessary because this is how users/groups are identified and authorized during access decisions. + +==== FileUserGroupProvider + The default UserGroupProvider is the FileUserGroupProvider, however, you can develop additional UserGroupProviders as extensions. The FileUserGroupProvider has the following properties: -* Users File - The file where the FileUserGroupProvider stores users and groups. By default, the 'users.xml' in the 'conf' directory is chosen. -* Legacy Authorized Users File - The full path to an existing authorized-users.xml that will be automatically be used to load the users and groups into the Users File. +* Users File - The file where the FileUserGroupProvider stores users and groups. By default, the _users.xml_ in the `conf` directory is chosen. +* Legacy Authorized Users File - The full path to an existing _authorized-users.xml_ that will be automatically be used to load the users and groups into the Users File. * Initial User Identity - The identity of a users and systems to seed the Users File. The name of each property must be unique, for example: "Initial User Identity A", "Initial User Identity B", "Initial User Identity C" or "Initial User Identity 1", "Initial User Identity 2", "Initial User Identity 3" -Another option for the UserGroupProvider is the LdapUserGroupProvider. By default, this option is commented out but can be configured in lieu of the FileUserGroupProvider. This will sync users and groups from a directory server and will present them in NiFi UI in read only form. The LdapUserGroupProvider has the following properties: +==== LdapUserGroupProvider -* Authentication Strategy - How the connection to the LDAP server is authenticated. Possible values are ANONYMOUS, SIMPLE, LDAPS, or START_TLS -* Manager DN - The DN of the manager that is used to bind to the LDAP server to search for users. -* Manager Password - The password of the manager that is used to bind to the LDAP server to search for users. -* TLS - Keystore - Path to the Keystore that is used when connecting to LDAP using LDAPS or START_TLS. -* TLS - Keystore Password - Password for the Keystore that is used when connecting to LDAP using LDAPS or START_TLS. -* TLS - Keystore Type - Type of the Keystore that is used when connecting to LDAP using LDAPS or START_TLS (i.e. JKS or PKCS12). -* TLS - Truststore - Path to the Truststore that is used when connecting to LDAP using LDAPS or START_TLS. -* TLS - Truststore Password - Password for the Truststore that is used when connecting to LDAP using LDAPS or START_TLS. -* TLS - Truststore Type - Type of the Truststore that is used when connecting to LDAP using LDAPS or START_TLS (i.e. JKS or PKCS12). -* TLS - Client Auth - Client authentication policy when connecting to LDAP using LDAPS or START_TLS. Possible values are REQUIRED, WANT, NONE. -* TLS - Protocol - Protocol to use when connecting to LDAP using LDAPS or START_TLS. (i.e. TLS, TLSv1.1, TLSv1.2, etc). -* TLS - Shutdown Gracefully - Specifies whether the TLS should be shut down gracefully before the target context is closed. Defaults to false. -* Referral Strategy - Strategy for handling referrals. Possible values are FOLLOW, IGNORE, THROW. -* Connect Timeout - Duration of connect timeout. (i.e. 10 secs). -* Read Timeout - Duration of read timeout. (i.e. 10 secs). -* Url - Space-separated list of URLs of the LDAP servers (i.e. ldap://:). -* Page Size - Sets the page size when retrieving users and groups. If not specified, no paging is performed. -* Sync Interval - Duration of time between syncing users and groups. (i.e. 30 mins). Minimum allowable value is 10 secs. -* User Search Base - Base DN for searching for users (i.e. ou=users,o=nifi). Required to search users. -* User Object Class - Object class for identifying users (i.e. person). Required if searching users. -* User Search Scope - Search scope for searching users (ONE_LEVEL, OBJECT, or SUBTREE). Required if searching users. -* User Search Filter - Filter for searching for users against the 'User Search Base' (i.e. (memberof=cn=team1,ou=groups,o=nifi) ). Optional. -* User Identity Attribute - Attribute to use to extract user identity (i.e. cn). Optional. If not set, the entire DN is used. -* User Group Name Attribute - Attribute to use to define group membership (i.e. memberof). Optional. If not set group membership will not be calculated through the users. Will rely on group membership being defined through 'Group Member Attribute' if set. The value of this property is the name of the attribute in the user ldap entry that associates them with a group. The value of that user attribute could be a dn or group name for instance. What value is expected is configured in the 'User Group Name Attribute - Referenced Group Attribute'. -* User Group Name Attribute - Referenced Group Attribute - If blank, the value of the attribute defined in 'User Group Name Attribute' is expected to be the full dn of the group. If not blank, this property will define the attribute of the group ldap entry that the value of the attribute defined in 'User Group Name Attribute' is referencing (i.e. name). Use of this property requires that 'Group Search Base' is also configured. -* Group Search Base - Base DN for searching for groups (i.e. ou=groups,o=nifi). Required to search groups. -* Group Object Class - Object class for identifying groups (i.e. groupOfNames). Required if searching groups. -* Group Search Scope - Search scope for searching groups (ONE_LEVEL, OBJECT, or SUBTREE). Required if searching groups. -* Group Search Filter - Filter for searching for groups against the 'Group Search Base'. Optional. -* Group Name Attribute - Attribute to use to extract group name (i.e. cn). Optional. If not set, the entire DN is used. -* Group Member Attribute - Attribute to use to define group membership (i.e. member). Optional. If not set group membership will not be calculated through the groups. Will rely on group membership being defined through 'User Group Name Attribute' if set. The value of this property is the name of the attribute in the group ldap entry that associates them with a user. The value of that group attribute could be a dn or memberUid for instance. What value is expected is configured in the 'Group Member Attribute - Referenced User Attribute'. (i.e. member: cn=User 1,ou=users,o=nifi vs. memberUid: user1) -* Group Member Attribute - Referenced User Attribute - If blank, the value of the attribute defined in 'Group Member Attribute' is expected to be the full dn of the user. If not blank, this property will define the attribute of the user ldap entry that the value of the attribute defined in 'Group Member Attribute' is referencing (i.e. uid). Use of this property requires that 'User Search Base' is also configured. (i.e. member: cn=User 1,ou=users,o=nifi vs. memberUid: user1) +Another option for the UserGroupProvider is the LdapUserGroupProvider. By default, this option is commented out but can be configured in lieu of the FileUserGroupProvider. This will sync users and groups from a directory server and will present them in the NiFi UI in read only form. + +The LdapUserGroupProvider has the following properties: + +[options="header,footer"] +|================================================================================================================================================== +| Property Name | Description +|`Authentication Strategy` | How the connection to the LDAP server is authenticated. Possible values are `ANONYMOUS`, `SIMPLE`, `LDAPS`, or `START_TLS`. +|`Manager DN` | The DN of the manager that is used to bind to the LDAP server to search for users. +|`Manager Password` | The password of the manager that is used to bind to the LDAP server to search for users. +|`TLS - Keystore` | Path to the Keystore that is used when connecting to LDAP using LDAPS or START_TLS. +|`TLS - Keystore Password` | Password for the Keystore that is used when connecting to LDAP using LDAPS or START_TLS. +|`TLS - Keystore Type` | Type of the Keystore that is used when connecting to LDAP using LDAPS or START_TLS (i.e. `JKS` or `PKCS12`). +|`TLS - Truststore` | Path to the Truststore that is used when connecting to LDAP using LDAPS or START_TLS. +|`TLS - Truststore Password` | Password for the Truststore that is used when connecting to LDAP using LDAPS or START_TLS. +|`TLS - Truststore Type` | Type of the Truststore that is used when connecting to LDAP using LDAPS or START_TLS (i.e. `JKS` or `PKCS12`). +|`TLS - Client Auth` | Client authentication policy when connecting to LDAP using LDAPS or START_TLS. Possible values are `REQUIRED`, `WANT`, `NONE`. +|`TLS - Protocol` | Protocol to use when connecting to LDAP using LDAPS or START_TLS. (i.e. `TLS`, `TLSv1.1`, `TLSv1.2`, etc). +|`TLS - Shutdown Gracefully` | Specifies whether the TLS should be shut down gracefully before the target context is closed. Defaults to false. +|`Referral Strategy` | Strategy for handling referrals. Possible values are `FOLLOW`, `IGNORE`, `THROW`. +|`Connect Timeout` | Duration of connect timeout. (i.e. `10 secs`). +|`Read Timeout` | Duration of read timeout. (i.e. `10 secs`). +|`Url` | Space-separated list of URLs of the LDAP servers (i.e. `ldap://:`). +|`Page Size` | Sets the page size when retrieving users and groups. If not specified, no paging is performed. +|`Sync Interval` | Duration of time between syncing users and groups. (i.e. `30 mins`). Minimum allowable value is `10 secs`. +|`User Search Base` | Base DN for searching for users (i.e. `ou=users,o=nifi`). Required to search users. +|`User Object Class` | Object class for identifying users (i.e. `person`). Required if searching users. +|`User Search Scope` | Search scope for searching users (`ONE_LEVEL`, `OBJECT`, or `SUBTREE`). Required if searching users. +|`User Search Filter` | Filter for searching for users against the `User Search Base` (i.e. `(memberof=cn=team1,ou=groups,o=nifi)`). Optional. +|`User Identity Attribute` | Attribute to use to extract user identity (i.e. `cn`). Optional. If not set, the entire DN is used. +|`User Group Name Attribute` | Attribute to use to define group membership (i.e. `memberof`). Optional. If not set group membership will not be calculated through the users. Will rely on group membership being defined through `Group Member Attribute` if set. The value of this property is the name of the attribute in the user ldap entry that associates them with a group. The value of that user attribute could be a dn or group name for instance. What value is expected is configured in the `User Group Name Attribute - Referenced Group Attribute`. +|`User Group Name Attribute - Referenced Group Attribute` | If blank, the value of the attribute defined in `User Group Name Attribute` is expected to be the full dn of the group. If not blank, this property will define the attribute of the group ldap entry that the value of the attribute defined in `User Group Name Attribute` is referencing (i.e. `name`). Use of this property requires that `Group Search Base` is also configured. +|`Group Search Base` | Base DN for searching for groups (i.e. `ou=groups,o=nifi`). Required to search groups. +|`Group Object Class` | Object class for identifying groups (i.e. `groupOfNames`). Required if searching groups. +|`Group Search Scope` | Search scope for searching groups (`ONE_LEVEL`, `OBJECT`, or `SUBTREE`). Required if searching groups. +|`Group Search Filter` | Filter for searching for groups against the `Group Search Base`. Optional. +|`Group Name Attribute` | Attribute to use to extract group name (i.e. `cn`). Optional. If not set, the entire DN is used. +|`Group Member Attribute` | Attribute to use to define group membership (i.e. `member`). Optional. If not set group membership will not be calculated through the groups. Will rely on group membership being defined through `User Group Name Attribute` if set. The value of this property is the name of the attribute in the group ldap entry that associates them with a user. The value of that group attribute could be a dn or memberUid for instance. What value is expected is configured in the `Group Member Attribute - Referenced User Attribute`. (i.e. `member: cn=User 1,ou=users,o=nifi` vs. `memberUid: user1`) +|`Group Member Attribute - Referenced User Attribute` | If blank, the value of the attribute defined in `Group Member Attribute` is expected to be the full dn of the user. If not blank, this property will define the attribute of the user ldap entry that the value of the attribute defined in `Group Member Attribute` is referencing (i.e. `uid`). Use of this property requires that `User Search Base` is also configured. (i.e. `member: cn=User 1,ou=users,o=nifi` vs. `memberUid: user1`) +|================================================================================================================================================== + +NOTE: Any identity mapping rules specified in _nifi.properties_ will also be applied to the user identities. Group names are not mapped. + +==== Composite Implementations Another option for the UserGroupProvider are composite implementations. This means that multiple sources/implementations can be configured and composed. For instance, an admin can configure users/groups to be loaded from a file and a directory server. There are two composite implementations, one that supports multiple UserGroupProviders and one that supports multiple UserGroupProviders and a single configurable UserGroupProvider. -The CompositeUserGroupProvider will provide support for retrieving users and groups from multiple sources. The CompositeUserGroupProvider has the following properties: +The CompositeUserGroupProvider will provide support for retrieving users and groups from multiple sources. The CompositeUserGroupProvider has the following property: -* User Group Provider - The identifier of user group providers to load from. The name of each property must be unique, for example: "User Group Provider A", "User Group Provider B", "User Group Provider C" or "User Group Provider 1", "User Group Provider 2", "User Group Provider 3" +[options="header,footer"] +|================================================================================================================================================== +| Property Name | Description +|`User Group Provider [unique key]` | The identifier of user group providers to load from. The name of each property must be unique, for example: "User Group Provider A", "User Group Provider B", "User Group Provider C" or "User Group Provider 1", "User Group Provider 2", "User Group Provider 3" +|================================================================================================================================================== + +NOTE: Any identity mapping rules specified in _nifi.properties_ are not applied in this implementation. This behavior would need to be applied by the base implementation. The CompositeConfigurableUserGroupProvider will provide support for retrieving users and groups from multiple sources. Additionally, a single configurable user group provider is required. Users from the configurable user group provider are configurable, however users loaded from one of the User Group Provider [unique key] will not be. The CompositeConfigurableUserGroupProvider has the following properties: -* Configurable User Group Provider - A configurable user group provider. -* User Group Provider - The identifier of user group providers to load from. The name of each property must be unique, for example: "User Group Provider A", "User Group Provider B", "User Group Provider C" or "User Group Provider 1", "User Group Provider 2", "User Group Provider 3" +[options="header,footer"] +|================================================================================================================================================== +| Property Name | Description +|`Configurable User Group Provider` | A configurable user group provider. +|`User Group Provider [unique key]` | The identifier of user group providers to load from. The name of each property must be unique, for example: "User Group Provider A", "User Group Provider B", "User Group Provider C" or "User Group Provider 1", "User Group Provider 2", "User Group Provider 3" +|================================================================================================================================================== + +==== FileAccessPolicyProvider The default AccessPolicyProvider is the FileAccessPolicyProvider, however, you can develop additional AccessPolicyProvider as extensions. The FileAccessPolicyProvider has the following properties: -* User Group Provider - The identifier for an User Group Provider defined above that will be used to access users and groups for use in the managed access policies. -* Authorizations File - The file where the FileAccessPolicyProvider will store policies. -* Initial Admin Identity - The identity of an initial admin user that will be granted access to the UI and given the ability to create additional users, groups, and policies. The value of this property could be a DN when using certificates or LDAP, or a Kerberos principal. This property will only be used when there are no other policies defined. If this property is specified then a Legacy Authorized Users File can not be specified. -* Legacy Authorized Users File - The full path to an existing authorized-users.xml that will be automatically converted to the new authorizations model. If this property is specified then an Initial Admin Identity can not be specified, and this property will only be used when there are no other users, groups, and policies defined. -* Node Identity - The identity of a NiFi cluster node. When clustered, a property for each node should be defined, so that every node knows about every other node. If not clustered these properties can be ignored. The name of each property must be unique, for example for a three node cluster: "Node Identity A", "Node Identity B", "Node Identity C" or "Node Identity 1", "Node Identity 2", "Node Identity 3" +[options="header,footer"] +|================================================================================================================================================== +| Property Name | Description +|`User Group Provider` | The identifier for an User Group Provider defined above that will be used to access users and groups for use in the managed access policies. +|`Authorizations File` | The file where the FileAccessPolicyProvider will store policies. +|`Initial Admin Identity` | The identity of an initial admin user that will be granted access to the UI and given the ability to create additional users, groups, and policies. The value of this property could be a DN when using certificates or LDAP, or a Kerberos principal. This property will only be used when there are no other policies defined. If this property is specified then a Legacy Authorized Users File can not be specified. +|`Legacy Authorized Users File` | The full path to an existing _authorized-users.xml_ that will be automatically converted to the new authorizations model. If this property is specified then an Initial Admin Identity can not be specified, and this property will only be used when there are no other users, groups, and policies defined. +|`Node Identity` | The identity of a NiFi cluster node. When clustered, a property for each node should be defined, so that every node knows about every other node. If not clustered these properties can be ignored. The name of each property must be unique, for example for a three node cluster: "Node Identity A", "Node Identity B", "Node Identity C" or "Node Identity 1", "Node Identity 2", "Node Identity 3" -The identities configured in the Initial Admin Identity, the Node Identity properties, or discovered in a Legacy Authorized Users File must be available in the configured User Group Provider. +|================================================================================================================================================== -The default authorizer is the StandardManagedAuthorizer, however, you can develop additional authorizers as extensions. The StandardManagedAuthorizer has the following properties: +NOTE: The identities configured in the Initial Admin Identity, the Node Identity properties, or discovered in a Legacy Authorized Users File must be available in the configured User Group Provider. -* Access Policy Provider - The identifier for an Access Policy Provider defined above. +NOTE: Any users in the legacy users file must be found in the configured User Group Provider. -The FileAuthorizer has been replaced with the more granular StandardManagedAuthorizer approach described above. However, it is still available for backwards compatibility reasons. The -FileAuthorizer has the following properties. +NOTE: Any identity mapping rules specified in _nifi.properties_ will also be applied to the node identities, + so the values should be the unmapped identities (i.e. full DN from a certificate). This identity must be found + in the configured User Group Provider. -* Authorizations File - The file where the FileAuthorizer stores policies. By default, the 'authorizations.xml' in the 'conf' directory is chosen. -* Users File - The file where the FileAuthorizer stores users and groups. By default, the 'users.xml' in the 'conf' directory is chosen. -* Initial Admin Identity - The identity of an initial admin user that is granted access to the UI and given the ability to create additional users, groups, and policies. This property is only used when there are no other users, groups, and policies defined. -* Legacy Authorized Users File - The full path to an existing authorized-users.xml that is automatically converted to the multi-tenant authorization model. This property is only used when there are no other users, groups, and policies defined. -* Node Identity - The identity of a NiFi cluster node. When clustered, a property for each node should be defined, so that every node knows about every other node. If not clustered, these properties can be ignored. +==== StandardManagedAuthorizer + +The default authorizer is the StandardManagedAuthorizer, however, you can develop additional authorizers as extensions. The StandardManagedAuthorizer has the following property: + +[options="header,footer"] +|================================================================================================================================================== +| Property Name | Description +|`Access Policy Provider` | The identifier for an Access Policy Provider defined above. +|================================================================================================================================================== + + +==== FileAuthorizer + +The FileAuthorizer has been replaced with the more granular StandardManagedAuthorizer approach described above. However, it is still available for backwards compatibility reasons. The FileAuthorizer has the following properties: + +[options="header,footer"] +|================================================================================================================================================== +| Property Name | Description +|`Authorizations File` | The file where the FileAuthorizer stores policies. By default, the _authorizations.xml_ in the `conf` directory is chosen. +|`Users File` | The file where the FileAuthorizer stores users and groups. By default, the _users.xml_ in the `conf` directory is chosen. +|`Initial Admin Identity` | The identity of an initial admin user that is granted access to the UI and given the ability to create additional users, groups, and policies. This property is only used when there are no other users, groups, and policies defined. +|`Legacy Authorized Users File` | The full path to an existing _authorized-users.xml_ that is automatically converted to the multi-tenant authorization model. This property is only used when there are no other users, groups, and policies defined. +|`Node Identity` | The identity of a NiFi cluster node. When clustered, a property for each node should be defined, so that every node knows about every other node. If not clustered, these properties can be ignored. +|================================================================================================================================================== + +NOTE: Any identity mapping rules specified in _nifi.properties_ will also be applied to the initial admin identity, so the value should be the unmapped identity. + +NOTE: Any identity mapping rules specified in _nifi.properties_ will also be applied to the node identities, so the values should be the unmapped identities (i.e. full DN from a certificate). [[initial-admin-identity]] ==== Initial Admin Identity (New NiFi Instance) -If you are setting up a secured NiFi instance for the first time, you must manually designate an “Initial Admin Identity” in the 'authorizers.xml' file. This initial admin user is granted access to the UI and given the ability to create additional users, groups, and policies. The value of this property could be a DN (when using certificates or LDAP) or a Kerberos principal. If you are the NiFi administrator, add yourself as the “Initial Admin Identity”. +If you are setting up a secured NiFi instance for the first time, you must manually designate an “Initial Admin Identity” in the _authorizers.xml_ file. This initial admin user is granted access to the UI and given the ability to create additional users, groups, and policies. The value of this property could be a DN (when using certificates or LDAP) or a Kerberos principal. If you are the NiFi administrator, add yourself as the “Initial Admin Identity”. + +After you have edited and saved the _authorizers.xml_ file, restart NiFi. The “Initial Admin Identity” user and administrative policies are added to the _users.xml_ and _authorizations.xml_ files during restart. Once NiFi starts, the “Initial Admin Identity” user is able to access the UI and begin managing users, groups, and policies. + +NOTE: For a brand new secure flow, providing the "Initial Admin Identity" gives that user access to get into the UI and to manage users, groups and policies. But if that user wants to start modifying the flow, they need to grant themselves policies for the root process group. The system is unable to do this automatically because in a new flow the UUID of the root process group is not permanent until the _flow.xml.gz_ is generated. If the NiFi instance is an upgrade from an existing _flow.xml.gz_ or a 1.x instance going from unsecure to secure, then the "Initial Admin Identity" user is automatically given the privileges to modify the flow. + +Some common use cases are described below. + +===== File-based (LDAP Authentication) Here is an example LDAP entry using the name John Smith: @@ -902,6 +975,8 @@ Here is an example LDAP entry using the name John Smith: ---- +===== File-based (Kerberos Authentication) + Here is an example Kerberos entry using the name John Smith and realm `NIFI.APACHE.ORG`: ---- @@ -932,11 +1007,9 @@ Here is an example Kerberos entry using the name John Smith and realm `NIFI.APAC ---- -After you have edited and saved the 'authorizers.xml' file, restart NiFi. The “Initial Admin Identity” user and administrative policies are added to the 'users.xml' and 'authorizations.xml' files during restart. Once NiFi starts, the “Initial Admin Identity” user is able to access the UI and begin managing users, groups, and policies. +===== LDAP-based Users/Groups Referencing User DN -NOTE: For a brand new secure flow, providing the "Initial Admin Identity" gives that user access to get into the UI and to manage users, groups and policies. But if that user wants to start modifying the flow, they need to grant themselves policies for the root process group. The system is unable to do this automatically because in a new flow the UUID of the root process group is not permanent until the flow.xml.gz is generated. If the NiFi instance is an upgrade from an existing flow.xml.gz or a 1.x instance going from unsecure to secure, then the "Initial Admin Identity" user is automatically given the privileges to modify the flow. - -Here is an example loading users and groups from LDAP. Group membership will be driven through the member attribute of each group. Authorization will still use file based access policies: +Here is an example loading users and groups from LDAP. Group membership will be driven through the member attribute of each group. Authorization will still use file-based access policies: ---- dn: cn=User 1,ou=users,o=nifi @@ -1025,9 +1098,11 @@ member: cn=User 2,ou=users,o=nifi ---- -The 'Initial Admin Identity' value would have loaded from the cn from John Smith's entry based on the 'User Identity Attribute' value. +The `Initial Admin Identity` value would have loaded from the cn from John Smith's entry based on the `User Identity Attribute` value. -Here is an example loading users and groups from LDAP. Group membership will be driven through the member attribute of each group. Authorization will still use file based access policies: +===== LDAP-based Users/Groups Referencing User Attribute + +Here is an example loading users and groups from LDAP. Group membership will be driven through the member uid attribute of each group. Authorization will still use file-based access policies: ---- dn: uid=User 1,ou=Users,dc=local @@ -1077,7 +1152,7 @@ memberUid: user2 30 mins - ou=Groups,dc=local + ou=Users,dc=local posixAccount ONE_LEVEL @@ -1111,9 +1186,36 @@ memberUid: user2 ---- +===== Composite - File and LDAP-based Users/Groups + Here is an example composite implementation loading users and groups from LDAP and a local file. Group membership will be driven through the member attribute of each group. The users from LDAP will be read only while the users loaded from the file will be configurable in UI. ---- +dn: cn=User 1,ou=users,o=nifi +objectClass: organizationalPerson +objectClass: person +objectClass: inetOrgPerson +objectClass: top +cn: User 1 +sn: User1 +uid: user1 + +dn: cn=User 2,ou=users,o=nifi +objectClass: organizationalPerson +objectClass: person +objectClass: inetOrgPerson +objectClass: top +cn: User 2 +sn: User2 +uid: user2 + +dn: cn=admins,ou=groups,o=nifi +objectClass: groupOfNames +objectClass: top +cn: admins +member: cn=User 1,ou=users,o=nifi +member: cn=User 2,ou=users,o=nifi + file-user-group-provider @@ -1191,12 +1293,12 @@ Here is an example composite implementation loading users and groups from LDAP a ---- -In this example, the users and groups are loaded from LDAP but the servers are managed in a local file. The 'Initial Admin Identity' value came from an attribute in a LDAP entry based on the 'User Identity Attribute'. The 'Node Identity' values are established in the local file using the 'Initial User Identity' properties. +In this example, the users and groups are loaded from LDAP but the servers are managed in a local file. The `Initial Admin Identity` value came from an attribute in a LDAP entry based on the `User Identity Attribute`. The `Node Identity` values are established in the local file using the `Initial User Identity` properties. [[legacy-authorized-users]] ==== Legacy Authorized Users (NiFi Instance Upgrade) -If you are upgrading from a 0.x NiFi instance, you can convert your previously configured users and roles to the multi-tenant authorization model. In the 'authorizers.xml' file, specify the location of your existing 'authorized-users.xml' file in the “Legacy Authorized Users File” property. +If you are upgrading from a 0.x NiFi instance, you can convert your previously configured users and roles to the multi-tenant authorization model. In the _authorizers.xml_ file, specify the location of your existing _authorized-users.xml_ file in the `Legacy Authorized Users File` property. Here is an example entry: @@ -1228,9 +1330,9 @@ Here is an example entry: ---- -After you have edited and saved the 'authorizers.xml' file, restart NiFi. Users and roles from the 'authorized-users.xml' file are converted and added as identities and policies in the 'users.xml' and 'authorizations.xml' files. Once the application starts, users who previously had a legacy Administrator role can access the UI and begin managing users, groups, and policies. +After you have edited and saved the _authorizers.xml_ file, restart NiFi. Users and roles from the _authorized-users.xml_ file are converted and added as identities and policies in the _users.xml_ and _authorizations.xml_ files. Once the application starts, users who previously had a legacy Administrator role can access the UI and begin managing users, groups, and policies. -The following tables summarize the global and component policies assigned to each legacy role if the NiFi instance has an existing 'flow.xml.gz': +The following tables summarize the global and component policies assigned to each legacy role if the NiFi instance has an existing _flow.xml.gz_: ===== Global Access Policies [cols=">s,^s,^s,^s,^s,^s,^s", options="header"] @@ -1265,9 +1367,9 @@ The following tables summarize the global and component policies assigned to eac For details on the individual policies in the table, see <>. -NOTE: NiFi fails to restart if values exist for both the “Initial Admin Identity” and “Legacy Authorized Users File” properties. You can specify only one of these values to initialize authorizations. +NOTE: NiFi fails to restart if values exist for both the `Initial Admin Identity` and `Legacy Authorized Users File` properties. You can specify only one of these values to initialize authorizations. -NOTE: Do not manually edit the 'authorizations.xml' file. Create authorizations only during initial setup and afterwards using the NiFi UI. +NOTE: Do not manually edit the _authorizations.xml_ file. Create authorizations only during initial setup and afterwards using the NiFi UI. [[cluster-node-identities]] ==== Cluster Node Identities @@ -1312,7 +1414,7 @@ cn=nifi-2,ou=people,dc=example,dc=com ---- -NOTE: In a cluster, all nodes must have the same 'authorizations.xml' and 'users.xml'. The only exception is if a node has empty 'authorizations.xml' and 'user.xml' files prior to joining the cluster. In this scenario, the node inherits them from the cluster during startup. +NOTE: In a cluster, all nodes must have the same _authorizations.xml_ and _users.xml_. The only exception is if a node has empty _authorizations.xml_ and _user.xml_ files prior to joining the cluster. In this scenario, the node inherits them from the cluster during startup. Now that initial authorizations have been created, additional users, groups and authorizations can be created and managed in the NiFi UI. @@ -1620,7 +1722,7 @@ image:user2-edit-connection.png["User2 Edit Connection"] This section provides an overview of the capabilities of NiFi to encrypt and decrypt data. -The `EncryptContent` processor allows for the encryption and decryption of data, both internal to NiFi and integrated with external systems, such as `openssl` and other data sources and consumers. +The EncryptContent processor allows for the encryption and decryption of data, both internal to NiFi and integrated with external systems, such as `openssl` and other data sources and consumers. [[key-derivation-functions]] === Key Derivation Functions @@ -1673,7 +1775,7 @@ Here are the KDFs currently supported by NiFi (primarily in the `EncryptContent` * link:http://security.stackexchange.com/a/26253/16485[Scrypt vs. Bcrypt (as of 2010)^] * link:http://security.stackexchange.com/a/6415/16485[Bcrypt vs PBKDF2^] * link:http://wildlyinaccurate.com/bcrypt-choosing-a-work-factor/[Choosing a work factor for Bcrypt^] -* link:https://docs.spring.io/spring-security/site/docs/current/apidocs/org/springframework/security/crypto/bcrypt/BCrypt.html[Spring Security Bcrypt^] +* link:https://docs.spring.io/spring-security/site/docs/current/api/org/springframework/security/crypto/bcrypt/BCrypt.html[Spring Security Bcrypt^] * link:https://www.openssl.org/docs/man1.1.0/crypto/EVP_BytesToKey.html[OpenSSL EVP BytesToKey PKCS#1v1.5^] * link:https://wiki.openssl.org/index.php/Manual:PKCS5_PBKDF2_HMAC(3)[OpenSSL PBKDF2 KDF^] * link:http://security.stackexchange.com/a/29139/16485[OpenSSL KDF flaws description^] @@ -1810,7 +1912,7 @@ If no administrator action is taken, the configuration values remain unencrypted [[encrypt-config_tool]] === Encrypt-Config Tool -The `encrypt-config` command line tool (invoked as `./bin/encrypt-config.sh` or `bin\encrypt-config.bat`) reads from a 'nifi.properties' file with plaintext sensitive configuration values, prompts for a master password or raw hexadecimal key, and encrypts each value. It replaces the plain values with the protected value in the same file, or writes to a new 'nifi.properties' file if specified. +The `encrypt-config` command line tool (invoked as `./bin/encrypt-config.sh` or `bin\encrypt-config.bat`) reads from a _nifi.properties_ file with plaintext sensitive configuration values, prompts for a master password or raw hexadecimal key, and encrypts each value. It replaces the plain values with the protected value in the same file, or writes to a new _nifi.properties_ file if specified. The default encryption algorithm utilized is AES/GCM 128/256-bit. 128-bit is used if the JCE Unlimited Strength Cryptographic Jurisdiction Policy files are not installed, and 256-bit is used if they are installed. @@ -1818,27 +1920,27 @@ You can use the following command line options with the `encrypt-config` tool: * `-h`,`--help` Prints this usage message * `-v`,`--verbose` Sets verbose mode (default false) - * `-n`,`--niFiProperties ` The nifi.properties file containing unprotected config values (will be overwritten) - * `-l`,`--loginIdentityProviders ` The login-identity-providers.xml file containing unprotected config values (will be overwritten) - * `-a`,`--authorizers ` The authorizers.xml file containing unprotected config values (will be overwritten) - * `-f`,`--flowXml ` The flow.xml.gz file currently protected with old password (will be overwritten) - * `-b`,`--bootstrapConf ` The bootstrap.conf file to persist master key - * `-o`,`--outputNiFiProperties ` The destination nifi.properties file containing protected config values (will not modify input nifi.properties) - * `-i`,`--outputLoginIdentityProviders ` The destination login-identity-providers.xml file containing protected config values (will not modify input login-identity-providers.xml) - * `-u`,`--outputAuthorizers ` The destination authorizers.xml file containing protected config values (will not modify input authorizers.xml) - * `-g`,`--outputFlowXml ` The destination flow.xml.gz file containing protected config values (will not modify input flow.xml.gz) + * `-n`,`--niFiProperties ` The _nifi.properties_ file containing unprotected config values (will be overwritten) + * `-l`,`--loginIdentityProviders ` The _login-identity-providers.xml_ file containing unprotected config values (will be overwritten) + * `-a`,`--authorizers ` The _authorizers.xml_ file containing unprotected config values (will be overwritten) + * `-f`,`--flowXml ` The _flow.xml.gz_ file currently protected with old password (will be overwritten) + * `-b`,`--bootstrapConf ` The _bootstrap.conf_ file to persist master key + * `-o`,`--outputNiFiProperties ` The destination _nifi.properties_ file containing protected config values (will not modify input _nifi.properties_) + * `-i`,`--outputLoginIdentityProviders ` The destination _login-identity-providers.xml_ file containing protected config values (will not modify input _login-identity-providers.xml_) + * `-u`,`--outputAuthorizers ` The destination _authorizers.xml_ file containing protected config values (will not modify input _authorizers.xml_) + * `-g`,`--outputFlowXml ` The destination _flow.xml.gz_ file containing protected config values (will not modify input _flow.xml.gz_) * `-k`,`--key ` The raw hexadecimal key to use to encrypt the sensitive properties * `-e`,`--oldKey ` The old raw hexadecimal key to use during key migration * `-p`,`--password ` The password from which to derive the key to use to encrypt the sensitive properties * `-w`,`--oldPassword ` The old password from which to derive the key during migration * `-r`,`--useRawKey` If provided, the secure console will prompt for the raw key value in hexadecimal form - * `-m`,`--migrate` If provided, the nifi.properties and/or login-identity-providers.xml sensitive properties will be re-encrypted with a new key - * `-x`,`--encryptFlowXmlOnly` If provided, the properties in flow.xml.gz will be re-encrypted with a new key but the nifi.properties and/or login-identity-providers.xml files will not be modified - * `-s`,`--propsKey ` The password or key to use to encrypt the sensitive processor properties in flow.xml.gz - * `-A`,`--newFlowAlgorithm ` The algorithm to use to encrypt the sensitive processor properties in flow.xml.gz - * `-P`,`--newFlowProvider ` The security provider to use to encrypt the sensitive processor properties in flow.xml.gz + * `-m`,`--migrate` If provided, the _nifi.properties_ and/or _login-identity-providers.xml_ sensitive properties will be re-encrypted with a new key + * `-x`,`--encryptFlowXmlOnly` If provided, the properties in _flow.xml.gz_ will be re-encrypted with a new key but the _nifi.properties_ and/or _login-identity-providers.xml_ files will not be modified + * `-s`,`--propsKey ` The password or key to use to encrypt the sensitive processor properties in _flow.xml.gz_ + * `-A`,`--newFlowAlgorithm ` The algorithm to use to encrypt the sensitive processor properties in _flow.xml.gz_ + * `-P`,`--newFlowProvider ` The security provider to use to encrypt the sensitive processor properties in _flow.xml.gz_ -As an example of how the tool works, assume that you have installed the tool on a machine supporting 256-bit encryption and with the following existing values in the 'nifi.properties' file: +As an example of how the tool works, assume that you have installed the tool on a machine supporting 256-bit encryption and with the following existing values in the _nifi.properties_ file: ---- # security properties # @@ -1865,7 +1967,7 @@ encrypt-config.sh -n nifi.properties ---- -As a result, the 'nifi.properties' file is overwritten with protected properties and sibling encryption identifiers (`aes/gcm/256`, the currently supported algorithm): +As a result, the _nifi.properties_ file is overwritten with protected properties and sibling encryption identifiers (`aes/gcm/256`, the currently supported algorithm): ---- # security properties # @@ -1886,7 +1988,7 @@ nifi.security.truststoreType= nifi.security.truststorePasswd= ---- -Additionally, the 'bootstrap.conf' file is updated with the encryption key as follows: +Additionally, the _bootstrap.conf_ file is updated with the encryption key as follows: ---- # Master key in hexadecimal format for encrypted sensitive configuration values @@ -1895,11 +1997,11 @@ nifi.bootstrap.sensitive.key=0123456789ABCDEFFEDCBA98765432100123456789ABCDEFFED Sensitive configuration values are encrypted by the tool by default, however you can encrypt any additional properties, if desired. To encrypt additional properties, specify them as comma-separated values in the `nifi.sensitive.props.additional.keys` property. -If the 'nifi.properties' file already has valid protected values, those property values are not modified by the tool. +If the _nifi.properties_ file already has valid protected values, those property values are not modified by the tool. -When applied to 'login-identity-providers.xml' and 'authorizers.xml', the property elements are updated with an `encryption` attribute: +When applied to _login-identity-providers.xml_ and _authorizers.xml_, the property elements are updated with an `encryption` attribute: -Example of protected login-identity-providers.xml: +Example of protected _login-identity-providers.xml_: ---- @@ -1916,7 +2018,7 @@ Example of protected login-identity-providers.xml: ---- -Example of protected authorizers.xml: +Example of protected _authorizers.xml_: --- @@ -1936,7 +2038,7 @@ Example of protected authorizers.xml: [encrypt_config_property_migration] === Sensitive Property Key Migration -In order to change the key used to encrypt the sensitive values, indicate *migration mode* using the `-m` or `--migrate` flag, provide the new key or password using the `-k` or `-p` flags as usual, and provide the existing key or password using `-e` or `-w` respectively. This will allow the toolkit to decrypt the existing values and re-encrypt them, and update `bootstrap.conf` with the new key. Only one of the key or password needs to be specified for each phase (old vs. new), and any combination is sufficient: +In order to change the key used to encrypt the sensitive values, indicate *migration mode* using the `-m` or `--migrate` flag, provide the new key or password using the `-k` or `-p` flags as usual, and provide the existing key or password using `-e` or `-w` respectively. This will allow the toolkit to decrypt the existing values and re-encrypt them, and update _bootstrap.conf_ with the new key. Only one of the key or password needs to be specified for each phase (old vs. new), and any combination is sufficient: * old key -> new key * old key -> new password @@ -1946,17 +2048,17 @@ In order to change the key used to encrypt the sensitive values, indicate *migra [encrypt_config_flow_migration] === Existing Flow Migration -This tool can also be used to change the value of `nifi.sensitive.props.key` for an existing flow. The tool will read the existing `flow.xml.gz` and decrypt any sensitive component properties using the original key, -then re-encrypt the sensitive properties with the new key, and write out a new version of the `flow.xml.gz`, or overwrite the existing one. +This tool can also be used to change the value of `nifi.sensitive.props.key` for an existing flow. The tool will read the existing _flow.xml.gz_ and decrypt any sensitive component properties using the original key, +then re-encrypt the sensitive properties with the new key, and write out a new version of the _flow.xml.gz_, or overwrite the existing one. -The current sensitive properties key is not provided as a command-line argument, as it is read directly from `nifi.properties`. As this file is a required parameter, the `-x`/`--encryptFlowXmlOnly` flags tell the tool *not* to attempt to encrypt the properties in `nifi.properties`, but rather to *only* update the `nifi.sensitive.props.key` value with the new key. The exception to this is if the `nifi.properties` is *already* encrypted, the new sensitive property key will also be encrypted before being written to `nifi.properties`. +The current sensitive properties key is not provided as a command-line argument, as it is read directly from _nifi.properties_. As this file is a required parameter, the `-x`/`--encryptFlowXmlOnly` flags tell the tool *not* to attempt to encrypt the properties in _nifi.properties_, but rather to *only* update the `nifi.sensitive.props.key` value with the new key. The exception to this is if the _nifi.properties_ is *already* encrypted, the new sensitive property key will also be encrypted before being written to _nifi.properties_. -The following command would migrate the sensitive properties key in place, meaning it would overwrite the existing `flow.xml.gz` and `nifi.properties`: +The following command would migrate the sensitive properties key in place, meaning it would overwrite the existing _flow.xml.gz_ and _nifi.properties_: ---- ./encrypt-config.sh -f /path/to/flow.xml.gz -n ./path/to/nifi.properties -s newpassword -x ---- -The following command would migrate the sensitive properties key and write out a separate `flow.xml.gz` and `nifi.properties`: +The following command would migrate the sensitive properties key and write out a separate _flow.xml.gz_ and _nifi.properties_: ---- ./encrypt-config.sh -f ./path/to/src/flow.xml.gz -g /path/to/dest/flow.xml.gz -n /path/to/src/nifi.properties -o /path/to/dest/nifi.properties -s newpassword -x ---- @@ -1991,7 +2093,7 @@ and clustered environments. These utilities include: * Node Manager -- The node manager tool allows administrators to perform a status check on a node as well as to connect, disconnect, or remove nodes that are part of a cluster. * File Manager -- The file manager tool allows administrators to backup, install or restore a NiFi installation from backup. -The admin toolkit is bundled with the nifi-toolkit and can be executed with scripts found in the _bin_ folder. +The admin toolkit is bundled with the nifi-toolkit and can be executed with scripts found in the `bin` folder. === Prerequisites for Running Admin Toolkit in a Secure Environment For secured nodes and clusters, two policies should be configured in advance: @@ -2021,7 +2123,7 @@ The following are available options: * `-b`,`--bootstrapConf ` Existing Bootstrap Configuration file (required) * `-d`,`--nifiInstallDir ` NiFi Root Folder (required) * `-h`,`--help` Help Text (optional) -* `-l`,`--level ` Status level of bulletin – INFO, WARN, ERROR +* `-l`,`--level ` Status level of bulletin – `INFO`, `WARN`, `ERROR` * `-m`,`--message ` Bulletin message (required) * `-p`,`--proxyDN ` Proxy or User DN (required for secured nodes) * `-v`,`--verbose` Verbose messaging (optional) @@ -2046,7 +2148,7 @@ displays if the node is not part of a cluster) as well as obtaining the status o from a cluster and need to be connected or removed, a list of urls of connected nodes should be provided to send the required command to the active cluster. Node Manager supports NiFi version 1.0.0 and higher. Node Manager is also available in -'node-manager.bat' file for use on Windows machines. +_node-manager.bat_ file for use on Windows machines. To connect, disconnect, or remove a node from a cluster: @@ -2105,7 +2207,7 @@ Disconnect: When a node is disconnected from the cluster, the node itself should appear as disconnected and the cluster should have a bulletin indicating the disconnect request was received. The cluster should also show _n-1/n_ -nodes available in the cluster. For example, if 1 node is disconnected from a 3-node cluster, then 2 of 3 nodes +nodes available in the cluster. For example, if 1 node is disconnected from a 3-node cluster, then "2 of 3" nodes should show on the remaining nodes in the cluster. Changes to the flow should not be allowed on the cluster with a disconnected node. @@ -2120,7 +2222,7 @@ Remove: When the remove command is executed the node should show as disconnected from a cluster. The nodes remaining in the cluster should show _n-1/n-1_ nodes. For example, if 1 node is removed from a 3-node cluster, then the remaining 2 nodes -should show 2 of 2 nodes). The cluster should allow a flow to be adjusted. The removed node can rejoin the +should show "2 of 2" nodes. The cluster should allow a flow to be adjusted. The removed node can rejoin the cluster if restarted and the flow for the cluster has not changed. If the flow was changed, the flow template of the removed node should be deleted before restarting the node to allow it to obtain the cluster flow (otherwise an uninheritable flow file exception may occur). @@ -2129,7 +2231,7 @@ an uninheritable flow file exception may occur). The File Manager utility allows system administrators to take a backup of an existing NiFi installation, install a new version of NiFi in a designated location (while migrating any previous configuration settings) or restore an installation from a previous backup. -File Manager supports NiFi version 1.0.0 and higher and is available in 'file-manager.bat' file for use on Windows machines. +File Manager supports NiFi version 1.0.0 and higher and is available in _file-manager.bat_ file for use on Windows machines. To show help: @@ -2199,8 +2301,10 @@ folder of the current installation) to the new installation as well as migrate c Restore: The restore operation allows an existing installation to revert back to a previous installation. Using an existing backup directory (created from the backup operation) -the FileManager utility will restore libraries, scripts and documents as well as revert to previous configurations. NOTE: If repositories were changed due to the installation -of a newer version of NiFi these may no longer be compatible during restore. In that scenario exclude the -m option to ensure new repositories will be created or, if repositories +the FileManager utility will restore libraries, scripts and documents as well as revert to previous configurations. + +NOTE: If repositories were changed due to the installation +of a newer version of NiFi these may no longer be compatible during restore. In that scenario exclude the `-m` option to ensure new repositories will be created or, if repositories live outside of the NiFi directory, remove them so they can be recreated on startup after restore. @@ -2238,7 +2342,7 @@ NiFi Clustering is unique and has its own terminology. It's important to underst [template="glossary", id="terminology"] *Terminology* + -*NiFi Cluster Coordinator*: A NiFi Cluster Cluster Coordinator is the node in a NiFi cluster that is responsible for carrying out +*NiFi Cluster Coordinator*: A NiFi Cluster Coordinator is the node in a NiFi cluster that is responsible for carrying out tasks to manage which nodes are allowed in the cluster and providing the most up-to-date flow to newly joining nodes. When a DataFlow Manager manages a dataflow in a cluster, they are able to do so through the User Interface of any node in the cluster. Any change made is then replicated to all nodes in the cluster. @@ -2308,7 +2412,7 @@ some number of Nodes have cast votes (configured by setting the `nifi.cluster.fl a flow is elected to be the "correct" copy of the flow. All nodes that have incompatible flows are then disconnected from the cluster while those with compatible flows inherit the cluster's flow. Election is performed according to the "popular vote" with the caveat that the winner will never be an "empty flow" unless all flows are empty. This -allows an administrator to remove a node's `flow.xml.gz` file and restart the node, knowing that the node's flow will +allows an administrator to remove a node's _flow.xml.gz_ file and restart the node, knowing that the node's flow will not be voted to be the "correct" flow unless no other flow is found. *Basic Cluster Setup* + @@ -2329,31 +2433,31 @@ For each Node, the minimum properties to configure are as follows: * Under the _State Management section_, set the `nifi.state.management.provider.cluster` property to the identifier of the Cluster State Provider. Ensure that the Cluster State Provider has been configured in the _state-management.xml_ file. See <> for more information. -* Under _Cluster Node_ Properties, set the following: -** nifi.cluster.is.node - Set this to _true_. -** nifi.cluster.node.address - Set this to the fully qualified hostname of the node. If left blank, it defaults to "localhost". -** nifi.cluster.node.protocol.port - Set this to an open port that is higher than 1024 (anything lower requires root). -** nifi.cluster.node.protocol.threads - The number of threads that should be used to communicate with other nodes in the cluster. This property - defaults to 10. A thread pool is used for replicating requests to all nodes, and the +* Under _Cluster Node Properties_, set the following: +** `nifi.cluster.is.node` - Set this to _true_. +** `nifi.cluster.node.address` - Set this to the fully qualified hostname of the node. If left blank, it defaults to `localhost`. +** `nifi.cluster.node.protocol.port` - Set this to an open port that is higher than 1024 (anything lower requires root). +** `nifi.cluster.node.protocol.threads` - The number of threads that should be used to communicate with other nodes in the cluster. This property + defaults to `10`. A thread pool is used for replicating requests to all nodes, and the thread pool will never have fewer than this number of threads. It will grow as needed up to the maximum value set by the `nifi.cluster.node.protocol.max.threads` property. -** nifi.cluster.node.protocol.max.threads - The maximum number of threads that should be used to communicate with other nodes in the cluster. This property - defaults to 50. A thread pool is used for replication requests to all nodes, and the thread pool will have a "core" size that is configured by the +** `nifi.cluster.node.protocol.max.threads` - The maximum number of threads that should be used to communicate with other nodes in the cluster. This property + defaults to `50`. A thread pool is used for replication requests to all nodes, and the thread pool will have a "core" size that is configured by the `nifi.cluster.node.protocol.threads` property. However, if necessary, the thread pool will increase the number of active threads to the limit set by this property. -** nifi.zookeeper.connect.string - The Connect String that is needed to connect to Apache ZooKeeper. This is a comma-separted list - of hostname:port pairs. For example, localhost:2181,localhost:2182,localhost:2183. This should contain a list of all ZooKeeper +** `nifi.zookeeper.connect.string` - The Connect String that is needed to connect to Apache ZooKeeper. This is a comma-separated list + of hostname:port pairs. For example, `localhost:2181,localhost:2182,localhost:2183`. This should contain a list of all ZooKeeper instances in the ZooKeeper quorum. -** nifi.zookeeper.root.node - The root ZNode that should be used in ZooKeeper. ZooKeeper provides a directory-like structure +** `nifi.zookeeper.root.node` - The root ZNode that should be used in ZooKeeper. ZooKeeper provides a directory-like structure for storing data. Each 'directory' in this structure is referred to as a ZNode. This denotes the root ZNode, or 'directory', - that should be used for storing data. The default value is _/root_. This is important to set correctly, as which cluster + that should be used for storing data. The default value is `/root`. This is important to set correctly, as which cluster the NiFi instance attempts to join is determined by which ZooKeeper instance it connects to and the ZooKeeper Root Node that is specified. -** nifi.cluster.flow.election.max.wait.time - Specifies the amount of time to wait before electing a Flow as the "correct" Flow. +** `nifi.cluster.flow.election.max.wait.time` - Specifies the amount of time to wait before electing a Flow as the "correct" Flow. If the number of Nodes that have voted is equal to the number specified by the `nifi.cluster.flow.election.max.candidates` - property, the cluster will not wait this long. The default value is _5 mins_. Note that the time starts as soon as the first vote + property, the cluster will not wait this long. The default value is `5 mins`. Note that the time starts as soon as the first vote is cast. -** nifi.cluster.flow.election.max.candidates - Specifies the number of Nodes required in the cluster to cause early election +** `nifi.cluster.flow.election.max.candidates` - Specifies the number of Nodes required in the cluster to cause early election of Flows. This allows the Nodes in the cluster to avoid having to wait a long time before starting processing if we reach at least this number of nodes in the cluster. @@ -2364,9 +2468,9 @@ image:ncm.png["Clustered User Interface"] *Troubleshooting* -If you encounter issues and your cluster does not work as described, investigate the nifi-app.log and nifi-user.log -files on the nodes. If needed, you can change the logging level to DEBUG by editing the conf/logback.xml file. Specifically, -set the level="DEBUG" in the following line (instead of "INFO"): +If you encounter issues and your cluster does not work as described, investigate the _nifi-app.log_ and _nifi-user.log_ +files on the nodes. If needed, you can change the logging level to DEBUG by editing the `conf/logback.xml` file. Specifically, +set the `level="DEBUG"` in the following line (instead of `"INFO"`): ---- @@ -2392,9 +2496,9 @@ Providers. The _nifi.properties_ file contains three different properties that a |==== |*Property*|*Description* -|nifi.state.management.configuration.file|The first is the property that specifies an external XML file that is used for configuring the local and/or cluster-wide State Providers. This XML file may contain configurations for multiple providers -|nifi.state.management.provider.local|The property that provides the identifier of the local State Provider configured in this XML file -|nifi.state.management.provider.cluster|Similarly, the property provides the identifier of the cluster-wide State Provider configured in this XML file. +|`nifi.state.management.configuration.file`|The first is the property that specifies an external XML file that is used for configuring the local and/or cluster-wide State Providers. This XML file may contain configurations for multiple providers +|`nifi.state.management.provider.local`|The property that provides the identifier of the local State Provider configured in this XML file +|`nifi.state.management.provider.cluster`|Similarly, the property provides the identifier of the cluster-wide State Provider configured in this XML file. |==== This XML file consists of a top-level `state-management` element, which has one or more `local-provider` and zero or more `cluster-provider` @@ -2407,12 +2511,12 @@ Once these State Providers have been configured in the _state-management.xml_ fi referenced by their identifiers. By default, the Local State Provider is configured to be a `WriteAheadLocalStateProvider` that persists the data to the -_$NIFI_HOME/state/local_ directory. The default Cluster State Provider is configured to be a `ZooKeeperStateProvider`. The default +`$NIFI_HOME/state/local` directory. The default Cluster State Provider is configured to be a `ZooKeeperStateProvider`. The default ZooKeeper-based provider must have its `Connect String` property populated before it can be used. It is also advisable, if multiple NiFi instances will use the same ZooKeeper instance, that the value of the `Root Node` property be changed. For instance, one might set the value to `/nifi//production`. A `Connect String` takes the form of comma separated : tuples, such as -my-zk-server1:2181,my-zk-server2:2181,my-zk-server3:2181. In the event a port is not specified for any of the hosts, the ZooKeeper default of -2181 is assumed. +`my-zk-server1:2181,my-zk-server2:2181,my-zk-server3:2181`. In the event a port is not specified for any of the hosts, the ZooKeeper default of +`2181` is assumed. When adding data to ZooKeeper, there are two options for Access Control: `Open` and `CreatorOnly`. If the `Access Control` property is set to `Open`, then anyone is allowed to log into ZooKeeper and have full permissions to see, change, delete, or administer the data. @@ -2446,8 +2550,8 @@ embedded ZooKeeper server. |==== |*Property*|*Description* -|nifi.state.management.embedded.zookeeper.start|Specifies whether or not this instance of NiFi should run an embedded ZooKeeper server -|nifi.state.management.embedded.zookeeper.properties|Properties file that provides the ZooKeeper properties to use if `nifi.state.management.embedded.zookeeper.start` is set to true +|`nifi.state.management.embedded.zookeeper.start`|Specifies whether or not this instance of NiFi should run an embedded ZooKeeper server +|`nifi.state.management.embedded.zookeeper.properties`|Properties file that provides the ZooKeeper properties to use if `nifi.state.management.embedded.zookeeper.start` is set to `true` |==== This can be accomplished by setting the `nifi.state.management.embedded.zookeeper.start` property in _nifi.properties_ to `true` on those nodes @@ -2462,7 +2566,7 @@ with the list of ZooKeeper servers. The servers are specified as properties in t configured as :[:]. For example, `myhost:2888:3888`. This list of nodes should be the same nodes in the NiFi cluster that have the `nifi.state.management.embedded.zookeeper.start` property set to `true`. Also note that because ZooKeeper will be listening on these ports, the firewall may need to be configured to open these ports for incoming traffic, at least between nodes in the cluster. Additionally, the port to -listen on for client connections must be opened in the firewall. The default value for this is _2181_ but can be configured via the _clientPort_ property +listen on for client connections must be opened in the firewall. The default value for this is `2181` but can be configured via the _clientPort_ property in the _zookeeper.properties_ file. When using an embedded ZooKeeper, the ./__conf/zookeeper.properties__ file has a property named `dataDir`. By default, this value is set to `./state/zookeeper`. @@ -2529,13 +2633,13 @@ Kerberos client libraries be installed. This is accomplished in Fedora-based Lin [source] yum install krb5-workstation -Once this is complete, the /etc/krb5.conf will need to be configured appropriately for your organization’s Kerberos environment. +Once this is complete, the _/etc/krb5.conf_ will need to be configured appropriately for your organization’s Kerberos environment. [[zk_kerberos_server]] ==== Kerberizing Embedded ZooKeeper Server -The krb5.conf file on the systems with the embedded zookeeper servers should be identical to the one on the system where the krb5kdc service is running. +The _krb5.conf_ file on the systems with the embedded zookeeper servers should be identical to the one on the system where the krb5kdc service is running. When using the embedded ZooKeeper server, we may choose to secure the server by using Kerberos. All nodes configured to launch an embedded ZooKeeper and using Kerberos should follow these steps. When using the embedded ZooKeeper server, we may choose to secure the server by using Kerberos. All nodes configured to launch an embedded ZooKeeper and using Kerberos should follow these steps. @@ -2557,12 +2661,12 @@ kadmin: xst -k zookeeper-server.keytab zookeeper/myHost.example.com@EXAMPLE.COM This will create a file in the current directory named `zookeeper-server.keytab`. We can now copy that file into the `$NIFI_HOME/conf/` directory. We should ensure that only the user that will be running NiFi is allowed to read this file. -We will need to repeat the above steps for each of the instances of NiFi that will be running the embedded ZooKeeper server, being sure to replace _myHost.example.com_ with -__myHost2.example.com__, or whatever fully qualified hostname the ZooKeeper server will be run on. +We will need to repeat the above steps for each of the instances of NiFi that will be running the embedded ZooKeeper server, being sure to replace `myHost.example.com` with +`myHost2.example.com`, or whatever fully qualified hostname the ZooKeeper server will be run on. Now that we have our KeyTab for each of the servers that will be running NiFi, we will need to configure NiFi’s embedded ZooKeeper server to use this configuration. ZooKeeper uses the Java Authentication and Authorization Service (JAAS), so we need to create a JAAS-compatible file In the `$NIFI_HOME/conf/` directory, create a file -named `zookeeper-jaas.conf` (this file will already exist if the Client has already been configured to authenticate via Kerberos. That’s okay, just add to the file). +named _zookeeper-jaas.conf_ (this file will already exist if the Client has already been configured to authenticate via Kerberos. That’s okay, just add to the file). We will add to this file, the following snippet: [source] @@ -2575,15 +2679,15 @@ Server { principal="zookeeper/myHost.example.com@EXAMPLE.COM"; }; -Be sure to replace the value of _principal_ above with the appropriate Principal, including the fully qualified domain name of the server. +Be sure to replace the value of `principal` above with the appropriate Principal, including the fully qualified domain name of the server. -Next, we need to tell NiFi to use this as our JAAS configuration. This is done by setting a JVM System Property, so we will edit the `conf/bootstrap.conf` file. +Next, we need to tell NiFi to use this as our JAAS configuration. This is done by setting a JVM System Property, so we will edit the _conf/bootstrap.conf_ file. If the Client has already been configured to use Kerberos, this is not necessary, as it was done above. Otherwise, we will add the following line to our _bootstrap.conf_ file: [source] java.arg.15=-Djava.security.auth.login.config=./conf/zookeeper-jaas.conf -Note: this additional line in the file doesn’t have to be number 15, it just has to be added to the bootstrap.conf file, use whatever number is appropriate for your configuration. +NOTE: This additional line in the file doesn’t have to be number 15, it just has to be added to the _bootstrap.conf_ file. Use whatever number is appropriate for your configuration. We will want to initialize our Kerberos ticket by running the following command: @@ -2592,7 +2696,7 @@ kinit –kt zookeeper-server.keytab "zookeeper/myHost.example.com@EXAMPLE.COM" Again, be sure to replace the Principal with the appropriate value, including your realm and your fully qualified hostname. -Finally, we need to tell the Kerberos server to use the SASL Authentication Provider. To do this, we edit the `$NIFI_HOME/conf/zookeeper.properties` file and add the following +Finally, we need to tell the Kerberos server to use the SASL Authentication Provider. To do this, we edit the _$NIFI_HOME/conf/zookeeper.properties_ file and add the following lines: [source] @@ -2602,8 +2706,8 @@ kerberos.removeRealmFromPrincipal=true jaasLoginRenew=3600000 requireClientAuthScheme=sasl -The kerberos.removeHostFromPrincipal and the kerberos.removeRealmFromPrincipal properties are used to normalize the user principal name before comparing an identity to acls -applied on a Znode. By default the full principal is used however setting the removeHostFromPrincipal and removeRealmFromPrincipal kerberos properties to true will instruct +The `kerberos.removeHostFromPrincipal` and the `kerberos.removeRealmFromPrincipal` properties are used to normalize the user principal name before comparing an identity to acls +applied on a Znode. By default the full principal is used however setting the `kerberos.removeHostFromPrincipal` and the `kerberos.removeRealmFromPrincipal` properties to true will instruct Zookeeper to remove the host and the realm from the logged in user's identity for comparison. In cases where NiFi nodes (within the same cluster) use principals that have different host(s)/realm(s) values, these kerberos properties can be configured to ensure that the nodes' identity will be normalized and that the nodes will have appropriate access to shared Znodes in Zookeeper. @@ -2616,7 +2720,7 @@ Now, we can start NiFi, and the embedded ZooKeeper server will use Kerberos as t [[zk_kerberos_client]] ==== Kerberizing NiFi's ZooKeeper Client -Note: The NiFi nodes running the embedded zookeeper server will also need to follow the below procedure since they will also be acting as a client at +NOTE: The NiFi nodes running the embedded zookeeper server will also need to follow the below procedure since they will also be acting as a client at the same time. The preferred mechanism for authenticating users with ZooKeeper is to use Kerberos. In order to use Kerberos to authenticate, we must configure a few @@ -2639,11 +2743,11 @@ kadmin: xst -k nifi.keytab nifi@EXAMPLE.COM This keytab file can be copied to the other NiFi nodes with embedded zookeeper servers. -This will create a file in the current directory named `nifi.keytab`. We can now copy that file into the _$NIFI_HOME/conf/_ directory. We should ensure +This will create a file in the current directory named _nifi.keytab_. We can now copy that file into the `$NIFI_HOME/conf/` directory. We should ensure that only the user that will be running NiFi is allowed to read this file. Next, we need to configure NiFi to use this KeyTab for authentication. Since ZooKeeper uses the Java Authentication and Authorization Service (JAAS), we need to -create a JAAS-compatible file. In the `$NIFI_HOME/conf/` directory, create a file named `zookeeper-jaas.conf` and add to it the following snippet: +create a JAAS-compatible file. In the `$NIFI_HOME/conf/` directory, create a file named _zookeeper-jaas.conf_ and add to it the following snippet: [source] Client { @@ -2662,15 +2766,15 @@ We add the following line anywhere in this file in order to tell the NiFi JVM to [source] java.arg.15=-Djava.security.auth.login.config=./conf/zookeeper-jaas.conf -Finally we need to update `nifi.properties` to ensure that NiFi knows to apply SASL specific ACLs for the Znodes it will create in Zookeeper for cluster management. -To enable this, in the `$NIFI_HOME/conf/nifi.properties` file and edit the following properties as shown below: +Finally we need to update _nifi.properties_ to ensure that NiFi knows to apply SASL specific ACLs for the Znodes it will create in Zookeeper for cluster management. +To enable this, in the _$NIFI_HOME/conf/nifi.properties_ file and edit the following properties as shown below: [source] nifi.zookeeper.auth.type=sasl nifi.zookeeper.kerberos.removeHostFromPrincipal=true nifi.zookeeper.kerberos.removeRealmFromPrincipal=true -Note: The kerberos.removeHostFromPrincipal and kerberos.removeRealmFromPrincipal should be consistent with what is set in Zookeeper configuration. +NOTE: The `kerberos.removeHostFromPrincipal` and `kerberos.removeRealmFromPrincipal` should be consistent with what is set in Zookeeper configuration. We can initialize our Kerberos ticket by running the following command: @@ -2688,7 +2792,7 @@ in the following locations: - _conf/zookeeper.properties_ file should use FQDN for `server.1`, `server.2`, ..., `server.N` values. - The `Connect String` property of the ZooKeeperStateProvider - - The /etc/hosts file should also resolve the FQDN to an IP address that is *not* _127.0.0.1_. + - The _/etc/hosts_ file should also resolve the FQDN to an IP address that is *not* `127.0.0.1`. Failure to do so, may result in errors similar to the following: @@ -2758,53 +2862,54 @@ Before you begin, confirm that: + |==== |*Required Information*|*Description* -|Source ZooKeeper hostname (*sourceHostname*)|The hostname must be one of the hosts running in the ZooKeeper ensemble, which can be found in /conf/zookeeper.properties. Any of the hostnames declared in the *server.N* properties can be used. -|Destination ZooKeeper hostname (*destinationHostname*)|The hostname must be one of the hosts running in the ZooKeeper ensemble, which can be found in /conf/zookeeper.properties. Any of the hostnames declared in the *server.N* properties can be used. -|Source ZooKeeper port (*sourceClientPort*)|This can be found in *zookeeper.properties* of the /conf/zookeeper.properties. The port is specified in the *clientPort* property. -|Destination ZooKeeper port (*destinationClientPort*)|This can be found in *zookeeper.properties* of the /conf/zookeeper.properties. The port is specified in the *clientPort* property. +|Source ZooKeeper hostname (*sourceHostname*)|The hostname must be one of the hosts running in the ZooKeeper ensemble, which can be found in _/conf/zookeeper.properties_. Any of the hostnames declared in the `server.N` properties can be used. +|Destination ZooKeeper hostname (*destinationHostname*)|The hostname must be one of the hosts running in the ZooKeeper ensemble, which can be found in _/conf/zookeeper.properties_. Any of the hostnames declared in the `server.N` properties can be used. +|Source ZooKeeper port (*sourceClientPort*)|This can be found in _/conf/zookeeper.properties_. The port is specified in the `clientPort` property. +|Destination ZooKeeper port (*destinationClientPort*)|This can be found in _/conf/zookeeper.properties_. The port is specified in the `clientPort` property. |Export data path|Determine the path that will store a json file containing the export of data from ZooKeeper. It must be readable and writable by the user running the zk-migrator tool. -|Source ZooKeeper Authentication Information|This information is in /conf/state-management.xml. For NiFi 0.x, if Creator Only is specified in state-management.xml, you need to supply authentication information using the `-a,--auth` argument with the values from the Username and Password properties in state-management.xml. For NiFi 1.x, supply authentication information using the `-k,--krb-conf` argument. -+ -If the state-management.xml specifies Open, no authentication is required. -|Destination ZooKeeper Authentication Information|This information is in /conf/state-management.xml. For NiFi 0.x, if Creator Only is specified in state-management.xml, you need to supply authentication information using the `-a,--auth` argument with the values from the Username and Password properties in state-management.xml. For NiFi 1.x, supply authentication information using the `-k,--krb-conf` argument. -+ -If the state-management.xml specifies Open, no authentication is required. -|Root path to which NiFi writes data in Source ZooKeeper (*sourceRootPath*)|This information can be found in /conf/state-management.xml under the Root Node property in the cluster-provider element. (default: /nifi) -|Root path to which NiFi writes data in Destination ZooKeeper (*destinationRootPath*)|This information can be found in /conf/state-management.xml under the Root Node property in the cluster-provider element. +|Source ZooKeeper Authentication Information|This information is in _/conf/state-management.xml_. For NiFi 0.x, if Creator Only is specified in _state-management.xml_, you need to supply authentication information using the `-a,--auth` argument with the values from the Username and Password properties in _state-management.xml_. For NiFi 1.x, supply authentication information using the `-k,--krb-conf` argument. + +If the _state-management.xml_ specifies Open, no authentication is required. +|Destination ZooKeeper Authentication Information|This information is in _/conf/state-management.xml_. For NiFi 0.x, if Creator Only is specified in _state-management.xml_, you need to supply authentication information using the `-a,--auth` argument with the values from the Username and Password properties in state-management.xml. For NiFi 1.x, supply authentication information using the `-k,--krb-conf` argument. + +If the _state-management.xml_ specifies Open, no authentication is required. +|Root path to which NiFi writes data in Source ZooKeeper (*sourceRootPath*)|This information can be found in `/conf/state-management.xml` under the Root Node property in the cluster-provider element. (default: `/nifi`) +|Root path to which NiFi writes data in Destination ZooKeeper (*destinationRootPath*)|This information can be found in _/conf/state-management.xml_ under the Root Node property in the cluster-provider element. |==== 2. Stop all processors in the NiFi flow. If you are migrating between two NiFi installations, the flows on both must be stopped. 3. Export the NiFi component data from the source ZooKeeper. The following command reads from the specified ZooKeeper running on the given hostname:port, using the provided path to the data, and authenticates with ZooKeeper using the given username and password. The data read from ZooKeeper is written to the file provided. * For NiFi 0.x ** For an open ZooKeeper: -*** zk-migrator.sh -r -z *sourceHostname:sourceClientPort*/*sourceRootPath*/components -f /*path*/*to*/*export*/*zk-source-data.json* +*** `zk-migrator.sh -r -z sourceHostname:sourceClientPort/sourceRootPath/components -f /path/to/export/zk-source-data.json` ** For a ZooKeeper using username:password for authentication: -*** zk-migrator.sh -r -z *sourceHostname:sourceClientPort*/*sourceRootPath*/components -a -f /*path*/*to*/*export*/*zk-source-data.json* +*** `zk-migrator.sh -r -z sourceHostname:sourceClientPort/sourceRootPath/components -a -f /path/to/export/zk-source-data.json` + * For NiFi 1.x ** For an open ZooKeeper: -*** zk-migrator.sh -r -z *sourceHostname:sourceClientPort*/*sourceRootPath*/components -f /*path*/*to*/*export*/*zk-source-data.json* +*** `zk-migrator.sh -r -z sourceHostname:sourceClientPort/sourceRootPath/components -f /path/to/export/zk-source-data.json` ** For a ZooKeeper using Kerberos for authentication: -*** zk-migrator.sh -r -z *sourceHostname:sourceClientPort*/*sourceRootPath*/components -k /*path*/*to*/*jaasconfig*/*jaas-config.conf* -f /*path*/*to*/*export*/*zk-source-data.json* +*** `zk-migrator.sh -r -z sourceHostname:sourceClientPort/sourceRootPath/components -k /path/to/jaasconfig/jaas-config.conf -f /path/to/export/zk-source-data.json` -4. (Optional) If you have used the new NiFi installation to do any processing, you can also export its ZooKeeper data as a backup prior to performing the migration. +4. (Optional) If you have used the new NiFi installation to do any processing, you can also export its ZooKeeper data as a backup prior to performing the migration. * For an open ZooKeeper: -** zk-migrator.sh -r -z *destinationHostname:destinationClientPort*/*destinationRootPath*/components -f /*path*/*to*/*export*/*zk-destination-backup-data.json* +** `zk-migrator.sh -r -z destinationHostname:destinationClientPort/destinationRootPath/components -f /path/to/export/zk-destination-backup-data.json` * For a ZooKeeper using Kerberos for authentication: -** zk-migrator.sh -r -z *destinationHostname:destinationClientPort*/*destinationRootPath*/components -k /*path*/*to*/*jaasconfig*/*jaas-config.conf* -f /*path*/*to*/*export*/*zk-destination-backup-data.json* +** `zk-migrator.sh -r -z destinationHostname:destinationClientPort/destinationRootPath/components -k /path/to/jaasconfig/jaas-config.conf -f /path/to/export/zk-destination-backup-data.json` 5. Migrate the ZooKeeper data to the destination ZooKeeper. If the source and destination ZooKeepers are the same, the `--ignore-source` option can be added to the following examples. * For an open ZooKeeper: -** zk-migrator.sh -s -z *destinationHostname:destinationClientPort*/*destinationRootPath*/components -f /*path*/*to*/*export*/*zk-source-data.json* +** `zk-migrator.sh -s -z destinationHostname:destinationClientPort/destinationRootPath/components -f /path/to/export/zk-source-data.json` * For a ZooKeeper using Kerberos for authentication: -** zk-migrator.sh -s -z *destinationHostname:destinationClientPort*/*destinationRootPath*/components -k /*path*/*to*/*jaasconfig*/*jaas-config.conf* -f /*path*/*to*/*export*/*zk-source-data.json* +** `zk-migrator.sh -s -z destinationHostname:destinationClientPort/destinationRootPath/components -k /path/to/jaasconfig/jaas-config.conf -f /path/to/export/zk-source-data.json` 6. Once the migration has completed successfully, start the processors in the NiFi flow. Processing should continue from the point at which it was stopped when the NiFi flow was stopped. [[bootstrap_properties]] == Bootstrap Properties -The _bootstrap.conf_ file in the _conf_ directory allows users to configure settings for how NiFi should be started. +The _bootstrap.conf_ file in the `conf` directory allows users to configure settings for how NiFi should be started. This includes parameters, such as the size of the Java Heap, what Java command to run, and Java System Properties. Here, we will address the different properties that are made available in the file. Any changes to this file will @@ -2812,28 +2917,28 @@ take effect only after NiFi has been stopped and restarted. |==== |*Property*|*Description* -|java|Specifies the fully qualified java command to run. By default, it is simply `java` but could be changed to an absolute path or a reference an environment variable, such as `$JAVA_HOME/bin/java` -|run.as|The username to run NiFi as. For instance, if NiFi should be run as the 'nifi' user, setting this value to 'nifi' will cause the NiFi Process to be run as the 'nifi' user. +|`java`|Specifies the fully qualified java command to run. By default, it is simply `java` but could be changed to an absolute path or a reference an environment variable, such as `$JAVA_HOME/bin/java` +|`run.as`|The username to run NiFi as. For instance, if NiFi should be run as the `nifi` user, setting this value to `nifi` will cause the NiFi Process to be run as the `nifi` user. This property is ignored on Windows. For Linux, the specified user may require sudo permissions. -|lib.dir|The _lib_ directory to use for NiFi. By default, this is set to `./lib` -|conf.dir|The _conf_ directory to use for NiFi. By default, this is set to `./conf` -|graceful.shutdown.seconds|When NiFi is instructed to shutdown, the Bootstrap will wait this number of seconds for the process to shutdown cleanly. At this amount of time, - if the service is still running, the Bootstrap will "kill" the process, or terminate it abruptly. -|java.arg.N|Any number of JVM arguments can be passed to the NiFi JVM when the process is started. These arguments are defined by adding properties to _bootstrap.conf_ that - begin with `java.arg.`. The rest of the property name is not relevant, other than to different property names, and will be ignored. The default includes +|`lib.dir`|The _lib_ directory to use for NiFi. By default, this is set to `./lib` +|`conf.dir`|The _conf_ directory to use for NiFi. By default, this is set to `./conf` +|`graceful.shutdown.seconds`|When NiFi is instructed to shutdown, the Bootstrap will wait this number of seconds for the process to shutdown cleanly. At this amount of time, + if the service is still running, the Bootstrap will `kill` the process, or terminate it abruptly. +|`java.arg.N`|Any number of JVM arguments can be passed to the NiFi JVM when the process is started. These arguments are defined by adding properties to _bootstrap.conf_ that + begin with `java.arg.`. The rest of the property name is not relevant, other than to differentiate property names, and will be ignored. The default includes properties for minimum and maximum Java Heap size, the garbage collector to use, etc. -|notification.services.file|When NiFi is started, or stopped, or when the Bootstrap detects that NiFi has died, the Bootstrap is able to send notifications of these events +|`notification.services.file`|When NiFi is started, or stopped, or when the Bootstrap detects that NiFi has died, the Bootstrap is able to send notifications of these events to interested parties. This is configured by specifying an XML file that defines which notification services can be used. More about this file can be found in the <> section. -|notification.max.attempts|If a notification service is configured but is unable to perform its function, it will try again up to a maximum number of attempts. This property +|`notification.max.attempts`|If a notification service is configured but is unable to perform its function, it will try again up to a maximum number of attempts. This property configures what that maximum number of attempts is. The default value is `5`. -|nifi.start.notification.services|This property is a comma-separated list of Notification Service identifiers that correspond to the Notification Services +|`nifi.start.notification.services`|This property is a comma-separated list of Notification Service identifiers that correspond to the Notification Services defined in the `notification.services.file` property. The services with the specified identifiers will be used to notify their configured recipients whenever NiFi is started. -|nifi.stop.notification.services|This property is a comma-separated list of Notification Service identifiers that correspond to the Notification Services +|`nifi.stop.notification.services`|This property is a comma-separated list of Notification Service identifiers that correspond to the Notification Services defined in the `notification.services.file` property. The services with the specified identifiers will be used to notify their configured recipients whenever NiFi is stopped. -|nifi.died.notification.services|This property is a comma-separated list of Notification Service identifiers that correspond to the Notification Services +|`nifi.died.notification.services`|This property is a comma-separated list of Notification Service identifiers that correspond to the Notification Services defined in the `notification.services.file` property. The services with the specified identifiers will be used to notify their configured recipients if the bootstrap determines that NiFi has unexpectedly died. |==== @@ -2873,20 +2978,20 @@ It has the following properties available: |==== |*Property*|*Required*|*Description* -|SMTP Hostname|true|The hostname of the SMTP Server that is used to send Email Notifications -|SMTP Port|true|The Port used for SMTP communications -|SMTP Username|true|Username for the SMTP account -|SMTP Password||Password for the SMTP account -|SMTP Auth||Flag indicating whether authentication should be used -|SMTP TLS||Flag indicating whether TLS should be enabled -|SMTP Socket Factory||javax.net.ssl.SSLSocketFactory -|SMTP X-Mailer Header||X-Mailer used in the header of the outgoing email -|Content Type||Mime Type used to interpret the contents of the email, such as text/plain or text/html -|From|true|Specifies the Email address to use as the sender. Otherwise, a "friendly name" can be used as the From address, but the value +|`SMTP Hostname`|true|The hostname of the SMTP Server that is used to send Email Notifications +|`SMTP Port`|true|The Port used for SMTP communications +|`SMTP Username`|true|Username for the SMTP account +|`SMTP Password`||Password for the SMTP account +|`SMTP Auth`||Flag indicating whether authentication should be used +|`SMTP TLS`||Flag indicating whether TLS should be enabled +|`SMTP Socket Factory`||`javax.net.ssl.SSLSocketFactory` +|`SMTP X-Mailer Header`||X-Mailer used in the header of the outgoing email +|`Content Type`||Mime Type used to interpret the contents of the email, such as `text/plain` or `text/html` +|`From`|true|Specifies the Email address to use as the sender. Otherwise, a "friendly name" can be used as the From address, but the value must be enclosed in double-quotes. -|To||The recipients to include in the To-Line of the email -|CC||The recipients to include in the CC-Line of the email -|BCC||The recipients to include in the BCC-Line of the email +|`To`||The recipients to include in the To-Line of the email +|`CC`||The recipients to include in the CC-Line of the email +|`BCC`||The recipients to include in the BCC-Line of the email |==== @@ -2916,16 +3021,16 @@ It has the following properties available: |==== |*Property*|*Required*|*Description* -|URL|true|The URL to send the notification to. Expression language is supported. -|Connection timeout||Max wait time for connection to remote service. Expression language is supported. This defaults to 10s. -|Write timeout||Max wait time for remote service to read the request sent. Expression language is supported. This defaults to 10s. -|Truststore Filename||The fully-qualified filename of the Truststore -|Truststore Type||The Type of the Truststore. Either JKS or PKCS12 -|Truststore Password||The password for the Truststore -|Keystore Filename||The fully-qualified filename of the Keystore -|Keystore Type||The password for the Keystore -|Keystore Password||The password for the key. If this is not specified, but the Keystore Filename, Password, and Type are specified, then the Keystore Password will be assumed to be the same as the Key Password. -|SSL Protocol||The algorithm to use for this SSL context. This can either be "SSL" or "TLS". +|`URL`|true|The URL to send the notification to. Expression language is supported. +|`Connection timeout`||Max wait time for connection to remote service. Expression language is supported. This defaults to `10s`. +|`Write timeout`||Max wait time for remote service to read the request sent. Expression language is supported. This defaults to `10s`. +|`Truststore Filename`||The fully-qualified filename of the Truststore +|`Truststore Type`||The Type of the Truststore. Either `JKS` or `PKCS12` +|`Truststore Password`||The password for the Truststore +|`Keystore Filename`||The fully-qualified filename of the Keystore +|`Keystore Type`||The password for the Keystore +|`Keystore Password`||The password for the key. If this is not specified, but the Keystore Filename, Password, and Type are specified, then the Keystore Password will be assumed to be the same as the Key Password. +|`SSL Protocol`||The algorithm to use for this SSL context. This can either be `SSL` or `TLS`. |==== In addition to the properties above, dynamic properties can be added. They will be added as headers to the HTTP request. Expression language is supported. @@ -2954,7 +3059,7 @@ A complete example of configuring the HTTP service could look like the following ​When running Apache NiFi behind a proxy there are a couple of key items to be aware of during deployment. * NiFi is comprised of a number of web applications (web UI, web API, documentation, custom UIs, data viewers, etc), so the mapping needs to be configured for the *root path*. That way all context -paths are passed through accordingly. For instance, if only the `/nifi` context path was mapped, the custom UI for `UpdateAttribute` will not work, since it is available at `/update-attribute-ui-`. +paths are passed through accordingly. For instance, if only the `/nifi` context path was mapped, the custom UI for UpdateAttribute will not work, since it is available at `/update-attribute-ui-`. * NiFi's REST API will generate URIs for each component on the graph. Since requests are coming through a proxy, certain elements of the URIs being generated need to be overridden. Without overriding, the users will be able to view the dataflow on the canvas but will be unable to modify existing components. Requests will be attempting to call back directly to NiFi, not through the @@ -3009,12 +3114,12 @@ documentation of the proxy for guidance for your deployment environment and use ** By default, if NiFi is running securely it will only accept HTTP requests with a Host header matching the host[:port] that it is bound to. If NiFi is to accept requests directed to a different host[:port] the expected values need to be configured. This may be required when running behind a proxy or in a containerized environment. This is configured in a comma -separated list in _nifi.properties_ using the `nifi.web.proxy.host` property (e.g. localhost:18443, proxyhost:443). IPv6 addresses are accepted. Please refer to +separated list in _nifi.properties_ using the `nifi.web.proxy.host` property (e.g. `localhost:18443, proxyhost:443`). IPv6 addresses are accepted. Please refer to RFC 5952 Sections link:https://tools.ietf.org/html/rfc5952#section-4[4] and link:https://tools.ietf.org/html/rfc5952#section-6[6] for additional details. ** NiFi will only accept HTTP requests with a X-ProxyContextPath or X-Forwarded-Context header if the value is whitelisted in the `nifi.web.proxy.context.path` property in _nifi.properties_. This property accepts a comma separated list of expected values. In the event an incoming request has an X-ProxyContextPath or X-Forwarded-Context header value that is not -present in the whitelist, the "An unexpected error has occurred" page will be shown and an error will be written to the nifi-app.log. +present in the whitelist, the "An unexpected error has occurred" page will be shown and an error will be written to the _nifi-app.log_. * Additional configurations at both proxy server and NiFi cluster are required to make NiFi Site-to-Site work behind reverse proxies. See <> for details. @@ -3030,8 +3135,8 @@ The following properties must be set in _nifi.properties_ to enable Kerberos ser |==== |*Property*|*Required*|*Description* -|Service Principal|true|The service principal used by NiFi to communicate with the KDC -|Keytab Location|true|The file path to the keytab containing the service principal +|`Service Principal`|true|The service principal used by NiFi to communicate with the KDC +|`Keytab Location`|true|The file path to the keytab containing the service principal |==== See <> for complete documentation. @@ -3073,7 +3178,7 @@ root@kdc:~# [[system_properties]] == System Properties -The _nifi.properties_ file in the _conf_ directory is the main configuration file for controlling how NiFi runs. This section provides an overview of the properties in this file and includes some notes on how to configure it in a way that will make upgrading easier. *After making changes to this file, restart NiFi in order +The _nifi.properties_ file in the `conf` directory is the main configuration file for controlling how NiFi runs. This section provides an overview of the properties in this file and includes some notes on how to configure it in a way that will make upgrading easier. *After making changes to this file, restart NiFi in order for the changes to take effect.* NOTE: The contents of this file are relatively stable but do change from time to time. It is always a good idea to @@ -3088,39 +3193,39 @@ The first section of the _nifi.properties_ file is for the Core Properties. Thes |=== |*Property*|*Description* -|nifi.flow.configuration.file*|The location of the flow configuration file (i.e., the file that contains what is currently displayed on the NiFi graph). The default value is `./conf/flow.xml.gz`. -|nifi.flow.configuration.archive.enabled*|Specifies whether NiFi creates a backup copy of the flow automatically when the flow is updated. The default value is `true`. -|nifi.flow.configuration.archive.dir*|The location of the archive directory where backup copies of the flow.xml are saved. The default value is `./conf/archive`. NiFi removes old archive files to limit disk usage based on archived file lifespan, total size, and number of files, as specified with `nifi.flow.configuration.archive.max.time`, `max.storage` and `max.count` properties respectively. If none of these limitation for archiving is specified, NiFi uses default conditions, that is `30 days` for max.time and `500 MB` for max.storage. + -This cleanup mechanism takes into account only automatically created archived flow.xml files. If there are other files or directories in this archive directory, NiFi will ignore them. Automatically created archives have filename with ISO 8601 format timestamp prefix followed by '_'. That is T+_. For example, `20160706T160719+0900_flow.xml.gz`. NiFi checks filenames when it cleans archive directory. If you would like to keep a particular archive in this directory without worrying about NiFi deleting it, you can do so by copying it with a different filename pattern. -|nifi.flow.configuration.archive.max.time*|The lifespan of archived flow.xml files. NiFi will delete expired archive files when it updates flow.xml if this property is specified. Expiration is determined based on current system time and the last modified timestamp of an archived flow.xml. If no archive limitation is specified in nifi.properties, NiFi removes archives older than `30 days`. -|nifi.flow.configuration.archive.max.storage*|The total data size allowed for the archived flow.xml files. NiFi will delete the oldest archive files until the total archived file size becomes less than this configuration value, if this property is specified. If no archive limitation is specified in nifi.properties, NiFi uses `500 MB` for this. -|nifi.flow.configuration.archive.max.count*|The number of archive files allowed. NiFi will delete the oldest archive files so that only N latest archives can be kept, if this property is specified. -|nifi.flowcontroller.autoResumeState|Indicates whether -upon restart- the components on the NiFi graph should return to their last state. The default value is `true`. -|nifi.flowcontroller.graceful.shutdown.period|Indicates the shutdown period. The default value is `10 secs`. -|nifi.flowservice.writedelay.interval|When many changes are made to the flow.xml, this property specifies how long to wait before writing out the changes, so as to batch the changes into a single write. The default value is `500 ms`. -|nifi.administrative.yield.duration|If a component allows an unexpected exception to escape, it is considered a bug. As a result, the framework will pause (or administratively yield) the component for this amount of time. This is done so that the component does not use up massive amounts of system resources, since it is known to have problems in the existing state. The default value is `30 secs`. -|nifi.bored.yield.duration|When a component has no work to do (i.e., is "bored"), this is the amount of time it will wait before checking to see if it has new data to work on. This way, it does not use up CPU resources by checking for new work too often. When setting this property, be aware that it could add extra latency for components that do not constantly have work to do, as once they go into this "bored" state, they will wait this amount of time before checking for more work. The default value is `10 ms`. -|nifi.queue.backpressure.count|When drawing a new connection between two components, this is the default value for that connection's back pressure object threshold. The default is `10000` and the value must be an integer. -|nifi.queue.backpressure.size|When drawing a new connection between two components, this is the default value for that connection's back pressure data size threshold. The default is `1 GB` and the value must be a data size including the unit of measure. -|nifi.authorizer.configuration.file*|This is the location of the file that specifies how authorizers are defined. The default value is `./conf/authorizers.xml`. -|nifi.login.identity.provider.configuration.file*|This is the location of the file that specifies how username/password authentication is performed. This file is +|`nifi.flow.configuration.file`*|The location of the flow configuration file (i.e., the file that contains what is currently displayed on the NiFi graph). The default value is `./conf/flow.xml.gz`. +|`nifi.flow.configuration.archive.enabled`*|Specifies whether NiFi creates a backup copy of the flow automatically when the flow is updated. The default value is `true`. +|`nifi.flow.configuration.archive.dir`*|The location of the archive directory where backup copies of the _flow.xml_ are saved. The default value is `./conf/archive`. NiFi removes old archive files to limit disk usage based on archived file lifespan, total size, and number of files, as specified with `nifi.flow.configuration.archive.max.time`, `max.storage` and `max.count` properties respectively. If none of these limitation for archiving is specified, NiFi uses default conditions, that is `30 days` for max.time and `500 MB` for max.storage. + +This cleanup mechanism takes into account only automatically created archived _flow.xml_ files. If there are other files or directories in this archive directory, NiFi will ignore them. Automatically created archives have filename with ISO 8601 format timestamp prefix followed by ``. That is `T+_`. For example, `20160706T160719+0900_flow.xml.gz`. NiFi checks filenames when it cleans archive directory. If you would like to keep a particular archive in this directory without worrying about NiFi deleting it, you can do so by copying it with a different filename pattern. +|`nifi.flow.configuration.archive.max.time`*|The lifespan of archived _flow.xml_ files. NiFi will delete expired archive files when it updates _flow.xml_ if this property is specified. Expiration is determined based on current system time and the last modified timestamp of an archived _flow.xml_. If no archive limitation is specified in _nifi.properties_, NiFi removes archives older than `30 days`. +|`nifi.flow.configuration.archive.max.storage`*|The total data size allowed for the archived _flow.xml_ files. NiFi will delete the oldest archive files until the total archived file size becomes less than this configuration value, if this property is specified. If no archive limitation is specified in _nifi.properties_, NiFi uses `500 MB` for this. +|`nifi.flow.configuration.archive.max.count`*|The number of archive files allowed. NiFi will delete the oldest archive files so that only N latest archives can be kept, if this property is specified. +|`nifi.flowcontroller.autoResumeState`|Indicates whether -upon restart- the components on the NiFi graph should return to their last state. The default value is `true`. +|`nifi.flowcontroller.graceful.shutdown.period`|Indicates the shutdown period. The default value is `10 secs`. +|`nifi.flowservice.writedelay.interval`|When many changes are made to the _flow.xml_, this property specifies how long to wait before writing out the changes, so as to batch the changes into a single write. The default value is `500 ms`. +|`nifi.administrative.yield.duration`|If a component allows an unexpected exception to escape, it is considered a bug. As a result, the framework will pause (or administratively yield) the component for this amount of time. This is done so that the component does not use up massive amounts of system resources, since it is known to have problems in the existing state. The default value is `30 secs`. +|`nifi.bored.yield.duration`|When a component has no work to do (i.e., is "bored"), this is the amount of time it will wait before checking to see if it has new data to work on. This way, it does not use up CPU resources by checking for new work too often. When setting this property, be aware that it could add extra latency for components that do not constantly have work to do, as once they go into this "bored" state, they will wait this amount of time before checking for more work. The default value is `10 ms`. +|`nifi.queue.backpressure.count`|When drawing a new connection between two components, this is the default value for that connection's back pressure object threshold. The default is `10000` and the value must be an integer. +|`nifi.queue.backpressure.size`|When drawing a new connection between two components, this is the default value for that connection's back pressure data size threshold. The default is `1 GB` and the value must be a data size including the unit of measure. +|`nifi.authorizer.configuration.file`*|This is the location of the file that specifies how authorizers are defined. The default value is `./conf/authorizers.xml`. +|`nifi.login.identity.provider.configuration.file`*|This is the location of the file that specifies how username/password authentication is performed. This file is only considered if `nifi.security.user.login.identity.provider` is configured with a provider identifier. The default value is `./conf/login-identity-providers.xml`. -|nifi.templates.directory*|This is the location of the directory where flow templates are saved (for backward compatibility only). Templates are stored in the flow.xml.gz starting with NiFi 1.0. The template directory can be used to (bulk) import templates into the flow.xml.gz automatically on NiFi startup. The default value is `./conf/templates`. -|nifi.ui.banner.text|This is banner text that may be configured to display at the top of the User Interface. It is blank by default. -|nifi.ui.autorefresh.interval|The interval at which the User Interface auto-refreshes. The default value is `30 secs`. -|nifi.nar.library.directory|The location of the nar library. The default value is `./lib` and probably should be left as is. + +|`nifi.templates.directory`*|This is the location of the directory where flow templates are saved (for backward compatibility only). Templates are stored in the _flow.xml.gz_ starting with NiFi 1.0. The template directory can be used to (bulk) import templates into the _flow.xml.gz_ automatically on NiFi startup. The default value is `./conf/templates`. +|`nifi.ui.banner.text`|This is banner text that may be configured to display at the top of the User Interface. It is blank by default. +|`nifi.ui.autorefresh.interval`|The interval at which the User Interface auto-refreshes. The default value is `30 secs`. +|`nifi.nar.library.directory`|The location of the nar library. The default value is `./lib` and probably should be left as is. + + -*NOTE*: Additional library directories can be specified by using the *_nifi.nar.library.directory._* prefix with unique suffixes and separate paths as values. + +*NOTE*: Additional library directories can be specified by using the `nifi.nar.library.directory.` prefix with unique suffixes and separate paths as values. + + For example, to provide two additional library locations, a user could also specify additional properties with keys of: + + -nifi.nar.library.directory.lib1=/nars/lib1 + -nifi.nar.library.directory.lib2=/nars/lib2 + +`nifi.nar.library.directory.lib1=/nars/lib1` + +`nifi.nar.library.directory.lib2=/nars/lib2` + + Providing three total locations, including `nifi.nar.library.directory`. -|nifi.nar.working.directory|The location of the nar working directory. The default value is `./work/nar` and probably should be left as is. -|nifi.documentation.working.directory|The documentation working directory. The default value is `./work/docs/components` and probably should be left as is. -|nifi.processor.scheduling.timeout|Time to wait for a Processor's life-cycle operation (@OnScheduled and @OnUnscheduled) to finish before other life-cycle operation (e.g., stop) could be invoked. The default value is `1 min`. +|`nifi.nar.working.directory`|The location of the nar working directory. The default value is `./work/nar` and probably should be left as is. +|`nifi.documentation.working.directory`|The documentation working directory. The default value is `./work/docs/components` and probably should be left as is. +|`nifi.processor.scheduling.timeout`|Time to wait for a Processor's life-cycle operation (`@OnScheduled` and `@OnUnscheduled`) to finish before other life-cycle operation (e.g., *stop*) could be invoked. The default value is `1 min`. |=== @@ -3131,11 +3236,11 @@ for components to persist state. See the <> section for more i |==== |*Property*|*Description* -|nifi.state.management.configuration.file|The XML file that contains configuration for the local and cluster-wide State Providers. The default value is `./conf/state-management.xml`. -|nifi.state.management.provider.local|The ID of the Local State Provider to use. This value must match the value of the `id` element of one of the `local-provider` elements in the _state-management.xml_ file. -|nifi.state.management.provider.cluster|The ID of the Cluster State Provider to use. This value must match the value of the `id` element of one of the `cluster-provider` elements in the _state-management.xml_ file. This value is ignored if not clustered but is required for nodes in a cluster. -|nifi.state.management.embedded.zookeeper.start|Specifies whether or not this instance of NiFi should start an embedded ZooKeeper Server. This is used in conjunction with the ZooKeeperStateProvider. -|nifi.state.management.embedded.zookeeper.properties|Specifies a properties file that contains the configuration for the embedded ZooKeeper Server that is started (if the `nifi.state.management.embedded.zookeeper.start` property is set to `true`) +|`nifi.state.management.configuration.file`|The XML file that contains configuration for the local and cluster-wide State Providers. The default value is `./conf/state-management.xml`. +|`nifi.state.management.provider.local`|The ID of the Local State Provider to use. This value must match the value of the `id` element of one of the `local-provider` elements in the _state-management.xml_ file. +|`nifi.state.management.provider.cluster`|The ID of the Cluster State Provider to use. This value must match the value of the `id` element of one of the `cluster-provider` elements in the _state-management.xml_ file. This value is ignored if not clustered but is required for nodes in a cluster. +|`nifi.state.management.embedded.zookeeper.start`|Specifies whether or not this instance of NiFi should start an embedded ZooKeeper Server. This is used in conjunction with the ZooKeeperStateProvider. +|`nifi.state.management.embedded.zookeeper.properties`|Specifies a properties file that contains the configuration for the embedded ZooKeeper Server that is started (if the `nifi.state.management.embedded.zookeeper.start` property is set to `true`) |==== @@ -3145,8 +3250,8 @@ The H2 Settings section defines the settings for the H2 database, which keeps tr |==== |*Property*|*Description* -|nifi.database.directory*|The location of the H2 database directory. The default value is `./database_repository`. -|nifi.h2.url.append|This property specifies additional arguments to add to the connection string for the H2 database. The default value should be used and should not be changed. It is: `;LOCK_TIMEOUT=25000;WRITE_DELAY=0;AUTO_SERVER=FALSE`. +|`nifi.database.directory`*|The location of the H2 database directory. The default value is `./database_repository`. +|`nifi.h2.url.append`|This property specifies additional arguments to add to the connection string for the H2 database. The default value should be used and should not be changed. It is: `;LOCK_TIMEOUT=25000;WRITE_DELAY=0;AUTO_SERVER=FALSE`. |==== @@ -3158,8 +3263,8 @@ to configure it on a separate drive if available. |==== |*Property*|*Description* -|nifi.flowfile.repository.implementation|The FlowFile Repository implementation. The default value is `org.apache.nifi.controller.repository.WriteAheadFlowFileRepository` and should only be changed with caution. To store flowfiles in memory instead of on disk (accepting data loss in the event of power/machine failure or a restart of NiFi), set this property to `org.apache.nifi.controller.repository.VolatileFlowFileRepository`. -|nifi.flowfile.repository.wal.implementation|If the repository implementation is configured to use the `WriteAheadFlowFileRepository`, this property can be used to specify which implementation of the +|`nifi.flowfile.repository.implementation`|The FlowFile Repository implementation. The default value is `org.apache.nifi.controller.repository.WriteAheadFlowFileRepository` and should only be changed with caution. To store flowfiles in memory instead of on disk (accepting data loss in the event of power/machine failure or a restart of NiFi), set this property to `org.apache.nifi.controller.repository.VolatileFlowFileRepository`. +|`nifi.flowfile.repository.wal.implementation`|If the repository implementation is configured to use the `WriteAheadFlowFileRepository`, this property can be used to specify which implementation of the Write-Ahead Log should be used. The default value is `org.apache.nifi.wali.SequentialAccessWriteAheadLog`. This version of the write-ahead log was added in version 1.6.0 of Apache NiFi and was developed in order to address an issue that exists in the older implementation. In the event of power loss or an operating system crash, the old implementation was susceptible to recovering FlowFiles incorrectly. This could potentially lead to the wrong attributes or content being assigned to a FlowFile upon restart, following the power loss or OS crash. However, one can still choose to opt into @@ -3167,10 +3272,10 @@ using the previous implementation and accept that risk, if desired (for example, To do so, set the value of this property to `org.wali.MinimalLockingWriteAheadLog`. If the value of this property is changed, upon restart, NiFi will still recover the records written using the previously configured repository and delete the files written by the previously configured implementation. -|nifi.flowfile.repository.directory*|The location of the FlowFile Repository. The default value is `./flowfile_repository`. -|nifi.flowfile.repository.partitions|The number of partitions. The default value is `256`. -|nifi.flowfile.repository.checkpoint.interval| The FlowFile Repository checkpoint interval. The default value is `2 mins`. -|nifi.flowfile.repository.always.sync|If set to `true`, any change to the repository will be synchronized to the disk, meaning that NiFi will ask the operating system not to cache the information. This is very expensive and can significantly reduce NiFi performance. However, if it is `false`, there could be the potential for data loss if either there is a sudden power loss or the operating system crashes. The default value is `false`. +|`nifi.flowfile.repository.directory`*|The location of the FlowFile Repository. The default value is `./flowfile_repository`. +|`nifi.flowfile.repository.partitions`|The number of partitions. The default value is `256`. +|`nifi.flowfile.repository.checkpoint.interval`| The FlowFile Repository checkpoint interval. The default value is `2 mins`. +|`nifi.flowfile.repository.always.sync`|If set to `true`, any change to the repository will be synchronized to the disk, meaning that NiFi will ask the operating system not to cache the information. This is very expensive and can significantly reduce NiFi performance. However, if it is `false`, there could be the potential for data loss if either there is a sudden power loss or the operating system crashes. The default value is `false`. |==== === Swap Management @@ -3182,12 +3287,12 @@ available again. These properties govern how that process occurs. |==== |*Property*|*Description* -|nifi.swap.manager.implementation|The Swap Manager implementation. The default value is `org.apache.nifi.controller.FileSystemSwapManager` and should not be changed. -|nifi.queue.swap.threshold|The queue threshold at which NiFi starts to swap FlowFile information to disk. The default value is `20000`. -|nifi.swap.in.period|The swap in period. The default value is `5 sec`. -|nifi.swap.in.threads|The number of threads to use for swapping in. The default value is `1`. -|nifi.swap.out.period|The swap out period. The default value is `5 sec`. -|nifi.swap.out.threads|The number of threads to use for swapping out. The default value is `4`. +|`nifi.swap.manager.implementation`|The Swap Manager implementation. The default value is `org.apache.nifi.controller.FileSystemSwapManager` and should not be changed. +|`nifi.queue.swap.threshold`|The queue threshold at which NiFi starts to swap FlowFile information to disk. The default value is `20000`. +|`nifi.swap.in.period`|The swap in period. The default value is `5 sec`. +|`nifi.swap.in.threads`|The number of threads to use for swapping in. The default value is `1`. +|`nifi.swap.out.period`|The swap out period. The default value is `5 sec`. +|`nifi.swap.out.threads`|The number of threads to use for swapping out. The default value is `4`. |==== === Content Repository @@ -3200,40 +3305,40 @@ FlowFile Repository, if also on that disk, could become corrupt. To avoid this s |==== |*Property*|*Description* -|nifi.content.repository.implementation|The Content Repository implementation. The default value is `org.apache.nifi.controller.repository.FileSystemRepository` and should only be changed with caution. To store flowfile content in memory instead of on disk (at the risk of data loss in the event of power/machine failure), set this property to `org.apache.nifi.controller.repository.VolatileContentRepository`. +|`nifi.content.repository.implementation`|The Content Repository implementation. The default value is `org.apache.nifi.controller.repository.FileSystemRepository` and should only be changed with caution. To store flowfile content in memory instead of on disk (at the risk of data loss in the event of power/machine failure), set this property to `org.apache.nifi.controller.repository.VolatileContentRepository`. |==== === File System Content Repository Properties |==== |*Property*|*Description* -|nifi.content.repository.implementation|The Content Repository implementation. The default value is `org.apache.nifi.controller.repository.FileSystemRepository` and should only be changed with caution. To store flowfile content in memory instead of on disk (at the risk of data loss in the event of power/machine failure), set this property to `org.apache.nifi.controller.repository.VolatileContentRepository`. -|nifi.content.claim.max.appendable.size|The maximum size for a content claim. The default value is `1 MB`. -|nifi.content.claim.max.flow.files|The maximum number of FlowFiles to assign to one content claim. The default value is `100`. -|nifi.content.repository.directory.default*|The location of the Content Repository. The default value is `./content_repository`. + +|`nifi.content.repository.implementation`|The Content Repository implementation. The default value is `org.apache.nifi.controller.repository.FileSystemRepository` and should only be changed with caution. To store flowfile content in memory instead of on disk (at the risk of data loss in the event of power/machine failure), set this property to `org.apache.nifi.controller.repository.VolatileContentRepository`. +|`nifi.content.claim.max.appendable.size`|The maximum size for a content claim. The default value is `1 MB`. +|`nifi.content.claim.max.flow.files`|The maximum number of FlowFiles to assign to one content claim. The default value is `100`. +|`nifi.content.repository.directory.default`*|The location of the Content Repository. The default value is `./content_repository`. + + -*NOTE*: Multiple content repositories can be specified by using the *_nifi.content.repository.directory._* prefix with unique suffixes and separate paths as values. + +*NOTE*: Multiple content repositories can be specified by using the `nifi.content.repository.directory.` prefix with unique suffixes and separate paths as values. + + For example, to provide two additional locations to act as part of the content repository, a user could also specify additional properties with keys of: + + -nifi.content.repository.directory.content1=/repos/content1 + -nifi.content.repository.directory.content2=/repos/content2 + +`nifi.content.repository.directory.content1=/repos/content1` + +`nifi.content.repository.directory.content2=/repos/content2` + + Providing three total locations, including `nifi.content.repository.directory.default`. -|nifi.content.repository.archive.max.retention.period|If archiving is enabled (see nifi.content.repository.archive.enabled below), then +|`nifi.content.repository.archive.max.retention.period`|If archiving is enabled (see `nifi.content.repository.archive.enabled` below), then this property specifies the maximum amount of time to keep the archived data. The default value is `12 hours`. -|nifi.content.repository.archive.max.usage.percentage|If archiving is enabled (see nifi.content.repository.archive.enabled below), then this property must have a value that indicates the content repository disk usage percentage at which archived data begins to be removed. If the archive is empty and content repository disk usage is above this percentage, then archiving is temporarily disabled. Archiving will resume when disk usage is below this percentage. The default value is `50%`. -|nifi.content.repository.archive.enabled|To enable content archiving, set this to _true_ and specify a value for the `nifi.content.repository.archive.max.usage.percentage` property above. Content archiving enables the provenance UI to view or replay content that is no longer in a dataflow queue. By default, archiving is enabled. -|nifi.content.repository.always.sync|If set to `true`, any change to the repository will be synchronized to the disk, meaning that NiFi will ask the operating system not to cache the information. This is very expensive and can significantly reduce NiFi performance. However, if it is `false`, there could be the potential for data loss if either there is a sudden power loss or the operating system crashes. The default value is `false`. -|nifi.content.viewer.url|The URL for a web-based content viewer if one is available. It is blank by default. +|`nifi.content.repository.archive.max.usage.percentage`|If archiving is enabled (see `nifi.content.repository.archive.enabled` below), then this property must have a value that indicates the content repository disk usage percentage at which archived data begins to be removed. If the archive is empty and content repository disk usage is above this percentage, then archiving is temporarily disabled. Archiving will resume when disk usage is below this percentage. The default value is `50%`. +|`nifi.content.repository.archive.enabled`|To enable content archiving, set this to `true` and specify a value for the `nifi.content.repository.archive.max.usage.percentage` property above. Content archiving enables the provenance UI to view or replay content that is no longer in a dataflow queue. By default, archiving is enabled. +|`nifi.content.repository.always.sync`|If set to `true`, any change to the repository will be synchronized to the disk, meaning that NiFi will ask the operating system not to cache the information. This is very expensive and can significantly reduce NiFi performance. However, if it is `false`, there could be the potential for data loss if either there is a sudden power loss or the operating system crashes. The default value is `false`. +|`nifi.content.viewer.url`|The URL for a web-based content viewer if one is available. It is blank by default. |==== === Volatile Content Repository Properties |==== |*Property*|*Description* -|nifi.volatile.content.repository.max.size|The Content Repository maximum size in memory. The default value is `100 MB`. -|nifi.volatile.content.repository.block.size|The Content Repository block size. The default value is `32 KB`. +|`nifi.volatile.content.repository.max.size`|The Content Repository maximum size in memory. The default value is `100 MB`. +|`nifi.volatile.content.repository.block.size`|The Content Repository block size. The default value is `32 KB`. |==== === Provenance Repository @@ -3242,8 +3347,8 @@ The Provenance Repository contains the information related to Data Provenance. T |==== |*Property*|*Description* -|nifi.provenance.repository.implementation|The Provenance Repository implementation. The default value is `org.apache.nifi.provenance.PersistentProvenanceRepository`. -Two additional repositories are available as well. +|`nifi.provenance.repository.implementation`|The Provenance Repository implementation. The default value is `org.apache.nifi.provenance.PersistentProvenanceRepository`. +Three additional repositories are available as well. To store provenance events in memory instead of on disk (in which case all events will be lost on restart, and events will be evicted in a first-in-first-out order), set this property to `org.apache.nifi.provenance.VolatileProvenanceRepository`. This leaves a configurable number of Provenance Events in the Java heap, so the number of events that can be retained is very limited. @@ -3256,7 +3361,7 @@ redesigns. When used in a NiFi instance that is responsible for processing large The `WriteAheadProvenanceRepository` was then written to provide the same capabilities as the `PersistentProvenanceRepository` while providing far better performance. Changing to the `WriteAheadProvenanceRepository` is easy to accomplish, as the two repositories support most of the same properties. -*Note Well*, however, the following caveat: The `WriteAheadProvenanceRepository` will make use of the Provenance data stored by the `PersistentProvenanceRepository`. However, the +*NOTE:* The `WriteAheadProvenanceRepository` will make use of the Provenance data stored by the `PersistentProvenanceRepository`. However, the `PersistentProvenanceRepository` may not be able to read the data written by the `WriteAheadProvenanceRepository`. Therefore, once the Provenance Repository is changed to use the `WriteAheadProvenanceRepository`, it cannot be changed back to the `PersistentProvenanceRepository` without deleting the data in the Provenance Repository. It is therefore recommended that before changing the implementation, users ensure that their version of NiFi is stable, in case any issue arises that causes the user to need to roll back to @@ -3268,95 +3373,95 @@ at this time. |==== |*Property*|*Description* -|nifi.provenance.repository.directory.default*|The location of the Provenance Repository. The default value is `./provenance_repository`. + +|`nifi.provenance.repository.directory.default`*|The location of the Provenance Repository. The default value is `./provenance_repository`. + + -*NOTE*: Multiple provenance repositories can be specified by using the *_nifi.provenance.repository.directory._* prefix with unique suffixes and separate paths as values. + +*NOTE*: Multiple provenance repositories can be specified by using the `nifi.provenance.repository.directory.` prefix with unique suffixes and separate paths as values. + + For example, to provide two additional locations to act as part of the provenance repository, a user could also specify additional properties with keys of: + + -nifi.provenance.repository.directory.provenance1=/repos/provenance1 + -nifi.provenance.repository.directory.provenance2=/repos/provenance2 + +`nifi.provenance.repository.directory.provenance1=/repos/provenance1` + +`nifi.provenance.repository.directory.provenance2=/repos/provenance2` + + Providing three total locations, including `nifi.provenance.repository.directory.default`. -|nifi.provenance.repository.max.storage.time|The maximum amount of time to keep data provenance information. The default value is `24 hours`. -|nifi.provenance.repository.max.storage.size|The maximum amount of data provenance information to store at a time. The default value is `1 GB`. -|nifi.provenance.repository.rollover.time|The amount of time to wait before rolling over the latest data provenance information so that it is available in the User Interface. The default value is `30 secs`. -|nifi.provenance.repository.rollover.size|The amount of information to roll over at a time. The default value is `100 MB`. -|nifi.provenance.repository.query.threads|The number of threads to use for Provenance Repository queries. The default value is `2`. -|nifi.provenance.repository.index.threads|The number of threads to use for indexing Provenance events so that they are searchable. The default value is `2`. +|`nifi.provenance.repository.max.storage.time`|The maximum amount of time to keep data provenance information. The default value is `24 hours`. +|`nifi.provenance.repository.max.storage.size`|The maximum amount of data provenance information to store at a time. The default value is `1 GB`. +|`nifi.provenance.repository.rollover.time`|The amount of time to wait before rolling over the latest data provenance information so that it is available in the User Interface. The default value is `30 secs`. +|`nifi.provenance.repository.rollover.size`|The amount of information to roll over at a time. The default value is `100 MB`. +|`nifi.provenance.repository.query.threads`|The number of threads to use for Provenance Repository queries. The default value is `2`. +|`nifi.provenance.repository.index.threads`|The number of threads to use for indexing Provenance events so that they are searchable. The default value is `2`. For flows that operate on a very high number of FlowFiles, the indexing of Provenance events could become a bottleneck. If this is the case, a bulletin will appear, indicating that "The rate of the dataflow is exceeding the provenance recording rate. Slowing down flow to accommodate." If this happens, increasing the value of this property may increase the rate at which the Provenance Repository is able to process these records, resulting in better overall throughput. -|nifi.provenance.repository.compress.on.rollover|Indicates whether to compress the provenance information when rolling it over. The default value is `true`. -|nifi.provenance.repository.always.sync|If set to `true`, any change to the repository will be synchronized to the disk, meaning that NiFi will ask the operating system not to cache the information. This is very expensive and can significantly reduce NiFi performance. However, if it is `false`, there could be the potential for data loss if either there is a sudden power loss or the operating system crashes. The default value is `false`. -|nifi.provenance.repository.journal.count|The number of journal files that should be used to serialize Provenance Event data. Increasing this value will allow more tasks to simultaneously update the repository but will result in more expensive merging of the journal files later. This value should ideally be equal to the number of threads that are expected to update the repository simultaneously, but 16 tends to work well in must environments. The default value is `16`. -|nifi.provenance.repository.indexed.fields|This is a comma-separated list of the fields that should be indexed and made searchable. Fields that are not indexed will not be searchable. Valid fields are: `EventType, FlowFileUUID, Filename, TransitURI, ProcessorID, AlternateIdentifierURI, Relationship, Details`. The default value is: `EventType, FlowFileUUID, Filename, ProcessorID`. -|nifi.provenance.repository.indexed.attributes|This is a comma-separated list of FlowFile Attributes that should be indexed and made searchable. It is blank by default. But some good examples to consider are 'filename', 'uuid', and 'mime.type' as well as any custom attritubes you might use which are valuable for your use case. -|nifi.provenance.repository.index.shard.size|Large values for the shard size will result in more Java heap usage when searching the Provenance Repository but should provide better performance. The default value is `500 MB`. -|nifi.provenance.repository.max.attribute.length|Indicates the maximum length that a FlowFile attribute can be when retrieving a Provenance Event from the repository. If the length of any attribute exceeds this value, it will be truncated when the event is retrieved. The default value is `65536`. +|`nifi.provenance.repository.compress.on.rollover`|Indicates whether to compress the provenance information when rolling it over. The default value is `true`. +|`nifi.provenance.repository.always.sync`|If set to `true`, any change to the repository will be synchronized to the disk, meaning that NiFi will ask the operating system not to cache the information. This is very expensive and can significantly reduce NiFi performance. However, if it is `false`, there could be the potential for data loss if either there is a sudden power loss or the operating system crashes. The default value is `false`. +|`nifi.provenance.repository.journal.count`|The number of journal files that should be used to serialize Provenance Event data. Increasing this value will allow more tasks to simultaneously update the repository but will result in more expensive merging of the journal files later. This value should ideally be equal to the number of threads that are expected to update the repository simultaneously, but 16 tends to work well in must environments. The default value is `16`. +|`nifi.provenance.repository.indexed.fields`|This is a comma-separated list of the fields that should be indexed and made searchable. Fields that are not indexed will not be searchable. Valid fields are: `EventType`, `FlowFileUUID`, `Filename`, `TransitURI`, `ProcessorID`, `AlternateIdentifierURI`, `Relationship`, `Details`. The default value is: `EventType, FlowFileUUID, Filename, ProcessorID`. +|`nifi.provenance.repository.indexed.attributes`|This is a comma-separated list of FlowFile Attributes that should be indexed and made searchable. It is blank by default. But some good examples to consider are `filename`, `uuid`, and `mime.type` as well as any custom attritubes you might use which are valuable for your use case. +|`nifi.provenance.repository.index.shard.size`|Large values for the shard size will result in more Java heap usage when searching the Provenance Repository but should provide better performance. The default value is `500 MB`. +|`nifi.provenance.repository.max.attribute.length`|Indicates the maximum length that a FlowFile attribute can be when retrieving a Provenance Event from the repository. If the length of any attribute exceeds this value, it will be truncated when the event is retrieved. The default value is `65536`. |==== === Volatile Provenance Repository Properties |==== |*Property*|*Description* -|nifi.provenance.repository.buffer.size|The Provenance Repository buffer size. The default value is `100000`. +|`nifi.provenance.repository.buffer.size`|The Provenance Repository buffer size. The default value is `100000` provenance events. |==== === Write Ahead Provenance Repository Properties |==== |*Property*|*Description* -|nifi.provenance.repository.directory.default*|The location of the Provenance Repository. The default value is `./provenance_repository`. + +|`nifi.provenance.repository.directory.default`*|The location of the Provenance Repository. The default value is `./provenance_repository`. + + - *NOTE*: Multiple provenance repositories can be specified by using the *_nifi.provenance.repository.directory._* prefix with unique suffixes and separate paths as values. + + *NOTE*: Multiple provenance repositories can be specified by using the `nifi.provenance.repository.directory.` prefix with unique suffixes and separate paths as values. + + For example, to provide two additional locations to act as part of the provenance repository, a user could also specify additional properties with keys of: + + - nifi.provenance.repository.directory.provenance1=/repos/provenance1 + - nifi.provenance.repository.directory.provenance2=/repos/provenance2 + + `nifi.provenance.repository.directory.provenance1=/repos/provenance1` + + `nifi.provenance.repository.directory.provenance2=/repos/provenance2` + + Providing three total locations, including `nifi.provenance.repository.directory.default`. -|nifi.provenance.repository.max.storage.time|The maximum amount of time to keep data provenance information. The default value is `24 hours`. -|nifi.provenance.repository.max.storage.size|The maximum amount of data provenance information to store at a time. +|`nifi.provenance.repository.max.storage.time`|The maximum amount of time to keep data provenance information. The default value is `24 hours`. +|`nifi.provenance.repository.max.storage.size`|The maximum amount of data provenance information to store at a time. The default value is `1 GB`. The Data Provenance capability can consume a great deal of storage space because so much data is kept. For production environments, values of 1-2 TB or more is not uncommon. The repository will write to a single "event file" (or set of "event files" if multiple storage locations are defined, as described above) for some period of time (defined by the - nifi.provenance.repository.rollover.time and nifi.provenance.repository.rollover.size properties). Data is always aged off one file at a time, + `nifi.provenance.repository.rollover.time` and `nifi.provenance.repository.rollover.size` properties). Data is always aged off one file at a time, so it is not advisable to write to a single "event file" for a tremendous amount of time, as it will prevent old data from aging off as smoothly. -|nifi.provenance.repository.rollover.time|The amount of time to wait before rolling over the "event file" that the repository is writing to. -|nifi.provenance.repository.rollover.size|The amount of data to write to a single "event file." The default value is `100 MB`. For production - environments where a very large amount of Data Provenance is generated, a value of 1 GB is also very reasonable. -|nifi.provenance.repository.query.threads|The number of threads to use for Provenance Repository queries. The default value is `2`. -|nifi.provenance.repository.index.threads|The number of threads to use for indexing Provenance events so that they are searchable. The default value is `2`. +|`nifi.provenance.repository.rollover.time`|The amount of time to wait before rolling over the "event file" that the repository is writing to. +|`nifi.provenance.repository.rollover.size`|The amount of data to write to a single "event file." The default value is `100 MB`. For production + environments where a very large amount of Data Provenance is generated, a value of `1 GB` is also very reasonable. +|`nifi.provenance.repository.query.threads`|The number of threads to use for Provenance Repository queries. The default value is `2`. +|`nifi.provenance.repository.index.threads`|The number of threads to use for indexing Provenance events so that they are searchable. The default value is `2`. For flows that operate on a very high number of FlowFiles, the indexing of Provenance events could become a bottleneck. If this happens, increasing the value of this property may increase the rate at which the Provenance Repository is able to process these records, resulting in better overall throughput. It is advisable to use at least 1 thread per storage location (i.e., if there are 3 storage locations, at least 3 threads should be used). For high throughput environments, where more CPU and disk I/O is available, it may make sense to increase this value significantly. Typically going beyond 2-4 threads per storage location is not valuable. However, this can be tuned depending on the CPU resources available compared to the I/O resources. -|nifi.provenance.repository.compress.on.rollover|Indicates whether to compress the provenance information when an "event file" is rolled over. The default value is `true`. -|nifi.provenance.repository.always.sync|If set to `true`, any change to the repository will be synchronized to the disk, meaning that NiFi will ask the operating system +|`nifi.provenance.repository.compress.on.rollover`|Indicates whether to compress the provenance information when an "event file" is rolled over. The default value is `true`. +|`nifi.provenance.repository.always.sync`|If set to `true`, any change to the repository will be synchronized to the disk, meaning that NiFi will ask the operating system not to cache the information. This is very expensive and can significantly reduce NiFi performance. However, if it is `false`, there could be the potential for data loss if either there is a sudden power loss or the operating system crashes. The default value is `false`. -|nifi.provenance.repository.indexed.fields|This is a comma-separated list of the fields that should be indexed and made searchable. - Fields that are not indexed will not be searchable. Valid fields are: `EventType, FlowFileUUID, Filename, TransitURI, ProcessorID, - AlternateIdentifierURI, Relationship, Details`. The default value is: `EventType, FlowFileUUID, Filename, ProcessorID`. -|nifi.provenance.repository.indexed.attributes|This is a comma-separated list of FlowFile Attributes that should be indexed and made searchable. It is blank by default. - But some good examples to consider are 'filename' and 'mime.type' as well as any custom attributes you might use which are valuable for your use case. -|nifi.provenance.repository.index.shard.size|The repository uses Apache Lucene to performing indexing and searching capabilities. This value indicates how large a Lucene Index should +|`nifi.provenance.repository.indexed.fields`|This is a comma-separated list of the fields that should be indexed and made searchable. + Fields that are not indexed will not be searchable. Valid fields are: `EventType`, `FlowFileUUID`, `Filename`, `TransitURI`, `ProcessorID`, + `AlternateIdentifierURI`, `Relationship`, `Details`. The default value is: `EventType, FlowFileUUID, Filename, ProcessorID`. +|`nifi.provenance.repository.indexed.attributes`|This is a comma-separated list of FlowFile Attributes that should be indexed and made searchable. It is blank by default. + But some good examples to consider are `filename` and `mime.type` as well as any custom attributes you might use which are valuable for your use case. +|`nifi.provenance.repository.index.shard.size`|The repository uses Apache Lucene to performing indexing and searching capabilities. This value indicates how large a Lucene Index should become before the Repository starts writing to a new Index. Large values for the shard size will result in more Java heap usage when searching the Provenance Repository but should provide better performance. The default value is `500 MB`. However, this is due to the fact that defaults are tuned for very small environments where most users begin to use NiFi. - For production environments, it is advisable to change this value to *4 to 8 GB*. Once all Provenance Events in the index have been aged off from the "event files," the index + For production environments, it is advisable to change this value to `4` to `8 GB`. Once all Provenance Events in the index have been aged off from the "event files," the index will be destroyed as well. -|nifi.provenance.repository.max.attribute.length|Indicates the maximum length that a FlowFile attribute can be when retrieving a Provenance Event from the repository. +|`nifi.provenance.repository.max.attribute.length`|Indicates the maximum length that a FlowFile attribute can be when retrieving a Provenance Event from the repository. If the length of any attribute exceeds this value, it will be truncated when the event is retrieved. The default value is `65536`. -|nifi.provenance.repository.concurrent.merge.threads|Apache Lucene creates several "segments" in an Index. These segments are periodically merged together in order to provide faster +|`nifi.provenance.repository.concurrent.merge.threads`|Apache Lucene creates several "segments" in an Index. These segments are periodically merged together in order to provide faster querying. This property specifies the maximum number of threads that are allowed to be used for *each* of the storage directories. The default value is `2`. For high throughput environments, it is advisable to set the number of index threads larger than the number of merge threads * the number of storage locations. For example, if there are 2 storage - locations and the number of index threads is set to 8, then the number of merge threads should likely be less than 4. While it is not critical that this be done, setting the + locations and the number of index threads is set to `8`, then the number of merge threads should likely be less than `4`. While it is not critical that this be done, setting the number of merge threads larger than this can result in all index threads being used to merge, which would cause the NiFi flow to periodically pause while indexing is happening, resulting in some data being processed with much higher latency than other data. -|nifi.provenance.repository.warm.cache.frequency|Each time that a Provenance query is run, the query must first search the Apache Lucene indices (at least, in most cases - there are +|`nifi.provenance.repository.warm.cache.frequency`|Each time that a Provenance query is run, the query must first search the Apache Lucene indices (at least, in most cases - there are some queries that are run often and the results are cached to avoid searching the Lucene indices). When a Lucene index is opened for the first time, it can be very expensive and take several seconds. This is compounded by having many different indices, and can result in a Provenance query taking much longer. After the index has been opened, the Operating System's disk cache will typically hold onto enough data to make re-opening the index much faster - at least for a period of time, until the disk cache evicts this data. If this value is set, @@ -3373,12 +3478,12 @@ All of the properties defined above (see <> tool in NiFi Toolkit. - |nifi.provenance.repository.encryption.key.id.*|Allows for additional keys to be specified for the `StaticKeyProvider`. For example, the line `nifi.provenance.repository.encryption.key.id.Key2=012...210` would provide an available key `Key2`. +|`nifi.provenance.repository.debug.frequency`|Controls the number of events processed between DEBUG statements documenting the performance metrics of the repository. This value is only used when DEBUG level statements are enabled in the log configuration. + |`nifi.provenance.repository.encryption.key.provider.implementation`|This is the fully-qualified class name of the **key provider**. A key provider is the datastore interface for accessing the encryption key to protect the provenance events. There are currently two implementations -- `StaticKeyProvider` which reads a key directly from _nifi.properties_, and `FileBasedKeyProvider` which reads *n* many keys from an encrypted file. The interface is extensible, and HSM-backed or other providers are expected in the future. + |`nifi.provenance.repository.encryption.key.provider.location`|The path to the key definition resource (empty for `StaticKeyProvider`, `./keys.nkp` or similar path for `FileBasedKeyProvider`). For future providers like an HSM, this may be a connection string or URL. + |`nifi.provenance.repository.encryption.key.id`|The active key ID to use for encryption (e.g. `Key1`). + |`nifi.provenance.repository.encryption.key`|The key to use for `StaticKeyProvider`. The key format is hex-encoded (`0123456789ABCDEFFEDCBA98765432100123456789ABCDEFFEDCBA9876543210`) but can also be encrypted using the <> tool in NiFi Toolkit. + |`nifi.provenance.repository.encryption.key.id.`*|Allows for additional keys to be specified for the `StaticKeyProvider`. For example, the line `nifi.provenance.repository.encryption.key.id.Key2=012...210` would provide an available key `Key2`. |==== The simplest configuration is below: @@ -3398,17 +3503,17 @@ nifi.provenance.repository.encryption.key=0123456789ABCDEFFEDCBA9876543210012345 The Component Status Repository contains the information for the Component Status History tool in the User Interface. These properties govern how that tool works. -The buffer.size and snapshot.frequency work together to determine the amount of historical data to retain. As an example to +The `buffer.size` and `snapshot.frequency` work together to determine the amount of historical data to retain. As an example to configure two days worth of historical data with a data point snapshot occurring every 5 minutes you would configure -snapshot.frequency to be "5 mins" and the buffer.size to be "576". To further explain this example for every 60 minutes there +`snapshot.frequency` to be "5 mins" and the buffer.size to be "576". To further explain this example for every 60 minutes there are 12 (60 / 5) snapshot windows for that time period. To keep that data for 48 hours (12 * 48) you end up with a buffer size of 576. |==== |*Property*|*Description* -|nifi.components.status.repository.implementation|The Component Status Repository implementation. The default value is `org.apache.nifi.controller.status.history.VolatileComponentStatusRepository` and should not be changed. -|nifi.components.status.repository.buffer.size|Specifies the buffer size for the Component Status Repository. The default value is `1440`. -|nifi.components.status.snapshot.frequency|This value indicates how often to present a snapshot of the components' status history. The default value is `1 min`. +|`nifi.components.status.repository.implementation`|The Component Status Repository implementation. The default value is `org.apache.nifi.controller.status.history.VolatileComponentStatusRepository` and should not be changed. +|`nifi.components.status.repository.buffer.size`|Specifies the buffer size for the Component Status Repository. The default value is `1440`. +|`nifi.components.status.snapshot.frequency`|This value indicates how often to present a snapshot of the components' status history. The default value is `1 min`. |==== @@ -3416,18 +3521,18 @@ of 576. === Site to Site Properties These properties govern how this instance of NiFi communicates with remote instances of NiFi when Remote Process Groups are configured in the dataflow. -Remote Process Groups can choose transport protocol from RAW and HTTP. Properties named with _nifi.remote.input.socket.*_ are RAW transport protocol specific. Similarly, _nifi.remote.input.http.*_ are HTTP transport protocol specific properties. +Remote Process Groups can choose transport protocol from RAW and HTTP. Properties named with `nifi.remote.input.socket.$$*$$` are RAW transport protocol specific. Similarly, `nifi.remote.input.http.$$*$$` are HTTP transport protocol specific properties. |==== |*Property*|*Description* -|nifi.remote.input.host|The host name that will be given out to clients to connect to this NiFi instance for Site-to-Site communication. By default, it is the value from InetAddress.getLocalHost().getHostName(). On UNIX-like operating systems, this is typically the output from the `hostname` command. -|nifi.remote.input.secure|This indicates whether communication between this instance of NiFi and remote NiFi instances should be secure. By default, it is set to `false`. In order for secure site-to-site to work, set the property to `true`. Many other Security Properties (below) must also be configured. -|nifi.remote.input.socket.port|The remote input socket port for Site-to-Site communication. By default, it is blank, but it must have a value in order to use RAW socket as transport protocol for Site-to-Site. -|nifi.remote.input.http.enabled|Specifies whether HTTP Site-to-Site should be enabled on this host. By default, it is set to `true`. + +|`nifi.remote.input.host`|The host name that will be given out to clients to connect to this NiFi instance for Site-to-Site communication. By default, it is the value from `InetAddress.getLocalHost().getHostName()`. On UNIX-like operating systems, this is typically the output from the `hostname` command. +|`nifi.remote.input.secure`|This indicates whether communication between this instance of NiFi and remote NiFi instances should be secure. By default, it is set to `false`. In order for secure site-to-site to work, set the property to `true`. Many other <> must also be configured. +|`nifi.remote.input.socket.port`|The remote input socket port for Site-to-Site communication. By default, it is blank, but it must have a value in order to use RAW socket as transport protocol for Site-to-Site. +|`nifi.remote.input.http.enabled`|Specifies whether HTTP Site-to-Site should be enabled on this host. By default, it is set to `true`. + Whether a Site-to-Site client uses HTTP or HTTPS is determined by `nifi.remote.input.secure`. If it is set to `true`, then requests are sent as HTTPS to `nifi.web.https.port`. If set to `false`, HTTP requests are sent to `nifi.web.http.port`. -|nifi.remote.input.http.transaction.ttl|Specifies how long a transaction can stay alive on the server. By default, it is set to `30 secs`. + +|`nifi.remote.input.http.transaction.ttl`|Specifies how long a transaction can stay alive on the server. By default, it is set to `30 secs`. + If a Site-to-Site client hasn’t proceeded to the next action after this period of time, the transaction is discarded from the remote NiFi instance. For example, when a client creates a transaction but doesn’t send or receive flow files, or when a client sends or receives flow files but doesn’t confirm that transaction. -|nifi.remote.contents.cache.expiration|Specifies how long NiFi should cache information about a remote NiFi instance when communicating via Site-to-Site. By default, NiFi will cache the + +|`nifi.remote.contents.cache.expiration`|Specifies how long NiFi should cache information about a remote NiFi instance when communicating via Site-to-Site. By default, NiFi will cache the + responses from the remote system for `30 secs`. This allows NiFi to avoid constantly making HTTP requests to the remote system, which is particularly important when this instance of NiFi + has many instances of Remote Process Groups. |==== @@ -3435,41 +3540,41 @@ has many instances of Remote Process Groups. [[site_to_site_reverse_proxy_properties]] === Site to Site Routing Properties for Reverse Proxies -Site-to-Site requires peer-to-peer communication between a client and a remote NiFi node. E.g. if a remote NiFi cluster has 3 nodes, nifi0, nifi1 and nifi2, then a client requests have to be reachable to each of those remote node. +Site-to-Site requires peer-to-peer communication between a client and a remote NiFi node. E.g. if a remote NiFi cluster has 3 nodes (`nifi0`, `nifi1` and `nifi2`) then client requests have to be reachable to each of those remote nodes. If a NiFi cluster is planned to receive/transfer data from/to Site-to-Site clients over the internet or a company firewall, a reverse proxy server can be deployed in front of the NiFi cluster nodes as a gateway to route client requests to upstream NiFi nodes, to reduce number of servers and ports those have to be exposed. In such environment, the same NiFi cluster would also be expected to be accessed by Site-to-Site clients within the same network. Sending FlowFiles to itself for load distribution among NiFi cluster nodes can be a typical example. In this case, client requests should be routed directly to a node without going through the reverse proxy. -In order to support such deployments, remote NiFi clusters need to expose its Site-to-Site endpoints dynamically based on client request contexts. Following properties configure how peers should be exposed to clients. A routing definition consists of 4 properties, 'when', 'hostname', 'port', and 'secure', grouped by 'protocol' and 'name'. Multiple routing definitions can be configured. 'protocol' represents Site-to-Site transport protocol, i.e. raw or http. +In order to support such deployments, remote NiFi clusters need to expose its Site-to-Site endpoints dynamically based on client request contexts. Following properties configure how peers should be exposed to clients. A routing definition consists of 4 properties, `when`, `hostname`, `port`, and `secure`, grouped by `protocol` and `name`. Multiple routing definitions can be configured. `protocol` represents Site-to-Site transport protocol, i.e. `RAW` or `HTTP`. |==== |*Property*|*Description* -|nifi.remote.route.{protocol}.{name}.when|Boolean value, 'true' or 'false'. Controls whether the routing definition for this name should be used. -|nifi.remote.route.{protocol}.{name}.hostname|Specify hostname that will be introduced to Site-to-Site clients for further communications. -|nifi.remote.route.{protocol}.{name}.port|Specify port number that will be introduced to Site-to-Site clients for further communications. -|nifi.remote.route.{protocol}.{name}.secure|Boolean value, 'true' or 'false'. Specify whether the remote peer should be accessed via secure protocol. Defaults to 'false'. +|`nifi.remote.route.{protocol}.{name}.when`|Boolean value, `true` or `false`. Controls whether the routing definition for this name should be used. +|`nifi.remote.route.{protocol}.{name}.hostname`|Specify hostname that will be introduced to Site-to-Site clients for further communications. +|`nifi.remote.route.{protocol}.{name}.port`|Specify port number that will be introduced to Site-to-Site clients for further communications. +|`nifi.remote.route.{protocol}.{name}.secure`|Boolean value, `true` or `false`. Specify whether the remote peer should be accessed via secure protocol. Defaults to `false`. |==== All of above routing properties can use NiFi Expression Language to compute target peer description from request context. Available variables are: |=== |*Variable name*|*Description* -|s2s.{source\|target}.hostname|Hostname of the source where the request came from, and the original target. -|s2s.{source\|target}.port|Same as above, for ports. Source port may not be useful as it is just a client side TCP port. -|s2s.{source\|target}.secure|Same as above, for secure or not. -|s2s.protocol|The name of Site-to-Site protocol being used, RAW or HTTP. -|s2s.request|The name of current request type, SiteToSiteDetail or Peers. See Site-to-Site protocol sequence below for detail. -|HTTP request headers|HTTP request header values can be referred by its name. +|`s2s.{source\|target}.hostname`|Hostname of the source where the request came from, and the original target. +|`s2s.{source\|target}.port`|Same as above, for ports. Source port may not be useful as it is just a client side TCP port. +|`s2s.{source\|target}.secure`|Same as above, for secure or not. +|`s2s.protocol`|The name of Site-to-Site protocol being used, `RAW` or `HTTP`. +|`s2s.request`|The name of current request type, `SiteToSiteDetail` or `Peers`. See Site-to-Site protocol sequence below for detail. +|`HTTP request headers`|HTTP request header values can be referred by its name. |=== ==== Site to Site protocol sequence Configuring these properties correctly would require some understandings on Site-to-Site protocol sequence. -1. A client initiates Site-to-Site protocol by sending a HTTP(S) request to the specified remote URL to get remote cluster Site-to-Site information. Specifically, to '/nifi-api/site-to-site'. This request is called 'SiteToSiteDetail'. +1. A client initiates Site-to-Site protocol by sending a HTTP(S) request to the specified remote URL to get remote cluster Site-to-Site information. Specifically, to '_/nifi-api/site-to-site_'. This request is called `SiteToSiteDetail`. 2. A remote NiFi node responds with its input and output ports, and TCP port numbers for RAW and TCP transport protocols. -3. The client sends another request to get remote peers using the TCP port number returned at #2. From this request, raw socket communication is used for RAW transport protocol, while HTTP keeps using HTTP(S). This request is called 'Peers'. +3. The client sends another request to get remote peers using the TCP port number returned at #2. From this request, raw socket communication is used for RAW transport protocol, while HTTP keeps using HTTP(S). This request is called `Peers`. 4. A remote NiFi node responds with list of available remote peers containing hostname, port, secure and workload such as the number of queued FlowFiles. From this point, further communication is done between the client and the remote NiFi node. 5. The client decides which peer to transfer data from/to, based on workload information. 6. The client sends a request to create a transaction to a remote NiFi node. @@ -3486,15 +3591,15 @@ Setting correct HTTP headers at reverse proxies are crucial for NiFi to work cor There are two types of requests-to-NiFi-node mapping techniques those can be applied at reverse proxy servers. One is 'Server name to Node' and the other is 'Port number to Node'. -With 'Server name to Node', the same port can be used to route requests to different upstream NiFi nodes based on the requested server name (e.g. nifi0.example.com, nifi1.example.com). Host name resolution should be configured to map different host names to the same reverse proxy address, that can be done by adding /etc/hosts file or DNS server entries. Also, if clients to reverse proxy uses HTTPS, reverse proxy server certificate should have wildcard common name or SAN to be accessed by different host names. +With 'Server name to Node', the same port can be used to route requests to different upstream NiFi nodes based on the requested server name (e.g. `nifi0.example.com`, `nifi1.example.com`). Host name resolution should be configured to map different host names to the same reverse proxy address, that can be done by adding /etc/hosts file or DNS server entries. Also, if clients to reverse proxy uses HTTPS, reverse proxy server certificate should have wildcard common name or SAN to be accessed by different host names. Some reverse proxy technologies do not support server name routing rules, in such case, use 'Port number to Node' technique. 'Port number to Node' mapping requires N open port at a reverse proxy for a NiFi cluster consists of N nodes. -Refer following examples for actual configurations. +Refer to the following examples for actual configurations. ==== Site to Site and Reverse Proxy Examples -Here are some example reverse proxy and NiFi setups to illustrate how configuration files look like. +Here are some example reverse proxy and NiFi setups to illustrate what configuration files look like. Client1 in the following diagrams represents a client that does not have direct access to NiFi nodes, and it accesses through the reverse proxy, while Client2 has direct access. @@ -3504,14 +3609,14 @@ In this example, Nginx is used as a reverse proxy. image:s2s-rproxy-servername.svg["Server name to Node mapping"] -1. Client1 initiates Site-to-Site protocol, the request is routed to one of upstream NiFi nodes. The NiFi node computes Site-to-Site port for RAW. By the routing rule 'example1' in nifi.properties shown below, port 10443 is returned. -2. Client1 asks peers to 'nifi.example.com:10443', the request is routed to 'nifi0:8081'. The NiFi node computes available peers, by 'example1' routing rule, 'nifi0:8081' is converted to 'nifi0.example.com:10443', so are nifi1 and nifi2. As a result, 'nifi0.example.com:10443', 'nifi1.example.com:10443' and 'nifi2.example.com:10443' are returned. -3. Client1 decides to use 'nifi2.example.com:10443' for further communication. -4. On the other hand, Client2 has two URIs for Site-to-Site bootstrap URIs, and initiates the protocol using one of them. The 'example1' routing does not match this for this request, and port 8081 is returned. -5. Client2 asks peers from 'nifi1:8081'. The 'example1' does not match, so the original 'nifi0:8081', 'nifi1:8081' and 'nifi2:8081' are returned as they are. -6. Client2 decides to use 'nifi2:8081' for further communication. +1. Client1 initiates Site-to-Site protocol, the request is routed to one of upstream NiFi nodes. The NiFi node computes Site-to-Site port for RAW. By the routing rule *example1* in _nifi.properties_ shown below, port 10443 is returned. +2. Client1 asks peers to `nifi.example.com:10443`, the request is routed to `nifi0:8081`. The NiFi node computes available peers, by *example1* routing rule, `nifi0:8081` is converted to `nifi0.example.com:10443`, so are `nifi1` and `nifi2`. As a result, `nifi0.example.com:10443`, `nifi1.example.com:10443` and `nifi2.example.com:10443` are returned. +3. Client1 decides to use `nifi2.example.com:10443` for further communication. +4. On the other hand, Client2 has two URIs for Site-to-Site bootstrap URIs, and initiates the protocol using one of them. The *example1* routing does not match this for this request, and port 8081 is returned. +5. Client2 asks peers from `nifi1:8081`. The *example1* does not match, so the original `nifi0:8081`, `nifi1:8081` and `nifi2:8081` are returned as they are. +6. Client2 decides to use `nifi2:8081` for further communication. -Routing rule 'example1' is defined in nifi.properties (all node has the same routing configuration): +Routing rule *example1* defined in _nifi.properties_ (all nodes have the same routing configuration): .... # S2S Routing for RAW, using server name to node nifi.remote.route.raw.example1.when=\ @@ -3524,7 +3629,7 @@ nifi.remote.route.raw.example1.secure=true .... -nginx.conf +_nginx.conf_ : .... http { @@ -3580,9 +3685,9 @@ stream { image:s2s-rproxy-portnumber.svg["Port number to Node mapping"] -The 'example2' routing maps original host names (nifi0, 1 and 2) to different proxy ports (10443, 10444 and 10445) using 'equals and 'ifElse' expressions. +The *example2* routing maps original host names (`nifi0`, `nifi1` and `nifi2`) to different proxy ports (`10443`, `10444` and `10445`) using `equals` and `ifElse` expressions. -nifi.properties (all node has the same routing configuration) +Routing rule *example2* defined in _nifi.properties_ (all nodes have the same routing configuration): .... # S2S Routing for RAW, using port number to node nifi.remote.route.raw.example2.when=\ @@ -3598,7 +3703,7 @@ ${s2s.target.hostname:equals('nifi2'):ifElse('10445',\ nifi.remote.route.raw.example2.secure=true .... -nginx.conf +_nginx.conf_ : .... http { # Same as example 1. @@ -3634,7 +3739,7 @@ stream { image:s2s-rproxy-http.svg["Server name to Node mapping"] -nifi.properties (all node has the same routing configuration) +Routing rule *example3* defined in _nifi.properties_ (all nodes have the same routing configuration): .... # S2S Routing for HTTP nifi.remote.route.http.example3.when=${X-ProxyHost:contains('.example.com')} @@ -3643,7 +3748,7 @@ nifi.remote.route.http.example3.port=443 nifi.remote.route.http.example3.secure=true .... -nginx.conf +_nginx.conf_ : .... http { upstream nifi_cluster { @@ -3691,67 +3796,68 @@ These properties pertain to the web-based User Interface. |==== |*Property*|*Description* -|nifi.web.war.directory|This is the location of the web war directory. The default value is `./lib`. -|nifi.web.http.host|The HTTP host. It is blank by default. -|nifi.web.http.port|The HTTP port. The default value is `8080`. -|nifi.web.http.port.forwarding|The port which forwards incoming HTTP requests to `nifi.web.http.host`. This property is designed to be used with 'port forwarding', when NiFi has to be started by a non-root user for better security, yet it needs to be accessed via low port to go through a firewall. For example, to expose NiFi via HTTP protocol on port 80, but actually listening on port 8080, you need to configure OS level port forwarding such as `iptables` (Linux/Unix) or `pfctl` (OS X) that redirects requests from 80 to 8080. Then set `nifi.web.http.port` as 8080, and `nifi.web.http.port.forwarding` as 80. It is blank by default. -|nifi.web.http.network.interface*|The name of the network interface to which NiFi should bind for HTTP requests. It is blank by default. + +|`nifi.web.war.directory`|This is the location of the web war directory. The default value is `./lib`. +|`nifi.web.http.host`|The HTTP host. It is blank by default. +|`nifi.web.http.port`|The HTTP port. The default value is `8080`. +|`nifi.web.http.port.forwarding`|The port which forwards incoming HTTP requests to `nifi.web.http.host`. This property is designed to be used with 'port forwarding', when NiFi has to be started by a non-root user for better security, yet it needs to be accessed via low port to go through a firewall. For example, to expose NiFi via HTTP protocol on port 80, but actually listening on port 8080, you need to configure OS level port forwarding such as `iptables` (Linux/Unix) or `pfctl` (OS X) that redirects requests from 80 to 8080. Then set `nifi.web.http.port` as 8080, and `nifi.web.http.port.forwarding` as 80. It is blank by default. +|`nifi.web.http.network.interface`*|The name of the network interface to which NiFi should bind for HTTP requests. It is blank by default. + + -*NOTE*: Multiple network interfaces can be specified by using the *_nifi.web.http.network.interface._* prefix with unique suffixes and separate network interface names as values. + +*NOTE*: Multiple network interfaces can be specified by using the `nifi.web.http.network.interface.` prefix with unique suffixes and separate network interface names as values. + + For example, to provide two additional network interfaces, a user could also specify additional properties with keys of: + + -nifi.web.http.network.interface.eth0=eth0 + -nifi.web.http.network.interface.eth1=eth1 + +`nifi.web.http.network.interface.eth0=eth0` + +`nifi.web.http.network.interface.eth1=eth1` + + Providing three total network interfaces, including `nifi.web.http.network.interface.default`. -|nifi.web.https.host|The HTTPS host. It is blank by default. -|nifi.web.https.port|The HTTPS port. It is blank by default. When configuring NiFi to run securely, this port should be configured. -|nifi.web.https.port.forwarding|Same as `nifi.web.http.port.forwarding`, but with HTTPS for secure communication. It is blank by default. -|nifi.web.https.network.interface*|The name of the network interface to which NiFi should bind for HTTPS requests. It is blank by default. + +|`nifi.web.https.host`|The HTTPS host. It is blank by default. +|`nifi.web.https.port`|The HTTPS port. It is blank by default. When configuring NiFi to run securely, this port should be configured. +|`nifi.web.https.port.forwarding`|Same as `nifi.web.http.port.forwarding`, but with HTTPS for secure communication. It is blank by default. +|`nifi.web.https.network.interface`*|The name of the network interface to which NiFi should bind for HTTPS requests. It is blank by default. + + -*NOTE*: Multiple network interfaces can be specified by using the *_nifi.web.https.network.interface._* prefix with unique suffixes and separate network interface names as values. + +*NOTE*: Multiple network interfaces can be specified by using the `nifi.web.https.network.interface.` prefix with unique suffixes and separate network interface names as values. + + For example, to provide two additional network interfaces, a user could also specify additional properties with keys of: + + -nifi.web.https.network.interface.eth0=eth0 + -nifi.web.https.network.interface.eth1=eth1 + +`nifi.web.https.network.interface.eth0=eth0` + +`nifi.web.https.network.interface.eth1=eth1` + + Providing three total network interfaces, including `nifi.web.https.network.interface.default`. -|nifi.web.jetty.working.directory|The location of the Jetty working directory. The default value is `./work/jetty`. -|nifi.web.jetty.threads|The number of Jetty threads. The default value is `200`. -|nifi.web.max.header.size|The maximum size allowed for request and response headers. The default value is 16 KB. -|nifi.web.proxy.host|A comma separated list of allowed HTTP Host header values to consider when NiFi is running securely and will be receiving requests to a different host[:port] than it is bound to. +|`nifi.web.jetty.working.directory`|The location of the Jetty working directory. The default value is `./work/jetty`. +|`nifi.web.jetty.threads`|The number of Jetty threads. The default value is `200`. +|`nifi.web.max.header.size`|The maximum size allowed for request and response headers. The default value is `16 KB`. +|`nifi.web.proxy.host`|A comma separated list of allowed HTTP Host header values to consider when NiFi is running securely and will be receiving requests to a different host[:port] than it is bound to. For example, when running in a Docker container or behind a proxy (e.g. localhost:18443, proxyhost:443). By default, this value is blank meaning NiFi should only allow requests sent to the host[:port] that NiFi is bound to. -|nifi.web.proxy.context.path|A comma separated list of allowed HTTP X-ProxyContextPath or X-Forwarded-Context header values to consider. By default, this value is +|`nifi.web.proxy.context.path`|A comma separated list of allowed HTTP X-ProxyContextPath or X-Forwarded-Context header values to consider. By default, this value is blank meaning all requests containing a proxy context path are rejected. Configuring this property would allow requests where the proxy path is contained in this listing. |==== +[[security_properties]] === Security Properties These properties pertain to various security features in NiFi. Many of these properties are covered in more detail in the -Security Configuration section of this Administrator's Guide. +<> section of this Administrator's Guide. |==== |*Property*|*Description* -|nifi.sensitive.props.key|This is the password used to encrypt any sensitive property values that are configured in processors. By default, it is blank, but the system administrator should provide a value for it. It can be a string of any length, although the recommended minimum length is 10 characters. Be aware that once this password is set and one or more sensitive processor properties have been configured, this password should not be changed. -|nifi.sensitive.props.algorithm|The algorithm used to encrypt sensitive properties. The default value is `PBEWITHMD5AND256BITAES-CBC-OPENSSL`. -|nifi.sensitive.props.provider|The sensitive property provider. The default value is `BC`. -|nifi.sensitive.props.additional.keys|The comma separated list of properties in `nifi.properties` to encrypt in addition to the default sensitive properties (see <>). -|nifi.security.keystore*|The full path and name of the keystore. It is blank by default. -|nifi.security.keystoreType|The keystore type. It is blank by default. -|nifi.security.keystorePasswd|The keystore password. It is blank by default. -|nifi.security.keyPasswd|The key password. It is blank by default. -|nifi.security.truststore*|The full path and name of the truststore. It is blank by default. -|nifi.security.truststoreType|The truststore type. It is blank by default. -|nifi.security.truststorePasswd|The truststore password. It is blank by default. -|nifi.security.needClientAuth|This indicates whether client authentication in the cluster protocol. It is blank by default. -|nifi.security.user.authorizer|Specifies which of the configured Authorizers in the authorizers.xml file to use. By default, it is set to `file-provider`. -|nifi.security.user.login.identity.provider|This indicates what type of login identity provider to use. The default value is blank, can be set to the identifier from a provider +|`nifi.sensitive.props.key`|This is the password used to encrypt any sensitive property values that are configured in processors. By default, it is blank, but the system administrator should provide a value for it. It can be a string of any length, although the recommended minimum length is 10 characters. Be aware that once this password is set and one or more sensitive processor properties have been configured, this password should not be changed. +|`nifi.sensitive.props.algorithm`|The algorithm used to encrypt sensitive properties. The default value is `PBEWITHMD5AND256BITAES-CBC-OPENSSL`. +|`nifi.sensitive.props.provider`|The sensitive property provider. The default value is `BC`. +|`nifi.sensitive.props.additional.keys`|The comma separated list of properties in _nifi.properties_ to encrypt in addition to the default sensitive properties (see <>). +|`nifi.security.keystore`*|The full path and name of the keystore. It is blank by default. +|`nifi.security.keystoreType`|The keystore type. It is blank by default. +|`nifi.security.keystorePasswd`|The keystore password. It is blank by default. +|`nifi.security.keyPasswd`|The key password. It is blank by default. +|`nifi.security.truststore`*|The full path and name of the truststore. It is blank by default. +|`nifi.security.truststoreType`|The truststore type. It is blank by default. +|`nifi.security.truststorePasswd`|The truststore password. It is blank by default. +|`nifi.security.needClientAuth`|This indicates whether client authentication in the cluster protocol. It is blank by default. +|`nifi.security.user.authorizer`|Specifies which of the configured Authorizers in the _authorizers.xml_ file to use. By default, it is set to `file-provider`. +|`nifi.security.user.login.identity.provider`|This indicates what type of login identity provider to use. The default value is blank, can be set to the identifier from a provider in the file specified in `nifi.login.identity.provider.configuration.file`. Setting this property will trigger NiFi to support username/password authentication. -|nifi.security.ocsp.responder.url|This is the URL for the Online Certificate Status Protocol (OCSP) responder if one is being used. It is blank by default. -|nifi.security.ocsp.responder.certificate|This is the location of the OCSP responder certificate if one is being used. It is blank by default. +|`nifi.security.ocsp.responder.url`|This is the URL for the Online Certificate Status Protocol (OCSP) responder if one is being used. It is blank by default. +|`nifi.security.ocsp.responder.certificate`|This is the location of the OCSP responder certificate if one is being used. It is blank by default. |==== === Identity Mapping Properties @@ -3771,9 +3877,9 @@ nifi.security.identity.mapping.transform.kerb=NONE The last segment of each property is an identifier used to associate the pattern with the replacement value. When a user makes a request to NiFi, their identity is checked to see if it matches each of those patterns in lexicographical order. For the first one that matches, the replacement specified in the `nifi.security.identity.mapping.value.xxxx` property is used. So a login with `CN=localhost, OU=Apache NiFi, O=Apache, L=Santa Monica, ST=CA, C=US` matches the DN mapping pattern above and the DN mapping value `$1@$2` is applied. The user is normalized to `localhost@Apache NiFi`. -In addition to mapping a transform may be applied. The supported versions are NONE (no transform applied), LOWER (identity lowercased), and UPPER (identity uppercased). If not specified, the default value is NONE. +In addition to mapping, a transform may be applied. The supported versions are `NONE` (no transform applied), `LOWER` (identity lowercased), and `UPPER` (identity uppercased). If not specified, the default value is `NONE`. -NOTE: These mappings are also applied to the "Initial Admin Identity", "Cluster Node Identity", and any legacy users in the authorizers.xml file as well as users imported from LDAP (See <>). +NOTE: These mappings are also applied to the "Initial Admin Identity", "Cluster Node Identity", and any legacy users in the _authorizers.xml_ file as well as users imported from LDAP (See <>). Group names can also be mapped. The following example will accept the existing group name but will lowercase it. This may be helpful when used in conjunction with an external authorizer. @@ -3783,7 +3889,7 @@ nifi.security.group.mapping.value.anygroup=$1 nifi.security.group.mapping.transform.anygroup=LOWER ---- -NOTE: These mappings are applied to any legacy groups referenced in the authorizers.xml as well as groups imported from LDAP. +NOTE: These mappings are applied to any legacy groups referenced in the _authorizers.xml_ as well as groups imported from LDAP. === Cluster Common Properties @@ -3791,8 +3897,8 @@ When setting up a NiFi cluster, these properties should be configured the same w |==== |*Property*|*Description* -|nifi.cluster.protocol.heartbeat.interval|The interval at which nodes should emit heartbeats to the Cluster Coordinator. The default value is `5 sec`. -|nifi.cluster.protocol.is.secure|This indicates whether cluster communications are secure. The default value is `false`. +|`nifi.cluster.protocol.heartbeat.interval`|The interval at which nodes should emit heartbeats to the Cluster Coordinator. The default value is `5 sec`. +|`nifi.cluster.protocol.is.secure`|This indicates whether cluster communications are secure. The default value is `false`. |==== === Cluster Node Properties @@ -3801,23 +3907,23 @@ Configure these properties for cluster nodes. |==== |*Property*|*Description* -|nifi.cluster.is.node|Set this to `true` if the instance is a node in a cluster. The default value is `false`. -|nifi.cluster.node.address|The fully qualified address of the node. It is blank by default. -|nifi.cluster.node.protocol.port|The node's protocol port. It is blank by default. -|nifi.cluster.node.protocol.threads|The number of threads that should be used to communicate with other nodes +|`nifi.cluster.is.node`|Set this to `true` if the instance is a node in a cluster. The default value is `false`. +|`nifi.cluster.node.address`|The fully qualified address of the node. It is blank by default. +|`nifi.cluster.node.protocol.port`|The node's protocol port. It is blank by default. +|`nifi.cluster.node.protocol.threads`|The number of threads that should be used to communicate with other nodes in the cluster. This property defaults to `10`, but for large clusters, this value may need to be larger. -|nifi.cluster.node.protocol.max.threads|The maximum number of threads that should be used to communicate with other nodes in the cluster. This property defaults to `50`. -|nifi.cluster.node.event.history.size|When the state of a node in the cluster is changed, an event is generated +|`nifi.cluster.node.protocol.max.threads`|The maximum number of threads that should be used to communicate with other nodes in the cluster. This property defaults to `50`. +|`nifi.cluster.node.event.history.size`|When the state of a node in the cluster is changed, an event is generated and can be viewed in the Cluster page. This value indicates how many events to keep in memory for each node. The default value is `25`. -|nifi.cluster.node.connection.timeout|When connecting to another node in the cluster, specifies how long this node should wait before considering +|`nifi.cluster.node.connection.timeout`|When connecting to another node in the cluster, specifies how long this node should wait before considering the connection a failure. The default value is `5 secs`. -|nifi.cluster.node.read.timeout|When communicating with another node in the cluster, specifies how long this node should wait to receive information +|`nifi.cluster.node.read.timeout`|When communicating with another node in the cluster, specifies how long this node should wait to receive information from the remote node before considering the communication with the node a failure. The default value is `5 secs`. -|nifi.cluster.node.max.concurrent.requests|The maximum number of outstanding web requests that can be replicated to nodes in the cluster. If this number of requests is exceeded, the embedded Jetty server will return a "409: Conflict" response. This property defaults to `100`. -|nifi.cluster.firewall.file|The location of the node firewall file. This is a file that may be used to list all the nodes that are allowed to connect +|`nifi.cluster.node.max.concurrent.requests`|The maximum number of outstanding web requests that can be replicated to nodes in the cluster. If this number of requests is exceeded, the embedded Jetty server will return a "409: Conflict" response. This property defaults to `100`. +|`nifi.cluster.firewall.file`|The location of the node firewall file. This is a file that may be used to list all the nodes that are allowed to connect to the cluster. It provides an additional layer of security. This value is blank by default, meaning that no firewall file is to be used. -|nifi.cluster.flow.election.max.wait.time|Specifies the amount of time to wait before electing a Flow as the "correct" Flow. If the number of Nodes that have voted is equal to the number specified by the `nifi.cluster.flow.election.max.candidates` property, the cluster will not wait this long. The default value is `5 mins`. Note that the time starts as soon as the first vote is cast. -|nifi.cluster.flow.election.max.candidates|Specifies the number of Nodes required in the cluster to cause early election of Flows. This allows the Nodes in the cluster to avoid having to wait a long time before starting processing if we reach at least this number of nodes in the cluster. +|`nifi.cluster.flow.election.max.wait.time`|Specifies the amount of time to wait before electing a Flow as the "correct" Flow. If the number of Nodes that have voted is equal to the number specified by the `nifi.cluster.flow.election.max.candidates` property, the cluster will not wait this long. The default value is `5 mins`. Note that the time starts as soon as the first vote is cast. +|`nifi.cluster.flow.election.max.candidates`|Specifies the number of Nodes required in the cluster to cause early election of Flows. This allows the Nodes in the cluster to avoid having to wait a long time before starting processing if we reach at least this number of nodes in the cluster. |==== [[claim_management]] @@ -3837,7 +3943,7 @@ will time out after some period of time. These properties determines how these l |==== |*Property*|*Description* -|nifi.cluster.request.replication.claim.timeout|Specifies how long to wait before considering a lock 'expired' and automatically +|`nifi.cluster.request.replication.claim.timeout`|Specifies how long to wait before considering a lock 'expired' and automatically unlocking. |==== @@ -3850,12 +3956,12 @@ to join a cluster. |==== |*Property*|*Description* -|nifi.zookeeper.connect.string|The Connect String that is needed to connect to Apache ZooKeeper. This is a comma-separated list +|`nifi.zookeeper.connect.string`|The Connect String that is needed to connect to Apache ZooKeeper. This is a comma-separated list of hostname:port pairs. For example, `localhost:2181,localhost:2182,localhost:2183`. This should contain a list of all ZooKeeper instances in the ZooKeeper quorum. This property must be specified to join a cluster and has no default value. -|nifi.zookeeper.connect.timeout|How long to wait when connecting to ZooKeeper before considering the connection a failure. The default value is `3 secs`. -|nifi.zookeeper.session.timeout|How long to wait after losing a connection to ZooKeeper before the session is expired. The default value is `3 secs`. -|nifi.zookeeper.root.node|The root ZNode that should be used in ZooKeeper. ZooKeeper provides a directory-like structure +|`nifi.zookeeper.connect.timeout`|How long to wait when connecting to ZooKeeper before considering the connection a failure. The default value is `3 secs`. +|`nifi.zookeeper.session.timeout`|How long to wait after losing a connection to ZooKeeper before the session is expired. The default value is `3 secs`. +|`nifi.zookeeper.root.node`|The root ZNode that should be used in ZooKeeper. ZooKeeper provides a directory-like structure for storing data. Each 'directory' in this structure is referred to as a ZNode. This denotes the root ZNode, or 'directory', that should be used for storing data. The default value is `/root`. This is important to set correctly, as which cluster the NiFi instance attempts to join is determined by which ZooKeeper instance it connects to and the ZooKeeper Root Node @@ -3867,19 +3973,19 @@ that is specified. |==== |*Property*|*Description* -|nifi.kerberos.krb5.file*|The location of the krb5 file, if used. It is blank by default. At this time, only a single krb5 file is allowed to +|`nifi.kerberos.krb5.file`*|The location of the krb5 file, if used. It is blank by default. At this time, only a single krb5 file is allowed to be specified per NiFi instance, so this property is configured here to support SPNEGO and service principals rather than in individual Processors. If necessary the krb5 file can support multiple realms. Example: `/etc/krb5.conf` -|nifi.kerberos.service.principal*|The name of the NiFi Kerberos service principal, if used. It is blank by default. Note that this property is for NiFi to authenticate as a client other systems. +|`nifi.kerberos.service.principal`*|The name of the NiFi Kerberos service principal, if used. It is blank by default. Note that this property is for NiFi to authenticate as a client other systems. Example: `nifi/nifi.example.com` or `nifi/nifi.example.com@EXAMPLE.COM` -|nifi.kerberos.service.keytab.location*|The file path of the NiFi Kerberos keytab, if used. It is blank by default. Note that this property is for NiFi to authenticate as a client other systems. +|`nifi.kerberos.service.keytab.location`*|The file path of the NiFi Kerberos keytab, if used. It is blank by default. Note that this property is for NiFi to authenticate as a client other systems. Example: `/etc/nifi.keytab` -|nifi.kerberos.spnego.principal*|The name of the NiFi Kerberos service principal, if used. It is blank by default. Note that this property is used to authenticate NiFi users. +|`nifi.kerberos.spnego.principal`*|The name of the NiFi Kerberos service principal, if used. It is blank by default. Note that this property is used to authenticate NiFi users. Example: `HTTP/nifi.example.com` or `HTTP/nifi.example.com@EXAMPLE.COM` -|nifi.kerberos.spnego.keytab.location*|The file path of the NiFi Kerberos keytab, if used. It is blank by default. Note that this property is used to authenticate NiFi users. +|`nifi.kerberos.spnego.keytab.location`*|The file path of the NiFi Kerberos keytab, if used. It is blank by default. Note that this property is used to authenticate NiFi users. Example: `/etc/http-nifi.keytab` -|nifi.kerberos.spengo.authentication.expiration*|The expiration duration of a successful Kerberos user authentication, if used. The default value is `12 hours`. +|`nifi.kerberos.spengo.authentication.expiration`*|The expiration duration of a successful Kerberos user authentication, if used. The default value is `12 hours`. |==== [[custom_properties]] @@ -3894,7 +4000,7 @@ To configure custom properties for use with NiFi’s Expression Language: |==== |*Property*|*Description* -|nifi.variable.registry.properties|This is a comma-separated list of file location paths for one or more custom property files. +|`nifi.variable.registry.properties`|This is a comma-separated list of file location paths for one or more custom property files. |==== * Restart your NiFi instance(s) for the updates to be picked up.