NIFI-4678 This closes #2321. change all the old fasioned headings in docs to modern ones

Signed-off-by: joewitt <joewitt@apache.org>
This commit is contained in:
edwarzjl 2017-12-06 17:03:22 +08:00 committed by joewitt
parent a12abc24e5
commit 68016ddbe8
8 changed files with 86 additions and 172 deletions

View File

@ -14,13 +14,11 @@
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
NiFi System Administrator's Guide = NiFi System Administrator's Guide
=================================
Apache NiFi Team <dev@nifi.apache.org> Apache NiFi Team <dev@nifi.apache.org>
:homepage: http://nifi.apache.org :homepage: http://nifi.apache.org
System Requirements == System Requirements
-------------------
Apache NiFi can run on something as simple as a laptop, but it can also be clustered across many enterprise-class servers. Therefore, the amount of hardware and memory needed will depend on the size and nature of the dataflow involved. The data is stored on disk while NiFi is processing it. So NiFi needs to have sufficient disk space allocated for its various repositories, particularly the content repository, flowfile repository, and provenance repository (see the <<system_properties>> section for more information about these repositories). NiFi has the following minimum system requirements: Apache NiFi can run on something as simple as a laptop, but it can also be clustered across many enterprise-class servers. Therefore, the amount of hardware and memory needed will depend on the size and nature of the dataflow involved. The data is stored on disk while NiFi is processing it. So NiFi needs to have sufficient disk space allocated for its various repositories, particularly the content repository, flowfile repository, and provenance repository (see the <<system_properties>> section for more information about these repositories). NiFi has the following minimum system requirements:
* Requires Java 8 or newer * Requires Java 8 or newer
@ -37,8 +35,7 @@ Apache NiFi can run on something as simple as a laptop, but it can also be clust
**Note** Under sustained and extremely high throughput the CodeCache settings may need to be tuned to avoid sudden performance loss. See the <<bootstrap_properties>> section for more information. **Note** Under sustained and extremely high throughput the CodeCache settings may need to be tuned to avoid sudden performance loss. See the <<bootstrap_properties>> section for more information.
How to install and start NiFi == How to install and start NiFi
-----------------------------
* Linux/Unix/OS X * Linux/Unix/OS X
** Decompress and untar into desired installation directory ** Decompress and untar into desired installation directory
@ -76,8 +73,7 @@ When NiFi first starts up, the following files and directories are created:
See the <<system_properties>> section of this guide for more information about configuring NiFi repositories and configuration files. See the <<system_properties>> section of this guide for more information about configuring NiFi repositories and configuration files.
Configuration Best Practices == Configuration Best Practices
----------------------------
NOTE: If you are running on Linux, consider these best practices. Typical Linux defaults are not necessarily well tuned for the needs of an IO intensive application like NiFi. For all of these areas, your distribution's requirements may vary. Use these sections as advice, but NOTE: If you are running on Linux, consider these best practices. Typical Linux defaults are not necessarily well tuned for the needs of an IO intensive application like NiFi. For all of these areas, your distribution's requirements may vary. Use these sections as advice, but
consult your distribution-specific documentation for how best to achieve these recommendations. consult your distribution-specific documentation for how best to achieve these recommendations.
@ -127,8 +123,7 @@ Doing so can cause a surprising bump in throughput. Edit the '/etc/fstab' file
and for the partition(s) of interest add the 'noatime' option. and for the partition(s) of interest add the 'noatime' option.
Security Configuration == Security Configuration
----------------------
NiFi provides several different configuration options for security purposes. The most important properties are those under the NiFi provides several different configuration options for security purposes. The most important properties are those under the
"security properties" heading in the _nifi.properties_ file. In order to run securely, the following properties must be set: "security properties" heading in the _nifi.properties_ file. In order to run securely, the following properties must be set:
@ -164,8 +159,7 @@ Now that the User Interface has been secured, we can easily secure Site-to-Site
accomplished by setting the `nifi.remote.input.secure` and `nifi.cluster.protocol.is.secure` properties, respectively, to `true`. accomplished by setting the `nifi.remote.input.secure` and `nifi.cluster.protocol.is.secure` properties, respectively, to `true`.
TLS Generation Toolkit === TLS Generation Toolkit
~~~~~~~~~~~~~~~~~~~~~~
In order to facilitate the secure setup of NiFi, you can use the `tls-toolkit` command line utility to automatically generate the required keystores, truststore, and relevant configuration files. This is especially useful for securing multiple NiFi nodes, which can be a tedious and error-prone process. In order to facilitate the secure setup of NiFi, you can use the `tls-toolkit` command line utility to automatically generate the required keystores, truststore, and relevant configuration files. This is especially useful for securing multiple NiFi nodes, which can be a tedious and error-prone process.
@ -176,8 +170,7 @@ The `tls-toolkit` command line tool has two primary modes of operation:
1. Standalone -- generates the certificate authority, keystores, truststores, and nifi.properties files in one command. 1. Standalone -- generates the certificate authority, keystores, truststores, and nifi.properties files in one command.
2. Client/Server mode -- uses a Certificate Authority Server that accepts Certificate Signing Requests from clients, signs them, and sends the resulting certificates back. Both client and server validate the others identity through a shared secret. 2. Client/Server mode -- uses a Certificate Authority Server that accepts Certificate Signing Requests from clients, signs them, and sends the resulting certificates back. Both client and server validate the others identity through a shared secret.
Standalone ==== Standalone
^^^^^^^^^^
Standalone mode is invoked by running `./bin/tls-toolkit.sh standalone -h` which prints the usage information along with descriptions of options that can be specified. Standalone mode is invoked by running `./bin/tls-toolkit.sh standalone -h` which prints the usage information along with descriptions of options that can be specified.
You can use the following command line options with the `tls-toolkit` in standalone mode: You can use the following command line options with the `tls-toolkit` in standalone mode:
@ -228,8 +221,7 @@ bin/tls-toolkit.sh standalone -n 'nifi[01-10].subdomain[1-4].domain(2)' -C 'CN=u
---- ----
Client/Server ==== Client/Server
^^^^^^^^^^^^^
Client/Server mode relies on a long-running Certificate Authority (CA) to issue certificates. The CA can be stopped when youre not bringing nodes online. Client/Server mode relies on a long-running Certificate Authority (CA) to issue certificates. The CA can be stopped when youre not bringing nodes online.
@ -279,8 +271,7 @@ After running the client you will have the CAs certificate, a keystore, a tru
For a client certificate that can be easily imported into the browser, specify: `-T PKCS12` For a client certificate that can be easily imported into the browser, specify: `-T PKCS12`
[[user_authentication]] [[user_authentication]]
User Authentication == User Authentication
-------------------
NiFi supports user authentication via client certificates, via username/password, via Apache Knox, or via OpenId Connect (http://openid.net/connect). NiFi supports user authentication via client certificates, via username/password, via Apache Knox, or via OpenId Connect (http://openid.net/connect).
@ -306,8 +297,7 @@ A secured instance of NiFi cannot be accessed anonymously unless configured to u
NOTE: NiFi does not perform user authentication over HTTP. Using HTTP, all users will be granted all roles. NOTE: NiFi does not perform user authentication over HTTP. Using HTTP, all users will be granted all roles.
[[ldap_login_identity_provider]] [[ldap_login_identity_provider]]
Lightweight Directory Access Protocol (LDAP) === Lightweight Directory Access Protocol (LDAP)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Below is an example and description of configuring a Login Identity Provider that integrates with a Directory Server to authenticate users. Below is an example and description of configuring a Login Identity Provider that integrates with a Directory Server to authenticate users.
@ -376,8 +366,7 @@ compatibility. USE_DN will use the full DN of the user entry if possible. USE_US
|================================================================================================================================================== |==================================================================================================================================================
[[kerberos_login_identity_provider]] [[kerberos_login_identity_provider]]
Kerberos === Kerberos
~~~~~~~~
Below is an example and description of configuring a Login Identity Provider that integrates with a Kerberos Key Distribution Center (KDC) to authenticate users. Below is an example and description of configuring a Login Identity Provider that integrates with a Kerberos Key Distribution Center (KDC) to authenticate users.
@ -408,8 +397,7 @@ nifi.security.user.login.identity.provider=kerberos-provider
See also <<kerberos_service>> to allow single sign-on access via client Kerberos tickets. See also <<kerberos_service>> to allow single sign-on access via client Kerberos tickets.
[[openid_connect]] [[openid_connect]]
OpenId Connect === OpenId Connect
~~~~~~~~~~~~~~
To enable authentication via OpenId Connect the following properties must be configured in nifi.properties. To enable authentication via OpenId Connect the following properties must be configured in nifi.properties.
@ -428,8 +416,7 @@ JSON Web Key (JWK) provided through the jwks_uri in the metadata found at the di
|================================================================================================================================================== |==================================================================================================================================================
[[apache_knox]] [[apache_knox]]
Apache Knox === Apache Knox
~~~~~~~~~~~
To enable authentication via Apache Knox the following properties must be configured in nifi.properties. To enable authentication via Apache Knox the following properties must be configured in nifi.properties.
@ -444,8 +431,7 @@ this listing. The audience that is populated in the token can be configured in K
|================================================================================================================================================== |==================================================================================================================================================
[[multi-tenant-authorization]] [[multi-tenant-authorization]]
Multi-Tenant Authorization == Multi-Tenant Authorization
--------------------------
After you have configured NiFi to run securely and with an authentication mechanism, you must configure who has access to the system, and the level of their access. After you have configured NiFi to run securely and with an authentication mechanism, you must configure who has access to the system, and the level of their access.
You can do this using 'multi-tenant authorization'. Multi-tenant authorization enables multiple groups of users (tenants) to command, control, and observe different You can do this using 'multi-tenant authorization'. Multi-tenant authorization enables multiple groups of users (tenants) to command, control, and observe different
@ -453,8 +439,7 @@ parts of the dataflow, with varying levels of authorization. When an authenticat
user has privileges to perform that action. These privileges are defined by policies that you can apply system-wide or to individual components. user has privileges to perform that action. These privileges are defined by policies that you can apply system-wide or to individual components.
[[authorizer-configuration]] [[authorizer-configuration]]
Authorizer Configuration === Authorizer Configuration
~~~~~~~~~~~~~~~~~~~~~~~~
An 'authorizer' grants users the privileges to manage users and policies by creating preliminary authorizations at startup. An 'authorizer' grants users the privileges to manage users and policies by creating preliminary authorizations at startup.
@ -464,8 +449,7 @@ Authorizers are configured using two properties in the 'nifi.properties' file:
* The `nifi.security.user.authorizer` property indicates which of the configured authorizers in the 'authorizers.xml' file to use. * The `nifi.security.user.authorizer` property indicates which of the configured authorizers in the 'authorizers.xml' file to use.
[[authorizers-setup]] [[authorizers-setup]]
Authorizers.xml Setup === Authorizers.xml Setup
~~~~~~~~~~~~~~~~~~~~~
The 'authorizers.xml' file is used to define and configure available authorizers. The default authorizer is the StandardManagedAuthorizer. The managed authorizer is comprised of a UserGroupProvider The 'authorizers.xml' file is used to define and configure available authorizers. The default authorizer is the StandardManagedAuthorizer. The managed authorizer is comprised of a UserGroupProvider
and a AccessPolicyProvider. The users, group, and access policies will be loaded and optionally configured through these providers. The managed authorizer will make all access decisions based on and a AccessPolicyProvider. The users, group, and access policies will be loaded and optionally configured through these providers. The managed authorizer will make all access decisions based on
@ -549,8 +533,7 @@ FileAuthorizer has the following properties.
* Node Identity - The identity of a NiFi cluster node. When clustered, a property for each node should be defined, so that every node knows about every other node. If not clustered, these properties can be ignored. * Node Identity - The identity of a NiFi cluster node. When clustered, a property for each node should be defined, so that every node knows about every other node. If not clustered, these properties can be ignored.
[[initial-admin-identity]] [[initial-admin-identity]]
Initial Admin Identity (New NiFi Instance) ==== Initial Admin Identity (New NiFi Instance)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If you are setting up a secured NiFi instance for the first time, you must manually designate an “Initial Admin Identity” in the 'authorizers.xml' file. This initial admin user is granted access to the UI and given the ability to create additional users, groups, and policies. The value of this property could be a DN (when using certificates or LDAP) or a Kerberos principal. If you are the NiFi administrator, add yourself as the “Initial Admin Identity”. If you are setting up a secured NiFi instance for the first time, you must manually designate an “Initial Admin Identity” in the 'authorizers.xml' file. This initial admin user is granted access to the UI and given the ability to create additional users, groups, and policies. The value of this property could be a DN (when using certificates or LDAP) or a Kerberos principal. If you are the NiFi administrator, add yourself as the “Initial Admin Identity”.
@ -876,8 +859,7 @@ Here is an example composite implementation loading users and groups from LDAP a
In this example, the users and groups are loaded from LDAP but the servers are managed in a local file. The 'Initial Admin Identity' value came from an attribute in a LDAP entry based on the 'User Identity Attribute'. The 'Node Identity' values are established in the local file using the 'Initial User Identity' properties. In this example, the users and groups are loaded from LDAP but the servers are managed in a local file. The 'Initial Admin Identity' value came from an attribute in a LDAP entry based on the 'User Identity Attribute'. The 'Node Identity' values are established in the local file using the 'Initial User Identity' properties.
[[legacy-authorized-users]] [[legacy-authorized-users]]
Legacy Authorized Users (NiFi Instance Upgrade) ==== Legacy Authorized Users (NiFi Instance Upgrade)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If you are upgrading from a 0.x NiFi instance, you can convert your previously configured users and roles to the multi-tenant authorization model. In the 'authorizers.xml' file, specify the location of your existing 'authorized-users.xml' file in the “Legacy Authorized Users File” property. If you are upgrading from a 0.x NiFi instance, you can convert your previously configured users and roles to the multi-tenant authorization model. In the 'authorizers.xml' file, specify the location of your existing 'authorized-users.xml' file in the “Legacy Authorized Users File” property.
@ -952,8 +934,7 @@ NOTE: NiFi fails to restart if values exist for both the “Initial Admin Identi
NOTE: Do not manually edit the 'authorizations.xml' file. Create authorizations only during initial setup and afterwards using the NiFi UI. NOTE: Do not manually edit the 'authorizations.xml' file. Create authorizations only during initial setup and afterwards using the NiFi UI.
[[cluster-node-identities]] [[cluster-node-identities]]
Cluster Node Identities ==== Cluster Node Identities
^^^^^^^^^^^^^^^^^^^^^^^
If you are running NiFi in a clustered environment, you must specify the identities for each node. The authorization policies required for the nodes to communicate are created during startup. If you are running NiFi in a clustered environment, you must specify the identities for each node. The authorization policies required for the nodes to communicate are created during startup.
@ -1000,8 +981,7 @@ NOTE: In a cluster, all nodes must have the same 'authorizations.xml' and 'users
Now that initial authorizations have been created, additional users, groups and authorizations can be created and managed in the NiFi UI. Now that initial authorizations have been created, additional users, groups and authorizations can be created and managed in the NiFi UI.
[[config-users-access-policies]] [[config-users-access-policies]]
Configuring Users & Access Policies === Configuring Users & Access Policies
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Depending on the capabilities of the configured UserGroupProvider and AccessPolicyProvider the users, groups, and policies will be configurable in the UI. If the extensions are not configurable the Depending on the capabilities of the configured UserGroupProvider and AccessPolicyProvider the users, groups, and policies will be configurable in the UI. If the extensions are not configurable the
users, groups, and policies will read-only in the UI. If the configured authorizer does not use UserGroupProvider and AccessPolicyProvider the users and policies may or may not be visible and users, groups, and policies will read-only in the UI. If the configured authorizer does not use UserGroupProvider and AccessPolicyProvider the users and policies may or may not be visible and
@ -1017,8 +997,7 @@ This section assumes the users, groups, and policies are configurable in the UI
NOTE: Instructions requiring interaction with the UI assume the application is being accessed by User1, a user with administrator privileges, such as the “Initial Admin Identity” user or a converted legacy admin user (see <<authorizers-setup>>). NOTE: Instructions requiring interaction with the UI assume the application is being accessed by User1, a user with administrator privileges, such as the “Initial Admin Identity” user or a converted legacy admin user (see <<authorizers-setup>>).
[[creating-users-groups]] [[creating-users-groups]]
Creating Users and Groups ==== Creating Users and Groups
^^^^^^^^^^^^^^^^^^^^^^^^^
From the UI, select “Users” from the Global Menu. This opens a dialog to create and manage users and groups. From the UI, select “Users” from the Global Menu. This opens a dialog to create and manage users and groups.
@ -1034,8 +1013,7 @@ To create a group, select the “Group” radio button, enter the name of the gr
image:group-creation-dialog.png["Group Creation Dialog"] image:group-creation-dialog.png["Group Creation Dialog"]
[[access-policies]] [[access-policies]]
Access Policies ==== Access Policies
^^^^^^^^^^^^^^^
You can manage the ability for users and groups to view or modify NiFi resources using 'access policies'. There are two types of access policies that can be applied to a resource: You can manage the ability for users and groups to view or modify NiFi resources using 'access policies'. There are two types of access policies that can be applied to a resource:
@ -1142,8 +1120,7 @@ NOTE: “View the policies” and “modify the policies” component-level acce
NOTE: You cannot modify the users/groups on an inherited policy. Users and groups can only be added or removed from a parent policy or an override policy. NOTE: You cannot modify the users/groups on an inherited policy. Users and groups can only be added or removed from a parent policy or an override policy.
[[viewing-policies-users]] [[viewing-policies-users]]
Viewing Policies on Users ==== Viewing Policies on Users
^^^^^^^^^^^^^^^^^^^^^^^^^
From the UI, select “Users” from the Global Menu. This opens the NiFi Users dialog. From the UI, select “Users” from the Global Menu. This opens the NiFi Users dialog.
@ -1156,8 +1133,7 @@ image:user-policies-detail.png["User Policies Detail"]
The User Policies window displays the global and component level policies that have been set for the chosen user. Select the Go To icon (image:iconGoTo.png["Go To Icon"]) to navigate to that component in the canvas. The User Policies window displays the global and component level policies that have been set for the chosen user. Select the Go To icon (image:iconGoTo.png["Go To Icon"]) to navigate to that component in the canvas.
[[access-policy-config-examples]] [[access-policy-config-examples]]
Access Policy Configuration Examples ==== Access Policy Configuration Examples
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The most effective way to understand how to create and apply access policies is to walk through some common examples. The following scenarios assume User1 is an administrator and User2 is a newly added user that has only been given access to the UI. The most effective way to understand how to create and apply access policies is to walk through some common examples. The following scenarios assume User1 is an administrator and User2 is a newly added user that has only been given access to the UI.
@ -1288,16 +1264,14 @@ Being added to both the view and modify policies for the process group, User2 ca
image:user2-edit-connection.png["User2 Edit Connection"] image:user2-edit-connection.png["User2 Edit Connection"]
[[encryption]] [[encryption]]
Encryption Configuration == Encryption Configuration
------------------------
This section provides an overview of the capabilities of NiFi to encrypt and decrypt data. This section provides an overview of the capabilities of NiFi to encrypt and decrypt data.
The `EncryptContent` processor allows for the encryption and decryption of data, both internal to NiFi and integrated with external systems, such as `openssl` and other data sources and consumers. The `EncryptContent` processor allows for the encryption and decryption of data, both internal to NiFi and integrated with external systems, such as `openssl` and other data sources and consumers.
[[key-derivation-functions]] [[key-derivation-functions]]
Key Derivation Functions === Key Derivation Functions
~~~~~~~~~~~~~~~~~~~~~~~~
Key Derivation Functions (KDF) are mechanisms by which human-readable information, usually a password or other secret information, is translated into a cryptographic key suitable for data protection. For further information, read https://en.wikipedia.org/wiki/Key_derivation_function[the Wikipedia entry on Key Derivation Functions]. Key Derivation Functions (KDF) are mechanisms by which human-readable information, usually a password or other secret information, is translated into a cryptographic key suitable for data protection. For further information, read https://en.wikipedia.org/wiki/Key_derivation_function[the Wikipedia entry on Key Derivation Functions].
Currently, KDFs are ingested by `CipherProvider` implementations and return a fully-initialized `Cipher` object to be used for encryption or decryption. Due to the use of a `CipherProviderFactory`, the KDFs are not customizable at this time. Future enhancements will include the ability to provide custom cost parameters to the KDF at initialization time. As a work-around, `CipherProvider` instances can be initialized with custom cost parameters in the constructor but this is not currently supported by the `CipherProviderFactory`. Currently, KDFs are ingested by `CipherProvider` implementations and return a fully-initialized `Cipher` object to be used for encryption or decryption. Due to the use of a `CipherProviderFactory`, the KDFs are not customizable at this time. Future enhancements will include the ability to provide custom cost parameters to the KDF at initialization time. As a work-around, `CipherProvider` instances can be initialized with custom cost parameters in the constructor but this is not currently supported by the `CipherProviderFactory`.
@ -1337,8 +1311,7 @@ Here are the KDFs currently supported by NiFi (primarily in the `EncryptContent`
** This KDF was added in v0.5.0. ** This KDF was added in v0.5.0.
** This KDF performs no operation on the input and is a marker to indicate the raw key is provided to the cipher. The key must be provided in hexadecimal encoding and be of a valid length for the associated cipher/algorithm. ** This KDF performs no operation on the input and is a marker to indicate the raw key is provided to the cipher. The key must be provided in hexadecimal encoding and be of a valid length for the associated cipher/algorithm.
Additional Resources ==== Additional Resources
^^^^^^^^^^^^^^^^^^^^
* http://stackoverflow.com/a/30308723/70465[Explanation of optimal scrypt cost parameters and relationships] * http://stackoverflow.com/a/30308723/70465[Explanation of optimal scrypt cost parameters and relationships]
* http://csrc.nist.gov/publications/nistpubs/800-132/nist-sp800-132.pdf[NIST Special Publication 800-132] * http://csrc.nist.gov/publications/nistpubs/800-132/nist-sp800-132.pdf[NIST Special Publication 800-132]
@ -1353,22 +1326,19 @@ Additional Resources
* https://wiki.openssl.org/index.php/Manual:PKCS5_PBKDF2_HMAC(3)[OpenSSL PBKDF2 KDF] * https://wiki.openssl.org/index.php/Manual:PKCS5_PBKDF2_HMAC(3)[OpenSSL PBKDF2 KDF]
* http://security.stackexchange.com/a/29139/16485[OpenSSL KDF flaws description] * http://security.stackexchange.com/a/29139/16485[OpenSSL KDF flaws description]
Salt and IV Encoding === Salt and IV Encoding
~~~~~~~~~~~~~~~~~~~~
Initially, the `EncryptContent` processor had a single method of deriving the encryption key from a user-provided password. This is now referred to as `NiFiLegacy` mode, effectively `MD5 digest, 1000 iterations`. In v0.4.0, another method of deriving the key, `OpenSSL PKCS#5 v1.5 EVP_BytesToKey` was added for compatibility with content encrypted outside of NiFi using the `openssl` command-line tool. Both of these <<key-derivation-functions, Key Derivation Functions>> (KDF) had hard-coded digest functions and iteration counts, and the salt format was also hard-coded. With v0.5.0, additional KDFs are introduced with variable iteration counts, work factors, and salt formats. In addition, _raw keyed encryption_ was also introduced. This required the capacity to encode arbitrary salts and Initialization Vectors (IV) into the cipher stream in order to be recovered by NiFi or a follow-on system to decrypt these messages. Initially, the `EncryptContent` processor had a single method of deriving the encryption key from a user-provided password. This is now referred to as `NiFiLegacy` mode, effectively `MD5 digest, 1000 iterations`. In v0.4.0, another method of deriving the key, `OpenSSL PKCS#5 v1.5 EVP_BytesToKey` was added for compatibility with content encrypted outside of NiFi using the `openssl` command-line tool. Both of these <<key-derivation-functions, Key Derivation Functions>> (KDF) had hard-coded digest functions and iteration counts, and the salt format was also hard-coded. With v0.5.0, additional KDFs are introduced with variable iteration counts, work factors, and salt formats. In addition, _raw keyed encryption_ was also introduced. This required the capacity to encode arbitrary salts and Initialization Vectors (IV) into the cipher stream in order to be recovered by NiFi or a follow-on system to decrypt these messages.
For the existing KDFs, the salt format has not changed. For the existing KDFs, the salt format has not changed.
NiFi Legacy ==== NiFi Legacy
^^^^^^^^^^^
The first 8 or 16 bytes of the input are the salt. The salt length is determined based on the selected algorithm's cipher block length. If the cipher block size cannot be determined (such as with a stream cipher like `RC4`), the default value of 8 bytes is used. On decryption, the salt is read in and combined with the password to derive the encryption key and IV. The first 8 or 16 bytes of the input are the salt. The salt length is determined based on the selected algorithm's cipher block length. If the cipher block size cannot be determined (such as with a stream cipher like `RC4`), the default value of 8 bytes is used. On decryption, the salt is read in and combined with the password to derive the encryption key and IV.
image:nifi-legacy-salt.png["NiFi Legacy Salt Encoding"] image:nifi-legacy-salt.png["NiFi Legacy Salt Encoding"]
OpenSSL PKCS#5 v1.5 EVP_BytesToKey ==== OpenSSL PKCS#5 v1.5 EVP_BytesToKey
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OpenSSL allows for salted or unsalted key derivation. _*Unsalted key derivation is a security risk and is not recommended.*_ If a salt is present, the first 8 bytes of the input are the ASCII string "`Salted__`" (`0x53 61 6C 74 65 64 5F 5F`) and the next 8 bytes are the ASCII-encoded salt. On decryption, the salt is read in and combined with the password to derive the encryption key and IV. If there is no salt header, the entire input is considered to be the cipher text. OpenSSL allows for salted or unsalted key derivation. _*Unsalted key derivation is a security risk and is not recommended.*_ If a salt is present, the first 8 bytes of the input are the ASCII string "`Salted__`" (`0x53 61 6C 74 65 64 5F 5F`) and the next 8 bytes are the ASCII-encoded salt. On decryption, the salt is read in and combined with the password to derive the encryption key and IV. If there is no salt header, the entire input is considered to be the cipher text.
@ -1376,8 +1346,7 @@ image:openssl-salt.png["OpenSSL Salt Encoding"]
For new KDFs, each of which allow for non-deterministic IVs, the IV must be stored alongside the cipher text. This is not a vulnerability, as the IV is not required to be secret, but simply to be unique for messages encrypted using the same key to reduce the success of cryptographic attacks. For these KDFs, the output consists of the salt, followed by the salt delimiter, UTF-8 string "`NiFiSALT`" (`0x4E 69 46 69 53 41 4C 54`) and then the IV, followed by the IV delimiter, UTF-8 string "`NiFiIV`" (`0x4E 69 46 69 49 56`), followed by the cipher text. For new KDFs, each of which allow for non-deterministic IVs, the IV must be stored alongside the cipher text. This is not a vulnerability, as the IV is not required to be secret, but simply to be unique for messages encrypted using the same key to reduce the success of cryptographic attacks. For these KDFs, the output consists of the salt, followed by the salt delimiter, UTF-8 string "`NiFiSALT`" (`0x4E 69 46 69 53 41 4C 54`) and then the IV, followed by the IV delimiter, UTF-8 string "`NiFiIV`" (`0x4E 69 46 69 49 56`), followed by the cipher text.
Bcrypt, Scrypt, PBKDF2 ==== Bcrypt, Scrypt, PBKDF2
^^^^^^^^^^^^^^^^^^^^^^
image:bcrypt-salt.png["Bcrypt Salt & IV Encoding"] image:bcrypt-salt.png["Bcrypt Salt & IV Encoding"]
@ -1385,8 +1354,7 @@ image:scrypt-salt.png["Scrypt Salt & IV Encoding"]
image:pbkdf2-salt.png["PBKDF2 Salt & IV Encoding"] image:pbkdf2-salt.png["PBKDF2 Salt & IV Encoding"]
Java Cryptography Extension (JCE) Limited Strength Jurisdiction Policies === Java Cryptography Extension (JCE) Limited Strength Jurisdiction Policies
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Because of US export regulations, default JVMs have http://docs.oracle.com/javase/7/docs/technotes/guides/security/SunProviders.html#importlimits[limits imposed on the strength of cryptographic operations] available to them. For example, AES operations are limited to `128 bit keys` by default. While `AES-128` is cryptographically safe, this can have unintended consequences, specifically on Password-based Encryption (PBE). Because of US export regulations, default JVMs have http://docs.oracle.com/javase/7/docs/technotes/guides/security/SunProviders.html#importlimits[limits imposed on the strength of cryptographic operations] available to them. For example, AES operations are limited to `128 bit keys` by default. While `AES-128` is cryptographically safe, this can have unintended consequences, specifically on Password-based Encryption (PBE).
@ -1459,8 +1427,7 @@ A number of PBE algorithms provided by NiFi impose strict limits on the length o
|7 |7
|=== |===
Allow Insecure Cryptographic Modes === Allow Insecure Cryptographic Modes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
By default, the `Allow Insecure Cryptographic Modes` property in `EncryptContent` processor settings is set to `not-allowed`. This means that if a password of fewer than `10` characters is provided, a validation error will occur. 10 characters is a conservative estimate and does not take into consideration full entropy calculations, patterns, etc. By default, the `Allow Insecure Cryptographic Modes` property in `EncryptContent` processor settings is set to `not-allowed`. This means that if a password of fewer than `10` characters is provided, a validation error will occur. 10 characters is a conservative estimate and does not take into consideration full entropy calculations, patterns, etc.
@ -1480,8 +1447,7 @@ If it is not possible to install the unlimited strength jurisdiction policies, t
It is preferable to request upstream/downstream systems to switch to https://cwiki.apache.org/confluence/display/NIFI/Encryption+Information[keyed encryption] or use a "strong" https://cwiki.apache.org/confluence/display/NIFI/Key+Derivation+Function+Explanations[Key Derivation Function (KDF) supported by NiFi]. It is preferable to request upstream/downstream systems to switch to https://cwiki.apache.org/confluence/display/NIFI/Encryption+Information[keyed encryption] or use a "strong" https://cwiki.apache.org/confluence/display/NIFI/Key+Derivation+Function+Explanations[Key Derivation Function (KDF) supported by NiFi].
Encrypted Passwords in Configuration Files == Encrypted Passwords in Configuration Files
------------------------------------------
In order to facilitate the secure setup of NiFi, you can use the `encrypt-config` command line utility to encrypt raw configuration values that NiFi decrypts in memory on startup. This extensible protection scheme transparently allows NiFi to use raw values in operation, while protecting them at rest. In the future, hardware security modules (HSM) and external secure storage mechanisms will be integrated, but for now, an AES encryption provider is the default implementation. In order to facilitate the secure setup of NiFi, you can use the `encrypt-config` command line utility to encrypt raw configuration values that NiFi decrypts in memory on startup. This extensible protection scheme transparently allows NiFi to use raw values in operation, while protecting them at rest. In the future, hardware security modules (HSM) and external secure storage mechanisms will be integrated, but for now, an AES encryption provider is the default implementation.
@ -1490,8 +1456,7 @@ This is a change in behavior; prior to 1.0, all configuration values were stored
If no administrator action is taken, the configuration values remain unencrypted. If no administrator action is taken, the configuration values remain unencrypted.
[[encrypt-config_tool]] [[encrypt-config_tool]]
Encrypt-Config Tool === Encrypt-Config Tool
~~~~~~~~~~~~~~~~~~~
The `encrypt-config` command line tool (invoked as `./bin/encrypt-config.sh` or `bin\encrypt-config.bat`) reads from a 'nifi.properties' file with plaintext sensitive configuration values, prompts for a master password or raw hexadecimal key, and encrypts each value. It replaces the plain values with the protected value in the same file, or writes to a new 'nifi.properties' file if specified. The `encrypt-config` command line tool (invoked as `./bin/encrypt-config.sh` or `bin\encrypt-config.bat`) reads from a 'nifi.properties' file with plaintext sensitive configuration values, prompts for a master password or raw hexadecimal key, and encrypts each value. It replaces the plain values with the protected value in the same file, or writes to a new 'nifi.properties' file if specified.
@ -1596,8 +1561,7 @@ When applied to 'login-identity-providers.xml', the property elements are update
---- ----
[encrypt_config_property_migration] [encrypt_config_property_migration]
Sensitive Property Key Migration === Sensitive Property Key Migration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In order to change the key used to encrypt the sensitive values, indicate *migration mode* using the `-m` or `--migrate` flag, provide the new key or password using the `-k` or `-p` flags as usual, and provide the existing key or password using `-e` or `-w` respectively. This will allow the toolkit to decrypt the existing values and re-encrypt them, and update `bootstrap.conf` with the new key. Only one of the key or password needs to be specified for each phase (old vs. new), and any combination is sufficient: In order to change the key used to encrypt the sensitive values, indicate *migration mode* using the `-m` or `--migrate` flag, provide the new key or password using the `-k` or `-p` flags as usual, and provide the existing key or password using `-e` or `-w` respectively. This will allow the toolkit to decrypt the existing values and re-encrypt them, and update `bootstrap.conf` with the new key. Only one of the key or password needs to be specified for each phase (old vs. new), and any combination is sufficient:
@ -1607,8 +1571,7 @@ In order to change the key used to encrypt the sensitive values, indicate *migra
* old password -> new password * old password -> new password
[encrypt_config_flow_migration] [encrypt_config_flow_migration]
Existing Flow Migration === Existing Flow Migration
~~~~~~~~~~~~~~~~~~~~~~~
This tool can also be used to change the value of `nifi.sensitive.props.key` for an existing flow. The tool will read the existing `flow.xml.gz` and decrypt any sensitive component properties using the original key, This tool can also be used to change the value of `nifi.sensitive.props.key` for an existing flow. The tool will read the existing `flow.xml.gz` and decrypt any sensitive component properties using the original key,
then re-encrypt the sensitive properties with the new key, and write out a new version of the `flow.xml.gz`, or overwrite the existing one. then re-encrypt the sensitive properties with the new key, and write out a new version of the `flow.xml.gz`, or overwrite the existing one.
@ -1626,8 +1589,7 @@ The following command would migrate the sensitive properties key and write out a
---- ----
[[encrypt-config_password]] [[encrypt-config_password]]
Password Key Derivation === Password Key Derivation
~~~~~~~~~~~~~~~~~~~~~~~
Instead of providing a 32 or 64 character raw hexadecimal key, you can provide a password from which the key will be derived. As of 1.0.0, the password must be at least 12 characters, and the key will be derived using `SCrypt` with the parameters: Instead of providing a 32 or 64 character raw hexadecimal key, you can provide a password from which the key will be derived. As of 1.0.0, the password must be at least 12 characters, and the key will be derived using `SCrypt` with the parameters:
@ -1643,14 +1605,12 @@ As of August 2016, these values are determined to be strong for this threat mode
NOTE: While fixed salts are counter to best practices, a static salt is necessary for deterministic key derivation without additional storage of the salt value. NOTE: While fixed salts are counter to best practices, a static salt is necessary for deterministic key derivation without additional storage of the salt value.
[[encrypt-config_secure_prompt]] [[encrypt-config_secure_prompt]]
Secure Prompt === Secure Prompt
~~~~~~~~~~~~~
If you prefer not to provide the password or raw key in the command-line invocation of the tool, leaving these arguments absent will prompt a secure console read of the password (by default) or raw key (if the `-r` flag is provided at invocation). If you prefer not to provide the password or raw key in the command-line invocation of the tool, leaving these arguments absent will prompt a secure console read of the password (by default) or raw key (if the `-r` flag is provided at invocation).
[[admin-toolkit]] [[admin-toolkit]]
Administrative Tools == Administrative Tools
--------------------
The admin toolkit contains command line utilities for administrators to support NiFi maintenance in standalone The admin toolkit contains command line utilities for administrators to support NiFi maintenance in standalone
and clustered environments. These utilities include: and clustered environments. These utilities include:
@ -1872,8 +1832,7 @@ live outside of the NiFi directory, remove them so they can be recreated on star
[[clustering]] [[clustering]]
Clustering Configuration == Clustering Configuration
------------------------
This section provides a quick overview of NiFi Clustering and instructions on how to set up a basic cluster. This section provides a quick overview of NiFi Clustering and instructions on how to set up a basic cluster.
In the future, we hope to provide supplemental documentation that covers the NiFi Cluster Architecture in depth. In the future, we hope to provide supplemental documentation that covers the NiFi Cluster Architecture in depth.
@ -2045,8 +2004,7 @@ set the level="DEBUG" in the following line (instead of "INFO"):
[[state_management]] [[state_management]]
State Management == State Management
----------------
NiFi provides a mechanism for Processors, Reporting Tasks, Controller Services, and the framework itself to persist state. This NiFi provides a mechanism for Processors, Reporting Tasks, Controller Services, and the framework itself to persist state. This
allows a Processor, for example, to resume from the place where it left off after NiFi is restarted. Additionally, it allows for allows a Processor, for example, to resume from the place where it left off after NiFi is restarted. Additionally, it allows for
@ -2474,8 +2432,7 @@ If the state-management.xml specifies Open, no authentication is required.
6. Once the migration has completed successfully, start the processors in the NiFi flow. Processing should continue from the point at which it was stopped when the NiFi flow was stopped. 6. Once the migration has completed successfully, start the processors in the NiFi flow. Processing should continue from the point at which it was stopped when the NiFi flow was stopped.
[[bootstrap_properties]] [[bootstrap_properties]]
Bootstrap Properties == Bootstrap Properties
--------------------
The _bootstrap.conf_ file in the _conf_ directory allows users to configure settings for how NiFi should be started. The _bootstrap.conf_ file in the _conf_ directory allows users to configure settings for how NiFi should be started.
This includes parameters, such as the size of the Java Heap, what Java command to run, and Java System Properties. This includes parameters, such as the size of the Java Heap, what Java command to run, and Java System Properties.
@ -2511,8 +2468,7 @@ take effect only after NiFi has been stopped and restarted.
|==== |====
[[notification_services]] [[notification_services]]
Notification Services == Notification Services
---------------------
When the NiFi bootstrap starts or stops NiFi, or detects that it has died unexpectedly, it is able to notify configured recipients. Currently, When the NiFi bootstrap starts or stops NiFi, or detects that it has died unexpectedly, it is able to notify configured recipients. Currently,
the only mechanisms supplied are to send an e-mail or HTTP POST notification. The notification services configuration file the only mechanisms supplied are to send an e-mail or HTTP POST notification. The notification services configuration file
is an XML file where the notification capabilities are configured. is an XML file where the notification capabilities are configured.
@ -2623,8 +2579,7 @@ A complete example of configuring the HTTP service could look like the following
.... ....
[[proxy_configuration]] [[proxy_configuration]]
Proxy Configuration == Proxy Configuration
-------------------
When running Apache NiFi behind a proxy there are a couple of key items to be aware of during deployment. When running Apache NiFi behind a proxy there are a couple of key items to be aware of during deployment.
* NiFi is comprised of a number of web applications (web UI, web API, documentation, custom UIs, data viewers, etc), so the mapping needs to be configured for the *root path*. That way all context * NiFi is comprised of a number of web applications (web UI, web API, documentation, custom UIs, data viewers, etc), so the mapping needs to be configured for the *root path*. That way all context
@ -2680,8 +2635,7 @@ documentation of the proxy for guidance for your deployment environment and use
.... ....
[[kerberos_service]] [[kerberos_service]]
Kerberos Service == Kerberos Service
----------------
NiFi can be configured to use Kerberos SPNEGO (or "Kerberos Service") for authentication. In this scenario, users will hit the REST endpoint `/access/kerberos` and the server will respond with a `401` status code and the challenge response header `WWW-Authenticate: Negotiate`. This communicates to the browser to use the GSS-API and load the user's Kerberos ticket and provide it as a Base64-encoded header value in the subsequent request. It will be of the form `Authorization: Negotiate YII...`. NiFi will attempt to validate this ticket with the KDC. If it is successful, the user's _principal_ will be returned as the identity, and the flow will follow login/credential authentication, in that a JWT will be issued in the response to prevent the unnecessary overhead of Kerberos authentication on every subsequent request. If the ticket cannot be validated, it will return with the appropriate error response code. The user will then be able to provide their Kerberos credentials to the login form if the `KerberosLoginIdentityProvider` has been configured. See <<kerberos_login_identity_provider>> login identity provider for more details. NiFi can be configured to use Kerberos SPNEGO (or "Kerberos Service") for authentication. In this scenario, users will hit the REST endpoint `/access/kerberos` and the server will respond with a `401` status code and the challenge response header `WWW-Authenticate: Negotiate`. This communicates to the browser to use the GSS-API and load the user's Kerberos ticket and provide it as a Base64-encoded header value in the subsequent request. It will be of the form `Authorization: Negotiate YII...`. NiFi will attempt to validate this ticket with the KDC. If it is successful, the user's _principal_ will be returned as the identity, and the flow will follow login/credential authentication, in that a JWT will be issued in the response to prevent the unnecessary overhead of Kerberos authentication on every subsequent request. If the ticket cannot be validated, it will return with the appropriate error response code. The user will then be able to provide their Kerberos credentials to the login form if the `KerberosLoginIdentityProvider` has been configured. See <<kerberos_login_identity_provider>> login identity provider for more details.
NiFi will only respond to Kerberos SPNEGO negotiation over an HTTPS connection, as unsecured requests are never authenticated. NiFi will only respond to Kerberos SPNEGO negotiation over an HTTPS connection, as unsecured requests are never authenticated.
@ -2697,8 +2651,7 @@ The following properties must be set in _nifi.properties_ to enable Kerberos ser
See <<kerberos_properties>> for complete documentation. See <<kerberos_properties>> for complete documentation.
[[kerberos_service_notes]] [[kerberos_service_notes]]
Notes === Notes
~~~~~
* Kerberos is case-sensitive in many places and the error messages (or lack thereof) may not be sufficiently explanatory. Check the case sensitivity of the service principal in your configuration files. Convention is `HTTP/fully.qualified.domain@REALM`. * Kerberos is case-sensitive in many places and the error messages (or lack thereof) may not be sufficiently explanatory. Check the case sensitivity of the service principal in your configuration files. Convention is `HTTP/fully.qualified.domain@REALM`.
* Browsers have varying levels of restriction when dealing with SPNEGO negotiations. Some will provide the local Kerberos ticket to any domain that requests it, while others whitelist the trusted domains. See link:http://docs.spring.io/autorepo/docs/spring-security-kerberos/1.0.2.BUILD-SNAPSHOT/reference/htmlsingle/#browserspnegoconfig[Spring Security Kerberos - Reference Documentation: Appendix E. Configure browsers for SPNEGO Negotiation] for common browsers. * Browsers have varying levels of restriction when dealing with SPNEGO negotiations. Some will provide the local Kerberos ticket to any domain that requests it, while others whitelist the trusted domains. See link:http://docs.spring.io/autorepo/docs/spring-security-kerberos/1.0.2.BUILD-SNAPSHOT/reference/htmlsingle/#browserspnegoconfig[Spring Security Kerberos - Reference Documentation: Appendix E. Configure browsers for SPNEGO Negotiation] for common browsers.
@ -2733,8 +2686,7 @@ root@kdc:~#
.... ....
[[system_properties]] [[system_properties]]
System Properties == System Properties
-----------------
The _nifi.properties_ file in the _conf_ directory is the main configuration file for controlling how NiFi runs. This section provides an overview of the properties in this file and includes some notes on how to configure it in a way that will make upgrading easier. *After making changes to this file, restart NiFi in order The _nifi.properties_ file in the _conf_ directory is the main configuration file for controlling how NiFi runs. This section provides an overview of the properties in this file and includes some notes on how to configure it in a way that will make upgrading easier. *After making changes to this file, restart NiFi in order
for the changes to take effect.* for the changes to take effect.*

View File

@ -14,8 +14,7 @@
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
NiFi Developer's Guide = NiFi Developer's Guide
======================
Apache NiFi Team <dev@nifi.apache.org> Apache NiFi Team <dev@nifi.apache.org>
:homepage: http://nifi.apache.org :homepage: http://nifi.apache.org

View File

@ -14,14 +14,12 @@
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
Apache NiFi Expression Language Guide = Apache NiFi Expression Language Guide
=====================================
Apache NiFi Team <dev@nifi.apache.org> Apache NiFi Team <dev@nifi.apache.org>
:homepage: http://nifi.apache.org :homepage: http://nifi.apache.org
[[overview]] [[overview]]
Overview == Overview
--------
All data in Apache NiFi is represented by an abstraction called a FlowFile. All data in Apache NiFi is represented by an abstraction called a FlowFile.
A FlowFile is comprised of two major pieces: content and attributes. A FlowFile is comprised of two major pieces: content and attributes.
The content portion of the FlowFile represents the data on which to operate. The content portion of the FlowFile represents the data on which to operate.
@ -48,8 +46,7 @@ and manipulate their values.
[[structure]] [[structure]]
Structure of a NiFi Expression == Structure of a NiFi Expression
------------------------------
The NiFi Expression Language always begins with the start delimiter `${` and ends The NiFi Expression Language always begins with the start delimiter `${` and ends
with the end delimiter `}`. Between the start and end delimiters is the text of the with the end delimiter `}`. Between the start and end delimiters is the text of the

View File

@ -14,14 +14,12 @@
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
Getting Started with Apache NiFi = Getting Started with Apache NiFi
================================
Apache NiFi Team <dev@nifi.apache.org> Apache NiFi Team <dev@nifi.apache.org>
:homepage: http://nifi.apache.org :homepage: http://nifi.apache.org
Who is This Guide For? == Who is This Guide For?
----------------------
This guide is written for users who have never used, have had limited exposure to, or only accomplished specific tasks within NiFi. This guide is written for users who have never used, have had limited exposure to, or only accomplished specific tasks within NiFi.
This guide is not intended to be an exhaustive instruction manual or a reference guide. The This guide is not intended to be an exhaustive instruction manual or a reference guide. The
@ -42,8 +40,7 @@ link:overview.html[Overview] documentation.
Terminology Used in This Guide == Terminology Used in This Guide
------------------------------
In order to talk about NiFi, there are a few key terms that readers should be familiar with. In order to talk about NiFi, there are a few key terms that readers should be familiar with.
We will explain those NiFi-specific terms here, at a high level. We will explain those NiFi-specific terms here, at a high level.
@ -58,8 +55,7 @@ splitting, merging, and processing FlowFiles. It is the most important building
dataflows. dataflows.
Downloading and Installing NiFi == Downloading and Installing NiFi
-------------------------------
NiFi can be downloaded from the link:http://nifi.apache.org/download.html[NiFi Downloads Page]. There are two packaging options NiFi can be downloaded from the link:http://nifi.apache.org/download.html[NiFi Downloads Page]. There are two packaging options
available: a "tarball" that is tailored more to Linux and a zip file that is more applicable for Windows users. Mac OS X users available: a "tarball" that is tailored more to Linux and a zip file that is more applicable for Windows users. Mac OS X users
@ -74,8 +70,7 @@ For information on how to configure the instance of NiFi (for example, to config
configuration, or the port that NiFi is running on), see the link:administration-guide.html[Admin Guide]. configuration, or the port that NiFi is running on), see the link:administration-guide.html[Admin Guide].
Starting NiFi == Starting NiFi
-------------
Once NiFi has been downloaded and installed as described above, it can be started by using the mechanism Once NiFi has been downloaded and installed as described above, it can be started by using the mechanism
appropriate for your operating system. appropriate for your operating system.
@ -113,8 +108,7 @@ and `sudo service nifi stop`. Additionally, the running status can be checked vi
I Started NiFi. Now What? == I Started NiFi. Now What?
-------------------------
Now that NiFi has been started, we can bring up the User Interface (UI) in order to create and monitor our dataflow. Now that NiFi has been started, we can bring up the User Interface (UI) in order to create and monitor our dataflow.
To get started, open a web browser and navigate to `http://localhost:8080/nifi`. The port can be changed by To get started, open a web browser and navigate to `http://localhost:8080/nifi`. The port can be changed by
@ -272,8 +266,7 @@ link:user-guide.html[User Guide].
What Processors are Available == What Processors are Available
-----------------------------
In order to create an effective dataflow, the users must understand what types of Processors are available to them. In order to create an effective dataflow, the users must understand what types of Processors are available to them.
NiFi contains many different Processors out of the box. These Processors provide capabilities to ingest data from NiFi contains many different Processors out of the box. These Processors provide capabilities to ingest data from
@ -440,8 +433,7 @@ categorizing them by their functions.
a message from SQS, perform some processing on it, and then delete the object from the queue only after it has successfully completed processing. a message from SQS, perform some processing on it, and then delete the object from the queue only after it has successfully completed processing.
Working With Attributes == Working With Attributes
-----------------------
Each FlowFile is created with several Attributes, and these Attributes will change over the life of Each FlowFile is created with several Attributes, and these Attributes will change over the life of
the FlowFile. The concept of a FlowFile is extremely powerful and provides three primary benefits. the FlowFile. The concept of a FlowFile is extremely powerful and provides three primary benefits.
First, it allows the user to make routing decisions in the flow so that FlowFiles that meet some criteria First, it allows the user to make routing decisions in the flow so that FlowFiles that meet some criteria
@ -565,14 +557,12 @@ cause a tooltip to show, which explains what the function does, the arguments th
Custom Properties Within Expression Language == Custom Properties Within Expression Language
--------------------------------------------
In addition to using FlowFile attributes, you can also define custom properties for Expression Language use. Defining custom properties gives you additional flexibility in processing and configuring dataflows. For example, you can refer to custom properties for connection, server, and service properties. Once you have created custom properties, you can identify their location in the `nifi.variable.registry.properties` field in the 'nifi.properties' file. After you have updated the 'nifi.properties' file and restarted NiFi, you are able to use custom properties as needed. In addition to using FlowFile attributes, you can also define custom properties for Expression Language use. Defining custom properties gives you additional flexibility in processing and configuring dataflows. For example, you can refer to custom properties for connection, server, and service properties. Once you have created custom properties, you can identify their location in the `nifi.variable.registry.properties` field in the 'nifi.properties' file. After you have updated the 'nifi.properties' file and restarted NiFi, you are able to use custom properties as needed.
Working With Templates == Working With Templates
----------------------
As we use Processors to build more and more complex dataflows in NiFi, we often will find that we string together the same sequence As we use Processors to build more and more complex dataflows in NiFi, we often will find that we string together the same sequence
of Processors to perform some task. This can become tedious and inefficient. To address this, NiFi provides a concept of Templates. of Processors to perform some task. This can become tedious and inefficient. To address this, NiFi provides a concept of Templates.
@ -610,8 +600,7 @@ There are a few important notes to remember when working with templates:
- If a component that is included in the template references a Controller Service, the Controller Service will also be added to the template. This means that each time that the template is added to the graph, it will create a copy of the Controller Service. - If a component that is included in the template references a Controller Service, the Controller Service will also be added to the template. This means that each time that the template is added to the graph, it will create a copy of the Controller Service.
Monitoring NiFi == Monitoring NiFi
---------------
As data flows through your dataflow in NiFi, it is important to understand how well your system is performing in order to assess if you As data flows through your dataflow in NiFi, it is important to understand how well your system is performing in order to assess if you
will require more resources and in order to assess the health of your current resources. NiFi provides a few mechanisms for monitoring will require more resources and in order to assess the health of your current resources. NiFi provides a few mechanisms for monitoring
@ -658,8 +647,7 @@ If the framework emits a bulletin, we will also see a bulletin indicator highlig
In the Global Menu is a Bulletin Board option. Clicking this option will take us to the bulletin board where we can see all bulletins that occur across the NiFi instance and can filter based on the component, the message, etc. In the Global Menu is a Bulletin Board option. Clicking this option will take us to the bulletin board where we can see all bulletins that occur across the NiFi instance and can filter based on the component, the message, etc.
Data Provenance == Data Provenance
---------------
NiFi keeps a very granular level of detail about each piece of data that it ingests. As the data is processed through NiFi keeps a very granular level of detail about each piece of data that it ingests. As the data is processed through
the system and is transformed, routed, split, aggregated, and distributed to other endpoints, this information is the system and is transformed, routed, split, aggregated, and distributed to other endpoints, this information is
@ -734,8 +722,7 @@ was introduced by a JOIN event, in which we were waiting for more FlowFiles to j
see where this is occurring is a very powerful feature that will help users to understand how the enterprise is operating. see where this is occurring is a very powerful feature that will help users to understand how the enterprise is operating.
Where To Go For More Information == Where To Go For More Information
--------------------------------
The NiFi community has built up a significant amount of documentation on how to use the software. The following guides are available, in The NiFi community has built up a significant amount of documentation on how to use the software. The following guides are available, in
addition to this Getting Started Guide: addition to this Getting Started Guide:

View File

@ -14,13 +14,11 @@
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
Apache NiFi In Depth = Apache NiFi In Depth
====================
Apache NiFi Team <dev@nifi.apache.org> Apache NiFi Team <dev@nifi.apache.org>
:homepage: http://nifi.apache.org :homepage: http://nifi.apache.org
Intro == Intro
-----
This advanced level document is aimed at providing an in-depth look at the implementation and design decisions of NiFi. It assumes the reader has read enough of the other documentation to know the basics of NiFi. This advanced level document is aimed at providing an in-depth look at the implementation and design decisions of NiFi. It assumes the reader has read enough of the other documentation to know the basics of NiFi.
FlowFiles are at the heart of NiFi and its flow-based design. A FlowFile is a data record, which consists of a pointer to its content (payload) and attributes to support the content, that is associated with one or more provenance events. The attributes are key/value pairs that act as the metadata for the FlowFile, such as the FlowFile filename. The content is the actual data or the payload of the file. Provenance is a record of what has happened to the FlowFile. Each one of these parts has its own repository (repo) for storage. FlowFiles are at the heart of NiFi and its flow-based design. A FlowFile is a data record, which consists of a pointer to its content (payload) and attributes to support the content, that is associated with one or more provenance events. The attributes are key/value pairs that act as the metadata for the FlowFile, such as the FlowFile filename. The content is the actual data or the payload of the file. Provenance is a record of what has happened to the FlowFile. Each one of these parts has its own repository (repo) for storage.

View File

@ -14,13 +14,11 @@
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
Apache NiFi Overview = Apache NiFi Overview
====================
Apache NiFi Team <dev@nifi.apache.org> Apache NiFi Team <dev@nifi.apache.org>
:homepage: http://nifi.apache.org :homepage: http://nifi.apache.org
What is Apache NiFi? == What is Apache NiFi?
--------------------
Put simply NiFi was built to automate the flow of data between systems. While Put simply NiFi was built to automate the flow of data between systems. While
the term 'dataflow' is used in a variety of contexts, we use it here the term 'dataflow' is used in a variety of contexts, we use it here
to mean the automated and managed flow of information between systems. This to mean the automated and managed flow of information between systems. This
@ -66,8 +64,7 @@ complexity, the rate of change necessary to adapt, and that at scale
the edge case becomes common occurrence. NiFi is built to help tackle these the edge case becomes common occurrence. NiFi is built to help tackle these
modern dataflow challenges. modern dataflow challenges.
The core concepts of NiFi == The core concepts of NiFi
-------------------------
NiFi's fundamental design concepts closely relate to the main ideas of Flow Based NiFi's fundamental design concepts closely relate to the main ideas of Flow Based
Programming <<fbp>>. Here are some of Programming <<fbp>>. Here are some of
@ -121,8 +118,7 @@ A few of these benefits include:
* Error handling becomes as natural as the happy-path rather than a coarse grained catch-all * Error handling becomes as natural as the happy-path rather than a coarse grained catch-all
* The points at which data enters and exits the system as well as how it flows through are well understood and easily tracked * The points at which data enters and exits the system as well as how it flows through are well understood and easily tracked
NiFi Architecture == NiFi Architecture
-----------------
image::zero-master-node.png["NiFi Architecture Diagram"] image::zero-master-node.png["NiFi Architecture Diagram"]
NiFi executes within a JVM on a host operating system. The primary NiFi executes within a JVM on a host operating system. The primary
@ -152,8 +148,7 @@ image::zero-master-cluster.png["NiFi Cluster Architecture Diagram"]
Starting with the NiFi 1.0 release, a Zero-Master Clustering paradigm is employed. Each node in a NiFi cluster performs the same tasks on the data, but each operates on a different set of data. Apache ZooKeeper elects a single node as the Cluster Coordinator, and failover is handled automatically by ZooKeeper. All cluster nodes report heartbeat and status information to the Cluster Coordinator. The Cluster Coordinator is responsible for disconnecting and connecting nodes. Additionally, every cluster has one Primary Node, also elected by ZooKeeper. As a DataFlow manager, you can interact with the NiFi cluster through the user interface (UI) of any node. Any change you make is replicated to all nodes in the cluster, allowing for multiple entry points. Starting with the NiFi 1.0 release, a Zero-Master Clustering paradigm is employed. Each node in a NiFi cluster performs the same tasks on the data, but each operates on a different set of data. Apache ZooKeeper elects a single node as the Cluster Coordinator, and failover is handled automatically by ZooKeeper. All cluster nodes report heartbeat and status information to the Cluster Coordinator. The Cluster Coordinator is responsible for disconnecting and connecting nodes. Additionally, every cluster has one Primary Node, also elected by ZooKeeper. As a DataFlow manager, you can interact with the NiFi cluster through the user interface (UI) of any node. Any change you make is replicated to all nodes in the cluster, allowing for multiple entry points.
Performance Expectations and Characteristics of NiFi == Performance Expectations and Characteristics of NiFi
----------------------------------------------------
NiFi is designed to fully leverage the capabilities of the underlying host system NiFi is designed to fully leverage the capabilities of the underlying host system
on which it is operating. This maximization of resources is particularly strong with on which it is operating. This maximization of resources is particularly strong with
regard to CPU and disk. For additional details, see the best practices and configuration tips in the Administration Guide. regard to CPU and disk. For additional details, see the best practices and configuration tips in the Administration Guide.
@ -192,8 +187,7 @@ is afforded by the JVM. JVM garbage collection becomes a very important
factor to both restricting the total practical heap size, as well as optimizing factor to both restricting the total practical heap size, as well as optimizing
how well the application runs over time. NiFi jobs can be I/O intensive when reading the same content regularly. Configure a large enough disk to optimize performance. how well the application runs over time. NiFi jobs can be I/O intensive when reading the same content regularly. Configure a large enough disk to optimize performance.
High Level Overview of Key NiFi Features == High Level Overview of Key NiFi Features
----------------------------------------
This sections provides a 20,000 foot view of NiFi's cornerstone fundamentals, so that you can understand the Apache NiFi big picture, and some of its the most interesting features. The key features categories include flow management, ease of use, security, extensible architecture, and flexible scaling model. This sections provides a 20,000 foot view of NiFi's cornerstone fundamentals, so that you can understand the Apache NiFi big picture, and some of its the most interesting features. The key features categories include flow management, ease of use, security, extensible architecture, and flexible scaling model.
Flow Management:: Flow Management::
@ -283,8 +277,7 @@ Flexible Scaling Model::
References == References
----------
[bibliography] [bibliography]
- [[[eip]]] Gregor Hohpe. Enterprise Integration Patterns [online]. Retrieved: 27 Dec 2014, from: http://www.enterpriseintegrationpatterns.com/ - [[[eip]]] Gregor Hohpe. Enterprise Integration Patterns [online]. Retrieved: 27 Dec 2014, from: http://www.enterpriseintegrationpatterns.com/
- [[[soa]]] Wikipedia. Service Oriented Architecture [online]. Retrieved: 27 Dec 2014, from: http://en.wikipedia.org/wiki/Service-oriented_architecture - [[[soa]]] Wikipedia. Service Oriented Architecture [online]. Retrieved: 27 Dec 2014, from: http://en.wikipedia.org/wiki/Service-oriented_architecture

View File

@ -14,14 +14,12 @@
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
Apache NiFi RecordPath Guide = Apache NiFi RecordPath Guide
============================
Apache NiFi Team <dev@nifi.apache.org> Apache NiFi Team <dev@nifi.apache.org>
:homepage: http://nifi.apache.org :homepage: http://nifi.apache.org
[[overview]] [[overview]]
Overview == Overview
--------
Apache NiFi offers a very robust set of Processors that are capable of ingesting, processing, Apache NiFi offers a very robust set of Processors that are capable of ingesting, processing,
routing, transforming, and delivering data of any format. This is possible because the NiFi routing, transforming, and delivering data of any format. This is possible because the NiFi
framework itself is data-agnostic. It doesn't care whether your data is a 100-byte JSON message framework itself is data-agnostic. It doesn't care whether your data is a 100-byte JSON message
@ -76,8 +74,7 @@ Enter the NiFi RecordPath language. RecordPath is intended to be a simple, easy-
[[structure]] [[structure]]
Structure of a RecordPath == Structure of a RecordPath
-------------------------
A Record in NiFi is made up of (potentially) many fields, and each of these fields could actually be itself a Record. This means that A Record in NiFi is made up of (potentially) many fields, and each of these fields could actually be itself a Record. This means that
a Record can be thought of as having a hierarchical, or nested, structure. We talk about an "inner Record" as being the child of the a Record can be thought of as having a hierarchical, or nested, structure. We talk about an "inner Record" as being the child of the

View File

@ -14,14 +14,12 @@
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
// //
Apache NiFi User Guide = Apache NiFi User Guide
======================
Apache NiFi Team <dev@nifi.apache.org> Apache NiFi Team <dev@nifi.apache.org>
:homepage: http://nifi.apache.org :homepage: http://nifi.apache.org
Introduction == Introduction
------------
Apache NiFi is a dataflow system based on the concepts of flow-based programming. It supports Apache NiFi is a dataflow system based on the concepts of flow-based programming. It supports
powerful and scalable directed graphs of data routing, transformation, and system mediation logic. NiFi has powerful and scalable directed graphs of data routing, transformation, and system mediation logic. NiFi has
a web-based user interface for design, control, feedback, and monitoring of dataflows. It is highly configurable a web-based user interface for design, control, feedback, and monitoring of dataflows. It is highly configurable
@ -33,8 +31,7 @@ See the link:administration-guide.html[System Administrators Guide] for infor
use a supported web browser to view the UI. use a supported web browser to view the UI.
Browser Support == Browser Support
---------------
[options="header"] [options="header"]
|====================== |======================
|Browser |Version |Browser |Version
@ -65,8 +62,7 @@ In environments where your browser width is less than 800 pixels and the height
UI may become unavailable. UI may become unavailable.
[template="glossary", id="terminology"] [template="glossary", id="terminology"]
Terminology == Terminology
-----------
*DataFlow Manager*: A DataFlow Manager (DFM) is a NiFi user who has permissions to add, remove, and modify components of a NiFi dataflow. *DataFlow Manager*: A DataFlow Manager (DFM) is a NiFi user who has permissions to add, remove, and modify components of a NiFi dataflow.
*FlowFile*: The FlowFile represents a single piece of data in NiFi. A FlowFile is made up of two components: *FlowFile*: The FlowFile represents a single piece of data in NiFi. A FlowFile is made up of two components:
@ -134,8 +130,7 @@ Terminology
[[User_Interface]] [[User_Interface]]
NiFi User Interface == NiFi User Interface
-------------------
The NiFi UI provides mechanisms for creating automated dataflows, as well as visualizing, The NiFi UI provides mechanisms for creating automated dataflows, as well as visualizing,
editing, monitoring, and administering those dataflows. The UI can be broken down into several segments, editing, monitoring, and administering those dataflows. The UI can be broken down into several segments,
@ -180,8 +175,7 @@ breadcrumbs is a link that will take you back up to that level in the flow.
image::nifi-navigation.png["NiFi Navigation"] image::nifi-navigation.png["NiFi Navigation"]
[[UI-with-multi-tenant-authorization]] [[UI-with-multi-tenant-authorization]]
Accessing the UI with Multi-Tenant Authorization == Accessing the UI with Multi-Tenant Authorization
------------------------------------------------
Multi-tenant authorization enables multiple groups of users (tenants) to command, control, and observe different parts of the dataflow, Multi-tenant authorization enables multiple groups of users (tenants) to command, control, and observe different parts of the dataflow,
with varying levels of authorization. When an authenticated user attempts to view or modify a NiFi resource, the system checks whether the with varying levels of authorization. When an authenticated user attempts to view or modify a NiFi resource, the system checks whether the
user has privileges to perform that action. These privileges are defined by policies that you can apply system wide or to individual user has privileges to perform that action. These privileges are defined by policies that you can apply system wide or to individual
@ -223,8 +217,7 @@ If you are unable to view or modify a NiFi resource, contact your System Adminis
link:administration-guide.html[System Administrators Guide] for more information. link:administration-guide.html[System Administrators Guide] for more information.
[[logging-in]] [[logging-in]]
Logging In == Logging In
---------
If NiFi is configured to run securely, users will be able to request access to the DataFlow. For information on configuring NiFi to run If NiFi is configured to run securely, users will be able to request access to the DataFlow. For information on configuring NiFi to run
securely, see the link:administration-guide.html[System Administrators Guide]. If NiFi supports anonymous access, users will be given access securely, see the link:administration-guide.html[System Administrators Guide]. If NiFi supports anonymous access, users will be given access
@ -238,8 +231,7 @@ image::login.png["Log In"]
[[building-dataflow]] [[building-dataflow]]
Building a DataFlow == Building a DataFlow
-------------------
A DFM is able to build an automated dataflow using the NiFi UI. Simply drag components from the toolbar to the canvas, A DFM is able to build an automated dataflow using the NiFi UI. Simply drag components from the toolbar to the canvas,
configure the components to meet specific needs, and connect configure the components to meet specific needs, and connect
@ -2100,8 +2092,7 @@ When switching between implementation "families" (i.e. `VolatileProvenanceReposi
* Corruption -- when a disk is filled or corrupted, there have been reported issues with the repository becoming corrupted and recovery steps are necessary. This is likely to continue to be an issue with the encrypted repository, although still limited in scope to individual records (i.e. an entire repository file won't be irrecoverable due to the encryption). * Corruption -- when a disk is filled or corrupted, there have been reported issues with the repository becoming corrupted and recovery steps are necessary. This is likely to continue to be an issue with the encrypted repository, although still limited in scope to individual records (i.e. an entire repository file won't be irrecoverable due to the encryption).
[[other_management_features]] [[other_management_features]]
Other Management Features == Other Management Features
-------------------------
In addition to the Summary Page, Data Provenance Page, Template Management Page, and Bulletin Board Page, there are In addition to the Summary Page, Data Provenance Page, Template Management Page, and Bulletin Board Page, there are
other tools in the Global Menu (see <<User_Interface>>) that are useful to the DFM. Select Flow Configuration History to view other tools in the Global Menu (see <<User_Interface>>) that are useful to the DFM. Select Flow Configuration History to view