NIFI-3701 Documentation improvements for 1.x.

This closes # 1733.

Signed-off-by: Andy LoPresto <alopresto@apache.org>
This commit is contained in:
Andrew Lim 2017-05-02 16:12:58 -04:00 committed by Andy LoPresto
parent 26d90fbccf
commit 580d65dfde
No known key found for this signature in database
GPG Key ID: 6EC293152D90B61D
2 changed files with 54 additions and 44 deletions

View File

@ -462,30 +462,39 @@ Here is an example entry:
After you have edited and saved the 'authorizers.xml' file, restart NiFi. Users and roles from the 'authorized-users.xml' file are converted and added as identities and policies in the 'users.xml' and 'authorizations.xml' files. Once the application starts, users who previously had a legacy Administrator role can access the UI and begin managing users, groups, and policies.
Here is a summary of policies assigned to each legacy role if the NiFi instance has an existing flow.xml.gz:
The following tables summarize the global and component policies assigned to each legacy role if the NiFi instance has an existing 'flow.xml.gz':
===== Global Access Policies
[cols=">s,^s,^s,^s,^s,^s,^s", options="header"]
|==========================
| | Admin | DFM | Monitor | Provenance | NiFi | Proxy
|view the UI |* |* |* | | |
|view the controller |* |* |* | |* |
|modify the controller | |* | | | |
|view system diagnostics | |* |* | | |
|view the dataflow |* |* |* | | |
|modify the dataflow | |* | | | |
|view the users/groups |* | | | | |
|modify the users/groups |* | | | | |
|view policies |* | | | | |
|modify policies |* | | | | |
|access the controller - view |* |* |* | |* |
|access the controller - modify | |* | | | |
|query provenance | | | |* | |
|access restricted components | |* | | | |
|view the data | |* | |* | |*
|modify the data | |* | | | |*
|access all policies - view |* | | | | |
|access all policies - modify |* | | | | |
|access users/user groups - view |* | | | | |
|access users/user groups - modify |* | | | | |
|retrieve site-to-site details | | | | |* |
|send proxy user requests | | | | | |*
|view system diagnostics | |* |* | | |
|proxy user requests | | | | | |*
|access counters | | | | | |
|==========================
For details on the policies in the table, see <<access-policies>>.
===== Component Access Policies on the Root Process Group
[cols=">s,^s,^s,^s,^s,^s,^s", options="header"]
|==========================
| | Admin | DFM | Monitor | Provenance | NiFi | Proxy
|view the component |* |* |* | | |
|modify the component | |* | | | |
|view the data | |* | |* | |*
|modify the data | |* | | | |*
|==========================
For details on the individual policies in the table, see <<access-policies>>.
NOTE: NiFi fails to restart if values exist for both the “Initial Admin Identity” and “Legacy Authorized Users File” properties. You can specify only one of these values to initialize authorizations.
@ -518,7 +527,7 @@ cn=nifi-2,ou=people,dc=example,dc=com
</authorizers>
----
NOTE: In a cluster, all nodes must have the same 'authorizations.xml'. If a node has a different 'authorizations.xml', it cannot join the cluster. The only exception is if a node has an empty 'authorizations.xml'. In this scenario, the node inherits the 'authorizations.xml' from the cluster.
NOTE: In a cluster, all nodes must have the same 'authorizations.xml' and 'users.xml'. The only exception is if a node has empty 'authorizations.xml' and 'user.xml' files prior to joining the cluster. In this scenario, the node inherits them from the cluster during startup.
Now that initial authorizations have been created, additional users, groups and authorizations can be created and managed in the NiFi UI.
@ -638,7 +647,7 @@ Component level access policies govern the following component level authorizati
|modify the policies
|Allows users to modify the list of users who can view/modify a component
|retrieve data via site-to-site
|receive data via site-to-site
|Allows a port to receive data from NiFi instances
|send data via site-to-site
@ -647,6 +656,8 @@ Component level access policies govern the following component level authorizati
NOTE: You can apply access policies to all component types except connections. Connection authorizations are inferred by the individual access policies on the source and destination components of the connection, as well as the access policy of the process group containing the components. This is discussed in more detail in the <<creating-a-connection>> and <<editing-a-connection>> examples below.
NOTE: In order to access List Queue or Delete Queue for a connection, a user requires permission to the "view the data" and "modify the data" policies on the component. In a clustered environment, all nodes must be be added to these policies as well, as a user request could be replicated through any node in the cluster.
[[access-policy-inheritance]]
===== Access Policy Inheritance
@ -1977,7 +1988,7 @@ This cleanup mechanism takes into account only automatically created archived fl
|nifi.authorizer.configuration.file*|This is the location of the file that specifies how authorizers are defined. The default value is ./conf/authorizers.xml.
|nifi.login.identity.provider.configuration.file*|This is the location of the file that specifies how username/password authentication is performed. This file is
only consider if `nifi.security.user.login.identity.provider` configured with a provider identifier. The default value is ./conf/login-identity-providers.xml.
|nifi.templates.directory*|This is the location of the directory where flow templates are saved. The default value is ./conf/templates.l
|nifi.templates.directory*|This is the location of the directory where flow templates are saved (for backward compatibility only). Templates are stored in the flow.xml.gz starting with NiFi 1.0. The template directory can be used to (bulk) import templates into the flow.xml.gz automatically on NiFi startup. The default value is ./conf/templates.
|nifi.ui.banner.text|This is banner text that may be configured to display at the top of the User Interface. It is blank by default.
|nifi.ui.autorefresh.interval|The interval at which the User Interface auto-refreshes. The default value is 30 sec.
|nifi.nar.library.directory|The location of the nar library. The default value is ./lib and probably should be left as is. +

View File

@ -834,7 +834,8 @@ image:connection-settings.png["Connection Settings"]
The Connection name is optional. If not specified, the name shown for the Connection will be names of the Relationships
that are active for the Connection.
File expiration is a concept by which data that cannot be processed in a timely fashion can be automatically removed from the flow.
===== FlowFile Expiration
FlowFile expiration is a concept by which data that cannot be processed in a timely fashion can be automatically removed from the flow.
This is useful, for example, when the volume of data is expected to exceed the volume that can be sent to a remote site.
In this case, the expiration can be used in conjunction with Prioritizers to ensure that the highest priority data is
processed first and then anything that cannot be processed within a certain time period (one hour, for example) can be dropped. The expiration period is based on the time that the data entered the NiFi instance. In other words, if the file expiration on a given connection is set to '1 hour', and a file that has been in the NiFi instance for one hour reaches that connection, it will expire. The default
@ -842,13 +843,18 @@ value of `0 sec` indicates that the data will never expire. When a file expirati
image:file_expiration_clock.png["File Expiration Indicator"]
===== Back Pressure
NiFi provides two configuration elements for Back Pressure. These thresholds indicate how much data should be
allowed to exist in the queue before the component that is the source of the Connection is no longer scheduled to run.
This allows the system to avoid being overrun with data. The first option provided is the ``Back pressure object threshold.''
This is the number of FlowFiles that can be in the queue before back pressure is applied. The second configuration option
is the ``Back pressure data size threshold.'' This specifies the maximum amount of data (in size) that should be queued up before
applying back pressure. This value is configured by entering a number followed by a data size (`B` for bytes, `KB` for
kilobytes, `MB` for megabytes, `GB` for gigabytes, or `TB` for terabytes). When back pressure is enabled, small progress bars appear on the connection label, so the DFM can see it at-a-glance when looking at a flow on the canvas. The progress bars change color based on the queue percentage: Green (0-60%), Yellow (61-85%) and Red (86-100%).
kilobytes, `MB` for megabytes, `GB` for gigabytes, or `TB` for terabytes).
NOTE: By default each new connection added will have a default Back Pressure Object Threshold of 10,000 objects and Back Pressure Data Size Threshold of 1 GB.
When back pressure is enabled, small progress bars appear on the connection label, so the DFM can see it at-a-glance when looking at a flow on the canvas. The progress bars change color based on the queue percentage: Green (0-60%), Yellow (61-85%) and Red (86-100%).
image:back_pressure_indicators.png["Back Pressure Indicator Bars"]
@ -860,6 +866,7 @@ When the queue is completely full, the Connection is highlighted in red.
image:back_pressure_full.png["Back Pressure Queue Full"]
===== Prioritization
The right-hand side of the tab provides the ability to prioritize the data in the queue so that higher priority data is
processed first. Prioritizers can be dragged from the top (`Available prioritizers') to the bottom (`Selected prioritizers').
Multiple prioritizers can be selected. The prioritizer that is at the top of the `Selected prioritizers' list is the highest
@ -874,6 +881,7 @@ The following prioritizers are available:
- *OldestFlowFileFirstPrioritizer*: Given two FlowFiles, the one that is oldest in the dataflow will be processed first. 'This is the default scheme that is used if no prioritizers are selected.'
- *PriorityAttributePrioritizer*: Given two FlowFiles that both have a "priority" attribute, the one that has the highest priority value will be processed first. Note that an UpdateAttribute processor should be used to add the "priority" attribute to the FlowFiles before they reach a connection that has this prioritizer set. Values for the "priority" attribute may be alphanumeric, where "a" is a higher priority than "z", and "1" is a higher priority than "9", for example.
===== Changing Configuration and Context Menu Options
After a connection has been drawn between two components, the connection's configuration may be changed, and the connection may be moved to a new destination; however, the processors on either side of the connection must be stopped before a configuration or destination change may be made.
image:nifi-connection.png["Connection"]
@ -1007,35 +1015,26 @@ value entered is not valid, the Remote Process Group will not be valid and will
==== Configure Site-to-Site server NiFi instance
[[Site-to-Site_Input_Port]]
*Input Port*: In order to allow another NiFi instance to push data to your local instance, you can simply drag an <<input_port,Input Port>> onto the Root Process Group
of your canvas. After entering a name for the port, it will be added to your flow. You can now right-click on the Input Port and choose Configure in order
to adjust the name and the number of concurrent tasks that are used for the port. If Site-to-Site is configured to run securely, you will also be given
the ability to adjust who has access to the port. If secure, only those who have been granted access to communicate with the port will be able to see
that the port exists.
*Retrieve Site-to-Site Details*: If your NiFi is running securely, in order for another NiFi instance to retrieve information from your instance, it needs to be added to the Global Access "retrieve site-to-site details" policy. This will allow the other instance to query your instance for details such as name, description, available peers (nodes when clustered), statistics, OS port information and available Input and Output ports. Utilizing Input and Output ports in a secured instance requires additional policy configuration as described below.
After being given access to a particular port, in order to see that port, the operator of a remote NiFi instance may need to right-click on their Remote
Process Group and choose to "Refresh" the flow.
[[Site-to-Site_Input_Port]]
*Input Port*: In order to allow another NiFi instance to push data to your local instance, you can simply drag an <<input_port,Input Port>> onto the Root Process Group of your canvas. After entering a name for the port, it will be added to your flow. You can now right-click on the Input Port and choose Configure in order to adjust the name and the number of concurrent tasks that are used for the port.
If Site-to-Site is configured to run securely, you will need to manage the port's "receive data via site-to-site" component access policy. Only those users who have been added to the policy will be able to communicate with the port.
[[Site-to-Site_Output_Port]]
*Output Port*: Similar to an Input Port, a DataFlow Manager may choose to add an <<output_port,Output Port>> to the Root Process Group. The Output Port allows an
authorized NiFi instance to remotely connect to your instance and pull data from the Output Port. Configuring the Output Port will again allow the
DFM to control how many concurrent tasks are allowed, as well as which NiFi instances are authorized to pull data from the instance being configured.
*Output Port*: Similar to an Input Port, a DataFlow Manager may choose to add an <<output_port,Output Port>> to the Root Process Group. The Output Port allows an authorized NiFi instance to remotely connect to your instance and pull data from the Output Port. Configuring the Output Port and managing the port's access policies will again allow the DFM to control how many concurrent tasks are allowed, as well as which users are authorized to pull data from the instance being configured.
In addition to other instances of NiFi, some other applications may use a Site-to-Site client in order to push data to or receive data from a NiFi instance.
For example, NiFi provides an Apache Storm spout and an Apache Spark Receiver that are able to pull data from NiFi's Root Group Output Ports.
[[Site-to-Site_Access_Control]]
*Access Control*: If your instance of NiFi is running securely, the first time that a client establishes a connection to your instance, the client will be forbidden and
a request for an account for that client will automatically be generated. The client will need to be granted the 'NiFi' role in order to communicate
via Site-to-Site. For more information on managing user accounts, see the
link:administration-guide.html#controlling-levels-of-access[Controlling Levels of Access]
section of the link:administration-guide.html[Admin Guide].
In addition to other instances of NiFi, some other applications may use a Site-to-Site client in order to push data to or receive data from a NiFi instance. For example, NiFi provides an Apache Storm spout and an Apache Spark Receiver that are able to pull data from NiFi's Root Group Output Ports.
For information on how to enable and configure Site-to-Site on a NiFi instance, see the
link:administration-guide.html#site_to_site_properties[Site-to-Site Properties] section of the
link:administration-guide.html[System Administrators Guide].
For information on how to configure access policies, see the
link:administration-guide.html#access-policies[Access Properties] section of the
link:administration-guide.html[System Administrators Guide].
=== Example Dataflow