mirror of https://github.com/apache/nifi.git
NIFI-13854 Updated Getting Started Guide for 2.0.0 (#9362)
Signed-off-by: David Handermann <exceptionfactory@apache.org>
This commit is contained in:
parent
8e867be875
commit
fa2a01c823
|
@ -16,7 +16,7 @@
|
||||||
//
|
//
|
||||||
= Getting Started with Apache NiFi
|
= Getting Started with Apache NiFi
|
||||||
Apache NiFi Team <dev@nifi.apache.org>
|
Apache NiFi Team <dev@nifi.apache.org>
|
||||||
:homepage: http://nifi.apache.org
|
:homepage: https://nifi.apache.org
|
||||||
:linkattrs:
|
:linkattrs:
|
||||||
|
|
||||||
|
|
||||||
|
@ -60,16 +60,8 @@ dataflows.
|
||||||
|
|
||||||
WARNING: Before proceeding, check the Admin Guide to confirm you have the <<administration-guide.adoc#system_requirements,minimum system requirements>> to install and run NiFi.
|
WARNING: Before proceeding, check the Admin Guide to confirm you have the <<administration-guide.adoc#system_requirements,minimum system requirements>> to install and run NiFi.
|
||||||
|
|
||||||
NiFi can be downloaded from the link:http://nifi.apache.org/download.html[NiFi Downloads page^]. There are two packaging options
|
NiFi can be downloaded from the link:https://nifi.apache.org/download/[NiFi Downloads page^]. After downloading the version of NiFi
|
||||||
available:
|
that you would like to use, simply extract the zip archive to the location that you wish to run the application from.
|
||||||
|
|
||||||
- a "tarball" (tar.gz) that is tailored more to Linux
|
|
||||||
- a zip file that is more applicable for Windows users
|
|
||||||
|
|
||||||
macOS users may also use the tarball or can install via link:https://brew.sh[Homebrew^] by simply running the command `brew install nifi` from the command line terminal.
|
|
||||||
|
|
||||||
For users who are not running macOS or do not have Homebrew installed, after downloading the version of NiFi that you
|
|
||||||
would like to use, simply extract the archive to the location that you wish to run the application from.
|
|
||||||
|
|
||||||
For information on how to configure the instance of NiFi (for example, to configure security, data storage
|
For information on how to configure the instance of NiFi (for example, to configure security, data storage
|
||||||
configuration, or the port that NiFi is running on), see the link:administration-guide.html[Admin Guide].
|
configuration, or the port that NiFi is running on), see the link:administration-guide.html[Admin Guide].
|
||||||
|
@ -83,7 +75,7 @@ appropriate for your operating system.
|
||||||
=== For Windows Users
|
=== For Windows Users
|
||||||
|
|
||||||
For Windows users, navigate to the folder where NiFi was installed. Within this folder is a subfolder
|
For Windows users, navigate to the folder where NiFi was installed. Within this folder is a subfolder
|
||||||
named `bin`. Navigate to this subfolder and run `nifi.cmd start` file.
|
named `bin`. Navigate to this subfolder and run `nifi.cmd start`.
|
||||||
|
|
||||||
This will launch NiFi and leave it running in the foreground. To shut down NiFi, select the window that
|
This will launch NiFi and leave it running in the foreground. To shut down NiFi, select the window that
|
||||||
was launched and hold the Ctrl key while pressing C.
|
was launched and hold the Ctrl key while pressing C.
|
||||||
|
@ -104,18 +96,6 @@ be used.
|
||||||
|
|
||||||
If NiFi was installed with Homebrew, run the commands `nifi start` or `nifi stop` from anywhere in your file system to start or stop NiFi.
|
If NiFi was installed with Homebrew, run the commands `nifi start` or `nifi stop` from anywhere in your file system to start or stop NiFi.
|
||||||
|
|
||||||
=== Installing as a Service
|
|
||||||
|
|
||||||
Currently, installing NiFi as a service is supported only for Linux and macOS users. To install the application
|
|
||||||
as a service, navigate to the installation directory in a Terminal window and execute the command `bin/nifi.sh install`
|
|
||||||
to install the service with the default name `nifi`. To specify a custom name for the service, execute the command
|
|
||||||
with an optional second argument that is the name of the service. For example, to install NiFi as a service with the
|
|
||||||
name `dataflow`, use the command `bin/nifi.sh install dataflow`.
|
|
||||||
|
|
||||||
Once installed, the service can be started and stopped using the appropriate commands, such as `sudo service nifi start`
|
|
||||||
and `sudo service nifi stop`. Additionally, the running status can be checked via `sudo service nifi status`.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
== I Started NiFi. Now What?
|
== I Started NiFi. Now What?
|
||||||
|
|
||||||
|
@ -188,14 +168,16 @@ for the Processor. The properties that are available depend on the type of Proce
|
||||||
for each type. Properties that are in bold are required properties. The Processor cannot be started until all required
|
for each type. Properties that are in bold are required properties. The Processor cannot be started until all required
|
||||||
properties have been configured. The most important property to configure for GetFile is the directory from which
|
properties have been configured. The most important property to configure for GetFile is the directory from which
|
||||||
to pick up files. If we set the directory name to `./data-in`, this will cause the Processor to start picking up
|
to pick up files. If we set the directory name to `./data-in`, this will cause the Processor to start picking up
|
||||||
any data in the `data-in` subdirectory of the NiFi Home directory. We can choose to configure several different
|
any data in the `data-in` subdirectory of the NiFi Home directory.
|
||||||
Properties for this Processor. If unsure what a particular Property does, we can hover over the Help icon (
|
|
||||||
image:iconInfo.png["Help"]
|
We can choose to configure several different Properties for this Processor.
|
||||||
|
If unsure what a particular Property does, we can hover over the Info icon (
|
||||||
|
image:iconInfo2.png["Info"]
|
||||||
)
|
)
|
||||||
next to the Property Name with the mouse in order to read a description of the property. Additionally, the
|
next to the Property Name with the mouse in order to read a description of the property. Additionally, the
|
||||||
tooltip that is displayed when hovering over the Help icon will provide the default value for that property,
|
tooltip that is displayed will provide the default value for that property if one exists,
|
||||||
if one exists, information about whether or not the property supports the Expression Language (see the
|
information about whether the property supports the Expression Language (see the <<ExpressionLanguage>> section below),
|
||||||
<<ExpressionLanguage>> section below), and previously configured values for that property.
|
whether the property is sensitive and will be encrypted at rest, and history of previously configured values for that property.
|
||||||
|
|
||||||
In order for this property to be valid, create a directory named `data-in` in the NiFi home directory and then
|
In order for this property to be valid, create a directory named `data-in` in the NiFi home directory and then
|
||||||
click the `Ok` button to close the dialog.
|
click the `Ok` button to close the dialog.
|
||||||
|
@ -220,7 +202,7 @@ transfers to the `success` Relationship.
|
||||||
|
|
||||||
In order to address this, let's add another Processor that we can connect the GetFile Processor to, by following
|
In order to address this, let's add another Processor that we can connect the GetFile Processor to, by following
|
||||||
the same steps above. This time, however, we will simply log the attributes that exist for the FlowFile. To do this,
|
the same steps above. This time, however, we will simply log the attributes that exist for the FlowFile. To do this,
|
||||||
we will add a LogAttributes Processor.
|
we will add a LogAttribute Processor.
|
||||||
|
|
||||||
We can now send the output of the GetFile Processor to the LogAttribute Processor. Hover over the GetFile Processor
|
We can now send the output of the GetFile Processor to the LogAttribute Processor. Hover over the GetFile Processor
|
||||||
with the mouse and a Connection Icon (
|
with the mouse and a Connection Icon (
|
||||||
|
@ -256,8 +238,8 @@ image:iconStop.png[Stopped]
|
||||||
). The LogAttribute Processor, however, is now invalid because its `success` Relationship has not been connected to
|
). The LogAttribute Processor, however, is now invalid because its `success` Relationship has not been connected to
|
||||||
anything. Let's address this by signaling that data that is routed to `success` by LogAttribute should be "Auto Terminated,"
|
anything. Let's address this by signaling that data that is routed to `success` by LogAttribute should be "Auto Terminated,"
|
||||||
meaning that NiFi should consider the FlowFile's processing complete and "drop" the data. To do this, we configure the
|
meaning that NiFi should consider the FlowFile's processing complete and "drop" the data. To do this, we configure the
|
||||||
LogAttribute Processor. On the Settings tab, in the right-hand side we can check the box next to the `success` Relationship
|
LogAttribute Processor. On the Relationships tab, we can check the `terminate` box next to the `success` Relationship
|
||||||
to Auto Terminate the data. Clicking `OK` will close the dialog and show that both Processors are now stopped.
|
to Auto Terminate the data. Clicking the `Apply` button will close the dialog and show that both Processors are now stopped.
|
||||||
|
|
||||||
|
|
||||||
=== Starting and Stopping Processors
|
=== Starting and Stopping Processors
|
||||||
|
@ -281,10 +263,11 @@ corner of the Processor, but nothing is shown there if there are currently no ta
|
||||||
|
|
||||||
With each Processor having the ability to expose multiple different Properties and Relationships, it can be challenging
|
With each Processor having the ability to expose multiple different Properties and Relationships, it can be challenging
|
||||||
to remember how all of the different pieces work for each Processor. To address this, you are able to right-click
|
to remember how all of the different pieces work for each Processor. To address this, you are able to right-click
|
||||||
on a Processor and choose the `Usage` menu item. This will provide you with the Processor's usage information, such as a
|
on a Processor and choose the `View Documentation` menu item. This will provide you with the Processor's usage information, such as a
|
||||||
description of the Processor, the different Relationships that are available, when the different Relationships are used,
|
description of the Processor, the different Relationships that are available, when the different Relationships are used,
|
||||||
Properties that are exposed by the Processor and their documentation, as well as which FlowFile Attributes (if any) are
|
Properties that are exposed by the Processor and their documentation, as well as which FlowFile Attributes (if any) are
|
||||||
expected on incoming FlowFiles and which Attributes (if any) are added to outgoing FlowFiles.
|
expected on incoming FlowFiles and which Attributes (if any) are added to outgoing FlowFiles. Some processors also describe
|
||||||
|
specific configurations needed to accomplish use cases where they are commonly used.
|
||||||
|
|
||||||
|
|
||||||
=== Other Components
|
=== Other Components
|
||||||
|
@ -310,7 +293,7 @@ categorizing them by their functions.
|
||||||
=== Data Transformation
|
=== Data Transformation
|
||||||
- *CompressContent*: Compress or Decompress Content
|
- *CompressContent*: Compress or Decompress Content
|
||||||
- *ConvertCharacterSet*: Convert the character set used to encode the content from one character set to another
|
- *ConvertCharacterSet*: Convert the character set used to encode the content from one character set to another
|
||||||
- *EncryptContent*: Encrypt or Decrypt Content
|
- *EncryptContentAge* / *EncryptContentPGP*: Encrypt or Decrypt Content
|
||||||
- *ReplaceText*: Use Regular Expressions to modify textual Content
|
- *ReplaceText*: Use Regular Expressions to modify textual Content
|
||||||
- *TransformXml*: Apply an XSLT transform to XML Content
|
- *TransformXml*: Apply an XSLT transform to XML Content
|
||||||
- *JoltTransformJSON*: Apply a JOLT specification to transform JSON Content
|
- *JoltTransformJSON*: Apply a JOLT specification to transform JSON Content
|
||||||
|
@ -318,7 +301,7 @@ categorizing them by their functions.
|
||||||
=== Routing and Mediation
|
=== Routing and Mediation
|
||||||
- *ControlRate*: Throttle the rate at which data can flow through one part of the flow
|
- *ControlRate*: Throttle the rate at which data can flow through one part of the flow
|
||||||
- *DetectDuplicate*: Monitor for duplicate FlowFiles, based on some user-defined criteria. Often used in conjunction
|
- *DetectDuplicate*: Monitor for duplicate FlowFiles, based on some user-defined criteria. Often used in conjunction
|
||||||
with HashContent
|
with CryptographicHashContent
|
||||||
- *DistributeLoad*: Load balance or sample data by distributing only a portion of data to each user-defined Relationship
|
- *DistributeLoad*: Load balance or sample data by distributing only a portion of data to each user-defined Relationship
|
||||||
- *MonitorActivity*: Sends a notification when a user-defined period of time elapses without any data coming through a particular
|
- *MonitorActivity*: Sends a notification when a user-defined period of time elapses without any data coming through a particular
|
||||||
point in the flow. Optionally send a notification when dataflow resumes.
|
point in the flow. Optionally send a notification when dataflow resumes.
|
||||||
|
@ -335,8 +318,6 @@ categorizing them by their functions.
|
||||||
=== Database Access
|
=== Database Access
|
||||||
- *ExecuteSQL*: Executes a user-defined SQL SELECT command, writing the results to a FlowFile in Avro format
|
- *ExecuteSQL*: Executes a user-defined SQL SELECT command, writing the results to a FlowFile in Avro format
|
||||||
- *PutSQL*: Updates a database by executing the SQL DDM statement defined by the FlowFile's content
|
- *PutSQL*: Updates a database by executing the SQL DDM statement defined by the FlowFile's content
|
||||||
- *SelectHiveQL*: Executes a user-defined HiveQL SELECT command against an Apache Hive database, writing the results to a FlowFile in Avro or CSV format
|
|
||||||
- *PutHiveQL*: Updates a Hive database by executing the HiveQL DDM statement defined by the FlowFile's content
|
|
||||||
|
|
||||||
[[AttributeExtraction]]
|
[[AttributeExtraction]]
|
||||||
=== Attribute Extraction
|
=== Attribute Extraction
|
||||||
|
@ -348,8 +329,7 @@ categorizing them by their functions.
|
||||||
Content or extract the value into the user-named Attribute.
|
Content or extract the value into the user-named Attribute.
|
||||||
- *ExtractText*: User supplies one or more Regular Expressions that are then evaluated against the textual content of the FlowFile, and the
|
- *ExtractText*: User supplies one or more Regular Expressions that are then evaluated against the textual content of the FlowFile, and the
|
||||||
values that are extracted are then added as user-named Attributes.
|
values that are extracted are then added as user-named Attributes.
|
||||||
- *HashAttribute*: Performs a hashing function against the concatenation of a user-defined list of existing Attributes.
|
- *CryptographicHashContent*: Performs a hashing function against the content of a FlowFile and adds the hash value as an Attribute.
|
||||||
- *HashContent*: Performs a hashing function against the content of a FlowFile and adds the hash value as an Attribute.
|
|
||||||
- *IdentifyMimeType*: Evaluates the content of a FlowFile in order to determine what type of file the FlowFile encapsulates. This Processor is
|
- *IdentifyMimeType*: Evaluates the content of a FlowFile in order to determine what type of file the FlowFile encapsulates. This Processor is
|
||||||
capable of detecting many different MIME Types, such as images, word processor documents, text, and compression formats just to name
|
capable of detecting many different MIME Types, such as images, word processor documents, text, and compression formats just to name
|
||||||
a few.
|
a few.
|
||||||
|
@ -374,12 +354,9 @@ categorizing them by their functions.
|
||||||
the data from one location to another location and is not to be used for copying the data.
|
the data from one location to another location and is not to be used for copying the data.
|
||||||
- *GetSFTP*: Downloads the contents of a remote file via SFTP into NiFi and then deletes the original file. This Processor is expected to move
|
- *GetSFTP*: Downloads the contents of a remote file via SFTP into NiFi and then deletes the original file. This Processor is expected to move
|
||||||
the data from one location to another location and is not to be used for copying the data.
|
the data from one location to another location and is not to be used for copying the data.
|
||||||
- *GetJMSQueue*: Downloads a message from a JMS Queue and creates a FlowFile based on the contents of the JMS message. The JMS Properties are
|
- *ConsumeJMS*: Downloads a message from a JMS Queue or Topic and creates a FlowFile based on the contents of the JMS message. The JMS Properties are
|
||||||
optionally copied over as Attributes, as well.
|
optionally copied over as Attributes, as well. This Processor also supports durable topic subscriptions.
|
||||||
- *GetJMSTopic*: Downloads a message from a JMS Topic and creates a FlowFile based on the contents of the JMS message. The JMS Properties are
|
- *InvokeHTTP*: Can download data from a remote HTTP server. See the <<HTTP>> section below.
|
||||||
optionally copied over as Attributes, as well. This Processor supports both durable and non-durable subscriptions.
|
|
||||||
- *GetHTTP*: Downloads the contents of a remote HTTP- or HTTPS-based URL into NiFi. The Processor will remember the ETag and Last-Modified Date
|
|
||||||
in order to ensure that the data is not continually ingested.
|
|
||||||
- *ListenHTTP*: Starts an HTTP (or HTTPS) Server and listens for incoming connections. For any incoming POST request, the contents of the request
|
- *ListenHTTP*: Starts an HTTP (or HTTPS) Server and listens for incoming connections. For any incoming POST request, the contents of the request
|
||||||
are written out as a FlowFile, and a 200 response is returned.
|
are written out as a FlowFile, and a 200 response is returned.
|
||||||
- *ListenUDP*: Listens for incoming UDP packets and creates a FlowFile per packet or per bundle of packets (depending on configuration) and
|
- *ListenUDP*: Listens for incoming UDP packets and creates a FlowFile per packet or per bundle of packets (depending on configuration) and
|
||||||
|
@ -387,16 +364,16 @@ categorizing them by their functions.
|
||||||
- *GetHDFS*: Monitors a user-specified directory in HDFS. Whenever a new file enters HDFS, it is copied into NiFi and deleted from HDFS. This
|
- *GetHDFS*: Monitors a user-specified directory in HDFS. Whenever a new file enters HDFS, it is copied into NiFi and deleted from HDFS. This
|
||||||
Processor is expected to move the file from one location to another location and is not to be used for copying the data. This Processor is also
|
Processor is expected to move the file from one location to another location and is not to be used for copying the data. This Processor is also
|
||||||
expected to be run On Primary Node only, if run within a cluster. In order to copy the data from HDFS and leave it in-tact, or to stream the data
|
expected to be run On Primary Node only, if run within a cluster. In order to copy the data from HDFS and leave it in-tact, or to stream the data
|
||||||
from multiple nodes in the cluster, see the ListHDFS Processor.
|
from multiple nodes in the cluster, see the ListHDFS Processor. _HDFS components are available via NiFi plugin extension._
|
||||||
- *ListHDFS* / *FetchHDFS*: ListHDFS monitors a user-specified directory in HDFS and emits a FlowFile containing the filename for each file that it
|
- *ListHDFS* / *FetchHDFS*: ListHDFS monitors a user-specified directory in HDFS and emits a FlowFile containing the filename for each file that it
|
||||||
encounters. It then persists this state across the entire NiFi cluster by way of a Distributed Cache. These FlowFiles can then be fanned out across
|
encounters. It then persists this state across the entire NiFi cluster by way of a Distributed Cache. These FlowFiles can then be fanned out across
|
||||||
the cluster and sent to the FetchHDFS Processor, which is responsible for fetching the actual content of those files and emitting FlowFiles that contain
|
the cluster and sent to the FetchHDFS Processor, which is responsible for fetching the actual content of those files and emitting FlowFiles that contain
|
||||||
the content fetched from HDFS.
|
the content fetched from HDFS. _HDFS components are available via NiFi plugin extension._
|
||||||
- *FetchS3Object*: Fetches the contents of an object from the Amazon Web Services (AWS) Simple Storage Service (S3). The outbound FlowFile contains the contents
|
- *FetchS3Object*: Fetches the contents of an object from the Amazon Web Services (AWS) Simple Storage Service (S3). The outbound FlowFile contains the contents
|
||||||
received from S3.
|
received from S3.
|
||||||
- *GetKafka*: Fetches messages from Apache Kafka, specifically for 0.8.x versions. The messages can be emitted as a FlowFile per message or can be batched together using a user-specified delimiter.
|
- *ConsumeKafka*: Fetches messages from Apache Kafka. The messages can be emitted as a FlowFile per message or can be batched together using a user-specified delimiter.
|
||||||
- *GetMongo*: Executes a user-specified query against MongoDB and writes the contents to a new FlowFile.
|
- *GetMongo*: Executes a user-specified query against MongoDB and writes the contents to a new FlowFile.
|
||||||
- *GetTwitter*: Allows Users to register a filter to listen to the Twitter "garden hose" or Enterprise endpoint, creating a FlowFile for each tweet
|
- *ConsumeTwitter*: Allows Users to register a filter to listen to the X/Twitter "garden hose" or Enterprise endpoint, creating a FlowFile for each post
|
||||||
that is received.
|
that is received.
|
||||||
|
|
||||||
=== Data Egress / Sending Data
|
=== Data Egress / Sending Data
|
||||||
|
@ -404,12 +381,13 @@ categorizing them by their functions.
|
||||||
- *PutFile*: Writes the contents of a FlowFile to a directory on the local (or network attached) file system.
|
- *PutFile*: Writes the contents of a FlowFile to a directory on the local (or network attached) file system.
|
||||||
- *PutFTP*: Copies the contents of a FlowFile to a remote FTP Server.
|
- *PutFTP*: Copies the contents of a FlowFile to a remote FTP Server.
|
||||||
- *PutSFTP*: Copies the contents of a FlowFile to a remote SFTP Server.
|
- *PutSFTP*: Copies the contents of a FlowFile to a remote SFTP Server.
|
||||||
- *PutJMS*: Sends the contents of a FlowFile as a JMS message to a JMS broker, optionally adding JMS Properties based on Attributes.
|
- *InvokeHTTP*: Send the contents of a FlowFile to a remote HTTP server. See the <<HTTP>> section below.
|
||||||
|
- *PublishJMS*: Sends the contents of a FlowFile as a JMS message to a JMS broker, optionally adding JMS Properties based on Attributes.
|
||||||
- *PutSQL*: Executes the contents of a FlowFile as a SQL DDL Statement (INSERT, UPDATE, or DELETE). The contents of the FlowFile must be a valid
|
- *PutSQL*: Executes the contents of a FlowFile as a SQL DDL Statement (INSERT, UPDATE, or DELETE). The contents of the FlowFile must be a valid
|
||||||
SQL statement. Attributes can be used as parameters so that the contents of the FlowFile can be parameterized SQL statements in order to avoid
|
SQL statement. Attributes can be used as parameters so that the contents of the FlowFile can be parameterized SQL statements in order to avoid
|
||||||
SQL injection attacks.
|
SQL injection attacks.
|
||||||
- *PutKafka*: Sends the contents of a FlowFile as a message to Apache Kafka, specifically for 0.8.x versions. The FlowFile can be sent as a single message or a delimiter, such as a
|
- *PublishKafka*: Sends the contents of a FlowFile as a message to Apache Kafka. The FlowFile can be sent as a single message or a delimiter, such as a
|
||||||
new-line can be specified, in order to send many messages for a single FlowFile.
|
new-line, can be specified in order to send many messages for a single FlowFile.
|
||||||
- *PutMongo*: Sends the contents of a FlowFile to Mongo as an INSERT or an UPDATE.
|
- *PutMongo*: Sends the contents of a FlowFile to Mongo as an INSERT or an UPDATE.
|
||||||
|
|
||||||
=== Splitting and Aggregation
|
=== Splitting and Aggregation
|
||||||
|
@ -433,18 +411,12 @@ categorizing them by their functions.
|
||||||
- *SplitContent*: Splits a single FlowFile into potentially many FlowFiles, similarly to SegmentContent. However, with SplitContent, the splitting
|
- *SplitContent*: Splits a single FlowFile into potentially many FlowFiles, similarly to SegmentContent. However, with SplitContent, the splitting
|
||||||
is not performed on arbitrary byte boundaries but rather a byte sequence is specified on which to split the content.
|
is not performed on arbitrary byte boundaries but rather a byte sequence is specified on which to split the content.
|
||||||
|
|
||||||
|
[[HTTP]]
|
||||||
=== HTTP
|
=== HTTP
|
||||||
- *GetHTTP*: Downloads the contents of a remote HTTP- or HTTPS-based URL into NiFi. The Processor will remember the ETag and Last-Modified Date
|
|
||||||
in order to ensure that the data is not continually ingested.
|
|
||||||
- *ListenHTTP*: Starts an HTTP (or HTTPS) Server and listens for incoming connections. For any incoming POST request, the contents of the request
|
- *ListenHTTP*: Starts an HTTP (or HTTPS) Server and listens for incoming connections. For any incoming POST request, the contents of the request
|
||||||
are written out as a FlowFile, and a 200 response is returned.
|
are written out as a FlowFile, and a 200 response is returned.
|
||||||
- *InvokeHTTP*: Performs an HTTP Request that is configured by the user. This Processor is much more versatile than the GetHTTP and PostHTTP
|
- *InvokeHTTP*: Can send a wide variety of HTTP Requests to a server, as configured by the user. A GET request can download data from an HTTP server.
|
||||||
but requires a bit more configuration. This Processor cannot be used as a Source Processor and is required to have incoming FlowFiles in order
|
A POST request can send the contents of a FlowFile in the body of the request to an HTTP server.
|
||||||
to be triggered to perform its task.
|
|
||||||
- *PostHTTP*: Performs an HTTP POST request, sending the contents of the FlowFile as the body of the message. This is often used in conjunction
|
|
||||||
with ListenHTTP in order to transfer data between two different instances of NiFi in cases where Site-to-Site cannot be used (for instance,
|
|
||||||
when the nodes cannot access each other directly and are able to communicate through an HTTP proxy).
|
|
||||||
*Note*: HTTP is available as a link:user-guide.html#site-to-site[Site-to-Site] transport protocol in addition to the existing RAW socket transport. It also supports HTTP Proxy. Using HTTP Site-to-Site is recommended since it's more scalable, and can provide bi-directional data transfer using input/output ports with better user authentication and authorization.
|
|
||||||
- *HandleHttpRequest* / *HandleHttpResponse*: The HandleHttpRequest Processor is a Source Processor that starts an embedded HTTP(S) server
|
- *HandleHttpRequest* / *HandleHttpResponse*: The HandleHttpRequest Processor is a Source Processor that starts an embedded HTTP(S) server
|
||||||
similarly to ListenHTTP. However, it does not send a response to the client. Instead, the FlowFile is sent out with the body of the HTTP request
|
similarly to ListenHTTP. However, it does not send a response to the client. Instead, the FlowFile is sent out with the body of the HTTP request
|
||||||
as its contents and attributes for all of the typical Servlet parameters, headers, etc. as Attributes. The HandleHttpResponse then is able to
|
as its contents and attributes for all of the typical Servlet parameters, headers, etc. as Attributes. The HandleHttpResponse then is able to
|
||||||
|
@ -522,8 +494,8 @@ While this may seem confusing at first, the section below on <<ExpressionLanguag
|
||||||
here.
|
here.
|
||||||
|
|
||||||
In addition to always adding a defined set of Attributes, the UpdateAttribute Processor has an Advanced UI that allows the user
|
In addition to always adding a defined set of Attributes, the UpdateAttribute Processor has an Advanced UI that allows the user
|
||||||
to configure a set of rules for which Attributes should be added when. To access this capability, in the Configure dialog's
|
to configure a set of rules for which Attributes should be added when. To access this capability, right-click on the UpdateAttribute
|
||||||
Properties tab, click the `Advanced` button at the bottom of the dialog. This will provide a UI that is tailored specifically
|
Processor and choose the `Advanced` menu item. This will provide a UI that is tailored specifically
|
||||||
to this Processor, rather than the simple Properties table that is provided for all Processors. Within this UI, the user is able
|
to this Processor, rather than the simple Properties table that is provided for all Processors. Within this UI, the user is able
|
||||||
to configure a rules engine, essentially, specifying rules that must match in order to have the configured Attributes added
|
to configure a rules engine, essentially, specifying rules that must match in order to have the configured Attributes added
|
||||||
to the FlowFile.
|
to the FlowFile.
|
||||||
|
@ -553,10 +525,10 @@ to that Relationship. All other FlowFiles will be routed to 'unmatched'.
|
||||||
As we extract Attributes from FlowFiles' contents and add user-defined Attributes, they don't do us much good as an operator unless
|
As we extract Attributes from FlowFiles' contents and add user-defined Attributes, they don't do us much good as an operator unless
|
||||||
we have some mechanism by which we can use them. The NiFi Expression Language allows us to access and manipulate FlowFile Attribute
|
we have some mechanism by which we can use them. The NiFi Expression Language allows us to access and manipulate FlowFile Attribute
|
||||||
values as we configure our flows. Not all Processor properties allow the Expression Language to be used, but many do. In order to
|
values as we configure our flows. Not all Processor properties allow the Expression Language to be used, but many do. In order to
|
||||||
determine whether or not a property supports the Expression Language, a user can hover over the Help icon (
|
determine whether or not a property supports the Expression Language, a user can hover over the Info icon (
|
||||||
image:iconInfo.png["Help"]
|
image:iconInfo2.png["Info"]
|
||||||
) in the Properties tab of the Processor Configure dialog. This will provide a tooltip that shows a description of the property, the
|
) in the Properties tab of the Processor Configure dialog. This will provide a tooltip that shows a description of the property
|
||||||
default value, if any, and whether or not the property supports the Expression Language.
|
and whether the property supports the Expression Language.
|
||||||
|
|
||||||
For properties that do support the Expression Language, it is used by adding an expression within the opening `${` tag and the closing
|
For properties that do support the Expression Language, it is used by adding an expression within the opening `${` tag and the closing
|
||||||
`}` tag. An expression can be as simple as an attribute name. For example, to reference the `uuid` Attribute, we can simply use the
|
`}` tag. An expression can be as simple as an attribute name. For example, to reference the `uuid` Attribute, we can simply use the
|
||||||
|
@ -601,9 +573,9 @@ are currently queued across the entire flow, as well as the total size of those
|
||||||
|
|
||||||
If the NiFi instance is in a cluster, we will also see an indicator here telling us how many nodes are in the cluster and how many are currently
|
If the NiFi instance is in a cluster, we will also see an indicator here telling us how many nodes are in the cluster and how many are currently
|
||||||
connected. In this case, the number of active threads and the queue size are indicative of all the sum of all nodes that are currently connected.
|
connected. In this case, the number of active threads and the queue size are indicative of all the sum of all nodes that are currently connected.
|
||||||
It is important to note that active threads only captures threads by objects that are in the graph (processors, processor groups, remote processor groups, funnels, etc.).
|
It is important to note that active threads only captures threads by Processors that are on the graph.
|
||||||
When broken down by node in the cluster (Global Menu -> Cluster), the active thread count is more comprehensive and includes these as well as any
|
When broken down by node in the cluster (Global Menu -> Cluster), the active thread count is more comprehensive and includes these plus any
|
||||||
other threads (reporting tasks, controller services, etc.)
|
other threads (Input and Output Ports, Funnels, Remote Process Groups, Reporting Tasks, etc.)
|
||||||
|
|
||||||
=== Component Statistics
|
=== Component Statistics
|
||||||
|
|
||||||
|
@ -612,10 +584,10 @@ by the component. These statistics provide information about how much data has b
|
||||||
window and allows us to see things like the number of FlowFiles that have been consumed by a Processor, as well as the number of FlowFiles
|
window and allows us to see things like the number of FlowFiles that have been consumed by a Processor, as well as the number of FlowFiles
|
||||||
that have been emitted by the Processor.
|
that have been emitted by the Processor.
|
||||||
|
|
||||||
The connections between Processors also expose the number of items that are currently queued.
|
The connections between Processors also expose several statistics about items that pass through the connection.
|
||||||
|
|
||||||
It may also be valuable to see historical values for these metrics and, if clustered, how the different nodes compare to one another.
|
It may also be valuable to see historical values for these metrics and, if clustered, how the different nodes compare to one another.
|
||||||
In order to see this information, we can right-click on a component and choose the `Stats` menu item. This will show us a graph that spans
|
In order to see this information, we can right-click on a component and choose the `View Status History` menu item. This will show us a graph that spans
|
||||||
the time since NiFi was started, or up to 24 hours, whichever is less. The amount of time that is shown here can be extended or reduced
|
the time since NiFi was started, or up to 24 hours, whichever is less. The amount of time that is shown here can be extended or reduced
|
||||||
by changing the configuration in the properties file.
|
by changing the configuration in the properties file.
|
||||||
|
|
||||||
|
@ -656,9 +628,9 @@ choose which Attributes will be important to your specific dataflows and make th
|
||||||
[[EventDetails]]
|
[[EventDetails]]
|
||||||
=== Event Details
|
=== Event Details
|
||||||
Once we have performed our search, our table will be populated only with the events that match the search criteria. From here, we
|
Once we have performed our search, our table will be populated only with the events that match the search criteria. From here, we
|
||||||
can choose the Info icon (
|
can click the kebab icon (
|
||||||
image:iconDetails.png[Details Icon]
|
image:iconKebab.png["Menu"]
|
||||||
) on the left-hand side of the table to view the details of that event:
|
) on the right-hand side of the table and choose to `View Details` of that event:
|
||||||
|
|
||||||
image:event-details.png[Event Details]
|
image:event-details.png[Event Details]
|
||||||
|
|
||||||
|
@ -692,10 +664,10 @@ this iterative development of the flow until it is processing the data exactly a
|
||||||
|
|
||||||
=== Lineage Graph
|
=== Lineage Graph
|
||||||
|
|
||||||
In addition to viewing the details of a Provenance event, we can also view the lineage of the FlowFile involved by clicking on the Lineage Icon (
|
In addition to viewing the details of a Provenance event, we can also view the lineage of the FlowFile involved.
|
||||||
image:iconLineage.png[Lineage]
|
Click the kebab icon (
|
||||||
) from the table view.
|
image:iconKebab.png["Menu"]
|
||||||
|
) on the right-hand side of the table and choose to `Show Lineage` of that event.
|
||||||
This provides us with a graphical representation of exactly what happened to that piece of data as it traversed the system:
|
This provides us with a graphical representation of exactly what happened to that piece of data as it traversed the system:
|
||||||
|
|
||||||
image:lineage-graph-annotated.png[Lineage Graph]
|
image:lineage-graph-annotated.png[Lineage Graph]
|
||||||
|
@ -722,7 +694,7 @@ addition to this Getting Started Guide:
|
||||||
lengthy discussions of all of the different components that comprise the application. This guide is written with the NiFi Operator as its
|
lengthy discussions of all of the different components that comprise the application. This guide is written with the NiFi Operator as its
|
||||||
audience. It provides information on each of the different components available in NiFi and explains how to use the different features
|
audience. It provides information on each of the different components available in NiFi and explains how to use the different features
|
||||||
provided by the application.
|
provided by the application.
|
||||||
- link:administration-guide.html[Administration Guide] - A guide for setting up and administering Apache NiFi for production environments.
|
- link:administration-guide.html[Administrator's Guide] - A guide for setting up and administering Apache NiFi for production environments.
|
||||||
This guide provides information about the different system-level settings, such as setting up clusters of NiFi and securing access to the
|
This guide provides information about the different system-level settings, such as setting up clusters of NiFi and securing access to the
|
||||||
web UI and data.
|
web UI and data.
|
||||||
- link:expression-language-guide.html[Expression Language Guide] - A far more exhaustive guide for understanding the Expression Language than
|
- link:expression-language-guide.html[Expression Language Guide] - A far more exhaustive guide for understanding the Expression Language than
|
||||||
|
@ -734,12 +706,10 @@ addition to this Getting Started Guide:
|
||||||
- link:https://cwiki.apache.org/confluence/display/NIFI/Contributor+Guide[Contributor's Guide^] - A guide for explaining how to contribute
|
- link:https://cwiki.apache.org/confluence/display/NIFI/Contributor+Guide[Contributor's Guide^] - A guide for explaining how to contribute
|
||||||
work back to the Apache NiFi community so that others can make use of it.
|
work back to the Apache NiFi community so that others can make use of it.
|
||||||
|
|
||||||
Several blog postings have also been added to the Apache NiFi blog site:
|
In addition to the guides provided here, you can browse the different
|
||||||
link:https://blogs.apache.org/nifi/[https://blogs.apache.org/nifi/^]
|
link:https://nifi.apache.org/community/contact/[NiFi Mailing Lists^] or send an e-mail to one of the mailing lists at
|
||||||
|
|
||||||
In addition to the blog and guides provided here, you can browse the different
|
|
||||||
link:https://nifi.apache.org/mailing_lists.html[NiFi Mailing Lists^] or send an e-mail to one of the mailing lists at
|
|
||||||
link:mailto:users@nifi.apache.org[users@nifi.apache.org] or
|
link:mailto:users@nifi.apache.org[users@nifi.apache.org] or
|
||||||
link:mailto:dev@nifi.apache.org[dev@nifi.apache.org].
|
link:mailto:dev@nifi.apache.org[dev@nifi.apache.org].
|
||||||
|
|
||||||
Many of the members of the NiFi community are also available on Twitter and actively monitor for tweets that mention @apachenifi.
|
Many of the members of the NiFi community are available on link:https://apachenifi.slack.com[Apache NiFi on Slack^]
|
||||||
|
and also actively monitor X/Twitter for posts that mention @apachenifi.
|
||||||
|
|
Binary file not shown.
After Width: | Height: | Size: 301 B |
Binary file not shown.
After Width: | Height: | Size: 166 B |
Loading…
Reference in New Issue