NIFI-13171 Removed nifi-kafka-connector in nifi-external module

This closes #8800

Signed-off-by: David Handermann <exceptionfactory@apache.org>
This commit is contained in:
Joseph Witt 2024-05-09 16:33:38 -07:00 committed by exceptionfactory
parent 44f6e2b833
commit 87776e369e
No known key found for this signature in database
30 changed files with 0 additions and 4371 deletions

View File

@ -40,8 +40,6 @@ env:
-pl -:minifi-integration-tests
-pl -:minifi-assembly
-pl -:nifi-assembly
-pl -:nifi-kafka-connector-assembly
-pl -:nifi-kafka-connector-tests
-pl -:nifi-toolkit-assembly
-pl -:nifi-registry-assembly
-pl -:nifi-registry-toolkit-assembly

View File

@ -1,487 +0,0 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
# Introduction
Apache NiFi is a very powerful tool for authoring and running dataflows. It provides many capabilities that are necessary for large-scale
enterprise deployments, such as data persistence and resilience, data lineage and traceability, and multi-tenancy. This, however, requires
that an administrator be responsible for ensuring that this process is running and operational. And generally, adding more capabilities
results in more complexity.
There are times, however, when users don't need all of the power of NiFi and would like to run in a much simpler form factor. A common use case
is to use NiFi to pull data from many different sources, perform manipulations (e.g., convert JSON to Avro), filter some records, and then to publish
the data to Apache Kafka. Another common use case is to pull data from Apache Kafka, perform some manipulations and filtering, and then publish the
data elsewhere.
For deployments where NiFi acts only as a bridge into and out of Kafka, it may be simpler to operationalize such a deployment by having the dataflow
run within Kafka Connect.
The NiFi Kafka Connector allows users to do just that!
# Stateless NiFi
When a dataflow is to be run within Kafka Connect, it is run using the Stateless NiFi dataflow engine. For more information, see the README of the
Stateless NiFi module.
Stateless NiFi differs from the standard NiFi engine in a few ways. For one, Stateless NiFi is an engine that is designed to be embedded. This makes it
very convenient to run within the Kafka Connect framework.
Stateless NiFi does not provide a user interface (UI) or a REST API and does not support modifying the dataflow while it is running.
Stateless NiFi does not persist FlowFile content to disk but rather holds the content in memory. Stateless NiFi operates on data in a First-In-First-Out order,
rather than using data prioritizers. Dataflows built for Stateless NiFi should have a single source and a single destination (or multiple destinations, if
data is expected to be routed to exactly one of them, such as a 'Failure' destination and a 'Success' destination).
Stateless NiFi does not currently provide access to data lineage/provenance.
Stateless NiFi does not support cyclic graphs. While it is common and desirable in standard NiFi deployments to have a 'failure' relationship from a Processor
route back to the same processor, this can result in a StackOverflowException in Stateless NiFi. The preferred approach in Stateless NiFi is to create
an Output Port for failures and route the data to that Output Port.
# Configuring Stateless NiFi Connectors
NiFi supports two different Kafka Connectors: a source connector and a sink connector. Each of these have much of the same configuration
elements but do have some differences. The configuration for each of these connectors is described below.
In order to build a flow that will be run in Kafka Connect, the dataflow must first be built. This is accomplished using a standard deployment
of NiFi. So, while NiFi does not have to be deployed in a production environment in order to use the Kafka Connector, it is necessary in a development
environment for building the actual dataflow that is to be deployed.
To build a flow for running in Kafka Connect, the flow should be built within a Process Group. Once read, the Process Group can be exported by right-clicking
on the Process Group and clicking "Download flow" or by saving the flow to a NiFi Registry instance.
## Source Connector
The NiFi Source Connector is responsible for obtaining data from one source and delivering that data to Kafka. The dataflow should not attempt to
deliver directly to Kafka itself via a Processor such as PublishKafka. Instead, the data should be routed to an Output Port. Any FlowFile delivered
to that Output Port will be obtained by the connector and delivered to Kafka. It is important to note that each FlowFile is delivered as a single Kafka
record. Therefore, if a FlowFile contains thousands of JSON records totaling 5 MB, for instance, it will be important to split those FlowFiles up into
individual Records using a SplitRecord processor before transferring the data to the Output Port. Otherwise, this would result in a single Kafka record
that is 5 MB.
In order to deploy the Source Connector, the nifi-kafka-connector-<version>-bin.tar.gz must first be unpacked into the directory where Kafka Connect
is configured to load connectors from. For example:
```
cd /opt/kafka/kafka-connectors
tar xzvf ~/Downloads/nifi-kafka-connector-1.13.0-bin.tar.gz
```
At this point, if Kafka Connect is already running, it must be restarted in order to pick up the new Connector. This connector supplies both the
NiFi Source Connector and the NiFi Sink Connector.
Next, we must create a configuration JSON that tells Kafka Connect how to deploy an instance of the Connector. This is the standard Kafka Connect
JSON configuration. However, it does require a few different configuration elements specific to the dataflow. Let us consider a dataflow that listens
for Syslog events, converts them into JSON, and then delivers the JSON to Kafka. The dataflow would consist of a single ListenSyslog Processor, with
the "success" relationship going to an Output Port with name `Syslog Messages`.
After creating this simple dataflow, we must place the dataflow in a location where the Kafka Connect connector is able to retrieve it. We could simply
copy the file to each Connect node. Or, we could host the dataflow somewhere that it can be pulled by each Connect instance. To download the flow, right-click
on the Process Group containing our dataflow and choose "Download flow" from the menu. From there, we can upload the JSON to Github as a Gist, for example.
### Configuring Source Connector
An example configuration JSON to run this dataflow would then look like this (note that this cannot be copied and pasted as JSON as is,
as it includes annotations (1), (2), etc. for illustrative purposes):
```
{
"name": "syslog-to-kafka",
"config": {
(1) "connector.class": "org.apache.nifi.kafka.connect.StatelessNiFiSourceConnector",
(2) "key.converter": "org.apache.kafka.connect.storage.StringConverter",
(3) "value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
(4) "tasks.max": "1",
(5) "name": "syslog-to-kafka",
(6) "working.directory": "./working/stateless",
(7) "topics": "syslog-gateway-json",
(8) "nexus.url": "https://repo1.maven.org/maven2/",
(9) "flow.snapshot": "https://gist.githubusercontent.com/user/6123f4b890f402c0b512888ccc92537e/raw/5b41a9eb6db09c54a4d325fec78a6e19c9abf9f2/syslog-to-kafka.json",
(10) "key.attribute": "syslog.hostname",
(11) "output.port": "Syslog Messages",
(12) "header.attribute.regex": "syslog.*",
(13) "krb5.file": "/etc/krb5.conf",
(14) "dataflow.timeout": "30 sec",
(15) "parameter.Syslog Port": "19944",
(16) "extensions.directory": "/tmp/stateless-extensions"
}
}
```
The first two elements, `name` and `config` are standard for all Kafka Connect deployments. Within the `config` element are several different fields that
will be explained here.
`(1) connector.class`: This is the name of the class to use for the Connector. The given value indicates that we want to use Stateless NiFi as a source for
our data. If the desire was instead to publish data from Kafka to another destination, we would use the value `org.apache.nifi.kafka.connect.StatelessNiFiSinkConnector`.
`(2) key.converter`: This specifies how to interpret the Kafka key. When the `key.attribute` field is specified, as in `(10)` above, this tells us that we
want to use whatever value is in the attribute name as the Kafka message key. In this example, the value of the "syslog.hostname" FlowFile attribute in NiFi
will be used as the Kafka message key.
`(3) value.converter`: Specifies how to convert the payload of the NiFi FlowFile into bytes that can be written to Kafka. Generally, with Stateless NiFi, it is
recommended to use the `value.converter` of `org.apache.kafka.connect.converters.ByteArrayConverter` as NiFi already has the data serialized as a byte array
and is very adept at formatting the data as it needs to be.
`(4) tasks.max`: The maximum number of tasks/threads to use to run the dataflow. Unlike the standard NiFi engine, with Stateless NiFi, the entire dataflow is run from start
to finish with a single thread. However, multiple threads can be used to run multiple copies of the dataflow, each with their own data.
`(5) name`: The name of the connect instance. This should match the first `name` field.
`(6) working.directory`: Optional. Specifies a directory on the Connect server that NiFi should use for unpacking extensions that it needs to perform the dataflow.
If not specified, defaults to `/tmp/nifi-stateless-working`.
`(7) topics`: The name of the topic to deliver data to. All FlowFiles will be delivered to the topic whose name is specified here. However, it may be
advantageous to determine the topic individually for each FlowFile. To achieve this, simply ensure that the dataflow specifies the topic name in an attribute,
and then use `topic.name.attribute` to specify the name of the attribute instead of `topic.name`. For example, if we wanted a separate Kafka topic for each
Syslog sender, we could omit `topic.name` and instead provide `"topic.name.attribute": "syslog.hostname"`.
`(8) nexus.url`: Traditional NiFi is deployed with many different extensions. In addition to that, many other third party extensions have been developed but
are not included in the distribution due to size constraints. It is important that the NiFi Kafka Connector not attempt to bundle all possible extensions. As
a result, Connect can be configured with the URL of a Nexus server. The example above points to Maven Central, which holds the released versions of the NiFi
extensions. When a connector is started, it will first identify which extensions are necessary to run the dataflow, determine which extensions are available,
and then automatically download any necessary extensions that it currently does not have available. If configuring a Nexus instance that has multiple repositories,
the name of the repository should be included in the URL. For example: `https://nexus-private.myorganization.org/nexus/repository/my-repository/`.
If the property is not specified, the necessary extensions (used by the flow) must be provided in the `extensions.directory` before deploying the connector.
`(9) flow.snapshot`: Specifies the dataflow to run. This is the file that was downloaded by right-clicking on the Process Group in NiFi and
clicking "Download flow". The dataflow can be stored external to the configured and the location can be represented as an HTTP (or HTTPS URL), or a filename.
If specifying a filename, that file must be present on all Kafka Connect nodes. Because of that, it may be simpler to host the dataflow somewhere.
Alternatively, the contents of the dataflow may be "Stringified" and included directly as the value for this property. This can be done, for example,
using the `jq` tool, such as `jq -R -s '.' <dataflow_file.json>` and the output can then be included in the Kafka Connect configuration JSON.
The process of escaping the JSON and including it within the Kafka Connect configuration may be less desirable if building the configuration manually,
but it can be beneficial if deploying from an automated system. It is important to note that if using a file or URL to specify the dataflow, it is important
that the contents of that file/URL not be overwritten in order to change the dataflow. Doing so can result in different Kafka Connect tasks running different
versions of the dataflow. Instead, a new file/URL should be created for the new version, and the Kafka Connect task should be updated to point to the new version.
This will allow Kafka Connect to properly update all tasks.
`(10) key.attribute`: Optional. Specifies the name of a FlowFile attribute that should be used to specify the key of the Kafka record. If not specified, the Kafka record
will not have a key associated with it. If specified, but the attribute does not exist on a particular FlowFile, it will also have no key associated with it.
`(11) output.port`: Optional. The name of the Output Port in the NiFi dataflow that FlowFiles should be pulled from. If the dataflow contains exactly one port, this property
is optional and can be omitted. However, if the dataflow contains multiple ports (for example, a 'Success' and a 'Failure' port), this property must be specified.
If any FlowFile is sent to any port other than the specified Port, it is considered a failure. The session is rolled back and no data is collected.
`(12) header.name.regex`: Optional. A Java Regular Expression that will be evaluated against all FlowFile attribute names. For any attribute whose name matches the Regular Expression,
the Kafka record will have a header whose name matches the attribute name and whose value matches the attribute value. If not specified, the Kafka record will have no
headers added to it.
`(13) krb5.conf`: Optional. Specifies the krb5.conf file to use if the dataflow interacts with any services that are secured via Kerberos. If not specified, will default
to `/etc/krb5.conf`.
`(14) dataflow.timeout`: Optional. Specifies the maximum amount of time to wait for the dataflow to complete. If the dataflow does not complete before this timeout,
the thread will be interrupted, and the dataflow is considered to have failed. The session will be rolled back and the connector will trigger the flow again. If not
specified, defaults to `30 secs`.
`(15) parameter.XYZ`: Optional. Specifies a Parameter to use in the Dataflow. In this case, the JSON field name is `parameter.Syslog Port`. Therefore, any Parameter Context
in the dataflow that has a parameter with name `Syslog Port` will get the value specified (i.e., `19944` in this case). If the dataflow has child Process Groups, and those child
Process Groups have their own Parameter Contexts, this value will be used for any and all Parameter Contexts containing a Parameter by the name of `Syslog Port`. If the Parameter
should be applied only to a specific Parameter Context, the name of the Parameter Context may be supplied and separated from the Parameter Name with a colon. For example,
`parameter.Syslog Context:Syslog Port`. In this case, the only Parameter Context whose `Syslog Port` parameter would be set would be the Parameter Context whose name is `Syslog Context`.
`(16) extensions.directory` : Specifies the directory to add any downloaded extensions to. If not specified, the extensions will be written to the same directory that the
connector lives in. Because this directory may not be writable, and to aid in upgrade scenarios, it is highly recommended that this property be configured.
### Transactional sources
Unlike with standard NiFi deployments, Stateless NiFi keeps the contents of FlowFiles in memory. It is important to understand that as long as the source of the data is
replayable and transactional, there is no concern over data loss. This is handled by treating the entire dataflow as a single transaction. Once data is obtained
from some source component, the data is transferred to the next processor in the flow. At that point, in a standard NiFi deployment, the processor would acknowledge the data
and NiFi would take ownership of that data, having persisted it to disk.
With Stateless NiFi, however, the data will not yet be acknowledged. Instead, the data will be transferred to the next processor in the flow, and it will perform
its task. The data is then transferred to the next processor in the flow and it will complete its task. This is repeated until all data is queued up at the Output Port.
At that point, the NiFi connector will provide the data to Kafka. Only after Kafka has acknowledged the records does the connector acknowledge this to NiFi. And only
at that point is the session committed, allowing the source processor to acknowledge receipt of the data.
As a result, if NiFi is restarted while processing some piece of data, the source processor will not have acknowledged the data and as a result is able to replay the
data, resulting in no data loss.
## Sink Connector
The NiFi Sink Connector is responsible for obtaining data from Kafka and delivering data to some other service. The dataflow should not attempt to
source data directly from Kafka itself via a Processor such as ConsumeKafka. Instead, the data should be received from an Input Port. Each Kafka record
will be enqueued into the outbound connection of that Input Port so that the next processor in the flow is able to process it.
It is important to note that each Kafka record FlowFile is delivered as a single FlowFile. Depending on the destination, it may be advantageous to merge
many of these Kafka records together before delivering them. For example, if delivering to HDFS, we certainly do not want to send each individual Kafka
message to HDFS as a separate file. For more information and restrictions on merging data within Stateless NiFi, see the [Merging](#merging) section below.
In order to deploy a connector to Kafka Connect, we must create a configuration JSON that tells Kafka Connect how to deploy an instance of the Connector.
This is the standard Kafka Connect JSON configuration. However, it does require a few different configuration elements specific to the dataflow. Let us consider a dataflow that receives
events from Kafka and delivers them to HDFS. The dataflow would consist of a single PutHDFS Processor, with
the "success" relationship going to an Output Port with name `Success` and the "failure" relationship going to an Output Port with name `Failure`.
The PutHDFS processor would be fed data from an Input Port with name `Input`.
After creating this simple dataflow, we must place the dataflow in a location where the Kafka Connect connector is able to retrieve it. We could simply
copy the file to each Connect node. Or, we could host the dataflow somewhere that it can be pulled by each Connect instance. To do this, right-click
on the Process Group containing our dataflow and choose Download flow. From there, we can upload the JSON to Github as a Gist, for example.
### Configuring Sink Connector
An example configuration JSON to run this dataflow would then look like this (note that this cannot be copied and pasted as JSON as is,
as it includes annotations (1), (2), etc. for illustrative purposes):
```
{
"name": "kafka-to-hdfs",
"config": {
(1) "connector.class": "org.apache.nifi.kafka.connect.StatelessNiFiSinkConnector",
(2) "key.converter": "org.apache.kafka.connect.storage.StringConverter",
(3) "value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
(4) "tasks.max": "1",
(5) "name": "kafka-to-hdfs",
(6) "working.directory": "./working/stateless",
(7) "topics": "syslog-gateway-json",
(8) "nexus.url": "https://repo1.maven.org/maven2/",
(9) "flow.snapshot": "https://gist.githubusercontent.com/user/6123f4b890f402c0b512888ccc92537e/raw/5b41a9eb6db09c54a4d325fec78a6e19c9abf9f2/kafka-to-hdfs.json",
(10) "input.port": "Syslog Messages",
(11) "failure.ports": "",
(12) "headers.as.attributes.regex": "syslog.*",
(13) "krb5.file": "/etc/krb5.conf",
(14) "dataflow.timeout": "30 sec",
(15) "parameter.Directory": "/syslog",
(16) "extensions.directory": "/tmp/stateless-extensions"
}
}
```
The first two elements, `name` and `config` are standard for all Kafka Connect deployments. Within the `config` element are several different fields that
will be explained here.
`(1) connector.class`: This is the name of the class to use for the Connector. The given value indicates that we want to use Stateless NiFi as a source for
our data. If the desire was instead to publish data from Kafka to another destination, we would use the value `org.apache.nifi.kafka.connect.StatelessNiFiSinkConnector`.
`(2) key.converter`: This specifies how to interpret the Kafka key. When the `key.attribute` field is specified, as in `(10)` above, this tells us that we
want to use whatever value is in the attribute name as the Kafka message key. In this example, the value of the "syslog.hostname" FlowFile attribute in NiFi
will be used as the Kafka message key.
`(3) value.converter`: Specifies how to convert the payload of the NiFi FlowFile into bytes that can be written to Kafka. Generally, with Stateless NiFi, it is
recommended to use the `value.converter` of `org.apache.kafka.connect.converters.ByteArrayConverter` as NiFi already has the data serialized as a byte array
and is very adept at formatting the data as it needs to be.
`(4) tasks.max`: The maximum number of tasks/threads to use to run the dataflow. Unlike standard NiFi deployments, with Stateless NiFi, the entire dataflow is run from start
to finish with a single thread. However, multiple threads can be used to run multiple copies of the dataflow.
`(5) name`: The name of the connect instance. This should match the first `name` field.
`(6) working.directory`: Optional. Specifies a directory on the Connect server that NiFi should use for unpacking extensions that it needs to perform the dataflow.
If not specified, defaults to `/tmp/nifi-stateless-working`.
`(7) topics`: A comma-separated list of Kafka topics to source data from.
`(8) nexus.url`: Traditional NiFi is deployed with many different extensions. In addition to that, many other third party extensions have been developed but
are not included in the distribution due to size constraints. It is important that the NiFi Kafka Connector not attempt to bundle all possible extensions. As
a result, Connect can be configured with the URL of a Nexus server. The example above points to Maven Central, which holds the released versions of the NiFi
extensions. When a connector is started, it will first identify which extensions are necessary to run the dataflow, determine which extensions are available,
and then automatically download any necessary extensions that it currently does not have available. If configuring a Nexus instance that has multiple repositories,
the name of the repository should be included in the URL. For example: `https://nexus-private.myorganization.org/nexus/repository/my-repository/`.
If the property is not specified, the necessary extensions (used by the flow) must be provided in the `extensions.directory` before deploying the connector.
`(9) flow.snapshot`: Specifies the dataflow to run. This is the file that was downloaded by right-clicking on the Process Group in NiFi and
clicking "Download flow". The dataflow can be stored external to the configured and the location can be represented as an HTTP (or HTTPS URL), or a filename.
If specifying a filename, that file must be present on all Kafka Connect nodes. Because of that, it may be simpler to host the dataflow somewhere.
Alternatively, the contents of the dataflow may be "Stringified" and included directly as the value for this property. This can be done, for example,
using the `jq` tool, such as `cat <dataflow_file.json> | jq -R -s '.'` and the output can then be included in the Kafka Connect configuration JSON.
The process of escaping the JSON and including it within the Kafka Connect configuration may be less desirable if building the configuration manually,
but it can be beneficial if deploying from an automated system. It is important to note that if using a file or URL to specify the dataflow, it is important
that the contents of that file/URL not be overwritten in order to change the dataflow. Doing so can result in different Kafka Connect tasks running different
versions of the dataflow. Instead, a new file/URL should be created for the new version, and the Kafka Connect task should be updated to point to the new version.
This will allow Kafka Connect to properly update all tasks.
`(10) input.port`: Optional. The name of the Input Port in the NiFi dataflow that Kafka records should be sent to. If the dataflow contains exactly one Input Port,
this property is optional and can be omitted. However, if the dataflow contains multiple Input Ports, this property must be specified.
`(11) failure.ports`: Optional. A comma-separated list of Output Ports that should be considered failure conditions. If any FlowFile is routed to an Output Port,
and the name of that Output Port is provided in this list, the dataflow is considered a failure and the session is rolled back. The dataflow will then wait a bit
and attempt to process the Kafka records again. If data is transferred to an Output Port that is not in the list of failure ports, the data will simply be discarded.
`(12) headers.as.attributes.regex`: Optional. A Java Regular Expression that will be evaluated against all Kafka record headers. For any header whose key matches the
Regular Expression, the header will be added to the FlowFile as an attribute. The attribute name will be the same as the header key, and the attribute value will be
the same as the header value.
`(13) krb5.conf`: Optional. Specifies the krb5.conf file to use if the dataflow interacts with any services that are secured via Kerberos. If not specified, will default
to `/etc/krb5.conf`.
`(14) dataflow.timeout`: Optional. Specifies the maximum amount of time to wait for the dataflow to complete. If the dataflow does not complete before this timeout,
the thread will be interrupted, and the dataflow is considered to have failed. The session will be rolled back and the connector will trigger the flow again. If not
specified, defaults to `30 secs`.
`(15) parameter.XYZ`: Optional. Specifies a Parameter to use in the Dataflow. In this case, the JSON field name is `parameter.Directory`. Therefore, any Parameter Context
in the dataflow that has a parameter with name `Directory` will get the value specified (i.e., `/syslog` in this case). If the dataflow has child Process Groups, and those child
Process Groups have their own Parameter Contexts, this value will be used for any and all Parameter Contexts containing a Parameter by the name of `Directory`. If the Parameter
should be applied only to a specific Parameter Context, the name of the Parameter Context may be supplied and separated from the Parameter Name with a colon. For example,
`parameter.HDFS:Directory`. In this case, the only Parameter Context whose `Directory` parameter would be set would be the Parameter Context whose name is `HDFS`.
`(16) extensions.directory` : Specifies the directory to add any downloaded extensions to. If not specified, the extensions will be written to the same directory that the
connector lives in. Because this directory may not be writable, and to aid in upgrade scenarios, it is highly recommended that this property be configured.
<a name="merging"></a>
### Merging
NiFi supports many different Processors that can be used as sinks for Apache Kafka data. For services that play well in the world of streaming, these
can often be delivered directly to a sink. For example, a PublishJMS processor is happy to receive many small messages. However, if the data is to be
sent to S3 or to HDFS, those services will perform much better if the data is first batched, or merged, together. The MergeContent and MergeRecord
processors are extremely popular in NiFi for this reason. They allow many small FlowFiles to be merged together into one larger FlowFile.
With the standard NiFi engine, we can simply set a minimum and maximum size for the merged data along with a timeout. However, with Stateless NiFi and Kafka Connect,
this may not work as well, because only a limited number of FlowFiles will be made available to the Processor. We can still use these Processor in
order to merge the data together, but with a bit of a limitation.
If MergeContent / MergeRecord are triggered but do not have enough FlowFiles to create a batch, the processor will do nothing. If there are more FlowFiles
queued up than the configured maximum number of entries, the Processor will merge up to that number of FlowFiles but then leave the rest sitting in the queue.
The next invocation will then not have enough FlowFiles to create a batch and therefore will remain queued. In either of these situations, the result can be
that the dataflow is constantly triggering the merge processor, which makes no process, and as a result the dataflow times out and rolls back the entire session.
Therefore, it is advisable that the MergeContent / MergeRecord processors be configured with a `Minimum Number of Entries` of `1` and a very large value for the
`Maximum Number of Entries` property (for example 1000000). Kafka Connect properties such as `offset.flush.timeout.ms` may be used to control
the amount of data that gets merged together.
# Installing the NiFi Connector
Now that we have covered how to build a dataflow that can be used as a Kafka Connector, and we've discussed how to build the configuration for that connector,
all we have left is describe how we can deploy the Connector itself.
In order to deploy the NiFi Connector, the nifi-kafka-connector-<version>-bin.tar.gz must first be unpacked into the directory where Kafka Connect
is configured to load connectors from (this is configured in the Kafka Connect properties file). For example:
```
cd /opt/kafka/kafka-connectors
tar xzvf ~/Downloads/nifi-kafka-connector-1.13.0-bin.tar.gz
```
At this point, if Kafka Connect is already running, it must be restarted in order to pick up the new Connector. It is not necessary to copy the connector
and restart Kafka Connect each time we want to deploy a dataflow as a connector - only when installing the NiFi connector initially.
This packaged connector supplies both the NiFi Source Connector and the NiFi Sink Connector.
# Deploying a NiFi Dataflow as a Connector
Once the NiFi connector is installed, we are ready to deploy our dataflow! Depending on the dataflow, we may need a source or a sink connector. Please see
the above sections on creating the appropriate configuration for the connector of interest.
Assuming that we have created the appropriate JSON configuration file for our connector and named it `syslog-to-kafka.json`, we can deploy the flow.
We do this by making a RESTful POST call to Kafka Connect:
```
connect-configurations $ curl -X POST kafka-01:8083/connectors -H "Content-Type: application/json" -d @syslog-to-kafka.json
```
This should produce a response similar to this (formatted for readability):
```
{
"name": "syslog-to-kafka",
"config": {
"connector.class": "org.apache.nifi.kafka.connect.StatelessNiFiSourceConnector",
"tasks.max": "1",
"working.directory": "./working/stateless",
"name": "syslog-to-kafka",
"topic.name": "syslog-gateway-json",
"parameter.Syslog Port": "19944"
"nexus.url": "https://repo1.maven.org/maven2/",
"flow.snapshot": "https://gist.githubusercontent.com/user/6123f4b890f402c0b512888ccc92537e/raw/5b41a9eb6db09c54a4d325fec78a6e19c9abf9f2/syslog-to-kafka.json",
},
"tasks": [],
"type": "source"
}
```
At this point, the connector has been deployed, and we can see in the Kafka Connect logs that the connector was successfully deployed.
The logs will contain many initialization messages, but should contain a message such as:
```
[2020-12-15 10:44:04,175] INFO NiFi Stateless Engine and Dataflow created and initialized in 2146 millis (org.apache.nifi.stateless.flow.StandardStatelessDataflowFactory:202)d
```
This indicates that the engine has successfully parsed the dataflow and started the appropriate components.
Depending on the hardware, this may take only a few milliseconds or it may take a few seconds. However, this assumes that the Connector
already has access to all of the NiFi extensions that it needs. If that is not the case, startup may take longer as it downloads the extensions
that are necessary. This is described in the next section.
Similarly, we can view which connectors are deployed:
```
connect-configurations $ curl kafka-01:8083/connectors
```
Which should produce an output such as:
```
["syslog-to-kafka"]
```
We can then terminate our deployment:
```
curl -X DELETE kafka-01:8083/connectors/syslog-to-kafka
```
# Sourcing Extensions
The list of NiFi extensions that are available numbers into the hundreds, with more being developed by the NiFi community continually.
It would not be ideal to package together all NiFi extensions in to the Connector. Therefore, the NiFi Kafka Connector includes no extensions
at all. This results in the connector occupying a measly 27 MB at the time of this writing.
Obviously, though, the extensions must be downloaded at some point. When a connector is deployed and started up, one of the first things that the
Connector does is to obtain the dataflow configuration, and then extract the information about which extensions are necessary for running the dataflow.
The Connector will then examine its own set of downloaded extensions and determine which ones it is missing. It will then create a List of necessary extensions
and begin downloading them.
In order to do this, the connect configuration must specify where to download the extensions. This is the reason for the "nexus.url" property that is described
in both the Source Connector and the Sink Connector. Once downloaded, the extensions are placed in the configured extensions directory (configured via the
`extensions.directory` configuration element).
If the `extensions.directory` is not explicitly specified in the connector configuration, extensions will be added to the NAR Directory
(configured via the `nar.directory` configuration element). If this is not specified, it is is auto-detected to be the same directory
that the NiFi Kafka Connector was installed in.
# Mapping of NiFi Features
There are some features that exist in NiFi that have very nice corollaries in Kafka Connect. These are discussed here.
#### State Management
In NiFi, a Processor is capable of storing state about the work that it has accomplished. This is particularly important for source
components such as ListS3. This Processor keeps state about the data that it has already seen so that it does not constantly list the same files repeatedly.
When using the Stateless NiFi Source Connector, the state that is stored by these processors is provided to Kafka Connect and is stored within Kafka itself
as the "Source Offsets" and "Source Partition" information. This allows a task to be restarted and resume where it left off. If a Processor stores
"Local State" in NiFi, it will be stored in Kafka using a "Source Partition" that corresponds directly to that task. As a result, each Task is analogous
to a Node in a NiFi cluster. If the Processor stores "Cluster-Wide State" in NiFi, the state will be stored in Kafka using a "Source Partition" that corresponds
to a cluster-wide state.
Kafka Connect does not allow for state to be stored for Sink Tasks.
#### Primary Node
NiFi provides several processors that are expected to run only on a single node in the cluster. This is accomplished by setting the Execution Node to
"Primary Node Only" in the scheduling tab when configuring a NiFi Processor. When using the Source Connector, if any source processor in the configured
dataflow is set to run on Primary Node Only, only a single task will ever run, even if the "tasks" configuration element is set to a large value. In this
case, a warning will be logged if attempting to use multiple tasks for a dataflow that has a source processor configured for Primary Node Only. Because Processors
should only be scheduled on Primary Node Only if they are sources of data, this is ignored for all Sink Tasks and for any Processor in a Source Task that has
incoming connections.
#### Processor Yielding
When a Processor determines that it is not capable of performing any work (for example, because the system that the Processor is pulling from has no more data to pull),
it may choose to yield. This means that the Processor will not run for some amount of time. The amount of time can be configured in a NiFi dataflow by
configuring the Processor and going to the Settings tab and updating the "Yield Duration" property. When using the Source Connector, if a Processor chooses
to yield, the Source Connector will pause for the configured amount of time before triggering the dataflow to run again.

View File

@ -1,291 +0,0 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
APACHE NIFI SUBCOMPONENTS:
The Apache NiFi project contains subcomponents with separate copyright
notices and license terms. Your use of the source code for the these
subcomponents is subject to the terms and conditions of the following
licenses.
The binary distribution of this product bundles 'Antlr 3' which is available
under a "3-clause BSD" license. For details see http://www.antlr3.org/license.html
Copyright (c) 2010 Terence Parr
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
Neither the name of the author nor the names of its contributors may be used
to endorse or promote products derived from this software without specific
prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
THE POSSIBILITY OF SUCH DAMAGE.
The binary distribution of this product bundles 'Bouncy Castle JDK 1.5'
under an MIT style license.
Copyright (c) 2000 - 2015 The Legion of the Bouncy Castle Inc. (http://www.bouncycastle.org)
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
This product bundles 'asm' which is available under a 3-Clause BSD style license.
For details see http://asm.ow2.org/asmdex-license.html
Copyright (c) 2012 France Télécom
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. Neither the name of the copyright holders nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
THE POSSIBILITY OF SUCH DAMAGE.

View File

@ -1,138 +0,0 @@
nifi-kafka-connector-assembly
Copyright 2014-2024 The Apache Software Foundation
This product includes software developed at
The Apache Software Foundation (http://www.apache.org/).
******************
Apache Software License v2
******************
The following binary components are provided under the Apache Software License v2
(ASLv2) Apache Commons Codec
The following NOTICE information applies:
Apache Commons Codec
Copyright 2002-2014 The Apache Software Foundation
src/test/org/apache/commons/codec/language/DoubleMetaphoneTest.java
contains test data from http://aspell.net/test/orig/batch0.tab.
Copyright (C) 2002 Kevin Atkinson (kevina@gnu.org)
===============================================================================
The content of package org.apache.commons.codec.language.bm has been translated
from the original php source code available at http://stevemorse.org/phoneticinfo.htm
with permission from the original authors.
Original source copyright:
Copyright (c) 2008 Alexander Beider & Stephen P. Morse.
(ASLv2) Apache Commons Configuration
The following NOTICE information applies:
Apache Commons Configuration
Copyright 2001-2017 The Apache Software Foundation
This product includes software developed at
The Apache Software Foundation (http://www.apache.org/).
(ASLv2) Apache Commons Lang
The following NOTICE information applies:
Apache Commons Lang
Copyright 2001-2015 The Apache Software Foundation
This product includes software from the Spring Framework,
under the Apache License 2.0 (see: StringUtils.containsWhitespace())
(ASLv2) Apache Commons Text
The following NOTICE information applies:
Apache Commons Text
Copyright 2001-2018 The Apache Software Foundation
(ASLv2) Apache HttpComponents
The following NOTICE information applies:
Apache HttpClient
Copyright 1999-2015 The Apache Software Foundation
Apache HttpCore
Copyright 2005-2015 The Apache Software Foundation
Apache HttpMime
Copyright 1999-2013 The Apache Software Foundation
This project contains annotations derived from JCIP-ANNOTATIONS
Copyright (c) 2005 Brian Goetz and Tim Peierls. See http://www.jcip.net
(ASLv2) Jackson JSON processor
The following NOTICE information applies:
# Jackson JSON processor
Jackson is a high-performance, Free/Open Source JSON processing library.
It was originally written by Tatu Saloranta (tatu.saloranta@iki.fi), and has
been in development since 2007.
It is currently developed by a community of developers, as well as supported
commercially by FasterXML.com.
## Licensing
Jackson core and extension components may licensed under different licenses.
To find the details that apply to this artifact see the accompanying LICENSE file.
For more information, including possible other licensing options, contact
FasterXML.com (http://fasterxml.com).
## Credits
A list of contributors may be found from CREDITS file, which is included
in some artifacts (usually source distributions); but is always available
from the source code management (SCM) system project uses.
(ASLv2) JSON-SMART
The following NOTICE information applies:
Copyright 2011 JSON-SMART authors
(ASLv2) JsonPath
The following NOTICE information applies:
Copyright 2011 JsonPath authors
(ASLv2) Spring Security
The following NOTICE information applies:
Spring Security 5
Copyright (c) 2002-2021 Pivotal, Inc.
(ASLv2) ASM Based Accessors Helper Used By JSON Smart (net.minidev:accessors-smart:jar:1.2 - http://www.minidev.net/)
The following NOTICE information applies:
ASM Based Accessors Helper Used By JSON Smart 1.2
Copyright 2017, Uriel Chemouni
(ASLv2) BCrypt Password Hashing Function (at.favre.lib:bcrypt:jar:0.9.0 - https://github.com/patrickfav/bcrypt)
The following NOTICE information applies:
BCrypt Password Hashing Function 0.9.0
Copyright 2018 Patrick Favre-Bulle
(ASLv2) Bytes Utility Library (at.favre.lib:bytes:jar:1.3.0 - https://github.com/patrickfav/bytes-java)
The following NOTICE information applies:
Bytes Utility Library 1.3.0
Copyright 2017 Patrick Favre-Bulle
************************
Common Development and Distribution License 1.1
************************
The following binary components are provided under the Common Development and Distribution License 1.1. See project link for details.
(CDDL 1.1) (GPL2 w/ CPE) aopalliance-repackaged (org.glassfish.hk2.external:aopalliance-repackaged:jar:2.5.0-b42 - https://javaee.github.io/glassfish/)
(CDDL 1.1) (GPL2 w/ CPE) hk2-api (org.glassfish.hk2:hk2-api:jar:2.5.0-b42 - https://javaee.github.io/glassfish/)
(CDDL 1.1) (GPL2 w/ CPE) hk2-utils (org.glassfish.hk2:hk2-utils:jar:2.5.0-b42 - https://javaee.github.io/glassfish/)
(CDDL 1.1) (GPL2 w/ CPE) hk2-locator (org.glassfish.hk2:hk2-locator:jar:2.5.0-b42 - https://javaee.github.io/glassfish/)
(CDDL 1.1) (GPL2 w/ CPE) javax.annotation API (javax.annotation:javax.annotation-api:jar:1.2 - http://jcp.org/en/jsr/detail?id=250)
(CDDL 1.1) (GPL2 w/ CPE) javax.inject:1 as OSGi bundle (org.glassfish.hk2.external:javax.inject:jar:2.4.0-b25 - https://hk2.java.net/external/javax.inject)
(CDDL 1.1) (GPL2 w/ CPE) javax.ws.rs-api (javax.ws.rs:javax.ws.rs-api:jar:2.1 - http://jax-rs-spec.java.net)
(CDDL 1.1) (GPL2 w/ CPE) jersey-client (org.glassfish.jersey.core:jersey-client:jar:2.26 - https://jersey.github.io/)
(CDDL 1.1) (GPL2 w/ CPE) jersey-common (org.glassfish.jersey.core:jersey-common:jar:2.26 - https://jersey.github.io/)
(CDDL 1.1) (GPL2 w/ CPE) jersey-entity-filtering (org.glassfish.jersey.ext:jersey-entity-filtering:jar:2.26 - https://jersey.github.io/)
(CDDL 1.1) (GPL2 w/ CPE) jersey-hk2 (org.glassfish.jersey.inject:jersey-hk2:jar:2.26 - https://jersey.github.io/)
(CDDL 1.1) (GPL2 w/ CPE) jersey-media-json-jackson (org.glassfish.jersey.media:jersey-media-json-jackson:jar:2.26 - https://jersey.github.io/)
(CDDL 1.1) (GPL2 w/ CPE) jersey-media-multipart (org.glassfish.jersey.media:jersey-media-multipart:jar:2.26 - https://jersey.github.io/)
(CDDL 1.1) (GPL2 w/ CPE) MIME Streaming Extension (org.jvnet.mimepull:mimepull:jar:1.9.3 - http://mimepull.java.net)
(CDDL 1.1) (GPL2 w/ CPE) OSGi resource locator bundle (org.glassfish.hk2:osgi-resource-locator:jar:1.0.1 - http://glassfish.org/osgi-resource-locator)

View File

@ -1,185 +0,0 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<parent>
<artifactId>nifi-kafka-connect</artifactId>
<groupId>org.apache.nifi</groupId>
<version>2.0.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>
<artifactId>nifi-kafka-connector-assembly</artifactId>
<dependencies>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-kafka-connector</artifactId>
<version>2.0.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-api</artifactId>
<version>2.0.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-stateless-api</artifactId>
<version>2.0.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-stateless-bootstrap</artifactId>
<version>2.0.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
</dependency>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-framework-api</artifactId>
<version>2.0.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-server-api</artifactId>
<version>2.0.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-runtime</artifactId>
<version>2.0.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-nar-utils</artifactId>
<version>2.0.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-properties</artifactId>
<version>2.0.0-SNAPSHOT</version>
</dependency>
<!-- SLF4J bridges to include -->
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>jcl-over-slf4j</artifactId>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>log4j-over-slf4j</artifactId>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>jul-to-slf4j</artifactId>
</dependency>
<!-- NAR files to include -->
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-jetty-nar</artifactId>
<version>2.0.0-SNAPSHOT</version>
<type>nar</type>
</dependency>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-stateless-nar</artifactId>
<version>2.0.0-SNAPSHOT</version>
<type>nar</type>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<finalName>nifi-kafka-connector-assembly-${project.version}</finalName>
<attach>true</attach>
<appendAssemblyId>false</appendAssemblyId>
</configuration>
<executions>
<execution>
<id>make shared resource</id>
<goals>
<goal>single</goal>
</goals>
<phase>package</phase>
<configuration>
<archiverConfig>
<defaultDirectoryMode>0775</defaultDirectoryMode>
<directoryMode>0775</directoryMode>
<fileMode>0664</fileMode>
</archiverConfig>
<descriptors>
<descriptor>src/main/assembly/dependencies.xml</descriptor>
</descriptors>
<tarLongFileMode>posix</tarLongFileMode>
<formats>
<format>zip</format>
</formats>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
<profiles>
<profile>
<id>targz</id>
<activation>
<activeByDefault>false</activeByDefault>
</activation>
<build>
<plugins>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<finalName>nifi-kafka-connector-assembly-${project.version}</finalName>
<attach>true</attach>
<appendAssemblyId>false</appendAssemblyId>
</configuration>
<executions>
<execution>
<id>make shared resource</id>
<goals>
<goal>single</goal>
</goals>
<phase>package</phase>
<configuration>
<archiverConfig>
<defaultDirectoryMode>0775</defaultDirectoryMode>
<directoryMode>0775</directoryMode>
<fileMode>0664</fileMode>
</archiverConfig>
<descriptors>
<descriptor>src/main/assembly/dependencies.xml</descriptor>
</descriptors>
<tarLongFileMode>posix</tarLongFileMode>
<formats>
<format>tar.gz</format>
</formats>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
</profile>
</profiles>
</project>

View File

@ -1,58 +0,0 @@
<?xml version="1.0"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<assembly>
<id>bin</id>
<includeBaseDirectory>true</includeBaseDirectory>
<baseDirectory>nifi-kafka-connector-${project.version}</baseDirectory>
<dependencySets>
<!-- Write out all dependency artifacts to directory -->
<dependencySet>
<scope>runtime</scope>
<useProjectArtifact>false</useProjectArtifact>
<outputDirectory>.</outputDirectory>
<directoryMode>0770</directoryMode>
<fileMode>0664</fileMode>
<useTransitiveFiltering>true</useTransitiveFiltering>
</dependencySet>
</dependencySets>
<!-- Explicitly pull in LICENSE, NOTICE, and README -->
<files>
<file>
<source>../README.md</source>
<outputDirectory>./</outputDirectory>
<destName>README.md</destName>
<fileMode>0644</fileMode>
<filtered>true</filtered>
</file>
<file>
<source>./LICENSE</source>
<outputDirectory>./</outputDirectory>
<destName>LICENSE</destName>
<fileMode>0644</fileMode>
<filtered>true</filtered>
</file>
<file>
<source>./NOTICE</source>
<outputDirectory>./</outputDirectory>
<destName>NOTICE</destName>
<fileMode>0644</fileMode>
<filtered>true</filtered>
</file>
</files>
</assembly>

View File

@ -1,128 +0,0 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<parent>
<artifactId>nifi-kafka-connect</artifactId>
<groupId>org.apache.nifi</groupId>
<version>2.0.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>
<artifactId>nifi-kafka-connector-tests</artifactId>
<build>
<plugins>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<finalName>nifi-kafka-connector</finalName>
<attach>false</attach>
</configuration>
<executions>
<execution>
<id>prepare integration test dependencies</id>
<goals>
<goal>single</goal>
</goals>
<phase>generate-resources</phase>
<configuration>
<archiverConfig>
<defaultDirectoryMode>0775</defaultDirectoryMode>
<directoryMode>0775</directoryMode>
<fileMode>0664</fileMode>
</archiverConfig>
<descriptors>
<descriptor>src/main/assembly/dependencies.xml</descriptor>
</descriptors>
<tarLongFileMode>posix</tarLongFileMode>
<formats>
<format>dir</format>
</formats>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.rat</groupId>
<artifactId>apache-rat-plugin</artifactId>
<configuration>
<excludes combine.children="append">
<exclude>src/test/resources/flows/Generate_Data.json</exclude>
<exclude>src/test/resources/flows/Write_To_File.json</exclude>
</excludes>
</configuration>
</plugin>
</plugins>
</build>
<dependencies>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-kafka-connector-assembly</artifactId>
<version>2.0.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-api</artifactId>
<version>2.0.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-stateless-api</artifactId>
<version>2.0.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-python-framework-api</artifactId>
<version>2.0.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-kafka-connector</artifactId>
<version>2.0.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>connect-api</artifactId>
<version>2.6.3</version>
<scope>test</scope>
</dependency>
<!-- Dependencies for integration tests. These must be excluded from the main assembly -->
<!-- TODO: Probably should separate this into separate integration test module -->
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-system-test-extensions-nar</artifactId>
<version>2.0.0-SNAPSHOT</version>
<type>nar</type>
</dependency>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-standard-nar</artifactId>
<version>2.0.0-SNAPSHOT</version>
<type>nar</type>
</dependency>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-standard-shared-nar</artifactId>
<version>2.0.0-SNAPSHOT</version>
<type>nar</type>
</dependency>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-standard-services-api-nar</artifactId>
<version>2.0.0-SNAPSHOT</version>
<type>nar</type>
</dependency>
</dependencies>
</project>

View File

@ -1,33 +0,0 @@
<?xml version="1.0"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<assembly>
<id>bin</id>
<includeBaseDirectory>true</includeBaseDirectory>
<baseDirectory>nars</baseDirectory>
<dependencySets>
<!-- Write out all dependency artifacts to nars directory -->
<dependencySet>
<scope>runtime</scope>
<useProjectArtifact>false</useProjectArtifact>
<outputDirectory>.</outputDirectory>
<directoryMode>0770</directoryMode>
<fileMode>0664</fileMode>
<useTransitiveFiltering>true</useTransitiveFiltering>
</dependencySet>
</dependencySets>
</assembly>

View File

@ -1,140 +0,0 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.nifi.kafka.connect;
import org.apache.kafka.connect.errors.RetriableException;
import org.apache.kafka.connect.sink.SinkRecord;
import org.apache.kafka.connect.sink.SinkTaskContext;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.TestInfo;
import org.mockito.Mockito;
import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
import java.util.Collections;
import java.util.HashMap;
import java.util.Map;
import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.assertNotNull;
import static org.junit.jupiter.api.Assertions.assertThrows;
import static org.junit.jupiter.api.Assertions.assertTrue;
public class StatelessNiFiSinkTaskIT {
private final File DEFAULT_OUTPUT_DIRECTORY = new File("target/sink-output");
@Test
public void testSimpleFlow(TestInfo testInfo) throws IOException {
final StatelessNiFiSinkTask sinkTask = new StatelessNiFiSinkTask();
sinkTask.initialize(Mockito.mock(SinkTaskContext.class));
final Map<String, String> properties = createDefaultProperties(testInfo);
sinkTask.start(properties);
final SinkRecord record = new SinkRecord("topic", 0, null, "key", null, "Hello World", 0L);
final File[] files = DEFAULT_OUTPUT_DIRECTORY.listFiles();
if (files != null) {
for (final File file : files) {
assertTrue(file.delete(), "Failed to delete existing file " + file.getAbsolutePath());
}
}
sinkTask.put(Collections.singleton(record));
sinkTask.flush(Collections.emptyMap());
final File[] outputFiles = DEFAULT_OUTPUT_DIRECTORY.listFiles();
assertNotNull(outputFiles);
assertEquals(1, outputFiles.length);
final File outputFile = outputFiles[0];
final String output = new String(Files.readAllBytes(outputFile.toPath()));
assertEquals("Hello World", output);
sinkTask.stop();
}
@Test
public void testParameters(TestInfo testInfo) throws IOException {
final StatelessNiFiSinkTask sinkTask = new StatelessNiFiSinkTask();
sinkTask.initialize(Mockito.mock(SinkTaskContext.class));
final Map<String, String> properties = createDefaultProperties(testInfo);
properties.put("parameter.Directory", "target/sink-output-2");
sinkTask.start(properties);
final SinkRecord record = new SinkRecord("topic", 0, null, "key", null, "Hello World", 0L);
final File outputDir = new File("target/sink-output-2");
final File[] files = outputDir.listFiles();
if (files != null) {
for (final File file : files) {
assertTrue(file.delete(), "Failed to delete existing file " + file.getAbsolutePath());
}
}
sinkTask.put(Collections.singleton(record));
sinkTask.flush(Collections.emptyMap());
final File[] outputFiles = outputDir.listFiles();
assertNotNull(outputFiles);
assertEquals(1, outputFiles.length);
final File outputFile = outputFiles[0];
final String output = new String(Files.readAllBytes(outputFile.toPath()));
assertEquals("Hello World", output);
sinkTask.stop();
}
@Test
public void testWrongOutputPort(TestInfo testInfo) {
final StatelessNiFiSinkTask sinkTask = new StatelessNiFiSinkTask();
sinkTask.initialize(Mockito.mock(SinkTaskContext.class));
final Map<String, String> properties = createDefaultProperties(testInfo);
properties.put(StatelessNiFiSinkConfig.FAILURE_PORTS, "Success, Failure");
sinkTask.start(properties);
final SinkRecord record = new SinkRecord("topic", 0, null, "key", null, "Hello World", 0L);
final File[] files = DEFAULT_OUTPUT_DIRECTORY.listFiles();
if (files != null) {
for (final File file : files) {
assertTrue(file.delete(), "Failed to delete existing file " + file.getAbsolutePath());
}
}
assertThrows(RetriableException.class, () -> {
sinkTask.put(Collections.singleton(record));
sinkTask.flush(Collections.emptyMap());
}, "Expected RetriableException to be thrown");
}
private Map<String, String> createDefaultProperties(TestInfo testInfo) {
final Map<String, String> properties = new HashMap<>();
properties.put(StatelessNiFiCommonConfig.DATAFLOW_TIMEOUT, "30 sec");
properties.put(StatelessNiFiSinkConfig.INPUT_PORT_NAME, "In");
properties.put(StatelessNiFiCommonConfig.FLOW_SNAPSHOT, "src/test/resources/flows/Write_To_File.json");
properties.put(StatelessNiFiCommonConfig.NAR_DIRECTORY, "target/nifi-kafka-connector-bin/nars");
properties.put(StatelessNiFiCommonConfig.WORKING_DIRECTORY, "target/nifi-kafka-connector-bin/working");
properties.put(StatelessNiFiCommonConfig.DATAFLOW_NAME, testInfo.getTestMethod().get().getName());
return properties;
}
}

View File

@ -1,288 +0,0 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.nifi.kafka.connect;
import org.apache.kafka.connect.errors.RetriableException;
import org.apache.kafka.connect.header.Header;
import org.apache.kafka.connect.header.Headers;
import org.apache.kafka.connect.source.SourceRecord;
import org.apache.kafka.connect.source.SourceTaskContext;
import org.apache.kafka.connect.storage.OffsetStorageReader;
import org.apache.nifi.components.state.Scope;
import org.apache.nifi.stateless.flow.StatelessDataflow;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.TestInfo;
import java.util.Collection;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.assertFalse;
import static org.junit.jupiter.api.Assertions.assertNotNull;
import static org.junit.jupiter.api.Assertions.assertNull;
import static org.junit.jupiter.api.Assertions.assertThrows;
import static org.junit.jupiter.api.Assertions.assertTrue;
public class StatelessNiFiSourceTaskIT {
@Test
public void testSimpleFlow(TestInfo testInfo) throws InterruptedException {
final StatelessNiFiSourceTask sourceTask = new StatelessNiFiSourceTask();
sourceTask.initialize(createContext());
final Map<String, String> properties = createDefaultProperties(testInfo);
sourceTask.start(properties);
final List<SourceRecord> sourceRecords = sourceTask.poll();
assertEquals(1, sourceRecords.size());
final SourceRecord record = sourceRecords.get(0);
assertEquals("Hello World", new String((byte[]) record.value()));
assertNull(record.key());
assertEquals("my-topic", record.topic());
sourceTask.stop();
}
@Test
public void testKeyAttribute(TestInfo testInfo) throws InterruptedException {
final StatelessNiFiSourceTask sourceTask = new StatelessNiFiSourceTask();
sourceTask.initialize(createContext());
final Map<String, String> properties = createDefaultProperties(testInfo);
properties.put(StatelessNiFiSourceConfig.KEY_ATTRIBUTE, "greeting");
sourceTask.start(properties);
final List<SourceRecord> sourceRecords = sourceTask.poll();
assertEquals(1, sourceRecords.size());
final SourceRecord record = sourceRecords.get(0);
final Object key = record.key();
assertEquals("hello", key);
assertEquals("my-topic", record.topic());
sourceTask.stop();
}
@Test
public void testTopicNameAttribute(TestInfo testInfo) throws InterruptedException {
final StatelessNiFiSourceTask sourceTask = new StatelessNiFiSourceTask();
sourceTask.initialize(createContext());
final Map<String, String> properties = createDefaultProperties(testInfo);
properties.put(StatelessNiFiSourceConfig.TOPIC_NAME_ATTRIBUTE, "greeting");
sourceTask.start(properties);
final List<SourceRecord> sourceRecords = sourceTask.poll();
assertEquals(1, sourceRecords.size());
final SourceRecord record = sourceRecords.get(0);
assertEquals("hello", record.topic());
sourceTask.stop();
}
@Test
public void testHeaders(TestInfo testInfo) throws InterruptedException {
final StatelessNiFiSourceTask sourceTask = new StatelessNiFiSourceTask();
sourceTask.initialize(createContext());
final Map<String, String> properties = createDefaultProperties(testInfo);
properties.put(StatelessNiFiSourceConfig.HEADER_REGEX, "uuid|greeting|num.*");
sourceTask.start(properties);
final List<SourceRecord> sourceRecords = sourceTask.poll();
assertEquals(1, sourceRecords.size());
final SourceRecord record = sourceRecords.get(0);
assertEquals("my-topic", record.topic());
final Map<String, String> headerValues = new HashMap<>();
final Headers headers = record.headers();
for (final Header header : headers) {
headerValues.put(header.key(), (String) header.value());
}
assertEquals("hello", headerValues.get("greeting"));
assertTrue(headerValues.containsKey("uuid"));
assertTrue(headerValues.containsKey("number"));
sourceTask.stop();
}
@Test
public void testTransferToWrongPort(TestInfo testInfo) {
final StatelessNiFiSourceTask sourceTask = new StatelessNiFiSourceTask();
sourceTask.initialize(createContext());
final Map<String, String> properties = createDefaultProperties(testInfo);
properties.put(StatelessNiFiSourceConfig.OUTPUT_PORT_NAME, "Another");
sourceTask.start(properties);
assertThrows(RetriableException.class, () -> sourceTask.poll(), "Expected RetriableException to be thrown");
}
@Test
public void testStateRecovered(TestInfo testInfo) {
final OffsetStorageReader offsetStorageReader = new OffsetStorageReader() {
@Override
public <T> Map<String, Object> offset(final Map<String, T> partition) {
if ("CLUSTER".equals(partition.get(StatelessNiFiSourceConfig.STATE_MAP_KEY))) {
final String serializedStateMap = "{\"version\":4,\"stateValues\":{\"abc\":\"123\"}}";
return Collections.singletonMap("c6562d38-4994-3fcc-ac98-1da34de1916f", serializedStateMap);
}
return null;
}
@Override
public <T> Map<Map<String, T>, Map<String, Object>> offsets(final Collection<Map<String, T>> partitions) {
return Collections.emptyMap();
}
};
final StatelessNiFiSourceTask sourceTask = new StatelessNiFiSourceTask();
sourceTask.initialize(createContext(offsetStorageReader));
final Map<String, String> properties = createDefaultProperties(testInfo);
properties.put(StatelessNiFiSourceConfig.OUTPUT_PORT_NAME, "Another");
sourceTask.start(properties);
final StatelessDataflow dataflow = sourceTask.getDataflow();
final Map<String, String> localStates = dataflow.getComponentStates(Scope.LOCAL);
final Map<String, String> clusterStates = dataflow.getComponentStates(Scope.CLUSTER);
assertFalse(clusterStates.isEmpty());
assertTrue(localStates.isEmpty());
}
@Test
public void testStateProvidedAndRecovered(TestInfo testInfo) throws InterruptedException {
final StatelessNiFiSourceTask sourceTask = new StatelessNiFiSourceTask();
sourceTask.initialize(createContext());
final Map<String, String> properties = createDefaultProperties(testInfo);
sourceTask.start(properties);
final List<SourceRecord> sourceRecords = sourceTask.poll();
assertEquals(1, sourceRecords.size());
final SourceRecord record = sourceRecords.get(0);
assertEquals("Hello World", new String((byte[]) record.value()));
assertNull(record.key());
assertEquals("my-topic", record.topic());
final Map<String, ?> sourceOffset = record.sourceOffset();
assertNotNull(sourceOffset);
assertEquals(1, sourceOffset.size());
final String generateProcessorId = sourceOffset.keySet().iterator().next();
final String serializedStateMap = "{\"version\":\"1\",\"stateValues\":{\"count\":\"1\"}}";
final Map<String, ?> expectedSourceOffset = Collections.singletonMap(generateProcessorId, serializedStateMap);
assertEquals(expectedSourceOffset, sourceOffset);
final Map<String, ?> sourcePartition = record.sourcePartition();
final Map<String, ?> expectedSourcePartition = Collections.singletonMap("task.index", "1");
assertEquals(expectedSourcePartition, sourcePartition);
sourceTask.stop();
final OffsetStorageReader offsetStorageReader = new OffsetStorageReader() {
@Override
public <T> Map<String, Object> offset(final Map<String, T> partition) {
if (sourcePartition.equals(partition)) {
return Collections.singletonMap(generateProcessorId, serializedStateMap);
}
return null;
}
@Override
public <T> Map<Map<String, T>, Map<String, Object>> offsets(final Collection<Map<String, T>> partitions) {
return Collections.emptyMap();
}
};
sourceTask.initialize(createContext(offsetStorageReader));
sourceTask.start(properties);
final StatelessDataflow dataflow = sourceTask.getDataflow();
final Map<String, String> localStates = dataflow.getComponentStates(Scope.LOCAL);
final Map<String, String> clusterStates = dataflow.getComponentStates(Scope.CLUSTER);
assertTrue(clusterStates.isEmpty());
assertFalse(localStates.isEmpty());
final String generateProcessorState = localStates.get(generateProcessorId);
assertEquals(serializedStateMap, generateProcessorState);
}
private Map<String, String> createDefaultProperties(TestInfo testInfo) {
final Map<String, String> properties = new HashMap<>();
properties.put(StatelessNiFiCommonConfig.DATAFLOW_TIMEOUT, "30 sec");
properties.put(StatelessNiFiSourceConfig.OUTPUT_PORT_NAME, "Out");
properties.put(StatelessNiFiSourceConfig.TOPIC_NAME, "my-topic");
properties.put(StatelessNiFiSourceConfig.KEY_ATTRIBUTE, "kafka.key");
properties.put(StatelessNiFiCommonConfig.FLOW_SNAPSHOT, "src/test/resources/flows/Generate_Data.json");
properties.put(StatelessNiFiCommonConfig.NAR_DIRECTORY, "target/nifi-kafka-connector-bin/nars");
properties.put(StatelessNiFiCommonConfig.WORKING_DIRECTORY, "target/nifi-kafka-connector-bin/working");
properties.put(StatelessNiFiCommonConfig.DATAFLOW_NAME, testInfo.getTestMethod().get().getName());
properties.put(StatelessNiFiSourceConfig.STATE_MAP_KEY, "1");
return properties;
}
private SourceTaskContext createContext() {
final OffsetStorageReader offsetStorageReader = createOffsetStorageReader();
return createContext(offsetStorageReader);
}
private SourceTaskContext createContext(final OffsetStorageReader offsetStorageReader) {
return new SourceTaskContext() {
@Override
public Map<String, String> configs() {
return Collections.emptyMap();
}
@Override
public OffsetStorageReader offsetStorageReader() {
return offsetStorageReader;
}
};
}
private OffsetStorageReader createOffsetStorageReader() {
return new OffsetStorageReader() {
@Override
public <T> Map<String, Object> offset(final Map<String, T> partition) {
return Collections.emptyMap();
}
@Override
public <T> Map<Map<String, T>, Map<String, Object>> offsets(final Collection<Map<String, T>> partitions) {
return Collections.emptyMap();
}
};
}
}

View File

@ -1,222 +0,0 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.nifi.kafka.connect;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.io.TempDir;
import java.io.File;
import java.io.IOException;
import static org.junit.jupiter.api.Assertions.assertFalse;
import static org.junit.jupiter.api.Assertions.assertTrue;
import static org.junit.jupiter.api.io.CleanupMode.ALWAYS;
public class WorkingDirectoryUtilsTest {
@Test
public void testDeleteNonexistentFile(@TempDir(cleanup = ALWAYS) File tempDir) {
File nonexistentFile = new File(tempDir, "testFile");
WorkingDirectoryUtils.purgeDirectory(nonexistentFile);
assertFalse(nonexistentFile.exists());
}
@Test
public void testDeleteFlatFile(@TempDir(cleanup = ALWAYS) File tempDir) throws IOException {
File file = new File(tempDir, "testFile");
file.createNewFile();
WorkingDirectoryUtils.purgeDirectory(file);
assertFalse(file.exists());
}
@Test
public void testDeleteDirectoryWithContents(@TempDir(cleanup = ALWAYS) File tempDir) throws IOException {
File directory = new File(tempDir, "directory");
File subDirectory = new File(directory, "subDirectory");
File subDirectoryContent = new File(subDirectory, "subDirectoryContent");
File directoryContent = new File(directory, "directoryContent");
directory.mkdir();
subDirectory.mkdir();
subDirectoryContent.createNewFile();
directoryContent.createNewFile();
WorkingDirectoryUtils.purgeDirectory(directory);
assertFalse(directory.exists());
}
@Test
public void testPurgeUnpackedNarsEmptyRootDirectory(@TempDir(cleanup = ALWAYS) File tempDir) {
File rootDirectory = new File(tempDir, "rootDirectory");
rootDirectory.mkdir();
WorkingDirectoryUtils.purgeIncompleteUnpackedNars(rootDirectory);
assertTrue(rootDirectory.exists());
}
@Test
public void testPurgeUnpackedNarsRootDirectoryWithFilesOnly(@TempDir(cleanup = ALWAYS) File tempDir) throws IOException {
File rootDirectory = new File(tempDir, "rootDirectory");
File directoryContent1 = new File(rootDirectory, "file1");
File directoryContent2 = new File(rootDirectory, "file2");
rootDirectory.mkdir();
directoryContent1.createNewFile();
directoryContent2.createNewFile();
WorkingDirectoryUtils.purgeIncompleteUnpackedNars(rootDirectory);
assertTrue(rootDirectory.exists() && directoryContent1.exists() && directoryContent2.exists());
}
@Test
public void testPurgeUnpackedNars(@TempDir(cleanup = ALWAYS) File tempDir) throws IOException {
File rootDirectory = new File(tempDir, "rootDirectory");
rootDirectory.mkdir();
TestDirectoryStructure testDirectoryStructure = new TestDirectoryStructure(rootDirectory);
WorkingDirectoryUtils.purgeIncompleteUnpackedNars(testDirectoryStructure.getRootDirectory());
assertTrue(testDirectoryStructure.isConsistent());
}
@Test
public void testWorkingDirectoryIntegrityRestored(@TempDir(cleanup = ALWAYS) File tempDir) throws IOException {
/*
workingDirectory
- nar
- extensions
- *TestDirectoryStructure*
- narDirectory
- narFile
- extensions
- *TestDirectoryStructure*
- additionalDirectory
- workingDirectoryFile
*/
File workingDirectory = new File(tempDir, "workingDirectory");
File nar = new File(workingDirectory, "nar");
File narExtensions = new File(nar, "extensions");
File narDirectory = new File(nar, "narDirectory");
File narFile = new File(nar, "narFile");
File extensions = new File(workingDirectory, "extensions");
File additionalDirectory = new File(workingDirectory, "additionalDirectory");
File workingDirectoryFile = new File(workingDirectory, "workingDirectoryFile");
workingDirectory.mkdir();
nar.mkdir();
narExtensions.mkdir();
narDirectory.mkdir();
narFile.createNewFile();
extensions.mkdir();
additionalDirectory.mkdir();
workingDirectoryFile.createNewFile();
TestDirectoryStructure narExtensionsStructure = new TestDirectoryStructure(narExtensions);
TestDirectoryStructure extensionsStructure = new TestDirectoryStructure(extensions);
WorkingDirectoryUtils.reconcileWorkingDirectory(workingDirectory);
assertTrue(workingDirectory.exists()
&& nar.exists()
&& narExtensionsStructure.isConsistent()
&& narDirectory.exists()
&& narFile.exists()
&& extensionsStructure.isConsistent()
&& additionalDirectory.exists()
&& workingDirectoryFile.exists()
);
}
private class TestDirectoryStructure {
/*
rootDirectory
- subDirectory1-nar-unpacked
- subDirectory1File1
- nar-digest
- subDirectory2
- subDirectory2File1
- subDirectory3-nar-unpacked
- subDirectory3Dir1
- subDirectory3Dir1File1
- subDirectory3File1
- fileInRoot
*/
File rootDirectory;
File subDirectory1;
File subDirectory2;
File subDirectory3;
File fileInRoot;
File subDirectory1File1;
File subDirectory1File2;
File subDirectory2File1;
File subDirectory3Dir1;
File subDirectory3File1;
File subDirectory3Dir1File1;
public TestDirectoryStructure(final File rootDirectory) throws IOException {
this.rootDirectory = rootDirectory;
subDirectory1 = new File(rootDirectory, "subDirectory1-" + WorkingDirectoryUtils.NAR_UNPACKED_SUFFIX);
subDirectory2 = new File(rootDirectory, "subDirector2");
subDirectory3 = new File(rootDirectory, "subDirector3-" + WorkingDirectoryUtils.NAR_UNPACKED_SUFFIX);
fileInRoot = new File(rootDirectory, "fileInRoot");
subDirectory1File1 = new File(subDirectory1, "subDirectory1File1");
subDirectory1File2 = new File(subDirectory1, WorkingDirectoryUtils.HASH_FILENAME);
subDirectory2File1 = new File(subDirectory2, "subDirectory2File1");
subDirectory3Dir1 = new File(subDirectory3, "subDirectory3Dir1");
subDirectory3File1 = new File(subDirectory3, "subDirectory3File1");
subDirectory3Dir1File1 = new File(subDirectory3Dir1, "subDirectory3Dir1File1");
subDirectory1.mkdir();
subDirectory2.mkdir();
subDirectory3.mkdir();
fileInRoot.createNewFile();
subDirectory1File1.createNewFile();
subDirectory1File2.createNewFile();
subDirectory2File1.createNewFile();
subDirectory3File1.createNewFile();
subDirectory3Dir1.mkdir();
subDirectory3Dir1File1.createNewFile();
}
public File getRootDirectory() {
return rootDirectory;
}
/**
* Checks if all directories ending in 'nar-unpacked' that have a file named 'nar-digest' within still exist,
* and the directory ending in 'nar-unpacked' without 'nar-digest' has been removed with all of its contents.
* @return true if the above is met.
*/
public boolean isConsistent() {
return (rootDirectory.exists()
&& subDirectory1.exists() && subDirectory1File1.exists() && subDirectory1File2.exists()
&& subDirectory2.exists() && subDirectory2File1.exists()
&& !(subDirectory3.exists() || subDirectory3Dir1.exists() || subDirectory3File1.exists() || subDirectory3Dir1File1.exists())
&& fileInRoot.exists());
}
}
}

View File

@ -1,167 +0,0 @@
{
"flowContents": {
"identifier": "49c7d1d6-d925-3e92-8765-797530ef8a8c",
"name": "Generate Data",
"comments": "",
"position": {
"x": 888.0,
"y": 224.0
},
"processGroups": [],
"remoteProcessGroups": [],
"processors": [
{
"identifier": "c6562d38-4994-3fcc-ac98-1da34de1916f",
"name": "GenerateFlowFile",
"comments": "",
"position": {
"x": 1046.0,
"y": 170.0
},
"bundle": {
"group": "org.apache.nifi",
"artifact": "nifi-system-test-extensions-nar",
"version": "1.12.1"
},
"style": {},
"type": "org.apache.nifi.processors.tests.system.GenerateFlowFile",
"properties": {
"File Size": "1 KB",
"Text": "Hello World",
"Batch Size": "1",
"number": "${nextInt()}",
"greeting": "hello"
},
"propertyDescriptors": {
"character-set": {
"name": "character-set",
"displayName": "Character Set",
"identifiesControllerService": false,
"sensitive": false
},
"File Size": {
"name": "File Size",
"displayName": "File Size",
"identifiesControllerService": false,
"sensitive": false
},
"mime-type": {
"name": "mime-type",
"displayName": "Mime Type",
"identifiesControllerService": false,
"sensitive": false
},
"generate-ff-custom-text": {
"name": "generate-ff-custom-text",
"displayName": "Custom Text",
"identifiesControllerService": false,
"sensitive": false
},
"Batch Size": {
"name": "Batch Size",
"displayName": "Batch Size",
"identifiesControllerService": false,
"sensitive": false
},
"Unique FlowFiles": {
"name": "Unique FlowFiles",
"displayName": "Unique FlowFiles",
"identifiesControllerService": false,
"sensitive": false
},
"Data Format": {
"name": "Data Format",
"displayName": "Data Format",
"identifiesControllerService": false,
"sensitive": false
}
},
"schedulingPeriod": "0 sec",
"schedulingStrategy": "TIMER_DRIVEN",
"executionNode": "ALL",
"penaltyDuration": "30 sec",
"yieldDuration": "1 sec",
"bulletinLevel": "WARN",
"runDurationMillis": 0,
"concurrentlySchedulableTaskCount": 1,
"autoTerminatedRelationships": [],
"scheduledState": "ENABLED",
"componentType": "PROCESSOR",
"groupIdentifier": "49c7d1d6-d925-3e92-8765-797530ef8a8c"
}
],
"inputPorts": [],
"outputPorts": [
{
"identifier": "22dca4db-f1e7-3381-8e3e-ba5d308ede67",
"name": "Out",
"position": {
"x": 1104.0,
"y": 472.0
},
"type": "OUTPUT_PORT",
"concurrentlySchedulableTaskCount": 1,
"allowRemoteAccess": false,
"componentType": "OUTPUT_PORT",
"groupIdentifier": "49c7d1d6-d925-3e92-8765-797530ef8a8c"
},
{
"identifier": "22dca4db-f1e7-3381-8e3e-ba5d308e0000",
"name": "Another",
"position": {
"x": 404.0,
"y": 472.0
},
"type": "OUTPUT_PORT",
"concurrentlySchedulableTaskCount": 1,
"allowRemoteAccess": false,
"componentType": "OUTPUT_PORT",
"groupIdentifier": "49c7d1d6-d925-3e92-8765-797530ef8a8c"
}
],
"connections": [
{
"identifier": "6674c5df-af6d-38f7-bbf3-3aa1b6f3ae7f",
"name": "",
"source": {
"id": "c6562d38-4994-3fcc-ac98-1da34de1916f",
"type": "PROCESSOR",
"groupId": "49c7d1d6-d925-3e92-8765-797530ef8a8c",
"name": "GenerateFlowFile",
"comments": ""
},
"destination": {
"id": "22dca4db-f1e7-3381-8e3e-ba5d308ede67",
"type": "OUTPUT_PORT",
"groupId": "49c7d1d6-d925-3e92-8765-797530ef8a8c",
"name": "Out"
},
"labelIndex": 1,
"zIndex": 0,
"selectedRelationships": [
"success"
],
"backPressureObjectThreshold": 10000,
"backPressureDataSizeThreshold": "1 GB",
"flowFileExpiration": "0 sec",
"prioritizers": [],
"bends": [],
"loadBalanceStrategy": "DO_NOT_LOAD_BALANCE",
"partitioningAttribute": "",
"loadBalanceCompression": "DO_NOT_COMPRESS",
"componentType": "CONNECTION",
"groupIdentifier": "49c7d1d6-d925-3e92-8765-797530ef8a8c"
}
],
"labels": [],
"funnels": [],
"controllerServices": [],
"variables": {},
"flowFileConcurrency": "UNBOUNDED",
"flowFileOutboundPolicy": "STREAM_WHEN_AVAILABLE",
"componentType": "PROCESS_GROUP"
},
"externalControllerServices": {},
"parameterContexts": {},
"flowEncodingVersion": "1.0"
}

View File

@ -1,312 +0,0 @@
{
"flowContents": {
"identifier": "52e5e828-d33e-3892-9576-30404bfcdae0",
"name": "Write To File",
"comments": "",
"position": {
"x": -87.1781433064217,
"y": 477.5978349604865
},
"processGroups": [],
"remoteProcessGroups": [],
"processors": [
{
"identifier": "bb6fb270-4461-3231-88b6-c744a095f492",
"name": "PutFile",
"comments": "",
"position": {
"x": 980.0000001338867,
"y": 456.99999844406426
},
"bundle": {
"group": "org.apache.nifi",
"artifact": "nifi-standard-nar",
"version": "1.13.0-SNAPSHOT"
},
"style": {},
"type": "org.apache.nifi.processors.standard.PutFile",
"properties": {
"Group": null,
"Owner": null,
"Create Missing Directories": "true",
"Permissions": null,
"Maximum File Count": null,
"Last Modified Time": null,
"Directory": "#{Directory}",
"Conflict Resolution Strategy": "replace"
},
"propertyDescriptors": {
"Group": {
"name": "Group",
"displayName": "Group",
"identifiesControllerService": false,
"sensitive": false
},
"Owner": {
"name": "Owner",
"displayName": "Owner",
"identifiesControllerService": false,
"sensitive": false
},
"Create Missing Directories": {
"name": "Create Missing Directories",
"displayName": "Create Missing Directories",
"identifiesControllerService": false,
"sensitive": false
},
"Permissions": {
"name": "Permissions",
"displayName": "Permissions",
"identifiesControllerService": false,
"sensitive": false
},
"Maximum File Count": {
"name": "Maximum File Count",
"displayName": "Maximum File Count",
"identifiesControllerService": false,
"sensitive": false
},
"Last Modified Time": {
"name": "Last Modified Time",
"displayName": "Last Modified Time",
"identifiesControllerService": false,
"sensitive": false
},
"Directory": {
"name": "Directory",
"displayName": "Directory",
"identifiesControllerService": false,
"sensitive": false
},
"Conflict Resolution Strategy": {
"name": "Conflict Resolution Strategy",
"displayName": "Conflict Resolution Strategy",
"identifiesControllerService": false,
"sensitive": false
}
},
"schedulingPeriod": "0 sec",
"schedulingStrategy": "TIMER_DRIVEN",
"executionNode": "ALL",
"penaltyDuration": "30 sec",
"yieldDuration": "1 sec",
"bulletinLevel": "WARN",
"runDurationMillis": 0,
"concurrentlySchedulableTaskCount": 1,
"autoTerminatedRelationships": [],
"scheduledState": "ENABLED",
"componentType": "PROCESSOR",
"groupIdentifier": "52e5e828-d33e-3892-9576-30404bfcdae0"
}
],
"inputPorts": [
{
"identifier": "c870b7f4-e922-3fa6-a1dd-b368a768d5f3",
"name": "Another",
"position": {
"x": 1376.0000001338867,
"y": 212.99999844406426
},
"type": "INPUT_PORT",
"concurrentlySchedulableTaskCount": 1,
"allowRemoteAccess": false,
"componentType": "INPUT_PORT",
"groupIdentifier": "52e5e828-d33e-3892-9576-30404bfcdae0"
},
{
"identifier": "8482cb2e-bc3b-3249-9b73-81f536949a62",
"name": "In",
"position": {
"x": 791.0000001338867,
"y": 214.99999844406426
},
"type": "INPUT_PORT",
"concurrentlySchedulableTaskCount": 1,
"allowRemoteAccess": false,
"componentType": "INPUT_PORT",
"groupIdentifier": "52e5e828-d33e-3892-9576-30404bfcdae0"
}
],
"outputPorts": [
{
"identifier": "2d89bbd9-1d56-3969-8334-47acfee60762",
"name": "Failure",
"position": {
"x": 1305.0000001338867,
"y": 751.9999984440643
},
"type": "OUTPUT_PORT",
"concurrentlySchedulableTaskCount": 1,
"allowRemoteAccess": false,
"componentType": "OUTPUT_PORT",
"groupIdentifier": "52e5e828-d33e-3892-9576-30404bfcdae0"
},
{
"identifier": "2b4b8e53-6fec-3114-911f-f13f944a2434",
"name": "Success",
"position": {
"x": 720.0,
"y": 752.0
},
"type": "OUTPUT_PORT",
"concurrentlySchedulableTaskCount": 1,
"allowRemoteAccess": false,
"componentType": "OUTPUT_PORT",
"groupIdentifier": "52e5e828-d33e-3892-9576-30404bfcdae0"
}
],
"connections": [
{
"identifier": "cb47f40e-2da9-30a8-9f9e-da9d1dda6a12",
"name": "",
"source": {
"id": "bb6fb270-4461-3231-88b6-c744a095f492",
"type": "PROCESSOR",
"groupId": "52e5e828-d33e-3892-9576-30404bfcdae0",
"name": "PutFile",
"comments": ""
},
"destination": {
"id": "2d89bbd9-1d56-3969-8334-47acfee60762",
"type": "OUTPUT_PORT",
"groupId": "52e5e828-d33e-3892-9576-30404bfcdae0",
"name": "Failure"
},
"labelIndex": 1,
"zIndex": 0,
"selectedRelationships": [
"failure"
],
"backPressureObjectThreshold": 10000,
"backPressureDataSizeThreshold": "1 GB",
"flowFileExpiration": "0 sec",
"prioritizers": [],
"bends": [],
"loadBalanceStrategy": "DO_NOT_LOAD_BALANCE",
"partitioningAttribute": "",
"loadBalanceCompression": "DO_NOT_COMPRESS",
"componentType": "CONNECTION",
"groupIdentifier": "52e5e828-d33e-3892-9576-30404bfcdae0"
},
{
"identifier": "a985c4a4-e34e-3aa5-88c0-3f3564ab76fc",
"name": "",
"source": {
"id": "c870b7f4-e922-3fa6-a1dd-b368a768d5f3",
"type": "INPUT_PORT",
"groupId": "52e5e828-d33e-3892-9576-30404bfcdae0",
"name": "Another"
},
"destination": {
"id": "bb6fb270-4461-3231-88b6-c744a095f492",
"type": "PROCESSOR",
"groupId": "52e5e828-d33e-3892-9576-30404bfcdae0",
"name": "PutFile",
"comments": ""
},
"labelIndex": 1,
"zIndex": 0,
"selectedRelationships": [
""
],
"backPressureObjectThreshold": 10000,
"backPressureDataSizeThreshold": "1 GB",
"flowFileExpiration": "0 sec",
"prioritizers": [],
"bends": [],
"loadBalanceStrategy": "DO_NOT_LOAD_BALANCE",
"partitioningAttribute": "",
"loadBalanceCompression": "DO_NOT_COMPRESS",
"componentType": "CONNECTION",
"groupIdentifier": "52e5e828-d33e-3892-9576-30404bfcdae0"
},
{
"identifier": "87250c74-a39e-3f3f-ad48-d4bcf6bd66d7",
"name": "",
"source": {
"id": "8482cb2e-bc3b-3249-9b73-81f536949a62",
"type": "INPUT_PORT",
"groupId": "52e5e828-d33e-3892-9576-30404bfcdae0",
"name": "In"
},
"destination": {
"id": "bb6fb270-4461-3231-88b6-c744a095f492",
"type": "PROCESSOR",
"groupId": "52e5e828-d33e-3892-9576-30404bfcdae0",
"name": "PutFile",
"comments": ""
},
"labelIndex": 1,
"zIndex": 0,
"selectedRelationships": [
""
],
"backPressureObjectThreshold": 10000,
"backPressureDataSizeThreshold": "1 GB",
"flowFileExpiration": "0 sec",
"prioritizers": [],
"bends": [],
"loadBalanceStrategy": "DO_NOT_LOAD_BALANCE",
"partitioningAttribute": "",
"loadBalanceCompression": "DO_NOT_COMPRESS",
"componentType": "CONNECTION",
"groupIdentifier": "52e5e828-d33e-3892-9576-30404bfcdae0"
},
{
"identifier": "cd6c9606-841e-30bd-b7bd-85261067fdef",
"name": "",
"source": {
"id": "bb6fb270-4461-3231-88b6-c744a095f492",
"type": "PROCESSOR",
"groupId": "52e5e828-d33e-3892-9576-30404bfcdae0",
"name": "PutFile",
"comments": ""
},
"destination": {
"id": "2b4b8e53-6fec-3114-911f-f13f944a2434",
"type": "OUTPUT_PORT",
"groupId": "52e5e828-d33e-3892-9576-30404bfcdae0",
"name": "Success"
},
"labelIndex": 1,
"zIndex": 0,
"selectedRelationships": [
"success"
],
"backPressureObjectThreshold": 10000,
"backPressureDataSizeThreshold": "1 GB",
"flowFileExpiration": "0 sec",
"prioritizers": [],
"bends": [],
"loadBalanceStrategy": "DO_NOT_LOAD_BALANCE",
"partitioningAttribute": "",
"loadBalanceCompression": "DO_NOT_COMPRESS",
"componentType": "CONNECTION",
"groupIdentifier": "52e5e828-d33e-3892-9576-30404bfcdae0"
}
],
"labels": [],
"funnels": [],
"controllerServices": [],
"variables": {},
"parameterContextName": "WriteToFile",
"flowFileConcurrency": "UNBOUNDED",
"flowFileOutboundPolicy": "STREAM_WHEN_AVAILABLE",
"componentType": "PROCESS_GROUP"
},
"externalControllerServices": {},
"parameterContexts": {
"WriteToFile": {
"name": "WriteToFile",
"parameters": [
{
"name": "Directory",
"description": "",
"sensitive": false,
"value": "target/sink-output"
}
]
}
},
"flowEncodingVersion": "1.0"
}

View File

@ -1,63 +0,0 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<parent>
<artifactId>nifi-kafka-connect</artifactId>
<groupId>org.apache.nifi</groupId>
<version>2.0.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>
<artifactId>nifi-kafka-connector</artifactId>
<dependencies>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>connect-api</artifactId>
<version>2.6.3</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>2.6.3</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
</dependency>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-api</artifactId>
<version>2.0.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-stateless-api</artifactId>
<version>2.0.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-stateless-bootstrap</artifactId>
<version>2.0.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-utils</artifactId>
<version>2.0.0-SNAPSHOT</version>
</dependency>
</dependencies>
</project>

View File

@ -1,276 +0,0 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.nifi.kafka.connect;
import org.apache.nifi.stateless.bootstrap.StatelessBootstrap;
import org.apache.nifi.stateless.config.ExtensionClientDefinition;
import org.apache.nifi.stateless.config.ParameterOverride;
import org.apache.nifi.stateless.config.SslContextDefinition;
import org.apache.nifi.stateless.engine.StatelessEngineConfiguration;
import org.apache.nifi.stateless.flow.DataflowDefinition;
import org.apache.nifi.stateless.flow.StatelessDataflow;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.slf4j.MDC;
import java.io.File;
import java.io.IOException;
import java.net.URL;
import java.net.URLClassLoader;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collection;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Optional;
import java.util.concurrent.locks.Lock;
import java.util.concurrent.locks.ReentrantLock;
import java.util.jar.JarFile;
import java.util.jar.Manifest;
import java.util.regex.Pattern;
public class StatelessKafkaConnectorUtil {
private static final String UNKNOWN_VERSION = "<Unable to determine Stateless NiFi Kafka Connector Version>";
private static final Logger logger = LoggerFactory.getLogger(StatelessKafkaConnectorUtil.class);
private static final Lock unpackNarLock = new ReentrantLock();
protected static final Pattern STATELESS_BOOTSTRAP_FILE_PATTERN = Pattern.compile("nifi-stateless-bootstrap-(.*).jar");
public static String getVersion() {
final File bootstrapJar = detectBootstrapJar();
if (bootstrapJar == null) {
return UNKNOWN_VERSION;
}
try (final JarFile jarFile = new JarFile(bootstrapJar)) {
final Manifest manifest = jarFile.getManifest();
if (manifest != null) {
return manifest.getMainAttributes().getValue("Implementation-Version");
}
} catch (IOException e) {
logger.warn("Could not determine Version of NiFi Stateless Kafka Connector", e);
return UNKNOWN_VERSION;
}
return UNKNOWN_VERSION;
}
public static StatelessDataflow createDataflow(final StatelessNiFiCommonConfig config) {
final StatelessEngineConfiguration engineConfiguration = createEngineConfiguration(config);
final List<ParameterOverride> parameterOverrides = config.getParameterOverrides();
final String dataflowName = config.getDataflowName();
final DataflowDefinition dataflowDefinition;
final StatelessBootstrap bootstrap;
try {
final Map<String, String> dataflowDefinitionProperties = new HashMap<>();
config.setFlowDefinition(dataflowDefinitionProperties);
dataflowDefinitionProperties.put(StatelessNiFiCommonConfig.BOOTSTRAP_FLOW_NAME, dataflowName);
MDC.setContextMap(Collections.singletonMap("dataflow", dataflowName));
StatelessDataflow dataflow;
// Use a Write Lock to ensure that only a single thread is calling StatelessBootstrap.bootstrap().
// We do this because the bootstrap() method will expand all NAR files into the working directory.
// If we have multiple Connector instances, or multiple tasks, we don't want several threads all
// unpacking NARs at the same time, as it could potentially result in the working directory becoming corrupted.
unpackNarLock.lock();
try {
WorkingDirectoryUtils.reconcileWorkingDirectory(engineConfiguration.getWorkingDirectory());
bootstrap = StatelessBootstrap.bootstrap(engineConfiguration, StatelessNiFiSourceTask.class.getClassLoader());
dataflowDefinition = bootstrap.parseDataflowDefinition(dataflowDefinitionProperties, parameterOverrides);
dataflow = bootstrap.createDataflow(dataflowDefinition);
} finally {
unpackNarLock.unlock();
}
return dataflow;
} catch (final Exception e) {
throw new RuntimeException("Failed to bootstrap Stateless NiFi Engine", e);
}
}
private static StatelessEngineConfiguration createEngineConfiguration(final StatelessNiFiCommonConfig config) {
final File narDirectory;
final String narDirectoryFilename = config.getNarDirectory();
if (narDirectoryFilename == null) {
narDirectory = detectNarDirectory();
} else {
narDirectory = new File(narDirectoryFilename);
}
final String dataflowName = config.getDataflowName();
final File baseWorkingDirectory;
final String workingDirectoryFilename = config.getWorkingDirectory();
if (workingDirectoryFilename == null) {
baseWorkingDirectory = StatelessNiFiCommonConfig.DEFAULT_WORKING_DIRECTORY;
} else {
baseWorkingDirectory = new File(workingDirectoryFilename);
}
final File workingDirectory = new File(baseWorkingDirectory, dataflowName);
final File extensionsDirectory;
final String extensionsDirectoryFilename = config.getExtensionsDirectory();
if (extensionsDirectoryFilename == null) {
extensionsDirectory = StatelessNiFiCommonConfig.DEFAULT_EXTENSIONS_DIRECTORY;
} else {
extensionsDirectory = new File(extensionsDirectoryFilename);
}
final SslContextDefinition sslContextDefinition = createSslContextDefinition(config);
return new StatelessEngineConfiguration() {
@Override
public File getWorkingDirectory() {
return workingDirectory;
}
@Override
public File getNarDirectory() {
return narDirectory;
}
@Override
public File getExtensionsDirectory() {
return extensionsDirectory;
}
@Override
public Collection<File> getReadOnlyExtensionsDirectories() {
return Collections.emptyList();
}
@Override
public File getKrb5File() {
return new File(config.getKrb5File());
}
@Override
public Optional<File> getContentRepositoryDirectory() {
return Optional.empty();
}
@Override
public SslContextDefinition getSslContext() {
return sslContextDefinition;
}
@Override
public String getSensitivePropsKey() {
return config.getSensitivePropsKey();
}
@Override
public List<ExtensionClientDefinition> getExtensionClients() {
final List<ExtensionClientDefinition> extensionClientDefinitions = new ArrayList<>();
final String nexusBaseUrl = config.getNexusBaseUrl();
if (nexusBaseUrl != null) {
final ExtensionClientDefinition definition = new ExtensionClientDefinition();
definition.setUseSslContext(false);
definition.setExtensionClientType("nexus");
definition.setCommsTimeout("30 secs");
definition.setBaseUrl(nexusBaseUrl);
extensionClientDefinitions.add(definition);
}
return extensionClientDefinitions;
}
@Override
public String getStatusTaskInterval() {
return "1 min";
}
};
}
private static SslContextDefinition createSslContextDefinition(final StatelessNiFiCommonConfig config) {
final String truststoreFile = config.getTruststoreFile();
if (truststoreFile == null || truststoreFile.trim().isEmpty()) {
return null;
}
final SslContextDefinition sslContextDefinition;
sslContextDefinition = new SslContextDefinition();
sslContextDefinition.setTruststoreFile(truststoreFile);
sslContextDefinition.setTruststorePass(config.getTruststorePassword());
sslContextDefinition.setTruststoreType(config.getTruststoreType());
final String keystoreFile = config.getKeystoreFile();
if (keystoreFile != null && !keystoreFile.trim().isEmpty()) {
sslContextDefinition.setKeystoreFile(keystoreFile);
sslContextDefinition.setKeystoreType(config.getKeystoreType());
final String keystorePass = config.getKeystorePassword();
sslContextDefinition.setKeystorePass(keystorePass);
final String explicitKeyPass = config.getKeystoreKeyPassword();
final String keyPass = (explicitKeyPass == null || explicitKeyPass.trim().isEmpty()) ? keystorePass : explicitKeyPass;
sslContextDefinition.setKeyPass(keyPass);
}
return sslContextDefinition;
}
private static URLClassLoader getConnectClassLoader() {
final ClassLoader classLoader = StatelessKafkaConnectorUtil.class.getClassLoader();
if (!(classLoader instanceof URLClassLoader)) {
throw new IllegalStateException("No configuration value was set for the " +
StatelessNiFiCommonConfig.NAR_DIRECTORY +
" configuration property, and was unable to determine the NAR directory automatically");
}
return (URLClassLoader) classLoader;
}
private static File detectBootstrapJar() {
final URLClassLoader urlClassLoader = getConnectClassLoader();
for (final URL url : urlClassLoader.getURLs()) {
final String artifactFilename = url.getFile();
if (artifactFilename == null) {
continue;
}
final File artifactFile = new File(artifactFilename);
if (STATELESS_BOOTSTRAP_FILE_PATTERN.matcher(artifactFile.getName()).matches()) {
return artifactFile;
}
}
return null;
}
private static File detectNarDirectory() {
final File bootstrapJar = detectBootstrapJar();
if (bootstrapJar == null) {
final URLClassLoader urlClassLoader = getConnectClassLoader();
logger.error("ClassLoader that loaded Stateless Kafka Connector did not contain nifi-stateless-bootstrap." +
" URLs that were present: {}", Arrays.asList(urlClassLoader.getURLs()));
throw new IllegalStateException("No configuration value was set for the " +
StatelessNiFiCommonConfig.NAR_DIRECTORY +
" configuration property, and was unable to determine the NAR directory automatically");
}
final File narDirectory = bootstrapJar.getParentFile();
logger.info("Detected NAR Directory to be {}", narDirectory.getAbsolutePath());
return narDirectory;
}
}

View File

@ -1,269 +0,0 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.nifi.kafka.connect;
import org.apache.kafka.common.config.AbstractConfig;
import org.apache.kafka.common.config.ConfigDef;
import org.apache.kafka.common.config.types.Password;
import org.apache.nifi.kafka.connect.validators.ConnectDirectoryExistsValidator;
import org.apache.nifi.kafka.connect.validators.ConnectHttpUrlValidator;
import org.apache.nifi.kafka.connect.validators.FlowSnapshotValidator;
import org.apache.nifi.stateless.config.ParameterOverride;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.io.File;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import static org.apache.kafka.common.config.ConfigDef.NonEmptyStringWithoutControlChars.nonEmptyStringWithoutControlChars;
public abstract class StatelessNiFiCommonConfig extends AbstractConfig {
private static final Logger logger = LoggerFactory.getLogger(StatelessNiFiCommonConfig.class);
public static final String NAR_DIRECTORY = "nar.directory";
public static final String EXTENSIONS_DIRECTORY = "extensions.directory";
public static final String WORKING_DIRECTORY = "working.directory";
public static final String FLOW_SNAPSHOT = "flow.snapshot";
public static final String KRB5_FILE = "krb5.file";
public static final String NEXUS_BASE_URL = "nexus.url";
public static final String DATAFLOW_TIMEOUT = "dataflow.timeout";
public static final String DATAFLOW_NAME = "name";
public static final String TRUSTSTORE_FILE = "security.truststore";
public static final String TRUSTSTORE_TYPE = "security.truststoreType";
public static final String TRUSTSTORE_PASSWORD = "security.truststorePasswd";
public static final String KEYSTORE_FILE = "security.keystore";
public static final String KEYSTORE_TYPE = "security.keystoreType";
public static final String KEYSTORE_PASSWORD = "security.keystorePasswd";
public static final String KEY_PASSWORD = "security.keyPasswd";
public static final String SENSITIVE_PROPS_KEY = "sensitive.props.key";
public static final String BOOTSTRAP_SNAPSHOT_URL = "nifi.stateless.flow.snapshot.url";
public static final String BOOTSTRAP_SNAPSHOT_FILE = "nifi.stateless.flow.snapshot.file";
public static final String BOOTSTRAP_SNAPSHOT_CONTENTS = "nifi.stateless.flow.snapshot.contents";
public static final String BOOTSTRAP_FLOW_NAME = "nifi.stateless.flow.name";
public static final String DEFAULT_KRB5_FILE = "/etc/krb5.conf";
public static final String DEFAULT_DATAFLOW_TIMEOUT = "60 sec";
public static final File DEFAULT_WORKING_DIRECTORY = new File("/tmp/nifi-stateless-working");
public static final File DEFAULT_EXTENSIONS_DIRECTORY = new File("/tmp/nifi-stateless-extensions");
public static final String DEFAULT_SENSITIVE_PROPS_KEY = "nifi-stateless";
public static final String FLOW_GROUP = "Flow";
public static final String DIRECTORIES_GROUP = "Directories";
public static final String TLS_GROUP = "TLS";
public static final String KERBEROS_GROUP = "Kerberos";
public static final String NEXUS_GROUP = "Nexus";
public static final String SECURITY_GROUP = "Security";
public static final String RECORD_GROUP = "Record";
protected static final Pattern PARAMETER_WITH_CONTEXT_PATTERN = Pattern.compile("parameter\\.(.*?):(.*)");
protected static final Pattern PARAMETER_WITHOUT_CONTEXT_PATTERN = Pattern.compile("parameter\\.(.*)");
protected StatelessNiFiCommonConfig(ConfigDef definition, Map<?, ?> originals, Map<String, ?> configProviderProps, boolean doLog) {
super(definition, originals, configProviderProps, doLog);
}
protected StatelessNiFiCommonConfig(ConfigDef definition, Map<?, ?> originals) {
super(definition, originals);
}
protected StatelessNiFiCommonConfig(ConfigDef definition, Map<?, ?> originals, boolean doLog) {
super(definition, originals, doLog);
}
public String getNarDirectory() {
return getString(NAR_DIRECTORY);
}
public String getExtensionsDirectory() {
return getString(EXTENSIONS_DIRECTORY);
}
public String getWorkingDirectory() {
return getString(WORKING_DIRECTORY);
}
public String getDataflowName() {
return getString(DATAFLOW_NAME);
}
public String getKrb5File() {
return getString(KRB5_FILE);
}
public String getNexusBaseUrl() {
return getString(NEXUS_BASE_URL);
}
public String getDataflowTimeout() {
return getString(DATAFLOW_TIMEOUT);
}
public String getKeystoreFile() {
return getString(KEYSTORE_FILE);
}
public String getKeystoreType() {
return getString(KEYSTORE_TYPE);
}
public String getKeystorePassword() {
return getOptionalPassword(KEYSTORE_PASSWORD);
}
public String getKeystoreKeyPassword() {
return getOptionalPassword(KEY_PASSWORD);
}
public String getTruststoreFile() {
return getString(TRUSTSTORE_FILE);
}
public String getTruststoreType() {
return getString(TRUSTSTORE_TYPE);
}
public String getTruststorePassword() {
return getOptionalPassword(TRUSTSTORE_PASSWORD);
}
public String getSensitivePropsKey() {
return getOptionalPassword(SENSITIVE_PROPS_KEY);
}
/**
* Populates the properties with the data flow definition parameters
*
* @param dataflowDefinitionProperties The properties to populate.
*/
public void setFlowDefinition(final Map<String, String> dataflowDefinitionProperties) {
String configuredFlowSnapshot = getString(FLOW_SNAPSHOT);
if (configuredFlowSnapshot.startsWith("http://") || configuredFlowSnapshot.startsWith("https://")) {
logger.debug("Configured Flow Snapshot appears to be a URL. Will use {} property to configured Stateless NiFi", StatelessNiFiCommonConfig.BOOTSTRAP_SNAPSHOT_URL);
dataflowDefinitionProperties.put(StatelessNiFiCommonConfig.BOOTSTRAP_SNAPSHOT_URL, configuredFlowSnapshot);
} else if (configuredFlowSnapshot.trim().startsWith("{")) {
logger.debug("Configured Flow Snapshot appears to be JSON. Will use {} property to configured Stateless NiFi", StatelessNiFiCommonConfig.BOOTSTRAP_SNAPSHOT_CONTENTS);
dataflowDefinitionProperties.put(StatelessNiFiCommonConfig.BOOTSTRAP_SNAPSHOT_CONTENTS, configuredFlowSnapshot);
} else {
logger.debug("Configured Flow Snapshot appears to be a File. Will use {} property to configured Stateless NiFi", StatelessNiFiCommonConfig.BOOTSTRAP_SNAPSHOT_FILE);
final File flowSnapshotFile = new File(configuredFlowSnapshot);
dataflowDefinitionProperties.put(StatelessNiFiCommonConfig.BOOTSTRAP_SNAPSHOT_FILE, flowSnapshotFile.getAbsolutePath());
}
}
/**
* Collect Parameter Context values that override standard properties
*
* @return The parameter overrides of the flow.
*/
public List<ParameterOverride> getParameterOverrides() {
final List<ParameterOverride> parameterOverrides = new ArrayList<>();
for (final Map.Entry<String, String> entry : originalsStrings().entrySet()) {
final String parameterValue = entry.getValue();
ParameterOverride parameterOverride = null;
final Matcher matcher = StatelessNiFiCommonConfig.PARAMETER_WITH_CONTEXT_PATTERN.matcher(entry.getKey());
if (matcher.matches()) {
final String contextName = matcher.group(1);
final String parameterName = matcher.group(2);
parameterOverride = new ParameterOverride(contextName, parameterName, parameterValue);
} else {
final Matcher noContextMatcher = StatelessNiFiCommonConfig.PARAMETER_WITHOUT_CONTEXT_PATTERN.matcher(entry.getKey());
if (noContextMatcher.matches()) {
final String parameterName = noContextMatcher.group(1);
parameterOverride = new ParameterOverride(parameterName, parameterValue);
}
}
if (parameterOverride != null) {
parameterOverrides.add(parameterOverride);
}
}
return parameterOverrides;
}
protected String getOptionalPassword(String key) {
Password password = getPassword(key);
return password == null ? null : password.value();
}
/**
* Adds the flow definition related common configs to a config definition.
*
* @param configDef The config def to extend.
*/
protected static void addFlowConfigElements(final ConfigDef configDef) {
configDef.define(FLOW_SNAPSHOT, ConfigDef.Type.STRING, null, new FlowSnapshotValidator(), ConfigDef.Importance.HIGH,
"Specifies the dataflow to run. This may be a file containing the dataflow, a URL that points to a dataflow, or a String containing the entire dataflow as an escaped JSON.",
FLOW_GROUP, 0, ConfigDef.Width.NONE, "Flow snapshot");
}
/**
* Adds the directory, NAR, kerberos and TLS common configs to a config definition.
*
* @param configDef The config def to extend.
*/
protected static void addCommonConfigElements(final ConfigDef configDef) {
configDef.define(NAR_DIRECTORY, ConfigDef.Type.STRING, null, new ConnectDirectoryExistsValidator(), ConfigDef.Importance.HIGH,
"Specifies the directory that stores the NiFi Archives (NARs)", DIRECTORIES_GROUP, 0, ConfigDef.Width.NONE, "NAR directory");
configDef.define(EXTENSIONS_DIRECTORY, ConfigDef.Type.STRING, null, ConfigDef.Importance.HIGH,
"Specifies the directory that stores the extensions that will be downloaded (if any) from the configured Extension Client",
DIRECTORIES_GROUP, 1, ConfigDef.Width.NONE, "Extensions directory");
configDef.define(WORKING_DIRECTORY, ConfigDef.Type.STRING, null, ConfigDef.Importance.HIGH,
"Specifies the temporary working directory for expanding NiFi Archives (NARs)",
DIRECTORIES_GROUP, 2, ConfigDef.Width.NONE, "Working directory");
configDef.define(DATAFLOW_NAME, ConfigDef.Type.STRING, ConfigDef.NO_DEFAULT_VALUE, nonEmptyStringWithoutControlChars(), ConfigDef.Importance.HIGH, "The name of the dataflow.");
configDef.define(
KRB5_FILE, ConfigDef.Type.STRING, DEFAULT_KRB5_FILE, ConfigDef.Importance.MEDIUM,
"Specifies the krb5.conf file to use if connecting to Kerberos-enabled services",
KERBEROS_GROUP, 0, ConfigDef.Width.NONE, "krb5.conf file");
configDef.define(
NEXUS_BASE_URL, ConfigDef.Type.STRING, null, new ConnectHttpUrlValidator(), ConfigDef.Importance.MEDIUM,
"Specifies the Base URL of the Nexus instance to source extensions from",
NEXUS_GROUP, 0, ConfigDef.Width.NONE, "Nexus base URL");
configDef.define(
DATAFLOW_TIMEOUT, ConfigDef.Type.STRING, DEFAULT_DATAFLOW_TIMEOUT, ConfigDef.Importance.MEDIUM,
"Specifies the amount of time to wait for the dataflow to finish processing input before considering the dataflow a failure",
FLOW_GROUP, 1, ConfigDef.Width.NONE, "Dataflow processing timeout");
configDef.define(TRUSTSTORE_FILE, ConfigDef.Type.STRING, null, ConfigDef.Importance.MEDIUM,
"Filename of the truststore that Stateless NiFi should use for connecting to NiFi Registry and for Site-to-Site communications." +
" If not specified, communications will occur only over http, not https.", TLS_GROUP, 0, ConfigDef.Width.NONE, "Truststore file");
configDef.define(TRUSTSTORE_TYPE, ConfigDef.Type.STRING, null, ConfigDef.Importance.MEDIUM,
"The type of the Truststore file. Either JKS or PKCS12.", TLS_GROUP, 1, ConfigDef.Width.NONE, "Truststore type");
configDef.define(TRUSTSTORE_PASSWORD, ConfigDef.Type.PASSWORD, null, ConfigDef.Importance.MEDIUM,
"The password for the truststore.", TLS_GROUP, 2, ConfigDef.Width.NONE, "Truststore password");
configDef.define(KEYSTORE_FILE, ConfigDef.Type.STRING, null, ConfigDef.Importance.MEDIUM,
"Filename of the keystore that Stateless NiFi should use for connecting to NiFi Registry and for Site-to-Site communications.",
TLS_GROUP, 3, ConfigDef.Width.NONE, "Keystore file");
configDef.define(KEYSTORE_TYPE, ConfigDef.Type.STRING, null, ConfigDef.Importance.MEDIUM,
"The type of the Keystore file. Either JKS or PKCS12.", TLS_GROUP, 4, ConfigDef.Width.NONE, "Keystore type");
configDef.define(KEYSTORE_PASSWORD, ConfigDef.Type.PASSWORD, null, ConfigDef.Importance.MEDIUM,
"The password for the keystore.", TLS_GROUP, 5, ConfigDef.Width.NONE, "Keystore password");
configDef.define(KEY_PASSWORD, ConfigDef.Type.PASSWORD, null, ConfigDef.Importance.MEDIUM,
"The password for the key in the keystore. If not provided, the password is assumed to be the same as the keystore password.",
TLS_GROUP, 6, ConfigDef.Width.NONE, "Keystore key password");
configDef.define(SENSITIVE_PROPS_KEY, ConfigDef.Type.PASSWORD, DEFAULT_SENSITIVE_PROPS_KEY, ConfigDef.Importance.MEDIUM, "A key that components can use for encrypting and decrypting " +
"sensitive values.", SECURITY_GROUP, 0, ConfigDef.Width.NONE, "Sensitive properties key");
}
}

View File

@ -1,113 +0,0 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.nifi.kafka.connect;
import org.apache.kafka.common.config.ConfigDef;
import java.util.Collections;
import java.util.LinkedHashSet;
import java.util.List;
import java.util.Map;
import java.util.Set;
public class StatelessNiFiSinkConfig extends StatelessNiFiCommonConfig {
public static final String INPUT_PORT_NAME = "input.port";
public static final String FAILURE_PORTS = "failure.ports";
public static final String HEADERS_AS_ATTRIBUTES_REGEX = "headers.as.attributes.regex";
public static final String HEADER_ATTRIBUTE_NAME_PREFIX = "attribute.prefix";
protected static final ConfigDef CONFIG_DEF = createConfigDef();
public StatelessNiFiSinkConfig(Map<?, ?> originals) {
super(CONFIG_DEF, originals);
}
protected StatelessNiFiSinkConfig(ConfigDef definition, Map<?, ?> originals) {
super(definition, originals);
}
/**
* @return The input port name to use when feeding the flow. Can be null, which means the single available input port will be used.
*/
public String getInputPortName() {
return getString(INPUT_PORT_NAME);
}
/**
* @return The output ports to handle as failure ports. Flow files sent to this port will cause the Connector to retry.
*/
public Set<String> getFailurePorts() {
final List<String> configuredPorts = getList(FAILURE_PORTS);
if (configuredPorts == null) {
return Collections.emptySet();
}
return new LinkedHashSet<>(configuredPorts);
}
public String getHeadersAsAttributesRegex() {
return getString(HEADERS_AS_ATTRIBUTES_REGEX);
}
public String getHeaderAttributeNamePrefix() {
return getString(HEADER_ATTRIBUTE_NAME_PREFIX);
}
protected static ConfigDef createConfigDef() {
ConfigDef configDef = new ConfigDef();
StatelessNiFiCommonConfig.addCommonConfigElements(configDef);
addFlowConfigs(configDef);
addSinkConfigs(configDef);
return configDef;
}
/**
* Add the flow definition related common configs to a config definition.
*
* @param configDef The config def to extend.
*/
protected static void addFlowConfigs(ConfigDef configDef) {
StatelessNiFiCommonConfig.addFlowConfigElements(configDef);
configDef.define(INPUT_PORT_NAME, ConfigDef.Type.STRING, null, ConfigDef.Importance.HIGH,
"The name of the Input Port to push data to", StatelessNiFiCommonConfig.FLOW_GROUP, 100,
ConfigDef.Width.NONE, "Input port name");
configDef.define(
StatelessNiFiSinkConfig.FAILURE_PORTS, ConfigDef.Type.LIST, null, ConfigDef.Importance.MEDIUM,
"A list of Output Ports that are considered failures. If any FlowFile is routed to an Output Ports whose name is provided in this property," +
" the session is rolled back and is considered a failure", FLOW_GROUP, 200, ConfigDef.Width.NONE, "Failure ports");
}
/**
* Add sink configs to a config definition.
*
* @param configDef The config def to extend.
*/
protected static void addSinkConfigs(ConfigDef configDef) {
configDef.define(
HEADERS_AS_ATTRIBUTES_REGEX, ConfigDef.Type.STRING, null, ConfigDef.Importance.MEDIUM,
"A regular expression to evaluate against Kafka message header keys. Any message header whose key matches the regular expression" +
" will be added to the FlowFile as an attribute. The name of the attribute will match the header key (with an optional prefix, as " +
"defined by the attribute.prefix configuration) and the header value will be added as the attribute value.",
RECORD_GROUP, 0, ConfigDef.Width.NONE, "Headers as Attributes regex");
configDef.define(
HEADER_ATTRIBUTE_NAME_PREFIX, ConfigDef.Type.STRING, null, ConfigDef.Importance.MEDIUM,
"A prefix to add to the key of each header that matches the headers.as.attributes.regex Regular Expression. For example," +
" if a header has the key MyHeader and a value of MyValue, and the headers.as.attributes.regex is set to My.* and this property" +
" is set to kafka. then the FlowFile that is created for the Kafka message will have an attribute" +
" named kafka.MyHeader with a value of MyValue.",
RECORD_GROUP, 1, ConfigDef.Width.NONE, "Headers as Attributes prefix");
}
}

View File

@ -1,66 +0,0 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.nifi.kafka.connect;
import org.apache.kafka.common.config.ConfigDef;
import org.apache.kafka.connect.connector.Task;
import org.apache.kafka.connect.sink.SinkConnector;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class StatelessNiFiSinkConnector extends SinkConnector {
private Map<String, String> properties;
@Override
public void start(final Map<String, String> properties) {
this.properties = new HashMap<>(properties);
}
@Override
public Class<? extends Task> taskClass() {
return StatelessNiFiSinkTask.class;
}
@Override
public List<Map<String, String>> taskConfigs(final int maxTasks) {
final List<Map<String, String>> configs = new ArrayList<>();
for (int i = 0; i < maxTasks; i++) {
configs.add(new HashMap<>(properties));
}
return configs;
}
@Override
public void stop() {
}
@Override
public ConfigDef config() {
return StatelessNiFiSinkConfig.CONFIG_DEF;
}
@Override
public String version() {
return StatelessKafkaConnectorUtil.getVersion();
}
}

View File

@ -1,276 +0,0 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.nifi.kafka.connect;
import org.apache.kafka.clients.consumer.OffsetAndMetadata;
import org.apache.kafka.common.TopicPartition;
import org.apache.kafka.common.config.ConfigException;
import org.apache.kafka.connect.errors.RetriableException;
import org.apache.kafka.connect.header.Header;
import org.apache.kafka.connect.sink.SinkRecord;
import org.apache.kafka.connect.sink.SinkTask;
import org.apache.nifi.controller.queue.QueueSize;
import org.apache.nifi.flowfile.FlowFile;
import org.apache.nifi.stateless.flow.DataflowTrigger;
import org.apache.nifi.stateless.flow.StatelessDataflow;
import org.apache.nifi.stateless.flow.TriggerResult;
import org.apache.nifi.util.FormatUtils;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.nio.charset.StandardCharsets;
import java.util.Collection;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Optional;
import java.util.Set;
import java.util.concurrent.TimeUnit;
import java.util.regex.Pattern;
public class StatelessNiFiSinkTask extends SinkTask {
private static final Logger logger = LoggerFactory.getLogger(StatelessNiFiSinkTask.class);
private StatelessDataflow dataflow;
private String inputPortName;
private Set<String> failurePortNames;
private long timeoutMillis;
private Pattern headerNameRegex;
private String headerNamePrefix;
private QueueSize queueSize;
private String dataflowName;
private long backoffMillis = 0L;
@Override
public String version() {
return StatelessKafkaConnectorUtil.getVersion();
}
@Override
public void start(final Map<String, String> properties) {
logger.info("Starting Sink Task");
StatelessNiFiSinkConfig config = createConfig(properties);
final String timeout = config.getDataflowTimeout();
timeoutMillis = (long) FormatUtils.getPreciseTimeDuration(timeout, TimeUnit.MILLISECONDS);
dataflowName = config.getDataflowName();
final String regex = config.getHeadersAsAttributesRegex();
headerNameRegex = regex == null ? null : Pattern.compile(regex);
headerNamePrefix = config.getHeaderAttributeNamePrefix();
if (headerNamePrefix == null) {
headerNamePrefix = "";
}
dataflow = StatelessKafkaConnectorUtil.createDataflow(config);
dataflow.initialize();
// Determine input port name. If input port is explicitly set, use the value given. Otherwise, if only one port exists, use that. Otherwise, throw ConfigException.
inputPortName = config.getInputPortName();
if (inputPortName == null) {
final Set<String> inputPorts = dataflow.getInputPortNames();
if (inputPorts.isEmpty()) {
throw new ConfigException("The dataflow specified for <" + dataflowName + "> does not have an Input Port at the root level. Dataflows used for a Kafka Connect Sink Task "
+ "must have at least one Input Port at the root level.");
}
if (inputPorts.size() > 1) {
throw new ConfigException("The dataflow specified for <" + dataflowName + "> has multiple Input Ports at the root level (" + inputPorts
+ "). The " + StatelessNiFiSinkConfig.INPUT_PORT_NAME + " property must be set to indicate which of these Ports Kafka records should be sent to.");
}
inputPortName = inputPorts.iterator().next();
}
// Validate the input port
if (!dataflow.getInputPortNames().contains(inputPortName)) {
throw new ConfigException("The dataflow specified for <" + dataflowName + "> does not have Input Port with name <" + inputPortName + "> at the root level. Existing Input Port names are "
+ dataflow.getInputPortNames());
}
// Determine the failure Ports, if any are given.
failurePortNames = config.getFailurePorts();
// Validate the failure ports
final Set<String> outputPortNames = dataflow.getOutputPortNames();
for (final String failurePortName : failurePortNames) {
if (!outputPortNames.contains(failurePortName)) {
throw new ConfigException("Dataflow was configured with a Failure Port of " + failurePortName
+ " but there is no Port with that name in the dataflow. Valid Port names are " + outputPortNames);
}
}
}
@Override
public void put(final Collection<SinkRecord> records) {
logger.debug("Enqueuing {} Kafka messages", records.size());
for (final SinkRecord record : records) {
final Map<String, String> attributes = createAttributes(record);
final byte[] contents = getContents(record.value());
queueSize = dataflow.enqueue(contents, attributes, inputPortName);
}
}
/**
* Creates a config instance to be used by the task.
*
* @param properties The properties to use in the config.
* @return The config instance.
*/
protected StatelessNiFiSinkConfig createConfig(Map<String, String> properties) {
return new StatelessNiFiSinkConfig(properties);
}
private void backoff() {
// If no backoff period has been set, set it to 1 second. Otherwise, double the amount of time to backoff, up to 10 seconds.
if (backoffMillis == 0L) {
backoffMillis = 1000L;
}
backoffMillis = Math.min(backoffMillis * 2, 10_000L);
context.timeout(backoffMillis);
}
private void resetBackoff() {
backoffMillis = 0L;
}
private synchronized void triggerDataflow() {
final long start = System.nanoTime();
while (dataflow.isFlowFileQueued()) {
final DataflowTrigger trigger = dataflow.trigger();
try {
final Optional<TriggerResult> resultOptional = trigger.getResult(timeoutMillis, TimeUnit.MILLISECONDS);
if (resultOptional.isPresent()) {
final TriggerResult result = resultOptional.get();
if (result.isSuccessful()) {
// Verify that data was only transferred to the expected Input Port
verifyOutputPortContents(trigger, result);
// Acknowledge the data so that the session can be committed
result.acknowledge();
resetBackoff();
} else {
retry(trigger, "Dataflow " + dataflowName + " failed to execute properly", result.getFailureCause().orElse(null));
}
} else {
retry(trigger, "Timed out waiting for dataflow " + dataflowName + " to complete", null);
}
} catch (final InterruptedException e) {
Thread.currentThread().interrupt();
dataflow.purge();
throw new RuntimeException("Interrupted while waiting for dataflow to complete", e);
}
}
context.requestCommit();
final long nanos = System.nanoTime() - start;
if (queueSize != null) {
logger.debug("Ran dataflow with {} messages ({}) in {} nanos", queueSize.getObjectCount(), FormatUtils.formatDataSize(queueSize.getByteCount()), nanos);
}
}
private void retry(final DataflowTrigger trigger, final String explanation, final Throwable cause) {
logger.error(explanation, cause);
trigger.cancel();
// We don't want to keep running as fast as possible, as doing so may overwhelm a destination system that is already struggling.
// This is analogous to ProcessContext.yield() in NiFi parlance.
backoff();
// We will throw a RetriableException, which will redeliver all messages. So we need to purge anything currently in the dataflow.
dataflow.purge();
// Because a background thread may have triggered the dataflow, we need to note that the last trigger was unsuccessful so the subsequent
// call to either put() or flush() will throw a RetriableException. This will result in the data being redelivered/retried.
throw new RetriableException(explanation, cause);
}
private void verifyOutputPortContents(final DataflowTrigger trigger, final TriggerResult result) {
for (final String failurePort : failurePortNames) {
final List<FlowFile> flowFiles = result.getOutputFlowFiles(failurePort);
if (flowFiles != null && !flowFiles.isEmpty()) {
logger.error("Dataflow transferred FlowFiles to Port {}, which is configured as a Failure Port. Rolling back session.", failurePort);
trigger.cancel();
throw new RetriableException("Data was transferred to Failure Port " + failurePort);
}
}
}
@Override
public void flush(final Map<TopicPartition, OffsetAndMetadata> currentOffsets) {
super.flush(currentOffsets);
triggerDataflow();
}
private byte[] getContents(final Object value) {
if (value == null) {
return new byte[0];
}
if (value instanceof String) {
return ((String) value).getBytes(StandardCharsets.UTF_8);
}
if (value instanceof byte[]) {
return (byte[]) value;
}
throw new IllegalArgumentException("Unsupported message type: the Message value was " + value + " but was expected to be a byte array or a String");
}
private Map<String, String> createAttributes(final SinkRecord record) {
final Map<String, String> attributes = new HashMap<>();
attributes.put("kafka.topic", record.topic());
attributes.put("kafka.offset", String.valueOf(record.kafkaOffset()));
attributes.put("kafka.partition", String.valueOf(record.kafkaPartition()));
attributes.put("kafka.timestamp", String.valueOf(record.timestamp()));
final Object key = record.key();
if (key instanceof String) {
attributes.put("kafka.key", (String) key);
}
if (headerNameRegex != null) {
for (final Header header : record.headers()) {
if (headerNameRegex.matcher(header.key()).matches()) {
final String attributeName = headerNamePrefix + header.key();
final String attributeValue = String.valueOf(header.value());
attributes.put(attributeName, attributeValue);
}
}
}
return attributes;
}
@Override
public void stop() {
logger.info("Shutting down Sink Task");
if (dataflow != null) {
dataflow.shutdown();
}
}
}

View File

@ -1,120 +0,0 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.nifi.kafka.connect;
import org.apache.kafka.common.config.ConfigDef;
import org.apache.nifi.kafka.connect.validators.ConnectRegularExpressionValidator;
import java.util.Map;
public class StatelessNiFiSourceConfig extends StatelessNiFiCommonConfig {
public static final String OUTPUT_PORT_NAME = "output.port";
public static final String TOPIC_NAME = "topics";
public static final String TOPIC_NAME_ATTRIBUTE = "topic.name.attribute";
public static final String KEY_ATTRIBUTE = "key.attribute";
public static final String HEADER_REGEX = "header.attribute.regex";
public static final String STATE_MAP_KEY = "task.index";
protected static final ConfigDef CONFIG_DEF = createConfigDef();
public StatelessNiFiSourceConfig(Map<?, ?> originals) {
super(CONFIG_DEF, originals);
}
protected StatelessNiFiSourceConfig(ConfigDef definition, Map<?, ?> originals) {
super(definition, originals);
}
/**
* @return The output port name to use when reading from the flow. Can be null, which means the single available output port will be used.
*/
public String getOutputPortName() {
return getString(OUTPUT_PORT_NAME);
}
public String getTopicName() {
return getString(TOPIC_NAME);
}
public String getTopicNameAttribute() {
return getString(TOPIC_NAME_ATTRIBUTE);
}
public String getKeyAttribute() {
return getString(KEY_ATTRIBUTE);
}
public String getHeaderRegex() {
return getString(HEADER_REGEX);
}
public String getStateMapKey() {
return originalsStrings().get(STATE_MAP_KEY);
}
protected static ConfigDef createConfigDef() {
ConfigDef configDef = new ConfigDef();
StatelessNiFiCommonConfig.addCommonConfigElements(configDef);
addFlowConfigs(configDef);
addSourceConfigs(configDef);
return configDef;
}
/**
* Add the flow definition related common configs to a config definition.
*
* @param configDef The config def to extend.
*/
protected static void addFlowConfigs(ConfigDef configDef) {
StatelessNiFiCommonConfig.addFlowConfigElements(configDef);
configDef.define(StatelessNiFiSourceConfig.OUTPUT_PORT_NAME, ConfigDef.Type.STRING, null,
ConfigDef.Importance.HIGH, "The name of the Output Port to pull data from",
FLOW_GROUP, 100, ConfigDef.Width.NONE, "Output port name");
}
/**
* Add sink configs to a config definition.
*
* @param configDef The config def to extend.
*/
protected static void addSourceConfigs(ConfigDef configDef) {
configDef.define(
StatelessNiFiSourceConfig.TOPIC_NAME, ConfigDef.Type.STRING, null, ConfigDef.Importance.HIGH,
"The name of the Kafka topic to send data to. Either the topics or topic.name.attribute configuration must be specified.",
RECORD_GROUP, 0, ConfigDef.Width.NONE, "Topic name");
configDef.define(
StatelessNiFiSourceConfig.TOPIC_NAME_ATTRIBUTE, ConfigDef.Type.STRING, null, ConfigDef.Importance.MEDIUM,
"Specifies the name of a FlowFile attribute to use for determining which Kafka Topic a FlowFile"
+ " will be sent to. Either the " + StatelessNiFiSourceConfig.TOPIC_NAME + " or " + StatelessNiFiSourceConfig.TOPIC_NAME_ATTRIBUTE +
" configuration must be specified. If both are specified, the " + StatelessNiFiSourceConfig.TOPIC_NAME_ATTRIBUTE
+ " will be preferred, but if a FlowFile does not have the specified attribute name, then the " + StatelessNiFiSourceConfig.TOPIC_NAME +
" property will serve as the default topic name to use.",
RECORD_GROUP, 1, ConfigDef.Width.NONE, "Topic name attribute");
configDef.define(StatelessNiFiSourceConfig.KEY_ATTRIBUTE, ConfigDef.Type.STRING, null, ConfigDef.Importance.MEDIUM,
"Specifies the name of a FlowFile attribute to use for determining the Kafka Message key. If not"
+ " specified, the message key will be null. If specified, the value of the attribute with the given name will be used as the message key.",
RECORD_GROUP, 100, ConfigDef.Width.NONE, "Record key attribute");
configDef.define(
StatelessNiFiSourceConfig.HEADER_REGEX, ConfigDef.Type.STRING, null, new ConnectRegularExpressionValidator(), ConfigDef.Importance.MEDIUM,
"Specifies a Regular Expression to evaluate against all FlowFile attributes. Any attribute whose name matches the Regular Expression" +
" will be converted into a Kafka message header with the name of the attribute used as header key and the value of the attribute used as the header"
+ " value.",
RECORD_GROUP, 200, ConfigDef.Width.NONE, "Record header attribute regex");
}
}

View File

@ -1,87 +0,0 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.nifi.kafka.connect;
import org.apache.kafka.common.config.ConfigDef;
import org.apache.kafka.connect.connector.Task;
import org.apache.kafka.connect.source.SourceConnector;
import org.apache.nifi.stateless.flow.StatelessDataflow;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class StatelessNiFiSourceConnector extends SourceConnector {
private StatelessNiFiSourceConfig config;
private boolean primaryNodeOnly;
@Override
public void start(final Map<String, String> properties) {
config = createConfig(properties);
final StatelessDataflow dataflow = StatelessKafkaConnectorUtil.createDataflow(config);
primaryNodeOnly = dataflow.isSourcePrimaryNodeOnly();
dataflow.shutdown();
}
@Override
public Class<? extends Task> taskClass() {
return StatelessNiFiSourceTask.class;
}
@Override
public List<Map<String, String>> taskConfigs(final int maxTasks) {
final int numTasks = primaryNodeOnly ? 1 : maxTasks;
final List<Map<String, String>> configs = new ArrayList<>();
for (int i = 0; i < numTasks; i++) {
final Map<String, String> taskConfig = new HashMap<>(config.originalsStrings());
taskConfig.put("task.index", String.valueOf(i));
configs.add(taskConfig);
}
return configs;
}
@Override
public void stop() {
}
@Override
public ConfigDef config() {
return StatelessNiFiSourceConfig.CONFIG_DEF;
}
@Override
public String version() {
return StatelessKafkaConnectorUtil.getVersion();
}
/**
* Creates a config instance to be used by the Connector.
*
* @param properties Properties to use in the config.
* @return The config instance.
*/
protected StatelessNiFiSourceConfig createConfig(Map<String, String> properties) {
return new StatelessNiFiSourceConfig(properties);
}
}

View File

@ -1,295 +0,0 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.nifi.kafka.connect;
import org.apache.kafka.clients.producer.RecordMetadata;
import org.apache.kafka.common.config.ConfigException;
import org.apache.kafka.connect.data.Schema;
import org.apache.kafka.connect.errors.RetriableException;
import org.apache.kafka.connect.header.ConnectHeaders;
import org.apache.kafka.connect.source.SourceRecord;
import org.apache.kafka.connect.source.SourceTask;
import org.apache.nifi.components.state.Scope;
import org.apache.nifi.flowfile.FlowFile;
import org.apache.nifi.stateless.flow.DataflowTrigger;
import org.apache.nifi.stateless.flow.StatelessDataflow;
import org.apache.nifi.stateless.flow.TriggerResult;
import org.apache.nifi.util.FormatUtils;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.util.ArrayList;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Optional;
import java.util.Set;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicLong;
import java.util.regex.Pattern;
public class StatelessNiFiSourceTask extends SourceTask {
private static final Logger logger = LoggerFactory.getLogger(StatelessNiFiSourceTask.class);
private static final long FAILURE_YIELD_MILLIS = 1000L;
private StatelessDataflow dataflow;
private String outputPortName;
private String topicName;
private String topicNameAttribute;
private TriggerResult triggerResult;
private String keyAttributeName;
private Pattern headerAttributeNamePattern;
private long timeoutMillis;
private String dataflowName;
private long failureYieldExpiration = 0L;
private final Map<String, String> clusterStatePartitionMap = Collections.singletonMap(StatelessNiFiSourceConfig.STATE_MAP_KEY, "CLUSTER");
private Map<String, String> localStatePartitionMap = new HashMap<>();
private final AtomicLong unacknowledgedRecords = new AtomicLong(0L);
@Override
public String version() {
return StatelessKafkaConnectorUtil.getVersion();
}
@Override
public void start(final Map<String, String> properties) {
logger.info("Starting Source Task");
StatelessNiFiSourceConfig config = createConfig(properties);
final String timeout = config.getDataflowTimeout();
timeoutMillis = (long) FormatUtils.getPreciseTimeDuration(timeout, TimeUnit.MILLISECONDS);
topicName = config.getTopicName();
topicNameAttribute = config.getTopicNameAttribute();
keyAttributeName = config.getKeyAttribute();
if (topicName == null && topicNameAttribute == null) {
throw new ConfigException("Either the topic.name or topic.name.attribute configuration must be specified");
}
final String headerRegex = config.getHeaderRegex();
headerAttributeNamePattern = headerRegex == null ? null : Pattern.compile(headerRegex);
dataflow = StatelessKafkaConnectorUtil.createDataflow(config);
dataflow.initialize();
// Determine the name of the Output Port to retrieve data from
dataflowName = config.getDataflowName();
outputPortName = config.getOutputPortName();
if (outputPortName == null) {
final Set<String> outputPorts = dataflow.getOutputPortNames();
if (outputPorts.isEmpty()) {
throw new ConfigException("The dataflow specified for <" + dataflowName + "> does not have an Output Port at the root level. Dataflows used for a Kafka Connect Source Task "
+ "must have at least one Output Port at the root level.");
}
if (outputPorts.size() > 1) {
throw new ConfigException("The dataflow specified for <" + dataflowName + "> has multiple Output Ports at the root level (" + outputPorts
+ "). The " + StatelessNiFiSourceConfig.OUTPUT_PORT_NAME + " property must be set to indicate which of these Ports Kafka records should be retrieved from.");
}
outputPortName = outputPorts.iterator().next();
}
final String taskIndex = config.getStateMapKey();
localStatePartitionMap.put(StatelessNiFiSourceConfig.STATE_MAP_KEY, taskIndex);
final Map<String, String> localStateMap = (Map<String, String>) (Map) context.offsetStorageReader().offset(localStatePartitionMap);
final Map<String, String> clusterStateMap = (Map<String, String>) (Map) context.offsetStorageReader().offset(clusterStatePartitionMap);
dataflow.setComponentStates(localStateMap, Scope.LOCAL);
dataflow.setComponentStates(clusterStateMap, Scope.CLUSTER);
}
@Override
public List<SourceRecord> poll() throws InterruptedException {
final long yieldExpiration = Math.max(failureYieldExpiration, dataflow.getSourceYieldExpiration());
final long now = System.currentTimeMillis();
final long yieldMillis = yieldExpiration - now;
if (yieldMillis > 0) {
// If source component has yielded, we don't want to trigger it again until the yield expiration expires, in order to avoid
// overloading the source system.
logger.debug("Source of NiFi flow has opted to yield for {} milliseconds. Will pause dataflow until that time period has elapsed.", yieldMillis);
Thread.sleep(yieldMillis);
return null;
}
if (unacknowledgedRecords.get() > 0) {
// If we have records that haven't yet been acknowledged, we want to return null instead of running.
// We need to wait for the last results to complete before triggering the dataflow again.
return null;
}
logger.debug("Triggering dataflow");
final long start = System.nanoTime();
final DataflowTrigger trigger = dataflow.trigger();
final Optional<TriggerResult> resultOptional = trigger.getResult(timeoutMillis, TimeUnit.MILLISECONDS);
if (!resultOptional.isPresent()) {
logger.warn("Dataflow timed out after waiting {} milliseconds. Will cancel the execution.", timeoutMillis);
trigger.cancel();
return null;
}
triggerResult = resultOptional.get();
if (!triggerResult.isSuccessful()) {
logger.error("Dataflow {} failed to execute properly", dataflowName, triggerResult.getFailureCause().orElse(null));
trigger.cancel();
failureYieldExpiration = System.currentTimeMillis() + FAILURE_YIELD_MILLIS; // delay next execution for 1 second to avoid constnatly failing and utilization huge amounts of resources
return null;
}
// Verify that data was only transferred to the expected Output Port
verifyFlowFilesTransferredToProperPort(triggerResult, outputPortName, trigger);
final long nanos = System.nanoTime() - start;
final List<FlowFile> outputFlowFiles = triggerResult.getOutputFlowFiles(outputPortName);
final List<SourceRecord> sourceRecords = new ArrayList<>(outputFlowFiles.size());
Map<String, ?> componentState = dataflow.getComponentStates(Scope.CLUSTER);
final Map<String, ?> partitionMap;
if (componentState == null || componentState.isEmpty()) {
componentState = dataflow.getComponentStates(Scope.LOCAL);
partitionMap = localStatePartitionMap;
} else {
partitionMap = clusterStatePartitionMap;
}
try {
for (final FlowFile flowFile : outputFlowFiles) {
final byte[] contents = triggerResult.readContentAsByteArray(flowFile);
final SourceRecord sourceRecord = createSourceRecord(flowFile, contents, componentState, partitionMap);
sourceRecords.add(sourceRecord);
}
} catch (final Exception e) {
logger.error("Failed to obtain contents of Output FlowFiles in order to form Kafka Record", e);
triggerResult.abort(e);
failureYieldExpiration = System.currentTimeMillis() + FAILURE_YIELD_MILLIS; // delay next execution for 1 second to avoid constantly failing and utilization huge amounts of resources
return null;
}
logger.debug("Returning {} records from poll() method (took {} nanos to run dataflow)", sourceRecords.size(), nanos);
// If there is at least one record, we don't want to acknowledge the trigger result until Kafka has committed the Record.
// This is handled by incrementing the unacknkowledgedRecords count. Then, Kafka Connect will call this.commitRecords().
// The commitRecords() call will then decrement the number of unacknowledgedRecords, and when all unacknowledged Records have been
// acknowledged, it will acknowledge the trigger result.
//
// However, if there are no records, this.commitRecords() will never be called. As a result, we need toe nsure that we acknowledge the trigger result here.
if (sourceRecords.size() > 0) {
unacknowledgedRecords.addAndGet(sourceRecords.size());
} else {
triggerResult.acknowledge();
}
return sourceRecords;
}
/**
* Creates a config instance to be used by the task.
*
* @param properties The properties to use in the config.
* @return The config instance.
*/
protected StatelessNiFiSourceConfig createConfig(Map<String, String> properties) {
return new StatelessNiFiSourceConfig(properties);
}
private void verifyFlowFilesTransferredToProperPort(final TriggerResult triggerResult, final String expectedPortName, final DataflowTrigger trigger) {
final Map<String, List<FlowFile>> flowFileOutputMap = triggerResult.getOutputFlowFiles();
for (final Map.Entry<String, List<FlowFile>> entry : flowFileOutputMap.entrySet()) {
final String portName = entry.getKey();
final List<FlowFile> flowFiles = entry.getValue();
if (!flowFiles.isEmpty() && !expectedPortName.equals(portName)) {
logger.error("Dataflow transferred FlowFiles to Port {} but was expecting data to be transferred to {}. Rolling back session.", portName, expectedPortName);
trigger.cancel();
throw new RetriableException("Data was transferred to unexpected port. Expected: " + expectedPortName + ". Actual: " + portName);
}
}
}
private SourceRecord createSourceRecord(final FlowFile flowFile, final byte[] contents, final Map<String, ?> componentState, final Map<String, ?> partitionMap) {
final Schema valueSchema = (contents == null || contents.length == 0) ? null : Schema.BYTES_SCHEMA;
// Kafka Connect currently gives us no way to determine the number of partitions that a given topic has.
// Therefore, we have no way to partition based on an attribute or anything like that, unless we left it up to
// the dataflow developer to know how many partitions exist a priori and explicitly set an attribute in the range of 0..max,
// but that is not a great solution. Kafka does support using a Simple Message Transform to change the partition of a given
// record, so that may be the best solution.
final Integer topicPartition = null;
final String topic;
if (topicNameAttribute == null) {
topic = topicName;
} else {
final String attributeValue = flowFile.getAttribute(topicNameAttribute);
topic = attributeValue == null ? topicName : attributeValue;
}
final ConnectHeaders headers = new ConnectHeaders();
if (headerAttributeNamePattern != null) {
for (final Map.Entry<String, String> entry : flowFile.getAttributes().entrySet()) {
if (headerAttributeNamePattern.matcher(entry.getKey()).matches()) {
final String headerName = entry.getKey();
final String headerValue = entry.getValue();
headers.add(headerName, headerValue, Schema.STRING_SCHEMA);
}
}
}
final Object key = keyAttributeName == null ? null : flowFile.getAttribute(keyAttributeName);
final Schema keySchema = key == null ? null : Schema.STRING_SCHEMA;
final Long timestamp = System.currentTimeMillis();
return new SourceRecord(partitionMap, componentState, topic, topicPartition, keySchema, key, valueSchema, contents, timestamp, headers);
}
@Override
public void commitRecord(final SourceRecord record, final RecordMetadata metadata) throws InterruptedException {
super.commitRecord(record, metadata);
final long unacked = unacknowledgedRecords.decrementAndGet();
logger.debug("SourceRecord {} committed; number of unacknowledged FlowFiles is now {}", record, unacked);
if (unacked < 1) {
logger.debug("Acknowledging trigger result");
triggerResult.acknowledge();
}
}
@Override
public void stop() {
logger.info("Shutting down Source Task for " + dataflowName);
if (dataflow != null) {
dataflow.shutdown();
}
}
// Available for testing
protected StatelessDataflow getDataflow() {
return dataflow;
}
}

View File

@ -1,100 +0,0 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.nifi.kafka.connect;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.io.File;
import java.util.Arrays;
public class WorkingDirectoryUtils {
protected static final String NAR_UNPACKED_SUFFIX = "nar-unpacked";
protected static final String HASH_FILENAME = "nar-digest";
private static final Logger logger = LoggerFactory.getLogger(WorkingDirectoryUtils.class);
/**
* Goes through the nar/extensions and extensions directories within the working directory
* and deletes every directory whose name ends in "nar-unpacked" and does not have a
* "nar-digest" file in it.
* @param workingDirectory File object pointing to the working directory.
*/
public static void reconcileWorkingDirectory(final File workingDirectory) {
purgeIncompleteUnpackedNars(new File(new File(workingDirectory, "nar"), "extensions"));
purgeIncompleteUnpackedNars(new File(workingDirectory, "extensions"));
}
/**
* Receives a directory as parameter and goes through every directory within it that ends in
* "nar-unpacked". If a directory ending in "nar-unpacked" does not have a file named
* "nar-digest" within it, it gets deleted with all of its contents.
* @param directory A File object pointing to the directory that is supposed to contain
* further directories whose name ends in "nar-unpacked".
*/
public static void purgeIncompleteUnpackedNars(final File directory) {
final File[] unpackedDirs = directory.listFiles(file -> file.isDirectory() && file.getName().endsWith(NAR_UNPACKED_SUFFIX));
if (unpackedDirs == null || unpackedDirs.length == 0) {
logger.debug("Found no unpacked NARs in {}", directory);
if (logger.isDebugEnabled()) {
logger.debug("Directory contains: {}", Arrays.deepToString(directory.listFiles()));
}
return;
}
for (final File unpackedDir : unpackedDirs) {
final File narHashFile = new File(unpackedDir, HASH_FILENAME);
if (narHashFile.exists()) {
logger.debug("Already successfully unpacked {}", unpackedDir);
} else {
purgeDirectory(unpackedDir);
}
}
}
/**
* Delete a directory with all of its contents.
* @param directory The directory to be deleted.
*/
public static void purgeDirectory(final File directory) {
if (directory.exists()) {
deleteRecursively(directory);
logger.debug("Cleaned up {}", directory);
}
}
private static void deleteRecursively(final File fileOrDirectory) {
if (fileOrDirectory.isDirectory()) {
final File[] files = fileOrDirectory.listFiles();
if (files != null) {
for (final File file : files) {
deleteRecursively(file);
}
}
}
deleteQuietly(fileOrDirectory);
}
private static void deleteQuietly(final File file) {
final boolean deleted = file.delete();
if (!deleted) {
logger.debug("Failed to cleanup temporary file {}", file);
}
}
}

View File

@ -1,51 +0,0 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.nifi.kafka.connect.validators;
import org.apache.kafka.common.config.ConfigDef;
import org.apache.kafka.common.config.ConfigException;
import java.io.File;
public class ConnectDirectoryExistsValidator implements ConfigDef.Validator {
@Override
public void ensureValid(final String name, final Object value) {
if (value == null) {
return;
}
if (!(value instanceof String)) {
throw new ConfigException("Invalid value for property " + name + ": The configured value is expected to be the path to a directory");
}
final File file = new File((String) value);
if (!file.exists()) {
throw new ConfigException("The value " + value + " configured for the property " + name + " is not valid because no directory exists at " + file.getAbsolutePath());
}
if (!file.isDirectory()) {
throw new ConfigException("The value " + value + " configured for the property " + name + " is not valid because " + file.getAbsolutePath() + " is not a directory");
}
final File[] files = file.listFiles();
if (files == null) {
throw new ConfigException("The value " + value + " configured for the property " + name + " is not valid because could not obtain a listing of files in directory "
+ file.getAbsolutePath());
}
}
}

View File

@ -1,45 +0,0 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.nifi.kafka.connect.validators;
import org.apache.kafka.common.config.ConfigDef;
import org.apache.kafka.common.config.ConfigException;
import java.net.URI;
public class ConnectHttpUrlValidator implements ConfigDef.Validator {
@Override
public void ensureValid(final String name, final Object value) {
if (value == null) {
return;
}
if (!(value instanceof String)) {
throw new ConfigException("Invalid value for property " + name + ": The configured value is expected to be a URL");
}
try {
final String protocol = URI.create((String) value).toURL().getProtocol();
if (!protocol.equals("http") && !protocol.equals("https")) {
throw new ConfigException("Invalid value for property " + name + ": The value must be an http or https URL");
}
} catch (final Exception e) {
throw new ConfigException("Invalid value for property " + name + ": The value is not a valid URL");
}
}
}

View File

@ -1,43 +0,0 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.nifi.kafka.connect.validators;
import org.apache.kafka.common.config.ConfigDef;
import org.apache.kafka.common.config.ConfigException;
import java.util.regex.Pattern;
public class ConnectRegularExpressionValidator implements ConfigDef.Validator {
@Override
public void ensureValid(final String name, final Object value) {
if (value == null) {
return;
}
if (!(value instanceof String)) {
throw new ConfigException("Invalid value for property " + name + ": The configured value is expected to be a URL");
}
try {
Pattern.compile((String) value);
} catch (final Exception e) {
throw new ConfigException("Invalid value for property " + name + ": The configured value is not a valid Java Regular Expression");
}
}
}

View File

@ -1,54 +0,0 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.nifi.kafka.connect.validators;
import org.apache.kafka.common.config.ConfigDef;
import org.apache.kafka.common.config.ConfigException;
import java.io.File;
public class FlowSnapshotValidator implements ConfigDef.Validator {
@Override
public void ensureValid(final String name, final Object value) {
if (value == null) {
return;
}
if (!(value instanceof String)) {
throw new ConfigException("Invalid value for property " + name + ": The configured value is expected to be the path to a file");
}
final String configuredValue = (String) value;
if (configuredValue.startsWith("http://") || configuredValue.startsWith("https://")) {
return;
}
if (configuredValue.trim().startsWith("{")) {
return;
}
final File file = new File(configuredValue);
if (!file.exists()) {
throw new ConfigException("The value " + value + " configured for the property " + name + " is not valid because no file exists at " + file.getAbsolutePath());
}
if (file.isDirectory()) {
throw new ConfigException("The value " + value + " configured for the property " + name + " is not valid because " + file.getAbsolutePath() + " is a directory, not a file");
}
}
}

View File

@ -1,32 +0,0 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<parent>
<artifactId>nifi-external</artifactId>
<groupId>org.apache.nifi</groupId>
<version>2.0.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>
<packaging>pom</packaging>
<artifactId>nifi-kafka-connect</artifactId>
<modules>
<module>nifi-kafka-connector</module>
<module>nifi-kafka-connector-assembly</module>
<module>nifi-kafka-connector-tests</module>
</modules>
</project>

View File

@ -1,29 +0,0 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi</artifactId>
<version>2.0.0-SNAPSHOT</version>
</parent>
<artifactId>nifi-external</artifactId>
<packaging>pom</packaging>
<modules>
<module>nifi-kafka-connect</module>
</modules>
</project>

View File

@ -37,7 +37,6 @@
<module>nifi-assembly</module>
<module>nifi-docs</module>
<module>nifi-maven-archetypes</module>
<module>nifi-external</module>
<module>nifi-docker</module>
<module>nifi-system-tests</module>
<module>minifi</module>