dfs.federation.router.default.nameserviceId
Nameservice identifier of the default subcluster to monitor.
dfs.federation.router.default.nameservice.enable
true
The default subcluster is enabled to read and write files.
dfs.federation.router.rpc.enable
true
If true, the RPC service to handle client requests in the router is
enabled.
dfs.federation.router.rpc-address
0.0.0.0:8888
RPC address that handles all clients requests.
The value of this property will take the form of router-host1:rpc-port.
dfs.federation.router.rpc-bind-host
The actual address the RPC server will bind to. If this optional address is
set, it overrides only the hostname portion of
dfs.federation.router.rpc-address. This is useful for making the name node
listen on all interfaces by setting it to 0.0.0.0.
dfs.federation.router.handler.count
10
The number of server threads for the router to handle RPC requests from
clients.
dfs.federation.router.handler.queue.size
100
The size of the queue for the number of handlers to handle RPC client requests.
dfs.federation.router.reader.count
1
The number of readers for the router to handle RPC client requests.
dfs.federation.router.reader.queue.size
100
The size of the queue for the number of readers for the router to handle RPC client requests.
dfs.federation.router.connection.creator.queue-size
100
Size of async connection creator queue.
dfs.federation.router.connection.pool-size
1
Size of the pool of connections from the router to namenodes.
dfs.federation.router.connection.min-active-ratio
0.5f
Minimum active ratio of connections from the router to namenodes.
dfs.federation.router.connection.clean.ms
10000
Time interval, in milliseconds, to check if the connection pool should
remove unused connections.
dfs.federation.router.enable.multiple.socket
false
If enable multiple downstream socket or not. If true, ConnectionPool
will use a new socket when creating a new connection for the same user,
and RouterRPCClient will get a better throughput. It's best used with
dfs.federation.router.max.concurrency.per.connection together to get
a better throughput with fewer sockets. Such as enable
dfs.federation.router.enable.multiple.socket and
set dfs.federation.router.max.concurrency.per.connection = 20.
dfs.federation.router.max.concurrency.per.connection
1
The maximum number of requests that a connection can handle concurrently.
When the number of requests being processed by a socket is less than this value,
new request will be processed by this socket. When enable
dfs.federation.router.enable.multiple.socket, it's best
set this value greater than 1, such as 20, to avoid frequent
creation and idle sockets in the case of a NS with jitter requests.
dfs.federation.router.connection.pool.clean.ms
60000
Time interval, in milliseconds, to check if the connection manager should
remove unused connection pools.
dfs.federation.router.metrics.enable
true
If the metrics in the router are enabled.
dfs.federation.router.dn-report.time-out
1000
Time out, in milliseconds for getDatanodeReport.
dfs.federation.router.dn-report.cache-expire
10s
Expiration time in seconds for datanodereport.
dfs.federation.router.enable.get.dn.usage
true
If true, the getNodeUsage method in RBFMetrics will return an up-to-date
result collecting from downstream nameservices. But it will take a long
time and take up thread resources. If false, it will return a mock result with all 0.
dfs.federation.router.metrics.class
org.apache.hadoop.hdfs.server.federation.metrics.FederationRPCPerformanceMonitor
Class to monitor the RPC system in the router. It must implement the
RouterRpcMonitor interface.
dfs.federation.router.admin.enable
true
If true, the RPC admin service to handle client requests in the router is
enabled.
dfs.federation.router.admin-address
0.0.0.0:8111
RPC address that handles the admin requests.
The value of this property will take the form of router-host1:rpc-port.
dfs.federation.router.admin-bind-host
The actual address the RPC admin server will bind to. If this optional
address is set, it overrides only the hostname portion of
dfs.federation.router.admin-address. This is useful for making the name
node listen on all interfaces by setting it to 0.0.0.0.
dfs.federation.router.admin.handler.count
1
The number of server threads for the router to handle RPC requests from
admin.
dfs.federation.router.admin.mount.check.enable
false
If true, add/update mount table will include a destination check to make
sure the file exists in downstream namenodes, and changes to mount table
will fail if the file doesn't exist in any of the destination namenode.
dfs.federation.router.http-address
0.0.0.0:50071
HTTP address that handles the web requests to the Router.
The value of this property will take the form of router-host1:http-port.
dfs.federation.router.http-bind-host
The actual address the HTTP server will bind to. If this optional
address is set, it overrides only the hostname portion of
dfs.federation.router.http-address. This is useful for making the name
node listen on all interfaces by setting it to 0.0.0.0.
dfs.federation.router.https-address
0.0.0.0:50072
HTTPS address that handles the web requests to the Router.
The value of this property will take the form of router-host1:https-port.
dfs.federation.router.https-bind-host
The actual address the HTTPS server will bind to. If this optional
address is set, it overrides only the hostname portion of
dfs.federation.router.https-address. This is useful for making the name
node listen on all interfaces by setting it to 0.0.0.0.
dfs.federation.router.http.enable
true
If the HTTP service to handle client requests in the router is enabled.
dfs.federation.router.fs-limits.max-component-length
0
Defines the maximum number of bytes in UTF-8 encoding in each
component of a path at Router side. A value of 0 will disable the check.
Support multiple size unit suffix(case insensitive). It is act as
configuration dfs.namenode.fs-limits.max-component-length at NameNode
side.
dfs.federation.router.file.resolver.client.class
org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver
Class to resolve files to subclusters. To enable multiple subclusters for a mount point,
set to org.apache.hadoop.hdfs.server.federation.resolver.MultipleDestinationMountTableResolver.
dfs.federation.router.namenode.resolver.client.class
org.apache.hadoop.hdfs.server.federation.resolver.MembershipNamenodeResolver
Class to resolve the namenode for a subcluster.
dfs.federation.router.store.enable
true
If true, the Router connects to the State Store.
dfs.federation.router.store.serializer
org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreSerializerPBImpl
Class to serialize State Store records.
dfs.federation.router.store.driver.class
org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreZooKeeperImpl
Class to implement the State Store. There are three implementation classes currently
being supported:
org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreFileImpl,
org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreFileSystemImpl and
org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreZooKeeperImpl.
These implementation classes use the local file, filesystem and ZooKeeper as a backend respectively.
By default it uses the ZooKeeper as the default State Store.
dfs.federation.router.store.connection.test
60000
How often to check for the connection to the State Store in milliseconds.
dfs.federation.router.store.driver.zk.parent-path
/hdfs-federation
The parent path of zookeeper for StateStoreZooKeeperImpl.
dfs.federation.router.store.driver.zk.async.max.threads
-1
Max threads number of StateStoreZooKeeperImpl in async mode.
The only class currently being supported:
org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreZooKeeperImpl.
Default value is -1, which means StateStoreZooKeeperImpl is working in sync mode.
Use positive integer value to enable async mode.
dfs.federation.router.cache.ttl
1m
How often to refresh the State Store caches in milliseconds. This setting
supports multiple time unit suffixes as described in
dfs.heartbeat.interval. If no suffix is specified then milliseconds is
assumed.
dfs.federation.router.store.membership.expiration
300000
Expiration time in milliseconds for a membership record.
dfs.federation.router.store.membership.expiration.deletion
-1
Deletion time in milliseconds for a membership record. If an expired
membership record exists beyond this time, it will be deleted. If this
value is negative, the deletion is disabled.
dfs.federation.router.heartbeat.enable
true
If true, the Router heartbeats into the State Store.
dfs.federation.router.heartbeat.interval
5000
How often the Router should heartbeat into the State Store in milliseconds.
dfs.federation.router.health.monitor.timeout
30s
Time out for Router to obtain HAServiceStatus from NameNode.
dfs.federation.router.heartbeat-state.interval
5s
How often the Router should heartbeat its state into the State Store in
milliseconds. This setting supports multiple time unit suffixes as
described in dfs.federation.router.quota-cache.update.interval.
dfs.federation.router.namenode.heartbeat.enable
If true, get namenode heartbeats and send into the State Store.
If not explicitly specified takes the same value as for
dfs.federation.router.heartbeat.enable.
dfs.federation.router.store.router.expiration
5m
Expiration time in milliseconds for a router state record. This setting
supports multiple time unit suffixes as described in
dfs.federation.router.quota-cache.update.interval.
dfs.federation.router.store.router.expiration.deletion
-1
Deletion time in milliseconds for a router state record. If an expired
router state record exists beyond this time, it will be deleted. If this
value is negative, the deletion is disabled.
dfs.federation.router.safemode.enable
true
dfs.federation.router.safemode.extension
30s
Time after startup that the Router is in safe mode. This setting
supports multiple time unit suffixes as described in
dfs.heartbeat.interval. If no suffix is specified then milliseconds is
assumed.
dfs.federation.router.safemode.expiration
3m
Time without being able to reach the State Store to enter safe mode. This
setting supports multiple time unit suffixes as described in
dfs.heartbeat.interval. If no suffix is specified then milliseconds is
assumed.
dfs.federation.router.monitor.namenode
The identifier of the namenodes to monitor and heartbeat.
dfs.federation.router.monitor.namenode.nameservice.resolution-enabled
false
Determines if the given monitored namenode address is a domain name which needs to
be resolved.
This is used by router to resolve namenodes.
dfs.federation.router.monitor.namenode.nameservice.resolver.impl
Nameservice resolver implementation used by router.
Effective with
dfs.federation.router.monitor.namenode.nameservices.resolution-enabled on.
dfs.federation.router.monitor.localnamenode.enable
true
If true, the Router should monitor the namenode in the local machine.
dfs.federation.router.mount-table.max-cache-size
10000
Maximum number of mount table cache entries to have.
By default, remove cache entries if we have more than 10k.
dfs.federation.router.mount-table.cache.enable
true
Set to true to enable mount table cache (Path to Remote Location cache).
Disabling the cache is recommended when a large amount of unique paths are queried.
dfs.federation.router.quota.enable
false
Set to true to enable quota system in Router. When it's enabled, setting
or clearing sub-cluster's quota directly is not recommended since Router
Admin server will override sub-cluster's quota with global quota.
dfs.federation.router.quota-cache.update.interval
60s
Interval time for updating quota usage cache in Router.
This property is used only if the value of
dfs.federation.router.quota.enable is true.
This setting supports multiple time unit suffixes as described
in dfs.heartbeat.interval. If no suffix is specified then milliseconds
is assumed.
dfs.federation.router.client.thread-size
32
Max threads size for the RouterClient to execute concurrent
requests.
dfs.federation.router.client.retry.max.attempts
3
Max retry attempts for the RouterClient talking to the Router.
dfs.federation.router.client.reject.overload
false
Set to true to reject client requests when we run out of RPC client
threads.
dfs.federation.router.client.allow-partial-listing
true
If the Router can return a partial list of files in a multi-destination mount point when one of the subclusters is unavailable.
True may return a partial list of files if a subcluster is down.
False will fail the request if one is unavailable.
dfs.federation.router.client.mount-status.time-out
1s
Set a timeout for the Router when listing folders containing mount
points. In this process, the Router checks the mount table and then it
checks permissions in the subcluster. After the time out, we return the
default values.
dfs.federation.router.connect.max.retries.on.timeouts
0
Maximum number of retries for the IPC Client when connecting to the
subclusters. By default, it doesn't let the IPC retry and the Router
handles it.
dfs.federation.router.connect.timeout
2s
Time out for the IPC client connecting to the subclusters. This should be
short as the Router has knowledge of the state of the Routers.
dfs.federation.router.keytab.file
The keytab file used by router to login as its
service principal. The principal name is configured with
dfs.federation.router.kerberos.principal.
dfs.federation.router.kerberos.principal
The Router service principal. This is typically set to
router/_HOST@REALM.TLD. Each Router will substitute _HOST with its
own fully qualified hostname at startup. The _HOST placeholder
allows using the same configuration setting on both Router
in an HA setup.
dfs.federation.router.kerberos.principal.hostname
Optional. The hostname for the Router containing this
configuration file. Will be different for each machine.
Defaults to current hostname.
dfs.federation.router.kerberos.internal.spnego.principal
${dfs.web.authentication.kerberos.principal}
The server principal used by the Router for web UI SPNEGO
authentication when Kerberos security is enabled. This is
typically set to HTTP/_HOST@REALM.TLD The SPNEGO server principal
begins with the prefix HTTP/ by convention.
If the value is '*', the web server will attempt to login with
every principal specified in the keytab file
dfs.web.authentication.kerberos.keytab.
dfs.federation.router.mount-table.cache.update
false
Set true to enable MountTableRefreshService. This service
updates mount table cache immediately after adding, modifying or
deleting the mount table entries. If this service is not enabled
mount table cache are refreshed periodically by
StateStoreCacheUpdateService
dfs.federation.router.mount-table.cache.update.timeout
1m
This property defines how long to wait for all the
admin servers to finish their mount table cache update. This setting
supports multiple time unit suffixes as described in
dfs.federation.router.safemode.extension.
dfs.federation.router.mount-table.cache.update.client.max.time
5m
Remote router mount table cache is updated through
RouterClient(RPC call). To improve performance, RouterClient
connections are cached but it should not be kept in cache forever.
This property defines the max time a connection can be cached. This
setting supports multiple time unit suffixes as described in
dfs.federation.router.safemode.extension.
dfs.federation.router.secret.manager.class
org.apache.hadoop.hdfs.server.federation.router.security.token.ZKDelegationTokenSecretManagerImpl
Class to implement state store to delegation tokens.
Default implementation uses zookeeper as the backend to store delegation tokens.
dfs.federation.router.top.num.token.realowners
10
The number of top real owners by tokens count to report in the JMX metrics.
Real owners are the effective users whose cretential are used to generate
the tokens.
dfs.federation.router.fairness.policy.controller.class
org.apache.hadoop.hdfs.server.federation.fairness.NoRouterRpcFairnessPolicyController
No fairness policy handler by default, for fairness
StaticFairnessPolicyController should be configured.
dfs.federation.router.fairness.handler.count.EXAMPLENAMESERVICE
Dedicated handler count for nameservice EXAMPLENAMESERVICE. The handler
(configed by dfs.federation.router.handler.count)resource is controlled
internally by Semaphore permits. Two requirements have to be satisfied.
1) all downstream nameservices need this config otherwise no permit will
be given thus not proxy will happen. 2) if a special *concurrent*
nameservice is specified, the sum of all configured values is smaller or
equal to the total number of router handlers; if the special *concurrent*
is not specified, the sum of all configured values must be strictly
smaller than the router handlers thus the left will be allocated to the
concurrent calls.
dfs.federation.router.fairness.acquire.timeout
1s
The maximum time to wait for a permit.
dfs.federation.router.federation.rename.bandwidth
10
Specify bandwidth per map in MB.
dfs.federation.router.federation.rename.map
10
Max number of concurrent maps to use for copy.
dfs.federation.router.federation.rename.delay
1000
Specify the delayed duration(millie seconds) when the job needs to retry.
dfs.federation.router.federation.rename.diff
0
Specify the threshold of the diff entries that used in incremental copy
stage.
dfs.federation.router.federation.rename.option
NONE
Specify the action when rename across namespaces. The option can be NONE
and DISTCP.
dfs.federation.router.federation.rename.force.close.open.file
true
Force close all open files when there is no diff in the DIFF_DISTCP stage.
dfs.federation.router.federation.rename.trash
trash
This options has 3 values: trash (move the source path to trash), delete
(delete the source path directly) and skip (skip both trash and deletion).
dfs.federation.router.observer.read.default
false
Whether observer reads are enabled. This is a default for all nameservices.
The default can be inverted for individual namespace by adding them to
dfs.federation.router.observer.read.overrides.
dfs.federation.router.observer.read.overrides
Commas separated list of namespaces for which to invert the default configuration,
dfs.federation.router.observer.read.default, for whether to enable observer reads.
dfs.federation.router.observer.federated.state.propagation.maxsize
5
The maximum size of the federated state to send in the RPC header. Sending the federated
state removes the need to msync on every read call, but at the expense of having a larger
header. The cost tradeoff between the larger header and always msync'ing depends on the number
of namespaces in use and the latency of the msync requests.