dfs.federation.router.default.nameserviceId Nameservice identifier of the default subcluster to monitor. dfs.federation.router.default.nameservice.enable true The default subcluster is enabled to read and write files. dfs.federation.router.rpc.enable true If true, the RPC service to handle client requests in the router is enabled. dfs.federation.router.rpc-address 0.0.0.0:8888 RPC address that handles all clients requests. The value of this property will take the form of router-host1:rpc-port. dfs.federation.router.rpc-bind-host The actual address the RPC server will bind to. If this optional address is set, it overrides only the hostname portion of dfs.federation.router.rpc-address. This is useful for making the name node listen on all interfaces by setting it to 0.0.0.0. dfs.federation.router.handler.count 10 The number of server threads for the router to handle RPC requests from clients. dfs.federation.router.handler.queue.size 100 The size of the queue for the number of handlers to handle RPC client requests. dfs.federation.router.reader.count 1 The number of readers for the router to handle RPC client requests. dfs.federation.router.reader.queue.size 100 The size of the queue for the number of readers for the router to handle RPC client requests. dfs.federation.router.connection.creator.queue-size 100 Size of async connection creator queue. dfs.federation.router.connection.pool-size 1 Size of the pool of connections from the router to namenodes. dfs.federation.router.connection.min-active-ratio 0.5f Minimum active ratio of connections from the router to namenodes. dfs.federation.router.connection.clean.ms 10000 Time interval, in milliseconds, to check if the connection pool should remove unused connections. dfs.federation.router.enable.multiple.socket false If enable multiple downstream socket or not. If true, ConnectionPool will use a new socket when creating a new connection for the same user, and RouterRPCClient will get a better throughput. It's best used with dfs.federation.router.max.concurrency.per.connection together to get a better throughput with fewer sockets. Such as enable dfs.federation.router.enable.multiple.socket and set dfs.federation.router.max.concurrency.per.connection = 20. dfs.federation.router.max.concurrency.per.connection 1 The maximum number of requests that a connection can handle concurrently. When the number of requests being processed by a socket is less than this value, new request will be processed by this socket. When enable dfs.federation.router.enable.multiple.socket, it's best set this value greater than 1, such as 20, to avoid frequent creation and idle sockets in the case of a NS with jitter requests. dfs.federation.router.connection.pool.clean.ms 60000 Time interval, in milliseconds, to check if the connection manager should remove unused connection pools. dfs.federation.router.metrics.enable true If the metrics in the router are enabled. dfs.federation.router.dn-report.time-out 1000 Time out, in milliseconds for getDatanodeReport. dfs.federation.router.dn-report.cache-expire 10s Expiration time in seconds for datanodereport. dfs.federation.router.enable.get.dn.usage true If true, the getNodeUsage method in RBFMetrics will return an up-to-date result collecting from downstream nameservices. But it will take a long time and take up thread resources. If false, it will return a mock result with all 0. dfs.federation.router.metrics.class org.apache.hadoop.hdfs.server.federation.metrics.FederationRPCPerformanceMonitor Class to monitor the RPC system in the router. It must implement the RouterRpcMonitor interface. dfs.federation.router.admin.enable true If true, the RPC admin service to handle client requests in the router is enabled. dfs.federation.router.admin-address 0.0.0.0:8111 RPC address that handles the admin requests. The value of this property will take the form of router-host1:rpc-port. dfs.federation.router.admin-bind-host The actual address the RPC admin server will bind to. If this optional address is set, it overrides only the hostname portion of dfs.federation.router.admin-address. This is useful for making the name node listen on all interfaces by setting it to 0.0.0.0. dfs.federation.router.admin.handler.count 1 The number of server threads for the router to handle RPC requests from admin. dfs.federation.router.admin.mount.check.enable false If true, add/update mount table will include a destination check to make sure the file exists in downstream namenodes, and changes to mount table will fail if the file doesn't exist in any of the destination namenode. dfs.federation.router.http-address 0.0.0.0:50071 HTTP address that handles the web requests to the Router. The value of this property will take the form of router-host1:http-port. dfs.federation.router.http-bind-host The actual address the HTTP server will bind to. If this optional address is set, it overrides only the hostname portion of dfs.federation.router.http-address. This is useful for making the name node listen on all interfaces by setting it to 0.0.0.0. dfs.federation.router.https-address 0.0.0.0:50072 HTTPS address that handles the web requests to the Router. The value of this property will take the form of router-host1:https-port. dfs.federation.router.https-bind-host The actual address the HTTPS server will bind to. If this optional address is set, it overrides only the hostname portion of dfs.federation.router.https-address. This is useful for making the name node listen on all interfaces by setting it to 0.0.0.0. dfs.federation.router.http.enable true If the HTTP service to handle client requests in the router is enabled. dfs.federation.router.fs-limits.max-component-length 0 Defines the maximum number of bytes in UTF-8 encoding in each component of a path at Router side. A value of 0 will disable the check. Support multiple size unit suffix(case insensitive). It is act as configuration dfs.namenode.fs-limits.max-component-length at NameNode side. dfs.federation.router.file.resolver.client.class org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver Class to resolve files to subclusters. To enable multiple subclusters for a mount point, set to org.apache.hadoop.hdfs.server.federation.resolver.MultipleDestinationMountTableResolver. dfs.federation.router.namenode.resolver.client.class org.apache.hadoop.hdfs.server.federation.resolver.MembershipNamenodeResolver Class to resolve the namenode for a subcluster. dfs.federation.router.store.enable true If true, the Router connects to the State Store. dfs.federation.router.store.serializer org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreSerializerPBImpl Class to serialize State Store records. dfs.federation.router.store.driver.class org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreZooKeeperImpl Class to implement the State Store. There are three implementation classes currently being supported: org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreFileImpl, org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreFileSystemImpl and org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreZooKeeperImpl. These implementation classes use the local file, filesystem and ZooKeeper as a backend respectively. By default it uses the ZooKeeper as the default State Store. dfs.federation.router.store.connection.test 60000 How often to check for the connection to the State Store in milliseconds. dfs.federation.router.store.driver.zk.parent-path /hdfs-federation The parent path of zookeeper for StateStoreZooKeeperImpl. dfs.federation.router.store.driver.zk.async.max.threads -1 Max threads number of StateStoreZooKeeperImpl in async mode. The only class currently being supported: org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreZooKeeperImpl. Default value is -1, which means StateStoreZooKeeperImpl is working in sync mode. Use positive integer value to enable async mode. dfs.federation.router.cache.ttl 1m How often to refresh the State Store caches in milliseconds. This setting supports multiple time unit suffixes as described in dfs.heartbeat.interval. If no suffix is specified then milliseconds is assumed. dfs.federation.router.store.membership.expiration 300000 Expiration time in milliseconds for a membership record. dfs.federation.router.store.membership.expiration.deletion -1 Deletion time in milliseconds for a membership record. If an expired membership record exists beyond this time, it will be deleted. If this value is negative, the deletion is disabled. dfs.federation.router.heartbeat.enable true If true, the Router heartbeats into the State Store. dfs.federation.router.heartbeat.interval 5000 How often the Router should heartbeat into the State Store in milliseconds. dfs.federation.router.health.monitor.timeout 30s Time out for Router to obtain HAServiceStatus from NameNode. dfs.federation.router.heartbeat-state.interval 5s How often the Router should heartbeat its state into the State Store in milliseconds. This setting supports multiple time unit suffixes as described in dfs.federation.router.quota-cache.update.interval. dfs.federation.router.namenode.heartbeat.enable If true, get namenode heartbeats and send into the State Store. If not explicitly specified takes the same value as for dfs.federation.router.heartbeat.enable. dfs.federation.router.store.router.expiration 5m Expiration time in milliseconds for a router state record. This setting supports multiple time unit suffixes as described in dfs.federation.router.quota-cache.update.interval. dfs.federation.router.store.router.expiration.deletion -1 Deletion time in milliseconds for a router state record. If an expired router state record exists beyond this time, it will be deleted. If this value is negative, the deletion is disabled. dfs.federation.router.safemode.enable true dfs.federation.router.safemode.extension 30s Time after startup that the Router is in safe mode. This setting supports multiple time unit suffixes as described in dfs.heartbeat.interval. If no suffix is specified then milliseconds is assumed. dfs.federation.router.safemode.expiration 3m Time without being able to reach the State Store to enter safe mode. This setting supports multiple time unit suffixes as described in dfs.heartbeat.interval. If no suffix is specified then milliseconds is assumed. dfs.federation.router.monitor.namenode The identifier of the namenodes to monitor and heartbeat. dfs.federation.router.monitor.namenode.nameservice.resolution-enabled false Determines if the given monitored namenode address is a domain name which needs to be resolved. This is used by router to resolve namenodes. dfs.federation.router.monitor.namenode.nameservice.resolver.impl Nameservice resolver implementation used by router. Effective with dfs.federation.router.monitor.namenode.nameservices.resolution-enabled on. dfs.federation.router.monitor.localnamenode.enable true If true, the Router should monitor the namenode in the local machine. dfs.federation.router.mount-table.max-cache-size 10000 Maximum number of mount table cache entries to have. By default, remove cache entries if we have more than 10k. dfs.federation.router.mount-table.cache.enable true Set to true to enable mount table cache (Path to Remote Location cache). Disabling the cache is recommended when a large amount of unique paths are queried. dfs.federation.router.quota.enable false Set to true to enable quota system in Router. When it's enabled, setting or clearing sub-cluster's quota directly is not recommended since Router Admin server will override sub-cluster's quota with global quota. dfs.federation.router.quota-cache.update.interval 60s Interval time for updating quota usage cache in Router. This property is used only if the value of dfs.federation.router.quota.enable is true. This setting supports multiple time unit suffixes as described in dfs.heartbeat.interval. If no suffix is specified then milliseconds is assumed. dfs.federation.router.client.thread-size 32 Max threads size for the RouterClient to execute concurrent requests. dfs.federation.router.client.retry.max.attempts 3 Max retry attempts for the RouterClient talking to the Router. dfs.federation.router.client.reject.overload false Set to true to reject client requests when we run out of RPC client threads. dfs.federation.router.client.allow-partial-listing true If the Router can return a partial list of files in a multi-destination mount point when one of the subclusters is unavailable. True may return a partial list of files if a subcluster is down. False will fail the request if one is unavailable. dfs.federation.router.client.mount-status.time-out 1s Set a timeout for the Router when listing folders containing mount points. In this process, the Router checks the mount table and then it checks permissions in the subcluster. After the time out, we return the default values. dfs.federation.router.connect.max.retries.on.timeouts 0 Maximum number of retries for the IPC Client when connecting to the subclusters. By default, it doesn't let the IPC retry and the Router handles it. dfs.federation.router.connect.timeout 2s Time out for the IPC client connecting to the subclusters. This should be short as the Router has knowledge of the state of the Routers. dfs.federation.router.keytab.file The keytab file used by router to login as its service principal. The principal name is configured with dfs.federation.router.kerberos.principal. dfs.federation.router.kerberos.principal The Router service principal. This is typically set to router/_HOST@REALM.TLD. Each Router will substitute _HOST with its own fully qualified hostname at startup. The _HOST placeholder allows using the same configuration setting on both Router in an HA setup. dfs.federation.router.kerberos.principal.hostname Optional. The hostname for the Router containing this configuration file. Will be different for each machine. Defaults to current hostname. dfs.federation.router.kerberos.internal.spnego.principal ${dfs.web.authentication.kerberos.principal} The server principal used by the Router for web UI SPNEGO authentication when Kerberos security is enabled. This is typically set to HTTP/_HOST@REALM.TLD The SPNEGO server principal begins with the prefix HTTP/ by convention. If the value is '*', the web server will attempt to login with every principal specified in the keytab file dfs.web.authentication.kerberos.keytab. dfs.federation.router.mount-table.cache.update false Set true to enable MountTableRefreshService. This service updates mount table cache immediately after adding, modifying or deleting the mount table entries. If this service is not enabled mount table cache are refreshed periodically by StateStoreCacheUpdateService dfs.federation.router.mount-table.cache.update.timeout 1m This property defines how long to wait for all the admin servers to finish their mount table cache update. This setting supports multiple time unit suffixes as described in dfs.federation.router.safemode.extension. dfs.federation.router.mount-table.cache.update.client.max.time 5m Remote router mount table cache is updated through RouterClient(RPC call). To improve performance, RouterClient connections are cached but it should not be kept in cache forever. This property defines the max time a connection can be cached. This setting supports multiple time unit suffixes as described in dfs.federation.router.safemode.extension. dfs.federation.router.secret.manager.class org.apache.hadoop.hdfs.server.federation.router.security.token.ZKDelegationTokenSecretManagerImpl Class to implement state store to delegation tokens. Default implementation uses zookeeper as the backend to store delegation tokens. dfs.federation.router.top.num.token.realowners 10 The number of top real owners by tokens count to report in the JMX metrics. Real owners are the effective users whose cretential are used to generate the tokens. dfs.federation.router.fairness.policy.controller.class org.apache.hadoop.hdfs.server.federation.fairness.NoRouterRpcFairnessPolicyController No fairness policy handler by default, for fairness StaticFairnessPolicyController should be configured. dfs.federation.router.fairness.handler.count.EXAMPLENAMESERVICE Dedicated handler count for nameservice EXAMPLENAMESERVICE. The handler (configed by dfs.federation.router.handler.count)resource is controlled internally by Semaphore permits. Two requirements have to be satisfied. 1) all downstream nameservices need this config otherwise no permit will be given thus not proxy will happen. 2) if a special *concurrent* nameservice is specified, the sum of all configured values is smaller or equal to the total number of router handlers; if the special *concurrent* is not specified, the sum of all configured values must be strictly smaller than the router handlers thus the left will be allocated to the concurrent calls. dfs.federation.router.fairness.acquire.timeout 1s The maximum time to wait for a permit. dfs.federation.router.federation.rename.bandwidth 10 Specify bandwidth per map in MB. dfs.federation.router.federation.rename.map 10 Max number of concurrent maps to use for copy. dfs.federation.router.federation.rename.delay 1000 Specify the delayed duration(millie seconds) when the job needs to retry. dfs.federation.router.federation.rename.diff 0 Specify the threshold of the diff entries that used in incremental copy stage. dfs.federation.router.federation.rename.option NONE Specify the action when rename across namespaces. The option can be NONE and DISTCP. dfs.federation.router.federation.rename.force.close.open.file true Force close all open files when there is no diff in the DIFF_DISTCP stage. dfs.federation.router.federation.rename.trash trash This options has 3 values: trash (move the source path to trash), delete (delete the source path directly) and skip (skip both trash and deletion). dfs.federation.router.observer.read.default false Whether observer reads are enabled. This is a default for all nameservices. The default can be inverted for individual namespace by adding them to dfs.federation.router.observer.read.overrides. dfs.federation.router.observer.read.overrides Commas separated list of namespaces for which to invert the default configuration, dfs.federation.router.observer.read.default, for whether to enable observer reads. dfs.federation.router.observer.federated.state.propagation.maxsize 5 The maximum size of the federated state to send in the RPC header. Sending the federated state removes the need to msync on every read call, but at the expense of having a larger header. The cost tradeoff between the larger header and always msync'ing depends on the number of namespaces in use and the latency of the msync requests.