Compare commits

...

729 Commits

Author SHA1 Message Date
YuCheng Hu 6ed3394bf6 maven-deploy-plugin.version Version to 3.0.0 2023-06-10 15:51:46 -04:00
hfutatzhanghb 35158db711
HDFS-17023. RBF: Record proxy time when call invokeConcurrent method. (#5683). Contributed by farmmamba.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Reviewed-by: Simbarashe Dzinamarira <sdzinamarira@linkedin.com>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-06-10 00:06:28 +05:30
Steve Loughran 7a45ef4164
MAPREDUCE-7435. Manifest Committer OOM on abfs (#5519)
This modifies the manifest committer so that the list of files
to rename is passed between stages as a file of
writeable entries on the local filesystem.

The map of directories to create is still passed in memory;
this map is built across all tasks, so even if many tasks
created files, if they all write into the same set of directories
the memory needed is O(directories) with the
task count not a factor.

The _SUCCESS file reports on heap size through gauges.
This should give a warning if there are problems.

Contributed by Steve Loughran
2023-06-09 17:00:59 +01:00
zhangshuyan 9c989515ba
HDFS-17037. Consider nonDfsUsed when running balancer. (#5715). Contributed by Shuyan Zhang.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2023-06-09 16:19:08 +08:00
Steve Loughran 7bb09f1010
HADOOP-18752. Change fs.s3a.directory.marker.retention to "keep" (#5689)
This 
1. changes the default value of fs.s3a.directory.marker.retention
   to "keep"
2. no longer prints a message when an S3A FS instances is
   instantiated with any option other than delete.

Switching to marker retention improves performance
on any S3 bucket as there are no needless marker DELETE requests
-leading to a reduction in write IOPS and and any delays waiting
for the DELETE call to finish.

There are *very* significant improvements on versioned buckets,
where tombstone markers slow down LIST operations: the more
tombstones there are, the worse query planning gets.

Having versioning enabled on production stores is the foundation
of any data protection strategy, so this has tangible benefits
in production.

It is *not* compatible with older hadoop releases; specifically
- Hadoop branch 2 < 2.10.2
- Any release of Hadoop 3.0.x and Hadoop 3.1.x
- Hadoop 3.2.0 and 3.2.1
- Hadoop 3.3.0
Incompatible releases have no problems reading data in stores
where markers are retained, but can get confused when deleting
or renaming directories.

If you are still using older versions to write to data, and cannot
yet upgrade, switch the option back to "delete"

Contributed by Steve Loughran
2023-06-08 12:12:29 +01:00
hfutatzhanghb 0e6bd09ae3
HDFS-17003. Erasure Coding: invalidate wrong block after reporting bad blocks from datanode (#5643). Contributed by hfutatzhanghb.
Reviewed-by: Stephen O'Donnell <sodonnel@apache.org>
Reviewed-by: zhangshuyan <zqingchai@gmail.com>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2023-06-08 18:06:51 +08:00
hfutatzhanghb ddae78b0ec
HDFS-17035. FsVolumeImpl#getActualNonDfsUsed may return negative value. (#5708). Contributed by farmmamba.
Reviewed-by: Shuyan Zhang <zqingchai@gmail.com>
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-06-08 14:28:01 +05:30
huhaiyang 0c209961f8
HDFS-17019. Optimize the logic for reconfigure slow peer enable for Namenode" (#5671)
* HDFS-17019. Optimize the logic for reconfigure slow peer enable for Namenode
2023-06-08 10:05:49 +08:00
Viraj Jasani 1dbaba8e70
HADOOP-18740. S3A prefetch cache blocks should be accessed by RW locks (#5675)
Contributed by Viraj Jasani
2023-06-07 14:05:52 +01:00
slfan1989 9de13f879a
YARN-11502. Refactor AMRMProxy#FederationInterceptor#registerApplicationMaster. (#5705) 2023-06-05 15:54:41 -07:00
slfan1989 e6937d7076
YARN-11425. [Hotfix] YARN-11425. Modify Expiration Time Unit error. (#5712) 2023-06-05 15:51:39 -07:00
slfan1989 fd3c3ae068
YARN-11500. Fix typos in hadoop-yarn-server-common#federation. (#5702) 2023-06-05 15:49:36 -07:00
zhtttylz d9980ab40f
HDFS-17029. Support getECPolices API in WebHDFS (#5698). Contributed by Hualong Zhang.
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Reviewed-by: Tao Li <tomscut@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-06-05 17:33:37 +05:30
caozhiqiang 5d6ca13c5c
HDFS-16983. Fix concat operation doesn't honor dfs.permissions.enabled (#5561). Contributed by caozhiqiang.
Reviewed-by: zhangshuyan <zqingchai@gmail.com>
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-06-05 16:42:59 +05:30
slfan1989 241398de3b
YARN-11492. Improve createJerseyClient#setConnectTimeout Code. (#5636). Contributed by Shilun Fan.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-06-05 16:36:07 +05:30
mudit-97 e69a077af8
YARN-11497 : Support removal of only selective node states in untracked removal flow (#5681)
Co-authored-by: mudit.sharma <mudit.sharma@flipkart.com>
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2023-06-05 17:36:10 +08:00
hfutatzhanghb 2243cfd225
HDFS-17028. RBF: Optimize debug logs of class ConnectionPool and other related class. (#5694). Contributed by farmmamba.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-06-05 14:34:46 +05:30
Ayush Saxena 1d0c9ab433
Revert "HADOOP-18207. Introduce hadoop-logging module (#5503)"
This reverts commit 03a499821c.
2023-06-05 09:34:40 +05:30
Xianming Lei ee94f6cdcb
YARN-11277. Trigger log-dir deletion by size for NonAggregatingLogHandler. (#4797)
Reviewed-by: Akira Ajisaka <aajisaka@apache.org>
Reviewed-by: Ashutosh Gupta <ashugpt@amazon.com>
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2023-06-05 11:08:06 +08:00
Szilard Nemeth e0a339223a HADOOP-18709. Add curator based ZooKeeper communication support over SSL/TLS into the common library. Contributed by Ferenc Erdelyi 2023-06-04 14:40:41 -04:00
Viraj Jasani 03a499821c
HADOOP-18207. Introduce hadoop-logging module (#5503)
Reviewed-by: Duo Zhang <zhangduo@apache.org>
2023-06-02 18:07:34 -07:00
Steve Loughran 160b9fc3c9
HADOOP-18755. openFile builder new optLong() methods break hbase-filesystem (#5704)
This is a followup to 
HADOOP-18724. Open file fails with NumberFormatException for S3AFileSystem

Contributed by Steve Loughran
2023-06-01 14:31:08 +01:00
smarthan 9f1e23cc67
HDFS-17031. Reduce some repeated codes in RouterRpcServer. (#5701). Contributed by Chengwei Wang.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Reviewed-by: Simbarashe Dzinamarira <sdzinamarira@linkedin.com>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-06-01 08:32:38 +05:30
NishthaShah f8b7ddf69c
HDFS-16996. Fix flaky testFsCloseAfterClusterShutdown in TestFileCreation (#5697). Contributed by Nishtha Shah.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-06-01 08:23:48 +05:30
Simbarashe Dzinamarira d92a5815f4
HDFS-17027. RBF: Adds auto-msync support for clients connecting to routers. (#5693) 2023-05-31 10:20:19 -07:00
Marcono1234 9acf462d26
HDFS-17000. Fix faulty loop condition in TestDFSStripedOutputStreamUpdatePipeline (#5699). Contributed by Marcono1234.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-31 16:10:07 +05:30
hchaverri 124313d215
HDFS-17026. RBF: NamenodeHeartbeatService should update JMX report with configurable frequency. (#5691). Contributed by hchaverri.
Signed-off-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2023-05-31 14:26:31 +08:00
slfan1989 86c250a54a
YARN-7720. Race condition between second app attempt and UAM timeout when first attempt node is down. (#5672) 2023-05-29 10:37:08 -07:00
Xianming Lei 97afb33c73
YARN-11276. Add LRU cache for RMWebServices.getApps. (#4793)
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2023-05-26 20:46:00 +08:00
slfan1989 b977065cc4
YARN-11478. [Federation] SQLFederationStateStore Support Store ApplicationSubmitData. (#5663) 2023-05-24 11:43:20 -07:00
Steve Loughran e6b54f7f68
Revert "HADOOP-18706. Improve S3ABlockOutputStream recovery (#5563)"
This reverts commit 372631c566.

Reverted due to HADOOP-18744.
2023-05-24 19:22:22 +01:00
hfutatzhanghb e9740cb17a
HDFS-16908. Fix javadoc of field IncrementalBlockReportManager#readyToSend. (#5351). Contributed by farmmamba.
Reviewed-by: Tao Li <tomscut@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-23 16:12:50 +05:30
Tamas Domok aeb3f6f1a8
YARN-11490. Reverting YARN-11211 and eliminating the use of DefaultMetricsSystem during configuration validation (#5644) 2023-05-23 10:36:37 +02:00
Ashutosh Gupta a98d15804a
MAPREDUCE-7419. Upgrade Junit 4 to 5 in hadoop-mapreduce-client-common (#5028). Contributed by Ashutosh Gupta.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-23 11:37:49 +05:30
Viraj Jasani 3b65b5d68f
HDFS-17020. RBF: mount table addAll should print failed records in std error (#5674) 2023-05-22 18:39:52 -07:00
Gautham B A afe850ca2c
HADOOP-18746. Install Python 3 for Windows 10 docker image (#5679)
* This PR installs Python 3.10.11 for
  Windows 10 Docker image to fix
  the issue with building mvnsite.
* After installing Python 3.10.11, it
  creates the hardlink python -> python3
  as required by the script.
2023-05-21 21:10:04 +05:30
hfutatzhanghb 5b22dc6ace
HDFS-16909. Improve ReplicaMap#mergeAll method. (#5353). Contributed by ZhangHB.
Reviewed-by: zhangshuyan <zqingchai@gmail.com>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2023-05-21 19:08:58 +08:00
huhaiyang af933f3a4f
HDFS-17017. Fix the issue of arguments number limit in report command in DFSAdmin (#5667). Contributed by Haiyang Hu.
Reviewed-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-21 09:03:34 +05:30
Xianming Lei 0110e24ed8
YARN-11496. Improve TimelineService log format. (#5677). Contributed by Xianming Lei.
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-20 14:57:45 +05:30
NishthaShah 9a524ede87
HDFS-17022. Fix the exception message to print the Identifier pattern (#5678). Contributed by Nishtha Shah.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-20 14:40:23 +05:30
zhtttylz 408dbf318e
HDFS-17014. HttpFS Add Support getStatus API (#5660). Contributed by Hualong Zhang.
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-20 14:37:38 +05:30
NishthaShah 5272ed8670
HADOOP-17518. Update the regex to A-Z (#5669). Contributed by Nishtha Shah.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-20 06:21:13 +05:30
Keyao Li 0914b3e792
HDFS-16697. Add logs if resources are not available in NameNodeResourcePolicy. (#5569). Contributed by ECFuzz.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-19 18:38:01 +05:30
Keyao Li 339bc7b3a6
HDFS-16653. Improve error messages in ShortCircuitCache. (#5568). Contributed by ECFuzz.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-19 07:43:18 +05:30
Xianming Lei 441fb23293
HDFS-17018. Improve dfsclient log format. (#5668). Contributed by Xianming Lei.
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-19 06:25:43 +05:30
Patrick GRANDJEAN 4627242c44
HADOOP-18652. Path.suffix raises NullPointerException (#5653). Contributed by Patrick Grandjean.
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-19 05:16:55 +05:30
LiuGuH f6770dee47
HDFS-16979. RBF: Add proxyuser port in hdfsauditlog (#5552). Contributed by liuguanghua.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Reviewed-by: Simbarashe Dzinamarira <sdzinamarira@linkedin.com>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-19 05:02:16 +05:30
slfan1989 bba663038d
YARN-8898. Fix FederationInterceptor#allocate to set application priority in allocateResponse. (#5645) 2023-05-18 11:57:38 -07:00
Peter Szucs ff8eac517a
YARN-11463. Node Labels root directory creation doesn't have a retry logic - 2nd addendum (#5670) 2023-05-18 14:48:43 +02:00
jianghuazhu 78cc528739
HDFS-17012.Remove unused DFSConfigKeys#DFS_DATANODE_PMEM_CACHE_DIRS_DEFAULT. (#5659). Contributed by JiangHua Zhu.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-18 14:27:19 +05:30
Tsz-Wo Nicholas Sze 350dcaf616
HDFS-17010. Add a subtree test to TestSnapshotDiffReport. (#5656) 2023-05-18 15:53:26 +08:00
wangzhaohui 03163f9de2
HDFS-17011. Fix the metric of "HttpPort" at DataNodeInfo (#5657). Contributed by Zhaohui Wang.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-18 12:12:11 +05:30
slfan1989 5ddaf2e133
YARN-11493. [Federation] ConfiguredRMFailoverProxyProvider Supports Randomly Select an Router. (#5651) 2023-05-17 11:09:10 -07:00
Viraj Jasani 8e17385141
HDFS-17009. RBF: state store putAll should also return failed records (#5664) 2023-05-17 09:33:34 -07:00
liang3zy22 482897a0f6
Fix typos in HDFS documents. (#5665) 2023-05-17 09:28:01 -07:00
Steve Loughran a90c722143
HADOOP-18724. [FOLLOW-UP] cherrypick changes from branch-3.3 backport (#5662)
* move FileContext.copy() onto optLong()
* move FileUtil onto optLong()

This brings trunk into sync with the branch-3.3 changes
2023-05-16 18:16:24 +01:00
Viraj Jasani bef40e9427
HADOOP-18688. S3A audit header to include count of items in delete ops (#5621)
The auditor-generated http referrer URL now includes the count of keys
to delete in the "ks" query parameter

Contributed by Viraj Jasani
2023-05-16 10:40:16 +01:00
Chun Chen 11af08d67a
YARN-11489. Fix memory leak of DelegationTokenRenewer futures in DelegationTokenRenewerPoolTracker. (#5629). Contributed by Chun Chen.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2023-05-14 21:38:04 +08:00
smarthan 251439d769
HDFS-16985. Fix data missing issue when delete local block file. (#5564). Contributed by Chengwei Wang.
Reviewed-by: Shuyan Zhang <zqingchai@gmail.com>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2023-05-14 21:33:38 +08:00
slfan1989 e0938b4c2a
YARN-11495. Fix typos in hadoop-yarn-server-web-proxy. (#5652). Contributed by Shilun Fan.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-13 11:41:38 +05:30
slfan1989 2f87f716fa
YARN-3660. BackPort [GPG] Federation Global Policy Generator (service hook only). (#5625) 2023-05-12 18:12:05 -07:00
Viraj Jasani 5d0cc455f5
HDFS-17008. Fix RBF JDK 11 javadoc warnings (#5648) 2023-05-12 17:27:13 -07:00
zhtttylz 0c77629849
HDFS-17001. Support getStatus API in WebHDFS (#5628). Contributed by Hualong Zhang.
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-13 03:49:13 +05:30
Steve Loughran ad1e3a0f5b
HADOOP-18724. (followup) remove deprecation on optLong/optDouble methods (#5650)
Somehow @Deprecated crept in to the declaration of the
new FSBuilder optLong/optDouble methods.
2023-05-12 15:22:37 +01:00
susheel-gupta 0f3406ac34
YARN-11312: [UI2] Refresh buttons don't work after EmberJS upgrade (#5649) 2023-05-12 14:20:26 +02:00
WangYuanben 905bfa84a8
HDFS-16965. Add switch to decide whether to enable native codec. (#5520). Contributed by WangYuanben.
Reviewed-by: Tao Li <tomscut@apache.org>
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-12 04:12:02 +05:30
Steve Loughran e76c09ac3b
HADOOP-18724. Open file fails with NumberFormatException for S3AFileSystem (#5611)
This:

1. Adds optLong, optDouble, mustLong and mustDouble
   methods to the FSBuilder interface to let callers explicitly
   passin long and double arguments.
2. The opt() and must() builder calls which take float/double values
   now only set long values instead, so as to avoid problems
   related to overloaded methods resulting in a ".0" being appended
   to a long value.
3. All of the relevant opt/must calls in the hadoop codebase move to
   the new methods
4. And the s3a code is resilient to parse errors in is numeric options
   -it will downgrade to the default.

This is nominally incompatible, but the floating-point builder methods
were never used: nothing currently expects floating point numbers.

For anyone who wants to safely set numeric builder options across all compatible
releases, convert the number to a string and then use the opt(String, String)
and must(String, String) methods.

Contributed by Steve Loughran
2023-05-11 17:57:25 +01:00
Viraj Jasani fe61d8f073
HDFS-16978. RBF: Admin command to support bulk add of mount points (#5554). Contributed by Viraj Jasani.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Reviewed-by: Simbarashe Dzinamarira <sdzinamarira@linkedin.com>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-11 08:45:34 +05:30
zhtttylz 5084e881ef
HDFS-16990. HttpFS Add Support getFileLinkStatus API (#5602). Contributed by Hualong Zhang.
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-10 23:10:19 +05:30
slfan1989 690db3c34b
YARN-11479. [Federation] ZookeeperFederationStateStore Support Store ApplicationSubmitData. (#5631) 2023-05-10 09:37:47 -07:00
cxzl25 be50d221f5
YARN-11467. RM failover may fail when the nodes.exclude-path file does not exist (#5565) 2023-05-10 15:16:33 +08:00
slfan1989 d95b5c679d
YARN-11424. [Federation] Router Supports DeregisterSubCluster. (#5363) 2023-05-09 16:17:23 -07:00
slfan1989 a2dda0ce03
HADOOP-18359. Update commons-cli from 1.2 to 1.5. (#5095). Contributed by Shilun Fan.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-10 01:42:12 +05:30
zhangshuyan 03bf8f982a
HDFS-16999. Fix wrong use of processFirstBlockReport(). (#5622). Contributed by Shuyan Zhang.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2023-05-09 10:05:28 +08:00
slfan1989 bdeca45294
YARN-11340. [Federation] Improve SQLFederationStateStore DataSource Config. (#5403) 2023-05-08 10:13:09 -07:00
Gautham B A a80e3dba3b
HADOOP-18734. Create qbt.sh symlink on Windows (#5626) 2023-05-08 09:55:15 -07:00
ZanderXu 4ee92efb73
HDFS-16865. The source path is always / after RBF proxied the complete, addBlock and getAdditionalDatanode RPC. (#5200). Contributed by ZanderXu.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-06 08:50:21 +05:30
slfan1989 cda9863d54
YARN-11477. [Federation] MemoryFederationStateStore Support Store ApplicationSubmitData. (#5616) 2023-05-05 13:43:40 -07:00
slfan1989 eab4c33d09
YARN-11470. FederationStateStoreFacade Cache Support Guava Cache. (#5609) 2023-05-05 13:41:38 -07:00
Gautham B A c974710d8e
HADOOP-18729. Fix mvnsite on Windows 10 (#5618) 2023-05-05 13:08:58 -07:00
Chris 372631c566
HADOOP-18706. Improve S3ABlockOutputStream recovery (#5563)
Contributed by Chris Bevard
2023-05-05 11:57:42 +01:00
Szilard Nemeth c7699d3dcd YARN-11079. Make an AbstractParentQueue to store common ParentQueue and ManagedParentQueue functionality. Contributed by Susheel Gupta 2023-05-04 22:16:18 -04:00
Viraj Jasani ceb8878d4f
HDFS-16998. RBF: Add ops metrics for getSlowDatanodeReport in RouterClientActivity (#5615) 2023-05-04 09:45:40 -07:00
Dongjoon Hyun 27776ac45e
HADOOP-18727. Fix WriteOperations.listMultipartUploads function description (#5613)
Contributed by Dongjoon Hyun
2023-05-04 13:03:48 +01:00
Peter Szucs bd607951c0
YARN-11463. Node Labels root directory creation doesn't have a retry logic - addendum (#5614) 2023-05-04 12:27:25 +02:00
Gautham B A 0d06fd77de
HADOOP-18134. Setup Jenkins nightly CI for Windows 10 (#5062)
This PR gets Yetus to run on Windows 10
against the Hadoop codebase. It introduces
the following changes to allow us to setup
the nightly CI on Jenkins for Hadoop on
Windows 10.
* Hadoop personality changes for Yetus.
  Additional arguments have been passed,
  which are necessary to build and test
  Hadoop on Windows 10.
* Docker image for building Hadoop on
  Windows 10.
  Installs the necessary tools that are
  needed to run Yetus.
* dev-support/jenkins.sh file.
  Passing of some flags are handled here
  which are needed for the nightly CI.
2023-05-03 22:44:54 +05:30
slfan1989 476f60a806
YARN-10144. Federation: Add missing FederationClientInterceptor APIs. (#5587) 2023-05-03 09:21:56 -07:00
slfan1989 c1d10f3872
YARN-9049. Add application submit data to state store. (#5606) 2023-05-03 09:19:54 -07:00
Tak Lon (Stephen) Wu 0e46388474
HADOOP-18671. Add recoverLease(), setSafeMode(), isFileClosed() as interfaces to hadoop-common (#5553)
The HDFS lease APIs have been replicated as interfaces in hadoop-common so other filesystems can
also implement them.  Applications which use the leasing APIs should migrate to the new
interface where possible.

Contributed by Stephen Wu
2023-05-03 11:05:55 +01:00
slfan1989 668c0a0930
YARN-11379. [Federation] Support mapAttributesToNodes, getGroupsForUser API's for Federation. (#5596) 2023-05-02 13:46:06 -07:00
zhangshuyan fddc9769a5
HADOOP-18726. Set the locale to avoid printing useless logs. (#5612). Contributed by Shuyan Zhang.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2023-05-03 00:09:36 +08:00
Viraj Jasani bfcf5dd03b
HADOOP-18697. S3A prefetch: failure of ITestS3APrefetchingInputStream#testRandomReadLargeFile (#5580)
Contributed by Viraj Jasani
2023-05-02 15:21:46 +01:00
Szilard Nemeth 73ca64a3ba
YARN-11450. Improvements for TestYarnConfigurationFields and TestConfigurationFieldsBase (#5455) 2023-05-02 15:52:57 +02:00
slfan1989 87e17b2713
YARN-11437. [hotfix][Federation] SQLFederationStateStore Support Version. (#5598) 2023-05-01 16:01:50 -07:00
Gautham B A 5147106b59
HADOOP-18725. Avoid cross-platform build for irrelevant Dockerfile changes (#5610) 2023-05-01 09:35:50 -07:00
SevenAddSix 1079890ae3
HDFS-16707. RBF: Expose RouterRpcFairnessPolicyController related request record metrics for each nameservice to Prometheus (#4665). Contributed by Jiale Qi.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-05-01 21:22:26 +05:30
Pralabh Kumar d75c6d9d57
HADOOP-18715. Add debug log for getting details of tokenKindMap (#5608). Contributed by Pralabh Kumar.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-04-29 17:28:49 +05:30
fanluoo 1a2cd965a7
HDFS-16897. Fix abundant Broken pipe exception in BlockSender (#5329). Contributed by fanluo.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-04-29 17:24:56 +05:30
Hexiaoqiao 70c0aa342e
YARN-11482. Fix bug of DRF comparision DominantResourceFairnessComparator2 in fair scheduler. (#5607). Contributed by Xiaoqiao He.
Reviewed-by: Shilun Fan <slfan1989@apache.org>
2023-04-29 11:18:42 +08:00
slfan1989 5ed7e912dc
YARN-11469. Refactor FederationStateStoreFacade Cache Code. (#5570)
Co-authored-by: slfan1989 <louj1988@@>
2023-04-28 14:11:13 -07:00
wangzhaohui 0e63152218
HDFS-16995. Remove unused parameters at NameNodeHttpServer#initWebHdfs (#5601). Contributed by Zhaohui Wang.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-04-28 05:25:01 +05:30
Tsz-Wo Nicholas Sze d9576bb9ee
HDFS-16972. Delete a snapshot may deleteCurrentFile. (#5532) 2023-04-27 09:17:47 -07:00
Riya Khandelwal 60a7e8acaa
YARN-11459 Changed label called "max resource" on UIv1 and UIv2 (#5527) 2023-04-27 15:25:25 +02:00
yl09099 245fde17d7
YARN-11474.The yarn queue list is displayed on the CLI (#5577)
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2023-04-27 18:05:48 +08:00
Steve Loughran eb749ddd4d
HADOOP-18695. S3A: reject multipart copy requests when disabled (#5548)
Contributed by Steve Loughran.
2023-04-27 10:59:46 +01:00
slfan1989 55eebcf277
YARN-11378. [Federation] Support checkForDecommissioningNodes、refreshClusterMaxPriority API's for Federation. (#5551) 2023-04-25 14:12:38 -07:00
slfan1989 a716459cdf
YARN-11437. [Federation] SQLFederationStateStore Support Version. (#5589) 2023-04-25 14:09:45 -07:00
cxzl25 5af0845076
HDFS-16672. Fix lease interval comparison in BlockReportLeaseManager (#4598). Contributed by dzcxzl.
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org
2023-04-26 02:11:11 +05:30
Steve Loughran b6b9bd67bb
MAPREDUCE-7437. MR Fetcher class to use an AtomicInteger to generate IDs. (#5579)
...as until now it wasn't thread safe

Contributed by Steve Loughran
2023-04-25 19:53:40 +01:00
Sebastian Baunsgaard 6aac6cb212
HADOOP-18660. Filesystem Spelling Mistake (#5475). Contributed by Sebastian Baunsgaard.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-04-25 21:44:04 +05:30
cxzl25 2f66f0b83a
HADOOP-18694. Client.Connection#updateAddress needs to ensure that address is resolved before updating (#5542). Contributed by dzcxzl.
Reviewed-by: Steve Vaughan <email@stevevaughan.me>
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org
2023-04-25 03:52:49 +05:30
zhtttylz c9e0af9961
HDFS-16981. Support getFileLinkStatus API in WebHDFS (#5572). Contributed by Hualong Zhang.
Reviewed-by: Simbarashe Dzinamarira <sdzinamarira@linkedin.com>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-04-25 03:30:56 +05:30
Tsz-Wo Nicholas Sze dc78849f27
HDFS-16975. FileWithSnapshotFeature.isCurrentFileDeleted is not reloaded from FSImage. (#5546) 2023-04-24 09:04:28 -07:00
Tamas Domok 05e6dc19ea
HADOOP-18705. ABFS should exclude incompatible credential providers. (#5560)
Contributed by Tamas Domok.
2023-04-24 15:46:40 +01:00
zhangshuyan 6a23c376c9
HDFS-16986. EC: Fix locationBudget in getListing(). (#5582). Contributed by Shuyan Zhang.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2023-04-24 18:53:25 +08:00
Doroszlai, Attila 5b23224970
HADOOP-18714. Wrong StringUtils.join() called in AbstractContractRootDirectoryTest (#5578) 2023-04-24 09:17:12 +02:00
wangzhaohui 51dcbd1d61
HDFS-16988. Improve NameServices info at JournalNode web UI (#5584). Contributed by Zhaohui Wang.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-04-24 01:13:02 +05:30
PJ Fanning b683769fc9
HADOOP-18712. Upgrade to jetty 9.4.51 due to cve (#5574). Contributed by PJ Fanning.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-04-24 01:01:51 +05:30
dependabot[bot] 3b7783c549
HADOOP-18689. Bump jettison from 1.5.3 to 1.5.4 in /hadoop-project (#5502)
Co-authored-by: Ayush Saxena <ayushsaxena@apache.org>
2023-04-22 16:19:21 +05:30
PJ Fanning ad49ddda0e
HADOOP-18711. upgrade nimbus jwt jar due to issues in its embedded shaded json-smart code. (#5573). Contributed by PJ Fanning.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-04-22 14:01:09 +05:30
LiuGuH 742e07d9c3
HADOOP-18710. Add RPC metrics for response time (#5545). Contributed by liuguanghua.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-04-22 01:06:08 +05:30
Ashutosh Gupta 964c1902c8
YARN-11463. Node Labels root directory creation doesn't have a retry logic (#5562)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
2023-04-21 14:03:22 +02:00
Christos Bisias 9e24ed2196
HADOOP-18691. Add a CallerContext getter on the Schedulable interface (#5540) 2023-04-20 10:11:25 -07:00
PJ Fanning 0918c87fa2
HADOOP-18687. Remove json-smart dependency. (#5549). Contributed by PJ Fanning.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-04-20 18:28:09 +05:30
Nikita Eshkeev d07356e60e
HADOOP-18597. Simplify single node instructions for creating directories for Map Reduce. (#5305)
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-04-20 16:12:44 +05:30
Neil 1ff7a65b9f
HDFS-16954. RBF: The operation of renaming a multi-subcluster directory to a single-cluster directory should throw ioexception. (#5483). Contributed by Max Xie.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-04-20 15:19:18 +05:30
Ayush Saxena 9e3d5c754b
Revert "HADOOP-18687. Remove json-smart dependency. (#5549). Contributed by PJ Fanning."
This reverts commit b6c0ec796e.
2023-04-20 10:26:08 +05:30
PJ Fanning b6c0ec796e
HADOOP-18687. Remove json-smart dependency. (#5549). Contributed by PJ Fanning.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-04-20 00:47:22 +05:30
rdingankar 5119d0c72f
HDFS-16982 Use the right Quantiles Array for Inverse Quantiles snapshot (#5556) 2023-04-18 10:47:37 -07:00
slfan1989 a258f1f235
YARN-11326. [Federation] Add RM FederationStateStoreService Metrics. (#4963) 2023-04-18 09:13:08 -07:00
slfan1989 635521db4c
YARN-11438. [Federation] ZookeeperFederationStateStore Support Version. (#5537) 2023-04-18 09:05:52 -07:00
Viraj Jasani 0e3aafe6c0
HADOOP-18399. S3A Prefetch - SingleFilePerBlockCache to use LocalDirAllocator (#5054)
Contributed by Viraj Jasani
2023-04-18 16:37:48 +01:00
Steve Loughran 405ed1dde6
HADOOP-18470. Hadoop 3.3.5 release wrap-up (#5558)
Post-release updates of the branches

* Add jdiff xml files from 3.3.5 release.
* Declare 3.3.5 as the latest stable release.
* Copy release notes.
2023-04-18 10:12:07 +01:00
Steve Loughran 6ea10cf41b
HADOOP-18696. ITestS3ABucketExistence arn test failures. (#5557)
Explicitly sets the fs.s3a.endpoint.region to eu-west-1 so
the ARN-referenced fs creation fails with unknown store
rather than IllegalArgumentException.

Steve Loughran
2023-04-17 10:18:33 +01:00
yl09099 2c4d6bf33d
YARN-11465. Improved YarnClient Log Format (#5550)
Co-authored-by: yl09099 <shaq376260428@163.com>
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2023-04-17 09:27:52 +08:00
Dongjoon Hyun 0d1b4a3556
HADOOP-18590. Publish SBOM artifacts (#5555). Contributed by Dongjoon Hyun. 2023-04-15 21:35:43 +05:30
slfan1989 0bcdea7912
YARN-11239. Optimize FederationClientInterceptor audit log. (#5127) 2023-04-14 13:09:18 -07:00
zhangshuyan 0185afafea
HDFS-16974. Consider volumes average load of each DataNode when choosing target. (#5541). Contributed by Shuyan Zhang.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2023-04-14 10:33:30 +08:00
dependabot[bot] f1936d29f1
HADOOP-18693. Bump derby from 10.10.2.0 to 10.14.2.0 in /hadoop-project (#5427)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-04-13 10:25:17 -07:00
Melissa You 2b60d0c1f4
[HDFS-16971] Add read metrics for remote reads in FileSystem Statistics #5534 (#5536) 2023-04-13 09:07:42 -07:00
slfan1989 06f9bdffa6
YARN-10846. Add dispatcher metrics to NM. (#4687) 2023-04-12 09:53:20 -07:00
slfan1989 dd6d0ac510
YARN-11462. Fix Typo of hadoop-yarn-common. (#5539)
Co-authored-by: Shilun Fan <slfan1989@apache.org>
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2023-04-12 11:08:23 +08:00
Steve Loughran 7c3d94a032
HADOOP-18637. S3A to support upload of files greater than 2 GB using DiskBlocks (#5543)
Contributed By: HarshitGupta and Steve Loughran
2023-04-12 05:17:45 +05:30
slfan1989 bffa49a64f
YARN-11377. [Federation] Support addToClusterNodeLabels、removeFromClusterNodeLabels、replaceLabelsOnNode API's for Federation. (#5525) 2023-04-11 09:47:58 -07:00
Sadanand Shenoy 74ddf69f80
HDFS-16911. Distcp with snapshot diff to support Ozone filesystem. (#5364) 2023-04-10 14:03:16 -07:00
rdingankar 3e2ae1da00
HDFS-16949 Introduce inverse quantiles for metrics where higher numer… (#5495) 2023-04-10 08:56:00 -07:00
mjwiq e45451f9c7
HADOOP-18687. hadoop-auth: remove unnecessary dependency on json-smart (#5524)
Contributed by Michiel de Jong
2023-04-06 16:00:33 +01:00
zhtttylz 523ff81624
HDFS-16952. Support getLinkTarget API in WebHDFS (#5517)
Co-authored-by: Zhtttylz <hualong.z@hotmail.com>
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2023-04-06 19:44:47 +08:00
Simbarashe Dzinamarira 47c22e388e
HDFS-16943. RBF: Implements MySQL based StateStoreDriver. (#5469) 2023-04-05 16:44:29 -07:00
Viraj Jasani 422bf3b24c
HDFS-16973. RBF: MountTableResolver cache size lookup should take read lock (#5533) 2023-04-05 14:06:38 -07:00
slfan1989 69b90b5698
YARN-11436. [Federation] MemoryFederationStateStore Support Version. (#5518) 2023-04-05 10:35:24 -07:00
HarshitGupta11 dfb2ca0a64
HADOOP-18684. S3A filesystem to support binding to to other URI schemes (#5521)
Contributed by Harshit Gupta
2023-04-05 12:42:11 +01:00
Viraj Jasani 937caf7de9
HDFS-16967. RBF: File based state stores should allow concurrent access to the records (#5523)
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Reviewed-by: Simbarashe Dzinamarira <sdzinamarira@linkedin.com>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2023-04-04 22:39:53 +09:00
Chris Nauroth 14c5810d5e HADOOP-18680: Insufficient heap during full test runs in Docker container.
Closes #5522

Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-04-03 22:53:29 +00:00
zhtttylz 811441d5bc
HDFS-16951. Add description of GETSERVERDEFAULTS to WebHDFS doc (#5491)
Co-authored-by: Zhtttylz <hualong.z@hotmail.com>
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2023-04-01 18:18:20 +08:00
slfan1989 eb1d3ebe2f
YARN-11442. Refactor FederationInterceptorREST Code. (#5420) 2023-03-31 15:29:18 -07:00
sreeb-msft 389b3ea6e3
HADOOP-18012. ABFS: Enable config controlled ETag check for Rename idempotency (#5488)
To support recovery of network failures during rename, the abfs client
fetches the etag of the source file, and when recovering from a
failure, uses this tag to determine whether the rename succeeded
before the failure happened.

* This works for files, but not directories
* It adds the overhead of a HEAD request before each rename.
* The option can be disabled by setting "fs.azure.enable.rename.resilience"
  to false

Contributed by Sree Bhattacharyya
2023-03-31 19:15:15 +01:00
Galsza 016362a28b
HADOOP-18548. Hadoop Archive tool (HAR) should acquire delegation tokens from source and destination file systems (#5355)
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
2023-03-30 07:12:02 +08:00
Viraj Jasani b4bcbb9515
HDFS-16959. RBF: State store cache loading metrics (#5497) 2023-03-29 10:43:13 -07:00
slfan1989 5bc8f25327
YARN-11446. [Federation] Add updateSchedulerConfiguration, getSchedulerConfiguration REST APIs for Router. (#5476) 2023-03-28 09:33:19 -07:00
slfan1989 aa602381c5
YARN-11426. Improve YARN NodeLabel Memory Display. (#5335)
YARN-11426. Improve YARN NodeLabel Memory Display.
Co-authored-by: slfan1989 <louj1988@@>
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Reviewed-by: Chris Nauroth <cnauroth@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2023-03-28 22:48:46 +08:00
zhangshuyan 700147b4ac
HDFS-16964. Improve processing of excess redundancy after failover. (#5510). Contributed by Shuyan Zhang.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2023-03-28 16:14:59 +08:00
Jinhu Wu b5e8269d9b
HADOOP-18458: AliyunOSSBlockOutputStream to support heap/off-heap buffer before uploading data to OSS (#4912) 2023-03-28 14:27:01 +08:00
slfan1989 926993cb73
YARN-11376. [Federation] Support updateNodeResource、refreshNodesResources API's for Federation. (#5496) 2023-03-27 09:27:21 -07:00
Anmol Asrani 762d3ddb43
HADOOP-18146: ABFS: Added changes for expect hundred continue header (#4039)
This change lets the client react pre-emptively to server load without getting to 503 and the exponential backoff
which follows. This stops performance suffering so much as capacity limits are approached for an account.

Contributed by Anmol Asranii
2023-03-27 12:43:34 +01:00
Andras Katona ee01c64c6c
HADOOP-18676. jettison dependency override in hadoop-common lib (#5513) 2023-03-27 09:59:02 +02:00
Ayush Saxena b82bcbd8ad
Revert "HADOOP-18676. Fixing jettison vulnerability of hadoop-common lib (#5507)"
This reverts commit 72b0122706.
2023-03-25 12:04:28 +05:30
Andras Katona 72b0122706
HADOOP-18676. Fixing jettison vulnerability of hadoop-common lib (#5507)
* HADOOP-18587. Fixing jettison vulnerability of hadoop-common lib

* no need for excluding, let it come

Change-Id: Ia6e4ad351158dd4b0510dec34bbde531a60e7654
2023-03-24 16:31:45 +01:00
Tamas Domok 69748aae32
YARN-11461. fix NPE in determineMissingParents (auto queue creation / CS). (#5506)
Change-Id: Iaaaf43a545588eaff8a0a20f6f3c27258a45f390
2023-03-24 09:38:53 +01:00
Kidd5368 5cf62d1498
HDFS-16948. Update log of BlockManager#chooseExcessRedundancyStriped when EC internal block is moved by balancer. (#5474). Contributed by Kidd53685368.
Reviewed-by: zhangshuyan <zqingchai@gmail.com>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2023-03-23 17:00:23 +08:00
zhaixiaojuan@loongson.cn 028cde0006
HADOOP-18644. Add bswap support for LoongArch64. (#5453). Contributed by zhaixiaojuan.
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
2023-03-23 11:08:59 +08:00
Ayush Saxena e3cb9573e1
HADOOP-18662. ListFiles with recursive fails with FNF. (#5477). Contributed by Ayush Saxena.
Reviewed-by: Steve Loughran <stevel@apache.org
2023-03-23 08:30:08 +05:30
Yubi Lee 67e02a92e0
HADOOP-18666. A whitelist of endpoints to skip Kerberos authentication doesn't work for ResourceManager and Job History Server (#5480) 2023-03-22 10:54:41 +09:00
Viraj Jasani 0dbe1d3284
HADOOP-18668. Path capability probe for truncate is only honored by RawLocalFileSystem (#5492) 2023-03-21 10:23:16 +08:00
Viraj Jasani 9a8287c36f
HADOOP-18669. Remove Log4Json Layout (#5493) 2023-03-21 10:07:06 +08:00
Viraj Jasani f8d0949f7d
HDFS-16953. RBF: Mount table store APIs should update cache only if state store record is successfully updated (#5482) 2023-03-18 14:43:25 -07:00
Viraj Jasani b6a9d7b442
HADOOP-18631. (ADDENDUM) Use LogCapturer to match audit log pattern and remove hdfs async audit log configs (#5451) 2023-03-18 06:33:50 +08:00
slfan1989 fa723ae839
YARN-11445. [Federation] Add getClusterInfo, getClusterUserInfo REST APIs for Router. (#5472) 2023-03-17 11:59:45 -07:00
Pranav Saxena 759ddebb13
HADOOP-18647. x-ms-client-request-id to identify the retry of an API. (#5437)
The x-ms-client-request-id now includes a field to indicate a call is a retry of a previous
operation

Contributed by Pranav Saxena
2023-03-15 20:03:22 +00:00
Masatake Iwasaki 7c42d0f7da
HADOOP-17746. Compatibility table in directory_markers.md doesn't render right. (#3116)
Contributed by Masatake Iwasaki
2023-03-15 17:10:42 +00:00
Viraj Jasani 15935fa865
HDFS-16947. RBF NamenodeHeartbeatService to report error for not being able to register namenode in state store (#5470) 2023-03-16 00:59:55 +08:00
Viraj Jasani cf4a678ce9
HADOOP-18649. CLA and CRLA appenders to be replaced with RFA (#5448) 2023-03-16 00:46:17 +08:00
Viraj Jasani 405bfa2800
HADOOP-18654. Remove unused custom appender TaskLogAppender (#5457) 2023-03-16 00:45:37 +08:00
Stephen O'Donnell eee2ea075d
HDFS-16942. Addendum. Send error to datanode if FBR is rejected due to bad lease (#5478). Contributed by Stephen O'Donnell/ 2023-03-15 10:03:00 +05:30
Viraj Jasani aff840c59c
HADOOP-18653. LogLevel servlet to determine log impl before using setLevel (#5456)
The log level can only be set on Log4J log implementations;
probes are used to downgrade to a warning when other
logging back ends are used

Contributed by Viraj Jasani
2023-03-13 12:30:12 +00:00
Steve Loughran 09469bf47d
HADOOP-18661. Fix bin/hadoop usage script terminology. (#5473)
Followup to HADOOP-13209: s/slaves/r/workers in
the usage message you get when you type "bin/hadoop"

Contributed by Steve Loughran
2023-03-13 12:24:36 +00:00
PJ Fanning 476340c699
HADOOP-18658. snakeyaml dependency: upgrade to v2.0 (#5467). Contributed by PJ Fanning.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-03-13 10:08:04 +05:30
Stephen O'Donnell ca6f5afb6d
HDFS-16942. Send error to datanode if FBR is rejected due to bad lease (#5460) 2023-03-11 16:40:07 +00:00
nao 734f7abfb8
HADOOP-18646. Upgrade Netty to 4.1.89.Final to fix CVE-2022-41881 (#5435)
This fixes CVE-2022-41881.

This also upgrades io.opencensus dependencies to 0.12.3
 
Contributed by Aleksandr Nikolaev
2023-03-10 15:27:22 +00:00
slfan1989 b406060c6b
YARN-8972. [Router] Add support to prevent DoS attack over ApplicationSubmissionContext size. (#5382) 2023-03-08 13:29:30 -08:00
rohit-kb 487368c4b9
HADOOP-18655. Upgrade kerby to 2.0.3 due to CVE-2023-25613 (#5458)
Upgrade kerby to 2.0.3 due to the CVE https://nvd.nist.gov/vuln/detail/CVE-2023-25613


Contributed by Rohit Kumar Badeau
2023-03-08 15:31:03 +00:00
Pranav Saxena 358bf80c94
HADOOP-18606. ABFS: Add reason in x-ms-client-request-id on a retried API call. (#5299)
Contributed by Pranav Saxena
2023-03-07 17:02:13 +00:00
slfan1989 927401886a
HDFS-16934. TestDFSAdmin.testAllDatanodesReconfig regression (#5434)
Contributed by Shilun Fan
2023-03-06 15:26:53 +00:00
zhangshuyan 2cb0c35fc1
HDFS-16939. Fix the thread safety bug in LowRedundancyBlocks. (#5450). Contributed by Shuyan Zhang.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2023-03-06 20:10:31 +08:00
Varun Saxena 2a0dc2ab2f YARN-11383. Workflow priority mappings is case sensitive (#5171)
Contributed by Aparajita Choudhary
2023-03-05 21:25:16 +05:30
ZanderXu 6bd2444815
HDFS-16923. [SBN read] getlisting RPC to observer will throw NPE if path does not exist (#5400)
Signed-off-by: Erik Krogen <xkrogen@apache.org>
2023-03-01 16:18:38 -08:00
Viraj Jasani e1ca466bdb
HADOOP-18648. Avoid loading kms log4j properties dynamically by KMSWebServer (#5441) 2023-03-02 08:02:07 +08:00
Tom 162288bc0a
HDFS-16896 clear ignoredNodes list when we clear deadnode list on ref… (#5322)
HDFS-16896 clear ignoredNodes list when we clear deadnode list on refetchLocations. 
ignoredNodes list is only used on hedged read codepath

Co-authored-by: Tom McCormick <tmccormi@linkedin.com>
2023-03-01 19:47:04 +00:00
Viraj Jasani 2ab7eb4caa
HDFS-16935. Fix TestFsDatasetImpl#testReportBadBlocks (#5432)
Contributed by Viraj Jasani
2023-03-01 18:53:10 +00:00
Szilard Nemeth 8f6be3678d MAPREDUCE-7434. Fix ShuffleHandler tests. Contributed by Tamas Domok 2023-03-01 16:10:05 +01:00
Viraj Jasani 28d2753d2f
HADOOP-18645. Provide keytab file key name with ServiceStateException (#5433)
Signed-off-by: Tao Li <tomscut@apache.org>
2023-03-01 09:34:12 +08:00
slfan1989 bcc51ce2c5
YARN-11375. [Federation] Support refreshAdminAcls、refreshServiceAcls API's for Federation. (#5312) 2023-02-28 14:44:00 -08:00
Steve Loughran dcd9dc6983
HADOOP-18641. Cloud connector dependency and LICENSE fixup. (#5429)
POM and LICENSE fixup of transient dependencies
* Exclude hadoop-cloud-storage imports which come in with hadoop-common
* Add explicit import of hadoop's org.codehaus.jettison declaration
  to hadoop-aliyun
* Tune aliyun jars imports
* Update LICENSE-binary for the current set of libraries.

Contributed by Steve Loughran
2023-02-28 10:48:54 +00:00
rdingankar 0ca5686034
HDFS-16917 Add transfer rate quantile metrics for DataNode reads (#5397)
Co-authored-by: Ravindra Dingankar <rdingankar@linkedin.com>
2023-02-27 18:26:32 +00:00
Simbarashe Dzinamarira 61f369c43e
HDFS-16890: RBF: Ensures router periodically refreshes its record of a namespace's state. (#5298) 2023-02-27 17:56:24 +00:00
slfan1989 8798b94ee1
YARN-11221. [Federation] Add replaceLabelsOnNodes, replaceLabelsOnNode REST APIs for Router. (#5302) 2023-02-27 09:34:39 -08:00
Viraj Jasani a90238c0b8
HADOOP-18631. Migrate Async appenders to log4j properties (#5418) 2023-02-26 01:47:44 +08:00
slfan1989 25ebd0b8b1
YARN-11222. [Federation] Add addToClusterNodeLabels, removeFromClusterNodeLabels REST APIs for Router. (#5328) 2023-02-24 10:52:57 -08:00
slfan1989 27a54955f9
YARN-5604. [Federation] Add versioning for FederationStateStore. (#5394) 2023-02-24 10:51:19 -08:00
Steve Loughran 4067facae6
HADOOP-18470. Remove HDFS RBF text in the 3.3.5 index.md file
+ add a link to mukund's apachecon talk

Change-Id: I3d04b385ff1312aabf2a81d034f54f124d544a54
2023-02-23 13:23:35 +00:00
Steve Loughran e2d7919dc1
Revert "HADOOP-18590. Publish SBOM artifacts (#5281)"
Causes HADOOP-18641. cyclonedx maven plugin breaks on recent maven releases

This reverts commit 6f99558c2e.
2023-02-23 11:23:53 +00:00
Owen O'Malley 8025a60ae7
HDFS-16901: Minor fix for unit test.
Signed-off-by: Owen O'Malley <oomalley@linkedin.com>
2023-02-22 16:37:49 -08:00
Simbarashe Dzinamarira 4cc33e5e37
HDFS-16901: RBF: Propagates real user's username via the caller context, when a proxy user is being used. (#5346) 2023-02-22 21:58:44 +00:00
slfan1989 2e997d818d
YARN-11370. [Federation] Refactor MemoryFederationStateStore code. (#5126) 2023-02-22 12:37:35 -08:00
Ayush Saxena e8a6b2c2c4
HADOOP-18582. Addendum: Skip unnecessary cleanup logic in DistCp. (#5409)
Followup to the original HADOOP-18582.

Temporary path cleanup is re-enabled for -append jobs
as these will create temporary files when creating or overwriting files.

Contributed by Ayush Saxena
2023-02-22 19:29:41 +00:00
hchaverr fb31393b65
HADOOP-18535. Implement token storage solution based on MySQL
Fixes #1240

Signed-off-by: Owen O'Malley <oomalley@linkedin.com>
2023-02-22 10:38:50 -08:00
Steve Loughran 11a220c6e7
HADOOP-18636 LocalDirAllocator cannot recover from directory tree deletion (#5412)
Even though DiskChecker.mkdirsWithExistsCheck() will create the directory tree,
it is only called *after* the enumeration of directories with available
space has completed.

Directories which don't exist are reported as having 0 space, therefore
the mkdirs code is never reached.

Adding a simple mkdirs() -without bothering to check the outcome-
ensures that if a dir has been deleted then it will be reconstructed
if possible. If it can't it will still have 0 bytes of space
reported and so be excluded from the allocation.

Contributed by Steve Loughran
2023-02-22 11:48:12 +00:00
susheel-gupta 49b8ac19f2
YARN-11408. Add a check of autoQueueCreation is disabled for emitDefaultUserLimitFactor method (#5278)
* added a check of autoQueueCreationV2Disabled

Change-Id: If1e36c5969d270c1b81a4bbd2e883fa819c81f20

* added check of AutoCreateChildQueueDisabled

Change-Id: Ia011b4393ccd8d4d419a2e46b06a5237d050851c

* removed auto-create-child-queue-enabled check and emit

Change-Id: I7a154124519ecbd81379b46a238707c16db1e82a
2023-02-22 09:46:42 +01:00
Ayush Saxena fe5bb49ad9
Revert "YARN-11404. Add junit5 dependency to hadoop-mapreduce-client-app to fix few unit test failure. Contributed by Susheel Gupta"
This reverts commit 8eda456d37.
2023-02-22 07:28:13 +05:30
slfan1989 4e6e2f318c
YARN-11394. Fix hadoop-yarn-server-resourcemanager module Java Doc Errors. (#5288)
Contributed by Shilun Fan
2023-02-21 14:39:32 +00:00
nao acf82d4d55
HADOOP-18622. Upgrade ant to 1.10.13 (#5360). Contributed by Aleksandr Nikolaev.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-02-21 16:48:49 +05:30
Viraj Jasani 88914cada0
HDFS-16925. Namenode audit log to only include IP address of client (#5407)
Reviewed-by: Takanobu Asanuma <tasanuma@apache.org>
Signed-off-by: Tao Li <tomscut@apache.org>
2023-02-21 08:26:32 +08:00
zhtttylz a3b500d046
HDFS-16916. Improve the use of JUnit Test in DFSClient (#5404)
Co-authored-by: Zhtttylz <hualong.z@hotmail.com>
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2023-02-18 13:17:37 +08:00
slfan1989 7e486038ea
YARN-11439. Fix Typo of hadoop-yarn-ui README.md. (#5405)
Co-authored-by: Shilun Fan <slfan1989@apache.org>
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2023-02-18 13:14:01 +08:00
Arnout Engelen 02fd87a4d8
HADOOP-18627. Add stronger wording in 'secure mode' introduction (#5406)
Make it more clear that when deploying Hadoop 'secure mode' is generally not optional.

Contributed by Arnout Engelen
2023-02-17 16:30:41 +00:00
Steve Loughran 10e7ca481c
YARN-11441. Revert YARN-10495.
This reverts commit 7d3c8ef606.
2023-02-17 15:05:06 +00:00
Bryan Beaudreault 7e19bc31b6
HADOOP-18215. Enhance WritableName to be able to return aliases for classes that use serializers (#4215) 2023-02-16 18:13:25 +00:00
Mehakmeet Singh 7a0903b743
HADOOP-18633. fix test AbstractContractDistCpTest#testDistCpUpdateCheckFileSkip (#5401)
Contributed by: Mehakmeet Singh
2023-02-16 10:09:06 +05:30
hfutatzhanghb 723535b788
HDFS-16914. Add some logs for updateBlockForPipeline RPC. (#5381)
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Tao Li <tomscut@apache.org>
2023-02-16 09:25:50 +08:00
slfan1989 a5f48eacca
YARN-11425. [Federation] Router Supports SubClusterCleaner. (#5326) 2023-02-15 14:40:34 -08:00
slfan1989 c3706597a3
YARN-11349. [Federation] Router Support DelegationToken With SQL. (#5244) 2023-02-15 14:38:41 -08:00
Ankit Saurabh f4f2793f3b
HADOOP-18351. Reduce excess logging of errors during S3A prefetching reads (#5274)
Contributed by Ankit Saurabh
2023-02-15 18:28:42 +00:00
Zita Dombi 4cbe19f3a2
HDFS-16761. Namenode UI for Datanodes page not loading if any data node is down (#5390) 2023-02-15 16:16:04 +00:00
Steve Loughran d56977e909
HADOOP-18470. More in the 3.3.5 index.html about security (#5383)
Expands on the comments in cluster config to tell people
they shouldn't be running a cluster without a private VLAN
in cloud, that Knox is good here, and unsecured clusters
without a VLAN are just computation-as-a-service to crypto miners

Contributed by Steve Loughran
2023-02-14 17:22:59 +00:00
SimhadriGovindappa e2ab35084a
HADOOP-18630. Add gh-pages in asf.yaml to deploy the current trunk doc (#5393). Contributed by Simhadri Govindappa.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-02-14 18:13:29 +05:30
Viraj Jasani 021fcc6c5e
HADOOP-18628. IPC Server Connection should log host name before returning VersionMismatch error (#5385)
Contributed by Viraj Jasani
2023-02-14 11:48:48 +00:00
GuoPhilipse fe0541b58d
HDFS-16913. Fix flaky some unit tests since they offen timeout (#5377)
Co-authored-by: gf13871 <gf13871@ly.com>
Reviewed-by: Tao Li <tomscut@apache.org>
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2023-02-14 15:29:38 +08:00
Ayush Saxena 1def35d802
HADOOP-18524. Addendum: Deploy Hadoop trunk version website. (#5389). Contributed by Ayush Saxena.
Reviewed-by: Vinayakumar B <vinayakumarb@apache.org>
2023-02-14 11:05:41 +05:30
Viraj Jasani 90de1ff151
HADOOP-18206 Cleanup the commons-logging references and restrict its usage in future (#5315) 2023-02-14 03:24:06 +08:00
Ayush Saxena 30f560554d
HADOOP-18524. Deploy Hadoop trunk version website. (#5386). Contributed by Ayush Saxena.
Reviewed-by: Akira Ajisaka <aajisaka@apache.org>
2023-02-14 00:03:02 +05:30
Tamas Domok e4b5314991
MAPREDUCE-7433. Remove unused mapred/LoggingHttpResponseEncoder.java. (#5388) 2023-02-13 16:21:27 +01:00
Steve Vaughan f42c89dffb
HDFS-16904. Close webhdfs during TestSymlinkHdfs teardown (#5372)
This is a followup to the original patch, 08f58ecf07, which it supercedes
* Switch to org.apache.hadoop.io.IOUtils and closeStream.
* Use cleanupWithLogger to include error logging

Contributed by Steve Vaughan Jr
2023-02-13 14:31:32 +00:00
hfutatzhanghb f3c4277576
HDFS-16882. RBF: Add cache hit rate metric in MountTableResolver#getDestinationForPath (#5276)
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Tao Li <tomscut@apache.org>
2023-02-11 08:00:43 +08:00
Owen O'Malley 26fba8701c
HDFS-18324. Fix race condition in closing IPC connections. (#5371) 2023-02-10 17:51:03 +00:00
Tamas Domok 151b71d7af
MAPREDUCE-7431. ShuffleHandler refactor and fix after Netty4 upgrade. (#5311) 2023-02-10 17:40:21 +01:00
Viraj Jasani 17c8cdf63c
HDFS-16907. ADDENDUM: Remove unused variables from testDataNodeMXBeanLastHeartbeats. (#5373)
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2023-02-10 00:33:20 +05:30
Mehakmeet Singh 9e4f50d8a0
HADOOP-18596. Distcp -update to use modification time while checking for file skip. (#5308)
Adding toggleable support for modification time during distcp -update between two stores with incompatible checksum comparison.

Contributed by: Mehakmeet Singh <mehakmeet.singh.behl@gmail.com>
2023-02-09 21:31:09 +05:30
huhaiyang 113a9e40cb
HADOOP-18625. Fix method name of RPC.Builder#setnumReaders (#5301)
Changes method name of RPC.Builder#setnumReaders to setNumReaders()

The original method is still there, just marked deprecated.
It is the one which should be used when working with older branches.

Contributed by Haiyang Hu
2023-02-09 13:28:34 +00:00
huhaiyang d5c046518e
HDFS-16910. Fix incorrectly initializing RandomAccessFile caused flush performance decreased for JN (#5359) 2023-02-09 10:47:57 +08:00
Viraj Jasani 4fcceff535
HADOOP-18620 Avoid using grizzly-http-* APIs (#5356) 2023-02-09 10:45:07 +08:00
slfan1989 af20841fb1
YARN-11217. [Federation] Add dumpSchedulerLogs REST APIs for Router. (#5272) 2023-02-08 11:48:38 -08:00
Steve Vaughan 08f58ecf07
HDFS-16904. Close webhdfs during TestSymlinkHdfs teardown (#5342)
Contributed by Steve Vaughan Jr
2023-02-08 17:15:42 +00:00
He Xiaoqiao 3ba058a894
HDFS-16898. Remove write lock for processCommandFromActor of DataNode to reduce impact on heartbeat (#5330). Contributed by ZhangHB.
Reviewed-by: zhangshuyan <zqingchai@gmail.com>
Reviewed-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2023-02-08 11:19:07 +08:00
He Xiaoqiao 7e919212c4
Revert "HDFS-16898. Make write lock fine-grain in method processCommandFromActor (#5330). Contributed by ZhangHB."
This reverts commit eb04ecd29d.
2023-02-08 11:15:22 +08:00
hfutatzhanghb eb04ecd29d
HDFS-16898. Make write lock fine-grain in method processCommandFromActor (#5330). Contributed by ZhangHB.
Reviewed-by: zhangshuyan <zqingchai@gmail.com>
Reviewed-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2023-02-08 10:33:28 +08:00
hchaverri d310642626
HDFS-16895. [RBF] NamenodeHeartbeatService should use credentials of logged in user 2023-02-07 18:45:05 +00:00
gardenia 8714403dc7
HADOOP-18621. Resource leak in CryptoOutputStream.close() (#5347)
When closing we need to wrap the flush() in a try .. finally, otherwise
when flush throws it will stop completion of the remainder of the
close activities and in particular the close of the underlying wrapped
stream object resulting in a resource leak.

Contributed by Colm Dougan
2023-02-07 12:01:57 +00:00
Viraj Jasani f02c452cf1
HDFS-16907. Add LastHeartbeatResponseTime for BP service actor (#5349)
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Tao Li <tomscut@apache.org>
2023-02-07 09:54:40 +08:00
slfan1989 a6a9fe17e0
YARN-3657. Federation maintenance mechanisms (simple CLI and command propagation). (#5348) 2023-02-06 11:47:07 -08:00
Steve Vaughan 5f5157ac53
HADOOP-18612. Avoid mixing canonical and non-canonical when performing comparisons (#5339)
Contributed by Steve Vaughan Jr
2023-02-06 18:28:29 +00:00
Steve Vaughan aed6fcee5b
HADOOP-18576. Java 11 JavaDoc fails due to missing package comments (#5344)
Add JavaDoc comments to package-info.java to avoid errors resulting from the use of Hadoop annotations.

Contributed by Steve Vaughan Jr
2023-02-06 18:17:57 +00:00
hfutatzhanghb be564f5c20
[HDFS-16903]. Fix javadoc of LightWeightResizableGSet class (#5338) 2023-02-06 13:21:28 +09:00
sunhao 0ae075a2af
HDFS-16848. RBF: Improve StateStoreZooKeeperImpl performance (#5147) 2023-02-05 09:33:05 +08:00
jokercurry dad73b76c0
YARN-11419. Remove redundant exception capture in NMClientAsyncImpl and improve readability in ContainerShellWebSocket, etc (#5309)
Co-authored-by: smallzhongfeng <982458633@qq.com>
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2023-02-04 10:29:19 +08:00
Viraj Jasani bce388fd3f
HDFS-16902 Add Namenode status to BPServiceActor metrics and improve logging in offerservice (#5334)
Reviewed-by: Mingliang Liu <liuml07@apache.org>
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Tao Li <tomscut@apache.org>
2023-02-03 09:11:02 +08:00
Ankit Saurabh 22f6d55b71
HADOOP-18246. Reduce lower limit on fs.s3a.prefetch.block.size to 1 byte. (#5120)
The minimum value of fs.s3a.prefetch.block.size is now 1

Contributed by Ankit Saurabh
2023-02-02 18:45:21 +00:00
Viraj Jasani ad0cff2f97
HADOOP-18592. Sasl connection failure should log remote address. (#5294)
Contributed by Viraj Jasani <vjasani@apache.org>

Signed-off-by: Chris Nauroth <cnauroth@apache.org>
Signed-off-by: Steve Loughran <stevel@apache.org>
Signed-off-by: Mingliang Liu <liuml07@apache.org>
2023-02-01 10:15:20 -08:00
Masatake Iwasaki 6d325d9d09 HADOOP-18598. maven site generation doesn't include javadocs. (#5319)
Reviewed-by: Chris Nauroth <cnauroth@apache.org>
(cherry picked from commit 004121f9cc)
2023-01-31 10:50:41 +00:00
Masatake Iwasaki a70f84098f
HADOOP-18601. Fix build failure with docs profile. (#5331)
Reviewed-by: Steve Loughran <stevel@apache.org>
2023-01-31 19:44:19 +09:00
huhaiyang 88c8ac750d
HDFS-16888. BlockManager#maxReplicationStreams, replicationStreamsHardLimit, blocksReplWorkMultiplier and PendingReconstructionBlocks#timeout should be volatile (#5296)
Reviewed-by: Tao Li <tomscut@apache.org>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2023-01-31 17:46:38 +09:00
Wei-Chiu Chuang 9d47108b50
HADOOP-18584. [NFS GW] Fix regression after netty4 migration. (#5252)
Reviewed-by: Tsz-Wo Nicholas Sze <szetszwo@apache.org>
2023-01-31 01:17:04 +08:00
Ayush Saxena 952d707240
HADOOP-18604. Add compile platform in the hadoop version output. (#5327). Contributed by Ayush Saxena.
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
2023-01-28 14:19:19 +05:30
Szilard Nemeth b677d40ab5 HADOOP-18602. Remove netty3 dependency 2023-01-27 16:32:50 +01:00
Steve Loughran 970ebaeded
HADOOP-17717. Update wildfly openssl to 1.1.3.Final. (#5310)
Contributed by Wei-Chiu Chuang
2023-01-27 11:50:17 +00:00
slfan1989 468135a4d9
YARN-11218. [Federation] Add getActivities, getBulkActivities REST APIs for Router. (#5284) 2023-01-26 11:14:05 -08:00
Szilard Nemeth cf1b3711cb YARN-10965. Centralize queue resource calculation based on CapacityVectors. Contributed by Andras Gyori 2023-01-26 19:45:54 +01:00
Szilard Nemeth 815cde9810 YARN-6971. Clean up different ways to create resources. Contributed by Riya Khandelwal 2023-01-25 17:28:29 +01:00
Szilard Nemeth 29f2230cb6 YARN-5607. Document TestContainerResourceUsage#waitForContainerCompletion. Contributed by Susheel Gupta 2023-01-25 15:13:24 +01:00
Szilard Nemeth 8eda456d37 YARN-11404. Add junit5 dependency to hadoop-mapreduce-client-app to fix few unit test failure. Contributed by Susheel Gupta 2023-01-25 15:06:20 +01:00
kevin wan 3b7b79b37a
HADOOP-18582. skip unnecessary cleanup logic in distcp (#5251)
Co-authored-by: 万康 <mingge@xiaohongshu.com>
Reviewed-by: Steve Loughran <stevel@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
2023-01-24 15:49:32 -08:00
slfan1989 3f767a61b1
YARN-8900. [Follow Up] Fix FederationInterceptorREST#invokeConcurrent Inaccurate Order of Subclusters. (#5260) 2023-01-19 17:13:55 -08:00
zhtttylz 72b760130a
HDFS-16893. Standardize the usage of DFSClient debug log (#5303)
Co-authored-by: Zhtttylz <hualong.z@hotmail.com>
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2023-01-19 07:56:41 +08:00
Viraj Jasani 04f3573f6a
HDFS-16891 Avoid the overhead of copy-on-write exception list while loading inodes sub sections in parallel (#5300)
Reviewed-by: Stephen O'Donnell <sodonnell@apache.org>
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
2023-01-18 13:13:41 -08:00
slfan1989 442a5fb285
YARN-11320. [Federation] Add getSchedulerInfo REST APIs for Router. (#5217) 2023-01-17 09:36:19 -08:00
Nikita Eshkeev 4de31123ce
Fix "the the" and friends typos (#5267)
Signed-off-by: Nikita Eshkeev <neshkeev@yandex.ru>
2023-01-17 03:33:59 +08:00
PJ Fanning d81d98388c
HADOOP-18575: followup: try to avoid repeatedly hitting exceptions when transformer factories do not support attributes (#5253)
Part of HADOOP-18469 and the hardening of XML/XSL parsers.
Followup to the main HADOOP-18575 patch, to improve performance when
working with xml/xsl engines which don't support the relevant attributes.

Include this change when backporting.

Contributed by PJ Fanning.
2023-01-16 13:15:37 +00:00
Ashutosh Gupta 38453f8589
MAPREDUCE-7413. Upgrade Junit 4 to 5 in hadoop-mapreduce-client-hs-plugins (#5023)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2023-01-16 18:19:28 +09:00
Ashutosh Gupta 082266516a
MAPREDUCE-7417. Upgrade Junit 4 to 5 in hadoop-mapreduce-client-uploader (#5019)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Reviewed-by: Shilun Fan <slfan1989@apache.org>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2023-01-16 17:22:04 +09:00
slfan1989 168fa07801
YARN-11409. Fix Typo of ResourceManager#webapp moudle. (#5285) 2023-01-13 11:53:13 +08:00
slfan1989 4520448327
YARN-11374. [Federation] Support refreshSuperUserGroupsConfiguration、refreshUserToGroupsMappings API's for Federation. (#5193) 2023-01-12 15:04:09 -08:00
Viraj Jasani 1263e024b9
HDFS-16887 Log start and end of phase/step in startup progress (#5292)
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
2023-01-12 14:26:52 -08:00
skysiders 36bf54aba0
MAPREDUCE-7375 JobSubmissionFiles don't set right permission after mkdirs (#4237)
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
2023-01-12 13:48:29 -08:00
huangxiaoping a90e424d9f
HADOOP-18591. Fix a typo in Trash (#5291)
Signed-off-by: Tao Li <tomscut@apache.org>
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
2023-01-12 13:21:21 -08:00
slfan1989 3d21cff263
YARN-11413. Fix Junit Test ERROR Introduced By YARN-6412. (#5289)
* YARN-11413. Fix Junit Test ERROR Introduced By YARN-6412.

* YARN-11413. Fix CheckStyle.

* YARN-11413. Fix CheckStyle.

Co-authored-by: slfan1989 <louj1988@@>
2023-01-12 14:29:05 +01:00
Simbarashe Dzinamarira f26d8bc9bd
HDFS-16886: Fixes error in documentation for StateStoreRecordsOperations. (#5290) 2023-01-11 19:46:51 +00:00
Szilard Nemeth 7f6cc196f8
YARN-11190. CS Mapping rule bug: User matcher does not work correctly for usernames with dot (#4471) 2023-01-11 13:23:04 +01:00
huhaiyang e3b09b7512
HDFS-16884. Fix TestFsDatasetImpl#testConcurrentWriteAndDeleteBlock failed (#5280)
Reviewed-by: Takanobu Asanuma <tasanuma@apache.org>
Signed-off-by: Tao Li <tomscut@apache.org>
2023-01-11 09:12:26 +08:00
Chengbing Liu 4cf304de45
HDFS-16872. Fix log throttling by declaring LogThrottlingHelper as static members (#5246)
Co-authored-by: Chengbing Liu <liuchengbing@qiyi.com>
Signed-off-by: Erik Krogen <xkrogen@apache.org>
2023-01-10 10:03:25 -08:00
huhaiyang f3cff032e6
HDFS-16885. Fix TestHdfsConfigFields#testCompareConfigurationClassAgainstXml failed (#5283) 2023-01-10 09:02:56 +08:00
Dongjoon Hyun 6f99558c2e
HADOOP-18590. Publish SBOM artifacts (#5281)
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
2023-01-09 16:41:06 -08:00
Simbarashe Dzinamarira b56d483258
HDFS-16876: Changes cleanup of shared RouterStateIdContext to be driven by namenodeResolver data. (#5282) 2023-01-09 23:55:13 +00:00
ahmarsuhail 9c6eeb699e
HADOOP-18320. Fixes typos in Delegation Tokens documentation. (#4499)
Contributed By: Ahmar Suhail
2023-01-09 22:18:41 +05:30
Riya Khandelwal dd49077aed
YARN-6412 aux-services classpath not documented (#5242) 2023-01-09 15:30:24 +01:00
Surendra Singh Lilhore a65d24488a
HADOOP-18581 : Handle Server KDC re-login when Server and Client run … (#5248)
* HADOOP-18581 : Handle Server KDC re-login when Server and Client run in same JVM.
2023-01-08 23:55:06 +05:30
Simbarashe Dzinamarira cd19da1309
HDFS-16877: Enables state context for namenode in TestObserverWithRouter (#5257) 2023-01-07 00:18:35 +00:00
PJ Fanning b9eb760ed2
HADOOP-18587: upgrade to jettison 1.5.3 due to cve (#5270)
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
2023-01-06 15:35:50 -08:00
Tsz-Wo Nicholas Sze 5022003e0f
HDFS-16881. Warn if AccessControlEnforcer runs for a long time to check permission. (#5268) 2023-01-05 09:31:52 +08:00
Yubi Lee 4511c360b9
HDFS-16883. Duplicate field name in hdfs-default.xml (#5271). Contributed by YUBI LEE. 2023-01-05 04:58:43 +05:30
huhaiyang 35ce60eadd
HDFS-16879. EC: Fsck -blockId shows number of redundant internal block replicas for EC Blocks (#5264) 2023-01-04 11:38:32 +08:00
slfan1989 0926fa5a2c
YARN-11225. [Federation] Add postDelegationToken postDelegationTokenExpiration cancelDelegationToken REST APIs for Router. (#5185) 2023-01-03 02:14:02 -08:00
susheel-gupta c44c9f984b
YARN-11393. Fs2cs could be extended to set ULF to -1 upon conversion (#5201) 2023-01-02 15:35:16 +01:00
Ayush Saxena b93b1c69cc
HADOOP-18586. Update the year to 2023. (#5265). Contributed by Ayush Saxena.
Reviewed-by: Takanobu Asanuma <tasanuma@apache.org>
2023-01-01 22:36:33 +05:30
Chris Nauroth 6b67373d10
YARN-11388: Prevent resource leaks in TestClientRMService. (#5187)
Signed-off-by: Shilun Fan <slfan1989@apache.org>
2022-12-28 11:00:27 -08:00
curie71 9668a85d40
YARN-11392 Audit Log missing in ClientRMService (#5250). Contributed by Beibei Zhao.
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
2022-12-27 15:58:53 -08:00
Neil d25c1be517
HDFS-16861. RBF. Truncate API always fails when dirs use AllResolver oder on Router (#5184)
Co-authored-by: xiezhineng <xiezhineng@corp.netease.com>
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Tao Li <tomscut@apache.org>
2022-12-26 21:19:33 +08:00
Akira Ajisaka 049d1762bd
MAPREDUCE-7428. Fix failing MapReduce tests due to the JUnit upgrades in WebServicesTestUtils (#5243)
Removed JUnit APIs from WebServicesTestUtils and TestContainerLogsUtils.
They are used by MapReduce modules as well as YARN modules, so the
APIs need to be removed to upgrade the JUnit version on a per-module basis.
Also, this effectively reverts the prior fix in #5209 because it didn't actually
fix the issue.
2022-12-24 04:33:35 +09:00
Bence Kosztolnik bf8ab83cd0 YARN-11395. RM UI, RMAttemptBlock can not render FINAL_SAVING. Contributed by Bence Kosztolnik
- In the YARN-1345 remove of FINAL_SAVING was missed from RMAttemptBlock
- Same issue was present after YARN-1345 in YARN-4411
- YARN-4411 logic was applied in this commit for FINAL_SAVING
2022-12-23 17:15:36 +01:00
ZanderXu df093ef9af
HDFS-16831. [RBF SBN] GetNamenodesForNameserviceId should shuffle Observer NameNodes every time (#5098) 2022-12-23 14:16:42 +08:00
slfan1989 17035da46e
YARN-11226. [Federation] Add createNewReservation, submitReservation, updateReservation, deleteReservation REST APIs for Router. (#5175) 2022-12-22 11:25:09 -08:00
susheel-gupta e6056d128a
YARN-10879. Incorrect WARN text in ACL check for application tag based placement (#5231)
Change-Id: Id892e38fe4c834b1743a0df2f0a40146d3d5a878
2022-12-22 17:20:53 +01:00
ZanderXu 15b52fb6a4
HDFS-16689. Standby NameNode crashes when transitioning to Active with in-progress tailer (#4744)
Signed-off-by: Erik Krogen <xkrogen@apache.org>
Co-authored-by: zengqiang.xu <zengqiang.xu@shopee.com>
2022-12-21 10:06:01 -08:00
David Dillon b63b777c84
HDFS-16873 FileStatus compareTo specify ordering by path (#5219) 2022-12-21 10:11:55 +08:00
ZanderXu 8d221255f2
HDFS-16764. [SBN Read] ObserverNamenode should throw ObserverRetryOnActiveException instead of FileNotFoundException during processing of addBlock rpc (#4872)
Signed-off-by: Erik Krogen <xkrogen@apache.org>
Co-authored-by: zengqiang.xu <zengqiang.xu@shopee.com>
2022-12-20 15:50:58 -08:00
Daniel-009497 7ff326129d
HDFS-16871 DiskBalancer process may throw IllegalArgumentException when the target DataNode has capital letter in hostname (#5240) 2022-12-21 07:12:02 +08:00
陈爽-Jack Chen f6605f1b3a
HADOOP-18438: AliyunOSSFileSystemStore deleteObjects interface should return the objects that failed to delete (#4857)
Merged to trunk, thank @chenshuang778  for your contribution
2022-12-20 13:57:49 +08:00
Steve Loughran 52c72fafe4
HADOOP-18470. Update index md with section on ABFS prefetching 2022-12-19 13:04:26 +00:00
PJ Fanning 6a07b5dc10
HADOOP-18575. Make XML transformer factory more lenient (#5224)
Due diligence followup to
HADOOP-18469. Add secure XML parser factories to XMLUtils (#4940)

Contributed by P J Fanning
2022-12-18 12:25:10 +00:00
Steve Loughran 33785fc5ad
HADOOP-18577. Followup: javadoc fix (#5232)
Fixes a javadoc error which came with
HADOOP-18577. ABFS: Add probes of readahead fix (#5205)

Part of the HADOOP-18521 ABFS readahead fix; MUST be included.

Contributed by Steve Loughran
2022-12-18 12:19:33 +00:00
Chengbing Liu ca3526da92
HADOOP-18567. LogThrottlingHelper: properly trigger dependent recorders in cases of infrequent logging (#5215)
Signed-off-by: Erik Krogen <xkrogen@apache.org>
Co-authored-by: Chengbing Liu <liuchengbing@qiyi.com>
2022-12-16 09:15:11 -08:00
Xing Lin f7bdf6c667
HDFS-16852. Skip KeyProviderCache shutdown hook registration if already shutting down (#5160)
Signed-off-by: Erik Krogen <xkrogen@apache.org>
2022-12-16 08:46:14 -08:00
Happy-shi c5b42d59d2
HDFS-16866. Fix a typo in Dispatcher (#5202)
Signed-off-by: Tao Li <tomscut@apache.org>
2022-12-16 11:07:41 +08:00
Steve Loughran cf1244492d
HADOOP-18577. ABFS: Add probes of readahead fix (#5205)
Followup patch to  HADOOP-18456 as part of HADOOP-18521,
ABFS ReadBufferManager buffer sharing across concurrent HTTP requests

Add probes of readahead fix aid in checking safety of
hadoop ABFS client across different releases.

* ReadBufferManager constructor logs the fact it is safe at TRACE
* AbfsInputStream declares it is fixed in toString()
  by including fs.azure.capability.readahead.safe" in the
  result.

The ABFS FileSystem hasPathCapability("fs.azure.capability.readahead.safe")
probe returns true to indicate the client's readahead manager has been fixed
to be safe when prefetching.

All Hadoop releases for which probe this returns false
and for which the probe "fs.capability.etags.available"
returns true at risk of returning invalid data when reading
ADLS Gen2/Azure storage data.

Contributed by Steve Loughran.
2022-12-15 17:08:25 +00:00
Steve Loughran 5f08e51b72
HADOOP-18561. Update commons-net to 3.9.0 (#5214)
Addresses CVE-2021-37533, which *only* relates to FTP.

Applications not using the ftp:// filesystem, which, as
anyone who has used it will know is very minimal and
so rarely used, is not a critical part of the project.

Furthermore, the FTP-related issue is at worst information leakage
if someone connects to a malicious server.

This is a due diligence PR rather than an emergency fix.

Contributed by Steve Loughran
2022-12-15 16:45:05 +00:00
Steve Loughran f7b1bb4dcc
HADOOP-18573. Improve error reporting on non-standard kerberos names (#5221)
The kerberos RPC does not declare any restriction on
characters used in kerberos names, though
implementations MAY be more restrictive.

If the kerberos controller supports use non-conventional
principal names *and the kerberos admin chooses to use them*
this can confuse some of the parsing.

The obvious solution is for the enterprise admins to "not do that"
as a lot of things break, bits of hadoop included.

Harden the hadoop code slightly so at least we fail more gracefully,
so people can then get in touch with their sysadmin and tell them
to stop it.
2022-12-15 11:42:36 +00:00
Mehakmeet Singh 32414cfe46
HADOOP-18574. Changing log level of IOStatistics increment to make the DEBUG logs less noisy (#5223)
Contributed by: Mehakmeet Singh
2022-12-15 10:19:18 +05:30
slfan1989 6172c3192d
YARN-11358. [Federation] Add FederationInterceptor#allow-partial-result config. (#5056) 2022-12-14 14:37:56 -08:00
Steve Loughran aaf92fe183
HADOOP-18526. Leak of S3AInstrumentation instances via hadoop Metrics references (#5144)
This has triggered an OOM in a process which was churning through s3a fs
instances; the increased memory footprint of IOStatistics amplified what
must have been a long-standing issue with FS instances being created
and not closed()

*  Makes sure instrumentation is closed when the FS is closed.
*  Uses a weak reference from metrics to instrumentation, so even
   if the FS wasn't closed (see HADOOP-18478), this back reference
   would not cause the S3AInstrumentation reference to be retained.
*  If S3AFileSystem is configured to log at TRACE it will log the
   calling stack of initialize(), so help identify where the
   instance is being created. This should help track down
   the cause of instance leakage.

Contributed by Steve Loughran.
2022-12-14 18:21:03 +00:00
slfan1989 63b9a6a2b6
YARN-11350. [Federation] Router Support DelegationToken With ZK. (#5131) 2022-12-14 09:09:38 -08:00
Doroszlai, Attila 4de8791deb
HADOOP-18569. NFS Gateway may release buffer too early (#5212)
(cherry picked from commit df4812df65)
2022-12-14 15:55:44 +01:00
Steve Loughran 1cecf8ab70
HADOOP-18183. s3a audit logs to publish range start/end of GET requests. (#5110)
The start and end of the range is set in a new audit param "rg",
e.g "?rg=100-200"

Contributed by Ankit Saurabh
2022-12-14 14:01:28 +00:00
Ashutosh Gupta 85ec7969a7
MAPREDUCE-7428. Fix failures related to Junit 4 to Junit 5 upgrade in org.apache.hadoop.mapreduce.v2.app.webapp (#5209)
Contributed by: Ashutosh Gupta
2022-12-14 12:54:08 +00:00
curie71 fdcbc8b072
HDFS-16868. Fix audit log duplicate issue when an ACE occurs in FSNamesystem. (#5206). Contributed by Beibei Zhao.
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2022-12-13 12:24:51 +08:00
slfan1989 a71aaef9a9
YARN-11385. Fix hadoop-yarn-server-common module Java Doc Errors. (#5182) 2022-12-10 15:03:49 -08:00
Jack Richard Buggins a46b20d25f
HADOOP-18329. Support for IBM Semeru JVM > 11.0.15.0 Vendor Name Changes (#4537)
The static boolean PlatformName.IBM_JAVA now identifies
Java 11+ IBM Semeru runtimes as IBM JVM releases.

Contributed by Jack Buggins.
2022-12-10 14:27:05 +00:00
Steve Loughran 0a7dfcc332
HADOOP-18546. Followup: ITestReadBufferManager fix (#5198)
This is a followup to the original HADOOP-18546
patch; cherry-picks of that should include this
or follow up with it.

Removes risk of race conditions in assertions
of ITestReadBufferManager on the state of the in-progress
and completed queues by removing assertions brittle
to race conditions in scheduling/network IO

* Waits for all the executor pool shutdown to complete before
  making any assertions
* Assertions that there are no in progress reads MUST be
  cut as there may be some and they won't be cancelled.
* Assertions that the completed list is without buffers
  of a closed stream are brittle because if there was
  an in progress stream which completed after stream.close()
  then it will end up in the list.

Contributed by Steve Loughran
2022-12-09 13:47:11 +00:00
Anurag P e76616f690
HDFS-16860 Upgrade moment.min.js to 2.29.4 (#5194) 2022-12-09 11:18:44 +05:30
dingshun3016 2fa540dca1
HDFS-16858. Dynamically adjust max slow disks to exclude. (#5180)
Reviewed-by: Chris Nauroth <cnauroth@apache.org>
Reviewed-by: slfan1989 <55643692+slfan1989@users.noreply.github.com>
Signed-off-by: Tao Li <tomscut@apache.org>
2022-12-09 08:10:04 +08:00
K0K0V0K ee7d1787cd
YARN-11390. TestResourceTrackerService.testNodeRemovalNormally: Shutdown nodes should be 0 now expected: <1> but was: <0> (#5190)
Reviewed-by: Peter Szucs
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
2022-12-08 09:52:19 -08:00
Oleksandr Shevchenko 0a4528cd7f
HADOOP-18563. Misleading AWS SDK S3 timeout configuration comment (#5197)
Contributed by Oleksandr Shevchenko
2022-12-08 15:07:59 +00:00
Pranav Saxena c67c2b7569
HADOOP-18546. ABFS. disable purging list of in progress reads in abfs stream close() (#5176)
This addresses HADOOP-18521, "ABFS ReadBufferManager buffer sharing
across concurrent HTTP requests" by not trying to cancel
in progress reads.

It supercedes HADOOP-18528, which disables the prefetching.
If that patch is applied *after* this one, prefetching
will be disabled.

As well as changing the default value in the code,
core-default.xml is updated to set
fs.azure.enable.readahead = true

As a result, if Configuration.get("fs.azure.enable.readahead")
returns a non-null value, then it can be inferred that
it was set in or core-default.xml (the fix is present)
or in core-site.xml (someone asked for it).

Contributed by Pranav Saxena.
2022-12-07 20:15:45 +00:00
Murali Krishna 2e88096266
HADOOP-18538. Upgrade kafka to 2.8.2 (#5164)
Signed-off-by: Brahma Reddy Battula <brahma@apache.org>
2022-12-06 22:27:46 +05:30
slfan1989 f71fd885be
YARN-11373. [Federation] Support refreshQueues refreshNodes API's for Federation. (#5146) 2022-12-06 08:17:05 -08:00
Akshat Bordia 86ac1ad9e5
YARN-10978. Fix ApplicationClassLoader to Correctly Expand Glob for Windows Path (#3558) 2022-12-06 16:39:49 +05:30
Gautham B A dadd3d9138
YARN-11386. Fix issue with classpath resolution (#5183)
* This PR ensures that all the special notations such as
  <CPS> are resolved before getting added to classpath.
2022-12-06 16:32:26 +05:30
Steve Loughran b666075a41
HADOOP-18560. AvroFSInput opens a stream twice and discards the second one without closing (#5186)
This is needed for branches with  the hadoop-common changes of
HADOOP-16202. Enhanced openFile()
2022-12-06 09:58:51 +00:00
Steve Loughran 84b33b897c
HADOOP-18470. index.md update for 3.3.5 release 2022-12-05 16:13:24 +00:00
ZanderXu 8a9bdb1edc
HDFS-16837. [RBF SBN] ClientGSIContext should merge RouterFederatedStates to get the max state id for each namespaces (#5123) 2022-12-05 16:15:47 +08:00
dingshun3016 02afb9ebe1
HDFS-16809. EC striped block is not sufficient when doing in maintenance. (#5050) 2022-12-05 16:34:51 +09:00
slfan1989 60e0fe8709
YARN-11381. Fix hadoop-yarn-common module Java Doc Errors. (#5179) 2022-12-02 10:56:17 -08:00
slfan1989 4af4997e11
YARN-11158. Support (Create/Renew/Cancel) DelegationToken API's for Federation. (#5104) 2022-12-01 13:20:21 -08:00
Szilard Nemeth 5440c75c4a YARN-10946. AbstractCSQueue: Create separate class for constructing Queue API objects. Contributed by Peter Szucs 2022-12-01 15:11:58 +01:00
litao 2067fcb646
HDFS-16550. Allow JN edit cache size to be set as a fraction of heap memory (#4209) 2022-11-30 07:44:21 -08:00
Anmol Asrani 7786600744
HADOOP-18457. ABFS: Support account level throttling (#5034)
This allows  abfs request throttling to be shared across all 
abfs connections talking to containers belonging to the same abfs storage
account -as that is the level at which IO throttling is applied.

The option is enabled/disabled in the configuration option 
"fs.azure.account.throttling.enabled";
The default is "true"

Contributed by Anmol Asrani
2022-11-30 13:05:31 +00:00
Kidd5368 72749a4ff8
HDFS-16839 It should consider EC reconstruction work when we determine if a node is busy (#5128)
Co-authored-by: Takanobu Asanuma <tasanuma@apache.org>
Reviewed-by: Tao Li <tomscut@apache.org>
2022-11-30 09:43:15 +08:00
Owen O'Malley 03471a736c
HDFS-16851: RBF: Add a utility to dump the StateStore. (#5155) 2022-11-29 22:12:35 +00:00
HarshitGupta11 0ef572abed
HADOOP-18530. ChecksumFileSystem::readVectored might return byte buffers not positioned at 0 (#5168)
Contributed by Harshit Gupta
2022-11-29 14:51:22 +00:00
caozhiqiang 35c65005d0
HDFS-16846. EC: Only EC blocks should be effected by max-streams-hard-limit configuration (#5143)
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2022-11-29 10:51:21 +09:00
Simbarashe Dzinamarira 909aeca86c
HDFS-16845: Adds configuration flag to allow clients to use router observer reads without using the ObserverReadProxyProvider. (#5142) 2022-11-29 00:49:10 +00:00
Simbarashe Dzinamarira ec2856d79c
HDFS-16847: RBF: Prevents StateStoreFileSystemImpl from committing tmp file after encountering an IOException. (#5145) 2022-11-29 00:47:01 +00:00
slfan1989 f93167e678
YARN-11380. Fix hadoop-yarn-api module Java Doc Errors. (#5152). Contributed by Shilun Fan.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-11-28 18:54:32 +05:30
sreeb-msft 1a7acc403b
HADOOP-18498. ABFS: Remove unwanted ? prefix from SAS Tokens (#5136)
This commit parses SAS Tokens and removes the unwanted prefix of '?' from them, if present.

At present, SAS Tokens are provided to the driver through customer implementations of the SASTokenProvider interface. The SAS token providers should not assume that the token will be the first query parameter in the URIs that communicate with the backend. However, it was observed that certain public interfaces provided by Storage to generate SAS can include the '?' as the first character of the SAS Token, which would ideally be the case when it is the first query parameter. Thus, tokens that contain this prefix will lead to an error in the driver due to a clash of query parameters.

To avoid failures for use of such SAS tokens, after receiving the SAS Token from the provider, the code checks for whether any ? prefix is present or not. If yes, it is removed before further usage of the token. This way, users would not have to manually remove the prefix before passing it on as a configuration.

Contributed by Sree Bhattacharya
2022-11-28 11:38:13 +00:00
PJ Fanning e09e81abe4
HADOOP-18496: remove unused okhttp.version (#5140). Contributed by PJ Fanning.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-11-27 18:59:40 +05:30
slfan1989 1ddc9091f6
YARN-11381. Fix hadoop-yarn-common module Java Doc Errors. (#5153). Contributed by Shilun Fan.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-11-26 21:01:07 +05:30
ZanderXu 87429f443a
HDFS-16779. Add ErasureCodingPolicy information to the response description for GETFILESTATUS in WebHDFS.md (#4922) 2022-11-25 09:26:28 +08:00
ZanderXu e0974298ce
HDFS-16826. [RBF SBN] ConnectionManager should advance the client stateId for each request (#5086) 2022-11-25 09:23:33 +08:00
huhaiyang ef84d21867
HDFS-16841. Enhance the function of DebugAdmin#VerifyECCommand (#5137) 2022-11-24 09:17:27 +08:00
ZanderXu bcc3d2a20e
HDFS-16838. Fix NPE in testAddRplicaProcessorForAddingReplicaInMap (#5125)
Reviewed-by: Xing Lin <xinglin@linkedin.com>
Reviewed-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Tao Li <tomscut@apache.org>
2022-11-24 09:07:27 +08:00
huhaiyang dfa9edacce
HDFS-16840. Enhance the usage description about oiv in HDFSCommands.md and OfflineImageViewerPB (#5129)
Reviewed-by: Takanobu Asanuma <tasanuma@apache.org>
Reviewed-by: ZanderXu <zanderxu@apache.org>
Signed-off-by: Tao Li <tomscut@apache.org>
2022-11-24 09:00:27 +08:00
Ashutosh Gupta 2c1158e858
HADOOP-18531. Fix assertion failure in ITestS3APrefetchingInputStream (#5149)
This patch MUST be applied to all branches containing HADOOP-18378
so as to ensure reliable test runs.

Contributed by Ashutosh Gupta
2022-11-23 17:47:39 +00:00
huhaiyang ac958777af
HDFS-16813. Remove parameter validation logic such as dfs.namenode.decommission.blocks.per.interval in DatanodeAdminManager#activate (#5063) 2022-11-23 09:50:57 +08:00
slfan1989 7cb22eb72d
YARN-11371. [Federation] Refactor FederationInterceptorREST#createNewApplication\submitApplication Use FederationActionRetry. (#5130) 2022-11-22 14:38:24 -08:00
Szilard Nemeth 3c37a01654 YARN-8262. get_executable in container-executor should provide meaningful error codes. Contributed by Susheel Gupta 2022-11-22 13:37:55 +01:00
litao 8f971b0e54
HDFS-16547. [SBN read] Namenode in safe mode should not be transfer to observer state (#4201)
Signed-off-by: Erik Krogen <xkrogen@apache.org>
Reviewed-by: Zengqiang Xu <xuzq_zander@163.com>
2022-11-21 10:14:07 -08:00
zhengchenyu dc2fba45fe
HDFS-16832. [SBN READ] Follow-on to HDFS-16732. Fix NPE when check the block location of empty directory (#5099)
Signed-off-by: Erik Krogen <xkrogen@apache.org>
Reviewed-by: Zengqiang Xu <xuzq_zander@163.com>
2022-11-21 08:26:16 -08:00
GuoPhilipse 069bd973d8
HADOOP-18532. Update command usage in FileSystemShell.md (#5141)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-11-21 15:55:46 +09:00
Ashutosh Gupta 696d042054
HADOOP-8728. Display (fs -text) shouldn't hard-depend on Writable serialized sequence files. (#5010)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-11-21 14:54:50 +09:00
Ashutosh Gupta 2e993fdf4e
YARN-6946. Upgrade JUnit from 4 to 5 in hadoop-yarn-common (#4717)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-11-21 14:40:02 +09:00
Ashutosh Gupta dcde414570
MAPREDUCE-7422. Upgrade Junit 4 to 5 in hadoop-mapreduce-examples (#5029)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-11-21 14:36:35 +09:00
Owen O'Malley c71a68ca80
HDFS-16844: Adds resilancy when StateStore gets exceptions. (#5138)
Allows the StateStore to stay up when there are errors reading the data.
2022-11-18 17:24:08 +00:00
Owen O'Malley 1ea5db52dd
HADOOP-18324. Interrupting RPC Client calls can lead to thread exhaustion. (#4527)
* Exactly 1 sending thread per an RPC connection.
* If the calling thread is interrupted before the socket write, it will be skipped instead of sending it anyways.
* If the calling thread is interrupted during the socket write, the write will finish.
* RPC requests will be written to the socket in the order received.
* Sending thread is only started by the receiving thread.
* The sending thread periodically checks the shouldCloseConnection flag.
2022-11-18 16:24:45 +00:00
Hu Xinqiu 7d39abd799
HADOOP-18429. fix infinite loop in MutableGaugeFloat#incr(float) (#4823) 2022-11-17 17:50:39 +08:00
slfan1989 eccd2d0492
YARN-11359. [Federation] Routing admin invocations transparently to multiple RMs. (#5057) 2022-11-16 18:00:38 -08:00
Szilard Nemeth 142df247ed YARN-11369. Commons.compress throws an IllegalArgumentException with large uids after 1.21. Contributed by Benjamin Teke 2022-11-16 13:07:05 +01:00
Lei Yang cd929457c9
HDFS-16836: StandbyCheckpointer shouldn't trigger rollback fs image after RU is finalized (#5135)
Co-authored-by: Lei Yang <leyang@linkedin.com>
2022-11-15 23:06:37 +00:00
Mehakmeet Singh 69e50c7b44
HADOOP-18528. Disable abfs prefetching by default (#5134)
Disables block prefetching on ABFS InputStreams, by setting
fs.azure.enable.readahead to false in core-default.xml and
the matching java constant.

This prevents
HADOOP-18521. ABFS ReadBufferManager buffer sharing across concurrent HTTP requests.

Once a fix for that is committed, this change can be reverted.

Contributed by Mehakmeet Singh.
2022-11-15 14:28:41 +00:00
Ashutosh Gupta a48e8c9beb
MAPREDUCE-5608. Replace and deprecate mapred.tasktracker.indexcache.mb (#5014)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-11-14 11:07:40 +09:00
slfan1989 04b31d7ecf
MAPREDUCE-7390. Remove WhiteBox in mapreduce module. (#4462)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-11-14 10:45:20 +09:00
ZanderXu d3c1c453f0
HDFS-16785. Avoid to hold write lock to improve performance when add volume. (#4945). Contributed by ZanderXu.
Signed-off-by: Tao Li <tomscut@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2022-11-13 22:01:35 +08:00
Szilard Nemeth 22c9f28f4d YARN-10005. Code improvements in MutableCSConfigurationProvider. Contributed by Peter Szucs 2022-11-12 18:35:49 +01:00
PJ Fanning d340c4a7a1
HADOOP-18496. Upgrade okhttp3 and dependencies due to kotlin CVEs (#5035)
Updates okhttp3 and okio so their transitive dependency on Kotlin
stdlib is free from recent CVEs.

okhttp3:okhttp => 4.10.0
okio:okio => 3.2.0
kotlin stdlib => 1.6.20

kotlin CVEs fixed:
 CVE-2022-24329
 CVE-2020-29582
 
Contributed by PJ Fanning.
2022-11-12 14:14:19 +00:00
Szilard Nemeth 5bb11cecea HADOOP-15327. Upgrade MR ShuffleHandler to use Netty4 #3259. Contributed by Szilard Nemeth. 2022-11-11 09:05:01 +01:00
Simbarashe Dzinamarira 552ee44eba
HDFS-16834: Removes request stateID consistency constraint between clients in different connection pools. (#5121)
Reviewed-by: Tao Li <tomscut@apache.org>
Signed-off-by: Zander Xu <zanderxu@apache.org>
2022-11-11 15:26:31 +08:00
slfan1989 b398a7b003
YARN-11367. [Federation] Fix DefaultRequestInterceptorREST Client NPE. (#5100) 2022-11-09 10:25:10 -08:00
zhengchenyu f68f1a4578
HADOOP-18433. Fix main thread name for . (#4838) 2022-11-09 19:18:31 +08:00
ted12138 7002e214b8
HADOOP-18502. MutableStat should return 0 when there is no change (#5058) 2022-11-09 10:21:43 +08:00
Steve Loughran 7f9ca101e2
HADOOP-18517. ABFS: Add fs.azure.enable.readahead option to disable readahead (#5103)
* HADOOP-18517. ABFS: Add fs.azure.enable.readahead option to disable readahead

Adds new config option to turn off readahead
* also allows it to be passed in through openFile(),
* extends ITestAbfsReadWriteAndSeek to use the option, including one
  replicated test...that shows that turning it off is slower.

Important: this does not address the critical data corruption issue
HADOOP-18521. ABFS ReadBufferManager buffer sharing across concurrent HTTP requests

What is does do is provide a way to completely bypass the ReadBufferManager.
To mitigate the problem, either fs.azure.enable.readahead needs to be set to false,
or set "fs.azure.readaheadqueue.depth" to 0 -this still goes near the (broken)
ReadBufferManager code, but does't trigger the bug.

For safe reading of files through the ABFS connector, readahead MUST be disabled
or the followup fix to HADOOP-18521 applied

Contributed by Steve Loughran
2022-11-08 11:43:04 +00:00
slfan1989 845cf8bc28
YARN-11368. [Federation] Improve Yarn Router's Federation Page style. (#5105) 2022-11-07 10:13:23 -08:00
Simbarashe Dzinamarira 44b8bb7224
HDFS-16821: Fixes regression in HDFS-13522 that enables observer reads by default (#5078) 2022-11-07 08:43:04 -08:00
Takanobu Asanuma 660530205e
HDFS-16833. NameNode should log internal EC blocks instead of the EC block group when it receives block reports. (#5106)
Reviewed-by: Tao Li <tomscut@apache.org>
2022-11-07 10:49:48 +09:00
slfan1989 5d6ab15860
YARN-11354. [Federation] Add Yarn Router's NodeLabel Web Page. (#5073) 2022-11-04 14:39:57 -07:00
Steve Vaughan 2ba982a061
MAPREDUCE-7386. Maven parallel builds (skipping tests) fail (#4415)
Contributed by Steve Vaughan Jr
2022-11-04 11:50:43 +00:00
huhaiyang e9319e696c
HDFS-16811. Support DecommissionBackoffMonitor Parameters reconfigurablereconfigurable (#5068)
Signed-off-by: Tao Li <tomscut@apache.org>
2022-11-04 14:18:59 +08:00
slfan1989 b90dfdff3f
YARN-11366. Improve equals, hashCode(), toString() methods of the Federation Base Object. (#5096) 2022-11-03 21:33:53 -07:00
Ashutosh Gupta e62ba16a02
HADOOP-18484. Upgrade hsqldb to v2.7.1 to mitigate CVE-2022-41853 (#4991) 2022-11-02 08:41:27 +01:00
Ashutosh Gupta 83acb55981
YARN-11364. Docker Container to accept docker Image name with sha256 digest (#5092)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Reviewed-by: slfan1989 <55643692+slfan1989@users.noreply.github.com>
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
2022-11-01 14:44:35 -07:00
Ashutosh Gupta 69225ae5b9
YARN-11363. Remove unused TimelineVersionWatcher and TimelineVersion from hadoop-yarn-server-tests (#5091)
Reviewed-by: slfan1989 <55643692+slfan1989@users.noreply.github.com>
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
2022-11-01 14:02:06 -07:00
wangteng13 388f2f182f
document fix for MAPREDUCE-7425 (#5090)
Reviewed-by: Ashutosh Gupta <ashutosh.gupta@st.niituniversity.in>
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
2022-11-01 13:34:59 -07:00
PJ Fanning 7ba304d1c6
HADOOP-18512: upgrade woodstox-core to 5.4.0 for security fix (#5087). Contributed by PJ Fanning.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-11-02 00:11:41 +05:30
Daniel Carl Jones 0b577992ef
HADOOP-18482. ITestS3APrefetchingInputStream to skip if CSV test file unavailable (#4983)
Contributed by Danny Jones
2022-10-31 21:19:34 +00:00
Steve Loughran 3b10cb5a3b
HADOOP-18507. VectorIO FileRange type to support a "reference" field (#5076)
Contributed by Steve Loughran
2022-10-31 21:12:13 +00:00
Gautham B A b1f418f802
YARN-11365. Fix NM class not found on Windows (#5093) 2022-10-31 09:19:07 -07:00
sabertiger af7dd660e0
HADOOP-18233. Possible race condition with TemporaryAWSCredentialsProvider (#5024)
This fixes a race condition with the TemporaryAWSCredentialProvider
one which has existed for a long time but which only surfaced
(usually in Spark) when the bucket existence probe was disabled
by setting fs.s3a.bucket.probe to 0 -a performance speedup
which was made the default in HADOOP-17454.

Contributed by Jimmy Wong.
2022-10-31 12:43:30 +00:00
Ashutosh Gupta cbe02c2e77
YARN-11264. Upgrade JUnit from 4 to 5 in hadoop-yarn-server-tests (#4776)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-10-30 04:33:57 +09:00
Samrat e04c9e810b
MAPREDUCE-7426. Fix typo in StartEndTimeBase (#4894)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-10-30 04:23:30 +09:00
Ashutosh Gupta c096803387
YARN-11339. Upgrade Junit 4 to 5 in hadoop-yarn-services-api (#4995)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-10-30 03:07:16 +09:00
PJ Fanning d6a65a4180
HADOOP-18472. Upgrade to snakeyaml 1.33 (#4958)
Reviewed-by: Dinesh Chitlangia <dineshc@apache.org>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-10-30 02:30:41 +09:00
Ashutosh Gupta 2aae7ffe08
YARN-11337. Upgrade Junit 4 to 5 in hadoop-yarn-applications-mawo (#4993)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-10-30 02:25:53 +09:00
slfan1989 070a2d4880
YARN-11332. [Federation] Improve FederationClientInterceptor#ThreadPool thread pool configuration. (#4982) 2022-10-28 15:39:04 -07:00
slfan1989 b1cd88c598
YARN-11229. [Federation] Add checkUserAccessToQueue REST APIs for Router. (#4929) 2022-10-28 15:37:35 -07:00
Chris Nauroth bfb84cd7f6
YARN-11360: Add number of decommissioning/shutdown nodes to YARN cluster metrics. (#5060) 2022-10-28 11:07:01 -07:00
jianghuazhu 88f7f5bc01
HDFS-16802.Print options when accessing ClientProtocol#rename2(). (#5013)
Reviewed-by: Tao Li <tomscut@apache.org>
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
Signed-off-by: Zander Xu <zanderxu@apache.org>
2022-10-28 16:31:20 +08:00
M1eyu2018 8396caa484
HDFS-16716. Improve appendToFile command: support appending on file with new block (#4697)
Reviewed-by: xuzq <15040255127@163.com>
Signed-off-by: Tao Li <tomscut@apache.org>
2022-10-27 19:03:15 +08:00
huhaiyang d26c35b228
HDFS-16817. Remove useless DataNode lock related configuration (#5072)
Reviewed-by: litao <tomleescut@gmail.com>
Signed-off-by: ZanderXu <zanderxu@apache.org>
2022-10-27 17:28:19 +08:00
Takanobu Asanuma 545a556883
HDFS-16822. HostRestrictingAuthorizationFilter should pass through requests if they don't access WebHDFS API. (#5079)
Reviewed-by: Ashutosh Gupta <ashugpt@amazon.com>
Reviewed-by: Tao Li <tomscut@apache.org>
2022-10-27 14:39:01 +09:00
slfan1989 ba77530ff4
YARN-11357. Fix FederationClientInterceptor#submitApplication Can't Update SubClusterId (#5055) 2022-10-26 16:42:22 -07:00
Bence Kosztolnik 562b693374 YARN-11356. Upgrade DataTables to 1.11.5 to fix CVEs. Contributed by Bence Kosztolnik. 2022-10-26 22:29:01 +02:00
Mehakmeet Singh fba46aa5bb
HADOOP-18499. S3A to support HTTPS web proxies (#5051)
The option "fs.s3a.proxy.ssl.enabled" controls
whether the s3a connects to a proxy over HTTP (default) or HTTPS.
Set to "true" to use HTTPS.

Contributed by Mehakmeet Singh
2022-10-26 11:45:20 +01:00
Wang Yu 37bff63c0f
Refactor CallerContext's constructor to eliminate duplicate code (#5070)
Reviewed-by: Tao Li <tomscut@apache.org>
Reviewed-by: Zander Xu <zanderxu@apache.org>
2022-10-26 06:40:31 +08:00
FuzzingTeam f140506d67
HADOOP-18504. Fixed an unhandled NullPointerException in class KeyProvider (#5064)
Contributed by FuzzingTeam
2022-10-25 18:07:49 +01:00
Ashutosh Gupta 0a26d84df1
HADOOP-9946. NumAllSinks metrics shows lower value than NumActiveSinks (#5002)
Reviewed-by: Akira Ajisaka <aajisaka@apache.org>
2022-10-25 17:22:25 +08:00
Ashutosh Gupta e6edbf1b4b
YARN-11338. Upgrade Junit 4 to 5 in hadoop-yarn-applications-unmanaged-am-launcher (#4994)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-10-25 17:21:12 +09:00
Ashutosh Gupta 21b7790866
YARN-11336. Upgrade Junit 4 to 5 in hadoop-yarn-applications-catalog-webapp (#4992)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-10-25 16:52:31 +09:00
slfan1989 454157a384
YARN-11345. [Federation] Refactoring Yarn Router's Application Web Page. (#5030) 2022-10-24 09:35:16 -07:00
Gautham B A 833750f72a
HADOOP-18506. Update build instructions for Windows using VS2019 (#5066) 2022-10-24 09:28:29 -07:00
PJ Fanning aac87ffe76
MAPREDUCE-7411: use secure XML parsers in mapreduce modules (#4980)
Lockdown of parsers in hadoop-mapreduce.

Follow-on to HADOOP-18469. Add secure XML parser factories to XMLUtils

Contributed by P J Fanning
2022-10-21 14:02:11 +01:00
slfan1989 d93e6f0cbb
YARN-11295. [Federation] Router Support DelegationToken in MemoryStore mode. (#5032) 2022-10-20 13:14:38 -07:00
Willi Raschkowski c4aa41aa80
HADOOP-18500. Upgrade maven-shade-plugin to 3.3.0 (#5045)
Contributed by Willi Raschkowski
2022-10-20 18:47:33 +01:00
FuzzingTeam 7f69e09290
HADOOP-18471. Fixed ArrayIndexOutOfBoundsException in DefaultStringifier (#4957)
Contributed by FuzzingTeam
2022-10-20 18:12:17 +01:00
Sneha Vijayarajan a996d889ec
HADOOP-17767. ABFS: Update test scripts (#3124)
Contributed by Sneha Vijayarajan
2022-10-20 18:07:04 +01:00
slfan1989 9adf0ca089
YARN-11342. [Federation] Refactor getNewApplication, submitApplication Use FederationActionRetry. (#5005) 2022-10-20 09:22:24 -07:00
jianghuazhu c5c00f3d2c
HDFS-16803.Improve some annotations in hdfs module. (#5031) 2022-10-20 10:58:23 +08:00
slfan1989 48b6f9f335
YARN-11328. Refactoring part of the code of SQLFederationStateStore. (#4976) 2022-10-19 16:11:28 -07:00
Viraj Jasani 8aa04b0b24
HADOOP-18189 S3APrefetchingInputStream to support status probes when closed (#5036)
Contributed by Viraj Jasani
2022-10-19 14:38:11 +01:00
Daniel Carl Jones 6207ac47e0
HADOOP-18304. Improve user-facing S3A committers documentation (#4478)
Contributed by: Daniel Carl Jones
2022-10-19 12:56:47 +05:30
Steve Loughran d80db6c9e5
HADOOP-18476. Abfs and S3A FileContext bindings to close wrapped filesystems in finalizer (#4966)
This is to try and close the underlying filesystems when the FileContext APIs are used.
Without this, threads may be leaked
2022-10-18 14:53:02 +01:00
slfan1989 ee886cacd7
YARN-11247. Remove unused classes introduced by YARN-9615. (#4720) 2022-10-18 09:40:39 -04:00
Hexiaoqiao babb050fa3
HADOOP-18497. Upgrade commons-text version to fix CVE-2022-42889. (#5037). Contributed by PJ Fanning.
Co-authored-by: He Xiaoqiao <hexiaoqiao@apache.org>
Reviewed-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2022-10-18 11:28:56 +08:00
Ankit Saurabh 2d91daab5e
HADOOP-18156. Address JavaDoc warnings in classes like MarkerTool, S3ObjectAttributes, etc (#4965)
Contributed by Ankit Saurabh
2022-10-17 18:10:47 +01:00
Ashutosh Gupta 9a8aff69ff
HDFS-6874. Add GETFILEBLOCKLOCATIONS operation to HttpFS (#4750)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
2022-10-17 18:56:15 +08:00
ZanderXu 136291d2d5
HADOOP-18462. InstrumentedWriteLock should consider Reentrant case (#4919). Contributed by ZanderXu.
Reviewed-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2022-10-17 12:44:25 +08:00
PJ Fanning 4ff6c9b8de
HADOOP-18493: upgrade jackson-databind to 2.12.7.1 (#5011). Contributed by PJ Fanning.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-10-17 10:03:10 +05:30
Szilard Nemeth b0d5182c31 YARN-10680. Revisit try blocks without catch blocks but having finally blocks. Contributed by Susheel Gupta 2022-10-15 21:51:08 +02:00
ahmarsuhail 77e551a478
HADOOP-18481. AWS v2 SDK upgrade log to not about standard AWS Credential Providers. (#4973)
The AWS SDKV2 upgrade log no longer warns about instantiation
of the v1 SDK credential providers which are commonly used in
s3a configurations:

* com.amazonaws.auth.EnvironmentVariableCredentialsProvider
* com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper
* com.amazonaws.auth.InstanceProfileCredentialsProvider

When the hadoop-aws module moves to the v2 SDK, references to these
credential providers will be rewritten to their v2 equivalents.

Follow-on to HADOOP-18382. "Upgrade AWS SDK to V2 - Prerequisites"

Contributed by Ahmar Suhail
2022-10-14 10:48:09 +01:00
slfan1989 5b52123c9d
YARN-8041. [Router] Federation: Improve Router REST API Metrics. (#4938) 2022-10-13 16:54:36 -07:00
slfan1989 1962851356
YARN-11294. [Federation] Router Support DelegationToken store/update/remove Token With MemoryStateStore. (#4915) 2022-10-13 16:52:22 -07:00
slfan1989 647457e6ab
YARN-11327. [Federation] Refactoring Yarn Router's Node Web Page. (#5009) 2022-10-13 14:05:30 -07:00
PJ Fanning bfce21ee08
YARN-11330. use secure XML parsers (#4981)
Move construction of XML parsers in YARN
modules to using the locked-down parser factory
of HADOOP-18469.

One exception: GpuDeviceInformationParser still supports DTD resolution;
all other features are disabled.

Contributed by P J Fanning
2022-10-13 18:19:19 +01:00
monthonk 9439d8e4e4
HADOOP-18292. Fix s3 select tests when running against unsupported storage class (#4489)
Follow-on from HADOOP-12020.

Contributed by Monthon Klongklaew
2022-10-13 13:13:36 +01:00
slfan1989 3ff8f58f8c
HADOOP-18360. Update commons-csv from 1.0 to 1.9.0. (#4928). Contributed by fanshilun.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-10-13 12:10:54 +05:30
slfan1989 1ff7e84caf
YARN-11334. Improve SubClusterState#fromString parameter and LogMessage. (#4988). Contributed by fanshilun.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-10-13 12:06:24 +05:30
Ashutosh Gupta d6b1e1eeb6
HDFS-16777. datatables@1.10.17 sonatype-2020-0988 vulnerability (#5003) 2022-10-12 14:51:12 -07:00
Gautham B A 5694d7e25f
Add Dockerfile_windows_10 (#4936) 2022-10-12 09:30:54 -07:00
slfan1989 d78b0b39a6
YARN-11323. [Federation] Improve ResourceManager Handler FinishApps. (#4954) 2022-10-11 14:53:02 -07:00
slfan1989 9e16f1f883
YARN-11317. [Federation] Refactoring Yarn Router's About Web Page. (#4946) 2022-10-11 13:30:48 -07:00
slfan1989 82a88a8ae6
YARN-11315. [Federation] YARN Federation Router Supports Cross-Origin. (#4934) 2022-10-11 10:18:50 -07:00
Gautham B A 2122733c30
Add .yetus/excludes.txt (#4984) 2022-10-11 09:23:34 -07:00
belugabehr 03d600fa82
HADOOP-17779: Lock File System Creator Semaphore Uninterruptibly (#3158) 2022-10-11 11:56:32 +01:00
huhaiyang d14b88c698
HDFS-16774.Improve async delete replica on datanode (#4903)
HDFS-16774. Improve async delete replica on datanode to reduce the probability of ReplicationNotFoundException

Co-authored-by: Haiyang Hu <haiyang.hu@shopee.com>
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
2022-10-11 16:22:56 +08:00
PJ Fanning 4fe079f85f
HDFS-16795. Use secure XML parsers (#4979)
Contributed by P J Fanning
2022-10-10 18:56:35 +01:00
Szilard Nemeth 0c515b0ef0 YARN-6766. Add helper method in FairSchedulerAppsBlock to print app info. Contributed by Riya Khandelwal 2022-10-10 15:30:33 +02:00
ZanderXu 62ff4e36cf
HDFS-16787. Remove the redundant lock in DataSetLockManager#removeLock (#4948). Contributed by ZanderXu.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2022-10-10 20:05:02 +08:00
ZanderXu b0b2cb4a16
HDFS-16783. Remove the redundant lock in deepCopyReplica and getFinalizedBlocks (#4942). Contributed by ZanderXu.
Reviewed-by: Mingxiang Li <liaiphag0@gmail.com>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2022-10-10 19:52:22 +08:00
Mukund Thakur be70bbb4be
HADOOP-18460. checkIfVectoredIOStopped before populating the buffers (#4986)
Contributed by Mukund Thakur
2022-10-10 11:17:45 +01:00
Steve Loughran 540a660429
HADOOP-18480. Upgrade aws sdk to 1.12.316 (#4972)
Contributed by Steve Loughran
2022-10-10 10:23:50 +01:00
Ashutosh Gupta 9a7d0e7ed0
YARN-11260. Upgrade JUnit from 4 to 5 in hadoop-yarn-server-timelineservice (#4775)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-10-10 00:28:03 +09:00
ZanderXu b0bfd09c41
HDFS-16798. HDFS-16798. SerialNumberMap should decrease current counter if the item exist. (#4987). Contributed by ZanderXu.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2022-10-09 18:58:09 +08:00
PJ Fanning 5eddec8c46
HADOOP-18468: Upgrade jettison to 1.5.1 to fix CVE-2022-40149 (#4937)
Contributed by PJ Fanning
2022-10-07 15:44:01 +01:00
Ashutosh Gupta 062c50db6b
MAPREDUCE-7370. Parallelize MultipleOutputs#close call (#4248). Contributed by Ashutosh Gupta.
Reviewed-by: Akira Ajisaka <aajisaka@apache.org>
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
2022-10-06 15:23:05 -07:00
PJ Fanning 8336b91329
HADOOP-18469. Add secure XML parser factories to XMLUtils (#4940)
Add to XMLUtils a set of methods to create secure XML Parsers/transformers, locking down DTD, schema, XXE exposure.

Use these wherever XML parsers are created.

Contributed by PJ Fanning
2022-10-06 19:30:51 +01:00
slfan1989 b31b3ea0f6
YARN-11187. Remove WhiteBox in yarn module. (#4463)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-10-06 23:13:33 +09:00
Daniel Carl Jones 7ec762a5fd
HADOOP-18465. Fix S3A SSE test skip when encryption is disabled (#4925)
Contributed by Daniel Carl Jones
2022-10-06 12:42:01 +01:00
Alessandro Passaro 1675a28e5a
HADOOP-18378. Implement lazy seek in S3A prefetching. (#4955)
Make S3APrefetchingInputStream.seek() completely lazy. Calls to seek() will not affect the current buffer nor interfere with prefetching, until read() is called.

This change allows various usage patterns to benefit from prefetching, e.g. when calling readFully(position, buffer) in a loop for contiguous positions the intermediate internal calls to seek() will be noops and prefetching will have the same performance as in a sequential read.

Contributed by Alessandro Passaro.
2022-10-06 12:00:41 +01:00
Steve Loughran 38b2ed2151
HADOOP-18442. Remove openstack support (#4855)
Contributed by Steve Loughran
2022-10-06 11:49:38 +01:00
slfan1989 1a9faf123d
YARN-11313. [Federation] Add SQLServer Script and Supported DB Version in Federation.md. (#4927) 2022-10-05 13:25:20 -07:00
slfan1989 a708ff96f1
YARN-11324. Fix some PBImpl classes to avoid NPE. (#4961) 2022-10-04 14:30:20 -04:00
slfan1989 874a004347
YARN-11318. Improve FederationInterceptorREST#createInterceptorForSubCluster Use WebAppUtils. (#4939) 2022-10-04 09:18:01 -07:00
slfan1989 22bd5e3b53
YARN-11238. Optimizing FederationClientInterceptor Call with Parallelism. (#4904) 2022-10-04 09:17:00 -07:00
Riya Khandelwal 07581f1ab2
YARN-6169 message on empty configuration file improved (#4952) 2022-10-03 23:31:06 -04:00
Navink 4891bf5049
HDFS-13369. Fix for FSCK Report broken with RequestHedgingProxyProvider (#4917)
Contributed-by: navinko <nakumr@cloudera.com>
2022-09-30 23:28:12 +08:00
Mukund Thakur e22f5e75ae
HADOOP-18463. Add an integration test to process data asynchronously during vectored read. (#4921)
part of HADOOP-18103.

Contributed by: Mukund Thakur
2022-09-28 23:16:47 +05:30
slfan1989 42d883937d
YARN-11310. [Federation] Refactoring Yarn Router's Federation Web Page. (#4924) 2022-09-27 13:31:52 -07:00
slfan1989 bfd6415827
YARN-11290. Improve Query Condition of FederationStateStore#getApplicationsHomeSubCluster. (#4846) 2022-09-27 13:28:52 -07:00
Mukund Thakur 735e35d648
HADOOP-18347. S3A Vectored IO to use bounded thread pool. (#4918)
part of HADOOP-18103.

Also introducing a config fs.s3a.vectored.active.ranged.reads
to configure the maximum number of number of range reads a
single input stream can have active (downloading, or queued)
to the central FileSystem instance's pool of queued operations.
This stops a single stream overloading the shared thread pool.

Contributed by: Mukund Thakur
2022-09-27 21:13:07 +05:30
Ashutosh Gupta d9f435f6ac
HDFS-16766. XML External Entity (XXE) attacks can occur while processing XML received from an untrusted source (#4886)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-09-27 15:44:25 +09:00
slfan1989 5d20988f9f
YARN-11308. Router Page display the db username and password in mask mode. (#4908) 2022-09-26 15:54:17 -07:00
slfan1989 0e65f4cc04
YARN-11316. Fix Yarn federation.md table format. (#4935) 2022-09-26 15:47:08 -07:00
slfan1989 aeba204fa2
YARN-11306. [Federation] Refactor NM#FederationInterceptor#recover Code. (#4897) 2022-09-26 15:46:06 -07:00
Viraj Jasani 648071e197
HADOOP-18466. Limit the findbugs suppression IS2_INCONSISTENT_SYNC to S3AFileSystem field (#4926)
Follow-on to HADOOP-18455.

Contributed by Viraj Jasani
2022-09-26 18:56:58 +01:00
Ashutosh Gupta 7923cac86b
HADOOP-18443. Upgrade snakeyaml to 1.32 (#4906)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-09-25 23:49:48 +09:00
Ashutosh Gupta 603e9bd745
YARN-11271. Upgrade JUnit from 4 to 5 in hadoop-yarn-server-timelineservice-hbase-common (#4774)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-09-25 23:30:58 +09:00
Ashutosh Gupta fd0415c44a
YARN-11270. Upgrade JUnit from 4 to 5 in hadoop-yarn-server-timelineservice-hbase-client (#4773)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-09-25 23:26:06 +09:00
Xing Lin 747fb92107
HADOOP-18444 Add Support for localized trash for ViewFileSystem in Trash.moveToAppropriateTrash (#4869)
* HADOOP-18444 Add Support for localized trash for ViewFileSystem in Trash.moveToAppropriateTrash

Signed-off-by: Xing Lin <xinglin@linkedin.com>
2022-09-23 10:37:51 -07:00
Steve Loughran 0676495950
HADOOP-18456. NullPointerException in ObjectListingIterator. (#4909)
This problem surfaced in impala integration tests
   IMPALA-11592. TestLocalCatalogRetries.test_fetch_metadata_retry fails in S3 build
after the change
  HADOOP-17461. Add thread-level IOStatistics Context
The actual GC race condition came with
 HADOOP-18091. S3A auditing leaks memory through ThreadLocal references

The fix for this is, if our hypothesis is correct, in WeakReferenceMap.create()
where a strong reference to the new value is kept in a local variable
*and referred to later* so that the JVM will not GC it.

Along with the fix, extra assertions ensure that if the problem is not fixed,
applications will fail faster/more meaningfully. 

Contributed by Steve Loughran.
2022-09-23 09:54:31 +01:00
Kidd5368 9a29075f91
HDFS-16776 Erasure Coding: The length of targets should be checked when DN gets a reconstruction task (#4901) 2022-09-23 12:27:56 +09:00
slfan1989 e526f48fa4
YARN-11296. [Federation] Fix SQLFederationStateStore#Sql script bug. (#4858) 2022-09-22 16:40:47 -07:00
PJ Fanning e6d2c336cb
HADOOP-18341: upgrade commons-configuration2 to 2.8.0 and commons-text to 1.9 (#4578)
Reviewed-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2022-09-22 09:45:20 +09:00
Viraj Jasani 084b68e380
HADOOP-18455. S3A prefetching executor should be closed (#4879)
follow-on patch to HADOOP-18186. 

Contributed by: Viraj Jasani
2022-09-22 00:22:41 +05:30
Samrat 740e1ef357
HDFS-16706. ViewFS doc points to wrong mount table name (#4803) 2022-09-21 16:55:20 +05:30
Ashutosh Gupta 917aef75fc
YARN-11255. Support loading alternative docker client config from system environment (#4884)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
2022-09-21 16:54:02 +05:30
slfan1989 4c5a7cc6fc
YARN-11307. Fix Yarn Router Broken Link. (#4905) 2022-09-20 23:10:09 -04:00
slfan1989 4d9bb81b16
HADOOP-18451. Update hsqldb.version from 2.3.4 to 2.5.2. (#4880) 2022-09-20 11:10:51 -07:00
GuoPhilipse 40ab9c8ba7
HDFS-16341. Fix BlockPlacementPolicy details in hdfs defaults. (#3691). Contributed by guophilipse.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-09-20 23:16:24 +05:30
Ashutosh Gupta 2950c5405b
HADOOP-16674. Fix when TestDNS.testRDNS can fail with ServiceUnavailableException (#4802). Contributed by Ashutosh Gupta.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-09-20 23:09:59 +05:30
slfan1989 fd687bb4c4
YARN-11305. Fix TestLogAggregationService#testLocalFileDeletionAfterUpload Failed After YARN-11241. (#4893). Contributed by fanshilun.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-09-20 23:06:24 +05:30
Ashutosh Gupta a134628d1b
YARN-11303. Upgrade jquery ui to 1.13.2 to mitigate CVE-2022-31160 (#4895) 2022-09-20 09:50:03 -07:00
Viraj Jasani 5b1657278c
HADOOP-18377. hadoop-aws build to add a -prefetch profile to run all tests with prefetching (#4914)
Contributed by Viraj Jasani
2022-09-20 10:26:13 +01:00
ZanderXu e68006cd70
HDFS-16772. refreshHostsReader should use the latest configuration (#4890) 2022-09-19 13:27:07 -07:00
slfan1989 f52b900a5f
YARN-11283. Fix Typo of NodeManager amrmproxy. (#4899) 2022-09-19 13:16:25 -07:00
slfan1989 342c4856b8
YARN-11293. [Federation] StoreNewMasterKey/removeStoredMasterKey With MemoryStateStore. (#4852) 2022-09-19 13:14:55 -07:00
GuoPhilipse 620dd37712
HADOOP-18118. [Follow on] Fix test failure in TestHttpServer (#4900)
Signed-off-by: Erik Krogen <xkrogen@apache.org>
2022-09-19 09:10:00 -07:00
Ashutosh Gupta 30c36ef25a
HADOOP-18400. Fix file split duplicating records from a succeeding split when reading BZip2 text files (#4732)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-09-19 13:45:05 +09:00
ZanderXu a73c4804d8
HADOOP-18446. [SBN read] Add a re-queue metric to RpcMetrics to quantify the number of re-queued RPCs (#4871)
Signed-off-by: Erik Krogen <xkrogen@apache.org>
Co-authored-by: zengqiang.xu <zengqiang.xu@shopee.com>
2022-09-16 10:09:01 -07:00
Ashutosh Gupta 0f03299eba
HADOOP-16769. LocalDirAllocator to provide diagnostics when file creation fails (#4842)
The patch provides detailed diagnostics of file creation failure in LocalDirAllocator.

Contributed by: Ashutosh Gupta
2022-09-16 13:17:26 +05:30
ZanderXu 43c1ebae16
HDFS-16771. Follow-up for HDFS-16659: JN should tersely print logs about NewerTxnIdException (#4882)
Signed-off-by: Erik Krogen <xkrogen@apache.org>
Co-authored-by: zengqiang.xu <zengqiang.xu@shopee.com>
2022-09-15 12:44:36 -07:00
Ashutosh Gupta 59d3c20118
MAPREDUCE-7407. Avoid stopContainer() on dead node (#4779) 2022-09-15 10:30:36 -07:00
Simbarashe Dzinamarira 6422eaf301
HDFS-16767: RBF: Support observer node in Router-Based Federation.
Fixes #4127

Signed-off-by: Owen O'Malley <oomalley@linkedin.com>
2022-09-14 16:00:14 -07:00
GuoPhilipse ce54b7e55d
HADOOP-18118. Fix KMS Accept Queue Size default value to 500 (#3972) 2022-09-15 00:53:25 +08:00
Colm O hEigeartaigh 272844ee57
HADOOP-15072 - Update Apache Kerby to 2.0.2 (#4473) 2022-09-15 00:43:25 +08:00
slfan1989 86b84ed74e
HADOOP-18452. Fix TestKMS#testKMSHAZooKeeperDelegationToken Failed By Hadoop-18427. (#4885) 2022-09-14 09:13:58 -07:00
Renukaprasad C 8ce71c8882
[Documentation] RBF: Duplicate statement to be removed for better readabilty (#4881) 2022-09-13 09:24:05 -07:00
slfan1989 88545a875d
YARN-11301. Fix NoClassDefFoundError: org/junit/platform/launcher/core/LauncherFactory after YARN-11269. (#4878)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-09-13 17:08:16 +09:00
Ashutosh Gupta 55d8a91b2c
YARN-11261. Upgrade JUnit from 4 to 5 in hadoop-yarn-server-web-proxy (#4777)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-09-13 17:05:24 +09:00
slfan1989 3ce353395b
YARN-7614. [RESERVATION] Support ListReservation APIs in Federation Router. (#4843) 2022-09-12 12:33:21 -07:00
Ashutosh Gupta 65a027b112
YARN-11241. Add uncleaning option for local app log file with log-aggregation enabled (#4703)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-09-12 23:32:20 +09:00
slfan1989 cde1f3af21
HADOOP-18302. Remove WhiteBox in hadoop-common module. (#4457)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-09-12 23:28:16 +09:00
caozhiqiang 1923096adb
Allow block reconstruction pending timeout to be refreshable (#4567)
Reviewed-by: Hiroyuki Adachi <hadachi@yahoo-corp.jp>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2022-09-12 11:45:01 +09:00
Ashutosh Gupta 21bae31d58
YARN-11265. Upgrade JUnit from 4 to 5 in hadoop-yarn-server-sharedcachemanager (#4772)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-09-11 03:31:28 +09:00
slfan1989 cdcb448b78
YARN-11286. Make AsyncDispatcher#printEventDetailsExecutor thread pool parameter configurable. (#4824). Contributed by fanshilun.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-09-10 23:00:50 +05:30
slfan1989 b2760520c3
YARN-11274. Impove Nodemanager#NodeStatusUpdaterImpl Log. (#4783). Contributed by fanshilun.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-09-10 22:57:06 +05:30
Simbarashe Dzinamarira e77d54d1ee
HDFS-13522: Add federated nameservices states to client protocol and propagate it between routers and clients.
Fixes #4311

Signed-off-by: Owen O'Malley <oomalley@linkedin.com>
2022-09-09 15:43:28 -07:00
slfan1989 e76ffbf102
YARN-11297. [Federation] Improve Yarn Router Reservation Submission Code. (#4863) 2022-09-09 10:39:00 -07:00
Mukund Thakur 8732625f50
HADOOP-18439. Fix VectoredIO for LocalFileSystem when checksum is enabled. (#4862)
part of HADOOP-18103.

While merging the ranges in CheckSumFs, they are rounded up based on the
value of checksum bytes size which leads to some ranges crossing the EOF
thus they need to be fixed else it will cause EOFException during actual reads.

Contributed By: Mukund Thakur
2022-09-09 21:46:08 +05:30
9uapaw 5b85af87f0 YARN-11278. Fixed Ambiguous error message in mutation API. Contributed by Ashutosh Gupta. 2022-09-09 14:38:41 +02:00
Viraj Jasani 56387cce57
HADOOP-18186. s3a prefetching to use SemaphoredDelegatingExecutor for submitting work (#4796)
Contributed by Viraj Jasani
2022-09-09 11:32:20 +01:00
ZanderXu 4a01fadb94
HDFS-16756. RBF proxies the client's user by the login user to enable CacheEntry (#4853). Contributed by ZanderXu.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-09-09 12:47:36 +05:30
slfan1989 29c4d8d8f7
YARN-11298. Improve Yarn Router Junit Test Close MockRM. (#4870) 2022-09-08 11:42:36 -07:00
slfan1989 0db3ee5b4b
HADOOP-18427. Improve ZKDelegationTokenSecretManager#startThead With recommended methods. (#4812) 2022-09-08 11:41:21 -07:00
Mehakmeet Singh 03961b10c2
HADOOP-18416. fix ITestS3AIOStatisticsContext test failure (#4806)
Follow on to HADOOP-17461.

Contributed by: Mehakmeet Singh
2022-09-08 21:03:18 +05:30
Ashutosh Gupta 832d0e0d76
HADOOP-18443. Upgrade snakeyaml to 1.31 to mitigate CVE-2022-25857 (#4856)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Brahma Reddy Battula <brahma@apache.org>
2022-09-08 19:58:38 +05:30
PJ Fanning 42c8f61fec
HADOOP-18441. Remove hadoop custom ServicesResourceTransformer (#4850). Contributed by PJ Fanning.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-09-07 17:11:12 +05:30
Erik Krogen c664f953c9
HADOOP-18426. Use weighted calculation for MutableStat mean/variance to fix accuracy. (#4844). Contributed by Erik Krogen.
Co-authored-by: Shuyan Zhang <zqingchai@gmail.com>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2022-09-07 13:49:56 +08:00
Ayush Saxena cc41ad63f9
HADOOP-18388. Allow dynamic groupSearchFilter in LdapGroupsMapping. (#4798)
* HADOOP-18388. Allow dynamic groupSearchFilter in LdapGroupsMapping.
2022-09-06 18:38:51 -04:00
ZanderXu c947c326e8
HDFS-16659. JournalNode should throw NewerTxnIdException when SinceTxId is bigger than HighestWrittenTxId (#4560)
Co-authored-by: Zander Xu <zengqiang.xu@shopee.com>
Signed-off-by: Erik Krogen <xkrogen@apache.org>
2022-09-06 10:12:55 -07:00
Sumangala Patki 7bcf853ff4
HADOOP-17873. ABFS: Fix transient failures in ITestAbfsStreamStatistics and ITestAbfsRestOperationException (#3699)
Successor for the reverted PR #3341, using the hadoop @VisibleForTesting attribute

Contributed by Sumangala Patki
2022-09-06 11:00:52 +01:00
ZanderXu be4c638e4c
HDFS-16748. RBF: DFSClient should uniquely identify writing files by namespace id and iNodeId via RBF (#4813). Contributed by ZanderXu.
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-09-05 17:46:15 +05:30
ZanderXu ac42519ade
HDFS-16593. Correct the BlocksRemoved metric on DataNode side (#4353). Contributed by ZanderXu.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2022-09-05 19:35:48 +08:00
slfan1989 7bf95d7949
YARN-11289. [Federation] Improve NM FederationInterceptor removeAppFromRegistry. (#4836) 2022-09-02 10:41:31 -07:00
slfan1989 1965708d49
YARN-11273. Federation StateStore: Support storage/retrieval of Reservations With SQL. (#4817) 2022-09-02 10:39:58 -07:00
slfan1989 b266f852d7
YARN-11284. [Federation] Improve UnmanagedAMPoolManager WithoutBlock ServiceStop (#4814) 2022-09-02 10:28:38 -07:00
slfan1989 3a96de7756
YARN-6667. Handle containerId duplicate without failing the heartbeat in Federation Interceptor. (#4810) 2022-09-02 10:25:26 -07:00
ZanderXu 7b239a80fe
HDFS-16750. NameNode should use NameNode.getRemoteUser() to log audit event to avoid possible NPE (#4821) 2022-09-02 10:23:03 -07:00
sreeb-msft c48ed3e96c
HADOOP-18408. ABFS: ITestAbfsManifestCommitProtocol fails on nonHNS configuration (#4758)
ITestAbfsManifestCommitProtocol  to set requireRenameResilience to false for nonHNS configuration  (#4758)

Contributed by Sree Bhattacharyya
2022-09-02 12:33:12 +01:00
slfan1989 37e213c3fc
YARN-11177. Support getNewReservation, submit / update/ Reservation API's for Federation. (#4764) 2022-09-01 16:35:20 -07:00
monthonk 20560401ec
HADOOP-18339. S3A storage class option only picked up when buffering writes to disk. (#4669)
Follow-up to HADOOP-12020 Support configuration of different S3 storage classes; 
S3 storage class is now set when buffering to heap/bytebuffers, and when
creating directory markers

Contributed by Monthon Klongklaew
2022-09-01 18:14:32 +01:00
Steve Vaughan 2dd8b1342e
HDFS-16755. TestQJMWithFaults.testUnresolvableHostName() can fail due to unexpected host resolution (#4833)
Use ".invalid" domain from IETF RFC 2606 to ensure that the host doesn't resolve.

Contributed by Steve Vaughan Jr
2022-09-01 14:00:15 +01:00
slfan1989 33edbed54c
YARN-11272. Federation StateStore: Support storage/retrieval of Reservations With Zk. (#4781) 2022-08-31 10:15:15 -07:00
Mukund Thakur 19830c98bc
HADOOP-18391. Improvements in VectoredReadUtils#readVectored() for direct buffers (#4787)
part of HADOOP-18103.

Contributed By: Mukund Thakur
2022-08-31 21:41:41 +05:30
9uapaw 84081a8cae MAPREDUCE-7409. Make shuffle key length configurable. Contributed by Ashutosh Gupta. 2022-08-31 17:32:51 +02:00
Steve Loughran c69e16b297
HADOOP-18410. S3AInputStream.unbuffer() does not release http connections (#4766)
HADOOP-16202 "Enhance openFile()" added asynchronous draining of the 
remaining bytes of an S3 HTTP input stream for those operations
(unbuffer, seek) where it could avoid blocking the active
thread.

This patch fixes the asynchronous stream draining to work and so
return the stream back to the http pool. Without this, whenever
unbuffer() or seek() was called on a stream and an asynchronous
drain triggered, the connection was not returned; eventually
the pool would be empty and subsequent S3 requests would
fail with the message "Timeout waiting for connection from pool"

The root cause was that even though the fields passed in to drain() were
converted to references through the methods, in the lambda expression
passed in to submit, they were direct references

operation = client.submit(
 () -> drain(uri, streamStatistics,
       false, reason, remaining,
       object, wrappedStream));  /* here */

Those fields were only read during the async execution, at which
point they would have been set to null (or even a subsequent read).

A new SDKStreamDrainer class peforms the draining; this is a Callable
and can be submitted directly to the executor pool.

The class is used in both the classic and prefetching s3a input streams.

Also, calling unbuffer() switches the S3AInputStream from adaptive
to random IO mode; that is, it is considered a cue that future
IO will not be sequential, whole-file reads.

Contributed by Steve Loughran.
2022-08-31 11:16:52 +01:00
Gautham B A c334ba89ad
HADOOP-18428. Parameterize platform toolset version (#4815)
* This PR adds an option
  use.platformToolsetVersion that
  makes the build systems to use
  this platform toolset version.
* This also makes sure that
  win-vs-upgrade.cmd does not get
  executed when the
  use.platformToolsetVersion
  option is specified.
2022-08-30 22:41:03 +05:30
slfan1989 8a47ed6f84
YARN-11287. Fix NoClassDefFoundError: org/junit/platform/launcher/core/LauncherFactory after YARN-10793 (#4828)
Co-authored-by: slfan1989 <louj1988@@>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-08-30 20:41:04 +09:00
Masatake Iwasaki 22835be63d
HADOOP-18375. Fix failure of shelltest for hadoop_add_ldlibpath. (#4652) 2022-08-30 19:33:29 +09:00
Ashutosh Gupta 90dba8b614
YARN-11245. Upgrade JUnit from 4 to 5 in hadoop-yarn-csi (#4778)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-08-30 17:26:06 +09:00
Samrat 2c05015716
YARN-11196. NUMA support in DefaultContainerExecutor (#4742) 2022-08-30 10:39:41 +05:30
zhangshuyan0 71778a6cc5
HDFS-16735. Reduce the number of HeartbeatManager loops. (#4780). Contributed by Shuyan Zhang.
Signed-off-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2022-08-29 11:30:21 +08:00
slfan1989 c60a900583
YARN-11275. [Federation] Add batchFinishApplicationMaster in UAMPoolManager. (#4792) 2022-08-27 10:17:00 -07:00
slfan1989 0075ef15c2
YARN-8482. [Router] Add cache for fast answers to getApps. (#4769) 2022-08-27 10:14:55 -07:00
slfan1989 4031b0774e
YARN-11253. Add Configuration to delegationToken RemoverScanInterval. (#4751) 2022-08-27 10:02:59 -07:00
ZanderXu 5567154f71
HDFS-16734. RBF: fix some bugs when handling getContentSummary RPC (#4763) 2022-08-26 16:04:33 -07:00
slfan1989 f8b9dd911c
YARN-11219. [Federation] Add getAppActivities, getAppStatistics REST APIs for Router. (#4757) 2022-08-26 16:01:17 -07:00
Gautham B A 5736b34b2a
HDFS-16736. Link to Boost library in libhdfspp (#4782) 2022-08-26 09:11:44 -07:00
zhengchenyu 231a4468cd
HDFS-16732. [SBN READ] Avoid get location from observer when the block report is delayed (#4756)
Signed-off-by: Erik Krogen <xkrogen@apache.org>
2022-08-25 10:37:25 -07:00
ahmarsuhail 7fb9c306e2
HADOOP-18382. AWS SDK v2 upgrade prerequisites (#4698)
This patch prepares the hadoop-aws module for a future
migration to using the v2 AWS SDK (HADOOP-18073)

That upgrade will be incompatible; this patch prepares
for it:
-marks some credential providers and other 
 classes and methods as @deprecated.
-updates site documentation
-reduces the visibility of the s3 client;
 other than for testing, it is kept private to
 the S3AFileSystem class.
-logs some warnings when deprecated APIs are used.

The warning messages are printed only once
per JVM's life. To disable them, set the
log level of org.apache.hadoop.fs.s3a.SDKV2Upgrade
to ERROR
 
Contributed by Ahmar Suhail
2022-08-25 17:36:48 +01:00
ZanderXu 1691cccc89
HDFS-16738. Invalid CallerContext caused NullPointerException (#4791) 2022-08-25 17:12:27 +08:00
Ayush Saxena 880686d1e3
Revert "HADOOP-18417. Upgrade to M7 of surefire plugin (#4795)"
This reverts commit 1ff121041c.
2022-08-25 03:44:49 +05:30
ZanderXu 8d4f51c432
HDFS-16728. RBF throw IndexOutOfBoundsException with disableNameServices (#4734). Contributed by ZanderXu.
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-08-24 20:27:15 +05:30
slfan1989 75aff247ae
YARN-11240. Fix incorrect placeholder in yarn-module. (#4678). Contributed by fanshilun
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-08-24 20:06:36 +05:30
slfan1989 052d7f286e
HADOOP-18361. Update commons-net from 3.6 to 3.8.0. (#4683). Contributed by fanshilun.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-08-24 20:05:17 +05:30
Steve Loughran de37fd37d6
MAPREDUCE-7403. manifest-committer dynamic partitioning support. (#4728)
Declares its compatibility with Spark's dynamic
output partitioning by having the stream capability
"mapreduce.job.committer.dynamic.partitioning"

Requires a Spark release with SPARK-40034, which
does the probing before deciding whether to 
accept/rejecting instantiation with
dynamic partition overwrite set

This feature can be declared as supported by
any other PathOutputCommitter implementations
whose algorithm and destination filesystem
are compatible.

None of the S3A committers are compatible.

The classic FileOutputCommitter is, but it
does not declare itself as such out of our fear
of changing that code. The Spark-side code
will automatically infer compatibility if
the created committer is of that class or
a subclass.

Contributed by Steve Loughran.
2022-08-24 11:18:19 +01:00
Steve Vaughan 1ff121041c
HADOOP-18417. Upgrade to M7 of surefire plugin (#4795)
This addresses an issue where the plugin's default classpath for executing tests fails to include org.junit.platform.launcher.core.LauncherFactory.

Contributed by: Steve Vaughan Jr
2022-08-24 11:04:04 +01:00
Simba Dzinamarira 4890ba5052
HADOOP-18406: Adds alignment context to call path for creating RPC proxy with multiple connections per user.
Fixes #4748

Signed-off-by: Owen O'Malley <oomalley@linkedin.com>
2022-08-23 17:00:57 -07:00
ZanderXu c37f01d95b
HDFS-16724. RBF should support get the information about ancestor mount points (#4719) 2022-08-23 13:25:42 -07:00
Simba Dzinamarira a3b1bafa34
HDFS-16669: Enhance client protocol to propagate last seen state IDs for multiple nameservices.
Fixes #4584

Signed-off-by: Owen O'Malley <oomalley@linkedin.com>
2022-08-23 11:12:50 -07:00
Steve Vaughan 6fbc38db95
HDFS-16686. GetJournalEditServlet fails to authorize valid Kerberos request (#4724) 2022-08-23 08:03:57 -07:00
ZanderXu 183f09b1da
HDFS-16717. Replace NPE with IOException in DataNode.class (#4699). Contributed by ZanderXu.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2022-08-23 18:17:32 +08:00
Viraj Jasani c249db80c2
HADOOP-18380. fs.s3a.prefetch.block.size to be read through longBytesOption (#4762)
Contributed by Viraj Jasani.
2022-08-23 10:49:04 +01:00
slfan1989 eda4bb5dcd
YARN-11250. Capture the Performance Metrics of ZookeeperFederationStateStore. (#4738) 2022-08-22 14:09:20 -07:00
Steve Vaughan 17daad34d4
HADOOP-18279. Cancel fileMonitoringTimer even if trustManager isn't defined (#4767) 2022-08-22 12:22:23 -07:00
Mukund Thakur 231e095802
HADOOP-18407. Improve readVectored() api spec (#4760)
part of HADOOP-18103.

Contributed By: Mukund Thakur
2022-08-22 23:19:29 +05:30
Steve Vaughan a9e5fb3313
HDFS-16684. Exclude the current JournalNode (#4723)
Exclude bound local addresses, including the use of a wildcard address in the bound host configurations.
* Allow sync attempts with unresolved addresses
* Update the comments.
* Remove unused import

Signed-off-by: stack <stack@apache.org>
2022-08-22 09:52:45 -07:00
Ashutosh Gupta c294a414b9
YARN-9425. Make initialDelay configurable for FederationStateStoreService#scheduledExecutorService (#4731). Contributed by groot and Shen Yinjie.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-08-22 03:40:00 +05:30
jianghuazhu 7f176d080c
HDFS-16729. RBF: fix some unreasonably annotated docs. (#4745)
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-08-21 07:29:31 +09:00
Clara Fang c870171182
YARN-11254. hadoop-minikdc dependency duplicated in hadoop-yarn-server-nodemanager (#4755)
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-08-21 07:09:42 +09:00
Ashutosh Gupta b253b3be9f
YARN-11269. Upgrade JUnit from 4 to 5 in hadoop-yarn-server-timeline-pluginstorage (#4771)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-08-21 06:52:23 +09:00
slfan1989 f75c58a1ca
YARN-11252. Yarn Federation Router Supports Update / Delete Reservation in MemoryStore. (#4741) 2022-08-18 21:13:43 -07:00
Viraj Jasani 7f030250b4
HADOOP-18403. Fix FileSystem leak in ITestS3AAWSCredentialsProvider (#4737)
Contributed By: Viraj Jasani
2022-08-19 04:14:43 +05:30
Steve Vaughan b7d4dc61bf
HADOOP-18365. Update the remote address when a change is detected (#4692)
Avoid reconnecting to the old address after detecting that the address has been updated.

* Fix Checkstyle line length violation
* Keep ConnectionId as Immutable for map key

The ConnectionId is used as a key in the connections map, and updating the remoteId caused problems with the cleanup of connections when the removeMethod was used.

Instead of updating the address within the remoteId, use the removeMethod to cleanup references to the current identifier and then replace it with a new identifier using the updated address.

* Use final to protect immutable ConnectionId

Mark non-test fields as private and final, and add a missing accessor.

* Use a stable hashCode to allow safe IP addr changes
* Add test that updated address is used

Once the address has been updated, it should be used in future calls.  Check to ensure that a second request succeeds and that it uses the existing updated address instead of having to re-resolve.

Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>
Signed-off-by: sokui
Signed-off-by: XanderZu
Signed-off-by: stack <stack@apache.org>
2022-08-18 09:21:23 -07:00
Ashutosh Gupta d09dd4a0b9
HADOOP-18385. ITestS3ACannedACLs failure; fixed by adding in a span (#4736)
Contributed by Ashutosh Gupta
2022-08-18 13:57:43 +01:00
Steve Loughran 682931a6ac
HADOOP-18028. High performance S3A input stream (#4752)
This is the the preview release of the HADOOP-18028 S3A performance input stream.
It is still stabilizing, but ready to test.

Contains

HADOOP-18028. High performance S3A input stream (#4109)
	Contributed by Bhalchandra Pandit.

HADOOP-18180. Replace use of twitter util-core with java futures (#4115)
	Contributed by PJ Fanning.

HADOOP-18177. Document prefetching architecture. (#4205)
	Contributed by Ahmar Suhail

HADOOP-18175. fix test failures with prefetching s3a input stream (#4212)
 Contributed by Monthon Klongklaew

HADOOP-18231.  S3A prefetching: fix failing tests & drain stream async.  (#4386)

	* adds in new test for prefetching input stream
	* creates streamStats before opening stream
	* updates numBlocks calculation method
	* fixes ITestS3AOpenCost.testOpenFileLongerLength
	* drains stream async
	* fixes failing unit test

	Contributed by Ahmar Suhail

HADOOP-18254. Disable S3A prefetching by default. (#4469)
	Contributed by Ahmar Suhail

HADOOP-18190. Collect IOStatistics during S3A prefetching (#4458)

	This adds iOStatisticsConnection to the S3PrefetchingInputStream class, with
	new statistic names in StreamStatistics.

	This stream is not (yet) IOStatisticsContext aware.

	Contributed by Ahmar Suhail

HADOOP-18379 rebase feature/HADOOP-18028-s3a-prefetch to trunk
HADOOP-18187. Convert s3a prefetching to use JavaDoc for fields and enums.
HADOOP-18318. Update class names to be clear they belong to S3A prefetching
	Contributed by Steve Loughran
2022-08-18 13:53:06 +01:00
slfan1989 cd72f7e042
YARN-11224. [Federation] Add getAppQueue, updateAppQueue REST APIs for Router. (#4747) 2022-08-17 13:13:07 -07:00
Steve Vaughan e40b3a3089
HDFS-4043. Namenode Kerberos Login does not use proper hostname for host qualified hdfs principal name. (#4693) 2022-08-17 12:03:33 -07:00
Ashutosh Gupta 5cc8c574d1
HDFS-16676. DatanodeAdminManager$Monitor reports a node as invalid continuously (#4626)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
2022-08-18 02:25:09 +08:00
Ashutosh Gupta 86abeb401e
HDFS-16730. Update the doc that append to EC files is supported (#4749)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
2022-08-17 05:41:41 +08:00
Ashutosh Gupta f02ff1afe2
YARN-11248. Add unit test for FINISHED_CONTAINERS_PULLED_BY_AM event on DECOMMISSIONING (#4721)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-08-16 19:07:05 +09:00
Viraj Jasani d55d76e1e2
HADOOP-18371. S3A FS init to log at debug when fs.s3a.create.storage.class is unset (#4730)
Contributed By: Viraj Jasani
2022-08-16 03:55:58 +05:30
xuzq 622ca0d51f
HDFS-16705. RBF supports healthMonitor timeout configurable and caching NN and client proxy in NamenodeHeartbeatService (#4662) 2022-08-15 13:55:16 -07:00
slfan1989 eff3b8c59a
YARN-10885. Make FederationStateStoreFacade#getApplicationHomeSubCluster use JCache. (#4701) 2022-08-15 13:46:40 -07:00
xuzq b1d4af2492
HDFS-16704. Datanode return empty response instead of NPE for GetVolumeInfo during restarting (#4661). Contributed by ZanderXu.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2022-08-15 20:57:27 +08:00
Hui Fei 86cc96c493
Revert "HDFS-16689. NameNode may crash when transitioning to Active with in-progress tailer if there are some abnormal JNs. (#4628)" (#4743) 2022-08-15 19:48:05 +08:00
Steve Loughran 906ae5138e
HADOOP-18402. S3A committer NPE in spark job abort (#4735)
JobID.toString() and TaskID.toString() to only be called
when the IDs are not null.

This doesn't surface in MapReduce, but Spark SQL can trigger
in job abort, where it may invok abortJob() with an
incomplete TaskContext.

This patch MUST be applied to branches containing
HADOOP-17833. "Improve Magic Committer Performance."

Contributed by Steve Loughran.
2022-08-15 11:20:59 +01:00
Steve Loughran eee59a8372
Revert "HADOOP-18402. S3A committer NPE in spark job abort (#4735)"
(managed to commit through the github ui before I'd got the message done)

This reverts commit ad83e95046.
2022-08-15 11:20:36 +01:00
Steve Loughran ad83e95046
HADOOP-18402. S3A committer NPE in spark job abort (#4735)
jobId.toString() to only be called when the ID isn't null.

this doesn't surface in MR, but spark seems to manage it

Change-Id: I06692ef30a4af510c660d7222292932a8d4b5147
2022-08-15 11:18:47 +01:00
slfan1989 ab88e4b65d
YARN-11223. [Federation] Add getAppPriority, updateApplicationPriority REST APIs for Router. (#4733) 2022-08-14 19:22:16 -07:00
Paul King d0fdb1d6e0
HADOOP-18404. Fix broken link to wiki help page in org.apache.hadoop.util.Shell (#4718). Contributed by Paul King.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-08-14 19:28:22 +05:30
slfan1989 d383cc4525
YARN-11236. Implement FederationReservationHomeSubClusterStore With MemoryStore. (#4711) 2022-08-13 10:37:20 -07:00
zhengchenyu 9f6bbc90a8
YARN-11148. In federation and security mode, nm recover may fail. (#4308) 2022-08-13 10:33:16 -07:00
kevins-29 b737869e01
HADOOP-18383. Codecs with @DoNotPool annotation are not closed causing memory leak (#4585) 2022-08-12 16:05:13 -07:00
xuzq e0c8c6eed4
HDFS-16678. RBF should supports disable getNodeUsage() in RBFMetrics (#4606) 2022-08-12 12:01:58 -07:00
xuzq 521e65acfe
HDFS-16723. Replace incorrect SafeModeException with StandbyException in RouterRpcServer.class (#4716). Contributed by ZanderXu.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-08-13 00:21:11 +05:30
Viraj Jasani 8c9533a0f8
HADOOP-18397. Shutdown AWSSecurityTokenService when its resources are no longer in use (#4722)
Contributed by Viraj Jasani.
2022-08-12 11:59:15 +01:00
xuzq 59619ad247
HDFS-16689. NameNode may crash when transitioning to Active with in-progress tailer if there are some abnormal JNs. (#4628) 2022-08-12 12:19:28 +08:00
Steve Vaughan 2005582a28
HDFS-16702. MiniDFSCluster should report cause of exception in assert error (#4680)
When the MiniDFSClsuter detects that an exception caused an exit, it should include that exception as the cause for the AssertionError that it throws. The current AssertError simply reports the message "Test resulted in an unexpected exit" and provides a stack trace to the location of the check for an exit exception. This patch adds the original exception as the cause of the AssertError.
2022-08-11 13:52:39 -07:00
slfan1989 6ca2d3f848
YARN-6539. Create SecureLogin inside Router. (#4712) 2022-08-11 13:25:51 -07:00
xuzq 09cabaad68
HDFS-13274. RBF: Extend RouterRpcClient to use multiple sockets (#4531) 2022-08-11 13:23:32 -07:00
Mukund Thakur b28e4c6904
HADOOP-18392. Propagate vectored s3a input stream stats to file system stats. (#4704)
part of HADOOP-18103.

Contributed By: Mukund Thakur
2022-08-12 01:42:00 +05:30
huaxiangsun e9509ac467
HADOOP-18340. deleteOnExit does not work with S3AFileSystem (#4608)
Contributed by Huaxiang Sun
2022-08-11 20:25:13 +01:00
Yubi Lee c0bbdca97e
HADOOP-18398. Prevent AvroRecord*.class from being included non-test jar (#4727)
Contributed by Yubi Lee.
2022-08-11 20:12:41 +01:00
slfan1989 133e8aabf0
YARN-11227. [Federation] Add getAppTimeout, getAppTimeouts, updateApplicationTimeout REST APIs for Router. (#4715) 2022-08-10 14:53:46 -07:00
slfan1989 ffa9ed93a4
YARN-6572. Refactoring Router services to use common util classes for pipeline creations. (#4594) 2022-08-09 14:44:29 -07:00
Ashutosh Gupta 92abd99450
YARN-11237. Fix Bug while disabling proxy failover with Federation (#4658) 2022-08-08 13:29:27 -07:00
slfan1989 977f4b6165
MAPREDUCE-7385. impove JobEndNotifier#httpNotification With recommended methods. (#4403). Contributed by fanshilun.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-08-09 00:59:03 +05:30
xuzq 895f7c51fd
HDFS-16709. Remove redundant cast in FSEditLogOp.class (#4667). Contributed by ZanderXu.
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-08-09 00:54:01 +05:30
Viraj Jasani 06f0f7db79
HADOOP-18373. IOStatisticsContext tuning (#4705)
The name of the option to enable/disable thread level statistics is
"fs.iostatistics.thread.level.enabled";

There is also an enabled() probe in IOStatisticsContext which can
be used to see if the thread level statistics is active.

Contributed by Viraj Jasani
2022-08-08 10:42:57 +01:00
slfan1989 d8d3325d2f
HADOOP-18387. Fix incorrect placeholder in hadoop-common (#4679). Contributed by fanshilun. 2022-08-08 02:35:39 +05:30
Ashutosh Gupta 1cda2dcb6e
YARN-10793. Upgrade Junit from 4 to 5 in hadoop-yarn-server-applicationhistoryservice (#4603)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-08-07 16:15:47 +09:00
slfan1989 52c2d99889
YARN-11228. [Federation] Add getAppAttempts, getAppAttempt REST APIs for Router. (#4695) 2022-08-06 09:36:26 -07:00
xuzq 25ccdc77af
HDFS-16648. Add isDebugEnabled check for debug blockLogs in some classes (#4529) 2022-08-06 21:34:01 +08:00
Ashutosh Gupta bd0f9a46e1
HADOOP-18390. Fix out of sync import for HADOOP-18321 (#4694)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-08-06 21:51:23 +09:00
ahmarsuhail b5642c5638
HADOOP-18366. ITestS3Select.testSelectSeekFullLandsat is timing out. (#4702)
Reduces size of data read to 1 MB

Contributed by Ahmar Suhail
2022-08-05 14:13:04 +01:00
Steve Loughran 62dbefd8f2
HADOOP-18305. Release Hadoop 3.3.4: upstream changelog and jdiff files
Add the r3.3.4 changelog, release notes and jdiff xml files.
2022-08-05 14:06:22 +01:00
Ayush Saxena 080e67039d
HADOOP-17234. Addendum. Add .asf.yaml to allow github and jira integration. (#4686). Contributed by Ayush Saxena.
Reviewed-by: Mingliang Liu <liuml07@apache.org>
2022-08-05 08:34:56 +05:30
slfan1989 6f7c4c74ea
YARN-11235. Refactor Policy Code and Define getReservationHomeSubcluster (#4656) 2022-08-04 10:16:08 -07:00
Ashutosh Gupta 0aa08ef543
HADOOP-18363. Fix bug preventing hadoop-metrics2 from emitting metrics to > 1 Ganglia servers (#4627)
* HADOOP-18363. Fix bug preventing hadoop-metrics2 from emitting metrics to > 1 Ganglia servers
2022-08-04 18:26:38 +05:30
zhangshuyan0 dbf73e16b1
HADOOP-18364. All method metrics related to the RPC protocol should be initialized. (#4624). Contributed by Shuyan Zhang.
Reviewed-by: Erik Krogen <xkrogen@apache.org>
Reviewed-by: Chao Sun <sunchao@apache.org>
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2022-08-04 13:00:37 +08:00
xuzq 8eebf40b1a
HDFS-16642. Moving the selecting inputstream from journalnode in EditLogTailer outof FSNLock (#4497) 2022-08-04 11:04:28 +08:00
Mukund Thakur 66dec9d322
HADOOP-18355. Update previous index properly while validating overlapping ranges. (#4647)
part of HADOOP-18103.

Contributed By: Mukund Thakur
2022-08-04 04:08:04 +05:30
slfan1989 c5ec727435
YARN-11230. [Federation] Add getContainer, signalToContainer REST APIs for Router. (#4689) 2022-08-03 11:21:48 -07:00
slfan1989 6463f86f83
YARN-11029. Refactor AMRMProxy Service code and Added Some Metrics. (#4650) 2022-08-03 09:38:00 -07:00
slfan1989 c5eba323bc
YARN-6972. Adding RM ClusterId in AppInfo. (#4673) 2022-08-03 09:35:40 -07:00
Steve Vaughan 0fc7dd8228
HDFS-16687. RouterFsckServlet replicates code from DfsServlet base class (#4681) 2022-08-03 09:14:11 -07:00
xuzq 0f36539d60
HDFS-16712. Fix incorrect placeholder in DataNode.java (#4672). Contributed by ZanderXu.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-08-03 13:01:41 +05:30
Ashutosh Gupta 69f6fdb757
HADOOP-18301. Upgrade commons-io to 2.11.0 (#4455)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-08-03 10:44:39 +09:00
slfan1989 1f0a71a92b
YARN-8973. [Router] Add missing methods in RMWebProtocol. (#4664) 2022-08-02 14:07:09 -07:00
slfan1989 57da4bb0a1
YARN-11220. [Federation] Add getLabelsToNodes, getClusterNodeLabels, getLabelsOnNode REST APIs for Router (#4657) 2022-08-02 12:09:55 -07:00
ahmarsuhail 123d1aa884
HADOOP-18368. Fixes ITestCustomSigner for access point names with '-' (#4634)
Contributed By: Ahmar Suhail <ahmarsu@amazon.co.uk>
2022-08-02 01:49:42 +05:30
slfan1989 13fbfd5dea
HADOOP-18358. Update commons-math3 from 3.1.1 to 3.6.1. (#4619). Contributed by fanshilun.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-08-02 01:48:47 +05:30
slfan1989 2680f17eb4
YARN-11180. Refactor some code of getNewApplication, submitApplication etc. (#4618) 2022-07-29 08:23:11 -07:00
slfan1989 e994635a95
YARN-11212. [Federation] Add getNodeToLabels REST APIs for Router. (#4614) 2022-07-28 11:53:04 -07:00
Mukund Thakur a5b12c8010
HADOOP-18227. Add input stream IOStats for vectored IO api in S3A. (#4636)
part of HADOOP-18103.

Contributed By: Mukund Thakur
2022-07-28 21:57:37 +05:30
9uapaw bf570bd4ac YARN-11063. Support auto queue creation template wildcards for arbitrary queue depths. Contributed by Bence Kosztolnik. 2022-07-28 17:32:20 +02:00
Steve Loughran 95a85875d0
HADOOP-18344. (followup) AWS SDK 1.12.262: update LICENSE-binary
Update LICENSE-binary with the new AWS SDK version.
Followup to #4637.

Contributed by Steve Loughran
2022-07-28 11:37:28 +01:00
Steve Loughran 58ed621304
HADOOP-18344. Upgrade AWS SDK to 1.12.262 (#4637)
Fixes CVE-2018-7489 in shaded jackson.

+Add more commands in testing.md
 to the CLI tests needed when qualifying
 a release

Contributed by Steve Loughran
2022-07-28 11:29:38 +01:00
xuzq f80fab2b90
HDFS-16671. RBF: RouterRpcFairnessPolicyController supports configurable permit acquire timeout (#4597) 2022-07-28 15:42:53 +08:00
xuzq a5adc27c99
HDFS-16658. Change logLevel from DEBUG to INFO if logEveryBlock is true (#4559). Contributed by ZanderXu.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2022-07-28 15:30:22 +08:00
slfan1989 06ac327e88
HDFS-16619. Fix HttpHeaders.Values And HttpHeaders.Names Deprecated Import (#4406)
Co-authored-by: slfan1989 <louj1988@@>
2022-07-28 11:07:24 +08:00
xuzq 24560f2eb5
HDFS-16660. Improve Code With Lambda in IPCLoggerChannel class (#4561) 2022-07-27 18:53:05 -07:00
ahmarsuhail c92ff0b4f1
HADOOP-18372. ILoadTestS3ABulkDeleteThrottling failing. (#4642)
Contributed by Ahmar Suhail
2022-07-27 17:19:57 +01:00
Mehakmeet Singh 4c8cd61961
HADOOP-17461. Collect thread-level IOStatistics. (#4352)
This adds a thread-level collector of IOStatistics, IOStatisticsContext,
which can be:
* Retrieved for a thread and cached for access from other
  threads.
* reset() to record new statistics.
* Queried for live statistics through the
  IOStatisticsSource.getIOStatistics() method.
* Queries for a statistics aggregator for use in instrumented
  classes.
* Asked to create a serializable copy in snapshot()

The goal is to make it possible for applications with multiple
threads performing different work items simultaneously
to be able to collect statistics on the individual threads,
and so generate aggregate reports on the total work performed
for a specific job, query or similar unit of work.

Some changes in IOStatistics-gathering classes are needed for 
this feature
* Caching the active context's aggregator in the object's
  constructor
* Updating it in close()

Slightly more work is needed in multithreaded code,
such as the S3A committers, which collect statistics across
all threads used in task and job commit operations.

Currently the IOStatisticsContext-aware classes are:
* The S3A input stream, output stream and list iterators.
* RawLocalFileSystem's input and output streams.
* The S3A committers.
* The TaskPool class in hadoop-common, which propagates
  the active context into scheduled worker threads.

Collection of statistics in the IOStatisticsContext
is disabled process-wide by default until the feature 
is considered stable.

To enable the collection, set the option
fs.thread.level.iostatistics.enabled
to "true" in core-site.xml;
	
Contributed by Mehakmeet Singh and Steve Loughran
2022-07-26 20:41:22 +01:00
KevinWikant 213ea03758
YARN-11210. Fix YARN RMAdminCLI retry logic for non-retryable kerbero… (#4563)
Co-authored-by: Kevin Wikant <wikak@amazon.com>
2022-07-26 09:21:37 +05:30
xuzq 01a2e0f6bd
HDFS-16533. COMPOSITE_CRC failed between replicated file and striped file due to invalid requested length. (#4155)
Co-authored-by: zengqiang.xu <zengqiang.xu@shopee.com>
2022-07-26 04:30:00 +08:00
slfan1989 bf8782d0ac
YARN-10883. [Router] Router Audit Log Add Client IP Address. (#4426) 2022-07-25 11:55:40 -07:00
skysiders 9fe96238d2
MAPREDUCE-7372 MapReduce set permission too late in copyJar method (#4026). Contributed by Zhang Dongsheng.
Reviewed-by: Steve Loughran <stevel@apache.org>
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
2022-07-25 11:38:59 -07:00
Gautham B A 6ba2c53720
HDFS-16681. Do not pass GCC flags for MSVC in libhdfspp (#4615)
* This PR ensures that the GCC flag
  -Wno-missing-field-initializers isn't passed
  for MSVC.
2022-07-25 22:55:38 +05:30
slfan1989 edeb99548a
YARN-11161. Support getAttributesToNodes, getClusterNodeAttributes, getNodesToAttributes API's for Federation (#4610) 2022-07-25 10:05:45 -07:00
Neil 2f49eec5dd
HDFS-16655. OIV: print out erasure coding policy name in oiv Delimited output (#4541). Contributed by Max Xie.
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
2022-07-25 17:39:25 +08:00
Gautham B A 8f83d9f56d
HDFS-16680. Skip libhdfspp Valgrind tests on Windows (#4611)
* The CMake test libhdfs_mini_stress_valgrind
  requires Valgrind.
* This PR skips this test on Windows since
  Valgrind isn't available.
2022-07-23 23:22:13 +05:30
Gautham B A 7de9b5ee27
HDFS-16467. Ensure Protobuf generated headers are included first (#4601)
* This PR ensures that the Protobuf generated headers
  are always included first, even when these headers
  are included transitively.
* This problem is specific to Windows only.
2022-07-23 23:20:15 +05:30
slfan1989 63db1a85e3
YARN-11203. Fix typo in hadoop-yarn-server-router module. (#4510). Contributed by fanshilun.
Reviewed-by: Fei Hui <feihui.ustc@gmail.com>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-07-23 20:28:45 +05:30
xuzq 2c96357051
HDFS-15079. RBF: Namenode needs to use the actual client Id and callId when going through RBF proxy. (#4530) 2022-07-23 22:19:37 +08:00
slfan1989 5c84cb81ba
YARN-8900. [Router] Federation: routing getContainers REST invocations transparently to multiple RMs (#4543) 2022-07-22 17:06:38 -07:00
Masatake Iwasaki 221eb2d68d Make upstream aware of 3.2.4 release.
(cherry picked from commit 817b8fdd38)
2022-07-22 04:08:36 +00:00
Masatake Iwasaki 3cce41a1f6 Make upstream aware of 3.2.4 release.
(cherry picked from commit e1637a57df)
2022-07-22 02:27:19 +00:00
slfan1989 2f6916a313
HDFS-16605. Improve Code With Lambda in hadoop-hdfs-rbf moudle. (#4375) 2022-07-21 18:42:55 -07:00
wangzhaohui 08a940d5dd
HDFS-16640. RBF: Show datanode IP list when click DN histogram in Router (#4488) 2022-07-21 16:21:31 -07:00
slfan1989 838020ce3b
YARN-11160. Support getResourceProfiles, getResourceProfile API's for Federation (#4540) 2022-07-21 11:57:24 -07:00
Szilard Nemeth f4b635c4dc YARN-11211. QueueMetrics leaks Configuration objects when validation API is called multiple times. Contributed by Andras Gyori 2022-07-21 14:20:34 +02:00
ashutoshpant bac2219e3c
HADOOP-18330. S3AFileSystem removes Path when calling createS3Client (#4572)
Adds a new parameter object in s3ClientCreationParameters that holds 
the full s3a path URI

Contributed by Ashutosh Pant
2022-07-21 10:16:39 +01:00
Gautham B A d07256a96d
HDFS-16667. Use malloc for buffer allocation in uriparser2 (#4576)
* Windows doesn't support variables for specifying
  the array size.
* This PR uses malloc to fix this issue.
2022-07-20 21:57:28 +05:30
Ashutosh Gupta e664f81ce7
HADOOP-18333.Upgrade jetty version to 9.4.48.v20220622 (#4553)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
2022-07-21 00:15:39 +08:00
Gautham B A 6415eb04e8
HDFS-16665. Fix duplicate sources for HDFS test (#4573)
* The library target hdfspp_test_shim_static is
  built using the following sources, which
  causes duplicate symbols to be defined -
  - hdfs_shim.c
  - ${LIBHDFSPP_BINDING_C}/hdfs.cc
* ${LIBHDFSPP_BINDING_C}/hdfs.cc is redundant
  and removing this fixes the issue.
2022-07-19 21:39:06 +05:30
Gautham B A 4fb799e6c5
HDFS-16464. Create only libhdfspp static libraries for Windows (#4571)
* Added the appropriate CMake flags and
  commands to enable only statically
  linked libraries and executables to
  be built on Windows.
2022-07-19 21:37:22 +05:30
Gautham B A 21b8952125
HDFS-16666. Pass CMake args for Windows in pom.xml (#4574)
* This PR passes the necessary CMake args in the
  pom.xml needed for building HDFS native client
  on Windows.
* These arguments are exposed as maven options
  and can be passed from the command-line.
2022-07-19 10:45:59 +05:30
Wei-Chiu Chuang a55ace7bc0
HADOOP-18079. Upgrade Netty to 4.1.77. (#3977)
Upgrade netty to address

CVE-2019-20444,
CVE-2019-20445
CVE-2022-24823

Contributed by Wei-Chiu Chuang
2022-07-18 10:41:00 +01:00
PJ Fanning 34e548cb62
HADOOP-18332: remove rs-api dependency as it conflicts with jsr311-api (#4547)
This downgrades jackson from the version switched to in
    HADOOP-18033 (2.13.0), to Jackson 2.12.7.
    This removes the dependency on javax.ws.rs-api,
    so avoiding runtime problems with applications using
    jersey-core v1 and/or jsr311-api.
    
    The 2.12.7 release still contains the fix for CVE-2020-36518.
    
    Contributed by PJ Fanning
2022-07-17 21:37:54 +05:30
Gautham B A 440f4c2b28
HDFS-16654. Link OpenSSL lib for CMake deps check (#4538)
* The check_c_source_compiles fails on Windows
  while linking with an "unable to resolve
  external symbol" error.
* This PR links OpenSSL lib for this check to
  fix this issue.
2022-07-17 20:47:30 +05:30
xuzq 8774f17868
HADOOP-13144. Enhancing IPC client throughput via multiple connections per user (#4542) 2022-07-15 14:18:46 -07:00
RuinanGu 9376b65989
HDFS-16566 Erasure Coding: Recovery may causes excess replicas when busy DN exsits (#4252) 2022-07-16 04:52:12 +08:00
Murali Krishna 2835174a4c
HDFS-16652. Upgrade jquery datatable version references to v1.10.19 (#4562) 2022-07-14 18:27:07 +05:30
Samrat 84ce592a85
YARN-11198. clean up numa resources from statestore (#4546)
* YARN-11198. clean up numa resources from levelDB

Co-authored-by: Deb <dbsamrat@3c22fba1b03f.ant.amazon.com>
2022-07-14 11:07:48 +05:30
xuzq 6f9c4359ec
HDFS-16283. RBF: reducing the load of renewLease() RPC (#4524). Contributed by ZanderXu.
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-07-14 07:26:40 +05:30
Ashutosh Gupta f1bd4e117e
HADOOP-18336.Tag FSDataInputStream.getWrappedStream() @Public/@Stable (#4555)
Contributed by: Ashutosh Gupta
2022-07-13 12:56:56 +01:00
HerCath 4c4a940da2
HADOOP-18217. ExitUtil synchronized blocks reduced. #4255
Reduce the ExitUtil synchronized block scopes so System.exit 
and Runtime.halt calls aren't within their boundaries,
so ExitUtil wrappers do not block each other.

Enlarged catches to all Throwables (not just Exceptions).

Contributed by Remi Catherinot
2022-07-13 12:35:44 +01:00
Ashutosh Gupta 0ca4868aa2
HADOOP-18294. Ensure build folder exists before writing checksum file.ProtocRunner#writeChecksums (#4446)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-07-12 20:15:26 +09:00
Ashutosh Gupta 4e8c0b902e
MAPREDUCE-7201.Make Job History File Permissions configurable (#4507)
* MAPREDUCE-7201.Make Job History File Permissions configurable

Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
2022-07-11 11:34:52 +05:30
lmccay e11ba5930e
HADOOP-18074 - Partial/Incomplete groups list can be returned in LDAP… (#4503)
* HADOOP-18074 - Partial/Incomplete groups list can be returned in LDAP groups lookup
2022-07-11 01:03:44 -04:00
Akira Ajisaka 9b1d3579b4
Revert "MAPREDUCE-7388. Remove unused variable _eof in GzipCodec.cc (#4429)"
This reverts commit fac895828f.
2022-07-09 03:05:42 +09:00
cfg1234 fac895828f
MAPREDUCE-7388. Remove unused variable _eof in GzipCodec.cc (#4429)
Co-authored-by: cWX456268 <chenfengge1@huawei.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-07-09 02:51:49 +09:00
Ayush Saxena 96f8e5b6f4
HADOOP-15789. DistCp does not clean staging folder if class extends DistCp. Contributed by Lawrence Andrews. (#4534)
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-07-08 17:04:20 +05:30
Gautham B A 8e39e35bea
HDFS-16466. Implement Linux permission flags on Windows (#4526)
* HDFS-16466. Implement Linux permission flags on Windows

* statinfo.cc uses POSIX permission flags.
  These flags aren't available for Windows.
* This PR implements the equivalent flags
  on Windows to make this cross platform
  compatible.
2022-07-08 09:29:13 +05:30
2431 changed files with 423989 additions and 43542 deletions

View File

@ -14,6 +14,8 @@
# limitations under the License.
github:
ghp_path: /
ghp_branch: gh-pages
enabled_merge_buttons:
squash: true
merge: false
@ -22,4 +24,4 @@ notifications:
commits: common-commits@hadoop.apache.org
issues: common-issues@hadoop.apache.org
pullrequests: common-issues@hadoop.apache.org
jira_options: link label worklog
jira_options: comment link label

59
.github/workflows/website.yml vendored Normal file
View File

@ -0,0 +1,59 @@
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
name: website
# Controls when the action will run.
on:
push:
branches: [ trunk ]
jobs:
build:
runs-on: ubuntu-latest
steps:
- name: Checkout Hadoop trunk
uses: actions/checkout@v3
with:
repository: apache/hadoop
- name: Set up JDK 8
uses: actions/setup-java@v3
with:
java-version: '8'
distribution: 'temurin'
- name: Cache local Maven repository
uses: actions/cache@v3
with:
path: ~/.m2/repository
key: ${{ runner.os }}-maven-${{ hashFiles('**/pom.xml') }}
restore-keys: |
${{ runner.os }}-maven-
- name: Build Hadoop maven plugins
run: cd hadoop-maven-plugins && mvn --batch-mode install
- name: Build Hadoop
run: mvn clean install -DskipTests -DskipShade
- name: Build document
run: mvn clean site
- name: Stage document
run: mvn site:stage -DstagingDirectory=${GITHUB_WORKSPACE}/staging/
- name: Deploy to GitHub Pages
uses: peaceiris/actions-gh-pages@v3
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./staging/hadoop-project
user_name: 'github-actions[bot]'
user_email: 'github-actions[bot]@users.noreply.github.com'

17
.yetus/excludes.txt Normal file
View File

@ -0,0 +1,17 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
dev-support/docker/Dockerfile_windows_10

View File

@ -492,39 +492,66 @@ Building on CentOS 8
----------------------------------------------------------------------------------
Building on Windows
Building on Windows 10
----------------------------------------------------------------------------------
Requirements:
* Windows System
* Windows 10
* JDK 1.8
* Maven 3.0 or later
* Boost 1.72
* Protocol Buffers 3.7.1
* CMake 3.19 or newer
* Visual Studio 2010 Professional or Higher
* Windows SDK 8.1 (if building CPU rate control for the container executor)
* zlib headers (if building native code bindings for zlib)
* Maven 3.0 or later (maven.apache.org)
* Boost 1.72 (boost.org)
* Protocol Buffers 3.7.1 (https://github.com/protocolbuffers/protobuf/releases)
* CMake 3.19 or newer (cmake.org)
* Visual Studio 2019 (visualstudio.com)
* Windows SDK 8.1 (optional, if building CPU rate control for the container executor. Get this from
http://msdn.microsoft.com/en-us/windows/bg162891.aspx)
* Zlib (zlib.net, if building native code bindings for zlib)
* Git (preferably, get this from https://git-scm.com/download/win since the package also contains
Unix command-line tools that are needed during packaging).
* Python (python.org, for generation of docs using 'mvn site')
* Internet connection for first build (to fetch all Maven and Hadoop dependencies)
* Unix command-line tools from GnuWin32: sh, mkdir, rm, cp, tar, gzip. These
tools must be present on your PATH.
* Python ( for generation of docs using 'mvn site')
Unix command-line tools are also included with the Windows Git package which
can be downloaded from http://git-scm.com/downloads
If using Visual Studio, it must be Professional level or higher.
Do not use Visual Studio Express. It does not support compiling for 64-bit,
which is problematic if running a 64-bit system.
The Windows SDK 8.1 is available to download at:
http://msdn.microsoft.com/en-us/windows/bg162891.aspx
Cygwin is not required.
----------------------------------------------------------------------------------
Building guidelines:
Hadoop repository provides the Dockerfile for building Hadoop on Windows 10, located at
dev-support/docker/Dockerfile_windows_10. It is highly recommended to use this and create the
Docker image for building Hadoop on Windows 10, since you don't have to install anything else
other than Docker and no additional steps are required in terms of aligning the environment with
the necessary paths etc.
However, if you still prefer taking the route of not using Docker, this Dockerfile_windows_10 will
still be immensely useful as a raw guide for all the steps involved in creating the environment
needed to build Hadoop on Windows 10.
Building using the Docker:
We first need to build the Docker image for building Hadoop on Windows 10. Run this command from
the root of the Hadoop repository.
> docker build -t hadoop-windows-10-builder -f .\dev-support\docker\Dockerfile_windows_10 .\dev-support\docker\
Start the container with the image that we just built.
> docker run --rm -it hadoop-windows-10-builder
You can now clone the Hadoop repo inside this container and proceed with the build.
NOTE:
While one may perceive the idea of mounting the locally cloned (on the host filesystem) Hadoop
repository into the container (using the -v option), we have seen the build to fail owing to some
files not being able to be located by Maven. Thus, we suggest cloning the Hadoop repository to a
non-mounted folder inside the container and proceed with the build. When the build is completed,
you may use the "docker cp" command to copy the built Hadoop tar.gz file from the docker container
to the host filesystem. If you still would like to mount the Hadoop codebase, a workaround would
be to copy the mounted Hadoop codebase into another folder (which doesn't point to a mount) in the
container's filesystem and use this for building.
However, we noticed no build issues when the Maven repository from the host filesystem was mounted
into the container. One may use this to greatly reduce the build time. Assuming that the Maven
repository is located at D:\Maven\Repository in the host filesystem, one can use the following
command to mount the same onto the default Maven repository location while launching the container.
> docker run --rm -v D:\Maven\Repository:C:\Users\ContainerAdministrator\.m2\repository -it hadoop-windows-10-builder
Building:
Keep the source code tree in a short path to avoid running into problems related
@ -540,6 +567,24 @@ configure the bit-ness of the build, and set several optional components.
Several tests require that the user must have the Create Symbolic Links
privilege.
To simplify the installation of Boost, Protocol buffers, OpenSSL and Zlib dependencies we can use
vcpkg (https://github.com/Microsoft/vcpkg.git). Upon cloning the vcpkg repo, checkout the commit
7ffa425e1db8b0c3edf9c50f2f3a0f25a324541d to get the required versions of the dependencies
mentioned above.
> git clone https://github.com/Microsoft/vcpkg.git
> cd vcpkg
> git checkout 7ffa425e1db8b0c3edf9c50f2f3a0f25a324541d
> .\bootstrap-vcpkg.bat
> .\vcpkg.exe install boost:x64-windows
> .\vcpkg.exe install protobuf:x64-windows
> .\vcpkg.exe install openssl:x64-windows
> .\vcpkg.exe install zlib:x64-windows
Set the following environment variables -
(Assuming that vcpkg was checked out at C:\vcpkg)
> set PROTOBUF_HOME=C:\vcpkg\installed\x64-windows
> set MAVEN_OPTS=-Xmx2048M -Xss128M
All Maven goals are the same as described above with the exception that
native code is built by enabling the 'native-win' Maven profile. -Pnative-win
is enabled by default when building on Windows since the native components
@ -557,6 +602,24 @@ the zlib 1.2.7 source tree.
http://www.zlib.net/
Build command:
The following command builds all the modules in the Hadoop project and generates the tar.gz file in
hadoop-dist/target upon successful build. Run these commands from an
"x64 Native Tools Command Prompt for VS 2019" which can be found under "Visual Studio 2019" in the
Windows start menu. If you're using the Docker image from Dockerfile_windows_10, you'll be
logged into "x64 Native Tools Command Prompt for VS 2019" automatically when you start the
container.
> set classpath=
> set PROTOBUF_HOME=C:\vcpkg\installed\x64-windows
> mvn clean package -Dhttps.protocols=TLSv1.2 -DskipTests -DskipDocs -Pnative-win,dist^
-Drequire.openssl -Drequire.test.libhadoop -Pyarn-ui -Dshell-executable=C:\Git\bin\bash.exe^
-Dtar -Dopenssl.prefix=C:\vcpkg\installed\x64-windows^
-Dcmake.prefix.path=C:\vcpkg\installed\x64-windows^
-Dwindows.cmake.toolchain.file=C:\vcpkg\scripts\buildsystems\vcpkg.cmake -Dwindows.cmake.build.type=RelWithDebInfo^
-Dwindows.build.hdfspp.dll=off -Dwindows.no.sasl=on -Duse.platformToolsetVersion=v142
----------------------------------------------------------------------------------
Building distributions:

View File

@ -210,22 +210,22 @@ hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/static/nvd3-1.8.5.* (css and js
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/checker/AbstractFuture.java
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/checker/TimeoutFuture.java
com.aliyun:aliyun-java-sdk-core:3.4.0
com.aliyun:aliyun-java-sdk-ecs:4.2.0
com.aliyun:aliyun-java-sdk-ram:3.0.0
com.aliyun:aliyun-java-sdk-core:4.5.10
com.aliyun:aliyun-java-sdk-kms:2.11.0
com.aliyun:aliyun-java-sdk-ram:3.1.0
com.aliyun:aliyun-java-sdk-sts:3.0.0
com.aliyun.oss:aliyun-sdk-oss:3.13.2
com.amazonaws:aws-java-sdk-bundle:1.11.901
com.amazonaws:aws-java-sdk-bundle:1.12.316
com.cedarsoftware:java-util:1.9.0
com.cedarsoftware:json-io:2.5.1
com.fasterxml.jackson.core:jackson-annotations:2.13.2
com.fasterxml.jackson.core:jackson-core:2.13.2
com.fasterxml.jackson.core:jackson-databind:2.13.2.2
com.fasterxml.jackson.jaxrs:jackson-jaxrs-base:2.13.2
com.fasterxml.jackson.jaxrs:jackson-jaxrs-json-provider:2.13.2
com.fasterxml.jackson.module:jackson-module-jaxb-annotations:2.13.2
com.fasterxml.jackson.core:jackson-annotations:2.12.7
com.fasterxml.jackson.core:jackson-core:2.12.7
com.fasterxml.jackson.core:jackson-databind:2.12.7.1
com.fasterxml.jackson.jaxrs:jackson-jaxrs-base:2.12.7
com.fasterxml.jackson.jaxrs:jackson-jaxrs-json-provider:2.12.7
com.fasterxml.jackson.module:jackson-module-jaxb-annotations:2.12.7
com.fasterxml.uuid:java-uuid-generator:3.1.4
com.fasterxml.woodstox:woodstox-core:5.3.0
com.fasterxml.woodstox:woodstox-core:5.4.0
com.github.davidmoten:rxjava-extras:0.8.0.17
com.github.stephenc.jcip:jcip-annotations:1.0-1
com.google:guice:4.0
@ -240,18 +240,17 @@ com.google.guava:guava:20.0
com.google.guava:guava:27.0-jre
com.google.guava:listenablefuture:9999.0-empty-to-avoid-conflict-with-guava
com.microsoft.azure:azure-storage:7.0.0
com.nimbusds:nimbus-jose-jwt:9.8.1
com.squareup.okhttp3:okhttp:4.9.3
com.squareup.okio:okio:1.6.0
com.nimbusds:nimbus-jose-jwt:9.31
com.squareup.okhttp3:okhttp:4.10.0
com.squareup.okio:okio:3.2.0
com.zaxxer:HikariCP:4.0.3
commons-beanutils:commons-beanutils:1.9.3
commons-cli:commons-cli:1.2
commons-beanutils:commons-beanutils:1.9.4
commons-cli:commons-cli:1.5.0
commons-codec:commons-codec:1.11
commons-collections:commons-collections:3.2.2
commons-daemon:commons-daemon:1.0.13
commons-io:commons-io:2.8.0
commons-logging:commons-logging:1.1.3
commons-net:commons-net:3.6
commons-net:commons-net:3.9.0
de.ruedigermoeller:fst:2.50
io.grpc:grpc-api:1.26.0
io.grpc:grpc-context:1.26.0
@ -260,18 +259,36 @@ io.grpc:grpc-netty:1.26.0
io.grpc:grpc-protobuf:1.26.0
io.grpc:grpc-protobuf-lite:1.26.0
io.grpc:grpc-stub:1.26.0
io.netty:netty:3.10.6.Final
io.netty:netty-all:4.1.42.Final
io.netty:netty-buffer:4.1.27.Final
io.netty:netty-codec:4.1.27.Final
io.netty:netty-codec-http:4.1.27.Final
io.netty:netty-codec-http2:4.1.27.Final
io.netty:netty-codec-socks:4.1.27.Final
io.netty:netty-common:4.1.27.Final
io.netty:netty-handler:4.1.27.Final
io.netty:netty-handler-proxy:4.1.27.Final
io.netty:netty-resolver:4.1.27.Final
io.netty:netty-transport:4.1.27.Final
io.netty:netty-all:4.1.77.Final
io.netty:netty-buffer:4.1.77.Final
io.netty:netty-codec:4.1.77.Final
io.netty:netty-codec-dns:4.1.77.Final
io.netty:netty-codec-haproxy:4.1.77.Final
io.netty:netty-codec-http:4.1.77.Final
io.netty:netty-codec-http2:4.1.77.Final
io.netty:netty-codec-memcache:4.1.77.Final
io.netty:netty-codec-mqtt:4.1.77.Final
io.netty:netty-codec-redis:4.1.77.Final
io.netty:netty-codec-smtp:4.1.77.Final
io.netty:netty-codec-socks:4.1.77.Final
io.netty:netty-codec-stomp:4.1.77.Final
io.netty:netty-codec-xml:4.1.77.Final
io.netty:netty-common:4.1.77.Final
io.netty:netty-handler:4.1.77.Final
io.netty:netty-handler-proxy:4.1.77.Final
io.netty:netty-resolver:4.1.77.Final
io.netty:netty-resolver-dns:4.1.77.Final
io.netty:netty-transport:4.1.77.Final
io.netty:netty-transport-rxtx:4.1.77.Final
io.netty:netty-transport-sctp:4.1.77.Final
io.netty:netty-transport-udt:4.1.77.Final
io.netty:netty-transport-classes-epoll:4.1.77.Final
io.netty:netty-transport-native-unix-common:4.1.77.Final
io.netty:netty-transport-classes-kqueue:4.1.77.Final
io.netty:netty-resolver-dns-classes-macos:4.1.77.Final
io.netty:netty-transport-native-epoll:4.1.77.Final
io.netty:netty-transport-native-kqueue:4.1.77.Final
io.netty:netty-resolver-dns-native-macos:4.1.77.Final
io.opencensus:opencensus-api:0.12.3
io.opencensus:opencensus-contrib-grpc-metrics:0.12.3
io.reactivex:rxjava:1.3.8
@ -282,16 +299,15 @@ javax.inject:javax.inject:1
log4j:log4j:1.2.17
net.java.dev.jna:jna:5.2.0
net.minidev:accessors-smart:1.2
net.minidev:json-smart:2.4.7
org.apache.avro:avro:1.9.2
org.apache.commons:commons-collections4:4.2
org.apache.commons:commons-compress:1.21
org.apache.commons:commons-configuration2:2.1.1
org.apache.commons:commons-csv:1.0
org.apache.commons:commons-configuration2:2.8.0
org.apache.commons:commons-csv:1.9.0
org.apache.commons:commons-digester:1.8.1
org.apache.commons:commons-lang3:3.12.0
org.apache.commons:commons-math3:3.1.1
org.apache.commons:commons-text:1.4
org.apache.commons:commons-math3:3.6.1
org.apache.commons:commons-text:1.10.0
org.apache.commons:commons-validator:1.6
org.apache.curator:curator-client:5.2.0
org.apache.curator:curator-framework:5.2.0
@ -305,46 +321,49 @@ org.apache.htrace:htrace-core:3.1.0-incubating
org.apache.htrace:htrace-core4:4.1.0-incubating
org.apache.httpcomponents:httpclient:4.5.6
org.apache.httpcomponents:httpcore:4.4.10
org.apache.kafka:kafka-clients:2.8.1
org.apache.kerby:kerb-admin:1.0.1
org.apache.kerby:kerb-client:1.0.1
org.apache.kerby:kerb-common:1.0.1
org.apache.kerby:kerb-core:1.0.1
org.apache.kerby:kerb-crypto:1.0.1
org.apache.kerby:kerb-identity:1.0.1
org.apache.kerby:kerb-server:1.0.1
org.apache.kerby:kerb-simplekdc:1.0.1
org.apache.kerby:kerb-util:1.0.1
org.apache.kerby:kerby-asn1:1.0.1
org.apache.kerby:kerby-config:1.0.1
org.apache.kerby:kerby-pkix:1.0.1
org.apache.kerby:kerby-util:1.0.1
org.apache.kerby:kerby-xdr:1.0.1
org.apache.kerby:token-provider:1.0.1
org.apache.kafka:kafka-clients:2.8.2
org.apache.kerby:kerb-admin:2.0.3
org.apache.kerby:kerb-client:2.0.3
org.apache.kerby:kerb-common:2.0.3
org.apache.kerby:kerb-core:2.0.3
org.apache.kerby:kerb-crypto:2.0.3
org.apache.kerby:kerb-identity:2.0.3
org.apache.kerby:kerb-server:2.0.3
org.apache.kerby:kerb-simplekdc:2.0.3
org.apache.kerby:kerb-util:2.0.3
org.apache.kerby:kerby-asn1:2.0.3
org.apache.kerby:kerby-config:2.0.3
org.apache.kerby:kerby-pkix:2.0.3
org.apache.kerby:kerby-util:2.0.3
org.apache.kerby:kerby-xdr:2.0.3
org.apache.kerby:token-provider:2.0.3
org.apache.solr:solr-solrj:8.8.2
org.apache.yetus:audience-annotations:0.5.0
org.apache.zookeeper:zookeeper:3.6.3
org.codehaus.jettison:jettison:1.1
org.eclipse.jetty:jetty-annotations:9.4.44.v20210927
org.eclipse.jetty:jetty-http:9.4.44.v20210927
org.eclipse.jetty:jetty-io:9.4.44.v20210927
org.eclipse.jetty:jetty-jndi:9.4.44.v20210927
org.eclipse.jetty:jetty-plus:9.4.44.v20210927
org.eclipse.jetty:jetty-security:9.4.44.v20210927
org.eclipse.jetty:jetty-server:9.4.44.v20210927
org.eclipse.jetty:jetty-servlet:9.4.44.v20210927
org.eclipse.jetty:jetty-util:9.4.44.v20210927
org.eclipse.jetty:jetty-util-ajax:9.4.44.v20210927
org.eclipse.jetty:jetty-webapp:9.4.44.v20210927
org.eclipse.jetty:jetty-xml:9.4.44.v20210927
org.eclipse.jetty.websocket:javax-websocket-client-impl:9.4.44.v20210927
org.eclipse.jetty.websocket:javax-websocket-server-impl:9.4.44.v20210927
org.codehaus.jettison:jettison:1.5.4
org.eclipse.jetty:jetty-annotations:9.4.51.v20230217
org.eclipse.jetty:jetty-http:9.4.51.v20230217
org.eclipse.jetty:jetty-io:9.4.51.v20230217
org.eclipse.jetty:jetty-jndi:9.4.51.v20230217
org.eclipse.jetty:jetty-plus:9.4.51.v20230217
org.eclipse.jetty:jetty-security:9.4.51.v20230217
org.eclipse.jetty:jetty-server:9.4.51.v20230217
org.eclipse.jetty:jetty-servlet:9.4.51.v20230217
org.eclipse.jetty:jetty-util:9.4.51.v20230217
org.eclipse.jetty:jetty-util-ajax:9.4.51.v20230217
org.eclipse.jetty:jetty-webapp:9.4.51.v20230217
org.eclipse.jetty:jetty-xml:9.4.51.v20230217
org.eclipse.jetty.websocket:javax-websocket-client-impl:9.4.51.v20230217
org.eclipse.jetty.websocket:javax-websocket-server-impl:9.4.51.v20230217
org.ehcache:ehcache:3.3.1
org.ini4j:ini4j:0.5.4
org.jetbrains.kotlin:kotlin-stdlib:1.4.10
org.jetbrains.kotlin:kotlin-stdlib-common:1.4.10
org.lz4:lz4-java:1.7.1
org.objenesis:objenesis:2.6
org.xerial.snappy:snappy-java:1.0.5
org.yaml:snakeyaml:1.16:
org.wildfly.openssl:wildfly-openssl:1.0.7.Final
org.yaml:snakeyaml:2.0
org.wildfly.openssl:wildfly-openssl:1.1.3.Final
--------------------------------------------------------------------------------
@ -408,7 +427,7 @@ hadoop-tools/hadoop-sls/src/main/html/js/thirdparty/bootstrap.min.js
hadoop-tools/hadoop-sls/src/main/html/js/thirdparty/jquery.js
hadoop-tools/hadoop-sls/src/main/html/css/bootstrap.min.css
hadoop-tools/hadoop-sls/src/main/html/css/bootstrap-responsive.min.css
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/*
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.11.5/*
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/jquery
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/jt/jquery.jstree.js
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/TERMINAL
@ -416,7 +435,7 @@ hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanage
bootstrap v3.3.6
broccoli-asset-rev v2.4.2
broccoli-funnel v1.0.1
datatables v1.10.8
datatables v1.11.5
em-helpers v0.5.13
em-table v0.1.6
ember v2.2.0
@ -491,7 +510,6 @@ javax.annotation:javax.annotation-api:1.3.2
javax.servlet:javax.servlet-api:3.1.0
javax.servlet.jsp:jsp-api:2.1
javax.websocket:javax.websocket-api:1.0
javax.ws.rs:javax.ws.rs-api:2.1.1
javax.ws.rs:jsr311-api:1.1.1
javax.xml.bind:jaxb-api:2.2.11
@ -500,12 +518,14 @@ Eclipse Public License 1.0
--------------------------
junit:junit:4.13.2
org.jacoco:org.jacoco.agent:0.8.5
HSQL License
------------
org.hsqldb:hsqldb:2.3.4
org.hsqldb:hsqldb:2.7.1
JDOM License

View File

@ -252,7 +252,7 @@ hadoop-tools/hadoop-sls/src/main/html/js/thirdparty/bootstrap.min.js
hadoop-tools/hadoop-sls/src/main/html/js/thirdparty/jquery.js
hadoop-tools/hadoop-sls/src/main/html/css/bootstrap.min.css
hadoop-tools/hadoop-sls/src/main/html/css/bootstrap-responsive.min.css
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.10.18/*
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/dt-1.11.5/*
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/jquery
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/jt/jquery.jstree.js
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/resources/TERMINAL

View File

@ -20,6 +20,20 @@
# Override these to match Apache Hadoop's requirements
personality_plugins "all,-ant,-gradle,-scalac,-scaladoc"
# These flags are needed to run Yetus against Hadoop on Windows.
WINDOWS_FLAGS="-Pnative-win
-Dhttps.protocols=TLSv1.2
-Drequire.openssl
-Drequire.test.libhadoop
-Dshell-executable=${BASH_EXECUTABLE}
-Dopenssl.prefix=${VCPKG_INSTALLED_PACKAGES}
-Dcmake.prefix.path=${VCPKG_INSTALLED_PACKAGES}
-Dwindows.cmake.toolchain.file=${CMAKE_TOOLCHAIN_FILE}
-Dwindows.cmake.build.type=RelWithDebInfo
-Dwindows.build.hdfspp.dll=off
-Dwindows.no.sasl=on
-Duse.platformToolsetVersion=v142"
## @description Globals specific to this personality
## @audience private
## @stability evolving
@ -87,17 +101,30 @@ function hadoop_order
echo "${hadoopm}"
}
## @description Retrieves the Hadoop project version defined in the root pom.xml
## @audience private
## @stability evolving
## @returns 0 on success, 1 on failure
function load_hadoop_version
{
if [[ -f "${BASEDIR}/pom.xml" ]]; then
HADOOP_VERSION=$(grep '<version>' "${BASEDIR}/pom.xml" \
| head -1 \
| "${SED}" -e 's|^ *<version>||' -e 's|</version>.*$||' \
| cut -f1 -d- )
return 0
else
return 1
fi
}
## @description Determine if it is safe to run parallel tests
## @audience private
## @stability evolving
## @param ordering
function hadoop_test_parallel
{
if [[ -f "${BASEDIR}/pom.xml" ]]; then
HADOOP_VERSION=$(grep '<version>' "${BASEDIR}/pom.xml" \
| head -1 \
| "${SED}" -e 's|^ *<version>||' -e 's|</version>.*$||' \
| cut -f1 -d- )
if load_hadoop_version; then
export HADOOP_VERSION
else
return 1
@ -262,7 +289,10 @@ function hadoop_native_flags
Windows_NT|CYGWIN*|MINGW*|MSYS*)
echo \
"${args[@]}" \
-Drequire.snappy -Drequire.openssl -Pnative-win
-Drequire.snappy \
-Pdist \
-Dtar \
"${WINDOWS_FLAGS}"
;;
*)
echo \
@ -405,7 +435,10 @@ function personality_modules
extra="${extra} ${flags}"
fi
extra="-Ptest-patch ${extra}"
if [[ "$IS_WINDOWS" && "$IS_WINDOWS" == 1 ]]; then
extra="-Ptest-patch -Pdist -Dtar ${WINDOWS_FLAGS} ${extra}"
fi
for module in $(hadoop_order ${ordering}); do
# shellcheck disable=SC2086
personality_enqueue_module ${module} ${extra}
@ -548,17 +581,28 @@ function shadedclient_rebuild
big_console_header "Checking client artifacts on ${repostatus} with shaded clients"
extra="-Dtest=NoUnitTests -Dmaven.javadoc.skip=true -Dcheckstyle.skip=true -Dspotbugs.skip=true"
if [[ "$IS_WINDOWS" && "$IS_WINDOWS" == 1 ]]; then
if load_hadoop_version; then
export HADOOP_HOME="${SOURCEDIR}/hadoop-dist/target/hadoop-${HADOOP_VERSION}-SNAPSHOT"
else
yetus_error "[WARNING] Unable to extract the Hadoop version and thus HADOOP_HOME is not set. Some tests may fail."
fi
extra="${WINDOWS_FLAGS} ${extra}"
fi
echo_and_redirect "${logfile}" \
"${MAVEN}" "${MAVEN_ARGS[@]}" verify -fae --batch-mode -am \
"${modules[@]}" \
-Dtest=NoUnitTests -Dmaven.javadoc.skip=true -Dcheckstyle.skip=true -Dspotbugs.skip=true
"${MAVEN}" "${MAVEN_ARGS[@]}" verify -fae --batch-mode -am "${modules[@]}" "${extra}"
big_console_header "Checking client artifacts on ${repostatus} with non-shaded clients"
echo_and_redirect "${logfile}" \
"${MAVEN}" "${MAVEN_ARGS[@]}" verify -fae --batch-mode -am \
"${modules[@]}" \
-DskipShade -Dtest=NoUnitTests -Dmaven.javadoc.skip=true -Dcheckstyle.skip=true -Dspotbugs.skip=true
-DskipShade -Dtest=NoUnitTests -Dmaven.javadoc.skip=true -Dcheckstyle.skip=true \
-Dspotbugs.skip=true "${extra}"
count=$("${GREP}" -c '\[ERROR\]' "${logfile}")
if [[ ${count} -gt 0 ]]; then

View File

@ -171,7 +171,17 @@ if [[ -n "${GPGBIN}" && ! "${HADOOP_SKIP_YETUS_VERIFICATION}" = true ]]; then
fi
fi
if ! (gunzip -c "${TARBALL}.gz" | tar xpf -); then
if [[ "$IS_WINDOWS" && "$IS_WINDOWS" == 1 ]]; then
gunzip -c "${TARBALL}.gz" | tar xpf -
# One of the entries in the Yetus tarball unzips a symlink qbt.sh.
# The symlink creation fails on Windows, unless this CI is run as Admin or Developer mode is
# enabled.
# Thus, we create the qbt.sh symlink ourselves and move it to the target.
YETUS_PRECOMMIT_DIR="${YETUS_PREFIX}-${HADOOP_YETUS_VERSION}/lib/precommit"
ln -s "${YETUS_PRECOMMIT_DIR}/test-patch.sh" qbt.sh
mv qbt.sh "${YETUS_PRECOMMIT_DIR}"
elif ! (gunzip -c "${TARBALL}.gz" | tar xpf -); then
yetus_error "ERROR: ${TARBALL}.gz is corrupt. Investigate and then remove ${HADOOP_PATCHPROCESS} to try again."
exit 1
fi

View File

@ -74,7 +74,7 @@ ENV PATH "${PATH}:/opt/protobuf/bin"
###
# Avoid out of memory errors in builds
###
ENV MAVEN_OPTS -Xms256m -Xmx1536m
ENV MAVEN_OPTS -Xms256m -Xmx3072m
# Skip gpg verification when downloading Yetus via yetus-wrapper
ENV HADOOP_SKIP_YETUS_VERIFICATION true

View File

@ -0,0 +1,124 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# Dockerfile for installing the necessary dependencies for building Hadoop.
# See BUILDING.txt.
FROM mcr.microsoft.com/windows:ltsc2019
# Need to disable the progress bar for speeding up the downloads.
# hadolint ignore=SC2086
RUN powershell $Global:ProgressPreference = 'SilentlyContinue'
# Restore the default Windows shell for correct batch processing.
SHELL ["cmd", "/S", "/C"]
# Install Visual Studio 2019 Build Tools.
RUN curl -SL --output vs_buildtools.exe https://aka.ms/vs/16/release/vs_buildtools.exe \
&& (start /w vs_buildtools.exe --quiet --wait --norestart --nocache \
--installPath "%ProgramFiles(x86)%\Microsoft Visual Studio\2019\BuildTools" \
--add Microsoft.VisualStudio.Workload.VCTools \
--add Microsoft.VisualStudio.Component.VC.ASAN \
--add Microsoft.VisualStudio.Component.VC.Tools.x86.x64 \
--add Microsoft.VisualStudio.Component.Windows10SDK.19041 \
|| IF "%ERRORLEVEL%"=="3010" EXIT 0) \
&& del /q vs_buildtools.exe
# Install Chocolatey.
RUN powershell -NoProfile -ExecutionPolicy Bypass -Command "iex ((New-Object System.Net.WebClient).DownloadString('https://chocolatey.org/install.ps1'))"
RUN setx PATH "%PATH%;%ALLUSERSPROFILE%\chocolatey\bin"
# Install git.
RUN choco install git.install -y
RUN powershell Copy-Item -Recurse -Path 'C:\Program Files\Git' -Destination C:\Git
# Install vcpkg.
# hadolint ignore=DL3003
RUN powershell git clone https://github.com/microsoft/vcpkg.git \
&& cd vcpkg \
&& git checkout 7ffa425e1db8b0c3edf9c50f2f3a0f25a324541d \
&& .\bootstrap-vcpkg.bat
RUN powershell .\vcpkg\vcpkg.exe install boost:x64-windows
RUN powershell .\vcpkg\vcpkg.exe install protobuf:x64-windows
RUN powershell .\vcpkg\vcpkg.exe install openssl:x64-windows
RUN powershell .\vcpkg\vcpkg.exe install zlib:x64-windows
ENV PROTOBUF_HOME "C:\vcpkg\installed\x64-windows"
# Install Azul Java 8 JDK.
RUN powershell Invoke-WebRequest -URI https://cdn.azul.com/zulu/bin/zulu8.62.0.19-ca-jdk8.0.332-win_x64.zip -OutFile $Env:TEMP\zulu8.62.0.19-ca-jdk8.0.332-win_x64.zip
RUN powershell Expand-Archive -Path $Env:TEMP\zulu8.62.0.19-ca-jdk8.0.332-win_x64.zip -DestinationPath "C:\Java"
ENV JAVA_HOME "C:\Java\zulu8.62.0.19-ca-jdk8.0.332-win_x64"
RUN setx PATH "%PATH%;%JAVA_HOME%\bin"
# Install Apache Maven.
RUN powershell Invoke-WebRequest -URI https://archive.apache.org/dist/maven/maven-3/3.8.6/binaries/apache-maven-3.8.6-bin.zip -OutFile $Env:TEMP\apache-maven-3.8.6-bin.zip
RUN powershell Expand-Archive -Path $Env:TEMP\apache-maven-3.8.6-bin.zip -DestinationPath "C:\Maven"
RUN setx PATH "%PATH%;C:\Maven\apache-maven-3.8.6\bin"
ENV MAVEN_OPTS '-Xmx2048M -Xss128M'
# Install CMake 3.19.0.
RUN powershell Invoke-WebRequest -URI https://cmake.org/files/v3.19/cmake-3.19.0-win64-x64.zip -OutFile $Env:TEMP\cmake-3.19.0-win64-x64.zip
RUN powershell Expand-Archive -Path $Env:TEMP\cmake-3.19.0-win64-x64.zip -DestinationPath "C:\CMake"
RUN setx PATH "%PATH%;C:\CMake\cmake-3.19.0-win64-x64\bin"
# Install zstd 1.5.4.
RUN powershell Invoke-WebRequest -Uri https://github.com/facebook/zstd/releases/download/v1.5.4/zstd-v1.5.4-win64.zip -OutFile $Env:TEMP\zstd-v1.5.4-win64.zip
RUN powershell Expand-Archive -Path $Env:TEMP\zstd-v1.5.4-win64.zip -DestinationPath "C:\ZStd"
RUN setx PATH "%PATH%;C:\ZStd"
# Install libopenssl 3.1.0 needed for rsync 3.2.7.
RUN powershell Invoke-WebRequest -Uri https://repo.msys2.org/msys/x86_64/libopenssl-3.1.0-1-x86_64.pkg.tar.zst -OutFile $Env:TEMP\libopenssl-3.1.0-1-x86_64.pkg.tar.zst
RUN powershell zstd -d $Env:TEMP\libopenssl-3.1.0-1-x86_64.pkg.tar.zst -o $Env:TEMP\libopenssl-3.1.0-1-x86_64.pkg.tar
RUN powershell mkdir "C:\LibOpenSSL"
RUN powershell tar -xvf $Env:TEMP\libopenssl-3.1.0-1-x86_64.pkg.tar -C "C:\LibOpenSSL"
# Install libxxhash 0.8.1 needed for rsync 3.2.7.
RUN powershell Invoke-WebRequest -Uri https://repo.msys2.org/msys/x86_64/libxxhash-0.8.1-1-x86_64.pkg.tar.zst -OutFile $Env:TEMP\libxxhash-0.8.1-1-x86_64.pkg.tar.zst
RUN powershell zstd -d $Env:TEMP\libxxhash-0.8.1-1-x86_64.pkg.tar.zst -o $Env:TEMP\libxxhash-0.8.1-1-x86_64.pkg.tar
RUN powershell mkdir "C:\LibXXHash"
RUN powershell tar -xvf $Env:TEMP\libxxhash-0.8.1-1-x86_64.pkg.tar -C "C:\LibXXHash"
# Install libzstd 1.5.4 needed for rsync 3.2.7.
RUN powershell Invoke-WebRequest -Uri https://repo.msys2.org/msys/x86_64/libzstd-1.5.4-1-x86_64.pkg.tar.zst -OutFile $Env:TEMP\libzstd-1.5.4-1-x86_64.pkg.tar.zst
RUN powershell zstd -d $Env:TEMP\libzstd-1.5.4-1-x86_64.pkg.tar.zst -o $Env:TEMP\libzstd-1.5.4-1-x86_64.pkg.tar
RUN powershell mkdir "C:\LibZStd"
RUN powershell tar -xvf $Env:TEMP\libzstd-1.5.4-1-x86_64.pkg.tar -C "C:\LibZStd"
# Install rsync 3.2.7.
RUN powershell Invoke-WebRequest -Uri https://repo.msys2.org/msys/x86_64/rsync-3.2.7-2-x86_64.pkg.tar.zst -OutFile $Env:TEMP\rsync-3.2.7-2-x86_64.pkg.tar.zst
RUN powershell zstd -d $Env:TEMP\rsync-3.2.7-2-x86_64.pkg.tar.zst -o $Env:TEMP\rsync-3.2.7-2-x86_64.pkg.tar
RUN powershell mkdir "C:\RSync"
RUN powershell tar -xvf $Env:TEMP\rsync-3.2.7-2-x86_64.pkg.tar -C "C:\RSync"
# Copy the dependencies of rsync 3.2.7.
RUN powershell Copy-Item -Path "C:\LibOpenSSL\usr\bin\*.dll" -Destination "C:\Program` Files\Git\usr\bin"
RUN powershell Copy-Item -Path "C:\LibXXHash\usr\bin\*.dll" -Destination "C:\Program` Files\Git\usr\bin"
RUN powershell Copy-Item -Path "C:\LibZStd\usr\bin\*.dll" -Destination "C:\Program` Files\Git\usr\bin"
RUN powershell Copy-Item -Path "C:\RSync\usr\bin\*" -Destination "C:\Program` Files\Git\usr\bin"
# Install Python 3.10.11.
RUN powershell Invoke-WebRequest -Uri https://www.python.org/ftp/python/3.10.11/python-3.10.11-embed-amd64.zip -OutFile $Env:TEMP\python-3.10.11-embed-amd64.zip
RUN powershell Expand-Archive -Path $Env:TEMP\python-3.10.11-embed-amd64.zip -DestinationPath "C:\Python3"
RUN powershell New-Item -ItemType HardLink -Value "C:\Python3\python.exe" -Path "C:\Python3\python3.exe"
RUN setx path "%PATH%;C:\Python3"
# We get strange Javadoc errors without this.
RUN setx classpath ""
RUN git config --global core.longpaths true
RUN setx PATH "%PATH%;C:\Program Files\Git\usr\bin"
# Define the entry point for the docker container.
ENTRYPOINT ["C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\BuildTools\\VC\\Auxiliary\\Build\\vcvars64.bat", "&&", "cmd.exe"]

View File

@ -48,7 +48,7 @@ is_platform_change() {
declare in_path
in_path="${SOURCEDIR}"/"${1}"
for path in "${SOURCEDIR}"/dev-support/docker/Dockerfile* "${SOURCEDIR}"/dev-support/docker/pkg-resolver/*.json; do
for path in "${DOCKERFILE}" "${SOURCEDIR}"/dev-support/docker/pkg-resolver/*.json; do
if [ "${in_path}" == "${path}" ]; then
echo "Found C/C++ platform related changes in ${in_path}"
return 0
@ -114,22 +114,47 @@ function check_ci_run() {
function run_ci() {
TESTPATCHBIN="${WORKSPACE}/${YETUS}/precommit/src/main/shell/test-patch.sh"
# this must be clean for every run
if [[ -d "${PATCHDIR}" ]]; then
rm -rf "${PATCHDIR:?}"
fi
mkdir -p "${PATCHDIR}"
if [[ "$IS_WINDOWS" && "$IS_WINDOWS" == 1 ]]; then
echo "Building in a Windows environment, skipping some Yetus related settings"
else
# run in docker mode and specifically point to our
# Dockerfile since we don't want to use the auto-pulled version.
YETUS_ARGS+=("--docker")
YETUS_ARGS+=("--dockerfile=${DOCKERFILE}")
YETUS_ARGS+=("--mvn-custom-repos")
YETUS_ARGS+=("--dockermemlimit=22g")
# if given a JIRA issue, process it. If CHANGE_URL is set
# (e.g., Github Branch Source plugin), process it.
# otherwise exit, because we don't want Hadoop to do a
# full build. We wouldn't normally do this check for smaller
# projects. :)
if [[ -n "${JIRA_ISSUE_KEY}" ]]; then
YETUS_ARGS+=("${JIRA_ISSUE_KEY}")
elif [[ -z "${CHANGE_URL}" ]]; then
echo "Full build skipped" >"${PATCHDIR}/report.html"
exit 0
# test with Java 8 and 11
YETUS_ARGS+=("--java-home=/usr/lib/jvm/java-8-openjdk-amd64")
YETUS_ARGS+=("--multijdkdirs=/usr/lib/jvm/java-11-openjdk-amd64")
YETUS_ARGS+=("--multijdktests=compile")
fi
if [[ "$IS_NIGHTLY_BUILD" && "$IS_NIGHTLY_BUILD" == 1 ]]; then
YETUS_ARGS+=("--empty-patch")
YETUS_ARGS+=("--branch=${BRANCH_NAME}")
else
# this must be clean for every run
if [[ -d "${PATCHDIR}" ]]; then
rm -rf "${PATCHDIR:?}"
fi
mkdir -p "${PATCHDIR}"
# if given a JIRA issue, process it. If CHANGE_URL is set
# (e.g., Github Branch Source plugin), process it.
# otherwise exit, because we don't want Hadoop to do a
# full build. We wouldn't normally do this check for smaller
# projects. :)
if [[ -n "${JIRA_ISSUE_KEY}" ]]; then
YETUS_ARGS+=("${JIRA_ISSUE_KEY}")
elif [[ -z "${CHANGE_URL}" ]]; then
echo "Full build skipped" >"${PATCHDIR}/report.html"
exit 0
fi
# write Yetus report as GitHub comment (YETUS-1102)
YETUS_ARGS+=("--github-write-comment")
YETUS_ARGS+=("--github-use-emoji-vote")
fi
YETUS_ARGS+=("--patch-dir=${PATCHDIR}")
@ -156,7 +181,6 @@ function run_ci() {
# changing these to higher values may cause problems
# with other jobs on systemd-enabled machines
YETUS_ARGS+=("--proclimit=5500")
YETUS_ARGS+=("--dockermemlimit=22g")
# -1 spotbugs issues that show up prior to the patch being applied
YETUS_ARGS+=("--spotbugs-strict-precheck")
@ -175,30 +199,15 @@ function run_ci() {
# much attention to them
YETUS_ARGS+=("--tests-filter=checkstyle")
# run in docker mode and specifically point to our
# Dockerfile since we don't want to use the auto-pulled version.
YETUS_ARGS+=("--docker")
YETUS_ARGS+=("--dockerfile=${DOCKERFILE}")
YETUS_ARGS+=("--mvn-custom-repos")
# effectively treat dev-suport as a custom maven module
YETUS_ARGS+=("--skip-dirs=dev-support")
# help keep the ASF boxes clean
YETUS_ARGS+=("--sentinel")
# test with Java 8 and 11
YETUS_ARGS+=("--java-home=/usr/lib/jvm/java-8-openjdk-amd64")
YETUS_ARGS+=("--multijdkdirs=/usr/lib/jvm/java-11-openjdk-amd64")
YETUS_ARGS+=("--multijdktests=compile")
# custom javadoc goals
YETUS_ARGS+=("--mvn-javadoc-goals=process-sources,javadoc:javadoc-no-fork")
# write Yetus report as GitHub comment (YETUS-1102)
YETUS_ARGS+=("--github-write-comment")
YETUS_ARGS+=("--github-use-emoji-vote")
"${TESTPATCHBIN}" "${YETUS_ARGS[@]}"
}

View File

@ -98,13 +98,6 @@
<createSourcesJar>true</createSourcesJar>
<shadeSourcesContent>true</shadeSourcesContent>
</configuration>
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-maven-plugins</artifactId>
<version>${project.version}</version>
</dependency>
</dependencies>
<executions>
<execution>
<phase>package</phase>
@ -254,8 +247,7 @@
</relocation>
</relocations>
<transformers>
<!-- Needed until MSHADE-182 -->
<transformer implementation="org.apache.hadoop.maven.plugin.shade.resource.ServicesResourceTransformer"/>
<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
<transformer implementation="org.apache.maven.plugins.shade.resource.ApacheLicenseResourceTransformer"/>
<transformer implementation="org.apache.maven.plugins.shade.resource.DontIncludeResourceTransformer">
<resource>NOTICE.txt</resource>

View File

@ -671,13 +671,6 @@
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-maven-plugins</artifactId>
<version>${project.version}</version>
</dependency>
</dependencies>
<executions>
<execution>
<phase>package</phase>
@ -704,7 +697,6 @@
<exclude>org.bouncycastle:*</exclude>
<!-- Leave snappy that includes native methods which cannot be relocated. -->
<exclude>org.xerial.snappy:*</exclude>
<exclude>javax.ws.rs:javax.ws.rs-api</exclude>
</excludes>
</artifactSet>
<filters>
@ -1053,8 +1045,7 @@
</relocation>
</relocations>
<transformers>
<!-- Needed until MSHADE-182 -->
<transformer implementation="org.apache.hadoop.maven.plugin.shade.resource.ServicesResourceTransformer"/>
<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
<transformer implementation="org.apache.maven.plugins.shade.resource.ApacheLicenseResourceTransformer"/>
<transformer implementation="org.apache.maven.plugins.shade.resource.DontIncludeResourceTransformer">
<resources>

View File

@ -128,13 +128,6 @@
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-maven-plugins</artifactId>
<version>${project.version}</version>
</dependency>
</dependencies>
<executions>
<execution>
<phase>package</phase>
@ -155,6 +148,7 @@
<!-- Leave javax APIs that are stable -->
<!-- the jdk ships part of the javax.annotation namespace, so if we want to relocate this we'll have to care it out by class :( -->
<exclude>com.google.code.findbugs:jsr305</exclude>
<exclude>io.netty:*</exclude>
<exclude>io.dropwizard.metrics:metrics-core</exclude>
<exclude>org.eclipse.jetty:jetty-servlet</exclude>
<exclude>org.eclipse.jetty:jetty-security</exclude>
@ -163,7 +157,8 @@
<exclude>org.bouncycastle:*</exclude>
<!-- Leave snappy that includes native methods which cannot be relocated. -->
<exclude>org.xerial.snappy:*</exclude>
<exclude>javax.ws.rs:javax.ws.rs-api</exclude>
<!-- leave out kotlin classes -->
<exclude>org.jetbrains.kotlin:*</exclude>
</excludes>
</artifactSet>
<filters>
@ -398,8 +393,7 @@
-->
</relocations>
<transformers>
<!-- Needed until MSHADE-182 -->
<transformer implementation="org.apache.hadoop.maven.plugin.shade.resource.ServicesResourceTransformer"/>
<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
<transformer implementation="org.apache.maven.plugins.shade.resource.ApacheLicenseResourceTransformer"/>
<transformer implementation="org.apache.maven.plugins.shade.resource.DontIncludeResourceTransformer">
<resources>

View File

@ -69,6 +69,10 @@
<groupId>com.github.pjfanning</groupId>
<artifactId>jersey-json</artifactId>
</exclusion>
<exclusion>
<groupId>org.codehaus.jettison</groupId>
<artifactId>jettison</artifactId>
</exclusion>
<exclusion>
<groupId>com.sun.jersey</groupId>
<artifactId>jersey-server</artifactId>
@ -182,6 +186,10 @@
<groupId>com.github.pjfanning</groupId>
<artifactId>jersey-json</artifactId>
</exclusion>
<exclusion>
<groupId>org.codehaus.jettison</groupId>
<artifactId>jettison</artifactId>
</exclusion>
<exclusion>
<groupId>io.netty</groupId>
<artifactId>netty</artifactId>
@ -233,6 +241,10 @@
<groupId>com.github.pjfanning</groupId>
<artifactId>jersey-json</artifactId>
</exclusion>
<exclusion>
<groupId>org.codehaus.jettison</groupId>
<artifactId>jettison</artifactId>
</exclusion>
<exclusion>
<groupId>com.sun.jersey</groupId>
<artifactId>jersey-servlet</artifactId>
@ -290,6 +302,10 @@
<groupId>com.github.pjfanning</groupId>
<artifactId>jersey-json</artifactId>
</exclusion>
<exclusion>
<groupId>org.codehaus.jettison</groupId>
<artifactId>jettison</artifactId>
</exclusion>
<exclusion>
<groupId>io.netty</groupId>
<artifactId>netty</artifactId>

View File

@ -127,11 +127,6 @@
<artifactId>hadoop-azure-datalake</artifactId>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-openstack</artifactId>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-cos</artifactId>

View File

@ -110,20 +110,8 @@
<groupId>org.bouncycastle</groupId>
<artifactId>bcprov-jdk15on</artifactId>
</exclusion>
<!-- HACK. Transitive dependency for nimbus-jose-jwt. Needed for
packaging. Please re-check this version when updating
nimbus-jose-jwt. Please read HADOOP-14903 for more details.
-->
<exclusion>
<groupId>net.minidev</groupId>
<artifactId>json-smart</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>net.minidev</groupId>
<artifactId>json-smart</artifactId>
</dependency>
<dependency>
<groupId>org.apache.zookeeper</groupId>
<artifactId>zookeeper</artifactId>

View File

@ -18,6 +18,10 @@
package org.apache.hadoop.util;
import java.security.AccessController;
import java.security.PrivilegedAction;
import java.util.Arrays;
import org.apache.hadoop.classification.InterfaceAudience;
import org.apache.hadoop.classification.InterfaceStability;
@ -33,10 +37,10 @@ public class PlatformName {
* per the java-vm.
*/
public static final String PLATFORM_NAME =
(System.getProperty("os.name").startsWith("Windows")
? System.getenv("os") : System.getProperty("os.name"))
+ "-" + System.getProperty("os.arch")
+ "-" + System.getProperty("sun.arch.data.model");
(System.getProperty("os.name").startsWith("Windows") ?
System.getenv("os") : System.getProperty("os.name"))
+ "-" + System.getProperty("os.arch") + "-"
+ System.getProperty("sun.arch.data.model");
/**
* The java vendor name used in this platform.
@ -44,10 +48,60 @@ public class PlatformName {
public static final String JAVA_VENDOR_NAME = System.getProperty("java.vendor");
/**
* A public static variable to indicate the current java vendor is
* IBM java or not.
* Define a system class accessor that is open to changes in underlying implementations
* of the system class loader modules.
*/
public static final boolean IBM_JAVA = JAVA_VENDOR_NAME.contains("IBM");
private static final class SystemClassAccessor extends ClassLoader {
public Class<?> getSystemClass(String className) throws ClassNotFoundException {
return findSystemClass(className);
}
}
/**
* A public static variable to indicate the current java vendor is
* IBM and the type is Java Technology Edition which provides its
* own implementations of many security packages and Cipher suites.
* Note that these are not provided in Semeru runtimes:
* See https://developer.ibm.com/languages/java/semeru-runtimes for details.
*/
public static final boolean IBM_JAVA = JAVA_VENDOR_NAME.contains("IBM") &&
hasIbmTechnologyEditionModules();
private static boolean hasIbmTechnologyEditionModules() {
return Arrays.asList(
"com.ibm.security.auth.module.JAASLoginModule",
"com.ibm.security.auth.module.Win64LoginModule",
"com.ibm.security.auth.module.NTLoginModule",
"com.ibm.security.auth.module.AIX64LoginModule",
"com.ibm.security.auth.module.LinuxLoginModule",
"com.ibm.security.auth.module.Krb5LoginModule"
).stream().anyMatch((module) -> isSystemClassAvailable(module));
}
/**
* In rare cases where different behaviour is performed based on the JVM vendor
* this method should be used to test for a unique JVM class provided by the
* vendor rather than using the vendor method. For example if on JVM provides a
* different Kerberos login module testing for that login module being loadable
* before configuring to use it is preferable to using the vendor data.
*
* @param className the name of a class in the JVM to test for
* @return true if the class is available, false otherwise.
*/
private static boolean isSystemClassAvailable(String className) {
return AccessController.doPrivileged((PrivilegedAction<Boolean>) () -> {
try {
// Using ClassLoader.findSystemClass() instead of
// Class.forName(className, false, null) because Class.forName with a null
// ClassLoader only looks at the boot ClassLoader with Java 9 and above
// which doesn't look at all the modules available to the findSystemClass.
new SystemClassAccessor().getSystemClass(className);
return true;
} catch (Exception ignored) {
return false;
}
});
}
public static void main(String[] args) {
System.out.println(PLATFORM_NAME);

View File

@ -24,7 +24,7 @@ This filter must be configured in front of all the web application resources tha
The Hadoop Auth and dependent JAR files must be in the web application classpath (commonly the `WEB-INF/lib` directory).
Hadoop Auth uses SLF4J-API for logging. Auth Maven POM dependencies define the SLF4J API dependency but it does not define the dependency on a concrete logging implementation, this must be addded explicitly to the web application. For example, if the web applicationan uses Log4j, the SLF4J-LOG4J12 and LOG4J jar files must be part part of the web application classpath as well as the Log4j configuration file.
Hadoop Auth uses SLF4J-API for logging. Auth Maven POM dependencies define the SLF4J API dependency but it does not define the dependency on a concrete logging implementation, this must be addded explicitly to the web application. For example, if the web applicationan uses Log4j, the SLF4J-LOG4J12 and LOG4J jar files must be part of the web application classpath as well as the Log4j configuration file.
### Common Configuration parameters

View File

@ -379,21 +379,6 @@
<Bug code="JLM" />
</Match>
<!--
OpenStack Swift FS module -closes streams in a different method
from where they are opened.
-->
<Match>
<Class name="org.apache.hadoop.fs.swift.snative.SwiftNativeOutputStream"/>
<Method name="uploadFileAttempt"/>
<Bug pattern="OBL_UNSATISFIED_OBLIGATION"/>
</Match>
<Match>
<Class name="org.apache.hadoop.fs.swift.snative.SwiftNativeOutputStream"/>
<Method name="uploadFilePartAttempt"/>
<Bug pattern="OBL_UNSATISFIED_OBLIGATION"/>
</Match>
<!-- code from maven source, null value is checked at callee side. -->
<Match>
<Class name="org.apache.hadoop.util.ComparableVersion$ListItem" />

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@ -176,13 +176,16 @@
</exclusions>
</dependency>
<dependency>
<groupId>com.sun.jersey</groupId>
<artifactId>jersey-server</artifactId>
<scope>compile</scope>
<!--
adding jettison as direct dependency (as jersey-json's jettison dependency is vulnerable with verison 1.1),
so those who depends on hadoop-common externally will get the non-vulnerable jettison
-->
<groupId>org.codehaus.jettison</groupId>
<artifactId>jettison</artifactId>
</dependency>
<dependency>
<groupId>commons-logging</groupId>
<artifactId>commons-logging</artifactId>
<groupId>com.sun.jersey</groupId>
<artifactId>jersey-server</artifactId>
<scope>compile</scope>
</dependency>
<dependency>
@ -200,11 +203,6 @@
<artifactId>assertj-core</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.glassfish.grizzly</groupId>
<artifactId>grizzly-http-servlet</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>commons-beanutils</groupId>
<artifactId>commons-beanutils</artifactId>
@ -342,6 +340,14 @@
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>io.netty</groupId>
<artifactId>netty-handler</artifactId>
</dependency>
<dependency>
<groupId>io.netty</groupId>
<artifactId>netty-transport-native-epoll</artifactId>
</dependency>
<dependency>
<groupId>io.dropwizard.metrics</groupId>
<artifactId>metrics-core</artifactId>
@ -383,6 +389,11 @@
<artifactId>mockwebserver</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.squareup.okio</groupId>
<artifactId>okio-jvm</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>dnsjava</groupId>
<artifactId>dnsjava</artifactId>
@ -649,9 +660,10 @@
<goal>exec</goal>
</goals>
<configuration>
<executable>${basedir}/../../dev-support/bin/shelldocs</executable>
<executable>${shell-executable}</executable>
<workingDirectory>src/site/markdown</workingDirectory>
<arguments>
<argument>${basedir}/../../dev-support/bin/shelldocs</argument>
<argument>--skipprnorep</argument>
<argument>--output</argument>
<argument>${basedir}/src/site/markdown/UnixShellAPI.md</argument>
@ -841,6 +853,36 @@
</execution>
</executions>
</plugin>
<plugin>
<!--Sets the skip.platformToolsetDetection to true if use.platformToolsetVersion is specified.
This implies that the automatic detection of which platform toolset to use will be skipped
and the one specified with use.platformToolsetVersion will be used.-->
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-antrun-plugin</artifactId>
<version>1.8</version>
<executions>
<execution>
<phase>validate</phase>
<goals>
<goal>run</goal>
</goals>
<configuration>
<exportAntProperties>true</exportAntProperties>
<target>
<condition property="skip.platformToolsetDetection" value="true" else="false">
<isset property="use.platformToolsetVersion"/>
</condition>
<!--Unfortunately, Maven doesn't have a way to negate a flag, thus we declare a
property which holds the negated value of skip.platformToolsetDetection.-->
<condition property="skip.platformToolsetDetection.negated" value="false" else="true">
<isset property="use.platformToolsetVersion"/>
</condition>
<echo>Skip platform toolset version detection = ${skip.platformToolsetDetection}</echo>
</target>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>exec-maven-plugin</artifactId>
@ -852,6 +894,7 @@
<goal>exec</goal>
</goals>
<configuration>
<skip>${skip.platformToolsetDetection}</skip>
<executable>${basedir}\..\..\dev-support\bin\win-vs-upgrade.cmd</executable>
<arguments>
<argument>${basedir}\src\main\winutils</argument>
@ -866,6 +909,7 @@
<goal>exec</goal>
</goals>
<configuration>
<skip>${skip.platformToolsetDetection}</skip>
<executable>msbuild</executable>
<arguments>
<argument>${basedir}/src/main/winutils/winutils.sln</argument>
@ -878,6 +922,27 @@
</arguments>
</configuration>
</execution>
<execution>
<id>compile-ms-winutils-using-build-tools</id>
<phase>compile</phase>
<goals>
<goal>exec</goal>
</goals>
<configuration>
<skip>${skip.platformToolsetDetection.negated}</skip>
<executable>msbuild</executable>
<arguments>
<argument>${basedir}/src/main/winutils/winutils.sln</argument>
<argument>/nologo</argument>
<argument>/p:Configuration=Release</argument>
<argument>/p:OutDir=${project.build.directory}/bin/</argument>
<argument>/p:IntermediateOutputPath=${project.build.directory}/winutils/</argument>
<argument>/p:WsceConfigDir=${wsce.config.dir}</argument>
<argument>/p:WsceConfigFile=${wsce.config.file}</argument>
<argument>/p:PlatformToolset=${use.platformToolsetVersion}</argument>
</arguments>
</configuration>
</execution>
<execution>
<id>convert-ms-native-dll</id>
<phase>generate-sources</phase>
@ -885,6 +950,7 @@
<goal>exec</goal>
</goals>
<configuration>
<skip>${skip.platformToolsetDetection}</skip>
<executable>${basedir}\..\..\dev-support\bin\win-vs-upgrade.cmd</executable>
<arguments>
<argument>${basedir}\src\main\native</argument>
@ -899,6 +965,7 @@
<goal>exec</goal>
</goals>
<configuration>
<skip>${skip.platformToolsetDetection}</skip>
<executable>msbuild</executable>
<arguments>
<argument>${basedir}/src/main/native/native.sln</argument>
@ -919,6 +986,35 @@
</arguments>
</configuration>
</execution>
<execution>
<id>compile-ms-native-dll-using-build-tools</id>
<phase>compile</phase>
<goals>
<goal>exec</goal>
</goals>
<configuration>
<skip>${skip.platformToolsetDetection.negated}</skip>
<executable>msbuild</executable>
<arguments>
<argument>${basedir}/src/main/native/native.sln</argument>
<argument>/nologo</argument>
<argument>/p:Configuration=Release</argument>
<argument>/p:OutDir=${project.build.directory}/bin/</argument>
<argument>/p:CustomZstdPrefix=${zstd.prefix}</argument>
<argument>/p:CustomZstdLib=${zstd.lib}</argument>
<argument>/p:CustomZstdInclude=${zstd.include}</argument>
<argument>/p:RequireZstd=${require.zstd}</argument>
<argument>/p:CustomOpensslPrefix=${openssl.prefix}</argument>
<argument>/p:CustomOpensslLib=${openssl.lib}</argument>
<argument>/p:CustomOpensslInclude=${openssl.include}</argument>
<argument>/p:RequireOpenssl=${require.openssl}</argument>
<argument>/p:RequireIsal=${require.isal}</argument>
<argument>/p:CustomIsalPrefix=${isal.prefix}</argument>
<argument>/p:CustomIsalLib=${isal.lib}</argument>
<argument>/p:PlatformToolset=${use.platformToolsetVersion}</argument>
</arguments>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
@ -1151,7 +1247,7 @@
<id>src-test-compile-protoc-legacy</id>
<phase>generate-test-sources</phase>
<goals>
<goal>compile</goal>
<goal>test-compile</goal>
</goals>
<configuration>
<skip>false</skip>
@ -1160,7 +1256,7 @@
com.google.protobuf:protoc:${protobuf.version}:exe:${os.detected.classifier}
</protocArtifact>
<includeDependenciesInDescriptorSet>false</includeDependenciesInDescriptorSet>
<protoSourceRoot>${basedir}/src/test/proto</protoSourceRoot>
<protoTestSourceRoot>${basedir}/src/test/proto</protoTestSourceRoot>
<outputDirectory>${project.build.directory}/generated-test-sources/java</outputDirectory>
<clearOutputDirectory>false</clearOutputDirectory>
<includes>

View File

@ -26,9 +26,9 @@ MYNAME="${BASH_SOURCE-$0}"
function hadoop_usage
{
hadoop_add_option "buildpaths" "attempt to add class files from build tree"
hadoop_add_option "hostnames list[,of,host,names]" "hosts to use in slave mode"
hadoop_add_option "hostnames list[,of,host,names]" "hosts to use in worker mode"
hadoop_add_option "loglevel level" "set the log4j level for this command"
hadoop_add_option "hosts filename" "list of hosts to use in slave mode"
hadoop_add_option "hosts filename" "list of hosts to use in worker mode"
hadoop_add_option "workers" "turn on worker mode"
hadoop_add_subcommand "checknative" client "check native Hadoop and compression libraries availability"

View File

@ -16,7 +16,7 @@
# limitations under the License.
# Run a Hadoop command on all slave hosts.
# Run a Hadoop command on all worker hosts.
function hadoop_usage
{

View File

@ -53,6 +53,10 @@
# variable is REQUIRED on ALL platforms except OS X!
# export JAVA_HOME=
# The language environment in which Hadoop runs. Use the English
# environment to ensure that logs are printed as expected.
export LANG=en_US.UTF-8
# Location of Hadoop. By default, Hadoop will attempt to determine
# this location based upon its execution path.
# export HADOOP_HOME=

View File

@ -75,14 +75,6 @@ log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n
#
# TaskLog Appender
#
log4j.appender.TLA=org.apache.hadoop.mapred.TaskLogAppender
log4j.appender.TLA.layout=org.apache.log4j.PatternLayout
log4j.appender.TLA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
#
# HDFS block state change log from block manager
#

View File

@ -98,7 +98,7 @@ public class ConfServlet extends HttpServlet {
if (FORMAT_JSON.equals(format)) {
Configuration.dumpConfiguration(conf, propertyName, out);
} else if (FORMAT_XML.equals(format)) {
conf.writeXml(propertyName, out);
conf.writeXml(propertyName, out, conf);
} else {
throw new BadFormatException("Bad format: " + format);
}

View File

@ -37,6 +37,7 @@ import org.apache.hadoop.util.StringUtils;
public class ConfigRedactor {
private static final String REDACTED_TEXT = "<redacted>";
private static final String REDACTED_XML = "******";
private List<Pattern> compiledPatterns;
@ -84,4 +85,19 @@ public class ConfigRedactor {
}
return false;
}
/**
* Given a key / value pair, decides whether or not to redact and returns
* either the original value or text indicating it has been redacted.
*
* @param key param key.
* @param value param value, will return if conditions permit.
* @return Original value, or text indicating it has been redacted
*/
public String redactXml(String key, String value) {
if (configIsSensitive(key)) {
return REDACTED_XML;
}
return value;
}
}

View File

@ -24,7 +24,6 @@ import com.ctc.wstx.io.SystemId;
import com.ctc.wstx.stax.WstxInputFactory;
import com.fasterxml.jackson.core.JsonFactory;
import com.fasterxml.jackson.core.JsonGenerator;
import org.apache.hadoop.classification.VisibleForTesting;
import java.io.BufferedInputStream;
import java.io.DataInput;
@ -87,6 +86,7 @@ import org.apache.hadoop.thirdparty.com.google.common.base.Charsets;
import org.apache.commons.collections.map.UnmodifiableMap;
import org.apache.hadoop.classification.InterfaceAudience;
import org.apache.hadoop.classification.InterfaceStability;
import org.apache.hadoop.classification.VisibleForTesting;
import org.apache.hadoop.fs.CommonConfigurationKeysPublic;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
@ -98,18 +98,19 @@ import org.apache.hadoop.security.UserGroupInformation;
import org.apache.hadoop.security.alias.CredentialProvider;
import org.apache.hadoop.security.alias.CredentialProvider.CredentialEntry;
import org.apache.hadoop.security.alias.CredentialProviderFactory;
import org.apache.hadoop.thirdparty.com.google.common.base.Strings;
import org.apache.hadoop.util.Preconditions;
import org.apache.hadoop.util.ReflectionUtils;
import org.apache.hadoop.util.StringInterner;
import org.apache.hadoop.util.StringUtils;
import org.apache.hadoop.util.XMLUtils;
import org.codehaus.stax2.XMLStreamReader2;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.apache.hadoop.util.Preconditions;
import org.apache.hadoop.thirdparty.com.google.common.base.Strings;
import static org.apache.commons.lang3.StringUtils.isBlank;
import static org.apache.commons.lang3.StringUtils.isNotBlank;
@ -3593,16 +3594,18 @@ public class Configuration implements Iterable<Map.Entry<String,String>>,
* </ul>
* @param propertyName xml property name.
* @param out the writer to write to.
* @param config configuration.
* @throws IOException raised on errors performing I/O.
*/
public void writeXml(@Nullable String propertyName, Writer out)
public void writeXml(@Nullable String propertyName, Writer out, Configuration config)
throws IOException, IllegalArgumentException {
Document doc = asXmlDocument(propertyName);
ConfigRedactor redactor = config != null ? new ConfigRedactor(this) : null;
Document doc = asXmlDocument(propertyName, redactor);
try {
DOMSource source = new DOMSource(doc);
StreamResult result = new StreamResult(out);
TransformerFactory transFactory = TransformerFactory.newInstance();
TransformerFactory transFactory = XMLUtils.newSecureTransformerFactory();
Transformer transformer = transFactory.newTransformer();
// Important to not hold Configuration log while writing result, since
@ -3614,11 +3617,16 @@ public class Configuration implements Iterable<Map.Entry<String,String>>,
}
}
public void writeXml(@Nullable String propertyName, Writer out)
throws IOException, IllegalArgumentException {
writeXml(propertyName, out, null);
}
/**
* Return the XML DOM corresponding to this Configuration.
*/
private synchronized Document asXmlDocument(@Nullable String propertyName)
throws IOException, IllegalArgumentException {
private synchronized Document asXmlDocument(@Nullable String propertyName,
ConfigRedactor redactor) throws IOException, IllegalArgumentException {
Document doc;
try {
doc = DocumentBuilderFactory
@ -3641,13 +3649,13 @@ public class Configuration implements Iterable<Map.Entry<String,String>>,
propertyName + " not found");
} else {
// given property is found, write single property
appendXMLProperty(doc, conf, propertyName);
appendXMLProperty(doc, conf, propertyName, redactor);
conf.appendChild(doc.createTextNode("\n"));
}
} else {
// append all elements
for (Enumeration<Object> e = properties.keys(); e.hasMoreElements();) {
appendXMLProperty(doc, conf, (String)e.nextElement());
appendXMLProperty(doc, conf, (String)e.nextElement(), redactor);
conf.appendChild(doc.createTextNode("\n"));
}
}
@ -3663,7 +3671,7 @@ public class Configuration implements Iterable<Map.Entry<String,String>>,
* @param propertyName
*/
private synchronized void appendXMLProperty(Document doc, Element conf,
String propertyName) {
String propertyName, ConfigRedactor redactor) {
// skip writing if given property name is empty or null
if (!Strings.isNullOrEmpty(propertyName)) {
String value = properties.getProperty(propertyName);
@ -3676,8 +3684,11 @@ public class Configuration implements Iterable<Map.Entry<String,String>>,
propNode.appendChild(nameNode);
Element valueNode = doc.createElement("value");
valueNode.appendChild(doc.createTextNode(
properties.getProperty(propertyName)));
String propertyValue = properties.getProperty(propertyName);
if (redactor != null) {
propertyValue = redactor.redactXml(propertyName, propertyValue);
}
valueNode.appendChild(doc.createTextNode(propertyValue));
propNode.appendChild(valueNode);
Element finalNode = doc.createElement("final");

View File

@ -241,12 +241,15 @@ public class CryptoOutputStream extends FilterOutputStream implements
return;
}
try {
flush();
if (closeOutputStream) {
super.close();
codec.close();
try {
flush();
} finally {
if (closeOutputStream) {
super.close();
codec.close();
}
freeBuffers();
}
freeBuffers();
} finally {
closed = true;
}

View File

@ -639,13 +639,14 @@ public abstract class KeyProvider implements Closeable {
public abstract void flush() throws IOException;
/**
* Split the versionName in to a base name. Converts "/aaa/bbb/3" to
* Split the versionName in to a base name. Converts "/aaa/bbb@3" to
* "/aaa/bbb".
* @param versionName the version name to split
* @return the base name of the key
* @throws IOException raised on errors performing I/O.
*/
public static String getBaseName(String versionName) throws IOException {
Objects.requireNonNull(versionName, "VersionName cannot be null");
int div = versionName.lastIndexOf('@');
if (div == -1) {
throw new IOException("No version in key path " + versionName);

View File

@ -60,7 +60,6 @@ public class AvroFSInput implements Closeable, SeekableInput {
FS_OPTION_OPENFILE_READ_POLICY_SEQUENTIAL)
.withFileStatus(status)
.build());
fc.open(p);
}
@Override

View File

@ -174,6 +174,7 @@ public abstract class ChecksumFileSystem extends FilterFileSystem {
private static final int HEADER_LENGTH = 8;
private int bytesPerSum = 1;
private long fileLen = -1L;
public ChecksumFSInputChecker(ChecksumFileSystem fs, Path file)
throws IOException {
@ -320,6 +321,18 @@ public abstract class ChecksumFileSystem extends FilterFileSystem {
return HEADER_LENGTH + (dataOffset/bytesPerSum) * FSInputChecker.CHECKSUM_SIZE;
}
/**
* Calculate length of file if not already cached.
* @return file length.
* @throws IOException any IOE.
*/
private long getFileLength() throws IOException {
if (fileLen == -1L) {
fileLen = fs.getFileStatus(file).getLen();
}
return fileLen;
}
/**
* Find the checksum ranges that correspond to the given data ranges.
* @param dataRanges the input data ranges, which are assumed to be sorted
@ -371,13 +384,28 @@ public abstract class ChecksumFileSystem extends FilterFileSystem {
IntBuffer sums = sumsBytes.asIntBuffer();
sums.position(offset / FSInputChecker.CHECKSUM_SIZE);
ByteBuffer current = data.duplicate();
int numChunks = data.remaining() / bytesPerSum;
int numFullChunks = data.remaining() / bytesPerSum;
boolean partialChunk = ((data.remaining() % bytesPerSum) != 0);
int totalChunks = numFullChunks;
if (partialChunk) {
totalChunks++;
}
CRC32 crc = new CRC32();
// check each chunk to ensure they match
for(int c = 0; c < numChunks; ++c) {
// set the buffer position and the limit
current.limit((c + 1) * bytesPerSum);
for(int c = 0; c < totalChunks; ++c) {
// set the buffer position to the start of every chunk.
current.position(c * bytesPerSum);
if (c == numFullChunks) {
// During last chunk, there may be less than chunk size
// data preset, so setting the limit accordingly.
int lastIncompleteChunk = data.remaining() % bytesPerSum;
current.limit((c * bytesPerSum) + lastIncompleteChunk);
} else {
// set the buffer limit to end of every chunk.
current.limit((c + 1) * bytesPerSum);
}
// compute the crc
crc.reset();
crc.update(current);
@ -396,11 +424,34 @@ public abstract class ChecksumFileSystem extends FilterFileSystem {
return data;
}
/**
* Validates range parameters.
* In case of CheckSum FS, we already have calculated
* fileLength so failing fast here.
* @param ranges requested ranges.
* @param fileLength length of file.
* @throws EOFException end of file exception.
*/
private void validateRangeRequest(List<? extends FileRange> ranges,
final long fileLength) throws EOFException {
for (FileRange range : ranges) {
VectoredReadUtils.validateRangeRequest(range);
if (range.getOffset() + range.getLength() > fileLength) {
final String errMsg = String.format("Requested range [%d, %d) is beyond EOF for path %s",
range.getOffset(), range.getLength(), file);
LOG.warn(errMsg);
throw new EOFException(errMsg);
}
}
}
@Override
public void readVectored(List<? extends FileRange> ranges,
IntFunction<ByteBuffer> allocate) throws IOException {
final long length = getFileLength();
validateRangeRequest(ranges, length);
// If the stream doesn't have checksums, just delegate.
VectoredReadUtils.validateVectoredReadRanges(ranges);
if (sums == null) {
datas.readVectored(ranges, allocate);
return;
@ -410,15 +461,18 @@ public abstract class ChecksumFileSystem extends FilterFileSystem {
List<CombinedFileRange> dataRanges =
VectoredReadUtils.mergeSortedRanges(Arrays.asList(sortRanges(ranges)), bytesPerSum,
minSeek, maxReadSizeForVectorReads());
// While merging the ranges above, they are rounded up based on the value of bytesPerSum
// which leads to some ranges crossing the EOF thus they need to be fixed else it will
// cause EOFException during actual reads.
for (CombinedFileRange range : dataRanges) {
if (range.getOffset() + range.getLength() > length) {
range.setLength((int) (length - range.getOffset()));
}
}
List<CombinedFileRange> checksumRanges = findChecksumRanges(dataRanges,
bytesPerSum, minSeek, maxSize);
sums.readVectored(checksumRanges, allocate);
datas.readVectored(dataRanges, allocate);
// Data read is correct. I have verified content of dataRanges.
// There is some bug below here as test (testVectoredReadMultipleRanges)
// is failing, should be
// somewhere while slicing the merged data into smaller user ranges.
// Spend some time figuring out but it is a complex code.
for(CombinedFileRange checksumRange: checksumRanges) {
for(FileRange dataRange: checksumRange.getUnderlying()) {
// when we have both the ranges, validate the checksum

View File

@ -417,6 +417,14 @@ public class CommonConfigurationKeys extends CommonConfigurationKeysPublic {
/** How often to retry a ZooKeeper operation in milliseconds. */
public static final String ZK_RETRY_INTERVAL_MS =
ZK_PREFIX + "retry-interval-ms";
/** Keystore location for ZooKeeper client connection over SSL. */
public static final String ZK_SSL_KEYSTORE_LOCATION = ZK_PREFIX + "ssl.keystore.location";
/** Keystore password for ZooKeeper client connection over SSL. */
public static final String ZK_SSL_KEYSTORE_PASSWORD = ZK_PREFIX + "ssl.keystore.password";
/** Truststore location for ZooKeeper client connection over SSL. */
public static final String ZK_SSL_TRUSTSTORE_LOCATION = ZK_PREFIX + "ssl.truststore.location";
/** Truststore password for ZooKeeper client connection over SSL. */
public static final String ZK_SSL_TRUSTSTORE_PASSWORD = ZK_PREFIX + "ssl.truststore.password";
public static final int ZK_RETRY_INTERVAL_MS_DEFAULT = 1000;
/** Default domain name resolver for hadoop to use. */
public static final String HADOOP_DOMAINNAME_RESOLVER_IMPL =
@ -475,4 +483,21 @@ public class CommonConfigurationKeys extends CommonConfigurationKeysPublic {
* default hadoop temp dir on local system: {@value}.
*/
public static final String HADOOP_TMP_DIR = "hadoop.tmp.dir";
/**
* Thread-level IOStats Support.
* {@value}
*/
public static final String IOSTATISTICS_THREAD_LEVEL_ENABLED =
"fs.iostatistics.thread.level.enabled";
/**
* Default value for Thread-level IOStats Support is true.
*/
public static final boolean IOSTATISTICS_THREAD_LEVEL_ENABLED_DEFAULT =
true;
public static final String HADOOP_SECURITY_RESOLVER_IMPL =
"hadoop.security.resolver.impl";
}

View File

@ -1000,6 +1000,7 @@ public class CommonConfigurationKeysPublic {
String.join(",",
"secret$",
"password$",
"username$",
"ssl.keystore.pass$",
"fs.s3.*[Ss]ecret.?[Kk]ey",
"fs.s3a.*.server-side-encryption.key",

View File

@ -163,5 +163,11 @@ public final class CommonPathCapabilities {
public static final String ETAGS_PRESERVED_IN_RENAME =
"fs.capability.etags.preserved.in.rename";
/**
* Does this Filesystem support lease recovery operations such as
* {@link LeaseRecoverable#recoverLease(Path)} and {@link LeaseRecoverable#isFileClosed(Path)}}?
* Value: {@value}.
*/
public static final String LEASE_RECOVERABLE = "fs.capability.lease.recoverable";
}

View File

@ -256,9 +256,8 @@ public class DelegationTokenRenewer
try {
action.cancel();
} catch (InterruptedException ie) {
LOG.error("Interrupted while canceling token for " + fs.getUri()
+ "filesystem");
LOG.debug("Exception in removeRenewAction: {}", ie);
LOG.error("Interrupted while canceling token for {} filesystem.", fs.getUri());
LOG.debug("Exception in removeRenewAction.", ie);
}
}
}

View File

@ -28,6 +28,34 @@ import org.apache.hadoop.classification.InterfaceStability;
* The base interface which various FileSystem FileContext Builder
* interfaces can extend, and which underlying implementations
* will then implement.
* <p>
* HADOOP-16202 expanded the opt() and must() arguments with
* operator overloading, but HADOOP-18724 identified mapping problems:
* passing a long value in to {@code opt()} could end up invoking
* {@code opt(string, double)}, which could then trigger parse failures.
* <p>
* To fix this without forcing existing code to break/be recompiled.
* <ol>
* <li>A new method to explicitly set a long value is added:
* {@link #optLong(String, long)}
* </li>
* <li>A new method to explicitly set a double value is added:
* {@link #optLong(String, long)}
* </li>
* <li>
* All of {@link #opt(String, long)}, {@link #opt(String, float)} and
* {@link #opt(String, double)} invoke {@link #optLong(String, long)}.
* </li>
* <li>
* The same changes have been applied to {@code must()} methods.
* </li>
* </ol>
* The forwarding of existing double/float setters to the long setters ensure
* that existing code will link, but are guaranteed to always set a long value.
* If you need to write code which works correctly with all hadoop releases,
* covert the option to a string explicitly and then call {@link #opt(String, String)}
* or {@link #must(String, String)} as appropriate.
*
* @param <S> Return type on the {@link #build()} call.
* @param <B> type of builder itself.
*/
@ -50,7 +78,9 @@ public interface FSBuilder<S, B extends FSBuilder<S, B>> {
* @return generic type B.
* @see #opt(String, String)
*/
B opt(@Nonnull String key, boolean value);
default B opt(@Nonnull String key, boolean value) {
return opt(key, Boolean.toString(value));
}
/**
* Set optional int parameter for the Builder.
@ -60,17 +90,25 @@ public interface FSBuilder<S, B extends FSBuilder<S, B>> {
* @return generic type B.
* @see #opt(String, String)
*/
B opt(@Nonnull String key, int value);
default B opt(@Nonnull String key, int value) {
return optLong(key, value);
}
/**
* Set optional float parameter for the Builder.
* This parameter is converted to a long and passed
* to {@link #optLong(String, long)} -all
* decimal precision is lost.
*
* @param key key.
* @param value value.
* @return generic type B.
* @see #opt(String, String)
* @deprecated use {@link #optDouble(String, double)}
*/
B opt(@Nonnull String key, float value);
@Deprecated
default B opt(@Nonnull String key, float value) {
return optLong(key, (long) value);
}
/**
* Set optional long parameter for the Builder.
@ -78,19 +116,27 @@ public interface FSBuilder<S, B extends FSBuilder<S, B>> {
* @param key key.
* @param value value.
* @return generic type B.
* @see #opt(String, String)
* @deprecated use {@link #optLong(String, long)} where possible.
*/
B opt(@Nonnull String key, long value);
default B opt(@Nonnull String key, long value) {
return optLong(key, value);
}
/**
* Set optional double parameter for the Builder.
*
* Pass an optional double parameter for the Builder.
* This parameter is converted to a long and passed
* to {@link #optLong(String, long)} -all
* decimal precision is lost.
* @param key key.
* @param value value.
* @return generic type B.
* @see #opt(String, String)
* @deprecated use {@link #optDouble(String, double)}
*/
B opt(@Nonnull String key, double value);
@Deprecated
default B opt(@Nonnull String key, double value) {
return optLong(key, (long) value);
}
/**
* Set an array of string values as optional parameter for the Builder.
@ -102,6 +148,30 @@ public interface FSBuilder<S, B extends FSBuilder<S, B>> {
*/
B opt(@Nonnull String key, @Nonnull String... values);
/**
* Set optional long parameter for the Builder.
*
* @param key key.
* @param value value.
* @return generic type B.
* @see #opt(String, String)
*/
default B optLong(@Nonnull String key, long value) {
return opt(key, Long.toString(value));
}
/**
* Set optional double parameter for the Builder.
*
* @param key key.
* @param value value.
* @return generic type B.
* @see #opt(String, String)
*/
default B optDouble(@Nonnull String key, double value) {
return opt(key, Double.toString(value));
}
/**
* Set mandatory option to the Builder.
*
@ -122,7 +192,9 @@ public interface FSBuilder<S, B extends FSBuilder<S, B>> {
* @return generic type B.
* @see #must(String, String)
*/
B must(@Nonnull String key, boolean value);
default B must(@Nonnull String key, boolean value) {
return must(key, Boolean.toString(value));
}
/**
* Set mandatory int option.
@ -132,17 +204,24 @@ public interface FSBuilder<S, B extends FSBuilder<S, B>> {
* @return generic type B.
* @see #must(String, String)
*/
B must(@Nonnull String key, int value);
default B must(@Nonnull String key, int value) {
return mustLong(key, value);
}
/**
* Set mandatory float option.
* This parameter is converted to a long and passed
* to {@link #mustLong(String, long)} -all
* decimal precision is lost.
*
* @param key key.
* @param value value.
* @return generic type B.
* @see #must(String, String)
* @deprecated use {@link #mustDouble(String, double)} to set floating point.
*/
B must(@Nonnull String key, float value);
@Deprecated
default B must(@Nonnull String key, float value) {
return mustLong(key, (long) value);
}
/**
* Set mandatory long option.
@ -152,17 +231,24 @@ public interface FSBuilder<S, B extends FSBuilder<S, B>> {
* @return generic type B.
* @see #must(String, String)
*/
B must(@Nonnull String key, long value);
@Deprecated
default B must(@Nonnull String key, long value) {
return mustLong(key, (long) value);
}
/**
* Set mandatory double option.
* Set mandatory long option, despite passing in a floating
* point value.
*
* @param key key.
* @param value value.
* @return generic type B.
* @see #must(String, String)
*/
B must(@Nonnull String key, double value);
@Deprecated
default B must(@Nonnull String key, double value) {
return mustLong(key, (long) value);
}
/**
* Set a string array as mandatory option.
@ -174,6 +260,30 @@ public interface FSBuilder<S, B extends FSBuilder<S, B>> {
*/
B must(@Nonnull String key, @Nonnull String... values);
/**
* Set mandatory long parameter for the Builder.
*
* @param key key.
* @param value value.
* @return generic type B.
* @see #opt(String, String)
*/
default B mustLong(@Nonnull String key, long value) {
return must(key, Long.toString(value));
}
/**
* Set mandatory double parameter for the Builder.
*
* @param key key.
* @param value value.
* @return generic type B.
* @see #opt(String, String)
*/
default B mustDouble(@Nonnull String key, double value) {
return must(key, Double.toString(value));
}
/**
* Instantiate the object which was being built.
*

View File

@ -144,7 +144,8 @@ public class FSDataInputStream extends DataInputStream
*
* @return the underlying input stream
*/
@InterfaceAudience.LimitedPrivate({"HDFS"})
@InterfaceAudience.Public
@InterfaceStability.Stable
public InputStream getWrappedStream() {
return in;
}

View File

@ -2231,7 +2231,7 @@ public class FileContext implements PathCapabilities {
InputStream in = awaitFuture(openFile(qSrc)
.opt(FS_OPTION_OPENFILE_READ_POLICY,
FS_OPTION_OPENFILE_READ_POLICY_WHOLE_FILE)
.opt(FS_OPTION_OPENFILE_LENGTH,
.optLong(FS_OPTION_OPENFILE_LENGTH,
fs.getLen()) // file length hint for object stores
.build());
try (OutputStream out = create(qDst, createFlag)) {

View File

@ -55,6 +55,15 @@ public interface FileRange {
*/
void setData(CompletableFuture<ByteBuffer> data);
/**
* Get any reference passed in to the file range constructor.
* This is not used by any implementation code; it is to help
* bind this API to libraries retrieving multiple stripes of
* data in parallel.
* @return a reference or null.
*/
Object getReference();
/**
* Factory method to create a FileRange object.
* @param offset starting offset of the range.
@ -62,6 +71,17 @@ public interface FileRange {
* @return a new instance of FileRangeImpl.
*/
static FileRange createFileRange(long offset, int length) {
return new FileRangeImpl(offset, length);
return new FileRangeImpl(offset, length, null);
}
/**
* Factory method to create a FileRange object.
* @param offset starting offset of the range.
* @param length length of the range.
* @param reference nullable reference to store in the range.
* @return a new instance of FileRangeImpl.
*/
static FileRange createFileRange(long offset, int length, Object reference) {
return new FileRangeImpl(offset, length, reference);
}
}

View File

@ -402,7 +402,8 @@ public class FileStatus implements Writable, Comparable<Object>,
}
/**
* Compare this FileStatus to another FileStatus
* Compare this FileStatus to another FileStatus based on lexicographical
* order of path.
* @param o the FileStatus to be compared.
* @return a negative integer, zero, or a positive integer as this object
* is less than, equal to, or greater than the specified object.
@ -412,7 +413,8 @@ public class FileStatus implements Writable, Comparable<Object>,
}
/**
* Compare this FileStatus to another FileStatus.
* Compare this FileStatus to another FileStatus based on lexicographical
* order of path.
* This method was added back by HADOOP-14683 to keep binary compatibility.
*
* @param o the FileStatus to be compared.

View File

@ -21,7 +21,6 @@ import javax.annotation.Nonnull;
import java.io.Closeable;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InterruptedIOException;
import java.lang.ref.WeakReference;
import java.lang.ref.ReferenceQueue;
import java.net.URI;
@ -1544,6 +1543,39 @@ public abstract class FileSystem extends Configured
public abstract FSDataOutputStream append(Path f, int bufferSize,
Progressable progress) throws IOException;
/**
* Append to an existing file (optional operation).
* @param f the existing file to be appended.
* @param appendToNewBlock whether to append data to a new block
* instead of the end of the last partial block
* @throws IOException IO failure
* @throws UnsupportedOperationException if the operation is unsupported
* (default).
* @return output stream.
*/
public FSDataOutputStream append(Path f, boolean appendToNewBlock) throws IOException {
return append(f, getConf().getInt(IO_FILE_BUFFER_SIZE_KEY,
IO_FILE_BUFFER_SIZE_DEFAULT), null, appendToNewBlock);
}
/**
* Append to an existing file (optional operation).
* This function is used for being overridden by some FileSystem like DistributedFileSystem
* @param f the existing file to be appended.
* @param bufferSize the size of the buffer to be used.
* @param progress for reporting progress if it is not null.
* @param appendToNewBlock whether to append data to a new block
* instead of the end of the last partial block
* @throws IOException IO failure
* @throws UnsupportedOperationException if the operation is unsupported
* (default).
* @return output stream.
*/
public FSDataOutputStream append(Path f, int bufferSize,
Progressable progress, boolean appendToNewBlock) throws IOException {
return append(f, bufferSize, progress);
}
/**
* Concat existing files together.
* @param trg the path to the target destination.
@ -2381,8 +2413,14 @@ public abstract class FileSystem extends Configured
if (stat.isFile()) { // file
curFile = stat;
} else if (recursive) { // directory
itors.push(curItor);
curItor = listLocatedStatus(stat.getPath());
try {
RemoteIterator<LocatedFileStatus> newDirItor = listLocatedStatus(stat.getPath());
itors.push(curItor);
curItor = newDirItor;
} catch (FileNotFoundException ignored) {
LOGGER.debug("Directory {} deleted while attempting for recursive listing",
stat.getPath());
}
}
}
@ -3564,9 +3602,9 @@ public abstract class FileSystem extends Configured
} catch (IOException | RuntimeException e) {
// exception raised during initialization.
// log summary at warn and full stack at debug
LOGGER.warn("Failed to initialize fileystem {}: {}",
LOGGER.warn("Failed to initialize filesystem {}: {}",
uri, e.toString());
LOGGER.debug("Failed to initialize fileystem", e);
LOGGER.debug("Failed to initialize filesystem", e);
// then (robustly) close the FS, so as to invoke any
// cleanup code.
IOUtils.cleanupWithLogger(LOGGER, fs);
@ -3647,11 +3685,7 @@ public abstract class FileSystem extends Configured
// to construct an instance.
try (DurationInfo d = new DurationInfo(LOGGER, false,
"Acquiring creator semaphore for %s", uri)) {
creatorPermits.acquire();
} catch (InterruptedException e) {
// acquisition was interrupted; convert to an IOE.
throw (IOException)new InterruptedIOException(e.toString())
.initCause(e);
creatorPermits.acquireUninterruptibly();
}
FileSystem fsToClose = null;
try {
@ -3908,6 +3942,7 @@ public abstract class FileSystem extends Configured
private volatile long bytesReadDistanceOfThreeOrFour;
private volatile long bytesReadDistanceOfFiveOrLarger;
private volatile long bytesReadErasureCoded;
private volatile long remoteReadTimeMS;
/**
* Add another StatisticsData object to this one.
@ -3925,6 +3960,7 @@ public abstract class FileSystem extends Configured
this.bytesReadDistanceOfFiveOrLarger +=
other.bytesReadDistanceOfFiveOrLarger;
this.bytesReadErasureCoded += other.bytesReadErasureCoded;
this.remoteReadTimeMS += other.remoteReadTimeMS;
}
/**
@ -3943,6 +3979,7 @@ public abstract class FileSystem extends Configured
this.bytesReadDistanceOfFiveOrLarger =
-this.bytesReadDistanceOfFiveOrLarger;
this.bytesReadErasureCoded = -this.bytesReadErasureCoded;
this.remoteReadTimeMS = -this.remoteReadTimeMS;
}
@Override
@ -3991,6 +4028,10 @@ public abstract class FileSystem extends Configured
public long getBytesReadErasureCoded() {
return bytesReadErasureCoded;
}
public long getRemoteReadTimeMS() {
return remoteReadTimeMS;
}
}
private interface StatisticsAggregator<T> {
@ -4218,6 +4259,14 @@ public abstract class FileSystem extends Configured
}
}
/**
* Increment the time taken to read bytes from remote in the statistics.
* @param durationMS time taken in ms to read bytes from remote
*/
public void increaseRemoteReadTime(final long durationMS) {
getThreadStatistics().remoteReadTimeMS += durationMS;
}
/**
* Apply the given aggregator to all StatisticsData objects associated with
* this Statistics object.
@ -4365,6 +4414,25 @@ public abstract class FileSystem extends Configured
return bytesRead;
}
/**
* Get total time taken in ms for bytes read from remote.
* @return time taken in ms for remote bytes read.
*/
public long getRemoteReadTime() {
return visitAll(new StatisticsAggregator<Long>() {
private long remoteReadTimeMS = 0;
@Override
public void accept(StatisticsData data) {
remoteReadTimeMS += data.remoteReadTimeMS;
}
public Long aggregate() {
return remoteReadTimeMS;
}
});
}
/**
* Get all statistics data.
* MR or other frameworks can use the method to get all statistics at once.

View File

@ -47,7 +47,8 @@ public class FileSystemStorageStatistics extends StorageStatistics {
"bytesReadDistanceOfOneOrTwo",
"bytesReadDistanceOfThreeOrFour",
"bytesReadDistanceOfFiveOrLarger",
"bytesReadErasureCoded"
"bytesReadErasureCoded",
"remoteReadTimeMS"
};
private static class LongStatisticIterator
@ -107,6 +108,8 @@ public class FileSystemStorageStatistics extends StorageStatistics {
return data.getBytesReadDistanceOfFiveOrLarger();
case "bytesReadErasureCoded":
return data.getBytesReadErasureCoded();
case "remoteReadTimeMS":
return data.getRemoteReadTimeMS();
default:
return null;
}

View File

@ -484,7 +484,7 @@ public class FileUtil {
in = awaitFuture(srcFS.openFile(src)
.opt(FS_OPTION_OPENFILE_READ_POLICY,
FS_OPTION_OPENFILE_READ_POLICY_WHOLE_FILE)
.opt(FS_OPTION_OPENFILE_LENGTH,
.optLong(FS_OPTION_OPENFILE_LENGTH,
srcStatus.getLen()) // file length hint for object stores
.build());
out = dstFS.create(dst, overwrite);

View File

@ -0,0 +1,46 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.hadoop.fs;
import java.io.IOException;
/**
* Whether the given Path of the FileSystem has the capability to perform lease recovery.
*/
public interface LeaseRecoverable {
/**
* Start the lease recovery of a file.
*
* @param file path to a file.
* @return true if the file is already closed, and it does not require lease recovery.
* @throws IOException if an error occurs during lease recovery.
* @throws UnsupportedOperationException if lease recovery is not supported by this filesystem.
*/
boolean recoverLease(Path file) throws IOException;
/**
* Get the close status of a file.
* @param file The string representation of the path to the file
* @return return true if file is closed
* @throws IOException If an I/O error occurred
* @throws UnsupportedOperationException if isFileClosed is not supported by this filesystem.
*/
boolean isFileClosed(Path file) throws IOException;
}

View File

@ -396,6 +396,10 @@ public class LocalDirAllocator {
Context ctx = confChanged(conf);
int numDirs = ctx.localDirs.length;
int numDirsSearched = 0;
// Max capacity in any directory
long maxCapacity = 0;
String errorText = null;
IOException diskException = null;
//remove the leading slash from the path (to make sure that the uri
//resolution results in a valid path on the dir being checked)
if (pathStr.startsWith("/")) {
@ -410,7 +414,14 @@ public class LocalDirAllocator {
//build the "roulette wheel"
for(int i =0; i < ctx.dirDF.length; ++i) {
availableOnDisk[i] = ctx.dirDF[i].getAvailable();
final DF target = ctx.dirDF[i];
// attempt to recreate the dir so that getAvailable() is valid
// if it fails, getAvailable() will return 0, so the dir will
// be declared unavailable.
// return value is logged at debug to keep spotbugs quiet.
final boolean b = new File(target.getDirPath()).mkdirs();
LOG.debug("mkdirs of {}={}", target, b);
availableOnDisk[i] = target.getAvailable();
totalAvailable += availableOnDisk[i];
}
@ -444,9 +455,18 @@ public class LocalDirAllocator {
int dirNum = ctx.getAndIncrDirNumLastAccessed(randomInc);
while (numDirsSearched < numDirs) {
long capacity = ctx.dirDF[dirNum].getAvailable();
if (capacity > maxCapacity) {
maxCapacity = capacity;
}
if (capacity > size) {
returnPath =
createPath(ctx.localDirs[dirNum], pathStr, checkWrite);
try {
returnPath = createPath(ctx.localDirs[dirNum], pathStr,
checkWrite);
} catch (IOException e) {
errorText = e.getMessage();
diskException = e;
LOG.debug("DiskException caught for dir {}", ctx.localDirs[dirNum], e);
}
if (returnPath != null) {
ctx.getAndIncrDirNumLastAccessed(numDirsSearched);
break;
@ -462,8 +482,13 @@ public class LocalDirAllocator {
}
//no path found
throw new DiskErrorException("Could not find any valid local " +
"directory for " + pathStr);
String newErrorText = "Could not find any valid local directory for " +
pathStr + " with requested size " + size +
" as the max capacity in any directory is " + maxCapacity;
if (errorText != null) {
newErrorText = newErrorText + " due to " + errorText;
}
throw new DiskErrorException(newErrorText, diskException);
}
/** Creates a file on the local FS. Pass size as

View File

@ -465,7 +465,12 @@ public class Path
* @return a new path with the suffix added
*/
public Path suffix(String suffix) {
return new Path(getParent(), getName()+suffix);
Path parent = getParent();
if (parent == null) {
return new Path("/", getName() + suffix);
}
return new Path(parent, getName() + suffix);
}
@Override

View File

@ -114,6 +114,16 @@ public interface PositionedReadable {
* As a result of the call, each range will have FileRange.setData(CompletableFuture)
* called with a future that when complete will have a ByteBuffer with the
* data from the file's range.
* <p>
* The position returned by getPos() after readVectored() is undefined.
* </p>
* <p>
* If a file is changed while the readVectored() operation is in progress, the output is
* undefined. Some ranges may have old data, some may have new and some may have both.
* </p>
* <p>
* While a readVectored() operation is in progress, normal read api calls may block.
* </p>
* @param ranges the byte ranges to read
* @param allocate the function to allocate ByteBuffer
* @throws IOException any IOE.

View File

@ -57,6 +57,8 @@ import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.impl.StoreImplementationUtils;
import org.apache.hadoop.fs.permission.FsPermission;
import org.apache.hadoop.fs.statistics.IOStatistics;
import org.apache.hadoop.fs.statistics.IOStatisticsAggregator;
import org.apache.hadoop.fs.statistics.IOStatisticsContext;
import org.apache.hadoop.fs.statistics.IOStatisticsSource;
import org.apache.hadoop.fs.statistics.BufferedIOStatisticsOutputStream;
import org.apache.hadoop.fs.statistics.impl.IOStatisticsStore;
@ -156,11 +158,19 @@ public class RawLocalFileSystem extends FileSystem {
/** Reference to the bytes read counter for slightly faster counting. */
private final AtomicLong bytesRead;
/**
* Thread level IOStatistics aggregator to update in close().
*/
private final IOStatisticsAggregator
ioStatisticsAggregator;
public LocalFSFileInputStream(Path f) throws IOException {
name = pathToFile(f);
fis = new FileInputStream(name);
bytesRead = ioStatistics.getCounterReference(
STREAM_READ_BYTES);
ioStatisticsAggregator =
IOStatisticsContext.getCurrentIOStatisticsContext().getAggregator();
}
@Override
@ -193,9 +203,13 @@ public class RawLocalFileSystem extends FileSystem {
@Override
public void close() throws IOException {
fis.close();
if (asyncChannel != null) {
asyncChannel.close();
try {
fis.close();
if (asyncChannel != null) {
asyncChannel.close();
}
} finally {
ioStatisticsAggregator.aggregate(ioStatistics);
}
}
@ -278,6 +292,7 @@ public class RawLocalFileSystem extends FileSystem {
// new capabilities.
switch (capability.toLowerCase(Locale.ENGLISH)) {
case StreamCapabilities.IOSTATISTICS:
case StreamCapabilities.IOSTATISTICS_CONTEXT:
case StreamCapabilities.VECTOREDIO:
return true;
default:
@ -407,9 +422,19 @@ public class RawLocalFileSystem extends FileSystem {
STREAM_WRITE_EXCEPTIONS)
.build();
/**
* Thread level IOStatistics aggregator to update in close().
*/
private final IOStatisticsAggregator
ioStatisticsAggregator;
private LocalFSFileOutputStream(Path f, boolean append,
FsPermission permission) throws IOException {
File file = pathToFile(f);
// store the aggregator before attempting any IO.
ioStatisticsAggregator =
IOStatisticsContext.getCurrentIOStatisticsContext().getAggregator();
if (!append && permission == null) {
permission = FsPermission.getFileDefault();
}
@ -436,10 +461,17 @@ public class RawLocalFileSystem extends FileSystem {
}
/*
* Just forward to the fos
* Close the fos; update the IOStatisticsContext.
*/
@Override
public void close() throws IOException { fos.close(); }
public void close() throws IOException {
try {
fos.close();
} finally {
ioStatisticsAggregator.aggregate(ioStatistics);
}
}
@Override
public void flush() throws IOException { fos.flush(); }
@Override
@ -485,6 +517,7 @@ public class RawLocalFileSystem extends FileSystem {
// new capabilities.
switch (capability.toLowerCase(Locale.ENGLISH)) {
case StreamCapabilities.IOSTATISTICS:
case StreamCapabilities.IOSTATISTICS_CONTEXT:
return true;
default:
return StoreImplementationUtils.isProbeForSyncable(capability);
@ -1293,4 +1326,9 @@ public class RawLocalFileSystem extends FileSystem {
return super.hasPathCapability(path, capability);
}
}
@VisibleForTesting
static void setUseDeprecatedFileStatus(boolean useDeprecatedFileStatus) {
RawLocalFileSystem.useDeprecatedFileStatus = useDeprecatedFileStatus;
}
}

View File

@ -0,0 +1,50 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.hadoop.fs;
import java.io.IOException;
/**
* Whether the given filesystem is in any status of safe mode.
*/
public interface SafeMode {
/**
* Enter, leave, or get safe mode.
*
* @param action One of {@link SafeModeAction} LEAVE, ENTER, GET, FORCE_EXIT.
* @throws IOException if set safe mode fails to proceed.
* @return true if the action is successfully accepted, otherwise false means rejected.
*/
default boolean setSafeMode(SafeModeAction action) throws IOException {
return setSafeMode(action, false);
}
/**
* Enter, leave, or get safe mode.
*
* @param action One of {@link SafeModeAction} LEAVE, ENTER, GET, FORCE_EXIT.
* @param isChecked If true check only for Active metadata node / NameNode's status,
* else check first metadata node / NameNode's status.
* @throws IOException if set safe mode fails to proceed.
* @return true if the action is successfully accepted, otherwise false means rejected.
*/
boolean setSafeMode(SafeModeAction action, boolean isChecked) throws IOException;
}

View File

@ -0,0 +1,41 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.hadoop.fs;
/**
* An identical copy from org.apache.hadoop.hdfs.protocol.HdfsConstants.SafeModeAction, that helps
* the other file system implementation to define {@link SafeMode}.
*/
public enum SafeModeAction {
/**
* Starting entering into safe mode.
*/
ENTER,
/**
* Gracefully exit from safe mode.
*/
LEAVE,
/**
* Force Exit from safe mode.
*/
FORCE_EXIT,
/**
* Get the status of the safe mode.
*/
GET;
}

View File

@ -84,7 +84,7 @@ public interface StreamCapabilities {
* Support for vectored IO api.
* See {@code PositionedReadable#readVectored(List, IntFunction)}.
*/
String VECTOREDIO = "readvectored";
String VECTOREDIO = "in:readvectored";
/**
* Stream abort() capability implemented by {@link Abortable#abort()}.
@ -93,6 +93,12 @@ public interface StreamCapabilities {
*/
String ABORTABLE_STREAM = CommonPathCapabilities.ABORTABLE_STREAM;
/**
* Streams that support IOStatistics context and capture thread-level
* IOStatistics.
*/
String IOSTATISTICS_CONTEXT = "fs.capability.iocontext.supported";
/**
* Capabilities that a stream can support and be queried for.
*/

View File

@ -23,8 +23,10 @@ import org.apache.hadoop.classification.InterfaceAudience;
import org.apache.hadoop.classification.InterfaceStability;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.viewfs.ViewFileSystem;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import static org.apache.hadoop.fs.viewfs.Constants.*;
/**
* Provides a trash facility which supports pluggable Trash policies.
@ -67,7 +69,7 @@ public class Trash extends Configured {
* Hence we get the file system of the fully-qualified resolved-path and
* then move the path p to the trashbin in that volume,
* @param fs - the filesystem of path p
* @param p - the path being deleted - to be moved to trasg
* @param p - the path being deleted - to be moved to trash
* @param conf - configuration
* @return false if the item is already in the trash or trash is disabled
* @throws IOException on error
@ -94,6 +96,27 @@ public class Trash extends Configured {
LOG.warn("Failed to get server trash configuration", e);
throw new IOException("Failed to get server trash configuration", e);
}
/*
* In HADOOP-18144, we changed getTrashRoot() in ViewFileSystem to return a
* viewFS path, instead of a targetFS path. moveToTrash works for
* ViewFileSystem now. ViewFileSystem will do path resolution internally by
* itself.
*
* When localized trash flag is enabled:
* 1). if fs is a ViewFileSystem, we can initialize Trash() with a
* ViewFileSystem object;
* 2). When fs is not a ViewFileSystem, the only place we would need to
* resolve a path is for symbolic links. However, symlink is not
* enabled in Hadoop due to the complexity to support it
* (HADOOP-10019).
*/
if (conf.getBoolean(CONFIG_VIEWFS_TRASH_FORCE_INSIDE_MOUNT_POINT,
CONFIG_VIEWFS_TRASH_FORCE_INSIDE_MOUNT_POINT_DEFAULT)) {
Trash trash = new Trash(fs, conf);
return trash.moveToTrash(p);
}
Trash trash = new Trash(fullyResolvedFs, conf);
return trash.moveToTrash(fullyResolvedPath);
}

View File

@ -30,6 +30,7 @@ import java.util.function.IntFunction;
import org.apache.hadoop.fs.impl.CombinedFileRange;
import org.apache.hadoop.util.Preconditions;
import org.apache.hadoop.util.functional.Function4RaisingIOE;
/**
* Utility class which implements helper methods used
@ -37,6 +38,8 @@ import org.apache.hadoop.util.Preconditions;
*/
public final class VectoredReadUtils {
private static final int TMP_BUFFER_MAX_SIZE = 64 * 1024;
/**
* Validate a single range.
* @param range file range.
@ -114,7 +117,12 @@ public final class VectoredReadUtils {
FileRange range,
ByteBuffer buffer) throws IOException {
if (buffer.isDirect()) {
buffer.put(readInDirectBuffer(stream, range));
readInDirectBuffer(range.getLength(),
buffer,
(position, buffer1, offset, length) -> {
stream.readFully(position, buffer1, offset, length);
return null;
});
buffer.flip();
} else {
stream.readFully(range.getOffset(), buffer.array(),
@ -122,13 +130,34 @@ public final class VectoredReadUtils {
}
}
private static byte[] readInDirectBuffer(PositionedReadable stream,
FileRange range) throws IOException {
// if we need to read data from a direct buffer and the stream doesn't
// support it, we allocate a byte array to use.
byte[] tmp = new byte[range.getLength()];
stream.readFully(range.getOffset(), tmp, 0, tmp.length);
return tmp;
/**
* Read bytes from stream into a byte buffer using an
* intermediate byte array.
* @param length number of bytes to read.
* @param buffer buffer to fill.
* @param operation operation to use for reading data.
* @throws IOException any IOE.
*/
public static void readInDirectBuffer(int length,
ByteBuffer buffer,
Function4RaisingIOE<Integer, byte[], Integer,
Integer, Void> operation) throws IOException {
if (length == 0) {
return;
}
int readBytes = 0;
int position = 0;
int tmpBufferMaxSize = Math.min(TMP_BUFFER_MAX_SIZE, length);
byte[] tmp = new byte[tmpBufferMaxSize];
while (readBytes < length) {
int currentLength = (readBytes + tmpBufferMaxSize) < length ?
tmpBufferMaxSize
: (length - readBytes);
operation.apply(position, tmp, 0, currentLength);
buffer.put(tmp, 0, currentLength);
position = position + currentLength;
readBytes = readBytes + currentLength;
}
}
/**
@ -210,6 +239,7 @@ public final class VectoredReadUtils {
if (sortedRanges[i].getOffset() < prev.getOffset() + prev.getLength()) {
throw new UnsupportedOperationException("Overlapping ranges are not supported");
}
prev = sortedRanges[i];
}
return Arrays.asList(sortedRanges);
}
@ -277,9 +307,16 @@ public final class VectoredReadUtils {
FileRange request) {
int offsetChange = (int) (request.getOffset() - readOffset);
int requestLength = request.getLength();
// Create a new buffer that is backed by the original contents
// The buffer will have position 0 and the same limit as the original one
readData = readData.slice();
// Change the offset and the limit of the buffer as the reader wants to see
// only relevant data
readData.position(offsetChange);
readData.limit(offsetChange + requestLength);
// Create a new buffer after the limit change so that only that portion of the data is
// returned to the reader.
readData = readData.slice();
return readData;
}

View File

@ -90,6 +90,11 @@ public final class AuditConstants {
*/
public static final String PARAM_PROCESS = "ps";
/**
* Header: Range for GET request data: {@value}.
*/
public static final String PARAM_RANGE = "rg";
/**
* Task Attempt ID query header: {@value}.
*/
@ -110,4 +115,9 @@ public final class AuditConstants {
*/
public static final String PARAM_TIMESTAMP = "ts";
/**
* Num of files to be deleted as part of the bulk delete request.
*/
public static final String DELETE_KEYS_SIZE = "ks";
}

View File

@ -44,11 +44,13 @@ import static org.apache.hadoop.util.Preconditions.checkNotNull;
* with option support.
*
* <code>
* .opt("foofs:option.a", true)
* .opt("foofs:option.b", "value")
* .opt("fs.s3a.open.option.caching", true)
* .opt("fs.option.openfile.read.policy", "random, adaptive")
* .opt("fs.s3a.open.option.etag", "9fe4c37c25b")
* .must("foofs:cache", true)
* .must("barfs:cache-size", 256 * 1024 * 1024)
* .optLong("fs.option.openfile.length", 1_500_000_000_000)
* .must("fs.option.openfile.buffer.size", 256_000)
* .mustLong("fs.option.openfile.split.start", 256_000_000)
* .mustLong("fs.option.openfile.split.end", 512_000_000)
* .build();
* </code>
*
@ -64,6 +66,7 @@ import static org.apache.hadoop.util.Preconditions.checkNotNull;
*/
@InterfaceAudience.Public
@InterfaceStability.Unstable
@SuppressWarnings({"deprecation", "unused"})
public abstract class
AbstractFSBuilderImpl<S, B extends FSBuilder<S, B>>
implements FSBuilder<S, B> {
@ -178,10 +181,7 @@ public abstract class
*/
@Override
public B opt(@Nonnull final String key, boolean value) {
mandatoryKeys.remove(key);
optionalKeys.add(key);
options.setBoolean(key, value);
return getThisBuilder();
return opt(key, Boolean.toString(value));
}
/**
@ -191,18 +191,17 @@ public abstract class
*/
@Override
public B opt(@Nonnull final String key, int value) {
mandatoryKeys.remove(key);
optionalKeys.add(key);
options.setInt(key, value);
return getThisBuilder();
return optLong(key, value);
}
@Override
public B opt(@Nonnull final String key, final long value) {
mandatoryKeys.remove(key);
optionalKeys.add(key);
options.setLong(key, value);
return getThisBuilder();
return optLong(key, value);
}
@Override
public B optLong(@Nonnull final String key, final long value) {
return opt(key, Long.toString(value));
}
/**
@ -212,10 +211,7 @@ public abstract class
*/
@Override
public B opt(@Nonnull final String key, float value) {
mandatoryKeys.remove(key);
optionalKeys.add(key);
options.setFloat(key, value);
return getThisBuilder();
return optLong(key, (long) value);
}
/**
@ -225,10 +221,17 @@ public abstract class
*/
@Override
public B opt(@Nonnull final String key, double value) {
mandatoryKeys.remove(key);
optionalKeys.add(key);
options.setDouble(key, value);
return getThisBuilder();
return optLong(key, (long) value);
}
/**
* Set optional double parameter for the Builder.
*
* @see #opt(String, String)
*/
@Override
public B optDouble(@Nonnull final String key, double value) {
return opt(key, Double.toString(value));
}
/**
@ -264,10 +267,22 @@ public abstract class
*/
@Override
public B must(@Nonnull final String key, boolean value) {
mandatoryKeys.add(key);
optionalKeys.remove(key);
options.setBoolean(key, value);
return getThisBuilder();
return must(key, Boolean.toString(value));
}
@Override
public B mustLong(@Nonnull final String key, final long value) {
return must(key, Long.toString(value));
}
/**
* Set optional double parameter for the Builder.
*
* @see #opt(String, String)
*/
@Override
public B mustDouble(@Nonnull final String key, double value) {
return must(key, Double.toString(value));
}
/**
@ -277,44 +292,22 @@ public abstract class
*/
@Override
public B must(@Nonnull final String key, int value) {
mandatoryKeys.add(key);
optionalKeys.remove(key);
options.setInt(key, value);
return getThisBuilder();
return mustLong(key, value);
}
@Override
public B must(@Nonnull final String key, final long value) {
mandatoryKeys.add(key);
optionalKeys.remove(key);
options.setLong(key, value);
return getThisBuilder();
return mustLong(key, value);
}
/**
* Set mandatory float option.
*
* @see #must(String, String)
*/
@Override
public B must(@Nonnull final String key, float value) {
mandatoryKeys.add(key);
optionalKeys.remove(key);
options.setFloat(key, value);
return getThisBuilder();
public B must(@Nonnull final String key, final float value) {
return mustLong(key, (long) value);
}
/**
* Set mandatory double option.
*
* @see #must(String, String)
*/
@Override
public B must(@Nonnull final String key, double value) {
mandatoryKeys.add(key);
optionalKeys.remove(key);
options.setDouble(key, value);
return getThisBuilder();
return mustLong(key, (long) value);
}
/**

View File

@ -29,10 +29,10 @@ import java.util.List;
* together into a single read for efficiency.
*/
public class CombinedFileRange extends FileRangeImpl {
private ArrayList<FileRange> underlying = new ArrayList<>();
private List<FileRange> underlying = new ArrayList<>();
public CombinedFileRange(long offset, long end, FileRange original) {
super(offset, (int) (end - offset));
super(offset, (int) (end - offset), null);
this.underlying.add(original);
}

View File

@ -0,0 +1,95 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.hadoop.fs.impl;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.store.LogExactlyOnce;
/**
* Class to help with use of FSBuilder.
*/
public class FSBuilderSupport {
private static final Logger LOG =
LoggerFactory.getLogger(FSBuilderSupport.class);
public static final LogExactlyOnce LOG_PARSE_ERROR = new LogExactlyOnce(LOG);
/**
* Options which are parsed.
*/
private final Configuration options;
/**
* Constructor.
* @param options the configuration options from the builder.
*/
public FSBuilderSupport(final Configuration options) {
this.options = options;
}
public Configuration getOptions() {
return options;
}
/**
* Get a long value with resilience to unparseable values.
* Negative values are replaced with the default.
* @param key key to log
* @param defVal default value
* @return long value
*/
public long getPositiveLong(String key, long defVal) {
long l = getLong(key, defVal);
if (l < 0) {
LOG.debug("The option {} has a negative value {}, replacing with the default {}",
key, l, defVal);
l = defVal;
}
return l;
}
/**
* Get a long value with resilience to unparseable values.
* @param key key to log
* @param defVal default value
* @return long value
*/
public long getLong(String key, long defVal) {
final String v = options.getTrimmed(key, "");
if (v.isEmpty()) {
return defVal;
}
try {
return options.getLong(key, defVal);
} catch (NumberFormatException e) {
final String msg = String.format(
"The option %s value \"%s\" is not a long integer; using the default value %s",
key, v, defVal);
// not a long,
LOG_PARSE_ERROR.warn(msg);
LOG.debug("{}", msg, e);
return defVal;
}
}
}

View File

@ -34,9 +34,21 @@ public class FileRangeImpl implements FileRange {
private int length;
private CompletableFuture<ByteBuffer> reader;
public FileRangeImpl(long offset, int length) {
/**
* nullable reference to store in the range.
*/
private final Object reference;
/**
* Create.
* @param offset offset in file
* @param length length of data to read.
* @param reference nullable reference to store in the range.
*/
public FileRangeImpl(long offset, int length, Object reference) {
this.offset = offset;
this.length = length;
this.reference = reference;
}
@Override
@ -71,4 +83,9 @@ public class FileRangeImpl implements FileRange {
public CompletableFuture<ByteBuffer> getData() {
return reader;
}
@Override
public Object getReference() {
return reference;
}
}

View File

@ -0,0 +1,97 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.hadoop.fs.impl;
import java.lang.ref.WeakReference;
import org.apache.hadoop.classification.InterfaceAudience;
import org.apache.hadoop.metrics2.MetricsCollector;
import org.apache.hadoop.metrics2.MetricsSource;
import static java.util.Objects.requireNonNull;
/**
* A weak referenced metrics source which avoids hanging on to large objects
* if somehow they don't get fully closed/cleaned up.
* The JVM may clean up all objects which are only weakly referenced whenever
* it does a GC, <i>even if there is no memory pressure</i>.
* To avoid these refs being removed, always keep a strong reference around
* somewhere.
*/
@InterfaceAudience.Private
public class WeakRefMetricsSource implements MetricsSource {
/**
* Name to know when unregistering.
*/
private final String name;
/**
* Underlying metrics source.
*/
private final WeakReference<MetricsSource> sourceWeakReference;
/**
* Constructor.
* @param name Name to know when unregistering.
* @param source metrics source
*/
public WeakRefMetricsSource(final String name, final MetricsSource source) {
this.name = name;
this.sourceWeakReference = new WeakReference<>(requireNonNull(source));
}
/**
* If the weak reference is non null, update the metrics.
* @param collector to contain the resulting metrics snapshot
* @param all if true, return all metrics even if unchanged.
*/
@Override
public void getMetrics(final MetricsCollector collector, final boolean all) {
MetricsSource metricsSource = sourceWeakReference.get();
if (metricsSource != null) {
metricsSource.getMetrics(collector, all);
}
}
/**
* Name to know when unregistering.
* @return the name passed in during construction.
*/
public String getName() {
return name;
}
/**
* Get the source, will be null if the reference has been GC'd
* @return the source reference
*/
public MetricsSource getSource() {
return sourceWeakReference.get();
}
@Override
public String toString() {
return "WeakRefMetricsSource{" +
"name='" + name + '\'' +
", sourceWeakReference is " +
(sourceWeakReference.get() == null ? "unset" : "set") +
'}';
}
}

View File

@ -18,12 +18,15 @@
package org.apache.hadoop.fs.impl;
import java.lang.ref.WeakReference;
import java.util.function.Consumer;
import java.util.function.Function;
import javax.annotation.Nullable;
import org.apache.hadoop.util.WeakReferenceMap;
import static java.util.Objects.requireNonNull;
/**
* A WeakReferenceMap for threads.
* @param <V> value type of the map
@ -35,20 +38,55 @@ public class WeakReferenceThreadMap<V> extends WeakReferenceMap<Long, V> {
super(factory, referenceLost);
}
/**
* Get the value for the current thread, creating if needed.
* @return an instance.
*/
public V getForCurrentThread() {
return get(currentThreadId());
}
/**
* Remove the reference for the current thread.
* @return any reference value which existed.
*/
public V removeForCurrentThread() {
return remove(currentThreadId());
}
/**
* Get the current thread ID.
* @return thread ID.
*/
public long currentThreadId() {
return Thread.currentThread().getId();
}
/**
* Set the new value for the current thread.
* @param newVal new reference to set for the active thread.
* @return the previously set value, possibly null
*/
public V setForCurrentThread(V newVal) {
return put(currentThreadId(), newVal);
requireNonNull(newVal);
long id = currentThreadId();
// if the same object is already in the map, just return it.
WeakReference<V> existingWeakRef = lookup(id);
// The looked up reference could be one of
// 1. null: nothing there
// 2. valid but get() == null : reference lost by GC.
// 3. different from the new value
// 4. the same as the old value
if (resolve(existingWeakRef) == newVal) {
// case 4: do nothing, return the new value
return newVal;
} else {
// cases 1, 2, 3: update the map and return the old value
return put(id, newVal);
}
}
}

View File

@ -0,0 +1,76 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.apache.hadoop.fs.impl.prefetch;
import java.io.Closeable;
import java.io.IOException;
import java.nio.ByteBuffer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.LocalDirAllocator;
/**
* Provides functionality necessary for caching blocks of data read from FileSystem.
*/
public interface BlockCache extends Closeable {
/**
* Indicates whether the given block is in this cache.
*
* @param blockNumber the id of the given block.
* @return true if the given block is in this cache, false otherwise.
*/
boolean containsBlock(int blockNumber);
/**
* Gets the blocks in this cache.
*
* @return the blocks in this cache.
*/
Iterable<Integer> blocks();
/**
* Gets the number of blocks in this cache.
*
* @return the number of blocks in this cache.
*/
int size();
/**
* Gets the block having the given {@code blockNumber}.
*
* @param blockNumber the id of the desired block.
* @param buffer contents of the desired block are copied to this buffer.
* @throws IOException if there is an error reading the given block.
*/
void get(int blockNumber, ByteBuffer buffer) throws IOException;
/**
* Puts the given block in this cache.
*
* @param blockNumber the id of the given block.
* @param buffer contents of the given block to be added to this cache.
* @param conf the configuration.
* @param localDirAllocator the local dir allocator instance.
* @throws IOException if there is an error writing the given block.
*/
void put(int blockNumber, ByteBuffer buffer, Configuration conf,
LocalDirAllocator localDirAllocator) throws IOException;
}

View File

@ -0,0 +1,250 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.apache.hadoop.fs.impl.prefetch;
import static org.apache.hadoop.fs.impl.prefetch.Validate.checkNotNegative;
import static org.apache.hadoop.fs.impl.prefetch.Validate.checkPositiveInteger;
import static org.apache.hadoop.fs.impl.prefetch.Validate.checkWithinRange;
/**
* Holds information about blocks of data in a file.
*/
public final class BlockData {
// State of each block of data.
enum State {
/** Data is not yet ready to be read from this block (still being prefetched). */
NOT_READY,
/** A read of this block has been enqueued in the prefetch queue. */
QUEUED,
/** A read of this block has been enqueued in the prefetch queue. */
READY,
/** This block has been cached in the local disk cache. */
CACHED
}
/**
* State of all blocks in a file.
*/
private State[] state;
/**
* The size of a file.
*/
private final long fileSize;
/**
* The file is divided into blocks of this size.
*/
private final int blockSize;
/**
* The file has these many blocks.
*/
private final int numBlocks;
/**
* Constructs an instance of {@link BlockData}.
* @param fileSize the size of a file.
* @param blockSize the file is divided into blocks of this size.
* @throws IllegalArgumentException if fileSize is negative.
* @throws IllegalArgumentException if blockSize is negative.
* @throws IllegalArgumentException if blockSize is zero or negative.
*/
public BlockData(long fileSize, int blockSize) {
checkNotNegative(fileSize, "fileSize");
if (fileSize == 0) {
checkNotNegative(blockSize, "blockSize");
} else {
checkPositiveInteger(blockSize, "blockSize");
}
this.fileSize = fileSize;
this.blockSize = blockSize;
this.numBlocks =
(fileSize == 0)
? 0
: ((int) (fileSize / blockSize)) + (fileSize % blockSize > 0
? 1
: 0);
this.state = new State[this.numBlocks];
for (int b = 0; b < this.numBlocks; b++) {
setState(b, State.NOT_READY);
}
}
/**
* Gets the size of each block.
* @return the size of each block.
*/
public int getBlockSize() {
return blockSize;
}
/**
* Gets the size of the associated file.
* @return the size of the associated file.
*/
public long getFileSize() {
return fileSize;
}
/**
* Gets the number of blocks in the associated file.
* @return the number of blocks in the associated file.
*/
public int getNumBlocks() {
return numBlocks;
}
/**
* Indicates whether the given block is the last block in the associated file.
* @param blockNumber the id of the desired block.
* @return true if the given block is the last block in the associated file, false otherwise.
* @throws IllegalArgumentException if blockNumber is invalid.
*/
public boolean isLastBlock(int blockNumber) {
if (fileSize == 0) {
return false;
}
throwIfInvalidBlockNumber(blockNumber);
return blockNumber == (numBlocks - 1);
}
/**
* Gets the id of the block that contains the given absolute offset.
* @param offset the absolute offset to check.
* @return the id of the block that contains the given absolute offset.
* @throws IllegalArgumentException if offset is invalid.
*/
public int getBlockNumber(long offset) {
throwIfInvalidOffset(offset);
return (int) (offset / blockSize);
}
/**
* Gets the size of the given block.
* @param blockNumber the id of the desired block.
* @return the size of the given block.
*/
public int getSize(int blockNumber) {
if (fileSize == 0) {
return 0;
}
if (isLastBlock(blockNumber)) {
return (int) (fileSize - (((long) blockSize) * (numBlocks - 1)));
} else {
return blockSize;
}
}
/**
* Indicates whether the given absolute offset is valid.
* @param offset absolute offset in the file..
* @return true if the given absolute offset is valid, false otherwise.
*/
public boolean isValidOffset(long offset) {
return (offset >= 0) && (offset < fileSize);
}
/**
* Gets the start offset of the given block.
* @param blockNumber the id of the given block.
* @return the start offset of the given block.
* @throws IllegalArgumentException if blockNumber is invalid.
*/
public long getStartOffset(int blockNumber) {
throwIfInvalidBlockNumber(blockNumber);
return blockNumber * (long) blockSize;
}
/**
* Gets the relative offset corresponding to the given block and the absolute offset.
* @param blockNumber the id of the given block.
* @param offset absolute offset in the file.
* @return the relative offset corresponding to the given block and the absolute offset.
* @throws IllegalArgumentException if either blockNumber or offset is invalid.
*/
public int getRelativeOffset(int blockNumber, long offset) {
throwIfInvalidOffset(offset);
return (int) (offset - getStartOffset(blockNumber));
}
/**
* Gets the state of the given block.
* @param blockNumber the id of the given block.
* @return the state of the given block.
* @throws IllegalArgumentException if blockNumber is invalid.
*/
public State getState(int blockNumber) {
throwIfInvalidBlockNumber(blockNumber);
return state[blockNumber];
}
/**
* Sets the state of the given block to the given value.
* @param blockNumber the id of the given block.
* @param blockState the target state.
* @throws IllegalArgumentException if blockNumber is invalid.
*/
public void setState(int blockNumber, State blockState) {
throwIfInvalidBlockNumber(blockNumber);
state[blockNumber] = blockState;
}
// Debug helper.
public String getStateString() {
StringBuilder sb = new StringBuilder();
int blockNumber = 0;
while (blockNumber < numBlocks) {
State tstate = getState(blockNumber);
int endBlockNumber = blockNumber;
while ((endBlockNumber < numBlocks) && (getState(endBlockNumber)
== tstate)) {
endBlockNumber++;
}
sb.append(
String.format("[%03d ~ %03d] %s%n", blockNumber, endBlockNumber - 1,
tstate));
blockNumber = endBlockNumber;
}
return sb.toString();
}
private void throwIfInvalidBlockNumber(int blockNumber) {
checkWithinRange(blockNumber, "blockNumber", 0, numBlocks - 1);
}
private void throwIfInvalidOffset(long offset) {
checkWithinRange(offset, "offset", 0, fileSize - 1);
}
}

View File

@ -0,0 +1,145 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.apache.hadoop.fs.impl.prefetch;
import java.io.Closeable;
import java.io.IOException;
import java.nio.ByteBuffer;
import static org.apache.hadoop.fs.impl.prefetch.Validate.checkNotNegative;
import static org.apache.hadoop.fs.impl.prefetch.Validate.checkNotNull;
/**
* Provides read access to the underlying file one block at a time.
*
* This class is the simplest form of a {@code BlockManager} that does
* perform prefetching or caching.
*/
public abstract class BlockManager implements Closeable {
/**
* Information about each block of the underlying file.
*/
private final BlockData blockData;
/**
* Constructs an instance of {@code BlockManager}.
*
* @param blockData information about each block of the underlying file.
*
* @throws IllegalArgumentException if blockData is null.
*/
public BlockManager(BlockData blockData) {
checkNotNull(blockData, "blockData");
this.blockData = blockData;
}
/**
* Gets block data information.
*
* @return instance of {@code BlockData}.
*/
public BlockData getBlockData() {
return blockData;
}
/**
* Gets the block having the given {@code blockNumber}.
*
* The entire block is read into memory and returned as a {@code BufferData}.
* The blocks are treated as a limited resource and must be released when
* one is done reading them.
*
* @param blockNumber the number of the block to be read and returned.
* @return {@code BufferData} having data from the given block.
*
* @throws IOException if there an error reading the given block.
* @throws IllegalArgumentException if blockNumber is negative.
*/
public BufferData get(int blockNumber) throws IOException {
checkNotNegative(blockNumber, "blockNumber");
int size = blockData.getSize(blockNumber);
ByteBuffer buffer = ByteBuffer.allocate(size);
long startOffset = blockData.getStartOffset(blockNumber);
read(buffer, startOffset, size);
buffer.flip();
return new BufferData(blockNumber, buffer);
}
/**
* Reads into the given {@code buffer} {@code size} bytes from the underlying file
* starting at {@code startOffset}.
*
* @param buffer the buffer to read data in to.
* @param startOffset the offset at which reading starts.
* @param size the number bytes to read.
* @return number of bytes read.
* @throws IOException if there an error reading the given block.
*/
public abstract int read(ByteBuffer buffer, long startOffset, int size) throws IOException;
/**
* Releases resources allocated to the given block.
*
* @param data the {@code BufferData} to release.
*
* @throws IllegalArgumentException if data is null.
*/
public void release(BufferData data) {
checkNotNull(data, "data");
// Do nothing because we allocate a new buffer each time.
}
/**
* Requests optional prefetching of the given block.
*
* @param blockNumber the id of the block to prefetch.
*
* @throws IllegalArgumentException if blockNumber is negative.
*/
public void requestPrefetch(int blockNumber) {
checkNotNegative(blockNumber, "blockNumber");
// Do nothing because we do not support prefetches.
}
/**
* Requests cancellation of any previously issued prefetch requests.
*/
public void cancelPrefetches() {
// Do nothing because we do not support prefetches.
}
/**
* Requests that the given block should be copied to the cache. Optional operation.
*
* @param data the {@code BufferData} instance to optionally cache.
*/
public void requestCaching(BufferData data) {
// Do nothing because we do not support caching.
}
@Override
public void close() {
}
}

View File

@ -0,0 +1,425 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.apache.hadoop.fs.impl.prefetch;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.DoubleSummaryStatistics;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import static org.apache.hadoop.fs.impl.prefetch.Validate.checkNotNegative;
/**
* Block level operations performed on a file.
* This class is meant to be used by {@code BlockManager}.
* It is separated out in its own file due to its size.
*
* This class is used for debugging/logging. Calls to this class
* can be safely removed without affecting the overall operation.
*/
public final class BlockOperations {
private static final Logger LOG = LoggerFactory.getLogger(BlockOperations.class);
/**
* Operation kind.
*/
public enum Kind {
UNKNOWN("??", "unknown", false),
CANCEL_PREFETCHES("CP", "cancelPrefetches", false),
CLOSE("CX", "close", false),
CACHE_PUT("C+", "putC", true),
GET_CACHED("GC", "getCached", true),
GET_PREFETCHED("GP", "getPrefetched", true),
GET_READ("GR", "getRead", true),
PREFETCH("PF", "prefetch", true),
RELEASE("RL", "release", true),
REQUEST_CACHING("RC", "requestCaching", true),
REQUEST_PREFETCH("RP", "requestPrefetch", true);
private String shortName;
private String name;
private boolean hasBlock;
Kind(String shortName, String name, boolean hasBlock) {
this.shortName = shortName;
this.name = name;
this.hasBlock = hasBlock;
}
private static Map<String, Kind> shortNameToKind = new HashMap<>();
public static Kind fromShortName(String shortName) {
if (shortNameToKind.isEmpty()) {
for (Kind kind : Kind.values()) {
shortNameToKind.put(kind.shortName, kind);
}
}
return shortNameToKind.get(shortName);
}
}
public static class Operation {
private final Kind kind;
private final int blockNumber;
private final long timestamp;
public Operation(Kind kind, int blockNumber) {
this.kind = kind;
this.blockNumber = blockNumber;
this.timestamp = System.nanoTime();
}
public Kind getKind() {
return kind;
}
public int getBlockNumber() {
return blockNumber;
}
public long getTimestamp() {
return timestamp;
}
public void getSummary(StringBuilder sb) {
if (kind.hasBlock) {
sb.append(String.format("%s(%d)", kind.shortName, blockNumber));
} else {
sb.append(String.format("%s", kind.shortName));
}
}
public String getDebugInfo() {
if (kind.hasBlock) {
return String.format("--- %s(%d)", kind.name, blockNumber);
} else {
return String.format("... %s()", kind.name);
}
}
}
public static class End extends Operation {
private Operation op;
public End(Operation op) {
super(op.kind, op.blockNumber);
this.op = op;
}
@Override
public void getSummary(StringBuilder sb) {
sb.append("E");
super.getSummary(sb);
}
@Override
public String getDebugInfo() {
return "***" + super.getDebugInfo().substring(3);
}
public double duration() {
return (getTimestamp() - op.getTimestamp()) / 1e9;
}
}
private ArrayList<Operation> ops;
private boolean debugMode;
public BlockOperations() {
this.ops = new ArrayList<>();
}
public synchronized void setDebug(boolean state) {
debugMode = state;
}
private synchronized Operation add(Operation op) {
if (debugMode) {
LOG.info(op.getDebugInfo());
}
ops.add(op);
return op;
}
public Operation getPrefetched(int blockNumber) {
checkNotNegative(blockNumber, "blockNumber");
return add(new Operation(Kind.GET_PREFETCHED, blockNumber));
}
public Operation getCached(int blockNumber) {
checkNotNegative(blockNumber, "blockNumber");
return add(new Operation(Kind.GET_CACHED, blockNumber));
}
public Operation getRead(int blockNumber) {
checkNotNegative(blockNumber, "blockNumber");
return add(new Operation(Kind.GET_READ, blockNumber));
}
public Operation release(int blockNumber) {
checkNotNegative(blockNumber, "blockNumber");
return add(new Operation(Kind.RELEASE, blockNumber));
}
public Operation requestPrefetch(int blockNumber) {
checkNotNegative(blockNumber, "blockNumber");
return add(new Operation(Kind.REQUEST_PREFETCH, blockNumber));
}
public Operation prefetch(int blockNumber) {
checkNotNegative(blockNumber, "blockNumber");
return add(new Operation(Kind.PREFETCH, blockNumber));
}
public Operation cancelPrefetches() {
return add(new Operation(Kind.CANCEL_PREFETCHES, -1));
}
public Operation close() {
return add(new Operation(Kind.CLOSE, -1));
}
public Operation requestCaching(int blockNumber) {
checkNotNegative(blockNumber, "blockNumber");
return add(new Operation(Kind.REQUEST_CACHING, blockNumber));
}
public Operation addToCache(int blockNumber) {
checkNotNegative(blockNumber, "blockNumber");
return add(new Operation(Kind.CACHE_PUT, blockNumber));
}
public Operation end(Operation op) {
return add(new End(op));
}
private static void append(StringBuilder sb, String format, Object... args) {
sb.append(String.format(format, args));
}
public synchronized String getSummary(boolean showDebugInfo) {
StringBuilder sb = new StringBuilder();
for (Operation op : ops) {
if (op != null) {
if (showDebugInfo) {
sb.append(op.getDebugInfo());
sb.append("\n");
} else {
op.getSummary(sb);
sb.append(";");
}
}
}
sb.append("\n");
getDurationInfo(sb);
return sb.toString();
}
public synchronized void getDurationInfo(StringBuilder sb) {
Map<Kind, DoubleSummaryStatistics> durations = new HashMap<>();
for (Operation op : ops) {
if (op instanceof End) {
End endOp = (End) op;
DoubleSummaryStatistics stats = durations.get(endOp.getKind());
if (stats == null) {
stats = new DoubleSummaryStatistics();
durations.put(endOp.getKind(), stats);
}
stats.accept(endOp.duration());
}
}
List<Kind> kinds = Arrays.asList(
Kind.GET_CACHED,
Kind.GET_PREFETCHED,
Kind.GET_READ,
Kind.CACHE_PUT,
Kind.PREFETCH,
Kind.REQUEST_CACHING,
Kind.REQUEST_PREFETCH,
Kind.CANCEL_PREFETCHES,
Kind.RELEASE,
Kind.CLOSE
);
for (Kind kind : kinds) {
append(sb, "%-18s : ", kind);
DoubleSummaryStatistics stats = durations.get(kind);
if (stats == null) {
append(sb, "--\n");
} else {
append(
sb,
"#ops = %3d, total = %5.1f, min: %3.1f, avg: %3.1f, max: %3.1f\n",
stats.getCount(),
stats.getSum(),
stats.getMin(),
stats.getAverage(),
stats.getMax());
}
}
}
public synchronized void analyze(StringBuilder sb) {
Map<Integer, List<Operation>> blockOps = new HashMap<>();
// Group-by block number.
for (Operation op : ops) {
if (op.blockNumber < 0) {
continue;
}
List<Operation> perBlockOps;
if (!blockOps.containsKey(op.blockNumber)) {
perBlockOps = new ArrayList<>();
blockOps.put(op.blockNumber, perBlockOps);
}
perBlockOps = blockOps.get(op.blockNumber);
perBlockOps.add(op);
}
List<Integer> prefetchedNotUsed = new ArrayList<>();
List<Integer> cachedNotUsed = new ArrayList<>();
for (Map.Entry<Integer, List<Operation>> entry : blockOps.entrySet()) {
Integer blockNumber = entry.getKey();
List<Operation> perBlockOps = entry.getValue();
Map<Kind, Integer> kindCounts = new HashMap<>();
Map<Kind, Integer> endKindCounts = new HashMap<>();
for (Operation op : perBlockOps) {
if (op instanceof End) {
int endCount = endKindCounts.getOrDefault(op.kind, 0) + 1;
endKindCounts.put(op.kind, endCount);
} else {
int count = kindCounts.getOrDefault(op.kind, 0) + 1;
kindCounts.put(op.kind, count);
}
}
for (Kind kind : kindCounts.keySet()) {
int count = kindCounts.getOrDefault(kind, 0);
int endCount = endKindCounts.getOrDefault(kind, 0);
if (count != endCount) {
append(sb, "[%d] %s : #ops(%d) != #end-ops(%d)\n", blockNumber, kind, count, endCount);
}
if (count > 1) {
append(sb, "[%d] %s = %d\n", blockNumber, kind, count);
}
}
int prefetchCount = kindCounts.getOrDefault(Kind.PREFETCH, 0);
int getPrefetchedCount = kindCounts.getOrDefault(Kind.GET_PREFETCHED, 0);
if ((prefetchCount > 0) && (getPrefetchedCount < prefetchCount)) {
prefetchedNotUsed.add(blockNumber);
}
int cacheCount = kindCounts.getOrDefault(Kind.CACHE_PUT, 0);
int getCachedCount = kindCounts.getOrDefault(Kind.GET_CACHED, 0);
if ((cacheCount > 0) && (getCachedCount < cacheCount)) {
cachedNotUsed.add(blockNumber);
}
}
if (!prefetchedNotUsed.isEmpty()) {
append(sb, "Prefetched but not used: %s\n", getIntList(prefetchedNotUsed));
}
if (!cachedNotUsed.isEmpty()) {
append(sb, "Cached but not used: %s\n", getIntList(cachedNotUsed));
}
}
private static String getIntList(Iterable<Integer> nums) {
List<String> numList = new ArrayList<>();
for (Integer n : nums) {
numList.add(n.toString());
}
return String.join(", ", numList);
}
public static BlockOperations fromSummary(String summary) {
BlockOperations ops = new BlockOperations();
ops.setDebug(true);
Pattern blockOpPattern = Pattern.compile("([A-Z+]+)(\\(([0-9]+)?\\))?");
String[] tokens = summary.split(";");
for (String token : tokens) {
Matcher matcher = blockOpPattern.matcher(token);
if (!matcher.matches()) {
String message = String.format("Unknown summary format: %s", token);
throw new IllegalArgumentException(message);
}
String shortName = matcher.group(1);
String blockNumberStr = matcher.group(3);
int blockNumber = (blockNumberStr == null) ? -1 : Integer.parseInt(blockNumberStr);
Kind kind = Kind.fromShortName(shortName);
Kind endKind = null;
if (kind == null) {
if (shortName.charAt(0) == 'E') {
endKind = Kind.fromShortName(shortName.substring(1));
}
}
if (kind == null && endKind == null) {
String message = String.format("Unknown short name: %s (token = %s)", shortName, token);
throw new IllegalArgumentException(message);
}
if (kind != null) {
ops.add(new Operation(kind, blockNumber));
} else {
Operation op = null;
for (int i = ops.ops.size() - 1; i >= 0; i--) {
op = ops.ops.get(i);
if ((op.blockNumber == blockNumber) && (op.kind == endKind) && !(op instanceof End)) {
ops.add(new End(op));
break;
}
}
if (op == null) {
LOG.warn("Start op not found: {}({})", endKind, blockNumber);
}
}
}
return ops;
}
}

View File

@ -0,0 +1,195 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.apache.hadoop.fs.impl.prefetch;
import java.util.Collections;
import java.util.IdentityHashMap;
import java.util.Set;
import java.util.concurrent.ArrayBlockingQueue;
import static org.apache.hadoop.fs.impl.prefetch.Validate.checkNotNull;
/**
* Manages a fixed pool of resources.
*
* Avoids creating a new resource if a previously created instance is already available.
*/
public abstract class BoundedResourcePool<T> extends ResourcePool<T> {
/**
* The size of this pool. Fixed at creation time.
*/
private final int size;
/**
* Items currently available in the pool.
*/
private ArrayBlockingQueue<T> items;
/**
* Items that have been created so far (regardless of whether they are currently available).
*/
private Set<T> createdItems;
/**
* Constructs a resource pool of the given size.
*
* @param size the size of this pool. Cannot be changed post creation.
*
* @throws IllegalArgumentException if size is zero or negative.
*/
public BoundedResourcePool(int size) {
Validate.checkPositiveInteger(size, "size");
this.size = size;
this.items = new ArrayBlockingQueue<>(size);
// The created items are identified based on their object reference.
this.createdItems = Collections.newSetFromMap(new IdentityHashMap<T, Boolean>());
}
/**
* Acquires a resource blocking if necessary until one becomes available.
*/
@Override
public T acquire() {
return this.acquireHelper(true);
}
/**
* Acquires a resource blocking if one is immediately available. Otherwise returns null.
*/
@Override
public T tryAcquire() {
return this.acquireHelper(false);
}
/**
* Releases a previously acquired resource.
*
* @throws IllegalArgumentException if item is null.
*/
@Override
public void release(T item) {
checkNotNull(item, "item");
synchronized (createdItems) {
if (!createdItems.contains(item)) {
throw new IllegalArgumentException("This item is not a part of this pool");
}
}
// Return if this item was released earlier.
// We cannot use items.contains() because that check is not based on reference equality.
for (T entry : items) {
if (entry == item) {
return;
}
}
try {
items.put(item);
} catch (InterruptedException e) {
throw new IllegalStateException("release() should never block", e);
}
}
@Override
public synchronized void close() {
for (T item : createdItems) {
close(item);
}
items.clear();
items = null;
createdItems.clear();
createdItems = null;
}
/**
* Derived classes may implement a way to cleanup each item.
*/
@Override
protected synchronized void close(T item) {
// Do nothing in this class. Allow overriding classes to take any cleanup action.
}
/**
* Number of items created so far. Mostly for testing purposes.
* @return the count.
*/
public int numCreated() {
synchronized (createdItems) {
return createdItems.size();
}
}
/**
* Number of items available to be acquired. Mostly for testing purposes.
* @return the number available.
*/
public synchronized int numAvailable() {
return (size - numCreated()) + items.size();
}
// For debugging purposes.
@Override
public synchronized String toString() {
return String.format(
"size = %d, #created = %d, #in-queue = %d, #available = %d",
size, numCreated(), items.size(), numAvailable());
}
/**
* Derived classes must implement a way to create an instance of a resource.
*/
protected abstract T createNew();
private T acquireHelper(boolean canBlock) {
// Prefer reusing an item if one is available.
// That avoids unnecessarily creating new instances.
T result = items.poll();
if (result != null) {
return result;
}
synchronized (createdItems) {
// Create a new instance if allowed by the capacity of this pool.
if (createdItems.size() < size) {
T item = createNew();
createdItems.add(item);
return item;
}
}
if (canBlock) {
try {
// Block for an instance to be available.
return items.take();
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
return null;
}
} else {
return null;
}
}
}

View File

@ -0,0 +1,319 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.apache.hadoop.fs.impl.prefetch;
import java.nio.ByteBuffer;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.Future;
import java.util.zip.CRC32;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
/**
* Holds the state of a ByteBuffer that is in use by {@code CachingBlockManager}.
*
* This class is not meant to be of general use. It exists into its own file due to its size.
* We use the term block and buffer interchangeably in this file because one buffer
* holds exactly one block of data.
*
* Holding all of the state associated with a block allows us to validate and control
* state transitions in a synchronized fashion.
*/
public final class BufferData {
private static final Logger LOG = LoggerFactory.getLogger(BufferData.class);
public enum State {
/**
* Unknown / invalid state.
*/
UNKNOWN,
/**
* Buffer has been acquired but has no data.
*/
BLANK,
/**
* This block is being prefetched.
*/
PREFETCHING,
/**
* This block is being added to the local cache.
*/
CACHING,
/**
* This block has data and is ready to be read.
*/
READY,
/**
* This block is no longer in-use and should not be used once in this state.
*/
DONE
}
/**
* Number of the block associated with this buffer.
*/
private final int blockNumber;
/**
* The buffer associated with this block.
*/
private ByteBuffer buffer;
/**
* Current state of this block.
*/
private volatile State state;
/**
* Future of the action being performed on this block (eg, prefetching or caching).
*/
private Future<Void> action;
/**
* Checksum of the buffer contents once in READY state.
*/
private long checksum = 0;
/**
* Constructs an instances of this class.
*
* @param blockNumber Number of the block associated with this buffer.
* @param buffer The buffer associated with this block.
*
* @throws IllegalArgumentException if blockNumber is negative.
* @throws IllegalArgumentException if buffer is null.
*/
public BufferData(int blockNumber, ByteBuffer buffer) {
Validate.checkNotNegative(blockNumber, "blockNumber");
Validate.checkNotNull(buffer, "buffer");
this.blockNumber = blockNumber;
this.buffer = buffer;
this.state = State.BLANK;
}
/**
* Gets the id of this block.
*
* @return the id of this block.
*/
public int getBlockNumber() {
return this.blockNumber;
}
/**
* Gets the buffer associated with this block.
*
* @return the buffer associated with this block.
*/
public ByteBuffer getBuffer() {
return this.buffer;
}
/**
* Gets the state of this block.
*
* @return the state of this block.
*/
public State getState() {
return this.state;
}
/**
* Gets the checksum of data in this block.
*
* @return the checksum of data in this block.
*/
public long getChecksum() {
return this.checksum;
}
/**
* Computes CRC32 checksum of the given buffer's contents.
*
* @param buffer the buffer whose content's checksum is to be computed.
* @return the computed checksum.
*/
public static long getChecksum(ByteBuffer buffer) {
ByteBuffer tempBuffer = buffer.duplicate();
tempBuffer.rewind();
CRC32 crc32 = new CRC32();
crc32.update(tempBuffer);
return crc32.getValue();
}
public synchronized Future<Void> getActionFuture() {
return this.action;
}
/**
* Indicates that a prefetch operation is in progress.
*
* @param actionFuture the {@code Future} of a prefetch action.
*
* @throws IllegalArgumentException if actionFuture is null.
*/
public synchronized void setPrefetch(Future<Void> actionFuture) {
Validate.checkNotNull(actionFuture, "actionFuture");
this.updateState(State.PREFETCHING, State.BLANK);
this.action = actionFuture;
}
/**
* Indicates that a caching operation is in progress.
*
* @param actionFuture the {@code Future} of a caching action.
*
* @throws IllegalArgumentException if actionFuture is null.
*/
public synchronized void setCaching(Future<Void> actionFuture) {
Validate.checkNotNull(actionFuture, "actionFuture");
this.throwIfStateIncorrect(State.PREFETCHING, State.READY);
this.state = State.CACHING;
this.action = actionFuture;
}
/**
* Marks the completion of reading data into the buffer.
* The buffer cannot be modified once in this state.
*
* @param expectedCurrentState the collection of states from which transition to READY is allowed.
*/
public synchronized void setReady(State... expectedCurrentState) {
if (this.checksum != 0) {
throw new IllegalStateException("Checksum cannot be changed once set");
}
this.buffer = this.buffer.asReadOnlyBuffer();
this.checksum = getChecksum(this.buffer);
this.buffer.rewind();
this.updateState(State.READY, expectedCurrentState);
}
/**
* Indicates that this block is no longer of use and can be reclaimed.
*/
public synchronized void setDone() {
if (this.checksum != 0) {
if (getChecksum(this.buffer) != this.checksum) {
throw new IllegalStateException("checksum changed after setReady()");
}
}
this.state = State.DONE;
this.action = null;
}
/**
* Updates the current state to the specified value.
* Asserts that the current state is as expected.
* @param newState the state to transition to.
* @param expectedCurrentState the collection of states from which
* transition to {@code newState} is allowed.
*
* @throws IllegalArgumentException if newState is null.
* @throws IllegalArgumentException if expectedCurrentState is null.
*/
public synchronized void updateState(State newState,
State... expectedCurrentState) {
Validate.checkNotNull(newState, "newState");
Validate.checkNotNull(expectedCurrentState, "expectedCurrentState");
this.throwIfStateIncorrect(expectedCurrentState);
this.state = newState;
}
/**
* Helper that asserts the current state is one of the expected values.
*
* @param states the collection of allowed states.
*
* @throws IllegalArgumentException if states is null.
*/
public void throwIfStateIncorrect(State... states) {
Validate.checkNotNull(states, "states");
if (this.stateEqualsOneOf(states)) {
return;
}
List<String> statesStr = new ArrayList<String>();
for (State s : states) {
statesStr.add(s.toString());
}
String message = String.format(
"Expected buffer state to be '%s' but found: %s",
String.join(" or ", statesStr), this);
throw new IllegalStateException(message);
}
public boolean stateEqualsOneOf(State... states) {
State currentState = this.state;
for (State s : states) {
if (currentState == s) {
return true;
}
}
return false;
}
public String toString() {
return String.format(
"[%03d] id: %03d, %s: buf: %s, checksum: %d, future: %s",
this.blockNumber,
System.identityHashCode(this),
this.state,
this.getBufferStr(this.buffer),
this.checksum,
this.getFutureStr(this.action));
}
private String getFutureStr(Future<Void> f) {
if (f == null) {
return "--";
} else {
return this.action.isDone() ? "done" : "not done";
}
}
private String getBufferStr(ByteBuffer buf) {
if (buf == null) {
return "--";
} else {
return String.format(
"(id = %d, pos = %d, lim = %d)",
System.identityHashCode(buf),
buf.position(), buf.limit());
}
}
}

View File

@ -0,0 +1,323 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.apache.hadoop.fs.impl.prefetch;
import java.io.Closeable;
import java.nio.ByteBuffer;
import java.util.ArrayList;
import java.util.Collections;
import java.util.IdentityHashMap;
import java.util.List;
import java.util.Map;
import java.util.concurrent.Future;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import static java.util.Objects.requireNonNull;
import static org.apache.hadoop.fs.impl.prefetch.Validate.checkNotNegative;
import static org.apache.hadoop.fs.impl.prefetch.Validate.checkState;
import static org.apache.hadoop.util.Preconditions.checkArgument;
import static org.apache.hadoop.util.Preconditions.checkNotNull;
/**
* Manages a fixed pool of {@code ByteBuffer} instances.
* <p>
* Avoids creating a new buffer if a previously created buffer is already available.
*/
public class BufferPool implements Closeable {
private static final Logger LOG = LoggerFactory.getLogger(BufferPool.class);
/**
* Max number of buffers in this pool.
*/
private final int size;
/**
* Size in bytes of each buffer.
*/
private final int bufferSize;
/*
Invariants for internal state.
-- a buffer is either in this.pool or in this.allocated
-- transition between this.pool <==> this.allocated must be atomic
-- only one buffer allocated for a given blockNumber
*/
/**
* Underlying bounded resource pool.
*/
private BoundedResourcePool<ByteBuffer> pool;
/**
* Allows associating metadata to each buffer in the pool.
*/
private Map<BufferData, ByteBuffer> allocated;
/**
* Prefetching stats.
*/
private PrefetchingStatistics prefetchingStatistics;
/**
* Initializes a new instance of the {@code BufferPool} class.
* @param size number of buffer in this pool.
* @param bufferSize size in bytes of each buffer.
* @param prefetchingStatistics statistics for this stream.
* @throws IllegalArgumentException if size is zero or negative.
* @throws IllegalArgumentException if bufferSize is zero or negative.
*/
public BufferPool(int size,
int bufferSize,
PrefetchingStatistics prefetchingStatistics) {
Validate.checkPositiveInteger(size, "size");
Validate.checkPositiveInteger(bufferSize, "bufferSize");
this.size = size;
this.bufferSize = bufferSize;
this.allocated = new IdentityHashMap<BufferData, ByteBuffer>();
this.prefetchingStatistics = requireNonNull(prefetchingStatistics);
this.pool = new BoundedResourcePool<ByteBuffer>(size) {
@Override
public ByteBuffer createNew() {
ByteBuffer buffer = ByteBuffer.allocate(bufferSize);
prefetchingStatistics.memoryAllocated(bufferSize);
return buffer;
}
};
}
/**
* Gets a list of all blocks in this pool.
* @return a list of all blocks in this pool.
*/
public List<BufferData> getAll() {
synchronized (allocated) {
return Collections.unmodifiableList(new ArrayList<>(allocated.keySet()));
}
}
/**
* Acquires a {@code ByteBuffer}; blocking if necessary until one becomes available.
* @param blockNumber the id of the block to acquire.
* @return the acquired block's {@code BufferData}.
*/
public synchronized BufferData acquire(int blockNumber) {
BufferData data;
final int maxRetryDelayMs = 600 * 1000;
final int statusUpdateDelayMs = 120 * 1000;
Retryer retryer = new Retryer(10, maxRetryDelayMs, statusUpdateDelayMs);
do {
if (retryer.updateStatus()) {
if (LOG.isDebugEnabled()) {
LOG.debug("waiting to acquire block: {}", blockNumber);
LOG.debug("state = {}", this);
}
releaseReadyBlock(blockNumber);
}
data = tryAcquire(blockNumber);
}
while ((data == null) && retryer.continueRetry());
if (data != null) {
return data;
} else {
String message =
String.format("Wait failed for acquire(%d)", blockNumber);
throw new IllegalStateException(message);
}
}
/**
* Acquires a buffer if one is immediately available. Otherwise returns null.
* @param blockNumber the id of the block to try acquire.
* @return the acquired block's {@code BufferData} or null.
*/
public synchronized BufferData tryAcquire(int blockNumber) {
return acquireHelper(blockNumber, false);
}
private synchronized BufferData acquireHelper(int blockNumber,
boolean canBlock) {
checkNotNegative(blockNumber, "blockNumber");
releaseDoneBlocks();
BufferData data = find(blockNumber);
if (data != null) {
return data;
}
ByteBuffer buffer = canBlock ? pool.acquire() : pool.tryAcquire();
if (buffer == null) {
return null;
}
buffer.clear();
data = new BufferData(blockNumber, buffer.duplicate());
synchronized (allocated) {
checkState(find(blockNumber) == null, "buffer data already exists");
allocated.put(data, buffer);
}
return data;
}
/**
* Releases resources for any blocks marked as 'done'.
*/
private synchronized void releaseDoneBlocks() {
for (BufferData data : getAll()) {
if (data.stateEqualsOneOf(BufferData.State.DONE)) {
release(data);
}
}
}
/**
* If no blocks were released after calling releaseDoneBlocks() a few times,
* we may end up waiting forever. To avoid that situation, we try releasing
* a 'ready' block farthest away from the given block.
*/
private synchronized void releaseReadyBlock(int blockNumber) {
BufferData releaseTarget = null;
for (BufferData data : getAll()) {
if (data.stateEqualsOneOf(BufferData.State.READY)) {
if (releaseTarget == null) {
releaseTarget = data;
} else {
if (distance(data, blockNumber) > distance(releaseTarget,
blockNumber)) {
releaseTarget = data;
}
}
}
}
if (releaseTarget != null) {
LOG.warn("releasing 'ready' block: {}", releaseTarget);
releaseTarget.setDone();
}
}
private int distance(BufferData data, int blockNumber) {
return Math.abs(data.getBlockNumber() - blockNumber);
}
/**
* Releases a previously acquired resource.
* @param data the {@code BufferData} instance to release.
* @throws IllegalArgumentException if data is null.
* @throws IllegalArgumentException if data cannot be released due to its state.
*/
public synchronized void release(BufferData data) {
checkNotNull(data, "data");
synchronized (data) {
checkArgument(
canRelease(data),
String.format("Unable to release buffer: %s", data));
ByteBuffer buffer = allocated.get(data);
if (buffer == null) {
// Likely released earlier.
return;
}
buffer.clear();
pool.release(buffer);
allocated.remove(data);
}
releaseDoneBlocks();
}
@Override
public synchronized void close() {
for (BufferData data : getAll()) {
Future<Void> actionFuture = data.getActionFuture();
if (actionFuture != null) {
actionFuture.cancel(true);
}
}
int currentPoolSize = pool.numCreated();
pool.close();
pool = null;
allocated.clear();
allocated = null;
prefetchingStatistics.memoryFreed(currentPoolSize * bufferSize);
}
// For debugging purposes.
@Override
public String toString() {
StringBuilder sb = new StringBuilder();
sb.append(pool.toString());
sb.append("\n");
List<BufferData> allData = new ArrayList<>(getAll());
Collections.sort(allData,
(d1, d2) -> d1.getBlockNumber() - d2.getBlockNumber());
for (BufferData data : allData) {
sb.append(data.toString());
sb.append("\n");
}
return sb.toString();
}
// Number of ByteBuffers created so far.
public synchronized int numCreated() {
return pool.numCreated();
}
// Number of ByteBuffers available to be acquired.
public synchronized int numAvailable() {
releaseDoneBlocks();
return pool.numAvailable();
}
private BufferData find(int blockNumber) {
synchronized (allocated) {
for (BufferData data : allocated.keySet()) {
if ((data.getBlockNumber() == blockNumber)
&& !data.stateEqualsOneOf(BufferData.State.DONE)) {
return data;
}
}
}
return null;
}
private boolean canRelease(BufferData data) {
return data.stateEqualsOneOf(
BufferData.State.DONE,
BufferData.State.READY);
}
}

View File

@ -0,0 +1,654 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.apache.hadoop.fs.impl.prefetch;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.time.Duration;
import java.time.Instant;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.Future;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicBoolean;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.function.Supplier;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.LocalDirAllocator;
import org.apache.hadoop.fs.statistics.DurationTracker;
import static java.util.Objects.requireNonNull;
import static org.apache.hadoop.fs.impl.prefetch.Validate.checkNotNegative;
import static org.apache.hadoop.io.IOUtils.cleanupWithLogger;
/**
* Provides read access to the underlying file one block at a time.
* Improve read performance by prefetching and locall caching blocks.
*/
public abstract class CachingBlockManager extends BlockManager {
private static final Logger LOG = LoggerFactory.getLogger(CachingBlockManager.class);
private static final int TIMEOUT_MINUTES = 60;
/**
* Asynchronous tasks are performed in this pool.
*/
private final ExecutorServiceFuturePool futurePool;
/**
* Pool of shared ByteBuffer instances.
*/
private BufferPool bufferPool;
/**
* Size of the in-memory cache in terms of number of blocks.
* Total memory consumption is up to bufferPoolSize * blockSize.
*/
private final int bufferPoolSize;
/**
* Local block cache.
*/
private BlockCache cache;
/**
* Error counts. For testing purposes.
*/
private final AtomicInteger numCachingErrors;
private final AtomicInteger numReadErrors;
/**
* Operations performed by this block manager.
*/
private final BlockOperations ops;
private boolean closed;
/**
* If a single caching operation takes more than this time (in seconds),
* we disable caching to prevent further perf degradation due to caching.
*/
private static final int SLOW_CACHING_THRESHOLD = 5;
/**
* Once set to true, any further caching requests will be ignored.
*/
private final AtomicBoolean cachingDisabled;
private final PrefetchingStatistics prefetchingStatistics;
private final Configuration conf;
private final LocalDirAllocator localDirAllocator;
/**
* Constructs an instance of a {@code CachingBlockManager}.
*
* @param futurePool asynchronous tasks are performed in this pool.
* @param blockData information about each block of the underlying file.
* @param bufferPoolSize size of the in-memory cache in terms of number of blocks.
* @param prefetchingStatistics statistics for this stream.
* @param conf the configuration.
* @param localDirAllocator the local dir allocator instance.
* @throws IllegalArgumentException if bufferPoolSize is zero or negative.
*/
public CachingBlockManager(
ExecutorServiceFuturePool futurePool,
BlockData blockData,
int bufferPoolSize,
PrefetchingStatistics prefetchingStatistics,
Configuration conf,
LocalDirAllocator localDirAllocator) {
super(blockData);
Validate.checkPositiveInteger(bufferPoolSize, "bufferPoolSize");
this.futurePool = requireNonNull(futurePool);
this.bufferPoolSize = bufferPoolSize;
this.numCachingErrors = new AtomicInteger();
this.numReadErrors = new AtomicInteger();
this.cachingDisabled = new AtomicBoolean();
this.prefetchingStatistics = requireNonNull(prefetchingStatistics);
if (this.getBlockData().getFileSize() > 0) {
this.bufferPool = new BufferPool(bufferPoolSize, this.getBlockData().getBlockSize(),
this.prefetchingStatistics);
this.cache = this.createCache();
}
this.ops = new BlockOperations();
this.ops.setDebug(false);
this.conf = requireNonNull(conf);
this.localDirAllocator = localDirAllocator;
}
/**
* Gets the block having the given {@code blockNumber}.
*
* @throws IllegalArgumentException if blockNumber is negative.
*/
@Override
public BufferData get(int blockNumber) throws IOException {
checkNotNegative(blockNumber, "blockNumber");
BufferData data;
final int maxRetryDelayMs = bufferPoolSize * 120 * 1000;
final int statusUpdateDelayMs = 120 * 1000;
Retryer retryer = new Retryer(10, maxRetryDelayMs, statusUpdateDelayMs);
boolean done;
do {
if (closed) {
throw new IOException("this stream is already closed");
}
data = bufferPool.acquire(blockNumber);
done = getInternal(data);
if (retryer.updateStatus()) {
LOG.warn("waiting to get block: {}", blockNumber);
LOG.info("state = {}", this.toString());
}
}
while (!done && retryer.continueRetry());
if (done) {
return data;
} else {
String message = String.format("Wait failed for get(%d)", blockNumber);
throw new IllegalStateException(message);
}
}
private boolean getInternal(BufferData data) throws IOException {
Validate.checkNotNull(data, "data");
// Opportunistic check without locking.
if (data.stateEqualsOneOf(
BufferData.State.PREFETCHING,
BufferData.State.CACHING,
BufferData.State.DONE)) {
return false;
}
synchronized (data) {
// Reconfirm state after locking.
if (data.stateEqualsOneOf(
BufferData.State.PREFETCHING,
BufferData.State.CACHING,
BufferData.State.DONE)) {
return false;
}
int blockNumber = data.getBlockNumber();
if (data.getState() == BufferData.State.READY) {
BlockOperations.Operation op = ops.getPrefetched(blockNumber);
ops.end(op);
return true;
}
data.throwIfStateIncorrect(BufferData.State.BLANK);
read(data);
return true;
}
}
/**
* Releases resources allocated to the given block.
*
* @throws IllegalArgumentException if data is null.
*/
@Override
public void release(BufferData data) {
if (closed) {
return;
}
Validate.checkNotNull(data, "data");
BlockOperations.Operation op = ops.release(data.getBlockNumber());
bufferPool.release(data);
ops.end(op);
}
@Override
public synchronized void close() {
if (closed) {
return;
}
closed = true;
final BlockOperations.Operation op = ops.close();
// Cancel any prefetches in progress.
cancelPrefetches();
cleanupWithLogger(LOG, cache);
ops.end(op);
LOG.info(ops.getSummary(false));
bufferPool.close();
bufferPool = null;
}
/**
* Requests optional prefetching of the given block.
* The block is prefetched only if we can acquire a free buffer.
*
* @throws IllegalArgumentException if blockNumber is negative.
*/
@Override
public void requestPrefetch(int blockNumber) {
checkNotNegative(blockNumber, "blockNumber");
if (closed) {
return;
}
// We initiate a prefetch only if we can acquire a buffer from the shared pool.
BufferData data = bufferPool.tryAcquire(blockNumber);
if (data == null) {
return;
}
// Opportunistic check without locking.
if (!data.stateEqualsOneOf(BufferData.State.BLANK)) {
// The block is ready or being prefetched/cached.
return;
}
synchronized (data) {
// Reconfirm state after locking.
if (!data.stateEqualsOneOf(BufferData.State.BLANK)) {
// The block is ready or being prefetched/cached.
return;
}
BlockOperations.Operation op = ops.requestPrefetch(blockNumber);
PrefetchTask prefetchTask = new PrefetchTask(data, this, Instant.now());
Future<Void> prefetchFuture = futurePool.executeFunction(prefetchTask);
data.setPrefetch(prefetchFuture);
ops.end(op);
}
}
/**
* Requests cancellation of any previously issued prefetch requests.
*/
@Override
public void cancelPrefetches() {
BlockOperations.Operation op = ops.cancelPrefetches();
for (BufferData data : bufferPool.getAll()) {
// We add blocks being prefetched to the local cache so that the prefetch is not wasted.
if (data.stateEqualsOneOf(BufferData.State.PREFETCHING, BufferData.State.READY)) {
requestCaching(data);
}
}
ops.end(op);
}
private void read(BufferData data) throws IOException {
synchronized (data) {
try {
readBlock(data, false, BufferData.State.BLANK);
} catch (IOException e) {
LOG.error("error reading block {}", data.getBlockNumber(), e);
throw e;
}
}
}
private void prefetch(BufferData data, Instant taskQueuedStartTime) throws IOException {
synchronized (data) {
prefetchingStatistics.executorAcquired(
Duration.between(taskQueuedStartTime, Instant.now()));
readBlock(
data,
true,
BufferData.State.PREFETCHING,
BufferData.State.CACHING);
}
}
private void readBlock(BufferData data, boolean isPrefetch, BufferData.State... expectedState)
throws IOException {
if (closed) {
return;
}
BlockOperations.Operation op = null;
DurationTracker tracker = null;
synchronized (data) {
try {
if (data.stateEqualsOneOf(BufferData.State.DONE, BufferData.State.READY)) {
// DONE : Block was released, likely due to caching being disabled on slow perf.
// READY : Block was already fetched by another thread. No need to re-read.
return;
}
data.throwIfStateIncorrect(expectedState);
int blockNumber = data.getBlockNumber();
// Prefer reading from cache over reading from network.
if (cache.containsBlock(blockNumber)) {
op = ops.getCached(blockNumber);
cache.get(blockNumber, data.getBuffer());
data.setReady(expectedState);
return;
}
if (isPrefetch) {
tracker = prefetchingStatistics.prefetchOperationStarted();
op = ops.prefetch(data.getBlockNumber());
} else {
op = ops.getRead(data.getBlockNumber());
}
long offset = getBlockData().getStartOffset(data.getBlockNumber());
int size = getBlockData().getSize(data.getBlockNumber());
ByteBuffer buffer = data.getBuffer();
buffer.clear();
read(buffer, offset, size);
buffer.flip();
data.setReady(expectedState);
} catch (Exception e) {
if (isPrefetch && tracker != null) {
tracker.failed();
}
numReadErrors.incrementAndGet();
data.setDone();
throw e;
} finally {
if (op != null) {
ops.end(op);
}
if (isPrefetch) {
prefetchingStatistics.prefetchOperationCompleted();
if (tracker != null) {
tracker.close();
}
}
}
}
}
/**
* Read task that is submitted to the future pool.
*/
private static class PrefetchTask implements Supplier<Void> {
private final BufferData data;
private final CachingBlockManager blockManager;
private final Instant taskQueuedStartTime;
PrefetchTask(BufferData data, CachingBlockManager blockManager, Instant taskQueuedStartTime) {
this.data = data;
this.blockManager = blockManager;
this.taskQueuedStartTime = taskQueuedStartTime;
}
@Override
public Void get() {
try {
blockManager.prefetch(data, taskQueuedStartTime);
} catch (Exception e) {
LOG.info("error prefetching block {}. {}", data.getBlockNumber(), e.getMessage());
LOG.debug("error prefetching block {}", data.getBlockNumber(), e);
}
return null;
}
}
private static final BufferData.State[] EXPECTED_STATE_AT_CACHING =
new BufferData.State[] {
BufferData.State.PREFETCHING, BufferData.State.READY
};
/**
* Requests that the given block should be copied to the local cache.
* The block must not be accessed by the caller after calling this method
* because it will released asynchronously relative to the caller.
*
* @throws IllegalArgumentException if data is null.
*/
@Override
public void requestCaching(BufferData data) {
if (closed) {
return;
}
if (cachingDisabled.get()) {
data.setDone();
return;
}
Validate.checkNotNull(data, "data");
// Opportunistic check without locking.
if (!data.stateEqualsOneOf(EXPECTED_STATE_AT_CACHING)) {
return;
}
synchronized (data) {
// Reconfirm state after locking.
if (!data.stateEqualsOneOf(EXPECTED_STATE_AT_CACHING)) {
return;
}
if (cache.containsBlock(data.getBlockNumber())) {
data.setDone();
return;
}
BufferData.State state = data.getState();
BlockOperations.Operation op = ops.requestCaching(data.getBlockNumber());
Future<Void> blockFuture;
if (state == BufferData.State.PREFETCHING) {
blockFuture = data.getActionFuture();
} else {
CompletableFuture<Void> cf = new CompletableFuture<>();
cf.complete(null);
blockFuture = cf;
}
CachePutTask task =
new CachePutTask(data, blockFuture, this, Instant.now());
Future<Void> actionFuture = futurePool.executeFunction(task);
data.setCaching(actionFuture);
ops.end(op);
}
}
private void addToCacheAndRelease(BufferData data, Future<Void> blockFuture,
Instant taskQueuedStartTime) {
prefetchingStatistics.executorAcquired(
Duration.between(taskQueuedStartTime, Instant.now()));
if (closed) {
return;
}
if (cachingDisabled.get()) {
data.setDone();
return;
}
try {
blockFuture.get(TIMEOUT_MINUTES, TimeUnit.MINUTES);
if (data.stateEqualsOneOf(BufferData.State.DONE)) {
// There was an error during prefetch.
return;
}
} catch (Exception e) {
LOG.info("error waiting on blockFuture: {}. {}", data, e.getMessage());
LOG.debug("error waiting on blockFuture: {}", data, e);
data.setDone();
return;
}
if (cachingDisabled.get()) {
data.setDone();
return;
}
BlockOperations.Operation op = null;
synchronized (data) {
try {
if (data.stateEqualsOneOf(BufferData.State.DONE)) {
return;
}
if (cache.containsBlock(data.getBlockNumber())) {
data.setDone();
return;
}
op = ops.addToCache(data.getBlockNumber());
ByteBuffer buffer = data.getBuffer().duplicate();
buffer.rewind();
cachePut(data.getBlockNumber(), buffer);
data.setDone();
} catch (Exception e) {
numCachingErrors.incrementAndGet();
LOG.info("error adding block to cache after wait: {}. {}", data, e.getMessage());
LOG.debug("error adding block to cache after wait: {}", data, e);
data.setDone();
}
if (op != null) {
BlockOperations.End endOp = (BlockOperations.End) ops.end(op);
if (endOp.duration() > SLOW_CACHING_THRESHOLD) {
if (!cachingDisabled.getAndSet(true)) {
String message = String.format(
"Caching disabled because of slow operation (%.1f sec)", endOp.duration());
LOG.warn(message);
}
}
}
}
}
protected BlockCache createCache() {
return new SingleFilePerBlockCache(prefetchingStatistics);
}
protected void cachePut(int blockNumber, ByteBuffer buffer) throws IOException {
if (closed) {
return;
}
cache.put(blockNumber, buffer, conf, localDirAllocator);
}
private static class CachePutTask implements Supplier<Void> {
private final BufferData data;
// Block being asynchronously fetched.
private final Future<Void> blockFuture;
// Block manager that manages this block.
private final CachingBlockManager blockManager;
private final Instant taskQueuedStartTime;
CachePutTask(
BufferData data,
Future<Void> blockFuture,
CachingBlockManager blockManager,
Instant taskQueuedStartTime) {
this.data = data;
this.blockFuture = blockFuture;
this.blockManager = blockManager;
this.taskQueuedStartTime = taskQueuedStartTime;
}
@Override
public Void get() {
blockManager.addToCacheAndRelease(data, blockFuture, taskQueuedStartTime);
return null;
}
}
/**
* Number of ByteBuffers available to be acquired.
*
* @return the number of available buffers.
*/
public int numAvailable() {
return bufferPool.numAvailable();
}
/**
* Number of caching operations completed.
*
* @return the number of cached buffers.
*/
public int numCached() {
return cache.size();
}
/**
* Number of errors encountered when caching.
*
* @return the number of errors encountered when caching.
*/
public int numCachingErrors() {
return numCachingErrors.get();
}
/**
* Number of errors encountered when reading.
*
* @return the number of errors encountered when reading.
*/
public int numReadErrors() {
return numReadErrors.get();
}
BufferData getData(int blockNumber) {
return bufferPool.tryAcquire(blockNumber);
}
@Override
public String toString() {
StringBuilder sb = new StringBuilder();
sb.append("cache(");
sb.append(cache.toString());
sb.append("); ");
sb.append("pool: ");
sb.append(bufferPool.toString());
return sb.toString();
}
}

View File

@ -0,0 +1,80 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.apache.hadoop.fs.impl.prefetch;
import java.time.Duration;
import org.apache.hadoop.fs.statistics.DurationTracker;
import static org.apache.hadoop.fs.statistics.IOStatisticsSupport.stubDurationTracker;
/**
* Empty implementation of the prefetching statistics interface.
*/
public final class EmptyPrefetchingStatistics
implements PrefetchingStatistics {
private static final EmptyPrefetchingStatistics
EMPTY_PREFETCHING_STATISTICS =
new EmptyPrefetchingStatistics();
private EmptyPrefetchingStatistics() {
}
public static EmptyPrefetchingStatistics getInstance() {
return EMPTY_PREFETCHING_STATISTICS;
}
@Override
public DurationTracker prefetchOperationStarted() {
return stubDurationTracker();
}
@Override
public void blockAddedToFileCache() {
}
@Override
public void blockRemovedFromFileCache() {
}
@Override
public void prefetchOperationCompleted() {
}
@Override
public void executorAcquired(Duration timeInQueue) {
}
@Override
public void memoryAllocated(int size) {
}
@Override
public void memoryFreed(int size) {
}
}

View File

@ -0,0 +1,88 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.apache.hadoop.fs.impl.prefetch;
import java.util.Locale;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Future;
import java.util.concurrent.TimeUnit;
import java.util.function.Supplier;
import org.slf4j.Logger;
import org.apache.hadoop.util.concurrent.HadoopExecutors;
/**
* A FuturePool implementation backed by a java.util.concurrent.ExecutorService.
*
* If a piece of work has started, it cannot (currently) be cancelled.
*
* This class is a simplified version of <code>com.twitter:util-core_2.11</code>
* ExecutorServiceFuturePool designed to avoid depending on that Scala library.
* One problem with using a Scala library is that many downstream projects
* (eg Apache Spark) use Scala, and they might want to use a different version of Scala
* from the version that Hadoop chooses to use.
*
*/
public class ExecutorServiceFuturePool {
private final ExecutorService executor;
public ExecutorServiceFuturePool(ExecutorService executor) {
this.executor = executor;
}
/**
* @param f function to run in future on executor pool
* @return future
* @throws java.util.concurrent.RejectedExecutionException can be thrown
* @throws NullPointerException if f param is null
*/
public Future<Void> executeFunction(final Supplier<Void> f) {
return executor.submit(f::get);
}
/**
* @param r runnable to run in future on executor pool
* @return future
* @throws java.util.concurrent.RejectedExecutionException can be thrown
* @throws NullPointerException if r param is null
*/
@SuppressWarnings("unchecked")
public Future<Void> executeRunnable(final Runnable r) {
return (Future<Void>) executor.submit(r::run);
}
/**
* Utility to shutdown the {@link ExecutorService} used by this class. Will wait up to a
* certain timeout for the ExecutorService to gracefully shutdown.
*
* @param logger Logger
* @param timeout the maximum time to wait
* @param unit the time unit of the timeout argument
*/
public void shutdown(Logger logger, long timeout, TimeUnit unit) {
HadoopExecutors.shutdown(executor, logger, timeout, unit);
}
public String toString() {
return String.format(Locale.ROOT, "ExecutorServiceFuturePool(executor=%s)", executor);
}
}

View File

@ -0,0 +1,301 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.apache.hadoop.fs.impl.prefetch;
import java.nio.ByteBuffer;
import static org.apache.hadoop.fs.impl.prefetch.Validate.checkNotNegative;
import static org.apache.hadoop.fs.impl.prefetch.Validate.checkNotNull;
import static org.apache.hadoop.fs.impl.prefetch.Validate.checkPositiveInteger;
import static org.apache.hadoop.fs.impl.prefetch.Validate.checkState;
import static org.apache.hadoop.fs.impl.prefetch.Validate.checkWithinRange;
/**
* Provides functionality related to tracking the position within a file.
*
* The file is accessed through an in memory buffer. The absolute position within
* the file is the sum of start offset of the buffer within the file and the relative
* offset of the current access location within the buffer.
*
* A file is made up of equal sized blocks. The last block may be of a smaller size.
* The size of a buffer associated with this file is typically the same as block size.
*/
public final class FilePosition {
/**
* Holds block based information about a file.
*/
private BlockData blockData;
/**
* Information about the buffer in use.
*/
private BufferData data;
/**
* Provides access to the underlying file.
*/
private ByteBuffer buffer;
/**
* Start offset of the buffer relative to the start of a file.
*/
private long bufferStartOffset;
/**
* Offset where reading starts relative to the start of a file.
*/
private long readStartOffset;
// Read stats after a seek (mostly for debugging use).
private int numSingleByteReads;
private int numBytesRead;
private int numBufferReads;
/**
* Constructs an instance of {@link FilePosition}.
*
* @param fileSize size of the associated file.
* @param blockSize size of each block within the file.
*
* @throws IllegalArgumentException if fileSize is negative.
* @throws IllegalArgumentException if blockSize is zero or negative.
*/
public FilePosition(long fileSize, int blockSize) {
checkNotNegative(fileSize, "fileSize");
if (fileSize == 0) {
checkNotNegative(blockSize, "blockSize");
} else {
checkPositiveInteger(blockSize, "blockSize");
}
this.blockData = new BlockData(fileSize, blockSize);
// The position is valid only when a valid buffer is associated with this file.
this.invalidate();
}
/**
* Associates a buffer with this file.
*
* @param bufferData the buffer associated with this file.
* @param startOffset Start offset of the buffer relative to the start of a file.
* @param readOffset Offset where reading starts relative to the start of a file.
*
* @throws IllegalArgumentException if bufferData is null.
* @throws IllegalArgumentException if startOffset is negative.
* @throws IllegalArgumentException if readOffset is negative.
* @throws IllegalArgumentException if readOffset is outside the range [startOffset, buffer end].
*/
public void setData(BufferData bufferData,
long startOffset,
long readOffset) {
checkNotNull(bufferData, "bufferData");
checkNotNegative(startOffset, "startOffset");
checkNotNegative(readOffset, "readOffset");
checkWithinRange(
readOffset,
"readOffset",
startOffset,
startOffset + bufferData.getBuffer().limit());
data = bufferData;
buffer = bufferData.getBuffer().duplicate();
bufferStartOffset = startOffset;
readStartOffset = readOffset;
setAbsolute(readOffset);
resetReadStats();
}
public ByteBuffer buffer() {
throwIfInvalidBuffer();
return buffer;
}
public BufferData data() {
throwIfInvalidBuffer();
return data;
}
/**
* Gets the current absolute position within this file.
*
* @return the current absolute position within this file.
*/
public long absolute() {
throwIfInvalidBuffer();
return bufferStartOffset + relative();
}
/**
* If the given {@code pos} lies within the current buffer, updates the current position to
* the specified value and returns true; otherwise returns false without changing the position.
*
* @param pos the absolute position to change the current position to if possible.
* @return true if the given current position was updated, false otherwise.
*/
public boolean setAbsolute(long pos) {
if (isValid() && isWithinCurrentBuffer(pos)) {
int relativePos = (int) (pos - bufferStartOffset);
buffer.position(relativePos);
return true;
} else {
return false;
}
}
/**
* Gets the current position within this file relative to the start of the associated buffer.
*
* @return the current position within this file relative to the start of the associated buffer.
*/
public int relative() {
throwIfInvalidBuffer();
return buffer.position();
}
/**
* Determines whether the given absolute position lies within the current buffer.
*
* @param pos the position to check.
* @return true if the given absolute position lies within the current buffer, false otherwise.
*/
public boolean isWithinCurrentBuffer(long pos) {
throwIfInvalidBuffer();
long bufferEndOffset = bufferStartOffset + buffer.limit();
return (pos >= bufferStartOffset) && (pos <= bufferEndOffset);
}
/**
* Gets the id of the current block.
*
* @return the id of the current block.
*/
public int blockNumber() {
throwIfInvalidBuffer();
return blockData.getBlockNumber(bufferStartOffset);
}
/**
* Determines whether the current block is the last block in this file.
*
* @return true if the current block is the last block in this file, false otherwise.
*/
public boolean isLastBlock() {
return blockData.isLastBlock(blockNumber());
}
/**
* Determines if the current position is valid.
*
* @return true if the current position is valid, false otherwise.
*/
public boolean isValid() {
return buffer != null;
}
/**
* Marks the current position as invalid.
*/
public void invalidate() {
buffer = null;
bufferStartOffset = -1;
data = null;
}
/**
* Gets the start of the current block's absolute offset.
*
* @return the start of the current block's absolute offset.
*/
public long bufferStartOffset() {
throwIfInvalidBuffer();
return bufferStartOffset;
}
/**
* Determines whether the current buffer has been fully read.
*
* @return true if the current buffer has been fully read, false otherwise.
*/
public boolean bufferFullyRead() {
throwIfInvalidBuffer();
return (bufferStartOffset == readStartOffset)
&& (relative() == buffer.limit())
&& (numBytesRead == buffer.limit());
}
public void incrementBytesRead(int n) {
numBytesRead += n;
if (n == 1) {
numSingleByteReads++;
} else {
numBufferReads++;
}
}
public int numBytesRead() {
return numBytesRead;
}
public int numSingleByteReads() {
return numSingleByteReads;
}
public int numBufferReads() {
return numBufferReads;
}
private void resetReadStats() {
numBytesRead = 0;
numSingleByteReads = 0;
numBufferReads = 0;
}
public String toString() {
StringBuilder sb = new StringBuilder();
if (buffer == null) {
sb.append("currentBuffer = null");
} else {
int pos = buffer.position();
int val;
if (pos >= buffer.limit()) {
val = -1;
} else {
val = buffer.get(pos);
}
String currentBufferState =
String.format("%d at pos: %d, lim: %d", val, pos, buffer.limit());
sb.append(String.format(
"block: %d, pos: %d (CBuf: %s)%n",
blockNumber(), absolute(),
currentBufferState));
sb.append("\n");
}
return sb.toString();
}
private void throwIfInvalidBuffer() {
checkState(buffer != null, "'buffer' must not be null");
}
}

View File

@ -0,0 +1,67 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.apache.hadoop.fs.impl.prefetch;
import java.time.Duration;
import org.apache.hadoop.fs.statistics.DurationTracker;
import org.apache.hadoop.fs.statistics.IOStatisticsSource;
public interface PrefetchingStatistics extends IOStatisticsSource {
/**
* A prefetch operation has started.
* @return duration tracker
*/
DurationTracker prefetchOperationStarted();
/**
* A block has been saved to the file cache.
*/
void blockAddedToFileCache();
/**
* A block has been removed from the file cache.
*/
void blockRemovedFromFileCache();
/**
* A prefetch operation has completed.
*/
void prefetchOperationCompleted();
/**
* An executor has been acquired, either for prefetching or caching.
* @param timeInQueue time taken to acquire an executor.
*/
void executorAcquired(Duration timeInQueue);
/**
* A new buffer has been added to the buffer pool.
* @param size size of the new buffer
*/
void memoryAllocated(int size);
/**
* Previously allocated memory has been freed.
* @param size size of memory freed.
*/
void memoryFreed(int size);
}

View File

@ -0,0 +1,71 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.apache.hadoop.fs.impl.prefetch;
import java.io.Closeable;
/**
* Manages a fixed pool of resources.
*
* Avoids creating a new resource if a previously created instance is already available.
*/
public abstract class ResourcePool<T> implements Closeable {
/**
* Acquires a resource blocking if necessary until one becomes available.
*
* @return the acquired resource instance.
*/
public abstract T acquire();
/**
* Acquires a resource blocking if one is immediately available. Otherwise returns null.
* @return the acquired resource instance (if immediately available) or null.
*/
public abstract T tryAcquire();
/**
* Releases a previously acquired resource.
*
* @param item the resource to release.
*/
public abstract void release(T item);
@Override
public void close() {
}
/**
* Derived classes may implement a way to cleanup each item.
*
* @param item the resource to close.
*/
protected void close(T item) {
// Do nothing in this class. Allow overriding classes to take any cleanup action.
}
/**
* Derived classes must implement a way to create an instance of a resource.
*
* @return the created instance.
*/
protected abstract T createNew();
}

View File

@ -0,0 +1,93 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.apache.hadoop.fs.impl.prefetch;
import static org.apache.hadoop.fs.impl.prefetch.Validate.checkGreater;
import static org.apache.hadoop.fs.impl.prefetch.Validate.checkPositiveInteger;
/**
* Provides retry related functionality.
*/
public class Retryer {
/* Maximum amount of delay (in ms) before retry fails. */
private int maxDelay;
/* Per retry delay (in ms). */
private int perRetryDelay;
/**
* The time interval (in ms) at which status update would be made.
*/
private int statusUpdateInterval;
/* Current delay. */
private int delay;
/**
* Initializes a new instance of the {@code Retryer} class.
*
* @param perRetryDelay per retry delay (in ms).
* @param maxDelay maximum amount of delay (in ms) before retry fails.
* @param statusUpdateInterval time interval (in ms) at which status update would be made.
*
* @throws IllegalArgumentException if perRetryDelay is zero or negative.
* @throws IllegalArgumentException if maxDelay is less than or equal to perRetryDelay.
* @throws IllegalArgumentException if statusUpdateInterval is zero or negative.
*/
public Retryer(int perRetryDelay, int maxDelay, int statusUpdateInterval) {
checkPositiveInteger(perRetryDelay, "perRetryDelay");
checkGreater(maxDelay, "maxDelay", perRetryDelay, "perRetryDelay");
checkPositiveInteger(statusUpdateInterval, "statusUpdateInterval");
this.perRetryDelay = perRetryDelay;
this.maxDelay = maxDelay;
this.statusUpdateInterval = statusUpdateInterval;
}
/**
* Returns true if retrying should continue, false otherwise.
*
* @return true if the caller should retry, false otherwise.
*/
public boolean continueRetry() {
if (this.delay >= this.maxDelay) {
return false;
}
try {
Thread.sleep(this.perRetryDelay);
} catch (InterruptedException e) {
// Ignore the exception as required by the semantic of this class;
}
this.delay += this.perRetryDelay;
return true;
}
/**
* Returns true if status update interval has been reached.
*
* @return true if status update interval has been reached.
*/
public boolean updateStatus() {
return (this.delay > 0) && this.delay % this.statusUpdateInterval == 0;
}
}

View File

@ -0,0 +1,489 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.apache.hadoop.fs.impl.prefetch;
import java.io.File;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.channels.WritableByteChannel;
import java.nio.file.Files;
import java.nio.file.OpenOption;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.StandardOpenOption;
import java.nio.file.attribute.PosixFilePermission;
import java.util.ArrayList;
import java.util.Collections;
import java.util.EnumSet;
import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.locks.ReentrantReadWriteLock;
import org.apache.hadoop.thirdparty.com.google.common.collect.ImmutableSet;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.LocalDirAllocator;
import static java.util.Objects.requireNonNull;
import static org.apache.hadoop.fs.impl.prefetch.Validate.checkNotNull;
/**
* Provides functionality necessary for caching blocks of data read from FileSystem.
* Each cache block is stored on the local disk as a separate file.
*/
public class SingleFilePerBlockCache implements BlockCache {
private static final Logger LOG = LoggerFactory.getLogger(SingleFilePerBlockCache.class);
/**
* Blocks stored in this cache.
*/
private final Map<Integer, Entry> blocks = new ConcurrentHashMap<>();
/**
* Number of times a block was read from this cache.
* Used for determining cache utilization factor.
*/
private int numGets = 0;
private boolean closed;
private final PrefetchingStatistics prefetchingStatistics;
/**
* Timeout to be used by close, while acquiring prefetch block write lock.
*/
private static final int PREFETCH_WRITE_LOCK_TIMEOUT = 5;
/**
* Lock timeout unit to be used by the thread while acquiring prefetch block write lock.
*/
private static final TimeUnit PREFETCH_WRITE_LOCK_TIMEOUT_UNIT = TimeUnit.SECONDS;
/**
* File attributes attached to any intermediate temporary file created during index creation.
*/
private static final Set<PosixFilePermission> TEMP_FILE_ATTRS =
ImmutableSet.of(PosixFilePermission.OWNER_READ, PosixFilePermission.OWNER_WRITE);
/**
* Cache entry.
* Each block is stored as a separate file.
*/
private static final class Entry {
private final int blockNumber;
private final Path path;
private final int size;
private final long checksum;
private final ReentrantReadWriteLock lock;
private enum LockType {
READ,
WRITE
}
Entry(int blockNumber, Path path, int size, long checksum) {
this.blockNumber = blockNumber;
this.path = path;
this.size = size;
this.checksum = checksum;
this.lock = new ReentrantReadWriteLock();
}
@Override
public String toString() {
return String.format(
"([%03d] %s: size = %d, checksum = %d)",
blockNumber, path, size, checksum);
}
/**
* Take the read or write lock.
*
* @param lockType type of the lock.
*/
private void takeLock(LockType lockType) {
if (LockType.READ == lockType) {
lock.readLock().lock();
} else if (LockType.WRITE == lockType) {
lock.writeLock().lock();
}
}
/**
* Release the read or write lock.
*
* @param lockType type of the lock.
*/
private void releaseLock(LockType lockType) {
if (LockType.READ == lockType) {
lock.readLock().unlock();
} else if (LockType.WRITE == lockType) {
lock.writeLock().unlock();
}
}
/**
* Try to take the read or write lock within the given timeout.
*
* @param lockType type of the lock.
* @param timeout the time to wait for the given lock.
* @param unit the time unit of the timeout argument.
* @return true if the lock of the given lock type was acquired.
*/
private boolean takeLock(LockType lockType, long timeout, TimeUnit unit) {
try {
if (LockType.READ == lockType) {
return lock.readLock().tryLock(timeout, unit);
} else if (LockType.WRITE == lockType) {
return lock.writeLock().tryLock(timeout, unit);
}
} catch (InterruptedException e) {
LOG.warn("Thread interrupted while trying to acquire {} lock", lockType, e);
Thread.currentThread().interrupt();
}
return false;
}
}
/**
* Constructs an instance of a {@code SingleFilePerBlockCache}.
*
* @param prefetchingStatistics statistics for this stream.
*/
public SingleFilePerBlockCache(PrefetchingStatistics prefetchingStatistics) {
this.prefetchingStatistics = requireNonNull(prefetchingStatistics);
}
/**
* Indicates whether the given block is in this cache.
*/
@Override
public boolean containsBlock(int blockNumber) {
return blocks.containsKey(blockNumber);
}
/**
* Gets the blocks in this cache.
*/
@Override
public Iterable<Integer> blocks() {
return Collections.unmodifiableList(new ArrayList<>(blocks.keySet()));
}
/**
* Gets the number of blocks in this cache.
*/
@Override
public int size() {
return blocks.size();
}
/**
* Gets the block having the given {@code blockNumber}.
*
* @throws IllegalArgumentException if buffer is null.
*/
@Override
public void get(int blockNumber, ByteBuffer buffer) throws IOException {
if (closed) {
return;
}
checkNotNull(buffer, "buffer");
Entry entry = getEntry(blockNumber);
entry.takeLock(Entry.LockType.READ);
try {
buffer.clear();
readFile(entry.path, buffer);
buffer.rewind();
validateEntry(entry, buffer);
} finally {
entry.releaseLock(Entry.LockType.READ);
}
}
protected int readFile(Path path, ByteBuffer buffer) throws IOException {
int numBytesRead = 0;
int numBytes;
FileChannel channel = FileChannel.open(path, StandardOpenOption.READ);
while ((numBytes = channel.read(buffer)) > 0) {
numBytesRead += numBytes;
}
buffer.limit(buffer.position());
channel.close();
return numBytesRead;
}
private Entry getEntry(int blockNumber) {
Validate.checkNotNegative(blockNumber, "blockNumber");
Entry entry = blocks.get(blockNumber);
if (entry == null) {
throw new IllegalStateException(String.format("block %d not found in cache", blockNumber));
}
numGets++;
return entry;
}
/**
* Puts the given block in this cache.
*
* @param blockNumber the block number, used as a key for blocks map.
* @param buffer buffer contents of the given block to be added to this cache.
* @param conf the configuration.
* @param localDirAllocator the local dir allocator instance.
* @throws IOException if either local dir allocator fails to allocate file or if IO error
* occurs while writing the buffer content to the file.
* @throws IllegalArgumentException if buffer is null, or if buffer.limit() is zero or negative.
*/
@Override
public void put(int blockNumber, ByteBuffer buffer, Configuration conf,
LocalDirAllocator localDirAllocator) throws IOException {
if (closed) {
return;
}
checkNotNull(buffer, "buffer");
if (blocks.containsKey(blockNumber)) {
Entry entry = blocks.get(blockNumber);
entry.takeLock(Entry.LockType.READ);
try {
validateEntry(entry, buffer);
} finally {
entry.releaseLock(Entry.LockType.READ);
}
return;
}
Validate.checkPositiveInteger(buffer.limit(), "buffer.limit()");
Path blockFilePath = getCacheFilePath(conf, localDirAllocator);
long size = Files.size(blockFilePath);
if (size != 0) {
String message =
String.format("[%d] temp file already has data. %s (%d)",
blockNumber, blockFilePath, size);
throw new IllegalStateException(message);
}
writeFile(blockFilePath, buffer);
long checksum = BufferData.getChecksum(buffer);
Entry entry = new Entry(blockNumber, blockFilePath, buffer.limit(), checksum);
blocks.put(blockNumber, entry);
// Update stream_read_blocks_in_cache stats only after blocks map is updated with new file
// entry to avoid any discrepancy related to the value of stream_read_blocks_in_cache.
// If stream_read_blocks_in_cache is updated before updating the blocks map here, closing of
// the input stream can lead to the removal of the cache file even before blocks is added with
// the new cache file, leading to incorrect value of stream_read_blocks_in_cache.
prefetchingStatistics.blockAddedToFileCache();
}
private static final Set<? extends OpenOption> CREATE_OPTIONS =
EnumSet.of(StandardOpenOption.WRITE,
StandardOpenOption.CREATE,
StandardOpenOption.TRUNCATE_EXISTING);
protected void writeFile(Path path, ByteBuffer buffer) throws IOException {
buffer.rewind();
WritableByteChannel writeChannel = Files.newByteChannel(path, CREATE_OPTIONS);
while (buffer.hasRemaining()) {
writeChannel.write(buffer);
}
writeChannel.close();
}
/**
* Return temporary file created based on the file path retrieved from local dir allocator.
*
* @param conf The configuration object.
* @param localDirAllocator Local dir allocator instance.
* @return Path of the temporary file created.
* @throws IOException if IO error occurs while local dir allocator tries to retrieve path
* from local FS or file creation fails or permission set fails.
*/
protected Path getCacheFilePath(final Configuration conf,
final LocalDirAllocator localDirAllocator)
throws IOException {
return getTempFilePath(conf, localDirAllocator);
}
@Override
public void close() throws IOException {
if (closed) {
return;
}
closed = true;
LOG.info(getStats());
int numFilesDeleted = 0;
for (Entry entry : blocks.values()) {
boolean lockAcquired = entry.takeLock(Entry.LockType.WRITE, PREFETCH_WRITE_LOCK_TIMEOUT,
PREFETCH_WRITE_LOCK_TIMEOUT_UNIT);
if (!lockAcquired) {
LOG.error("Cache file {} deletion would not be attempted as write lock could not"
+ " be acquired within {} {}", entry.path, PREFETCH_WRITE_LOCK_TIMEOUT,
PREFETCH_WRITE_LOCK_TIMEOUT_UNIT);
continue;
}
try {
Files.deleteIfExists(entry.path);
prefetchingStatistics.blockRemovedFromFileCache();
numFilesDeleted++;
} catch (IOException e) {
LOG.debug("Failed to delete cache file {}", entry.path, e);
} finally {
entry.releaseLock(Entry.LockType.WRITE);
}
}
if (numFilesDeleted > 0) {
LOG.info("Deleted {} cache files", numFilesDeleted);
}
}
@Override
public String toString() {
StringBuilder sb = new StringBuilder();
sb.append("stats: ");
sb.append(getStats());
sb.append(", blocks:[");
sb.append(getIntList(blocks()));
sb.append("]");
return sb.toString();
}
private void validateEntry(Entry entry, ByteBuffer buffer) {
if (entry.size != buffer.limit()) {
String message = String.format(
"[%d] entry.size(%d) != buffer.limit(%d)",
entry.blockNumber, entry.size, buffer.limit());
throw new IllegalStateException(message);
}
long checksum = BufferData.getChecksum(buffer);
if (entry.checksum != checksum) {
String message = String.format(
"[%d] entry.checksum(%d) != buffer checksum(%d)",
entry.blockNumber, entry.checksum, checksum);
throw new IllegalStateException(message);
}
}
/**
* Produces a human readable list of blocks for the purpose of logging.
* This method minimizes the length of returned list by converting
* a contiguous list of blocks into a range.
* for example,
* 1, 3, 4, 5, 6, 8 becomes 1, 3~6, 8
*/
private String getIntList(Iterable<Integer> nums) {
List<String> numList = new ArrayList<>();
List<Integer> numbers = new ArrayList<Integer>();
for (Integer n : nums) {
numbers.add(n);
}
Collections.sort(numbers);
int index = 0;
while (index < numbers.size()) {
int start = numbers.get(index);
int prev = start;
int end = start;
while ((++index < numbers.size()) && ((end = numbers.get(index)) == prev + 1)) {
prev = end;
}
if (start == prev) {
numList.add(Integer.toString(start));
} else {
numList.add(String.format("%d~%d", start, prev));
}
}
return String.join(", ", numList);
}
private String getStats() {
StringBuilder sb = new StringBuilder();
sb.append(String.format(
"#entries = %d, #gets = %d",
blocks.size(), numGets));
return sb.toString();
}
private static final String CACHE_FILE_PREFIX = "fs-cache-";
/**
* Determine if the cache space is available on the local FS.
*
* @param fileSize The size of the file.
* @param conf The configuration.
* @param localDirAllocator Local dir allocator instance.
* @return True if the given file size is less than the available free space on local FS,
* False otherwise.
*/
public static boolean isCacheSpaceAvailable(long fileSize, Configuration conf,
LocalDirAllocator localDirAllocator) {
try {
Path cacheFilePath = getTempFilePath(conf, localDirAllocator);
long freeSpace = new File(cacheFilePath.toString()).getUsableSpace();
LOG.info("fileSize = {}, freeSpace = {}", fileSize, freeSpace);
Files.deleteIfExists(cacheFilePath);
return fileSize < freeSpace;
} catch (IOException e) {
LOG.error("isCacheSpaceAvailable", e);
return false;
}
}
// The suffix (file extension) of each serialized index file.
private static final String BINARY_FILE_SUFFIX = ".bin";
/**
* Create temporary file based on the file path retrieved from local dir allocator
* instance. The file is created with .bin suffix. The created file has been granted
* posix file permissions available in TEMP_FILE_ATTRS.
*
* @param conf the configuration.
* @param localDirAllocator the local dir allocator instance.
* @return path of the file created.
* @throws IOException if IO error occurs while local dir allocator tries to retrieve path
* from local FS or file creation fails or permission set fails.
*/
private static Path getTempFilePath(final Configuration conf,
final LocalDirAllocator localDirAllocator) throws IOException {
org.apache.hadoop.fs.Path path =
localDirAllocator.getLocalPathForWrite(CACHE_FILE_PREFIX, conf);
File dir = new File(path.getParent().toUri().getPath());
String prefix = path.getName();
File tmpFile = File.createTempFile(prefix, BINARY_FILE_SUFFIX, dir);
Path tmpFilePath = Paths.get(tmpFile.toURI());
return Files.setPosixFilePermissions(tmpFilePath, TEMP_FILE_ATTRS);
}
}

View File

@ -0,0 +1,399 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.apache.hadoop.fs.impl.prefetch;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.Collection;
import static org.apache.hadoop.util.Preconditions.checkArgument;
/**
* A superset of Validate class in Apache commons lang3.
* <p>
* It provides consistent message strings for frequently encountered checks.
* That simplifies callers because they have to supply only the name of the argument
* that failed a check instead of having to supply the entire message.
*/
public final class Validate {
private Validate() {
}
/**
* Validates that the given reference argument is not null.
* @param obj the argument reference to validate.
* @param argName the name of the argument being validated.
*/
public static void checkNotNull(Object obj, String argName) {
checkArgument(obj != null, "'%s' must not be null.", argName);
}
/**
* Validates that the given integer argument is not zero or negative.
* @param value the argument value to validate
* @param argName the name of the argument being validated.
*/
public static void checkPositiveInteger(long value, String argName) {
checkArgument(value > 0, "'%s' must be a positive integer.", argName);
}
/**
* Validates that the given integer argument is not negative.
* @param value the argument value to validate
* @param argName the name of the argument being validated.
*/
public static void checkNotNegative(long value, String argName) {
checkArgument(value >= 0, "'%s' must not be negative.", argName);
}
/**
* Validates that the expression (that checks a required field is present) is true.
* @param isPresent indicates whether the given argument is present.
* @param argName the name of the argument being validated.
*/
public static void checkRequired(boolean isPresent, String argName) {
checkArgument(isPresent, "'%s' is required.", argName);
}
/**
* Validates that the expression (that checks a field is valid) is true.
* @param isValid indicates whether the given argument is valid.
* @param argName the name of the argument being validated.
*/
public static void checkValid(boolean isValid, String argName) {
checkArgument(isValid, "'%s' is invalid.", argName);
}
/**
* Validates that the expression (that checks a field is valid) is true.
* @param isValid indicates whether the given argument is valid.
* @param argName the name of the argument being validated.
* @param validValues the list of values that are allowed.
*/
public static void checkValid(boolean isValid,
String argName,
String validValues) {
checkArgument(isValid, "'%s' is invalid. Valid values are: %s.", argName,
validValues);
}
/**
* Validates that the given string is not null and has non-zero length.
* @param arg the argument reference to validate.
* @param argName the name of the argument being validated.
*/
public static void checkNotNullAndNotEmpty(String arg, String argName) {
checkNotNull(arg, argName);
checkArgument(
!arg.isEmpty(),
"'%s' must not be empty.",
argName);
}
/**
* Validates that the given array is not null and has at least one element.
* @param <T> the type of array's elements.
* @param array the argument reference to validate.
* @param argName the name of the argument being validated.
*/
public static <T> void checkNotNullAndNotEmpty(T[] array, String argName) {
checkNotNull(array, argName);
checkNotEmpty(array.length, argName);
}
/**
* Validates that the given array is not null and has at least one element.
* @param array the argument reference to validate.
* @param argName the name of the argument being validated.
*/
public static void checkNotNullAndNotEmpty(byte[] array, String argName) {
checkNotNull(array, argName);
checkNotEmpty(array.length, argName);
}
/**
* Validates that the given array is not null and has at least one element.
* @param array the argument reference to validate.
* @param argName the name of the argument being validated.
*/
public static void checkNotNullAndNotEmpty(short[] array, String argName) {
checkNotNull(array, argName);
checkNotEmpty(array.length, argName);
}
/**
* Validates that the given array is not null and has at least one element.
* @param array the argument reference to validate.
* @param argName the name of the argument being validated.
*/
public static void checkNotNullAndNotEmpty(int[] array, String argName) {
checkNotNull(array, argName);
checkNotEmpty(array.length, argName);
}
/**
* Validates that the given array is not null and has at least one element.
* @param array the argument reference to validate.
* @param argName the name of the argument being validated.
*/
public static void checkNotNullAndNotEmpty(long[] array, String argName) {
checkNotNull(array, argName);
checkNotEmpty(array.length, argName);
}
/**
* Validates that the given buffer is not null and has non-zero capacity.
* @param <T> the type of iterable's elements.
* @param iter the argument reference to validate.
* @param argName the name of the argument being validated.
*/
public static <T> void checkNotNullAndNotEmpty(Iterable<T> iter,
String argName) {
checkNotNull(iter, argName);
int minNumElements = iter.iterator().hasNext() ? 1 : 0;
checkNotEmpty(minNumElements, argName);
}
/**
* Validates that the given set is not null and has an exact number of items.
* @param <T> the type of collection's elements.
* @param collection the argument reference to validate.
* @param numElements the expected number of elements in the collection.
* @param argName the name of the argument being validated.
*/
public static <T> void checkNotNullAndNumberOfElements(
Collection<T> collection, int numElements, String argName) {
checkNotNull(collection, argName);
checkArgument(
collection.size() == numElements,
"Number of elements in '%s' must be exactly %s, %s given.",
argName,
numElements,
collection.size()
);
}
/**
* Validates that the given two values are equal.
* @param value1 the first value to check.
* @param value1Name the name of the first argument.
* @param value2 the second value to check.
* @param value2Name the name of the second argument.
*/
public static void checkValuesEqual(
long value1,
String value1Name,
long value2,
String value2Name) {
checkArgument(
value1 == value2,
"'%s' (%s) must equal '%s' (%s).",
value1Name,
value1,
value2Name,
value2);
}
/**
* Validates that the first value is an integer multiple of the second value.
* @param value1 the first value to check.
* @param value1Name the name of the first argument.
* @param value2 the second value to check.
* @param value2Name the name of the second argument.
*/
public static void checkIntegerMultiple(
long value1,
String value1Name,
long value2,
String value2Name) {
checkArgument(
(value1 % value2) == 0,
"'%s' (%s) must be an integer multiple of '%s' (%s).",
value1Name,
value1,
value2Name,
value2);
}
/**
* Validates that the first value is greater than the second value.
* @param value1 the first value to check.
* @param value1Name the name of the first argument.
* @param value2 the second value to check.
* @param value2Name the name of the second argument.
*/
public static void checkGreater(
long value1,
String value1Name,
long value2,
String value2Name) {
checkArgument(
value1 > value2,
"'%s' (%s) must be greater than '%s' (%s).",
value1Name,
value1,
value2Name,
value2);
}
/**
* Validates that the first value is greater than or equal to the second value.
* @param value1 the first value to check.
* @param value1Name the name of the first argument.
* @param value2 the second value to check.
* @param value2Name the name of the second argument.
*/
public static void checkGreaterOrEqual(
long value1,
String value1Name,
long value2,
String value2Name) {
checkArgument(
value1 >= value2,
"'%s' (%s) must be greater than or equal to '%s' (%s).",
value1Name,
value1,
value2Name,
value2);
}
/**
* Validates that the first value is less than or equal to the second value.
* @param value1 the first value to check.
* @param value1Name the name of the first argument.
* @param value2 the second value to check.
* @param value2Name the name of the second argument.
*/
public static void checkLessOrEqual(
long value1,
String value1Name,
long value2,
String value2Name) {
checkArgument(
value1 <= value2,
"'%s' (%s) must be less than or equal to '%s' (%s).",
value1Name,
value1,
value2Name,
value2);
}
/**
* Validates that the given value is within the given range of values.
* @param value the value to check.
* @param valueName the name of the argument.
* @param minValueInclusive inclusive lower limit for the value.
* @param maxValueInclusive inclusive upper limit for the value.
*/
public static void checkWithinRange(
long value,
String valueName,
long minValueInclusive,
long maxValueInclusive) {
checkArgument(
(value >= minValueInclusive) && (value <= maxValueInclusive),
"'%s' (%s) must be within the range [%s, %s].",
valueName,
value,
minValueInclusive,
maxValueInclusive);
}
/**
* Validates that the given value is within the given range of values.
* @param value the value to check.
* @param valueName the name of the argument.
* @param minValueInclusive inclusive lower limit for the value.
* @param maxValueInclusive inclusive upper limit for the value.
*/
public static void checkWithinRange(
double value,
String valueName,
double minValueInclusive,
double maxValueInclusive) {
checkArgument(
(value >= minValueInclusive) && (value <= maxValueInclusive),
"'%s' (%s) must be within the range [%s, %s].",
valueName,
value,
minValueInclusive,
maxValueInclusive);
}
/**
* Validates that the given path exists.
* @param path the path to check.
* @param argName the name of the argument being validated.
*/
public static void checkPathExists(Path path, String argName) {
checkNotNull(path, argName);
checkArgument(Files.exists(path), "Path %s (%s) does not exist.", argName,
path);
}
/**
* Validates that the given path exists and is a directory.
* @param path the path to check.
* @param argName the name of the argument being validated.
*/
public static void checkPathExistsAsDir(Path path, String argName) {
checkPathExists(path, argName);
checkArgument(
Files.isDirectory(path),
"Path %s (%s) must point to a directory.",
argName,
path);
}
/**
* Validates that the given path exists and is a file.
* @param path the path to check.
* @param argName the name of the argument being validated.
*/
public static void checkPathExistsAsFile(Path path, String argName) {
checkPathExists(path, argName);
checkArgument(Files.isRegularFile(path),
"Path %s (%s) must point to a file.", argName, path);
}
/**
* Check state.
* @param expression expression which must hold.
* @param format format string
* @param args arguments for the error string
* @throws IllegalStateException if the state is not valid.
*/
public static void checkState(boolean expression,
String format,
Object... args) {
if (!expression) {
throw new IllegalStateException(String.format(format, args));
}
}
private static void checkNotEmpty(int arraySize, String argName) {
checkArgument(
arraySize > 0,
"'%s' must have at least one element.",
argName);
}
}

View File

@ -15,17 +15,14 @@
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.hadoop.fs.swift;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.fs.PathFilter;
/**
* A path filter that accepts everything
* block caching for use in object store clients.
*/
public class AcceptAllFilter implements PathFilter {
@Override
public boolean accept(Path file) {
return true;
}
}
@InterfaceAudience.Private
@InterfaceStability.Unstable
package org.apache.hadoop.fs.impl.prefetch;
import org.apache.hadoop.classification.InterfaceAudience;
import org.apache.hadoop.classification.InterfaceStability;

View File

@ -15,6 +15,11 @@
* See the License for the specific language governing permissions and
* limitations under the License.
*/
/**
* Filesystem implementations that allow Hadoop to read directly from
* the local file system.
*/
@InterfaceAudience.LimitedPrivate({"HDFS", "MapReduce"})
@InterfaceStability.Unstable
package org.apache.hadoop.fs.local;

View File

@ -333,15 +333,24 @@ class CopyCommands {
*/
public static class AppendToFile extends CommandWithDestination {
public static final String NAME = "appendToFile";
public static final String USAGE = "<localsrc> ... <dst>";
public static final String USAGE = "[-n] <localsrc> ... <dst>";
public static final String DESCRIPTION =
"Appends the contents of all the given local files to the " +
"given dst file. The dst file will be created if it does " +
"not exist. If <localSrc> is -, then the input is read " +
"from stdin.";
"from stdin. Option -n represents that use NEW_BLOCK create flag to append file.";
private static final int DEFAULT_IO_LENGTH = 1024 * 1024;
boolean readStdin = false;
private boolean appendToNewBlock = false;
public boolean isAppendToNewBlock() {
return appendToNewBlock;
}
public void setAppendToNewBlock(boolean appendToNewBlock) {
this.appendToNewBlock = appendToNewBlock;
}
// commands operating on local paths have no need for glob expansion
@Override
@ -372,6 +381,9 @@ class CopyCommands {
throw new IOException("missing destination argument");
}
CommandFormat cf = new CommandFormat(2, Integer.MAX_VALUE, "n");
cf.parse(args);
appendToNewBlock = cf.getOpt("n");
getRemoteDestination(args);
super.processOptions(args);
}
@ -385,7 +397,8 @@ class CopyCommands {
}
InputStream is = null;
try (FSDataOutputStream fos = dst.fs.append(dst.path)) {
try (FSDataOutputStream fos = appendToNewBlock ?
dst.fs.append(dst.path, true) : dst.fs.append(dst.path)) {
if (readStdin) {
if (args.size() == 0) {
IOUtils.copyBytes(System.in, fos, DEFAULT_IO_LENGTH);

View File

@ -47,7 +47,6 @@ import org.apache.hadoop.io.DataInputBuffer;
import org.apache.hadoop.io.DataOutputBuffer;
import org.apache.hadoop.io.IOUtils;
import org.apache.hadoop.io.SequenceFile;
import org.apache.hadoop.io.Writable;
import org.apache.hadoop.io.compress.CompressionCodec;
import org.apache.hadoop.io.compress.CompressionCodecFactory;
import org.apache.hadoop.util.ReflectionUtils;
@ -217,8 +216,8 @@ class Display extends FsCommand {
protected class TextRecordInputStream extends InputStream {
SequenceFile.Reader r;
Writable key;
Writable val;
Object key;
Object val;
DataInputBuffer inbuf;
DataOutputBuffer outbuf;
@ -228,10 +227,8 @@ class Display extends FsCommand {
final Configuration lconf = getConf();
r = new SequenceFile.Reader(lconf,
SequenceFile.Reader.file(fpath));
key = ReflectionUtils.newInstance(
r.getKeyClass().asSubclass(Writable.class), lconf);
val = ReflectionUtils.newInstance(
r.getValueClass().asSubclass(Writable.class), lconf);
key = ReflectionUtils.newInstance(r.getKeyClass(), lconf);
val = ReflectionUtils.newInstance(r.getValueClass(), lconf);
inbuf = new DataInputBuffer();
outbuf = new DataOutputBuffer();
}
@ -240,8 +237,11 @@ class Display extends FsCommand {
public int read() throws IOException {
int ret;
if (null == inbuf || -1 == (ret = inbuf.read())) {
if (!r.next(key, val)) {
key = r.next(key);
if (key == null) {
return -1;
} else {
val = r.getCurrentValue(val);
}
byte[] tmp = key.toString().getBytes(StandardCharsets.UTF_8);
outbuf.write(tmp, 0, tmp.length);

View File

@ -633,7 +633,7 @@ public class PathData implements Comparable<PathData> {
return awaitFuture(fs.openFile(path)
.opt(FS_OPTION_OPENFILE_READ_POLICY,
policy)
.opt(FS_OPTION_OPENFILE_LENGTH,
.optLong(FS_OPTION_OPENFILE_LENGTH,
stat.getLen()) // file length hint for object stores
.build());
}

View File

@ -15,6 +15,10 @@
* See the License for the specific language governing permissions and
* limitations under the License.
*/
/**
* Support for the execution of a file system command.
*/
@InterfaceAudience.Private
@InterfaceStability.Unstable
package org.apache.hadoop.fs.shell;

View File

@ -0,0 +1,99 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.hadoop.fs.statistics;
import org.apache.hadoop.fs.statistics.impl.IOStatisticsContextIntegration;
import static java.util.Objects.requireNonNull;
/**
* An interface defined to capture thread-level IOStatistics by using per
* thread context.
* <p>
* The aggregator should be collected in their constructor by statistics-generating
* classes to obtain the aggregator to update <i>across all threads</i>.
* <p>
* The {@link #snapshot()} call creates a snapshot of the statistics;
* <p>
* The {@link #reset()} call resets the statistics in the context so
* that later snapshots will get the incremental data.
*/
public interface IOStatisticsContext extends IOStatisticsSource {
/**
* Get the IOStatisticsAggregator for the context.
*
* @return return the aggregator for the context.
*/
IOStatisticsAggregator getAggregator();
/**
* Capture the snapshot of the context's IOStatistics.
*
* @return IOStatisticsSnapshot for the context.
*/
IOStatisticsSnapshot snapshot();
/**
* Get a unique ID for this context, for logging
* purposes.
*
* @return an ID unique for all contexts in this process.
*/
long getID();
/**
* Reset the context's IOStatistics.
*/
void reset();
/**
* Get the context's IOStatisticsContext.
*
* @return instance of IOStatisticsContext for the context.
*/
static IOStatisticsContext getCurrentIOStatisticsContext() {
// the null check is just a safety check to highlight exactly where a null value would
// be returned if HADOOP-18456 has resurfaced.
return requireNonNull(
IOStatisticsContextIntegration.getCurrentIOStatisticsContext(),
"Null IOStatisticsContext");
}
/**
* Set the IOStatisticsContext for the current thread.
* @param statisticsContext IOStatistics context instance for the
* current thread. If null, the context is reset.
*/
static void setThreadIOStatisticsContext(
IOStatisticsContext statisticsContext) {
IOStatisticsContextIntegration.setThreadIOStatisticsContext(
statisticsContext);
}
/**
* Static probe to check if the thread-level IO statistics enabled.
*
* @return if the thread-level IO statistics enabled.
*/
static boolean enabled() {
return IOStatisticsContextIntegration.isIOStatisticsThreadLevelEnabled();
}
}

View File

@ -0,0 +1,75 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.hadoop.fs.statistics;
import org.apache.hadoop.classification.InterfaceAudience;
import org.apache.hadoop.classification.InterfaceStability;
/**
* Setter for IOStatistics entries.
* These operations have been in the read/write API
* {@code IOStatisticsStore} since IOStatistics
* was added; extracting into its own interface allows for
* {@link IOStatisticsSnapshot} to also support it.
* These are the simple setters, they don't provide for increments,
* decrements, calculation of min/max/mean etc.
* @since The interface and IOStatisticsSnapshot support was added <i>after</i> Hadoop 3.3.5
*/
@InterfaceAudience.Public
@InterfaceStability.Evolving
public interface IOStatisticsSetters extends IOStatistics {
/**
* Set a counter.
*
* No-op if the counter is unknown.
* @param key statistics key
* @param value value to set
*/
void setCounter(String key, long value);
/**
* Set a gauge.
*
* @param key statistics key
* @param value value to set
*/
void setGauge(String key, long value);
/**
* Set a maximum.
* @param key statistics key
* @param value value to set
*/
void setMaximum(String key, long value);
/**
* Set a minimum.
* @param key statistics key
* @param value value to set
*/
void setMinimum(String key, long value);
/**
* Set a mean statistic to a given value.
* @param key statistic key
* @param value new value.
*/
void setMeanStatistic(String key, MeanStatistic value);
}

View File

@ -62,7 +62,8 @@ import static org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.snapshotM
@InterfaceAudience.Public
@InterfaceStability.Evolving
public final class IOStatisticsSnapshot
implements IOStatistics, Serializable, IOStatisticsAggregator {
implements IOStatistics, Serializable, IOStatisticsAggregator,
IOStatisticsSetters {
private static final long serialVersionUID = -1762522703841538084L;
@ -222,6 +223,33 @@ public final class IOStatisticsSnapshot
return meanStatistics;
}
@Override
public synchronized void setCounter(final String key, final long value) {
counters().put(key, value);
}
@Override
public synchronized void setGauge(final String key, final long value) {
gauges().put(key, value);
}
@Override
public synchronized void setMaximum(final String key, final long value) {
maximums().put(key, value);
}
@Override
public synchronized void setMinimum(final String key, final long value) {
minimums().put(key, value);
}
@Override
public void setMeanStatistic(final String key, final MeanStatistic value) {
meanStatistics().put(key, value);
}
@Override
public String toString() {
return ioStatisticsToString(this);

View File

@ -47,7 +47,7 @@ public final class StreamStatisticNames {
public static final String STREAM_READ_ABORTED = "stream_aborted";
/**
* Bytes read from an input stream in read() calls.
* Bytes read from an input stream in read()/readVectored() calls.
* Does not include bytes read and then discarded in seek/close etc.
* These are the bytes returned to the caller.
* Value: {@value}.
@ -110,6 +110,34 @@ public final class StreamStatisticNames {
public static final String STREAM_READ_OPERATIONS =
"stream_read_operations";
/**
* Count of readVectored() operations in an input stream.
* Value: {@value}.
*/
public static final String STREAM_READ_VECTORED_OPERATIONS =
"stream_read_vectored_operations";
/**
* Count of bytes discarded during readVectored() operation
* in an input stream.
* Value: {@value}.
*/
public static final String STREAM_READ_VECTORED_READ_BYTES_DISCARDED =
"stream_read_vectored_read_bytes_discarded";
/**
* Count of incoming file ranges during readVectored() operation.
* Value: {@value}
*/
public static final String STREAM_READ_VECTORED_INCOMING_RANGES =
"stream_read_vectored_incoming_ranges";
/**
* Count of combined file ranges during readVectored() operation.
* Value: {@value}
*/
public static final String STREAM_READ_VECTORED_COMBINED_RANGES =
"stream_read_vectored_combined_ranges";
/**
* Count of incomplete read() operations in an input stream,
* that is, when the bytes returned were less than that requested.
@ -387,6 +415,46 @@ public final class StreamStatisticNames {
public static final String BLOCKS_RELEASED
= "blocks_released";
/**
* Total number of prefetching operations executed.
*/
public static final String STREAM_READ_PREFETCH_OPERATIONS
= "stream_read_prefetch_operations";
/**
* Total number of block in disk cache.
*/
public static final String STREAM_READ_BLOCKS_IN_FILE_CACHE
= "stream_read_blocks_in_cache";
/**
* Total number of active prefetch operations.
*/
public static final String STREAM_READ_ACTIVE_PREFETCH_OPERATIONS
= "stream_read_active_prefetch_operations";
/**
* Total bytes of memory in use by this input stream.
*/
public static final String STREAM_READ_ACTIVE_MEMORY_IN_USE
= "stream_read_active_memory_in_use";
/**
* count/duration of reading a remote block.
*
* Value: {@value}.
*/
public static final String STREAM_READ_REMOTE_BLOCK_READ
= "stream_read_block_read";
/**
* count/duration of acquiring a buffer and reading to it.
*
* Value: {@value}.
*/
public static final String STREAM_READ_BLOCK_ACQUIRE_AND_READ
= "stream_read_block_acquire_read";
private StreamStatisticNames() {
}

View File

@ -0,0 +1,81 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.hadoop.fs.statistics.impl;
import org.apache.hadoop.fs.statistics.IOStatistics;
import org.apache.hadoop.fs.statistics.IOStatisticsAggregator;
import org.apache.hadoop.fs.statistics.IOStatisticsContext;
import org.apache.hadoop.fs.statistics.IOStatisticsSnapshot;
/**
* Empty IOStatistics context which serves no-op for all the operations and
* returns an empty Snapshot if asked.
*
*/
final class EmptyIOStatisticsContextImpl implements IOStatisticsContext {
private static final IOStatisticsContext EMPTY_CONTEXT = new EmptyIOStatisticsContextImpl();
private EmptyIOStatisticsContextImpl() {
}
/**
* Create a new empty snapshot.
* A new one is always created for isolation.
*
* @return a statistics snapshot
*/
@Override
public IOStatisticsSnapshot snapshot() {
return new IOStatisticsSnapshot();
}
@Override
public IOStatisticsAggregator getAggregator() {
return EmptyIOStatisticsStore.getInstance();
}
@Override
public IOStatistics getIOStatistics() {
return EmptyIOStatistics.getInstance();
}
@Override
public void reset() {}
/**
* The ID is always 0.
* As the real context implementation counter starts at 1,
* we are guaranteed to have unique IDs even between them and
* the empty context.
* @return 0
*/
@Override
public long getID() {
return 0;
}
/**
* Get the single instance.
* @return an instance.
*/
static IOStatisticsContext getInstance() {
return EMPTY_CONTEXT;
}
}

View File

@ -16,7 +16,7 @@
* limitations under the License.
*/
package org.apache.hadoop.fs.s3a.statistics.impl;
package org.apache.hadoop.fs.statistics.impl;
import javax.annotation.Nullable;
import java.time.Duration;
@ -25,7 +25,6 @@ import java.util.concurrent.atomic.AtomicLong;
import org.apache.hadoop.fs.statistics.IOStatistics;
import org.apache.hadoop.fs.statistics.MeanStatistic;
import org.apache.hadoop.fs.statistics.impl.IOStatisticsStore;
/**
* This may seem odd having an IOStatisticsStore which does nothing

View File

@ -0,0 +1,128 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.hadoop.fs.statistics.impl;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.apache.hadoop.fs.statistics.IOStatistics;
import org.apache.hadoop.fs.statistics.IOStatisticsAggregator;
import org.apache.hadoop.fs.statistics.IOStatisticsContext;
import org.apache.hadoop.fs.statistics.IOStatisticsSnapshot;
/**
* Implementing the IOStatisticsContext.
*
* A Context defined for IOStatistics collection per thread which captures
* each worker thread's work in FS streams and stores it in the form of
* IOStatisticsSnapshot.
*
* For the current thread the IOStatisticsSnapshot can be used as a way to
* move the IOStatistics data between applications using the Serializable
* nature of the class.
*/
public final class IOStatisticsContextImpl implements IOStatisticsContext {
private static final Logger LOG =
LoggerFactory.getLogger(IOStatisticsContextImpl.class);
/**
* Thread ID.
*/
private final long threadId;
/**
* Unique ID.
*/
private final long id;
/**
* IOStatistics to aggregate.
*/
private final IOStatisticsSnapshot ioStatistics = new IOStatisticsSnapshot();
/**
* Constructor.
* @param threadId thread ID
* @param id instance ID.
*/
public IOStatisticsContextImpl(final long threadId, final long id) {
this.threadId = threadId;
this.id = id;
}
@Override
public String toString() {
return "IOStatisticsContextImpl{" +
"id=" + id +
", threadId=" + threadId +
", ioStatistics=" + ioStatistics +
'}';
}
/**
* Get the IOStatisticsAggregator of the context.
* @return the instance of IOStatisticsAggregator for this context.
*/
@Override
public IOStatisticsAggregator getAggregator() {
return ioStatistics;
}
/**
* Returns a snapshot of the current thread's IOStatistics.
*
* @return IOStatisticsSnapshot of the context.
*/
@Override
public IOStatisticsSnapshot snapshot() {
LOG.debug("Taking snapshot of IOStatisticsContext id {}", id);
return new IOStatisticsSnapshot(ioStatistics);
}
/**
* Reset the thread +.
*/
@Override
public void reset() {
LOG.debug("clearing IOStatisticsContext id {}", id);
ioStatistics.clear();
}
@Override
public IOStatistics getIOStatistics() {
return ioStatistics;
}
/**
* ID of this context.
* @return ID.
*/
@Override
public long getID() {
return id;
}
/**
* Get the thread ID.
* @return thread ID.
*/
public long getThreadID() {
return threadId;
}
}

View File

@ -0,0 +1,181 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.hadoop.fs.statistics.impl;
import java.lang.ref.WeakReference;
import java.util.concurrent.atomic.AtomicLong;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.apache.hadoop.classification.VisibleForTesting;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.impl.WeakReferenceThreadMap;
import org.apache.hadoop.fs.statistics.IOStatisticsContext;
import static org.apache.hadoop.fs.CommonConfigurationKeys.IOSTATISTICS_THREAD_LEVEL_ENABLED;
import static org.apache.hadoop.fs.CommonConfigurationKeys.IOSTATISTICS_THREAD_LEVEL_ENABLED_DEFAULT;
/**
* A Utility class for IOStatisticsContext, which helps in creating and
* getting the current active context. Static methods in this class allows to
* get the current context to start aggregating the IOStatistics.
*
* Static initializer is used to work out if the feature to collect
* thread-level IOStatistics is enabled or not and the corresponding
* implementation class is called for it.
*
* Weak Reference thread map to be used to keep track of different context's
* to avoid long-lived memory leakages as these references would be cleaned
* up at GC.
*/
public final class IOStatisticsContextIntegration {
private static final Logger LOG =
LoggerFactory.getLogger(IOStatisticsContextIntegration.class);
/**
* Is thread-level IO Statistics enabled?
*/
private static boolean isThreadIOStatsEnabled;
/**
* ID for next instance to create.
*/
public static final AtomicLong INSTANCE_ID = new AtomicLong(1);
/**
* Active IOStatistics Context containing different worker thread's
* statistics. Weak Reference so that it gets cleaned up during GC and we
* avoid any memory leak issues due to long lived references.
*/
private static final WeakReferenceThreadMap<IOStatisticsContext>
ACTIVE_IOSTATS_CONTEXT =
new WeakReferenceThreadMap<>(
IOStatisticsContextIntegration::createNewInstance,
IOStatisticsContextIntegration::referenceLostContext
);
static {
// Work out if the current context has thread level IOStatistics enabled.
final Configuration configuration = new Configuration();
isThreadIOStatsEnabled =
configuration.getBoolean(IOSTATISTICS_THREAD_LEVEL_ENABLED,
IOSTATISTICS_THREAD_LEVEL_ENABLED_DEFAULT);
}
/**
* Static probe to check if the thread-level IO statistics enabled.
*
* @return if the thread-level IO statistics enabled.
*/
public static boolean isIOStatisticsThreadLevelEnabled() {
return isThreadIOStatsEnabled;
}
/**
* Private constructor for a utility class to be used in IOStatisticsContext.
*/
private IOStatisticsContextIntegration() {}
/**
* Creating a new IOStatisticsContext instance for a FS to be used.
* @param key Thread ID that represents which thread the context belongs to.
* @return an instance of IOStatisticsContext.
*/
private static IOStatisticsContext createNewInstance(Long key) {
IOStatisticsContextImpl instance =
new IOStatisticsContextImpl(key, INSTANCE_ID.getAndIncrement());
LOG.debug("Created instance {}", instance);
return instance;
}
/**
* In case of reference loss for IOStatisticsContext.
* @param key ThreadID.
*/
private static void referenceLostContext(Long key) {
LOG.debug("Reference lost for threadID for the context: {}", key);
}
/**
* Get the current thread's IOStatisticsContext instance. If no instance is
* present for this thread ID, create one using the factory.
* @return instance of IOStatisticsContext.
*/
public static IOStatisticsContext getCurrentIOStatisticsContext() {
return isThreadIOStatsEnabled
? ACTIVE_IOSTATS_CONTEXT.getForCurrentThread()
: EmptyIOStatisticsContextImpl.getInstance();
}
/**
* Set the IOStatisticsContext for the current thread.
* @param statisticsContext IOStatistics context instance for the
* current thread. If null, the context is reset.
*/
public static void setThreadIOStatisticsContext(
IOStatisticsContext statisticsContext) {
if (isThreadIOStatsEnabled) {
if (statisticsContext == null) {
// new value is null, so remove it
ACTIVE_IOSTATS_CONTEXT.removeForCurrentThread();
} else {
// the setter is efficient in that it does not create a new
// reference if the context is unchanged.
ACTIVE_IOSTATS_CONTEXT.setForCurrentThread(statisticsContext);
}
}
}
/**
* Get thread ID specific IOStatistics values if
* statistics are enabled and the thread ID is in the map.
* @param testThreadId thread ID.
* @return IOStatisticsContext if found in the map.
*/
@VisibleForTesting
public static IOStatisticsContext getThreadSpecificIOStatisticsContext(long testThreadId) {
LOG.debug("IOStatsContext thread ID required: {}", testThreadId);
if (!isThreadIOStatsEnabled) {
return null;
}
// lookup the weakRef IOStatisticsContext for the thread ID in the
// ThreadMap.
WeakReference<IOStatisticsContext> ioStatisticsSnapshotWeakReference =
ACTIVE_IOSTATS_CONTEXT.lookup(testThreadId);
if (ioStatisticsSnapshotWeakReference != null) {
return ioStatisticsSnapshotWeakReference.get();
}
return null;
}
/**
* A method to enable IOStatisticsContext to override if set otherwise in
* the configurations for tests.
*/
@VisibleForTesting
public static void enableIOStatisticsContext() {
if (!isThreadIOStatsEnabled) {
LOG.info("Enabling Thread IOStatistics..");
isThreadIOStatsEnabled = true;
}
}
}

View File

@ -24,6 +24,7 @@ import java.util.concurrent.atomic.AtomicLong;
import org.apache.hadoop.fs.statistics.IOStatistics;
import org.apache.hadoop.fs.statistics.IOStatisticsAggregator;
import org.apache.hadoop.fs.statistics.DurationTrackerFactory;
import org.apache.hadoop.fs.statistics.IOStatisticsSetters;
import org.apache.hadoop.fs.statistics.MeanStatistic;
/**
@ -31,6 +32,7 @@ import org.apache.hadoop.fs.statistics.MeanStatistic;
* use in classes which track statistics for reporting.
*/
public interface IOStatisticsStore extends IOStatistics,
IOStatisticsSetters,
IOStatisticsAggregator,
DurationTrackerFactory {
@ -56,24 +58,6 @@ public interface IOStatisticsStore extends IOStatistics,
*/
long incrementCounter(String key, long value);
/**
* Set a counter.
*
* No-op if the counter is unknown.
* @param key statistics key
* @param value value to set
*/
void setCounter(String key, long value);
/**
* Set a gauge.
*
* No-op if the gauge is unknown.
* @param key statistics key
* @param value value to set
*/
void setGauge(String key, long value);
/**
* Increment a gauge.
* <p>
@ -85,14 +69,6 @@ public interface IOStatisticsStore extends IOStatistics,
*/
long incrementGauge(String key, long value);
/**
* Set a maximum.
* No-op if the maximum is unknown.
* @param key statistics key
* @param value value to set
*/
void setMaximum(String key, long value);
/**
* Increment a maximum.
* <p>
@ -104,16 +80,6 @@ public interface IOStatisticsStore extends IOStatistics,
*/
long incrementMaximum(String key, long value);
/**
* Set a minimum.
* <p>
* No-op if the minimum is unknown.
* </p>
* @param key statistics key
* @param value value to set
*/
void setMinimum(String key, long value);
/**
* Increment a minimum.
* <p>
@ -147,16 +113,6 @@ public interface IOStatisticsStore extends IOStatistics,
*/
void addMaximumSample(String key, long value);
/**
* Set a mean statistic to a given value.
* <p>
* No-op if the key is unknown.
* </p>
* @param key statistic key
* @param value new value.
*/
void setMeanStatistic(String key, MeanStatistic value);
/**
* Add a sample to the mean statistics.
* <p>

View File

@ -67,6 +67,17 @@ public interface IOStatisticsStoreBuilder {
IOStatisticsStoreBuilder withDurationTracking(
String... prefixes);
/**
* A value which is tracked with counter/min/max/mean.
* Similar to {@link #withDurationTracking(String...)}
* but without the failure option and with the same name
* across all categories.
* @param prefixes prefixes to add.
* @return the builder
*/
IOStatisticsStoreBuilder withSampleTracking(
String... prefixes);
/**
* Build the collector.
* @return a new collector.

View File

@ -92,6 +92,18 @@ final class IOStatisticsStoreBuilderImpl implements
return this;
}
@Override
public IOStatisticsStoreBuilderImpl withSampleTracking(
final String... prefixes) {
for (String p : prefixes) {
withCounters(p);
withMinimums(p);
withMaximums(p);
withMeanStatistics(p);
}
return this;
}
@Override
public IOStatisticsStore build() {
return new IOStatisticsStoreImpl(counters, gauges, minimums,

View File

@ -190,7 +190,7 @@ final class IOStatisticsStoreImpl extends WrappedIOStatistics
return counter.get();
} else {
long l = incAtomicLong(counter, value);
LOG.debug("Incrementing counter {} by {} with final value {}",
LOG.trace("Incrementing counter {} by {} with final value {}",
key, value, l);
return l;
}

View File

@ -97,7 +97,7 @@ import org.eclipse.jetty.server.SecureRequestCustomizer;
import org.eclipse.jetty.server.Server;
import org.eclipse.jetty.server.ServerConnector;
import org.eclipse.jetty.server.SslConnectionFactory;
import org.eclipse.jetty.server.handler.AllowSymLinkAliasChecker;
import org.eclipse.jetty.server.SymlinkAllowedResourceAliasChecker;
import org.eclipse.jetty.server.handler.ContextHandlerCollection;
import org.eclipse.jetty.server.handler.HandlerCollection;
import org.eclipse.jetty.server.handler.RequestLogHandler;
@ -144,7 +144,7 @@ public final class HttpServer2 implements FilterContainer {
public static final String HTTP_SOCKET_BACKLOG_SIZE_KEY =
"hadoop.http.socket.backlog.size";
public static final int HTTP_SOCKET_BACKLOG_SIZE_DEFAULT = 128;
public static final int HTTP_SOCKET_BACKLOG_SIZE_DEFAULT = 500;
public static final String HTTP_MAX_THREADS_KEY = "hadoop.http.max.threads";
public static final String HTTP_ACCEPTOR_COUNT_KEY =
"hadoop.http.acceptor.count";
@ -497,7 +497,12 @@ public final class HttpServer2 implements FilterContainer {
prefix -> this.conf.get(prefix + "type")
.equals(PseudoAuthenticationHandler.TYPE))
) {
server.initSpnego(conf, hostName, usernameConfKey, keytabConfKey);
server.initSpnego(
conf,
hostName,
getFilterProperties(conf, authFilterConfigurationPrefixes),
usernameConfKey,
keytabConfKey);
}
for (URI ep : endpoints) {
@ -939,7 +944,7 @@ public final class HttpServer2 implements FilterContainer {
handler.setHttpOnly(true);
handler.getSessionCookieConfig().setSecure(true);
logContext.setSessionHandler(handler);
logContext.addAliasCheck(new AllowSymLinkAliasChecker());
logContext.addAliasCheck(new SymlinkAllowedResourceAliasChecker(logContext));
setContextAttributes(logContext, conf);
addNoCacheFilter(logContext);
defaultContexts.put(logContext, true);
@ -958,7 +963,7 @@ public final class HttpServer2 implements FilterContainer {
handler.setHttpOnly(true);
handler.getSessionCookieConfig().setSecure(true);
staticContext.setSessionHandler(handler);
staticContext.addAliasCheck(new AllowSymLinkAliasChecker());
staticContext.addAliasCheck(new SymlinkAllowedResourceAliasChecker(staticContext));
setContextAttributes(staticContext, conf);
defaultContexts.put(staticContext, true);
}
@ -1340,8 +1345,12 @@ public final class HttpServer2 implements FilterContainer {
}
private void initSpnego(Configuration conf, String hostName,
String usernameConfKey, String keytabConfKey) throws IOException {
Properties authFilterConfigurationPrefixes, String usernameConfKey, String keytabConfKey)
throws IOException {
Map<String, String> params = new HashMap<>();
for (Map.Entry<Object, Object> entry : authFilterConfigurationPrefixes.entrySet()) {
params.put(String.valueOf(entry.getKey()), String.valueOf(entry.getValue()));
}
String principalInConf = conf.get(usernameConfKey);
if (principalInConf != null && !principalInConf.isEmpty()) {
params.put("kerberos.principal", SecurityUtil.getServerPrincipal(
@ -1967,4 +1976,8 @@ public final class HttpServer2 implements FilterContainer {
return metrics;
}
@VisibleForTesting
List<ServerConnector> getListeners() {
return listeners;
}
}

View File

@ -15,6 +15,10 @@
* See the License for the specific language governing permissions and
* limitations under the License.
*/
/**
* Support for embedded HTTP services.
*/
@InterfaceAudience.LimitedPrivate({"HBase", "HDFS", "MapReduce"})
@InterfaceStability.Unstable
package org.apache.hadoop.http;

Some files were not shown because too many files have changed in this diff Show More