Commit Graph

24797 Commits

Author SHA1 Message Date
daimin 9071c9646c
Fix thread safety of EC decoding during concurrent preads (#3881)
(cherry picked from commit 0e74f1e467)
2022-02-11 10:20:45 +08:00
Ayush Saxena 5b47b9f360
HADOOP-18096. Distcp: Sync moves filtered file to home directory rather than deleting. (#3940). Contributed by Ayush Saxena.
Reviewed-by: Steve Loughran <stevel@apache.org>
Reviewed-by: stack <stack@apache.org>
2022-02-11 02:05:14 +05:30
Steve Loughran 088684ec60
HADOOP-18091. S3A auditing leaks memory through ThreadLocal references (#3930)
Adds a new map type WeakReferenceMap, which stores weak
references to values, and a WeakReferenceThreadMap subclass
to more closely resemble a thread local type, as it is a
map of threadId to value.

Construct it with a factory method and optional callback
for notification on loss and regeneration.

 WeakReferenceThreadMap<WrappingAuditSpan> activeSpan =
      new WeakReferenceThreadMap<>(
          (k) -> getUnbondedSpan(),
          this::noteSpanReferenceLost);

This is used in ActiveAuditManagerS3A for span tracking.

Relates to
* HADOOP-17511. Add an Audit plugin point for S3A
* HADOOP-18094. Disable S3A auditing by default.

Contributed by Steve Loughran.

Change-Id: Ibf7bb082fd47298f7ebf46d92f56e80ca9b2aaf8
2022-02-10 12:33:40 +00:00
Joey Krabacher 84de16028d
HADOOP-18114. Documentation correction in assumed_roles.md (#3949)
Fixes typo in hadoop-aws/assumed_roles.md

Contributed by Joey Krabacher

Change-Id: I2b77bd7793ae0433196b77042d5f400d0bcbe745
2022-02-09 10:47:24 +00:00
singer-bin ce7cabb771
HDFS-16437 ReverseXML processor doesn't accept XML files without the … (#3926)
(cherry picked from commit 125e3b6160)
2022-02-06 13:36:57 +08:00
daimin 709e617a84
HDFS-16403. Improve FUSE IO performance by supporting FUSE parameter max_background (#3842)
Reviewed-by: Istvan Fajth <pifta@apache.org>
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
(cherry picked from commit d69938994e)
2022-02-06 13:06:35 +08:00
Abhishek Das 8b03514eaf HADOOP-18100: Change scope of inner classes in InodeTree to make them accessible outside package
Fixes #3950

Signed-off-by: Owen O'Malley <omalley@apache.org>

Cherry-picked from 3684c7f6 by Owen O'Malley
2022-02-04 12:13:10 -08:00
Petre Bogdan Stolojan 87ff57765a
HADOOP-18085. S3 SDK Upgrade causes AccessPoint ARN endpoint mistranslation (#3902)
Part of HADOOP-17198. Support S3 Access Points.

HADOOP-18068. "upgrade AWS SDK to 1.12.132" broke the access point endpoint
translation.

Correct endpoints should start with "s3-accesspoint.", after SDK upgrade they start with
"s3.accesspoint-" which messes up tests + region detection by the SDK.

Contributed by Bogdan Stolojan

Change-Id: I0c0181628ab803afc39036003777eaec79aa378c
2022-02-04 16:22:24 +00:00
Petre Bogdan Stolojan a8d7acf1a8
HADOOP-17951. Improve S3A checking of S3 Access Point existence (#3516)
Follow-on to HADOOP-17198. Support S3 Access Points

Contributed by Bogdan Stolojan

Change-Id: I0932476c64e1967eb0cb3e0f00060fac5d2bae72
2022-02-04 16:22:04 +00:00
Petre Bogdan Stolojan 664075f35d
HADOOP-17198. Support S3 Access Points (#3260)
Add support for S3 Access Points. This provides extra security as it
ensures applications are not working with buckets belong to third parties.

To bind a bucket to an access point, set the access point (ap) ARN,
which must be done for each specific bucket, using the pattern

fs.s3a.bucket.$BUCKET.accesspoint.arn = ARN

* The global/bucket option `fs.s3a.accesspoint.required` to
mandate that buckets must declare their access point.
* This is not compatible with S3Guard.

Consult the documentation for further details.

Contributed by Bogdan Stolojan

(this commit contains the changes to TestArnResource from HADOOP-18068,
 "upgrade AWS SDK to 1.12.132" so that it works with the later SDK.)

Change-Id: I3fac213e52ca6ec1c813effb8496c353964b8e1b
2022-02-04 16:21:35 +00:00
KevinWikant 7171e2190e HDFS-16443. Fix edge case where DatanodeAdminDefaultMonitor doubly enqueues a DatanodeDescriptor on exception (#3942)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 089e06de21)
2022-01-31 13:19:40 +09:00
KevinWikant 5e2eac6c41
HDFS-16303. Improve handling of datanode lost while decommissioning (#3921)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-01-31 13:18:36 +09:00
secfree 110104da38 HDFS-16169. Fix TestBlockTokenWithDFSStriped#testEnd2End failure (#3850)
Reviewed-by: Fei Hui <feihui.ustc@gmail.com>
Reviewed-by: litao <tomleescut@gmail.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 39cad5f28f)
2022-01-28 17:05:32 +09:00
Akira Ajisaka 8032b680fb YARN-10561. Upgrade node.js to 12.22.1 and yarn to 1.22.5 in YARN application catalog webapp (#2591)
Reviewed-by: Masatake Iwasaki <iwasakims@apache.org>
(cherry picked from commit 9cb535caf2)
2022-01-28 15:52:33 +09:00
litao b5d2e00f81 HDFS-16427. Add debug log for BlockManager#chooseExcessRedundancyStriped (#3888)
(cherry picked from commit 6136d630a3)
2022-01-27 13:44:03 +09:00
Xing Lin d613776b64
HADOOP-18093. Better exception handling for testFileStatusOnMountLink() in ViewFsBaseTest.java (#3918). Contributed by Xing Lin. (#3929)
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
(cherry picked from commit 0d17b629ff)
2022-01-26 21:55:32 +05:30
litao ef1a2b478b HDFS-16398. Reconfig block report parameters for datanode (#3831)
(cherry picked from commit c2ff39006f)

 Conflicts:
	hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
2022-01-26 17:36:17 +09:00
Wei-Chiu Chuang ff3a88b9c2
HDFS-16423. Balancer should not get blocks on stale storages (#3883) (#3924)
Reviewed-by: litao <tomleescut@gmail.com>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
(cherry picked from commit db2c3200e6)

 Conflicts:
	hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestGetBlocks.java
	hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancerWithHANameNodes.java

Co-authored-by: qinyuren <1476659627@qq.com>
2022-01-26 11:54:13 +08:00
Bryan Beaudreault bd13d73334 HDFS-16262. Async refresh of cached locations in DFSInputStream (#3527)
(cherry picked from commit 94b884ae55)
2022-01-25 11:43:47 +00:00
daimin 728ed10a7c
HDFS-16430. Add validation to maximum blocks in EC group when adding an EC policy (#3899). Contributed by daimin.
Reviewed-by: tomscut <litao@bigo.sg>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
(cherry picked from commit 5ef335da1e)

 Conflicts:
	hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ErasureCodingPolicyManager.java
2022-01-25 09:58:20 +08:00
Steve Loughran 4fd0389153
HADOOP-18094. Disable S3A auditing by default.
See HADOOP-18091. S3A auditing leaks memory through ThreadLocal references

* Adds a new option fs.s3a.audit.enabled to controls whether or not auditing
is enabled. This is false by default.

* When false, the S3A auditing manager is NoopAuditManagerS3A,
which was formerly only used for unit tests and
during filsystem initialization.

* When true, ActiveAuditManagerS3A is used for managing auditing,
allowing auditing events to be reported.

* updates documentation and tests.

This patch does not fix the underlying leak. When auditing is enabled,
long-lived threads will retain references to the audit managers
of S3A filesystem instances which have already been closed.

Contributed by Steve Loughran.

Change-Id: I671e594cd59e8ca77a1f65be791ad0ae9530b8d9
2022-01-24 14:04:23 +00:00
dependabot[bot] 55192570a1 YARN-11065. Bump follow-redirects from 1.13.3 to 1.14.7 in hadoop-yarn-ui (#3890)
Bumps [follow-redirects](https://github.com/follow-redirects/follow-redirects) from 1.13.3 to 1.14.7.
- [Release notes](https://github.com/follow-redirects/follow-redirects/releases)
- [Commits](https://github.com/follow-redirects/follow-redirects/compare/v1.13.3...v1.14.7)

---
updated-dependencies:
- dependency-name: follow-redirects
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit dae33cf935)
2022-01-20 21:45:53 +09:00
liubingxing d6ff60df65
HDFS-16352. return the real datanode numBlocks in #getDatanodeStorageReport (#3714). Contributed by liubingxing.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
(cherry picked from commit d8dea6f52a)
2022-01-20 18:47:00 +08:00
Anmol Asrani 9b221b9599
HADOOP-18084. ABFS: Add testfilePath while verifying test contents are read correctly (#3903)
Contributed by: Anmol Asrani

Change-Id: I6e71bf349a74032f453398c7ae66f9c3305be190
2022-01-19 10:18:05 +00:00
litao f9c0bc094a HDFS-16399. Reconfig cache report parameters for datanode (#3841)
(cherry picked from commit e355646330)
2022-01-19 18:43:15 +09:00
litao 11fe5279b0 HDFS-16400. Reconfig DataXceiver parameters for datanode (#3843)
Reviewed-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
(cherry picked from commit f02374df92)
2022-01-19 18:42:48 +09:00
litao cdaf4d89f9 HDFS-16331. Make dfs.blockreport.intervalMsec reconfigurable (#3676)
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
(cherry picked from commit 52ec65fd10)
2022-01-19 18:40:41 +09:00
Viraj Jasani 831c11c47a HDFS-16139. Update BPServiceActor Scheduler's nextBlockReportTime atomically (#3228). Contributed by Viraj Jasani.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
(cherry picked from commit b038042ece)
2022-01-19 16:01:00 +09:00
qinyuren 1c71d6e9fe HDFS-16426. Fix nextBlockReportTime when trigger full block report force (#3887)
(cherry picked from commit fcb1076699)
2022-01-19 13:44:02 +09:00
Steve Loughran 8ccc586af6
HADOOP-17409. Remove s3guard from S3A module (#3534)
Completely removes S3Guard support from the S3A codebase.

If the connector is configured to use any metastore other than
the null and local stores (i.e. DynamoDB is selected) the s3a client
will raise an exception and refuse to initialize.

This is to ensure that there is no mix of S3Guard enabled and disabled
deployments with the same configuration but different hadoop releases
-it must be turned off completely.

The "hadoop s3guard" command has been retained -but the supported
subcommands have been reduced to those which are not purely S3Guard
related: "bucket-info" and "uploads".

This is major change in terms of the number of files
changed; before cherry picking subsequent s3a patches into
older releases, this patch will probably need backporting
first.

Goodbye S3Guard, your work is done. Time to die.

Contributed by Steve Loughran.
2022-01-18 18:04:48 +00:00
Steve Loughran 47ba977ca9
HADOOP-18068. upgrade AWS SDK to 1.12.132 (#3864)
With this update, the versions of key shaded dependencies are

  jackson    2.12.3
  httpclient 4.5.13

This backport patch does not include the TestArn changes needed
for the test to work with this version of the SDK; it is only
to be applied to branches without HADOOP-17198. "Support S3 Access Points".
If that patch is backported later, that test suite MUST be
updated to the latest version.

Contributed by Steve Loughran

Change-Id: I8d2b71781ee8472b16469531f9cd0de32dd3356f
2022-01-18 12:20:12 +00:00
Viraj Jasani 5e9e779ed2
HADOOP-17152. Provide Hadoop's own Lists utility to reduce dependency on Guava (#3061)
Change-Id: I52e55b9d9826ad661e9ad7dc15f007aa168f0fe1
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2022-01-18 11:57:25 +00:00
Gera Shegalov 6c58f83b78 YARN-11055. Add missing newline in cgroups-operations.c (#3851)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit a94e9fcbde)
2022-01-17 16:21:13 +09:00
Xiangyi Zhu b5e7f59e53
HDFS-16043. Add markedDeleteBlockScrubberThread to delete blocks asynchronously (#3882). Contributed by Xiangyi Zhu.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2022-01-15 23:18:05 +08:00
Jackson Wang 926222a0d0 HDFS-16420. Avoid deleting unique data blocks when deleting redundancy striped blocks. (#3880)
Reviewed-by: litao <tomleescut@gmail.com>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
(cherry picked from commit d8862822d2)
2022-01-14 22:40:58 +09:00
ahmarsuhail 6649c2813e
HADOOP-16223. Remove misleading fs.s3a.delegation.tokens.enabled prompt (#3879)
Contributed by Ahmar Suhail

Change-Id: I6a33043831a059325c58b0f76c925e52c6ae14f7
2022-01-12 17:27:53 +00:00
Mukund Thakur 60c1c6d93c HADOOP-18065. ExecutorHelper.logThrowableFromAfterExecute() is too noisy. (#3860)
Downgrading warn logs to debug in case of InterruptedException

Contributed By: Mukund Thakur
2022-01-10 13:52:02 +05:30
monthonk 7dd8e900f8
HADOOP-14334. S3 SSEC tests to downgrade when running against a mandatory encryption object store (#3870)
Contributed by Monthon Klongklaew

Change-Id: Ib275c9690bbc90170c6a442ded198fe006c20bc1
2022-01-09 18:06:27 +00:00
Ayush Saxena 5edb33b5ed
HADOOP-18056. DistCp: Filter duplicates in the source paths. (#3825). Contributed by Ayush Saxena.
Reviewed-by: tomscut <litao@bigo.sg>
Reviewed-by: Steve Loughran <stevel@apache.org>
2022-01-05 23:53:55 +05:30
Ashutosh Gupta 6b83fe4a00 HDFS-16410. Insecure Xml parsing in OfflineEditsXmlLoader (#3854)
Contributed by Ashutosh Gupta
2022-01-05 10:00:23 -08:00
liever18 3a82899493 HDFS-16408. Ensure LeaseRecheckIntervalMs is greater than zero (#3856)
(cherry picked from commit e1d0aa9ee5)
2022-01-05 15:46:21 +00:00
Wei-Chiu Chuang 350b51f287 Make upstream aware of 3.3.1 release 2022-01-04 14:48:49 -08:00
Ashutosh Gupta bad3a0964c HDFS-16409. Fix typo: testHasExeceptionsReturnsCorrectValue -> testHasExceptionsReturnsCorrectValue (#3835)
Reviewed-by: Fei Hui <feihui.ustc@gmail.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 9eea0e28f2)
2022-01-04 13:25:56 +09:00
Ayush Saxena 53249a40db
HADOOP-18061. Update the year to 2022. (#3845). Contributed by Ayush Saxena.
Reviewed-by: Akira Ajisaka <aajisaka@apache.org>
2022-01-04 07:59:45 +05:30
jianghuazhu fd75c4a158 HADOOP-18063. Remove unused import AbstractJavaKeyStoreProvider in Shell class. (#3846)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 7398a0f1b2)
2022-01-04 11:26:07 +09:00
Igor Dvorzhak 5d72fdfcb2 HADOOP-13464. Upgrade Gson dependency to version 2.8.9 (#2524)
Change-Id: Ifd3fb9ec6ebfc8874bb799bc198219511fe55a2f

Update pom.xml

Update pom.xml

(cherry picked from commit 795054882a)
2021-12-30 21:37:14 +00:00
Ashutosh Gupta 6535a183b2 HDFS-14099. Unknown frame descriptor when decompressing multiple frames (#3836)
Co-authored-by: xuzq <xuzengqiang@kuaishou.com>
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit caab29ec88)
2021-12-28 22:00:48 +09:00
Akira Ajisaka cd30687a15 Revert "HDFS-14099. Unknown frame descriptor when decompressing multiple frames (#3836)"
This reverts commit 05b43f2057.
2021-12-28 21:51:40 +09:00
Ashutosh Gupta 05b43f2057 HDFS-14099. Unknown frame descriptor when decompressing multiple frames (#3836)
Co-authored-by: xuzq <xuzengqiang@kuaishou.com>
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit caab29ec88)
2021-12-28 21:49:06 +09:00
Akira Ajisaka 7cd52000e0 HADOOP-18045. Disable TestDynamometerInfra (#3829)
Reviewed-by: Fei Hui <feihui.ustc@gmail.com>
(cherry picked from commit dba139cd0f)
2021-12-28 13:23:05 +09:00