Commit Graph

5602 Commits

Author SHA1 Message Date
Owen O'Malley c52b97d084 HDFS-16495: RBF should prepend the client ip rather than append it.
Fixes #4054

Signed-off-by: Owen O'Malley <oomalley@linkedin.com>
2022-03-14 10:30:29 -07:00
Takanobu Asanuma 52fb9d7ce2 HADOOP-18014. CallerContext should not include some characters. (#3698)
Reviewed-by: Viraj Jasani <vjasani@apache.org>
Reviewed-by: Mingliang Liu <liuml07@apache.org>
Reviewed-by: Hui Fei <ferhui@apache.org>

Cherry-picked from 9c887e5b by Owen O'Malley
2022-03-14 10:29:37 -07:00
litao 0029f22d7d HADOOP-18003. Add a method appendIfAbsent for CallerContext (#3644)
Cherry-picked from 573b358f by Owen O'Malley
2022-03-14 10:29:23 -07:00
litao f9d40ed7b7 HDFS-16266. Add remote port information to HDFS audit log (#3538)
Reviewed-by: Akira Ajisaka <aajisaka@apache.org>
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
Cherry-picked from 359b03c8 by Owen O'Malley
2022-03-14 10:29:11 -07:00
Hui Fei 8e129e5b8d HDFS-13293. RBF: The RouterRPCServer should transfer client IP via CallerContext to NamenodeRpcServer (#2363)
Cherry-picked from 518a212c by Owen O'Malley
2022-03-14 10:28:55 -07:00
Fei Hui 5a38ed2f22 HADOOP-17276. Extend CallerContext to make it include many items (#2327)
Cherry-picked from d0d10f7e by Owen O'Malley
2022-03-14 10:28:38 -07:00
Wei-Chiu Chuang 743db6e7b4
HADOOP-18155. Refactor tests in TestFileUtil (#4063)
(cherry picked from commit d0fa9b5775)

 Conflicts:
	hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileUtil.java
	hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestFileUtil.java

Co-authored-by: Gautham B A <gautham.bangalore@gmail.com>
2022-03-14 09:40:17 +09:00
Mukund Thakur e0619b702a HADOOP-18112: Implement paging during multi object delete. (#4045)
Multi object delete of size more than 1000 is not supported by S3 and 
fails with MalformedXML error. So implementing paging of requests to 
reduce the number of keys in a single request. Page size can be configured
using "fs.s3a.bulk.delete.page.size" 

 Contributed By: Mukund Thakur
2022-03-11 13:16:51 +05:30
Chao Sun b174aaed57 Make upstream aware of 3.3.2 release 2022-03-02 19:10:30 -08:00
Owen O'Malley 1a3060d41e
HADOOP-18139: Allow configuration of zookeeper server principal.
Fixes #4024

Signed-off-by: Owen O'Malley <oomalley@linkedin.com>
2022-02-24 15:15:13 -08:00
Steve Loughran 94a0a04113
HADOOP-18136. Verify FileUtils.unTar() handling of missing .tar files.
Contributed by Steve Loughran

Change-Id: I3856afa821dbc8c2e3cb1cbe33793ec1734e2e24
2022-02-21 17:09:36 +00:00
Chentao Yu d14a7c6ee5 HADOOP-18109. Ensure that default permissions of directories under internal ViewFS directories are the same as directories on target filesystems. Contributed by Chentao Yu. (3953)
(cherry picked from commit 19d90e62fb)
2022-02-15 16:48:13 -08:00
GuoPhilipse 7512714475
HDFS-16449. Fix hadoop web site release notes and changelog not available (#3967)
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit b68964336d)
2022-02-14 05:40:16 +09:00
daimin 9071c9646c
Fix thread safety of EC decoding during concurrent preads (#3881)
(cherry picked from commit 0e74f1e467)
2022-02-11 10:20:45 +08:00
Steve Loughran 088684ec60
HADOOP-18091. S3A auditing leaks memory through ThreadLocal references (#3930)
Adds a new map type WeakReferenceMap, which stores weak
references to values, and a WeakReferenceThreadMap subclass
to more closely resemble a thread local type, as it is a
map of threadId to value.

Construct it with a factory method and optional callback
for notification on loss and regeneration.

 WeakReferenceThreadMap<WrappingAuditSpan> activeSpan =
      new WeakReferenceThreadMap<>(
          (k) -> getUnbondedSpan(),
          this::noteSpanReferenceLost);

This is used in ActiveAuditManagerS3A for span tracking.

Relates to
* HADOOP-17511. Add an Audit plugin point for S3A
* HADOOP-18094. Disable S3A auditing by default.

Contributed by Steve Loughran.

Change-Id: Ibf7bb082fd47298f7ebf46d92f56e80ca9b2aaf8
2022-02-10 12:33:40 +00:00
Abhishek Das 8b03514eaf HADOOP-18100: Change scope of inner classes in InodeTree to make them accessible outside package
Fixes #3950

Signed-off-by: Owen O'Malley <omalley@apache.org>

Cherry-picked from 3684c7f6 by Owen O'Malley
2022-02-04 12:13:10 -08:00
Petre Bogdan Stolojan 664075f35d
HADOOP-17198. Support S3 Access Points (#3260)
Add support for S3 Access Points. This provides extra security as it
ensures applications are not working with buckets belong to third parties.

To bind a bucket to an access point, set the access point (ap) ARN,
which must be done for each specific bucket, using the pattern

fs.s3a.bucket.$BUCKET.accesspoint.arn = ARN

* The global/bucket option `fs.s3a.accesspoint.required` to
mandate that buckets must declare their access point.
* This is not compatible with S3Guard.

Consult the documentation for further details.

Contributed by Bogdan Stolojan

(this commit contains the changes to TestArnResource from HADOOP-18068,
 "upgrade AWS SDK to 1.12.132" so that it works with the later SDK.)

Change-Id: I3fac213e52ca6ec1c813effb8496c353964b8e1b
2022-02-04 16:21:35 +00:00
Xing Lin d613776b64
HADOOP-18093. Better exception handling for testFileStatusOnMountLink() in ViewFsBaseTest.java (#3918). Contributed by Xing Lin. (#3929)
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
(cherry picked from commit 0d17b629ff)
2022-01-26 21:55:32 +05:30
Steve Loughran 8ccc586af6
HADOOP-17409. Remove s3guard from S3A module (#3534)
Completely removes S3Guard support from the S3A codebase.

If the connector is configured to use any metastore other than
the null and local stores (i.e. DynamoDB is selected) the s3a client
will raise an exception and refuse to initialize.

This is to ensure that there is no mix of S3Guard enabled and disabled
deployments with the same configuration but different hadoop releases
-it must be turned off completely.

The "hadoop s3guard" command has been retained -but the supported
subcommands have been reduced to those which are not purely S3Guard
related: "bucket-info" and "uploads".

This is major change in terms of the number of files
changed; before cherry picking subsequent s3a patches into
older releases, this patch will probably need backporting
first.

Goodbye S3Guard, your work is done. Time to die.

Contributed by Steve Loughran.
2022-01-18 18:04:48 +00:00
Viraj Jasani 5e9e779ed2
HADOOP-17152. Provide Hadoop's own Lists utility to reduce dependency on Guava (#3061)
Change-Id: I52e55b9d9826ad661e9ad7dc15f007aa168f0fe1
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2022-01-18 11:57:25 +00:00
ahmarsuhail 6649c2813e
HADOOP-16223. Remove misleading fs.s3a.delegation.tokens.enabled prompt (#3879)
Contributed by Ahmar Suhail

Change-Id: I6a33043831a059325c58b0f76c925e52c6ae14f7
2022-01-12 17:27:53 +00:00
Mukund Thakur 60c1c6d93c HADOOP-18065. ExecutorHelper.logThrowableFromAfterExecute() is too noisy. (#3860)
Downgrading warn logs to debug in case of InterruptedException

Contributed By: Mukund Thakur
2022-01-10 13:52:02 +05:30
Wei-Chiu Chuang 350b51f287 Make upstream aware of 3.3.1 release 2022-01-04 14:48:49 -08:00
jianghuazhu fd75c4a158 HADOOP-18063. Remove unused import AbstractJavaKeyStoreProvider in Shell class. (#3846)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 7398a0f1b2)
2022-01-04 11:26:07 +09:00
Ashutosh Gupta 6535a183b2 HDFS-14099. Unknown frame descriptor when decompressing multiple frames (#3836)
Co-authored-by: xuzq <xuzengqiang@kuaishou.com>
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit caab29ec88)
2021-12-28 22:00:48 +09:00
Akira Ajisaka cd30687a15 Revert "HDFS-14099. Unknown frame descriptor when decompressing multiple frames (#3836)"
This reverts commit 05b43f2057.
2021-12-28 21:51:40 +09:00
Ashutosh Gupta 05b43f2057 HDFS-14099. Unknown frame descriptor when decompressing multiple frames (#3836)
Co-authored-by: xuzq <xuzengqiang@kuaishou.com>
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit caab29ec88)
2021-12-28 21:49:06 +09:00
Dhananjay Badaya e7b1f87665 HADOOP-13500. Synchronizing iteration of Configuration properties object (#3775)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 4483607a4e)
2021-12-17 16:07:46 +09:00
Akira Ajisaka 35c5c6bb83 HADOOP-18040. Use maven.test.failure.ignore instead of ignoreTestFailure (#3774)
Reviewed-by: Masatake Iwasaki <iwasakims@apache.org>
(cherry picked from commit 9b9e2ef87f)

 Conflicts:
	hadoop-tools/hadoop-federation-balance/pom.xml
2021-12-10 01:38:26 +09:00
Steve Loughran 67eaf5aa9f
HADOOP-17979. Add Interface EtagSource to allow FileStatus subclasses to provide etags (#3633)
Contributed by Steve Loughran

Change-Id: I596205d788f623114c12962941445432e2036c34
2021-11-29 16:20:55 +00:00
smarthan bc40a41064 HADOOP-18023. Allow cp command to run with multi threads. (#3721)
(cherry picked from commit 932a78fe38)
2021-11-29 12:47:02 +00:00
Viraj Jasani 6094e1ec9a
HDFS-16171. De-flake testDecommissionStatus (#3280)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 6342d5e523)

 Conflicts:
	hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDecommissioningStatus.java
2021-11-25 14:15:10 +09:00
Istvan Fajth 48e95d8109 HADOOP-17975. Fallback to simple auth does not work for a secondary DistributedFileSystem instance. (#3579)
(cherry picked from commit ae3ba45db5)
2021-11-24 10:47:49 +00:00
smarthan cbb3ba135c HADOOP-17998. Allow get command to run with multi threads. (#3645)
(cherry picked from commit 63018dc73f)

 Conflicts:
	hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CopyCommands.java
2021-11-22 12:14:32 +00:00
Abhishek Das f456dc1837 HADOOP-17999. No-op implementation of setWriteChecksum and setVerifyChecksum in ViewFileSystem. Contributed by Abhishek Das. (#3639)
(cherry picked from commit 54a1d78e16)
2021-11-16 22:40:24 -08:00
litao 026d5860cb
HDFS-16315. Add metrics related to Transfer and NativeCopy for DataNode (#3666) 2021-11-17 11:06:53 +09:00
Chao Sun e079fa6577 Preparing for 3.3.3 development 2021-11-16 16:02:34 -08:00
litao 340dee4469
HDFS-16319. Add metrics doc for ReadLockLongHoldCount and WriteLockLongHoldCount (#3653). Contributed by tomscut.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2021-11-14 20:12:13 +05:30
litao 421013825f
HADOOP-18005. Correct log format for LdapGroupsMapping (#3647). Contributed by tomscut.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2021-11-14 20:01:09 +05:30
Steve Loughran a68671eaf7
HADOOP-17928. Syncable: S3A to warn and downgrade (#3585)
This switches the default behavior of S3A output streams
to warning that Syncable.hsync() or hflush() have been
called; it's not considered an error unless the defaults
are overridden.

This avoids breaking applications which call the APIs,
at the risk of people trying to use S3 as a safe store
of streamed data (HBase WALs, audit logs etc).

Contributed by Steve Loughran.

Change-Id: I0a02ec1e622343619f147f94158c18928a73a885
2021-11-04 14:41:42 +00:00
Mehakmeet Singh bd077c3814
HADOOP-17953. S3A: Tests to lookup global or per-bucket configuration for encryption algorithm (#3525)
Followup to S3-CSE work of HADOOP-13887

Contributed by Mehakmeet Singh
2021-10-21 12:03:50 +01:00
Szilard Nemeth 6f45666d0b HADOOP-17857. Check real user ACLs in addition to proxied user ACLs. Contributed by Eric Payne
(cherry picked from commit 5428d36b56)
2021-10-19 20:40:30 +00:00
Steve Loughran b8f3e54ff7 HADOOP-17945. JsonSerialization raises EOFException reading JSON data stored on google GCS (#3501)
Contributed By: Steve Loughran
2021-10-19 15:36:10 +05:30
Xing Lin af920f138b HADOOP-16532. Fix TestViewFsTrash to use the correct homeDir. Contributed by Xing Lin. (#3514)
(cherry picked from commit 97c0f96879)
2021-10-13 14:58:08 -07:00
Masatake Iwasaki 9e2936f8d1
HADOOP-17424. Replace HTrace with No-Op tracer (#3520)
(cherry picked from commit 1a205cc3ad)

 Conflicts:
	hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java
	hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
	hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/tracing/TestTracing.java

Co-authored-by: Siyao Meng <50227127+smengcl@users.noreply.github.com>
2021-10-12 00:07:09 +09:00
Viraj Jasani 77ee5a4266
HADOOP-17950. Provide replacement for deprecated APIs of commons-io IOUtils (#3515)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 8071dbb9c6)
2021-10-07 11:00:19 +09:00
Ahmed Hussein 2cdc6a245d HADOOP-17930. implement non-guava Precondition checkState (#3522)
Reviewed-by: Viraj Jasani <vjasani@apache.org>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
(cherry picked from commit c36f9402dc)
2021-10-07 10:57:20 +09:00
Viraj Jasani fd3069d70c HADOOP-17947. Additional element types for VisibleForTesting (ADDENDUM) (#3521)
(cherry picked from commit 783e4805e7)
2021-10-06 02:18:54 +09:00
Mehakmeet Singh 769059c2f5
HADOOP-17871. S3A CSE: minor tuning (#3412)
This migrates the fs.s3a-server-side encryption configuration options
to a name which covers client-side encryption too.

fs.s3a.server-side-encryption-algorithm becomes fs.s3a.encryption.algorithm
fs.s3a.server-side-encryption.key becomes fs.s3a.encryption.key

The existing keys remain valid, simply deprecated and remapped
to the new values. If you want server-side encryption options
to be picked up regardless of hadoop versions, use
the old keys.

(the old key also works for CSE, though as no version of Hadoop
with CSE support has shipped without this remapping, it's less
relevant)

Contributed by: Mehakmeet Singh

Change-Id: I51804b21b287dbce18864f0a6ad17126aba2b281
2021-10-05 11:39:25 +01:00
Mehakmeet Singh aee975a136
HADOOP-13887. Support S3 client side encryption (S3-CSE) using AWS-SDK (#2706)
This (big!) patch adds support for client side encryption in AWS S3,
with keys managed by AWS-KMS.

Read the documentation in encryption.md very, very carefully before
use and consider it unstable.

S3-CSE is enabled in the existing configuration option
"fs.s3a.server-side-encryption-algorithm":

fs.s3a.server-side-encryption-algorithm=CSE-KMS
fs.s3a.server-side-encryption.key=<KMS_KEY_ID>

You cannot enable CSE and SSE in the same client, although
you can still enable a default SSE option in the S3 console.

* Filesystem list/get status operations subtract 16 bytes from the length
  of all files >= 16 bytes long to compensate for the padding which CSE
  adds.
* The SDK always warns about the specific algorithm chosen being
  deprecated. It is critical to use this algorithm for ranged
  GET requests to work (i.e. random IO). Ignore.
* Unencrypted files CANNOT BE READ.
  The entire bucket SHOULD be encrypted with S3-CSE.
* Uploading files may be a bit slower as blocks are now
  written sequentially.
* The Multipart Upload API is disabled when S3-CSE is active.

Contributed by Mehakmeet Singh

Change-Id: Ie1a27a036a39db66a67e9c6d33bc78d54ea708a0
2021-10-05 11:37:41 +01:00