Commit Graph

5565 Commits

Author SHA1 Message Date
Steve Loughran 708a0ce21b
HADOOP-13704. Optimized S3A getContentSummary()
Optimize the scan for s3 by performing a deep tree listing,
inferring directory counts from the paths returned.

Contributed by Ahmar Suhail.

Change-Id: I26ffa8c6f65fd11c68a88d6e2243b0eac6ffd024
2022-03-22 13:21:12 +00:00
Owen O'Malley 1d5650c4d0
HDFS-13248: Namenode needs to use the actual client IP when going through the
RBF proxy. There is a new configuration knob dfs.namenode.ip-proxy-users that configures
the list of users than can set their client ip address using the client context.

Fixes #4081
2022-03-21 09:27:35 -07:00
Abhishek Das da9970dd69 HADOOP-18129: Change URI to String in INodeLink to reduce memory footprint of ViewFileSystem
Fixes #3996
2022-03-17 17:25:55 -07:00
Steve Loughran 9037f9a334
HADOOP-18162. hadoop-common support for MAPREDUCE-7341 Manifest Committer
* New statistic names in StoreStatisticNames
  (for joint use with s3a committers)
* Improvements to IOStatistics implementation classes
* RateLimiting wrapper to guava RateLimiter
* S3A committer Tasks moved over as TaskPool and
  added support for RemoteIterator
* JsonSerialization.load() to fail fast if source does not exist

+ tests.

This commit is a prerequisite for the main MAPREDUCE-7341 Manifest Committer
patch.

Contributed by Steve Loughran

Change-Id: Ia92e2ab5083ac3d8d3d713a4d9cb3e9e0278f654
2022-03-17 11:20:53 +00:00
Xing Lin 8b8158f02d
HADOOP-18144: getTrashRoot in ViewFileSystem should return a path in ViewFS.
To get the new behavior, define fs.viewfs.trash.force-inside-mount-point to be true.

If the trash root for path p is in the same mount point as path p,
and one of:
* The mount point isn't at the top of the target fs.
* The resolved path of path is root (eg it is the fallback FS).
* The trash root isn't in user's target fs home directory.
get the corresponding viewFS path for the trash root and return it.
Otherwise, use <mnt>/.Trash/<user>.

Signed-off-by: Owen O'Malley <oomalley@linkedin.com>
2022-03-14 11:29:48 -07:00
Owen O'Malley 7b5eac27ff
HDFS-16495: RBF should prepend the client ip rather than append it.
Fixes #4054

Signed-off-by: Owen O'Malley <oomalley@linkedin.com>
2022-03-14 10:21:35 -07:00
Mukund Thakur 672e380c4f
HADOOP-18112: Implement paging during multi object delete. (#4045)
Multi object delete of size more than 1000 is not supported by S3 and 
fails with MalformedXML error. So implementing paging of requests to 
reduce the number of keys in a single request. Page size can be configured
using "fs.s3a.bulk.delete.page.size" 

 Contributed By: Mukund Thakur
2022-03-11 13:05:45 +05:30
Gautham B A d0fa9b5775
HADOOP-18155. Refactor tests in TestFileUtil (#4053) 2022-03-10 22:02:38 +05:30
Duo Zhang db36747e83
HADOOP-17526 Use Slf4jRequestLog for HttpRequestLog (#4050)
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2022-03-10 10:15:09 +08:00
Viraj Jasani 66b72406bd
HADOOP-18131. Upgrade maven enforcer plugin and relevant dependencies (#4000)
Reviewed-by: Akira Ajisaka <aajisaka@apache.org>
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2022-03-08 17:27:04 +09:00
Viraj Jasani 278568203b
HDFS-16481. Provide support to set Http and Rpc ports in MiniJournalCluster (#4028). Contributed by Viraj Jasani. 2022-03-04 22:17:48 +05:30
Chao Sun f800b65b40 Make upstream aware of 3.3.2 release 2022-03-02 19:14:50 -08:00
ted12138 902a7935e9
HADOOP-18128. Fix typo issues of outputstream.md (#4025) 2022-03-02 18:25:56 +08:00
Ayush Saxena d05655d2ad
Revert "HADOOP-18082.Add debug log when RPC#Reader gets a Call. (#3891). Contributed by JiangHua Zhu."
Exposes a Race Condition. Which leads to test failures in YARN. (HADOOP-18143)

This reverts commit 2025243fbf.
2022-02-28 21:44:24 +05:30
Owen O'Malley 12fa38d546
HADOOP-18139: Allow configuration of zookeeper server principal.
Fixes #4024

Signed-off-by: Owen O'Malley <oomalley@linkedin.com>
2022-02-24 15:01:50 -08:00
monthonk 1f157f802d
HADOOP-17386. Change default fs.s3a.buffer.dir to be under Yarn container path on yarn applications (#3908)
Co-authored-by: Monthon Klongklaew <monthonk@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-02-22 13:50:27 +09:00
jianghuazhu 589695c6a9
HDFS-16316.Improve DirectoryScanner: add regular file check related block. (#3861) 2022-02-22 10:15:19 +08:00
Steve Loughran cae749b076
HADOOP-18136. Verify FileUtils.unTar() handling of missing .tar files.
Contributed by Steve Loughran

Change-Id: I73af19d2e2e41f4ba686c470726a80c3903a1950
2022-02-21 17:08:56 +00:00
jianghuazhu 2025243fbf
HADOOP-18082.Add debug log when RPC#Reader gets a Call. (#3891). Contributed by JiangHua Zhu. 2022-02-17 01:49:45 +05:30
Chentao Yu 19d90e62fb HADOOP-18109. Ensure that default permissions of directories under internal ViewFS directories are the same as directories on target filesystems. Contributed by Chentao Yu. (3953) 2022-02-15 15:58:24 -08:00
GuoPhilipse b68964336d
HDFS-16449. Fix hadoop web site release notes and changelog not available (#3967)
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-02-14 05:38:28 +09:00
daimin 0e74f1e467
Fix thread safety of EC decoding during concurrent preads (#3881) 2022-02-11 10:20:00 +08:00
Xing Lin ca8ba24051 HADOOP-18110. ViewFileSystem: Add Support for Localized Trash Root
Fixes #3956
2022-02-10 16:43:04 -08:00
Steve Loughran efdec92cab
HADOOP-18091. S3A auditing leaks memory through ThreadLocal references (#3930)
Adds a new map type WeakReferenceMap, which stores weak
references to values, and a WeakReferenceThreadMap subclass
to more closely resemble a thread local type, as it is a
map of threadId to value.

Construct it with a factory method and optional callback
for notification on loss and regeneration.

 WeakReferenceThreadMap<WrappingAuditSpan> activeSpan =
      new WeakReferenceThreadMap<>(
          (k) -> getUnbondedSpan(),
          this::noteSpanReferenceLost);

This is used in ActiveAuditManagerS3A for span tracking.

Relates to
* HADOOP-17511. Add an Audit plugin point for S3A
* HADOOP-18094. Disable S3A auditing by default.

Contributed by Steve Loughran.
2022-02-10 12:31:41 +00:00
Abhishek Das 3684c7f66a
HADOOP-18100: Change scope of inner classes in InodeTree to make them accessible outside package
Fixes #3950

Signed-off-by: Owen O'Malley <omalley@apache.org>
2022-02-03 16:28:04 -08:00
Ayush Saxena aeae5716cc
Revert "HADOOP-18024. SocketChannel is not closed when IOException happens in Server$Listener.doAccept (#3719)"
This reverts commit 6ed01585eb.

Breaks TestIPC#testIOEOnListenerAccept
2022-02-01 14:11:25 +05:30
Li MingXiang e17c96a40a
HDFS-16429. Add DataSetLockManager to manage fine-grain locks for FsDataSetImpl. (#3900). Contributed by limingxiang.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2022-01-27 16:53:21 +08:00
Viraj Jasani 4faac58841
HADOOP-18089. Test coverage for Async profiler servlets (#3913)
Reviewed-by: Akira Ajisaka <akiraaj@amazon.com>
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
2022-01-26 11:24:16 +08:00
Xing Lin 0d17b629ff
HADOOP-18093. Better exception handling for testFileStatusOnMountLink() in ViewFsBaseTest.java (#3918). Contributed by Xing Lin.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-01-25 19:40:18 +05:30
Steve Loughran 14ba19af06
HADOOP-17409. Remove s3guard from S3A module (#3534)
Completely removes S3Guard support from the S3A codebase.

If the connector is configured to use any metastore other than
the null and local stores (i.e. DynamoDB is selected) the s3a client
will raise an exception and refuse to initialize.

This is to ensure that there is no mix of S3Guard enabled and disabled
deployments with the same configuration but different hadoop releases
-it must be turned off completely.

The "hadoop s3guard" command has been retained -but the supported
subcommands have been reduced to those which are not purely S3Guard
related: "bucket-info" and "uploads".

This is major change in terms of the number of files
changed; before cherry picking subsequent s3a patches into
older releases, this patch will probably need backporting
first.

Goodbye S3Guard, your work is done. Time to die.

Contributed by Steve Loughran.
2022-01-17 18:08:57 +00:00
ahmarsuhail 7542677470
HADOOP-16223. Remove misleading fs.s3a.delegation.tokens.enabled prompt (#3879)
Contributed by Ahmar Suhail
2022-01-12 17:25:17 +00:00
Viraj Jasani 93294f0329
HADOOP-18077. ProfileOutputServlet unable to proceed due to NPE (#3875) 2022-01-12 16:20:34 +08:00
litao 39efbc6b6f
HDFS-16404. Fix typo for CachingGetSpaceUsed (#3844). Contributed by tomscut. 2022-01-09 16:41:10 +08:00
Viraj Jasani f64fda0f00
HADOOP-18055. Async Profiler endpoint for Hadoop daemons (#3824)
Reviewed-by: Akira Ajisaka <aajisaka@apache.org>
2022-01-06 17:56:49 +08:00
Mukund Thakur da0a6ba1ce
HADOOP-18065. ExecutorHelper.logThrowableFromAfterExecute() is too noisy. (#3860)
Downgrading warn logs to debug in case of InterruptedException

Contributed By: Mukund Thakur
2022-01-06 10:54:27 +05:30
jianghuazhu 7398a0f1b2
HADOOP-18063. Remove unused import AbstractJavaKeyStoreProvider in Shell class. (#3846)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-01-04 11:25:13 +09:00
jianghuazhu 43afd1753a
HDFS-16394.RPCMetrics increases the number of handlers in processing. (#3822) 2021-12-31 16:40:14 +08:00
Ashutosh Gupta caab29ec88
HDFS-14099. Unknown frame descriptor when decompressing multiple frames (#3836)
Co-authored-by: xuzq <xuzengqiang@kuaishou.com>
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2021-12-28 21:44:38 +09:00
Viraj Jasani 04b6b9a87b
HADOOP-16908. Prune Jackson 1 from the codebase and restrict it's usage for future (#3789)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2021-12-20 16:01:34 +09:00
Dhananjay Badaya 4483607a4e
HADOOP-13500. Synchronizing iteration of Configuration properties object (#3775)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2021-12-17 16:05:46 +09:00
PHILO-HE 8e08f43e03
HDFS-16014: Fix an issue in checking native pmdk lib by 'hadoop checknative' command (#3762) 2021-12-14 14:45:12 +05:30
Wei-Chiu Chuang d7c5400fbc
HADOOP-17982. OpensslCipher initialization error should log a WARN message. (#3599)
Change-Id: I070fc4784679b3be73aa3a11201bbae23c20ad4e
2021-12-10 18:14:04 +09:00
Akira Ajisaka 9b9e2ef87f
HADOOP-18040. Use maven.test.failure.ignore instead of ignoreTestFailure (#3774)
Reviewed-by: Masatake Iwasaki <iwasakims@apache.org>
2021-12-10 01:36:31 +09:00
Haoze Wu 6ed01585eb
HADOOP-18024. SocketChannel is not closed when IOException happens in Server$Listener.doAccept (#3719) 2021-12-08 18:48:43 +09:00
Andras Gyori 47ea0d734f
HADOOP-18021. Provide a public wrapper of Configuration#substituteVars (#3710)
Contributed by Andras Gyori
2021-12-03 16:44:58 +00:00
Desmond Sisson df4197592f
HADOOP-18029: Update CompressionCodecFactory to handle uppercase file extensions (#3739)
Co-authored-by: Desmond Sisson <sissonde@amazon.com>
2021-12-01 15:36:54 -08:00
smarthan 932a78fe38
HADOOP-18023. Allow cp command to run with multi threads. (#3721) 2021-11-29 12:45:08 +00:00
Takanobu Asanuma 9c887e5b82
HADOOP-18014. CallerContext should not include some characters. (#3698)
Reviewed-by: Viraj Jasani <vjasani@apache.org>
Reviewed-by: Mingliang Liu <liuml07@apache.org>
Reviewed-by: Hui Fei <ferhui@apache.org>
2021-11-25 14:05:04 +09:00
huhaiyang 99b161dec7
HADOOP-17995. Stale record should be remove when DataNodePeerMetrics#dumpSendPacketDownstreamAvgInfoAsJson (#3708) 2021-11-25 10:20:42 +08:00
Steve Loughran 98fe0d0fc3
HADOOP-17979. Add Interface EtagSource to allow FileStatus subclasses to provide etags (#3633)
Contributed by Steve Loughran
2021-11-24 17:33:12 +00:00