Commit Graph

25979 Commits

Author SHA1 Message Date
Gautham B A 6415eb04e8
HDFS-16665. Fix duplicate sources for HDFS test (#4573)
* The library target hdfspp_test_shim_static is
  built using the following sources, which
  causes duplicate symbols to be defined -
  - hdfs_shim.c
  - ${LIBHDFSPP_BINDING_C}/hdfs.cc
* ${LIBHDFSPP_BINDING_C}/hdfs.cc is redundant
  and removing this fixes the issue.
2022-07-19 21:39:06 +05:30
Gautham B A 4fb799e6c5
HDFS-16464. Create only libhdfspp static libraries for Windows (#4571)
* Added the appropriate CMake flags and
  commands to enable only statically
  linked libraries and executables to
  be built on Windows.
2022-07-19 21:37:22 +05:30
Gautham B A 21b8952125
HDFS-16666. Pass CMake args for Windows in pom.xml (#4574)
* This PR passes the necessary CMake args in the
  pom.xml needed for building HDFS native client
  on Windows.
* These arguments are exposed as maven options
  and can be passed from the command-line.
2022-07-19 10:45:59 +05:30
Wei-Chiu Chuang a55ace7bc0
HADOOP-18079. Upgrade Netty to 4.1.77. (#3977)
Upgrade netty to address

CVE-2019-20444,
CVE-2019-20445
CVE-2022-24823

Contributed by Wei-Chiu Chuang
2022-07-18 10:41:00 +01:00
PJ Fanning 34e548cb62
HADOOP-18332: remove rs-api dependency as it conflicts with jsr311-api (#4547)
This downgrades jackson from the version switched to in
    HADOOP-18033 (2.13.0), to Jackson 2.12.7.
    This removes the dependency on javax.ws.rs-api,
    so avoiding runtime problems with applications using
    jersey-core v1 and/or jsr311-api.
    
    The 2.12.7 release still contains the fix for CVE-2020-36518.
    
    Contributed by PJ Fanning
2022-07-17 21:37:54 +05:30
Gautham B A 440f4c2b28
HDFS-16654. Link OpenSSL lib for CMake deps check (#4538)
* The check_c_source_compiles fails on Windows
  while linking with an "unable to resolve
  external symbol" error.
* This PR links OpenSSL lib for this check to
  fix this issue.
2022-07-17 20:47:30 +05:30
xuzq 8774f17868
HADOOP-13144. Enhancing IPC client throughput via multiple connections per user (#4542) 2022-07-15 14:18:46 -07:00
RuinanGu 9376b65989
HDFS-16566 Erasure Coding: Recovery may causes excess replicas when busy DN exsits (#4252) 2022-07-16 04:52:12 +08:00
Murali Krishna 2835174a4c
HDFS-16652. Upgrade jquery datatable version references to v1.10.19 (#4562) 2022-07-14 18:27:07 +05:30
Samrat 84ce592a85
YARN-11198. clean up numa resources from statestore (#4546)
* YARN-11198. clean up numa resources from levelDB

Co-authored-by: Deb <dbsamrat@3c22fba1b03f.ant.amazon.com>
2022-07-14 11:07:48 +05:30
xuzq 6f9c4359ec
HDFS-16283. RBF: reducing the load of renewLease() RPC (#4524). Contributed by ZanderXu.
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-07-14 07:26:40 +05:30
Ashutosh Gupta f1bd4e117e
HADOOP-18336.Tag FSDataInputStream.getWrappedStream() @Public/@Stable (#4555)
Contributed by: Ashutosh Gupta
2022-07-13 12:56:56 +01:00
HerCath 4c4a940da2
HADOOP-18217. ExitUtil synchronized blocks reduced. #4255
Reduce the ExitUtil synchronized block scopes so System.exit 
and Runtime.halt calls aren't within their boundaries,
so ExitUtil wrappers do not block each other.

Enlarged catches to all Throwables (not just Exceptions).

Contributed by Remi Catherinot
2022-07-13 12:35:44 +01:00
Ashutosh Gupta 0ca4868aa2
HADOOP-18294. Ensure build folder exists before writing checksum file.ProtocRunner#writeChecksums (#4446)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-07-12 20:15:26 +09:00
Ashutosh Gupta 4e8c0b902e
MAPREDUCE-7201.Make Job History File Permissions configurable (#4507)
* MAPREDUCE-7201.Make Job History File Permissions configurable

Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
2022-07-11 11:34:52 +05:30
lmccay e11ba5930e
HADOOP-18074 - Partial/Incomplete groups list can be returned in LDAP… (#4503)
* HADOOP-18074 - Partial/Incomplete groups list can be returned in LDAP groups lookup
2022-07-11 01:03:44 -04:00
Akira Ajisaka 9b1d3579b4
Revert "MAPREDUCE-7388. Remove unused variable _eof in GzipCodec.cc (#4429)"
This reverts commit fac895828f.
2022-07-09 03:05:42 +09:00
cfg1234 fac895828f
MAPREDUCE-7388. Remove unused variable _eof in GzipCodec.cc (#4429)
Co-authored-by: cWX456268 <chenfengge1@huawei.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-07-09 02:51:49 +09:00
Ayush Saxena 96f8e5b6f4
HADOOP-15789. DistCp does not clean staging folder if class extends DistCp. Contributed by Lawrence Andrews. (#4534)
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-07-08 17:04:20 +05:30
Gautham B A 8e39e35bea
HDFS-16466. Implement Linux permission flags on Windows (#4526)
* HDFS-16466. Implement Linux permission flags on Windows

* statinfo.cc uses POSIX permission flags.
  These flags aren't available for Windows.
* This PR implements the equivalent flags
  on Windows to make this cross platform
  compatible.
2022-07-08 09:29:13 +05:30
Jinhu Wu 3ec4b932c1
HADOOP-18313: AliyunOSSBlockOutputStream should not mark the temporary file for deletion (#4502)
HADOOP-18313: AliyunOSSBlockOutputStream should not mark the temporary file for deletion. Contributed by wujinhu.
2022-07-06 14:23:46 +08:00
Ashutosh Gupta a432925f74
HADOOP-18321.Fix when to read an additional record from a BZip2 text file split (#4521)
* HADOOP-18321.Fix when to read an additional record from a BZip2 text file split

Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com> and Reviewed by Akira Ajisaka.
2022-07-06 10:00:14 +05:30
slfan1989 161b1fac2e
YARN-11169. Support moveApplicationAcrossQueues, getQueueInfo API's for Federation. (#4464) 2022-07-05 11:24:29 -07:00
cxzl25 ea46f49b04
HDFS-16638. Add isDebugEnabled check for debug blockLogs in BlockManager. (#4480). Contributed by dzcxzl.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-07-04 02:36:29 +05:30
jianghuazhu a333674785
HDFS-16647. Delete unused NameNode#FS_HDFS_IMPL_KEY. (#4525). Contributed by JiangHua Zhu.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-07-03 01:47:02 +05:30
Ashutosh Gupta 9e206fe0b6
HADOOP-18297. Upgrade dependency-check-maven to 7.1.1 (#4449)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
2022-07-03 04:04:55 +08:00
Ashutosh Gupta 151bb31c47
YARN-9403.GET /apps/{appid}/entities/YARN_APPLICATION accesses application table instead of entity table (#4516)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
2022-07-02 21:59:28 +05:30
Ashutosh Gupta 57cbde9abf
YARN-10287.Update scheduler-conf corrupts the CS configuration when removing queue which is referred in queue mapping (#4515)
* YARN-10287.Update scheduler-conf corrupts the CS configuration when removing queue which is referred in queue mapping

Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
2022-07-02 21:58:56 +05:30
Tamas Domok 3cad632709 YARN-11202. Optimize ClientRMService.getApplications. Contributed by Tamas Domok.
Change-Id: I55ddb46fd0e4cdb644747d6d43083215f10861b5
2022-07-01 10:50:48 +02:00
9uapaw 2d133a54ac YARN-11204. Various MapReduce tests fail with NPE in AggregatedLogDeletionService.stopRMClient. Contributed by Szilard Nemeth. 2022-06-29 15:12:04 +02:00
slfan1989 073b8ea1d5
HADOOP-18284. Remove Unnecessary semicolon ';' (#4422). Contributed by fanshilun. 2022-06-29 15:20:41 +05:30
jianghuazhu 321a4844ad
HADOOP-18314. Add some description for PowerShellFencer. (#4505) 2022-06-28 19:06:39 -07:00
hchaverr cf33164857
HDFS-16591. Setup JaasConfiguration in ZKCuratorManager when SASL is enabled
Fixes #4447
Signed-off-by: Owen O'Malley <oomalley@linkedin.com>
2022-06-28 16:44:02 -07:00
Samrat 7eefdf8642
YARN-11195. Adding document to enable numa (#4501)
Contributed by Samrat Deb.
2022-06-28 17:46:43 +05:30
Ashutosh Gupta a177232ebc
YARN-9822.TimelineCollectorWebService#putEntities blocked when ATSV2 HBase is down (#4492)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
2022-06-28 09:32:07 +05:30
swamirishi 43112bd472
HADOOP-18306: Warnings should not be shown on cli console when linux user not present on client (#4474). Contributed by swamirishi. 2022-06-27 17:20:58 -07:00
Mehakmeet Singh 823f5ee0d4
HADOOP-18242. ABFS Rename Failure when tracking metadata is in an incomplete state (#4331)
ABFS rename fails intermittently when the Storage-blob tracking
metadata is in an incomplete state. This surfaces as the error code
404 and an error message of "RenameDestinationParentPathNotFound"

To mitigate this issue, when a request fails with this response.
the ABFS client issues a HEAD call on the source file
and then retries the rename operation again

ABFS filesystem statistics track when this occurs with new counters
  rename_recovery
  metadata_incomplete_rename_failures
  rename_path_attempts

This is very rare occurrence and appears to be triggered under certain
heavy load conditions, just as with HADOOP-18163.

Contributed by Mehakmeet Singh.
2022-06-27 19:06:59 +01:00
Colm O hEigeartaigh 25f8bdcd21
HADOOP-18308 - Update to Apache LDAP API 2.0.x (#4477)
Update the dependencies of the LDAP libraries used for testing:

ldap-api.version = 2.0.0
apacheds.version = 2.0.0.AM26

Contributed by Colm O hEigeartaigh.
2022-06-27 11:15:18 +01:00
Ashutosh Gupta b7edc6c60c
HDFS-16633. Fixing when Reserved Space For Replicas is not released on some cases (#4452)
* HDFS-16633.Reserved Space For Replicas is not released on some cases

Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
2022-06-24 18:35:00 +05:30
Ashutosh Gupta 734b6f19ad
YARN-9874.Remove unnecessary LevelDb write call in LeveldbConfigurationStore#confirmMutation (#4487)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
2022-06-23 21:59:27 +05:30
Ashutosh Gupta 4abb2ba58c
YARN-10320.Replace FSDataInputStream#read with readFully in Log Aggregation (#4486)
* YARN-10320.Replace FSDataInputStream#read with readFully in Log Aggregation

Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
2022-06-23 21:58:32 +05:30
slfan1989 0af4bb3b42
YARN-11192. TestRouterWebServicesREST failing after YARN-9827. (#4484). Contributed by fanshilun.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-06-23 13:21:36 +05:30
Ashutosh Gupta dd819f7904
HADOOP-18271.Remove unused Imports in Hadoop Common project (#4392)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
2022-06-23 12:30:28 +05:30
Igor Dvorzhak 77d1b194c7
HADOOP-18300. Upgrade Gson dependency to version 2.9.0 (#4454)
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
2022-06-22 16:37:22 -07:00
Steve Loughran e1842b2a74
HADOOP-18103. Add a high-performance vectored read API. (#4476)
This feature adds methods for ranged vectored read operations
in PositionedReadable.

All stream which implement that interface support the new API.

The default implementation reads each range in the vector
sequentially.

However, specific implementations may provide higher performance
versions. This is done in two places

* Local FileSystem/Checksum FileSystem
* The S3A client.

The S3A client first coalesces adjacent and "nearby" ranges
together, then fetches each range in separate HTTP GET requests,
executed in parallel. As such it delivers significant speedups
to applications reading separate blocks of data from the same
file, columnar data format libraries in particular.

This is the merge commit of the feature branch; the work is in

HADOOP-11867. Add a high-performance vectored read API.
HADOOP-18104. S3A: Add configs to configure minSeekForVectorReads and maxReadSizeForVectorReads.
HADOOP-18107. Adding scale test for vectored reads for large file
HADOOP-18105. Implement buffer pooling with weak references.
HADOOP-18106. Handle memory fragmentation in S3A Vectored IO.

Contributed By: Owen O'Malley and Mukund Thakur
2022-06-22 18:19:23 +01:00
Mukund Thakur 4d1f6f9b99 HADOOP-18106: Handle memory fragmentation in S3A Vectored IO. (#4445)
part of HADOOP-18103.
Handling memory fragmentation in S3A vectored IO implementation by
allocating smaller user range requested size buffers and directly
filling them from the remote S3 stream and skipping undesired
data in between ranges.
This patch also adds aborting active vectored reads when stream is
closed or unbuffer() is called.

Contributed By: Mukund Thakur
2022-06-22 17:29:32 +01:00
Mukund Thakur 0d49bd2004 HADOOP-18105 Implement buffer pooling with weak references (#4263)
part of HADOOP-18103.
Required for vectored IO feature. None of current buffer pool
implementation is complete. ElasticByteBufferPool doesn't use
weak references and could lead to memory leak errors and
DirectBufferPool doesn't support caller preferences of direct
and heap buffers and has only fixed length buffer implementation.

Contributed By: Mukund Thakur
2022-06-22 17:29:32 +01:00
Mukund Thakur 1408dd89a7 HADOOP-18107 Adding scale test for vectored reads for large file (#4273)
part of HADOOP-18103.

Contributed By: Mukund Thakur
2022-06-22 17:29:32 +01:00
Mukund Thakur 5db0f34e29 HADOOP-18104: S3A: Add configs to configure minSeekForVectorReads and maxReadSizeForVectorReads (#3964)
Part of HADOOP-18103.
Introducing fs.s3a.vectored.read.min.seek.size and fs.s3a.vectored.read.max.merged.size
to configure min seek and max read during a vectored IO operation in S3A connector.
These properties actually define how the ranges will be merged. To completely
disable merging set fs.s3a.max.readsize.vectored.read to 0.

Contributed By: Mukund Thakur
2022-06-22 17:29:32 +01:00
Mukund Thakur 2daf0a814f HADOOP-11867. Add a high-performance vectored read API. (#3904)
part of HADOOP-18103.
Add support for multiple ranged vectored read api in PositionedReadable.
The default iterates through the ranges to read each synchronously,
but the intent is that FSDataInputStream subclasses can make more
efficient readers especially in object stores implementation.

Also added implementation in S3A where smaller ranges are merged and
sliced byte buffers are returned to the readers. All the merged ranged are
fetched from S3 asynchronously.

Contributed By: Owen O'Malley and Mukund Thakur
2022-06-22 17:29:32 +01:00