Commit Graph

26018 Commits

Author SHA1 Message Date
Steve Loughran 36bbde2fda
HADOOP-18181. Move prefetch common classes to hadoop common (#4690)
contains
HADOOP-18187. Convert s3a prefetching to use JavaDoc for fields and enums.
HADOOP-18318. Update class names to be clear they belong to S3A prefetching

Contributed by Steve Loughran
2022-08-17 09:50:49 +01:00
Steve Loughran e23f70a03c
HADOOP-18379 rebase feature/HADOOP-18028-s3a-prefetch to trunk
Fixes the build and a test failure (ITestS3ARequesterPays) which
was always there if you tested without prefetching enabled.

Change-Id: I4503c64856cfeb453b558808065b38455e1fce23
2022-07-28 14:58:03 +01:00
ahmarsuhail a9dbd7d62f
HADOOP-18190. Collect IOStatistics during S3A prefetching (#4458)
This adds iOStatisticsConnection to the S3PrefetchingInputStream class, with
new statistic names in StreamStatistics.

This stream is not (yet) IOStatisticsContext aware.


Contributed by Ahmar Suhail
2022-07-28 14:24:56 +01:00
ahmarsuhail 6a3b9f1723
HADOOP-18254. Disable S3A prefetching by default. (#4469)
Contributed by Ahmar Suhail <ahmarsu@amazon.co.uk>
2022-07-28 14:24:52 +01:00
ahmarsuhail 515cba7d2e
HADOOP-18231. S3A prefetching: fix failing tests & drain stream async. (#4386)
* adds in new test for prefetching input stream
* creates streamStats before opening stream
* updates numBlocks calculation method
* fixes ITestS3AOpenCost.testOpenFileLongerLength
* drains stream async
* fixes failing unit test


Contributed by Ahmar Suhail
2022-07-28 14:23:56 +01:00
Ahmar Suhail 3c06960a31
fixes compilation errors in tests 2022-07-28 14:23:56 +01:00
monthonk 9abc77b19e
HADOOP-18175. fix test failures with prefetching s3a input stream (#4212)
Contributed by Monthon Klongklaew
2022-07-28 14:23:56 +01:00
ahmarsuhail 538ddf8532
HADOOP-18177. Document prefetching architecture. (#4205)
Contributed by Ahmar Suhail
2022-07-28 14:23:56 +01:00
PJ Fanning 5a1f4dd5c1
HADOOP-18180. Replace use of twitter util-core with java futures in S3A prefetching stream (#4115)
Contributed by PJ Fanning.
2022-07-28 14:23:51 +01:00
Steve Loughran fd24290aa4
HADOOP-18028. High performance S3A input stream (#4109)
This is the the initial merge of the HADOOP-18028 S3A performance input stream.
This patch on its own is incomplete and must be accompanied by all other commits
with HADOOP-18028 in their git commit message. Consult the JIRA for that list

Contributed by Bhalchandra Pandit.
2022-07-28 14:19:53 +01:00
Steve Loughran 95a85875d0
HADOOP-18344. (followup) AWS SDK 1.12.262: update LICENSE-binary
Update LICENSE-binary with the new AWS SDK version.
Followup to #4637.

Contributed by Steve Loughran
2022-07-28 11:37:28 +01:00
Steve Loughran 58ed621304
HADOOP-18344. Upgrade AWS SDK to 1.12.262 (#4637)
Fixes CVE-2018-7489 in shaded jackson.

+Add more commands in testing.md
 to the CLI tests needed when qualifying
 a release

Contributed by Steve Loughran
2022-07-28 11:29:38 +01:00
xuzq f80fab2b90
HDFS-16671. RBF: RouterRpcFairnessPolicyController supports configurable permit acquire timeout (#4597) 2022-07-28 15:42:53 +08:00
xuzq a5adc27c99
HDFS-16658. Change logLevel from DEBUG to INFO if logEveryBlock is true (#4559). Contributed by ZanderXu.
Signed-off-by: He Xiaoqiao <hexiaoqiao@apache.org>
2022-07-28 15:30:22 +08:00
slfan1989 06ac327e88
HDFS-16619. Fix HttpHeaders.Values And HttpHeaders.Names Deprecated Import (#4406)
Co-authored-by: slfan1989 <louj1988@@>
2022-07-28 11:07:24 +08:00
xuzq 24560f2eb5
HDFS-16660. Improve Code With Lambda in IPCLoggerChannel class (#4561) 2022-07-27 18:53:05 -07:00
ahmarsuhail c92ff0b4f1
HADOOP-18372. ILoadTestS3ABulkDeleteThrottling failing. (#4642)
Contributed by Ahmar Suhail
2022-07-27 17:19:57 +01:00
Mehakmeet Singh 4c8cd61961
HADOOP-17461. Collect thread-level IOStatistics. (#4352)
This adds a thread-level collector of IOStatistics, IOStatisticsContext,
which can be:
* Retrieved for a thread and cached for access from other
  threads.
* reset() to record new statistics.
* Queried for live statistics through the
  IOStatisticsSource.getIOStatistics() method.
* Queries for a statistics aggregator for use in instrumented
  classes.
* Asked to create a serializable copy in snapshot()

The goal is to make it possible for applications with multiple
threads performing different work items simultaneously
to be able to collect statistics on the individual threads,
and so generate aggregate reports on the total work performed
for a specific job, query or similar unit of work.

Some changes in IOStatistics-gathering classes are needed for 
this feature
* Caching the active context's aggregator in the object's
  constructor
* Updating it in close()

Slightly more work is needed in multithreaded code,
such as the S3A committers, which collect statistics across
all threads used in task and job commit operations.

Currently the IOStatisticsContext-aware classes are:
* The S3A input stream, output stream and list iterators.
* RawLocalFileSystem's input and output streams.
* The S3A committers.
* The TaskPool class in hadoop-common, which propagates
  the active context into scheduled worker threads.

Collection of statistics in the IOStatisticsContext
is disabled process-wide by default until the feature 
is considered stable.

To enable the collection, set the option
fs.thread.level.iostatistics.enabled
to "true" in core-site.xml;
	
Contributed by Mehakmeet Singh and Steve Loughran
2022-07-26 20:41:22 +01:00
KevinWikant 213ea03758
YARN-11210. Fix YARN RMAdminCLI retry logic for non-retryable kerbero… (#4563)
Co-authored-by: Kevin Wikant <wikak@amazon.com>
2022-07-26 09:21:37 +05:30
xuzq 01a2e0f6bd
HDFS-16533. COMPOSITE_CRC failed between replicated file and striped file due to invalid requested length. (#4155)
Co-authored-by: zengqiang.xu <zengqiang.xu@shopee.com>
2022-07-26 04:30:00 +08:00
slfan1989 bf8782d0ac
YARN-10883. [Router] Router Audit Log Add Client IP Address. (#4426) 2022-07-25 11:55:40 -07:00
skysiders 9fe96238d2
MAPREDUCE-7372 MapReduce set permission too late in copyJar method (#4026). Contributed by Zhang Dongsheng.
Reviewed-by: Steve Loughran <stevel@apache.org>
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
2022-07-25 11:38:59 -07:00
Gautham B A 6ba2c53720
HDFS-16681. Do not pass GCC flags for MSVC in libhdfspp (#4615)
* This PR ensures that the GCC flag
  -Wno-missing-field-initializers isn't passed
  for MSVC.
2022-07-25 22:55:38 +05:30
slfan1989 edeb99548a
YARN-11161. Support getAttributesToNodes, getClusterNodeAttributes, getNodesToAttributes API's for Federation (#4610) 2022-07-25 10:05:45 -07:00
Neil 2f49eec5dd
HDFS-16655. OIV: print out erasure coding policy name in oiv Delimited output (#4541). Contributed by Max Xie.
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
2022-07-25 17:39:25 +08:00
Gautham B A 8f83d9f56d
HDFS-16680. Skip libhdfspp Valgrind tests on Windows (#4611)
* The CMake test libhdfs_mini_stress_valgrind
  requires Valgrind.
* This PR skips this test on Windows since
  Valgrind isn't available.
2022-07-23 23:22:13 +05:30
Gautham B A 7de9b5ee27
HDFS-16467. Ensure Protobuf generated headers are included first (#4601)
* This PR ensures that the Protobuf generated headers
  are always included first, even when these headers
  are included transitively.
* This problem is specific to Windows only.
2022-07-23 23:20:15 +05:30
slfan1989 63db1a85e3
YARN-11203. Fix typo in hadoop-yarn-server-router module. (#4510). Contributed by fanshilun.
Reviewed-by: Fei Hui <feihui.ustc@gmail.com>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-07-23 20:28:45 +05:30
xuzq 2c96357051
HDFS-15079. RBF: Namenode needs to use the actual client Id and callId when going through RBF proxy. (#4530) 2022-07-23 22:19:37 +08:00
slfan1989 5c84cb81ba
YARN-8900. [Router] Federation: routing getContainers REST invocations transparently to multiple RMs (#4543) 2022-07-22 17:06:38 -07:00
Masatake Iwasaki 221eb2d68d Make upstream aware of 3.2.4 release.
(cherry picked from commit 817b8fdd38)
2022-07-22 04:08:36 +00:00
Masatake Iwasaki 3cce41a1f6 Make upstream aware of 3.2.4 release.
(cherry picked from commit e1637a57df)
2022-07-22 02:27:19 +00:00
slfan1989 2f6916a313
HDFS-16605. Improve Code With Lambda in hadoop-hdfs-rbf moudle. (#4375) 2022-07-21 18:42:55 -07:00
wangzhaohui 08a940d5dd
HDFS-16640. RBF: Show datanode IP list when click DN histogram in Router (#4488) 2022-07-21 16:21:31 -07:00
slfan1989 838020ce3b
YARN-11160. Support getResourceProfiles, getResourceProfile API's for Federation (#4540) 2022-07-21 11:57:24 -07:00
Szilard Nemeth f4b635c4dc YARN-11211. QueueMetrics leaks Configuration objects when validation API is called multiple times. Contributed by Andras Gyori 2022-07-21 14:20:34 +02:00
ashutoshpant bac2219e3c
HADOOP-18330. S3AFileSystem removes Path when calling createS3Client (#4572)
Adds a new parameter object in s3ClientCreationParameters that holds 
the full s3a path URI

Contributed by Ashutosh Pant
2022-07-21 10:16:39 +01:00
Gautham B A d07256a96d
HDFS-16667. Use malloc for buffer allocation in uriparser2 (#4576)
* Windows doesn't support variables for specifying
  the array size.
* This PR uses malloc to fix this issue.
2022-07-20 21:57:28 +05:30
Ashutosh Gupta e664f81ce7
HADOOP-18333.Upgrade jetty version to 9.4.48.v20220622 (#4553)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
2022-07-21 00:15:39 +08:00
Gautham B A 6415eb04e8
HDFS-16665. Fix duplicate sources for HDFS test (#4573)
* The library target hdfspp_test_shim_static is
  built using the following sources, which
  causes duplicate symbols to be defined -
  - hdfs_shim.c
  - ${LIBHDFSPP_BINDING_C}/hdfs.cc
* ${LIBHDFSPP_BINDING_C}/hdfs.cc is redundant
  and removing this fixes the issue.
2022-07-19 21:39:06 +05:30
Gautham B A 4fb799e6c5
HDFS-16464. Create only libhdfspp static libraries for Windows (#4571)
* Added the appropriate CMake flags and
  commands to enable only statically
  linked libraries and executables to
  be built on Windows.
2022-07-19 21:37:22 +05:30
Gautham B A 21b8952125
HDFS-16666. Pass CMake args for Windows in pom.xml (#4574)
* This PR passes the necessary CMake args in the
  pom.xml needed for building HDFS native client
  on Windows.
* These arguments are exposed as maven options
  and can be passed from the command-line.
2022-07-19 10:45:59 +05:30
Wei-Chiu Chuang a55ace7bc0
HADOOP-18079. Upgrade Netty to 4.1.77. (#3977)
Upgrade netty to address

CVE-2019-20444,
CVE-2019-20445
CVE-2022-24823

Contributed by Wei-Chiu Chuang
2022-07-18 10:41:00 +01:00
PJ Fanning 34e548cb62
HADOOP-18332: remove rs-api dependency as it conflicts with jsr311-api (#4547)
This downgrades jackson from the version switched to in
    HADOOP-18033 (2.13.0), to Jackson 2.12.7.
    This removes the dependency on javax.ws.rs-api,
    so avoiding runtime problems with applications using
    jersey-core v1 and/or jsr311-api.
    
    The 2.12.7 release still contains the fix for CVE-2020-36518.
    
    Contributed by PJ Fanning
2022-07-17 21:37:54 +05:30
Gautham B A 440f4c2b28
HDFS-16654. Link OpenSSL lib for CMake deps check (#4538)
* The check_c_source_compiles fails on Windows
  while linking with an "unable to resolve
  external symbol" error.
* This PR links OpenSSL lib for this check to
  fix this issue.
2022-07-17 20:47:30 +05:30
xuzq 8774f17868
HADOOP-13144. Enhancing IPC client throughput via multiple connections per user (#4542) 2022-07-15 14:18:46 -07:00
RuinanGu 9376b65989
HDFS-16566 Erasure Coding: Recovery may causes excess replicas when busy DN exsits (#4252) 2022-07-16 04:52:12 +08:00
Murali Krishna 2835174a4c
HDFS-16652. Upgrade jquery datatable version references to v1.10.19 (#4562) 2022-07-14 18:27:07 +05:30
Samrat 84ce592a85
YARN-11198. clean up numa resources from statestore (#4546)
* YARN-11198. clean up numa resources from levelDB

Co-authored-by: Deb <dbsamrat@3c22fba1b03f.ant.amazon.com>
2022-07-14 11:07:48 +05:30
xuzq 6f9c4359ec
HDFS-16283. RBF: reducing the load of renewLease() RPC (#4524). Contributed by ZanderXu.
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-07-14 07:26:40 +05:30