Commit Graph

24642 Commits

Author SHA1 Message Date
Viraj Jasani fd3069d70c HADOOP-17947. Additional element types for VisibleForTesting (ADDENDUM) (#3521)
(cherry picked from commit 783e4805e7)
2021-10-06 02:18:54 +09:00
Steve Loughran 6f7b45641a
HADOOP-17922. move to fs.s3a.encryption.algorithm - JCEKS integration (#3466)
The ordering of the resolution of new and deprecated s3a encryption options
& secrets is the same when JCEKS and other hadoop credentials stores are used
to store them as when they are in XML files: per-bucket settings always take
priority over global values, even when the bucket-level options use the
old option names.

Contributed by Mehakmeet Singh and Steve Loughran

Change-Id: I871672071efa2eb6b600cb2658fceeef57f658a3
2021-10-05 11:39:43 +01:00
Mehakmeet Singh 769059c2f5
HADOOP-17871. S3A CSE: minor tuning (#3412)
This migrates the fs.s3a-server-side encryption configuration options
to a name which covers client-side encryption too.

fs.s3a.server-side-encryption-algorithm becomes fs.s3a.encryption.algorithm
fs.s3a.server-side-encryption.key becomes fs.s3a.encryption.key

The existing keys remain valid, simply deprecated and remapped
to the new values. If you want server-side encryption options
to be picked up regardless of hadoop versions, use
the old keys.

(the old key also works for CSE, though as no version of Hadoop
with CSE support has shipped without this remapping, it's less
relevant)

Contributed by: Mehakmeet Singh

Change-Id: I51804b21b287dbce18864f0a6ad17126aba2b281
2021-10-05 11:39:25 +01:00
Mehakmeet Singh abb367aec6
HADOOP-17817.HADOOP-17823. S3A to raise IOE if both S3-CSE and S3Guard enabled (#3239)
S3A S3Guard tests to skip if S3-CSE are enabled (#3263)

    Follow on to
    * HADOOP-13887. Encrypt S3A data client-side with AWS SDK (S3-CSE)

    If the S3A bucket is set up to use S3-CSE encryption, all tests which turn
    on S3Guard are skipped, so they don't raise any exceptions about
    incompatible configurations.

Contributed by Mehakmeet Singh

Change-Id: I9f4188109b56a1f4e5a31fae265d980c5795db1e
2021-10-05 11:38:57 +01:00
Mehakmeet Singh aee975a136
HADOOP-13887. Support S3 client side encryption (S3-CSE) using AWS-SDK (#2706)
This (big!) patch adds support for client side encryption in AWS S3,
with keys managed by AWS-KMS.

Read the documentation in encryption.md very, very carefully before
use and consider it unstable.

S3-CSE is enabled in the existing configuration option
"fs.s3a.server-side-encryption-algorithm":

fs.s3a.server-side-encryption-algorithm=CSE-KMS
fs.s3a.server-side-encryption.key=<KMS_KEY_ID>

You cannot enable CSE and SSE in the same client, although
you can still enable a default SSE option in the S3 console.

* Filesystem list/get status operations subtract 16 bytes from the length
  of all files >= 16 bytes long to compensate for the padding which CSE
  adds.
* The SDK always warns about the specific algorithm chosen being
  deprecated. It is critical to use this algorithm for ranged
  GET requests to work (i.e. random IO). Ignore.
* Unencrypted files CANNOT BE READ.
  The entire bucket SHOULD be encrypted with S3-CSE.
* Uploading files may be a bit slower as blocks are now
  written sequentially.
* The Multipart Upload API is disabled when S3-CSE is active.

Contributed by Mehakmeet Singh

Change-Id: Ie1a27a036a39db66a67e9c6d33bc78d54ea708a0
2021-10-05 11:37:41 +01:00
Viraj Jasani da011baf85 HADOOP-17947. Provide alternative to Guava VisibleForTesting (#3505)
Reviewed-by: Steve Loughran <stevel@apache.org>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
(cherry picked from commit 5b1d594005)
2021-10-05 10:01:07 +09:00
Josh Elser feeaebeb84
HADOOP-17934. ABFS: Make sure the AbfsHttpOperation is non-null before using it (#3477)
Contributed by: Josh Elser

Change-Id: I24a2e0322d8cae2d72d65c7f3d8a74580a418317
2021-10-04 20:54:39 +01:00
Ahmed Hussein 31b44c519c
HADOOP-17929. implement non-guava Precondition checkArgument (#3473)
Reviewed-by: Viraj Jasani <vjasani@apache.org>
(cherry picked from commit 0c498f21de)
2021-10-01 16:49:07 +08:00
litao 5ed4274f38
HADOOP-17938. Print lockWarningThreshold in InstrumentedLock#logWarni… (#3485)
Reviewed-by: Hui Fei <ferhui@apache.org>
(cherry picked from commit 211db3fe08)
2021-10-01 10:24:33 +08:00
AngersZhuuuu 4475d8bfe7
HDFS-16235. Fix Deadlock in LeaseRenewer for static remove method (#3472)
(cherry picked from commit 5f9321a5d4)
2021-09-29 18:36:36 +08:00
He Xiaoqiao 06954af6f0
HDFS-14575. LeaseRenewer#daemon threads leak in DFSClient. Contributed by Renukaprasad C.
Co-authored-by: Tao Yang <taoyang1@apache.org>
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
(cherry picked from commit 10b79a26fe)
2021-09-29 18:36:36 +08:00
Warren Zhu e33509b094
HADOOP-17941. Update xerces to 2.12.1 (#3496)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 1db5eb43ad)
2021-09-29 18:49:52 +09:00
Neil 88deac0479
YARN-10970. Standby RM should expose prom endpoint (#3480)
Reviewed-by: Adam Antal <adamantal@apache.org>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 4bd0c36189)
2021-09-29 15:48:02 +09:00
Takanobu Asanuma f00ab40b4d HADOOP-17940. Upgrade Kafka to 2.8.1 (#3488)
Reviewed-by: Masatake Iwasaki <iwasakims@apache.org>
(cherry picked from commit 2068b0041c)
2021-09-28 13:31:53 +09:00
Chao Sun 6931b70a00
HADOOP-17936. Fix test failure after reverting HADOOP-16878 from branch-3.3 (#3478) 2021-09-27 13:56:44 -07:00
Dongjoon Hyun ca7fb6a813
HADOOP-17939. Support building on Apple Silicon (#3486)
Support building on Apple Silicon with ARM CPUs by using the x86_64 version of protoc.

Contributed by  Dongjoon Hyun

Change-Id: I4b8330098822f1fd28f0113650eb709d53bcc690
2021-09-27 13:27:56 +01:00
Wei-Chiu Chuang bb08de559a HDFS-16233. Do not use exception handler to implement copy-on-write for EnumCounters. (#3468)
(cherry picked from commit 87632bbacf)
2021-09-24 09:13:15 -07:00
Akira Ajisaka f1c0dc84cd
HDFS-15977. Call explicit_bzero only if it is available. (#2914)
Reviewed-by: Masatake Iwasaki <iwasakims@apache.org>
Reviewed-by: Inigo Goiri <inigoiri@apache.org>
(cherry picked from commit f0241ec216)

 Conflicts:
	hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/lib/x-platform/syscall_linux.cc
2021-09-24 11:50:12 +09:00
Chao Sun ff26a7700d Revert "HADOOP-16878. FileUtil.copy() to throw IOException if the source and destination are the same (#2383)"
This reverts commit 54c40cbf49.
2021-09-23 15:04:27 -07:00
Chao Sun 9fd0832a99 Revert "MAPREDUCE-7303. Fix TestJobResourceUploader failures after HADOOP-16878. Contributed by Peter Bacsko."
This reverts commit c40f0f1eb3.
2021-09-23 15:04:26 -07:00
Peter Bacsko 99d84b941b YARN-9606. Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient. Contributed by Bilwa S T. 2021-09-23 13:58:56 +02:00
Ayush Saxena 80d2bda51c
HDFS-15288. Add Available Space Rack Fault Tolerant BPP. Contributed by Ayush Saxena.(Addendum) 2021-09-23 10:33:16 +05:30
Ayush Saxena 38f529abf7
HDFS-16223. AvailableSpaceRackFaultTolerantBlockPlacementPolicy should use chooseRandomWithStorageTypeTwoTrial() for better performance. (#3424). Contributed by Ayush Saxena. 2021-09-23 08:54:01 +05:30
Ayush Saxena 3355126062
HDFS-15288. Add Available Space Rack Fault Tolerant BPP. Contributed by Ayush Saxena. 2021-09-23 08:52:20 +05:30
Rintaro Ikeda a5657b9657 HADOOP-17926. Maven-eclipse-plugin is no longer needed since Eclipse can import Maven projects by itself. (#3465)
(cherry picked from commit 962068d2d8)
2021-09-22 13:08:30 +00:00
Mehakmeet Singh 8e5620cd9e
HADOOP-17195. ABFS: OutOfMemory error while uploading huge files (#3446)
Addresses the problem of processes running out of memory when
there are many ABFS output streams queuing data to upload,
especially when the network upload bandwidth is less than the rate
data is generated.

ABFS Output streams now buffer their blocks of data to
"disk", "bytebuffer" or "array", as set in
"fs.azure.data.blocks.buffer"

When buffering via disk, the location for temporary storage
is set in "fs.azure.buffer.dir"

For safe scaling: use "disk" (default); for performance, when
confident that upload bandwidth will never be a bottleneck,
experiment with the memory options.

The number of blocks a single stream can have queued for uploading
is set in "fs.azure.block.upload.active.blocks".
The default value is 20.

Contributed by Mehakmeet Singh.
2021-09-22 11:19:16 +01:00
sumangala-patki dd30db78e7
HADOOP-17290. ABFS: Add Identifiers to Client Request Header (#2520)
Contributed by Sumangala Patki.

(cherry picked from commit 35570e414a)
2021-09-21 16:45:51 +01:00
Neil 9700d98eac
HADOOP-17893. Improve PrometheusSink for Namenode TopMetrics (#3426)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit ae2c5ccfcf)
2021-09-21 10:44:51 +09:00
Rintaro Ikeda 92af6cd3bc HADOOP-17919. Fix command line example in Hadoop Cluster Setup documentation. (#3453)
(cherry picked from commit 607c20c612)
2021-09-17 13:34:07 +00:00
Steve Loughran 9188fa8cce
HADOOP-17126. implement non-guava Precondition checkNotNull
This adds a new class org.apache.hadoop.util.Preconditions which is

* @Private/@Unstable
* Intended to allow us to move off Google Guava
* Is designed to be trivially backportable
  (i.e contains no references to guava classes internally)

Please use this instead of the guava equivalents, where possible.

Contributed by: Ahmed Hussein

Change-Id: Ic392451bcfe7d446184b7c995734bcca8c07286e
2021-09-17 11:06:59 +01:00
Liang-Chi Hsieh 103ef9c711 HADOOP-17891. Fix compilation error under skipShade (ADDENDUM) (#3441) 2021-09-16 10:12:28 -07:00
Eric Badger 52ba50fd3c YARN-10935. AM Total Queue Limit goes below per-user AM Limit if parent is full. Contributed by Eric Payne.
(cherry picked from commit 43f0a34dd4)
2021-09-16 16:46:44 +00:00
wangzhaohui 2f73ac1c14 HDFS-16181. [SBN Read] Fix display of JournalNode metric RpcRequestCacheMissAmount (#3317)
Co-authored-by: wangzhaohui8 <wangzhaohui8@jd.com>

(cherry picked from commit 232fd7cae170de8c6b52c14841a47dca8735c6d2)
2021-09-15 10:02:13 -07:00
Liang-Chi Hsieh b9715c2931 HADOOP-17891. Exclude snappy-java and lz4-java from relocation in shaded hadoop client libraries (#3385) 2021-09-14 11:21:41 -07:00
Szilard Nemeth 6c68211062 YARN-10870. Missing user filtering check -> yarn.webapp.filter-entity-list-by-user for RM Scheduler page. Contributed by Gergely Pollak 2021-09-14 18:08:34 +02:00
Tamas Domok 8e4ac01135
YARN-10901. Permission checking error on an existing directory in LogAggregationFileController#verifyAndCreateRemoteLogDir (#3409)
Co-authored-by: Tamas Domok <tdomok@cloudera.com>
2021-09-14 17:34:32 +02:00
EungsopYoo 51a4a23e37
HDFS-16198. Short circuit read leaks Slot objects when InvalidToken exception is thrown (#3359)
Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org>
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
(cherry picked from commit c4c5883d8b)
2021-09-14 16:27:59 +08:00
Symious c0f32f3cf8 HDFS-16221. RBF: Add usage of refreshCallQueue for Router (#3421)
(cherry picked from commit 7f6553af75)
2021-09-13 10:48:09 +08:00
Symious 8affaa6312 HDFS-16210. RBF: Add the option of refreshCallQueue to RouterAdmin (#3379)
(cherry picked from commit c0890e6d04)
2021-09-10 10:01:27 +08:00
sumangala-patki 1cb9e747eb
HADOOP-17618. ABFS: Partially obfuscate SAS object IDs in Logs (#2845)
Contributed by Sumangala Patki

(cherry picked from commit 3450522c2f)
2021-09-09 14:04:12 +01:00
Adam Binford 59a955dfa0
HADOOP-17804. Expose prometheus metrics only after a flush and dedupe with tag values (#3369)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 4ced012f33)
2021-09-09 16:51:04 +09:00
Ahmed Hussein 1f61944e3b HDFS-16207. Remove NN logs stack trace for non-existent xattr query (#3375)
Change-Id: Ibde523b20a6b8ac92991da52583e625a018d2ee6
2021-09-09 05:27:13 +00:00
Steve Loughran a2242df10a
HADOOP-17894. CredentialProviderFactory.getProviders() recursion loading JCEKS file from S3A (#3393)
* CredentialProviderFactory to detect and report on recursion.
* S3AFS to remove incompatible providers.
* Integration Test for this.

Contributed by Steve Loughran.

Change-Id: Ia247b3c9fe8488ffdb7f57b40eb6e37c57e522ef
2021-09-08 17:00:20 +01:00
Masatake Iwasaki 76393e1359 HADOOP-17899. Avoid using implicit dependency on junit-jupiter-api. (#3399)
(cherry picked from commit ce7a5bfbd3)
2021-09-08 09:11:39 +00:00
Masatake Iwasaki 5926ccde77 HADOOP-17897. Allow nested blocks in switch case in checkstyle settings. (#3394)
(cherry picked from commit e183ec8998)
2021-09-08 04:59:05 +00:00
Mukund Thakur 3b1c594355 HADOOP-17156. ABFS: Release the byte buffers held by input streams in close() (#3285)
Contributed By: Mukund Thakur
2021-09-07 15:29:22 +05:30
Yellow Flash 09e8e5c5cb
HADOOP-17870. Http Filesystem to qualify relative paths. (#3338)
Contributed by Yellowflash

Change-Id: I217da06a1a2e5c0ca2b324f8e21baa0846f64858
2021-09-07 10:54:35 +01:00
Chris Nauroth cc90b4f987 HADOOP-15129. Datanode caches namenode DNS lookup failure and cannot startup (#3348)
Co-authored-by:  Karthik Palaniappan

Change-Id: Id079a5319e5e83939d5dcce5fb9ebe3715ee864f
2021-09-03 18:48:07 +00:00
Viraj Jasani 7a4eaeb8bf
HADOOP-17874. ExceptionsHandler to add terse/suppressed Exceptions in thread-safe manner (#3343)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit 99a157fa4a)
2021-09-03 10:27:06 +09:00
Ahmed Hussein fc5d67dfb4 HADOOP-17886. Upgrade ant to 1.10.11 (#3371)
(cherry picked from commit 051207375b)
2021-09-02 16:17:32 -05:00