Commit Graph

25116 Commits

Author SHA1 Message Date
Steve Loughran daa33aafff
HADOOP-18577. ABFS: Add probes of readahead fix (#5205)
Followup patch to  HADOOP-18456 as part of HADOOP-18521,
ABFS ReadBufferManager buffer sharing across concurrent HTTP requests

Add probes of readahead fix aid in checking safety of
hadoop ABFS client across different releases.

* ReadBufferManager constructor logs the fact it is safe at TRACE
* AbfsInputStream declares it is fixed in toString()
  by including fs.azure.capability.readahead.safe" in the
  result.

The ABFS FileSystem hasPathCapability("fs.azure.capability.readahead.safe")
probe returns true to indicate the client's readahead manager has been fixed
to be safe when prefetching.

All Hadoop releases for which probe this returns false
and for which the probe "fs.capability.etags.available"
returns true at risk of returning invalid data when reading
ADLS Gen2/Azure storage data.

Contributed by Steve Loughran.
2022-12-15 17:11:22 +00:00
Steve Loughran 65892a7759
HADOOP-18573. Improve error reporting on non-standard kerberos names (#5221)
The kerberos RPC does not declare any restriction on
characters used in kerberos names, though
implementations MAY be more restrictive.

If the kerberos controller supports use non-conventional
principal names *and the kerberos admin chooses to use them*
this can confuse some of the parsing.

The obvious solution is for the enterprise admins to "not do that"
as a lot of things break, bits of hadoop included.

Harden the hadoop code slightly so at least we fail more gracefully,
so people can then get in touch with their sysadmin and tell them
to stop it.
2022-12-15 11:44:12 +00:00
Mehakmeet Singh 1009d2560f
HADOOP-18574. Changing log level of IOStatistics increment to make the DEBUG logs less noisy (#5223)
Contributed by: Mehakmeet Singh
2022-12-15 11:43:49 +00:00
Steve Loughran ba55f370a9
HADOOP-18526. Leak of S3AInstrumentation instances via hadoop Metrics references (#5144)
This has triggered an OOM in a process which was churning through s3a fs
instances; the increased memory footprint of IOStatistics amplified what
must have been a long-standing issue with FS instances being created
and not closed()

*  Makes sure instrumentation is closed when the FS is closed.
*  Uses a weak reference from metrics to instrumentation, so even
   if the FS wasn't closed (see HADOOP-18478), this back reference
   would not cause the S3AInstrumentation reference to be retained.
*  If S3AFileSystem is configured to log at TRACE it will log the
   calling stack of initialize(), so help identify where the
   instance is being created. This should help track down
   the cause of instance leakage.

Contributed by Steve Loughran.
2022-12-14 18:23:04 +00:00
Steve Loughran 654082773c
HADOOP-18183. s3a audit logs to publish range start/end of GET requests. (#5110)
The start and end of the range is set in a new audit param "rg",
e.g "?rg=100-200"

Contributed by Ankit Saurabh
2022-12-14 16:51:46 +00:00
Doroszlai, Attila 6202348502
HADOOP-18569. NFS Gateway may release buffer too early (#5207) (#5211)
(cherry picked from commit df4812df65)
2022-12-14 15:56:16 +01:00
Jack Richard Buggins c5b360fd15
HADOOP-18329. Support for IBM Semeru JVM > 11.0.15.0 Vendor Name Changes (#4537) (#5208)
The static boolean PlatformName.IBM_JAVA now identifies
Java 11+ IBM Semeru runtimes as IBM JVM releases.

Contributed by Jack Buggins.
2022-12-12 17:28:56 +00:00
Pranav Saxena 50a0f33cc9
HADOOP-18546. ABFS. disable purging list of in progress reads in abfs stream close() (#5176)
This addresses HADOOP-18521, "ABFS ReadBufferManager buffer sharing
across concurrent HTTP requests" by not trying to cancel
in progress reads.

It supercedes HADOOP-18528, which disables the prefetching.
If that patch is applied *after* this one, prefetching
will be disabled.

As well as changing the default value in the code,
core-default.xml is updated to set
fs.azure.enable.readahead = true

As a result, if Configuration.get("fs.azure.enable.readahead")
returns a non-null value, then it can be inferred that
it was set in or core-default.xml (the fix is present)
or in core-site.xml (someone asked for it).

Note: this commit contains the followup commit:
That is needed to avoid race conditions in the test.

Contributed by Pranav Saxena.
2022-12-09 13:49:14 +00:00
K0K0V0K 8b748c1cb8 YARN-11390. TestResourceTrackerService.testNodeRemovalNormally: Shutdown nodes should be 0 now expected: <1> but was: <0> (#5190)
Reviewed-by: Peter Szucs
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
(cherry picked from commit ee7d1787cd)
2022-12-08 18:15:08 +00:00
Oleksandr Shevchenko dafc9ef8b6
HADOOP-18563. Misleading AWS SDK S3 timeout configuration comment (#5197)
Contributed by Oleksandr Shevchenko
2022-12-08 15:12:58 +00:00
Steve Loughran 95890f3368
HADOOP-18560. AvroFSInput opens a stream twice and discards the second one without closing (#5186)
This is needed for branches with  the hadoop-common changes of
HADOOP-16202. Enhanced openFile()
2022-12-06 09:59:51 +00:00
Steve Loughran 36889005f7
HADOOP-18470. index.md update for 3.3.5 release 2022-12-05 16:22:40 +00:00
dingshun3016 ebd2407d48 HDFS-16809. EC striped block is not sufficient when doing in maintenance. (#5050)
(cherry picked from commit 02afb9ebe1)
2022-12-05 16:39:49 +09:00
Ashutosh Gupta fa2a0a603a HDFS-16633. Fixing when Reserved Space For Replicas is not released on some cases (#4452)
* HDFS-16633.Reserved Space For Replicas is not released on some cases

Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
(cherry picked from commit b7edc6c60c)

 Conflicts:
	hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
	hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestSpaceReservation.java
2022-12-01 15:00:32 +09:00
Anmol Asrani 1cc8cb68f2
HADOOP-18457. ABFS: Support account level throttling (#5034)
This allows  abfs request throttling to be shared across all
abfs connections talking to containers belonging to the same abfs storage
account -as that is the level at which IO throttling is applied.

The option is enabled/disabled in the configuration option
"fs.azure.account.throttling.enabled";
The default is "true"

Contributed by Anmol Asrani
2022-11-30 13:14:11 +00:00
Kidd5368 8c7f2ddc10
HDFS-16839 It should consider EC reconstruction work when we determine if a node is busy (#5128)
Co-authored-by: Takanobu Asanuma <tasanuma@apache.org>
Reviewed-by: Tao Li <tomscut@apache.org>
(cherry picked from commit 72749a4ff8)
2022-11-29 17:45:12 -08:00
Owen O'Malley 6b23e70539 HDFS-16851: RBF: Add a utility to dump the StateStore. (#5155) 2022-11-29 14:19:19 -08:00
HarshitGupta11 f29d9a11bc
HADOOP-18530. ChecksumFileSystem::readVectored might return byte buffers not positioned at 0 (#5168)
Contributed by Harshit Gupta
2022-11-29 14:52:25 +00:00
Simbarashe Dzinamarira 9d37ee082c HDFS-16847: RBF: Prevents StateStoreFileSystemImpl from committing tmp file after encountering an IOException. (#5145) 2022-11-28 16:54:25 -08:00
sreeb-msft 00249619a0
HADOOP-18498. ABFS: Remove unwanted ? prefix from SAS Tokens (#5136)
This commit parses SAS Tokens and removes the unwanted prefix of '?' from them, if present.

At present, SAS Tokens are provided to the driver through customer implementations of the SASTokenProvider interface. The SAS token providers should not assume that the token will be the first query parameter in the URIs that communicate with the backend. However, it was observed that certain public interfaces provided by Storage to generate SAS can include the '?' as the first character of the SAS Token, which would ideally be the case when it is the first query parameter. Thus, tokens that contain this prefix will lead to an error in the driver due to a clash of query parameters.

To avoid failures for use of such SAS tokens, after receiving the SAS Token from the provider, the code checks for whether any ? prefix is present or not. If yes, it is removed before further usage of the token. This way, users would not have to manually remove the prefix before passing it on as a configuration.

Contributed by Sree Bhattacharya
2022-11-28 11:40:06 +00:00
zhengchenyu 4addf31ef4 HDFS-16832. [SBN READ] Follow-on to HDFS-16732. Fix NPE when check the block location of empty directory (#5099)
Signed-off-by: Erik Krogen <xkrogen@apache.org>
Reviewed-by: Zengqiang Xu <xuzq_zander@163.com>

(cherry picked from commit dc2fba45fe)
2022-11-21 08:33:16 -08:00
Owen O'Malley 8ff54dac58 HDFS-16844: RBF: Adds resilancy when StateStore gets exceptions. (#5138)
Allows the StateStore to stay up when there are errors reading the data.
2022-11-18 09:29:08 -08:00
Owen O'Malley 9b3ffe960e HADOOP-18324. Interrupting RPC Client calls can lead to thread exhaustion. (#4527)
* Exactly 1 sending thread per an RPC connection.
* If the calling thread is interrupted before the socket write, it will be skipped instead of sending it anyways.
* If the calling thread is interrupted during the socket write, the write will finish.
* RPC requests will be written to the socket in the order received.
* Sending thread is only started by the receiving thread.
* The sending thread periodically checks the shouldCloseConnection flag.
2022-11-18 08:29:28 -08:00
Lei Yang b68520d2a5 HDFS-16836: StandbyCheckpointer shouldn't trigger rollback fs image after RU is finalized (#5135)
Co-authored-by: Lei Yang <leyang@linkedin.com>
2022-11-15 15:10:41 -08:00
Mehakmeet Singh 9e53ed3602
HADOOP-18528. Disable abfs prefetching by default (#5134)
Disables block prefetching on ABFS InputStreams, by setting
fs.azure.enable.readahead to false in core-default.xml and
the matching java constant.

This prevents
HADOOP-18521. ABFS ReadBufferManager buffer sharing across concurrent HTTP requests.

Once a fix for that is committed, this change can be reverted.

Contributed by Mehakmeet Singh.
2022-11-15 14:29:33 +00:00
huhaiyang 033ceca090
HDFS-16811. Support DecommissionBackoffMonitor parameters reconfigurable (#5122)
Signed-off-by: Tao Li <tomscut@apache.org>
2022-11-10 13:37:09 +08:00
Steve Loughran b1ea32f91c
HADOOP-18517. ABFS: Add fs.azure.enable.readahead option to disable readahead (#5103)
* HADOOP-18517. ABFS: Add fs.azure.enable.readahead option to disable readahead

Adds new config option to turn off readahead
* also allows it to be passed in through openFile(),
* extends ITestAbfsReadWriteAndSeek to use the option, including one
  replicated test...that shows that turning it off is slower.

Important: this does not address the critical data corruption issue
HADOOP-18521. ABFS ReadBufferManager buffer sharing across concurrent HTTP requests

What is does do is provide a way to completely bypass the ReadBufferManager.
To mitigate the problem, either fs.azure.enable.readahead needs to be set to false,
or set "fs.azure.readaheadqueue.depth" to 0 -this still goes near the (broken)
ReadBufferManager code, but does't trigger the bug.

For safe reading of files through the ABFS connector, readahead MUST be disabled
or the followup fix to HADOOP-18521 applied

Contributed by Steve Loughran
2022-11-08 13:41:31 +00:00
Steve Loughran c392075761
HADOOP-18507. VectorIO FileRange type to support a "reference" field (#5076)
Contributed by Steve Loughran
2022-11-08 13:35:42 +00:00
Melissa You d33ee67151
Hadoop-18520. Backport HADOOP-18427 and HADOOP-18452 to branch-3.3 (#5118)
* HADOOP-18427. Improve ZKDelegationTokenSecretManager#startThead With recommended methods. (#4812)

* HADOOP-18452. Fix TestKMS#testKMSHAZooKeeperDelegationToken Failed By Hadoop-18427. (#4885)

Co-authored-by: slfan1989 <55643692+slfan1989@users.noreply.github.com>
2022-11-07 18:48:29 -08:00
Melissa You 02aedd7811
Hadoop-18519. Backport HDFS-15383 and HADOOP-17835 to branch-3.3 (#5112)
* HDFS-15383. RBF: Add support for router delegation token without watch (#2047)

Improving router's performance for delegation tokens related operations. It achieves the goal by removing watchers from router on tokens since based on our experience. The huge number of watches inside Zookeeper is degrading Zookeeper's performance pretty hard. The current limit is about 1.2-1.5 million.

* HADOOP-17835. Use CuratorCache implementation instead of PathChildrenCache / TreeCache (#3266)

Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
Co-authored-by: lfengnan <lfengnan@uber.com>
Co-authored-by: Viraj Jasani <vjasani@apache.org>
Co-authored-by: Melissa You <myou@myou-mn1.linkedin.biz>
2022-11-07 13:29:50 -08:00
Melissa You 853ffb545a
HADOOP-18515. Backport HADOOP-17612 to branch-3.3(Upgrade Zookeeper to 3.6.3 and Curator to 5.2.0) (#5097)
* HADOOP-17612. Upgrade Zookeeper to 3.6.3 and Curator to 5.2.0 (#3241)

Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
Co-authored-by: Viraj Jasani <vjasani@apache.org>
Co-authored-by: Melissa You <myou@myou-mn1.linkedin.biz>
2022-11-05 09:28:24 -07:00
Ashutosh Gupta 7b84f6458b
HADOOP-18484. Upgrade hsqldb to v2.7.1 to mitigate CVE-2022-41853 (#5101) 2022-11-04 11:00:17 +01:00
Ashutosh Gupta 0961014262 YARN-11364. Docker Container to accept docker Image name with sha256 digest (#5092)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Reviewed-by: slfan1989 <55643692+slfan1989@users.noreply.github.com>
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
(cherry picked from commit 83acb55981)
2022-11-01 21:45:04 +00:00
Ashutosh Gupta c5830d237b YARN-11363. Remove unused TimelineVersionWatcher and TimelineVersion from hadoop-yarn-server-tests (#5091)
Reviewed-by: slfan1989 <55643692+slfan1989@users.noreply.github.com>
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
(cherry picked from commit 69225ae5b9)
2022-11-01 21:10:46 +00:00
wangteng13 4da1cad680 document fix for MAPREDUCE-7425 (#5090)
Reviewed-by: Ashutosh Gupta <ashutosh.gupta@st.niituniversity.in>
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
(cherry picked from commit 388f2f182f)
2022-11-01 20:36:17 +00:00
PJ Fanning d88a6ee962
HADOOP-18512: upgrade woodstox-core to 5.4.0 for security fix (#5087). Contributed by PJ Fanning.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-11-02 00:14:01 +05:30
sabertiger 1fdc6c5322
HADOOP-18233. Possible race condition with TemporaryAWSCredentialsProvider (#5024)
This fixes a race condition with the TemporaryAWSCredentialProvider
one which has existed for a long time but which only surfaced
(usually in Spark) when the bucket existence probe was disabled
by setting fs.s3a.bucket.probe to 0 -a performance speedup
which was made the default in HADOOP-17454.

Contributed by Jimmy Wong.
2022-10-31 17:50:49 +00:00
PJ Fanning 41e3c7edaf
HADOOP-18472. Upgrade to snakeyaml 1.33 (#4958)
Reviewed-by: Dinesh Chitlangia <dineshc@apache.org>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
(cherry picked from commit d6a65a4180)

 Conflicts:
	LICENSE-binary
	hadoop-project/pom.xml
2022-10-30 02:32:44 +09:00
Chris Nauroth 33293d4ba4 YARN-11360: Add number of decommissioning/shutdown nodes to YARN cluster metrics. (#5060)
(cherry picked from commit bfb84cd7f6)
2022-10-28 18:13:57 +00:00
Mehakmeet Singh 6accb7809f
HADOOP-18499. S3A to support HTTPS web proxies (#5083)
The option "fs.s3a.proxy.ssl.enabled" controls
whether the s3a connects to a proxy over HTTP (default) or HTTPS.
Set to "true" to use HTTPS.

Contributed by Mehakmeet Singh
2022-10-27 20:17:57 +05:30
M1eyu2018 cbac2c4875 HDFS-16716. Improve appendToFile command: support appending on file with new block (#4697)
Reviewed-by: xuzq <15040255127@163.com>
Signed-off-by: Tao Li <tomscut@apache.org>
2022-10-27 19:11:51 +08:00
Takanobu Asanuma 53143409a8 HDFS-16822. HostRestrictingAuthorizationFilter should pass through requests if they don't access WebHDFS API. (#5079)
Reviewed-by: Ashutosh Gupta <ashugpt@amazon.com>
Reviewed-by: Tao Li <tomscut@apache.org>
(cherry picked from commit 545a556883)
2022-10-27 14:40:07 +09:00
PJ Fanning bd276092b0 MAPREDUCE-7411: use secure XML parsers in mapreduce modules (#4980)
Lockdown of parsers in hadoop-mapreduce.

Follow-on to HADOOP-18469. Add secure XML parser factories to XMLUtils

Contributed by P J Fanning
2022-10-26 11:04:29 +01:00
Viraj Jasani 36a0e818ec HDFS-16016. BPServiceActor to provide new thread to handle IBR (#2998)
Contributed by Viraj Jasani

(cherry picked from commit c1bf3cb0da)
2022-10-24 15:16:38 +09:00
Takanobu Asanuma 198bc444de
HDFS-16566 Erasure Coding: Recovery may causes excess replicas when busy DN exsits (#4252) (#5059)
(cherry picked from commit 9376b65989)

Co-authored-by: RuinanGu <57645247+RuinanGu@users.noreply.github.com>
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
Reviewed-by: Ashutosh Gupta <ashugpt@amazon.com>
2022-10-22 13:14:04 +09:00
Steve Loughran 19f8e4f34d
YARN-11330. use secure XML parsers (#4981)
Move construction of XML parsers in YARN
modules to using the locked-down parser factory
of HADOOP-18469.

One exception: GpuDeviceInformationParser still supports DTD resolution;
all other features are disabled.

Contributed by P J Fanning
2022-10-21 14:16:22 +01:00
SevenAddSix 237814a9b3 HDFS-16480. Fix typo: indicies -> indices (#4020)
(cherry picked from commit 5eab9719cb)
2022-10-21 17:32:58 +09:00
Hui Fei 0c2234fd8e HDFS-15803. EC: Remove unnecessary method (getWeight) in StripedReconstructionInfo. Contributed by huhaiyang
(cherry picked from commit 66ecee333e)
2022-10-21 17:31:30 +09:00
FuzzingTeam 1f414ab847
HADOOP-18471. Fixed ArrayIndexOutOfBoundsException in DefaultStringifier (#4957)
Contributed by FuzzingTeam
2022-10-20 18:14:24 +01:00
Steve Loughran 75b04010a2
HDFS-16795. Use secure XML parsers (#4979)
Move construction of XML parsers in HDFS
modules to using the locked-down parser factory
of HADOOP-18469.

Contributed by P J Fanning
2022-10-20 17:48:58 +01:00