hadoop

Commit Graph

Author	SHA1	Message	Date
PJ Fanning	f856611121	HADOOP-18587: upgrade to jettison 1.5.3 due to cve (#5270 ) Signed-off-by: Chris Nauroth <cnauroth@apache.org> (cherry picked from commit `b9eb760ed2`)	2023-01-06 23:41:18 +00:00
Ayush Saxena	f63f20259b	HADOOP-18586. Update the year to 2023. (#5265 ). Contributed by Ayush Saxena. Reviewed-by: Takanobu Asanuma <tasanuma@apache.org>	2023-01-01 22:45:23 +05:30
Chris Nauroth	1f270d8a5e	YARN-11388: Prevent resource leaks in TestClientRMService. (#5187 ) Signed-off-by: Shilun Fan <slfan1989@apache.org> (cherry picked from commit `6b67373d10`)	2022-12-28 19:02:02 +00:00
curie71	290dc7817c	YARN-11392 Audit Log missing in ClientRMService (#5250 ). Contributed by Beibei Zhao. Signed-off-by: Chris Nauroth <cnauroth@apache.org> (cherry picked from commit `9668a85d40`)	2022-12-28 00:00:18 +00:00
Bence Kosztolnik	7190fcf713	YARN-11395. RM UI, RMAttemptBlock can not render FINAL_SAVING. Contributed by Bence Kosztolnik - In the YARN-1345 remove of FINAL_SAVING was missed from RMAttemptBlock - Same issue was present after YARN-1345 in YARN-4411 - YARN-4411 logic was applied in this commit for FINAL_SAVING	2022-12-23 17:20:35 +01:00
susheel-gupta	f9fac84f43	YARN-10879. Incorrect WARN text in ACL check for application tag based placement (#5231 ) Change-Id: Id892e38fe4c834b1743a0df2f0a40146d3d5a878	2022-12-22 17:26:05 +01:00
Steve Loughran	cda1d45a61	HADOOP-18470. Update index md with section on ABFS prefetching	2022-12-19 13:03:57 +00:00
Steve Loughran	223046cb64	HADOOP-18561. Update commons-net to 3.9.0 (#5214 ) Addresses CVE-2021-37533, which only relates to FTP. Applications not using the ftp:// filesystem, which, as anyone who has used it will know is very minimal and so rarely used, is not a critical part of the project. Furthermore, the FTP-related issue is at worst information leakage if someone connects to a malicious server. This is a due diligence PR rather than an emergency fix. Contributed by Steve Loughran	2022-12-19 11:57:47 +00:00
PJ Fanning	29b6df563b	HADOOP-18575. Make XML transformer factory more lenient (#5224 ) Due diligence followup to HADOOP-18469. Add secure XML parser factories to XMLUtils (#4940) Contributed by P J Fanning	2022-12-18 12:26:11 +00:00
Steve Loughran	c59444b160	HADOOP-18577. Followup: javadoc fix (#5232 ) Fixes a javadoc error which came with HADOOP-18577. ABFS: Add probes of readahead fix (#5205) Part of the HADOOP-18521 ABFS readahead fix; MUST be included. Contributed by Steve Loughran	2022-12-18 12:20:41 +00:00
Chengbing Liu	bfc916e7b0	HADOOP-18567. LogThrottlingHelper: properly trigger dependent recorders in cases of infrequent logging (#5215 ) Signed-off-by: Erik Krogen <xkrogen@apache.org> Co-authored-by: Chengbing Liu <liuchengbing@qiyi.com> (cherry picked from commit `ca3526da92`)	2022-12-16 09:16:51 -08:00
Xing Lin	d43fa95043	HDFS-16852. Skip KeyProviderCache shutdown hook registration if already shutting down (#5160 ) Signed-off-by: Erik Krogen <xkrogen@apache.org> (cherry picked from commit `f7bdf6c667`)	2022-12-16 08:47:33 -08:00
Steve Loughran	daa33aafff	HADOOP-18577. ABFS: Add probes of readahead fix (#5205 ) Followup patch to HADOOP-18456 as part of HADOOP-18521, ABFS ReadBufferManager buffer sharing across concurrent HTTP requests Add probes of readahead fix aid in checking safety of hadoop ABFS client across different releases. * ReadBufferManager constructor logs the fact it is safe at TRACE * AbfsInputStream declares it is fixed in toString() by including fs.azure.capability.readahead.safe" in the result. The ABFS FileSystem hasPathCapability("fs.azure.capability.readahead.safe") probe returns true to indicate the client's readahead manager has been fixed to be safe when prefetching. All Hadoop releases for which probe this returns false and for which the probe "fs.capability.etags.available" returns true at risk of returning invalid data when reading ADLS Gen2/Azure storage data. Contributed by Steve Loughran.	2022-12-15 17:11:22 +00:00
Steve Loughran	65892a7759	HADOOP-18573. Improve error reporting on non-standard kerberos names (#5221 ) The kerberos RPC does not declare any restriction on characters used in kerberos names, though implementations MAY be more restrictive. If the kerberos controller supports use non-conventional principal names and the kerberos admin chooses to use them this can confuse some of the parsing. The obvious solution is for the enterprise admins to "not do that" as a lot of things break, bits of hadoop included. Harden the hadoop code slightly so at least we fail more gracefully, so people can then get in touch with their sysadmin and tell them to stop it.	2022-12-15 11:44:12 +00:00
Mehakmeet Singh	1009d2560f	HADOOP-18574. Changing log level of IOStatistics increment to make the DEBUG logs less noisy (#5223 ) Contributed by: Mehakmeet Singh	2022-12-15 11:43:49 +00:00
Steve Loughran	ba55f370a9	HADOOP-18526. Leak of S3AInstrumentation instances via hadoop Metrics references (#5144 ) This has triggered an OOM in a process which was churning through s3a fs instances; the increased memory footprint of IOStatistics amplified what must have been a long-standing issue with FS instances being created and not closed() * Makes sure instrumentation is closed when the FS is closed. * Uses a weak reference from metrics to instrumentation, so even if the FS wasn't closed (see HADOOP-18478), this back reference would not cause the S3AInstrumentation reference to be retained. * If S3AFileSystem is configured to log at TRACE it will log the calling stack of initialize(), so help identify where the instance is being created. This should help track down the cause of instance leakage. Contributed by Steve Loughran.	2022-12-14 18:23:04 +00:00
Steve Loughran	654082773c	HADOOP-18183. s3a audit logs to publish range start/end of GET requests. (#5110 ) The start and end of the range is set in a new audit param "rg", e.g "?rg=100-200" Contributed by Ankit Saurabh	2022-12-14 16:51:46 +00:00
Doroszlai, Attila	6202348502	HADOOP-18569. NFS Gateway may release buffer too early (#5207 ) (#5211 ) (cherry picked from commit `df4812df65`)	2022-12-14 15:56:16 +01:00
Jack Richard Buggins	c5b360fd15	HADOOP-18329. Support for IBM Semeru JVM > 11.0.15.0 Vendor Name Changes (#4537 ) (#5208 ) The static boolean PlatformName.IBM_JAVA now identifies Java 11+ IBM Semeru runtimes as IBM JVM releases. Contributed by Jack Buggins.	2022-12-12 17:28:56 +00:00
Pranav Saxena	50a0f33cc9	HADOOP-18546. ABFS. disable purging list of in progress reads in abfs stream close() (#5176 ) This addresses HADOOP-18521, "ABFS ReadBufferManager buffer sharing across concurrent HTTP requests" by not trying to cancel in progress reads. It supercedes HADOOP-18528, which disables the prefetching. If that patch is applied after this one, prefetching will be disabled. As well as changing the default value in the code, core-default.xml is updated to set fs.azure.enable.readahead = true As a result, if Configuration.get("fs.azure.enable.readahead") returns a non-null value, then it can be inferred that it was set in or core-default.xml (the fix is present) or in core-site.xml (someone asked for it). Note: this commit contains the followup commit: That is needed to avoid race conditions in the test. Contributed by Pranav Saxena.	2022-12-09 13:49:14 +00:00
K0K0V0K	8b748c1cb8	YARN-11390. TestResourceTrackerService.testNodeRemovalNormally: Shutdown nodes should be 0 now expected: <1> but was: <0> (#5190 ) Reviewed-by: Peter Szucs Signed-off-by: Chris Nauroth <cnauroth@apache.org> (cherry picked from commit `ee7d1787cd`)	2022-12-08 18:15:08 +00:00
Oleksandr Shevchenko	dafc9ef8b6	HADOOP-18563. Misleading AWS SDK S3 timeout configuration comment (#5197 ) Contributed by Oleksandr Shevchenko	2022-12-08 15:12:58 +00:00
Steve Loughran	95890f3368	HADOOP-18560. AvroFSInput opens a stream twice and discards the second one without closing (#5186 ) This is needed for branches with the hadoop-common changes of HADOOP-16202. Enhanced openFile()	2022-12-06 09:59:51 +00:00
Steve Loughran	36889005f7	HADOOP-18470. index.md update for 3.3.5 release	2022-12-05 16:22:40 +00:00
dingshun3016	ebd2407d48	HDFS-16809. EC striped block is not sufficient when doing in maintenance. (#5050 ) (cherry picked from commit `02afb9ebe1`)	2022-12-05 16:39:49 +09:00
Ashutosh Gupta	fa2a0a603a	HDFS-16633. Fixing when Reserved Space For Replicas is not released on some cases (#4452 ) * HDFS-16633.Reserved Space For Replicas is not released on some cases Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com> (cherry picked from commit `b7edc6c60c`) Conflicts: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestSpaceReservation.java	2022-12-01 15:00:32 +09:00
Anmol Asrani	1cc8cb68f2	HADOOP-18457. ABFS: Support account level throttling (#5034 ) This allows abfs request throttling to be shared across all abfs connections talking to containers belonging to the same abfs storage account -as that is the level at which IO throttling is applied. The option is enabled/disabled in the configuration option "fs.azure.account.throttling.enabled"; The default is "true" Contributed by Anmol Asrani	2022-11-30 13:14:11 +00:00
Kidd5368	8c7f2ddc10	HDFS-16839 It should consider EC reconstruction work when we determine if a node is busy (#5128 ) Co-authored-by: Takanobu Asanuma <tasanuma@apache.org> Reviewed-by: Tao Li <tomscut@apache.org> (cherry picked from commit `72749a4ff8`)	2022-11-29 17:45:12 -08:00
Owen O'Malley	6b23e70539	HDFS-16851: RBF: Add a utility to dump the StateStore. (#5155 )	2022-11-29 14:19:19 -08:00
HarshitGupta11	f29d9a11bc	HADOOP-18530. ChecksumFileSystem::readVectored might return byte buffers not positioned at 0 (#5168 ) Contributed by Harshit Gupta	2022-11-29 14:52:25 +00:00
Simbarashe Dzinamarira	9d37ee082c	HDFS-16847: RBF: Prevents StateStoreFileSystemImpl from committing tmp file after encountering an IOException. (#5145 )	2022-11-28 16:54:25 -08:00
sreeb-msft	00249619a0	HADOOP-18498. ABFS: Remove unwanted ? prefix from SAS Tokens (#5136 ) This commit parses SAS Tokens and removes the unwanted prefix of '?' from them, if present. At present, SAS Tokens are provided to the driver through customer implementations of the SASTokenProvider interface. The SAS token providers should not assume that the token will be the first query parameter in the URIs that communicate with the backend. However, it was observed that certain public interfaces provided by Storage to generate SAS can include the '?' as the first character of the SAS Token, which would ideally be the case when it is the first query parameter. Thus, tokens that contain this prefix will lead to an error in the driver due to a clash of query parameters. To avoid failures for use of such SAS tokens, after receiving the SAS Token from the provider, the code checks for whether any ? prefix is present or not. If yes, it is removed before further usage of the token. This way, users would not have to manually remove the prefix before passing it on as a configuration. Contributed by Sree Bhattacharya	2022-11-28 11:40:06 +00:00
zhengchenyu	4addf31ef4	HDFS-16832. [SBN READ] Follow-on to HDFS-16732. Fix NPE when check the block location of empty directory (#5099 ) Signed-off-by: Erik Krogen <xkrogen@apache.org> Reviewed-by: Zengqiang Xu <xuzq_zander@163.com> (cherry picked from commit `dc2fba45fe`)	2022-11-21 08:33:16 -08:00
Owen O'Malley	8ff54dac58	HDFS-16844: RBF: Adds resilancy when StateStore gets exceptions. (#5138 ) Allows the StateStore to stay up when there are errors reading the data.	2022-11-18 09:29:08 -08:00
Owen O'Malley	9b3ffe960e	HADOOP-18324. Interrupting RPC Client calls can lead to thread exhaustion. (#4527 ) * Exactly 1 sending thread per an RPC connection. * If the calling thread is interrupted before the socket write, it will be skipped instead of sending it anyways. * If the calling thread is interrupted during the socket write, the write will finish. * RPC requests will be written to the socket in the order received. * Sending thread is only started by the receiving thread. * The sending thread periodically checks the shouldCloseConnection flag.	2022-11-18 08:29:28 -08:00
Lei Yang	b68520d2a5	HDFS-16836: StandbyCheckpointer shouldn't trigger rollback fs image after RU is finalized (#5135 ) Co-authored-by: Lei Yang <leyang@linkedin.com>	2022-11-15 15:10:41 -08:00
Mehakmeet Singh	9e53ed3602	HADOOP-18528. Disable abfs prefetching by default (#5134 ) Disables block prefetching on ABFS InputStreams, by setting fs.azure.enable.readahead to false in core-default.xml and the matching java constant. This prevents HADOOP-18521. ABFS ReadBufferManager buffer sharing across concurrent HTTP requests. Once a fix for that is committed, this change can be reverted. Contributed by Mehakmeet Singh.	2022-11-15 14:29:33 +00:00
huhaiyang	033ceca090	HDFS-16811. Support DecommissionBackoffMonitor parameters reconfigurable (#5122 ) Signed-off-by: Tao Li <tomscut@apache.org>	2022-11-10 13:37:09 +08:00
Steve Loughran	b1ea32f91c	HADOOP-18517. ABFS: Add fs.azure.enable.readahead option to disable readahead (#5103 ) * HADOOP-18517. ABFS: Add fs.azure.enable.readahead option to disable readahead Adds new config option to turn off readahead * also allows it to be passed in through openFile(), * extends ITestAbfsReadWriteAndSeek to use the option, including one replicated test...that shows that turning it off is slower. Important: this does not address the critical data corruption issue HADOOP-18521. ABFS ReadBufferManager buffer sharing across concurrent HTTP requests What is does do is provide a way to completely bypass the ReadBufferManager. To mitigate the problem, either fs.azure.enable.readahead needs to be set to false, or set "fs.azure.readaheadqueue.depth" to 0 -this still goes near the (broken) ReadBufferManager code, but does't trigger the bug. For safe reading of files through the ABFS connector, readahead MUST be disabled or the followup fix to HADOOP-18521 applied Contributed by Steve Loughran	2022-11-08 13:41:31 +00:00
Steve Loughran	c392075761	HADOOP-18507. VectorIO FileRange type to support a "reference" field (#5076 ) Contributed by Steve Loughran	2022-11-08 13:35:42 +00:00
Melissa You	d33ee67151	Hadoop-18520. Backport HADOOP-18427 and HADOOP-18452 to branch-3.3 (#5118 ) * HADOOP-18427. Improve ZKDelegationTokenSecretManager#startThead With recommended methods. (#4812) * HADOOP-18452. Fix TestKMS#testKMSHAZooKeeperDelegationToken Failed By Hadoop-18427. (#4885) Co-authored-by: slfan1989 <55643692+slfan1989@users.noreply.github.com>	2022-11-07 18:48:29 -08:00
Melissa You	02aedd7811	Hadoop-18519. Backport HDFS-15383 and HADOOP-17835 to branch-3.3 (#5112 ) * HDFS-15383. RBF: Add support for router delegation token without watch (#2047) Improving router's performance for delegation tokens related operations. It achieves the goal by removing watchers from router on tokens since based on our experience. The huge number of watches inside Zookeeper is degrading Zookeeper's performance pretty hard. The current limit is about 1.2-1.5 million. * HADOOP-17835. Use CuratorCache implementation instead of PathChildrenCache / TreeCache (#3266) Signed-off-by: Akira Ajisaka <aajisaka@apache.org> Co-authored-by: lfengnan <lfengnan@uber.com> Co-authored-by: Viraj Jasani <vjasani@apache.org> Co-authored-by: Melissa You <myou@myou-mn1.linkedin.biz>	2022-11-07 13:29:50 -08:00
Melissa You	853ffb545a	HADOOP-18515. Backport HADOOP-17612 to branch-3.3(Upgrade Zookeeper to 3.6.3 and Curator to 5.2.0) (#5097 ) * HADOOP-17612. Upgrade Zookeeper to 3.6.3 and Curator to 5.2.0 (#3241) Signed-off-by: Akira Ajisaka <aajisaka@apache.org> Co-authored-by: Viraj Jasani <vjasani@apache.org> Co-authored-by: Melissa You <myou@myou-mn1.linkedin.biz>	2022-11-05 09:28:24 -07:00
Ashutosh Gupta	7b84f6458b	HADOOP-18484. Upgrade hsqldb to v2.7.1 to mitigate CVE-2022-41853 (#5101 )	2022-11-04 11:00:17 +01:00
Ashutosh Gupta	0961014262	YARN-11364. Docker Container to accept docker Image name with sha256 digest (#5092 ) Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com> Reviewed-by: slfan1989 <55643692+slfan1989@users.noreply.github.com> Signed-off-by: Chris Nauroth <cnauroth@apache.org> (cherry picked from commit `83acb55981`)	2022-11-01 21:45:04 +00:00
Ashutosh Gupta	c5830d237b	YARN-11363. Remove unused TimelineVersionWatcher and TimelineVersion from hadoop-yarn-server-tests (#5091 ) Reviewed-by: slfan1989 <55643692+slfan1989@users.noreply.github.com> Signed-off-by: Chris Nauroth <cnauroth@apache.org> (cherry picked from commit `69225ae5b9`)	2022-11-01 21:10:46 +00:00
wangteng13	4da1cad680	document fix for MAPREDUCE-7425 (#5090 ) Reviewed-by: Ashutosh Gupta <ashutosh.gupta@st.niituniversity.in> Signed-off-by: Chris Nauroth <cnauroth@apache.org> (cherry picked from commit `388f2f182f`)	2022-11-01 20:36:17 +00:00
PJ Fanning	d88a6ee962	HADOOP-18512: upgrade woodstox-core to 5.4.0 for security fix (#5087 ). Contributed by PJ Fanning. Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>	2022-11-02 00:14:01 +05:30
sabertiger	1fdc6c5322	HADOOP-18233. Possible race condition with TemporaryAWSCredentialsProvider (#5024 ) This fixes a race condition with the TemporaryAWSCredentialProvider one which has existed for a long time but which only surfaced (usually in Spark) when the bucket existence probe was disabled by setting fs.s3a.bucket.probe to 0 -a performance speedup which was made the default in HADOOP-17454. Contributed by Jimmy Wong.	2022-10-31 17:50:49 +00:00
PJ Fanning	41e3c7edaf	HADOOP-18472. Upgrade to snakeyaml 1.33 (#4958 ) Reviewed-by: Dinesh Chitlangia <dineshc@apache.org> Signed-off-by: Akira Ajisaka <aajisaka@apache.org> (cherry picked from commit `d6a65a4180`) Conflicts: LICENSE-binary hadoop-project/pom.xml	2022-10-30 02:32:44 +09:00

1 2 3 4 5 ...

25128 Commits All Branches Search

25128 Commits

All Branches