hadoop

Commit Graph

Author	SHA1	Message	Date
Viraj Jasani	0ad7d7c677	HADOOP-18697. S3A prefetch: failure of ITestS3APrefetchingInputStream#testRandomReadLargeFile (#5580 ) Contributed by Viraj Jasani	2023-05-02 15:45:37 +01:00
Ayush Saxena	a226016c52	HADOOP-18662. ListFiles with recursive fails with FNF. (#5477 ). Contributed by Ayush Saxena. Reviewed-by: Steve Loughran <stevel@apache.org>	2023-05-02 20:12:22 +05:30
Pralabh Kumar	6b6bd82bf0	HADOOP-18715. Add debug log for getting details of tokenKindMap (#5608 ). Contributed by Pralabh Kumar. Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>	2023-04-29 17:30:00 +05:30
Viraj Jasani	05edfee1f3	HADOOP-18399. S3A Prefetch - SingleFilePerBlockCache to use LocalDirAllocator (#5054 ) Contributed by Viraj Jasani	2023-04-28 12:03:30 +01:00
Ankit Saurabh	312b776833	HADOOP-18351. Reduce excess logging of errors during S3A prefetching reads (#5274 ) Contributed by Ankit Saurabh	2023-04-28 12:03:30 +01:00
Alessandro Passaro	0f1a3f23a5	HADOOP-18378. Implement lazy seek in S3A prefetching. (#4955 ) Make S3APrefetchingInputStream.seek() completely lazy. Calls to seek() will not affect the current buffer nor interfere with prefetching, until read() is called. This change allows various usage patterns to benefit from prefetching, e.g. when calling readFully(position, buffer) in a loop for contiguous positions the intermediate internal calls to seek() will be noops and prefetching will have the same performance as in a sequential read. Contributed by Alessandro Passaro.	2023-04-28 12:03:30 +01:00
Viraj Jasani	f07be3bec2	HADOOP-18455. S3A prefetching executor should be closed (#4879 ) follow-on patch to HADOOP-18186. Contributed by: Viraj Jasani	2023-04-28 12:03:30 +01:00
Steve Loughran	4ce763a322	HADOOP-18028. High performance S3A input stream (#4752 ) This is the the preview release of the HADOOP-18028 S3A performance input stream. It is still stabilizing, but ready to test. Contains HADOOP-18028. High performance S3A input stream (#4109) Contributed by Bhalchandra Pandit. HADOOP-18180. Replace use of twitter util-core with java futures (#4115) Contributed by PJ Fanning. HADOOP-18177. Document prefetching architecture. (#4205) Contributed by Ahmar Suhail HADOOP-18175. fix test failures with prefetching s3a input stream (#4212) Contributed by Monthon Klongklaew HADOOP-18231. S3A prefetching: fix failing tests & drain stream async. (#4386) * adds in new test for prefetching input stream * creates streamStats before opening stream * updates numBlocks calculation method * fixes ITestS3AOpenCost.testOpenFileLongerLength * drains stream async * fixes failing unit test Contributed by Ahmar Suhail HADOOP-18254. Disable S3A prefetching by default. (#4469) Contributed by Ahmar Suhail HADOOP-18190. Collect IOStatistics during S3A prefetching (#4458) This adds iOStatisticsConnection to the S3PrefetchingInputStream class, with new statistic names in StreamStatistics. This stream is not (yet) IOStatisticsContext aware. Contributed by Ahmar Suhail HADOOP-18379 rebase feature/HADOOP-18028-s3a-prefetch to trunk HADOOP-18187. Convert s3a prefetching to use JavaDoc for fields and enums. HADOOP-18318. Update class names to be clear they belong to S3A prefetching Contributed by Steve Loughran	2023-04-28 12:03:29 +01:00
Sebastian Baunsgaard	919c3f615b	HADOOP-18660. Filesystem Spelling Mistake (#5475 ). Contributed by Sebastian Baunsgaard. Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>	2023-04-25 19:59:54 +01:00
Steve Loughran	21cf507db3	HADOOP-17450. Add Public IOStatistics API -missed backport (#5590 ) This cherrypicks SemaphoredDelegatingExecutor HADOOP-17450 changes from trunk somehow they didn't get into the main IOStatistics backport to branch-3.3	2023-04-25 15:02:56 +01:00
Doroszlai, Attila	13d3cfd311	HADOOP-18714. Wrong StringUtils.join() called in AbstractContractRootDirectoryTest (#5588 ) (cherry picked from commit `5b23224970`)	2023-04-24 15:49:20 +02:00
Nikita Eshkeev	7a32e7cc38	HADOOP-18597. Simplify single node instructions for creating directories for Map Reduce. (#5305 ) Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>	2023-04-24 01:14:09 +05:30
Christos Bisias	57ff8bdb67	HADOOP-18691. Add a CallerContext getter on the Schedulable interface (#5540 )	2023-04-20 10:13:33 -07:00
Steve Loughran	a505940a2f	HADOOP-18470. Hadoop 3.3.5 release wrap-up (#5558 ) Post-release updates of the branches * Add jdiff xml files from 3.3.5 release. * Declare 3.3.5 as the latest stable release. * Copy release notes.	2023-04-18 10:12:41 +01:00
Viraj Jasani	20d3b9cc46	HADOOP-18620 Avoid using grizzly-http-* APIs (#5356 ) (#5374 )	2023-03-30 07:13:10 +08:00
Steve Loughran	0dd4e500b0	HADOOP-18661. Fix bin/hadoop usage script terminology. (#5473 ) Followup to HADOOP-13209: s/slaves/r/workers in the usage message you get when you type "bin/hadoop" Contributed by Steve Loughran	2023-03-13 12:24:10 +00:00
rdingankar	94b3c6dd90	HDFS-16917 Add transfer rate quantile metrics for DataNode reads (#5397 ) Co-authored-by: Ravindra Dingankar <rdingankar@linkedin.com>	2023-02-27 15:49:26 -08:00
Simbarashe Dzinamarira	5fe19a0f01	HDFS-16901: RBF: Propagates real user's username via the caller context, when a proxy user is being used. (#5346 )	2023-02-24 13:32:23 -08:00
hchaverr	eab7215354	HADOOP-18535. Implement token storage solution based on MySQL Fixes #1240 Signed-off-by: Owen O'Malley <oomalley@linkedin.com>	2023-02-22 14:02:13 -08:00
Steve Loughran	ee71318d72	HADOOP-18636 LocalDirAllocator cannot recover from directory tree deletion (#5412 ) Even though DiskChecker.mkdirsWithExistsCheck() will create the directory tree, it is only called after the enumeration of directories with available space has completed. Directories which don't exist are reported as having 0 space, therefore the mkdirs code is never reached. Adding a simple mkdirs() -without bothering to check the outcome- ensures that if a dir has been deleted then it will be reconstructed if possible. If it can't it will still have 0 bytes of space reported and so be excluded from the allocation. Contributed by Steve Loughran	2023-02-22 11:50:17 +00:00
Arnout Engelen	477f17be97	HADOOP-18627. Add stronger wording in 'secure mode' introduction (#5406 ) Make it more clear that when deploying Hadoop 'secure mode' is generally not optional. Contributed by Arnout Engelen	2023-02-17 16:31:21 +00:00
Bryan Beaudreault	aa6c51364a	HADOOP-18215. Enhance WritableName to be able to return aliases for classes that use serializers (#4215 )	2023-02-16 11:38:20 -08:00
Viraj Jasani	8c9c68c19e	HADOOP-18628. IPC Server Connection should log host name before returning VersionMismatch error (#5385 ) Contributed by Viraj Jasani	2023-02-15 18:23:44 +00:00
Steve Loughran	cd2401d2cc	HADOOP-18470. More in the 3.3.5 index.html about security (#5383 ) Expands on the comments in cluster config to tell people they shouldn't be running a cluster without a private VLAN in cloud, that Knox is good here, and unsecured clusters without a VLAN are just computation-as-a-service to crypto miners Contributed by Steve Loughran	2023-02-14 17:25:20 +00:00
Owen O'Malley	9e7a9fd46d	HDFS-18324. Fix race condition in closing IPC connections. (#5371 )	2023-02-10 13:56:52 -08:00
huhaiyang	de08baded6	HADOOP-18625. Fix method name of RPC.Builder#setnumReaders (#5301 ) Changes method name of RPC.Builder#setnumReaders to setNumReaders() The original method is still there, just marked deprecated. It is the one which should be used when working with older branches. Contributed by Haiyang Hu	2023-02-09 13:29:47 +00:00
gardenia	752f6d8213	HADOOP-18621. Resource leak in CryptoOutputStream.close() (#5347 ) When closing we need to wrap the flush() in a try .. finally, otherwise when flush throws it will stop completion of the remainder of the close activities and in particular the close of the underlying wrapped stream object resulting in a resource leak. Contributed by Colm Dougan	2023-02-07 12:04:00 +00:00
Steve Vaughan	221221d6fb	HADOOP-18612. Avoid mixing canonical and non-canonical when performing comparisons (#5339 ) Contributed by Steve Vaughan Jr	2023-02-06 18:30:45 +00:00
Steve Vaughan	7b6a69faaa	HADOOP-18279. Cancel fileMonitoringTimer even if trustManager isn't defined (#4789 ) Co-authored-by: Steve Vaughan Jr <s_vaughan@apple.com>	2023-02-01 13:33:34 -08:00
Viraj Jasani	f3fa4af5dc	HADOOP-18592. Sasl connection failure should log remote address. (#5294 ) Contributed by Viraj Jasani <vjasani@apache.org> Signed-off-by: Chris Nauroth <cnauroth@apache.org> Signed-off-by: Steve Loughran <stevel@apache.org> Signed-off-by: Mingliang Liu <liuml07@apache.org>	2023-02-01 10:16:42 -08:00
Wei-Chiu Chuang	4836f1ec37	HADOOP-18584. [NFS GW] Fix regression after netty4 migration. (#5252 ) Reviewed-by: Tsz-Wo Nicholas Sze <szetszwo@apache.org> (cherry picked from commit `9d47108b50`)	2023-02-01 05:33:01 -08:00
Ayush Saxena	73f3196db5	HADOOP-18604. Add compile platform in the hadoop version output. (#5327 ). Contributed by Ayush Saxena. Signed-off-by: Chris Nauroth <cnauroth@apache.org>	2023-01-28 14:20:27 +05:30
PJ Fanning	ada06aa22e	HADOOP-18575: followup: try to avoid repeatedly hitting exceptions when transformer factories do not support attributes (#5253 ) Part of HADOOP-18469 and the hardening of XML/XSL parsers. Followup to the main HADOOP-18575 patch, to improve performance when working with xml/xsl engines which don't support the relevant attributes. Include this change when backporting. Contributed by PJ Fanning.	2023-01-16 15:48:15 +00:00
huangxiaoping	f5e9901e6d	HADOOP-18591. Fix a typo in Trash (#5291 ) Signed-off-by: Tao Li <tomscut@apache.org> Signed-off-by: Chris Nauroth <cnauroth@apache.org> (cherry picked from commit `a90e424d9f`)	2023-01-12 21:22:25 +00:00
Chengbing Liu	af96e0f5b3	HDFS-16872. Fix log throttling by declaring LogThrottlingHelper as static members (#5246 ) Co-authored-by: Chengbing Liu <liuchengbing@qiyi.com> Signed-off-by: Erik Krogen <xkrogen@apache.org> (cherry picked from commit `4cf304de45`)	2023-01-10 10:04:05 -08:00
PJ Fanning	29b6df563b	HADOOP-18575. Make XML transformer factory more lenient (#5224 ) Due diligence followup to HADOOP-18469. Add secure XML parser factories to XMLUtils (#4940) Contributed by P J Fanning	2022-12-18 12:26:11 +00:00
Chengbing Liu	bfc916e7b0	HADOOP-18567. LogThrottlingHelper: properly trigger dependent recorders in cases of infrequent logging (#5215 ) Signed-off-by: Erik Krogen <xkrogen@apache.org> Co-authored-by: Chengbing Liu <liuchengbing@qiyi.com> (cherry picked from commit `ca3526da92`)	2022-12-16 09:16:51 -08:00
Steve Loughran	65892a7759	HADOOP-18573. Improve error reporting on non-standard kerberos names (#5221 ) The kerberos RPC does not declare any restriction on characters used in kerberos names, though implementations MAY be more restrictive. If the kerberos controller supports use non-conventional principal names and the kerberos admin chooses to use them this can confuse some of the parsing. The obvious solution is for the enterprise admins to "not do that" as a lot of things break, bits of hadoop included. Harden the hadoop code slightly so at least we fail more gracefully, so people can then get in touch with their sysadmin and tell them to stop it.	2022-12-15 11:44:12 +00:00
Mehakmeet Singh	1009d2560f	HADOOP-18574. Changing log level of IOStatistics increment to make the DEBUG logs less noisy (#5223 ) Contributed by: Mehakmeet Singh	2022-12-15 11:43:49 +00:00
Steve Loughran	ba55f370a9	HADOOP-18526. Leak of S3AInstrumentation instances via hadoop Metrics references (#5144 ) This has triggered an OOM in a process which was churning through s3a fs instances; the increased memory footprint of IOStatistics amplified what must have been a long-standing issue with FS instances being created and not closed() * Makes sure instrumentation is closed when the FS is closed. * Uses a weak reference from metrics to instrumentation, so even if the FS wasn't closed (see HADOOP-18478), this back reference would not cause the S3AInstrumentation reference to be retained. * If S3AFileSystem is configured to log at TRACE it will log the calling stack of initialize(), so help identify where the instance is being created. This should help track down the cause of instance leakage. Contributed by Steve Loughran.	2022-12-14 18:23:04 +00:00
Steve Loughran	654082773c	HADOOP-18183. s3a audit logs to publish range start/end of GET requests. (#5110 ) The start and end of the range is set in a new audit param "rg", e.g "?rg=100-200" Contributed by Ankit Saurabh	2022-12-14 16:51:46 +00:00
Doroszlai, Attila	6202348502	HADOOP-18569. NFS Gateway may release buffer too early (#5207 ) (#5211 ) (cherry picked from commit `df4812df65`)	2022-12-14 15:56:16 +01:00
Jack Richard Buggins	c5b360fd15	HADOOP-18329. Support for IBM Semeru JVM > 11.0.15.0 Vendor Name Changes (#4537 ) (#5208 ) The static boolean PlatformName.IBM_JAVA now identifies Java 11+ IBM Semeru runtimes as IBM JVM releases. Contributed by Jack Buggins.	2022-12-12 17:28:56 +00:00
Pranav Saxena	50a0f33cc9	HADOOP-18546. ABFS. disable purging list of in progress reads in abfs stream close() (#5176 ) This addresses HADOOP-18521, "ABFS ReadBufferManager buffer sharing across concurrent HTTP requests" by not trying to cancel in progress reads. It supercedes HADOOP-18528, which disables the prefetching. If that patch is applied after this one, prefetching will be disabled. As well as changing the default value in the code, core-default.xml is updated to set fs.azure.enable.readahead = true As a result, if Configuration.get("fs.azure.enable.readahead") returns a non-null value, then it can be inferred that it was set in or core-default.xml (the fix is present) or in core-site.xml (someone asked for it). Note: this commit contains the followup commit: That is needed to avoid race conditions in the test. Contributed by Pranav Saxena.	2022-12-09 13:49:14 +00:00
Steve Loughran	95890f3368	HADOOP-18560. AvroFSInput opens a stream twice and discards the second one without closing (#5186 ) This is needed for branches with the hadoop-common changes of HADOOP-16202. Enhanced openFile()	2022-12-06 09:59:51 +00:00
Steve Loughran	36889005f7	HADOOP-18470. index.md update for 3.3.5 release	2022-12-05 16:22:40 +00:00
HarshitGupta11	f29d9a11bc	HADOOP-18530. ChecksumFileSystem::readVectored might return byte buffers not positioned at 0 (#5168 ) Contributed by Harshit Gupta	2022-11-29 14:52:25 +00:00
Owen O'Malley	9b3ffe960e	HADOOP-18324. Interrupting RPC Client calls can lead to thread exhaustion. (#4527 ) * Exactly 1 sending thread per an RPC connection. * If the calling thread is interrupted before the socket write, it will be skipped instead of sending it anyways. * If the calling thread is interrupted during the socket write, the write will finish. * RPC requests will be written to the socket in the order received. * Sending thread is only started by the receiving thread. * The sending thread periodically checks the shouldCloseConnection flag.	2022-11-18 08:29:28 -08:00
Mehakmeet Singh	9e53ed3602	HADOOP-18528. Disable abfs prefetching by default (#5134 ) Disables block prefetching on ABFS InputStreams, by setting fs.azure.enable.readahead to false in core-default.xml and the matching java constant. This prevents HADOOP-18521. ABFS ReadBufferManager buffer sharing across concurrent HTTP requests. Once a fix for that is committed, this change can be reverted. Contributed by Mehakmeet Singh.	2022-11-15 14:29:33 +00:00
Steve Loughran	c392075761	HADOOP-18507. VectorIO FileRange type to support a "reference" field (#5076 ) Contributed by Steve Loughran	2022-11-08 13:35:42 +00:00

1 2 3 4 5 ...

5735 Commits