Commit Graph

5153 Commits

Author SHA1 Message Date
Sahil Takiar 5906268f0d HADOOP-16321: ITestS3ASSL+TestOpenSSLSocketFactory failing with java.lang.UnsatisfiedLinkErrors 2019-05-21 11:30:45 -06:00
Ben Roling a36274d699
HADOOP-16085. S3Guard: use object version or etags to protect against inconsistent read after replace/overwrite.
Contributed by Ben Roling.

S3Guard will now track the etag of uploaded files and, if an S3
bucket is versioned, the object version.

You can then control how to react to a mismatch between the data
in the DynamoDB table and that in the store: warn, fail, or, when
using versions, return the original value.

This adds two new columns to the table: etag and version.
This is transparent to older S3A clients -but when such clients
add/update data to the S3Guard table, they will not add these values.
As a result, the etag/version checks will not work with files uploaded by older clients.

For a consistent experience, upgrade all clients to use the latest hadoop version.
2019-05-19 22:29:54 +01:00
Alexis Daboville 4cb3da6ac7
HADOOP-16248. MutableQuantiles leak memory under heavy load.
Contributed by Alexis Daboville,
2019-05-17 15:15:22 +01:00
Sahil Takiar b067f8acaa HADOOP-16050: s3a SSL connections should use OpenSSL
(cherry picked from commit aebf229c17)
2019-05-16 08:57:54 -06:00
David Mollitor 2713dcf6e9
HADOOP-16307. Intern User Name and Group Name in FileStatus.
Author:    David Mollitor
2019-05-16 16:02:07 +02:00
Bharat Viswanadham d4c8858586
HADOOP-16247. NPE in FsUrlConnection. Contributed by Karthik Palanisamy. 2019-05-15 17:41:36 -07:00
Inigo Goiri 389e640f0c HADOOP-16161. NetworkTopology#getWeightUsingNetworkLocation return unexpected result. Contributed by He Xiaoqiao. 2019-05-13 11:46:16 -07:00
Akira Ajisaka f257497b0f HADOOP-16299. [JDK 11] Build fails without specifying -Djavac.version=11
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2019-05-09 14:49:46 +09:00
Prabhu Joseph 96dc5cedfe
HADOOP-16293. AuthenticationFilterInitializer doc has speudo instead of pseudo.
Author:    Prabhu Joseph
2019-05-08 10:18:20 +01:00
Peter Bacsko 713e8a27ae HADOOP-16238. Add the possbility to set SO_REUSEADDR in IPC Server Listener. Contributed by Peter Bacsko.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2019-05-07 17:48:27 -07:00
Siyao Meng 93f2283a69 HADOOP-16289. Allow extra jsvc startup option in hadoop_start_secure_daemon in hadoop-functions.sh. Contributed by Siyao Meng.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2019-05-06 15:47:33 -07:00
Vinayakumar B f1875b205e HADOOP-16059. Use SASL Factories Cache to Improve Performance. Contributed by Ayush Saxena. 2019-05-03 11:22:14 +05:30
Giovanni Matteo Fumarola 7a3188d054 HADOOP-16282. Avoid FileStream to improve performance. Contributed by Ayush Saxena. 2019-05-02 12:58:42 -07:00
Sahil Takiar 4877f0aa51 HDFS-3246: pRead equivalent for direct read path (#597)
HDFS-3246: pRead equivalent for direct read path

Contributed by Sahil Takiar
2019-04-30 14:52:16 -07:00
Ben Roling 0af4011580
HADOOP-16221. S3Guard: add option to fail operation on metadata write failure. 2019-04-30 11:53:26 +01:00
Sean Mackrory a703dae25e HADOOP-16222. Fix new deprecations after guava 27.0 update in trunk. Contributed by Gabor Bota. 2019-04-24 10:39:00 -06:00
Anu Engineer f4ab9370f5 HADOOP-16026:Replace incorrect use of system property user.name.
Contributed by Dinesh Chitlangia.
2019-04-22 14:02:13 -07:00
Inigo Goiri fb1c549139 HDFS-14374. Expose total number of delegation tokens in AbstractDelegationTokenSecretManager. Contributed by CR Hota. 2019-04-22 13:32:08 -07:00
Erik Krogen 1ddb48872f HADOOP-16265. Fix bug causing Configuration#getTimeDuration to use incorrect units when the default value is used. Contributed by starphin. 2019-04-22 08:16:57 -07:00
Zsombor Gegesy 008766c119 HADOOP-15014. KMS should log the IP address of the clients. Contributed by Zsombor Gegesy.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2019-04-16 05:28:18 -07:00
Kenneth Yang b1120d27ab
HADOOP-16249. Make CallerContext LimitedPrivate scope to Public.
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2019-04-16 11:18:21 +09:00
Gabor Bota 1943db5571
HADOOP-16237. Fix new findbugs issues after updating guava to 27.0-jre.
Author:    Gabor Bota <gabor.bota@cloudera.com>
2019-04-12 18:28:38 -07:00
Sahil Takiar 2382f63fc0
HADOOP-14747. S3AInputStream to implement CanUnbuffer.
Author:    Sahil Takiar <stakiar@cloudera.com>
2019-04-12 18:12:02 -07:00
Chen Liang 626fec652b HDFS-13699. Add DFSClient sending handshake token to DataNode, and allow DataNode overwrite downstream QOP. Contributed by Chen Liang. 2019-04-12 17:37:51 -07:00
Arpit Agarwal 87407553ef
HADOOP-16243. Change Log Level to trace in NetUtils.java. Contributed by chencan. 2019-04-10 13:21:04 -07:00
Giovanni Matteo Fumarola 813cee1a18 HDFS-14420. Fix typo in KeyShell console. Contributed by Hu Xiaodong. 2019-04-10 11:18:40 -07:00
Todd Lipcon 65deb1ac42 HADOOP-16179. hadoop-common pom should not depend on kerb-simplekdc
The hadoop-common pom currently has a dependency on kerb-simplekdc. In
fact, the only classes used from Kerby are in kerb-core and kerb-util
(which is a transitive dependency frmo kerb-core). Depending on
kerb-simplekdc pulls a bunch of other unnecessary classes into the
hadoop-common classpath.

This changes the hadoop-common pom to depend only on kerb-core.

hadoop-minikdc already had the appropriate dependency on kerb-simplekdc
so it continues to pull in what it needs.

Signed-off-by: Todd Lipcon <todd@apache.org>
2019-04-10 08:49:46 -07:00
Akira Ajisaka bb8dda2bf9
HADOOP-12890. Fix typo in AbstractService. Contributed by Gabor Liptak. 2019-04-08 15:26:12 +09:00
Akira Ajisaka ab645b3caa
HADOOP-14635. Javadoc correction for AccessControlList#buildACL. Contributed by Yeliang Cang. 2019-04-08 15:18:45 +09:00
Akira Ajisaka 72f4b9cd68
HADOOP-15242. Fix typos in hadoop-functions.sh. Contributed by Ray Chiang. 2019-04-08 13:20:21 +09:00
Akira Ajisaka 0d47d283a6
HADOOP-10848. Cleanup calling of sun.security.krb5.Config. 2019-04-08 10:02:34 +09:00
David Mollitor c90736350b
HADOOP-16208. Do Not Log InterruptedException in Client.
Contributed by David Mollitor.
2019-04-04 16:15:57 +01:00
Inigo Goiri 7b5b783f66 HDFS-14327. Using FQDN instead of IP to access servers with DNS resolving. Contributed by Fengnan Li. 2019-04-03 16:11:13 -07:00
Siyao Meng e62cbcbc83 HADOOP-16011. OsSecureRandom very slow compared to other SecureRandom implementations. Contributed by Siyao Meng.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2019-04-03 14:29:52 -07:00
Steve Loughran 366186d999
HADOOP-16233. S3AFileStatus to declare that isEncrypted() is always true (#685)
This is needed to fix up some confusion about caching of job.addCache() handling of S3A paths; all parent dirs -the files are downloaded by the NM without  using the DTs of the user submitting the job. This means that when you submit jobs to an EC2 cluster with lower IAM permissions than the user, cached resources don't get downloaded and the job doesn't start.

Production code changes:
* S3AFileStatus Adds "true" to the superclass's encrypted flag during construction.

Tests
* Base AbstractContractOpenTest can control whether zero byte files created in tests are encrypted. Not done via an XML attribute, just a subclass point. Thoughts?
* Verify that the filecache considers paths to not have the permissions which trigger reduce-privilege downloads
* And extend ITestDelegatedMRJob to test a completely different bucket (open street map), to verify that cached resources do get their tokens picked up

Docs:
* Advise FS developers to say all files are encrypted. It's otherwise harmless and it'll stop other people seeing impossible to debug error messages on app launch.

Contributed by Steve Loughran.

Change-Id: Ifaae4c9d735ccc5eafeebd2584b65daf2d4e5da3
2019-04-03 21:23:40 +01:00
Gabor Bota d7979079ea HADOOP-16210. Update guava to 27.0-jre in hadoop-project trunk. Contributed by Gabor Bota. 2019-04-03 12:59:39 -06:00
Sahil Takiar 3b0c5016b2
HDFS-14394: Add -std=c99 / -std=gnu99 to libhdfs compile flags
Signed-off-by: Todd Lipcon <todd@apache.org>
2019-04-03 10:56:33 -07:00
Akira Ajisaka aaaf856f4b
HADOOP-16226. new Path(String str) does not remove all the trailing slashes of str 2019-04-03 13:16:59 +09:00
Lokesh Jain cf268114c9 HDFS-13960. hdfs dfs -checksum command should optionally show block size in output. Contributed by Lokesh Jain.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2019-04-02 12:24:55 -07:00
Steve Loughran 61d19110d4
HADOOP-16218. Findbugs warning of null param to non-nullable method in Configuration with Guava update. (#655)
Change-Id: I461e518ce9a4730b91a8138ad55b39e9a4b0a4b8
2019-04-02 09:15:11 +01:00
Akira Ajisaka ebd0d21538
HADOOP-16225. Fix links to the developer mailing lists in DownstreamDev.md. Contributed by Wanqiang Ji. 2019-04-02 10:51:47 +09:00
Xiaoyu Yao f41f938b2e
HADOOP-16199. KMSLoadBlanceClientProvider does not select token correctly. Contributed by Xiaoyu Yao.
This closes  #642.
2019-03-28 21:55:31 -07:00
Gabor Bota b5db238383
HADOOP-15999. S3Guard: Better support for out-of-band operations.
Author:    Gabor Bota
2019-03-28 15:59:25 +00:00
David Mollitor d18d0859eb
HADOOP-16181. HadoopExecutors shutdown Cleanup.
Author:    David Mollitor <david.mollitor@cloudera.com>
2019-03-22 10:29:27 +00:00
David Mollitor 246ab77f28
HADOOP-16196. Path Parameterize Comparable.
Author:    David Mollitor <david.mollitor@cloudera.com>
2019-03-22 10:26:24 +00:00
Eric Yang 5f6e225166 YARN-9363. Replaced debug logging with SLF4J parameterized log message.
Contributed by Prabhu Joseph
2019-03-18 13:57:18 -04:00
Eric Yang 5446e3cb8a HADOOP-16167. Fixed Hadoop shell script for Ubuntu 18.
Contributed by Daniel Templeton
2019-03-18 13:04:49 -04:00
Erik Krogen 8c95cb9d6b HADOOP-16192. Fix CallQueue backoff bugs: perform backoff when add() is used and update backoff when refreshed. 2019-03-18 08:13:43 -07:00
Shweta Yakkali 2db38abffc HDFS-14328. [Clean-up] Remove NULL check before instanceof in TestGSet
(Contributed by Shweta Yakkali via Daniel Templeton)

Change-Id: I5b9f0e66664714d7c5bbfa30492a09f770626711
2019-03-18 15:10:26 +01:00
Eric Yang 2064ca015d YARN-9349. Changed logging to use slf4j api.
Contributed by Prabhu Joseph
2019-03-15 19:20:59 -04:00
Ben Roling 6fa229891e
HADOOP-15625. S3A input stream to use etags/version number to detect changed source files.
Author: Ben Roling <ben.roling@gmail.com>

Initial patch from Brahma Reddy Battula.
2019-03-13 20:37:11 +00:00
Erik Krogen 66357574ae HDFS-14346. Add better time precision to Configuration#getTimeDuration, allowing return unit and default unit to be specified independently. Contributed by Chao Sun. 2019-03-13 13:15:56 -07:00
Matt Foley f74159c8fc HADOOP-16166. TestRawLocalFileSystemContract fails with build Docker container running on Mac.
Also provided similar fix for Windows.
2019-03-13 09:33:24 -07:00
Konstantin V Shvachko 2a54feabb2 HDFS-14347. [SBN Read] Restore a comment line mistakenly removed in ProtobufRpcEngine. Contributed by Fengnan Li. 2019-03-11 18:59:15 -07:00
Konstantin V Shvachko 4ad295a4f1 HDFS-14270.[SBN Read] Add trace level logging for stateId in RPC Server. Contributed by Shweta Yakkali. 2019-03-11 13:48:06 -07:00
Steve Loughran 0cbe9ad8c2
HADOOP-16109. Parquet reading S3AFileSystem causes EOF
Nobody gets seek right. No matter how many times they think they have.

Reproducible test from: Dave Christianson
Fixed seek() logic: Steve Loughran
2019-03-09 16:00:34 +00:00
Praveen Krishna 2b94e51a8f
HADOOP-16114. NetUtils#canonicalizeHost gives different value for same host.
Author:    Praveen Krishna <praveenkrishna@tutanota.com>
2019-03-07 11:06:34 +00:00
Vivek Ratnavel Subramanian a55fc36299 HDDS-1093. Configuration tab in OM/SCM ui is not displaying the correct values. 2019-03-06 17:43:57 -08:00
Sahil Takiar 618e009ac0 HDFS-14111. hdfsOpenFile on HDFS causes unnecessary IO from file offset 0. Contributed by Sahil Takiar.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2019-03-06 15:04:06 -08:00
Wei-Chiu Chuang 6192c1fe3b Revert "HDFS-14111. hdfsOpenFile on HDFS causes unnecessary IO from file offset 0. Contributed by Sahil Takiar."
This reverts commit f5a4b43a49.
2019-03-06 15:02:18 -08:00
Sahil Takiar f5a4b43a49 HDFS-14111. hdfsOpenFile on HDFS causes unnecessary IO from file offset 0. Contributed by Sahil Takiar.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2019-03-06 14:58:45 -08:00
Eric Yang 3c5b7136e2 HADOOP-16150. Added concat method to ChecksumFS as unsupported operation.
Contributed by Steve Loughran

(cherry picked from commit 8b517e7ad6)
2019-03-05 13:32:00 -05:00
Stephen O'Donnell 686c0141ef
HADOOP-16140. hadoop fs expunge to add -immediate option to purge trash immediately.
Contributed by Stephen O'Donnell.

Signed-off-by: Steve Loughran <stevel@apache.org>
2019-03-05 14:09:00 +00:00
Prabhu Joseph e40e2d6ad5
YARN-7243. Moving logging APIs over to slf4j in hadoop-yarn-server-resourcemanager.
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2019-03-05 14:10:08 +09:00
coder-chenzhi fe7551f21b
HADOOP-16162. Remove unused Job Summary Appender configurations from log4j.properties
This closes #551

Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2019-03-05 13:41:33 +09:00
David Mollitor 9fcd89ab93
HADOOP-16148. Cleanup LineReader Unit Test.
Contributed by David Mollitor.

Signed-off-by: Steve Loughran <stevel@apache.org>
2019-03-04 23:08:12 +00:00
Ajay Kumar 0d61facd37 HADOOP-15889. Add hadoop.token configuration parameter to load tokens. Contributed by Íñigo Goiri 2019-02-28 10:34:28 -08:00
Márton Elek 84c4966a5a
HADOOP-16067. Incorrect Format Debug Statement KMSACLs. Contributed by Charan Hebri. 2019-02-28 12:15:47 +01:00
Eric Yang feccd282fe HADOOP-16107. Update ChecksumFileSystem createFile/openFile API to generate checksum.
Contributed by Steve Loughran
2019-02-27 15:53:41 -05:00
Tsz Wo Nicholas Sze 9192f71e21 HADOOP-16127. In ipc.Client, put a new connection could happen after stop. 2019-02-26 15:14:21 -08:00
Abhishek Modi 52b2eab575
HADOOP-16093. Move DurationInfo from hadoop-aws to hadoop-common org.apache.hadoop.util.
Contributed by Abhishek Modi
2019-02-26 17:10:41 +00:00
Surendra Singh Lilhore 59ba3552d3 HDFS-14299. ViewFs: Correct error message for read only operations. Contributed by hu xiaodong. 2019-02-26 12:40:32 +05:30
Konstantin V Shvachko a6ab37192a HDFS-14130. [SBN read] Make ZKFC ObserverNode aware. Contributed by xiangheng and Konstantin Shvachko. 2019-02-25 14:35:02 -08:00
Inigo Goiri ba4e7bd192 HADOOP-16125. Support multiple bind users in LdapGroupsMapping. Contributed by Lukas Majercak. 2019-02-25 13:39:13 -08:00
Tsz Wo Nicholas Sze 0edb0c51dc HADOOP-16126. ipc.Client.stop() may sleep too long to wait for all connections. 2019-02-25 13:15:28 -08:00
Yongjun Zhang f7a27cdee4 HDFS-14118. Support using DNS to resolve nameservices to IP addresses. Contributed by Fengnan Li. 2019-02-23 09:35:36 -08:00
George Huang 9daf43c6fa HADOOP-16129. Misc. bug fixes for KMS Benchmark. Contributed by George Huang.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2019-02-22 17:52:09 -08:00
Daryn Sharp a87e458432 HADOOP-15813. Enable more reliable SSL connection reuse. Contributed by Daryn Sharp.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2019-02-20 18:17:04 -08:00
Shweta Yakkali 371a6db59a HDFS-14273. Fix checkstyle issues in BlockLocation's method javadoc
(Contributed by Shweta Yakkali via Daniel Templeton)

Change-Id: I546aa4a0fe7f83b53735acd9925f366b2f1a00e2
2019-02-20 15:36:53 -08:00
George Huang 0525d85d57 HADOOP-15967. KMS Benchmark Tool. Contributed by George Huang.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2019-02-19 15:24:30 -08:00
Akira Ajisaka dabfeab785
YARN-9308. fairscheduler-statedump.log gets generated regardless of service again after the merge of HDFS-7240. Contributed by Wilfred Spiegelenburg. 2019-02-15 14:49:49 +09:00
Erik Krogen 64f28f9efa HDFS-14162. [SBN read] Allow Balancer to work with Observer node. Add a new ProxyCombiner allowing for multiple related protocols to be combined. Allow AlignmentContext to be passed in NameNodeProxyFactory. Contributed by Erik Krogen. 2019-02-14 11:22:04 -08:00
Siyao Meng fd026863b1 HDFS-14241. Provide feedback on successful renameSnapshot and deleteSnapshot. Contributed by Siyao Meng.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2019-02-13 19:48:02 -08:00
Chen Liang 024c87291c HDFS-13617. Allow wrapping NN QOP into token in encrypted message. Contributed by Chen Liang 2019-02-13 12:40:31 -08:00
Vinayakumar B 00c5ffaee2 HADOOP-16108. Tail Follow Interval Should Allow To Specify The Sleep Interval To Save Unnecessary RPC's. Contributed by Ayush Saxena. 2019-02-13 16:44:32 +05:30
Yiqun Lin 7b11b404a3 HADOOP-16097. Provide proper documentation for FairCallQueue. Contributed by Erik Krogen. 2019-02-13 11:16:04 +08:00
Xiaoyu Yao ca4e46a05e HDDS-1075. Fix CertificateUtil#parseRSAPublicKey charsetName. Contributed by Siddharth Wagle. 2019-02-11 12:09:14 -08:00
Steve Loughran 668817a6ce
Revert "HADOOP-15954. ABFS: Enable owner and group conversion for MSI and login user using OAuth."
(accidentally mixed in two patches)

This reverts commit fa8cd1bf28.
2019-02-07 21:57:22 +00:00
Wangda Tan 308f3168fa Make upstream aware of 3.1.2 release
Change-Id: I397bc6ef75498726df4763bd07a8bf8fe1c38365
2019-02-05 14:03:18 -08:00
Da Zhou fa8cd1bf28
HADOOP-15954. ABFS: Enable owner and group conversion for MSI and login user using OAuth.
Contributed by Da Zhou and Junhua Gu.
2019-02-05 19:23:15 +00:00
Steve Loughran f365957c63
HADOOP-15229. Add FileSystem builder-based openFile() API to match createFile();
S3A to implement S3 Select through this API.

The new openFile() API is asynchronous, and implemented across FileSystem and FileContext.

The MapReduce V2 inputs are moved to this API, and you can actually set must/may
options to pass in.

This is more useful for setting things like s3a seek policy than for S3 select,
as the existing input format/record readers can't handle S3 select output where
the stream is shorter than the file length, and splitting plain text is suboptimal.
Future work is needed there.

In the meantime, any/all filesystem connectors are now free to add their own filesystem-specific
configuration parameters which can be set in jobs and used to set filesystem input stream
options (seek policy, retry, encryption secrets, etc).

Contributed by Steve Loughran
2019-02-05 11:51:02 +00:00
Steve Loughran 7f46d13dac
HADOOP-16079. Token.toString faulting if any token listed can't load.
Contributed by Steve Loughran.
2019-02-01 14:31:47 +00:00
Inigo Goiri bcc3a79f58 HADOOP-16084. Fix the comment for getClass in Configuration. Contributed by Fengnan Li. 2019-01-31 10:06:05 -08:00
Akira Ajisaka 1129288cf5
HADOOP-14178. Move Mockito up to version 2.23.4. Contributed by Akira Ajisaka and Masatake Iwasaki. 2019-01-29 18:29:56 -08:00
Hanisha Koneru b3bc94ebfd HDFS-14236. Lazy persist copy/ put fails with ViewFs. 2019-01-29 16:45:44 -08:00
Wei-Chiu Chuang d1714c20e9 Revert "HDFS-14084. Need for more stats in DFSClient. Contributed by Pranay Singh."
This reverts commit 1d523279da.
2019-01-29 15:43:09 -08:00
Pranay Singh 1d523279da HDFS-14084. Need for more stats in DFSClient. Contributed by Pranay Singh.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2019-01-25 09:02:57 -08:00
Eric Yang 0dd35e218f HADOOP-15922. Fixed DelegationTokenAuthenticator URL decoding for doAs user.
Contributed by He Xiaoqiao
2019-01-22 18:59:36 -05:00
Akira Ajisaka a463cf75a0
HADOOP-15787. [JDK11] TestIPC.testRTEDuringConnectionSetup fails. Contributed by Zsolt Venczel. 2019-01-22 10:19:05 +09:00
Sunil G 2e2508b8e3 Make 3.2.0 aware to other branches 2019-01-21 21:24:51 +05:30