Commit Graph

1214 Commits

Author SHA1 Message Date
Steve Loughran 511df1e837 HADOOP-16430. S3AFilesystem.delete to incrementally update s3guard with deletions
Contributed by Steve Loughran.

This overlaps the scanning for directory entries with batched calls to S3 DELETE and updates of the S3Guard tables.
It also uses S3Guard to list the files to delete, so find newly created files even when S3 listings are not use consistent.

For path which the client considers S3Guard to be authoritative, we also do a recursive LIST of the store and delete files; this is to find unindexed files and do guarantee that the delete(path, true) call really does delete everything underneath.

Change-Id: Ice2f6e940c506e0b3a78fa534a99721b1698708e
2019-09-05 14:25:15 +01:00
Abhishek Modi 16576fde8e YARN-9754. Add support for arbitrary DAG AM Simulator. Contributed by Abhishek Modi. 2019-08-29 11:43:40 +05:30
HUAN-PING SU 6f068cf53f HADOOP-16416. mark DynamoDBMetadataStore.deleteTrackingValueMap as final. Contributed by kevin su.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2019-08-27 16:17:32 -07:00
Akira Ajisaka 567091aa9b
HADOOP-15958. Revisiting LICENSE and NOTICE files.
This closes #1307

Reviewed-by: Masatake Iwasaki <iwasakims@apache.org>
2019-08-27 13:47:12 +09:00
Ewan Higgs 23e532d739 Revert "HADOOP-16193. Add extra S3A MPU test to see what happens if a file is created during the MPU. Contributed by Steve Loughran"
This reverts commit 69ddb36876.
2019-08-26 12:37:26 +02:00
Akira Ajisaka 84b1982060
YARN-9774. Fix order of arguments for assertEquals in TestSLSUtils. Contributed by Nikhil Navadiya. 2019-08-23 14:39:31 +09:00
Erik Krogen 63c295e298 HDFS-14755. [Dynamometer] Enhance compatibility of Dynamometer with branch-2 builds. Contributed by Takanobu Asanuma. 2019-08-22 09:57:12 -07:00
Steve Loughran 61b2df2331
HADOOP-16470. Make last AWS credential provider in default auth chain EC2ContainerCredentialsProviderWrapper.
Contributed by Steve Loughran.

Contains HADOOP-16471. Restore (documented) fs.s3a.SharedInstanceProfileCredentialsProvider.

Change-Id: I06b99b57459cac80bf743c5c54f04e59bb54c2f8
2019-08-22 17:27:56 +01:00
Ewan Higgs 69ddb36876 HADOOP-16193. Add extra S3A MPU test to see what happens if a file is created during the MPU. Contributed by Steve Loughran 2019-08-22 13:56:47 +02:00
Takanobu Asanuma ee7c261e1e
HDFS-14763. Fix package name of audit log class in Dynamometer document (#1335) 2019-08-22 18:37:16 +09:00
Abhishek Modi 3ad1fcfc8b YARN-9752. Add support for allocation id in SLS. Contributed by Abhishek Modi 2019-08-21 20:39:51 +05:30
bibinchundatt 10ec31d20e YARN-9765. SLS runner crashes when run with metrics turned off. Contributed by Abhishek Modi. 2019-08-21 13:48:21 +05:30
KAI XIE c765584eb2 HADOOP-16158. DistCp to support checksum validation when copy blocks in parallel (#919)
* DistCp to support checksum validation when copy blocks in parallel

* address review comments

* add checksums comparison test for combine mode
2019-08-18 18:46:31 -07:00
Steve Loughran 0e4b757955 HADOOP-16500 S3ADelegationTokens to only log at debug on startup (#1269). Contributed by Steve Loughran.
Change-Id: Ifafc15f32791911976d7ebc36fb6e8853f59ed41
2019-08-14 10:50:26 +02:00
Erik Krogen 274966e675 HDFS-14717. [Dynamometer] Remove explicit search for JUnit dependency JAR from Dynamometer Client as it is packaged in the primary JAR. Contributed by Kevin Su. 2019-08-13 08:52:59 -07:00
Steve Loughran 189dc10884 HADOOP-16481. ITestS3GuardDDBRootOperations.test_300_MetastorePrune needs to set region. (#1209). Contributed by Steve Loughran. 2019-08-09 17:33:08 +02:00
Steve Loughran e25a5c2eab HADOOP-16499. S3A retry policy to be exponential (#1246). Contributed by Steve Loughran. 2019-08-09 15:52:37 +02:00
Da Zhou 43a91f820a
HADOOP-16315. ABFS: transform full UPN for named user in AclStatus
Contributed by Da Zhou

Change-Id: Ibc78322415fcbeff89c06c8586c53f5695550290
2019-08-09 12:38:13 +01:00
Abhishek Modi a92b7a5491 YARN-9694. UI always show default-rack for all the nodes while running SLS. 2019-08-09 11:41:16 +05:30
bilaharith 5840df86d7
HADOOP-16479. ABFS FileStatus.getModificationTime returns localized time instead of UTC.
Contributed by Bilahari T H

Change-Id: I532055baaadfd7c324710e4b25f60cdf0378bdc0
2019-08-08 19:08:48 +01:00
Steve Loughran b01efe5cf6
HADOOP-16472. findbugs warning on LocalMetadataStore.ttlTimeProvider sync
Contributed by Steve Loughran.

Moved the setter and addAncestors to synchronized

Change-Id: Ib362c66d1b8c9124eca7db9a44274ac08d0b3be6
2019-08-02 22:30:48 +01:00
Felipe Lopes bca86bd289
HADOOP-16469. Update committers.md
Contributed by Felipe Lopes.

Change-Id: I5c05b878bde073aeb45bf22340183893f85269e1
2019-07-30 12:47:55 +01:00
Gabor Bota 7b219778e0
HADOOP-16433. S3Guard: Filter expired entries and tombstones when listing with MetadataStore.listChildren().
Contributed by Gabor Bota.

This pulls the tracking of the lastUpdated timestamp of metadata entries up from the DDB metastore into all s3guard stores, and then uses this to filter out expired tombstones from listings.

Change-Id: I80f121236b49c75a024116f65a3ef29d3580b462
2019-07-24 18:11:43 +01:00
sunlisheng a1251addff
HADOOP-16431. Remove useless log in IOUtils.java and ExceptionDiags.java.
This closes #1091

Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2019-07-24 10:04:39 +09:00
Steve Loughran 4317d33232
HADOOP-16380. S3Guard to determine empty directory status for all non-root directories.
Contributed by Steve Loughran and Gabor Bota.

This
* Asks S3Guard to determine the empty directory status.
* Has S3A's root directory rm("/") command to always return false (as abfs does)
* Documents that object stores MAY do this
* Overloads ContractTestUtils.assertDeleted to let assertions declare that the source directory does not need to exist. This stops inconsistencies in directory listings failing a root test.

It avoids a recent regression (HADOOP-16279) where if there was a tombstone above the first element found in a directory listing, the directory would be considered empty, when in fact there were child entries. That could downgrade an rm(path, recursive) to a no-op, while also confusing rename(src, dest), as dest could be mistaken for an empty directory and so permit the copy above it, rather than reject it "destination path exists and is not empty".

Change-Id: I136a3d1a5a48a67e6155d790a40ff558d0d2c108
2019-07-23 14:52:03 +01:00
Ayush Saxena e60f5e2572 HADOOP-16440. Distcp can not preserve timestamp with -delete option. Contributed by ludun. 2019-07-20 13:11:14 +05:30
Arun Singh 0b45293abb
HADOOP-16404. ABFS default blocksize change(256MB from 512MB)
Contributed by: Arun Singh
2019-07-19 20:21:28 -07:00
Sean Mackrory 7f1b76ca35
HADOOP-13868. [s3a] New default for S3A multi-part configuration (#1125) 2019-07-19 09:49:59 -06:00
lqjaclee cd967c75a7
HADOOP-15847. S3Guard testConcurrentTableCreations to set R/W capacity == 0
Contributed by lqjaclee

Change-Id: I4a4d5b29f2677c188799479e4db38f07fa0591d1
2019-07-19 14:46:55 +01:00
Josh Rosen d545f9c290 HADOOP-16437 documentation typo fix: fs.s3a.experimental.input.fadvise
Fix fs.s3a.experimental.fadvise to fs.s3a.experimental.input.fadvise 

Contributed by: Josh Rosen
2019-07-18 23:19:38 +01:00
Gabor Bota c58e11bf52
HADOOP-16383. Pass ITtlTimeProvider instance in initialize method in MetadataStore interface. Contributed by Gabor Bota. (#1009) 2019-07-17 16:24:39 +02:00
Steve Loughran 19a001826f
Revert "HDFS-9913. DistCp to add -useTrash to move deleted files to Trash."
Reverting due to test failures if ~/.Trash not present during test setup.

This reverts commit ee3115f488.

Change-Id: Icbeeb261570b9131ff99d765ac0945c335b26658
2019-07-17 13:13:24 +01:00
Shen Yinjie ee3115f488
HDFS-9913. DistCp to add -useTrash to move deleted files to Trash.
Contributed by Shen Yinjie.

Change-Id: I03ac7d22ab1054f8e5de4aa7552909c734438f4a
2019-07-17 11:50:46 +01:00
Sean Mackrory 5672efa5c7
HADOOP-15729. [s3a] Allow core threads to time out. (#1075) 2019-07-16 18:14:23 -06:00
Steve Loughran b15ef7dc3d
HADOOP-16384: S3A: Avoid inconsistencies between DDB and S3.
Contributed by Steve Loughran

Contains

- HADOOP-16397. Hadoop S3Guard Prune command to support a -tombstone option.
- HADOOP-16406. ITestDynamoDBMetadataStore.testProvisionTable times out intermittently

This patch doesn't fix the underlying problem but it

* changes some tests to clean up better
* does a lot more in logging operations in against DDB, if enabled
* adds an entry point to dump the state of the metastore and s3 tables (precursor to fsck)
* adds a purge entry point to help clean up after a test run has got a store into a mess
* s3guard prune command adds -tombstone option to only clear tombstones

The outcome is that tests should pass consistently and if problems occur we have better diagnostics.

Change-Id: I3eca3f5529d7f6fec398c0ff0472919f08f054eb
2019-07-12 13:02:25 +01:00
Steve Loughran 6a3433bffd
HADOOP-16357. TeraSort Job failing on S3 DirectoryStagingCommitter: destination path exists.
Contributed by Steve Loughran.

This patch

* changes the default for the staging committer to append, as we get for the classic FileOutputFormat committer
* adds a check for the dest path being a file not a dir
* adds tests for this
* Changes AbstractCommitTerasortIT. to not use the simple parser, so fails if the file is present.

Change-Id: Id53742958ed1cf321ff96c9063505d64f3254f53
2019-07-11 18:15:34 +01:00
Yiqun Lin 5043840b1d HDFS-14410. Make Dynamometer documentation properly compile onto the Hadoop site. Contributed by Erik Krogen. 2019-07-11 23:47:27 +08:00
Erik Krogen 32925d04d9 HDFS-14640. [Dynamometer] Fix TestDynamometerInfra failure. Contributed by Erik Krogen. 2019-07-11 08:34:39 -07:00
Erik Krogen fc0656dd30 HADOOP-16418. [Dynamometer] Fix checkstyle and findbugs warnings. Contributed by Erik Krogen. 2019-07-11 08:29:57 -07:00
Steve Loughran c7b5f858a0
HADOOP-16393. S3Guard init command uses global settings, not those of target bucket.
Contributed by Steve Loughran.

Change-Id: I226a91ab8d7758340f8d221aa80a7abf9a0d3e8f
2019-07-10 20:57:02 +01:00
Erik Krogen 90b10a0d54 HDFS-14622. [Dynamometer] Update XML FsImage parsing logic to ignore non-INodeSection entries to fix an issue caused by the presence of Centralized Cache Management functionality. Contributed by Erik Krogen. 2019-07-10 09:59:11 -07:00
Masatake Iwasaki 738c09349e HADOOP-16411. Fix javadoc warnings in hadoop-dynamometer.
Signed-off-by: Masatake Iwasaki <iwasakims@apache.org>
2019-07-09 09:44:49 +09:00
Sean Mackrory de6b7bc67a HADOOP-16409. Allow authoritative mode on non-qualified paths. Contributed by Sean Mackrory 2019-07-08 19:27:07 +02:00
Sean Mackrory 34747c373f
HADOOP-16396. Allow authoritative mode on a subdirectory. (#1043) 2019-07-03 12:04:47 -06:00
Erik Krogen ab0b180ddb HDFS-12345 Add Dynamometer to hadoop-tools, a tool for scale testing the HDFS NameNode with real metadata and workloads. Contributed by Erik Krogen. 2019-06-25 08:07:39 -07:00
kkori 366f3deec5 HADOOP-16390. escape javadoc in S3AUtils public methods
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2019-06-25 17:47:37 +09:00
Takanobu Asanuma 98d2065643 HDFS-12564. Add the documents of swebhdfs configurations on the client side. Contributed by Takanobu Asanuma.
Signed-off-by: Wei-Chiu Chuang <weichiu@apache.org>
2019-06-20 20:17:24 -07:00
Steve Loughran e02eb24e0a
HADOOP-15183. S3Guard store becomes inconsistent after partial failure of rename.
Contributed by Steve Loughran.

Change-Id: I825b0bc36be960475d2d259b1cdab45ae1bb78eb
2019-06-20 09:56:40 +01:00
Sahil Takiar 28291a9e8a
HADOOP-16379: S3AInputStream.unbuffer should merge input stream stats into fs-wide stats
Contributed by Sahil Takiar

Change-Id: I2bcfaaea00d12c633757069402dcd0b91a5f5c05
2019-06-20 09:42:27 +01:00
Robert Levas 450c070a8f
HADOOP-16340. ABFS driver continues to retry on IOException responses from REST operations.
Contributed by Robert Levas.

This makes the HttpException constructor protected rather than public, so it is possible
to implement custom subclasses of this exception -exceptions which will not be retried.

Change-Id: Ie8aaa23a707233c2db35948784908b6778ff3a8f
2019-06-19 17:43:14 +01:00