Commit Graph

2112 Commits

Author SHA1 Message Date
Ashutosh Gupta 30c36ef25a
HADOOP-18400. Fix file split duplicating records from a succeeding split when reading BZip2 text files (#4732)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-09-19 13:45:05 +09:00
Ashutosh Gupta 59d3c20118
MAPREDUCE-7407. Avoid stopContainer() on dead node (#4779) 2022-09-15 10:30:36 -07:00
sreeb-msft c48ed3e96c
HADOOP-18408. ABFS: ITestAbfsManifestCommitProtocol fails on nonHNS configuration (#4758)
ITestAbfsManifestCommitProtocol  to set requireRenameResilience to false for nonHNS configuration  (#4758)

Contributed by Sree Bhattacharyya
2022-09-02 12:33:12 +01:00
9uapaw 84081a8cae MAPREDUCE-7409. Make shuffle key length configurable. Contributed by Ashutosh Gupta. 2022-08-31 17:32:51 +02:00
Steve Loughran de37fd37d6
MAPREDUCE-7403. manifest-committer dynamic partitioning support. (#4728)
Declares its compatibility with Spark's dynamic
output partitioning by having the stream capability
"mapreduce.job.committer.dynamic.partitioning"

Requires a Spark release with SPARK-40034, which
does the probing before deciding whether to 
accept/rejecting instantiation with
dynamic partition overwrite set

This feature can be declared as supported by
any other PathOutputCommitter implementations
whose algorithm and destination filesystem
are compatible.

None of the S3A committers are compatible.

The classic FileOutputCommitter is, but it
does not declare itself as such out of our fear
of changing that code. The Spark-side code
will automatically infer compatibility if
the created committer is of that class or
a subclass.

Contributed by Steve Loughran.
2022-08-24 11:18:19 +01:00
slfan1989 977f4b6165
MAPREDUCE-7385. impove JobEndNotifier#httpNotification With recommended methods. (#4403). Contributed by fanshilun.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-08-09 00:59:03 +05:30
Ashutosh Gupta bd0f9a46e1
HADOOP-18390. Fix out of sync import for HADOOP-18321 (#4694)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-08-06 21:51:23 +09:00
skysiders 9fe96238d2
MAPREDUCE-7372 MapReduce set permission too late in copyJar method (#4026). Contributed by Zhang Dongsheng.
Reviewed-by: Steve Loughran <stevel@apache.org>
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
2022-07-25 11:38:59 -07:00
PJ Fanning 34e548cb62
HADOOP-18332: remove rs-api dependency as it conflicts with jsr311-api (#4547)
This downgrades jackson from the version switched to in
    HADOOP-18033 (2.13.0), to Jackson 2.12.7.
    This removes the dependency on javax.ws.rs-api,
    so avoiding runtime problems with applications using
    jersey-core v1 and/or jsr311-api.
    
    The 2.12.7 release still contains the fix for CVE-2020-36518.
    
    Contributed by PJ Fanning
2022-07-17 21:37:54 +05:30
Ashutosh Gupta 4e8c0b902e
MAPREDUCE-7201.Make Job History File Permissions configurable (#4507)
* MAPREDUCE-7201.Make Job History File Permissions configurable

Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
2022-07-11 11:34:52 +05:30
Akira Ajisaka 9b1d3579b4
Revert "MAPREDUCE-7388. Remove unused variable _eof in GzipCodec.cc (#4429)"
This reverts commit fac895828f.
2022-07-09 03:05:42 +09:00
cfg1234 fac895828f
MAPREDUCE-7388. Remove unused variable _eof in GzipCodec.cc (#4429)
Co-authored-by: cWX456268 <chenfengge1@huawei.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-07-09 02:51:49 +09:00
Ashutosh Gupta a432925f74
HADOOP-18321.Fix when to read an additional record from a BZip2 text file split (#4521)
* HADOOP-18321.Fix when to read an additional record from a BZip2 text file split

Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com> and Reviewed by Akira Ajisaka.
2022-07-06 10:00:14 +05:30
slfan1989 073b8ea1d5
HADOOP-18284. Remove Unnecessary semicolon ';' (#4422). Contributed by fanshilun. 2022-06-29 15:20:41 +05:30
Steve Loughran c9ddbd210c
MAPREDUCE-7391. TestLocalDistributedCacheManager failing after HADOOP-16202 (#4472)
Fixing a mockito-based test which broke when HADOOP-16202
changed the methods being invoked.

Contributed by Steve Loughran
2022-06-22 12:52:41 +01:00
Christian Bartolomäus ef36457b53
MAPREDUCE-7389. Fix typo in description of property (#4440). Contributed by Christian Bartolomaus. 2022-06-21 19:24:11 +05:30
Ashutosh Gupta 36c4be819f
MAPREDUCE-7369. Fixed MapReduce tasks timing out when spends more time on MultipleOutputs#close (#4247)
Contributed by Ravuri Sushma sree.

Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
2022-06-20 17:01:01 +09:00
slfan1989 10fc865d3c
MAPREDUCE-7387. Fix TestJHSSecurity#testDelegationToken AssertionError due to HDFS-16563 (#4428). Contributed by fanshilun.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2022-06-20 12:14:04 +05:30
Steve Loughran e199da3fae
HADOOP-17833. Improve Magic Committer performance (#3289)
Speed up the magic committer with key changes being

* Writes under __magic always retain directory markers

* File creation under __magic skips all overwrite checks,
  including the LIST call intended to stop files being
	created over dirs.
* mkdirs under __magic probes the path for existence
  but does not look any further.  	

Extra parallelism in task and job commit directory scanning
Use of createFile and openFile with parameters which all for
HEAD checks to be skipped.

The committer can write the summary _SUCCESS file to the path
`fs.s3a.committer.summary.report.directory`, which can be in a
different file system/bucket if desired, using the job id as
the filename. 

Also: HADOOP-15460. S3A FS to add `fs.s3a.create.performance`

Application code can set the createFile() option
fs.s3a.create.performance to true to disable the same
safety checks when writing under magic directories.
Use with care.

The createFile option prefix `fs.s3a.create.header.`
can be used to add custom headers to S3 objects when
created.


Contributed by Steve Loughran.
2022-06-17 19:11:35 +01:00
Ashutosh Gupta 9c3330c22f
MAPREDUCE-7377. Remove unused imports in MapReduce project (#4299)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-05-14 01:34:19 +09:00
Szilard Nemeth f143e99428 MAPREDUCE-7379. RMContainerRequestor#makeRemoteRequest has confusing log message. Contributed by Ashutosh Gupta 2022-05-11 16:55:19 +02:00
Ayush Saxena 665ada6d21
MAPREDUCE-7376. AggregateWordCount fetches wrong results. (#4257). Contributed by Ayush Saxena.
Reviewed-by: Steve Loughran <stevel@apache.org>
2022-05-09 22:56:14 +05:30
Ashutosh Gupta fb13c1e4a8
MAPREDUCE-7246. In MapredAppMasterRest#Mapreduce_Application_Master_Info_API, updating the datatype of appId to "string". (#4223)
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2022-04-25 14:29:35 +09:00
Steve Loughran 6999acf520
HADOOP-16202. Enhanced openFile(): mapreduce and YARN changes. (#2584/2)
These changes ensure that sequential files are opened with the
right read policy, and split start/end is passed in.

As well as offering opportunities for filesystem clients to
choose fetch/cache/seek policies, the settings ensure that
processing text files on an s3 bucket where the default policy
is "random" will still be processed efficiently.

This commit depends on the associated hadoop-common patch,
which must be committed first.

Contributed by Steve Loughran.

Change-Id: Ic6713fd752441cf42ebe8739d05c2293a5db9f94
2022-04-24 17:33:05 +01:00
Kengo Seki dc4a680da8
MAPREDUCE-7373. Building MapReduce NativeTask fails on Fedora 34+ (#4120) 2022-03-30 22:47:45 +09:00
Steve Loughran 7328c34ba5
MAPREDUCE-7341. Add an intermediate manifest committer for Azure and GCS
This is a mapreduce/spark output committer optimized for
performance and correctness on Azure ADLS Gen 2 storage
(via the abfs connector) and Google Cloud Storage
(via the external gcs connector library).

* It is safe to use with HDFS, however it has not been optimized
for that use.
* It is *not* safe for use with S3, and will fail if an attempt
is made to do so.

Contributed by Steve Loughran

Change-Id: I6f3502e79c578b9fd1a8c1485f826784b5421fca
2022-03-17 11:24:13 +00:00
Viraj Jasani 66b72406bd
HADOOP-18131. Upgrade maven enforcer plugin and relevant dependencies (#4000)
Reviewed-by: Akira Ajisaka <aajisaka@apache.org>
Reviewed-by: Wei-Chiu Chuang <weichiu@apache.org>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2022-03-08 17:27:04 +09:00
Viraj Jasani 08c803ea30
MAPREDUCE-7371. DistributedCache alternative APIs should not use DistributedCache APIs internally (#3855) 2022-01-09 00:18:10 +09:00
Stamatis Zampetakis bface2ac6c
MAPREDUCE-7368. DBOutputFormat.DBRecordWriter#write must throw exception when it fails. (#3671). Contributed by Stamatis Zampetakis.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2021-12-08 16:40:11 +05:30
Viraj Jasani 53edd0de5a
HADOOP-18033. Upgrade fasterxml Jackson to 2.13.0 (#3749)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2021-12-08 16:52:22 +09:00
Viraj Jasani 215388beea
HADOOP-18022. Add restrict-imports-enforcer-rule for Guava Preconditions and remove remaining usages (#3712)
Reviewed-by: Akira Ajisaka <aajisaka@apache.org>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2021-11-29 17:37:30 +09:00
Viraj Jasani 516f36c6f1
HADOOP-17967. Keep restrict-imports-enforcer-rule for Guava VisibleForTesting in hadoop-main pom (#3555) 2021-10-21 16:54:25 +09:00
Viraj Jasani 1151edf12e
HADOOP-17956. Replace all default Charset usage with UTF-8 (#3529)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2021-10-14 13:07:24 +09:00
Viraj Jasani b1ad4eab9a
HADOOP-17959. Replace Guava VisibleForTesting by Hadoop's own annotation in hadoop-cloud-storage-project and hadoop-mapreduce-project modules (#3537)
Reviewed-by: Ahmed Hussein <ahussein@apache.org>
2021-10-11 16:22:50 +09:00
Viraj Jasani 8071dbb9c6
HADOOP-17950. Provide replacement for deprecated APIs of commons-io IOUtils (#3515)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2021-10-07 10:58:29 +09:00
Dongjoon Hyun f5148ca542
MAPREDUCE-7363. Rename JobClientUnitTest to TestJobClients (#3487) 2021-09-28 09:50:01 -07:00
Chao Sun 2ee294b1b1 Revert "MAPREDUCE-7303. Fix TestJobResourceUploader failures after HADOOP-16878. Contributed by Peter Bacsko."
This reverts commit 7bc305db5d.
2021-09-25 09:29:33 -07:00
lzx404243 6187f76f11
MAPREDUCE-7311. Clear filesystem statistics after tests in TestTaskProgressReporter (#2500)
Co-authored-by: Zhengxi Li <zli89@illinois.edu>
2021-09-01 13:47:09 +09:00
lzx404243 7b5be74228
MAPREDUCE-7342. Stop RMService in TestClientRedirect.testRedirect() (#2968)
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2021-08-30 08:39:33 +09:00
jenny e31169c864
MAPREDUCE-7258. HistoryServerRest.html#Task_Counters_API, modify the jobTaskCounters's itemName from taskcounterGroup to taskCounterGroup (#1808)
Co-authored-by: chenjuanni <chenjuanni@inspur.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
2021-08-02 15:36:53 +09:00
Viraj Jasani e95c3259de
MAPREDUCE-7356. Remove some duplicate dependencies from mapreduce-client's child poms (#3193). Contributed by Viraj Jasani.
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
2021-07-13 19:30:13 +05:30
Viraj Jasani 618c9218ee
HADOOP-17788. Replace IOUtils#closeQuietly usages by Hadoop's own utility (#3171)
Reviewed-by: Steve Loughran <stevel@apache.org>
Reviewed-by: Akira Ajisaka <aajisaka@apache.org>
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2021-07-08 16:03:40 +09:00
Eric Payne 7581413156 MAPREDUCE-7353: Mapreduce job fails when NM is stopped. Contributed by Bilwa S T (BilwaST) 2021-07-07 20:43:44 +00:00
Shubham Gupta 3f4221ec34
MAPREDUCE-7351 - CleanupJob during handle of SIGTERM signal (#3176)
Co-authored-by: Shubham Gupta <gshubham@microsoft.com>
2021-07-07 09:08:15 +05:30
Jim Brennan 7c7d02edbd YARN-10824. Title not set for JHS and NM webpages. Contributed by Bilwa S T. 2021-06-25 20:32:08 +00:00
Viraj Jasani 6e11461eaa
MAPREDUCE-7354. Use empty array constant present in TaskCompletionEvent to avoid creating redundant objects (#3123)
Reviewed-by: Hui Fei <ferhui@apache.org>
Reviewed-by: Akira Ajisaka <aajisaka@apache.org>
2021-06-21 16:46:06 +09:00
Viraj Jasani 4ef27a596f
HADOOP-17753. Keep restrict-imports-enforcer-rule for Guava Lists in top level hadoop-main pom (#3087) 2021-06-11 12:15:52 +09:00
Viraj Jasani 207c92753f
MAPREDUCE-7350. Replace Guava Lists usage by Hadoop's own Lists in hadoop-mapreduce-project (#3074) 2021-06-07 11:51:29 +09:00
Viraj Jasani 986d0a4f1d
HADOOP-17732. Keep restrict-imports-enforcer-rule for Guava Sets in hadoop-main pom (#3049)
Signed-off-by: Takanobu Asanuma <tasanuma@apache.org>
2021-05-26 17:14:31 +09:00
Akira Ajisaka 8a489ce78e
MAPREDUCE-7348. TestFrameworkUploader#testNativeIO fails. (#3053)
Reviewed-by: Hui Fei <ferhui@apache.org>
2021-05-26 15:47:56 +09:00