From d763c997071752c351ceb689716c8d44803f29b4 Mon Sep 17 00:00:00 2001 From: Viraj Jasani Date: Tue, 22 Feb 2022 08:30:38 +0530 Subject: [PATCH] HADOOP-18125. Utility to identify git commit / Jira fixVersion discrepancies for RC preparation (#3991) Signed-off-by: Wei-Chiu Chuang (cherry picked from commit 697e5d463640a7107a622262eb2d333d0458fd8b) --- dev-support/git-jira-validation/README.md | 134 ++++++++++++++++++ .../git_jira_fix_version_check.py | 118 +++++++++++++++ .../git-jira-validation/requirements.txt | 18 +++ 3 files changed, 270 insertions(+) create mode 100644 dev-support/git-jira-validation/README.md create mode 100644 dev-support/git-jira-validation/git_jira_fix_version_check.py create mode 100644 dev-support/git-jira-validation/requirements.txt diff --git a/dev-support/git-jira-validation/README.md b/dev-support/git-jira-validation/README.md new file mode 100644 index 00000000000..308c54228d1 --- /dev/null +++ b/dev-support/git-jira-validation/README.md @@ -0,0 +1,134 @@ + + +Apache Hadoop Git/Jira FixVersion validation +============================================================ + +Git commits in Apache Hadoop contains Jira number of the format +HADOOP-XXXX or HDFS-XXXX or YARN-XXXX or MAPREDUCE-XXXX. +While creating a release candidate, we also include changelist +and this changelist can be identified based on Fixed/Closed Jiras +with the correct fix versions. However, sometimes we face few +inconsistencies between fixed Jira and Git commit message. + +git_jira_fix_version_check.py script takes care of +identifying all git commits with commit +messages with any of these issues: + +1. commit is reverted as per commit message +2. commit does not contain Jira number format in message +3. Jira does not have expected fixVersion +4. Jira has expected fixVersion, but it is not yet resolved + +Moreover, this script also finds any resolved Jira with expected +fixVersion but without any corresponding commit present. + +This should be useful as part of RC preparation. + +git_jira_fix_version_check supports python3 and it required +installation of jira: + +``` +$ python3 --version +Python 3.9.7 + +$ python3 -m venv ./venv + +$ ./venv/bin/pip install -r dev-support/git-jira-validation/requirements.txt + +$ ./venv/bin/python dev-support/git-jira-validation/git_jira_fix_version_check.py + +``` + +The script also requires below inputs: +``` +1. First commit hash to start excluding commits from history: + Usually we can provide latest commit hash from last tagged release + so that the script will only loop through all commits in git commit + history before this commit hash. e.g for 3.3.2 release, we can provide + git hash: fa4915fdbbbec434ab41786cb17b82938a613f16 + because this commit bumps up hadoop pom versions to 3.3.2: + https://github.com/apache/hadoop/commit/fa4915fdbbbec434ab41786cb17b82938a613f16 + +2. Fix Version: + Exact fixVersion that we would like to compare all Jira's fixVersions + with. e.g for 3.3.2 release, it should be 3.3.2. + +3. JIRA Project Name: + The exact name of Project as case-sensitive e.g HADOOP / OZONE + +4. Path of project's working dir with release branch checked-in: + Path of project from where we want to compare git hashes from. Local fork + of the project should be up-to date with upstream and expected release + branch should be checked-in. + +5. Jira server url (default url: https://issues.apache.org/jira): + Default value of server points to ASF Jiras but this script can be + used outside of ASF Jira too. +``` + + +Example of script execution: +``` +JIRA Project Name (e.g HADOOP / OZONE etc): HADOOP +First commit hash to start excluding commits from history: fa4915fdbbbec434ab41786cb17b82938a613f16 +Fix Version: 3.3.2 +Jira server url (default: https://issues.apache.org/jira): +Path of project's working dir with release branch checked-in: /Users/vjasani/Documents/src/hadoop-3.3/hadoop + +Check git status output and verify expected branch + +On branch branch-3.3.2 +Your branch is up to date with 'origin/branch-3.3.2'. + +nothing to commit, working tree clean + + +Jira/Git commit message diff starting: ############################################## +Jira not present with version: 3.3.2. Commit: 8cd8e435fb43a251467ca74fadcb14f21a3e8163 HADOOP-17198. Support S3 Access Points (#3260) (branch-3.3.2) (#3955) +WARN: Jira not found. Commit: 8af28b7cca5c6020de94e739e5373afc69f399e5 Updated the index as per 3.3.2 release +WARN: Jira not found. Commit: e42e483d0085aa46543ebcb1196dd155ddb447d0 Make upstream aware of 3.3.1 release +Commit seems reverted. Commit: 6db1165380cd308fb74c9d17a35c1e57174d1e09 Revert "HDFS-14099. Unknown frame descriptor when decompressing multiple frames (#3836)" +Commit seems reverted. Commit: 1e3f94fa3c3d4a951d4f7438bc13e6f008f228f4 Revert "HDFS-16333. fix balancer bug when transfer an EC block (#3679)" +Jira not present with version: 3.3.2. Commit: ce0bc7b473a62a580c1227a4de6b10b64b045d3a HDFS-16344. Improve DirectoryScanner.Stats#toString (#3695) +Jira not present with version: 3.3.2. Commit: 30f0629d6e6f735c9f4808022f1a1827c5531f75 HDFS-16339. Show the threshold when mover threads quota is exceeded (#3689) +Jira not present with version: 3.3.2. Commit: e449daccf486219e3050254d667b74f92e8fc476 YARN-11007. Correct words in YARN documents (#3680) +Commit seems reverted. Commit: 5c189797828e60a3329fd920ecfb99bcbccfd82d Revert "HDFS-16336. Addendum: De-flake TestRollingUpgrade#testRollback (#3686)" +Jira not present with version: 3.3.2. Commit: 544dffd179ed756bc163e4899e899a05b93d9234 HDFS-16171. De-flake testDecommissionStatus (#3280) +Jira not present with version: 3.3.2. Commit: c6914b1cb6e4cab8263cd3ae5cc00bc7a8de25de HDFS-16350. Datanode start time should be set after RPC server starts successfully (#3711) +Jira not present with version: 3.3.2. Commit: 328d3b84dfda9399021ccd1e3b7afd707e98912d HDFS-16336. Addendum: De-flake TestRollingUpgrade#testRollback (#3686) +Jira not present with version: 3.3.2. Commit: 3ae8d4ccb911c9ababd871824a2fafbb0272c016 HDFS-16336. De-flake TestRollingUpgrade#testRollback (#3686) +Jira not present with version: 3.3.2. Commit: 15d3448e25c797b7d0d401afdec54683055d4bb5 HADOOP-17975. Fallback to simple auth does not work for a secondary DistributedFileSystem instance. (#3579) +Jira not present with version: 3.3.2. Commit: dd50261219de71eaa0a1ad28529953e12dfb92e0 YARN-10991. Fix to ignore the grouping "[]" for resourcesStr in parseResourcesString method (#3592) +Jira not present with version: 3.3.2. Commit: ef462b21bf03b10361d2f9ea7b47d0f7360e517f HDFS-16332. Handle invalid token exception in sasl handshake (#3677) +WARN: Jira not found. Commit: b55edde7071419410ea5bea4ce6462b980e48f5b Also update hadoop.version to 3.3.2 +... +... +... +Found first commit hash after which git history is redundant. commit: fa4915fdbbbec434ab41786cb17b82938a613f16 +Exiting successfully +Jira/Git commit message diff completed: ############################################## + +Any resolved Jira with fixVersion 3.3.2 but corresponding commit not present +Starting diff: ############################################## +HADOOP-18066 is marked resolved with fixVersion 3.3.2 but no corresponding commit found +HADOOP-17936 is marked resolved with fixVersion 3.3.2 but no corresponding commit found +Completed diff: ############################################## + + +``` + diff --git a/dev-support/git-jira-validation/git_jira_fix_version_check.py b/dev-support/git-jira-validation/git_jira_fix_version_check.py new file mode 100644 index 00000000000..c2e12a13aae --- /dev/null +++ b/dev-support/git-jira-validation/git_jira_fix_version_check.py @@ -0,0 +1,118 @@ +#!/usr/bin/env python3 +############################################################################ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +############################################################################ +"""An application to assist Release Managers with ensuring that histories in +Git and fixVersions in JIRA are in agreement. See README.md for a detailed +explanation. +""" + + +import os +import re +import subprocess + +from jira import JIRA + +jira_project_name = input("JIRA Project Name (e.g HADOOP / OZONE etc): ") \ + or "HADOOP" +# Define project_jira_keys with - appended. e.g for HADOOP Jiras, +# project_jira_keys should include HADOOP-, HDFS-, YARN-, MAPREDUCE- +project_jira_keys = [jira_project_name + '-'] +if jira_project_name == 'HADOOP': + project_jira_keys.append('HDFS-') + project_jira_keys.append('YARN-') + project_jira_keys.append('MAPREDUCE-') + +first_exclude_commit_hash = input("First commit hash to start excluding commits from history: ") +fix_version = input("Fix Version: ") + +jira_server_url = input( + "Jira server url (default: https://issues.apache.org/jira): ") \ + or "https://issues.apache.org/jira" + +jira = JIRA(server=jira_server_url) + +local_project_dir = input("Path of project's working dir with release branch checked-in: ") +os.chdir(local_project_dir) + +GIT_STATUS_MSG = subprocess.check_output(['git', 'status']).decode("utf-8") +print('\nCheck git status output and verify expected branch\n') +print(GIT_STATUS_MSG) + +print('\nJira/Git commit message diff starting: ##############################################') + +issue_set_from_commit_msg = set() + +for commit in subprocess.check_output(['git', 'log', '--pretty=oneline']).decode( + "utf-8").splitlines(): + if commit.startswith(first_exclude_commit_hash): + print("Found first commit hash after which git history is redundant. commit: " + + first_exclude_commit_hash) + print("Exiting successfully") + break + if re.search('revert', commit, re.IGNORECASE): + print("Commit seems reverted. \t\t\t Commit: " + commit) + continue + ACTUAL_PROJECT_JIRA = None + for project_jira in project_jira_keys: + if project_jira in commit: + ACTUAL_PROJECT_JIRA = project_jira + break + if not ACTUAL_PROJECT_JIRA: + print("WARN: Jira not found. \t\t\t Commit: " + commit) + continue + JIRA_NUM = '' + for c in commit.split(ACTUAL_PROJECT_JIRA)[1]: + if c.isdigit(): + JIRA_NUM = JIRA_NUM + c + else: + break + issue = jira.issue(ACTUAL_PROJECT_JIRA + JIRA_NUM) + EXPECTED_FIX_VERSION = False + for version in issue.fields.fixVersions: + if version.name == fix_version: + EXPECTED_FIX_VERSION = True + break + if not EXPECTED_FIX_VERSION: + print("Jira not present with version: " + fix_version + ". \t Commit: " + commit) + continue + if issue.fields.status is None or issue.fields.status.name not in ('Resolved', 'Closed'): + print("Jira is not resolved yet? \t\t Commit: " + commit) + else: + # This means Jira corresponding to current commit message is resolved with expected + # fixVersion. + # This is no-op by default, if needed, convert to print statement. + issue_set_from_commit_msg.add(ACTUAL_PROJECT_JIRA + JIRA_NUM) + +print('Jira/Git commit message diff completed: ##############################################') + +print('\nAny resolved Jira with fixVersion ' + fix_version + + ' but corresponding commit not present') +print('Starting diff: ##############################################') +all_issues_with_fix_version = jira.search_issues( + 'project=' + jira_project_name + ' and status in (Resolved,Closed) and fixVersion=' + + fix_version) + +for issue in all_issues_with_fix_version: + if issue.key not in issue_set_from_commit_msg: + print(issue.key + ' is marked resolved with fixVersion ' + fix_version + + ' but no corresponding commit found') + +print('Completed diff: ##############################################') diff --git a/dev-support/git-jira-validation/requirements.txt b/dev-support/git-jira-validation/requirements.txt new file mode 100644 index 00000000000..ae7535a119f --- /dev/null +++ b/dev-support/git-jira-validation/requirements.txt @@ -0,0 +1,18 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +jira==3.1.1