HADOOP-18470. More in the 3.3.5 index.html about security (#5383)
Expands on the comments in cluster config to tell people they shouldn't be running a cluster without a private VLAN in cloud, that Knox is good here, and unsecured clusters without a VLAN are just computation-as-a-service to crypto miners Contributed by Steve Loughran
This commit is contained in:
parent
e2ab35084a
commit
d56977e909
|
@ -35,6 +35,8 @@ These instructions do not cover integration with any Kerberos services,
|
||||||
-everyone bringing up a production cluster should include connecting to their
|
-everyone bringing up a production cluster should include connecting to their
|
||||||
organisation's Kerberos infrastructure as a key part of the deployment.
|
organisation's Kerberos infrastructure as a key part of the deployment.
|
||||||
|
|
||||||
|
See [Security](./SecureMode.html) for details on how to secure a cluster.
|
||||||
|
|
||||||
Prerequisites
|
Prerequisites
|
||||||
-------------
|
-------------
|
||||||
|
|
||||||
|
|
|
@ -24,7 +24,7 @@ Users are encouraged to read the full set of release notes.
|
||||||
This page provides an overview of the major changes.
|
This page provides an overview of the major changes.
|
||||||
|
|
||||||
Azure ABFS: Critical Stream Prefetch Fix
|
Azure ABFS: Critical Stream Prefetch Fix
|
||||||
---------------------------------------------
|
----------------------------------------
|
||||||
|
|
||||||
The abfs has a critical bug fix
|
The abfs has a critical bug fix
|
||||||
[HADOOP-18546](https://issues.apache.org/jira/browse/HADOOP-18546).
|
[HADOOP-18546](https://issues.apache.org/jira/browse/HADOOP-18546).
|
||||||
|
@ -120,25 +120,76 @@ be vulnerable, and the ugprades should also reduce the number of false
|
||||||
positives security scanners report.
|
positives security scanners report.
|
||||||
|
|
||||||
We have not been able to upgrade every single dependency to the latest
|
We have not been able to upgrade every single dependency to the latest
|
||||||
version there is. Some of those changes are just going to be incompatible.
|
version there is. Some of those changes are fundamentally incompatible.
|
||||||
If you have concerns about the state of a specific library, consult the pache JIRA
|
If you have concerns about the state of a specific library, consult the Apache JIRA
|
||||||
issue tracker to see whether a JIRA has been filed, discussions have taken place about
|
issue tracker to see if an issue has been filed, discussions have taken place about
|
||||||
the library in question, and whether or not there is already a fix in the pipeline.
|
the library in question, and whether or not there is already a fix in the pipeline.
|
||||||
*Please don't file new JIRAs about dependency-X.Y.Z having a CVE without
|
*Please don't file new JIRAs about dependency-X.Y.Z having a CVE without
|
||||||
searching for any existing issue first*
|
searching for any existing issue first*
|
||||||
|
|
||||||
As an open source project, contributions in this area are always welcome,
|
As an open-source project, contributions in this area are always welcome,
|
||||||
especially in testing the active branches, testing applications downstream of
|
especially in testing the active branches, testing applications downstream of
|
||||||
those branches and of whether updated dependencies trigger regressions.
|
those branches and of whether updated dependencies trigger regressions.
|
||||||
|
|
||||||
|
|
||||||
|
Security Advisory
|
||||||
|
=================
|
||||||
|
|
||||||
|
Hadoop HDFS is a distributed filesystem allowing remote
|
||||||
|
callers to read and write data.
|
||||||
|
|
||||||
|
Hadoop YARN is a distributed job submission/execution
|
||||||
|
engine allowing remote callers to submit arbitrary
|
||||||
|
work into the cluster.
|
||||||
|
|
||||||
|
Unless a Hadoop cluster is deployed with
|
||||||
|
[caller authentication with Kerberos](./hadoop-project-dist/hadoop-common/SecureMode.html),
|
||||||
|
anyone with network access to the servers has unrestricted access to the data
|
||||||
|
and the ability to run whatever code they want in the system.
|
||||||
|
|
||||||
|
In production, there are generally three deployment patterns which
|
||||||
|
can, with care, keep data and computing resources private.
|
||||||
|
1. Physical cluster: *configure Hadoop security*, usually bonded to the
|
||||||
|
enterprise Kerberos/Active Directory systems.
|
||||||
|
Good.
|
||||||
|
1. Cloud: transient or persistent single or multiple user/tenant cluster
|
||||||
|
with private VLAN *and security*.
|
||||||
|
Good.
|
||||||
|
Consider [Apache Knox](https://knox.apache.org/) for managing remote
|
||||||
|
access to the cluster.
|
||||||
|
1. Cloud: transient single user/tenant cluster with private VLAN
|
||||||
|
*and no security at all*.
|
||||||
|
Requires careful network configuration as this is the sole
|
||||||
|
means of securing the cluster..
|
||||||
|
Consider [Apache Knox](https://knox.apache.org/) for managing
|
||||||
|
remote access to the cluster.
|
||||||
|
|
||||||
|
*If you deploy a Hadoop cluster in-cloud without security, and without configuring a VLAN
|
||||||
|
to restrict access to trusted users, you are implicitly sharing your data and
|
||||||
|
computing resources with anyone with network access*
|
||||||
|
|
||||||
|
If you do deploy an insecure cluster this way then port scanners will inevitably
|
||||||
|
find it and submit crypto-mining jobs. If this happens to you, please do not report
|
||||||
|
this as a CVE or security issue: it is _utterly predictable_. Secure *your cluster* if
|
||||||
|
you want to remain exclusively *your cluster*.
|
||||||
|
|
||||||
|
Finally, if you are using Hadoop as a service deployed/managed by someone else,
|
||||||
|
do determine what security their products offer and make sure it meets your requirements.
|
||||||
|
|
||||||
|
|
||||||
Getting Started
|
Getting Started
|
||||||
===============
|
===============
|
||||||
|
|
||||||
The Hadoop documentation includes the information you need to get started using
|
The Hadoop documentation includes the information you need to get started using
|
||||||
Hadoop. Begin with the
|
Hadoop. Begin with the
|
||||||
[Single Node Setup](./hadoop-project-dist/hadoop-common/SingleCluster.html)
|
[Single Node Setup](./hadoop-project-dist/hadoop-common/SingleCluster.html)
|
||||||
which shows you how to set up a single-node Hadoop installation.
|
which shows you how to set up a single-node Hadoop installation.
|
||||||
Then move on to the
|
Then move on to the
|
||||||
[Cluster Setup](./hadoop-project-dist/hadoop-common/ClusterSetup.html)
|
[Cluster Setup](./hadoop-project-dist/hadoop-common/ClusterSetup.html)
|
||||||
to learn how to set up a multi-node Hadoop installation.
|
to learn how to set up a multi-node Hadoop installation.
|
||||||
|
|
||||||
|
Before deploying Hadoop in production, read
|
||||||
|
[Hadoop in Secure Mode](./hadoop-project-dist/hadoop-common/SecureMode.html),
|
||||||
|
and follow its instructions to secure your cluster.
|
||||||
|
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue