HADOOP-17928. Syncable: S3A to warn and downgrade (#3585)
This switches the default behavior of S3A output streams to warning that Syncable.hsync() or hflush() have been called; it's not considered an error unless the defaults are overridden. This avoids breaking applications which call the APIs, at the risk of people trying to use S3 as a safe store of streamed data (HBase WALs, audit logs etc). Contributed by Steve Loughran.
This commit is contained in:
parent
2f35cc36cd
commit
6c6d1b64d4
|
@ -2205,7 +2205,16 @@
|
||||||
</description>
|
</description>
|
||||||
</property>
|
</property>
|
||||||
|
|
||||||
<!-- Azure file system properties -->
|
<property>
|
||||||
|
<name>fs.s3a.downgrade.syncable.exceptions</name>
|
||||||
|
<value>true</value>
|
||||||
|
<description>
|
||||||
|
Warn but continue when applications use Syncable.hsync when writing
|
||||||
|
to S3A.
|
||||||
|
</description>
|
||||||
|
</property>
|
||||||
|
|
||||||
|
<!-- Azure file system properties -->
|
||||||
<property>
|
<property>
|
||||||
<name>fs.AbstractFileSystem.wasb.impl</name>
|
<name>fs.AbstractFileSystem.wasb.impl</name>
|
||||||
<value>org.apache.hadoop.fs.azure.Wasb</value>
|
<value>org.apache.hadoop.fs.azure.Wasb</value>
|
||||||
|
|
|
@ -387,7 +387,7 @@ public final class Constants {
|
||||||
* Value: {@value}.
|
* Value: {@value}.
|
||||||
*/
|
*/
|
||||||
public static final boolean DOWNGRADE_SYNCABLE_EXCEPTIONS_DEFAULT =
|
public static final boolean DOWNGRADE_SYNCABLE_EXCEPTIONS_DEFAULT =
|
||||||
false;
|
true;
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* The capacity of executor queues for operations other than block
|
* The capacity of executor queues for operations other than block
|
||||||
|
|
|
@ -924,30 +924,48 @@ connector isn't saving any data at all. The `Syncable` API, especially the
|
||||||
`hsync()` call, are critical for applications such as HBase to safely
|
`hsync()` call, are critical for applications such as HBase to safely
|
||||||
persist data.
|
persist data.
|
||||||
|
|
||||||
The S3A connector throws an `UnsupportedOperationException` when these API calls
|
When configured to do so, the S3A connector throws an `UnsupportedOperationException`
|
||||||
are made, because the guarantees absolutely cannot be met: nothing is being flushed
|
when these API calls are made, because the API guarantees absolutely cannot be met:
|
||||||
or saved.
|
_nothing is being flushed or saved_.
|
||||||
|
|
||||||
* Applications which intend to invoke the Syncable APIs call `hasCapability("hsync")` on
|
* Applications which intend to invoke the Syncable APIs should call `hasCapability("hsync")` on
|
||||||
the stream to see if they are supported.
|
the stream to see if they are supported.
|
||||||
* Or catch and downgrade `UnsupportedOperationException`.
|
* Or catch and downgrade `UnsupportedOperationException`.
|
||||||
|
|
||||||
These recommendations _apply to all filesystems_.
|
These recommendations _apply to all filesystems_.
|
||||||
|
|
||||||
To downgrade the S3A connector to simply warning of the use of
|
For consistency with other filesystems, S3A output streams
|
||||||
|
do not by default reject the `Syncable` calls -instead
|
||||||
|
they print a warning of its use.
|
||||||
|
|
||||||
|
|
||||||
|
The count of invocations of the two APIs are collected in the S3A filesystem
|
||||||
|
Statistics/IOStatistics and so their use can be monitored.
|
||||||
|
|
||||||
|
To switch the S3A connector to rejecting all use of
|
||||||
`hsync()` or `hflush()` calls, set the option
|
`hsync()` or `hflush()` calls, set the option
|
||||||
`fs.s3a.downgrade.syncable.exceptions` to true.
|
`fs.s3a.downgrade.syncable.exceptions` to `false`.
|
||||||
|
|
||||||
```xml
|
```xml
|
||||||
<property>
|
<property>
|
||||||
<name>fs.s3a.downgrade.syncable.exceptions</name>
|
<name>fs.s3a.downgrade.syncable.exceptions</name>
|
||||||
<value>true</value>
|
<value>false</value>
|
||||||
</property>
|
</property>
|
||||||
```
|
```
|
||||||
|
|
||||||
The count of invocations of the two APIs are collected
|
Regardless of the setting, the `Syncable` API calls do not work.
|
||||||
in the S3A filesystem Statistics/IOStatistics and so
|
Telling the store to *not* downgrade the calls is a way to
|
||||||
their use can be monitored.
|
1. Prevent applications which require Syncable to work from being deployed
|
||||||
|
against S3.
|
||||||
|
2. Identify applications which are making the calls even though they don't
|
||||||
|
need to. These applications can then be fixed -something which may take
|
||||||
|
time.
|
||||||
|
|
||||||
|
Put differently: it is safest to disable downgrading syncable exceptions.
|
||||||
|
However, enabling the downgrade stops applications unintentionally using the API
|
||||||
|
from breaking.
|
||||||
|
|
||||||
|
*Tip*: try turning it on in staging environments to see what breaks.
|
||||||
|
|
||||||
### `RemoteFileChangedException` and read-during-overwrite
|
### `RemoteFileChangedException` and read-during-overwrite
|
||||||
|
|
||||||
|
|
|
@ -141,6 +141,10 @@ public class TestS3ABlockOutputStream extends AbstractS3AMockTest {
|
||||||
*/
|
*/
|
||||||
@Test
|
@Test
|
||||||
public void testSyncableUnsupported() throws Exception {
|
public void testSyncableUnsupported() throws Exception {
|
||||||
|
final S3ABlockOutputStream.BlockOutputStreamBuilder
|
||||||
|
builder = mockS3ABuilder();
|
||||||
|
builder.withDowngradeSyncableExceptions(false);
|
||||||
|
stream = spy(new S3ABlockOutputStream(builder));
|
||||||
intercept(UnsupportedOperationException.class, () -> stream.hflush());
|
intercept(UnsupportedOperationException.class, () -> stream.hflush());
|
||||||
intercept(UnsupportedOperationException.class, () -> stream.hsync());
|
intercept(UnsupportedOperationException.class, () -> stream.hsync());
|
||||||
}
|
}
|
||||||
|
|
Loading…
Reference in New Issue