hadoop/hadoop-project-dist/hadoop-common/filesystem/outputstream.html

1231 lines
78 KiB
HTML

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!--
| Generated by Apache Maven Doxia at 2023-03-25
| Rendered using Apache Maven Stylus Skin 1.5
-->
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Apache Hadoop 3.4.0-SNAPSHOT &#x2013; Output: OutputStream, Syncable and StreamCapabilities</title>
<style type="text/css" media="all">
@import url("../css/maven-base.css");
@import url("../css/maven-theme.css");
@import url("../css/site.css");
</style>
<link rel="stylesheet" href="../css/print.css" type="text/css" media="print" />
<meta name="Date-Revision-yyyymmdd" content="20230325" />
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
</head>
<body class="composite">
<div id="banner">
<a href="http://hadoop.apache.org/" id="bannerLeft">
<img src="http://hadoop.apache.org/images/hadoop-logo.jpg" alt="" />
</a>
<a href="http://www.apache.org/" id="bannerRight">
<img src="http://www.apache.org/images/asf_logo_wide.png" alt="" />
</a>
<div class="clear">
<hr/>
</div>
</div>
<div id="breadcrumbs">
<div class="xright"> <a href="http://wiki.apache.org/hadoop" class="externalLink">Wiki</a>
|
<a href="https://gitbox.apache.org/repos/asf/hadoop.git" class="externalLink">git</a>
|
<a href="http://hadoop.apache.org/" class="externalLink">Apache Hadoop</a>
&nbsp;| Last Published: 2023-03-25
&nbsp;| Version: 3.4.0-SNAPSHOT
</div>
<div class="clear">
<hr/>
</div>
</div>
<div id="leftColumn">
<div id="navcolumn">
<h5>General</h5>
<ul>
<li class="none">
<a href="../../../index.html">Overview</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/SingleCluster.html">Single Node Setup</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/ClusterSetup.html">Cluster Setup</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/CommandsManual.html">Commands Reference</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/FileSystemShell.html">FileSystem Shell</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/Compatibility.html">Compatibility Specification</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/DownstreamDev.html">Downstream Developer's Guide</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/AdminCompatibilityGuide.html">Admin Compatibility Guide</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/InterfaceClassification.html">Interface Classification</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/filesystem/index.html">FileSystem Specification</a>
</li>
</ul>
<h5>Common</h5>
<ul>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/CLIMiniCluster.html">CLI Mini Cluster</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/FairCallQueue.html">Fair Call Queue</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/NativeLibraries.html">Native Libraries</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/Superusers.html">Proxy User</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/RackAwareness.html">Rack Awareness</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/SecureMode.html">Secure Mode</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/ServiceLevelAuth.html">Service Level Authorization</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/HttpAuthentication.html">HTTP Authentication</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/CredentialProviderAPI.html">Credential Provider API</a>
</li>
<li class="none">
<a href="../../../hadoop-kms/index.html">Hadoop KMS</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/Tracing.html">Tracing</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/UnixShellGuide.html">Unix Shell Guide</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/registry/index.html">Registry</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/AsyncProfilerServlet.html">Async Profiler</a>
</li>
</ul>
<h5>HDFS</h5>
<ul>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HdfsDesign.html">Architecture</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html">User Guide</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HDFSCommands.html">Commands Reference</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html">NameNode HA With QJM</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithNFS.html">NameNode HA With NFS</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/ObserverNameNode.html">Observer NameNode</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/Federation.html">Federation</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/ViewFs.html">ViewFs</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/ViewFsOverloadScheme.html">ViewFsOverloadScheme</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html">Snapshots</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HdfsEditsViewer.html">Edits Viewer</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HdfsImageViewer.html">Image Viewer</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html">Permissions and HDFS</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HdfsQuotaAdminGuide.html">Quotas and HDFS</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/LibHdfs.html">libhdfs (C API)</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/WebHDFS.html">WebHDFS (REST API)</a>
</li>
<li class="none">
<a href="../../../hadoop-hdfs-httpfs/index.html">HttpFS</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/ShortCircuitLocalReads.html">Short Circuit Local Reads</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html">Centralized Cache Management</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html">NFS Gateway</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html">Rolling Upgrade</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/ExtendedAttributes.html">Extended Attributes</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html">Transparent Encryption</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HdfsMultihoming.html">Multihoming</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html">Storage Policies</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/MemoryStorage.html">Memory Storage Support</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/SLGUserGuide.html">Synthetic Load Generator</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HDFSErasureCoding.html">Erasure Coding</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HDFSDiskbalancer.html">Disk Balancer</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HdfsUpgradeDomain.html">Upgrade Domain</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HdfsDataNodeAdminGuide.html">DataNode Admin</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs-rbf/HDFSRouterFederation.html">Router Federation</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/HdfsProvidedStorage.html">Provided Storage</a>
</li>
</ul>
<h5>MapReduce</h5>
<ul>
<li class="none">
<a href="../../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html">Tutorial</a>
</li>
<li class="none">
<a href="../../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapredCommands.html">Commands Reference</a>
</li>
<li class="none">
<a href="../../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html">Compatibility with 1.x</a>
</li>
<li class="none">
<a href="../../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/EncryptedShuffle.html">Encrypted Shuffle</a>
</li>
<li class="none">
<a href="../../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/PluggableShuffleAndPluggableSort.html">Pluggable Shuffle/Sort</a>
</li>
<li class="none">
<a href="../../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/DistributedCacheDeploy.html">Distributed Cache Deploy</a>
</li>
<li class="none">
<a href="../../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/SharedCacheSupport.html">Support for YARN Shared Cache</a>
</li>
</ul>
<h5>MapReduce REST APIs</h5>
<ul>
<li class="none">
<a href="../../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapredAppMasterRest.html">MR Application Master</a>
</li>
<li class="none">
<a href="../../../hadoop-mapreduce-client/hadoop-mapreduce-client-hs/HistoryServerRest.html">MR History Server</a>
</li>
</ul>
<h5>YARN</h5>
<ul>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/YARN.html">Architecture</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/YarnCommands.html">Commands Reference</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html">Capacity Scheduler</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/FairScheduler.html">Fair Scheduler</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/ResourceManagerRestart.html">ResourceManager Restart</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html">ResourceManager HA</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/ResourceModel.html">Resource Model</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/NodeLabel.html">Node Labels</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/NodeAttributes.html">Node Attributes</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/WebApplicationProxy.html">Web Application Proxy</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/TimelineServer.html">Timeline Server</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/TimelineServiceV2.html">Timeline Service V.2</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html">Writing YARN Applications</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/YarnApplicationSecurity.html">YARN Application Security</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/NodeManager.html">NodeManager</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/DockerContainers.html">Running Applications in Docker Containers</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/RuncContainers.html">Running Applications in runC Containers</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/NodeManagerCgroups.html">Using CGroups</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/SecureContainer.html">Secure Containers</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/ReservationSystem.html">Reservation System</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/GracefulDecommission.html">Graceful Decommission</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/OpportunisticContainers.html">Opportunistic Containers</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/Federation.html">YARN Federation</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/SharedCache.html">Shared Cache</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/UsingGpus.html">Using GPU</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/UsingFPGA.html">Using FPGA</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/PlacementConstraints.html">Placement Constraints</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/YarnUI2.html">YARN UI2</a>
</li>
</ul>
<h5>YARN REST APIs</h5>
<ul>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/WebServicesIntro.html">Introduction</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html">Resource Manager</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/NodeManagerRest.html">Node Manager</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/TimelineServer.html#Timeline_Server_REST_API_v1">Timeline Server</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/TimelineServiceV2.html#Timeline_Service_v.2_REST_API">Timeline Service V.2</a>
</li>
</ul>
<h5>YARN Service</h5>
<ul>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/yarn-service/Overview.html">Overview</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/yarn-service/QuickStart.html">QuickStart</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/yarn-service/Concepts.html">Concepts</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/yarn-service/YarnServiceAPI.html">Yarn Service API</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/yarn-service/ServiceDiscovery.html">Service Discovery</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-site/yarn-service/SystemServices.html">System Services</a>
</li>
</ul>
<h5>Hadoop Compatible File Systems</h5>
<ul>
<li class="none">
<a href="../../../hadoop-aliyun/tools/hadoop-aliyun/index.html">Aliyun OSS</a>
</li>
<li class="none">
<a href="../../../hadoop-aws/tools/hadoop-aws/index.html">Amazon S3</a>
</li>
<li class="none">
<a href="../../../hadoop-azure/index.html">Azure Blob Storage</a>
</li>
<li class="none">
<a href="../../../hadoop-azure-datalake/index.html">Azure Data Lake Storage</a>
</li>
<li class="none">
<a href="../../../hadoop-cos/cloud-storage/index.html">Tencent COS</a>
</li>
<li class="none">
<a href="../../../hadoop-huaweicloud/cloud-storage/index.html">Huaweicloud OBS</a>
</li>
</ul>
<h5>Auth</h5>
<ul>
<li class="none">
<a href="../../../hadoop-auth/index.html">Overview</a>
</li>
<li class="none">
<a href="../../../hadoop-auth/Examples.html">Examples</a>
</li>
<li class="none">
<a href="../../../hadoop-auth/Configuration.html">Configuration</a>
</li>
<li class="none">
<a href="../../../hadoop-auth/BuildingIt.html">Building</a>
</li>
</ul>
<h5>Tools</h5>
<ul>
<li class="none">
<a href="../../../hadoop-streaming/HadoopStreaming.html">Hadoop Streaming</a>
</li>
<li class="none">
<a href="../../../hadoop-archives/HadoopArchives.html">Hadoop Archives</a>
</li>
<li class="none">
<a href="../../../hadoop-archive-logs/HadoopArchiveLogs.html">Hadoop Archive Logs</a>
</li>
<li class="none">
<a href="../../../hadoop-distcp/DistCp.html">DistCp</a>
</li>
<li class="none">
<a href="../../../hadoop-federation-balance/HDFSFederationBalance.html">HDFS Federation Balance</a>
</li>
<li class="none">
<a href="../../../hadoop-gridmix/GridMix.html">GridMix</a>
</li>
<li class="none">
<a href="../../../hadoop-rumen/Rumen.html">Rumen</a>
</li>
<li class="none">
<a href="../../../hadoop-resourceestimator/ResourceEstimator.html">Resource Estimator Service</a>
</li>
<li class="none">
<a href="../../../hadoop-sls/SchedulerLoadSimulator.html">Scheduler Load Simulator</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/Benchmarking.html">Hadoop Benchmarking</a>
</li>
<li class="none">
<a href="../../../hadoop-dynamometer/Dynamometer.html">Dynamometer</a>
</li>
</ul>
<h5>Reference</h5>
<ul>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/release/">Changelog and Release Notes</a>
</li>
<li class="none">
<a href="../../../api/index.html">Java API docs</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/UnixShellAPI.html">Unix Shell API</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/Metrics.html">Metrics</a>
</li>
</ul>
<h5>Configuration</h5>
<ul>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/core-default.xml">core-default.xml</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs/hdfs-default.xml">hdfs-default.xml</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-hdfs-rbf/hdfs-rbf-default.xml">hdfs-rbf-default.xml</a>
</li>
<li class="none">
<a href="../../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml">mapred-default.xml</a>
</li>
<li class="none">
<a href="../../../hadoop-yarn/hadoop-yarn-common/yarn-default.xml">yarn-default.xml</a>
</li>
<li class="none">
<a href="../../../hadoop-kms/kms-default.html">kms-default.xml</a>
</li>
<li class="none">
<a href="../../../hadoop-hdfs-httpfs/httpfs-default.html">httpfs-default.xml</a>
</li>
<li class="none">
<a href="../../../hadoop-project-dist/hadoop-common/DeprecatedProperties.html">Deprecated Properties</a>
</li>
</ul>
<a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy">
<img alt="Built by Maven" src="../images/logos/maven-feather.png"/>
</a>
</div>
</div>
<div id="bodyColumn">
<div id="contentBox">
<!---
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<ul>
<li><a href="#Introduction">Introduction</a></li>
<li><a href="#How_data_is_written_to_a_filesystem">How data is written to a filesystem</a></li>
<li><a href="#Output_Stream_Model">Output Stream Model</a>
<ul>
<li><a href="#a"></a>
<ul>
<li><a href="#Visibility_of_Flushed_Data">Visibility of Flushed Data</a></li></ul></li>
<li><a href="#State_of_Stream_and_File_System_after_Filesystem.create.28.29">State of Stream and File System after Filesystem.create()</a></li>
<li><a href="#State_of_Stream_and_File_System_after_Filesystem.append.28.29">State of Stream and File System after Filesystem.append()</a>
<ul>
<li><a href="#Persisting_data">Persisting data</a></li></ul></li></ul></li>
<li><a href="#Class_FSDataOutputStream">Class FSDataOutputStream</a></li>
<li><a href="#Class_java.io.OutputStream">Class java.io.OutputStream</a>
<ul>
<li><a href="#write.28Stream.2C_data.29">write(Stream, data)</a>
<ul>
<li><a href="#Preconditions">Preconditions</a></li>
<li><a href="#Postconditions">Postconditions</a></li></ul></li>
<li><a href="#write.28Stream.2C_byte.5B.5D_data.2C_int_offset.2C_int_len.29">write(Stream, byte[] data, int offset, int len)</a>
<ul>
<li><a href="#Preconditions">Preconditions</a></li>
<li><a href="#Postconditions">Postconditions</a></li></ul></li>
<li><a href="#write.28byte.5B.5D_data.29">write(byte[] data)</a></li>
<li><a href="#flush.28.29"> flush()</a>
<ul>
<li><a href="#Preconditions">Preconditions</a></li>
<li><a href="#Postconditions">Postconditions</a></li></ul></li>
<li><a href="#close.28.29">close()</a></li>
<li><a href="#HDFS_and_OutputStream.close.28.29">HDFS and OutputStream.close()</a></li></ul></li>
<li><a href="#org.apache.hadoop.fs.Syncable">org.apache.hadoop.fs.Syncable</a>
<ul>
<li><a href="#Syncable.hflush.28.29">Syncable.hflush()</a>
<ul>
<li><a href="#Preconditions">Preconditions</a></li>
<li><a href="#Postconditions">Postconditions</a></li>
<li><a href="#hflush.28.29_Performance">hflush() Performance</a></li></ul></li>
<li><a href="#Syncable.hsync.28.29"> Syncable.hsync()</a>
<ul>
<li><a href="#Preconditions">Preconditions</a></li>
<li><a href="#Postconditions">Postconditions</a></li></ul></li></ul></li>
<li><a href="#Interface_StreamCapabilities">Interface StreamCapabilities</a></li>
<li><a href="#interface_CanSetDropBehind"> interface CanSetDropBehind</a></li>
<li><a href="#Durability.2C_Concurrency.2C_Consistency_and_Visibility_of_stream_output.">Durability, Concurrency, Consistency and Visibility of stream output.</a>
<ul>
<li><a href="#Durability"> Durability</a></li>
<li><a href="#Concurrency"> Concurrency</a></li>
<li><a href="#Consistency_and_Visibility">Consistency and Visibility</a></li></ul></li>
<li><a href="#Issues_with_the_Hadoop_Output_Stream_model."> Issues with the Hadoop Output Stream model.</a>
<ul>
<li><a href="#HDFS"> HDFS</a>
<ul>
<li><a href="#HDFS:_hsync.28.29_only_syncs_the_latest_block">HDFS: hsync() only syncs the latest block</a></li>
<li><a href="#HDFS:_delayed_visibility_of_metadata_updates.">HDFS: delayed visibility of metadata updates.</a></li></ul></li>
<li><a href="#Local_Filesystem.2C_file:">Local Filesystem, file:</a></li>
<li><a href="#Checksummed_output_streams"> Checksummed output streams</a></li>
<li><a href="#Object_Stores"> Object Stores</a>
<ul>
<li><a href="#Visibility_of_newly_created_objects">Visibility of newly created objects</a></li>
<li><a href="#Visibility_of_the_output_of_a_stream_after_close.28.29">Visibility of the output of a stream after close()</a></li></ul></li></ul></li>
<li><a href="#Implementors_notes."> Implementors notes.</a>
<ul>
<li><a href="#Always_implement_Syncable_-even_if_just_to_throw_UnsupportedOperationException">Always implement Syncable -even if just to throw UnsupportedOperationException</a></li>
<li><a href="#StreamCapabilities">StreamCapabilities</a></li>
<li><a href="#Metadata_updates">Metadata updates</a></li>
<li><a href="#Does_close.28.29_synchronize_and_persist_data.3F">Does close() synchronize and persist data?</a></li></ul></li></ul>
<h1>Output: <code>OutputStream</code>, <code>Syncable</code> and <code>StreamCapabilities</code></h1><section>
<h2><a name="Introduction"></a>Introduction</h2>
<p>This document covers the Output Streams within the context of the <a href="index.html">Hadoop File System Specification</a>.</p>
<p>It uses the filesystem model defined in <a href="model.html">A Model of a Hadoop Filesystem</a> with the notation defined in <a href="Notation.html">notation</a>.</p>
<p>The target audiences are: 1. Users of the APIs. While <code>java.io.OutputStream</code> is a standard interfaces, this document clarifies how it is implemented in HDFS and elsewhere. The Hadoop-specific interfaces <code>Syncable</code> and <code>StreamCapabilities</code> are new; <code>Syncable</code> is notable in offering durability and visibility guarantees which exceed that of <code>OutputStream</code>. 1. Implementors of File Systems and clients.</p></section><section>
<h2><a name="How_data_is_written_to_a_filesystem"></a>How data is written to a filesystem</h2>
<p>The core mechanism to write data to files through the Hadoop FileSystem APIs is through <code>OutputStream</code> subclasses obtained through calls to <code>FileSystem.create()</code>, <code>FileSystem.append()</code>, or <code>FSDataOutputStreamBuilder.build()</code>.</p>
<p>These all return instances of <code>FSDataOutputStream</code>, through which data can be written through various <code>write()</code> methods. After a stream&#x2019;s <code>close()</code> method is called, all data written to the stream MUST BE persisted to the filesystem and visible to oll other clients attempting to read data from that path via <code>FileSystem.open()</code>.</p>
<p>As well as operations to write the data, Hadoop&#x2019;s <code>OutputStream</code> implementations provide methods to flush buffered data back to the filesystem, so as to ensure that the data is reliably persisted and/or visible to other callers. This is done via the <code>Syncable</code> interface. It was originally intended that the presence of this interface could be interpreted as a guarantee that the stream supported its methods. However, this has proven impossible to guarantee as the static nature of the interface is incompatible with filesystems whose syncability semantics may vary on a store/path basis. As an example, erasure coded files in HDFS do not support the Sync operations, even though they are implemented as subclass of an output stream which is <code>Syncable</code>.</p>
<p>A new interface: <code>StreamCapabilities</code>. This allows callers to probe the exact capabilities of a stream, even transitively through a chain of streams.</p></section><section>
<h2><a name="Output_Stream_Model"></a>Output Stream Model</h2>
<p>For this specification, an output stream can be viewed as a list of bytes stored in the client; <code>hsync()</code> and <code>hflush()</code> are operations the actions which propagate the data to be visible to other readers of the file and/or made durable.</p>
<div class="source">
<div class="source">
<pre>buffer: List[byte]
</pre></div></div>
<p>A flag, <code>open</code> tracks whether the stream is open: after the stream is closed no more data may be written to it:</p>
<div class="source">
<div class="source">
<pre>open: bool
buffer: List[byte]
</pre></div></div>
<p>The destination path of the stream, <code>path</code>, can be tracked to form a triple <code>path, open, buffer</code></p>
<div class="source">
<div class="source">
<pre>Stream = (path: Path, open: Boolean, buffer: byte[])
</pre></div></div>
<section><section>
<h4><a name="Visibility_of_Flushed_Data"></a>Visibility of Flushed Data</h4>
<p>(Immediately) after <code>Syncable</code> operations which flush data to the filesystem, the data at the stream&#x2019;s destination path MUST match that of <code>buffer</code>. That is, the following condition MUST hold:</p>
<div class="source">
<div class="source">
<pre>FS'.Files(path) == buffer
</pre></div></div>
<p>Any client reading the data at the path MUST see the new data. The <code>Syncable</code> operations differ in their durability guarantees, not visibility of data.</p></section></section><section>
<h3><a name="State_of_Stream_and_File_System_after_Filesystem.create.28.29"></a>State of Stream and File System after <code>Filesystem.create()</code></h3>
<p>The output stream returned by a <code>FileSystem.create(path)</code> or <code>FileSystem.createFile(path).build()</code> within a filesystem <code>FS</code>, can be modeled as a triple containing an empty array of no data:</p>
<div class="source">
<div class="source">
<pre>Stream' = (path, true, [])
</pre></div></div>
<p>The filesystem <code>FS'</code> MUST contain a 0-byte file at the path:</p>
<div class="source">
<div class="source">
<pre>FS' = FS where data(FS', path) == []
</pre></div></div>
<p>Thus, the initial state of <code>Stream'.buffer</code> is implicitly consistent with the data at the filesystem.</p>
<p><i>Object Stores</i>: see caveats in the &#x201c;Object Stores&#x201d; section below.</p></section><section>
<h3><a name="State_of_Stream_and_File_System_after_Filesystem.append.28.29"></a>State of Stream and File System after <code>Filesystem.append()</code></h3>
<p>The output stream returned from a call of <code>FileSystem.append(path, buffersize, progress)</code> within a filesystem <code>FS</code>, can be modelled as a stream whose <code>buffer</code> is initialized to that of the original file:</p>
<div class="source">
<div class="source">
<pre>Stream' = (path, true, data(FS, path))
</pre></div></div>
<section>
<h4><a name="Persisting_data"></a>Persisting data</h4>
<p>When the stream writes data back to its store, be it in any supported flush operation, in the <code>close()</code> operation, or at any other time the stream chooses to do so, the contents of the file are replaced with the current buffer</p>
<div class="source">
<div class="source">
<pre>Stream' = (path, true, buffer)
FS' = FS where data(FS', path) == buffer
</pre></div></div>
<p>After a call to <code>close()</code>, the stream is closed for all operations other than <code>close()</code>; they MAY fail with <code>IOException</code> or <code>RuntimeException</code>.</p>
<div class="source">
<div class="source">
<pre>Stream' = (path, false, [])
</pre></div></div>
<p>The <code>close()</code> operation MUST be idempotent with the sole attempt to write the data made in the first invocation.</p>
<ol style="list-style-type: decimal">
<li>If <code>close()</code> succeeds, subsequent calls are no-ops.</li>
<li>If <code>close()</code> fails, again, subsequent calls are no-ops. They MAY rethrow the previous exception, but they MUST NOT retry the write.</li>
</ol><!-- ============================================================= -->
<!-- CLASS: FSDataOutputStream -->
<!-- ============================================================= -->
</section></section></section><section>
<h2><a name="Class_FSDataOutputStream"></a><a name="fsdataoutputstream"></a>Class <code>FSDataOutputStream</code></h2>
<div class="source">
<div class="source">
<pre>public class FSDataOutputStream
extends DataOutputStream
implements Syncable, CanSetDropBehind, StreamCapabilities {
// ...
}
</pre></div></div>
<p>The <code>FileSystem.create()</code>, <code>FileSystem.append()</code> and <code>FSDataOutputStreamBuilder.build()</code> calls return an instance of a class <code>FSDataOutputStream</code>, a subclass of <code>java.io.OutputStream</code>.</p>
<p>The base class wraps an <code>OutputStream</code> instance, one which may implement <code>Syncable</code>, <code>CanSetDropBehind</code> and <code>StreamCapabilities</code>.</p>
<p>This document covers the requirements of such implementations.</p>
<p>HDFS&#x2019;s <code>FileSystem</code> implementation, <code>DistributedFileSystem</code>, returns an instance of <code>HdfsDataOutputStream</code>. This implementation has at least two behaviors which are not explicitly declared by the base Java implementation</p>
<ol style="list-style-type: decimal">
<li>
<p>Writes are synchronized: more than one thread can write to the same output stream. This is a use pattern which HBase relies on.</p>
</li>
<li>
<p><code>OutputStream.flush()</code> is a no-op when the file is closed. Apache Druid has made such a call on this in the past <a class="externalLink" href="https://issues.apache.org/jira/browse/HADOOP-14346">HADOOP-14346</a>.</p>
</li>
</ol>
<p>As the HDFS implementation is considered the de-facto specification of the FileSystem APIs, the fact that <code>write()</code> is thread-safe is significant.</p>
<p>For compatibility, not only SHOULD other FS clients be thread-safe, but new HDFS features, such as encryption and Erasure Coding SHOULD also implement consistent behavior with the core HDFS output stream.</p>
<p>Put differently:</p>
<p><i>It isn&#x2019;t enough for Output Streams to implement the core semantics of <code>java.io.OutputStream</code>: they need to implement the extra semantics of <code>HdfsDataOutputStream</code>, especially for HBase to work correctly.</i></p>
<p>The concurrent <code>write()</code> call is the most significant tightening of the Java specification.</p></section><section>
<h2><a name="Class_java.io.OutputStream"></a><a name="outputstream"></a>Class <code>java.io.OutputStream</code></h2>
<p>A Java <code>OutputStream</code> allows applications to write a sequence of bytes to a destination. In a Hadoop filesystem, that destination is the data under a path in the filesystem.</p>
<div class="source">
<div class="source">
<pre>public abstract class OutputStream implements Closeable, Flushable {
public abstract void write(int b) throws IOException;
public void write(byte b[]) throws IOException;
public void write(byte b[], int off, int len) throws IOException;
public void flush() throws IOException;
public void close() throws IOException;
}
</pre></div></div>
<section>
<h3><a name="write.28Stream.2C_data.29"></a><a name="writedata:_int"></a><code>write(Stream, data)</code></h3>
<p>Writes a byte of data to the stream.</p><section>
<h4><a name="Preconditions"></a>Preconditions</h4>
<div class="source">
<div class="source">
<pre>Stream.open else raise ClosedChannelException, PathIOException, IOException
</pre></div></div>
<p>The exception <code>java.nio.channels.ClosedChannelExceptionn</code> is raised in the HDFS output streams when trying to write to a closed file. This exception does not include the destination path; and <code>Exception.getMessage()</code> is <code>null</code>. It is therefore of limited value in stack traces. Implementors may wish to raise exceptions with more detail, such as a <code>PathIOException</code>.</p></section><section>
<h4><a name="Postconditions"></a>Postconditions</h4>
<p>The buffer has the lower 8 bits of the data argument appended to it.</p>
<div class="source">
<div class="source">
<pre>Stream'.buffer = Stream.buffer + [data &amp; 0xff]
</pre></div></div>
<p>There may be an explicit limit on the size of cached data, or an implicit limit based by the available capacity of the destination filesystem. When a limit is reached, <code>write()</code> SHOULD fail with an <code>IOException</code>.</p></section></section><section>
<h3><a name="write.28Stream.2C_byte.5B.5D_data.2C_int_offset.2C_int_len.29"></a><a name="writebufferoffsetlen"></a><code>write(Stream, byte[] data, int offset, int len)</code></h3><section>
<h4><a name="Preconditions"></a>Preconditions</h4>
<p>The preconditions are all defined in <code>OutputStream.write()</code></p>
<div class="source">
<div class="source">
<pre>Stream.open else raise ClosedChannelException, PathIOException, IOException
data != null else raise NullPointerException
offset &gt;= 0 else raise IndexOutOfBoundsException
len &gt;= 0 else raise IndexOutOfBoundsException
offset &lt; data.length else raise IndexOutOfBoundsException
offset + len &lt; data.length else raise IndexOutOfBoundsException
</pre></div></div>
<p>After the operation has returned, the buffer may be re-used. The outcome of updates to the buffer while the <code>write()</code> operation is in progress is undefined.</p></section><section>
<h4><a name="Postconditions"></a>Postconditions</h4>
<div class="source">
<div class="source">
<pre>Stream'.buffer = Stream.buffer + data[offset...(offset + len)]
</pre></div></div>
</section></section><section>
<h3><a name="write.28byte.5B.5D_data.29"></a><a name="writebuffer"></a><code>write(byte[] data)</code></h3>
<p>This is defined as the equivalent of:</p>
<div class="source">
<div class="source">
<pre>write(data, 0, data.length)
</pre></div></div>
</section><section>
<h3><a name="flush.28.29"></a><a name="flush"></a> <code>flush()</code></h3>
<p>Requests that the data is flushed. The specification of <code>ObjectStream.flush()</code> declares that this SHOULD write data to the &#x201c;intended destination&#x201d;.</p>
<p>It explicitly precludes any guarantees about durability.</p>
<p>For that reason, this document doesn&#x2019;t provide any normative specifications of behaviour.</p><section>
<h4><a name="Preconditions"></a>Preconditions</h4>
<p>None.</p></section><section>
<h4><a name="Postconditions"></a>Postconditions</h4>
<p>None.</p>
<p>If the implementation chooses to implement a stream-flushing operation, the data may be saved to the file system such that it becomes visible to others&quot;</p>
<div class="source">
<div class="source">
<pre>FS' = FS where data(FS', path) == buffer
</pre></div></div>
<p>When a stream is closed, <code>flush()</code> SHOULD downgrade to being a no-op, if it was not one already. This is to work with applications and libraries which can invoke it in exactly this way.</p>
<p><i>Issue</i>: Should <code>flush()</code> forward to <code>hflush()</code>?</p>
<p>No. Or at least, make it optional.</p>
<p>There&#x2019;s a lot of application code which assumes that <code>flush()</code> is low cost and should be invoked after writing every single line of output, after writing small 4KB blocks or similar.</p>
<p>Forwarding this to a full flush across a distributed filesystem, or worse, a distant object store, is very inefficient. Filesystem clients which convert a <code>flush()</code> to an <code>hflush()</code> will eventually have to roll back that feature: <a class="externalLink" href="https://issues.apache.org/jira/browse/HADOOP-16548">HADOOP-16548</a>.</p></section></section><section>
<h3><a name="close.28.29"></a><a name="close"></a><code>close()</code></h3>
<p>The <code>close()</code> operation saves all data to the filesystem and releases any resources used for writing data.</p>
<p>The <code>close()</code> call is expected to block until the write has completed (as with <code>Syncable.hflush()</code>), possibly until it has been written to durable storage.</p>
<p>After <code>close()</code> completes, the data in a file MUST be visible and consistent with the data most recently written. The metadata of the file MUST be consistent with the data and the write history itself (i.e. any modification time fields updated).</p>
<p>After <code>close()</code> is invoked, all subsequent <code>write()</code> calls on the stream MUST fail with an <code>IOException</code>.</p>
<p>Any locking/leaseholding mechanism MUST release its lock/lease.</p>
<div class="source">
<div class="source">
<pre>Stream'.open = false
FS' = FS where data(FS', path) == buffer
</pre></div></div>
<p>The <code>close()</code> call MAY fail during its operation.</p>
<ol style="list-style-type: decimal">
<li>Callers of the API MUST expect for some calls to <code>close()</code> to fail and SHOULD code appropriately. Catching and swallowing exceptions, while common, is not always the ideal solution.</li>
<li>Even after a failure, <code>close()</code> MUST place the stream into a closed state. Follow-on calls to <code>close()</code> are ignored, and calls to other methods rejected. That is: caller&#x2019;s cannot be expected to call <code>close()</code> repeatedly until it succeeds.</li>
<li>The duration of the <code>close()</code> operation is undefined. Operations which rely on acknowledgements from remote systems to meet the persistence guarantees implicitly have to await these acknowledgements. Some Object Store output streams upload the entire data file in the <code>close()</code> operation. This can take a large amount of time. The fact that many user applications assume that <code>close()</code> is both fast and does not fail means that this behavior is dangerous.</li>
</ol>
<p>Recommendations for safe use by callers</p>
<ul>
<li>Do plan for exceptions being raised, either in catching and logging or by throwing the exception further up. Catching and silently swallowing exceptions may hide serious problems.</li>
<li>Heartbeat operations SHOULD take place on a separate thread, so that a long delay in <code>close()</code> does not block the thread so long that the heartbeat times out.</li>
</ul>
<p>Implementors:</p>
<ul>
<li>Have a look at <a class="externalLink" href="https://issues.apache.org/jira/browse/HADOOP-16785">HADOOP-16785</a> to see examples of complications in close.</li>
<li>Incrementally writing blocks before a close operation results in a behavior which matches client expectations better: write failures to surface earlier and close to be more housekeeping than the actual upload.</li>
<li>If block uploads are executed in separate threads, the output stream <code>close()</code> call MUST block until all the asynchronous uploads have completed; any error raised MUST be reported. If multiple errors were raised, the stream can choose which to propagate. What is important is: when <code>close()</code> returns without an error, applications expect the data to have been successfully written.</li>
</ul></section><section>
<h3><a name="HDFS_and_OutputStream.close.28.29"></a>HDFS and <code>OutputStream.close()</code></h3>
<p>HDFS does not immediately <code>sync()</code> the output of a written file to disk on <code>OutputStream.close()</code> unless configured with <code>dfs.datanode.synconclose</code> is true. This has caused <a class="externalLink" href="https://issues.apache.org/jira/browse/ACCUMULO-1364">problems in some applications</a>.</p>
<p>Applications which absolutely require the guarantee that a file has been persisted MUST call <code>Syncable.hsync()</code> <i>before</i> the file is closed.</p></section></section><section>
<h2><a name="org.apache.hadoop.fs.Syncable"></a><a name="syncable"></a><code>org.apache.hadoop.fs.Syncable</code></h2>
<div class="source">
<div class="source">
<pre>@InterfaceAudience.Public
@InterfaceStability.Stable
public interface Syncable {
/** Flush out the data in client's user buffer. After the return of
* this call, new readers will see the data.
* @throws IOException if any error occurs
*/
void hflush() throws IOException;
/** Similar to posix fsync, flush out the data in client's user buffer
* all the way to the disk device (but the disk may have it in its cache).
* @throws IOException if error occurs
*/
void hsync() throws IOException;
}
</pre></div></div>
<p>The purpose of <code>Syncable</code> interface is to provide guarantees that data is written to a filesystem for both visibility and durability.</p>
<p><i>SYNC-1</i>: An <code>OutputStream</code> which implements <code>Syncable</code> and does not raise <code>UnsupportedOperationException</code> on invocations is making an explicit declaration that it can meet those guarantees.</p>
<p><i>SYNC-2</i>: If a stream, declares the interface as implemented, but does not provide durability, the interface&#x2019;s methods MUST raise <code>UnsupportedOperationException</code>.</p>
<p>The <code>Syncable</code> interface has been implemented by other classes than subclasses of <code>OutputStream</code>, such as <code>org.apache.hadoop.io.SequenceFile.Writer</code>.</p>
<p><i>SYNC-3</i> The fact that a class implements <code>Syncable</code> does not guarantee that <code>extends OutputStream</code> holds.</p>
<p>That is, for any class <code>C</code>: <code>(C instanceof Syncable)</code> does not imply <code>(C instanceof OutputStream)</code></p>
<p>This specification only covers the required behavior of <code>OutputStream</code> subclasses which implement <code>Syncable</code>.</p>
<p><i>SYNC-4:</i> The return value of <code>FileSystem.create(Path)</code> is an instance of <code>FSDataOutputStream</code>.</p>
<p><i>SYNC-5:</i> <code>FSDataOutputStream implements Syncable</code></p>
<p>SYNC-5 and SYNC-1 imply that all output streams which can be created with <code>FileSystem.create(Path)</code> must support the semantics of <code>Syncable</code>. This is demonstrably not true: <code>FSDataOutputStream</code> simply downgrades to a <code>flush()</code> if its wrapped stream is not <code>Syncable</code>. Therefore the declarations SYNC-1 and SYNC-2 do not hold: you cannot trust <code>Syncable</code>.</p>
<p>Put differently: <i>callers MUST NOT rely on the presence of the interface as evidence that the semantics of <code>Syncable</code> are supported</i>. Instead they MUST be dynamically probed for using the <code>StreamCapabilities</code> interface, where available.</p><section>
<h3><a name="Syncable.hflush.28.29"></a><a name="syncable.hflush"></a><code>Syncable.hflush()</code></h3>
<p>Flush out the data in client&#x2019;s user buffer. After the return of this call, new readers will see the data. The <code>hflush()</code> operation does not contain any guarantees as to the durability of the data. only its visibility.</p>
<p>Thus implementations may cache the written data in memory &#x2014;visible to all, but not yet persisted.</p><section>
<h4><a name="Preconditions"></a>Preconditions</h4>
<div class="source">
<div class="source">
<pre>hasCapability(Stream, &quot;hflush&quot;)
Stream.open else raise IOException
</pre></div></div>
</section><section>
<h4><a name="Postconditions"></a>Postconditions</h4>
<div class="source">
<div class="source">
<pre>FS' = FS where data(path) == cache
</pre></div></div>
<p>After the call returns, the data MUST be visible to all new callers of <code>FileSystem.open(path)</code> and <code>FileSystem.openFile(path).build()</code>.</p>
<p>There is no requirement or guarantee that clients with an existing <code>DataInputStream</code> created by a call to <code>(FS, path)</code> will see the updated data, nor is there a guarantee that they <i>will not</i> in a current or subsequent read.</p>
<p>Implementation note: as a correct <code>hsync()</code> implementation MUST also offer all the semantics of an <code>hflush()</code> call, implementations of <code>hflush()</code> may just invoke <code>hsync()</code>:</p>
<div class="source">
<div class="source">
<pre>public void hflush() throws IOException {
hsync();
}
</pre></div></div>
</section><section>
<h4><a name="hflush.28.29_Performance"></a><code>hflush()</code> Performance</h4>
<p>The <code>hflush()</code> call MUST block until the store has acknowledge that the data has been received and is now visible to others. This can be slow, as it will include the time to upload any outstanding data from the client, and for the filesystem itself to process it.</p>
<p>Often Filesystems only offer the <code>Syncable.hsync()</code> guarantees: persistence as well as visibility. This means the time to return can be even greater.</p>
<p>Application code MUST NOT call <code>hflush()</code> or <code>hsync()</code> at the end of every line or, unless they are writing a WAL, at the end of every record. Use with care.</p></section></section><section>
<h3><a name="Syncable.hsync.28.29"></a><a name="syncable.hsync"></a> <code>Syncable.hsync()</code></h3>
<p>Similar to POSIX <code>fsync()</code>, this call saves the data in client&#x2019;s user buffer all the way to the disk device (but the disk may have it in its cache).</p>
<p>That is: it is a requirement for the underlying FS To save all the data to the disk hardware itself, where it is expected to be durable.</p><section>
<h4><a name="Preconditions"></a>Preconditions</h4>
<div class="source">
<div class="source">
<pre>hasCapability(Stream, &quot;hsync&quot;)
Stream.open else raise IOException
</pre></div></div>
</section><section>
<h4><a name="Postconditions"></a>Postconditions</h4>
<div class="source">
<div class="source">
<pre>FS' = FS where data(path) == buffer
</pre></div></div>
<p><i>Implementations are required to block until that write has been acknowledged by the store.</i></p>
<p>This is so the caller can be confident that once the call has returned successfully, the data has been written.</p></section></section></section><section>
<h2><a name="Interface_StreamCapabilities"></a><a name="streamcapabilities"></a>Interface <code>StreamCapabilities</code></h2>
<div class="source">
<div class="source">
<pre>@InterfaceAudience.Public
@InterfaceStability.Evolving
</pre></div></div>
<p>The <code>org.apache.hadoop.fs.StreamCapabilities</code> interface exists to allow callers to dynamically determine the behavior of a stream.</p>
<div class="source">
<div class="source">
<pre> public boolean hasCapability(String capability) {
switch (capability.toLowerCase(Locale.ENGLISH)) {
case StreamCapabilities.HSYNC:
case StreamCapabilities.HFLUSH:
return supportFlush;
default:
return false;
}
}
</pre></div></div>
<p>Once a stream has been closed, a <code>hasCapability()</code> call MUST do one of</p>
<ul>
<li>return the capabilities of the open stream.</li>
<li>return false.</li>
</ul>
<p>That is: it MUST NOT raise an exception about the file being closed;</p>
<p>See <a href="pathcapabilities.html">pathcapabilities</a> for specifics on the <code>PathCapabilities</code> API; the requirements are similar: a stream MUST NOT return true for a capability for which it lacks support, be it because</p>
<ul>
<li>The capability is unknown.</li>
<li>The capability is known and known to be unsupported.</li>
</ul>
<p>Standard stream capabilities are defined in <code>StreamCapabilities</code>; consult the javadocs for the complete set of options.</p>
<table border="0" class="bodyTable">
<thead>
<tr class="a">
<th> Name </th>
<th> Probes for support of </th></tr>
</thead><tbody>
<tr class="b">
<td> <code>dropbehind</code> </td>
<td> <code>CanSetDropBehind.setDropBehind()</code> </td></tr>
<tr class="a">
<td> <code>hsync</code> </td>
<td> <code>Syncable.hsync()</code> </td></tr>
<tr class="b">
<td> <code>hflush</code> </td>
<td> <code>Syncable.hflush()</code>. Deprecated: probe for <code>HSYNC</code> only. </td></tr>
<tr class="a">
<td> <code>in:readahead</code> </td>
<td> <code>CanSetReadahead.setReadahead()</code> </td></tr>
<tr class="b">
<td> <code>in:unbuffer&quot;</code> </td>
<td> <code>CanUnbuffer.unbuffer()</code> </td></tr>
<tr class="a">
<td> <code>in:readbytebuffer</code> </td>
<td> <code>ByteBufferReadable#read(ByteBuffer)</code> </td></tr>
<tr class="b">
<td> <code>in:preadbytebuffer</code> </td>
<td> <code>ByteBufferPositionedReadable#read(long, ByteBuffer)</code> </td></tr>
</tbody>
</table>
<p>Stream implementations MAY add their own custom options. These MUST be prefixed with <code>fs.SCHEMA.</code>, where <code>SCHEMA</code> is the schema of the filesystem.</p></section><section>
<h2><a name="interface_CanSetDropBehind"></a><a name="cansetdropbehind"></a> interface <code>CanSetDropBehind</code></h2>
<div class="source">
<div class="source">
<pre>@InterfaceAudience.Public
@InterfaceStability.Evolving
public interface CanSetDropBehind {
/**
* Configure whether the stream should drop the cache.
*
* @param dropCache Whether to drop the cache. null means to use the
* default value.
* @throws IOException If there was an error changing the dropBehind
* setting.
* UnsupportedOperationException If this stream doesn't support
* setting the drop-behind.
*/
void setDropBehind(Boolean dropCache)
throws IOException, UnsupportedOperationException;
}
</pre></div></div>
<p>This interface allows callers to change policies used inside HDFS.</p>
<p>Implementations MUST return <code>true</code> for the call</p>
<div class="source">
<div class="source">
<pre>StreamCapabilities.hasCapability(&quot;dropbehind&quot;);
</pre></div></div>
</section><section>
<h2><a name="Durability.2C_Concurrency.2C_Consistency_and_Visibility_of_stream_output."></a><a name="durability-of-output"></a>Durability, Concurrency, Consistency and Visibility of stream output.</h2>
<p>These are the aspects of the system behaviour which are not directly covered in this (very simplistic) filesystem model, but which are visible in production.</p><section>
<h3><a name="Durability"></a><a name="durability"></a> Durability</h3>
<ol style="list-style-type: decimal">
<li><code>OutputStream.write()</code> MAY persist the data, synchronously or asynchronously</li>
<li><code>OutputStream.flush()</code> flushes data to the destination. There are no strict persistence requirements.</li>
<li><code>Syncable.hflush()</code> synchronously sends all outstanding data to the destination filesystem. After returning to the caller, the data MUST be visible to other readers, it MAY be durable. That is: it does not have to be persisted, merely guaranteed to be consistently visible to all clients attempting to open a new stream reading data at the path.</li>
<li><code>Syncable.hsync()</code> MUST transmit the data as per <code>hflush</code> and persist that data to the underlying durable storage.</li>
<li><code>close()</code> The first call to <code>close()</code> MUST flush out all remaining data in the buffers, and persist it, as a call to <code>hsync()</code>.</li>
</ol>
<p>Many applications call <code>flush()</code> far too often -such as at the end of every line written. If this triggered an update of the data in persistent storage and any accompanying metadata, distributed stores would overload fast. Thus: <code>flush()</code> is often treated at most as a cue to flush data to the network buffers -but not commit to writing any data.</p>
<p>It is only the <code>Syncable</code> interface which offers guarantees.</p>
<p>The two <code>Syncable</code> operations <code>hsync()</code> and <code>hflush()</code> differ purely by the extra guarantee of <code>hsync()</code>: the data must be persisted. If <code>hsync()</code> is implemented, then <code>hflush()</code> can be implemented simply by invoking <code>hsync()</code></p>
<div class="source">
<div class="source">
<pre>public void hflush() throws IOException {
hsync();
}
</pre></div></div>
<p>This is perfectly acceptable as an implementation: the semantics of <code>hflush()</code> are satisfied. What is not acceptable is downgrading <code>hsync()</code> to <code>hflush()</code>, as the durability guarantee is no longer met.</p></section><section>
<h3><a name="Concurrency"></a><a name="concurrency"></a> Concurrency</h3>
<ol style="list-style-type: decimal">
<li>
<p>The outcome of more than one process writing to the same file is undefined.</p>
</li>
<li>
<p>An input stream opened to read a file <i>before the file was opened for writing</i> MAY fetch data updated by writes to an OutputStream. Because of buffering and caching, this is not a requirement &#x2014;and if an input stream does pick up updated data, the point at which the updated data is read is undefined. This surfaces in object stores where a <code>seek()</code> call which closes and re-opens the connection may pick up updated data, while forward stream reads do not. Similarly, in block-oriented filesystems, the data may be cached a block at a time &#x2014;and changes only picked up when a different block is read.</p>
</li>
<li>
<p>A filesystem MAY allow the destination path to be manipulated while a stream is writing to it &#x2014;for example, <code>rename()</code> of the path or a parent; <code>delete()</code> of a path or parent. In such a case, the outcome of future write operations on the output stream is undefined. Some filesystems MAY implement locking to prevent conflict. However, this tends to be rare on distributed filesystems, for reasons well known in the literature.</p>
</li>
<li>
<p>The Java API specification of <code>java.io.OutputStream</code> does not require an instance of the class to be thread safe. However, <code>org.apache.hadoop.hdfs.DFSOutputStream</code> has a stronger thread safety model (possibly unintentionally). This fact is relied upon in Apache HBase, as discovered in HADOOP-11708. Implementations SHOULD be thread safe. <i>Note</i>: even the <code>DFSOutputStream</code> synchronization model permits the output stream to have <code>close()</code> invoked while awaiting an acknowledgement from datanode or namenode writes in an <code>hsync()</code> operation.</p>
</li>
</ol></section><section>
<h3><a name="Consistency_and_Visibility"></a><a name="consistency"></a>Consistency and Visibility</h3>
<p>There is no requirement for the data to be immediately visible to other applications &#x2014;not until a specific call to flush buffers or persist it to the underlying storage medium are made.</p>
<p>If an output stream is created with <code>FileSystem.create(path, overwrite==true)</code> and there is an existing file at the path, that is <code>exists(FS, path)</code> holds, then, the existing data is immediately unavailable; the data at the end of the path MUST consist of an empty byte sequence <code>[]</code>, with consistent metadata.</p>
<div class="source">
<div class="source">
<pre>exists(FS, path)
(Stream', FS') = create(FS, path)
exists(FS', path)
getFileStatus(FS', path).getLen() = 0
</pre></div></div>
<p>The metadata of a file (<code>length(FS, path)</code> in particular) SHOULD be consistent with the contents of the file after <code>flush()</code> and <code>sync()</code>.</p>
<div class="source">
<div class="source">
<pre>(Stream', FS') = create(FS, path)
(Stream'', FS'') = write(Stream', data)
(Stream''', FS''') hsync(Stream'')
exists(FS''', path)
getFileStatus(FS''', path).getLen() = len(data)
</pre></div></div>
<p><i>HDFS does not do this except when the write crosses a block boundary</i>; to do otherwise would overload the Namenode. Other stores MAY copy this behavior.</p>
<p>As a result, while a file is being written <code>length(Filesystem, Path)</code> MAY be less than the length of <code>data(Filesystem, Path)</code>.</p>
<p>The metadata MUST be consistent with the contents of a file after the <code>close()</code> operation.</p>
<p>After the contents of an output stream have been persisted (<code>hflush()/hsync()</code>) all new <code>open(FS, Path)</code> operations MUST return the updated data.</p>
<p>After <code>close()</code> has been invoked on an output stream, a call to <code>getFileStatus(path)</code> MUST return the final metadata of the written file, including length and modification time. The metadata of the file returned in any of the FileSystem <code>list</code> operations MUST be consistent with this metadata.</p>
<p>The value of <code>getFileStatus(path).getModificationTime()</code> is not defined while a stream is being written to. The timestamp MAY be updated while a file is being written, especially after a <code>Syncable.hsync()</code> call. The timestamps MUST be updated after the file is closed to that of a clock value observed by the server during the <code>close()</code> call. It is <i>likely</i> to be in the time and time zone of the filesystem, rather than that of the client.</p>
<p>Formally, if a <code>close()</code> operation triggers an interaction with a server which starts at server-side time <code>t1</code> and completes at time <code>t2</code> with a successfully written file, then the last modification time SHOULD be a time <code>t</code> where <code>t1 &lt;= t &lt;= t2</code></p></section></section><section>
<h2><a name="Issues_with_the_Hadoop_Output_Stream_model."></a><a name="issues"></a> Issues with the Hadoop Output Stream model.</h2>
<p>There are some known issues with the output stream model as offered by Hadoop, specifically about the guarantees about when data is written and persisted &#x2014;and when the metadata is synchronized. These are where implementation aspects of HDFS and the &#x201c;Local&#x201d; filesystem do not follow the simple model of the filesystem used in this specification.</p><section>
<h3><a name="HDFS"></a><a name="hdfs-issues"></a> HDFS</h3><section>
<h4><a name="HDFS:_hsync.28.29_only_syncs_the_latest_block"></a>HDFS: <code>hsync()</code> only syncs the latest block</h4>
<p>The reference implementation, <code>DFSOutputStream</code> will block until an acknowledgement is received from the datanodes: that is, all hosts in the replica write chain have successfully written the file.</p>
<p>That means that the expectation callers may have is that the return of the method call contains visibility and durability guarantees which other implementations must maintain.</p>
<p>Note, however, that the reference <code>DFSOutputStream.hsync()</code> call only actually persists <i>the current block</i>. If there have been a series of writes since the last sync, such that a block boundary has been crossed. The <code>hsync()</code> call claims only to write the most recent.</p>
<p>From the javadocs of <code>DFSOutputStream.hsync(EnumSet&lt;SyncFlag&gt; syncFlags)</code></p>
<blockquote>
<p>Note that only the current block is flushed to the disk device. To guarantee durable sync across block boundaries the stream should be created with {@link CreateFlag#SYNC_BLOCK}.</p>
</blockquote>
<p>This is an important HDFS implementation detail which must not be ignored by anyone relying on HDFS to provide a Write-Ahead-Log or other database structure where the requirement of the application is that &#x201c;all preceeding bytes MUST have been persisted before the commit flag in the WAL is flushed&#x201d;</p>
<p>See [Stonebraker81], Michael Stonebraker, <i>Operating System Support for Database Management</i>, 1981, for a discussion on this topic.</p>
<p>If you do need <code>hsync()</code> to have synced every block in a very large write, call it regularly.</p></section><section>
<h4><a name="HDFS:_delayed_visibility_of_metadata_updates."></a>HDFS: delayed visibility of metadata updates.</h4>
<p>That HDFS file metadata often lags the content of a file being written to is not something everyone expects, nor convenient for any program trying to pick up updated data in a file being written. Most visible is the length of a file returned in the various <code>list</code> commands and <code>getFileStatus</code> &#x2014;this is often out of date.</p>
<p>As HDFS only supports file growth in its output operations, this means that the size of the file as listed in the metadata may be less than or equal to the number of available bytes &#x2014;but never larger. This is a guarantee which is also held</p>
<p>One algorithm to determine whether a file in HDFS is updated is:</p>
<ol style="list-style-type: decimal">
<li>Remember the last read position <code>pos</code> in the file, using <code>0</code> if this is the initial read.</li>
<li>Use <code>getFileStatus(FS, Path)</code> to query the updated length of the file as recorded in the metadata.</li>
<li>If <code>Status.length &amp;gt; pos</code>, the file has grown.</li>
<li>If the number has not changed, then
<ol style="list-style-type: decimal">
<li>Reopen the file.</li>
<li><code>seek(pos)</code> to that location</li>
<li>If <code>read() != -1</code>, there is new data.</li>
</ol>
</li>
</ol>
<p>This algorithm works for filesystems which are consistent with metadata and data, as well as HDFS. What is important to know is that, for an open file <code>getFileStatus(FS, path).getLen() == 0</code> does not imply that <code>data(FS, path)</code> is empty.</p>
<p>When an output stream in HDFS is closed; the newly written data is not immediately written to disk unless HDFS is deployed with <code>dfs.datanode.synconclose</code> set to true. Otherwise it is cached and written to disk later.</p></section></section><section>
<h3><a name="Local_Filesystem.2C_file:"></a><a name="local-issues"></a>Local Filesystem, <code>file:</code></h3>
<p><code>LocalFileSystem</code>, <code>file:</code>, (or any other <code>FileSystem</code> implementation based on <code>ChecksumFileSystem</code>) has a different issue. If an output stream is obtained from <code>create()</code> and <code>FileSystem.setWriteChecksum(false)</code> has <i>not</i> been called on the filesystem, then the stream only flushes as much local data as can be written to full checksummed blocks of data.</p>
<p>That is, the hsync/hflush operations are not guaranteed to write all the pending data until the file is finally closed.</p>
<p>For this reason, the local filesystem accessed via <code>file://</code> URLs does not support <code>Syncable</code> unless <code>setWriteChecksum(false)</code> was called on that FileSystem instance so as to disable checksum creation. After which, obviously, checksums are not generated for any file. Is</p></section><section>
<h3><a name="Checksummed_output_streams"></a><a name="checksummed-fs-issues"></a> Checksummed output streams</h3>
<p>Because <code>org.apache.hadoop.fs.FSOutputSummer</code> and <code>org.apache.hadoop.fs.ChecksumFileSystem.ChecksumFSOutputSummer</code> implement the underlying checksummed output stream used by HDFS and other filesystems, it provides some of the core semantics of the output stream behavior.</p>
<ol style="list-style-type: decimal">
<li>The <code>close()</code> call is unsynchronized, re-entrant and may attempt to close the stream more than once.</li>
<li>It is possible to call <code>write(int)</code> on a closed stream (but not <code>write(byte[], int, int)</code>).</li>
<li>It is possible to call <code>flush()</code> on a closed stream.</li>
</ol>
<p>Behaviors 1 and 2 really have to be considered bugs to fix, albeit with care.</p>
<p>Behavior 3 has to be considered a defacto standard, for other implementations to copy.</p></section><section>
<h3><a name="Object_Stores"></a><a name="object-store-issues"></a> Object Stores</h3>
<p>Object store streams MAY buffer the entire stream&#x2019;s output until the final <code>close()</code> operation triggers a single <code>PUT</code> of the data and materialization of the final output.</p>
<p>This significantly changes their behaviour compared to that of POSIX filesystems and that specified in this document.</p><section>
<h4><a name="Visibility_of_newly_created_objects"></a>Visibility of newly created objects</h4>
<p>There is no guarantee that any file will be visible at the path of an output stream after the output stream is created .</p>
<p>That is: while <code>create(FS, path, boolean)</code> returns a new stream</p>
<div class="source">
<div class="source">
<pre>Stream' = (path, true, [])
</pre></div></div>
<p>The other postcondition of the operation, <code>data(FS', path) == []</code> MAY NOT hold, in which case:</p>
<ol style="list-style-type: decimal">
<li><code>exists(FS, p)</code> MAY return false.</li>
<li>If a file was created with <code>overwrite = True</code>, the existing data MAY still be visible: <code>data(FS', path) = data(FS, path)</code>.</li>
<li>
<p>The check for existing data in a <code>create()</code> call with <code>overwrite=False</code>, may take place in the <code>create()</code> call itself, in the <code>close()</code> call prior to/during the write, or at some point in between. In the special case that the object store supports an atomic <code>PUT</code> operation, the check for existence of existing data and the subsequent creation of data at the path contains a race condition: other clients may create data at the path between the existence check and the subsequent write.</p>
</li>
<li>
<p>Calls to <code>create(FS, Path, overwrite=false)</code> MAY succeed, returning a new <code>OutputStream</code>, even while another stream is open and writing to the destination path.</p>
</li>
</ol>
<p>This allows for the following sequence of operations, which would raise an exception in the second <code>open()</code> call if invoked against HDFS:</p>
<div class="source">
<div class="source">
<pre>Stream1 = open(FS, path, false)
sleep(200)
Stream2 = open(FS, path, false)
Stream.write('a')
Stream1.close()
Stream2.close()
</pre></div></div>
<p>For anyone wondering why the clients don&#x2019;t create a 0-byte file in the <code>create()</code> call, it would cause problems after <code>close()</code> &#x2014;the marker file could get returned in <code>open()</code> calls instead of the final data.</p></section><section>
<h4><a name="Visibility_of_the_output_of_a_stream_after_close.28.29"></a>Visibility of the output of a stream after <code>close()</code></h4>
<p>One guarantee which Object Stores SHOULD make is the same as those of POSIX filesystems: After a stream <code>close()</code> call returns, the data MUST be persisted durably and visible to all callers. Unfortunately, even that guarantee is not always met:</p>
<ol style="list-style-type: decimal">
<li>
<p>Existing data on a path MAY be visible for an indeterminate period of time.</p>
</li>
<li>
<p>If the store has any form of create inconsistency or buffering of negative existence probes, then even after the stream&#x2019;s <code>close()</code> operation has returned, <code>getFileStatus(FS, path)</code> and <code>open(FS, path)</code> may fail with a <code>FileNotFoundException</code>.</p>
</li>
</ol>
<p>In their favour, the atomicity of the store&#x2019;s PUT operations do offer their own guarantee: a newly created object is either absent or all of its data is present: the act of instantiating the object, while potentially exhibiting create inconsistency, is atomic. Applications may be able to use that fact to their advantage.</p>
<p>The <a href="abortable.html">Abortable</a> interface exposes this ability to abort an output stream before its data is made visible, so can be used for checkpointing and similar operations.</p></section></section></section><section>
<h2><a name="Implementors_notes."></a><a name="implementors"></a> Implementors notes.</h2><section>
<h3><a name="Always_implement_Syncable_-even_if_just_to_throw_UnsupportedOperationException"></a>Always implement <code>Syncable</code> -even if just to throw <code>UnsupportedOperationException</code></h3>
<p>Because <code>FSDataOutputStream</code> silently downgrades <code>Syncable.hflush()</code> and <code>Syncable.hsync()</code> to <code>wrappedStream.flush()</code>, callers of the API MAY be misled into believing that their data has been flushed/synced after syncing to a stream which does not support the APIs.</p>
<p>Implementations SHOULD implement the API but throw <code>UnsupportedOperationException</code>.</p></section><section>
<h3><a name="StreamCapabilities"></a><code>StreamCapabilities</code></h3>
<p>Implementors of filesystem clients SHOULD implement the <code>StreamCapabilities</code> interface and its <code>hasCapabilities()</code> method to declare whether or not an output streams offer the visibility and durability guarantees of <code>Syncable</code>.</p>
<p>Implementors of <code>StreamCapabilities.hasCapabilities()</code> MUST NOT declare that they support the <code>hflush</code> and <code>hsync</code> capabilities on streams where this is not true.</p>
<p>Sometimes streams pass their data to store, but the far end may not sync it all the way to disk. That is not something the client can determine. Here: if the client code is making the hflush/hsync passes these requests on to the distributed FS, it SHOULD declare that it supports them.</p></section><section>
<h3><a name="Metadata_updates"></a>Metadata updates</h3>
<p>Implementors MAY NOT update a file&#x2019;s metadata (length, date, &#x2026;) after every <code>hsync()</code> call. HDFS doesn&#x2019;t, except when the written data crosses a block boundary.</p></section><section>
<h3><a name="Does_close.28.29_synchronize_and_persist_data.3F"></a>Does <code>close()</code> synchronize and persist data?</h3>
<p>By default, HDFS does not immediately data to disk when a stream is closed; it will be asynchronously saved to disk.</p>
<p>This does not mean that users do not expect it.</p>
<p>The behavior as implemented is similar to the write-back aspect&#x2019;s of NFS&#x2019;s <a class="externalLink" href="https://docstore.mik.ua/orelly/networking_2ndEd/nfs/ch07_04.htm">caching</a>. <code>DFSClient.close()</code> is performing an <code>hflush()</code> to the client to upload all data to the datanodes.</p>
<ol style="list-style-type: decimal">
<li><code>close()</code> SHALL return once the guarantees of <code>hflush()</code> are met: the data is visible to others.</li>
<li>For durability guarantees, <code>hsync()</code> MUST be called first.</li>
</ol></section></section>
</div>
</div>
<div class="clear">
<hr/>
</div>
<div id="footer">
<div class="xright">
&#169; 2008-2023
Apache Software Foundation
- <a href="http://maven.apache.org/privacy-policy.html">Privacy Policy</a>.
Apache Maven, Maven, Apache, the Apache feather logo, and the Apache Maven project logos are trademarks of The Apache Software Foundation.
</div>
<div class="clear">
<hr/>
</div>
</div>
</body>
</html>