When making a GetBucketLocation request, Amazon may route the request
to the bucket region. When making it with v4 signer, the request may
fail because of the region mismatch. Concretely, a request to
test.s3.amazonaws.com may resolve to s3-us-west-2-w.amazonaws.com. The
request itself is prepared for the us-east-1 region (s3.amazonaws.com
endpoint), but then fails when the DNS resolution points to a
us-west-2 endpoint.
Bucket-in-path works around this for the GetBucketLocation requests.
That means that every GetBucketLocation request will be of the form:
https://s3.amazonaws.com/{bucket}?location. This ensures that jclouds
requests will not be subjected to Amazon's routing/DNS pointers.
Fixes: JCLOUDS-1213
Readers can confuse this with 1. Found via error-prone. Fixed via:
find -name \*.java | xargs sed -i 's/\( [0-9][0-9]*\)l/\1L/g'
find -name \*.java | xargs sed -i 's/\(([0-9][0-9]*\)l/\1L/g'
Previously we created a new builder instead of using the one the
method modified and did not preserve content encoding. Addresses
AWSS3BlobIntegrationLiveTest.testPutInputStream test failures.
AWS S3 MPU ETag are hashes of the part ETag headers but some
implementations, specifically S3Proxy with the filesystem provider,
represent multi-part objects as a single object. Remove these checks
since they add nothing.
These are simpler than the full XML API and better supported by
non-AWS S3 implementations, e.g., Ceph, S3Proxy. Further this makes
the provider more consistent when creating a bucket or object which
only supports setting canned ACLs.
When constructing the query path, S3 does not properly handle encoded
paths. For example, if a blob named %20 is to be placed into the blob
store, S3 would end up placing blob named " " (what %20 represents).
This occurs because the S3 provider examines the URI's path portion
(which is presented in a decoded fasion to the caller). After
examining the path, it is not encoded again. Instead, we should call
getRawPath() to avoid this issue.
There are two issues on the decoding path:
1. Given a blob named " ", S3 API will throw a RuntimeException due to
a NULL check -- the key that it uses is NULL to represent the XML
content " " corresponding to the blob name.
2. Given a blob named "%20 ", S3 API will generate a URI for a blob
named "%20%20", which is also incorrect. The correct URI would be
"%2520%20" (escaping the first "%" and " " characters).
The first issue is due to the currentOrNull() helper, which calls
trim() on the string and then compares the string to an empty string.
This means that a blob named " " will be parsed as "" and then
converted to NULL as the result of that method. Passing "null" as the
key then fails in a number of places (notably, appendPath()).
The second issue is due to the appendPath() method in the jclouds Uris
class. The issue here is that appendPath() calls urlDecode() and
passes the result to path(). The path() method, in turn, also calls
urlDecode(). After these transformations, a properly encoded blob of
the form %2520%20 turns into "%20 " and then " " (two spaces). After
these transformations the path is encoded again, resulting in "%20%20"
(which is wrong).
The patch adds delimiter option support in the generic blob store
interface. A live integration test is added to verify that jclouds
correctly lists objects separated by a delimiter.