docs(aio): add more docs about `aio-builds-setup`

This commit is contained in:
Georgios Kalpakas 2017-03-09 22:12:01 +02:00 committed by Chuck Jazdzewski
parent b804a488c5
commit 3bb59902f7
14 changed files with 562 additions and 34 deletions

View File

@ -1,32 +0,0 @@
# VM Setup Instructions
- Set up secrets (access tokens, passwords, etc)
- Set up docker
- Attach persistent disk
- Build docker image (+ checkout repo)
- Run image (+ setup for run on boot)
## Build image
- `<aio-builds-setup-dir>/build.sh [<name>[:<tag>] [--build-arg <NAME>=<value> ...]]`
## Run image
- `sudo docker run \
-d \
--dns 127.0.0.1 \
--name <instance-name> \
-p 80:80 \
-p 443:443 \
--restart unless-stopped \
[-v <host-cert-dir>:/etc/ssl/localcerts:ro] \
-v <host-secrets-dir>:/aio-secrets:ro \
-v <host-builds-dir>:/var/www/aio-builds \
[-v <host-logs-dir>:/var/log/aio] \
<name>[:<tag>]
`
## Questions
- Do we care to keep logs (e.g. cron, nginx, aio-upload-server, aio-clean-up, pm2) outside of the container?
- Instead of creating new comments for each commit, update the original comment?

View File

@ -0,0 +1,28 @@
# VM Setup Instructions
## Overview
- [General overview](overview--general.md)
- [Security model](overview--security-model.md)
- [Available Commands](overview--scripts-and-commands.md)
## Setting up the VM
- [Set up secrets](vm-setup--set-up-secrets.md)
- [Set up docker](vm-setup--set-up-docker.md)
- [Attach persistent disk](vm-setup--attach-persistent-disk.md)
- [Create host directories and files](vm-setup--create-host-dirs-and-files.md)
- [Create docker image](vm-setup--create-docker-image.md)
## Configuring the docker image
- [Available environment variables](image-config--environment-variables.md)
## Starting the docker container
- [Create docker image](vm-setup--start-docker-container.md)
## Miscellaneous
- [Debug docker container](misc--debug-docker-container.md)
- [Integrate with CI](misc--integrate-with-ci.md)

View File

@ -0,0 +1,52 @@
# Image config - Environment variables
Below is a list of environment variables that can be configured when creating the docker image (as
described [here](vm-setup--create-docker-image.md)). An up-to-date list of the configurable
environment variables and their default values can be found in the
[Dockerfile](../dockerbuild/Dockerfile).
**Note:**
Each variable has a `TEST_` prefixed counterpart, which is used for testing purposes. In most cases
you don't need to specify values for those.
- `AIO_BUILDS_DIR`:
The directory (inside the container) where the uploaded build artifacts are kept.
- `AIO_DOMAIN_NAME`:
The domain name of the server.
- `AIO_GITHUB_ORGANIZATION`:
The GitHub organization whose teams arew whitelisted for accepting uploads.
See also `AIO_GITHUB_TEAM_SLUGS`.
- `AIO_GITHUB_TEAM_SLUGS`:
A comma-separated list of teams, whose authors are allowed to upload PRs.
See also `AIO_GITHUB_ORGANIZATION`.
- `AIO_NGINX_HOSTNAME`:
The internal hostname for accessing the nginx server. This is mostly used for performing a
periodic health-check.
- `AIO_NGINX_PORT_HTTP`:
The port number on which nginx listens for HTTP connections. This should be mapped to the
corresponding port on the host VM (as described [here](vm-setup--start-docker-container.md)).
- `AIO_NGINX_PORT_HTTPS`:
The port number on which nginx listens for HTTPS connections. This should be mapped to the
corresponding port on the host VM (as described [here](vm-setup--start-docker-container.md)).
- `AIO_REPO_SLUG`:
The repository slug (in the form `<user>/<repo>`) for which PRs will be uploaded.
- `AIO_UPLOAD_HOSTNAME`:
The internal hostname for accessing the Node.js upload-server. This is used by nginx for
delegating upload requests and also for performing a periodic health-check.
- `AIO_UPLOAD_MAX_SIZE`:
The maximum allowed size for the uploaded gzip archive containing the build artifacts. Files
larger than this will be rejected.
- `AIO_UPLOAD_PORT`:
The port number on which the Node.js upload-server listens for HTTP connections. This is used by
nginx for delegating upload requests and also for performing a periodic health-check.

View File

@ -0,0 +1,12 @@
# Miscellaneous - Debug docker container
TODO (gkalpak): Add docs. Mention:
- `aio-health-check`
- `aio-verify-setup`
- Test nginx accessible at:
- `http://$TEST_AIO_NGINX_HOTNAME:$TEST_AIO_NGINX_PORT_HTTP`
- `https://$TEST_AIO_NGINX_HOTNAME:$TEST_AIO_NGINX_PORT_HTTPS`
- Test upload-server accessible at:
- `http://$TEST_AIO_UPLOAD_HOTNAME:$TEST_AIO_UPLOAD_PORT`
- Local DNS (via dnsmasq) maps the above hostnames to 127.0.0.1

View File

@ -0,0 +1,12 @@
# Miscellaneous - Integrate with CI
TODO (gkalpak): Add docs. Mention:
- Travis' JWT addon (+ limitations).
Relevant files: `.travis.yml`
- Testing on CI.
Relevant files: `ci/test-aio.sh`, `aio/aio-builds-setup/scripts/test.sh`
- Preverifying on CI.
Relevant files: `ci/deploy.sh`, `aio/aio-builds-setup/scripts/travis-preverify-pr.sh`
- Deploying from CI.
Relevant files: `ci/deploy.sh`, `aio/scripts/deploy-preview.sh`

View File

@ -0,0 +1,84 @@
# Overview - General
## Objective
Whenever a PR job is run on Travis, we want to build `angular.io` and upload the build artifacts to
a publicly accessible server so that collaborators (developers, designers, authors, etc) can preview
the changes without having to checkout and build the app locally.
## Source code
In order to make it easier to administer the server and version-control the setup, we are using
[docker](https://www.docker.com) to run a container on a VM. The Dockerfile and all other files
necessary for creating the docker container are stored (and versioned) along with the angular.io
project's source code (currently part of the angular/angular repo) in the `aio-builds-setup/`
directory.
## Setup
The VM is hosted on [Google Compute Engine](https://cloud.google.com/compute/). The host OS is
debian:jessie. For more info how to set up the host VM take a look at the "Setting up the VM"
section in [TOC](_TOC.md).
## Security model
Since we are managing a public server, it is important to take appropriate measures in order to
prevent abuse. For more details on the challenges and the chosen approach take a look at the
[security model](overview--security-model.md).
## The 10000 feet view
This section gives a brief summary of the several operations performed on CI and by the docker
container:
### On CI (Travis)
- Build job completes successfully (i.e. build succeeds and tests pass).
- The CI script checks whether the build job was initiated by a PR against the angular/angular
master branch.
- The CI script checks whether the PR has touched any files inside the angular.io project directory
(currently `aio/`).
- The CI script checks whether the author of the PR is a member of one of the whitelisted GitHub
teams (and therefore allowed to upload).
**Note:**
For security reasons, the same checks will be performed on the server as well. This is an optional
step with the purpose of:
1. Avoiding the wasted overhead associated with uploads that are going to be rejected (e.g.
building the artifacts, sending them to the server, running checks on the server, etc).
2. Avoiding failing the build (due to an error response from the server) or requiring additional
logic for detecting the reasons of the failure.
- The CI script gzip and upload the build artifacts to the server.
More info on how to set things up on CI can be found [here](misc--integrate-with-ci.md).
### Uploading build artifacts
- nginx receives upload request.
- nginx checks that the uploaded gzip archive does not exceed the specified max file size, stores it
in a temporary location and passes the filepath to the Node.js upload-server.
- The upload-server verifies that the uploaded file is not trying to overwrite an existing build,
and runs several checks to determine whether the request should be accepted (more details can be
found [here](overview--security-model.md)).
- The upload-server deploys the artifacts to a sub-directory named after the PR number and SHA:
`<PR>/<SHA>/`
- The upload-server posts a comment on the corresponding PR on GitHub mentioning the SHA and the
the link where the preview can be found.
### Serving build artifacts
- nginx receives a request for an uploaded resource on a subdomain corresponding to the PR and SHA.
E.g.: `pr<PR>-<SHA>.ngbuilds.io/path/to/resurce`
- nginx maps the subdomain to the correct sub-direcory and serves the resource.
E.g.: `/<PR>/<SHA>/path/to/resource`
### Removing obsolete artifacts
In order to avoid flooding the disk with unnecessary build artifacts, there is a cronjob that runs a
clean-up tasks once a day. The task retrieves all open PRs from GitHub and removes all directories
that do not correspond with an open PR.
### Health-check
The docker service runs a periodic health-check that verifies the running conditions of the
container. This includes verifying the status of specific system services, the responsiveness of
nginx and the upload-server and internet connectivity.

View File

@ -0,0 +1,55 @@
# Overview - Scripts and Commands
This is an overview of the available scripts and commands.
## Scripts
The scripts are located inside `<aio-builds-setup-dir>/scripts/`. The following scripts are
available:
- `build.sh`:
Can be used for creating a preconfigured docker image.
See [here](vm-setup--create-docker-image.md) for more info.
- `test.sh`
Can be used for running the tests for `<aio-builds-setup-dir>/dockerbuild/scripts-js/`. This is
useful for CI integration. See [here](misc--integrate-with-ci.md) for more info.
- `travis-preverify-pr.sh`
Can be used for "preverifying" a PR before uploading the artifacts to the server. It checks that
the author of the PR a member of one of the specified GitHub teams and therefore allowed to upload
build artifacts. This is useful for CI integration. See [here](misc--integrate-with-ci.md) for
more info.
## Commands
The following commands are available globally from inside the docker container. They are either used
by the container to perform its various operations or can be used ad-hoc, mainly for testing
purposes. Each command is backed by a corresponding script inside
`<aio-builds-setup-dir>/dockerbuild/scripts-sh/`.
- `aio-clean-up`:
Cleans up the builds directory by removing the artifacts that do not correspond to an open PR.
_It is run as a daily cronjob._
- `aio-health-check`:
Runs a basic health-check, verifying that the necessary services are running, the servers are
responding and there is a working internet connection.
_It is used periodically by docker for determining the container's health status._
- `aio-init`:
Initializes the container (mainly by starting the necessary services).
_It is run (by default) when starting the container._
- `aio-upload-server-prod`:
Spins up a Node.js upload-server instance.
_It is used in `aio-init` (see above) during initialization._
- `aio-upload-server-test`:
Spins up a Node.js upload-server instance for tests.
_It is used in `aio-verify-setup` (see below) for running tests._
- `aio-verify-setup`:
Runs a suite of e2e-like tests, mainly verifying the correct (inter)operation of nginx and the
Node.js upload-server.

View File

@ -0,0 +1,116 @@
# Overview - Security model
Whenever a PR job is run on Travis, we want to build `angular.io` and upload the build artifacts to
a publicly accessible server so that collaborators (developers, designers, authors, etc) can preview
the changes without having to checkout and build the app locally.
This document discusses the security considerations associated with uploading build artifacts as
part of the CI setup and serving them publicly.
## Security objectives
- **Prevent uploading arbitrary content to our servers.**
Since there is no restriction on who can submit a PR, we cannot allow any PR's build artifacts to
be uploaded.
- **Prevent overwriting other peoples uploaded content.**
There needs to be a mechanism in place to ensure that the uploaded content does indeed correspond
to the PR indicated by its URL.
- **Prevent arbitrary access on the server.**
Since the PR author has full access over the build artifacts that would be uploaded, we must
ensure that the uploaded files will not enable arbitrary access to the server or expose sensitive
info.
## Issues / Caveats
- Because the PR author can change the scripts run on CI, any security mechanisms must be immune to
such changes.
- For security reasons, encrypted Travis variables are not available to PRs, so we can't rely on
them to implement security.
## Implemented approach
### In a nutshell
The implemented approach can be broken up to the following sub-tasks:
1. Verify which PR the uploaded artifacts correspond to.
2. Determine the author of the PR.
3. Check whether the PR author is a member of some whitelisted GitHub team.
4. Deploy the artifacts to the corresponding PR's directory.
5. Prevent overwriting previously deployed artifacts (which ensures that the guarantees established
during deployment will remain valid until the artifacts are removed).
6. Prevent uploaded files from accessing anything outside their directory.
### Implementation details
This section describes how each of the aforementioned sub-tasks is accomplished:
1. **Verify which PR the uploaded artifacts correspond to.**
We are taking advantage of Travis' [JWT addon](https://docs.travis-ci.com/user/jwt). By sharing
a secret between Travis (which keeps it private but uses it to sign a JWT) and the server (which
uses it to verify the authenticity of the JWT), we can accomplish the following:
a. Verify that the upload request comes from Travis.
b. Determine the PR that these artifacts correspond to (since Travis puts that information into
the JWT, without the PR author being able to modify it).
_Note:_
_There are currently certain limitation in the implementation of the JWT addon._
_See the next section for more details._
2. **Determine the author of the PR.**
Once we have securely associated the uploaded artifaacts to a PR, we retrieve the PR's metadata -
including the author's username - using the [GitHub API](https://developer.github.com/v3/).
To avoid rate-limit restrictions, we use a Personal Access Token (issued by
[@mary-poppins](https://github.com/mary-poppins)).
3. **Check whether the PR author is a member of some whitelisted GitHub team.**
Again using the GitHub API, we can verify the author's membership in one of the
whitelisted/trusted GitHub teams. For this operation, we need a PErsonal Access Token with the
`read:org` scope issued by a user that can "see" the specified GitHub organization.
Here too, we use token by @mary-poppins.
4. **Deploy the artifacts to the corresponding PR's directory.**
With the preceeding steps, we have verified that the uploaded artifacts have been uploaded by
Travis and correspond to a PR whose author is a member of a trusted team. Essentially, as long as
sub-tasks 1, 2 and 3 can be securely accomplished, it is possible to "project" the trust we have
in a team's members through the PR and Travis to the build artifacts.
5. **Prevent overwriting previously deployed artifacts**.
In order to enforce this restriction (and ensure that the deployed artifacts validity is
preserved throughout their "lifetime"), the server that handles the upload (currently a Node.js
Express server) rejects uploads that target an existing directory.
_Note: A PR can contain multiple uploads; one for each SHA that was built on Travis._
6. **Prevent uploaded files from accessing anything outside their directory.**
Nginx (which is used to serve the uploaded artifacts) has been configured to not follow symlinks
outside of the directory where the build artifacts are stored.
## Assumptions / Things to keep in mind
- Each trusted PR author has full control over the content that is uploaded for their PRs. Part of
the security model relies on the trustworthiness of these authors.
- If anyone gets access to the `PREVIEW_DEPLOYMENT_TOKEN` (a.k.a. `NGBUILDS_IO_KEY` on
angular/angular) variable generated for each Travis job, they will be able to impersonate the
corresponding PR's author on the preview server for as long as the token is valid (currently 90
mins). Because of this, the value of the `PREVIEW_DEPLOYMENT_TOKEN` should not be made publicly
accessible (e.g. by printing it on the Travis job log).
- Travis does only allow specific whitelisted property names to be used with the JWT addon. The only
known such property at the time is `SAUCE_ACCESS_KEY` (used for integration with SauceLabs). In
order to be able to actually use the JWT addon we had to name the encrypted variable
`SAUCE_ACCESS_KEY` (which we later re-assign to `NGBUILDS_IO_KEY`).

View File

@ -13,5 +13,8 @@
## Mount disk on boot
- ``echo UUID=`sudo blkid -s UUID -o value /dev/disk/by-id/google-aio-builds` \
/mnt/disks/aio-builds ext4 discard,defaults,nofail 0 2 | sudo tee -a /etc/fstab``
- Run:
```
echo UUID=`sudo blkid -s UUID -o value /dev/disk/by-id/google-aio-builds` \
/mnt/disks/aio-builds ext4 discard,defaults,nofail 0 2 | sudo tee -a /etc/fstab
```

View File

@ -0,0 +1,32 @@
# VM setup - Create docker image
## Checkout repository
- `git clone <repo-url>`
## Build docker image
- `<aio-builds-setup-dir>/scripts/build.sh [<name>[:<tag>] [--build-arg <NAME>=<value> ...]]`
- You can overwrite the default environment variables inside the image, by passing new values using
`--build-arg`.
**Note:** The build script has to execute docker commands with `sudo`.
## Example
The following commands would create a docker image from GitHub repo `foo/bar` to be deployed on the
`foobar-builds.io` domain and accepting PR deployments from authors that are members of the
`bar-core` and `bar-docs-authors` teams of organization `foo`:
- `git clone https://github.com/foo/bar.git foobar`
- Run:
```
./foobar/aio-builds-setup/scripts/build.sh foobar-builds \
--build-arg AIO_REPO_SLUG=foo/bar \
--build-arg AIO_DOMAIN_NAME=foobar-builds.io \
--build-arg AIO_GITHUB_ORGANIZATION=foo \
--build-arg AIO_GITHUB_TEMA_SLUGS=bar-core,bar-docs-authors
```
A full list of the available environment variables can be found
[here](image-config--environment-variables.md).

View File

@ -0,0 +1,74 @@
# VM setup - Create host directories and files
## Create directory with secrets
For security reasons, sensitive info (such as tokens and passwords) are not hardcoded into the
docker image, nor passed as environment variables at runtime. They are passed to the docker
container from the host VM as files inside a directory. Each file's name is the name of the variable
and the file content is the value. These are read from inside the running container when necessary.
More info on how to create `secrets` directory and files can be found
[here](vm-setup--set-up-secrets.md).
## Create directory for build artifacts
The uploaded build artifacts should be kept on a directory outside the docker container, so it is
easier to replace the container without losing the uploaded builds. For portability across VMs a
persistent disk can be used (as described [here](vm-setup--attach-persistent-disk.md)).
**Note:** The directories created inside that directory will be owned by user `www-data`.
## Create SSL certificates (Optional for dev)
The host VM can attach a directory containing the SSL certificate and key to be used by the nginx
server for serving the uploaded build artifacts. More info on how to attach the directory when
starting the container can be found [here](vm-setup--start-docker-container.md).
In order for the container to be able to find the certificate and key, they should be named
`<DOMAIN_NAME>.crt` and `<DOMAIN_NAME>.key` respectively. For example, for a domain name
`ngbuild.io`, nginx will look for files `ngbuilds.io.crt` and `ngbuilds.io.key`. More info on how to
specify the domain name see [here](vm-setup--create-docker-image.md).
If no directory is attached, nginx will use an internal self-signed certificate. This is convenient
during development, but is not suitable for production.
**Note:**
Since nginx needs to be able to serve requests for both the main domain as well as any subdomain
(e.g. `ngbuilds.io/` and `foo-bar.ngbuilds.io/`), the provided certificate needs to be a wildcard
certificate covering both the domain and subdomains.
## Create directory for logs (Optional)
Optionally, a logs directory can pe passed to the docker container for storing non-system-related
logs. If not provided, the logs are kept locally on the container and will be lost whenever the
container is replaced (e.g. when updating to use a newer version of the docker image).
The following files log files are kept in this directory:
- `clean-up.log`:
Output of the `aio-clean-up` command, run as a cronjob for cleaning up the build artifacts of
closed PRs.
- `init.log`:
Output of the `aio-init` command, run (by default) when starting the container.
- `nginx/{access,error}.log`:
The access and error logs produced by the nginx server while serving "production" files.
- `nginx-test/{access,error}.log`:
The access and error logs produced by the nginx server while serving "test" files. This is only
used when running tests locally from inside the container, e.g. with the `aio-verify-setup`
command. (See [here](overview--scripts-and-commands.md) for more info.)
- `upload-server-{prod,test,verify-setup}-*.log`:
The logs produced by the Node.js upload-server while serving either:
- `-prod`: "Production" files (g.g during normal operation).
- `-test`: "Test" files (e.g. when a test instance is started with the `aio-upload-server-test`
command).
- `-verify-setup`: "Test" files, but while running `aio-verify-setup`.
(See [here](overview--scripts-and-commands.md) for more info the commands mentioned above.)
- `verify-setup.log`:
The output of the `aio-verify-setup` command (e.g. Jasmine output), except for upload-server
output which is logged to `upload-server-verify-setup-*.log` (see above).

View File

@ -0,0 +1,92 @@
# VM setup - Start docker container
## The `docker run` command
Once everything has been setup and configured, a docker container can be started with the following
command:
```
sudo docker run \
-d \
--dns 127.0.0.1 \
--name <instance-name> \
-p 80:80 \
-p 443:443 \
--restart unless-stopped \
[-v <host-cert-dir>:/etc/ssl/localcerts:ro] \
-v <host-secrets-dir>:/aio-secrets:ro \
-v <host-builds-dir>:/var/www/aio-builds \
[-v <host-logs-dir>:/var/log/aio] \
<name>[:<tag>]
```
Below is the same command with inline comments explaining each option. The aPI docs for `docker run`
can be found [here](https://docs.docker.com/engine/reference/run/).
```
sudo docker run \
# Start as a daemon.
-d \
# Use the local DNS server.
# (This is necessary for mapping internal URLs, e.g. for the Node.js upload-server.)
--dns 127.0.0.1 \
# USe `<instance-name>` as an alias for the container.
# Useful for running `docker` commands, e.g.: `docker stop <instance-name>`
--name <instance-name> \
# Map ports of the hosr VM (left) to ports of the docker container (right)
-p 80:80 \
-p 443:443 \
# Automatically restart the container (unless it was explicitly stopped by the user).
# (This ensures that the container will be automatically started on boot.)
--restart unless-stopped \
# The directory the contains the SSL certificates.
# (See [here](vm-setup--create-host-dirs-and-files.md) for more info.)
# If not provided, the container will use self-signed certificates.
[-v <host-cert-dir>:/etc/ssl/localcerts:ro] \
# The directory the contains the secrets (e.g. GitHub token, JWT secret, etc).
# (See [here](vm-setup--set-up-secrets.md) for more info.)
-v <host-secrets-dir>:/aio-secrets:ro \
# The uploaded build artifacts will stored to and served from this directory.
# (If you are using a persistent disk - as described [here](vm-setup--attach-persistent-disk.md) -
# this will be a directory inside the disk.)
-v <host-builds-dir>:/var/www/aio-builds \
# The directory where the logs are being kept.
# (See [here](vm-setup--create-host-dirs-and-files.md) for more info.)
# If not provided, the logs will be kept inside the container, which means they will be lost
# whenever a new container is created.
[-v <host-logs-dir>:/var/log/aio] \
# The name of the docker image to use (and an optional tag; defaults to `latest`).
# (See [here](vm-setup--create-docker-image.md) for instructions on how to create the iamge.)
<name>[:<tag>]
```
## Example
The following command would start a docker container based on the previously created `foobar-builds`
docker image, alias it as 'foobar-builds-1' and map predefined directories on the host VM to be used
by the container for accesing secrets and SSL certificates and keeping the build artifacts and logs.
```
sudo docker run \
-d \
--dns 127.0.0.1 \
--name foobar-builds-1 \
-p 80:80 \
-p 443:443 \
--restart unless-stopped \
-v /etc/ssl/localcerts:/etc/ssl/localcerts:ro \
-v /foobar-secrets:/aio-secrets:ro \
-v /mnt/disks/foobar-builds:/var/www/aio-builds \
-v /foobar-logs:/var/log/aio \
foobar-builds
```