Document production hosting (#2661)
This commit is contained in:
@@ -1,37 +1,245 @@
|
||||
# Production hosting
|
||||
|
||||
In production, a deploy commit checks in two config files:
|
||||
[][operations issues]
|
||||
|
||||
- `.env`
|
||||
- `config/local-shields-io-production.yml`
|
||||
[#ops chat room][ops discord]
|
||||
|
||||
The `.env` file sets `NODE_CONFIG_ENV` which bootstraps the configuration process. The rest of the configuration is loaded from three sources:
|
||||
[operations issues]: https://github.com/badges/shields/issues?q=is%3Aissue+is%3Aopen+label%3Aoperations
|
||||
[ops discord]: https://discordapp.com/channels/308323056592486420/480747695879749633
|
||||
|
||||
- `config/local-shields-io-production.yml` (secrets)
|
||||
- [`config/shields-io-production.yml`](../config/shields-io-production.yml) (non-secrets)
|
||||
- [`config/default.yml`](../config/default.yml)
|
||||
| Component | Subcomponent | People with access |
|
||||
| -------------- | --------------- | ------------------------------------------------------------------------------------------ |
|
||||
| Badge servers | Account owner | @espadrine |
|
||||
| Badge servers | ssh, logs | @espadrine |
|
||||
| Badge servers | Deployment | @espadrine, @paulmelnikow |
|
||||
| Badge servers | Admin endpoints | @espadrine, @paulmelnikow |
|
||||
| Cloudflare | Account owner | @espadrine |
|
||||
| Cloudflare | Admin access | @espadrine, @paulmelnikow |
|
||||
| GitHub | OAuth app | @espadrine ([could be transferred to the badges org][oauth transfer]) |
|
||||
| DNS | Account owner | @olivierlacan |
|
||||
| Sentry | Error reports | @espadrine, @paulmelnikow |
|
||||
| Frontend | Deployment | Technically anyone with push access but in practice must be deployed with the badge server |
|
||||
| Metrics server | Owner | @platan |
|
||||
| UptimeRobot | Account owner | @paulmelnikow |
|
||||
| More metrics | Owner | @RedSparr0w |
|
||||
|
||||
These settings are currently set in `config/local-shields-io-production.yml`:
|
||||
There are [too many bottlenecks][issue 2577]!
|
||||
|
||||
- bintray_apikey
|
||||
- bintray_user
|
||||
- gh_client_id
|
||||
- gh_client_secret
|
||||
- gh_oauth_state
|
||||
- libraries_io_api_key
|
||||
- sentry_dsn
|
||||
- shields_secret
|
||||
- sl_insight_apiToken
|
||||
- sl_insight_userUuid
|
||||
- wheelmap_token
|
||||
## Badge servers
|
||||
|
||||
## Main Server Sysadmin
|
||||
There are three public badge servers on OVH VPS’s.
|
||||
|
||||
- Servers in DNS round-robin:
|
||||
- s0.shields-server.com: 192.99.59.72 (vps71670.vps.ovh.ca)
|
||||
- s1.shields-server.com: 51.254.114.150 (vps244529.ovh.net)
|
||||
- s2.shields-server.com: 149.56.96.133 (vps117870.vps.ovh.ca)
|
||||
- Self-signed TLS certificates, but `img.shields.io` is behind CloudFlare, which provides signed certificates.
|
||||
- Using systemd to automatically restart the server when it crashes.
|
||||
| Cname | Hostname | Type | IP | Location |
|
||||
| --------------------------- | -------------------- | ---- | -------------- | ------------------ |
|
||||
| [s0.shields-server.com][s0] | vps71670.vps.ovh.ca | VPS | 192.99.59.72 | Quebec, Canada |
|
||||
| [s1.shields-server.com][s1] | vps244529.ovh.net | VPS | 51.254.114.150 | Gravelines, France |
|
||||
| [s2.shields-server.com][s2] | vps117870.vps.ovh.ca | VPS | 149.56.96.133 | Quebec, Canada |
|
||||
|
||||
See https://github.com/badges/ServerScript for helper admin scripts.
|
||||
- These are single-core virtual hosts with 2 GB RAM [VPS SSD 1]().
|
||||
- The Node version (v9.4.0 at time of writing) and dependency versions on the
|
||||
servers can be inspected in Sentry, but only when an error occurs.
|
||||
- The servers use self-signed SSL certificates. ([#1460][issue 1460])
|
||||
- After accepting the certificate, you can debug an individual server using
|
||||
the links above.
|
||||
- The scripts that start the server live in the [ServerScript][] repo. However
|
||||
updates must be pulled manually. They are not updated as part of the deploy process.
|
||||
- The server runs SSH.
|
||||
- Deploys are made using a git post-receive hook.
|
||||
- The server uses systemd to automatically restart the server when it crashes.
|
||||
- Provisioning additional servers is a manual process which is yet to been
|
||||
documented.
|
||||
|
||||
[s0]: https://s0.shields-server.com/index.html
|
||||
[s1]: https://s1.shields-server.com/index.html
|
||||
[s2]: https://s2.shields-server.com/index.html
|
||||
[vps ssd 1]: https://www.ovh.com/world/vps/vps-ssd.xml
|
||||
[issue 1460]: https://github.com/badges/shields/issues/1460
|
||||
[serverscript]: https://github.com/badges/ServerScript
|
||||
|
||||
## Attached state
|
||||
|
||||
Shields has mercifully little persistent state:
|
||||
|
||||
1. The GitHub tokens we collect are saved on each server in JSON files on disk.
|
||||
They can be fetched from the [GitHub auth admin endpoint][] for debugging.
|
||||
2. The analytics data is also saved on each server in JSON files on disk.
|
||||
3. The server keeps a few caches in memory. These are neither persisted nor
|
||||
inspectable.
|
||||
- The [request cache][]
|
||||
- The [regular-update cache][]
|
||||
- The [raster cache][]
|
||||
|
||||
[github auth admin endpoint]: https://github.com/badges/shields/blob/master/services/github/auth/admin.js
|
||||
[request cache]: https://github.com/badges/shields/blob/master/lib/request-handler.js#L29-L30
|
||||
[regular-update cache]: https://github.com/badges/shields/blob/master/lib/regular-update.js
|
||||
[raster cache]: https://github.com/badges/shields/blob/master/gh-badges/lib/svg-to-img.js#L9-L10
|
||||
[oauth transfer]: https://developer.github.com/apps/managing-oauth-apps/transferring-ownership-of-an-oauth-app/
|
||||
|
||||
## Configuration
|
||||
|
||||
To bootstrap the configuration process,
|
||||
[the script that starts the server][start-shields.sh] sets a single
|
||||
environment variable:
|
||||
|
||||
```
|
||||
NODE_CONFIG_ENV=shields-io-production
|
||||
```
|
||||
|
||||
With that variable set, the server ([using `config`][config]) reads these
|
||||
files:
|
||||
|
||||
- [`local-shields-io-production.yml`][local-shields-io-production.yml].
|
||||
This file contains secrets which are checked in with a deploy commit.
|
||||
- [`shields-io-production.yml`][shields-io-production.yml]. This file
|
||||
contains non-secrets which are checked in to the main repo.
|
||||
- [`default.yml`][default.yml]`. This file contains defaults.
|
||||
|
||||
[start-shields.sh]: https://github.com/badges/ServerScript/blob/master/start-shields.sh#L7
|
||||
[config]: https://github.com/lorenwest/node-config/wiki/Configuration-Files
|
||||
[local-shields-io-production.yml]: ../config/local-shields-io-production.example.yml
|
||||
[shields-io-production.yml]: ../config/shields-io-production.yml
|
||||
[default.yml]: ../config/default.yml
|
||||
|
||||
The project ships with `dotenv`, however there is no `.env` in production.
|
||||
|
||||
## Badge CDN
|
||||
|
||||
Sitting in front of the three servers is a Cloudflare Free account which
|
||||
provides several services:
|
||||
|
||||
- Global CDN, caching, and SSL gateway for `img.shields.io`
|
||||
- Analytics through the Cloudflare dashboard
|
||||
- DNS hosting for `shields.io`
|
||||
|
||||
Cloudflare is configured to respect the servers' cache headers.
|
||||
|
||||
## Frontend
|
||||
|
||||
The frontend is served by [GitHub Pages][] via the [gh-pages branch][gh-pages]. SSL is enforced.
|
||||
|
||||
`shields.io` resolves to the GitHub Pages hosts. It is not proxied through
|
||||
Cloudflare.
|
||||
|
||||
Technically any maintainer can push to `gh-pages`, but in practice the frontend must be deployed
|
||||
with the badge server via the deployment process described below.
|
||||
|
||||
[github pages]: https://pages.github.com/
|
||||
[gh-pages]: https://github.com/badges/shields/tree/gh-pages
|
||||
|
||||
## Deployment
|
||||
|
||||
To set things up for deployment:
|
||||
|
||||
1. Get your SSH key added to the server.
|
||||
2. Clone a fresh copy of the repository, dedicated for deployment.
|
||||
(Not required, but recommended; and lets you use `npm ci` below.)
|
||||
3. Add remotes:
|
||||
|
||||
```sh
|
||||
git remote add s0 root@s0.shields-server.com:/home/m/shields.git
|
||||
git remote add s1 root@s1.shields-server.com:/home/m/shields.git
|
||||
git remote add s2 root@s2.shields-server.com:/home/m/shields.git
|
||||
```
|
||||
|
||||
`origin` should point to GitHub as usual.
|
||||
|
||||
4. Since the deploy uses `git worktree`, make sure you have git 2.5 or later.
|
||||
|
||||
To deploy:
|
||||
|
||||
1. Use `git fetch` to obtain a current copy of
|
||||
`local-shields-io-production.yml` from the server (or obtain the current
|
||||
version of that file some other way). Save it in `config/`.
|
||||
2. Check out the commit you want to deploy.
|
||||
3. Run `npm ci`. **This is super important for the frontend build!**
|
||||
4. Run `make deploy-s0` to make a canary deploy.
|
||||
5. Check the canary deploy:
|
||||
- [Visit the server][s0]. Don't forget that most of the preview badges
|
||||
are static!
|
||||
- Look for errors in [Sentry][].
|
||||
- Keep an eye on the [status page][status].
|
||||
6. After a little while (usually 10–60 minutes), finish the deploy:
|
||||
`make push-s1 push-s2 deploy-gh-pages`.
|
||||
|
||||
To roll back, check out the commit you want to roll back to and repeat those
|
||||
steps.
|
||||
|
||||
To see which commit is deployed to a server run `git ls-remote` and then
|
||||
`git log` on the `HEAD` ref. There will be two deploy commits preceded by the
|
||||
commit which was deployed.
|
||||
|
||||
Be careful not to push the deploy commits to GitHub.
|
||||
|
||||
`make deploy-s0` does the following:
|
||||
|
||||
1. Creates a working tree in `/tmp`.
|
||||
2. In that tree, runs `features` and `examples` to generate data files
|
||||
needed for the frontend.
|
||||
3. Builds and checks in the built frontend.
|
||||
4. Checks in `local-shields-io-production.yml`.
|
||||
5. Pushes to s0, which updates dependencies and then restarts itself.
|
||||
|
||||
`make push-s1 push-s2 deploy-gh-pages` does the following:
|
||||
|
||||
1. Pushes the same working tree to s1 and s2.
|
||||
2. Creates a new working tree for the frontend.
|
||||
3. Adds a commit cleaning out the index.
|
||||
4. Adds another commit with the build frontend.
|
||||
5. Pushes to `gh-pages`.
|
||||
|
||||
## DNS
|
||||
|
||||
I'm not sure where the DNS is registered.
|
||||
|
||||
## Logs
|
||||
|
||||
Logs are available on the individual servers via SSH.
|
||||
|
||||
## Error reporting
|
||||
|
||||
[Error reporting][sentry] is one of the most useful tools we have for monitoring
|
||||
the server. It's generously donated by [Sentry][sentry home]. We bundle
|
||||
[`raven`][raven] into the application, and the Sentry DSN is configured via
|
||||
`local-shields-io-production.yml` (see [documentation][sentry configuration]).
|
||||
|
||||
[sentry]: https://sentry.io/shields/
|
||||
[raven]: https://www.npmjs.com/package/raven
|
||||
[sentry home]: https://sentry.io/shields/
|
||||
[sentry configuration]: https://github.com/badges/shields/blob/master/doc/self-hosting.md#sentry
|
||||
|
||||
## Monitoring
|
||||
|
||||
Request performance is monitored in two places:
|
||||
|
||||
- [Status][] (using [UptimeRobot][])
|
||||
- [Server metrics][] using Prometheus and Grafana
|
||||
- [@RedSparr0w's monitor][monitor] which posts [notifications][] to a private
|
||||
[#monitor chat room][monitor discord]
|
||||
|
||||
Overall server performance is monitored using Prometheus and Grafana.
|
||||
Coming soon! ([#2068][issue 2068])
|
||||
|
||||
[status]: https://status.shields.io/
|
||||
[server metrics]: https://metrics.shields.io/
|
||||
[uptimerobot]: https://uptimerobot.com/
|
||||
[monitor]: https://shields.redsparr0w.com/1568/
|
||||
[notifications]: http://shields.redsparr0w.com/discord_notification
|
||||
[monitor discord]: https://discordapp.com/channels/308323056592486420/470700909182320646
|
||||
[issue 2068]: https://github.com/badges/shields/issues/2068
|
||||
|
||||
## Analytics
|
||||
|
||||
The server analytics data is public and can be fetched from the
|
||||
[analytics endpoint][] or using the [analytics script][].
|
||||
|
||||
[analytics endpoint]: https://github.com/badges/shields/blob/master/lib/analytics.js
|
||||
[analytics script]: https://github.com/badges/ServerScript/blob/master/stats.js
|
||||
|
||||
## Known limitations
|
||||
|
||||
1. The only way to inspect the commit on the server is with `git ls-remote`.
|
||||
2. The production deploy installs `devDependencies`. It does not honor
|
||||
`package-lock.json`. ([#1988][issue 1988])
|
||||
|
||||
[issue 2577]: https://github.com/badges/shields/issues/2577
|
||||
[issue 1988]: https://github.com/badges/shields/issues/1988
|
||||
|
||||
Reference in New Issue
Block a user