Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
180b7e7
test: added scripts for functional tests
kayrus Feb 11, 2016
8d7930c
Merge pull request #1430 from endocode/kayrus/docker_test
kayrus Feb 24, 2016
5709939
travis: bump go minor versions, add 1.6
kayrus Feb 24, 2016
421873d
Merge pull request #1440 from endocode/kayrus/bump_go_version_in_travis
kayrus Feb 24, 2016
9400eeb
fleetctl: avoid hard coding sleep time values inside calls
Feb 16, 2016
4a71826
fleetctl:destroy: on destroy check if the unit does not exist
Feb 24, 2016
d2fca87
tests fleetctl destroy behavior
steveej Feb 12, 2016
bf18345
destroy_test: add a destroy test for non-existent units
Feb 17, 2016
b04be71
fleetctl:test: add appendJobsForTests() helper to automatically appen…
Feb 18, 2016
33bddce
fleetctl:destroy_test: make the test more smarter by checking for rac…
Feb 18, 2016
d463b1c
fleetctl:test: add commandTestResults struct and newFakeRegistryForCo…
Feb 24, 2016
be1126f
fleetctl:test: add some tests for fleetctl stop path
Feb 24, 2016
063b974
fleetctl:test: add some tests for fleetctl unload path
Feb 24, 2016
50c78c6
fleetctl:test: restore back sharedFlags.NoBlock when finishing
Feb 24, 2016
e50ee54
fleetctl:test: improve operation description for unload and stop tests
Feb 24, 2016
d605dc0
Merge pull request #1439 from endocode/tixxdz/fleet-destroy-and-tests-v1
tixxdz Feb 24, 2016
aae37e1
docs: remove D-Bus and polkit note and added new fleetd CLI parameters.
kayrus Feb 5, 2016
1550074
Merge pull request #1427 from endocode/kayrus/remove_dbus_polkit_note
jonboulle Feb 29, 2016
87a51d3
docs: purged CLI parameters info (was added by mistake)
kayrus Mar 1, 2016
ce614b5
Merge pull request #1450 from endocode/kayrus/purge_parameters
kayrus Mar 1, 2016
fecff7f
fleetctl: add tryWaitForUnitStates() and getBlockAttempts()
Feb 17, 2016
f47f906
fleetctl: {load|start|stop|unload} use the tryWaitForUnitStates() and…
Mar 2, 2016
ed4569c
fleetctl: move logic to lookup a unit into getUnitFile() and getUnitF…
Mar 2, 2016
4297361
fleetctl: inline getBlockAttempts() in tryWaitForUnitStates() calls
Feb 17, 2016
c7510c3
fleetctl: improve code comment about getUnitFileFromTemplate()
Mar 2, 2016
bd57a75
fleetctl: getBlockAttempts() standarise on negative meaning do not block
Feb 20, 2016
4d77bd1
fleetctl:test: push tests for getBlockAttempts()
Mar 2, 2016
f1c438f
fleetctl: improve getUnitFile() error handling and add some documenta…
Mar 2, 2016
8881d3f
fleetctl: for errors indicate that getUnitFileFromTemplate() tried bo…
Mar 2, 2016
60a7c54
fleetctl: improve getUnitFile() and getUnitFileFromTemplate() Godoc
Mar 2, 2016
cab3099
fleetctl: just inline getUnitInstanceInfo() and restore previous erro…
Mar 2, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
language: go
matrix:
include:
- go: 1.4.2
- go: 1.4.3
install:
- go get golang.org/x/tools/cmd/cover
- go get golang.org/x/tools/cmd/vet
- go: 1.5.1
- go: 1.5.3
- go: 1.6

script:
- ./test
Expand Down
22 changes: 16 additions & 6 deletions Documentation/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Every system in the fleet cluster runs a single `fleetd` daemon. Each daemon enc
- The engine uses a _lease model_ to enforce that only one engine is running at a time. Every time a reconciliation is due, an engine will attempt to take a lease on etcd. If the lease succeeds, the reconciliation proceeds; otherwise, that engine will remain idle until the next reconciliation period begins.
- The engine uses a simplistic "least-loaded" scheduling algorithm: when considering where to schedule a given unit, preference is given to agents running the smallest number of units.

The reconciliation loop of the engine can be disabled with the `--disable-engine` flag. This means that
The reconciliation loop of the engine can be disabled with the `disable_engine` config flag. This means that
this `fleetd` daemon will *never* become a cluster leader. If all running daemons have this setting,
your cluster is dead; i.e. no jobs will be scheduled. Use with care.

Expand Down Expand Up @@ -50,18 +50,28 @@ A UnitState object represents the state of a Unit in the fleet engine. A UnitSta

## Preview Release

Current releases of fleet don't currently perform any authentication or authorization for submitted units. This means that any client that can access your etcd cluster can potentially run arbitrary code on many of your machines very easily.
Current releases of fleet don't currently perform any authentication or authorization for submitted units. This means that any client that can access your etcd cluster can potentially run arbitrary code on many of your machines very easily, thus it is strongly recommended to enable [TLS authentication][etcd-security] on the etcd side, set proper file permissions to the keypair on the host and [configure fleet][fleet-tls] to use keypair.

## Securing etcd

You should avoid public access to etcd and instead run fleet [from your local laptop][using-the-client] with the `--tunnel` flag to run commands over an SSH tunnel. You can alias this flag for easier usage: `alias fleetctl=fleetctl --tunnel 10.10.10.10` - or use the environment variable `FLEETCTL_TUNNEL`.

## Other Notes
## Securing fleetd

Since it interacts directly with systemd over D-Bus, the fleetd daemon must be run with elevated privileges (i.e. as root) in order to perform operations like starting and stopping services. From the [systemd D-Bus documentation][systemd-dbus]:
It is also recommended to run fleetd under separate `fleet` user and group, and set the permissions of the fleetd API's listening Unix socket to `0660`. This will require local user to be in `fleet` group to perform an action with fleetd. Since the fleet daemon uses [D-Bus][d-bus] to communicate with systemd it is necessary to create a [`polkit(8)`][polkit] rule to allow fleetd to communicate with systemd:

> In contrast to most of the other services of the systemd suite PID 1 does not use PolicyKit for controlling access to privileged operations, but relies exclusively on the low-level D-Bus policy language. (This is done in order to avoid a cyclic dependency between PolicyKit and systemd/PID 1.) This means that sensitive operations exposed by PID 1 on the bus are generally not available to unprivileged processes directly.
```js
polkit.addRule(function(action, subject) {
if (action.id.indexOf("org.freedesktop.systemd1.") == 0 &&
subject.user == "fleet") {
return polkit.Result.YES;
}
});
```

[etcd-security]: https://github.com/coreos/etcd/blob/master/Documentation/security.md
[d-bus]: https://www.freedesktop.org/wiki/Software/dbus/
[fleet-tls]: deployment-and-configuration.md#tls-authentication
[polkit]: https://www.freedesktop.org/software/polkit/docs/latest/polkit.8.html
[states documentation]: states.md
[using-the-client]: using-the-client.md#get-up-and-running
[systemd-dbus]: http://www.freedesktop.org/wiki/Software/systemd/dbus/
107 changes: 87 additions & 20 deletions Documentation/deployment-and-configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,45 @@ Deploying `fleet` on CoreOS is even simpler: just run `systemctl start fleet`. T

Each `fleetd` daemon must be configured to talk to the same [etcd cluster][etcd]. By default, the `fleetd` daemon will connect to either http://127.0.0.1:2379 or http://127.0.0.1:4001, depending on which endpoint responds. Refer to the configuration documentation below for customization help.

`fleet` requires etcd be of version 0.3.0+.
`fleet` requires etcd be of version 0.3.0+ but it is recommended to use etcd 2.0.0+ which supports [TLS authentication][etcd-security].

### TLS Authentication

If your etcd cluster has [TLS authentication][etcd-security] enabled, you will need to configure fleet to use an appropriate TLS keypair. The examples below show how to achieve this:

#### Using systemd Drop-Ins

```ini
[Service]
Environment="FLEET_ETCD_CAFILE=/etc/ssl/etcd/ca.pem"
Environment="FLEET_ETCD_CERTFILE=/etc/ssl/etcd/client.pem"
Environment="FLEET_ETCD_KEYFILE=/etc/ssl/etcd/client-key.pem"
Environment="FLEET_ETCD_SERVERS=https://172.16.0.101:2379,https://172.16.0.102:2379,https://172.16.0.103:2379"
Environment="FLEET_METADATA=hostname=server1"
Environment="FLEET_PUBLIC_IP=172.16.0.101"
```

#### Using CoreOS Cloud Config

```yaml
#cloud-config

coreos:
fleet:
etcd_servers: "https://192.0.2.12:2379"
etcd_cafile: /etc/ssl/etcd/ca.pem
etcd_certfile: /etc/ssl/etcd/client.pem
etcd_keyfile: /etc/ssl/etcd/client-key.pem
```

#### Using fleet configuration file

```ini
etcd_servers=["https://192.0.2.12:2379"]
etcd_cafile=/etc/ssl/etcd/ca.pem
etcd_certfile=/etc/ssl/etcd/client.pem
etcd_keyfile=/etc/ssl/etcd/client-key.pem
```

## systemd

Expand All @@ -20,15 +58,15 @@ The `fleetctl` client tool uses SSH to interact with a fleet cluster. This means

Authorizing a public SSH key is typically as easy as appending it to the user's `~/.ssh/authorized_keys` file. This may not be true on your systemd, though. If running CoreOS, use the built-in `update-ssh-keys` utility - it helps manage multiple authorized keys.

To make things incredibly easy, included in the [fleet source][fleetctl-inject-ssh] is a script that will distribute SSH keys across a fleet cluster running on CoreOS. Simply pipe the contents of a public SSH key into the script:
To make things incredibly easy, included in the [fleet source][fleet-inject-ssh] is a script that will distribute SSH keys across a fleet cluster running on CoreOS. Simply pipe the contents of a public SSH key into the script:

```
```sh
cat ~/.ssh/id_rsa.pub | ./fleetctl-inject-ssh.sh simon
```

All but the first argument to `fleetctl-inject-ssh.sh` are passed directly to `fleetctl`.

```
```sh
cat ~/.ssh/id_rsa.pub | ./fleetctl-inject-ssh.sh simon --tunnel 19.12.0.33
```

Expand All @@ -40,14 +78,14 @@ The configuration of these interfaces is managed through a [systemd socket unit]

CoreOS ships a socket unit for fleet (`fleet.socket`) which binds to a Unix domain socket, `/var/run/fleet.sock`. Unix socket is accessible using tool such as curl (v7.40 or greater): `curl --unix-socket /var/run/fleet.sock http:/fleet/v1/units`.
To serve the fleet API over a network address, simply extend or replace this socket unit.
For example, writing the following [drop-in] to `/etc/systemd/system/fleet.socket.d/30-ListenStream.conf` would enable fleet to be reached over the local port `49153` in addition to `/var/run/fleet.sock`:
For example, writing the following [drop-in][drop-in] to `/etc/systemd/system/fleet.socket.d/30-ListenStream.conf` would enable fleet to be reached over the local port `49153` in addition to `/var/run/fleet.sock`:

```
```ini
[Socket]
ListenStream=127.0.0.1:49153
```

After you've written the file, call `systemctl daemon-reload` to load the new [drop-in], followed by `systemctl stop fleet.service; systemctl restart fleet.socket; systemctl start fleet.service`.
After you've written the file, call `systemctl daemon-reload` to load the new [drop-in][drop-in], followed by `systemctl stop fleet.service; systemctl restart fleet.socket; systemctl start fleet.service`.

Once the socket is running, the fleet API will be available at `http://${ListenStream}/fleet/v1`, where `${ListenStream}` is the value of the `ListenStream` option used in your socket file.
This endpoint is accessible directly using tools such as curl and wget, or you can use fleetctl like so: `fleetctl --endpoint http://${ListenStream} <command>`.
Expand All @@ -67,7 +105,7 @@ fleet will look at `/etc/fleet/fleet.conf` for this config file by default. The

Environment variables may also provide configuration options. Options provided in an environment variable will override the corresponding option provided in a config file. To use an environment variable, simply prefix the name of a given option with `FLEET_`, while uppercasing the rest of the name. For example, to set the `etcd_servers` option to 'http://192.0.2.12:2379' when running the fleetd binary:

```
```sh
$ FLEET_ETCD_SERVERS=http://192.0.2.12:2379 /usr/bin/fleetd
```

Expand All @@ -92,7 +130,7 @@ Amount of time in seconds to allow a single etcd request before considering it f

Default: 1.0

#### etcd_cafile, etcd_keyfile, etcd_certfile
#### etcd_cafile, etcd_keyfile, etcd_certfile

Provide TLS configuration when SSL certificate authentication is enabled in etcd endpoints

Expand All @@ -115,23 +153,31 @@ Default: ""

Comma-delimited key/value pairs that are published with the local to the fleet registry. This data can be used directly by a client of fleet to make scheduling decisions. An example set of metadata could look like:

metadata="region=us-west,az=us-west-1"
metadata='region=us-west,az=us-west-1'
metadata=region=us-west,az=us-west-1
```ini
metadata="region=us-west,az=us-west-1"
metadata='region=us-west,az=us-west-1'
metadata=region=us-west,az=us-west-1
```

The value of the metadata option should conform to one of these three forms:

metadata="STRING"
metadata='STRING'
metadata=STRING

```ini
metadata="STRING"
metadata='STRING'
metadata=STRING
```

...while STRING is one of:

yyy[,yyy[,yyy...]]
```ini
yyy[,yyy[,yyy...]]
```

...and yyy is one of:

key=value
```ini
key=value
```

Space and tab characters will be stripped around the equals sign and around each comma. If the same key is defined more than once, the last value overwrites the previous value(s).

Expand All @@ -149,9 +195,30 @@ Interval in seconds at which the engine should reconcile the cluster schedule in

Default: 2

[etcd]: https://github.com/coreos/docs/blob/master/etcd/getting-started-with-etcd.md
#### token_limit

Maximum number of entries per page returned from API requests.

Default: "100"

### disable_engine

Disable the engine entirely, use with care. You can find more info about this option in [fleet scaling doc][fleet-scale].

Default: false

### disable_watches

Disable the use of etcd watches. Increases scheduling latency. You can find more info about this option in [fleet scaling doc][fleet-scale].

Default: false

[api-doc]: api-v1.md
[fleetctl-inject-ssh]: /scripts/fleetctl-inject-ssh.sh
[config]: /fleet.conf.sample
[etcd]: https://github.com/coreos/docs/blob/master/etcd/getting-started-with-etcd.md
[etcd-security]: https://github.com/coreos/etcd/blob/master/Documentation/security.md
[fleet-inject-ssh]: /scripts/fleetctl-inject-ssh.sh
[fleet-scale]: fleet-scaling.md#implemented-quick-wins
[socket-unit]: http://www.freedesktop.org/software/systemd/man/systemd.socket.html
[config]: /fleet.conf.sample
[drop-in]: https://github.com/coreos/docs/blob/master/os/using-systemd-drop-in-units.md
Expand Down
6 changes: 3 additions & 3 deletions Documentation/fleet-scaling.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,13 +39,13 @@ RPCs between the engine and agent.
this is an expensive operation. The fewer nodes that are engaged in this
election, the better. Possible downside is that if there isn't a leader at
all, the cluster is inoperable. However the (usually) 5 machines running
etcd are also a single point of failure. *See the `--disable-engine` flag.*
etcd are also a single point of failure. *See the `disable_engine` config flag.*

* Making some defaults exported and allow them to be overridden. For instance
fleet's tokenLimit controls how many Units are listed per "page". *See the
`--token-limit` flag.*
`token_limit` config flag.*

* Removing watches from fleet: By removing the watches from fleet we stop
the entire cluster from walking up whenever a new job is to be scheduled.
The downside of this change is that fleet's responsiveness is lower.
*See the `--disable-watches` flag.*
*See the `disable_watches` config flag.*
35 changes: 9 additions & 26 deletions build
Original file line number Diff line number Diff line change
@@ -1,41 +1,24 @@
#!/bin/bash -e

# The -X format changed from go1.4 -> go1.5
function go_linker_dashX {
local version=$(go version)
local regex="go([0-9]+).([0-9]+)."
if [[ $version =~ $regex ]]; then
if [ ${BASH_REMATCH[1]} -eq "1" -a ${BASH_REMATCH[2]} -le "4" ]; then
echo "$1 \"$2\""
else
echo "$1=$2"
fi
else
echo "could not determine Go version"
exit 1
fi
}
CDIR=$(cd `dirname $0` && pwd)
cd $CDIR

ORG_PATH="github.com/coreos"
REPO_PATH="${ORG_PATH}/fleet"
VERSION=$(git describe --dirty)
GLDFLAGS="-X $(go_linker_dashX github.com/coreos/fleet/version.Version ${VERSION})"

source build-env

if [ ! -h gopath/src/${REPO_PATH} ]; then
mkdir -p gopath/src/${ORG_PATH}
ln -s ../../../.. gopath/src/${REPO_PATH} || exit 255
mkdir -p gopath/src/${ORG_PATH}
ln -s ../../../.. gopath/src/${REPO_PATH} || exit 255
fi

export GOBIN=${PWD}/bin
export GOPATH=${PWD}/gopath

eval $(go env)

if [ ${GOOS} = "linux" ]; then
echo "Building fleetd..."
CGO_ENABLED=0 go build -o bin/fleetd -a -installsuffix netgo -ldflags "${GLDFLAGS}" ${REPO_PATH}/fleetd
echo "Building fleetd..."
CGO_ENABLED=0 go build -o bin/fleetd -a -installsuffix netgo -ldflags "${GLDFLAGS}" ${REPO_PATH}/fleetd
else
echo "Not on Linux - skipping fleetd build"
echo "Not on Linux - skipping fleetd build"
fi

echo "Building fleetctl..."
Expand Down
4 changes: 3 additions & 1 deletion build-docker
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
#!/bin/bash -e

docker run --rm -v $PWD:/opt/fleet -u $(id -u):$(id -g) google/golang:1.4 /bin/bash -c "cd /opt/fleet && ./build"
CDIR=$(cd `dirname $0` && pwd)

docker run --rm -v $CDIR:/opt/fleet -u $(id -u):$(id -g) google/golang:1.4 /bin/bash -c "cd /opt/fleet && ./build"
23 changes: 23 additions & 0 deletions build-env
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# The -X format changed from go1.4 -> go1.5
function go_linker_dashX {
local version=$(go version)
local regex="go([0-9]+).([0-9]+)."
if [[ $version =~ $regex ]]; then
if [ ${BASH_REMATCH[1]} -eq "1" -a ${BASH_REMATCH[2]} -le "4" ]; then
echo "$1 \"$2\""
else
echo "$1=$2"
fi
else
echo "could not determine Go version"
exit 1
fi
}

export GOBIN=${PWD}/bin
export GOPATH=${PWD}/gopath
export GLDFLAGS="-X $(go_linker_dashX github.com/coreos/fleet/version.Version ${VERSION})"
eval $(go env)
export PATH="${GOROOT}/bin:${PATH}"
export FLEETD_BIN="$(pwd)/bin/fleetd"
export FLEETCTL_BIN="$(pwd)/bin/fleetctl"
4 changes: 4 additions & 0 deletions client/http.go
Original file line number Diff line number Diff line change
Expand Up @@ -137,3 +137,7 @@ func is404(err error) bool {
googerr, ok := err.(*googleapi.Error)
return ok && googerr.Code == http.StatusNotFound
}

func IsErrorUnitNotFound(err error) bool {
return is404(err)
}
8 changes: 7 additions & 1 deletion fleetctl/destroy.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ package main

import (
"time"

"github.com/coreos/fleet/client"
)

var cmdDestroyUnit = &Command{
Expand All @@ -42,6 +44,10 @@ func runDestroyUnits(args []string) (exit int) {
for _, v := range units {
err := cAPI.DestroyUnit(v.Name)
if err != nil {
// Ignore 'Unit does not exist' error
if client.IsErrorUnitNotFound(err) {
continue
}
stderr("Error destroying units: %v", err)
exit = 1
continue
Expand Down Expand Up @@ -71,7 +77,7 @@ func runDestroyUnits(args []string) (exit int) {
if u == nil {
break
}
time.Sleep(500 * time.Millisecond)
time.Sleep(defaultSleepTime)
}
}

Expand Down
Loading