Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 31 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ Among the distinguishing factors:
- Built-in HTTP(S) index server to read/write indexes
- Reflinking matching blocks (rather than copying) from seed files if supported by the filesystem (currently only Btrfs and XFS)
- catar archives can be created from standard tar archives, and they can also be extracted to GNU tar format.
- Optional chunk store encryption with XChaCha20-Poly1305 or AES-265-GCM.

## Terminology

Expand Down Expand Up @@ -69,7 +70,7 @@ catar archives can also be extracted to GNU tar archive streams. All files in th

## Tool

The tool is provided for convenience. It uses the desync library and makes most features of it available in a consistent fashion. It does not match upsteam casync's syntax exactly, but tries to be similar at least.
The tool is provided for convenience. It uses the desync library and makes most features of it available in a consistent fashion. It does not match upstream casync's syntax exactly, but tries to be similar at least.

### Installation

Expand Down Expand Up @@ -233,6 +234,25 @@ If the client configures the HTTP chunk server to be uncompressed (`chunk-server

Compressed and uncompressed chunks can live in the same store and don't interfere with each other. A store that's configured for compressed chunks by configuring it client-side will not see the uncompressed chunks that may be present. `prune` and `verify` too will ignore any chunks written in the other format. Both kinds of chunks can be accessed by multiple clients concurrently and independently.

### Chunk Encryption

Chunks can be encrypted with a symmetric algorithm on a per-store basis. To use encryption, it has to be enabled in the [configuration](Configuration) file, and an algorithm needs to be specified. A single instance of desync can use multiple stores at the same time, each with a different (or the same) encryption mode and key. Encrypted chunks are stores with file extensions containing the algorithm and a key identifier. If the password for a store is changed, all existing chunks in it will become "invisible" since the extension would no longer match. To change the key, chunks have to be re-encrypted with the new key. That could happen into same, or better, a new store. Create a new store, then either re-chunk the data, or use `desync cache -c <new-store> -s <old-store> <index>` to decrypt the chunks from the old store and re-encrypt with the new key in the new store.
For all available algorithms, the 256bit encryption key is derived from the configured password by hashing it with SHA256. Encryption nonces or IVs are generated randomly per chunk which can weaken encryption in some modes when used on very large chunk stores, see notes below.

| ID | Algorithm | Key | Nonce/IV | Notes |
|:---:|:---:|:---:|:---:|:---:|
| `xchacha20-poly1305` | XChaCha20-Poly1305 (AEAD) | 256bit | 192bit | Default |
| `aes-256-gcm` | AES 256bit Galois Counter Mode (AEAD) | 256bit | 128bit | Don't use for large chunk stores (>2<sup>32</sup> chunks) |

Chunk extensions in stores are chosen based on compression or encryption settings as follows:

| Compressed | Encrypted | Extension | Example |
|:---:|:---:|:---:|:---:|
| no | no | n/a | `fbef/fbef1a00ced..9280ce78` |
| yes | no | `.cacnk` | `ffbef/fbef1a00ced..9280ce78.cacnk` |
| no | yes | `.<algorithm>-<keyID>` | `fbef/fbef1a00ced..9280ce78.aes-256-gcm-635af003` |
| yes | yes | `.cacnk.<algorithm>-<keyID>` | `fbef/fbef1a00ced..9280ce78.cacnk.aes-256-gcm-635af003` |

### Configuration

For most use cases, it is sufficient to use the tool's default configuration not requiring a config file. Having a config file `$HOME/.config/desync/config.json` allows for further customization of timeouts, error retry behaviour or credentials that can't be set via command-line options or environment variables. All values have sensible defaults if unconfigured. Only add configuration for values that differ from the defaults. To view the current configuration, use `desync config`. If no config file is present, this will show the defaults. To create a config file allowing custom values, use `desync config -w` which will write the current configuration to the file, then edit the file.
Expand All @@ -242,17 +262,20 @@ Available configuration values:
- `http-timeout` *DEPRECATED, see `store-options.<Location>.timeout`* - HTTP request timeout used in HTTP stores (not S3) in nanoseconds
- `http-error-retry` *DEPRECATED, see `store-options.<Location>.error-retry` - Number of times to retry failed chunk requests from HTTP stores
- `s3-credentials` - Defines credentials for use with S3 stores. Especially useful if more than one S3 store is used. The key in the config needs to be the URL scheme and host used for the store, excluding the path, but including the port number if used in the store URL. It is also possible to use a [standard aws credentials file](https://docs.aws.amazon.com/cli/latest/userguide/cli-config-files.html) in order to store s3 credentials.
- `store-options` - Allows customization of chunk and index stores, for example comression settings, timeouts, retry behavior and keys. Not all options are applicable to every store, some of these like `timeout` are ignored for local stores. Some of these options, such as the client certificates are overwritten with any values set in the command line. Note that the store location used in the command line needs to match the key under `store-options` exactly for these options to be used. Watch out for trailing `/` in URLs.
- `store-options` - Allows customization of chunk and index stores, for example compression settings, timeouts, retry behavior and keys. Not all options are applicable to every store, some of these like `timeout` are ignored for local stores. Some of these options, such as the client certificates are overwritten with any values set in the command line. Note that the store location used in the command line needs to match the key under `store-options` exactly for these options to be used. Watch out for trailing `/` in URLs.
- `timeout` - Time limit for chunk read or write operation in nanoseconds. Default: 1 minute. If set to a negative value, timeout is infinite.
- `error-retry` - Number of times to retry failed chunk requests. Default: 0.
- `error-retry-base-interval` - Number of nanoseconds to wait before first retry attempt. Retry attempt number N for the same request will wait N times this interval. Default: 0.
- `client-cert` - Cerificate file to be used for stores where the server requires mutual SSL.
- `client-cert` - Certificate file to be used for stores where the server requires mutual SSL.
- `client-key` - Key file to be used for stores where the server requires mutual SSL.
- `ca-cert` - Certificate file containing trusted certs or CAs.
- `trust-insecure` - Trust any certificate presented by the server.
- `skip-verify` - Disables data integrity verification when reading chunks to improve performance. Only recommended when chaining chunk stores with the `chunk-server` command using compressed stores.
- `uncompressed` - Reads and writes uncompressed chunks from/to this store. This can improve performance, especially for local stores or caches. Compressed and uncompressed chunks can coexist in the same store, but only one kind is read or written by one client.
- `http-auth` - Value of the Authorization header in HTTP requests. This could be a bearer token with `"Bearer <token>"` or a Base64-encoded username and password pair for basic authentication like `"Basic dXNlcjpwYXNzd29yZAo="`.
- `encryption` - Must be set to `true` to encrypt chunks in the store.
- `encryption-password` - Encryption password to use for all chunks in the store.
- `encryption-algorithm` - Optional, symmetric encryption algorithm. Default `xchacha20-poly1305`.

#### Example config

Expand Down Expand Up @@ -287,6 +310,11 @@ Available configuration values:
},
"/path/to/local/cache": {
"uncompressed": true
},
"/path/to/encrypted/store": {
"encryption": true,
"encryption-algorithm": "xchacha20-poly1305",
"encryption-password": "mystorepassword"
}
}
}
Expand Down
19 changes: 16 additions & 3 deletions cmd/desync/chunkserver.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ type chunkServerOptions struct {
skipVerifyWrite bool
uncompressed bool
logFile string
encryptionAlg string
encryptionPw string
}

func newChunkServerCommand(ctx context.Context) *cobra.Command {
Expand Down Expand Up @@ -68,6 +70,8 @@ needing to restart the server. This can be done under load as well.
flags.BoolVar(&opt.skipVerifyWrite, "skip-verify-write", true, "don't verify chunk data written to this server (faster)")
flags.BoolVarP(&opt.uncompressed, "uncompressed", "u", false, "serve uncompressed chunks")
flags.StringVar(&opt.logFile, "log", "", "request log file or - for STDOUT")
flags.StringVar(&opt.encryptionPw, "encryption-password", "", "serve chunks encrypted with this password")
flags.StringVar(&opt.encryptionAlg, "encryption-algorithm", "xchacha20-poly1305", "encryption algorithm")
addStoreOptions(&opt.cmdStoreOptions, flags)
addServerOptions(&opt.cmdServerOptions, flags)
return cmd
Expand Down Expand Up @@ -127,9 +131,18 @@ func runChunkServer(ctx context.Context, opt chunkServerOptions, args []string)
}
defer s.Close()

var converters desync.Converters
if !opt.uncompressed {
converters = desync.Converters{desync.Compressor{}}
// Build the converters. In this case, the "storage" side is what is served
// up by the server towards the client. The StoreOptions struct already has
// logic to build the converters from options so use that instead of repeating
// it here.
converters, err := desync.StoreOptions{
Uncompressed: opt.uncompressed,
Encryption: opt.encryptionPw != "",
EncryptionAlgorithm: opt.encryptionAlg,
EncryptionPassword: opt.encryptionPw,
}.StorageConverters()
if err != nil {
return err
}

handler := desync.NewHTTPHandler(s, opt.writable, opt.skipVerifyWrite, converters, opt.auth)
Expand Down
40 changes: 40 additions & 0 deletions cmd/desync/chunkserver_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -181,3 +181,43 @@ func startChunkServer(t *testing.T, args ...string) (string, context.CancelFunc)
time.Sleep(time.Second)
return addr, cancel
}

func TestChunkServerEncryption(t *testing.T) {
outdir := t.TempDir()

// Start a (writable) server, it'll expect compressed+encrypted chunks over
// the wire while storing them only compressed in the local store
addr, cancel := startChunkServer(t, "-s", outdir, "-w", "--skip-verify-read=false", "--skip-verify-write=false", "--encryption-password", "testpassword")
defer cancel()
store := fmt.Sprintf("http://%s/", addr)

// Build a client config. The client needs to be setup to talk to the HTTP chunk server
// compressed+encrypted. Create a temp JSON config for that HTTP store and load it.
cfgFile = filepath.Join(outdir, "config.json")
cfgFileContent := fmt.Sprintf(`{"store-options": {"%s":{"encryption": true, "encryption-password": "testpassword"}}}`, store)
require.NoError(t, ioutil.WriteFile(cfgFile, []byte(cfgFileContent), 0644))
initConfig()

// Run a "chop" command to send some chunks (encrypted) over HTTP, then have the server
// store them un-encrypted in its local store.
chopCmd := newChopCommand(context.Background())
chopCmd.SetArgs([]string{"-s", store, "testdata/blob1.caibx", "testdata/blob1"})
chopCmd.SetOutput(ioutil.Discard)
_, err := chopCmd.ExecuteC()
require.NoError(t, err)

// Now read it all back over HTTP (again encrypted) and re-assemble the test file
extractFile := filepath.Join(outdir, "blob1")
extractCmd := newExtractCommand(context.Background())
extractCmd.SetArgs([]string{"-s", store, "testdata/blob1.caibx", extractFile})
extractCmd.SetOutput(ioutil.Discard)
_, err = extractCmd.ExecuteC()
require.NoError(t, err)

// Not actually necessary, but for good measure let's compare the blobs
blobIn, err := ioutil.ReadFile("testdata/blob1")
require.NoError(t, err)
blobOut, err := ioutil.ReadFile(extractFile)
require.NoError(t, err)
require.Equal(t, blobIn, blobOut)
}
23 changes: 23 additions & 0 deletions compress.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,26 @@ func Compress(src []byte) ([]byte, error) {
func Decompress(dst, src []byte) ([]byte, error) {
return decoder.DecodeAll(src, dst)
}

// Compression layer converter. Compresses/decompresses chunk data
// to and from storage. Implements the converter interface.
type Compressor struct{}

var _ converter = Compressor{}

func (d Compressor) toStorage(in []byte) ([]byte, error) {
return Compress(in)
}

func (d Compressor) fromStorage(in []byte) ([]byte, error) {
return Decompress(nil, in)
}

func (d Compressor) equal(c converter) bool {
_, ok := c.(Compressor)
return ok
}

func (d Compressor) storageExtension() string {
return ".cacnk"
}
6 changes: 0 additions & 6 deletions const.go
Original file line number Diff line number Diff line change
Expand Up @@ -137,9 +137,3 @@ var (
CaFormatTableTailMarker: "CaFormatTableTailMarker",
}
)

// CompressedChunkExt is the file extension used for compressed chunks
const CompressedChunkExt = ".cacnk"

// UncompressedChunkExt is the file extension of uncompressed chunks
const UncompressedChunkExt = ""
35 changes: 17 additions & 18 deletions coverter.go → converter.go
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
package desync

import "strings"

// Converters are modifiers for chunk data, such as compression or encryption.
// They are used to prepare chunk data for storage, or to read it from storage.
// The order of the conversion layers matters. When plain data is prepared for
Expand Down Expand Up @@ -63,6 +65,16 @@ func (s Converters) equal(c Converters) bool {
return true
}

// Extension to be used in storage. Concatenation of converter
// extensions in order (towards storage).
func (s Converters) storageExtension() string {
var ext strings.Builder
for _, layer := range s {
ext.WriteString(layer.storageExtension())
}
return ext.String()
}

// converter is a storage data modifier layer.
type converter interface {
// Convert data from it's original form to storage format.
Expand All @@ -75,23 +87,10 @@ type converter interface {
// the output may be used for the next conversion layer.
fromStorage([]byte) ([]byte, error)

equal(converter) bool
}

// Compression layer
type Compressor struct{}

var _ converter = Compressor{}

func (d Compressor) toStorage(in []byte) ([]byte, error) {
return Compress(in)
}
// Returns the file extension that should be used for a
// chunk when stored. Usually a concatenation of layers.
storageExtension() string

func (d Compressor) fromStorage(in []byte) ([]byte, error) {
return Decompress(nil, in)
}

func (d Compressor) equal(c converter) bool {
_, ok := c.(Compressor)
return ok
// True is one converter matches another exactly.
equal(converter) bool
}
125 changes: 125 additions & 0 deletions encrypt.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
package desync

import (
"bytes"
"crypto/aes"
"crypto/cipher"
"crypto/rand"
"crypto/sha256"
"errors"
"fmt"

"golang.org/x/crypto/chacha20poly1305"
)

// xchacha20poly1305 is an encryption layer for chunk storage. It
// encrypts/decrypts to/from storage using ChaCha20-Poly1305 AEAD.
// The key is generated from a passphrase with SHA256.
type xchacha20poly1305 struct {
key []byte
aead cipher.AEAD

// Chunk extension with identifier derived from the key.
extension string
}

var _ converter = xchacha20poly1305{}

func NewXChaCha20Poly1305(passphrase string) (xchacha20poly1305, error) {
key := sha256.Sum256([]byte(passphrase))
keyHash := sha256.Sum256(key[:])
extension := fmt.Sprintf(".xchacha20-poly1305-%x", keyHash[:4])
aead, err := chacha20poly1305.NewX(key[:])
return xchacha20poly1305{key: key[:], aead: aead, extension: extension}, err
}

// encrypt for storage. The nonce is prepended to the data.
func (d xchacha20poly1305) toStorage(in []byte) ([]byte, error) {
out := make([]byte, d.aead.NonceSize(), d.aead.NonceSize()+len(in)+d.aead.Overhead())
nonce := out[:d.aead.NonceSize()]
if _, err := rand.Read(nonce); err != nil {
return nil, err
}
return d.aead.Seal(out, nonce, in, nil), nil
}

// decrypt from storage. The nonce is taken from the start of the
// chunk data. This by itself does not verify integrity. That
// is achieved by the existing chunk validation.
func (d xchacha20poly1305) fromStorage(in []byte) ([]byte, error) {
if len(in) < d.aead.NonceSize() {
return nil, errors.New("no nonce prefix found in chunk, not encrypted or wrong algorithm")
}
nonce := in[:d.aead.NonceSize()]
return d.aead.Open(nil, nonce, in[d.aead.NonceSize():], nil)
}

func (d xchacha20poly1305) equal(c converter) bool {
other, ok := c.(xchacha20poly1305)
if !ok {
return false
}
return bytes.Equal(d.key, other.key)
}

func (d xchacha20poly1305) storageExtension() string {
return d.extension
}

// aes256gcm is an encryption layer for chunk storage. It
// encrypts/decrypts to/from storage using AES 256 GCM.
// The key is generated from a passphrase with SHA256.
type aes256gcm struct {
key []byte
aead cipher.AEAD

// Chunk extension with identifier derived from the key.
extension string
}

var _ converter = aes256gcm{}

func NewAES256GCM(passphrase string) (aes256gcm, error) {
key := sha256.Sum256([]byte(passphrase))
keyHash := sha256.Sum256(key[:])
extension := fmt.Sprintf(".aes-256-gcm-%x", keyHash[:4])
block, err := aes.NewCipher(key[:])
if err != nil {
return aes256gcm{}, err
}
aead, err := cipher.NewGCM(block)
return aes256gcm{key: key[:], aead: aead, extension: extension}, err
}

// encrypt for storage. The nonce is prepended to the data.
func (d aes256gcm) toStorage(in []byte) ([]byte, error) {
out := make([]byte, d.aead.NonceSize(), d.aead.NonceSize()+len(in)+d.aead.Overhead())
nonce := out[:d.aead.NonceSize()]
if _, err := rand.Read(nonce); err != nil {
return nil, err
}
return d.aead.Seal(out, nonce, in, nil), nil
}

// decrypt from storage. The nonce is taken from the start of the
// chunk data. This by itself does not verify integrity. That
// is achieved by the existing chunk validation.
func (d aes256gcm) fromStorage(in []byte) ([]byte, error) {
if len(in) < d.aead.NonceSize() {
return nil, errors.New("no nonce prefix found in chunk, not encrypted or wrong algorithm")
}
nonce := in[:d.aead.NonceSize()]
return d.aead.Open(nil, nonce, in[d.aead.NonceSize():], nil)
}

func (d aes256gcm) equal(c converter) bool {
other, ok := c.(aes256gcm)
if !ok {
return false
}
return bytes.Equal(d.key, other.key)
}

func (d aes256gcm) storageExtension() string {
return d.extension
}
Loading