diff --git a/README.md b/README.md index bef0d46..cf465c6 100644 --- a/README.md +++ b/README.md @@ -13,8 +13,8 @@ Among the distinguishing factors: - Supported on MacOS, though there could be incompatibilities when exchanging catar-files between Linux and Mac for example since devices and filemodes differ slightly. \*BSD should work as well but hasn't been tested. Windows supports a subset of commands. - Where the upstream command has chosen to optimize for storage efficiency (f/e, being able to use local files as "seeds", building temporary indexes into them), this command chooses to optimize for runtime performance (maintaining a local explicit chunk store, avoiding the need to reindex) at cost to storage efficiency. - Where the upstream command has chosen to take full advantage of Linux platform features, this client chooses to implement a minimum featureset and, while high-value platform-specific features (such as support for btrfs reflinks into a decompressed local chunk cache) might be added in the future, the ability to build without them on other platforms will be maintained. -- Both, SHA512/256 and SHA256 are supported hash functions. -- Only chunk stores using zstd compression as well uncompressed are supported at this point. +- Both SHA512/256 and SHA256 are supported hash functions. +- Only chunk stores using zstd compression as well as uncompressed are supported at this point. - Supports local stores as well as remote stores (as client) over SSH, SFTP and HTTP - Built-in HTTP(S) chunk server that can proxy multiple local or remote stores and also supports caching and deduplication for concurrent requests. - Drop-in replacement for casync on SSH servers when serving chunks read-only @@ -22,7 +22,7 @@ Among the distinguishing factors: - Supports chunking with the same algorithm used by casync (see `make` command) but executed in parallel. Results are identical to what casync produces, same chunks and index files, but with significantly better performance. For example, up to 10x faster than casync if the chunks are already present in the store. If the chunks are new, it heavily depends on I/O, but it's still likely several times faster than casync. - While casync supports very small min chunk sizes, optimizations in desync require min chunk sizes larger than the window size of the rolling hash used (currently 48 bytes). The tool's default chunk sizes match the defaults used in casync, min 16k, avg 64k, max 256k. - Allows FUSE mounting of blob indexes -- S3/GC protocol support to access chunk stores for read operations and some some commands that write chunks +- S3/GC protocol support to access chunk stores for read operations and some commands that write chunks - Stores and retrieves index files from remote index stores such as HTTP, SFTP, Google Storage and S3 - Built-in HTTP(S) index server to read/write indexes - Reflinking matching blocks (rather than copying) from seed files if supported by the filesystem (currently only Btrfs and XFS) @@ -34,7 +34,7 @@ The documentation below uses terms that may not be clear to readers not already - **chunk** - A chunk is a section of data from a file. Typically it's between 16kB and 256kB. Chunks are identified by the SHA512-256 checksum of their uncompressed data. Files are split into several chunks with the `make` command which tries to find chunk boundaries intelligently using the algorithm outlined in this [blog post](http://0pointer.net/blog/casync-a-tool-for-distributing-file-system-images.html). By default, chunks are stored as files compressed with [zstd](https://github.com/facebook/zstd) and extension `.cacnk`. - **chunk store** - Location, either local or remote that stores chunks. In its most basic form, a chunk store can be a local directory, containing chunk files named after the checksum of the chunk. Other protocols like HTTP, S3, GC, SFTP and SSH are available as well. -- **index** - Indexes are data structures containing references to chunks and their location within a file. An index is a small representation of a much larger file. Given an index and a chunk store, it's possible to re-assemble the large file or make it available via a FUSE mount. Indexes are produced during chunking operations such as the `create` command. The most common file extension for an index is `.caibx`. When catar archives are chunked, the extension `.caidx` is used instead. +- **index** - Indexes are data structures containing references to chunks and their location within a file. An index is a small representation of a much larger file. Given an index and a chunk store, it's possible to re-assemble the large file or make it available via a FUSE mount. Indexes are produced during chunking operations such as the `make` command. The most common file extension for an index is `.caibx`. When catar archives are chunked, the extension `.caidx` is used instead. - **index store** - Index stores are used to keep index files. It could simply be a local directory, or accessed over SFTP, S3, GC or HTTP. - **catar** - Archives of directory trees, similar to what is produced by the `tar` command. These commonly have the `.catar` extension. - **caidx** - Index file of a chunked catar. @@ -42,7 +42,7 @@ The documentation below uses terms that may not be clear to readers not already ## Parallel chunking -One of the significant differences to casync is that desync attempts to make chunking faster by utilizing more CPU resources, chunking data in parallel. Depending on the chosen degree of concurrency, the file is split into N equal parts and each part is chunked independently. While the chunking of each part is ongoing, part1 is trying to align with part2, and part3 is trying to align with part4 and so on. Alignment is achieved once a common split point is found in the overlapping area. If a common split point is found, the process chunking the previous part stops, eg. part1 chunker stops, part2 chunker keeps going until it aligns with part3 and so on until all split points have been found. Once all split points have been determined, the file is opened again (N times) to read, compress and store the chunks. While in most cases this process achieves significantly reduced chunking times at the cost of CPU, there are edge cases where chunking is only about as fast as upstream casync (with more CPU usage). This is the case if no split points can be found in the data between min and max chunk size as is the case if most or all of the file consists of 0-bytes. In this situation, the concurrent chunking processes for each part will not align with each other and a lot of effort is wasted. The table below shows how the type of data that is being chunked can influence runtime of each operation. `make` refers to the process of chunking, while `extract` refers to re-assembly of blobs from chunks. +One of the significant differences to casync is that desync attempts to make chunking faster by utilizing more CPU resources, chunking data in parallel. Depending on the chosen degree of concurrency, the file is split into N equal parts and each part is chunked independently. While the chunking of each part is ongoing, part1 is trying to align with part2, and part3 is trying to align with part4 and so on. Alignment is achieved once a common split point is found in the overlapping area. If a common split point is found, the process chunking the previous part stops, e.g. part1 chunker stops, part2 chunker keeps going until it aligns with part3 and so on until all split points have been found. Once all split points have been determined, the file is opened again (N times) to read, compress and store the chunks. While in most cases this process achieves significantly reduced chunking times at the cost of CPU, there are edge cases where chunking is only about as fast as upstream casync (with more CPU usage). This is the case if no split points can be found in the data between min and max chunk size as is the case if most or all of the file consists of 0-bytes. In this situation, the concurrent chunking processes for each part will not align with each other and a lot of effort is wasted. The table below shows how the type of data that is being chunked can influence runtime of each operation. `make` refers to the process of chunking, while `extract` refers to re-assembly of blobs from chunks. Command | Mostly/All 0-bytes | Typical data ------------ | ------------- | ------------ @@ -54,7 +54,7 @@ extract | Extremely fast - Effectively the speed of a truncate() syscall | Fast Copy-on-write filesystems such as Btrfs and XFS support cloning of blocks between files in order to save disk space as well as improve extraction performance. To utilize this feature, desync uses several seeds to clone sections of files rather than reading the data from chunk-stores and copying it in place: - A built-in seed for Null-chunks (a chunk of Max chunk size containing only 0 bytes). This can significantly reduce disk usage of files with large 0-byte ranges, such as VM images. This will effectively turn an eager-zeroed VM disk into a sparse disk while retaining all the advantages of eager-zeroed disk images. -- A build-in Self-seed. As chunks are being written to the destination file, the file itself becomes a seed. If one chunk, or a series of chunks is used again later in the file, it'll be cloned from the position written previously. This saves storage when the file contains several repetitive sections. +- A built-in Self-seed. As chunks are being written to the destination file, the file itself becomes a seed. If one chunk, or a series of chunks is used again later in the file, it'll be cloned from the position written previously. This saves storage when the file contains several repetitive sections. - Seed files and their indexes can be provided when extracting a file. For this feature, it's necessary to already have the index plus its blob on disk. So for example `image-v1.vmdk` and `image-v1.vmdk.caibx` can be used as seed for the extract operation of `image-v2.vmdk`. The amount of additional disk space required to store `image-v2.vmdk` will be the delta between it and `image-v1.vmdk`. ![chunks-from-seeds](doc/seed.png) @@ -63,13 +63,13 @@ Even if cloning is not available, seeds are still useful. `desync` automatically ## Reading and writing tar streams -In addition to packing local filesystem trees into catar archives, it is possible to read a tar archive stream. Various tar formats such as GNU and BSD tar are supported. See [https://golang.org/pkg/archive/tar/](https://golang.org/pkg/archive/tar/) for details on supported formats. When reading from tar archives, the content is no re-ordered and written to the catar in the same order. This may create output files that are different when comparing to using the local filesystem as input since the order depends entirely on how the tar file is created. Since the catar format does not support hardlinks, the input tar stream needs to follow hardlinks for desync to process them correctly. See the `--hard-dereference` option in the tar utility. +In addition to packing local filesystem trees into catar archives, it is possible to read a tar archive stream. Various tar formats such as GNU and BSD tar are supported. See [https://golang.org/pkg/archive/tar/](https://golang.org/pkg/archive/tar/) for details on supported formats. When reading from tar archives, the content is not re-ordered and written to the catar in the same order. This may create output files that are different when comparing to using the local filesystem as input since the order depends entirely on how the tar file is created. Since the catar format does not support hardlinks, the input tar stream needs to follow hardlinks for desync to process them correctly. See the `--hard-dereference` option in the tar utility. catar archives can also be extracted to GNU tar archive streams. All files in the output stream are ordered the same as in the catar. ## Tool -The tool is provided for convenience. It uses the desync library and makes most features of it available in a consistent fashion. It does not match upsteam casync's syntax exactly, but tries to be similar at least. +The tool is provided for convenience. It uses the desync library and makes most features of it available in a consistent fashion. It does not match upstream casync's syntax exactly, but tries to be similar at least. ### Installation @@ -95,14 +95,14 @@ cd desync/cmd/desync && go install - `chop` - split a blob according to an existing caibx and store the chunks in a local store - `pull` - serve chunks using the casync protocol over stdin/stdout. Set `CASYNC_REMOTE_PATH=desync` on the client to use it. - `tar` - pack a catar file, optionally chunk the catar and create an index file. -- `untar` - unpack a catar file or an index referencing a catar. Device entries in tar files are unsuppored and `--no-same-owner` and `--no-same-permissions` options are ignored on Windows. +- `untar` - unpack a catar file or an index referencing a catar. Device entries in tar files are unsupported and `--no-same-owner` and `--no-same-permissions` options are ignored on Windows. - `prune` - remove unreferenced chunks from a local, S3 or GC store. Use with caution, can lead to data loss. - `verify-index` - verify that an index file matches a given blob -- `chunk-server` - start a HTTP(S) chunk server/store -- `index-server` - start a HTTP(S) index server/store +- `chunk-server` - start an HTTP(S) chunk server/store +- `index-server` - start an HTTP(S) index server/store - `make` - split a blob into chunks and create an index file - `mount-index` - FUSE mount a blob index. Will make the blob available as single file inside the mountpoint. -- `info` - Show information about an index file, such as number of chunks and optionally chunks from an index that a re present in a store +- `info` - Show information about an index file, such as number of chunks and optionally chunks from an index that are present in a store - `inspect-chunks` - Show detailed information about chunks stored in an index file - `mtree` - Print the content of an archive or index in mtree-compatible format. @@ -118,7 +118,7 @@ cd desync/cmd/desync && go install - `-l` Listening address for the HTTP chunk server. Can be used multiple times to run on more than one interface or more than one port. Only supported by the `chunk-server` command. - `-m` Specify the min/avg/max chunk sizes in kb. Only applicable to the `make` command. Defaults to 16:64:256 and for best results the min should be avg/4 and the max should be 4*avg. - `-i` When packing/unpacking an archive, don't create/read an archive file but instead store/read the chunks and use an index file (caidx) for the archive. Only applicable to `tar` and `untar` commands. -- `-t` Trust all certificates presented by HTTPS stores. Allows the use of self-signed certs when using a HTTPS chunk server. +- `-t` Trust all certificates presented by HTTPS stores. Allows the use of self-signed certs when using an HTTPS chunk server. - `--key` Key file in PEM format used for HTTPS `chunk-server` and `index-server` commands. Also requires a certificate with `--cert` - `--cert` Certificate file in PEM format used for HTTPS `chunk-server` and `index-server` commands. Also requires `-key`. - `-k` Keep partially assembled files in place when `extract` fails or is interrupted. The command can then be restarted and it'll not have to retrieve completed parts again. Also use this option to write to block devices. @@ -134,12 +134,12 @@ cd desync/cmd/desync && go install ### Caching -The `-c ` option can be used to either specify an existing store to act as cache or to populate a new store. Whenever a chunk is requested, it is first looked up in the cache before routing the request to the next (possibly remote) store. Any chunks downloaded from the main stores are added to the cache. In addition, when a chunk is read from the cache and it is a local store, mtime of the chunk is updated to allow for basic garbage collection based on file age. The cache store is expected to be writable. If the cache contains an invalid chunk (checksum does not match the chunk ID), the operation will fail. Invalid chunks are not skipped or removed from the cache automatically. `verfiy -r` can be used to +The `-c ` option can be used to either specify an existing store to act as cache or to populate a new store. Whenever a chunk is requested, it is first looked up in the cache before routing the request to the next (possibly remote) store. Any chunks downloaded from the main stores are added to the cache. In addition, when a chunk is read from the cache and it is a local store, mtime of the chunk is updated to allow for basic garbage collection based on file age. The cache store is expected to be writable. If the cache contains an invalid chunk (checksum does not match the chunk ID), the operation will fail. Invalid chunks are not skipped or removed from the cache automatically. `verify -r` can be used to evict bad chunks from a local store or cache. ### Multiple chunk stores -One of the main features of desync is the ability to combine/chain multiple chunk stores of different types and also combine it with a cache store. For example, for a command that reads chunks when assembling a blob, stores can be chained in the command line like so: `-s -s -s `. A chunk will first be requested from `store1`, and if not found there, the request will be routed to `` and so on. Typically, the fastest chunk store should be listed first to improve performance. It is also possible to combine multiple chunk stores with a cache. In most cases the cache would be a local store, but that is not a requirement. When combining stores and a cache like so: `-s -s -c `, a chunk request will first be routed to the cache store, then to store1 followed by store2. Any chunks that is not yet in the cache will be stored there upon first request. +One of the main features of desync is the ability to combine/chain multiple chunk stores of different types and also combine it with a cache store. For example, for a command that reads chunks when assembling a blob, stores can be chained in the command line like so: `-s -s -s `. A chunk will first be requested from `store1`, and if not found there, the request will be routed to `` and so on. Typically, the fastest chunk store should be listed first to improve performance. It is also possible to combine multiple chunk stores with a cache. In most cases the cache would be a local store, but that is not a requirement. When combining stores and a cache like so: `-s -s -c `, a chunk request will first be routed to the cache store, then to store1 followed by store2. Any chunk that is not yet in the cache will be stored there upon first request. Not all types of stores support all operations. The table below lists the supported operations on all store types. @@ -147,7 +147,7 @@ Not all types of stores support all operations. The table below lists the suppor | --- | :---: | :---: | :---: | :---: | :---: | | Read chunks | yes | yes | yes | yes | yes | | Write chunks | yes | yes | yes | yes | no | -| Use as cache | yes | yes | yes |yes | no | +| Use as cache | yes | yes | yes | yes | no | | Prune | yes | yes | no | yes | no | | Verify | yes | yes | no | no | no | @@ -157,7 +157,7 @@ Given stores with identical content (same chunks in each), it is possible to gro ### Dynamic store configuration -Some long-running processes, namely `chunk-server` and `mount-index` may require a reconfiguration without having to restart them. This can be achieved by starting them with the `--store-file` options which provides the arguments that are normally passed via command line flags `--store` and `--cache` from a JSON file instead. Once the server is running, a SIGHUP to the process will trigger a reload of the configuration and replace the stores internally without restart. This can be done under load. If the configuration in the file is found to be invalid, and error is printed to STDERR and the reload ignored. The structure of the store-file is as follows: +Some long-running processes, namely `chunk-server` and `mount-index` may require a reconfiguration without having to restart them. This can be achieved by starting them with the `--store-file` options which provides the arguments that are normally passed via command line flags `--store` and `--cache` from a JSON file instead. Once the server is running, a SIGHUP to the process will trigger a reload of the configuration and replace the stores internally without restart. This can be done under load. If the configuration in the file is found to be invalid, an error is printed to STDERR and the reload ignored. The structure of the store-file is as follows: ```json { @@ -169,7 +169,7 @@ Some long-running processes, namely `chunk-server` and `mount-index` may require } ``` -This can be combined with store failover by providing the same syntax as is used in the command-line, for example `{"stores":["/path/to/main|/path/to/backup"]}`, See [Examples](#examples) for details on how to use the `--store-file` option. +This can be combined with store failover by providing the same syntax as is used in the command-line, for example `{"stores":["/path/to/main|/path/to/backup"]}`, see [Examples](#examples) for details on how to use the `--store-file` option. ### Remote indexes @@ -223,7 +223,7 @@ s3+https://example.com/bucket/prefix?lookup=auto ### Compressed vs Uncompressed chunk stores -By default, desync reads and writes chunks in compressed form to all supported stores. This is in line with upstream casync's goal of storing in the most efficient way. It is however possible to change this behavior by providing desync with a config file (see Configuration section below). Disabling compression and store chunks uncompressed may reduce latency in some use-cases and improve performance. desync supports reading and writing uncompressed chunks to SFTP, S3, HTTP and local stores and caches. If more than one store is used, each of those can be configured independently, for example it's possible to read compressed chunks from S3 while using a local uncompressed cache for best performance. However, care needs to be taken when using the `chunk-server` command and building chains of chunk store proxies to avoid shifting the decompression load onto the server (it's possible this is actually desirable). +By default, desync reads and writes chunks in compressed form to all supported stores. This is in line with upstream casync's goal of storing in the most efficient way. It is however possible to change this behavior by providing desync with a config file (see Configuration section below). Disabling compression and storing chunks uncompressed may reduce latency in some use-cases and improve performance. desync supports reading and writing uncompressed chunks to SFTP, S3, HTTP and local stores and caches. If more than one store is used, each of those can be configured independently, for example it's possible to read compressed chunks from S3 while using a local uncompressed cache for best performance. However, care needs to be taken when using the `chunk-server` command and building chains of chunk store proxies to avoid shifting the decompression load onto the server (it's possible this is actually desirable). In the setup below, a client reads chunks from an HTTP chunk server which itself gets chunks from S3. @@ -299,7 +299,7 @@ Available configuration values: } ``` -#### Example aws credentials +#### Example AWS credentials ```ini [default] @@ -481,13 +481,13 @@ Start a chunk server with a store-file, this allows the configuration to be re-r ```text # Create store file -echo '{"stores": ["http://192.168.1.1/"], "cache": "/tmp/cache"}` > stores.json +echo '{"stores": ["http://192.168.1.1/"], "cache": "/tmp/cache"}' > stores.json # Start the server desync chunk-server --store-file stores.json -l :8080 # Modify -echo '{"stores": ["http://192.168.1.2/"], "cache": "/tmp/cache"}` > stores.json +echo '{"stores": ["http://192.168.1.2/"], "cache": "/tmp/cache"}' > stores.json # Reload killall -1 desync @@ -543,17 +543,17 @@ FUSE mount a chunked and remote index file. First a (small) index file is read f desync cat -s http://192.168.1.1/store http://192.168.1.2/small.caibx | desync mount-index -s http://192.168.1.1/store - /mnt/point ``` -Long-running FUSE mount that may need to have its store setup changed without unmounting. This can be done by using the `--store-file` option rather than speicifying store+cache in the command line. The process will then reload the file when a SIGHUP is sent. +Long-running FUSE mount that may need to have its store setup changed without unmounting. This can be done by using the `--store-file` option rather than specifying store+cache in the command line. The process will then reload the file when a SIGHUP is sent. ```text # Create the store file -echo '{"stores": ["http://192.168.1.1/"], "cache": "/tmp/cache"}` > stores.json +echo '{"stores": ["http://192.168.1.1/"], "cache": "/tmp/cache"}' > stores.json # Start the mount desync mount-index --store-file stores.json index.caibx /some/mnt # Modify the store setup -echo '{"stores": ["http://192.168.1.2/"], "cache": "/tmp/cache"}` > stores.json +echo '{"stores": ["http://192.168.1.2/"], "cache": "/tmp/cache"}' > stores.json # Reload killall -1 desync @@ -595,7 +595,7 @@ desync --config /path/to/client.json extract -s http://127.0.0.1:8080/ /path/to/ HTTPS chunk server using key and certificate signed by custom CA. ```text -# Building the CA and server certficate +# Building the CA and server certificate openssl genrsa -out ca.key 4096 openssl req -x509 -new -nodes -key ca.key -sha256 -days 3650 -out ca.crt openssl genrsa -out server.key 2048 @@ -612,7 +612,7 @@ desync extract --ca-cert ca.crt -s https://hostname:8443/ image.iso.caibx image. HTTPS chunk server with client authentication (mutual-TLS). ```text -# Building the CA, server and client certficates +# Building the CA, server and client certificates openssl genrsa -out ca.key 4096 openssl req -x509 -new -nodes -key ca.key -sha256 -days 3650 -out ca.crt openssl genrsa -out server.key 2048 diff --git a/assemble.go b/assemble.go index ce9c81e..ca13980 100644 --- a/assemble.go +++ b/assemble.go @@ -7,7 +7,7 @@ import ( "os" ) -// InvalidSeedAction represent the action that we will take if a seed +// InvalidSeedAction represents the action that we will take if a seed // happens to be invalid. There are currently three options: // - fail with an error // - skip the invalid seed and try to continue @@ -108,7 +108,7 @@ func AssembleFile(ctx context.Context, name string, idx Index, s Store, seeds [] ChunksTotal: len(idx.Chunks), } - // Determine is the target exists and create it if not + // Determine if the target exists and create it if not info, err := os.Stat(name) switch { case os.IsNotExist(err): // File doesn't exist yet => create it @@ -147,7 +147,7 @@ func AssembleFile(ctx context.Context, name string, idx Index, s Store, seeds [] defer ns.close() seeds = append([]Seed{ns}, seeds...) - // Start a self-seed which will become usable once chunks are written contigously + // Start a self-seed which will become usable once chunks are written contiguously // beginning at position 0. There is no need to add this to the seeds list because // when we create a plan it will be empty. ss, err := newSelfSeed(name, idx) diff --git a/assemble_test.go b/assemble_test.go index 99427bf..025b8b6 100644 --- a/assemble_test.go +++ b/assemble_test.go @@ -81,7 +81,7 @@ func TestExtract(t *testing.T) { } os.Remove(out1.Name()) - // This one is a complete file matching what we exepct at the end + // This one is a complete file matching what we expect at the end out2, err := ioutil.TempFile("", "out2") if err != nil { t.Fatal(err) diff --git a/cache.go b/cache.go index 5d7de6e..6ce71c6 100644 --- a/cache.go +++ b/cache.go @@ -62,7 +62,7 @@ func (c Cache) Close() error { return c.s.Close() } -// New cache which GetChunk() function will return ChunkMissing error instead of ChunkInvalid +// RepairableCache is a cache whose GetChunk() function will return ChunkMissing error instead of ChunkInvalid // so caller can redownload invalid chunk from store type RepairableCache struct { l WriteStore diff --git a/chunk.go b/chunk.go index 9bda441..c18ca0c 100644 --- a/chunk.go +++ b/chunk.go @@ -41,7 +41,7 @@ func NewChunkWithID(id ChunkID, b []byte, skipVerify bool) (*Chunk, error) { } // NewChunkFromStorage builds a new chunk from data that is not in plain format. -// It uses raw storage format from it source and the modifiers are used to convert +// It uses raw storage format from its source and the modifiers are used to convert // into plain data as needed. func NewChunkFromStorage(id ChunkID, b []byte, modifiers Converters, skipVerify bool) (*Chunk, error) { c := &Chunk{id: id, storage: b, converters: modifiers} diff --git a/chunker.go b/chunker.go index 802386f..1dacca5 100644 --- a/chunker.go +++ b/chunker.go @@ -190,12 +190,12 @@ func (c *Chunker) Next() (uint64, []byte, error) { pos++ - // didn't find a boundry before reaching the max? + // didn't find a boundary before reaching the max? if pos >= m { return c.split(pos, nil) } - // Did we find a boundry? + // Did we find a boundary? if c.hValue%c.hDiscriminator == c.hDiscriminator-1 { return c.split(pos, nil) } @@ -247,7 +247,7 @@ func (c *Chunker) Avg() uint64 { return c.avg } // Max returns the maximum chunk size func (c *Chunker) Max() uint64 { return c.max } -// Hash implements the rolling hash algorithm used to find chunk bounaries +// Hash implements the rolling hash algorithm used to find chunk boundaries // in a stream of bytes. type Hash struct { value uint32 diff --git a/chunkstorage.go b/chunkstorage.go index 4de76bb..16abe52 100644 --- a/chunkstorage.go +++ b/chunkstorage.go @@ -5,7 +5,7 @@ import ( ) // ChunkStorage stores chunks in a writable store. It can be safely used by multiple goroutines and -// contains an internal cache of what chunks have been store previously. +// contains an internal cache of what chunks have been stored previously. type ChunkStorage struct { sync.Mutex ws WriteStore diff --git a/cmd/desync/chop.go b/cmd/desync/chop.go index 1d31e76..21eebcc 100644 --- a/cmd/desync/chop.go +++ b/cmd/desync/chop.go @@ -23,11 +23,11 @@ func newChopCommand(ctx context.Context) *cobra.Command { cmd := &cobra.Command{ Use: "chop ", - Short: "Reads chunks from a file according to an index", + Short: "Read chunks from a file according to an index", Long: `Reads the index and extracts all referenced chunks from the file into a store, local or remote. -Does not modify the input file or index in any. It's used to populate a chunk +Does not modify the input file or index in any way. It's used to populate a chunk store by chopping up a file according to an existing index. To exclude chunks that are known to exist in the target store already, use --ignore which will skip any chunks from the given index. The same can be achieved by providing the diff --git a/cmd/desync/chunkserver.go b/cmd/desync/chunkserver.go index 6791ad4..bbab6a4 100644 --- a/cmd/desync/chunkserver.go +++ b/cmd/desync/chunkserver.go @@ -36,12 +36,12 @@ func newChunkServerCommand(ctx context.Context) *cobra.Command { reading from multiple local or remote stores as well as a local cache. If --cert and --key are provided, the server will serve over HTTPS. The -w option enables writing to this store, but this is only allowed when just one upstream -chunk store is provided. The option --skip-verify-write disables validation of -chunks written to this server which bypasses checksum validation as well as -the necessary decompression step to calculate it to improve performance. If -u -is used, only uncompressed chunks are being served (and accepted). If the -upstream store serves compressed chunks, everything will have to be decompressed -server-side so it's better to also read from uncompressed upstream stores. +chunk store is provided. The option --skip-verify-write disables hash validation +of chunks written to this server, avoiding the decompression step needed to +calculate checksums, to improve performance. If -u is used, only uncompressed +chunks are served (and accepted). If the upstream store serves compressed chunks, +everything will have to be decompressed server-side so it's better to also read +from uncompressed upstream stores. While --concurrency does not limit the number of clients that can be served concurrently, it does influence connection pools to remote upstream stores and diff --git a/cmd/desync/chunkserver_test.go b/cmd/desync/chunkserver_test.go index a984095..7237017 100644 --- a/cmd/desync/chunkserver_test.go +++ b/cmd/desync/chunkserver_test.go @@ -168,7 +168,7 @@ func startChunkServer(t *testing.T, args ...string) (string, context.CancelFunc) // Flush any handlers that were registered in the default mux before http.DefaultServeMux = &http.ServeMux{} - // Start the server in a gorountine. Cancel the context when done + // Start the server in a goroutine. Cancel the context when done ctx, cancel := context.WithCancel(context.Background()) cmd := newChunkServerCommand(ctx) cmd.SetArgs(append(args, "-l", addr)) diff --git a/cmd/desync/config.go b/cmd/desync/config.go index 448f137..e43c1d7 100644 --- a/cmd/desync/config.go +++ b/cmd/desync/config.go @@ -99,7 +99,7 @@ func newConfigCommand(ctx context.Context) *cobra.Command { Short: "Show or write config file", Long: `Shows the current internal configuration settings, either the defaults, the values from $HOME/.config/desync/config.json or the specified config file. The -output can be used to create a custom config file writing it to the specified file +output can be used to create a custom config file by writing it to the specified file or $HOME/.config/desync/config.json by default.`, Example: ` desync config desync --config desync.json config -w`, @@ -138,14 +138,14 @@ func runConfig(ctx context.Context, write bool) error { return err } -// Global config in the main packe defining the defaults. Those can be +// Global config in the main package defining the defaults. Those can be // overridden by loading a config file or in the command line. var cfg Config var cfgFile string // Look for $HOME/.config/desync and if present, load into the global config // instance. Values defined in the file will be set accordingly, while anything -// that's not in the file will retain it's default values. +// that's not in the file will retain its default values. func initConfig() { var defaultLocation bool if cfgFile == "" { diff --git a/cmd/desync/extract.go b/cmd/desync/extract.go index 0834b72..e7e17b4 100644 --- a/cmd/desync/extract.go +++ b/cmd/desync/extract.go @@ -42,9 +42,9 @@ set the path by writing the index file path, followed by a colon and the data pa If several seed files and indexes are available, the -seed-dir option can be used to automatically select all .caibx files in a directory as seeds. Use '-' to read the index from STDIN. If a seed is invalid, by default the extract operation will be -aborted. With the -skip-invalid-seeds, the invalid seeds will be discarded and the -extraction will continue without them. Otherwise with the -regenerate-invalid-seeds, -the eventual invalid seed indexes will be regenerated, in memory, by using the +aborted. With --skip-invalid-seeds, the invalid seeds will be discarded and the +extraction will continue without them. Otherwise with --regenerate-invalid-seeds, +any invalid seed indexes will be regenerated, in memory, by using the available data, and neither data nor indexes will be changed on disk. Also, if the seed changes while processing, its invalid chunks will be taken from the self seed, or the store, instead of aborting.`, diff --git a/cmd/desync/indexserver.go b/cmd/desync/indexserver.go index 9215e76..57e3c5a 100644 --- a/cmd/desync/indexserver.go +++ b/cmd/desync/indexserver.go @@ -32,7 +32,7 @@ func newIndexServerCommand(ctx context.Context) *cobra.Command { Use: "index-server", Short: "Server for indexes over HTTP(S)", Long: `Starts an HTTP index server that can be used as remote store. It supports -reading from a single local or a proxying to a remote store. +reading from a single local store or proxying to a remote store. If --cert and --key are provided, the server will serve over HTTPS. The -w option enables writing to this store.`, Example: ` desync index-server -s sftp://192.168.1.1/indexes -l :8080`, @@ -127,7 +127,7 @@ func serve(ctx context.Context, opt cmdServerOptions, addresses ...string) error return err } if ok := certPool.AppendCertsFromPEM(b); !ok { - return errors.New("no client CA certficates found in client-ca file") + return errors.New("no client CA certificates found in client-ca file") } tlsConfig.ClientCAs = certPool } diff --git a/cmd/desync/indexserver_test.go b/cmd/desync/indexserver_test.go index 34d9e8c..652d006 100644 --- a/cmd/desync/indexserver_test.go +++ b/cmd/desync/indexserver_test.go @@ -81,7 +81,7 @@ func startIndexServer(t *testing.T, args ...string) (string, context.CancelFunc) // Flush any handlers that were registered in the default mux before http.DefaultServeMux = &http.ServeMux{} - // Start the server in a gorountine. Cancel the context when done + // Start the server in a goroutine. Cancel the context when done ctx, cancel := context.WithCancel(context.Background()) cmd := newIndexServerCommand(ctx) cmd.SetArgs(append(args, "-l", addr)) diff --git a/cmd/desync/mount-index.go b/cmd/desync/mount-index.go index 73d61ef..9ac7003 100644 --- a/cmd/desync/mount-index.go +++ b/cmd/desync/mount-index.go @@ -41,7 +41,7 @@ All chunks that are accessed by the mount are retrieved from the store and writt the file as read operations are performed. Once all chunks have been accessed, the COR file is fully populated. On termination, a .state file is written containing information about which chunks of the index have or have not been read. A state file is -only valid for a one cache-file and one index. When re-using it with a different index, +only valid for one cache file and one index. When re-using it with a different index, data corruption can occur. This command supports the --store-file option which can be used to define the stores @@ -64,7 +64,7 @@ needing to restart the server. This can be done under load as well. flags.StringVarP(&opt.corFile, "cor-file", "", "", "use a copy-on-read sparse file as cache") flags.StringVarP(&opt.StateSaveFile, "cor-state-save", "", "", "file to store the state for copy-on-read") flags.StringVarP(&opt.StateInitFile, "cor-state-init", "", "", "copy-on-read state init file") - flags.IntVarP(&opt.StateInitConcurrency, "cor-init-n", "", 10, "number of gorooutines to use for initialization (with --cor-state-init)") + flags.IntVarP(&opt.StateInitConcurrency, "cor-init-n", "", 10, "number of goroutines to use for initialization (with --cor-state-init)") addStoreOptions(&opt.cmdStoreOptions, flags) return cmd } diff --git a/cmd/desync/options.go b/cmd/desync/options.go index b1f39c8..99d302b 100644 --- a/cmd/desync/options.go +++ b/cmd/desync/options.go @@ -94,7 +94,7 @@ func (o cmdServerOptions) validate() error { func addServerOptions(o *cmdServerOptions, f *pflag.FlagSet) { f.StringVar(&o.cert, "cert", "", "cert file in PEM format, requires --key") f.StringVar(&o.key, "key", "", "key file in PEM format, requires --cert") - f.BoolVar(&o.mutualTLS, "mutual-tls", false, "require valid client certficate") + f.BoolVar(&o.mutualTLS, "mutual-tls", false, "require valid client certificate") f.StringVar(&o.clientCA, "client-ca", "", "acceptable client certificate or CA") f.StringVar(&o.auth, "authorization", "", "expected value of the authorization header in requests") } diff --git a/cmd/desync/tar.go b/cmd/desync/tar.go index 79cb555..f614e94 100644 --- a/cmd/desync/tar.go +++ b/cmd/desync/tar.go @@ -40,7 +40,7 @@ less disk space is required as no intermediary catar is created. There can however be a difference in performance depending on file size. By default, input is read from local disk. Using --input-format=tar, -the input can be a tar file or stream to STDIN with '-'. +the input can be a tar file or a stream from STDIN with '-'. `, Example: ` desync tar documents.catar $HOME/Documents desync tar -i -s /path/to/local pics.caidx $HOME/Pictures`, diff --git a/cmd/desync/verifyindex.go b/cmd/desync/verifyindex.go index 0d3c3f4..0d5578c 100644 --- a/cmd/desync/verifyindex.go +++ b/cmd/desync/verifyindex.go @@ -16,7 +16,7 @@ func newVerifyIndexCommand(ctx context.Context) *cobra.Command { cmd := &cobra.Command{ Use: "verify-index ", - Short: "Verifies an index matches a file", + Short: "Verify an index matches a file", Long: `Verifies an index file matches the content of a blob. Use '-' to read the index from STDIN.`, Example: ` desync verify-index sftp://192.168.1.1/myIndex.caibx largefile.bin`, diff --git a/consoleindex.go b/consoleindex.go index da6527b..a2ddf59 100644 --- a/consoleindex.go +++ b/consoleindex.go @@ -26,7 +26,7 @@ func (s ConsoleIndexStore) GetIndex(string) (i Index, e error) { return IndexFromReader(os.Stdin) } -// StoreIndex writes the provided indes to STDOUT. The name is ignored. +// StoreIndex writes the provided index to STDOUT. The name is ignored. func (s ConsoleIndexStore) StoreIndex(name string, idx Index) error { _, err := idx.WriteTo(os.Stdout) return err diff --git a/coverter.go b/coverter.go index a4692ea..23d78fe 100644 --- a/coverter.go +++ b/coverter.go @@ -38,7 +38,7 @@ func (s Converters) fromStorage(in []byte) ([]byte, error) { return b, nil } -// Returns true is conversion involves compression. Typically +// Returns true if conversion involves compression. Typically // used to determine the correct file-extension. func (s Converters) hasCompression() bool { for _, layer := range s { @@ -65,12 +65,12 @@ func (s Converters) equal(c Converters) bool { // converter is a storage data modifier layer. type converter interface { - // Convert data from it's original form to storage format. + // Convert data from its original form to storage format. // The input could be plain data, or the output of a prior // converter. toStorage([]byte) ([]byte, error) - // Convert data from it's storage format towards it's plain + // Convert data from its storage format towards its plain // form. The input could be encrypted or compressed, while // the output may be used for the next conversion layer. fromStorage([]byte) ([]byte, error) diff --git a/dedupqueue.go b/dedupqueue.go index a012934..53b847a 100644 --- a/dedupqueue.go +++ b/dedupqueue.go @@ -10,7 +10,7 @@ var _ Store = &DedupQueue{} // DedupQueue wraps a store and provides deduplication of incoming chunk requests. This is useful when // a burst of requests for the same chunk is received and the chunk store serving those is slow. With // the DedupQueue wrapper, concurrent requests for the same chunk will result in just one request to the -// upstread store. Implements the Store interface. +// upstream store. Implements the Store interface. type DedupQueue struct { store Store mu sync.Mutex @@ -126,7 +126,7 @@ func (r *request) wait() (interface{}, error) { return r.data, r.err } -// Set the result data and marks this request as comlete. +// Set the result data and marks this request as complete. func (r *request) markDone(data interface{}, err error) { r.data = data r.err = err diff --git a/digest.go b/digest.go index 9de1633..87ebd06 100644 --- a/digest.go +++ b/digest.go @@ -16,13 +16,13 @@ type HashAlgorithm interface { Algorithm() crypto.Hash } -// SHA512-256 hashing algoritm for Digest. +// SHA512-256 hashing algorithm for Digest. type SHA512256 struct{} func (h SHA512256) Sum(data []byte) [32]byte { return sha512.Sum512_256(data) } func (h SHA512256) Algorithm() crypto.Hash { return crypto.SHA512_256 } -// SHA256 hashing algoritm for Digest. +// SHA256 hashing algorithm for Digest. type SHA256 struct{} func (h SHA256) Sum(data []byte) [32]byte { return sha256.Sum256(data) } diff --git a/doc.go b/doc.go index 2d56f42..7d6b4cf 100644 --- a/doc.go +++ b/doc.go @@ -1,7 +1,7 @@ /* Package desync implements data structures, protocols and features of https://github.com/systemd/casync in order to allow support for additional -platforms and improve performace by way of concurrency and caching. +platforms and improve performance by way of concurrency and caching. Supports the following casync data structures: catar archives, caibx/caidx index files, castr stores (local or remote). diff --git a/failover.go b/failover.go index c507bf4..36e5f50 100644 --- a/failover.go +++ b/failover.go @@ -91,7 +91,7 @@ func (g *FailoverGroup) current() (Store, int) { return g.stores[g.active], g.active } -// Fail over to the next available store after recveiving an error from i (the active). We +// Fail over to the next available store after receiving an error from i (the active). We // need i to know which store returned the error as there could be failures from concurrent // requests. Another request could have initiated the failover already. So ignore if i is not // (no longer) the active store. diff --git a/fileseed.go b/fileseed.go index 6962a8f..b6f0e80 100644 --- a/fileseed.go +++ b/fileseed.go @@ -40,7 +40,7 @@ func NewIndexSeed(dstFile string, srcFile string, index Index) (*FileSeed, error // and a nil SeedSegment. func (s *FileSeed) LongestMatchWith(chunks []IndexChunk) (int, SeedSegment) { s.mu.RLock() - // isInvalid can be concurrently read or wrote. Use a mutex to avoid a race + // isInvalid can be concurrently read or written. Use a mutex to avoid a race if len(chunks) == 0 || len(s.index.Chunks) == 0 || s.isInvalid { return 0, nil } diff --git a/gcs.go b/gcs.go index cf56d34..f4e6a30 100644 --- a/gcs.go +++ b/gcs.go @@ -77,7 +77,7 @@ func (s GCStoreBase) String() string { return s.Location } -// Close the GCS base store. NOP opertation but needed to implement the store interface. +// Close the GCS base store. NOP operation but needed to implement the store interface. func (s GCStoreBase) Close() error { return nil } // NewGCStore creates a chunk store with Google Storage backing. The URL diff --git a/index.go b/index.go index 02d74f5..8d9f1c3 100644 --- a/index.go +++ b/index.go @@ -76,7 +76,7 @@ func IndexFromReader(r io.Reader) (c Index, err error) { c.Chunks[i].Start = lastOffset c.Chunks[i].Size = r.Offset - lastOffset lastOffset = r.Offset - // Check the max size of the chunk only. The min apperently doesn't apply + // Check the max size of the chunk only. The min apparently doesn't apply // to the last chunk. if c.Chunks[i].Size > c.Index.ChunkSizeMax { return c, fmt.Errorf("chunk size %d is larger than maximum %d", c.Chunks[i].Size, c.Index.ChunkSizeMax) @@ -180,7 +180,7 @@ func ChunkStream(ctx context.Context, c Chunker, ws WriteStore, n int) (Index, e // Feed the workers, stop if there are any errors. To keep the index list in // order, we calculate the checksum here before handing them over to the - // workers for compression and storage. That could probablybe optimized further + // workers for compression and storage. That could probably be optimized further var num int // chunk #, so we can re-assemble the index in the right order later loop: for { diff --git a/ioctl_linux.go b/ioctl_linux.go index 206584e..1f67d4a 100644 --- a/ioctl_linux.go +++ b/ioctl_linux.go @@ -21,7 +21,7 @@ const blkGetSize64 = 0x80081272 const fiCloneRange = 0x4020940d // CanClone tries to determine if the filesystem allows cloning of blocks between -// two files. It'll create two tempfiles in the same dirs and attempt to perfom +// two files. It'll create two tempfiles in the same dirs and attempt to perform // a 0-byte long block clone. If that's successful it'll return true. func CanClone(dstFile, srcFile string) bool { dst, err := ioutil.TempFile(filepath.Dir(dstFile), ".tmp") diff --git a/local.go b/local.go index c337a65..518459f 100644 --- a/local.go +++ b/local.go @@ -121,7 +121,7 @@ func (s LocalStore) Verify(ctx context.Context, n int, repair bool, w io.Writer) }() } - // Go trough all chunks underneath Base, filtering out other files, then feed + // Go through all chunks underneath Base, filtering out other files, then feed // the IDs to the workers err := filepath.Walk(s.Base, func(path string, info os.FileInfo, err error) error { // See if we're meant to stop @@ -167,7 +167,7 @@ func (s LocalStore) Verify(ctx context.Context, n int, repair bool, w io.Writer) // Prune removes any chunks from the store that are not contained in a list // of chunks func (s LocalStore) Prune(ctx context.Context, ids map[ChunkID]struct{}) error { - // Go trough all chunks underneath Base, filtering out other directories and files + // Go through all chunks underneath Base, filtering out other directories and files err := filepath.Walk(s.Base, func(path string, info os.FileInfo, err error) error { // See if we're meant to stop select { diff --git a/localfs.go b/localfs.go index 2aac134..6943ed0 100644 --- a/localfs.go +++ b/localfs.go @@ -23,7 +23,7 @@ type LocalFS struct { sErr error } -// LocalFSOptions influence the behavior of the filesystem when reading from or writing too it. +// LocalFSOptions influence the behavior of the filesystem when reading from or writing to it. type LocalFSOptions struct { // Only used when reading from the filesystem. Will only return // files from the same device as the first read operation. @@ -35,7 +35,7 @@ type LocalFSOptions struct { // Ignore the incoming permissions when writing files. Use the current default instead. NoSamePermissions bool - // Reads all timestamps as zero. Used in tar operations to avoid unneccessary changes. + // Reads all timestamps as zero. Used in tar operations to avoid unnecessary changes. NoTime bool } diff --git a/make.go b/make.go index dfd4820..92fe24b 100644 --- a/make.go +++ b/make.go @@ -12,7 +12,7 @@ import ( // IndexFromFile chunks a file in parallel and returns an index. It does not // store chunks! Each concurrent chunker starts filesize/n bytes apart and -// splits independently. Each chunk worker tries to sync with it's next +// splits independently. Each chunk worker tries to sync with its next // neighbor and if successful stops processing letting the next one continue. // The main routine reads and assembles a list of (confirmed) chunks from the // workers, starting with the first worker. @@ -160,7 +160,7 @@ type pChunker struct { chunker Chunker // starting position in the stream for this worker, needed to calculate - // the absolute position of every boundry that is returned + // the absolute position of every boundary that is returned offset uint64 once sync.Once @@ -197,7 +197,7 @@ func (c *pChunker) start(ctx context.Context) { if len(b) == 0 { // TODO: If this worker reached the end of the stream and it's not the // last one, we should probably stop all following workers. Meh, shouldn't - // be happening for large file or save significant CPU for small ones. + // be happening for large files or save significant CPU for small ones. c.eof = true return } diff --git a/mount-index.go b/mount-index.go index ea44f28..ea6eded 100644 --- a/mount-index.go +++ b/mount-index.go @@ -22,7 +22,7 @@ type MountFS interface { } // IndexMountFS is used to FUSE mount an index file (as a blob, not an archive). -// It present a single file underneath the mountpoint. +// It presents a single file underneath the mountpoint. type IndexMountFS struct { fs.Inode diff --git a/mtreefs.go b/mtreefs.go index e09b638..ddd63e5 100644 --- a/mtreefs.go +++ b/mtreefs.go @@ -86,7 +86,7 @@ func (fs MtreeFS) CreateDevice(n NodeDevice) error { return nil } -// Converts filenames into an mtree-compatible format following the rules outined in mtree(5): +// Converts filenames into an mtree-compatible format following the rules outlined in mtree(5): // // When encoding file or pathnames, any backslash character or character outside of the 95 // printable ASCII characters must be encoded as a backslash followed by three octal digits. diff --git a/nullchunk.go b/nullchunk.go index d04c555..6b31f8e 100644 --- a/nullchunk.go +++ b/nullchunk.go @@ -5,7 +5,7 @@ package desync // the chunking algorithm does not produce split boundaries, which results // in many chunks of 0-bytes of size MAX (max chunk size). The NullChunk can be // used to make requesting this kind of chunk more efficient by serving it -// from memory, rather that request it from disk or network and decompress +// from memory, rather than request it from disk or network and decompress // it repeatedly. type NullChunk struct { Data []byte @@ -13,7 +13,7 @@ type NullChunk struct { } // NewNullChunk returns an initialized chunk consisting of 0-bytes of 'size' -// which must mach the max size used in the index to be effective +// which must match the max size used in the index to be effective func NewNullChunk(size uint64) *NullChunk { b := make([]byte, int(size)) return &NullChunk{ diff --git a/nullseed.go b/nullseed.go index 26033e7..efdc30b 100644 --- a/nullseed.go +++ b/nullseed.go @@ -109,7 +109,7 @@ func (s *nullChunkSection) WriteInto(dst *os.File, offset, length, blocksize uin return 0, 0, fmt.Errorf("unable to copy %d bytes to %s : wrong size", length, dst.Name()) } - // When cloning isn'a available we'd normally have to copy the 0 bytes into + // When cloning isn't available we'd normally have to copy the 0 bytes into // the target range. But if that's already blank (because it's a new/truncated // file) there's no need to copy 0 bytes. if !s.canReflink { diff --git a/protocol.go b/protocol.go index b45ff62..8906f2b 100644 --- a/protocol.go +++ b/protocol.go @@ -66,7 +66,7 @@ func (p *Protocol) RecvHello() (uint64, error) { return 0, err } if m.Type != CaProtocolHello { - return 0, fmt.Errorf("expected protocl hello, got %x", m.Type) + return 0, fmt.Errorf("expected protocol hello, got %x", m.Type) } if len(m.Body) != 8 { return 0, fmt.Errorf("unexpected length of hello msg, got %d, expected 8", len(m.Body)) diff --git a/remotessh.go b/remotessh.go index 6ea79cd..e9b730f 100644 --- a/remotessh.go +++ b/remotessh.go @@ -71,7 +71,7 @@ func (r *RemoteSSH) String() string { // StartProtocol initiates a connection to the remote store server using // the value in CASYNC_SSH_PATH (default "ssh"), and executes the command in // CASYNC_REMOTE_PATH (default "casync"). It then performs the HELLO handshake -// to initialze the connection +// to initialize the connection func StartProtocol(u *url.URL) (*Protocol, error) { sshCmd := os.Getenv("CASYNC_SSH_PATH") if sshCmd == "" { diff --git a/s3.go b/s3.go index 6431e90..a99c6b9 100644 --- a/s3.go +++ b/s3.go @@ -77,7 +77,7 @@ func (s S3StoreBase) Close() error { return nil } // NewS3Store creates a chunk store with S3 backing. The URL // should be provided like this: s3+http://host:port/bucket // Credentials are passed in via the environment variables S3_ACCESS_KEY -// and S3S3_SECRET_KEY, or via the desync config file. +// and S3_SECRET_KEY, or via the desync config file. func NewS3Store(location *url.URL, s3Creds *credentials.Credentials, region string, opt StoreOptions, lookupType minio.BucketLookupType) (s S3Store, e error) { b, err := NewS3StoreBase(location, s3Creds, region, opt, lookupType) if err != nil { diff --git a/s3index.go b/s3index.go index 2941eb9..bfff4f6 100644 --- a/s3index.go +++ b/s3index.go @@ -20,7 +20,7 @@ type S3IndexStore struct { // NewS3IndexStore creates an index store with S3 backing. The URL // should be provided like this: s3+http://host:port/bucket // Credentials are passed in via the environment variables S3_ACCESS_KEY -// and S3S3_SECRET_KEY, or via the desync config file. +// and S3_SECRET_KEY, or via the desync config file. func NewS3IndexStore(location *url.URL, s3Creds *credentials.Credentials, region string, opt StoreOptions, lookupType minio.BucketLookupType) (s S3IndexStore, e error) { b, err := NewS3StoreBase(location, s3Creds, region, opt, lookupType) if err != nil { diff --git a/selfseed.go b/selfseed.go index 38feacf..86a5220 100644 --- a/selfseed.go +++ b/selfseed.go @@ -46,7 +46,7 @@ func (s *selfSeed) add(segment IndexSegment) { // Advance pos until we find a chunk we don't yet have recorded while recording // the chunk positions we do have in the position map used to find seed matches. // Since it's guaranteed that the numbers are only increasing, we drop old numbers - // from the cache map to keep it's size to a minimum and only store out-of-sequence + // from the cache map to keep its size to a minimum and only store out-of-sequence // numbers for { // See if we can advance the write pointer in the self-seed which requires diff --git a/sftp.go b/sftp.go index aba5426..69cfdcd 100644 --- a/sftp.go +++ b/sftp.go @@ -88,7 +88,7 @@ func newSFTPStoreBase(location *url.URL, opt StoreOptions) (*SFTPStoreBase, erro // StoreObject adds a new object to a writable index or chunk store. func (s *SFTPStoreBase) StoreObject(name string, r io.Reader) error { // Write to a tempfile on the remote server. This is not 100% guaranteed to not - // conflict between gorouties, there's no tempfile() function for remote servers. + // conflict between goroutines, there's no tempfile() function for remote servers. // Use a large enough random number instead to build a tempfile tmpfile := name + strconv.Itoa(rand.Int()) d := path.Dir(name) diff --git a/sparse-file_test.go b/sparse-file_test.go index 9eac7c0..322a212 100644 --- a/sparse-file_test.go +++ b/sparse-file_test.go @@ -97,7 +97,7 @@ func TestSparseFileRead(t *testing.T) { require.Equal(t, fromBlob, fromSparse) } - // Read the whole file. After this is should match the whole blob + // Read the whole file. After this it should match the whole blob whole := make([]byte, index.Length()) _, err = h.ReadAt(whole, 0) require.NoError(t, err) diff --git a/store.go b/store.go index d4d985d..a682444 100644 --- a/store.go +++ b/store.go @@ -111,7 +111,7 @@ func (o *StoreOptions) UnmarshalJSON(data []byte) error { // Returns data converters that convert between plain and storage-format. Each layer // represents a modification such as compression or encryption and is applied in order -// depending the direction of data. If data is written to storage, the layer's toStorage +// depending on the direction of data. If data is written to storage, the layer's toStorage // method is called in the order they are returned. If data is read, the fromStorage // method is called in reverse order. func (o *StoreOptions) converters() []converter { diff --git a/swapstore.go b/swapstore.go index 4b0ef81..ed25905 100644 --- a/swapstore.go +++ b/swapstore.go @@ -19,7 +19,7 @@ type SwapStore struct { mu sync.RWMutex } -// SwapWriteStore does ther same as SwapStore but implements WriteStore as well. +// SwapWriteStore does the same as SwapStore but implements WriteStore as well. type SwapWriteStore struct { SwapStore } @@ -56,21 +56,21 @@ func (s *SwapStore) String() string { return s.s.String() } -// Close the store. NOP opertation, needed to implement Store interface. +// Close the store. NOP operation, needed to implement Store interface. func (s *SwapStore) Close() error { s.mu.RLock() defer s.mu.RUnlock() return s.s.Close() } -// Close the store. NOP opertation, needed to implement Store interface. +// Close the store. NOP operation, needed to implement Store interface. func (s *SwapStore) Swap(new Store) error { s.mu.Lock() defer s.mu.Unlock() _, oldWritable := s.s.(WriteStore) _, newWritable := new.(WriteStore) if oldWritable && !newWritable { - return errors.New("a writable store can obly be updated with another writable one") + return errors.New("a writable store can only be updated with another writable one") } s.s.Close() // Close the old store s.s = new diff --git a/tar.go b/tar.go index 3777918..c91f4fc 100644 --- a/tar.go +++ b/tar.go @@ -11,7 +11,7 @@ import ( // TarFeatureFlags are used as feature flags in the header of catar archives. These // should be used in index files when chunking a catar as well. TODO: Find out what -// CaFormatWithPermissions is as that's not set incasync-produced catar archives. +// CaFormatWithPermissions is as that's not set in casync-produced catar archives. const TarFeatureFlags uint64 = CaFormatWith32BitUIDs | CaFormatWithNSecTime | CaFormatWithPermissions | @@ -130,7 +130,7 @@ func tar(ctx context.Context, enc FormatEncoder, fs *fsBufReader, f *File) (n in } items = append(items, FormatGoodbyeItem{ - Offset: uint64(start), // This is tempoary, it needs to be re-calculated later as offset from the goodbye marker + Offset: uint64(start), // This is temporary, it needs to be re-calculated later as offset from the goodbye marker Size: uint64(n - start), Hash: SipHash([]byte(name)), }) diff --git a/types.go b/types.go index c8df06d..8876f11 100644 --- a/types.go +++ b/types.go @@ -21,7 +21,7 @@ func ChunkIDFromSlice(b []byte) (ChunkID, error) { return c, nil } -// ChunkIDFromString converts a SHA512/56 encoded as string into a ChunkID +// ChunkIDFromString converts a SHA512/256 encoded as string into a ChunkID func ChunkIDFromString(id string) (ChunkID, error) { b, err := hex.DecodeString(id) if err != nil { diff --git a/untar.go b/untar.go index f176a09..d0817f4 100644 --- a/untar.go +++ b/untar.go @@ -48,7 +48,7 @@ loop: } // UnTarIndex takes an index file (of a chunked catar), re-assembles the catar -// and decodes it on-the-fly into the target directory 'dst'. Uses n gorountines +// and decodes it on-the-fly into the target directory 'dst'. Uses n goroutines // to retrieve and decompress the chunks. func UnTarIndex(ctx context.Context, fs FilesystemWriter, index Index, s Store, n int, pb ProgressBar) error { type requestJob struct { @@ -119,7 +119,7 @@ func UnTarIndex(ctx context.Context, fs FilesystemWriter, index Index, s Store, return nil }) - // Assember - Read from data channels push the chunks into the pipe that untar reads from + // Assembler - Read from data channels push the chunks into the pipe that untar reads from g.Go(func() error { defer w.Close() // No more chunks to come, stop the untar loop: