Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 28 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,21 +34,23 @@ And this package is needed to create and manipulate netcdf data files with pytho

To build RNNLIB do

``` shell
$ cmake -DCMAKE_BUILD_TYPE=Release .
$ cmake --build .
```

Cmake run creates the binary files 'rnnlib', 'rnnsynth' and 'gradient_check' in the current directory.
Cmake run creates the binary files `rnnlib`, `rnnsynth` and `gradient_check` in the current directory.

It is recommended that you add the directory containing the 'rnnlib' binary to your path,
as otherwise the tools in the 'utilities' directory will not work.
as otherwise the tools in the `utilities` directory will not work.

Project files for the integrated development environments can be generated by cmake. Run cmake --help
Project files for the integrated development environments can be generated by cmake. Run `cmake --help`
to get list of supported IDEs.


# Handwriting synthesis

Step in to examples/online_prediction and go through few steps below to prepare the
Step in to `examples/online_prediction` and go through few steps below to prepare the
training data, train the model and eventually plot the results of the synthesis

## Downloading online handwriting dataset
Expand All @@ -57,26 +59,28 @@ Start by registering and downloading pen strokes data from
http://www.iam.unibe.ch/~fkiwww/iamondb/data/lineStrokes-all.tar.gz
Text lables for strokes can be found here
http://www.iam.unibe.ch/~fkiwww/iamondb/data/ascii-all.tar.gz
Then unzip ./lineStrokes and ./ascii under examples/online_prediction.
Then unzip `./lineStroke`s and `./ascii under examples/online_prediction`.
Data format in the downloaded files can not be used as is
and requires further preprocessing to convert pen coordinates to offsets from
previous point and merge them into the single file of netcdf format.

## Preparing the training data

Run ./build_netcdf.sh to split dataset to training and validation sets.
Run `./build_netcdf.sh` to split dataset to training and validation sets.
The same script does all necessary preprocessing including normalisation
of the input and makes corresponding online.nc and online_validation.nc
of the input and makes corresponding `online.nc` and `online_validation.nc`
files for use with rnnlib .

Each point in the input sequences from online.nc consists of three numbers:
Each point in the input sequences from `online.nc` consists of three numbers:
the x and y offset from the previous point, and the binary end-of-stroke feature.

## Gradient check

To gain some confidence that the build is fine run the gradient check:

``` shell
gradient_check --autosave=false check_synth2.config
```

## Training

Expand All @@ -87,7 +91,9 @@ too slow convergence rate.

### Step 1

``` shell
rnnlib --verbose=false synth1d.config
```

Where synth1d.config is 1st step configuration file that defines network topology:
3 LSTM hidden layers of 400 cells, 20 gaussian mixtures as output layer, 10 mixtures
Expand All @@ -96,7 +102,7 @@ Somewhere between training epoch 10-15 it will find optimal solution and will do
"early stopping" w/o improvement for 20 epoch. "Early" here takes 3 days on Intel
Sandybridge CPU. Normally training can be stopped as long as loss starts rising up
for 2-3 consequent epochs.
The best solution found is stored in synth1d@<time step>.best_loss.save file
The best solution found is stored in `synth1d@<time step>.best_loss.save` file

### Step 2

Expand All @@ -106,36 +112,42 @@ training unlike in Step 1. Therefore one must be more patient to declare early s
wait for 20 epochs with loss worse then the best result so far. Rnnlib has implementation
of MDL regulariser which is used in this step. The command line is as following:

``` shell
rnnlib --mdl=true --mdlOptimiser=rmsprop from_step1.best_loss.save
```

### Synthesis

Handwriting synthesis is done by rnnsynth binary using network parameters obtained by
step 2:

``` shell
rnnsynth from_step2.best_loss.save
```

The character sequence is given to stdin and output is written to stdout. The output sequence
is the same as input where each data point has x,y offsets and end-of-stroke flag.

### Plotting the results

Rnnsynth output is the sequence of x,y offsets and end-of-stroke flags. To visualise it one
can use show_pen.m Octave script:
can use `show_pen.m` Octave script:

``` shell
octave:>show_pen('/tmp/trace1')
```

Where /tmp/trace1 contains stdout from rnnsynth.
Where `/tmp/trace1` contains stdout from rnnsynth.

### Rnnlib configuration file

Configuration options are exlained in http://sourceforge.net/p/rnnl/wiki/Home/. Since then
there are few things added:
* lstm1d as hiddenType layer type - optimised LSTM layer when input dimension is 1d
* rmsprop optimizer type
* mixtures=N where N is number of gaussians in the output layer
* charWindowSize=N where N is the number of gaussians in the character window layer
* skipConnections=true|false - whether to add skip connections; default is true
* `lstm1d` as hiddenType layer type - optimised LSTM layer when input dimension is 1d
* `rmsprop` optimizer type
* `mixtures`=N where N is number of gaussians in the output layer
* `charWindowSize`=N where N is the number of gaussians in the character window layer
* `skipConnections` =true|false - whether to add skip connections; default is true

# Contact

Expand Down