Hi 👋 This is a Python implementation for sequence-labeling algorithms for Dependency, Constituency and Graph Parsing.
-
Dependency Parsing:
- Absolute and relative indexing (Strzyz et al., 2019).
- PoS-tag relative indexing (Strzyz et al., 2019).
- Bracketing encoding (
$k$ -planar) (Strzyz et al., 2020). -
$4$ -bit projective encoding (Gómez-Rodríguez et al., 2023). -
$7$ -bit$2$ -planar encoding (Gómez-Rodríguez et al., 2023). - Hierarchical bracketing encoding (Ezquerro et al., 2025a, OFFICIAL).
- Hexa-Tagging projective encoding (Amini et al., 2023).
- Arc-Eager transition-based system (Nivre and Fernández-González, 2002).
- Biaffine dependency parser (Dozat & Manning, 2017).
-
Graph Parsing:
- Absolute and relative indexing (Ezquerro et al., 2024, OFFICIAL).
- Bracketing encoding (
$k$ -planar) (Ezquerro et al., 2024, OFFICIAL). - Hierarchical bracketing encoding (Ezquerro et al., 2025b, OFFICIAL).
-
$4k$ -bit encoding (Ezquerro et al., 2024, OFFICIAL). -
$6k$ -bit encoding (Ezquerro et al., 2024, OFFICIAL). - Covington graph parser (Covington, 2001).
- Biaffine graph parser (Dozat & Manning, 2018).
-
Constituency Parsing:
- Absolute and relative indexing (Gómez-Rodríguez and Vilares, 2018).
- Tetra-Tagging (Kitaev and Klein, 2020).
It is also the official repository of the following papers:
- Dependency Graph Parsing as Sequence Labeling (Ezquerro et al., 2024).
- Hierarchical Bracketing Encodings for Dependency Parsing as Tagging (Ezquerro et al., 2025a).
- Hierarchical Bracketing Encodings Work for Dependency Graphs (Ezquerro et al., 2025b).
- Bringing Emerging Architectures to Sequence Labeling in NLP (Ezquerro et al., 2025c).
Please, feel free to reach out if you want to collaborate or include additional parsers to SePar!
This code was tested in Python >=3.12 in a GPU system with NVIDIA drivers (>=535) and CUDA (>=12.4) installed. Use the requirements.txt to download the dependencies in an existing environment:
pip install -r requirements.txtOur code allows running the official evaluation of constituency (Sekine and Collins, 1997) and graph (Oepen et al., 2015) parsers. Please, follow the instructions to download and install the EVALB executable and SDP toolkit. For the constituency and graph parsers use the --evalb argument to compute the official evaluation.
- For constituency parsers, the
--evalbargument requires three paths: (i) the EVALB executable, (ii) the labeled parameter file and (iii) the unlabeled parameter file. - For graph parsers, the
--evalbargument is the path to therun.shfile from the SDP toolkit.
You can train, evaluate and predict different parsers from terminal with the run.py script. Each parser has a string identifier that is introduced as the first argument of run.py. The following table shows the available parsers and their configuration (modifiable through terminal arguments).
| Identifier | Parser | Paper | Arguments | Default |
|---|---|---|---|---|
dep-idx |
Absolute and relative indexing | Strzyz et al. (2019) |
rel{true, false}
|
false |
dep-pos |
PoS-tag relative indexing | Strzyz et al. (2019) | ||
dep-bracket |
Bracketing encoding ( |
Strzyz et al. (2020) |
k |
1 |
dep-bit4 |
|
Gómez-Rodríguez et al. (2023) |
proj{None, head, head+path, path}
|
None |
dep-bit7 |
|
Gómez-Rodríguez et al. (2023) | ||
dep-hexa |
Hexa-Tagging | Amini et al. (2023) |
proj{head, head+path, path}
|
head |
dep-hier |
Hierarchical Bracketing | Ezquerro et al., (2025a) |
variant{proj, head, head+path, path, nonp}
|
proj |
dep-eager |
Arc-Eager system | Nivre and Fernández-González (2002) |
stackbufferproj{None, head, head+path, path}
|
1, 1, None
|
dep-biaffine |
Biaffine dependency parser | Dozat & Manning (2017) | ||
con-idx |
Absolute and relative indexing | Gómez-Rodríguez and Vilares (2018) |
rel{true, false}
|
false |
con-tetra |
Tetra-Tagging | Kitaev and Klein (2020) | ||
grp-idx |
Absolute and relative indexing | Ezquerro et al. (2024) |
rel{true, false}
|
false |
grp-bracket |
Bracketing encoding ( |
Ezquerro et al. (2024) |
k |
2 |
grp-hier |
Hierarchical bracketing encoding | Ezquerro et al., (2025b) | ||
grp-bit4k |
|
Ezquerro et al. (2024) |
k |
3 |
grp-bit6k |
|
Ezquerro et al. (2024) |
k |
3 |
grp-cov |
Covington | Covington (2001) | ||
grp-biaffine |
Biaffine graph parser | Dozat & Manning (2018) |
To train a parser from scratch, the run.py script should follow this syntax:
python3 run.py <parser-identifier> <specific-args> \
-c <conf> -d <device> (--load <pt-path> --seed <seed> --evalb <*evalb-paths>) \
train --train <train-path> --dev <dev-path> --test <test-paths> \
-o <output-folder> (--run-name <run-name>)where:
<parser-identifier>is the identifier specified in the table above (e.g.dep-idx),<specific-args>are the specific arguments of each parser (e.g.--relfordep-idx),<conf>is the model configuration file (see some examples in configs folder),<device>is the CUDA integer device,<train-path>,<dev-path>and<test-paths>are the paths to the training, development and test sets (multiple test paths are possible).<output-folder>is a folder to store the training results (including theparser.ptfile).
And optionally:
<pt-path>: Whether to load the parser from an existing.ptfile.<seed>: Specify other seed value. By default, this code always uses the seed123.<*evalb-paths>: Paths to evaluation scripts. This command is only available for constituency and graph parsers (see Evaluation instructions).<run-name>: wandb identifier.
W&B logging: SePar also allows model debugging with wandb. Please. follow these instructions to create and account and connect it with your local installation. Note that SePar still works without a wandb account.
SePar supports distributed training with FSDP2 by running the script run.py with torchrun. Use the CUDA_VISIBLE_DEVICES variable to hide specific GPUs.
CUDA_VISIBLE_DEVICES=<devices> torchrun --nproc_per_node <num-devices> \
run.py <parser-identifier> <specific-args> \
-c <conf> (--load <pt-path> --seed <seed>) \
train --train <train-path> --dev <dev-path> --test <test-paths> \
-o <output-folder> (--run-name <run-name>)where <devices> is the list of GPU identifiers (separated by comma) and <num-devices> is the number of GPUs used.
Warning
As introduced in this tutorial, FSDP2 requires manually specifying which modules or layers are sharded between GPUs for a better parameter distribution. In separ/utils/shard.py we include a function recursive_shard() which only shards large Transformer layers (specifically, those corresponding to pretrained models included in the configs folder). We suggest manually adding more layers when training with other LLMs. Do not hesitate to reach us if you need any help!
Evaluation with a trained parser is also performed with the run.py script.
python3 run.py <parser-identifier> <specific-args> --load <pt-path> -c <conf> -d <device>
eval <input> (--output <output> --batch-size <batch-size>)where:
<parser-identifier>is the identifier specified in the table above (e.g.dep-idx),<specific-args>are the specific arguments of each parser (e.g.--relfordep-idx),<pt-path>is the path where the parser has been stored (e.g. theparser.ptfile created after training).<conf>is the model configuration file (see some examples in configs folder),<device>is the CUDA integer device (e.g.0),<input>is the annotated file to perform the evaluation.
And optionally:
<output>: Folder to store the result metric.<batch-size>: Inference batch size. By default is set to 100.
Prediction with a trained parser is also conducted from the run.py script.
python3 run.py <parser-identifier> <specific-args> --load <pt-path> -c <conf> -d <device> \
predict <input> <output> (--batch-size <batch-size>)where:
<parser-identifier>is the identifier specified in the table above (e.g.dep-idx),<specific-args>are the specific arguments of each parser (e.g.--relfordep-idx),<pt-path>is the path where the parser has been stored (e.g. theparser.ptfile created after training).<conf>is the model configuration file (see some examples in configs folder),<device>is the CUDA integer device (e.g.0),<input>is the annotated file to perform the evaluation.<output>is the file to store the predicted file.
And optionally:
<batch-size>: Inference batch size. By default is set to 100.
Check the docs folder for specific examples running different dependency (docs/dep.md), constituency (docs/con.md) and semantic (docs/grp.md) parsers. Each document contains specific instructions to reproduce the results of the original papers:
- Dependency Graph Parsing as Sequence Labeling (Ezquerro et al., 2024).
- Hierarchical Bracketing Encodings for Dependency Parsing as Tagging (Ezquerro et al., 2025a).
- Hierarchical Bracketing Encodings Work for Dependency Graphs (Ezquerro et al., 2025b).
- Bringing Emerging Architectures to Sequence Labeling in NLP (Ezquerro et al., 2025c).
The docs/examples.ipynb notebook includes some examples of how to use the implemented classes and methods to parse and linearize input graphs/trees.