Skip to content

BillGates98/DLinker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DLinker

RDF Data Linking tool

N|solid

Build Status

DLinker is an RDF data linking tool.

  • Depend of four hyperparameters(measure_level, alpha_predicate, alpha and phi)
  • the strength of the similarity search measure 'measure_level'
  • acceptance threshold for similar predicates 'alpha_predicate'
  • acceptance threshold for similar literals 'alpha'
  • number of accepted similarity pairs 'phi'
  • put validation parameter after introduice the file ine validation path 'validation'

Evaluations

  • HOBBIT and SPATEN(url) : sh ./job.sh --input_path ./inputs/spaten_hobbit/ --output ./outputs/spaten_hobbit --alpha_predicate 1 --alpha 0.3 --phi 1 --measure_level 0 --validation ./validations/spaten_hobbit/valid_same_as.nt
    • Precision : 1.0
    • Recall : 1.0
    • F-measure : 1.0
  • Doremus data (url) : sh ./job.sh --input_path ./inputs/doremus/ --output ./outputs/doremus/ --alpha_predicate 1 --alpha 0.88 --phi 2 --measure_level 2 --validation ./validations/doremus/valid_same_as.ttl  
    • Precision : 0.966
    • Recall : 1.0
    • F-measure : 0.983
  • SPIMBENCH data(url) : sh ./job.sh --input_path ./inputs/spimbench/ --output ./outputs/spimbench/ --alpha_predicate 1 --alpha 1 --phi 1 --measure_level 1 --validation ./validations/spimbench/valid_same_as.ttl  
    • Precision : 0.786
    • Recall : 1.0
    • F-measure : 0.880

Make sure you have this in the './outputs/spimbench/similars_predicates.csv' the content below :

predicate_1,value_1,predicate_2,value_2,similarities
<http://www.bbc.co.uk/ontologies/creativework/title>,http://www.bbc.co.uk/ontologies/creativework/title, <http://www.bbc.co.uk/ontologies/creativework/title>,http://www.bbc.co.uk/ontologies/creativework/title,1.0

Features

  • Take only pairs of files in the inputs path('./inputs/') with any names. Example : 'source.ttl' and 'target.ttl'
  • Compute pairs predicates
  • Compute similars literals
  • Once the data is in place the whole thing can be launched with the sbatch file('./job.sh') without forgot the hyperparameters
  • Place results in the output path('./outputs')
  • Place valid pairs in the file '/validations/valid_same_as.ttl' and call in argument with 'validation'
  • Compute score similarity from this python script ('./score_computation.py')

Tech

DLinker is implemented with below elements to work properly :

  • [Python >=3.8] - Awesome Language who is an interpreted, multi-paradigm and multi-platform programming language.!
  • [Visual Studio Code Editor] - awesome text editor
  • [markdown-it] - Markdown parser done right. Fast and easy to extend.

Installation

Python version (>=3.8) to run. Spacy version (>=3.4.1) to run.

Install the dependencies and devDependencies and start the server.

pip install spacy

Script Shell contains inside ./job.sh file

#!/bin/bash

i=1;
params=``
for param in "$@" 
do
    i=$((i + 1));
    params=`echo $params $param`
done

# echo "All params : ". $params
python3.8 ./candidate_entities_pairs.py $params
python3.8 ./score_computation.py $params

Expected Output after running on HOBBIT AND SPATEN datasets :

sh ./job.sh --input_path ./inputs/spaten_hobbit/ --output ./outputs/spaten_hobbit --alpha_predicate 1 --alpha 0.3 --phi 1 --measure_level 0 --validation ./validations/spaten_hobbit/valid_same_as.nt

@prefix owl: <http://www.w3.org/2002/07/owl#> .

<http://www.spaten.com/trace-data#162531> owl:sameAs <http://www.hobbit.e79638702-1458-413d-a054-06ba82203597> .

<http://www.spaten.com/trace-data#207815> owl:sameAs <http://www.hobbit.ea985b39b-df18-43d2-aac1-21e41e04c910> .

<http://www.spaten.com/trace-data#21948> owl:sameAs <http://www.hobbit.ea0e27f34-3e0a-48ff-9a27-b4f3052187e4> .

<http://www.spaten.com/trace-data#332418> owl:sameAs <http://www.hobbit.e6ec56f99-f954-431c-b644-49e3f52c4608> .

<http://www.spaten.com/trace-data#402929> owl:sameAs <http://www.hobbit.e7dd07dff-791f-4611-af13-d2ab783b8880> .

<http://www.spaten.com/trace-data#44152> owl:sameAs <http://www.hobbit.e0cadb061-dfc0-4992-b9ec-77b03e869c19> .

...

About

RDF Data Linker

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors