Concrete steps to improve usability: Tasks - [ ] Include a small sample dataset and a script to generate it. - [ ] Add a CLI (argparse/typer) with clear arguments and defaults. - [ ] Provide pytest unit tests for the dedupe logic; add CI to run tests. - [ ] Add a Dockerfile (and optional compose) to run the pipeline in isolation. - [ ] Expand README with a quickstart showing end-to-end execution. Benefits - Anyone can clone and run the example with confidence.