Skip to content

Conversation

@Humanshere
Copy link
Contributor

@Humanshere Humanshere commented Dec 31, 2025

Issue: #72
I did some digging (here is an informal log pastebin)
And I landed on zstd. It is basically zlib but better. Other algorighms were not balanced in the time-size tradeoff.

Here is the benchmark results on diff.c
image

and it is pretty clear that zstd is the most balanced
(look at level 3 way faster than rest of algos and comparable ratio to them)
Now let me explain (what I could understand of ) the internal workings of zstd
So basically it is an evolution of DEFLATE algorithms (zlib)
DEFLATE used

  1. LZ77 (in an oversimplified way instead of storing ABABABABABABABAB...AB it stored {AB*n times})
  2. Huffman coding (learned this in 2nd sem :) ) shorter codes for the AB (frequently used symbols ), so that average size is reduced.

zstd improves these-

  1. uses a optimized version of LZ77 from what I could gather with better pattern matching
  2. It employs Fixed State Entropy along with Huffman, I couldn't wrap my head on the inner workings but the abstract idea was it could assign fractional codes to symbols(by maintaining a "state" (a large number) that evolves as you add more data) saving space without compromising on time

If external dependancy is not allowed, I would go for zlib

@OpenGitBot
Copy link

Hey @Humanshere

Thanks for opening this PR 🚀. Mentor will review your pull request soon and till then, keep contributing and stay calm.

Thanks for contributing in OpenCode'25 ✨✨!

@Humanshere
Copy link
Contributor Author

Also i have implemented compression in data.py because I didn't know what else to commit.
Do let me know, if anything needs to be done/undone prior to merge

@Rational-Idiot
Copy link
Contributor

But the problem is that zstd isn't in the python std library till python 3.14 which hasn't had widespread adoption and is in the bug fixes period of development, which means using zstd would add a dependency 😮

@Humanshere
Copy link
Contributor Author

@Rational-Idiot that is why I mentioned in the last line, zlib if I had to abide by python std

@Rational-Idiot
Copy link
Contributor

40 points 🧔‍♀️

@Rational-Idiot Rational-Idiot merged commit cfe26ed into opencodeiiita:main Jan 3, 2026
@OpenGitBot
Copy link

Hey @Humanshere

Your PR has been merged 🥳🥳 and you have earned 40 points.

Thanks for contributing in OpenCode'25✨✨

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants