-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Description
-
doc needs to convert into string to proceed further for re
doc = str(doc).lower() # convert into string to handle error if input is else -
while converting into binary from md5 encoding required to mention
h = bin(int(md5(token.encode('utf-8')).hexdigest(), 16)) # to handle encoding error -
Python version update to call dictionary items in 3.0
for _, token in token_dict.items(): # instead of iteritems() in python version 2.0 replaced with items() in python 3.0
if __name__ == '__main__':
# Just for demonstration
doc = data # {doc_id, doc}
binary_hash = simhash(doc)
print(binary_hash)
raphaiela
Metadata
Metadata
Assignees
Labels
No labels