Skip to content

some issues with code including python versions #3

@shoaib-intro

Description

@shoaib-intro
  1. doc needs to convert into string to proceed further for re
    doc = str(doc).lower() # convert into string to handle error if input is else

  2. while converting into binary from md5 encoding required to mention
    h = bin(int(md5(token.encode('utf-8')).hexdigest(), 16)) # to handle encoding error

  3. Python version update to call dictionary items in 3.0
    for _, token in token_dict.items(): # instead of iteritems() in python version 2.0 replaced with items() in python 3.0

if __name__ == '__main__':
    # Just for demonstration
    doc = data # {doc_id, doc}
    binary_hash = simhash(doc)
    print(binary_hash)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions