Skip to content

Google Hangouts parser for very large (or small, doesn't matter) json files

Notifications You must be signed in to change notification settings

Kaitlyn019/hangouts_parser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

hangouts_parser

Google Hangouts parser for very large (or small, doesn't matter) json files

Recently I had an issue with parsing my google hangouts takeout file. At 1.6 GB, it was too large for my laptop to run with the standard json package and too large for any website I usually use to parse the takeout files. This was my solution :)

Basic information: this uses the ijson package which iteratively goes through the file.

To start: populate groups and contacts hash with get_chats(jsonfile), or load from a csv with import_groups(file) and import_contacts(file)

The following are the two main methods:

get_chats(jsonfile)

  • Paramenter: JSON file, RETURNS: Nothing
  • Creates a hash of all groups: key is chat ID, values is a hash with name and participants gaia_id
  • Creates a hash of all contacts: key is gaia ids, value is username
  • Uncomment print_chats to print the chats, uncomment print(contacts) to print the contacts

get_messages(jsonfile, id, output=None)

  • Parameter: JSON file, chat ID - can be found in groups hash, Output file
  • Returns: Dataframe of messages

About

Google Hangouts parser for very large (or small, doesn't matter) json files

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages