-
Notifications
You must be signed in to change notification settings - Fork 0
Description
The current graph implementation relies heavily on iterating through lists. Given the size of the graph is not in
Another possible improvement is to write the actual implementation in C and dispatch via python. Given that it is not necessarily clear how batched queries can be made, and the fact that there are no benchmarks for simple hash-map based optimizations, this might turn out to be a premature optimization.
The issue will be divided into:
- Developing a benchmark method which recreates actually seen run times for queries from sensors
- Create simple hash-map based optimizations in python itself and check its effectiveness
- Analyze the requirement for compiled graph implementation
Developing the benchmark
- Neighbor access benchmark: For a given graph, the access time for neighbors of a given node should be reduced
- Edge access benchmark: Given a node, minimize the time required to obtain the incoming and outgoing edges
- Area sweep benchmark: Given a node or point (x,y), a direction, angle and distance, find all entities in the section for the graph.
For developing the benchmark, we use the rich-bench. A copy of the original graph implementation is added along with the new implementation. Then individual functions will simulate the final output.
For ensuring that the results are not affected by graph structure, the tests will happen on multiple randomly generated graphs with final result being the mean of all values. We will only care for the mean time.
Preliminary Graph Specific Optimization
Neighbor access and Edge access are specific to graphs but the Area Sweep is a more general problem of data chunking.
Initially, neighbor access can be solved directly if edge access is solved. This can be done directly using a HashMap. The extra memory overhead will be minimal for moderately sized graph (< 1 million edges) as we will be using ids in place of actual objects. But when considering very large graphs, we will run into issues as it will become important to chunk the hashmap itself.
For this reason, a slightly general but still highly specific to Area Sweep benchmark chunking system will be developed and integrated with the create_iter methods in the graph to get some initial structure.
Future of the Optimization
The initial chunking method will become the base for developing the memory engine and its custom stores for Gamms specific data.