-
Notifications
You must be signed in to change notification settings - Fork 14
Description
This is a tracking issue for solving precision issues.
Problem Overview
Precision is one of the major obstacles to the adoption of hyperbolic geometry in machine learning.
As shown in Representation Tradeoffs for Hyperbolic Embeddings, there is a tradeoff between precision and dimensionality when representing points in hyperbolic space with floats, independent of the model that is used.
Hyperlib should have a solution to this in its core components. Ideally the solution will satisfy the following.
- reasonably efficient: it doesn't incur significant overhead compared to Euclidean methods and is GPU compatible
- easy to use: it's abstracted away from the API so that a casual user doesn't have to touch it
- general: it's general enough to be used with different models of hyperbolic space
Approaches
Hope for the best
We see many papers that simply accept the precision errors and try to mitigate them, or go to higher dimensions.
E.g. Our current approach in the Poincare model is to cast tf.float64, which only gets us 53 bits of precision.
Multiprecision
In the sarkar embeddings, we use a multi-precision library mpmath to represent points. As far as multiprecision arithmetic goes it is fast (assuming it is using the gmpy) backend. However the support for vector operations is not good and it cannot easily interoperate with numpy or tensorflow. Also we do not yet have a good method to automatically determine the precision setting (for example, in sarkar_embedding it uses far too much precision by default).
Avoiding the Problem
One common approach to avoid precision errors, especially in hyperbolic SGD, is to map from the (Euclidean) tangent space and do all operations there instead. We should definitely experiment with and support this method in Hyperlib. This will work for all models via the exponential map. However, it only solves part of the problem.
Multi-Component Float
Multi-Component Floats (MCF) are an alternate representation for floats that can be vectorized, proposed by Yu and De Sa as a way to do calculations in the upper half-space model. IMO this is the most promising approach if it can be extended to other models of hyperbolic space.
Todos
- Spike: implementing MCF for upper-half space