-
Notifications
You must be signed in to change notification settings - Fork 72
Description
When installing lb-foraging with certain versions of gym, the import hangs for a (very) long time.
Consider a simple script, import_tests.py:
from time import time
start = time()
import lbforaging
print(f"time = {time() - start:.3f}s")With gym==0.21.*:
$ python import_tests.py
time = 0.650s
vs. with gym==0.22.*:
$ python import_tests.py
time = 1188.446s
This massive bottleneck is caused by the way gym registers environments from v0.22 onwards. Consider the gym/envs/registrations.py::EnvRegistry::register() method in v0.21:
def register(self, id, **kwargs):
if self._ns is not None:
if "/" in id:
namespace, id = id.split("/")
logger.warn(
f"Custom namespace '{namespace}' is being overrode by namespace '{self._ns}'. "
"If you are developing a plugin you shouldn't specify a namespace in `register` calls. "
"The namespace is specified through the entry point key."
)
id = f"{self._ns}/{id}"
if id in self.env_specs:
logger.warn("Overriding environment {}".format(id))
self.env_specs[id] = EnvSpec(id, **kwargs)See how this is different in v0.22:
def register(self, id: str, **kwargs) -> None:
spec = EnvSpec(id, **kwargs)
if self._ns is not None:
if spec.namespace is not None:
logger.warn(
f"Custom namespace `{spec.namespace}` is being overridden "
f"by namespace `{self._ns}`. If you are developing a "
"plugin you shouldn't specify a namespace in `register` "
"calls. The namespace is specified through the "
"entry point package metadata."
)
# Replace namespace
spec.namespace = self._ns
try:
# Get all versions of this spec.
versions = self.env_specs.versions(spec.namespace, spec.name)
# We raise an error if the user is attempting to initialize an
# unversioned environment when a versioned one already exists.
latest_versioned_spec = max(
filter(lambda spec: isinstance(spec.version, int), versions),
key=lambda spec: cast(int, spec.version),
default=None,
)
unversioned_spec = next(
filter(lambda spec: spec.version is None, versions), None
)
# Trying to register an unversioned spec when versioned spec exists
if unversioned_spec and spec.version is not None:
message = (
"Can't register the versioned environment "
f"`{spec.id}` when the unversioned environment "
f"`{unversioned_spec.id}` of the same name already exists."
)
raise error.RegistrationError(message)
elif latest_versioned_spec and spec.version is None:
message = (
f"Can't register the unversioned environment `{spec.id}` "
f"when version `{latest_versioned_spec.version}` "
"of the same name already exists. Note: the default "
"behavior is that the `gym.make` with the unversioned "
"environment will return the latest versioned environment."
)
raise error.RegistrationError(message)
# We might not find this namespace or name in which case
# we should continue to register the environment.
except (error.NamespaceNotFound, error.NameNotFound):
pass
finally:
if spec.id in self.env_specs:
logger.warn(f"Overriding environment {id}")
self.env_specs[spec.id] = specSpecifically, the issue here is the addition of:
https://github.com/openai/gym/blob/95063a08943e1f587c58be7435d94f81ccac8fd9/gym/envs/registration.py#L559
versions = self.env_specs.versions(spec.namespace, spec.name)which calls
https://github.com/openai/gym/blob/95063a08943e1f587c58be7435d94f81ccac8fd9/gym/envs/registration.py#L220
self._assert_name_exists(namespace, name)which eventually hits:
https://github.com/openai/gym/blob/95063a08943e1f587c58be7435d94f81ccac8fd9/gym/envs/registration.py#L293
suggestions = difflib.get_close_matches(name, self.names(namespace), n=1)Herein lies the problem: lbforaging attempts to register a large number of environments (9720 by default). With each new environment, gym checks whether any environment has already been registered with a similar name. Naturally, this fuzzy match scales really badly with a large number of environments being registered.
v0.23 of gym takes the same approach to registration as v0.22, and thus faces the same issue. In v0.24-v0.26, the registration no longer looks for a fuzzy match (with difflib), but there is still a bottleneck due to an iteration over all registered envs:
https://github.com/openai/gym/blob/dcd185843a62953e27c2d54dc8c2d647d604b635/gym/envs/registration.py#L379-L403
def _check_spec_register(spec: EnvSpec):
"""Checks whether the spec is valid to be registered. Helper function for `register`."""
global registry
latest_versioned_spec = max(
(
spec_
for spec_ in registry.values()
if spec_.namespace == spec.namespace
and spec_.name == spec.name
and spec_.version is not None
),
key=lambda spec_: int(spec_.version), # type: ignore
default=None,
)
unversioned_spec = next(
(
spec_
for spec_ in registry.values()
if spec_.namespace == spec.namespace
and spec_.name == spec.name
and spec_.version is None
),
None,
)thus still yielding bad performance (definitely not as bad as v0.22 & v0.23 though):
$ python import_tests.py
time = 8.575s
I'm not sure what the solution for lbforaging is here?
- Advise users that gym v0.22 & v0.23 are incompatible
- Don't pre-register loads of environment configs. Perhaps do a lazy registration only when user asks for a given set-up? Not sure if/why this isn't a preferred approach.
Note: I am aware that gym has officially moved to https://github.com/Farama-Foundation/Gymnasium, but the issue remains there with gymnasium v0.27:
https://github.com/Farama-Foundation/Gymnasium/blob/6f35e7f87fc5b455b8cc70e366016c463fa52850/gymnasium/envs/registration.py#L298-L321