Static Registration
Overview
Static registration is the method used by spice_cloud to locate registries in clients’ projects. This approach leverages Abstract Syntax Trees (AST) and selective code execution to efficiently parse registrations, ensuring the process is both fast and lightweight.
The code is located in the app/spice_cloud/parser/ folder within the spice repository. As indicated by the path, the code is server-side and should not be executed directly by clients.
The parsing is executed on the project route (/api/project/{project_id}/registries).
Algorithm Details
The algorithm processes blobs downloaded from the client’s repository and operates exclusively on Python modules and notebooks.
The entry point is the main:load_registries function, which:
- Searches for registered features (
main:find_features_in_project). - Builds static registries from the features (
main:build_registries).
Registrations and features are parsed using a combination of AST (through the astroid library) and the exec and eval functions. The core idea of the algorithm is to act as a minimal Python interpreter, selectively identifying and executing only the specific pieces of code related to each feature to obtain their values.
Detailed Steps
- Feature Search:
find_features_in_projectsearches for features in each module and notebook.search_registrationis called to search for features in a specific module or notebook. When a feature is found, it is saved as aStaticFeatureobject. - Registry Building: Once all files have been parsed,
build_registriesbuild static registries (StaticRegistry) using the static features.
Inference
The inference module is a crucial component of the system, with its infer_node function taking an AST node and returning its computed value. This function is used to compute several entities:
- Registries
- Feature attributes
- Loop variables
Registries are identified by looking for calls of functions and methods named register, either as decorators or inline. infer_node is then used to verify if the call is made on the register method of a Registry object and to obtain the registry name.
When a registration is found, infer_node processes keyword variables provided to register and the callable being registered to extract feature attributes.
If a for-loop is encountered during the search for registrations, infer_node extracts each loop variable, allowing it to recognize features registered within the loop. For example, in the following case, the parser correctly identifies three features: a_1, a_2, and a_3:
for num in [1, 2, 3]:
registry.register(lambda: ..., name=f"a_{num}")
infer_node works in two steps:
- Explore the node being inferred with the
_Visitorclass, using the visitor pattern (see an example on the astroid project). This class retrieves only the nodes necessary to compute the target node’s value, efficiently ignoring all irrelevant nodes. - Compute the node value by using
execto execute the code of the extracted nodes andevalto infer the value of the target node.- ⚠️ We must be careful with this specific part, as it is dangerous to directly execute Python code from clients.
Caveats
Currently, the parser works well in most scenarios, but there are several ways it differs from import-time parsing:
- When
astroidfails to find the AST node corresponding to the registered callable, some feature attributes may not be correctly identified.- This issue could be resolved by developing custom code to infer the registered callable, rather than relying on
astroid(infeature:_get_registered_callable).
- This issue could be resolved by developing custom code to infer the registered callable, rather than relying on
- The parser recursively compute for-loops iterables even if the loops don’t contain registrations. This can increase parsing time.
- This issue could be resolved by first identify registrations and then check if they are within a for-loop.