System Design¶

What Data to Cache?¶

All of it.

Well, all of the data in the subset that we care about. It would be awesome if the cache was only required to store the data that is actually returned from the API, however it can’t know that it is able to satisfy any given query with the data is had unless it has everything.

So, we will pick a subset of entity types, and a subset of fields on those types, and we will assume that we have every entity that will ever be requested via the API.

A good boundary (for us) is to cut off old projects, however the cache will have no knowledge of this.

When to Cache Data?¶

We have a few opportunities to capture data as it is created/modified, and can manufacture a few more:

1. The event log¶

Anything a user does in the UI will be reflected in the event log. Anything done by a script with generate_events active will also end up in the log.

2. Results of pass-through requests¶

Anything returned from a request we can’t handle (e.g. an update, create, a read on a entity/field that is uncached or a filter we don’t support) can be inspected and cached.

3. Periodic scans¶

Even if a script does not generate events, it will still affect the updated_at field on entities. Ergo, we can periodically scan every entity type, and request all entities which have been changed since the last scan.

4. Executing all requests asynchronously¶

Even if we are able to fully satisfy a read request, we can pass that request through anyways and make sure that the data we returned matches.

Where to Cache Data?¶

We use PostgreSQL (or SQLite in development) to store the data.

Shotgun also uses PostgreSQL, but is almost guaranteed to use it differently.

Schema Rules¶

Basic entities are stored in tables sharing the name of the entity type, e.g.: asset and shot.
Special fields (e.g. caching meta-data) are prefixed with an underscore.
Columns generally take the field’s name.
All fields representing Shotgun data will be left nullable, unless we are absolutely certain they are not (e.g. checkbox).
multi_entity fields will be represented as an association table named:
```
{entity_type}_{field_name}
```
We will not attempt to represent the *Connection entities, as we have so little of an understanding of how they work that we feel that we shouldn’t even try.

We will also not attempt to represent the bi-directional nature of Shotgun’s multi-entity relationships (since we cannot trivially determine them with the public API), and instead rely upon the event log to update the inverse association.
entity fields will be encoded with the entity type in {field_name}__type and the id in {field_name}__id (note the double underscores).

Caveats¶

The primary caveat that we must make is that in order for the cache to be up to date (delayed only by the event loop) is that all script API keys must generate events.

This is often disabled on some scripts in order to stop the massive amount of email they can generate. However, there is no other way for the cache to be immediately notified of changes.

In the future (i.e. when these features are implemented), you can reduce the effect of non-event-generating scripts by decreasing the delay between update scans, or you can execute all requests asynchronously.