Advanced Examples

These are examples of some of the more advanced features that Estoult has.

Associations

Currently only two kinds of associations are supported: one-to-one and one-to-many. You can create them with the Association class.

Here’s an example where we have an Organisation that has many Users and one Admin. Where an Admin is also a User.

from esoult import Field, Association


class User(db.Schema):

    __tablename__ = "users"

    id = Field(int, null=False)
    organisation_id = Field(int)
    name = Field(str)

class Admin(db.Schema):

    __tablename__ = "admins"

    id = Field(int, null=False)
    organisation_id = Field(int)
    user_id = Field(int)

    user = Association.has_one(User, on=["user_id", "id"])

class Organisation(db.Schema):

    __tablename__ = "organisations"

    id = Field(int, null=False)
    name = Field(str)

    admin = Association.has_one(Admin, on=["id", "organisation_id"])
    users = Association.has_many(User, on=["id", "organisation_id"])

If you need to create a relationship on a schema that has not yet been defined, you can supply a string of the module path of the schema instead of the schema path itself:

class Organisation(db.Schema):
    # ...
    admin = Association.has_one("schemas.base.Admin", on=["id", "organisation_id"])
    # ...

With associations we can now preload to retrieve all the data from Organisation to User without having to construct multiple queries.

organisations = (
   Query(Organisation)
   .select()
   .preload({Organisation.admin: [Admin.user])
   .preload(Organisation.users)
   .execute()
)

To retrieve all the fields in the associated schema you can have the association field by itself, like with .preload(Organisation.users). To define specific fields and associations in the schema (Estoult will not recursively go through every association by default for performance reasons, so they must be specified), pass in a dictionary with the key as the owner field and the value as a list of fields.

This select will return a list of rows looking something like this:

[
   {
      "id": 1,
      "name": "Org Name",
      "admin": {
         "user": {"id": 2, "name": "Real Name"}
      },
      "users": [
         {"id": 3, "name": "Fake Name"}
         ...
      ]
   }
   ...
]

Note that we aren’t getting the id in the admin field because we only specified Admin.user. If instead, we wanted the id of admin but not of user we would need to change the preload:

# From this
.preload({Organisation.admin: [Admin.user])
# To this
.preload({Organisation.admin: [Admin.id, {Admin.user: [User.name]}]})

For creating and updating with associations you pass in the same structure you would get from a preload.

new_org = {
   "name": "Les Fans D'Astolfo",
   "admin": {
      "user": {"name": "Test Account"}
   },
   "users": [
      # This is updated
      {"id": 4, "name": "Justin Duch"},
      # This is inserted
      {"name": "Emilie Rousseau"}
   ],
}

org = Organisation.insert(new_org)

For each association, if there is a primary key supplied it will update the row, otherwise it will do an insert. This is the same no matter if the parent is inserting or updating.

Disabling wildcard selects on a schema

If performance is of importance to you, it is recommended to never use a wildcard (*) in your SQL select queries. For larger tables a select * is often detrimental to performance. Unfortunately, most ORMs (and Estoult) make wildcards too easy to do, as they are the default when no fields are specifically provided in a query.

But unlike most ORMs, Estoult doesn’t suck and actually cares about performant SQL, so you can easily remove the ability to use a * with the class variable _allow_wildcard_select in your schema.

class VeryBigTable(db.schema):
    __tablename__ = "very_big_table"
    _allow_wildcard_select = False

    id = Field(str)
    field1 = Field(str)
    field2 = Field(str)

Because wildcards can still be useful for prototyping, and aren’t a big of a problem for smaller tables, this allows you to pick and choose which schemas can use wildcards. You could also disallow them on every schema by default by using a base class, and separately allow them in each schema for smaller tables.

Now running a select query without specifying fields will raise an error:

# ILLEGAL! GO TO GAOL!
Query(VeryBigTable).select().execute()

# Do this instead :3
Query(VeryBigTable).select(VeryBigTable.field2).execute()

This also applies to preloading associations:

# BIG NO NO!
Query(VeryBigTableParent)
    .select()
    .preload(VeryBigTableParent.very_big_tables)
    .execute()

# Very fast and correct, good job!
Query(VeryBigTableParent)
    .select()
    .preload({VeryBigTableParent.very_big_tables: [VeryBigTable.field1]})
    .execute()

Procedurally building queries

Imagine you have a service to sync your music library between multiple streaming platforms.

You’d have one Schema to keep track of your general music collection and several links to each of your services.

class Song(db.schema):
    __tablename__ = "songs"

    id = Field(str)
    title = Field(str)
    artist = Field(str)


class Spotfiy(db.schema):
    __tablename__ = "spotify"

    song_id = Field(str, "song_id", primary_key=True)
    spotify_id = Field(str, "spotify_id")


class AppleMusic(db.schema):
    __tablename__ = "applemusic"

    song_id = Field(str, "song_id", primary_key=True)
    applemusic_id = Field(str, "applemusic_id")


class Soundcloud(db.schema):
    __tablename__ = "soundcloud"

    song_id = Field(str, "song_id", primary_key=True)
    soundcloud_id = Field(str, "soundcloud_id")

To populate the tables, you use each of the services API’s to fetch all your music, add each song to the Song table and the service’s corresponding id to it’s table with the song_id we made.

Note

I have no idea if any of these services let you do this.

But for some reason, because you are a terrible programmer, this process creates a bunch of “orphaned” songs, where a row in the Song table has no rows in either Spotify, AppleMusic, or Soundcloud pointing to it. And instead of fixing the underlying issue, you just decide to write a SQL statement to remove orphaned songs to run every once in a while.

The SQL would look like this.

delete from songs
where song.id in (
    select song.id
    from songs
             left join spotify on spotify.song_id = songs.id
             left join applemusic on applemusic.song_id = songs.id
             left join soundcloud on soundcloud.song_id = songs.id
    where spotify.spotify_id is null
      and applemusic.applemusic_id is null
      and soundcloud.soundcloud_id is null
)

You’d easily be able to write this in Estoult, but you know that you are intending to add more streaming services in the future and wont be bothered to keep having to change the query.

Instead we can create this query procedurally to minimise the amount of things you’ll have to change.

First, let’s make a mapping for our tables.

links = {
    "spotify": Spotfiy
    "applemusic": AppleMusic,
    "soundcloud": Soundcloud,
}

Now let’s start on the “select” subquery. We first start by selecting the Song.id.

select_query = Query(Song).select(Song.id)

And we can add our left joins for every service.

for schema in links.values():
    select_query.left_join(schema, on=[Song.id, schema.song_id])

For our where we would use list comprehension and then de-construct the list into function arguments.

select_query.where(
    *[op.is_null(s[f"{n}_id"]) for n, s in links.items()]
)

You can access a Schema’s fields as if they were a dict. In op.is_null(s[f"{n}_id"]) we are doing just that, where s and n are the schema and name we get from links.items().

Finally, we can add it to a delete function.

Query(Song).delete().where(op.in_(Song.id, select_query)).execute()

Now we only need to change the links dictionary instead of messing around with the query.

This is what it looks like all together for reference:

links = {
    "spotify": Spotfiy
    "applemusic": AppleMusic,
    "soundcloud": Soundcloud,
}

select_query = Query(Song).select(Song.id)

for schema in links.values():
    select_query.left_join(schema, on=[Song.id, schema.song_id])

select_query.where(
    *[op.is_null(s[f"{n}_id"]) for n, s in links.items()]
)

Query(Song).delete().where(op.in_(Song.id, select_query)).execute()