Semantically significant apis: a case study with sqlc // Work Dad Dev

I’ve been developing an URL shortener in my spare time: shortn.cc. It works as expected, though the frontend UI definitely needs work (hey, backend engineer for most of my career! Cut me some slack).

It is built using a Go stack with sqlite. As I’ve discussed before, I’m not a fan of ORMs and instead use a named query library called sqlc to create queries and generate a storage API for my service. It’s been working quite well and it’s a relief to be able to write SQL queries that get typed checked at API generation time instead of at runtime.

shortn.cc tracks clicks on its URLs without having to track users across the internet. This means I know nothing about the user clicking the URL except what I can get from the IP (location data) and the HTTP headers they sent me. This gives me less information and potentially false ones, if the caller is manually setting HTTP headers or using a VPN. But, I think it’s a fine trade-off that preserves user anonymity as much as possible while performing the URL redirect.

sqlc allows me to generate an API for my storage so that I can call queries.GetFullURL(shortURL) in my service code. But, the common storage API generated by sqlc has a sparse interface for creating transactions and after a year of experimenting, I evolved an API that seems obvious in retrospect but took me using the code in multiple projects, fixing bugs from the original designs and settling on trade-offs that made sense for me, resulting in a semantically significant API for transactions.

Let me take youi through my journey as we understand the sqlc generated API, what makes it cumbersome for transactions and the issues I had with each attempt that finally led me to the current API.

Generated code

sqlc works by setting up a yaml config with:

A reference to a schema file or migration files directory.
A reference to a queries file.
A setup for package names.

For our example, here is the sqlc.yaml file I use in most of my projects:

version: 2
sql:
  - engine: "sqlite"
    schema: "storage/datastore/migrations"
    queries: "storage/datastore/queries.sql"
    gen:
      go:
        package: "datastore"
        out: "storage/datastore"

In my projects I have a storage/datastore package, which is where the SQL generated code will be placed. It also has a queries.sql file which is where I write all my datastore queries, and a migrations directory where I store my SQL migration (ddl) files. When you call sqlc generate, it goes through the files in the migration directory in file name order, parses them, then parses the queries.sql file and if the queries are correct (it checks for table and columns names), it generates 3 files: db.go, models.go and queries.sql.go.

queries.sql.go has the API generated from queries.sql and defines a Queries struct with one method per query. models.go has the Go structs that map to the tables defined in the SQL migration files. And db.go is the db interface that will be the same for all database drivers that are compatible with database/sql¹.

Looking at the generated code for db.go:

package datastore

import (
    "context"
    "database/sql"
)

type DBTX interface {
    ExecContext(context.Context, string, ...any) (sql.Result, error)
    PrepareContext(context.Context, string) (*sql.Stmt, error)
    QueryContext(context.Context, string, ...any) (*sql.Rows, error)
    QueryRowContext(context.Context, string, ...any) *sql.Row
}

func New(db DBTX) *Queries {
    return &Queries{db: db}

}

type Queries struct {
    db DBTX
}

func (q *Queries) WithTx(tx *sql.Tx) *Queries {
    return &Queries{
        db: tx,
    }
}

It has a New constructor that accepts a DBTX interface and returns a *Queries instance. The methods on the DBTX interface are the same ones available on an *sql.DB type, so you can open the database, create the *datastore.Queries type and use it instead of *sql.DB directly:

db, err := sql.Open(“sqlite”, “/tmp/datastore.db”)

if err != nil {
    // Do something
}
defer db.Close()

queries := datastore.New(db)
svc := service.New(queries)
...

As you can see my service depends on *datastore.Queries, not on *sql.DB. Assuming we have Queries.GetFullURL (returns the full url for the short one), Queries.StoreShortURL (stores a short URL in the database) and Queries.UseQuota (uses one URL quota if user still has any), we could have two service methods CreateShortURL and ResolveShortURL that uses the storage API as such:

type Service struct {
    queries *datastore.Queries
}

func (s *Service) ResolveShortURL(ctx context.Context, short string) (string, error) {
    full, err := s.queries.GetFullURL(ctx, short)
    if err != nil {
        return “”, err
    }

    return full, nil
}

func (s *Service) CreateShortURL(ctx context.Context, full string) (string, error) {
    if err := s.queries.UseQuota(ctx); err != nil {
        return “”, fmt.Errorf(“quota exhausted: %w”, err)
    }

    short := s.generateShort()
    err := s.queries.StoreShortURL(ctx, datastore.StoreShortURLParams{
        Short: short,
        Full: full,
    })


    if err != nil {
        return “”, fmt.Errorf(“error storing short url: %w”, err)
    }

    return short, nil
}

Pretty straight forward. ResolveShortURL just calls Queries.GetFullURL and returns the full IRL if it exists. CreateShortURL tries to consume URL quota and if it succeeds, creates a short URL and calls Queries.StoreShortURL to store it in the database.

Most of you can see the issue here, right?

Where is my transaction?

While ResolveShortURL is fine, CreateShortURL has a big problem: its 2 database calls are not in the same transaction. So if UseQuota succeeds but StoreShortURL doesn’t, we would have consumed quota without creating the URL and the user would be very sad.

The solution, of course, is to call UseQuota and StoreShortURL in a transaction. That’s what the Queries.WithTx method is for: you pass it an *sql.Tx and it returns a new *Queries struct that is bound to the transaction. Except… where can I get the *sql.Tx from?

It should come from the *sql.DB.BeginTx call. But, for that, I would need a reference to the *sql.DB used by *Queries because it does not expose the *sql.DB for this purpose.

Moreover, the control of the transaction (calling Commit/Rollback when appropriate) is also external to calling the *Queries methods. Fundamentally, sqlc gives you named SQL queries only. You are still in charge of managing the db calls.

Towards a better API

The design decision from sqlc makes sense: all it’s doing is wrapping SQL queries with methods, so handling db transactions should still be in control of the caller.

Adding any transaction management to the library would bring it closer to being an ORM and I can appreciate the tradeoff, but it makes for a bad abstraction. From the perspective of my Go code, I’m not dealing with SQL queries, only with API calls. Having to handle transactions manually seems backwards.

A good abstraction would steer you away from trying to solve a common problem with semantically significant APIs. Being able to express a transaction given you have a *Queries type would make the code much easier to understand and I experimented with multiple APIs over the course of a year until I landed on one that works well.

Attempt #1: keep a reference to the `*sql.DB`

Since I need the reference to *sql.DB anyways, I just kept it side-by-side with the *Queries instance. I can now rewrite CreateShortURL to use a transaction:

type Service struct {
    queries *datastore.Queries
    db *sql.DB
}

// Nothing changes here, no transaction needed.
func (s *Service) ResolveShortURL(ctx context.Context, short string) (string, error) {
    full, err := s.queries.GetFullURL(ctx, short)
    if err != nil {
        return “”, err
    }

    return full, nil
}

func (s *Service) CreateShortURL(ctx context.Context, full string) (string, error) {
    // Create the transaction.
    tx, err := s.db.BeginTx(ctx, nil)
    if err != nil {
        return “”, err
    }

    // If tx.Commit is called successfully, Rollback will have no effect.
    defer tx.Rollback()

    queries := s.queries.WithTx(tx)
    if err := queries.UseQuota(ctx); err != nil {
        return “”, fmt.Errorf(“quota exhausted: %w”, err)
    }

    short := s.generateShort()
    err = queries.StoreShortURL(ctx, datastore.StoreShortURLParams{
        Short: short,
        Full: full,
    })

    if err != nil {
        return “”, fmt.Errorf(“error storing short url: %w”, err)
    }

    // Commit and return the short url.
    return short, tx.Commit()
}

This now works as expected. If UseQuota doesn’t fail but StoreShortURL does, the quota will not have been consumed. Notice the code looks pretty much the same, except that I’m creating and managing the transaction.

If you have a single or maybe two methods you want to run in a transaction, this is fine. But it gets old soon and at some point you want to hide the transaction mechanism because, while it is important to know when you are running in a transaction (semantically significant), the mechanics are a distraction.

Attempt #2: `datastore.DB` with Transaction method.

Let’s make the transaction more obvious. If instead of keeping an explicit reference to *sql.DB, we create a new type datastore.DB that exposes a Transaction method, we can have it deal with the mechanics and use a function parameter that gets called with the properly setup transaction:

package datastore

type DB struct {
    // Embed *Queries, so all their methods are available in DB.
    *Queries
    rdbms *sql.DB
}

func (db *DB) Transaction(ctx context.Context, f func(*DB) error) (tErr error) {
    defer func() {
        // Rollback the transaction in case there is an error.
        if tErr != nil {
            rbErr := tx.Rollback()
            if rbErr != nil {
                tErr = errors.Join(tErr, rbErr)
            }
        }
    }()

    // Create the transaction
    tx, err := db.rdbms.BeginTx(ctx, nil)
    if err != nil {
        return fmt.Errorf("error creating transaction: %w", err)
    }

    // Create the new DB instance that uses the transaction.
    txdb := DB{
        rdbms:   db.rdbms,
        Queries: db.Queries.WithTx(tx),
    }

    // Call the function parameter with it and handle the transaction.
    if err := f(txdb); err != nil {
        return fmt.Errorf("transaction error: %w", err)
    }

    return tx.Commit()
}

Now, reimplementing the service, we get:

type Service struct {
    queries datastore.DB
}

// Nothing changes here, no transaction needed.
func (s *Service) ResolveShortURL(ctx context.Context, short string) (string, error) {
    full, err := s.queries.GetFullURL(ctx, short)
    if err != nil {
        return “”, err
    }

    return full, nil
}

func (s *Service) CreateShortURL(ctx context.Context, full string) (string, error) {
    var short string

    return short, s.queries.Transaction(ctx, func(queries datastore.DB) error {
        if err := queries.UseQuota(ctx); err != nil {
            return fmt.Errorf(“quota exhausted: %w”, err)
        }

        // Make sure I don’t shadow the short variable.
        short = s.generateShort()

        // Hmm…
        err := s.queries.StoreShortURL(ctx, datastore.StoreShortURLParams{
            Short: short,
            Full: full,
        })

        if err != nil {
            return “”, fmt.Errorf(“error storing short url: %w”, err)

        }
        return nil
    })

}

This is much better now. The Transaction method with the closure defined within the call makes it obvious what is happening and other than wrapping the code within the closure, it looks exactly the same as before minus the *sql.Tx mechanics.

But, both attempts have a problem: there is nothing in the APIs that prevent me from using the original s.queries instance that is not bound to the transaction. In fact, one of the first bugs I had (and I left it here in the example) was that StoreShortURL was using s.queries instead of queries. The code semantics is incorrect but there is nothing syntactically incorrect, so it compiled and worked, until it didn’t.

The design of both attempts had as a goal not changing code that didn’t need the transaction, so the *Queries instance should be available for that purpose. The trade-off was, unless you have good testing for concurrent access, you will inevitably mix them and have unwanted behavior. You could also argue that naming the transaction-bound instance differently would have made it easier to spot the problem in code reviews, but a better design would make it impossible to happen in the first place.

Attempt #3: transactions all the way down.

This version make *Queries an unexported field instead of embedding it in datastore.DB and the function parameter expects *Queries instead datastore.DB as the function argument:

type DB struct {
    queries *Queries
    rdbms *sql.DB
}

func (db *DB) Transaction(ctx context.Context, f func(*Queries) error) error {
    defer func() {
        // Rollback the transaction in case there is an error.
        if tErr != nil {
            rbErr := tx.Rollback()
            if rbErr != nil {
                tErr = errors.Join(tErr, rbErr)
            }
        }
    }()


    tx, err := db.rdbms.BeginTx(ctx, nil)
    if err != nil {
        return fmt.Errorf("error creating transaction: %w", err)
    }

    queries := db.queries.WithTx(tx)
    if err := f(queries); err != nil {
        return fmt.Errorf("transaction error: %w", err)
        return err
    }

    return tx.Commit()
}

With this, we now require callers to always call Transaction and the only way to use *Queries methods is through the function parameter, making it impossible to use incorrectly:

type Service struct {
    db datastore.DB
}

// Rewrite to use the Transaction API.
func (s *Service) ResolveShortURL(ctx context.Context, short string) (string, error) {
    var full string

    return full, s.db.Transaction(ctx, func(queries *datastore.Queries) error {
        var err error
        full, err = queries.GetFullURL(ctx, short)
        return err
    })
}

func (s *Service) CreateShortURL(ctx context.Context, full string) (string, error) {
    var short string

    return short, s.db.Transaction(ctx, func(queries *datastore.Queries) error {
        if err := queries.UseQuota(ctx); err != nil {
            return fmt.Errorf(“quota exhausted: %w”, err)
        }

        // Make sure I don’t shadow the short variable.
        short = s.generateShort()

        // Calling s.db.queries would have resulted in a compiler error.
        err := queries.StoreShortURL(ctx, datastore.StoreShortURLParams{
            Short: short,
            Full: full,
        })

        if err != nil {
            return “”, fmt.Errorf(“error storing short url: %w”, err)
        }
        return nil
    })
}

We now have a semantically significant API that prevents us from having unwanted transaction behaviors. The trade-off is that every call that involves the storage API must be wrapped in a function parameter.

Appendix: sqlite and repo example

I created a repo with a storage API. It is actually an echo golang service template that I use for all of my projects. It uses sqlc and SQLite and out of the box you get user authentication. You need to define a .env file with the correct environment flags (see config/config.go). You can install air which will run it in and live reload if you make changes.. Or, just call run.sh and it will start the service.

It has, however, a different implementation of the api from attempt #3: it uses generics so that I can use datatstore.DB with different *Queries types and instead of Transaction it has Read and Write methods. The reason for that would be a post of its own but in short, sqlite only allows a single writer at a time and by creating separate connections for the readers and writers, I avoid dealing with user-level errors that come from having concurrent writes. Otherwise, the idea is exactly the same.

Notes

There are specialized interfaces for specialized drivers e.g pgx/v5 for PostgreSQL but we can ignore those for now. ↩︎