The problem of ORM library

Writing a data access layer is tedious for developers who are not familiar with SQL and the Paradox of Choice with ORM libraries. Many developers usually use a most-stared ORM library on Github and might end up with unexpected behaviors because some features in the library do over-engineering their requirements.

The object/relational mapping problem is hard since essentially what an ORM library is doing is synchronizing between two quite different representations of data. Many blog posts have been discussing for years by many big authors, including OrmHate by Martin Fowler (not against ORM, but worth mentioning anyway), Object-Relational Mapping Is the Vietnam of Computer Science by Jeff Atwood, The Vietnam of Computer Science by Ted Neward, ORM Is an Anti-Pattern by Laurie Voss, and many others.

Starting your baby steps with a Micro-ORM seems to be a good choice because of simplicity and maintainability. In this tutorial, I will show you the steps to build a micro ORM was taken from drone - a elegant go project structure.

The design of data mapper

Before jump into the implementation detail, let’s summarize the definition of data mapper.

Data mapper is a layer of mappers that moves data between objects and a database while keeping them independent of each other and the mapper itself.

A data mapper contains 3 parts from the definition: an object, a mapper and a database connection. In this post, I use a User object existed in many system and a UserStore to group data access actions on a data store. UserStore defines domain logic to manipulate the data from data store. Each action contains a mapper to move data between application objects and database. For example, a Create method have a mapper to translate a object to CREATE SQL statement for creating a new user record in the table. The database connection will be established from the main function by dependency injection.

Figure 1 depicts a relationship between a user object and group of mapper interface and database.

Figure 1: Relationship between User object, UserStore and database

Figure 1: Relationship between User object, UserStore and database

The following User struct contains 7 properties: login (login), active (active), showing (avatar) and tracking (created, updated, lastLogin). A tricky type int64 is using for tracking properties to store timestamp because timestamp not only removes a couple between time and server time zone compare to parsing string format but also does not depends on timezone.

// User represents a user of the system.
type User struct {
  Login     string
  Email     string
  Avatar    string
  Active    bool
  LastLogin int64
  Created   int64
  Updated   int64
}

The UserStore methods are divided into commands (Create, Update, Delete) and queries (Find, List, FindLogin, FindActive, Count) for write/read operations respectively. The command group changes the state of User object while queries group only read the User object.

// UserStore defines operations for working with users.
type UserStore interface {

  // Create persists a new user to the datastore.
  Create(context.Context, *User) error

  // Update persists an updated user to the datastore.
  Update(context.Context, *User) error

  // Delete deletes a user from the datastore.
  Delete(context.Context, *User) error

  // Find returns a user from the data store.
  Find(context.Context, int64) (*User, error)

  // List returns a list of users from the datastore.
  List(context.Context) ([]*User, error)

  // FindLogin returns a user from the datastore by username.
  FindLogin(context.Context, string) (*User, error)

  // FindActive returns a list of active users from the datastore.
  FindActive(context.Context) ([]*User, error)

  // Count returns a count of users.
  Count(context.Context) (int64, error)
}

I use context.Context as a first parameter for each method to carries deadlines, cancellation signals, and other request-scoped values.

Database connection

Although the built-in library database/sql provides a generic sql driver, it does not provides naming query statement and binding execution which provided by a sqlx. A common Locker interface to lock and unlock a share resource (SQLite for testing) and distinguish driver type with driver.

// Driver defines the database driver.
type Driver int

// Database driver enums.
const (
  Sqlite = iota + 1
  Mysql
  Postgres
)

// DB is a pool of zero or more underlying connections to
// the database.
type DB struct {
  conn   *sqlx.DB
  lock   Locker
  driver Driver
}

// A Locker represents an object that can be locked and unlocked.
type Locker interface {
  Lock()
  Unlock()
  RLock()
  RUnlock()
}

The pattern commands and queries is implemented by Queryer, Binder and Execer helper interfaces. Binder maps parameters to build a query while Queryer reads a row in database to return a sql.Row and the Execer performs an execution with a given query.

type (
  // Queryer interface defines a set of methods for
  // querying the database.
  Queryer interface {
    Query(query string, args ...interface{}) (*sql.Rows, error)
    QueryRow(query string, args ...interface{}) *sql.Row
  }

  // Binder interface defines database field bindings.
  Binder interface {
    BindNamed(query string, arg interface{}) (string, []interface{}, error)
  }

  // Execer interface defines a set of methods for executing
  // read and write commands against the database.
  Execer interface {
    Queryer
    Exec(query string, args ...interface{}) (sql.Result, error)
  }
)

View and Lock wraps a read/write to database in a safe action with desired lock.

// View executes a function within the context of a managed read-only
// transaction. Any error that is returned from the function is returned
// from the View() method.
func (db *DB) View(fn func(Queryer, Binder) error) error {
  db.lock.RLock()
  err := fn(db.conn, db.conn)
  db.lock.RUnlock()
  return err
}

// Lock obtains a write lock to the database (sqlite only) and executes
// a function. Any error that is returned from the function is returned
// from the Lock() method.
func (db *DB) Lock(fn func(Execer, Binder) error) error {
  db.lock.Lock()
  err := fn(db.conn, db.conn)
  db.lock.Unlock()
  return err
}

The Update action requires rollback a executed action when error happened. A tricky defer wraps recover and commit to support atomic transaction.

// Update executes a function within the context of a read-write managed
// transaction. If no error is returned from the function then the
// transaction is committed. If an error is returned then the entire
// transaction is rolled back. Any error that is returned from the function
// or returned from the commit is returned from the Update() method.
func (db *DB) Update(fn func(Execer, Binder) error) (err error) {
  db.lock.Lock()
  defer db.lock.Unlock()

  tx, err := db.conn.Begin()
  if err != nil {
    return err
  }

  defer func() {
    if p := recover(); p != nil {
      err = tx.Rollback()
      debug.PrintStack()
    } else if err != nil {
      tx.Rollback()
    } else {
      err = tx.Commit()
    }
  }()

  err = fn(tx, db.conn)
  return err
}

The mapper detail

The declaration of queryBase is using for mapping fields. Since the result reads fields from SELECT statement, it shares the mapping with the child queries. For example, the queryKey is our domain logic to find a user by a given id. The usage of SELECT * FROM users is not clear which fields is returned from database in order to guaranty the fields order and data types.

const queryBase = `
SELECT
user_id,
user_login,
user_email,
user_avatar,
user_active,
user_last_login,
user_created,
user_updated,
`

const queryKey = queryBase + `
FROM users
WHERE user_id = :user_id
`

We do fields mapping as follows:


// helper function converts the User structure to a set
// of named query parameters.
func toParams(u *core.User) map[string]interface{} {
	return map[string]interface{}{
		"user_id":         u.ID,
		"user_login":      u.Login,
		"user_email":      u.Email,
		"user_avatar":     u.Avatar,
		"user_active":     u.Active,
		"user_last_login": u.LastLogin,
		"user_created":    u.Created,
		"user_updated":    u.Updated,
	}
}

// helper function scans the sql.Row and copies the column
// values to the destination object.
func scanRow(scanner db.Scanner, dest *core.User) error {
	return scanner.Scan(
		&dest.Login,
		&dest.Email,
		&dest.Avatar,
		&dest.Active,
		&dest.LastLogin,
		&dest.Created,
		&dest.Updated,
	)
}

The usage of interface{} gives a generic type to map each field in User struct to map with a field in a record from SQL table. However, a drawback of type checking by compiler when compiling the mapper. We can overcome this issue with a wring a unit test.

If we have many SQL statements, we can take advantage of finding similar word by editors (cmd + d in sublime text for example)

// Find returns a user from the datastore.
func (s *userStore) Find(ctx context.Context, id int64) (*core.User, error) {
	out := &core.User{ID: id}
	err := s.db.View(func(queryer db.Queryer, binder db.Binder) error {
		params := toParams(out)
		query, args, err := binder.BindNamed(queryKey, params)
		if err != nil {
			return err
		}
		row := queryer.QueryRow(query, args...)
		return scanRow(row, out)
	})
	return out, err
}

Wrap-up

I have illustrated the baby step to create a micro ORM for mapping User data. The pattern approach provides a just enough technique for solving problems. The more detail can be found on Github with an implemented real world application.

I hope this article will help you figure out how to design a good code in Go when working with SQL databases.