Skip to main content
Version: 2.0.x

Introduction to ZIO Schema

ZIO Schema is a ZIO-based library for modeling the schema of data structures as first-class values.

Development CI Badge Sonatype Releases Sonatype Snapshots javadoc ZIO Schema

Introduction

ZIO Schema helps us to solve some of the most common problems in distributed computing, such as serialization, deserialization, and data migration.

It turns a compiled-time construct (the type of a data structure) into a runtime construct (a value that can be read, manipulated, and composed at runtime). A schema is a structure of a data type. ZIO Schema reifies the concept of structure for data types. It makes a high-level description of any data type and makes them first-class values.

Creating a schema for a data type helps us to write codecs for that data type. So this library can be a host of functionalities useful for writing codecs and protocols like JSON, Protobuf, CSV, and so forth.

What Problems Does ZIO Schema Solve?

With schema descriptions that can be automatically derived for case classes and sealed traits, ZIO Schema will be going to provide powerful features for free:

  1. Metaprogramming without macros, reflection, or complicated implicit derivations.
    1. Creating serialization and deserialization codecs for any supported protocol (JSON, Protobuf, etc.)
    2. Deriving standard type classes (Eq, Show, Ordering, etc.) from the structure of the data
    3. Default values for data types
  2. Automate ETL (Extract, Transform, Load) pipelines
    1. Diffing: diffing between two values of the same type
    2. Patching: applying a diff to a value to update it
    3. Migration: migrating values from one type to another
  3. Computations as data: Not only we can turn types into values, but we can also turn computations into values. This opens up a whole new world of possibilities concerning distributed computing.

When our data structures need to be serialized, deserialized, persisted, or transported across the wire, then ZIO Schema lets us focus on data modeling and automatically tackle all the low-level, messy details for us.

ZIO Schema is used by a growing number of ZIO libraries, including ZIO Flow, ZIO Redis, ZIO SQL and ZIO DynamoDB.

Installation

In order to use this library, we need to add the following lines in our build.sbt file:

libraryDependencies += "dev.zio" %% "zio-schema"          % "0.4.15"
libraryDependencies += "dev.zio" %% "zio-schema-avro" % "0.4.15"
libraryDependencies += "dev.zio" %% "zio-schema-bson" % "0.4.15"
libraryDependencies += "dev.zio" %% "zio-schema-json" % "0.4.15"
libraryDependencies += "dev.zio" %% "zio-schema-msg-pack" % "0.4.15"
libraryDependencies += "dev.zio" %% "zio-schema-protobuf" % "0.4.15"
libraryDependencies += "dev.zio" %% "zio-schema-thrift" % "0.4.15"
libraryDependencies += "dev.zio" %% "zio-schema-zio-test" % "0.4.15"

// Required for the automatic generic derivation of schemas
libraryDependencies += "dev.zio" %% "zio-schema-derivation" % "0.4.15"
libraryDependencies += "org.scala-lang" % "scala-reflect" % scalaVersion.value % "provided"

Example

In this simple example first, we create a schema for Person and then run the diff operation on two instances of the Person data type, and finally, we encode a Person instance using Protobuf protocol:

import zio._
import zio.stream._
import zio.schema.codec.{BinaryCodec, ProtobufCodec}
import zio.schema.{DeriveSchema, Schema}

final case class Person(name: String, age: Int)

object Person {
implicit val schema: Schema[Person] = DeriveSchema.gen
val protobufCodec: BinaryCodec[Person] = ProtobufCodec.protobufCodec
}

object Main extends ZIOAppDefault {
def run =
ZStream
.succeed(Person("John", 43))
.via(Person.protobufCodec.streamEncoder)
.runCollect
.flatMap(x =>
Console.printLine(s"Encoded data with protobuf codec: ${toHex(x)}")
)

def toHex(chunk: Chunk[Byte]): String =
chunk.map("%02X".format(_)).mkString
}

Here is the output of running the above program:

Encoded data with protobuf codec: 0A044A6F686E102B

Resources