Avro Schema Resolution & Projection – Part 5

Avro Schema Resolution & Projection – Part 5

Introduction

This is the final part in multi part series that talks about Apache Avro Schema Resolution and Projection which is one of the main capabilities of Avro which provides enhanced versioning support. There are occasions where multiple writers may use a different and evolving writer schema to write the binary Avro, it becomes very difficult to have a universal reader that can understand different writer schemas. In these scenarios, the reader maintains its own reader schema and is compatible with the schema used by the writers. This is a powerful functionality as it enables Schema Evolution.

Don’t forget to look at our other blogs in this series: Part 1 , Part 2 , Part 3Part 4

Use Case

Let’s try to create a different reader schema than the writer schema and understand how resolution and projection works with Avro.

What we want to do:

  • Create a reader’s Schema
  • Write a Java program to read the binary Avro written by a different writer Schema

Solution

Create a reader’s Schema:

  • We have made some changes to the reader’s Product Schema. The following changes were done to the original Product Schema and and saved as product_reader.avsc
    • Removed product_description, product_status, product_category, and product_hash field
    • Changed the name of the “price” field to “cost” and created an alias
    • Added new field “discount”
  • Resolution & Projection:
    • The reader Schema is said to be Evolving (Schema Evolution) by adding new fields (discount) that the writer doesn’t have.
    • The reader Schema is interested only in certain fields but not all fields written by the writer. This is called Projection which is very useful when the writer schema has large number of fields but the reader is interested in only few of them.
    • The reader Schema used “aliases” which allows the reader to use different names in the Schema used to read the Avro data than in the Schema originally used to write the data. Aliases is another useful technique for evolving Avro Schemas.

Write a Java program to read the binary Avro written by a different writer Schema:

  •  Verify:

Conclusion

  • Schema Evolution, Resolution, and Projection are on the best capabilities of Avro Serialization Framework which enables Enhanced Versioning Support.
  • These capabilities enable both Writer and Readers not to rely on the same Schema and thus enabling loose coupling.

References

4661 Views 4 Views Today