Member-only story
Kappa Vault
This article expands on “Data Vault Agility on Snowflake”; see: bit.ly/337Jhp3, scroll down to “Multiple source system cadence and how to handle it”
What is Kappa Architecture?
Coined by Jay Kreps back in 2014 (see: bit.ly/3CtWZQo) Kappa challenges the thinking and design behind a Lambda architecture. The premise behind Lambda is the separation of batch and streaming workloads defined as batch and speed layers respectively. If you wanted data now then you must accept a margin of error in the results, batch will “catch up” and consolidate the data in the serving layer with a higher degree of accuracy. This is an implementation of eventual consistency. Kappa on the other hand is a stream first approach, both workloads are handled by a single stream processing engine that can deal with batch and streaming, something Snowflake today natively supports with Snowpipe and Kafka, see: bit.ly/3fqV9oZ.
A vast array of algorithms has been developed through the years that incorporates this margin of error in stream-only analytics and all with the same principle in mind; if you want the answer now it might not be accurate, if you want it later sure we can provide that accuracy you desire later.