Binary serialisation

Binary serialisation: better, stronger, faster Duncan Coutts March 2014, Fun in the Afternoon 2014

.

Well-Typed .

The Haskell Consultants .

Binary serialisation use cases Two different use case for reading/writing binary data Serialisation of Haskell values

Externally defined formats

I

Don’t really care about the binary format

I

Binary format details are critical

I

Only need to interoperate with same program

I

Need to interoperate with other systems

I

Want to be able to serialise direct to/from internal types

I

Some manual conversion to internal types acceptable

I

e.g. "I just want to save a data file and read it back later"

I

e.g. "I want to read these X509 certificates"

.

Well-Typed .

Binary serialisation use cases Two different use case for reading/writing binary data Serialisation of Haskell values

Externally defined formats

I


I

Binary format details are critical

I


I

Need to interoperate with other systems

I


I

Some manual conversion to internal types acceptable

I

e.g. "I just want to save a data file and read it back later"

I

e.g. "I want to read these X509 certificates"

The binary and cereal packages do both (confusing!) .

Well-Typed .

Binary serialisation use cases Two different use case for reading/writing binary data Serialisation of Haskell values I


I


I


I

e.g. "I just want to save a data file and read it back later" This talk is about the serialisation of Haskell values case .

Well-Typed .

Goals

Goals of this work I

I

replace the serialisation layer in the binary package I get binary into the Haskell Platform much better serialisation format I I I

I

schema-free decoding allow for versioning/migration more compact

significantly faster

.

Well-Typed .

Headline results Much better serialisation format I I

I I

CBOR (like MessagePack but better designed) schema-free decoding (e.g. generic conversion to JSON) allows for versioning/migration 1/2 encoded file size

Dramatically faster I I I

3-5x faster encoding 2-3x faster decoding “real world” benchmarks, not micro-benchmarks

Buy-in from the other binary package hackers .

Well-Typed .

The existing data format

The binary package’s data format I

Very simple and direct

I

Fixed-size “as is” encodings

I

No type information

I

No headers

I

No redundancy

Examples I

Int encoded as 8-bytes big endian

I

List as length prefixed sequence (8-bytes again)

I

Tuples as sequence

I

Sum types use 1-byte constructor tag

.

Well-Typed .

The binary package’s data format

It’s a bad format I

not very compact

I

no redundancy

.

Well-Typed .

The binary package’s data format

It’s a bad format I

not very compact

I

no redundancy

The lack of redundancy is the greater problem

.

Well-Typed .

The binary package’s data format The current format makes developers nervous. Lose the types that generated the data and you’re in trouble: I I I

can’t tell what anything is can’t tell where one bit of data ends and the next begins practically impossible to recover

No redundancy means no chance to detect or localise errors. Debugging incorrect Binary instances is hard. Error information if things go wrong is nigh on useless *** Exception: Data.Binary.Get.runGet at position 5844572: demandInput: not enough bytes .

Well-Typed .

The binary package’s data format Annoying or unsuitable in some use cases

Example Haddock cannot read .haddock files created by older versions Thus .haddock files not shareable via hackage while hoogle text DB files are shareable

.

Well-Typed .

The binary package’s data format Annoying or unsuitable in some use cases

Example Haddock cannot read .haddock files created by older versions Thus .haddock files not shareable via hackage while hoogle text DB files are shareable I do not trust it for medium/long term data storage

Example The hackage server uses cereal (via acid-state) for most data but there’s a separate backup/restore system that dumps all data in text or other standard formats. .

Well-Typed .

A new data format

What do we want in a data format?

Programmers like text formats I

easy to ‘eyeball’ the data

I

can usually tell where data begins and ends

I

can usually guess what data is

I

easier to manage ad-hoc upgrades and migrations

I

can often recover data written by old programs

.

Well-Typed .

What do we want in a binary data format?

Properties we want in binary data format I

compact encoded size

I

fast encoding and decoding decode without a schema (ie the program types)

I

I I

‘eyeball’ the data testing, debugging and recovery

I

redundancy to help error checking

I

option to change schema while still reading old data files

.

Well-Typed .

Existing formats we could steal Various off-the-shelf binary formats I

MessagePack

I

Protocol buffers

I

BSON

I

Thrift

I

BERT

I

CBOR

Many of these provide schema-less decoding. Several provide equivalent of JSON “universal” data model. Just steal one: re-use an existing specification and tools.

.

Well-Typed .

CBOR

We are picking the CBOR format I

“Concise Binary Object Representation”

I

Specified in RFC 7049 (proposed standard)

I

JSON-like data model

I

Self-describing low level structure

I

Much like MessagePack but better designed

I

Reasonably flexible and extensible

.

Well-Typed .

CBOR CBOR data items have a 1-byte header I I I I

specifies type (integers, bytes, string, lists etc) specifies format variable length encodings small values directly in header

Examples Value 10

Binary (hex) 1A

Value [1, 2, 3]

Binary (hex) 83 01 02 03

100

18 64

[1, [2, 3]]

82 01 82 02 03

"" "a"

60

[ 1, 2, 3]

9F 01 02 03 FF

61 61

( "a", "b")

7F 61 61 61 62 FF

[]

80

100000.0

FA 47 C3 50 00 .

Well-Typed .

Encoding constructors data T = C1 Int Int | C2 String | C3 We will encode constructors as CBOR arrays, with a constructor tag: Haskell value C1 3 4

CBOR notation [0, 3, 4]

C2 "hi"

[1, "hi" ]

C3

[2]

Even single-constructor types will use a tag.

.

Well-Typed .

A mechanism for versioning / migration data T = C1 Int Int | C2 String [Int] | C3

-- previously just: C2 String

When we add or change a constructor we can use a new tag. The decoder may recognise the old tag and decode and covert. Haskell value old C2 "hi"

CBOR notation [1, "hi" ]

new C2 "hi" [ ]

[3, "hi", [ ]]

.

Well-Typed .

A mechanism for versioning / migration data T = C1 Int Int | C2 String [Int] | C3

-- previously just: C2 String

When we add or change a constructor we can use a new tag. The decoder may recognise the old tag and decode and covert. Haskell value old C2 "hi"

CBOR notation [1, "hi" ]

new C2 "hi" [ ]

[3, "hi", [ ]]

This is not automatic, but it is possible. Further library work is needed (like safecopy). .

Well-Typed .

Benchmarks

Benchmark setup

“real world” benchmark rather than micro-benchmark I

Serialising .cabal file data structures I I

I I

I

100 for ‘small’ criterion benchmarks 10k for ‘big’ GC/alloc benchmarks

Mostly lists and deeply nested trees of custom constructors Comparing new code against I binary I cereal I aeson, just for reference Prototype using MessagePack format rather than CBOR

.

Well-Typed .

Encoded file size (MB)

Serialised data size

100

50

0 aeson binary cereal new I I

binary and cereal produce identical files new format is 1/2 size of binary/cereal .

Well-Typed .

Time (milliseconds)

Encode and decode time encode decode 100

50

0 aeson

binary

I

5x faster encode than cereal

I

3x faster decode than binary

new

cereal

.

Well-Typed .

Time (milliseconds)

Encode and decode time encode decode decode + deepseq

100

50

0 aeson

binary

new

cereal

Deepseq issue needs work

.

Well-Typed .

Decoding for large amounts of data

Time (seconds)

2 1.5 1 0.5 0 binary I I

cereal

new

streaming decode to avoid all results in memory at once aeson is off the scale at 25sec .

Well-Typed .

Decoding allocation

Allocation (GB)

8 6 4 2 0 binary I I

cereal

new

dramatically less allocation allocation costs mirror decode time .

Well-Typed .

Optimisation

Existing binary approach We’ll focus on serialisation.

.

Well-Typed .

Existing binary approach We’ll focus on serialisation. Current serialisation uses a Builder monoid newtype Builder = Builder { runBuilder :: (Buffer → IO L.ByteString) → Buffer → IO L.ByteString } data Buffer = Buffer !(ForeignPtr Word8) !Int !Int !Int Continuation style, passing a Buffer around. Performs well in micro-benchmarks and badly for “real world” tree-structured data. .

Well-Typed .

Why it’s slow

Tricky...

.

Well-Typed .

Why it’s slow

Tricky... It generates a lot of code. Writing an optimal code generator is harder than writing optimal code. Every buffer check needs a resume continuation (which duplicates code). Allocates large multi-argument closures (continuations). Allocates many Buffer structures.

.

Well-Typed .

Illustrative example

data TwoInts = TwoInts {-# UNPACK #-} {-# UNPACK #-}

!Int !Int

encodeTwoInts :: TwoInts → Builder encodeTwoInts (TwoInts a b) = word64BE (fromIntegral a) word64BE (fromIntegral b) Simplifying for the example by using the Builder layer, rather than Put writer monad.

.

Well-Typed .

Encoder core (simplified) encodeTwoInts1 = λa b k ptr1 ptr2 s → case (minusAddr # ptr2 ptr1) < # 16 of False → case writeWord8OffAddr # ptr1 0 (uncheckedShiftRL # a 56) s of -- repeat 7 more times for a and 8 for b... s0 → k (BufferRange (plusAddr # ptr1 16) ptr2 s0 )

True → (# s, BufferFull 16 ptr1 $ λbuf → case buf of BufferRange ptr10 ptr20 → case writeWord8OffAddr # ptr10 0 (uncheckedShiftRL # a 56) s of -- repeat 7 more times for a and 8 for b... s0 → k (BufferRange (plusAddr # ptr10 16) ptr20 s0 ) #) .

Well-Typed .

The new trick The trick is to use a deep embedding of the builder actions data FlatTerm = OutWord !Word FlatTerm | OutInt !Int FlatTerm | OutBytes !ByteString FlatTerm ... | OutStreamEnd Even though a deep embedding involves allocation and indirection! Credit to Simon Meier for trying this crazy idea.

.

Well-Typed .

The new trick The trick is to use a deep embedding of the builder actions data FlatTerm = OutWord !Word FlatTerm | OutInt !Int FlatTerm | OutBytes !ByteString FlatTerm ... | OutStreamEnd The Binary instances produce the FlatTerm data structure. newtype Encoding = Encoding (FlatTerm → FlatTerm) class Binary a where encode :: a → Encoding .

Well-Typed .

The new trick The trick is to use a deep embedding of the builder actions data FlatTerm = OutWord !Word FlatTerm | OutInt !Int FlatTerm | OutBytes !ByteString FlatTerm ... | OutStreamEnd Fast “interpreter” function runs through the FlatTerm , doing the format encoding details and writing out to a buffer toBuilder :: Encoding → B.Builder

.

Well-Typed .

Encode core (simplified)

encodeTwoInts1 = λti k → case ti of TwoInts a b → OutListLen 3 (OutWord 1 (OutInt a (OutInt b k)))

Allocates the whole encode description in one go. No messing. All dynamic checks left to the interpreter.

.

Well-Typed .

Encoder interpreter toBuilder :: Encoding → B.Builder toBuilder (Encoding f) → B.builder (step (f OutStreamEnd)) step toks k (BI.BufferRange optr optr_end) = go toks optr where go toks ! optr | optr ‘plusPtr‘ bound 6 optr_end = case toks of OutWord x toks0 → B.runB encWord x optr >>= go toks0 OutFloat x toks0 → B.runB encFloat x optr >>= go toks0 ... | otherwise = return $ B.bufferFull bound op (step toks k) Heavily optimised low-level interpreter loop.

.

Well-Typed .

Decoder deep embedding Also apply deep embedding trick to the decoder newtype Decoder a = Decoder { runDecoder :: ∀r. (a → DecodeAction r) → DecodeAction r } data DecodeAction a = ConsumeToken (TermToken → DecodeAction a) | Fail String | Done a data TermToken = TkWord !Word | TkInt !Int | TkBytes !B.ByteString ...

.

Well-Typed .

Decoder interpreter go_fast :: DecodeAction a → B.ByteString → SlowPath a go_fast (ConsumeToken k) !bs | B.length bs > 9 -- plenty of space = case decodeDispatch (B.unsafeHead bs) bs of RsToken sz tok → go_fast (k tok) (B.unsafeDrop sz bs) RsString sz len → SlowString (B.unsafeDrop sz bs) k len RsBytes sz len → SlowBytes (B.unsafeDrop sz bs) k len RsFail msg → SlowFail bs msg go_fast da !bs = go_fast_end da bs Fast path / slow path optimisation I I I

loop while we still have enough input buffer return reason for falling out of the fast path slow path fixes things up and calls fast path again .

Well-Typed .

Thanks! Questions?

.

Well-Typed .

More decoder stuff...

Decode code

decodeTwoInts :: Decoder TwoInts decodeTwoInts = do expectListOfLen 3 expectTagOf 1 x ← decode y ← decode return (TwoInts x y) Common pattern

.

Well-Typed .

Decode core (simplified) decodeTwoInts1 = λk → let mkTwoInts = ConsumeToken $ λtk1 → case tk1 of TkInt n1 → ConsumeToken $ λtk2 → case tk2 of TkInt n2 → k (TwoInts n1 n2) checkTag = ConsumeToken $ λtk → case tk of TkWord tag → case eqWord # 1 tag of False → . . . True → mkTwoInts in ConsumeToken $ λtk → case tk of TkListLen len → case eqWord # 3 len of False → . . . True → checkTag Has to allocate six closures! .

Well-Typed .

Decode optimisation idea Want to minimise closure allocation decodeTwoInts :: Decoder TwoInts decodeTwoInts = do expectListOfLen 3 expectTagOf 1 x ← decode y ← decode return (TwoInts x y) Necessarily has two closures, due to x and y .

.

Well-Typed .

Decode optimisation idea

Perhaps a stack approach? decodeTwoInts :: Decoder TwoInts decodeTwoInts = do expectListOfLen 3 expectTagOf 1 pushDecode pushDecode transformStack (λx y → TwoInts x y)

.

Well-Typed .

Binary serialisation

Binary serialisation

Suggest Documents

Korea pharmaceuticals serialisation policy & national traceability ...

meeting current and future serialisation challenges - Aesica

as a standardised serialisation for ISO 24615 â SynAF

Binary Vectors and Super-binary Vectors - Springer

Binary Molecular Nomenclature Rules for Binary ... - ScienceGeek.net

Binary Vectors and Super-binary Vectors

Binary Adder

BINARY QSAR

On the Conversion between Non-Binary and Binary Constraint ...

Overview Arithmetic circuits Binary half adder Binary full adder Full ...

Dynamical Model of Binary Asteroid Systems Using Binary

Binary and Millisecond Pulsarswww.researchgate.net › publication › fulltext › Binary-an

on self-inverse binary matrices over the binary ... - Semantic Scholar

On the Conversion between Non-Binary and Binary Constraint

The primordial binary population II: Recovering the binary population

Binary Phase Diagrams and Thermodynamic ... - COREwww.researchgate.net › publication › fulltext › Binary-Ph

Binary Image Watermarking with Tampering Localization in Binary ...

reaching consensus on a binary state by exchanging binary ... - MIRLab

Formalization of Binary Fields and N-dimensional Binary Vector ...

Interlaced Binary Search: Multiple Items' Binary Search ... - Google Sites

BINARY NUMBER SYSTEM 4

Binary Adders and Subtractors

Text as Binary Sequence

Binary Huff Curves - CiteSeerX

Binary serialisation