API Reference¶
Overview¶
This describes the API available from binary-data. It is organized in the following sections, depending on different demands: using binary data in a tool, extending with a custom binary format, and the internal API.
Class hierarchy¶
The class hierarchy is rooted in <frame>
. Several direct
subclasses exist, some are open
and may be subclassed. Only those
combinations of direct subclasses which were needed until now are
defined (there might be need for other combinations in the future).
-
<frame>
Abstract Class¶ - Discussion
The abstract superclass of all frames, several generic functions are defined on this class.
- Superclasses
- Operations
-
<leaf-frame>
Abstract Class¶ - Discussion
The abstract superclass of all frames without any further structure.
- Superclasses
- Operations
-
<fixed-size-frame>
Abstract Class¶ - Discussion
The abstract superclass of all frames with a static length. The specialization of
frame-size
callsfield-size
on the object class of the given instance.- Superclasses
-
<variable-size-frame>
Abstract Class¶ - Discussion
The abstract superclass of all frames with a variable length.
- Superclasses
-
<translated-frame>
Abstract Class¶ - Discussion
The abstract superclass of all frames with a conversion into a native Dylan type.
- Superclasses
-
<untranslated-frame>
Abstract Class¶ - Discussion
Abstract superclass of all frames with a custom class instance.
- Superclasses
-
<fixed-size-untranslated-frame>
Abstract Class¶ - Discussion
Abstract superclass for fixed sized frames without a translation
- Superclasses
-
<variable-size-untranslated-frame>
Abstract Class¶ - Discussion
Abstract superclass for variable sized frames without a translation. This is the direct superclass of
<container-frame>
.- Superclasses
-
<fixed-size-translated-leaf-frame>
Open Abstract Class¶ - Discussion
Superclass of all fixed size leaf frames with a translation, mainly used for bit vectors represented as Dylan
<integer>
- Superclasses
-
<variable-size-translated-leaf-frame>
Open Abstract Class¶ - Discussion
Superclass of all variable size leaf frames with a translation (currently unused)
- Superclasses
-
<fixed-size-untranslated-leaf-frame>
Open Abstract Class¶ - Discussion
Superclass of all fixed size leaf frames without a translation, mainly used for byte vectors (IP addresses, MAC address, …), see its subclass
<fixed-size-byte-vector-frame>
.- Superclasses
-
<variable-size-untranslated-leaf-frame>
Open Abstract Class¶ - Discussion
Superclass of all variable size leaf frames without a translation (for example class
<raw-frame>
and class<externally-delimited-string>
)- Superclasses
-
<null-frame>
Class¶ - Discussion
A concrete zero size leaf frame without a translation. This frame type can be used as one of the types of a variably-typed field to make the field optional. A field with a type <null-frame> is considered to be missing from the container frame. Conversion of a <null-frame> to string or vice versa is not supported (because it wouldn’t make much sense).
- Superclasses
-
<container-frame>
Open Abstract Class¶ Superclass of all binary data definitions using the
define binary-data
macro.- Superclasses
- Operations
field-count
-
<header-frame>
Open Abstract Class¶ Superclass of all binary data definitions which support layering, thus have a header and payload.
-
<variably-typed-container-frame>
Open Abstract Class¶ Superclass of all binary data definitions which have an abstract header followed by more fields. In the header a specific
<layering-field>
determines which subclass to instantiate.- Superclasses
Tool API¶
Parsing Frames¶
-
parse-frame
Open Generic function¶ Parses the given binary packet as frame-type, resulting in an instance of the frame-type and the number of consumed bits.
- Signature
parse-frame frame-type packet #rest rest #key #all-keys => result consumed-bits
- Parameters
frame-type – Any subclass of
<frame>
.packet – The byte vector as
<sequence>
.rest (#rest) – An instance of
<object>
.
- Values
result – An instance of the given frame-type.
consumed-bits – The number of bits consumed as
<integer>
-
read-frame
Open Generic function¶ Converts a given string to an instance of the given leaf frame type.
- Signature
read-frame frame-type string => frame
- Parameters
frame-type – An instance of
subclass(<leaf-frame>)
.string – An instance of
<string>
.
- Values
frame – An instance of
<object>
.
Assembling Frames¶
Information about Frames¶
-
frame-size
Open Generic function¶ Returns the length in bits for the given frame.
- Signature
frame-size frame => length
- Parameters
frame – An instance of
<frame>
.
- Values
length – The size in bits, an instance of
<integer>
.
-
summary
Open Generic function¶ Returns a human-readable customizable (in binary-data-definer) string, which summarizes the frame.
-
packet
Open Generic function¶ Underlying byte vector of the given
<container-frame>
.- Signature
packet frame => byte-vector
- Parameters
frame – An instance of
<container-frame>
.
- Values
byte-vector – An instance of
<byte-sequence>
.
-
parent
Sealed Generic function¶ If the frame is a payload of another layer, returns the frame of the upper layer, false otherwise.
- Signature
parent frame => parent-frame
- Parameters
frame – An instance of
<container-frame>
or<variable-size-byte-vector-frame>
- Values
parent-frame – Either the
<container-frame>
of the upper layer or#f
Information about Frame Types¶
-
fields
Open Generic function¶ Returns a vector of
<field>
for the given<container-frame>
- Signature
fields frame-type => fields
- Parameters
frame-type – Any subclass of
<container-frame>
.
- Values
fields – A
<simple-vector>
containing all fields.
Note
Current API also allows instances of <container-frame>
, should be revised
-
frame-name
Open Generic function¶ Returns the name of the frame type.
- Signature
frame-name frame-type => name
- Parameters
frame-type – Any subclass of
<container-frame>
.
- Values
name – A
<string>
with the human-readable frame name.
Note
Current API also allows instances of <container-frame>
, should be revised
Fields¶
Syntactic sugar in the define binary-data
domain-specific
language instantiates these fields.
-
<field>
Abstract Class¶ The abstract superclass of all fields.
- Superclasses
- Init-Keywords
name – The name of this field.
fixup – A unary Dylan function computing the value of this field, used if no default is supplied and none provided by the client, defaults to
#f
.init-value – The default value if the client did not provide any, default
$unsupplied
.static-end – A Dylan expression determining the end, defaults to
$unknown-at-compile-time
.static-length – A Dylan expression determining the length, defaults to
$unknown-at-compile-time
.static-start – A Dylan expression determining the start, defaults to
$unknown-at-compile-time
.dynamic-end – A unary Dylan function computing the end, defaults to
#f
.dynamic-length – A unary Dylan function computing the length, defaults to
#f
.dynamic-start – A unary Dylan function computing the start, defaults to
#f
.getter – The getter method to extract this fields value out of a concrete frame.
setter – The setter method to set this fields to a concrete value in a concrete frame.
index – An
<integer>
which is an index of this field in its<container-frame>
.
- Discussion
All keyword arguments correspond to a slot, which can be accessed.
- Operations
field-name(<field>)
fixup-function(<field>)
init-value(<field>)
static-start(<field>)
static-length(<field>)
static-end(<field>)
getter(<field>)
setter(<field>)
See also
-
<variably-typed-field>
Class¶ The class for fields of dynamic type.
- Superclasses
- Init-Keywords
type-function – A unary Dylan function computing the type of the field, defaults to
payload-type
.
See also
-
<statically-typed-field>
Abstract Class¶ The abstract superclass of all statically typed fields.
Note
restrict type in source code!
-
<single-field>
Class¶ The common field. Nothing interesting going on here.
- Superclasses
-
<enum-field>
Class¶ An enumeration field to map
<integer>
to<symbol>
.- Superclasses
- Init-Keywords
mapping – A mapping from keys to values as
<collection>
.
-
<layering-field>
Class¶ The layering field is used in
<header-frame>
and<variably-typed-container-frame>
to determine the concrete type of the payload or which subclass to use.- Superclasses
- Discussion
The
fixup-function
slot is bound to use the available layering information. No need to specify a fixup.
-
<repeated-field>
Abstract Class¶ Abstract superclass of repeated fields. The
init-value
slot is bound to#()
.- Superclasses
-
<count-repeated-field>
Class¶ A repeated field whose number of repetitions is determined externally.
- Superclasses
- Init-Keywords
count – A unary function returning the number of occurences.
Layering of frames¶
-
payload-type
Function¶ The type of the payload, It is just a wrapper around
lookup-layer
, which returns<raw-frame>
iflookup-layer
returned false.- Signature
payload-type frame => payload-type
- Parameters
frame – An instance of
<container-frame>
.
- Values
payload-type – An instance of
<type>
.
-
lookup-layer
Open Generic function¶ Given a frame-type and a key, returns the type of the payload.
-
reverse-lookup-layer
Open Generic function¶ Given a frame type and a payload, returns the value for the layering field.
Note
Check whether it can work with other types than integers
Database of Binary Data Formats¶
Note
Rename to $binary-data-registry
or similar. Also, narrow types for the functions in this section.
-
$protocols
Constant¶ A hash table with all defined binary formats. Insertion is done by a call of
define binary-data
.- Type
- Value
Mapping of
<symbol>
to subclasses of<container-frame>
.
-
find-protocol
Function¶ Looks for the given name in the hashtable
$protocols
. Signals an error if no protocol with the given name can be found.
-
find-protocol-field
Function¶ Queries a field by name in a given binary data format. Errors if no such field is known in the binary data format.
Utilities¶
-
hexdump
Generic function¶ Prints the given data in hexadecimal on the given stream.
- Signature
hexdump stream data => ()
- Parameters
stream – An instance of
<stream>
.data – An instance of
<sequence>
.
- Discussion
Prints 8 bytes separated by a whitespace in hexadecimal, followed by two whitespaces, and another 8 bytes.
If the given data has more than 16 elements, it prints multiple lines, and prefix each with a line number (as 16 bit hexadecimal).
-
byte-offset
Function¶ Computes the number of bytes for a given number of bits. A synonym for
rcurry(ash, 3)
.
-
bit-offset
Function¶ Computes the number of bits which do not fit into a byte for a given number of bits. A synonym for
curry(logand, 7)
.
-
byte-aligned
Function¶ Checks that the given number of bits can be represented in full bytes, otherwise signals an
<alignment-error>
.- Signature
byte-aligned bits
- Parameters
bits – An instance of
<integer>
.
-
data
Generic function¶ Returns the underlying byte vector of a wrapper object, used for several untranslated leaf frames.
- Signature
data (object) => (#rest results)
- Parameters
object – An instance of
<object>
.
- Values
#rest results – An instance of
<object>
.
Note
should be removed from the API, or become internal
Errors¶
Extension API¶
Extending Binary Data Formats¶
This domain-specific language defines a subclass of
<container-frame>
, and lots of boilerplate.
-
define binary-data
Defining Macro¶ - Macro Call
define [abstract] binary-data *binary-format-name* ([*super-binary-format*]) [summary *summary*] [;] [over *over-spec* *] [;] [length *length-expression*] [;] [*field-spec*] [;] end
- Parameters
binary-format-name – A standard Dylan class name.
super-binary-format – A standard Dylan name, used superclass.
summary – A Dylan expression consisting of a format-string and a list of arguments.
over-spec – A pair of binary format and value.
length-expression – A Dylan expression computing the length of a frame instance.
field-spec – A list of fields for this binary format.
- Discussion
Defines the binary data class binary-data-name, which is a subclass of super-binary-format. In the body some syntactic sugar for specializing the pretty printer (summary specializes
summary
), providing a custom length implementation (length specializescontainer-frame-size
), and provide binary format layering information via over-spec (<layering-field>
). The remaining body is a list of field-spec. Each field-spec line corresponds to a slot in the defined class. Additionally, each field-spec instantiates an object of<field>
to store the static metadata. The vector of fields is available via the methodfields
.summary: *format-string* *format-arguments*
This generates a method implementation for
summary
. Each format-arguments is applied to the frame instance.over-spec: *over-binary-format* *layering-value*
The over-binary-format should be a subclass of
<header-frame>
or<variably-typed-container-frame>
. The layering-value will be registered for the specified over-binary-format.field-spec: [*field-attribute*] field *field-name* [:: *field-type*] [= *default-value*], [*keyword-arguments* *] [;] field-attribute: variably-typed | layering | repeated | enum mapping: { *key* <=> *value* }
field-name: Each field has a unique field-name, which is used as name for the getter and setter methods
field-type: The field-type can be any subclass of
<frame>
, required unlessvariably-typed
attribute provided.default-value: The default-value should be an instance of the given field-type.
field-attribute: Syntactic sugar for some common patterns is available via attributes.
variably-typed
instantiates a<variably-typed-field>
.layering
instantiates a<layering-field>
.repeated
instantiates a<repeated-field>
.enum
instantiates a<enum-field>
.
keyword-arguments: Depending on the field type, various keywords are supported. Lots of values are standard Dylan expressions, where the current frame object is implicitly bound to
frame
, indicated by frame-expression.fixup: A frame-expression computing the field value if no default was supplied, and the client didn’t provide one (handy for length fields).
start: A frame-expression computing the start bit of the field in the frame.
end: A frame-expression computing the end bit of the field in the frame.
length: A frame-expression computing the length of the field.
static-start: A Dylan expression stating the start of the field in the frame.
static-end: A Dylan expression stating the end of the field in the frame.
static-length: A Dylan expression stating the length of the field.
type-function: A frame-expression computing the type of this
<variably-typed-field>
.count: A frame-expression computing the amount of repetitions of this
<count-repeated-field>
.reached-end?: A frame-expression returning a
<boolean>
whether this<self-delimited-repeated-field>
has reached its end.mappings: A mapping for
<enum-field>
between values and<symbol>
The list of fields is instantiated once for each binary data definition. If a static start offset, length, and end offset can be trivially computed (using constant folding), this is done during macro processing.
Several generic functions can be specialized on the binary-format-name for custom behaviour:
Note
rename start, end, length to dynamic-start, dynamic-end, dynamic-length
Note
Check whether those field attributes compose in some way
-
fixup!
Open Generic function¶ Fixes data in an assembled container frame.
- Signature
fixup! frame => ()
- Parameters
frame – A union of
<container-frame>
and<raw-frame>
. Usually specialized on a subclass of<unparsed-container-frame>
.
- Discussion
Used for post-assembly of certain fields, such as checksum calculations in IPv4, ICMP, TCP frames, compression of domain names in DNS fragments.
Defining a Custom Leaf Frame¶
A common structure in binary data formats are subsequent ranges of bits or bytes, each with a different meaning. There are some macros available to define frame types of common patterns.
-
field-size
Open Generic function¶ Returns the static size of a given frame type. Should be specialized for custom fixed sized frames.
-
high-level-type
Open Generic function¶ For translated frames, return the native Dylan type. Otherwise identity.
- Signature
high-level-type frame-type => type
- Parameters
frame-type – An instance of
subclass(<frame>)
.
- Values
type – An instance of
<type>
.
-
assemble-frame-into
Open Generic function¶ Shuffle the bits in the given packet so that the frame is encoded correctly.
- Signature
assemble-frame-into frame packet => length
- Parameters
frame – An instance of
<frame>
.packet – An instance of
<stretchy-vector-subsequence>
.
- Values
length – An instance of
<integer>
.
-
assemble-frame-into-as
Open Generic function¶ Shuffle the bits in the given packet so that the frame is encoded correctly as the given frame-type.
- Signature
assemble-frame-into-as frame-type frame packet => length
- Parameters
frame-type – A subclass of
<translated-frame>
.frame – An instance of
<object>
.packet – An instance of
<stretchy-vector-subsequence>
.
- Values
length – An instance of
<integer>
.
-
define n-bit-unsigned-integer
Defining Macro¶ Describes an
<integer>
represented by a bit vector of arbitrary size.- Macro Call
define n-bit-unsigned-integer (*class-name* ; *bits* ) end
- Parameters
class-name – A Dylan class name which is defined by this macro.
bits – The number of bits represented by this frame.
- Discussion
Defines the class class-name with
<unsigned-integer-bit-frame>
as its superclass.There are several predefined classes of the form
<Kbit-unsigned-integer>
with K between 1 and 15, and 20.- Operations
high-level-type
returnslimited(<integer>, min: 0, max: 2 ^ bits -1)
.field-size
returns bits.
-
define n-byte-unsigned-integer
Defining Macro¶ Describes an
<integer>
represented by a byte vector of arbitrary size and encoding (little or big endian).- Macro Call
define n-byte-unsigned-integer (*class-name-prefix* ; *bytes*) end
- Parameters
class-name-prefix – A prefix for the class name which is defined by this macro.
bytes – The number of bytes represented by this frame.
- Discussion
Defines the classes class-name-prefix
-big-endian-unsigned-integer>
(superclass<big-endian-unsigned-integer-byte-frame>
and class-name-prefix-little-endian-unsigned-integer>
(superclass<little-endian-unsigned-integer-byte-frame>
.The following classes are predefined:
<2byte-big-endian-unsigned-integer>
,<2byte-little-endian-unsigned-integer>
,<3byte-big-endian-unsigned-integer>
, and<3byte-little-endian-unsigned-integer>
.- Operations
high-level-type
returnslimited(<integer>, min: 0, max: 2 ^ (8 * *bytes*) - 1
.field-size
returns bytes * 8.
-
define n-byte-vector
Defining Macro¶ Defines a class with an underlying fixed size byte vector.
- Macro Call
define n-byte-vector (*class-name* , *bytes*) end
- Parameters
class-name – A standard Dylan class name.
bytes – The number of bytes represented by this frame.
- Discussion
Defines the class class-name, as a subclass of
<fixed-size-byte-vector-frame>
. Callsdefine leaf-frame-constructor
with the given class-name (without surrounding angle brackets).- Operations
field-size
returns bytes * 8.
-
define leaf-frame-constructor
Defining Macro¶ Defines constructors for a given name.
- Macro Call
define leaf-frame-constructor (*constructor-name*) end
- Parameters
constructor-name – name of the constructor.
- Discussion
Defines the generic function constructor-name and three specializations:
- Operations
constructor-name
<byte-vector>
callsparse-frame
constructor-name
<collection>
, converts the<collection>
into a<byte-vector>
and calls constructor-name.constructor-name
<string>
, which callsread-frame
.
Predefined Leaf Frames¶
-
<unsigned-integer-bit-frame>
Abstract Class¶ The superclass of all bit frames, concrete classes are defined with the
define n-bit-unsigned-integer
.- Superclasses
- Operations
See also
-
<boolean-bit>
Class¶ A single bit, at the Dylan level a
<boolean>
.The
high-level-type
returns<boolean>
. Thefield-size
returns 1.- Superclasses
-
<unsigned-byte>
Class¶ A single byte, represented as a
<byte>
.- Operations
high-level-type
returns<byte>
.field-size
returns 8.
- Superclasses
-
<variable-size-byte-vector>
Abstract Class¶ A byte vector of arbitrary size, provided externally.
- Superclasses
-
<externally-delimited-string>
Class¶ A
<string>
of a certain length, externally delimited. The conversion methodas
is specialised on<string>
and<externally-delimited-string>
.- Superclasses
Note
should be a variable-size translated leaf frame, if that is possible.
-
<raw-frame>
Class¶ The bottom of the type hierarchy: if nothing is known, a
<raw-frame>
is all you can have.hexdump
can be used to inspect the frame contents.- Superclasses
-
<fixed-size-byte-vector-frame>
Open Abstract Class¶ A vector of any amount of bytes with a custom representation. Used amongst others for IP addresses, MAC addresses
- Superclasses
- Init-Keywords
data – The underlying byte vector.
- Operations
See also
-
<big-endian-unsigned-integer-byte-frame>
Abstract Class¶ A frame representing an
<integer>
of a certain size, depending on the size of the underlyaing byte vector.The macro
define n-byte-unsigned-integer-definer
defines subclasses with a certain size.- Superclasses
- Operations
See also
32 Bit Frames¶
The <integer>
type in Dylan is represented by only 30
bits, thus 32 bit frames which should be represented as a
<number>
require a workaround. The workaround consists of using
<fixed-size-byte-vector-frame>
and converting to
<double-float>
values.
Note
This hack is awful and should be replaced by native 32 bit integers, or machine words.
-
<big-endian-unsigned-integer-4byte>
Class¶ - Superclasses
-
<little-endian-unsigned-integer-4byte>
Class¶ - Superclasses
-
big-endian-unsigned-integer-4byte
Generic function¶ - Signature
big-endian-unsigned-integer-4byte (data) => (#rest results)
- Parameters
data – An instance of
<object>
.
- Values
#rest results – An instance of
<object>
.
-
little-endian-unsigned-integer-4byte
Generic function¶ - Signature
little-endian-unsigned-integer-4byte (data) => (#rest results)
- Parameters
data – An instance of
<object>
.
- Values
#rest results – An instance of
<object>
.
-
byte-vector-to-float-be
Function¶ - Signature
byte-vector-to-float-be (bv) => (res)
- Parameters
bv – An instance of
<stretchy-byte-vector-subsequence>
.
- Values
res – An instance of
<float>
.
-
byte-vector-to-float-le
Function¶ - Signature
byte-vector-to-float-le (bv) => (res)
- Parameters
bv – An instance of
<stretchy-byte-vector-subsequence>
.
- Values
res – An instance of
<float>
.
-
float-to-byte-vector-be
Function¶ - Signature
float-to-byte-vector-be (float) => (res)
- Parameters
float – An instance of
<float>
.
- Values
res – An instance of
<byte-vector>
.
-
float-to-byte-vector-le
Function¶ - Signature
float-to-byte-vector-le (float) => (res)
- Parameters
float – An instance of
<float>
.
- Values
res – An instance of
<byte-vector>
.
Stretchy Vector Subsequences¶
The underlying byte vector which is used in binary data is a
<stretchy-byte-vector>
. To allow zerocopy while parsing, and
providing each frame parser only with a byte vector of the required
size for the type, there is a <stretchy-vector-subsequence>
which tracks the byte-vector together with a start and end index.
Note
Should live in a separate module and types can be narrowed a bit further.
-
<stretchy-vector-subsequence>
Abstract Class¶ - Superclasses
<vector>
- Init-Keywords
data –
end –
start –
-
subsequence
Generic function¶ - Signature
subsequence (seq) => (#rest results)
- Parameters
seq – An instance of
<object>
.
- Values
#rest results – An instance of
<object>
.
-
<stretchy-byte-vector-subsequence>
Class¶ - Superclasses
-
decode-integer
Generic function¶ - Signature
decode-integer (seq count) => (#rest results)
- Parameters
seq – An instance of
<object>
.count – An instance of
<object>
.
- Values
#rest results – An instance of
<object>
.
-
encode-integer
Generic function¶ - Signature
encode-integer (value seq count) => (#rest results)
- Parameters
value – An instance of
<object>
.seq – An instance of
<object>
.count – An instance of
<object>
.
- Values
#rest results – An instance of
<object>
.