Core Types
Fennel supports the following data types, expressed as native Python type hints.
Implemented as signed 8 byte integer (int64)
Implemented as signed 8 byte float with double precision
Implemented as signed 16 byte integer (int128) with int val as precision.
Implemented as standard 1 byte boolean
Arbitrary sequence of utf-8 characters. Like most programming languages, str
doesn't support arbitrary binary bytes though.
Arbitrary sequence of binary bytes. This is useful for storing binary data.
List of elements of any other valid type T. Unlike Python lists, all elements
must have the same type.
Map from str to data of any valid type T.
Fennel does not support dictionaries with arbitrary types for keys - please reach out to Fennel support if you have use cases requiring that.
Same as Python Optional - permits either None or values of type T.
Denotes a list of floats of the given fixed length i.e. Embedding[32]
describes a list of 32 floats. This is same as list[float] but enforces the
list length which is important for dot product and other similar operations on
embeddings.
Describes a timestamp, implemented as microseconds since Unix epoch (so minimum granularity is microseconds). Can be natively parsed from multiple formats though internally is stored as 8-byte signed integer describing timestamp as microseconds from epoch in UTC.
Describes a date, implemented as days since Unix epoch. Can be natively parsed from multiple formats though internally is stored as 8-byte signed integer describing date as days epoch in UTC.
Describes the equivalent of a struct or dataclass - a container containing a fixed set of fields of fixed types.
Fennel uses a strong type system and post data-ingestion, data doesn't auto-coerce
across types. For instance, it will be a compile or runtime error if something
was expected to be of type float but received an int instead.
1# imports for data types
2from typing import List, Optional, Dict
3from datetime import datetime
4from fennel.dtypes import struct
5
6# imports for datasets
7from fennel.datasets import dataset, field
8from fennel.lib import meta
9
10@struct  # like dataclass but verifies that fields have valid Fennel types
11class Address:
12    street: str
13    city: str
14    state: str
15    zip_code: Optional[str]
16
17@meta(owner="[email protected]")
18@dataset
19class Student:
20    id: int = field(key=True)
21    name: str
22    grades: Dict[str, float]
23    honors: bool
24    classes: List[str]
25    address: Address  # Address is now a valid Fennel type
26    signup_time: datetimepython
Type Restrictions
Fennel type restrictions allow you to put additional constraints on base types and restrict the set of valid values in some form.
Restriction on the base type of str. Permits only the strings matching the given
regex pattern.
Restriction on the base type of int or float. Permits only the numbers
between low and high (both inclusive by default). Left or right can be made
exclusive by setting min_strict or max_strict to be False respectively.
Restricts a type T to only accept one of the given values as valid values.
oneof can be thought of as a more general version of enum.
For the restriction to be valid, all the values must themselves be of type T.
1# imports for data types
2from datetime import datetime, timezone
3from fennel.dtypes import oneof, between, regex
4
5# imports for datasets
6from fennel.datasets import dataset, field
7from fennel.lib import meta
8from fennel.connectors import source, Webhook
9
10webhook = Webhook(name="fennel_webhook")
11
12@meta(owner="[email protected]")
13@source(webhook.endpoint("UserInfoDataset"), disorder="14d", cdc="upsert")
14@dataset
15class UserInfoDataset:
16    user_id: int = field(key=True)
17    name: str
18    age: between(int, 0, 100, strict_min=True)
19    gender: oneof(str, ["male", "female", "non-binary"])
20    email: regex(r"[^@]+@[^@]+\.[^@]+")
21    timestamp: datetimepython
Type Restriction Composition
These restricted types act as regular types -- they can be mixed/matched to form complex composite types. For instance, the following are all valid Fennel types:
- list[regex('$[0-9]{5}$')]- list of regexes matching US zip codes
- oneof(Optional[int], [None, 0, 1])- a nullable type that only takes 0 or 1 as valid values
Data belonging to the restricted types is still stored & transmitted (e.g. in json encoding) as a regular base type. It's just that Fennel will reject data of base type that doesn't meet the restriction.
Duration
Fennel lets you express durations in an easy to read natural language as described below:
| Symbol | Unit | 
|---|---|
| y | Year | 
| w | Week | 
| d | Day | 
| h | Hour | 
| m | Minute | 
| s | Second | 
There is no shortcut for month because there is a very high degree of
variance in month's duration- some months are 28 days, some are 30 days and
some are 31 days. A common convention in ML is to use 4 weeks
to describe a month.
A year is hardcoded to be exactly 365 days and doesn't take into account variance like leap years.
1"7h" -> 7 hours
2"12d" -> 12 days
3"2y" -> 2 years
4"3h 20m 4s" -> 3 hours 20 minutes and 4 seconds
5"2y 4w" -> 2 years and 4 weekstext