API
Docs

Core Types

Fennel supports the following data types, expressed as native Python type hints.

int

Implemented as signed 8 byte integer (int64)

float

Implemented as signed 8 byte float with double precision

Decimal[int]

Implemented as signed 16 byte integer (int128) with int val as precision.

bool

Implemented as standard 1 byte boolean

str

Arbitrary sequence of utf-8 characters. Like most programming languages, str doesn't support arbitrary binary bytes though.

bytes

Arbitrary sequence of binary bytes. This is useful for storing binary data.

List[T]

List of elements of any other valid type T. Unlike Python lists, all elements must have the same type.

dict[T]

Map from str to data of any valid type T.

Fennel does not support dictionaries with arbitrary types for keys - please reach out to Fennel support if you have use cases requiring that.

Optional[T]

Same as Python Optional - permits either None or values of type T.

Embedding[int]

Denotes a list of floats of the given fixed length i.e. Embedding[32] describes a list of 32 floats. This is same as list[float] but enforces the list length which is important for dot product and other similar operations on embeddings.

datetime

Describes a timestamp, implemented as microseconds since Unix epoch (so minimum granularity is microseconds). Can be natively parsed from multiple formats though internally is stored as 8-byte signed integer describing timestamp as microseconds from epoch in UTC.

date

Describes a date, implemented as days since Unix epoch. Can be natively parsed from multiple formats though internally is stored as 8-byte signed integer describing date as days epoch in UTC.

struct {k1: T1, k2: T2, ...}

Describes the equivalent of a struct or dataclass - a container containing a fixed set of fields of fixed types.

Note

Fennel uses a strong type system and post data-ingestion, data doesn't auto-coerce across types. For instance, it will be a compile or runtime error if something was expected to be of type float but received an int instead.

1# imports for data types
2from typing import List, Optional, Dict
3from datetime import datetime
4from fennel.dtypes import struct
5
6# imports for datasets
7from fennel.datasets import dataset, field
8from fennel.lib import meta
9
10@struct  # like dataclass but verifies that fields have valid Fennel types
11class Address:
12    street: str
13    city: str
14    state: str
15    zip_code: Optional[str]
16
17@meta(owner="[email protected]")
18@dataset
19class Student:
20    id: int = field(key=True)
21    name: str
22    grades: Dict[str, float]
23    honors: bool
24    classes: List[str]
25    address: Address  # Address is now a valid Fennel type
26    signup_time: datetime

python

Type Restrictions

Fennel type restrictions allow you to put additional constraints on base types and restrict the set of valid values in some form.

regex:regex("<pattern>")

Restriction on the base type of str. Permits only the strings matching the given regex pattern.

between:between(T, low, high)

Restriction on the base type of int or float. Permits only the numbers between low and high (both inclusive by default). Left or right can be made exclusive by setting min_strict or max_strict to be False respectively.

oneof:oneof(T, [values...])

Restricts a type T to only accept one of the given values as valid values. oneof can be thought of as a more general version of enum.

For the restriction to be valid, all the values must themselves be of type T.

1# imports for data types
2from datetime import datetime, timezone
3from fennel.dtypes import oneof, between, regex
4
5# imports for datasets
6from fennel.datasets import dataset, field
7from fennel.lib import meta
8from fennel.connectors import source, Webhook
9
10webhook = Webhook(name="fennel_webhook")
11
12@meta(owner="[email protected]")
13@source(webhook.endpoint("UserInfoDataset"), disorder="14d", cdc="upsert")
14@dataset
15class UserInfoDataset:
16    user_id: int = field(key=True)
17    name: str
18    age: between(int, 0, 100, strict_min=True)
19    gender: oneof(str, ["male", "female", "non-binary"])
20    email: regex(r"[^@]+@[^@]+\.[^@]+")
21    timestamp: datetime

python

Type Restriction Composition

These restricted types act as regular types -- they can be mixed/matched to form complex composite types. For instance, the following are all valid Fennel types:

  • list[regex('$[0-9]{5}$')] - list of regexes matching US zip codes
  • oneof(Optional[int], [None, 0, 1]) - a nullable type that only takes 0 or 1 as valid values
Note

Data belonging to the restricted types is still stored & transmitted (e.g. in json encoding) as a regular base type. It's just that Fennel will reject data of base type that doesn't meet the restriction.

Duration

Fennel lets you express durations in an easy to read natural language as described below:

SymbolUnit
yYear
wWeek
dDay
hHour
mMinute
sSecond

There is no shortcut for month because there is a very high degree of variance in month's duration- some months are 28 days, some are 30 days and some are 31 days. A common convention in ML is to use 4 weeks to describe a month.

Note

A year is hardcoded to be exactly 365 days and doesn't take into account variance like leap years.

1"7h" -> 7 hours
2"12d" -> 12 days
3"2y" -> 2 years
4"3h 20m 4s" -> 3 hours 20 minutes and 4 seconds
5"2y 4w" -> 2 years and 4 weeks

text