Performance Tuning

Derived Extractors

For certain common extractor patterns, Fennel provides the ability to derive these extractors as part of their feature definitions. These derived extractors carry metadata that allow the fennel backend to significantly improve performance: for these extractors, Fennel avoids calling into the Python runtime generally needed to execute extractor logic.

A feature can have at most one extractor, across python-based and derived extractors.

The following extractor types are supported:

  1. Dataset lookup extractors. These extractors perform a lookup on a single field of a dataset, optionally supply a default value for missing rows, and assign the output to a single feature. Here's an example of a manually written extractor of this form:
1@meta(owner="[email protected]")
4class User:
5    uid: int = field(key=True)
6    name: str
7    timestamp: datetime
10@meta(owner="[email protected]")
12class UserFeatures:
13    uid: int = feature(id=1)
14    name: str = feature(id=2)
16    @extractor(depends_on=[User])
17    @inputs(uid)
18    @outputs(name)
19    def func(cls, ts: pd.Series, uids: pd.Series):
20        names, found = User.lookup(ts, uid=uids)
21        names.fillna("Unknown", inplace=True)
22        return names[["name"]]
  1. Aliases. These extractors unidirectionally map an input feature to an output feature.


These extractors are derived by the feature.extract() function. Here is an example:

1@meta(owner="[email protected]")
3class Request:
4    user_id: int = feature(id=1)
7@meta(owner="[email protected]")
9class UserFeaturesDerived:
10    uid: int = feature(id=1).extract(feature=Request.user_id)
11    name: str = feature(id=2).extract(, default="Unknown")

In this example, UserFeaturesDerived.uid is an alias to Request.user_id. Aliasing is specified via the feature kwarg. specifies a lookup extractor, with the same functionality as the extractor func defined above. The lookup extractor uses the following arguments:

  • field - The dataset field to do a lookup on
  • default - An optional default value for rows not found
  • provider - A featureset that provides the values matching the keys of the dataset to look up. The input feature name must match the field name. If not provided, as in the above example, then the current featureset is assumed to be the provider. If the input feature that the extractor needs does not match the name of the field, an alias extractor can be defined, as is the case with UserFeaturesDerived.uid in the above example.

Here is an example where the input provider is a different featureset:

1@meta(owner="[email protected]")
3class Request2:
4    uid: int = feature(id=1)
7@meta(owner="[email protected]")
9class UserFeaturesDerived2:
10    name: str = feature(id=1).extract(
11, provider=Request2, default="Unknown"
12    )