API
Docs

All

Function to check if all the elements in a boolean list are True.

Returns

Expr

Returns an expression object denoting the result of the all operation.

Only works when the list is of type bool or Optional[bool]. For an empty list, returns an expression denoting True. If the list has one or more None elements, the result becomes None.

1from fennel.expr import col
2
3expr = col("x").list.all()
4
5# works for lists of int/float or their optional versions
6assert expr.typeof(schema={"x": List[bool]}) == bool
7assert expr.typeof(schema={"x": List[Optional[bool]]}) == Optional[bool]
8assert (
9    expr.typeof(schema={"x": Optional[List[Optional[bool]]]})
10    == Optional[bool]
11)
12
13with pytest.raises(Exception):
14    expr.typeof(schema={"x": List[str]})
15
16# can be evaluated as well
17df = pd.DataFrame(
18    {"x": [[True, True], [True, False], [], None, [True, None]]}
19)
20schema = {"x": Optional[List[Optional[bool]]]}
21assert expr.eval(df, schema=schema).tolist() == [
22    True,
23    False,
24    True,
25    pd.NA,
26    pd.NA,
27]
Checking if all elements of a list are True

python

Errors

Use of invalid types:

The list namespace must be invoked on an expression that evaluates to list or optional of list. All can only be invoked on lists of bools (or optionals of bool).

Any

Function to check if a boolean list contains any True value.

Returns

Expr

Returns an expression object denoting the result of any operation.

Only works when the list is of type bool(or optional bool). For an empty list, returns an expression denoting 'False'. If the list has one or more None elements, the result becomes None unless it also has True in which case the result is still True.

1from fennel.expr import col
2
3expr = col("x").list.any()
4
5# works for lists of int/float or their optional versions
6assert expr.typeof(schema={"x": List[bool]}) == bool
7assert expr.typeof(schema={"x": List[Optional[bool]]}) == Optional[bool]
8assert (
9    expr.typeof(schema={"x": Optional[List[Optional[bool]]]})
10    == Optional[bool]
11)
12
13with pytest.raises(Exception):
14    expr.typeof(schema={"x": List[str]})
15
16# can be evaluated as well
17df = pd.DataFrame(
18    {"x": [[True, True], [True, False], [], None, [True, None]]}
19)
20schema = {"x": Optional[List[Optional[bool]]]}
21assert expr.eval(df, schema=schema).tolist() == [
22    True,
23    True,
24    False,
25    pd.NA,
26    True,
27]
Checking if the list has any True value

python

Errors

Use of invalid types:

The list namespace must be invoked on an expression that evaluates to list or optional of list. Any can only be invoked on lists of bool (or optionals of bool).

At

Function to get the value of the element at a given index of the list.

Parameters

index:Expr

The index at which list's value needs to be evaluated. This expression is expected to evaluate to an int. Fennel supports indexing by negative integers as well.

1from fennel.expr import col
2
3expr = col("x").list.at(col("y"))
4
5# contains works for only list types, index can be int/optional[int]
6assert expr.typeof(schema={"x": List[int], "y": int}) == Optional[int]
7assert expr.typeof(schema={"x": List[str], "y": int}) == Optional[str]
8
9schema = {"x": Optional[List[float]], "y": float}
10with pytest.raises(Exception):
11    expr.typeof(schema=schema)
12
13# can be evaluated with a dataframe
14df = pd.DataFrame(
15    {
16        "x": [[1, 2, 3], [4, 5, None], [4, 5, None], None],
17        "y": [1, 5, 0, 4],
18    }
19)
20schema = {"x": Optional[List[Optional[int]]], "y": int}
21assert expr.eval(df, schema=schema).tolist() == [2, pd.NA, 4, pd.NA]
22
23# schema of column must be list of something
24with pytest.raises(ValueError):
25    expr.typeof(schema={"x": int})
Getting the value of a list's element at given index

python

1from fennel.expr import col
2
3expr = col("x").list.at(col("y"))
4
5# negative indices until -len(list) are allowed and do reverse indexing
6# beyond that, start returning None like other out-of-bounds indices
7df = pd.DataFrame(
8    {
9        "x": [[1, 2, 3], [4, 5, None], [4, 5, None], None],
10        "y": [-1, -5, -2, -4],
11    }
12)
13schema = {"x": Optional[List[Optional[int]]], "y": int}
14assert expr.eval(df, schema=schema).tolist() == [3, pd.NA, 5, pd.NA]
Also works with negative indices

python

Returns

Expr

Returns an expression object denoting the value of the list at the given index. If the index is out of bounds of list's length, None is returned. Consequently, for a list of elements of type T, at always returns Optional[T].

Fennel also supports negative indices: -1 maps to the last element of the list, -2 to the second last element of the list and so on. Negative indices smaller than -len start returning None like other out-of-bound indices.

Errors

Use of invalid types:

The list namespace must be invoked on an expression that evaluates to list or optional of list. Similarly, index must evaluate to an element of type int or Optional[int].

Contains

Function to check if the given list contains a given element.

Parameters

item:Expr

contains check if the base list contains the item or not.

1from fennel.expr import col
2
3expr = col("x").list.contains(col("y"))
4
5# contains works for only list types
6assert expr.typeof(schema={"x": List[int], "y": int}) == bool
7assert (
8    expr.typeof(schema={"x": Optional[List[float]], "y": float})
9    == Optional[bool]
10)
11
12# however doesn't work if item is not of the same type as the list elements
13with pytest.raises(ValueError):
14    expr.typeof(schema={"x": List[int], "y": str})
15
16# can be evaluated with a dataframe
17df = pd.DataFrame(
18    {
19        "x": [[1, 2, 3], [4, 5, None], [4, 5, None], None, []],
20        "y": [1, 5, 3, 4, None],
21    }
22)
23schema = {"x": Optional[List[Optional[int]]], "y": Optional[int]}
24assert expr.eval(df, schema=schema).tolist() == [
25    True,
26    True,
27    pd.NA,
28    pd.NA,
29    False,
30]
31
32# schema of column must be list of something
33with pytest.raises(ValueError):
34    expr.typeof(schema={"x": int})
Checking if a list contains a given item

python

Returns

Expr

Returns an expression object denoting the result of the contains expression. The resulting expression is of type bool or Optional[bool] depending on either of input/item being nullable.

Note that, Fennel expressions borrow semantics from SQL and treat None as an unknown value. As a result, the following rules apply to contains in presence of nulls:

  • If the base list itself is None, the result is None regardless of the item.
  • If the item is None, the result is None regardless of the list, unless it is empty, in which case, the answer is False (after all, if the list is empty, no matter the value of the item, it's not present in the list).
  • If the item is not None and is present in the list, the answer is obviously True
  • However, if the item is not None, is not present in the list but the list has some None element, the result is still None (because the None values in the list may have been that element - we just can't say)

This is somewhat (but not exactly) similar to Spark's array_contains function.

Note

If you are interested in checking if a list has any None elements, a better way of doing that is to use hasnull.

Errors

Use of invalid types:

The list namespace must be invoked on an expression that evaluates to list or optional of list. Similarly, item must evaluate to an element of type T or Optional[T] if the list itself was of type List[T] (or Optional[List[T]])

Filter

Function to filter a list down to elements satisfying a predicate.

Parameters

var:str

The variable name to which each element of the list should be bound to one-by-one.

predicate:Expr

The predicate expression to be used to filter the list down. This must evaluate to bool for each element of the list. Note that this expression can refer to the element under consideration via var(name) where name is the first argument given to the filter operation (see example for details).

Returns

Expr

Returns an expression object denoting the filtered list.

1from fennel.expr import col, var
2
3expr = col("x").list.filter("x", var("x") % 2 == 0)
4
5# works as long as predicate is valid and evaluates to bool
6assert expr.typeof(schema={"x": List[int]}) == List[int]
7assert expr.typeof(schema={"x": List[float]}) == List[float]
8
9with pytest.raises(Exception):
10    expr.typeof(schema={"x": List[str]})
11
12# can be evaluated as well
13df = pd.DataFrame({"x": [[1, 2, 3], [], [1, 2, -2], None, [1, 3]]})
14schema = {"x": Optional[List[int]]}
15assert expr.eval(df, schema=schema).tolist() == [
16    [2],
17    [],
18    [2, -2],
19    pd.NA,
20    [],
21]
Filtering the list to only even numbers

python

Errors

Use of invalid types:

The list namespace must be invoked on an expression that evaluates to list or optional of list.

Has Null

Function to check if the given list has any None values.

1from fennel.expr import col
2
3expr = col("x").list.hasnull()
4
5# len works for any list type or optional list type
6assert expr.typeof(schema={"x": List[int]}) == bool
7assert expr.typeof(schema={"x": Optional[List[float]]}) == Optional[bool]
8
9# can be evaluated with a dataframe
10df = pd.DataFrame({"x": [[1, 2, 3], [4, 5, None], [], None]})
11schema = {"x": Optional[List[Optional[int]]]}
12assert expr.eval(df, schema=schema).tolist() == [False, True, False, pd.NA]
13
14# schema of column must be list of something
15with pytest.raises(ValueError):
16    expr.typeof(schema={"x": int})
Checking if a list has any null values

python

Returns

Expr

Returns an expression object denoting the result of the hasnull function. The resulting expression is of type bool or Optional[bool] depending on the input being nullable.

Errors

Use of invalid types:

The list namespace must be invoked on an expression that evaluates to list or optional of list.

Len

Function to get the length of a list.

1from fennel.expr import col
2
3expr = col("x").list.len()
4
5# len works for any list type or optional list type
6assert expr.typeof(schema={"x": List[int]}) == int
7assert expr.typeof(schema={"x": Optional[List[float]]}) == Optional[int]
8
9# can be evaluated with a dataframe
10df = pd.DataFrame({"x": [[1, 2, 3], [4, 5], [], None]})
11schema = {"x": Optional[List[int]]}
12assert expr.eval(df, schema=schema).tolist() == [3, 2, 0, pd.NA]
13
14# schema of column must be list of something
15with pytest.raises(ValueError):
16    expr.typeof(schema={"x": int})
Getting the length of a list

python

Returns

Expr

Returns an expression object denoting the result of the len function. The resulting expression is of type int or Optional[int] depending on the input being nullable.

Errors

Use of invalid types:

The list namespace must be invoked on an expression that evaluates to list or optional of list.

Map

Function to map each element of a list to get another list of the same size.

Parameters

var:str

The variable name to which each element of the list should be bound to one-by-one.

expr:Expr

The expression to be used to transform each element of the list. Note that this expression can refer to the element under consideration via var(name) where name is the first argument given to the map operation (see example for details).

Returns

Expr

Returns an expression object denoting the transformed list.

1from fennel.expr import col, var
2
3expr = col("x").list.map("x", var("x") % 2)
4
5# works as long as predicate is valid
6assert expr.typeof(schema={"x": List[int]}) == List[int]
7assert expr.typeof(schema={"x": List[Optional[int]]}) == List[Optional[int]]
8
9# can be evaluated as well
10df = pd.DataFrame({"x": [[1, 2, 3], [], [1, 2, None], None, [1, 3]]})
11schema = {"x": Optional[List[Optional[int]]]}
12expected = [[1, 0, 1], [], [1, 0, pd.NA], pd.NA, [1, 1]]
13assert expr.eval(df, schema=schema).tolist() == expected
Transforming the list to get another list

python

Errors

Use of invalid types:

The list namespace must be invoked on an expression that evaluates to list or optional of list.

Max

Function to get the maximum value of a list.

Returns

Expr

Returns an expression object denoting the max value of a list.

Only works when the list is of type int/float (or their optional versions). For an empty list, returns an expression denoting 'None'. If the list has one or more None elements, the result becomes None.

1from fennel.expr import col
2
3expr = col("x").list.max()
4
5# works for lists of int/float or their optional versions
6assert expr.typeof(schema={"x": List[int]}) == Optional[int]
7assert expr.typeof(schema={"x": Optional[List[float]]}) == Optional[float]
8
9with pytest.raises(Exception):
10    expr.typeof(schema={"x": List[str]})
11
12# can be evaluated as well
13df = pd.DataFrame({"x": [[1, 2, 3], [4, 5, None], [], None]})
14schema = {"x": Optional[List[Optional[int]]]}
15assert expr.eval(df, schema=schema).tolist() == [3, pd.NA, pd.NA, pd.NA]
Taking the maximum value of a list

python

Errors

Use of invalid types:

The list namespace must be invoked on an expression that evaluates to list or optional of list. Max can only be invoked on lists of ints/floats (or optionals of ints/floats).

Mean

Function to get the mean of the values of a list.

Returns

Expr

Returns an expression object denoting the mean value of a list.

Only works when the list is of type int/float (or their optional versions). For an empty list, returns an expression denoting 'None'. If the list has one or more None elements, the result becomes None.

The output type of this expression is either float or Optional[float] depending on the inputs.

1from fennel.expr import col
2
3expr = col("x").list.mean()
4
5# works for lists of int/float or their optional versions
6assert expr.typeof(schema={"x": List[int]}) == Optional[float]
7assert expr.typeof(schema={"x": Optional[List[float]]}) == Optional[float]
8
9with pytest.raises(Exception):
10    expr.typeof(schema={"x": List[str]})
11
12# can be evaluated as well
13df = pd.DataFrame({"x": [[1, 2, 3], [4, 5, None], [], None]})
14schema = {"x": Optional[List[Optional[int]]]}
15assert expr.eval(df, schema=schema).tolist() == [2.0, pd.NA, pd.NA, pd.NA]
Taking the average value of a list

python

Errors

Use of invalid types:

The list namespace must be invoked on an expression that evaluates to list or optional of list. Mean can only be invoked on lists of ints/floats (or optionals of ints/floats).

Min

Function to get the min value of a list.

Returns

Expr

Returns an expression object denoting the min value of a list.

Only works when the list is of type int/float (or their optional versions). For an empty list, returns an expression denoting 'None'. If the list has one or more None elements, the result becomes None.

1from fennel.expr import col
2
3expr = col("x").list.min()
4
5# works for lists of int/float or their optional versions
6assert expr.typeof(schema={"x": List[int]}) == Optional[int]
7assert expr.typeof(schema={"x": Optional[List[float]]}) == Optional[float]
8
9with pytest.raises(Exception):
10    expr.typeof(schema={"x": List[str]})
11
12# can be evaluated as well
13df = pd.DataFrame({"x": [[1, 2, 3], [4, 5, None], [], None]})
14schema = {"x": Optional[List[Optional[int]]]}
15assert expr.eval(df, schema=schema).tolist() == [1, pd.NA, pd.NA, pd.NA]
Taking the minimum value of a list

python

Errors

Use of invalid types:

The list namespace must be invoked on an expression that evaluates to list or optional of list. Min can only be invoked on lists of ints/floats (or optionals of ints/floats).

Sum

Function to get the sum of values of a list.

Returns

Expr

Returns an expression object denoting the sum of the values of the list.

Only works when the list is of type int/float (or their optional versions). For an empty list, returns an expression denoting '0'. If the list has one or more None elements, the whole sum becomes None.

1from fennel.expr import col
2
3expr = col("x").list.sum()
4
5# works for lists of int/float or their optional versions
6assert expr.typeof(schema={"x": List[int]}) == int
7assert expr.typeof(schema={"x": Optional[List[float]]}) == Optional[float]
8
9with pytest.raises(Exception):
10    expr.typeof(schema={"x": List[str]})
11
12# can be evaluated as well
13df = pd.DataFrame({"x": [[1, 2, 3], [4, 5, None], [], None]})
14schema = {"x": Optional[List[Optional[int]]]}
15assert expr.eval(df, schema=schema).tolist() == [6, pd.NA, 0, pd.NA]
Summing the values of a list

python

Errors

Use of invalid types:

The list namespace must be invoked on an expression that evaluates to list or optional of list. Sum can only be invoked on lists of ints/floats (or optionals of ints/floats).