All
Function to check if all the elements in a boolean list are True
.
Returns
Returns an expression object denoting the result of the all
operation.
Only works when the list is of type bool or Optional[bool]. For an empty list,
returns an expression denoting True
. If the list has one or more None
elements, the result becomes None
.
1from fennel.expr import col
2
3expr = col("x").list.all()
4
5# works for lists of int/float or their optional versions
6assert expr.typeof(schema={"x": List[bool]}) == bool
7assert expr.typeof(schema={"x": List[Optional[bool]]}) == Optional[bool]
8assert (
9 expr.typeof(schema={"x": Optional[List[Optional[bool]]]})
10 == Optional[bool]
11)
12
13with pytest.raises(Exception):
14 expr.typeof(schema={"x": List[str]})
15
16# can be evaluated as well
17df = pd.DataFrame(
18 {"x": [[True, True], [True, False], [], None, [True, None]]}
19)
20schema = {"x": Optional[List[Optional[bool]]]}
21assert expr.eval(df, schema=schema).tolist() == [
22 True,
23 False,
24 True,
25 pd.NA,
26 pd.NA,
27]
python
Errors
The list
namespace must be invoked on an expression that evaluates to list
or optional of list. All
can only be invoked on lists of bools (or
optionals of bool).
Any
Function to check if a boolean list contains any True
value.
Returns
Returns an expression object denoting the result of any
operation.
Only works when the list is of type bool(or optional bool). For
an empty list, returns an expression denoting 'False'. If the list has one or more
None
elements, the result becomes None
unless it also has True
in which case
the result is still True
.
1from fennel.expr import col
2
3expr = col("x").list.any()
4
5# works for lists of int/float or their optional versions
6assert expr.typeof(schema={"x": List[bool]}) == bool
7assert expr.typeof(schema={"x": List[Optional[bool]]}) == Optional[bool]
8assert (
9 expr.typeof(schema={"x": Optional[List[Optional[bool]]]})
10 == Optional[bool]
11)
12
13with pytest.raises(Exception):
14 expr.typeof(schema={"x": List[str]})
15
16# can be evaluated as well
17df = pd.DataFrame(
18 {"x": [[True, True], [True, False], [], None, [True, None]]}
19)
20schema = {"x": Optional[List[Optional[bool]]]}
21assert expr.eval(df, schema=schema).tolist() == [
22 True,
23 True,
24 False,
25 pd.NA,
26 True,
27]
python
Errors
The list
namespace must be invoked on an expression that evaluates to list
or optional of list. Any
can only be invoked on lists of bool (or
optionals of bool).
At
Function to get the value of the element at a given index of the list.
Parameters
The index at which list's value needs to be evaluated. This expression is expected to evaluate to an int. Fennel supports indexing by negative integers as well.
1from fennel.expr import col
2
3expr = col("x").list.at(col("y"))
4
5# contains works for only list types, index can be int/optional[int]
6assert expr.typeof(schema={"x": List[int], "y": int}) == Optional[int]
7assert expr.typeof(schema={"x": List[str], "y": int}) == Optional[str]
8
9schema = {"x": Optional[List[float]], "y": float}
10with pytest.raises(Exception):
11 expr.typeof(schema=schema)
12
13# can be evaluated with a dataframe
14df = pd.DataFrame(
15 {
16 "x": [[1, 2, 3], [4, 5, None], [4, 5, None], None],
17 "y": [1, 5, 0, 4],
18 }
19)
20schema = {"x": Optional[List[Optional[int]]], "y": int}
21assert expr.eval(df, schema=schema).tolist() == [2, pd.NA, 4, pd.NA]
22
23# schema of column must be list of something
24with pytest.raises(ValueError):
25 expr.typeof(schema={"x": int})
python
1from fennel.expr import col
2
3expr = col("x").list.at(col("y"))
4
5# negative indices until -len(list) are allowed and do reverse indexing
6# beyond that, start returning None like other out-of-bounds indices
7df = pd.DataFrame(
8 {
9 "x": [[1, 2, 3], [4, 5, None], [4, 5, None], None],
10 "y": [-1, -5, -2, -4],
11 }
12)
13schema = {"x": Optional[List[Optional[int]]], "y": int}
14assert expr.eval(df, schema=schema).tolist() == [3, pd.NA, 5, pd.NA]
python
Returns
Returns an expression object denoting the value of the list at the given index.
If the index is out of bounds of list's length, None
is returned. Consequently,
for a list of elements of type T
, at
always returns Optional[T]
.
Fennel also supports negative indices: -1 maps to the last element of the list,
-2 to the second last element of the list and so on. Negative indices smaller
than -len start returning None
like other out-of-bound indices.
Errors
The list
namespace must be invoked on an expression that evaluates to list
or optional of list. Similarly, index
must evaluate to an element of type int
or Optional[int]
.
Contains
Function to check if the given list contains a given element.
Parameters
contains
check if the base list contains the item
or not.
1from fennel.expr import col
2
3expr = col("x").list.contains(col("y"))
4
5# contains works for only list types
6assert expr.typeof(schema={"x": List[int], "y": int}) == bool
7assert (
8 expr.typeof(schema={"x": Optional[List[float]], "y": float})
9 == Optional[bool]
10)
11
12# however doesn't work if item is not of the same type as the list elements
13with pytest.raises(ValueError):
14 expr.typeof(schema={"x": List[int], "y": str})
15
16# can be evaluated with a dataframe
17df = pd.DataFrame(
18 {
19 "x": [[1, 2, 3], [4, 5, None], [4, 5, None], None, []],
20 "y": [1, 5, 3, 4, None],
21 }
22)
23schema = {"x": Optional[List[Optional[int]]], "y": Optional[int]}
24assert expr.eval(df, schema=schema).tolist() == [
25 True,
26 True,
27 pd.NA,
28 pd.NA,
29 False,
30]
31
32# schema of column must be list of something
33with pytest.raises(ValueError):
34 expr.typeof(schema={"x": int})
python
Returns
Returns an expression object denoting the result of the contains
expression.
The resulting expression is of type bool
or Optional[bool]
depending on
either of input/item being nullable.
Note that, Fennel expressions borrow semantics from SQL and treat None
as
an unknown value. As a result, the following rules apply to contains
in
presence of nulls:
- If the base list itself is
None
, the result isNone
regardless of the item. - If the item is
None
, the result isNone
regardless of the list, unless it is empty, in which case, the answer isFalse
(after all, if the list is empty, no matter the value of the item, it's not present in the list). - If the item is not
None
and is present in the list, the answer is obviouslyTrue
- However, if the item is not
None
, is not present in the list but the list has someNone
element, the result is stillNone
(because theNone
values in the list may have been that element - we just can't say)
This is somewhat (but not exactly) similar to Spark's array_contains
function.
If you are interested in checking if a list has any None
elements, a better
way of doing that is to use hasnull.
Errors
The list
namespace must be invoked on an expression that evaluates to list
or optional of list. Similarly, item
must evaluate to an element of type T
or Optional[T]
if the list itself was of type List[T]
(or Optional[List[T]]
)
Filter
Function to filter a list down to elements satisfying a predicate.
Parameters
The variable name to which each element of the list should be bound to one-by-one.
The predicate expression to be used to filter the list down. This must
evaluate to bool for each element of the list. Note that this expression can
refer to the element under consideration via var(name)
where name is the
first argument given to the filter
operation (see example for details).
Returns
Returns an expression object denoting the filtered list.
1from fennel.expr import col, var
2
3expr = col("x").list.filter("x", var("x") % 2 == 0)
4
5# works as long as predicate is valid and evaluates to bool
6assert expr.typeof(schema={"x": List[int]}) == List[int]
7assert expr.typeof(schema={"x": List[float]}) == List[float]
8
9with pytest.raises(Exception):
10 expr.typeof(schema={"x": List[str]})
11
12# can be evaluated as well
13df = pd.DataFrame({"x": [[1, 2, 3], [], [1, 2, -2], None, [1, 3]]})
14schema = {"x": Optional[List[int]]}
15assert expr.eval(df, schema=schema).tolist() == [
16 [2],
17 [],
18 [2, -2],
19 pd.NA,
20 [],
21]
python
Errors
The list
namespace must be invoked on an expression that evaluates to list
or optional of list.
Has Null
Function to check if the given list has any None
values.
1from fennel.expr import col
2
3expr = col("x").list.hasnull()
4
5# len works for any list type or optional list type
6assert expr.typeof(schema={"x": List[int]}) == bool
7assert expr.typeof(schema={"x": Optional[List[float]]}) == Optional[bool]
8
9# can be evaluated with a dataframe
10df = pd.DataFrame({"x": [[1, 2, 3], [4, 5, None], [], None]})
11schema = {"x": Optional[List[Optional[int]]]}
12assert expr.eval(df, schema=schema).tolist() == [False, True, False, pd.NA]
13
14# schema of column must be list of something
15with pytest.raises(ValueError):
16 expr.typeof(schema={"x": int})
python
Returns
Returns an expression object denoting the result of the hasnull
function.
The resulting expression is of type bool
or Optional[bool]
depending on
the input being nullable.
Errors
The list
namespace must be invoked on an expression that evaluates to list
or optional of list.
Len
Function to get the length of a list.
1from fennel.expr import col
2
3expr = col("x").list.len()
4
5# len works for any list type or optional list type
6assert expr.typeof(schema={"x": List[int]}) == int
7assert expr.typeof(schema={"x": Optional[List[float]]}) == Optional[int]
8
9# can be evaluated with a dataframe
10df = pd.DataFrame({"x": [[1, 2, 3], [4, 5], [], None]})
11schema = {"x": Optional[List[int]]}
12assert expr.eval(df, schema=schema).tolist() == [3, 2, 0, pd.NA]
13
14# schema of column must be list of something
15with pytest.raises(ValueError):
16 expr.typeof(schema={"x": int})
python
Returns
Returns an expression object denoting the result of the len
function.
The resulting expression is of type int
or Optional[int]
depending on
the input being nullable.
Errors
The list
namespace must be invoked on an expression that evaluates to list
or optional of list.
Map
Function to map each element of a list to get another list of the same size.
Parameters
The variable name to which each element of the list should be bound to one-by-one.
The expression to be used to transform each element of the list. Note that
this expression can refer to the element under consideration via var(name)
where name is the first argument given to the map
operation (see example for
details).
Returns
Returns an expression object denoting the transformed list.
1from fennel.expr import col, var
2
3expr = col("x").list.map("x", var("x") % 2)
4
5# works as long as predicate is valid
6assert expr.typeof(schema={"x": List[int]}) == List[int]
7assert expr.typeof(schema={"x": List[Optional[int]]}) == List[Optional[int]]
8
9# can be evaluated as well
10df = pd.DataFrame({"x": [[1, 2, 3], [], [1, 2, None], None, [1, 3]]})
11schema = {"x": Optional[List[Optional[int]]]}
12expected = [[1, 0, 1], [], [1, 0, pd.NA], pd.NA, [1, 1]]
13assert expr.eval(df, schema=schema).tolist() == expected
python
Errors
The list
namespace must be invoked on an expression that evaluates to list
or optional of list.
Max
Function to get the maximum value of a list.
Returns
Returns an expression object denoting the max value of a list.
Only works when the list is of type int/float (or their optional versions). For
an empty list, returns an expression denoting 'None'. If the list has one or more
None
elements, the result becomes None
.
1from fennel.expr import col
2
3expr = col("x").list.max()
4
5# works for lists of int/float or their optional versions
6assert expr.typeof(schema={"x": List[int]}) == Optional[int]
7assert expr.typeof(schema={"x": Optional[List[float]]}) == Optional[float]
8
9with pytest.raises(Exception):
10 expr.typeof(schema={"x": List[str]})
11
12# can be evaluated as well
13df = pd.DataFrame({"x": [[1, 2, 3], [4, 5, None], [], None]})
14schema = {"x": Optional[List[Optional[int]]]}
15assert expr.eval(df, schema=schema).tolist() == [3, pd.NA, pd.NA, pd.NA]
python
Errors
The list
namespace must be invoked on an expression that evaluates to list
or optional of list. Max
can only be invoked on lists of ints/floats (or
optionals of ints/floats).
Mean
Function to get the mean of the values of a list.
Returns
Returns an expression object denoting the mean value of a list.
Only works when the list is of type int/float (or their optional versions). For
an empty list, returns an expression denoting 'None'. If the list has one or more
None
elements, the result becomes None
.
The output type of this expression is either float
or Optional[float]
depending
on the inputs.
1from fennel.expr import col
2
3expr = col("x").list.mean()
4
5# works for lists of int/float or their optional versions
6assert expr.typeof(schema={"x": List[int]}) == Optional[float]
7assert expr.typeof(schema={"x": Optional[List[float]]}) == Optional[float]
8
9with pytest.raises(Exception):
10 expr.typeof(schema={"x": List[str]})
11
12# can be evaluated as well
13df = pd.DataFrame({"x": [[1, 2, 3], [4, 5, None], [], None]})
14schema = {"x": Optional[List[Optional[int]]]}
15assert expr.eval(df, schema=schema).tolist() == [2.0, pd.NA, pd.NA, pd.NA]
python
Errors
The list
namespace must be invoked on an expression that evaluates to list
or optional of list. Mean
can only be invoked on lists of ints/floats (or
optionals of ints/floats).
Min
Function to get the min value of a list.
Returns
Returns an expression object denoting the min value of a list.
Only works when the list is of type int/float (or their optional versions). For
an empty list, returns an expression denoting 'None'. If the list has one or more
None
elements, the result becomes None
.
1from fennel.expr import col
2
3expr = col("x").list.min()
4
5# works for lists of int/float or their optional versions
6assert expr.typeof(schema={"x": List[int]}) == Optional[int]
7assert expr.typeof(schema={"x": Optional[List[float]]}) == Optional[float]
8
9with pytest.raises(Exception):
10 expr.typeof(schema={"x": List[str]})
11
12# can be evaluated as well
13df = pd.DataFrame({"x": [[1, 2, 3], [4, 5, None], [], None]})
14schema = {"x": Optional[List[Optional[int]]]}
15assert expr.eval(df, schema=schema).tolist() == [1, pd.NA, pd.NA, pd.NA]
python
Errors
The list
namespace must be invoked on an expression that evaluates to list
or optional of list. Min
can only be invoked on lists of ints/floats (or
optionals of ints/floats).
Sum
Function to get the sum of values of a list.
Returns
Returns an expression object denoting the sum of the values of the list.
Only works when the list is of type int/float (or their optional versions). For
an empty list, returns an expression denoting '0'. If the list has one or more
None
elements, the whole sum becomes None
.
1from fennel.expr import col
2
3expr = col("x").list.sum()
4
5# works for lists of int/float or their optional versions
6assert expr.typeof(schema={"x": List[int]}) == int
7assert expr.typeof(schema={"x": Optional[List[float]]}) == Optional[float]
8
9with pytest.raises(Exception):
10 expr.typeof(schema={"x": List[str]})
11
12# can be evaluated as well
13df = pd.DataFrame({"x": [[1, 2, 3], [4, 5, None], [], None]})
14schema = {"x": Optional[List[Optional[int]]]}
15assert expr.eval(df, schema=schema).tolist() == [6, pd.NA, 0, pd.NA]
python
Errors
The list
namespace must be invoked on an expression that evaluates to list
or optional of list. Sum
can only be invoked on lists of ints/floats (or
optionals of ints/floats).