Functions
Use built-in functions for data manipulation and analysis to operate on the underlying database storing the chain data. These functions are useful for operations like DataChain.filter and DataChain.mutate.
Functions are organized by category and accessed through their respective modules. For example, string functions are accessed via func.string.length(), array functions via func.array.contains(), etc.
Global Function Access
Only a subset of functions are available directly from datachain.func (e.g., func.length). Most functions should be accessed through their specific module namespace (e.g., func.string.length) to avoid naming conflicts.
Function Categories
DataChain provides several categories of functions for different types of operations:
- Aggregate Functions - Functions for aggregating data like 
sum,count,avg, etc. - Array Functions - Functions for working with arrays and lists
 - Conditional Functions - Functions for conditional logic like 
ifelse,case, etc. - Numeric Functions - Functions for numeric operations and computations
 - Path Functions - Functions for working with file paths
 - Random Functions - Functions for generating random values
 - String Functions - Functions for string manipulation and processing
 - Window Functions - Functions for window operations
 
Usage
from datachain.func import aggregate, array, conditional, numeric, path, random, string, window
# Access functions through their module namespaces
dc.mutate(
    text_length=string.length("text_column"),
    contains_item=array.contains("array_column", "value"),
    file_extension=path.file_ext("file_path")
)
# Some commonly used functions are also available directly
from datachain.func import sum, count, length, ifelse
dc.mutate(total=sum("amount"))