Fenic 0.4.0 Released: Declarative Tools, MCP, and HuggingFace — plus major DX & reliability gains

TL;DR Upgrade now to unlock declarative tool creation for function calling, a production‑ready MCP server, GPT‑5 & Claude Opus 4.1 support, a new HuggingFace connector, directory loaders, local metrics tables, richer catalog metadata, clearer errors, and big performance & stability improvements.

Upgrade Today

bash
pip install --upgrade fenic

What’s in it for you

Declarative tools for agents: Define function‑calling tools as data — less boilerplate, safer types, faster iteration.
Assistant integrations out‑of‑the‑box: Full MCP server so Claude Code, Gemini CLI, Cursor & friends can use your Fenic tools directly.
Latest models, simpler ops: GPT‑5 & Claude Opus 4.1 support, plus provider key validation to fail fast.
More data, fewer hops: Read HuggingFace datasets via hf://… and load entire directories into DataFrames.
See cost & performance: Built‑in local metrics table to analyze latency and spend per pipeline.
Catalog you can trust: Descriptions on views/tables, thread‑safe local catalog, logical types in cloud catalog.
Cleaner DX: Crisper errors (e.g., union() schema mismatches), handy null() / empty() helpers, clearer S3 auth behavior.
Faster & sturdier: Thread‑safe concurrency, Rust regex validation, smarter retries and resource cleanup.
Async UDFs for concurrent I/O: Run API/DB/MCP calls in parallel with ordering, retries, and timeouts—without leaving DataFrame semantics.

Declarative tool creation (catalog-backed) (⭐️)

Build LLM tools by declaring what the tool does and its parameters, then register it in the Fenic catalog. Catalog tools are type-safe, discoverable, and automatically consumable by MCP servers and fenic-serve.

Why it matters

Drop up to 70% of agent boilerplate.
Strongly-typed params via ToolParam with automatic validation.
Tools are versionable metadata—easy to diff, review, and reuse.
One definition, many runtimes (programmatic servers, ASGI, CLI).

python
from fenic.api.session import Session
from fenic.core.mcp.types import ToolParam

session = Session.get_or_create()

# Your DataFrame query that implements the tool
df = session.create_dataframe(...)  # e.g., search, transform, summarize

# Register the tool in the catalog
session.catalog.create_tool(
    tool_name="my_tool",
    tool_description="A tool that searches documents",
    tool_query=df,               # The DataFrame query to execute
    tool_params=[
        ToolParam(
            name="search_term",
            description="The term to search for",
            allowed_values=None,     # Or e.g. ["bug","feature","note"]
            default_value="default", # Optional
        ),
        ToolParam(
            name="limit",
            description="Max results to return",
            default_value=10,
        ),
    ],
    result_limit=50,              # Max rows to return
)

Pair this with Fenic’s semantic operators and you can roll out production‑grade agent tools in minutes.

MCP servers: run Fenic tools anywhere

Fenic ships a complete Model Context Protocol (MCP) server with multiple ways to run it. Pick the style that fits your deployment and integrate assistants without leaving your data plane.

Programmatic — synchronous

python
from fenic.api.mcp import create_mcp_server, run_mcp_server_sync
from fenic.api.session import Session
from fenic.core.mcp.types import ParameterizedToolDefinition

session = Session.get_or_create()
tools = session.catalog.list_tools()
server = create_mcp_server(
    session,
    "MyServer",
    tools=tools,           # List of ParameterizedToolDefinition
    concurrency_limit=8,
)

run_mcp_server_sync(
    server,
    transport="http",      # or "stdio"
    stateless_http=True,
    port=8000,
    host="127.0.0.1",
    path="/mcp",
)

Programmatic — asynchronous

python
import asyncio
from fenic.api.mcp import create_mcp_server, run_mcp_server_async
from fenic.api.session import Session

session = Session.get_or_create()

async def main():
	tools = session.catalog.list_tools()
    server = create_mcp_server(session, "MyServer", tools=tools)
    await run_mcp_server_async(
        server,
        transport="http",
        stateless_http=True,
        port=8000,
        host="127.0.0.1",
    )

asyncio.run(main())

ASGI application (production-ready)

python
from fenic.api.mcp import create_mcp_server, run_mcp_server_asgi
from fenic.api.session import Session
from fenic.core.mcp.types import ParameterizedToolDefinition

session = Session.get_or_create()
tools = session.catalog.list_tools()
server = create_mcp_server(session, "MyServer", tools=tools)

app = run_mcp_server_asgi(
    server,
    stateless_http=True,
    port=8000,
    host="127.0.0.1",
    path="/mcp",
)
# Launch with any ASGI server, e.g.:
# uvicorn myapp:app

CLI - fenic-serve

bash
# Run with all catalog tools
fenic-serve

# Run with specific catalog tools
fenic-serve --tools sales_by_product sales_by_customer

# HTTP transport (default)
fenic-serve --transport http --port 8000 --host 127.0.0.1

# stdio transport (for direct tool integration)
fenic-serve --transport stdio

# Custom config + selected tools
fenic-serve --config-file ./session.config.json --tools my_tool

# Stateful HTTP sessions
fenic-serve --stateful-http

Direct server methods

python
server = create_mcp_server(session, "MyServer", tools=[...])

# Synchronous
server.run(transport="http", stateless_http=True)

# Asynchronous
await server.run_async(transport="stdio")

# Get ASGI app directly
app = server.http_app(stateless_http=True, port=8000)

Transport options

http: default; ideal for web services and APIs
stdio: direct tool integration (e.g., Claude Desktop)

Key parameters

stateless_http: maintain state across requests (False) or not (True)
concurrency_limit: max concurrent tool executions (default 8)
transport: "http" or "stdio"
port / host / path: HTTP server configuration

Async UDFs: concurrent I/O inside your DataFrames (⭐️)

Run network and tool-bound work in parallel without leaving DataFrame semantics. Async UDFs let you fan out API calls and database lookups across rows concurrently, while preserving type safety, input order, and predictable resource usage.

Why it matters

Throughput for I/O workloads: Maximize parallelism on slow endpoints with bounded concurrency, retries, and timeouts.
Production-safe by design: Ordered results, cooperative cancellation, and memory-aware buffering prevent tail-latency blowups.
Fenic-native ergonomics: Keep transformations declarative—no bespoke asyncio plumbing required.

Key features

Configurable concurrency — max_concurrency caps in-flight tasks
Automatic retries — exponential backoff for transient failures
Timeouts — per-item timeout_seconds to avoid hangs
Ordered results — output matches input row order
Resource management — bounded buffers; cancels pending work on error
Type safety — declared return_type with clear, actionable errors

Usage example

python
import fenic as fc
from fenic.core.types import IntegerType
import aiohttp

@fc.async_udf(
    return_type=IntegerType,
    max_concurrency=10,
    timeout_seconds=5,
    num_retries=2,
)
async def fetch_score(user_id: int) -> int:
    async with aiohttp.ClientSession() as session:
        async with session.get(f"https://api.example.com/score/{user_id}") as resp:
            data = await resp.json()
            return data["score"]

# Apply to a DataFrame
df = df.select(
    fc.col("user_id"),
    fetch_score(fc.col("user_id")).alias("score"),
)

Great for

Parallel API calls inside DataFrame transforms
Low-latency lookups against services/DBs

Under the hood, fenic uses a unified event loop, smart buffer management, and cooperative cancellation. Individual failures return None (instead of failing the entire batch), keeping pipelines resilient.

Enhanced AI model support

GPT‑5 integration

Specialized parameters including verbosity control and minimal reasoning modes for cost‑efficient runs.
Smoother high‑throughput batch operations.

Claude Opus 4.1

Access Anthropic’s latest capabilities with easy provider switching.

Provider key validation

Validate API keys at session init with clear, fail‑fast errors.
Eliminate surprise runtime failures from missing/misconfigured credentials.

python
from fenic.api.session import Session

  # Provider keys ARE validated during session creation
  # If any configured model has invalid keys, this will fail immediately
  session = Session.get_or_create()  # ← Validation happens HERE

  # By the time you get here, all keys have already been validated
  df.semantic.map("content", prompt="...", model="gpt-5")

Data processing enhancements

HuggingFace connector

Read datasets directly from the ML ecosystem using a simple URI scheme.

python
# HuggingFace connector - use hf:// scheme
df = session.read.csv("hf://datasets/squad/default/train.csv")
# or for parquet files:
df = session.read.parquet("hf://datasets/cais/mmlu/astronomy/*.parquet")
df = df.semantic.extract("context", schema=QuestionAnswer, model="gpt-5")

Directory content loading

Turn folders into DataFrames for batch processing — recursively, with file metadata extracted automatically.

python
from fenic.core.types import MarkdownType
  df = session.read.docs(
      "/data/logs/",
      data_type=MarkdownType,
      recursive=True
  )

Local metrics table

Track inference metrics (latency, cost, model, tokens) locally for inspection and optimization.

python
metrics = session.table("fenic_system.query_metrics")
metrics.select("model", "latency_ms", "cost_usd").order_by("latency_ms").show()

Catalog improvements

Add descriptions to views/tables for self‑documenting pipelines.
Thread‑safe local catalog for safer concurrent work.
Logical types in the cloud catalog and richer metadata.

python
# Catalog improvements
  session.catalog.create_table(
      'my_table',
      schema,
      description='My table description'
  )

Developer experience (DX)

Better errors (e.g., `union()`)

More actionable messages on schema mismatches with suggested fixes and clearer stack traces.

Utility functions

F.null(data_type) - creates null values
F.empty(data_type) - creates empty values (empty arrays/structs, or null for primitives)

python
# Utility functions
  from fenic.api.functions import null, empty
  from fenic.core.types import IntegerType, ArrayType

  # Create null-valued column
  df = df.with_column("null_col", F.null(IntegerType))

  # Create empty array column
  df = df.with_column("empty_array", F.empty(ArrayType(IntegerType)))

S3 integration

Friendlier messages when credentials are missing, automatic env detection, and support for multiple auth methods.

Performance & reliability

Thread‑safe operations.
Rust regex validation for rlike/ilike/like.
Async UDF flow reduces tool‑call latency.
OpenAI retry logic handles intermittent 404s during batch processing.
Clean shutdowns prevent memory leaks (proper event‑loop and task cancellation).

Upgrading from v0.3.x

No breaking changes. It's safe to perform the following:

bash
pip install --upgrade fenic

Documentation & examples

Enriched MCP server example README
Fixed group‑by docstrings & examples
Updated example notebooks; Colab‑friendly main README
Refreshed clustering API docs

Try it out & tell us what you build

Upgrade: pip install --upgrade fenic
Explore: read the latest docs at docs.fenic.ai
Engage: ⭐ the repo, open issues, or hop into Discord

Links

GitHub: https://github.com/typedef-ai/fenic
Release Notes: https://github.com/typedef-ai/fenic/releases/tag/v0.4.0
Docs: https://docs.fenic.ai
Example Notebooks: https://github.com/typedef-ai/fenic/tree/main/examples

Thank you for being part of the Fenic community — your feedback drives every release.

— The Fenic Team

Fenic 0.4.0 Released: Declarative Tools, MCP, and HuggingFace — plus major DX & reliability gains

Fenic 0.4.0 Released: Declarative Tools, MCP, and HuggingFace — plus major DX & reliability gains

What’s in it for you

Declarative tool creation (catalog-backed) (⭐️)

MCP servers: run Fenic tools anywhere

Programmatic — synchronous

Programmatic — asynchronous

ASGI application (production-ready)

CLI - fenic-serve

Direct server methods

Async UDFs: concurrent I/O inside your DataFrames (⭐️)

Enhanced AI model support

GPT‑5 integration

Claude Opus 4.1

Provider key validation

Data processing enhancements

HuggingFace connector

Directory content loading

Local metrics table

Catalog improvements

Developer experience (DX)

Better errors (e.g., union())

Utility functions

S3 integration

Performance & reliability

Upgrading from v0.3.x

Documentation & examples

Try it out & tell us what you build

Better errors (e.g., `union()`)