Francisco Utrera

AboutPosts

Founding AI Engineer at Embrace.ai, where I own the entire AI stack — from LLM orchestration and RAG to agentic tool execution and document processing.

I've built, from scratch, a production multi-model inference pipeline (OpenAI, Claude, Gemini), a provider-agnostic LLM SDK with ReAct agents, a RAG system with vector search and neural reranking, an agentic tool execution framework with curated API endpoint definitions and automatic credential handling, and a multi-modal document extraction pipeline. Across 68 microservices, I'm the sole or primary author of the AI core, knowledge intelligence, and document processing layers.

Before Embrace I worked on adversarial robustness and transfer learning research at UC Berkeley, publishing work on how adversarially-trained networks transfer better to downstream tasks.

What I Build

LLM Orchestration — Multi-model routing across OpenAI, Claude, and Gemini with provider-agnostic abstractions, type-safe tool definitions via Zod, and ReAct agent loops with automatic tool execution
RAG Architecture — Embedding pipelines, Pinecone vector search with overfetch, Cohere neural reranking, metadata hydration, and contextual document enrichment via Claude and Gemini
Agentic Tool Execution — Curated API endpoint definitions per platform (Zendesk, Jira, Confluence, Salesforce) with automatic credential handling. Code sandbox execution via ECS Fargate. Citation tracking back to source documents.
Document Intelligence — Multi-format extraction (PDF, DOCX, images, audio/video), intelligent strategy selection, LLM-powered enrichment with multi-modal element processing, and autonomous article generation from knowledge gaps
Production Infrastructure — AWS Lambda + ECS Fargate, SQS/SNS event-driven architecture, GraphQL federation across 68 microservices, Inngest durable workflows, Langfuse observability with per-model cost attribution

LinkedIn fjutrera
Medium @fjulozada
GitHub @utrerf
Email fjulozada@gmail.com