Austin Lally

Hi, I'm Austin.

I'm an AI Engineer.

I'm open to new opportunities.

View resume →

Current Interests

Longer-Horizon Agents

Memory, multi-step planning, and self-evaluation

Better Retrieval

Cross-modal grounding, ranking from sparse user signals, and agentic review

Generative Consistency

Evaluation, automatic quality gating, and self-improving pipelines

AI Observability

Metrics, alerts, and visualizations for nondeterministic systems

Expertise

LLM & Agentic Systems

  • Multi-agent orchestration with role-specialized agents
  • Prompt assembly, context management, streaming responses
  • Layered guardrails: input classification, output inspection, top-level safety prompts

Search & Retrieval

  • Semantic and hybrid search over millions of records
  • Embedding pipelines and multimodal ingestion
  • Engagement-based ranking models and reranking

Generative AI

  • Identity-conditioned image generation across hosted model providers
  • SFace-based similarity evaluation and runtime quality gating
  • LLM-driven content pipelines with structured outputs

Production AI Engineering

  • Real-time orchestration over WebSockets with low-latency streaming
  • Inference optimization via ONNX for content-safety classifiers
  • Terraform-deployed services on AWS and Azure

Technology

AI/ML

  • PyTorch
  • HuggingFace Transformers
  • ONNX
  • BERT/RoBERTa
  • CLIP
  • BGE

LLM & Generative

  • Multi-agent orchestration
  • AutoGen
  • RAG
  • LLM guardrails
  • Diffusion models
  • ComfyUI

Search & Retrieval

  • Semantic / hybrid search
  • Embedding pipelines
  • Azure AI Search
  • FAISS
  • Reranking models

Backend & Infra

  • Python
  • FastAPI
  • TypeScript
  • WebSockets
  • Docker
  • Azure
  • AWS
  • Terraform
  • Databricks

Experience

AI Engineer

Cortina Productions

2024 - present

Founder

WxH Inc.

2021 - 2024

Director of Engineering

Concentric Sky

2011 - 2019

Education

M.S. Computer Science

Machine Learning / AI focus

Oregon State University

2022

B.S. Computer Science

Math minor

University of Oregon

2011

Projects

Projects overview

Theodore Roosevelt Presidential Library

  • FastAPI • WebSockets • Azure

Production multi-agent conversational AI for a permanent museum installation. A primary TR-voice agent answers visitors in-character, supported by classification, context tracking, and note-taking agents.

  • Engineered real-time LLM orchestration over WebSockets handling kiosk connections, prompt assembly, context management, and low-latency streaming responses via Azure Speech
  • Implemented layered guardrails: upstream input classification, downstream output inspection, and a fixed top-level prompt layer enforcing safety constraints across CMS-driven content
  • Designed prompt assembly to keep responses on-character within strict latency budgets

The National Archives

  • FastAPI • Azure AI Search • CLIP/BGE

Semantic retrieval and ranking system powering 30 interactive gallery stations in The American Story exhibit, surfacing relevant matches from a 2M-record curated set drawn from a 15M-record corpus.

  • Engineered multimodal ingestion pipeline with LLM-based enrichment and CLIP/BGE embeddings to build the retrieval indexes
  • Trained an engagement-based ranking model to filter the corpus and provide query-time signals
  • Built offline filtering and a query-time ranking system using user-labeled feedback collected through a purpose-built evaluation interface

Identity-Conditioned Photo Platform

  • Gemini • GPT • AWS • Terraform

Identity-conditioned image generation that places visitors into themed scenes — powering permanent installations at the Theodore Roosevelt Presidential Library and another museum, plus a conference-expo kiosk used to sell the capability.

  • Built the expo kiosk as an async end-to-end product: mobile QR-loaded selfie form, generation backend, kiosk display, and an admin panel for switching between theme modes (TRPL, spy disguises, dinosaurs, and more)
  • Added an LLM-driven theme-creation agent — visitors chat to define a new theme, the agent emits a structured prompt config, and admins review before publishing it to the kiosk
  • Integrated Gemini and GPT Image behind a unified API with SFace-based identity-similarity evaluation for runtime quality gating, deployed via Terraform on AWS (Elastic Beanstalk, S3, CloudFront)

Architecture Visualizations

  • Python • ComfyUI • Gemini

Image-generation service for an architecture museum that produces concept views of buildings using a two-tier hybrid pipeline.

  • Local diffusion models for fast, low-cost previews
  • Gemini for the final polished hero output

Editorial Sound Bite Surfacing

  • Whisper • GPT batch • Claude

LLM-driven workflow that surfaces candidate sound bites from interview transcripts for editorial review — decomposed so the producing team could own the final matching step themselves in Claude.

  • Whisper transcribes interviews; a batch GPT pass tags each utterance with structured metadata
  • Final brief-to-utterance matching handed off to producers as a Claude prompt — putting frontier AI directly into the editorial workflow

Content Safety Classifier

  • .NET 8 • ONNX • BERT/RoBERTa

BERT/RoBERTa-based profanity classifiers trained as a content-safety guardrail layer, optimized for low-latency inference via ONNX in a .NET 8 runtime.

  • Fine-tuned transformer classifiers on profanity detection
  • Exported and optimized for ONNX runtime in .NET 8 inference services

Width by Height

  • Flutter • Swift • Firebase

iPad-first app for designing artwork arrangements. Import your artwork or browse The Met, photograph your space, drag and drop, and preview arrangements in your room with AR.

  • Built the Flutter app with Firebase integration
  • Implemented ML models from current research to predict room geometry from RGB images
  • Used ARKit to detect, perspective-correct, and predict size of scanned artwork

Pre-AI focus · academic & client work

Earlier Work

Room Type Classifier

Room Type Classifier

Graduate coursework · MobileNet · Flutter

Fine-tuned a MobileNet-based classifier on indoor scene datasets and deployed it into a Flutter app for on-device inference.

Intelligent Agents & Decision Making

Intelligent Agents & Decision Making

Graduate coursework · Reinforcement Learning

Trained an agent to navigate a simulated frozen lake using two reinforcement-learning approaches and compared the results.

Data Augmentation for Video CV

Data Augmentation for Video CV

Graduate coursework · Literature Review

Surveyed the literature on data augmentation for video-based computer vision tasks to identify promising areas for future research.

PORTL

Concentric Sky · Angular · Play · MongoDB

Search platform for local cultural events. Aggregation pipeline merged event data from upstream APIs; admin frontend supported manual create, edit, import, and export.

Digital Promise

Concentric Sky · Angular · Django · MySQL

Microcredential authoring and assessment platform. Admins authored assessments and criteria, recipients completed assessments, and reviewers moderated awards.

MasteryTrack

Concentric Sky · React · Django

Tracking and assessment system for mastery-based learning. Teachers track student progress against learning objectives and surface targeted support.

A little bit about me

Austin traveling

I grew up in Eugene, Oregon. After high school I spent six years in the Air Force as an Airborne Cryptologic Linguist in Korean. Software had always pulled at me, so when I separated I went to the University of Oregon for Computer Science.

I spent the next eight years at Concentric Sky, an Oregon software studio, working across web, mobile, and backend on a steady stream of new client projects. I grew from engineer to Director of Engineering, taking on architecture, team leadership, and long-term technical direction.

In 2019, my wife and I decided to travel the world. We bought a one-way ticket to the Azores. We saw a wide swath of western Europe (and Morocco!) before COVID made us turn our attention back homeward.

That detour landed me back in school. I did a master's at Oregon State focused on machine learning, and along the way my wife and I founded WxH — a computer-vision iPad app for visualizing artwork in real spaces. We built the product, secured non-dilutive funding, and learned an enormous amount about everything outside of the code.

Today I'm based in the Washington, DC area, building AI for permanent museum installations. The work sits at a sweet spot for me — the engineering constraints are real, and the result is public-facing and impactful in a way that I find really rewarding.