I build production-grade AI systems — reliable, observable, and built to scale. Four years of software engineering across financial services and complex domains, combined with an accelerated CS degree, means I ship AI that's fast to deploy and hard to break.
I'm a software developer and AI specialist with four years of engineering experience and a background that makes me better at the job. I'm completing an accelerated BS + MS in Computer Science with a focus on AI and LLM systems.
I started in operational management — owning complex workflows, working within compliance frameworks, making decisions where accuracy mattered. Then I became an Application Developer at TD Bank, shipping software to millions of users. Each step added something the previous one couldn't.
Now I build AI systems that are reliable by design: eval-first, guardrails from day one, observability as a first-class concern.
1 from langchain.vectorstores import Pinecone
2 from langchain.retrievers import BM25Retriever, EnsembleRetriever
3 from langchain.chat_models import ChatOpenAI
4 from langchain.chains import RetrievalQAWithSourcesChain
5 from .reranker import CohereReranker
6 from .guardrails import HallucinationGuard
7
8
9 def build_rag_pipeline(
10 index_name: str,
11 docs: list[str],
12 model: str = "gpt-4o",
13 ) -> RetrievalQAWithSourcesChain:
14 # Dense retriever via Pinecone embeddings
15 dense = Pinecone.from_existing_index(
16 index_name=index_name,
17 embedding=OpenAIEmbeddings(),
18 ).as_retriever(search_kwargs={"k": 20})
19
20 # Sparse retriever for keyword precision
21 sparse = BM25Retriever.from_texts(docs)
22 sparse.k = 20
23
24 # Hybrid: 60% dense, 40% sparse
25 hybrid = EnsembleRetriever(
26 retrievers=[dense, sparse],
27 weights=[0.6, 0.4],
28 )
29
30 # Rerank top-20 down to top-5 for context window
31 reranked = CohereReranker(base_retriever=hybrid, top_n=5)
32
33 llm = ChatOpenAI(model=model, temperature=0)
34
35 chain = RetrievalQAWithSourcesChain.from_chain_type(
36 llm=llm,
37 retriever=reranked,
38 return_source_documents=True,
39 )
40
41 # Wrap with hallucination guard before returning
42 return HallucinationGuard(chain=chain, threshold=0.85)
1 import re
2 from presidio_analyzer import AnalyzerEngine
3 from presidio_anonymizer import AnonymizerEngine
4 from fastapi import Request, Response
5 from .telemetry import risk_counter, latency_hist
6
7
8 analyzer = AnalyzerEngine()
9 anonymizer = AnonymizerEngine()
10
11 PII_ENTITIES = [
12 "PERSON", "EMAIL_ADDRESS", "PHONE_NUMBER",
13 "US_SSN", "CREDIT_CARD", "MEDICAL_LICENSE",
14 ]
15
16
17 async def guardrail_middleware(
18 request: Request,
19 call_next,
20 ) -> Response:
21 body = await request.json()
22 prompt = body.get("prompt", "")
23
24 # Detect PII in incoming prompt
25 results = analyzer.analyze(
26 text=prompt,
27 entities=PII_ENTITIES,
28 language="en",
29 )
30
31 if results:
32 # Track risk event in OpenTelemetry
33 risk_counter.add(1, {"entity_types": [r.entity_type for r in results]})
34 # Anonymize before forwarding to LLM
35 clean = anonymizer.anonymize(text=prompt, analyzer_results=results)
36 body["prompt"] = clean.text
37
38 # Forward sanitised request downstream
39 with latency_hist.time():
40 response = await call_next(request)
41
42 return response
1 from langgraph.graph import StateGraph, END
2 from langgraph.checkpoint import MemorySaver
3 from langchain_openai import ChatOpenAI
4 from .nodes import triage, route_task, execute, audit_log
5 from .state import WorkflowState
6
7
8 def build_agent_graph() -> StateGraph:
9 graph = StateGraph(WorkflowState)
10
11 # Register nodes
12 graph.add_node("triage", triage)
13 graph.add_node("route", route_task)
14 graph.add_node("execute", execute)
15 graph.add_node("audit", audit_log)
16
17 # Human-in-the-loop: pause before execute for sensitive tasks
18 graph.add_edge("triage", "route")
19 graph.add_conditional_edges(
20 "route",
21 lambda s: "human_review" if s["risk_level"] > 0.7 else "execute",
22 {"human_review": END, "execute": "execute"},
23 )
24 graph.add_edge("execute", "audit")
25 graph.add_edge("audit", END)
26
27 # Persist state between turns for long-running workflows
28 memory = MemorySaver()
29 return graph.compile(checkpointer=memory)
30
31
32 # Usage
33 app = build_agent_graph()
34 result = app.invoke(
35 {"task": "schedule_followup", "patient_id": "P-8821"},
36 config={"configurable": {"thread_id": "wf-001"}},
37 )
| LLMs & Orchestration |
LangChain
LlamaIndex
LangGraph
OpenAI API
Hugging Face
AWS Bedrock
RAG Pipelines
Prompt Engineering
|
| Vector Databases |
Pinecone
FAISS
Weaviate
pgvector
Chroma
Qdrant
|
| Cloud & DevOps |
AWS SageMaker
AWS Lambda
Docker
Kubernetes
GitHub Actions
Terraform
CI/CD
|
| Languages |
Python
FastAPI
SQL
JavaScript
TypeScript
Bash
REST APIs
|
| Data & ML |
Embeddings
Fine-Tuning
Feature Pipelines
PostgreSQL
Redis
Kafka
|
| LLMOps |
MLflow
RAGAS
Prometheus
OpenTelemetry
LLM Evals
Guardrails
PII Redaction
|
Open to AI Engineering roles across any industry. If you're building production LLM systems that need to be reliable, scalable, and genuinely useful — I'd love to talk.