A University of Arizona · BIO5 Institute
PH&AI Summer School · 2nd Edition
AI Fluency Track · Deep Dive

Agentic
Engineering.

Past the chat window — into your filesystem, your shell, and your research stack. Vibe coding · MCP · sandboxes · safety.

Tyson Swetnam, PhD
Associate Professor · BIO5 Institute
Grand Challenges Research Building
Tucson, AZ · June 2026
APH&AI · AI Fluency
02 / The Shift
What's Different This Time

The model isn't reading your prompt. It's reading your filesystem.

Yesterday · the basics deck

Prompt → response.

A chat window. One turn at a time. The model sees what you paste. You see what it writes. The boundary between "your computer" and "the AI" is a textbox.

Today · this deck

Prompt → tool calls → effects.

The model reads your repo, edits files, runs your tests, calls APIs, queries databases — through standardized tool protocols. You approve actions, not paragraphs.

Everything in the basics deck still applies — CRAFT, role, format. But the failure modes change: an unclear prompt no longer wastes a turn. It wastes a file.

APH&AI · AI Fluency
03 / Outcome
What You'll Leave With

Four lenses for the agentic stack.

01.
Vibe coding.

The IDE / extension / CLI / browser landscape. How to choose a surface for the work.

02.
Local filesystems.

What an agent on your machine can actually do — and how to scope its authority.

03.
MCP.

The Model Context Protocol — USB-C for AI. Standard plug for tools, data, resources.

04.
Sandboxes.

Containers, VMs, hosted notebooks. Where you let the agent run hot without setting the lab on fire.

Companion page: tyson-swetnam.github.io/intro-gpt/vibe/ and /research/. Every section below has a deeper write-up there.

APH&AI · AI Fluency
Part 01 · Vibe Coding
Part 01

Vibe
coding.

An LLM, your IDE, your repo. The fastest way to ship code you don't fully understand — and the fastest way to break something you do.

APH&AI · AI Fluency
05 / Origin
February 2025 · Karpathy on X

Where the term came from — and what it actually means.

There's a new kind of coding I call vibe coding, where you fully give in to the vibes, embrace exponentials, and forget that the code even exists.

Andrej Karpathy · Feb 2 2025
Working definition

Using an LLM to generate and edit code directly inside your IDE — the model is a collaborative partner, not a search engine you copy-paste from.

The trade

Speed in exchange for meaningful authority over your machine. Files, network, shell. Worth knowing what you handed over.

APH&AI · AI Fluency
06 / Surfaces
Pick The Surface That Matches The Job

Four families. Different authority, different blast radius.

01 · Desktop IDE

Standalone editors.

Full-fat editing experience. Agent sees the open repo.

  • Claude Desktop
  • VS Code
  • Cursor
  • Positron (Posit)
  • Windsurf
  • Antigravity
02 · VS Code ext

Inside your editor.

Bring the agent to where you already work. BYOM in some cases.

  • Claude Code
  • Gemini CLI Companion
  • OpenAI Codex
  • GitHub Copilot
  • Cline
  • Roo Code
03 · CLI

Terminal-first.

Scriptable, headless, scriptable into pipelines and CI.

  • Aider
  • Claude Code CLI
  • OpenAI Codex CLI
  • Gemini CLI
  • OpenCode.ai
04 · Browser

Sandboxed by default.

No local filesystem. Code runs in a hosted Python or Node env.

  • Claude Code (web)
  • ChatGPT
  • Google Gemini
  • OpenWebUI (self-host)

Authority increases left-to-right in your head, but is actually highest in 01 · Desktop IDE: that's the surface that touches your real files.

APH&AI · AI Fluency
07 / Choosing
A First-Pass Decision Table

Match the surface to the situation. Try one. Switch when it stops fitting.

If you are… Surface Reach for Notes
New to vibe coding, exploring Browser ChatGPT / Gemini / Claude.ai No install, sandboxed Python, zero blast radius.
A researcher with R / Python notebooks Desktop IDE Positron · Cursor · VS Code Native data-science tooling, agent has repo context.
Comfortable in terminal, scripting workflows CLI Claude Code CLI · Aider · Gemini CLI Pipe into CI, run headless against many repos.
Working with sensitive data on your laptop BYOM Cline + Ollama · Aider + local LLM Self-hosted model; nothing leaves the machine.
Already deep in GitHub PR review VS Code ext GitHub Copilot · Claude Code Lives in the editor you already have open.

None of these are permanent. The cost of switching is one afternoon of muscle memory.

APH&AI · AI Fluency
Part 02 · Local Filesystems
Part 02

Your
filesystem
is a tool.

Once you grant filesystem access, the agent has everything your user does. That's the deal. Understand what it can reach.

APH&AI · AI Fluency
09 / Capabilities
What An Agent On Your Machine Can Actually Do

Four authorities. Granted with one click. Worth saying out loud.

FS.
Filesystem.

Read, modify, and delete files anywhere your user has permission. Not just the open repo — your home directory, your Downloads, your Dropbox.

NET.
Network.

Make API calls, fetch URLs, exfiltrate data, install packages. The agent has your IP and your connectivity.

SH.
Shell.

Execute arbitrary commands — file deletion, force pushes, cloud sync. The terminal is the terminal.

ENV.
Environment.

Read environment variables — including secrets your shell exposes. API keys, database URLs, auth tokens.

Heads-up
LLMs occasionally hallucinate package names. Attackers register them on PyPI / npm. If your agent installs dependencies without review, it can pull in malicious code. Read the requirements.txt diff.
APH&AI · AI Fluency
10 / Practices
Practices That Keep This Manageable

Seven habits. Adopt them before you point an agent at code that matters.

01 · Review
Read every command before approving. Most tools prompt — don't auto-approve everything.
02 · Scope
Work inside project-specific virtual environments, not at user-root.
03 · Secrets
Never store secrets in code. Use environment variables and secret managers.
04 · Sudo
Be cautious with sudo or administrator privileges. Agents rarely need them.
05 · Monitor
Actively watch agent actions when you're learning a new tool. Trust comes from observation.
06 · Policy
Follow your institution's security and privacy policies. UA has classifications. So does your IRB.
07 · Sandbox
For sensitive work, use containers or VMs — covered in Part 04.
APH&AI · AI Fluency
Part 03 · Model Context Protocol
Part 03 · MCP

USB-C
for AI.

One protocol, any tool. The plug that lets the same agent talk to your filesystem, your database, your calendar, your lab instruments.

APH&AI · AI Fluency
12 / MCP · What
Model Context Protocol

An open protocol that standardizes how applications provide context to LLMs.

Before MCP: every agent shipped its own plugin system. Connecting Claude to your filesystem meant one integration; connecting Cursor to the same filesystem meant another.

After MCP: write one server. Any compatible client can use it. The "M×N" problem becomes "M + N".

Maintained by

Anthropic, open spec at modelcontextprotocol.io. Implementations from Anthropic, OpenAI, community.

Tools

Functions the model can call — read a file, run a query, send an email.

Resources

Data the model can pull in — files, database rows, API responses.

Prompts

Reusable templates the server publishes for the client to surface.

APH&AI · AI Fluency
13 / MCP · How
Anatomy Of An MCP Connection

One host. One protocol. Many servers, each scoped to one job.

Host / Client
Claude Desktop

The app the user talks to. Manages the model, the chat, and the list of attached servers.

also: Cursor · VS Code · Cline · Claude Code

MCP · JSON-RPC over stdio / SSE
Servers · each a separate process
filesystem read · write · list
github issues · prs
postgres query · schema
slack channels · messages
your-own-thing lab gear · REDCap

Each server is its own process with its own permissions. You enable them one at a time in the host's config — and you revoke them the same way.

APH&AI · AI Fluency
14 / MCP · Config
From Zero To Connected

Connecting Claude Desktop to your project folder — the actual JSON.

// claude_desktop_config.json
{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-filesystem",
        "/Users/tswetnam/research/phai-2026"
      ]
    },
    "postgres": {
      "command": "npx",
      "args": ["-y", "@mcp/server-postgres"],
      "env": { "DATABASE_URL": "..." }
    }
  }
}
What this buys you

Claude can now read & write files in your project folder and query your local Postgres — through tool calls you approve in the chat.

Scoping rule

The path you pass in is the leash — the filesystem server cannot reach outside it. Use this. Point each MCP at the narrowest scope that does the job.

APH&AI · AI Fluency
Part 04 · AI Sandboxes
Part 04

Let it
run hot.

The point of a sandbox: somewhere the agent can experiment, install random packages, and break things without taking your laptop down with it.

APH&AI · AI Fluency
16 / Sandboxes
Three Tiers Of Isolation

Pick the tier that matches the data sensitivity, not your comfort.

Tier 01 · Soft

Hosted notebook.

ChatGPT's Python sandbox, Gemini code execution, Claude artifacts. The agent runs code in someone else's container.

  • Zero install
  • No local file access
  • Ephemeral by default
  • Data leaves your machine
Tier 02 · Medium

Container / devcontainer.

Docker or VS Code devcontainer. Agent runs against your real repo but inside an isolated FS + network.

  • Reproducible env
  • Limited host access
  • Survives destructive commands
  • Network still escapes
Tier 03 · Hard

VM / data enclave.

Full virtual machine or institutional enclave. For HIPAA, FERPA, CUI, or anything where leakage is unacceptable.

  • Strong isolation
  • Auditable
  • Often offline
  • Heavy to spin up

Public health work crosses all three tiers in a week. Synthetic data → soft. Cohort exploration → medium. Patient-level analysis → hard. Match the tier to the row.

APH&AI · AI Fluency
17 / The Loop
The Agentic Loop

Plan → act → observe → revise. The model that does this in a container is the one you trust.

01
Goal
02
Plan
03
Act · tool call
04
Observe
05
Revise & loop
Where humans belong

At step 03 approval for high-stakes actions, and at step 04 observation for everything. Don't auto-approve write tools.

Why the sandbox matters

Steps 03→04 may run 50 iterations before surfacing to you. Each iteration touches the FS. A bad plan in a sandbox is a wasted minute; on your laptop it's a restore.

APH&AI · AI Fluency
Part 05 · Agentic Research
Part 05

Putting it
to work.

Literature review · hypothesis generation · code & data analysis. The same stack, pointed at a real research problem.

APH&AI · AI Fluency
19 / Literature
Literature Review & Synthesis

Four tools. Four jobs. Don't reach for the same one twice.

P.
Perplexity

Search & summarize

The "what's out there on X?" tool. Web-first, citations inline, fast.

G.
Gemini Deep Research

Multi-step report

You pose a research question, it produces a structured report over many minutes.

N.
NotebookLM

Your own corpus

Upload your PDFs / docs / audio. Grounded answers, no hallucinated citations.

S.
ScholarAI / SemScholar

Peer-reviewed

Custom GPTs and tools pinned to academic indexes. For when "web" isn't enough.

Rule of thumb: Perplexity for scoping, ScholarAI for citations you'll actually cite, NotebookLM for synthesizing what you've already collected.

APH&AI · AI Fluency
20 / Hypothesis
Hypothesis Generation Via Role-Play

Roles aren't just style. They route which patterns the model surfaces.

Pattern 01 · Domain expert
I want you to act as a data scientist
with complete knowledge of R, the
TidyVerse, and RStudio.

Write the code to:
1. Create a new R project env
2. Load Palmer Penguins
3. Plot regressions of body mass,
   bill length & width by species

Output as R + RMarkdown with text
and code in ``` blocks.
Pattern 02 · Talk to a dead scientist
I want you to respond as though
you are the mathematician
Benoit Mandelbrot.

Explain the relationship of
lacunarity and fractal dimension
for a self-affine series.

Show results using mathematical
equations in LaTeX or MathJax.

Try with and without web search enabled. The deltas tell you which claims the model is generating vs. retrieving.

APH&AI · AI Fluency
21 / Code Execution
Where The Code Actually Runs

Same prompt. Five very different places it could execute.

Surface Where it runs Sees your files? Good for
ChatGPT Python tool OpenAI sandbox No Quick plots, data wrangling, "show me what this CSV looks like"
Gemini code execution Google sandbox No Inline Python results in chat, large-context analysis
Claude Code (CLI / IDE) Your machine Yes — via MCP Real repo work, multi-file refactors, test runs
Jupyter AI Your kernel Yes Notebook-native AI in JupyterLab; great for R / Python research
Cline + Ollama Your machine, local model Yes Sensitive code; nothing leaves the laptop

Match the row to the data classification. Synthetic test data → row 1. De-identified cohort → row 3. IRB-sensitive → row 5, or push to an institutional enclave.

APH&AI · AI Fluency
22 / Safety
Coding Safely With AI

Six categories. Work through them in order before pointing an agent at code that matters.

01
Review every line.
Correctness, efficiency, maintainability. Test edge cases. Check for SQL injection, XSS, weak auth, secret leakage. If you don't understand a block, ask the AI to explain it.
02
Local execution risks.
FS, network, shell, env vars. Review commands before approving. Project-scoped venvs. Never store secrets in code. Avoid admin privileges.
03
Bias, licensing, IP.
Models reproduce non-inclusive identifiers, deprecated patterns, licensed code. AI-generated code is "common practice," not "best practice." Check license compatibility. Document AI use.
04
Privacy & data handling.
Prompts, file contents, terminal output, project metadata — all leave your machine. Don't share PHI/PII in prompts. For sensitive code: Cline+Ollama, Aider+local LLM.
05
Accessibility.
Ask for WCAG review on UI code. Alt text, ARIA, color contrast, keyboard nav. Audit identifiers and comments for inclusive language.
06
Environmental footprint.
Don't use a frontier model when a smaller one will do. Cache results. Avoid agentic loops that fire speculative requests. Cumulative compute is the cost.
APH&AI · AI Fluency
23 / Resources
Bookmarks

Where to keep reading after the room empties out.

Vibe Coding
tyson-swetnam.github.io/intro-gpt/vibe/
Research
tyson-swetnam.github.io/intro-gpt/research/
Agentic AI
tyson-swetnam.github.io/intro-gpt/agentic/
MCP
modelcontextprotocol.io · /intro-gpt/mcp/
Sandboxes
tyson-swetnam.github.io/intro-gpt/ai_sandboxes/
Local models
ollama.com · /intro-gpt/ollama/
Claude Code
docs.anthropic.com/en/docs/claude-code
APH&AI · AI Fluency
End · Q&A
Now point one at something.

Boot a
sandbox.

Pick a real task from this week. Spin up a devcontainer. Attach one MCP. Watch the agent loop. Approve every write. Decide what to delegate next.

Tyson Swetnam, PhD
tswetnam@arizona.edu · @tswetnam
BIO5 Institute
University of Arizona