5 Thoughts On Agent Engineering
While Agents became the buzzword of 2024, their real-world usage thus far has fallen short. The idea of agentic systems is likely to further embed itself into software engineering discourse despite most developers still having an ambiguous understanding of both what an agent is and the ambiguity of how to design and engineer for this new paradigm.
It’s not hard to imagine this paradigm becoming as practical and commonplace as the usage of web-browsers for office work but the journey to that arrival is far from clear-cut as there are still many challenges about how these agents should be designed in a manner that allows them to be easily reused, adapted, deployed and monitored.
My read is that the utility of these tools is sharply bifurcated right now. It is hard to find engineers who do not lean on AI assistance in some capacity, while I’ve seen plenty of senior and staff level engineers state plainly they do not find it helpful or see reason (yet) to adapt their workflow to incorporate AI. At this point, choosing not to use these tools is probably putting yourself at a real disadvantage. Here are some of my initial thoughts on the topic:
From Generalization to Inner Activations
When people talk about deep learning models getting larger, the focus is usually on the size of the model weights. But as the number of parameters increases, the intermediate values generated inside the model grow as well. These values are used during the forward pass, but are often forgotten afterward.
sending objects/files over websockets
This post comes from me messing around with pushing a python object (and later a whole project folder) over a socket to a remote server and then running some python program on this remote server. The reason I thought this would be useful was that I wanted to be able to iterate on a model on a few different remote machines (e.g. a GPU server and faster CPU server) without having to commit and push every change. I found that while it works alright for a single file, it becomes messy once a project spans multiple files or starts to grow in project layout and dependency complexity.
TCAV 101
TCAV (Testing with Concept Activation Vectors) is an interpretability method that measures how sensitive a model’s layers are to human-chosen “concepts”. This post is me working through my own implementation of it and trying it on a non-image, time-series dataset (the SWAT water-treatment data) to see how well the idea holds up outside of the image tasks it usually gets shown on.
using fastText to classify phone call dialogue
This is the first part of using fastText to tag lines of dialogue from sales-call transcripts. fastText is an open-source library for supervised and unsupervised learning of text representations and classifiers, and the thing I like about it is how fast and simple it is to get a model going compared to something like sklearn or spaCy. Here I walk through setting it up, getting the call data into the format it wants, training a supervised model, and a few things that helped the results.
using kubernete jobs for one off ingestion of csv's
This post comes from wanting a repeatable way to get a one-off CSV into Postgres as the first step of a machine learning pipeline, with everything running on Kubernetes locally. I set up a local Kubernetes cluster with Docker for Mac, install Postgres and the dashboard with Helm, and then use a Kubernetes job with pgfutter to ingest the CSV into the database. It is overkill for a single CSV, but the point is to have a standardized pipeline I can reuse and later extend to model training.
downloading files from kaggle
Often it can be a hassle to get files from Kaggle onto a remote server. This is a quick walkthrough of a script I wrote using RoboBrowser to log into Kaggle and download a competition’s files from the command line. You could also use this for other sites where you need to log in with a username/password to download a file.
telegram bot
This post is my personal introduction to using Telegram and Bluemix. I wanted to set up a simple bot in Python, run it in a Docker container, and deploy it on Bluemix. It is super simple, but the intent is to show how to do the basics before integrating peripheral APIs and other processes.
TensorFlow 0.7.0 Dockerfile with Python 3
In 2015, Google released TensorFlow, a new deep learning framework and tensor library that was similar in many ways to Theano. I enjoy using it a lot more than Theano, mostly because of Theano’s long compile times when using Keras and TensorBoard. This post will not go into detail about using Theano, TensorFlow, or Keras. It is about how I built a Docker image for a slightly older NVIDIA card, which, for my purposes, is capable of using multiple GPUs in isolation and executing a model on one card without affecting the other.
using generative neural nets in keras to create ‘on-the-fly’ dialogue
This is an old walkthrough of using Keras to train a character-level LSTM on YouTube captions, then using speech-to-text to generate a possible next line of dialogue. The code is no longer current, but it shows the basic flow I was playing with: collect sales-call subtitles, train a small generative model, speak a prompt, and have the model suggest what could come next.