TCAV 101

TCAV (Testing with Concept Activation Vectors) is an interpretability method that measures how sensitive a model’s layers are to human-chosen “concepts”. This post is me working through my own implementation of it and trying it on a non-image, time-series dataset (the SWAT water-treatment data) to see how well the idea holds up outside of the image tasks it usually gets shown on.