Tag: Activations
From Generalization to Inner Activations
When people talk about deep learning models getting larger, the focus is usually on the size of the model weights. But as the number of parameters increases, the intermediate values generated inside the model grow as well. These values are used during the forward pass, but are often forgotten afterward.
Tag: CCA
From Generalization to Inner Activations
When people talk about deep learning models getting larger, the focus is usually on the size of the model weights. But as the number of parameters increases, the intermediate values generated inside the model grow as well. These values are used during the forward pass, but are often forgotten afterward.
Tag: Generalization
From Generalization to Inner Activations
When people talk about deep learning models getting larger, the focus is usually on the size of the model weights. But as the number of parameters increases, the intermediate values generated inside the model grow as well. These values are used during the forward pass, but are often forgotten afterward.
Tag: Inner-Representations
From Generalization to Inner Activations
When people talk about deep learning models getting larger, the focus is usually on the size of the model weights. But as the number of parameters increases, the intermediate values generated inside the model grow as well. These values are used during the forward pass, but are often forgotten afterward.
Tag: Perturbation
From Generalization to Inner Activations
When people talk about deep learning models getting larger, the focus is usually on the size of the model weights. But as the number of parameters increases, the intermediate values generated inside the model grow as well. These values are used during the forward pass, but are often forgotten afterward.
Tag: SVD
From Generalization to Inner Activations
When people talk about deep learning models getting larger, the focus is usually on the size of the model weights. But as the number of parameters increases, the intermediate values generated inside the model grow as well. These values are used during the forward pass, but are often forgotten afterward.
Tag: Python
sending objects/files over websockets
This post comes from me messing around with pushing a python object (and later a whole project folder) over a socket to a remote server and then running some python program on this remote server. The reason I thought this would be useful was that I wanted to be able to iterate on a model on a few different remote machines (e.g. a GPU server and faster CPU server) without having to commit and push every change. I found that while it works alright for a single file, it becomes messy once a project spans multiple files or starts to grow in project layout and dependency complexity.
downloading files from kaggle
Often it can be a hassle to get files from Kaggle onto a remote server. This is a quick walkthrough of a script I wrote using RoboBrowser to log into Kaggle and download a competition’s files from the command line. You could also use this for other sites where you need to log in with a username/password to download a file.
Tag: Websockets
sending objects/files over websockets
This post comes from me messing around with pushing a python object (and later a whole project folder) over a socket to a remote server and then running some python program on this remote server. The reason I thought this would be useful was that I wanted to be able to iterate on a model on a few different remote machines (e.g. a GPU server and faster CPU server) without having to commit and push every change. I found that while it works alright for a single file, it becomes messy once a project spans multiple files or starts to grow in project layout and dependency complexity.
Tag: Deep Learning
TCAV 101
TCAV (Testing with Concept Activation Vectors) is an interpretability method that measures how sensitive a model’s layers are to human-chosen “concepts”. This post is me working through my own implementation of it and trying it on a non-image, time-series dataset (the SWAT water-treatment data) to see how well the idea holds up outside of the image tasks it usually gets shown on.
Tag: Research
TCAV 101
TCAV (Testing with Concept Activation Vectors) is an interpretability method that measures how sensitive a model’s layers are to human-chosen “concepts”. This post is me working through my own implementation of it and trying it on a non-image, time-series dataset (the SWAT water-treatment data) to see how well the idea holds up outside of the image tasks it usually gets shown on.
Tag: Fasttext
using fastText to classify phone call dialogue
This is the first part of using fastText to tag lines of dialogue from sales-call transcripts. fastText is an open-source library for supervised and unsupervised learning of text representations and classifiers, and the thing I like about it is how fast and simple it is to get a model going compared to something like sklearn or spaCy. Here I walk through setting it up, getting the call data into the format it wants, training a supervised model, and a few things that helped the results.
Tag: Nlp
using fastText to classify phone call dialogue
This is the first part of using fastText to tag lines of dialogue from sales-call transcripts. fastText is an open-source library for supervised and unsupervised learning of text representations and classifiers, and the thing I like about it is how fast and simple it is to get a model going compared to something like sklearn or spaCy. Here I walk through setting it up, getting the call data into the format it wants, training a supervised model, and a few things that helped the results.
Tag: Textclassification
using fastText to classify phone call dialogue
This is the first part of using fastText to tag lines of dialogue from sales-call transcripts. fastText is an open-source library for supervised and unsupervised learning of text representations and classifiers, and the thing I like about it is how fast and simple it is to get a model going compared to something like sklearn or spaCy. Here I walk through setting it up, getting the call data into the format it wants, training a supervised model, and a few things that helped the results.
Tag: Docker
using kubernete jobs for one off ingestion of csv's
This post comes from wanting a repeatable way to get a one-off CSV into Postgres as the first step of a machine learning pipeline, with everything running on Kubernetes locally. I set up a local Kubernetes cluster with Docker for Mac, install Postgres and the dashboard with Helm, and then use a Kubernetes job with pgfutter to ingest the CSV into the database. It is overkill for a single CSV, but the point is to have a standardized pipeline I can reuse and later extend to model training.
TensorFlow 0.7.0 Dockerfile with Python 3
In 2015, Google released TensorFlow, a new deep learning framework and tensor library that was similar in many ways to Theano. I enjoy using it a lot more than Theano, mostly because of Theano’s long compile times when using Keras and TensorBoard. This post will not go into detail about using Theano, TensorFlow, or Keras. It is about how I built a Docker image for a slightly older NVIDIA card, which, for my purposes, is capable of using multiple GPUs in isolation and executing a model on one card without affecting the other.
Tag: Kubernetes
using kubernete jobs for one off ingestion of csv's
This post comes from wanting a repeatable way to get a one-off CSV into Postgres as the first step of a machine learning pipeline, with everything running on Kubernetes locally. I set up a local Kubernetes cluster with Docker for Mac, install Postgres and the dashboard with Helm, and then use a Kubernetes job with pgfutter to ingest the CSV into the database. It is overkill for a single CSV, but the point is to have a standardized pipeline I can reuse and later extend to model training.
Tag: Postgres
using kubernete jobs for one off ingestion of csv's
This post comes from wanting a repeatable way to get a one-off CSV into Postgres as the first step of a machine learning pipeline, with everything running on Kubernetes locally. I set up a local Kubernetes cluster with Docker for Mac, install Postgres and the dashboard with Helm, and then use a Kubernetes job with pgfutter to ingest the CSV into the database. It is overkill for a single CSV, but the point is to have a standardized pipeline I can reuse and later extend to model training.
Tag: Kaggle
downloading files from kaggle
Often it can be a hassle to get files from Kaggle onto a remote server. This is a quick walkthrough of a script I wrote using RoboBrowser to log into Kaggle and download a competition’s files from the command line. You could also use this for other sites where you need to log in with a username/password to download a file.
Tag: Bot
telegram bot
This post is my personal introduction to using Telegram and Bluemix. I wanted to set up a simple bot in Python, run it in a Docker container, and deploy it on Bluemix. It is super simple, but the intent is to show how to do the basics before integrating peripheral APIs and other processes.
Tag: Telegram
telegram bot
This post is my personal introduction to using Telegram and Bluemix. I wanted to set up a simple bot in Python, run it in a Docker container, and deploy it on Bluemix. It is super simple, but the intent is to show how to do the basics before integrating peripheral APIs and other processes.
Tag: Dockerfile
TensorFlow 0.7.0 Dockerfile with Python 3
In 2015, Google released TensorFlow, a new deep learning framework and tensor library that was similar in many ways to Theano. I enjoy using it a lot more than Theano, mostly because of Theano’s long compile times when using Keras and TensorBoard. This post will not go into detail about using Theano, TensorFlow, or Keras. It is about how I built a Docker image for a slightly older NVIDIA card, which, for my purposes, is capable of using multiple GPUs in isolation and executing a model on one card without affecting the other.
Tag: Python3
TensorFlow 0.7.0 Dockerfile with Python 3
In 2015, Google released TensorFlow, a new deep learning framework and tensor library that was similar in many ways to Theano. I enjoy using it a lot more than Theano, mostly because of Theano’s long compile times when using Keras and TensorBoard. This post will not go into detail about using Theano, TensorFlow, or Keras. It is about how I built a Docker image for a slightly older NVIDIA card, which, for my purposes, is capable of using multiple GPUs in isolation and executing a model on one card without affecting the other.
Tag: Tensorflow
TensorFlow 0.7.0 Dockerfile with Python 3
In 2015, Google released TensorFlow, a new deep learning framework and tensor library that was similar in many ways to Theano. I enjoy using it a lot more than Theano, mostly because of Theano’s long compile times when using Keras and TensorBoard. This post will not go into detail about using Theano, TensorFlow, or Keras. It is about how I built a Docker image for a slightly older NVIDIA card, which, for my purposes, is capable of using multiple GPUs in isolation and executing a model on one card without affecting the other.
Tag: Deeplearning
using generative neural nets in keras to create ‘on-the-fly’ dialogue
This is an old walkthrough of using Keras to train a character-level LSTM on YouTube captions, then using speech-to-text to generate a possible next line of dialogue. The code is no longer current, but it shows the basic flow I was playing with: collect sales-call subtitles, train a small generative model, speak a prompt, and have the model suggest what could come next.
Tag: Keras
using generative neural nets in keras to create ‘on-the-fly’ dialogue
This is an old walkthrough of using Keras to train a character-level LSTM on YouTube captions, then using speech-to-text to generate a possible next line of dialogue. The code is no longer current, but it shows the basic flow I was playing with: collect sales-call subtitles, train a small generative model, speak a prompt, and have the model suggest what could come next.
Tag: Rnn
using generative neural nets in keras to create ‘on-the-fly’ dialogue
This is an old walkthrough of using Keras to train a character-level LSTM on YouTube captions, then using speech-to-text to generate a possible next line of dialogue. The code is no longer current, but it shows the basic flow I was playing with: collect sales-call subtitles, train a small generative model, speak a prompt, and have the model suggest what could come next.