sending objects/files over websockets
This post comes from me messing around with pushing a python object (and later a whole project folder) over a socket to a remote server and then running some python program on this remote server. The reason I thought this would be useful was that I wanted to be able to iterate on a model on a few different remote machines (e.g. a GPU server and faster CPU server) without having to commit and push every change. I found that while it works alright for a single file, it becomes messy once a project spans multiple files or starts to grow in project layout and dependency complexity.
Updates
1/24/23 Ran into this library cloudpickle and it reminded me of this post.
3/18/21: Found this outrun which seems to be basically a better and more thought out way to do a lot of this.
Introduction
Recently I became interested in better workflows for machine learning projects. One of the main issues I’ve experienced is developing locally but not being able to easily test locally. Generally this is from a local computer being a bit slower and older for the types of models and datasets I am working with and initial code decisions that resulted in poor performance on a slower computer than the server used to train. Another instance I have had this happen is if the remote server is behind a VPN and a relay hop and there is a bit of a latency between saving the file and running it if I am using something such as VSCode (although you can try something such as a ProxyJump in your ssh config). Regardless of the problem, I have yet to find an actual workflow that I find versatile and simple.
One of my recent ideas was using websockets to push a python object to a remote server and then executing that. I’m sure anyone reading this will understand that it is “unsafe” in many regards. Regardless, I think the idea of being able to quickly (with a single command or hotkey) run a model while you are iterating on it without having to commit it is incredibly useful (as that is one of the other frequent workflows I have seen).
Implementation
To implement this idea, you will need the dill package. Dill works by serializing the class/function in a folder and then sending it over TCP and running it on the remote.
Here is the basic example for the client:
import asyncio
import os
import dill
server_url = os.environ.get("SERVER_URL", "127.0.0.1")
server_port = os.environ.get("SERVER_PORT", 8888)
class Foo:
def __init__(self, val=10):
self.val = val
def run(self):
print(f"running Foo.run() with val={self.val}")
async def tcp_client():
obj = Foo(val=1)
data = dill.dumps(obj)
reader, writer = await asyncio.open_connection("127.0.0.1", 8888)
writer.write(dill.dumps(obj, recurse=True))
writer.close()
if __name__ == "__main__":
asyncio.run(tcp_client())
and then the server:
import asyncio
import dill
server_url = os.environ.get("SERVER_URL", "127.0.0.1")
server_port = os.environ.get("SERVER_PORT", 8888)
async def handle_func(reader, writer):
data = await reader.read(-1)
obj = dill.loads(data)
addr = writer.get_extra_info("peername")
print(f"got obj: {obj} - addr: {addr}")
obj.run()
writer.close()
async def main():
server = await asyncio.start_server(handle_func, "127.0.0.1", 8888)
addr = server.sockets[0].getsockname()
print(f"Serving on {addr}")
async with server:
await server.serve_forever()
asyncio.run(main())
Most of this is similar to the example in the official python docs. While this works, it won’t work so well if your model/project starts going beyond one file as serializing the related files and loading it on the remote was not something I was able to figure out. I’m still very interested in this but from looking at marshal, pyro5, dill, etc. I was not able to get it to work correctly if for instance you have a run.py and a model.py.
After messing around with this and having issues getting dill to work with a python class outside of the file running, I wanted to see if it was feasible with another method.
Sending a folder
Instead the alternative solution for perhaps a bigger project involves creating a tar of the project, sending the tar over websocket (although at this point websocket is not so important, its only advantage over something like HTTP may be something like piping results/inputs back and forth).
This time the client looks like so:
import asyncio
import tarfile
server_url = os.environ.get("SERVER_URL", "127.0.0.1")
server_port = os.environ.get("SERVER_PORT", 8888)
async def tcp_send_folder(folder):
with open(folder, "rb") as f:
data = f.read()
reader, writer = await asyncio.open_connection(server_url, server_port)
writer.write(data)
writer.close()
await writer.wait_closed()
def create_tarfile(folder):
f = "out/client/send.tar.gz"
with tarfile.open(f, mode="w:gz") as tar:
tar.add(folder)
return f
if __name__ == "__main__":
asyncio.run(tcp_send_folder(create_tarfile("src")))
and the server as such:
import asyncio
import os
import tarfile
server_url = os.environ.get("SERVER_URL", "127.0.0.1")
server_port = os.environ.get("SERVER_PORT", 8888)
def run_folder(folder):
tar = tarfile.open(folder)
tar.extractall(path="out/run")
tar.close()
from out.run.src import main
main.run()
async def handle_folder(reader, writer):
addr = writer.get_extra_info("peername")
outfile = "out/server/out.tar.gz"
with open(outfile, "wb") as f:
while True:
data = await reader.read(1024)
if not data:
break
f.write(data)
writer.close()
await writer.wait_closed()
run_folder(outfile)
print("done...")
async def main():
server = await asyncio.start_server(handle_folder, server_url, server_port)
addr = server.sockets[0].getsockname()
print(f"Serving on {addr}")
async with server:
await server.serve_forever()
asyncio.run(main())
Other
It’s hard to say how useful this is although some ML/AI researcher may find it helpful to dispatch research and experimentation if they have a workflow that allows it. Although I know of other tooling that is directionally similar, such as DVC CML, often they rely on creating a git commit and pushing to a remote that can then run the experiments. Some other possible ways to achieve this could be by using an executable with pyinstaller/cx_freeze (slow) or making the core functionality serializable (not ideal) but I have yet to find a best way to do this.