I'm working on this containerized API where in two steps (one asynchronous and one synchronous) the user can interact with the output generated by the first step trough a front-end service.
I wasn't entirely sure if I should post it here or on SO but since the question is more so about the design rather than the implementation I decided to post it here, let me know if you think it's more suitable for SO.
The flow looks as follows:
- The user has requested to do some asynchronous job for which it received an unique identifier.
The job produces a model that the user wants to test:
They send a POST request to a front-end service with the unique identifier and custom data (json) with which they want to test the model with.
The front-end service starts a Kubernetes Job.
The job's Init Container requests the model.
The job's main container loads the model.
The front-end service somehow sends a compute request with the user-supplied json to the job's main container.
The job's main container performs a computation with the model and data. The response is passed on to the user via the front-end service.
The pod stays up for some time so that when similar requests for the same model are receive it doesn't have to spin up new pods every time.
After some time the pod shuts down, finishing the job.
I'm having trouble with step 6 (and by extension step 8). As far as I know, pods created by a job can't be connected by a service. And even if that's possible, multiple requests for different models can occur concurrently, so the service has to be able to dynamically differentiate the pods.
The first iteration of this project was to let the back-end container being able to dynamically load new models, but after review it was decided that this was not desirable so now in order to load a new model the container has to be restarted where the Init Container retrieves the correct data.
My first thought was to let the back-end job send a request to retrieve the data but that creates several problems:
1. The front-end service has to store the request json in a database even though it's read only once because the back-end request can be routed to a different front-end pod.
2. How would the job know to request new data? (step 8)
3. How are the results sent to the user?
The second thought was to forgo step 8 and 9 and let the job run to completion and let the front-end read the job status and after it's finished read the logs. At least, that's how to example in the Job documentation does it. This would mean that the job logs must be reserved for output though, which seems like bad design.
We can build upon this though and in stead of writing to the logs, write to the database. This shares the problem 1 of my first idea in the sense that the database will contain read-once data, but so far this seems to be the only workable solution.
What is your thought? Is this the way to go, or do you perhaps have an entirely different way to encapsulate this behavior?