-1

In my machine learning pipeline, I have all the arguments collected into a dictionary.

args = {'save_model': True,
'learning_rate': 0.01,
'batch_size': 4,
'model': 'my_model',
'momentum': 0.9,
'random_brighness': 0.5,
'random_flipping': 0.5,
...}

Then I have a bunch of functions that take the entire args dictionary as an input. Each of these only uses a small subset of all the arguments. Is there anything wrong with this design?

model = get_model(args)
data = get_data(args)
transformed_data = transform_data(data, args)

from the perspective of the function, it looks like:

def get_trainer(args):
loss_function = args['loss_function']
optimizer = args['optimizer']
class_weights = args['class_weights']
...

versus:

def get_trainer(loss_function, optimizer, class_weights)
...
jss367
  • 253

1 Answers1

4

The main problem I have with that choice is that it removes one of the benefits of the type system, which is the ability to give you validation at compile/design time, rather than at runtime. When all your methods take a dictionary, they A) do not give the reader of the code information on what parameters they need, and B) do not allow the compiler or interpreter to validate function calls.

With explicit arguments, if you add or remove needed parameters at some point, you, the interpreter and your IDE can easily find the call sites that don't match the new signature. With a generic dict, you have to remember and check them yourself or wait for a runtime error at some point.

This is equivalent, in more statically typed languages, to defining all arguments as "object" or "Any" - it saves you time having to define types, but pushes errors and validations to runtime, which ia always costlier.

  • I do the checking at runtime. Just a wrapper that falls into the debugger and lists all keys that were present that the function didn’t ask about. – gnasher729 Jan 22 '21 at 20:58
  • For example, if a function inquired about “random_brightness” and handled its absence correctly my wrapper would fall into the debugger and tell about the user about the “random_brighness” key. – gnasher729 Jan 23 '21 at 15:02