Implementation and Example for EVM code are very complicated and almost not to be understood

I have some doubts on the implementation of the EVM code, which are the following:

1. Most importantly, the return types are neither standard nor intuitive: https://github.com/Vastlab/vast/blob/2ad7fa23cb8f0f1acb71d6b6d9aced6a70562e0c/vast/opensetAlgos/EVM.py#L293
   a) Instead of turning the function into a generator, you should rather return the values in a useful data type (see next point).
   b) Instead of returning a (nested) 2-element tuple, I would rather return a (nested) dictionary with the first element of the tuple as key, and the second as value.
   c) When returning from the inference function, the first element is always identical `"probs"`:https://github.com/Vastlab/vast/blob/2ad7fa23cb8f0f1acb71d6b6d9aced6a70562e0c/vast/opensetAlgos/EVM.py#L340 
   There is no reason to provide this as a result, it will be ignored in any function that calls this one.

I do understand why you are returning it this way -- to be able to call it in parallel processes -- but handling dictionaries in the calling function is at least as simple.

2. Providing input parameters in an argument parser structure is very uncommon: https://github.com/Vastlab/vast/blob/2ad7fa23cb8f0f1acb71d6b6d9aced6a70562e0c/vast/opensetAlgos/EVM.py#L133
   Any function that wants to use your functions will need to either take your provided arg parser, or replicate your parser structure. The more common way is to have a separate parameter with a proper default value. Then, you can collect your parameters in a dictionary and pass them to the function, such as:
```
params = dict(tailsize=0.1, distance_metric="euclidean")
results = EVM_Training(..., **params)
```

3. As far as I understood your code, the chunk size is based on the number samples, where we could have many per class. However, the concatenation of features is tested for equality with the chunk size: https://github.com/Vastlab/vast/blob/2ad7fa23cb8f0f1acb71d6b6d9aced6a70562e0c/vast/opensetAlgos/EVM.py#L209
   It might happen that I this test is never met, when we have more samples in the collection than we have defined as chunk size.

4. For that the `Example.iynb` should serve as an example on how to use your functions, there are many parts that are very difficult to be understood: https://github.com/Vastlab/vast/blob/2ad7fa23cb8f0f1acb71d6b6d9aced6a70562e0c/vast/opensetAlgos/Example.ipynb
   a) Cell 2 uses some pre-defined data that is downloaded from your web page, but it is explained nowhere what this data contains.
   b) The line 2 in cell 3 contains a nested `list, map, list, lambda, apply` function that is impossible to follow, especially when we do not know how the data looks like.
   c) Cell 10 processes the complicated (see my point 1 above) output of your `EVM_Training` function using a combination of `dict, list, zip, *, [1]` which does not help in any way to understand how to handle the results of the training.
   d) Cell 11 contains a `cat, list, dict, list, zip, *, [1], values` combination. WTF. I have no idea what this is doing.


I would **highly** recommend to change the implementation to be more user-friendly and provide an example that show-cases how to use your functions with simple python, i.e., without any of: `cat, zip, *, [1], values, list, dict`, especially not in combination in one line (as far as possible).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation and Example for EVM code are very complicated and almost not to be understood #59

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Implementation and Example for EVM code are very complicated and almost not to be understood #59

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions