Skip to content

Conversation

@henrikjacobsenfys
Copy link
Member

@henrikjacobsenfys henrikjacobsenfys commented Oct 17, 2025

Adding SampleModel which can contain ModelComponents and calculate the model at given x, with and without detailed balance.

@henrikjacobsenfys henrikjacobsenfys added [scope] enhancement Adds/improves features (major.MINOR.patch) [priority] high Should be prioritized soon labels Oct 17, 2025
@henrikjacobsenfys henrikjacobsenfys added this to the First release milestone Oct 17, 2025
@henrikjacobsenfys
Copy link
Member Author

henrikjacobsenfys commented Oct 17, 2025

Code coverage is not 100%, but I think all the essentials are there, and I'd rather get feedback on the important things than polish tests that may need to change anyway :)

Edit: apparently codecov is upset about the missing tests, so I added tests and discovered a minor bug in the process.

@codecov
Copy link

codecov bot commented Oct 17, 2025

Codecov Report

❌ Patch coverage is 99.66443% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 97.67%. Comparing base (01b9490) to head (23c8660).

Files with missing lines Patch % Lines
.../easydynamics/sample_model/component_collection.py 99.08% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop      #56      +/-   ##
===========================================
+ Coverage    96.60%   97.67%   +1.06%     
===========================================
  Files           12       13       +1     
  Lines          412      644     +232     
===========================================
+ Hits           398      629     +231     
- Misses          14       15       +1     
Flag Coverage Δ
unittests 97.67% <99.66%> (+1.06%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@henrikjacobsenfys henrikjacobsenfys changed the title A Sample Model Oct 17, 2025
Copy link
Member

@rozyczko rozyczko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

part 1 after a quick look

Numeric = Union[float, int]


class SampleModel(ObjBase, MutableMapping):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't you inherit from TheoreticalModelBase or at least from BaseObjectCollection?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably? I'm not exactly sure

params = []
for comp in self.components.values():
params.extend(comp.get_parameters())
return params
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or just

    from itertools import chain
    
    # Create generator for temperature parameter
    temp_params = (self._temperature,) if self._temperature is not None else ()
    
    # Create generator for component parameters
    comp_params = (param for comp in self.components.values() 
                   for param in comp.get_parameters())
    
    # Chain them together and return as list
    return list(chain(temp_params, comp_params))

(slight teaching aspect for itertools - a VERY important python module)

if is_not_fixed and is_independent:
fit_parameters.append(parameter)

return fit_parameters
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, not a bad loop (easy to see what's being done) but if you want to make it more "modular" you could use list comprehension, e.g.

    def is_fit_parameter(param: Parameter) -> bool:
        """Check if a parameter can be used for fitting."""
        return (not getattr(param, "fixed", False) and 
                getattr(param, "_independent", True))
    
    return [param for param in self.get_parameters() if is_fit_parameter(param)]

@henrikjacobsenfys
Copy link
Member Author

@rozyczko Would you have another look when you have time? I'm working on other things the rest of the day and am away tomorrow, so no rush.

Copy link
Member

@rozyczko rozyczko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments after pondering on the file for too long ;)

"""

CollectionBase.__init__(self, name=name)
TheoreticalModelBase.__init__(self, name=name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be just

super().__init__(name=name)

to properly handle multiple inheritances. Otherwise it violates MRO best practices and makes maintenance more difficult (by being too explicit)

-------
List[ModelComponent]
"""
return list(self)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class mixes direct list access (list(self)) with a components property. Pick one approach:

  • Either make components the primary interface and remove list(self) calls
  • Or fully embrace the collection interface and remove the components property

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I will embrace the collection

Name to assign to the component. If None, uses the component's own name.
"""
if not isinstance(component, ModelComponent):
raise TypeError("component must be an instance of ModelComponent.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe uppercase Component since it is the start of the sentence?


# Add initial components if provided. Mostly used for serialization.
if data:
# Just to be safe
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment adds no value. Either explain WHY it's needed or remove it, please.



class SampleModel(CollectionBase, TheoreticalModelBase):
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would probably be useful to add __len__ so querying the model is simpler

>>> model = SampleModel()
>>> model.add_component(Gaussian(name="g1"))
>>> len(model)

To do this, just query super():

def __len__(self) -> int:
        return super().__len__()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already works! :)

self.insert(index=len(self), value=component)

def remove_component(self, name: str):
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why find all indices if you only delete the first? This is either inefficient or you expect duplicate names?
Just delete the first index found.

List[ModelComponent]
"""
return list(self)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this creation of a new list every time components() is accessed is pretty wasteful. Either cache it or use list(self) directly as said earlier.

) # noqa: E501

def convert_unit(self, unit: Union[str, sc.Unit]) -> None:
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider adding the rollback mechanism, which you did in ModelComponent.

Comment on lines 206 to 211
if not self.components:
raise ValueError("No components in the model to evaluate.")
result = None
for component in list(self):
value = component.evaluate(x)
result = value if result is None else result + value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or just

if not self.components:
    raise ValueError("No components in the model to evaluate.")
return sum(component.evaluate(x) for component in self)


class SampleModel(CollectionBase, TheoreticalModelBase):
"""
A model of the scattering from a sample, combining multiple model components.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, you could extend the API to include __contains__ so we can do:

>>> model = SampleModel()
>>> gaussian = Gaussian(name="g1")
>>> model.add_component(gaussian)
>>> "g1" in model
True
>>> gaussian in model
True
>>> "nonexistent" in model
False

with something like

def __contains__(self, item: Union[str, ModelComponent]) -> bool:
    if isinstance(item, str):
        # Check by component name
        return any(comp.name == item for comp in self)
    elif isinstance(item, ModelComponent):
        # Check by component instance
        return any(comp is item for comp in self)
    else:
        return False

@henrikjacobsenfys
Copy link
Member Author

As always, very useful comments :) here's the updated version

Copy link

@damskii9992 damskii9992 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First and by far the biggest comment.
Do you WANT your SampleModel to BE a list and dictionary? Or do you want it to contain a list/dictionary?
It should only be the former if you REALLY want to be able to iterate over your model, slice it to get components, or do len(sample_model) instead of, for example len(sample_model.components).
I don't really think you want it to be a list/dictionary . . .
And even if you did, you should probably write up your own collections class, as relying on CollectionBase is dangerous as it is subject to be removed (or at least heavily modified).

@henrikjacobsenfys
Copy link
Member Author

First and by far the biggest comment. Do you WANT your SampleModel to BE a list and dictionary? Or do you want it to contain a list/dictionary? It should only be the former if you REALLY want to be able to iterate over your model, slice it to get components, or do len(sample_model) instead of, for example len(sample_model.components). I don't really think you want it to be a list/dictionary . . . And even if you did, you should probably write up your own collections class, as relying on CollectionBase is dangerous as it is subject to be removed (or at least heavily modified).

Well, I honestly don't care too much exactly what it is. I want to have a collection of components that I/the user can easily access and work with, I want it to be integrated with the rest of corelib, i.e. reuse as much of corelib as possible, which also helps minimizing the amount of code I have to write/maintain in dynamics-lib. CollectionBase checks all those boxes. I'm not attached to it in any way, though, so happy enough to do it in a different way.

Would you recommend the components to be stored as a dict or list?

@henrikjacobsenfys henrikjacobsenfys marked this pull request as draft November 21, 2025 13:42
@henrikjacobsenfys
Copy link
Member Author

@damskii9992 Could I have a quick sanity check of the changes to ModelComponent and Gaussian to use ModelBase before I make the same changes to all the other components?

self,
name="ModelComponent",
display_name="ModelComponent",
unit: Optional[Union[str, sc.Unit]] = "meV",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use Optional or Union for type hinting, from Python 3.10 and forward you simply write str | sc.Unit
Also, Optional essentially means it can also be a None, so it shouldn't even have been used here.

def free_all_parameters(self):
"""Free all parameters in the model component."""
for p in self.get_parameters():
for p in self.get_all_parameters():

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and above should be get_fittable_parameters since you can't set dependent Parameters fixed attribute.


old_unit = self._unit
pars = self.get_parameters()
pars = self.get_all_parameters()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be get_all_variables so you also convert DescriptorNumbers? You had a Plancks constant descriptor somewhere right? Shouldn't that one also be converted?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I don't think so, since Planck's constant can't be converted to e.g. microeV. So I'll leave it as it is for now, and figure out the units issue when I get to the DiffusionModel

def __init__(
self,
name="ModelComponent",
display_name="ModelComponent",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also remove the default display_name, it is not a mandatory attribute :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, but I like names... I'll probably remove it

@henrikjacobsenfys
Copy link
Member Author

@damskii9992 @rozyczko all ready for another review when you have time :)

@henrikjacobsenfys henrikjacobsenfys marked this pull request as ready for review December 12, 2025 05:11
@henrikjacobsenfys henrikjacobsenfys changed the title Sample Model ComponentCollection Dec 12, 2025
Copy link
Member

@rozyczko rozyczko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One important bug needs to be addressed.
The rest is for consideration

self,
display_name: str = "MyComponentCollection",
unit: str | sc.Unit = "meV",
components: List[ModelComponent] = [],
Copy link
Member

@rozyczko rozyczko Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh my... Mutable Default Argument bug. Using [] as a default argument is a well-known Python anti-pattern.
Python’s default arguments are evaluated once when the function is defined, not each time the function is called
Use None and create the list inside the function:

components: List[ModelComponent] | None = None,

then

components = components or []

Check out this example:

def append_to(element, to=[]):
    to.append(element)
    return to

then call it twice with different args

my_list = append_to(12)
print(my_list)

my_other_list = append_to(42)
print(my_other_list)

What did you expect in the second case? [42]? :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this madness?!

"""
Gaussian function: area/(width*sqrt(2pi)) * exp(-0.5*((x - center)/width)^2)
If the center is not provided, it will be centered at 0 and fixed, which is typically what you want in QENS.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your component functions are missing get_all_variables Implementation, relying on the parent class implementation.
The parent ModelBase.get_all_variables() may not find _area, _center, _width since they're not passed to super().init() anymore.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am indeed relying on the parent class implementation, but it works :)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the new ModelBase classs' get_all_variables method automatically finds the parameters of of the class if they are defined as properties which return a Parameter :)

super().__init__(
display_name=display_name,
unit=unit,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gaussian calls super().__init__() first (which relies on self._unit), but _unit is set inside ModelComponent.__init__ via validate_unit. However, other components (Lorentzian, DHO) set self._unit = unit before calling super. This is inconsistent

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, the Gaussian does it right, setting self._unit = unit before calling super is redundant. Will fix.

)

def get_parameters(self) -> list[Parameter]:
def get_all_variables(self) -> list[DescriptorBase]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overrides get_all_variables properly but other components use get_all_parameters. The naming is inconsistent across the codebase

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, this is because of how the coefficients of the polynomial are created and stored. ModelBase does not look through lists of Parameters, so I have to do it manually. All the other get_all_ from ModelBase rely on get_all_variables, so this was the only one that needed to be overridden.

It may make more sense to update ModelBase to also look through lists to check for Parameters (and Descriptors)

Parameters
----------
name : str
Name of the sample model.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Says name but param is display_name

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn, thought I had gotten them all

@@ -0,0 +1,296 @@
import warnings
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code mixes old-style Union[X, Y] with new-style X | Y:

(old)
Union[Numeric, list, np.ndarray, sc.Variable, sc.DataArray]
(new)
Numeric | list | np.ndarray | sc.Variable | sc.DataArray

Pick one style for consistency.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to consistently use the new style, but clearly I'm not good at consistency. Should be better now

@@ -0,0 +1,372 @@
from copy import copy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this is part of the EasyScience ecosystem which relies heavily on as_dict()/from_dict() serialization, there are no tests for serializing/deserializing ComponentCollection.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Copilot was happy to write those tests :)

Copy link

@damskii9992 damskii9992 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not reviewed the unit tests yet, but I feel like there is plenty here to look at first.

name="ModelComponent",
unit: Optional[Union[str, sc.Unit]] = "meV",
**kwargs: Any,
display_name: str = None,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typehint should be str | None

):
self.validate_unit(unit)
super().__init__(name=name, **kwargs)
super().__init__(display_name=display_name)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No custom unique_name? Is that on purpose or did you just forget it?

name: Optional[str] = "Polynomial",
coefficients: Optional[Sequence[Union[Numeric, Parameter]]] = (0.0,),
unit: Union[str, sc.Unit] = "meV",
display_name: str = "Polynomial",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this missing a typehint of None?

unit: str | sc.Unit = "meV",
):
self.validate_unit(unit)
super().__init__(display_name=display_name, unit=unit)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also (possibly) missing a custom unique_name? I won't write it on the rest of the places.

param = Parameter(name=f"{display_name}_c{i}", value=float(coef))
else:
raise TypeError(
"Each coefficient must be either a numeric value or a Parameter."

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're allowing the user to pass Parameters? I thought we decided against that . . .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I needed it to make serialisation work with the old base...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, I'll keep it for now, not use it unless I feel like I really need to, then remove it once SampleModel exists :)

for component in self.components:
if hasattr(component, "area"):
area_params.append(component.area)
total_area += component.area.value

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you need to check their units? Remember that we can add Parameters together. You might want to use that here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I'll do that

raise ValueError("Total area is zero; cannot normalize.")

if not np.isfinite(total_area):
raise ValueError("Total area is not finite; cannot normalize.")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would this ever happen? We should not allow Parameters to have infinite values and then this could never be infinite.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this also accounts for NaN's. Either way I think it's fine to have a check even if it will never be used if things go as we expect :)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary checks does make code slower, but only by a little. Just wanted to point it out in case you hadn't realized ^^


if not self.components:
raise ValueError("No components in the model to evaluate.")
return sum(component.evaluate(x) for component in self.components)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this even work? You are trying to sum the contributions element-wise, right? So if x is an array, you would be trying to use sum on a tuple of arrays/lists.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix all free parameters in the model.
"""
for param in self.get_all_parameters():
param.fixed = True

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As previously mentioned, this and below should be get_fittable_parameters as you can't set the fixed attribute on dependent Parameters.


if isinstance(item, str):
# Check by component name
return any(comp.display_name == item for comp in self.components)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In all of this class, you should really think about where you use display_name and where you use unique_name.
display_name is ONLY meant for displaying, any and all logic should be using the unique_name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

[priority] high Should be prioritized soon [scope] enhancement Adds/improves features (major.MINOR.patch)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants