Skip to content
This repository was archived by the owner on May 26, 2022. It is now read-only.
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
209 changes: 209 additions & 0 deletions model_discretization_api.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,209 @@
'''
A framework for efficient model image rasterization in astropy.

This is a GSoC 2015 project and is therefore on a relatively short time frame.
We focus first on developing a robust solution to a simple but common use case that
is extensible to future implementation of more complex models and solutions.

Use case: Generate a sky model image of discrete sources, e.g. something like
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@patti – I think it might be helpful to add code for this use case that should actually work and be efficient once this is implemented.

photutils.datasets.make_100gaussians_image(), but faster and convolving a source model
with an instrument PSF.

The primary challenge- and motivation- is maximizing speed while maintainging precision. A
straightforward approach is to use oversampling within a bounded area around the
source containing the majority of its flux (e.g. 5-sigma) for discretization and integration.

The key steps are:
1. Model generation
- Convolution of a source model with an instrument PSF model
- Fitting the convolved model to image data if required
2. Model discretization/Image rasterization
- Integration of the model over pixel areas
- Reprojection if required

The pieces necessary are largely in existence. Relevant packages
include astropy.modeling, astropy.convolution, astropy.nddata, sherpa, galfit, reproject,
... others.

Outlined here is a rough draft of the proposed API for community discussion.

'''

######################
# ConvolvedModel API #
######################

from astropy.modeling import ConvolvedModel
from astropy.modeling.models import Box2D, Lorentz1D, Gaussian2D
from astropy.convolution import Gaussian2DKernel, Gaussian1DKernel

# Initiate a ConvolvedModel instance, takes source model and convolution kernel
# like:
source = Box2D(1, 0, 0, 1, 1)
psf = Gaussian2DKernel(1)
convolved_model = ConvolvedModel(source, psf, mode='oversample', factor=10)

# mode will be used both for convolution and later integration
# if a different mode is desired for convolution it should be passed in the psf

# support of 1D models
source = Lorentz1D(1, 0, 0, 1, 1)
psf = Gaussian1DKernel(1)
convolved_model = ConvolvedModel(source, psf)

# should there be a Convolved1DModel and Convolved2DModel?
# should this be an extension of the CompoundModel class?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not exactly--the CompoundModel class is specifically for handling models that are composed out of multiple models that have been combined with arithmetic and/or composition operators. That said, I can see why you would suggest that here, since you're basically convolving the source model with some convolution kernel, so in that sense you're composing the source model with the convolution operator.

So the way Perry and I talked about doing this is instead of making a ConvolvedModel that takes the source as an argument, as proposed here, there might be some ConvolutionModel that is instantiated just with the desired kernel and then would run a source model through it like:

convolution = ConvolutionModel(psf)
convolved_model = source | convolution

this would require that the ConvolutionModel overrides how the pipe operator is implemented for models, since it has a slightly different sense than it does in normal compound models. In normal compound models, model_1 | model_2 means "pass the output of model_1 in as the input to model_2". However, in this case the input to the convolution model is the entire model on the LHS (as well as its input, if that makes sense). This special case could definitely be supported though, I think. Last we talked about this I said I would whip up a prototype but haven't had a chance. I can show you more exactly what I mean later if that sounds interesting.

However, if the scheme I'm proposing doesn't work out, something like the above example could be made to work too. I just like the source | convolution scheme since it fits in more nicely overall with the existing syntax for compound models :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me what a convolution operator implies in this context is that the form of the input model must be returned as a regularly sampled set of values (effectively table look up) so that a discrete convolution can be applied. For example, if the input model is purely analytic (or even table look-up on a different grid, or irregular grid), the model must be evaluated on a grid that the convolved can use. In effect the convolver has to turn the input into a lookup table model and operate on that. The convolver needs info on the sampling interval and range that it is expected to deal with, and uses that to evaluate the input to that. Is there any other way to deal with it?


# later we can think about using any model as convolution kernels
# and even implement the analytical solution for the common Gaussian case
# or other cases where analytical solutions are known
source = Gaussian2D(1, 0, 0, 1, 1)
psf = Gaussian2D(1, 0, 0, 1, 1)
convolved_model = ConvolvedModel(source, psf)


# the convolution method can be set via string, to choose between the astropy.convolution
# methods
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be better if ConvolvedModel is an abstract base class, and the "method" should be treated using distinct class. E.g. FFTConvolvedModel(source, psf) and StandardConvolvedModel(source,psf). Then instead of the user passing in a function, they sub-class ConvolvedModel.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There could still be a factory function or class that then creates an object of different types depending on the method parameter as a convenience for the user.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I don't see any advantage of one method over the other (function vs. subclassing).

convolved_model = ConvolvedModel(source, psf, method='fft')
convolved_model = ConvolvedModel(source, psf, method='standard')

# or by passing a function that should be used. This is mainly nececessary, because
# the performance of the astropy convolution algorithms is rather weak...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I recall, the astropy convolution stuff was written specifically because scipy was problematic for some use cases. So can you clarify what situation in failed in? Regardless, for this API having a custom option is good no matter what.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The astropy convolution routines can deal with NaN values for the price of weak performance (in terms of speed...). For this use case NaN values won't be an issue, so it would make sense to just go for the fastest solution.

convolved_model = ConvolvedModel(source, psf, method=scipy.signal.fftconvolve)

# recommendation would be to use scipy.signal.fftconvolve which is very fast

# additional keyword arguments can be passed to the convolution function
convolved_model = ConvolvedModel(source, psf, method='standard', mode='oversample')

# the convolved_model can be fit
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@patti – Can you please spell this complete use case out where a convolved model is fit?
I just discussed this with @adonath and we think it's important to see where the PSF and model are gridded and how those grids relate.


# Further notes:
# 1. Implementation of this class should be straight forward and not much effort
# 2. Testing could be set up using Gaussian models, where the analytical solution is
# known
# 3. Later one could think about assigning the corresponding fourier transforms to analytical
# models and use this for the convolution, which will probably have a better performance.
# Source model and PSF are evaluated analytically in Fourier space, mutliplied and transformed
# back to normal space.
# 4. As convolution is a linear operation the derivative of a convolution behaves like
# http://en.wikipedia.org/wiki/Convolution#Differentiation this relation can be used
# to implement `fit_deriv` methods for convolved models.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍



#########################
# Model integration API #
#########################

import numpy as np

# to avoid bias, when the scale of the model is similar to the size of the pixels,
# models have to be integrated over pixels, for this purpose models could define
# an integrate function, that takes upper and lower bounds of the integration
# (see also https://github.com/astropy/astropy/issues/1586)
model = Gaussian1D(1, 0, 1)
model.integrate(0, np.inf)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the past there has been some argument that rather than a Model.integrate method, there should be some kind of general integrate() function that can take models as inputs, among other types of callable objects. I don't recall exactly what the argument for this was (@perrygreenfield might have some thoughts). But I think that even if we do go that route, it will make sense to have Model.integrate methods as well, if nothing else so that individual models can implement analytic integrals where possible (though I think where not it should still be able to fall back on numerical integration techniques).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main idea is that it ought to work for other things without forcing a user to cast it explicitly into a model. The function could do that behind the scenes though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it would be extremely useful for subclasses to be able to define analytic integrals when it makes sense. So I think an integrate method is a critical option for some modeling use cases. (Similarly, a deriv, which is not the same as derivative w.r.t. parameters, i.e. the current fit_deriv)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @eteq said, the main idea is, that subclasses can define analytical integrals or clever specific solutions to integrate (for a Gaussian see https://github.com/astropy/photutils/blob/master/photutils/psf.py#L252). For now I don't see the need for an extra integrate() function, because there is scipy.integrate, which can deal with any callable...


# it should also work with arrays of bounds of course
x_lo = np.arange(-10, 11) - 0.5
x_hi = np.arange(-10, 11) + 0.5
model.integrate((x_lo, x_hi))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing that should be possible: integrate a 1D model with other jacobians. I.e., a factor of r for integrating in polar coordinates or r^2 for spherical.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eteq – Agree it would be nice. Another use case could be integrating spectral energy distributions (SEDs) to get the integral photon flux or energy flux in an energy band. But it's pretty hard to impossible to do in a simple API that's not use-case-specific, no?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the additional requirement to the integrate method when integrating SEDs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For broad-band SED integration integration there's (at least) two things to consider to implement it well:

  • The numerical integration should be done with x = log(energy) and y = log(flux) axes.
  • To compute the integral count flux in an energy band one has to integrate y = log(flux), to get the integral energy flux one has to integrate y = log(flux * energy ** 2) I think this is similar to the use case @eteq mentioned of adding factors of r or r^2 to polar and spherical coordinate integrals, i.e. there is a coordinate transform involved, that's why I was mentioning the SED use case here.

Examples of SED integration Python code is here, here and here.

Like I said, it would be nice if the Astropy model integration methods supported such computation somehow, but I don't have a suggestion how it should be implemented. So my suggestion here would be: if someone posts a concrete idea for this feature here within ~ the next week, let's try to flesh it out, otherwise punt on it as possible future work or something to be done elsewhere on top of astropy.modeling.


# for 2d models it would be
y_lo = np.arange(-10, 11) - 0.5
y_hi = np.arange(-10, 11) + 0.5
model.integrate((x_lo, x_hi, y_lo, y_hi))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might prefer here that for N-D models, there would be N bounds arguments, each a separate (lo, hi) 2-tuple.
How would this work for vectors/paths? Or is that not being considered for now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, the API should be designed for N-D models. Concerning the vector/path integrations, I'm not sure whether there is simple way to deal with that. The (x_lo, x_hi, y_lo, y_hi) syntax implies an integration along the x-axis and y-axis over a rectangular area (or pixel), which might be only the most common use case. Another use case @cdeil and me discussed were healpix grids (where pixels are not necessarily rectangular...). So I think it's worth thinking about a more general approach. E.g. one could just pass the pixel's center and a region definition, where the model should be integrated, but this will quickly get very complicated, because there might not be analytical solutions anymore...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One important question is how evaluate and integrate take their inputs.
At the moment evaluate takes x for 1D and x, y as separate arguments for 2D, I'm not sure how it's done for 3D or ND.

For integrate this API proposal mostly takes a single positional argument (a tuple) as input, i.e. does it differently than evaluate. Note that the evaluate methods e.g. in scipy.interpolate also take a single points and xi argument for ND, see here for an example: http://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.griddata.html

What do we want for evaluate and integrate in Astropy models?
Is changing the evaluate signature up for discussion at this point?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note this PR and dicussion in numpy how cartesian grids are generated: numpy/numpy#5874
(I think this is the input format for the scipy.interpolate functions, but this format is not used in Astropy, right?)


# whenever possible (e.g. polynomials) the integrate function would be implemented
# analytically, for all others a standard numerical integration routine would be used
# e.g. from scipy.integrate
# the integration mode can be passed to the integrate method if it is not already an
# attribute of model
model.integrate((x_lo, x_hi, y_lo, y_hi), mode='integrate')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


# for convenience one could incorporate the functionality into the standard model API
# as is done for the convolution kernals
model = Gaussian1D(1, 0, 1, mode='center')
model = Gaussian1D(1, 0, 1, mode='linear_interp')
model = Gaussian1D(1, 0, 1, mode='integrate')
model = Gaussian1D(1, 0, 1, mode='oversample', factor=10)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain a bit more about this? What do "mode" and "factor" apply to when instantiating a model?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a option for convenience. The idea is that the 'mode' argument sets the evaluation mode of the model. So mode='center' corresponds to what is done now, the model is just evaluated at the center of the pixel. mode='oversample' would use oversampling to evaluate the model etc. The problem is that this only works easily for pixel coordinates and uniform grids, otherwise the model needs to know about the sample size (which is related to the api definition of the .integrate() method)


# the given mode could also be used when the model is called for evaluation
x = np.arange(-10, 11)
y = np.arange(-10, 11)
model(np.meshgrid(x,y))


# models that are spatially confined to a certain region such as Gaussian2D, Disk2D,
# Delta2D, Box2D, ... should have the possibility to apply a different normalization,
# where the amplitude parameter corresponds to the integral of the model and not the
# peak value at maximum (see e.g.: https://gammapy.readthedocs.org/en/latest/api/gammapy.morphology.Shell2D.html#gammapy.morphology.Shell2D)
# This is very useful for modeling sources, because the amplitude parameter then
# corresponds to the total flux of the source...
model = Gaussian2D(1, 0, 0, 1, 1, normalization='peak')
model = Gaussian2D(1, 0, 0, 1, 1, normalization='integral')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will have to give some thought to how this would be implemented within the current Model implementation, but 👍 in principle. The above examples show the models being instantiated with amplitude=1. Would this just be overridden somehow? What would model.amplitude return?

This might also be a case for something I've thought about before, for other use cases, of "partial" models that are instantiated without values specified up front for all their parameters. In the general case of "partial" models, any unspecified parameters would have to be supplied at evaluation time. However, in the case of a normalized model you might just leave the amplitude unspecified.

Or in the meantime you could just specify a default amplitude like in the above examples, but still normalize the output which is fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The easiest solution would be to define a second evaluate method, where the amplitude parameter corresponds to the integral over the model and not to the peak value (that's how it was done here: https://gammapy.readthedocs.org/en/latest/api/gammapy.morphology.Shell2D.html#gammapy.morphology.Shell2D). This requires a detailed documentation of this behaviour. A second option (which is probably more elegant)
would be to introduce a model.integral parameter, that is related by some function to the model.amplitude parameter (tied and fixed would have to be defined correspondingly). The advantage of the normalized models is not only about obtaining the total integral (or flux for source models...) of the model, but also that the parametrisation using the integral as amplitude parameter behaves usually much more stable in the fit (e.g. for a Gaussian A * exp(-(x / sigma) ** 2) vs. A / sqrt(2 * pi * sigma ** 2) exp(-(x / sigma) ** 2))


# alternatively one could introduce new NormGaussian2D, NormBox2D, ... models, but that
# would probably be a lot of code duplication?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably. I like you idea of incorporating normalization into models somehow (with the caveat that it won't make sense for all models).


# all models that are 'normalisable' should have an `extent` attribute that specifies
# a rectangular region where the model is nonzero or has 99.99% containment (or a similar
# criterion). A reasonable value would be chosen as default and would be given as multiples
# model's width parameters. Model evaluation can later be limited to the region defined
# by `extent` for better performance
Gaussian2D.extent= dict(x=5 * x_stddev, y=5 * y_stddev)
Disk2D.extent = dict(x=R_0, y=R_0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This aspect is something that I find a bit tricky, because while one should be able to come up with some sensible defaults, users may also want to customize this, and the exact implementation of this is going to vary from model to model.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @embray that this seems dangerous to require

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to having it optional. If it's not there, then the model is evaluated on the whole image by default.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it should be definitely optional, note that this is also related to astropy/astropy#2190.
I will think about a general method how to find suitable default values...


# models should have an attribute that defines a reasonable sample size for the model,
# warnings could be raised, when models are undersampled.
Gaussian2D.sample_size = width / 5.


###############################
# Model Image Computation API #
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One really important thing here: adding fake noise to make simulated images. This really really should use quantities. I.e., effective_area = 100*u.m**2, exposure_time=5*u.minutes, quantum_efficiency = 0.9, and then something like sky_image.simulate(effective_are, exposure_time, qunatum_efficiency) would yield an "image" that's an NDData with units in u.ADU.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(using np.random.poisson to produce the simulated image)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eteq – What exactly are you proposing here?
If the model evaluate returns a quantity, it'll just be used, i.e. the output sky image will also be a quantity.
Other than that, I don't think we should introduce checks or attach units ourselves, because it would be restricting.
Maybe Joe wants to render an image in flux units and Jane in counts units and Jim in surface brightness units?

###############################

# functionality is added to NDData via NDDataMixin classes
class SkyImage(NDData, NDSlicingMixin, NDWCSCutoutMixin) : pass

# a sky image object is created as following
sky_image = SkyImage(np.zeros(501, 10001), wcs=WCS())

# it has a coordinates attribute, that returns a SkyCoord object
# with every pixel containing the corresponding position
sky_image.coordinates

# one can make rectangular cutouts of the sky image using
cutout = sky_image.extract(position, extent)
cutout = sky_image.extract(postion, extent, copy=True)

# which returns a SkyImage object, but with modified WCS transform
# data can be copied optionally

# to render the model on a sky image a new function should be defined
source_image = render_model_to_sky(model, wcs=WCS())

# which would return again a SkyImage object centered on the nearest pixel of the
# source position of size defined by extent. The pixel values are computed according to
# the model's mode.

# The source image can then be added to sky_image
sky_image.add(source_image)

# later `.add` could use any kind of resampling/reprojection
# for now we just assume that `sky_image` and `source_image`
# pixels are aligned (which should be checked...)

# Combining it all,
sky_image.add_sources(models)

# models should be an iterable of model instances for each source.
# then paralleization can speed things up.

sky_image.add_sources(models, distribute=True, N_core='none')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where my lack of domain knowledge comes in--would you be possibly using different models (i.e. different types of models) for different sources, or could they all be the same model but with varying parameters for each source? For the latter case the (poorly understood, but very useful) feature of model sets could be employed. I don't understand the use case here well enough to know if this is an appropriate use for that feature or not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@embray – There is the use case that one wants to make an image of 3 Gaussians and 4 Sersic profiles and 1 polynomial, i.e. different types of sources that can't be represented as one model set.
Of course one could make groups of models sets and still use the model sets feature, but then it gets complicated for the user and also implementing it would be hard ... one would need sets of extensions defining the bounding boxes where the models should be evaluated.
Even if it's possible to make this work, I think it's too complex and we shouldn't use model sets to implement this.