astrobase.varbase.trends module

Contains light curve trend-removal tools, such as external parameter decorrelation (EPD) and smoothing.

astrobase.varbase.trends.smooth_magseries_ndimage_medfilt(mags, windowsize)[source]

This smooths the magseries with a median filter that reflects the array at the boundary.

See https://docs.scipy.org/doc/scipy/reference/tutorial/ndimage.html for details.

Parameters:
  • mags (np.array) – The input mags/flux time-series to smooth.
  • windowsize (int) – This is a odd integer containing the smoothing window size.
Returns:

The smoothed mag/flux time-series array.

Return type:

np.array

astrobase.varbase.trends.smooth_magseries_signal_medfilt(mags, windowsize)[source]

This smooths the magseries with a simple median filter.

This function pads with zeros near the boundary, see:

https://stackoverflow.com/questions/24585706/scipy-medfilt-wrong-result

Typically this is bad.

Parameters:
  • mags (np.array) – The input mags/flux time-series to smooth.
  • windowsize (int) – This is a odd integer containing the smoothing window size.
Returns:

The smoothed mag/flux time-series array.

Return type:

np.array

astrobase.varbase.trends.smooth_magseries_gaussfilt(mags, windowsize, windowfwhm=7)[source]

This smooths the magseries with a Gaussian kernel.

Parameters:
  • mags (np.array) – The input mags/flux time-series to smooth.
  • windowsize (int) – This is a odd integer containing the smoothing window size.
  • windowfwhm (int) – This is an odd integer containing the FWHM of the applied Gaussian window function.
Returns:

The smoothed mag/flux time-series array.

Return type:

np.array

astrobase.varbase.trends.smooth_magseries_savgol(mags, windowsize, polyorder=2)[source]

This smooths the magseries with a Savitsky-Golay filter.

Parameters:
  • mags (np.array) – The input mags/flux time-series to smooth.
  • windowsize (int) – This is a odd integer containing the smoothing window size.
  • polyorder (int) – This is an integer containing the polynomial degree order to use when generating the Savitsky-Golay filter.
Returns:

The smoothed mag/flux time-series array.

Return type:

np.array

astrobase.varbase.trends.epd_magseries(times, mags, errs, fsv, fdv, fkv, xcc, ycc, bgv, bge, iha, izd, magsarefluxes=False, epdsmooth_sigclip=3.0, epdsmooth_windowsize=21, epdsmooth_func=<function smooth_magseries_savgol>, epdsmooth_extraparams=None)[source]

Detrends a magnitude series using External Parameter Decorrelation.

Requires a set of external parameters similar to those present in HAT light curves. At the moment, the HAT light-curve-specific external parameters are:

  • S: the ‘fsv’ column in light curves,
  • D: the ‘fdv’ column in light curves,
  • K: the ‘fkv’ column in light curves,
  • x coords: the ‘xcc’ column in light curves,
  • y coords: the ‘ycc’ column in light curves,
  • background value: the ‘bgv’ column in light curves,
  • background error: the ‘bge’ column in light curves,
  • hour angle: the ‘iha’ column in light curves,
  • zenith distance: the ‘izd’ column in light curves

S, D, and K are defined as follows:

  • S -> measure of PSF sharpness (~1/sigma^2 sosmaller S = wider PSF)
  • D -> measure of PSF ellipticity in xy direction
  • K -> measure of PSF ellipticity in cross direction

S, D, K are related to the PSF’s variance and covariance, see eqn 30-33 in A. Pal’s thesis: https://arxiv.org/abs/0906.3486

NOTE: The errs are completely ignored and returned unchanged (except for sigclip and finite filtering).

Parameters:
  • times,mags,errs (np.array) – The input mag/flux time-series to detrend.
  • fsv (np.array) – Array containing the external parameter S of the same length as times.
  • fdv (np.array) – Array containing the external parameter D of the same length as times.
  • fkv (np.array) – Array containing the external parameter K of the same length as times.
  • xcc (np.array) – Array containing the external parameter x-coords of the same length as times.
  • ycc (np.array) – Array containing the external parameter y-coords of the same length as times.
  • bgv (np.array) – Array containing the external parameter background value of the same length as times.
  • bge (np.array) – Array containing the external parameter background error of the same length as times.
  • iha (np.array) – Array containing the external parameter hour angle of the same length as times.
  • izd (np.array) – Array containing the external parameter zenith distance of the same length as times.
  • magsarefluxes (bool) – Set this to True if mags actually contains fluxes.
  • epdsmooth_sigclip (float or int or sequence of two floats/ints or None) –

    This specifies how to sigma-clip the input LC before fitting the EPD function to it.

    If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.

    If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.

    If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.

  • epdsmooth_windowsize (int) – This is the number of LC points to smooth over to generate a smoothed light curve that will be used to fit the EPD function.
  • epdsmooth_func (Python function) –

    This sets the smoothing filter function to use. A Savitsky-Golay filter is used to smooth the light curve by default. The functions that can be used with this kwarg are listed in varbase.trends. If you want to use your own function, it MUST have the following signature:

    def smoothfunc(mags_array, window_size, **extraparams)
    

    and return a numpy array of the same size as mags_array with the smoothed time-series. Any extra params can be provided using the extraparams dict.

  • epdsmooth_extraparams (dict) – This is a dict of any extra filter params to supply to the smoothing function.
Returns:

Returns a dict of the following form:

{'times':the input times after non-finite elems removed,
 'mags':the EPD detrended mag values (the EPD mags),
 'errs':the errs after non-finite elems removed,
 'fitcoeffs':EPD fit coefficient values,
 'fitinfo':the full tuple returned by scipy.leastsq,
 'fitmags':the EPD fit function evaluated at times,
 'mags_median': this is median of the EPD mags,
 'mags_mad': this is the MAD of EPD mags}

Return type:

dict

astrobase.varbase.trends.epd_magseries_extparams(times, mags, errs, externalparam_arrs, initial_coeff_guess, magsarefluxes=False, epdsmooth_sigclip=3.0, epdsmooth_windowsize=21, epdsmooth_func=<function smooth_magseries_savgol>, epdsmooth_extraparams=None, objective_func=<function _epd_residual2>, objective_kwargs=None, optimizer_func=<function least_squares>, optimizer_kwargs=None)[source]

This does EPD on a mag-series with arbitrary external parameters.

Parameters:
  • times,mags,errs (np.array) – The input mag/flux time-series to run EPD on.
  • externalparam_arrs (list of np.arrays) – This is a list of ndarrays of external parameters to decorrelate against. These should all be the same size as times, mags, errs.
  • initial_coeff_guess (np.array) – An array of initial fit coefficients to pass into the objective function.
  • epdsmooth_sigclip (float or int or sequence of two floats/ints or None) –

    This specifies how to sigma-clip the input LC before smoothing it and fitting the EPD function to it. The actual LC will not be sigma-clipped.

    If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.

    If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.

    If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.

  • epdsmooth_windowsize (int) – This is the number of LC points to smooth over to generate a smoothed light curve that will be used to fit the EPD function.
  • epdsmooth_func (Python function) –

    This sets the smoothing filter function to use. A Savitsky-Golay filter is used to smooth the light curve by default. The functions that can be used with this kwarg are listed in varbase.trends. If you want to use your own function, it MUST have the following signature:

    def smoothfunc(mags_array, window_size, **extraparams)
    

    and return a numpy array of the same size as mags_array with the smoothed time-series. Any extra params can be provided using the extraparams dict.

  • epdsmooth_extraparams (dict) – This is a dict of any extra filter params to supply to the smoothing function.
  • objective_func (Python function) –

    The function that calculates residuals between the model and the smoothed mag-series. This must have the following signature:

    def objective_func(fit_coeffs,
                       times,
                       mags,
                       errs,
                       *external_params,
                       **objective_kwargs)
    

    where times, mags, errs are arrays of the sigma-clipped and smoothed time-series, fit_coeffs is an array of EPD fit coefficients, external_params is a tuple of the passed in external parameter arrays, and objective_kwargs is a dict of any optional kwargs to pass into the objective function.

    This should return the value of the residual based on evaluating the model function (and any weights based on errs or times).

  • objective_kwargs (dict or None) – A dict of kwargs to pass into the objective_func function.
  • optimizer_func (Python function) –

    The function that minimizes the residual between the model and the smoothed mag-series using the objective_func. This should have a signature similar to one of the optimizer functions in scipy.optimize, i.e.:

    def optimizer_func(objective_func, initial_coeffs, args=(), ...)
    

    and return a scipy.optimize.OptimizeResult. We’ll rely on the .success attribute to determine if the EPD fit was successful, and the .x attribute to get the values of the fit coefficients.

  • optimizer_kwargs (dict or None) – A dict of kwargs to pass into the optimizer_func function.
Returns:

Returns a dict of the following form:

{'times':the input times after non-finite elems removed,
 'mags':the EPD detrended mag values (the EPD mags),
 'errs':the errs after non-finite elems removed,
 'fitcoeffs':EPD fit coefficient values,
 'fitinfo':the result returned by the optimizer function,
 'mags_median': this is the median of the EPD mags,
 'mags_mad': this is the MAD of EPD mags}

Return type:

dict

astrobase.varbase.trends.rfepd_magseries(times, mags, errs, externalparam_arrs, magsarefluxes=False, epdsmooth=True, epdsmooth_sigclip=3.0, epdsmooth_windowsize=21, epdsmooth_func=<function smooth_magseries_savgol>, epdsmooth_extraparams=None, rf_subsample=1.0, rf_ntrees=300, rf_extraparams={'criterion': 'mse', 'n_jobs': -1, 'oob_score': False})[source]

This uses a RandomForestRegressor to de-correlate the given magseries.

Parameters:
  • times,mags,errs (np.array) – The input mag/flux time-series to run EPD on.
  • externalparam_arrs (list of np.arrays) – This is a list of ndarrays of external parameters to decorrelate against. These should all be the same size as times, mags, errs.
  • epdsmooth (bool) – If True, sets the training LC for the RandomForestRegress to be a smoothed version of the sigma-clipped light curve provided in times, mags, errs.
  • epdsmooth_sigclip (float or int or sequence of two floats/ints or None) –

    This specifies how to sigma-clip the input LC before smoothing it and fitting the EPD function to it. The actual LC will not be sigma-clipped.

    If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.

    If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.

    If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.

  • epdsmooth_windowsize (int) – This is the number of LC points to smooth over to generate a smoothed light curve that will be used to fit the EPD function.
  • epdsmooth_func (Python function) –

    This sets the smoothing filter function to use. A Savitsky-Golay filter is used to smooth the light curve by default. The functions that can be used with this kwarg are listed in varbase.trends. If you want to use your own function, it MUST have the following signature:

    def smoothfunc(mags_array, window_size, **extraparams)
    

    and return a numpy array of the same size as mags_array with the smoothed time-series. Any extra params can be provided using the extraparams dict.

  • epdsmooth_extraparams (dict) – This is a dict of any extra filter params to supply to the smoothing function.
  • rf_subsample (float) – Defines the fraction of the size of the mags array to use for training the random forest regressor.
  • rf_ntrees (int) – This is the number of trees to use for the RandomForestRegressor.
  • rf_extraprams (dict) – This is a dict of any extra kwargs to provide to the RandomForestRegressor instance used.
Returns:

Returns a dict with decorrelated mags and the usual info from the RandomForestRegressor: variable importances, etc.

Return type:

dict