astrobase.periodbase.kbls module

Contains the Kovacs, et al. (2002) Box-Least-squared-Search period-search algorithm implementation for periodbase.

astrobase.periodbase.kbls.bls_serial_pfind(times, mags, errs, magsarefluxes=False, startp=0.1, endp=100.0, stepsize=0.0005, mintransitduration=0.01, maxtransitduration=0.4, nphasebins=200, autofreq=True, periodepsilon=0.1, nbestpeaks=5, sigclip=10.0, endp_timebase_check=True, verbose=True, get_stats=True)[source]

Runs the Box Least Squares Fitting Search for transit-shaped signals.

Based on eebls.f from Kovacs et al. 2002 and python-bls from Foreman-Mackey et al. 2015. This is the serial version (which is good enough in most cases because BLS in Fortran is fairly fast). If nfreq > 5e5, this will take a while.

Parameters:
  • times,mags,errs (np.array) – The magnitude/flux time-series to search for transits.
  • magsarefluxes (bool) – If the input measurement values in mags and errs are in fluxes, set this to True.
  • startp,endp (float) – The minimum and maximum periods to consider for the transit search.
  • stepsize (float) – The step-size in frequency to use when constructing a frequency grid for the period search.
  • mintransitduration,maxtransitduration (float) – The minimum and maximum transitdurations (in units of phase) to consider for the transit search.
  • nphasebins (int) – The number of phase bins to use in the period search.
  • autofreq (bool) –

    If this is True, the values of stepsize and nphasebins will be ignored, and these, along with a frequency-grid, will be determined based on the following relations:

    nphasebins = int(ceil(2.0/mintransitduration))
    if nphasebins > 3000:
        nphasebins = 3000
    
    stepsize = 0.25*mintransitduration/(times.max()-times.min())
    
    minfreq = 1.0/endp
    maxfreq = 1.0/startp
    nfreq = int(ceil((maxfreq - minfreq)/stepsize))
    
  • periodepsilon (float) – The fractional difference between successive values of ‘best’ periods when sorting by periodogram power to consider them as separate periods (as opposed to part of the same periodogram peak). This is used to avoid broad peaks in the periodogram and make sure the ‘best’ periods returned are all actually independent.
  • nbestpeaks (int) – The number of ‘best’ peaks to return from the periodogram results, starting from the global maximum of the periodogram peak values.
  • sigclip (float or int or sequence of two floats/ints or None) –

    If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.

    If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.

    If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.

  • endp_timebase_check (bool) – If True, will check if the endp value is larger than the time-base of the observations. If it is, will change the endp value such that it is half of the time-base. If False, will allow an endp larger than the time-base of the observations.
  • verbose (bool) – If this is True, will indicate progress and details about the frequency grid used for the period search.
  • get_stats (bool) –

    If True, runs bls_stats_singleperiod() for each of the best periods in the output and injects the output into the output dict so you only have to run this function to get the periods and their stats.

    The output dict from this function will then contain a ‘stats’ key containing a list of dicts with statistics for each period in resultdict['nbestperiods']. These dicts will contain fit values of transit parameters after a trapezoid transit model is fit to the phased light curve at each period in resultdict['nbestperiods'], i.e. fit values for period, epoch, transit depth, duration, ingress duration, and the SNR of the transit.

    NOTE: make sure to check the ‘fit_status’ key for each resultdict['stats'] item to confirm that the trapezoid transit model fit succeeded and that the stats calculated are valid.

Returns:

This function returns a dict, referred to as an lspinfo dict in other astrobase functions that operate on periodogram results. This is a standardized format across all astrobase period-finders, and is of the form below:

{'bestperiod': the best period value in the periodogram,
 'bestlspval': the periodogram peak associated with the best period,
 'nbestpeaks': the input value of nbestpeaks,
 'nbestlspvals': nbestpeaks-size list of best period peak values,
 'nbestperiods': nbestpeaks-size list of best periods,
 'stats': BLS stats for each best period,
 'lspvals': the full array of periodogram powers,
 'frequencies': the full array of frequencies considered,
 'periods': the full array of periods considered,
 'blsresult': the result dict from the eebls.f wrapper function,
 'stepsize': the actual stepsize used,
 'nfreq': the actual nfreq used,
 'nphasebins': the actual nphasebins used,
 'mintransitduration': the input mintransitduration,
 'maxtransitduration': the input maxtransitdurations,
 'method':'bls' -> the name of the period-finder method,
 'kwargs':{ dict of all of the input kwargs for record-keeping}}

Return type:

dict

astrobase.periodbase.kbls.bls_parallel_pfind(times, mags, errs, magsarefluxes=False, startp=0.1, endp=100.0, stepsize=0.0001, mintransitduration=0.01, maxtransitduration=0.4, nphasebins=200, autofreq=True, nbestpeaks=5, periodepsilon=0.1, sigclip=10.0, endp_timebase_check=True, verbose=True, nworkers=None, get_stats=True)[source]

Runs the Box Least Squares Fitting Search for transit-shaped signals.

Based on eebls.f from Kovacs et al. 2002 and python-bls from Foreman-Mackey et al. 2015. Breaks up the full frequency space into chunks and passes them to parallel BLS workers.

NOTE: the combined BLS spectrum produced by this function is not identical to that produced by running BLS in one shot for the entire frequency space. There are differences on the order of 1.0e-3 or so in the respective peak values, but peaks appear at the same frequencies for both methods. This is likely due to different aliasing caused by smaller chunks of the frequency space used by the parallel workers in this function. When in doubt, confirm results for this parallel implementation by comparing to those from the serial implementation above.

Parameters:
  • times,mags,errs (np.array) – The magnitude/flux time-series to search for transits.
  • magsarefluxes (bool) – If the input measurement values in mags and errs are in fluxes, set this to True.
  • startp,endp (float) – The minimum and maximum periods to consider for the transit search.
  • stepsize (float) – The step-size in frequency to use when constructing a frequency grid for the period search.
  • mintransitduration,maxtransitduration (float) – The minimum and maximum transitdurations (in units of phase) to consider for the transit search.
  • nphasebins (int) – The number of phase bins to use in the period search.
  • autofreq (bool) –

    If this is True, the values of stepsize and nphasebins will be ignored, and these, along with a frequency-grid, will be determined based on the following relations:

    nphasebins = int(ceil(2.0/mintransitduration))
    if nphasebins > 3000:
        nphasebins = 3000
    
    stepsize = 0.25*mintransitduration/(times.max()-times.min())
    
    minfreq = 1.0/endp
    maxfreq = 1.0/startp
    nfreq = int(ceil((maxfreq - minfreq)/stepsize))
    
  • periodepsilon (float) – The fractional difference between successive values of ‘best’ periods when sorting by periodogram power to consider them as separate periods (as opposed to part of the same periodogram peak). This is used to avoid broad peaks in the periodogram and make sure the ‘best’ periods returned are all actually independent.
  • nbestpeaks (int) – The number of ‘best’ peaks to return from the periodogram results, starting from the global maximum of the periodogram peak values.
  • sigclip (float or int or sequence of two floats/ints or None) –

    If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.

    If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.

    If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.

  • endp_timebase_check (bool) – If True, will check if the endp value is larger than the time-base of the observations. If it is, will change the endp value such that it is half of the time-base. If False, will allow an endp larger than the time-base of the observations.
  • verbose (bool) – If this is True, will indicate progress and details about the frequency grid used for the period search.
  • nworkers (int or None) – The number of parallel workers to launch for period-search. If None, nworkers = NCPUS.
  • get_stats (bool) –

    If True, runs bls_stats_singleperiod() for each of the best periods in the output and injects the output into the output dict so you only have to run this function to get the periods and their stats.

    The output dict from this function will then contain a ‘stats’ key containing a list of dicts with statistics for each period in resultdict['nbestperiods']. These dicts will contain fit values of transit parameters after a trapezoid transit model is fit to the phased light curve at each period in resultdict['nbestperiods'], i.e. fit values for period, epoch, transit depth, duration, ingress duration, and the SNR of the transit.

    NOTE: make sure to check the ‘fit_status’ key for each resultdict['stats'] item to confirm that the trapezoid transit model fit succeeded and that the stats calculated are valid.

Returns:

This function returns a dict, referred to as an lspinfo dict in other astrobase functions that operate on periodogram results. This is a standardized format across all astrobase period-finders, and is of the form below:

{'bestperiod': the best period value in the periodogram,
 'bestlspval': the periodogram peak associated with the best period,
 'nbestpeaks': the input value of nbestpeaks,
 'nbestlspvals': nbestpeaks-size list of best period peak values,
 'nbestperiods': nbestpeaks-size list of best periods,
 'stats': list of stats dicts returned for each best period,
 'lspvals': the full array of periodogram powers,
 'frequencies': the full array of frequencies considered,
 'periods': the full array of periods considered,
 'blsresult': list of result dicts from eebls.f wrapper functions,
 'stepsize': the actual stepsize used,
 'nfreq': the actual nfreq used,
 'nphasebins': the actual nphasebins used,
 'mintransitduration': the input mintransitduration,
 'maxtransitduration': the input maxtransitdurations,
 'method':'bls' -> the name of the period-finder method,
 'kwargs':{ dict of all of the input kwargs for record-keeping}}

Return type:

dict

astrobase.periodbase.kbls.bls_stats_singleperiod(times, mags, errs, period, magsarefluxes=False, sigclip=10.0, perioddeltapercent=10, nphasebins=200, mintransitduration=0.01, maxtransitduration=0.4, ingressdurationfraction=0.1, verbose=True)[source]

This calculates the SNR, depth, duration, a refit period, and time of center-transit for a single period.

The equation used for SNR is:

SNR = (transit model depth / RMS of LC with transit model subtracted)
      * sqrt(number of points in transit)

NOTE: you should set the kwargs sigclip, nphasebins, mintransitduration, maxtransitduration to what you used for an initial BLS run to detect transits in the input light curve to match those input conditions.

Parameters:
  • times,mags,errs (np.array) – These contain the magnitude/flux time-series and any associated errors.
  • period (float) – The period to search around and refit the transits. This will be used to calculate the start and end periods of a rerun of BLS to calculate the stats.
  • magsarefluxes (bool) – Set to True if the input measurements in mags are actually fluxes and not magnitudes.
  • sigclip (float or int or sequence of two floats/ints or None) –

    If a single float or int, a symmetric sigma-clip will be performed using the number provided as the sigma-multiplier to cut out from the input time-series.

    If a list of two ints/floats is provided, the function will perform an ‘asymmetric’ sigma-clip. The first element in this list is the sigma value to use for fainter flux/mag values; the second element in this list is the sigma value to use for brighter flux/mag values. For example, sigclip=[10., 3.], will sigclip out greater than 10-sigma dimmings and greater than 3-sigma brightenings. Here the meaning of “dimming” and “brightening” is set by physics (not the magnitude system), which is why the magsarefluxes kwarg must be correctly set.

    If sigclip is None, no sigma-clipping will be performed, and the time-series (with non-finite elems removed) will be passed through to the output.

  • perioddeltapercent (float) –

    The fraction of the period provided to use to search around this value. This is a percentage. The period range searched will then be:

    [period - (perioddeltapercent/100.0)*period,
     period + (perioddeltapercent/100.0)*period]
    
  • nphasebins (int) – The number of phase bins to use in the BLS run.
  • mintransitduration (float) – The minimum transit duration in phase to consider.
  • maxtransitduration (float) – The maximum transit duration to consider.
  • ingressdurationfraction (float) – The fraction of the transit duration to use to generate an initial value of the transit ingress duration for the BLS model refit. This will be fit by this function.
  • verbose (bool) – If True, will indicate progress and any problems encountered.
Returns:

A dict of the following form is returned:

{'period': the refit best period,
 'epoch': the refit epoch (i.e. mid-transit time),
 'snr':the SNR of the transit,
 'transitdepth':the depth of the transit,
 'transitduration':the duration of the transit,
 'ingressduration':if trapezoid fit OK, is the ingress duration,
 'npoints_in_transit':the number of LC points in transit,
 'fit_status': 'ok' or 'trapezoid model fit failed,...',
 'nphasebins':the input value of nphasebins,
 'transingressbin':the phase bin containing transit ingress,
 'transegressbin':the phase bin containing transit egress,
 'blsmodel':the full BLS model used along with its parameters,
 'subtractedmags':BLS model - phased light curve,
 'phasedmags':the phase light curve,
 'phases': the phase values}

You should check the ‘fit_status’ key in this returned dict for a value of ‘ok’. If it is ‘trapezoid model fit failed, using box model’, you may not want to trust the transit period and epoch found.

Return type:

dict

astrobase.periodbase.kbls.bls_snr(blsdict, times, mags, errs, assumeserialbls=False, magsarefluxes=False, sigclip=10.0, npeaks=None, perioddeltapercent=10, ingressdurationfraction=0.1, verbose=True)[source]

Calculates the signal to noise ratio for each best peak in the BLS periodogram, along with transit depth, duration, and refit period and epoch.

The following equation is used for SNR:

SNR = (transit model depth / RMS of LC with transit model subtracted)
      * sqrt(number of points in transit)
Parameters:
  • blsdict (dict) –

    This is an lspinfo dict produced by either bls_parallel_pfind or bls_serial_pfind in this module, or by your own BLS function. If you provide results in a dict from an external BLS function, make sure this matches the form below:

    {'bestperiod': the best period value in the periodogram,
     'bestlspval': the periodogram peak associated with the best period,
     'nbestpeaks': the input value of nbestpeaks,
     'nbestlspvals': nbestpeaks-size list of best period peak values,
     'nbestperiods': nbestpeaks-size list of best periods,
     'lspvals': the full array of periodogram powers,
     'frequencies': the full array of frequencies considered,
     'periods': the full array of periods considered,
     'blsresult': list of result dicts from eebls.f wrapper functions,
     'stepsize': the actual stepsize used,
     'nfreq': the actual nfreq used,
     'nphasebins': the actual nphasebins used,
     'mintransitduration': the input mintransitduration,
     'maxtransitduration': the input maxtransitdurations,
     'method':'bls' -> the name of the period-finder method,
     'kwargs':{ dict of all of the input kwargs for record-keeping}}
    
  • times,mags,errs (np.array) – These contain the magnitude/flux time-series and any associated errors.
  • assumeserialbls (bool) – If this is True, this function will not rerun BLS around each best peak in the input lspinfo dict to refit the periods and epochs. This is usally required for bls_parallel_pfind so set this to False if you use results from that function. The parallel method breaks up the frequency space into chunks for speed, and the results may not exactly match those from a regular BLS run.
  • magsarefluxes (bool) – Set to True if the input measurements in mags are actually fluxes and not magnitudes.
  • npeaks (int or None) – This controls how many of the periods in blsdict[‘nbestperiods’] to find the SNR for. If it’s None, then this will calculate the SNR for all of them. If it’s an integer between 1 and len(blsdict[‘nbestperiods’]), will calculate for only the specified number of peak periods, starting from the best period.
  • perioddeltapercent (float) –

    The fraction of the period provided to use to search around this value. This is a percentage. The period range searched will then be:

    [period - (perioddeltapercent/100.0)*period,
     period + (perioddeltapercent/100.0)*period]
    
  • ingressdurationfraction (float) – The fraction of the transit duration to use to generate an initial value of the transit ingress duration for the BLS model refit. This will be fit by this function.
  • verbose (bool) – If True, will indicate progress and any problems encountered.
Returns:

A dict of the following form is returned:

{'npeaks: the number of periodogram peaks requested to get SNR for,
 'period': list of refit best periods for each requested peak,
 'epoch': list of refit epochs (i.e. mid-transit times),
 'snr':list of SNRs of the transit for each requested peak,
 'transitdepth':list of depths of the transits,
 'transitduration':list of durations of the transits,
 'nphasebins':the input value of nphasebins,
 'transingressbin':the phase bin containing transit ingress,
 'transegressbin':the phase bin containing transit egress,
 'allblsmodels':the full BLS models used along with its parameters,
 'allsubtractedmags':BLS models - phased light curves,
 'allphasedmags':the phase light curves,
 'allphases': the phase values}

Return type:

dict