astrobase.lcproc.varthreshold module

This contains functions to investigate where to set a threshold for several variability indices to distinguish between variable and non-variable stars.

astrobase.lcproc.varthreshold.variability_threshold(featuresdir, outfile, magbins=array([ 8., 8.25, 8.5, 8.75, 9., 9.25, 9.5, 9.75, 10., 10.25, 10.5, 10.75, 11., 11.25, 11.5, 11.75, 12., 12.25, 12.5, 12.75, 13., 13.25, 13.5, 13.75, 14., 14.25, 14.5, 14.75, 15., 15.25, 15.5, 15.75, 16. ]), maxobjects=None, timecols=None, magcols=None, errcols=None, lcformat='hat-sql', lcformatdir=None, min_lcmad_stdev=5.0, min_stetj_stdev=2.0, min_iqr_stdev=2.0, min_inveta_stdev=2.0, verbose=True)[source]

This generates a list of objects with stetson J, IQR, and 1.0/eta above some threshold value to select them as potential variable stars.

Use this to pare down the objects to review and put through period-finding. This does the thresholding per magnitude bin; this should be better than one single cut through the entire magnitude range. Set the magnitude bins using the magbins kwarg.

FIXME: implement a voting classifier here. this will choose variables based on the thresholds in IQR, stetson, and inveta based on weighting carried over from the variability recovery sims.

Parameters:
  • featuresdir (str) – This is the directory containing variability feature pickles created by astrobase.lcproc.lcpfeatures.parallel_varfeatures() or similar.
  • outfile (str) – This is the output pickle file that will contain all the threshold information.
  • magbins (np.array of floats) – This sets the magnitude bins to use for calculating thresholds.
  • maxobjects (int or None) – This is the number of objects to process. If None, all objects with feature pickles in featuresdir will be processed.
  • timecols (list of str or None) – The timecol keys to use from the lcdict in calculating the thresholds.
  • magcols (list of str or None) – The magcol keys to use from the lcdict in calculating the thresholds.
  • errcols (list of str or None) – The errcol keys to use from the lcdict in calculating the thresholds.
  • lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
  • lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
  • min_lcmad_stdev,min_stetj_stdev,min_iqr_stdev,min_inveta_stdev (float or np.array) – These are all the standard deviation multiplier for the distributions of light curve standard deviation, Stetson J variability index, the light curve interquartile range, and 1/eta variability index respectively. These multipliers set the minimum values of these measures to use for selecting variable stars. If provided as floats, the same value will be used for all magbins. If provided as np.arrays of size = magbins.size - 1, will be used to apply possibly different sigma cuts for each magbin.
  • verbose (bool) – If True, will report progress and warn about any problems.
Returns:

Contains all of the variability threshold information along with indices into the array of the object IDs chosen as variables.

Return type:

dict

astrobase.lcproc.varthreshold.plot_variability_thresholds(varthreshpkl, xmin_lcmad_stdev=5.0, xmin_stetj_stdev=2.0, xmin_iqr_stdev=2.0, xmin_inveta_stdev=2.0, lcformat='hat-sql', lcformatdir=None, magcols=None)[source]

This makes plots for the variability threshold distributions.

Parameters:
  • varthreshpkl (str) – The pickle produced by the function above.
  • xmin_lcmad_stdev,xmin_stetj_stdev,xmin_iqr_stdev,xmin_inveta_stdev (float or np.array) – Values of the threshold values to override the ones in the vartresholdpkl. If provided, will plot the thresholds accordingly instead of using the ones in the input pickle directly.
  • lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
  • lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
  • magcols (list of str or None) – The magcol keys to use from the lcdict.
Returns:

The file name of the threshold plot generated.

Return type:

str