astrobase.lcproc.lcvfeatures module

This contains functions to generate variability features for large collections of light curves. Useful later for variable star classification.

astrobase.lcproc.lcvfeatures.get_varfeatures(lcfile, outdir, timecols=None, magcols=None, errcols=None, mindet=1000, lcformat='hat-sql', lcformatdir=None)[source]

This runs astrobase.varclass.varfeatures.all_nonperiodic_features() on a single LC file.

Parameters:
  • lcfile (str) – The input light curve to process.
  • outfile (str) – The filename of the output variable features pickle that will be generated.
  • timecols (list of str or None) – The timecol keys to use from the lcdict in calculating the features.
  • magcols (list of str or None) – The magcol keys to use from the lcdict in calculating the features.
  • errcols (list of str or None) – The errcol keys to use from the lcdict in calculating the features.
  • mindet (int) – The minimum number of LC points required to generate variability features.
  • lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
  • lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
Returns:

The generated variability features pickle for the input LC, with results for each magcol in the input magcol or light curve format’s default magcol list.

Return type:

str

astrobase.lcproc.lcvfeatures.serial_varfeatures(lclist, outdir, maxobjects=None, timecols=None, magcols=None, errcols=None, mindet=1000, lcformat='hat-sql', lcformatdir=None)[source]

This runs variability feature extraction for a list of LCs.

Parameters:
  • lclist (list of str) – The list of light curve file names to process.
  • outdir (str) – The directory where the output varfeatures pickle files will be written.
  • maxobjects (int) – The number of LCs to process from lclist.
  • timecols (list of str or None) – The timecol keys to use from the lcdict in calculating the features.
  • magcols (list of str or None) – The magcol keys to use from the lcdict in calculating the features.
  • errcols (list of str or None) – The errcol keys to use from the lcdict in calculating the features.
  • mindet (int) – The minimum number of LC points required to generate variability features.
  • lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
  • lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
Returns:

List of the generated variability features pickles for the input LCs, with results for each magcol in the input magcol or light curve format’s default magcol list.

Return type:

list of str

astrobase.lcproc.lcvfeatures.parallel_varfeatures(lclist, outdir, maxobjects=None, timecols=None, magcols=None, errcols=None, mindet=1000, lcformat='hat-sql', lcformatdir=None, nworkers=2)[source]

This runs variable feature extraction in parallel for all LCs in lclist.

Parameters:
  • lclist (list of str) – The list of light curve file names to process.
  • outdir (str) – The directory where the output varfeatures pickle files will be written.
  • maxobjects (int) – The number of LCs to process from lclist.
  • timecols (list of str or None) – The timecol keys to use from the lcdict in calculating the features.
  • magcols (list of str or None) – The magcol keys to use from the lcdict in calculating the features.
  • errcols (list of str or None) – The errcol keys to use from the lcdict in calculating the features.
  • mindet (int) – The minimum number of LC points required to generate variability features.
  • lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
  • lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
  • nworkers (int) – The number of parallel workers to launch.
Returns:

A dict with key:val pairs of input LC file name : the generated variability features pickles for each of the input LCs, with results for each magcol in the input magcol or light curve format’s default magcol list.

Return type:

dict

astrobase.lcproc.lcvfeatures.parallel_varfeatures_lcdir(lcdir, outdir, fileglob=None, maxobjects=None, timecols=None, magcols=None, errcols=None, recursive=True, mindet=1000, lcformat='hat-sql', lcformatdir=None, nworkers=2)[source]

This runs parallel variable feature extraction for a directory of LCs.

Parameters:
  • lcdir (str) – The directory of light curve files to process.
  • outdir (str) – The directory where the output varfeatures pickle files will be written.
  • fileglob (str or None) – The file glob to use when looking for light curve files in lcdir. If None, the default file glob associated for this LC format will be used.
  • maxobjects (int) – The number of LCs to process from lclist.
  • timecols (list of str or None) – The timecol keys to use from the lcdict in calculating the features.
  • magcols (list of str or None) – The magcol keys to use from the lcdict in calculating the features.
  • errcols (list of str or None) – The errcol keys to use from the lcdict in calculating the features.
  • mindet (int) – The minimum number of LC points required to generate variability features.
  • lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
  • lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
  • nworkers (int) – The number of parallel workers to launch.
Returns:

A dict with key:val pairs of input LC file name : the generated variability features pickles for each of the input LCs, with results for each magcol in the input magcol or light curve format’s default magcol list.

Return type:

dict