astrobase.lcproc.lcsfeatures module

This contains functions to obtain various star magnitude and color features for large numbers of light curves. Useful later for variable star classification.

astrobase.lcproc.lcsfeatures.get_starfeatures(lcfile, outdir, kdtree, objlist, lcflist, neighbor_radius_arcsec, deredden=True, custom_bandpasses=None, lcformat='hat-sql', lcformatdir=None)[source]

This runs the functions from astrobase.varclass.starfeatures() on a single light curve file.

Parameters:
  • lcfile (str) – This is the LC file to extract star features for.
  • outdir (str) – This is the directory to write the output pickle to.
  • kdtree (scipy.spatial.cKDTree) – This is a scipy.spatial.KDTree or cKDTree used to calculate neighbor proximity features. This is for the light curve catalog this object is in.
  • objlist (np.array) – This is a Numpy array of object IDs in the same order as the kdtree.data np.array. This is for the light curve catalog this object is in.
  • lcflist (np.array) – This is a Numpy array of light curve filenames in the same order as kdtree.data. This is for the light curve catalog this object is in.
  • neighbor_radius_arcsec (float) – This indicates the radius in arcsec to search for neighbors for this object using the light curve catalog’s kdtree, objlist, lcflist, and in GAIA.
  • deredden (bool) – This controls if the colors and any color classifications will be dereddened using 2MASS DUST.
  • custom_bandpasses (dict or None) –

    This is a dict used to define any custom bandpasses in the in_objectinfo dict you want to make this function aware of and generate colors for. Use the format below for this dict:

    {
    '<bandpass_key_1>':{'dustkey':'<twomass_dust_key_1>',
                        'label':'<band_label_1>'
                        'colors':[['<bandkey1>-<bandkey2>',
                                   '<BAND1> - <BAND2>'],
                                  ['<bandkey3>-<bandkey4>',
                                   '<BAND3> - <BAND4>']]},
    .
    ...
    .
    '<bandpass_key_N>':{'dustkey':'<twomass_dust_key_N>',
                        'label':'<band_label_N>'
                        'colors':[['<bandkey1>-<bandkey2>',
                                   '<BAND1> - <BAND2>'],
                                  ['<bandkey3>-<bandkey4>',
                                   '<BAND3> - <BAND4>']]},
    }
    

    Where:

    bandpass_key is a key to use to refer to this bandpass in the objectinfo dict, e.g. ‘sdssg’ for SDSS g band

    twomass_dust_key is the key to use in the 2MASS DUST result table for reddening per band-pass. For example, given the following DUST result table (using http://irsa.ipac.caltech.edu/applications/DUST/):

    |Filter_name|LamEff |A_over_E_B_V_SandF|A_SandF|A_over_E_B_V_SFD|A_SFD|
    |char       |float  |float             |float  |float           |float|
    |           |microns|                  |mags   |                |mags |
     CTIO U       0.3734              4.107   0.209            4.968 0.253
     CTIO B       0.4309              3.641   0.186            4.325 0.221
     CTIO V       0.5517              2.682   0.137            3.240 0.165
    .
    .
    ...
    

    The twomass_dust_key for ‘vmag’ would be ‘CTIO V’. If you want to skip DUST lookup and want to pass in a specific reddening magnitude for your bandpass, use a float for the value of twomass_dust_key. If you want to skip DUST lookup entirely for this bandpass, use None for the value of twomass_dust_key.

    band_label is the label to use for this bandpass, e.g. ‘W1’ for WISE-1 band, ‘u’ for SDSS u, etc.

    The ‘colors’ list contains color definitions for all colors you want to generate using this bandpass. this list contains elements of the form:

    ['<bandkey1>-<bandkey2>','<BAND1> - <BAND2>']
    

    where the the first item is the bandpass keys making up this color, and the second item is the label for this color to be used by the frontends. An example:

    ['sdssu-sdssg','u - g']
    
  • lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
  • lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
Returns:

Path to the output pickle containing all of the star features for this object.

Return type:

str

astrobase.lcproc.lcsfeatures.serial_starfeatures(lclist, outdir, lc_catalog_pickle, neighbor_radius_arcsec, maxobjects=None, deredden=True, custom_bandpasses=None, lcformat='hat-sql', lcformatdir=None)[source]

This drives the get_starfeatures function for a collection of LCs.

Parameters:
  • lclist (list of str) – The list of light curve file names to process.
  • outdir (str) – The output directory where the results will be placed.
  • lc_catalog_pickle (str) –

    The path to a catalog containing at a dict with least:

    • an object ID array accessible with dict[‘objects’][‘objectid’]
    • an LC filename array accessible with dict[‘objects’][‘lcfname’]
    • a scipy.spatial.KDTree or cKDTree object to use for finding neighbors for each object accessible with dict[‘kdtree’]

    A catalog pickle of the form needed can be produced using astrobase.lcproc.catalogs.make_lclist() or astrobase.lcproc.catalogs.filter_lclist().

  • neighbor_radius_arcsec (float) – This indicates the radius in arcsec to search for neighbors for this object using the light curve catalog’s kdtree, objlist, lcflist, and in GAIA.
  • maxobjects (int) – The number of objects to process from lclist.
  • deredden (bool) – This controls if the colors and any color classifications will be dereddened using 2MASS DUST.
  • custom_bandpasses (dict or None) –

    This is a dict used to define any custom bandpasses in the in_objectinfo dict you want to make this function aware of and generate colors for. Use the format below for this dict:

    {
    '<bandpass_key_1>':{'dustkey':'<twomass_dust_key_1>',
                        'label':'<band_label_1>'
                        'colors':[['<bandkey1>-<bandkey2>',
                                   '<BAND1> - <BAND2>'],
                                  ['<bandkey3>-<bandkey4>',
                                   '<BAND3> - <BAND4>']]},
    .
    ...
    .
    '<bandpass_key_N>':{'dustkey':'<twomass_dust_key_N>',
                        'label':'<band_label_N>'
                        'colors':[['<bandkey1>-<bandkey2>',
                                   '<BAND1> - <BAND2>'],
                                  ['<bandkey3>-<bandkey4>',
                                   '<BAND3> - <BAND4>']]},
    }
    

    Where:

    bandpass_key is a key to use to refer to this bandpass in the objectinfo dict, e.g. ‘sdssg’ for SDSS g band

    twomass_dust_key is the key to use in the 2MASS DUST result table for reddening per band-pass. For example, given the following DUST result table (using http://irsa.ipac.caltech.edu/applications/DUST/):

    |Filter_name|LamEff |A_over_E_B_V_SandF|A_SandF|A_over_E_B_V_SFD|A_SFD|
    |char       |float  |float             |float  |float           |float|
    |           |microns|                  |mags   |                |mags |
     CTIO U       0.3734              4.107   0.209            4.968 0.253
     CTIO B       0.4309              3.641   0.186            4.325 0.221
     CTIO V       0.5517              2.682   0.137            3.240 0.165
    .
    .
    ...
    

    The twomass_dust_key for ‘vmag’ would be ‘CTIO V’. If you want to skip DUST lookup and want to pass in a specific reddening magnitude for your bandpass, use a float for the value of twomass_dust_key. If you want to skip DUST lookup entirely for this bandpass, use None for the value of twomass_dust_key.

    band_label is the label to use for this bandpass, e.g. ‘W1’ for WISE-1 band, ‘u’ for SDSS u, etc.

    The ‘colors’ list contains color definitions for all colors you want to generate using this bandpass. this list contains elements of the form:

    ['<bandkey1>-<bandkey2>','<BAND1> - <BAND2>']
    

    where the the first item is the bandpass keys making up this color, and the second item is the label for this color to be used by the frontends. An example:

    ['sdssu-sdssg','u - g']
    
  • lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
  • lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
Returns:

A list of all star features pickles produced.

Return type:

list of str

astrobase.lcproc.lcsfeatures.parallel_starfeatures(lclist, outdir, lc_catalog_pickle, neighbor_radius_arcsec, maxobjects=None, deredden=True, custom_bandpasses=None, lcformat='hat-sql', lcformatdir=None, nworkers=2)[source]

This runs get_starfeatures in parallel for all light curves in lclist.

Parameters:
  • lclist (list of str) – The list of light curve file names to process.
  • outdir (str) – The output directory where the results will be placed.
  • lc_catalog_pickle (str) –

    The path to a catalog containing at a dict with least:

    • an object ID array accessible with dict[‘objects’][‘objectid’]
    • an LC filename array accessible with dict[‘objects’][‘lcfname’]
    • a scipy.spatial.KDTree or cKDTree object to use for finding neighbors for each object accessible with dict[‘kdtree’]

    A catalog pickle of the form needed can be produced using astrobase.lcproc.catalogs.make_lclist() or astrobase.lcproc.catalogs.filter_lclist().

  • neighbor_radius_arcsec (float) – This indicates the radius in arcsec to search for neighbors for this object using the light curve catalog’s kdtree, objlist, lcflist, and in GAIA.
  • maxobjects (int) – The number of objects to process from lclist.
  • deredden (bool) – This controls if the colors and any color classifications will be dereddened using 2MASS DUST.
  • custom_bandpasses (dict or None) –

    This is a dict used to define any custom bandpasses in the in_objectinfo dict you want to make this function aware of and generate colors for. Use the format below for this dict:

    {
    '<bandpass_key_1>':{'dustkey':'<twomass_dust_key_1>',
                        'label':'<band_label_1>'
                        'colors':[['<bandkey1>-<bandkey2>',
                                   '<BAND1> - <BAND2>'],
                                  ['<bandkey3>-<bandkey4>',
                                   '<BAND3> - <BAND4>']]},
    .
    ...
    .
    '<bandpass_key_N>':{'dustkey':'<twomass_dust_key_N>',
                        'label':'<band_label_N>'
                        'colors':[['<bandkey1>-<bandkey2>',
                                   '<BAND1> - <BAND2>'],
                                  ['<bandkey3>-<bandkey4>',
                                   '<BAND3> - <BAND4>']]},
    }
    

    Where:

    bandpass_key is a key to use to refer to this bandpass in the objectinfo dict, e.g. ‘sdssg’ for SDSS g band

    twomass_dust_key is the key to use in the 2MASS DUST result table for reddening per band-pass. For example, given the following DUST result table (using http://irsa.ipac.caltech.edu/applications/DUST/):

    |Filter_name|LamEff |A_over_E_B_V_SandF|A_SandF|A_over_E_B_V_SFD|A_SFD|
    |char       |float  |float             |float  |float           |float|
    |           |microns|                  |mags   |                |mags |
     CTIO U       0.3734              4.107   0.209            4.968 0.253
     CTIO B       0.4309              3.641   0.186            4.325 0.221
     CTIO V       0.5517              2.682   0.137            3.240 0.165
    .
    .
    ...
    

    The twomass_dust_key for ‘vmag’ would be ‘CTIO V’. If you want to skip DUST lookup and want to pass in a specific reddening magnitude for your bandpass, use a float for the value of twomass_dust_key. If you want to skip DUST lookup entirely for this bandpass, use None for the value of twomass_dust_key.

    band_label is the label to use for this bandpass, e.g. ‘W1’ for WISE-1 band, ‘u’ for SDSS u, etc.

    The ‘colors’ list contains color definitions for all colors you want to generate using this bandpass. this list contains elements of the form:

    ['<bandkey1>-<bandkey2>','<BAND1> - <BAND2>']
    

    where the the first item is the bandpass keys making up this color, and the second item is the label for this color to be used by the frontends. An example:

    ['sdssu-sdssg','u - g']
    
  • lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
  • lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
  • nworkers (int) – The number of parallel workers to launch.
Returns:

A dict with key:val pairs of the input light curve filename and the output star features pickle for each LC processed.

Return type:

dict

astrobase.lcproc.lcsfeatures.parallel_starfeatures_lcdir(lcdir, outdir, lc_catalog_pickle, neighbor_radius_arcsec, fileglob=None, maxobjects=None, deredden=True, custom_bandpasses=None, lcformat='hat-sql', lcformatdir=None, nworkers=2, recursive=True)[source]

This runs parallel star feature extraction for a directory of LCs.

Parameters:
  • lcdir (list of str) – The directory to search for light curves.
  • outdir (str) – The output directory where the results will be placed.
  • lc_catalog_pickle (str) –

    The path to a catalog containing at a dict with least:

    • an object ID array accessible with dict[‘objects’][‘objectid’]
    • an LC filename array accessible with dict[‘objects’][‘lcfname’]
    • a scipy.spatial.KDTree or cKDTree object to use for finding neighbors for each object accessible with dict[‘kdtree’]

    A catalog pickle of the form needed can be produced using astrobase.lcproc.catalogs.make_lclist() or astrobase.lcproc.catalogs.filter_lclist().

  • neighbor_radius_arcsec (float) – This indicates the radius in arcsec to search for neighbors for this object using the light curve catalog’s kdtree, objlist, lcflist, and in GAIA.
  • fileglob (str) – The UNIX file glob to use to search for the light curves in lcdir. If None, the default value for the light curve format specified will be used.
  • maxobjects (int) – The number of objects to process from lclist.
  • deredden (bool) – This controls if the colors and any color classifications will be dereddened using 2MASS DUST.
  • custom_bandpasses (dict or None) –

    This is a dict used to define any custom bandpasses in the in_objectinfo dict you want to make this function aware of and generate colors for. Use the format below for this dict:

    {
    '<bandpass_key_1>':{'dustkey':'<twomass_dust_key_1>',
                        'label':'<band_label_1>'
                        'colors':[['<bandkey1>-<bandkey2>',
                                   '<BAND1> - <BAND2>'],
                                  ['<bandkey3>-<bandkey4>',
                                   '<BAND3> - <BAND4>']]},
    .
    ...
    .
    '<bandpass_key_N>':{'dustkey':'<twomass_dust_key_N>',
                        'label':'<band_label_N>'
                        'colors':[['<bandkey1>-<bandkey2>',
                                   '<BAND1> - <BAND2>'],
                                  ['<bandkey3>-<bandkey4>',
                                   '<BAND3> - <BAND4>']]},
    }
    

    Where:

    bandpass_key is a key to use to refer to this bandpass in the objectinfo dict, e.g. ‘sdssg’ for SDSS g band

    twomass_dust_key is the key to use in the 2MASS DUST result table for reddening per band-pass. For example, given the following DUST result table (using http://irsa.ipac.caltech.edu/applications/DUST/):

    |Filter_name|LamEff |A_over_E_B_V_SandF|A_SandF|A_over_E_B_V_SFD|A_SFD|
    |char       |float  |float             |float  |float           |float|
    |           |microns|                  |mags   |                |mags |
     CTIO U       0.3734              4.107   0.209            4.968 0.253
     CTIO B       0.4309              3.641   0.186            4.325 0.221
     CTIO V       0.5517              2.682   0.137            3.240 0.165
    .
    .
    ...
    

    The twomass_dust_key for ‘vmag’ would be ‘CTIO V’. If you want to skip DUST lookup and want to pass in a specific reddening magnitude for your bandpass, use a float for the value of twomass_dust_key. If you want to skip DUST lookup entirely for this bandpass, use None for the value of twomass_dust_key.

    band_label is the label to use for this bandpass, e.g. ‘W1’ for WISE-1 band, ‘u’ for SDSS u, etc.

    The ‘colors’ list contains color definitions for all colors you want to generate using this bandpass. this list contains elements of the form:

    ['<bandkey1>-<bandkey2>','<BAND1> - <BAND2>']
    

    where the the first item is the bandpass keys making up this color, and the second item is the label for this color to be used by the frontends. An example:

    ['sdssu-sdssg','u - g']
    
  • lcformat (str) – This is the formatkey associated with your light curve format, which you previously passed in to the lcproc.register_lcformat function. This will be used to look up how to find and read the light curves specified in basedir or use_list_of_filenames.
  • lcformatdir (str or None) – If this is provided, gives the path to a directory when you’ve stored your lcformat description JSONs, other than the usual directories lcproc knows to search for them in. Use this along with lcformat to specify an LC format JSON file that’s not currently registered with lcproc.
  • nworkers (int) – The number of parallel workers to launch.
Returns:

A dict with key:val pairs of the input light curve filename and the output star features pickle for each LC processed.

Return type:

dict