astrobase.hatsurveys.hatlc module

This contains functions to read HAT sqlite (“sqlitecurves”) and CSV light curves generated by the new HAT data server.

The most useful functions in this module are:

read_csvlc(lcfile):

    This reads a CSV light curve produced by the HAT data server into an
    lcdict.

    lcfile is the HAT gzipped CSV LC (with a .hatlc.csv.gz extension)

And:

read_and_filter_sqlitecurve(lcfile, columns=None, sqlfilters=None,
                            raiseonfail=False, forcerecompress=False):

    This reads a sqlitecurve file and optionally filters it, returns an
    lcdict.

    Returns columns requested in columns. If None, then returns all columns
    present in the latest columnlist in the lightcurve. See COLUMNDEFS for
    the full list of HAT LC columns.

    If sqlfilters is not None, it must be a list of text SQL filters that
    apply to the columns in the lightcurve.

    This returns an lcdict with an added 'lcfiltersql' key that indicates
    what the parsed SQL filter string was.

    If forcerecompress = True, will recompress the un-gzipped sqlitecurve
    even if the gzipped form exists on disk already.

Finally:

describe(lcdict):

    This describes the metadata of the light curve.

Command line usage

You can call this module directly from the command line:

If you just have this file alone:

$ chmod +x hatlc.py
$ ./hatlc.py --help

If astrobase is installed with pip, etc., this will be on your path already:

$ hatlc --help

These should give you the following:

usage: hatlc.py [-h] [--describe] hatlcfile

read a HAT LC of any format and output to stdout

positional arguments:
  hatlcfile   path to the light curve you want to read and pipe to stdout

optional arguments:
  -h, --help  show this help message and exit
  --describe  don't dump the columns, show only object info and LC metadata

Either one will dump any HAT LC recognized to stdout (or just dump the description if requested).

Other useful functions

Two other functions that might be useful:

normalize_lcdict(lcdict, timecol='rjd', magcols='all', mingap=4.0,
                 normto='sdssr', debugmode=False):

    This normalizes magnitude columns (specified in the magcols keyword
    argument) in an lcdict obtained from reading a HAT light curve. This
    normalization is done by finding 'timegroups' in each magnitude column,
    assuming that these belong to different 'eras' separated by a specified
    gap in the mingap keyword argument, and thus may be offset vertically
    from one another. Measurements within a timegroup are normalized to zero
    using the meidan magnitude of the timegroup. Once all timegroups have
    been processed this way, the whole time series is then re-normalized to
    the specified value in the normto keyword argument.

And:

normalize_lcdict_byinst(lcdict, magcols='all', normto='sdssr',
                        normkeylist=('stf','ccd','flt','fld','prj','exp'),
                        debugmode=False)

    This normalizes magnitude columns (specified in the magcols keyword
    argument) in an lcdict obtained from reading a HAT light curve. This
    normalization is done by generating a normalization key using columns in
    the lcdict that specify various instrument properties. The default
    normalization key (specified in the normkeylist kwarg) is a combination
    of:

    - HAT station IDs ('stf')
    - camera position ID ('ccd'; useful for HATSouth observations)
    - camera filters ('flt')
    - observed HAT field names ('fld')
    - HAT project IDs ('prj')
    - camera exposure times ('exp')

    with the assumption that measurements with identical normalization keys
    belong to a single 'era'. Measurements within an era are normalized to
    zero using the median magnitude of the era. Once all eras have been
    processed this way, the whole time series is then re-normalized to the
    specified value in the normto keyword argument.

There’s an IPython notebook describing the use of this module and accompanying modules from the astrobase package at:

https://github.com/waqasbhatti/astrobase-notebooks/blob/master/lightcurve-work.ipynb

astrobase.hatsurveys.hatlc.read_and_filter_sqlitecurve(lcfile, columns=None, sqlfilters=None, raiseonfail=False, returnarrays=True, forcerecompress=False, quiet=True)[source]

This reads a HAT sqlitecurve and optionally filters it.

Parameters:
  • lcfile (str) – The path to the HAT sqlitecurve file.
  • columns (list) – A list of columns to extract from the ligh curve file. If None, then returns all columns present in the latest columnlist in the light curve.
  • sqlfilters (list of str) – If no None, it must be a list of text SQL filters that apply to the columns in the lightcurve.
  • raiseonfail (bool) – If this is True, an Exception when reading the LC will crash the function instead of failing silently and returning None as the result.
  • returnarrays (bool) – If this is True, the output lcdict contains columns as np.arrays instead of lists. You generally want this to be True.
  • forcerecompress (bool) – If True, the sqlitecurve will be recompressed even if a compressed version of it is found. This usually happens when sqlitecurve opening is interrupted by the OS for some reason, leaving behind a gzipped and un-gzipped copy. By default, this function refuses to overwrite the existing gzipped version so if the un-gzipped version is corrupt but that one isn’t, it can be safely recovered.
  • quiet (bool) – If True, will not warn about any problems, even if the light curve reading fails (the only clue then will be the return value of None). Useful for batch processing of many many light curves.
Returns:

tuple – A two-element tuple is returned, with the first element being the lcdict.

Return type:

(lcdict, status_message)

astrobase.hatsurveys.hatlc.describe(lcdict, returndesc=False, offsetwith=None)[source]

This describes the light curve object and columns present.

Parameters:
  • lcdict (dict) – The input lcdict to parse for column and metadata info.
  • returndesc (bool) – If True, returns the description string as an str instead of just printing it to stdout.
  • offsetwith (str) – This is a character to offset the output description lines by. This is useful to add comment characters like ‘#’ to the output description lines.
Returns:

If returndesc is True, returns the description lines as a str, otherwise returns nothing.

Return type:

str or None

astrobase.hatsurveys.hatlc.read_lcc_csvlc(lcfile)[source]

This reads a CSV LC produced by an LCC-Server instance.

Parameters:lcfile (str) – The LC file to read.
Returns:Returns an lcdict that’s readable by most astrobase functions for further processing.
Return type:dict
astrobase.hatsurveys.hatlc.describe_lcc_csv(lcdict, returndesc=False)[source]

This describes the LCC CSV format light curve file.

Parameters:
  • lcdict (dict) – The input lcdict to parse for column and metadata info.
  • returndesc (bool) – If True, returns the description string as an str instead of just printing it to stdout.
Returns:

If returndesc is True, returns the description lines as a str, otherwise returns nothing.

Return type:

str or None

astrobase.hatsurveys.hatlc.read_csvlc(lcfile)[source]

This reads a HAT data server or LCC-Server produced CSV light curve into an lcdict.

This will automatically figure out the format of the file provided. Currently, it can read:

Parameters:lcfile (str) – The light curve file to read.
Returns:Returns an lcdict that can be read and used by many astrobase processing functions.
Return type:dict
astrobase.hatsurveys.hatlc.find_lc_timegroups(lctimes, mingap=4.0)[source]

This finds the time gaps in the light curve, so we can figure out which times are for consecutive observations and which represent gaps between seasons.

Parameters:
  • lctimes (np.array) – This is the input array of times, assumed to be in some form of JD.
  • mingap (float) – This defines how much the difference between consecutive measurements is allowed to be to consider them as parts of different timegroups. By default it is set to 4.0 days.
Returns:

A tuple of the form below is returned, containing the number of time groups found and Python slice objects for each group:

(ngroups, [slice(start_ind_1, end_ind_1), ...])

Return type:

tuple

astrobase.hatsurveys.hatlc.normalize_lcdict(lcdict, timecol='rjd', magcols='all', mingap=4.0, normto='sdssr', debugmode=False, quiet=False)[source]

This normalizes magcols in lcdict using timecol to find timegroups.

Parameters:
  • lcdict (dict) – The input lcdict to process.
  • timecol (str) – The key in the lcdict that is to be used to extract the time column.
  • magcols ('all' or list of str) – If this is ‘all’, all of the columns in the lcdict that are indicated to be magnitude measurement columns are normalized. If this is a list of str, must contain the keys of the lcdict specifying which magnitude columns will be normalized.
  • mingap (float) – This defines how much the difference between consecutive measurements is allowed to be to consider them as parts of different timegroups. By default it is set to 4.0 days.
  • normto ({'globalmedian', 'zero', 'jmag', 'hmag', 'kmag', 'bmag', 'vmag', 'sdssg', 'sdssr', 'sdssi'}) – This indicates which column will be the normalization target. If this is ‘globalmedian’, the normalization will be to the global median of each LC column. If this is ‘zero’, will normalize to 0.0 for each LC column. Otherwise, will normalize to the value of one of the other keys in the lcdict[‘objectinfo’][magkey], meaning the normalization will be to some form of catalog magnitude.
  • debugmode (bool) – If True, will indicate progress as time-groups are found and processed.
  • quiet (bool) – If True, will not emit any messages when processing.
Returns:

Returns the lcdict with the magnitude measurements normalized as specified. The normalization happens IN PLACE.

Return type:

dict

astrobase.hatsurveys.hatlc.normalize_lcdict_byinst(lcdict, magcols='all', normto='sdssr', normkeylist=('stf', 'ccd', 'flt', 'fld', 'prj', 'exp'), debugmode=False, quiet=False)[source]

This is a function to normalize light curves across all instrument combinations present.

Use this to normalize a light curve containing a variety of:

  • HAT station IDs (‘stf’)
  • camera IDs (‘ccd’)
  • filters (‘flt’)
  • observed field names (‘fld’)
  • HAT project IDs (‘prj’)
  • exposure times (‘exp’)
Parameters:
  • lcdict (dict) – The input lcdict to process.
  • magcols ('all' or list of str) – If this is ‘all’, all of the columns in the lcdict that are indicated to be magnitude measurement columns are normalized. If this is a list of str, must contain the keys of the lcdict specifying which magnitude columns will be normalized.
  • normto ({'zero', 'jmag', 'hmag', 'kmag', 'bmag', 'vmag', 'sdssg', 'sdssr', 'sdssi'}) – This indicates which column will be the normalization target. If this is ‘zero’, will normalize to 0.0 for each LC column. Otherwise, will normalize to the value of one of the other keys in the lcdict[‘objectinfo’][magkey], meaning the normalization will be to some form of catalog magnitude.
  • normkeylist (list of str) – These are the column keys to use to form the normalization index. Measurements in the specified magcols with identical normalization index values will be considered as part of a single measurement ‘era’, and will be normalized to zero. Once all eras have been normalized this way, the final light curve will be re-normalized as specified in normto.
  • debugmode (bool) – If True, will indicate progress as time-groups are found and processed.
  • quiet (bool) – If True, will not emit any messages when processing.
Returns:

Returns the lcdict with the magnitude measurements normalized as specified. The normalization happens IN PLACE.

Return type:

dict

astrobase.hatsurveys.hatlc.main()[source]

This is called when we’re executed from the commandline.

The current usage from the command-line is described below:

usage: hatlc [-h] [--describe] hatlcfile

read a HAT LC of any format and output to stdout

positional arguments:
  hatlcfile   path to the light curve you want to read and pipe to stdout

optional arguments:
  -h, --help  show this help message and exit
  --describe  don't dump the columns, show only object info and LC metadata