Fetch JWST TSO programs from STScI

This Python script shows how to fetch and parse the entire catalog of JWST TSO prgrams from the STScI website.

This tutorial is split into three sections:


1. Fetch TSO files

STScI provides the information for each JWST program in two files:

  • the Astronomers Proposal Tool (APT) file (in .aptx format)
  • the status file (in .xml format)

The APT files have to be then converted .xml format to be further parsed. This can be done using the STScI’s APT software. This will be done under the hood by Gen TSO, all we need to set is the path to the APT executable:

After you installed APT, search in your Application folders. The executable will be located inside the bin subfolder. We will need to get the absolute path to it, It should be something like:

/Applications/APT\\ 2024.7.1/bin/apt

The following script shows how to fetch the APT and status for the programs:

import gen_tso.catalogs as cat
from gen_tso.utils import KNOWN_PROGRAMS
# KNOWN_PROGRAMS is a list containing all known TSO PIDs up to cycle 4


# Path to APT executable:
apt_command = '/Applications/APT\\ 2025.1/bin/apt'

# Fetch all TSO programs:
programs = KNOWN_PROGRAMS
cat.fetch_jwst_programs(programs, apt_command)

After you installed APT on Linux, search for the folder where it was installed. The path to the executable should be something like this:

/home/user/ast/apt/APT_2025.1/APT

The following script shows how to fetch the APT and status for the programs:

import gen_tso.catalogs as cat
from gen_tso.utils import KNOWN_PROGRAMS
# KNOWN_PROGRAMS is a list containing all known TSO PIDs up to cycle 4


# Path to APT executable:
apt_command = '/home/user/ast/apt/APT_2024.5.2/APT'

# Fetch all TSO programs:
programs = KNOWN_PROGRAMS
cat.fetch_jwst_programs(programs, apt_command)

Either on a Linux or OSX machine, you can create an environmental variable in your bash pointing to the APT executable, e.g.:

export APT='/Applications/APT\ 2025.1/bin/apt'

then you can directly use that environmental variable in the script. The following script shows how to fetch the APT and status for the programs:

import gen_tso.catalogs as cat
from gen_tso.utils import KNOWN_PROGRAMS
# KNOWN_PROGRAMS is a list containing all known TSO PIDs up to cycle 4


# APT's environmental variable:
apt_command = 'APT'

# Fetch all TSO programs:
programs = KNOWN_PROGRAMS
cat.fetch_jwst_programs(programs, apt_command)

To process one program takes a few seconds. Thus, to process the entire catalog will take some ~10 min.

Fetch specific programs

If you need the observations from a single program, you can use the programs argument with the desired PID(s):

import gen_tso.catalogs as cat

# Fetch a single program
apt_command = 'APT'
programs = 3712
cat.fetch_jwst_programs(programs, apt_command)

If you want observations from specific programs, you can use the programs argument with the desired PID(s):

import gen_tso.catalogs as cat

# Fetch a set of programs
apt_command = 'APT'
programs = [2759, 3557, 3730, 4098]
cat.fetch_jwst_programs(programs, apt_command)
Note

By default, the cat.fetch_jwst_programs() function will store the files into a default folder inside the Gen TSO package, you can find that folder with this script:

>>> from gen_tso.utils import ROOT
>>> print(f'{ROOT}data/programs/')
/home/user/.../gen_tso/data/programs/

To place them to a custom folder, use the output_folder argument.



2. Parse files to database

Once we have downloaded the .xml files of the JWST TSO programs, we can parse them to a human readable format and save their content to a csv file if desired. Here’s how:

from gen_tso import catalogs as cat
from gen_tso.utils import KNOWN_PROGRAMS
# KNOWN_PROGRAMS is a list containing all known TSO PIDs up to cycle 4


# A list of observations from all TSO programs
observations = cat.parse_program(KNOWN_PROGRAMS)

Set the to_csv argument to save the observations to a .csv file.

from gen_tso import catalogs as cat
from gen_tso.utils import KNOWN_PROGRAMS
# KNOWN_PROGRAMS is a list containing all known TSO PIDs up to cycle 4


# Parse and save all programs to default .csv file
to_csv = 'jwst_tso_programs.csv'
observations = cat.parse_program(KNOWN_PROGRAMS, to_csv=to_csv)


# Thats all! Now we can take a look at the content:
observations[98]
{'category': 'ERS',
 'pid': '1366',
 'pi': 'Batalha',
 'cycle': 1,
 'observation': '3',
 'visit': '1',
 'status': 'Executed',
 'event': 'transit',
 'target': 'WASP-39',
 'target_in_program': 'WASP-39',
 'ra': '14:29:18.3955',
 'dec': '-03:26:40.20',
 'instrument': 'NIRSPEC',
 'mode': 'BOTS',
 'disperser': 'G395H',
 'filter': 'F290LP',
 'subarray': 'SUB2048',
 'readout': 'NRSRAPID',
 'groups': 70,
 'integrations': 465,
 'duration': 10.56,
 'date_start': datetime.datetime(2022, 7, 30, 20, 46, 32),
 'date_end': datetime.datetime(2022, 7, 31, 6, 21, 30),
 'plan_window': None,
 'proprietary_period': 0,
 'label': 'Transmission - WASP-39b : nirspec g395h',
 'special_reqs': ['OrientRange', 'PeriodZeroPhase', 'TimeSeriesObservation'],
 'phase_reqs': {'ZeroPhase': '2457792.356338',
  'Period': '4.05527999 Days',
  'PhaseStart': '0.95248',
  'PhaseEnd': '0.96275'},
 'period': 4.05527999,
 'phase_start': 0.957615,
 'phase_duration': 0.10850052304279982}

Set the to_csv argument to a filename path to save the outputs programs to the specified file:

from gen_tso import catalogs as cat

# Save to a custom csv file:
pid = 3712
observations = cat.parse_program(pid, to_csv='jwst_program_3712.csv')


# That's all! now we can take a look at the content:
observations[0]
{'category': 'GO',
 'pid': '3712',
 'pi': 'Cubillos',
 'cycle': 2,
 'observation': '1',
 'visit': '1',
 'status': 'Executed',
 'event': 'transit',
 'target': 'WASP-69',
 'target_in_program': 'WASP-69',
 'ra': '21:00:6.2319',
 'dec': '-05:05:41.49',
 'instrument': 'NIRSPEC',
 'mode': 'BOTS',
 'disperser': 'G395H',
 'filter': 'F290LP',
 'subarray': 'SUB2048',
 'readout': 'NRSRAPID',
 'groups': 5,
 'integrations': 4115,
 'duration': 8.27,
 'date_start': datetime.datetime(2024, 9, 30, 6, 39, 59),
 'date_end': datetime.datetime(2024, 9, 30, 14, 28, 11),
 'plan_window': None,
 'proprietary_period': 12,
 'label': 'NIRSpec/G395H WASP-69b',
 'special_reqs': ['PeriodZeroPhase', 'TimeSeriesObservation'],
 'phase_reqs': {'ZeroPhase': '2455748.83422',
  'Period': '3.8681382 Days',
  'PhaseStart': '0.957128',
  'PhaseEnd': '0.967900'},
 'period': 3.8681382,
 'phase_start': 0.962514,
 'phase_duration': 0.0890824772841191}

Default csv file

Note that there’s a special case that users might consider. gen_tso has a default csv file location that’s used to store/maintain all known programs.

Use the following script if you want to manually updated that file. Just be warned that if you do, make sure to store all known programs in there, otherwise gen_tso will be omitting programs that were not stored in there.

from gen_tso import catalogs as cat
from gen_tso.utils import KNOWN_PROGRAMS, ROOTS

# Default csv file
to_csv = f'{ROOT}data/programs/jwst_tso_programs.csv'
observations = cat.parse_program(KNOWN_PROGRAMS, to_csv=to_csv)

3. Playground

Now that we have a database of JWST TSO programs at hand, users can load them using the cat.load_programs() function.

Here are a few things you can do with this database.

List phase-curve progams

Here’s another example to show all phase-curve programs

import gen_tso.catalogs as cat


# Without arguments, this will load all programs
observations = cat.load_programs()

# Use the catalog to cross check with NASA target names
catalog = cat.Catalog()
targets = [
    target for target in catalog.targets
    if target.is_transiting
]

# List all phase curve programs
ntotal = 0
print(' N      Program  Obs  Target        Status')
print('---------------  -------------------------')
for obs in observations:
    event = obs['event']
    host = cat.get_observation_host(obs, targets)
    # Host not yet in NASA Exoplanet Archive
    if host is None:
        host = obs['target']
    if event == 'phase curve':
        ntotal += 1
        category = obs['category']
        pid = obs['pid']
        oid = obs['observation']
        status = obs['status']
        print(f"{ntotal:2}  {category:>6} {pid}  {oid:>3}  {host:12s}  {status}")
 N      Program  Obs  Target        Status
---------------  -------------------------
 1     GTO 1201    1  WASP-121      Executed
 2     GTO 1201    2  LTT 9779      Executed
 3     GTO 1224    2  WASP-43       Executed
 4     ERS 1366   11  WASP-43       Executed
 5      GO 1729    1  WASP-121      Executed
 6      GO 1803    1  GJ 1214       Executed
 7      GO 2008    1  HD 80606      Implementation
 8      GO 2158    1  NGTS-10       Executed
 9      GO 2159    1  K2-141        Executed
10      GO 2347    1  K2-141        Executed
11      GO 2488    1  HD 80606      Executed
12      GO 2508    1  GJ 367        Executed
13  SURVEY 2765    1  TOI-849       Withdrawn
14  SURVEY 2765    2  TOI-849       Withdrawn
15  SURVEY 2765    3  TOI-849       Withdrawn
16  SURVEY 2765    4  TOI-2109      Withdrawn
17  SURVEY 2765    5  TOI-2109      Withdrawn
18  SURVEY 2765    6  TOI-2109      Withdrawn
19      GO 3077    1  TRAPPIST-1    Executed
20      GO 3077    2  TRAPPIST-1    Executed
21      GO 3231    1  LTT 9779      Executed
22      GO 3263    1  TOI-1685      Executed
23      GO 3315    3  K2-22         Executed
24      GO 3784    1  TOI-2445      Executed
25      GO 3860    1  TOI-561       Executed
26      GO 4008    1  LHS 3844      Executed
27      GO 5022    1  WASP-103      Implementation
28      GO 5268    1  WASP-76       Executed
29      GO 7255    1  KELT-9        Implementation
30      GO 7255    2  KELT-9        Implementation
31      GO 8739    1  BD+05 4868    Implementation
32      GO 8864    1  TOI-1807      Implementation
33      GO 8864    2  TOI-2260      Implementation
34      GO 8864    3  TOI-431       Implementation
35      GO 8864    4  TOI-2431      Implementation
36      GO 8864    5  TOI-6255      Implementation
37      GO 8877    1  TOI-2109      Implementation

All programs for a target

Use the grouped=True argument to get a list of observations grouped per target (host)

import gen_tso.catalogs as cat

# Observations grouper per host target
observations = cat.load_programs(grouped=True)


# Use the catalog to cross check with NASA target names
catalog = cat.Catalog()
targets = [
    target for target in catalog.targets
    if target.is_transiting
]

# List all programs targeting 'LHS 1140' (as named in NASA archive):
ntotal = 0
print(' N      Program  Obs  Target    Planets  Event  Status')
print('---------------  -----------------------------  ------')
for programs in observations:
    host = cat.get_observation_host(programs[0], targets)
    if host != 'LHS 1140':
        continue
    for obs in programs:
        ntotal += 1
        event = obs['event']
        category = obs['category']
        pid = obs['pid']
        oid = obs['observation']
        target = obs['target']
        planets = obs['planets']
        status = obs['status']
        print(f"{ntotal:2}  {category:>6} {pid}  {oid:>3}  {target:8s}  {planets}  {event:7}  {status}")
 N      Program  Obs  Target    Planets  Event  Status
---------------  -----------------------------  ------
 1      GO 2334    1  G 268-38  ['b']  transit  Executed
 2      GO 2334    2  G 268-38  ['b']  transit  Executed
 3      GO 3730    9  LHS 1140  ['c']  eclipse  Executed
 4      GO 3730   10  LHS 1140  ['c']  eclipse  Executed
 5      GO 3730   11  LHS 1140  ['c']  eclipse  Executed
 6      DD 6543    1  LHS 1140  ['b']  transit  Executed
 7      DD 6543    2  LHS 1140  ['b']  transit  Executed
 8      GO 7073   21  LHS 1140  ['b']  transit  Implementation
 9      GO 7073   22  LHS 1140  ['b']  transit  Implementation
10      GO 7073   23  LHS 1140  ['b']  transit  Implementation
11      GO 7073   24  LHS 1140  ['b']  transit  Implementation
12      GO 7073   25  LHS 1140  ['b']  transit  Implementation
13      GO 7073   26  LHS 1140  ['b']  transit  Implementation
14      GO 7073   31  LHS 1140  ['c']  transit  Implementation
15      GO 7073   32  LHS 1140  ['c']  transit  Implementation
16      GO 7073   33  LHS 1140  ['c']  transit  Implementation

Get targeted planet(s)

Now that we have the JWST programs information, we can cross-check with the NASA catalog of exoplanet to find out which planet(s) are targeted by each observation. This script find’s which planet belongs to each observation, and displays the info for multi-planet systems:

Note

Note that the planet information is already contained in the cat.load_programs() output. This script is mostly a demo.

import gen_tso.catalogs as cat

# Observations grouper per host target
observations = cat.load_programs()


# Will need to cross check with NASA catalog of transiting planets
catalog = cat.Catalog()
targets = [
    target for target in catalog.targets
    if target.is_transiting
]

# Find the planet (by host name, labels, or phase constraints)
for obs in observations:
    obs['planets'] = cat.get_planet_letters(obs, targets)


# Show planets in multi-planet systems
print('PID   Obs  Target        Planets     Event type')
print('-----------------------------------------------')
for obs in observations:
    host = cat.get_observation_host(obs, targets)
    pid = obs['pid']
    oid = obs['observation']
    target = obs['target']
    all_planets = [
         target.planet
         for target in targets
         if target.host == host
    ]
    if len(all_planets) == 0:
        print(f"{pid}  {oid:>3}  {target:12s}  Couldn't determine planet, target not in NASA database")

    if len(all_planets) > 1:
        event = obs['event']
        planets = obs['planets']
        print(f"{pid}  {oid:>3}  {host:12s}  {repr(planets):10}  {event}")
PID   Obs  Target        Planets     Event type
-----------------------------------------------
1177    7  TRAPPIST-1    ['b']       eclipse
1177    8  TRAPPIST-1    ['b']       eclipse
1177    9  TRAPPIST-1    ['b']       eclipse
1177   10  TRAPPIST-1    ['b']       eclipse
1177   11  TRAPPIST-1    ['b']       eclipse
1201  101  TRAPPIST-1    ['f']       transit
1201  102  TRAPPIST-1    ['f']       transit
1201  103  TRAPPIST-1    ['f']       transit
1201  104  TRAPPIST-1    ['f']       transit
1201  105  TRAPPIST-1    ['f']       transit
1201  111  TRAPPIST-1    ['d']       transit
1201  112  TRAPPIST-1    ['d']       transit
1201  301  L 98-59       ['c']       transit
1201  302  L 98-59       ['c']       transit
1201  311  L 98-59       ['d']       transit
...
6193    1  TOI-700       ['d']       transit
6193    2  TOI-700       ['e']       transit
6456    1  TRAPPIST-1    ['b', 'e']  transit
6456    2  TRAPPIST-1    ['b', 'e']  transit
6456    3  TRAPPIST-1    ['b', 'e']  transit
6456    4  TRAPPIST-1    ['b', 'e']  transit
6456    5  TRAPPIST-1    ['b', 'e']  transit
6456   15  TRAPPIST-1    ['b', 'e']  transit
6457    1  LP 791-18     ['d']       eclipse
6457    2  LP 791-18     ['d']       eclipse
6457    3  LP 791-18     ['d']       eclipse
6457    4  LP 791-18     ['d']       eclipse
6457    5  LP 791-18     ['d']       eclipse
6491    1  Kepler-167    ['e']       transit
...
8004    1  TOI-431       ['d']       transit
8004    2  TOI-431       ['d']       transit
8004    3  HIP 9618      ['b']       transit
8004    4  HIP 9618      ['b']       transit
8739    1  BD+05 4868    Couldn't determine planet, target not in NASA database
8864    3  TOI-431       ['b']       phase curve
9025    3  WASP-84       ['b']       transit

Plot spectral coverage

This script combines the various information from the programs to display the wavelength coverage for the systems targeted in cycle 4.

import gen_tso.catalogs as cat
import gen_tso.pandeia_io as jwst
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.image import BboxImage
from matplotlib.transforms import Bbox, TransformedBbox
plt.ion()


# First define a few convenient functions to handle the data
def get_band_width(inst, mode, disperser, filter, subarray):
    """Get the wavelength edges for the observing configuration"""
    wl_min2, wl_max2 = 0, 0
    if mode == 'lrs':
        mode = 'lrsslitless'
        filter = 'None'
    elif mode == 'mrs':
        mode = 'mrs_ts'
        filter = disperser[0:disperser.index('(')]
    elif mode == 'imaging ts':
        mode = 'imaging_ts'
        subarray = 'imager'
    elif mode == 'grismr ts':
        mode = 'lw_tsgrism'
        filters = filter.split(',')
        filter = filters[1]
        if 'dhs0' in disperser:
            mode2 = 'sw_tsgrism'
            filter2 = filters[0]
            wl_min2, wl_max2 = jwst.get_bandwidths(
                inst, mode2, subarray, filter2,
            )[2:4]
    elif mode == 'soss':
        filter = 'clear'
        subarray = subarray.replace('full', 'sossfull')
    elif mode == 'bots':
        filter = f'{disperser}/{filter}'
    wl_min, wl_max = jwst.get_bandwidths(inst, mode, subarray, filter)[2:4]

    if wl_min2 != 0:
        return [wl_min, wl_min2], [wl_max, wl_max2]
    return [wl_min], [wl_max]


def make_passband(wl_min, wl_max):
    """Make a colormap array coloring specified wl ranges"""
    ncol = 64
    band = np.zeros((2,ncol,4))
    wl = np.logspace(np.log10(0.65), np.log10(18), ncol)
    for i in range(ncol):
        band[:,i] = plt.cm.rainbow(i/(ncol-1))
    passband = np.zeros(ncol, bool)
    for wmin, wmax in zip(wl_min, wl_max):
        passband |= (wl >= wmin) & (wl<=wmax)
    band[:,:,3] = passband
    return band


def collect_configs(programss):
    """
    Split observations by mode and event, count them, and
    collect wavelenght coverage.
    """
    configs = {}
    for j,obs in enumerate(programs):
        inst = obs['instrument'].lower()
        mode = obs['mode'].lower()
        disperser = obs['disperser'].lower()
        filter = obs['filter'].lower()
        subarray = obs['subarray'].lower()

        if mode == 'lrs':
            label = 'LRS'
        elif mode == 'mrs':
            label = f"MRS/{obs['disperser']}"
        elif mode == 'imaging ts':
            label = obs['filter']
        elif mode == 'soss':
            label = 'SOSS'
        elif mode == 'bots':
            label = obs['disperser']
        elif mode == 'grismr ts':
            filters = obs['filter'].split(',')
            if 'dhs0' in disperser:
                label = '/'.join(filters)
            else:
                label = filters[1]
        event = obs['event'][0].upper()
        if label not in configs:
            configs[label] = {'T': 0, 'E': 0, 'P': 0}
        configs[label][event] += 1

        wl_min, wl_max = get_band_width(inst, mode, disperser, filter, subarray)
        configs[label]['wl_min'] = wl_min
        configs[label]['wl_max'] = wl_max
    return configs


def make_label(config):
    label = ''
    if config['T'] > 0:
        label += f"{config['T']}T  "
    if config['E'] > 0:
        label += f"{config['E']}E  "
    if config['P'] > 0:
        label += f"{config['P']}PC"
    return label.strip()

# Now lets plot
# JWST observations grouped by host:
observations = cat.load_programs(grouped=True)

# For target cross-checking only with transiting planets
targets = [
    target for target in cat.Catalog().targets
    if target.is_transiting
]


# Plot planets targeted in Cycle 4
fs = 10
dx = 0.24
h = 0.0138
dtx = 0.002

x0 = 0.01
y0 = 0.93
fig = plt.figure(0)
plt.clf()
fig.set_size_inches(7.5,9)
plt.subplots_adjust(0.0, 0.0, 1.0, 1.0)
ax = plt.subplot(111)
plt.axis('off')
ax.set_xlim(0, 1)
ax.set_ylim(0, 1)
title = 'Exoplanets with Cycle 4 TSOs'
ax.text(0.5, 0.97, title, fontsize=15, weight='bold', ha='center')
for programs in observations:
    cycles = [obs['cycle'] for obs in programs]
    if 4 not in cycles:
        continue
    if y0 < 0.02:
        x0 += 0.345
        y0 = 0.93
    host = cat.get_observation_host(programs[0], targets)
    if host is None:
        host = programs[0]['target']
    all_planets = np.concatenate([obs['planets'] for obs in programs])
    for planet in np.unique(all_planets):
        planet_name = f'{host} {planet}'
        ax.text(x0, y0, planet_name, fontsize=fs, ha='left', va='bottom')
        progs = [obs for obs in programs if planet in obs['planets']]
        configs = collect_configs(progs)
        for label,config in configs.items():
            x1 = x0 + dx
            y1 = y0 - h
            band = make_passband(config['wl_min'], config['wl_max'])
            ax.plot(
                [x0, x1, x1, x0, x0], [y0, y0, y1, y1, y0],
                lw=0.5, c='0.5',
            )
            bbox = TransformedBbox(Bbox.from_bounds(x0,y1,dx,h), ax.transAxes)
            ax.add_artist(BboxImage(bbox, cmap='rainbow', data=band))
            obs_label = make_label(config)
            ax.text(x1+dtx, y1, obs_label, fontsize=fs-1, ha='left', va='bottom')
            b0 = np.where(band[0,:,3])[0][0]/64
            xt = x0 + dx*b0 + dtx * (-1)**int(b0 >= 0.2)
            yt = y0-0.9*h
            ha = 'left' if b0 < 0.2 else 'right'
            col = 'w' if b0 < 0.2 else '0.5'
            ax.text(xt, yt, label, color=col, fontsize=fs-4, ha=ha, va='bottom')
            y0 -= h
        y0 -= 1.5*h
    y0 -= 1.2*h
plt.savefig('family_portrait_cycle4.png', dpi=300)

Back to top