API Reference

fyrd.queue

The core class in this file is the Queue() class which does most of the queue management. In addition, get_cluster_environment() attempts to autodetect the cluster type (torque, slurm, normal) and sets the global cluster type for the whole file. Finally, the wait() function accepts a list of jobs and will block until those jobs are complete.

The Queue class relies on a few simple queue parsers defined by the torque_queue_parser and slurm_queue_parser functions. These call qstat -x or squeue and sacct to get job information, and yield a simple tuple of that data with the following members:

job_id, name, userid, partition, state, node-list, node-count, cpu-per-node, exit-code

The Queue class then converts this information into a Queue.QueueJob object and adds it to the internal jobs dictionary within the Queue class. This list is now the basis for all of the other functionality encoded by the Queue class. It can be accessed directly, or sliced by accessing the completed, queued, and running attributes of the Queue class, these are used to simply divide up the jobs dictionary to make finding information easy.

fyrd.queue.Queue

class fyrd.queue.Queue(user=None, partition=None, qtype=None)[source]

Bases: object

A wrapper for all defined batch systems.

jobs

A dictionary of all jobs in this queue in the form: {jobid: Queue.QueueJob}

Type:dict
finished

A dictionary of all completed jobs, same format as jobs

Type:dict
bad

A dictionary of all jobs with failed or unknown states, same format as jobs

Type:dict
active_job_count

Total jobs in the queue (including array job children)

Type:int
max_jobs

The maximum number of jobs allowed in the queue

Type:int
can_submit

True if active_job_count < max_jobs, False otherwise

Type:bool
job_states

A list of the different states of jobs in this queue

Type:list
active_job_count

A count of all jobs that are either pending or running in the current queue

Type:int
can_submit

True if total active jobs is less than max_jobs

Type:bool
users

A set of all users with active jobs

Type:set
job_states

A set of all current job states

Type:set
wait(jobs, return_disp=False)[source]

Block until all jobs in jobs are complete.

get(jobs)[source]

Get all results from a bunch of Job objects.

wait_to_submit(max_jobs=None)[source]

Block until fewer running/pending jobs in queue than max_jobs.

update()[source]

Refresh the list of jobs from the server.

get_jobs(key)[source]

Return a dict of jobs where state matches key.

get_user_jobs(users)[source]

Return a dict of jobs for all all jobs by each user in users.

Can filter by user, queue type or partition on initialization.

Parameters:
  • user (str) – Optional usernameto filter the queue with. If user=’self’ or ‘current’, the current user will be used.
  • partition (str) – Optional partition to filter the queue with.
  • qtype (str) – one of the defined batch queues (e.g. ‘slurm’)

Methods

Queue.wait(jobs, return_disp=False, notify=True)[source]

Block until all jobs in jobs are complete.

Update time is dependant upon the queue_update parameter in your ~/.fyrd/config.txt file.

Parameters:
  • jobs (list) – List of either fyrd.job.Job, fyrd.queue.QueueJob, job_id
  • return_disp (bool, optional) – If a job disappeares from the queue, return ‘disapeared’ instead of True
  • notify (str, True, or False, optional) – If True, both notification address and wait_time must be set in the [notify] section of the config. A notification email will be sent if the time exceeds this time. This is the default. If a string is passed, notification is forced and the string must be the to address. False means no notification
Returns:

True on success False or None on failure unless return_disp is True and the job disappeares, then returns ‘disappeared’

Return type:

bool or str

Queue.get(jobs)[source]

Get all results from a bunch of Job objects.

Parameters:jobs (list) – List of fyrd.Job objects
Returns:job_results{job_id: Job}
Return type:dict
Raises:fyrd.ClusterError – If any job fails or goes missing.
Queue.wait_to_submit(max_jobs=None)[source]

Block until fewer running/queued jobs in queue than max_jobs.

Parameters:max_jobs (int) – Override self.max_jobs for wait
Queue.test_job_in_queue(job_id, array_id=None)[source]

Check to make sure job is in self.

Tries 12 times with 1 second between each. If found returns True, else False.

Parameters:
  • job_id (str) –
  • array_id (str, optional) –
Returns:

exists

Return type:

bool

Queue.get_jobs(key)[source]

Return a dict of jobs where state matches key.

Queue.get_user_jobs(users)[source]

Filter jobs by user.

Parameters:users (list) – A list of users/owners
Returns:A filtered job dictionary of {job_id: QueueJob} for all jobs owned by the queried users.
Return type:dict
Queue.update()[source]

Refresh the list of jobs from the server, limit queries.

Queue.check_dependencies(dependencies)[source]

Check if dependencies are running.

Parameters:dependencies (list) – List of job IDs
Returns:‘active’ if dependencies are running or queued, ‘good’ if completed, ‘bad’ if failed, cancelled, or suspended, ‘absent’ otherwise.
Return type:str

fyrd.queue Jobs

Hold information about individual jobs, QueueJob about primary jobs, QueueChild about individual array jobs (which are stored in the children attribute of QueueJob objects.

class fyrd.queue.QueueJob[source]

A very simple class to store info about jobs in the queue.

Only used for torque and slurm queues.

id

Job ID

Type:int
name

Job name

Type:str
owner

User who owns the job

Type:str
threads

Number of cores used by the job

Type:int
queue

The queue/partition the job is running in

Type:str
state

Current state of the job, normalized to slurm states

Type:str
nodes

List of nodes job is running on

Type:list
exitcode

Exit code of completed job

Type:int
disappeared

Job cannot be found in the queue anymore

Type:bool
array_job

This job is an array job and has children

Type:bool
children

If array job, list of child job numbers

Type:dict

Initialize.

class fyrd.queue.QueueChild(parent)[source]

A very simple class to store info about child jobs in the queue.

Only used for torque and slurm queues.

id

Job ID

Type:int
name

Job name

Type:str
owner

User who owns the job

Type:str
threads

Number of cores used by the job

Type:int
queue

The queue/partition the job is running in

Type:str
state

Current state of the job, normalized to slurm states

Type:str
nodes

List of nodes job is running on

Type:list
exitcode

Exit code of completed job

Type:int
disappeared

Job cannot be found in the queue anymore

Type:bool
parent

Backref to parent job

Type:QueueJob

Initialize with a parent.

fyrd.queue.QueueError

exception fyrd.queue.QueueError[source]

Simple Exception wrapper.

fyrd.job

Job management is handled by the Job() class. This is a very large class that defines all the methods required to build and submit a job to the cluster.

It accepts keyword arguments defined in fyrd.options on initialization, which are then fleshed out using profile information from the config files defined by fyrd.conf.

The primary argument on initialization is the function or script to submit.

Examples:

Job('ls -lah | grep myfile')
Job(print, ('hi',))
Job('echo hostname', profile='tiny')
Job(huge_function, args=(1,2) kwargs={'hi': 'there'},
    profile='long', cores=28, mem='200GB')

fyrd.job.Job

class fyrd.Job(command, args=None, kwargs=None, name=None, qtype=None, profile=None, queue=None, **kwds)[source]

Bases: object

Information about a single job on the cluster.

Holds information about submit time, number of cores, the job script, and more.

Below are the core attributes and methods required to use this class, note that this is an incomplete list.

id

The ID number for the job, only set once the job has been submitted

Type:str
name

The name of the job

Type:str
command

The function or shell script that will be submitted

Type:str or callable
args

A list of arguments to the shell script or function in command

Type:list
kwargs

A dictionary of keyword arguments to the function (not shell script) in command

Type:dict
state
A slurm-style one word description of the state of the job, one of:
  • Not_Submitted
  • queued
  • running
  • completed
  • failed
Type:str
submitted
Type:bool
written
Type:bool
done
Type:bool
running
Type:bool
dependencies

A list of dependencies associated with this job

Type:list
out

The output of the function or a copy of stdout for a script

Type:str
stdout

Any output to STDOUT

Type:str
stderr

Any output to STDERR

Type:str
exitcode

The exitcode of the running processes (the script runner if the Job is a function).

Type:int
submit_time

A datetime object for the time of submission

Type:datetime
start

A datetime object for time execution started on the remote node.

Type:datetime
end

A datetime object for time execution ended on the remote node.

Type:datetime
runtime

A timedelta object containing runtime.

Type:timedelta
files

A list of script files associated with this job

Type:list
nodes

A list of nodes associated with this job

Type:list
modules

A list of modules associated with this job

Type:list
clean_files

If True, auto-delete script and function files on job completion

Type:bool
clean_outputs

If True, auto-delete script outputs and error files on job completion

Type:bool
kwds

Keyword arguments to the batch system (e.g. mem, cores, walltime), this is initialized by taking every additional keyword argument to the Job. e.g. Job(‘echo hi’, profile=large, walltime=‘00:20:00’, mem=‘2GB’) will result in kwds containing {walltime: ‘00:20:00’, mem: ‘2GB’}. There is no need to alter this manually.

Type:dict
submit_args

List of parsed submit arguments that will be passed at runtime to the submit function. Generated within the Job object, no need to set manually, use the kwds attribute instead.

Type:list
initialize()[source]

Use attributes to prep job for running

gen_scripts()[source]

Create script files (but do not write them)

write(overwrite=True)[source]

Write scripts to files

submit(wait_on_max_queue=True)[source]

Submit the job if it is ready and the queue is sufficiently open.

resubmit(wait_on_max_queue=True)[source]

Clean all internal states with scrub() and then resubmit

kill(confirm=True)[source]

Immediately kill the currently running job

clean(delete_outputs=True, get_outputs=True)[source]

Delete any files created by this object

scrub(confirm=True)[source]

Clean everything and reset to an unrun state.

update(fetch_info=True)[source]

Update our status from the queue

wait()[source]

Block until the job is done

get()[source]

Block until the job is done and then return the output (stdout if job is a script), by default saves all outputs to self (i.e. .out, .stdout, .stderr) and deletes all intermediate files before returning. If save argument is False, does not delete the output files by default.

Notes

Printing or reproducing the class will display detailed job information.

Both wait() and get() will update the queue every few seconds (defined by the queue_update item in the config) and add queue information to the job as they go.

If the job disappears from the queue with no information, it will be listed as ‘completed’.

All jobs have a .submission attribute, which is a Script object containing the submission script for the job and the file name, plus a ‘written’ bool that checks if the file exists.

In addition, some batch systems (e.g. SLURM) have an .exec_script attribute, which is a Script object containing the shell command to run. This difference is due to the fact that some SLURM systems execute multiple lines of the submission file at the same time.

Finally, if the job command is a function, this object will also contain a .function attribute, which contains the script to run the function.

Initialization function arguments.

Parameters:
  • command (function/str) – The command or function to execute.
  • args (tuple/dict, optional) – Optional arguments to add to command, particularly useful for functions.
  • kwargs (dict, optional) – Optional keyword arguments to pass to the command, only used for functions.
  • name (str, optional) – Optional name of the job. If not defined, guessed. If a job of the same name is already queued, an integer job number (not the queue number) will be added, ie. <name>.1
  • qtype (str, optional) – Override the default queue type
  • profile (str, optional) – The name of a profile saved in the conf
  • queue (fyrd.queue.Queue, optional) – An already initiated Queue class to use.
  • kwdsAll other keywords are parsed into cluster keywords by the options system. For available keywords see fyrd.option_help()

Methods

Job.initialize()[source]

Make self runnable using set attributes.

Job.gen_scripts()[source]

Create the script objects from the set parameters.

Job.write(overwrite=True)[source]

Write all scripts.

Parameters:overwrite (bool, optional) – Overwrite existing files, defaults to True.
Returns:self
Return type:Job
Job.clean(delete_outputs=None, get_outputs=True)[source]

Delete all scripts created by this module, if they were written.

Parameters:
  • delete_outputs (bool, optional) – also delete all output and err files, but get their contents first.
  • get_outputs (bool, optional) – if delete_outputs, save outputs before deleting.
Returns:

self

Return type:

Job

Job.scrub(confirm=True)[source]

Clean everything and reset to an unrun state.

Parameters:confirm (bool, optional) – Get user input before proceeding
Returns:self
Return type:Job
Job.submit(wait_on_max_queue=True, additional_keywords=None, max_jobs=None)[source]

Submit this job.

To disable max_queue_len, set it to 0. None will allow override by the default settings in the config file, and any positive integer will be interpretted to be the maximum queue length.

Parameters:
  • wait_on_max_queue (bool, optional) – Block until queue limit is below the maximum before submitting.
  • additional_keywords (dict, optional) – Pass this dictionary to the batch system submission function, not necessary.
  • max_jobs (int, optional) – Override the maximum number of jobs to wait for
Returns:

self

Return type:

Job

Job.resubmit(wait_on_max_queue=True, cancel_running=None)[source]

Attempt to auto resubmit, deletes prior files.

Parameters:
  • wait_on_max_queue (bool, optional) – Block until queue limit is below the maximum before submitting.
  • cancel_running (bool or None, optional) – If the job is currently running, cancel it before resubmitting. If None (default), will ask the user.
  • disable max_queue_len, set it to 0. None will allow override by (To) –
  • default settings in the config file, and any positive integer will (the) –
  • interpretted to be the maximum queue length. (be) –
Returns:

self

Return type:

Job

Job.get_keywords()[source]

Return a list of the keyword arguments used to make the job.

Job.set_keywords(kwds, replace=False)[source]

Set the job keywords, just updates self.kwds.

Parameters:
  • kwds (dict) – Set of valid arguments.
  • replace (bool, optional) – Overwrite the keword arguments instead of updating.
Job.wait()[source]

Block until job completes.

Returns:success – True if exitcode == 0, False if not, ‘disappeared’ if job lost from queue.
Return type:bool or str
Job.get(save=True, cleanup=None, delete_outfiles=None, del_no_save=None, raise_on_error=True)[source]

Block until job completed and return output of script/function.

By default saves all outputs to this class and deletes all intermediate files.

Parameters:
  • save (bool, optional) – Save all outputs to the class also (advised)
  • cleanup (bool, optional) – Clean all intermediate files after job completes.
  • delete_outfiles (bool, optional) – Clean output files after job completes.
  • del_no_save (bool, optional) – Delete output files even if save is False
  • raise_on_error (bool, optional) – If the returned output is an Exception, raise it.
Returns:

Function output if Function, else STDOUT

Return type:

str

Job.get_output(save=True, delete_file=None, update=True, raise_on_error=True)[source]

Get output of function or script.

This is the same as stdout for a script, or the function output for a function.

By default, output file is kept unless delete_file is True or self.clean_files is True.

Parameters:
  • save (bool, optional) – Save the output to self.out, default True. Would be a good idea to set to False if the output is huge.
  • delete_file (bool, optional) – Delete the output file when getting
  • update (bool, optional) – Update job info from queue first.
  • raise_on_error (bool, optional) – If the returned output is an Exception, raise it.
Returns:

output – The output of the script or function. Always a string if script.

Return type:

anything

Job.get_stdout(save=True, delete_file=None, update=True)[source]

Get stdout of function or script, same for both.

By default, output file is kept unless delete_file is True or self.clean_files is True.

Also sets self.start and self.end from the contents of STDOUT if possible.

Returns:
  • save (bool, optional) – Save the output to self.stdout, default True. Would be a good idea to set to False if the output is huge.
  • delete_file (bool, optional) – Delete the stdout file when getting
  • update (bool, optional) – Update job info from queue first.
Returns:The contents of STDOUT, with runtime info and trailing newline removed.
Return type:str
Job.get_stderr(save=True, delete_file=None, update=True)[source]

Get stderr of function or script, same for both.

By default, output file is kept unless delete_file is True or self.clean_files is True.

Parameters:
  • save (bool, optional) – Save the output to self.stdout, default True. Would be a good idea to set to False if the output is huge.
  • delete_file (bool, optional) – Delete the stdout file when getting
  • update (bool, optional) – Update job info from queue first.
Returns:

The contents of STDERR, with trailing newline removed.

Return type:

str

Job.get_times(update=True, stdout=None)[source]

Get stdout of function or script, same for both.

Sets self.start and self.end from the contents of STDOUT if possible.

Parameters:
  • update (bool, optional) – Update job info from queue first.
  • stdout (str, optional) – Pass existing stdout for use
Returns:

  • start (datetime.datetime)
  • end (datetime.datetime)

Job.get_exitcode(update=True, stdout=None)[source]

Try to get the exitcode.

Parameters:
  • update (bool, optional) – Update job info from queue first.
  • stdout (str, optional) – Pass existing stdout for use
Returns:

exitcode

Return type:

int

Job.update(fetch_info=True)[source]

Update status from the queue.

Parameters:fetch_info (bool, optional) – Fetch basic job info if complete.
Returns:self
Return type:Job
Job.update_queue_info()[source]

Set (and return) queue_info from the queue even if done.

Job.fetch_outputs(save=True, delete_files=None, get_stats=True)[source]

Save all outputs in their current state. No return value.

This method does not wait for job completion, but merely gets the outputs. To wait for job completion, use get() instead.

Parameters:
  • save (bool, optional) – Save all outputs to the class also (advised)
  • delete_files (bool, optional) – Delete the output files when getting, only used if save is True
  • get_stats (bool, optional) – Try to get exitcode.

fyrd.submission_scripts

This module defines to classes that are used to build the actual jobs for submission, including writing the files. Function is actually a child class of Script.

class fyrd.submission_scripts.Script(file_name, script)[source]

Bases: object

A script string plus a file name.

Initialize the script and file name.

clean(delete_output=None)[source]

Delete any files made by us.

exists

True if file is on disk, False if not.

write(overwrite=True)[source]

Write the script file.

class fyrd.submission_scripts.Function(file_name, function, args=None, kwargs=None, imports=None, syspaths=None, pickle_file=None, outfile=None)[source]

Bases: fyrd.submission_scripts.Script

A special Script used to run a function.

Create a function wrapper.

NOTE: Function submission will fail if the parent file’s code is not wrapped in an if __main__ wrapper.

Parameters:
  • file_name (str) – A root name to the outfiles
  • function (callable) – Function handle.
  • args (tuple, optional) – Arguments to the function as a tuple.
  • kwargs (dict, optional) – Named keyword arguments to pass in the function call
  • imports (list, optional) – A list of imports, if not provided, defaults to all current imports, which may not work if you use complex imports. The list can include the import call, or just be a name, e.g [‘from os import path’, ‘sys’]
  • syspaths (list, optional) – Paths to be included in submitted function
  • pickle_file (str, optional) – The file to hold the function.
  • outfile (str, optional) – The file to hold the output.
clean(delete_output=False)[source]

Delete the input pickle file and any scripts.

Parameters:delete_output (bool, optional) – Delete the output pickle file too.
write(overwrite=True)[source]

Write the pickle file and call the parent Script write function.

fyrd.batch_systems

All batch systems are defined here.

fyrd.batch_systems functions

fyrd.batch_systems.get_cluster_environment(overwrite=False)[source]

Detect the local cluster environment and set MODE globally.

Detect the current batch system by looking for command line utilities. Order is important here, so we hard code the batch system lookups.

Paths to files can also be set in the config file.

Parameters:overwrite (bool, optional) – If True, run checks anyway, otherwise just accept MODE if it is already set.
Returns:MODE
Return type:str
fyrd.batch_systems.check_queue(qtype=None)[source]

Check if both MODE and qtype are valid.

First checks the MODE global and autodetects its value, if that fails, no other tests are done, the qtype argument is ignored.

After MODE is found to be a reasonable value, the queried queue is tested for functionality. If qtype is defined, this queue is tested, else the queue in MODE is tested.

Tests are defined per batch system.

Parameters:qtype (str) –
Returns:batch_system_functional
Return type:bool
Raises:ClusterError – If MODE or qtype is not in DEFINED_SYSTEMS

See also

get_cluster_environment()
Auto detect the batch environment
get_batch_system()
Return the batch system module
fyrd.batch_systems.get_batch_system(qtype=None)[source]

Return a batch_system module.

fyrd.batch_systems.options

All keyword arguments are defined in dictionaries in the options.py file, alongside function to manage those dictionaries. Of particular importance is option_help(), which can display all of the keyword arguments as a string or a table. check_arguments() checks a dictionary to make sure that the arguments are allowed (i.e. defined), it is called on all keyword arguments in the package.

To see keywords, run fyrd keywords from the console or fyrd.option_help() from a python session.

The way that option handling works in general, is that all hard-coded keyword arguments must contain a dictionary entry for ‘torque’ and ‘slurm’, as well as a type declaration. If the type is NoneType, then the option is assumed to be a boolean option. If it has a type though, check_argument() attempts to cast the type and specific idiosyncrasies are handled in this step, e.g. memory is converted into an integer of MB. Once the arguments are sanitized format() is called on the string held in either the ‘torque’ or the ‘slurm’ values, and the formatted string is then used as an option. If the type is a list/tuple, the ‘sjoin’ and ‘tjoin’ dictionary keys must exist, and are used to handle joining.

The following two functions are used to manage this formatting step.

option_to_string() will take an option/value pair and return an appropriate string that can be used in the current queue mode. If the option is not implemented in the current mode, a debug message is printed to the console and an empty string is returned.

options_to_string() is a wrapper around option_to_string() and can handle a whole dictionary of arguments, it explicitly handle arguments that cannot be managed using a simple string format.

fyrd.batch_systems.options.option_help(mode='string', qtype=None, tablefmt='simple')[source]

Print a sting to stdout displaying information on all options.

The possible run modes for this extension are:

string Return a formatted string
print Print the string to stdout
list Return a simple list of keywords
table Return a table of lists
merged_table Combine all keywords into a single table
Parameters:
  • mode ({'string', 'print', 'list', 'table', 'merged_table'}, optional) –
  • qtype (str, optional) – If provided only return info on that queue type.
  • tablefmt (str, optional) –

    A tabulate-style table format, one of:

    'plain', 'simple', 'grid', 'pipe', 'orgtbl',
    'rst', 'mediawiki', 'latex', 'latex_booktabs'
    
Returns:

A formatted string

Return type:

str

fyrd.batch_systems.options.sanitize_arguments(kwds)[source]

Run check_arguments, but return unmatched keywords as is.

fyrd.batch_systems.options.split_keywords(kwargs)[source]

Split a dictionary of keyword arguments into two dictionaries.

The first dictionary will contain valid arguments for fyrd, the second will contain all others.

Returns:valid_args, other_args
Return type:dict
fyrd.batch_systems.options.check_arguments(kwargs)[source]

Make sure all keywords are allowed.

Raises OptionsError on error, returns sanitized dictionary on success.

Note: Checks in SYNONYMS if argument is not recognized, raises OptionsError
if it is not found there either.
fyrd.batch_systems.options.options_to_string(option_dict, qtype=None)[source]

Return a multi-line string for job submission.

This function pre-parses options and then passes them to the parse_strange_options function of each batch system, before using the option_to_string function to parse the remaining options.

Parameters:
  • option_dict (dict) – Dict in format {option: value} where value can be None. If value is None, default used.
  • qtype (str) – The defined batch system
Returns:

  • parsed_options (str) – A multi-line string of parsed options
  • runtime_options (list) – A list of parsed options to be used at submit time

fyrd.batch_systems.options.option_to_string(option, value=None, qtype=None)[source]

Return a string with an appropriate flag for slurm or torque.

Parameters:
  • option (str) – An allowed option definied in options.all_options
  • value (str, optional) – A value for that option if required (if None, default used)
  • qtype (str, optional) – One of the defined batch systems
Returns:

A string with the appropriate flags for the active queue.

Return type:

str

fyrd.conf

fyrd.conf handles the config (~/.fyrd/config.txt) file and the profiles (~/.fyrd/profiles.txt) file.

Profiles are combinations of keyword arguments that can be called in any of the submission functions. Both the config and profiles are just ConfigParser objects, conf.py merely adds an abstraction layer on top of this to maintain the integrity of the files.

config

The config has three sections (and no defaults):

  • queue — sets options for handling the queue
  • jobs — sets options for submitting jobs
  • jobqueue — local option handling, will be removed in the future

For a complete reference, see the config documentation : Configuration

Options can be managed with the get_option() and set_option() functions, but it is actually easier to use the console script:

fyrd conf list
fyrd conf edit max_jobs 3000
fyrd.conf.get_option(section=None, key=None, default=None)[source]

Get a single key or section.

All args are optional, if they are missing, the parent section or entire config will be returned.

Parameters:
  • section (str) – The config section to use (e.g. queue), if None, all sections returned.
  • key (str) – The config key to get (e.g. ‘max_jobs’), if None, whole section returned.
  • default – If the key does not exist, create it with this default value.
Returns:

Option value if key exists, None if no key exists.

Return type:

option_value

See also

set_option()
Set an option
get_config()
Get the entire config
fyrd.conf.set_option(section, key, value)[source]

Write a config key to the config file.

Parameters:
  • section (str) – Section of the config file to use.
  • key (str) – Key to add.
  • value – Value to add for key.
Returns:

Return type:

ConfigParser

fyrd.conf.delete(section, key)[source]

Delete a config item.

Parameters:
  • section (str) – Section of config file.
  • key (str) – Key to delete
Returns:

Return type:

ConfigParger

fyrd.conf.load_config()[source]

Load config from the config file.

If any section or key from DEFAULTS is not present in the config, it is added back, enforcing a minimal configuration.

Returns:
Return type:ConfigParser
fyrd.conf.write_config()[source]

Write the current config to CONFIG_FILE.

fyrd.conf.create_config(cnf=None, def_queue=None)[source]

Create an initial config file.

Gets all information from the file-wide DEFAULTS constant and overwrites specific keys using the values in cnf.

This means that any records in the cnf dict that are not present in DEFAULTS will be ignored, and any records that are absent will be populated from DEFAULTS.

Parameters:
  • cnf (dict) – A dictionary of config defaults.
  • def_queue (str) – A name for a queue to add to the default profile.
fyrd.conf.create_config_interactive(prompt=True)[source]

Interact with the user to create a new config.

Uses readline autocompletion to make setup easier.

Parameters:prompt (bool) – As for confirmation before beginning wizard.

profiles

Profiles are wrapped in a Profile() class to make attribute access easy, but they are fundamentally just dictionaries of keyword arguments. They can be created with cluster.conf.Profile(name, {keywds}) and then written to a file with the write() method.

The easiest way to interact with profiles is not with class but with the get_profile(), set_profile(), and del_profile() functions. These make it very easy to go from a dictionary of keywords to a profile.

Profiles can then be called with the profile= keyword in any submission function or Job class.

As with the config, profile management is the easiest and most stable when using the console script:

fyrd profile list
fyrd profile add very_long walltime:120:00:00
fyrd profile edit default partition:normal cores:4 mem:10GB
fyrd profile delete small

fyrd.conf.Profile

class fyrd.conf.Profile(name, kwds)[source]

Bases: object

A job submission profile. Just a thin wrapper around a dict.

name
Type:str
kwds
Type:dict
write : Write self to config file

Set up bare minimum attributes.

Parameters:
  • name (str) – Name of the profile
  • kwds (dict) – Dictionary of keyword arguments (will be validated).
write()[source]

Write self to config file.

fyrd.conf.set_profile(name, kwds, update=True)[source]

Write profile to config file.

Parameters:
  • name (str) – The name of the profile to add/edit.
  • kwds (dict) – Keyword arguments to add to the profile.
  • update (bool) – Update the profile rather than overwriting it.
fyrd.conf.get_profile(profile=None, allow_none=True)[source]

Return a profile if it exists, if None, return all profiles.

Will return None if profile is supplied but does not exist.

Parameters:
  • profile (str) – The name of a profile to search for.
  • allow_none (bool) – If True, return None if no profile matches, otherwise raise a ValueError.
Returns:

The requested profile.

Return type:

fyrd.conf.Profile

fyrd.helpers

The helpers are all high level functions that are not required for the library but make difficult jobs easy to assist in the goal of trivially easy cluster submission.

The functions in fyrd.basic below are different in that they provide simple job submission and management, while the functions in fyrd.helpers allow the submission of many jobs.

fyrd.helpers.jobify(name=None, profile=None, qtype=None, submit=True, **kwds)[source]

Decorator to make any function a job.

Will make any function return a Job object that will execute the function on the cluster.

If submit is True, the job will be submitted when it is returned.

Usage:

@fyrd.jobify(name='my_job', profile='small', mem='8GB',
             time='00:10:00', imports=['from time import sleep'])
def do_something(file_path, iteration_count=24):
    for i in range(iteration_count):
        print(file_path + i)
        sleep(1)
    return file_path

job = do_something('my_file.txt')
out = job.get()
Parameters:
  • name (str, optional) – Optional name of the job. If not defined, guessed. If a job of the same name is already queued, an integer job number (not the queue number) will be added, ie. <name>.1
  • qtype (str, optional) – Override the default queue type
  • profile (str, optional) – The name of a profile saved in the conf
  • submit (bool, optional) – Submit the Job before returning it
  • kwdsAll other keywords are parsed into cluster keywords by the options system. For available keywords see fyrd.option_help()
Returns:

A Job class initialized with the decorated function.

Return type:

fyrd.job.Job

Examples

>>> import fyrd
>>> @fyrd.jobify(name='test_job', mem='1GB')
... def test(string, iterations=4):
...     """This does basically nothing!"""
...     outstring = ""
...     for i in range(iterations):
...         outstring += "Version {0}: {1}".format(i, string)
...     return outstring
>>> j = test('hi')
>>> j.get()
'Version 0: hiVersion 1: hiVersion 2: hiVersion 3: hiVersion 4: hi'
fyrd.helpers.parapply(jobs, df, func, args=(), profile=None, applymap=False, merge_axis=0, merge_apply=False, name='parapply', imports=None, direct=True, **kwds)[source]

Split a dataframe, run apply in parallel, return result.

This function will split a dataframe into however many pieces are requested with the jobs argument, run apply in parallel by submitting the jobs to the cluster, and then recombine the outputs.

If the ‘clean_files’ and ‘clean_outputs’ arguments are not passed, we delete all intermediate files and output files by default.

This function will take any keyword arguments accepted by Job, which can be found by running fyrd.options.option_help(). It also accepts any of the keywords accepted by by pandas.DataFrame.apply(), found here

Parameters:
  • jobs (int) – Number of pieces to split the dataframe into
  • df (DataFrame) – Any pandas DataFrame
  • args (tuple) – Positional arguments to pass to the function, keyword arguments can just be passed directly.
  • profile (str) – A fyrd cluster profile to use
  • applymap (bool) – Run applymap() instead of apply()
  • merge_axis (int) – Which axis to merge on, 0 or 1, default is 1 as apply transposes columns
  • merge_apply (bool) – Apply the function on the merged dataframe also
  • name (str) – A prefix name for all of the jobs
  • imports (list) – A list of imports in any format, e.g. [‘import numpy’, ‘scipy’, ‘from numpy import mean’]
  • direct (bool) – Whether to run the function directly or to return a Job. Default True.
  • keyword arguments recognized by fyrd will be used for job (Any) –
  • submission.
  • keyword arguments will be passed to DataFrame.apply()* (*Additional) –
Returns:

A recombined DataFrame: concatenated version of original split DataFrame

Return type:

DataFrame

Example

>>> import numpy
>>> import pandas
>>> import fyrd
>>> df = pandas.DataFrame([[0, 1], [2, 6], [9, 24], [13, 76], [4, 12]])
>>> df['sum'] = fyrd.helpers.parapply(2, df, lambda x: x[0]+x[1], axis=1)
>>> df
    0   1  sum
    0   0   1    1
    1   2   6    8
    2   9  24   33
    3  13  76   89
    4   4  12   16

See also

parapply_summary()
Merge results of parapply using applied function
splitrun()
Run a command in parallel on a split file
fyrd.helpers.parapply_summary(jobs, df, func, args=(), profile=None, applymap=False, name='parapply', imports=None, direct=True, **kwds)[source]

Run parapply for a function with summary stats.

Instead of returning the concatenated result, merge the result using the same function as was used during apply.

This works best for summary functions like .mean(), which do a linear operation on a whole dataframe or series.

Parameters:
  • jobs (int) – Number of pieces to split the dataframe into
  • df (DataFrame) – Any pandas DataFrame
  • args (tuple) – Positional arguments to pass to the function, keyword arguments can just be passed directly.
  • profile (str) – A fyrd cluster profile to use
  • applymap (bool) – Run applymap() instead of apply()
  • merge_axis (int) – Which axis to merge on, 0 or 1, default is 1 as apply transposes columns
  • merge_apply (bool) – Apply the function on the merged dataframe also
  • name (str) – A prefix name for all of the jobs
  • imports (list) – A list of imports in any format, e.g. [‘import numpy’, ‘scipy’, ‘from numpy import mean’]
  • direct (bool) – Whether to run the function directly or to return a Job. Default True.
  • keyword arguments recognized by fyrd will be used for job (Any) –
  • submission.
  • keyword arguments will be passed to DataFrame.apply()* (*Additional) –
Returns:

A recombined DataFrame

Return type:

DataFrame

Example

>>> import numpy
>>> import pandas
>>> import fyrd
>>> df = pandas.DataFrame([[0, 1], [2, 6], [9, 24], [13, 76], [4, 12]])
>>> df = fyrd.helpers.parapply_summary(2, df, numpy.mean)
>>> df
0     6.083333
1    27.166667
dtype: float64

See also

parapply()
Run a command in parallel on a DataFrame without merging the

result()

fyrd.helpers.splitrun(jobs, infile, inheader, command, args=None, kwargs=None, name=None, qtype=None, profile=None, outfile=None, outheader=False, merge_func=None, direct=True, **kwds)[source]

Split a file, run command in parallel, return result.

This function will split a file into however many pieces are requested with the jobs argument, and run command on each.

Accepts exactly the same arguments as the Job class, with the exception of the first three and last four arguments, which are:

  • the number of jobs
  • the file to work on
  • whether the input file has a header
  • an optional output file
  • whether the output file has a header
  • an optional function to use to merge the resulting list, only used if there is no outfile.
  • whether to run directly or to return a Job. If direct is True, this function will just run and thus block until complete, if direct is False, the function will submit as a Job and return that Job.

Note: If command is a string, .format(file={file}) will be called on it, where file is each split file. If command is a function, the there must be an argument in either args or kwargs that contains {file}. It will be replaced with the path to the file, again by the format command.

If outfile is specified, there must also be an ‘{outfile}’ line in any script or an ‘{outfile}’ argument in either args or kwargs. When this function completes, the file at outfile will contain the concatenated output files of all of the jobs.

If the ‘clean_files’ and ‘clean_outputs’ arguments are not passed, we delete all intermediate files and output files by default.

The intermediate files will be stored in the ‘scriptpath’ directory.

Any header line is kept at the top of the file.

Primary return value varies and is decided in this order:

If outfile:
the absolute path to that file
If merge_func:
the result of merge_func(list), where list is the list of outputs.
Else:
a list of results

If direct is False, this function returns a fyrd.job.Job object which will return the results described above on get().

Parameters:
  • jobs (int) – Number of pieces to split the dataframe into
  • infile (str) – The path to the file to be split.
  • inheader (bool) – Does the input file have a header?
  • command (function/str) – The command or function to execute.
  • args (tuple/dict) – Optional arguments to add to command, particularly useful for functions.
  • kwargs (dict) – Optional keyword arguments to pass to the command, only used for functions.
  • name (str) – Optional name of the job. If not defined, guessed. If a job of the same name is already queued, an integer job number (not the queue number) will be added, ie. <name>.1
  • qtype (str) – Override the default queue type
  • profile (str) – The name of a profile saved in the conf
  • outfile (str) – The path to the expected output file.
  • outheader (bool) – Does the input outfile have a header?
  • merge_func (function) – An optional function used to merge the output list if there is no outfile.
  • direct (bool) – Whether to run the function directly or to return a Job. Default True.
  • other keywords are parsed into cluster keywords by the options (*All) –
  • For available keywords see fyrd.option_help() * (system.) –
Returns:

See description above

Return type:

Varies

fyrd.basic

This module holds high level functions to make job submission easy, allowing the user to skip multiple steps and to avoid using the Job class directly.

submit(), make_job(), and make_job_file() all create Job objects in the background and allow users to submit jobs. All of these functions accept the exact same arguments as the Job class does, and all of them return a Job object.

submit_file() is different, it simply submits a pre-formed job file, either one that has been written by this software or by any other method. The function makes no attempt to fix arguments to allow submission on multiple clusters, it just submits the file.

clean() takes a list of job objects and runs the clean() method on all of them, clean_dir() uses known directory and suffix information to clean out all job files from any directory.

fyrd.basic.submit()[source]

Submit a script to the cluster.

Parameters:
  • command (function/str) – The command or function to execute.
  • args (tuple/dict, optional) – Optional arguments to add to command, particularly useful for functions.
  • kwargs (dict, optional) – Optional keyword arguments to pass to the command, only used for functions.
  • name (str, optional) – Optional name of the job. If not defined, guessed. If a job of the same name is already queued, an integer job number (not the queue number) will be added, ie. <name>.1
  • qtype (str, optional) – Override the default queue type
  • profile (str, optional) – The name of a profile saved in the conf
  • queue (fyrd.queue.Queue, optional) – An already initiated Queue class to use.
  • kwdsAll other keywords are parsed into cluster keywords by the options system. For available keywords see fyrd.option_help()
Returns:

Return type:

Job object

fyrd.basic.make_job()[source]

Make a job compatible with the chosen cluster but do not submit.

Parameters:
  • command (function/str) – The command or function to execute.
  • args (tuple/dict, optional) – Optional arguments to add to command, particularly useful for functions.
  • kwargs (dict, optional) – Optional keyword arguments to pass to the command, only used for functions.
  • name (str, optional) – Optional name of the job. If not defined, guessed. If a job of the same name is already queued, an integer job number (not the queue number) will be added, ie. <name>.1
  • qtype (str, optional) – Override the default queue type
  • profile (str, optional) – The name of a profile saved in the conf
  • queue (fyrd.queue.Queue, optional) – An already initiated Queue class to use.
  • kwdsAll other keywords are parsed into cluster keywords by the options system. For available keywords see fyrd.option_help()
Returns:

Return type:

Job object

fyrd.basic.make_job_file()[source]

Make a job file compatible with the chosen cluster.

Parameters:
  • command (function/str) – The command or function to execute.
  • args (tuple/dict, optional) – Optional arguments to add to command, particularly useful for functions.
  • kwargs (dict, optional) – Optional keyword arguments to pass to the command, only used for functions.
  • name (str, optional) – Optional name of the job. If not defined, guessed. If a job of the same name is already queued, an integer job number (not the queue number) will be added, ie. <name>.1
  • qtype (str, optional) – Override the default queue type
  • profile (str, optional) – The name of a profile saved in the conf
  • queue (fyrd.queue.Queue, optional) – An already initiated Queue class to use.
  • kwdsAll other keywords are parsed into cluster keywords by the options system. For available keywords see fyrd.option_help()
Returns:

Path to job file

Return type:

str

fyrd.basic.submit_file()[source]

Submit an existing job file to the cluster.

This function is independent of the Job object and just submits a file using a cluster appropriate method.

Parameters:
  • script_file (str) – The path to the file to submit
  • dependencies (str or list of strings, optional) – A job number or list of job numbers to depend on
  • qtype (str, optional) – The name of the queue system to use, auto-detected if not given.
  • submit_args (dict) – A dictionary of keyword arguments for the submission script.
Returns:

job_number

Return type:

str

fyrd.basic.clean()[source]

Delete all files in jobs list or single Job object.

Parameters:
  • jobs (fyrd.job.Job or list of fyrd.job.Job) – Job objects to clean
  • clean_outputs (bool) – Also clean outputs.
fyrd.basic.clean_dir()[source]

Delete all files made by this module in directory.

CAUTION: The clean() function will delete EVERY file with
extensions matching those these::
.<suffix>.err .<suffix>.out .<suffix>.out.func.pickle .<suffix>.sbatch & .<suffix>.script for slurm mode .<suffix>.qsub for torque mode .<suffix>.job for local mode _func.<suffix>.py _func.<suffix>.py.pickle.in _func.<suffix>.py.pickle.out

Note

This function will change in the future to use batch system defined paths.

Parameters:
  • directory (str) – The directory to run in, defaults to the current directory.
  • suffix (str) – Override the default suffix.
  • qtype (str) – Only run on files of this qtype
  • confirm (bool) – Ask the user before deleting the files
  • delete_outputs (bool) – Delete all output files too.
Returns:

A set of deleted files

Return type:

list

fyrd.run

A library of useful functions used throughout the fyrd package.

These include functions to handle data, format outputs, handle file opening, run commands, check file extensions, get user input, and search and format imports.

These functions are not intended to be accessed directly and so documentation is limited.

exception fyrd.run.CommandError[source]

Bases: Exception

A custom exception.

class fyrd.run.CustomFormatter(prog, indent_increment=2, max_help_position=24, width=None)[source]

Bases: argparse.ArgumentDefaultsHelpFormatter, argparse.RawDescriptionHelpFormatter

Custom argparse formatting.

fyrd.run.block_read(files, size=65536)[source]

Iterate through a file by blocks.

fyrd.run.check_pid(pid)[source]

Check For the existence of a unix pid.

fyrd.run.cmd(command, args=None, stdout=None, stderr=None, tries=1)[source]

Run command and return status, output, stderr.

Parameters:
  • command (str) – Path to executable.
  • args (tuple, optional) – Tuple of arguments.
  • stdout (str, optional) – File or open file like object to write STDOUT to.
  • stderr (str, optional) – File or open file like object to write STDERR to.
  • tries (int, optional) – Number of times to try to execute. 1+
Returns:

  • exit_code (int)
  • STDOUT (str)
  • STDERR (str)

fyrd.run.cmd_or_file(string)[source]

If string is a file, return the contents, else return the string.

Parameters:string (str) – Path to a file or any other string
Returns:script – Either the contents of the file if string is a file or just the contents of string.
Return type:str
fyrd.run.count_lines(infile, force_blocks=False)[source]

Return the line count of a file as quickly as possible.

Uses wc if avaialable, otherwise does a rapid read.

fyrd.run.exp_file(infile)[source]

Return an expanded path to a file.

fyrd.run.export_globals(function)[source]

Add a function’s globals to the current globals.

fyrd.run.export_imports(function, kwds)[source]

Get imports from a function and from kwds.

Also sets globals and adds path to module to sys path.

Parameters:
  • function (callable) – A function handle
  • kwds (dict) – A dictionary of keyword arguments
Returns:

imports + sys.path.append for module path

Return type:

list

fyrd.run.export_run(function, args, kwargs)[source]

Execute a function after first exporting all imports.

fyrd.run.file_getter(file_strings, variables, extra_vars=None, max_count=None)[source]

Get a list of files and variable values using the search string.

The file strings can contain standard unix glob (like *) and variable containing strings in the form {name}.

For example, a file_string of {dir}/*.txt will match every file that ends in .txt in every directory relative to the current path.

The result for a directory name test with two files named 1.txt and 2.txt is a list of:

[(('dir/1.txt'), {'dir': 'test'}),
 (('dir/2.txt'), {'dir': 'test'})]

This is repeated for every file_string in file_strings, and the following tests are done:

  1. All file_strings must result in identical numbers of files
  2. All variables must have only a single value in every file string

If there are multiple file_strings, they are added to the result x in order, but the dictionary remains the same as variables must be shared. If multiple file_strings are provided the results are combined by alphabetical order.

Parameters:
  • file_strings (list of str) – List of search strings, e.g. */*, */*.txt, {dir}/*.txt or {dir}/{file}.txt
  • variables (list of str) – List of variables to look for
  • extra_vars (list of str, optional) –

    A list of additional variables specified in a very precise format:

    new_var:orig_var:regex:sub_str
    
    or
    
    new_var:value
    

    The orig_var must correspond to a variable in variables. var will be generated by running re.sub(regex, sub_str, string) where string is the result of orig_var for the given file set

  • max_count (int, optional) – Max number of file_strings to parse, default is all.
Returns:

A list of files. Each list item will be a two-item tuple of (files, variables). Files will be a tuple with the same length as max_count, or file_strings if max_count is None. Variables will be a dictionary of all variables and extra_vars for this file set. e.g.:

[((file1, dir1, file2), {var1: val, var2: val})]

Return type:

list

Raises:

ValueError – Raised if any of the above tests are not met.

fyrd.run.file_type(infile)[source]

Return file type after stripping gz or bz2.

fyrd.run.get_all_imports(function, kwds, prot=False)[source]

Get all imports from a function and from kwds.

Parameters:
  • function (callable) – A function handle
  • kwds (dict) – A dictionary of keyword arguments
  • prot (bool) – Wrap all import in try statement
Returns:

Imports

Return type:

list

fyrd.run.get_function_path(function)[source]

Return path to module defining a function if it exists.

fyrd.run.get_imports(function, mode='string')[source]

Build a list of potentially useful imports from a function handle.

Gets:

  • All modules from globals()
  • All modules from the function’s globals()
  • All functions from the function’s globals()

Modes:

string:
Return a list of strings formatted as unprotected import calls
prot:
Similar to string, but with try..except blocks
list:
Return two lists: (import name, module name) for modules and (import name, function name, module name) for functions
Parameters:
  • function (callable) – A function handle
  • mode (str) – A string corresponding to one of the above modes
Returns:

Return type:

str or list

fyrd.run.get_input(message, valid_answers=None, default=None)[source]

Get input from the command line and check answers.

Allows input to work with python 2/3

Parameters:
  • message (str) – A message to print, an additional space will be added.
  • valid_answers (list) – A list of answers to accept, if None, ignored. Case insensitive. There is one special option here: ‘yesno’, this allows all case insensitive variations of y/n/yes/no.
  • default (str) – The default answer.
Returns:

response

Return type:

str

fyrd.run.get_pbar(iterable, name=None, unit=None, **kwargs)[source]

Return a tqdm progress bar iterable.

If progressbar is set to False in the config, will not be shown.

fyrd.run.get_yesno(message, default=None)[source]

Get yes/no answer from user.

Parameters:
  • message (str) – A message to print, an additional space will be added.
  • default ({'y', 'n'}, optional) – One of {‘y’, ‘n’}, the default if the user gives no answer. If None, answer forced.
Returns:

True on yes, False on no

Return type:

bool

fyrd.run.import_function(function, mode='string')[source]

Return an import string for the function.

Attempts to resolve the parent module also, if the parent module is a file, ie it isn’t __main__, the import string will include a call to sys.path.append to ensure the module is importable.

If this function isn’t defined by a module, returns an empty string.

Parameters:mode ({'string', 'list'}, optional) – string/list, return as a unified string or a list.
fyrd.run.indent(string, prefix=' ')[source]

Replicate python3’s textwrap.indent for python2.

Parameters:
  • string (str) – Any string.
  • prefix (str) – What to indent with.
Returns:

Indented string

Return type:

str

fyrd.run.is_exc(x)[source]

Check if x is the output of sys.exc_info().

Returns:True if matched the output of sys.exc_info().
Return type:bool
fyrd.run.is_exe(fpath)[source]

Return True is fpath is executable.

fyrd.run.is_file_type(infile, types)[source]

Return True if infile is one of types.

Parameters:
  • infile (str) – Any file name
  • types (list) – String or list/tuple of strings (e.g [‘bed’, ‘gtf’])
Returns:

is_file_type

Return type:

bool

fyrd.run.listify(iterable)[source]

Try to force any iterable into a list sensibly.

fyrd.run.merge_lists(lists)[source]

Turn a list of lists into a single list.

fyrd.run.normalize_imports(imports, prot=True)[source]

Take a heterogenous list of imports and normalize it.

Parameters:
  • imports (list) – A list of strings, formatted differently.
  • prot (bool) – Protect imports with try..except blocks
Returns:

A list of strings that can be used for imports

Return type:

list

fyrd.run.open_zipped(infile, mode='r')[source]

Open a regular, gzipped, or bz2 file.

If infile is a file handle or text device, it is returned without changes.

Returns:
Return type:text mode file handle.
fyrd.run.opt_split(opt, split_on)[source]

Split options by chars in split_on, merge all into single list.

Parameters:
  • opt (list) – A list of strings, can be a single string.
  • split_on (list) – A list of characters to use to split the options.
Returns:

A single merged list of split options, uniqueness guaranteed, order not.

Return type:

list

fyrd.run.parse_glob(string, get_vars=None)[source]

Return a list of files that match a simple regex glob.

Parameters:
  • string (str) –
  • get_vars (list) – A list of variable names to search for. The string must contain these variables in the form {variable}. These variables will be temporarily replaced with a * and then run through glob.glob to generate a list of files. This list is then parsed to create the output.
Returns:

Keys are all files that match the string, values are None if get_vars is not passed. If get_vars is passed, the values are dictionaries of {‘variable’: ‘result’}. e.g. for {name}.txt and hi.txt:

{hi.txt: {name: 'hi'}}

Return type:

dict

Raises:

ValueError – If blank or numeric variable names are used or if get_vars returns multiple different names for a file.

fyrd.run.replace_argument(args, find_string, replace_string, error=True)[source]

Replace find_string with replace string in a tuple or dict.

If dict, the values are replaced, not the keys.

Note: args can also be a list, in which case the first item is assumed to be a tuple, and the second a dictionary

Parameters:
  • args (list/tuple/dict) – Tuple or dict of args
  • find_string (str) – A string to search for
  • replace_string (str) – A string to replace with
  • error (bool) – Raise ValueError if replacement fails
Returns:

Return type:

The same object as was passed, with alterations made.

fyrd.run.split_file(infile, parts, outpath='', keep_header=False)[source]

Split a file in parts and return a list of paths.

Note

Linux specific (uses wc).

If has_header is True, the top line is stripped off the infile prior to splitting and assumed to be the header.

Parameters:
  • outpath (str, optional) – The directory to save the split files.
  • keep_header (bool, optional) – Add the header line to the top of every file.
Returns:

Paths to split files.

Return type:

list

fyrd.run.string_getter(string)[source]

Parse a string for {}, {#}, and {string}.

Parameters:string (str) –
Returns:
  • ints (set) – A set of ints containing all {#} values
  • vrs (set) – A set of {string} values
Raises:ValueError – If both {} and {#} are passed
fyrd.run.syspath_fmt(syspaths)[source]

Take a list of paths and return a sys of sys.path.append strings.

fyrd.run.update_syspaths(function, kwds=None)[source]

Add function path to ‘syspaths’ in kwds.

fyrd.run.which(program)[source]

Replicate the UNIX which command.

Taken verbatim from:
stackoverflow.com/questions/377017/test-if-executable-exists-in-python
Parameters:program (str) – Name of executable to test.
Returns:Path to the program or None on failure.
Return type:str or None
fyrd.run.write_iterable(iterable, outfile)[source]

Write all elements of iterable to outfile.

fyrd.logme

This is a package I wrote myself and keep using because I like it. It provides syslog style leveled logging (e.g. ‘debug’->’info’->’warn’->’error’->’critical’) and it implements colors and timestamped messages.

The minimum print level can be set module wide at runtime by changing cluster.logme.MIN_LEVEL.

fyrd.logme.log(message, level='info', logfile=None, also_write=None, min_level=None, kind=None)[source]

Print a string to logfile.

Levels display as:

verbose:  <timestamp> VERBOSE -->
debug:    <timestamp> DEBUG -->
info:     <timestamp> INFO -->
warn:     <timestamp> WARNING -->
error:    <timestamp> ERROR -->
critical: <timestamp> CRITICAL -->
Parameters:
  • message (str, optional) – The message to print.
  • logfile (file or logging object, optional) – Optional file to log to, defaults to STDERR. Can provide a logging object
  • level ({'debug', 'info', 'warn', 'error', 'normal'}, optional) – Will only print if level > MIN_LEVEL
  • also_write ({'stdout', 'stderr'}, optional) – Print to STDOUT or STDERR also. These only have an effect if the output is not already set to the same device.
  • min_level (str, deprecated) – Retained for backwards compatibility, min_level should be set using the logme.MIN_LEVEL constant.
  • kind (str, deprecated) – synonym for level, kept to retain backwards compatibility

Logging with timestamps and optional log files.

Print a timestamped message to a logfile, STDERR, or STDOUT.

If STDERR or STDOUT are used, colored flags are added. Colored flags are INFO, WARNINING, ERROR, or CRITICAL.

It is possible to write to both logfile and STDOUT/STDERR using the also_write argument.

If level is ‘error’ or ‘critical’, error is written to STDERR unless also_write == -1

MIN_LEVEL can also be provided, logs will only print if vlevel > MIN_LEVEL. Level order: critical>error>warn>info>debug>verbose

Usage:

import logme as lm
lm.log("Screw up!", <outfile>,
    level='debug'|'info'|'warn'|'error'|'normal',
    also_write='stderr'|'stdout')

Example:

lm.log('Hi')
Prints: 20160223 11:46:24.969 | INFO --> Hi
lm.log('Hi', level='debug')
Prints nothing
lm.MIN_LEVEL = 'debug'
lm.log('Hi', level='debug')
Prints: 20160223 11:46:24.969 | DEBUG --> Hi

Note: Uses terminal colors and STDERR, not compatible with non-unix systems

fyrd.logme.log(message, level='info', logfile=None, also_write=None, min_level=None, kind=None)[source]

Print a string to logfile.

Levels display as:

verbose:  <timestamp> VERBOSE -->
debug:    <timestamp> DEBUG -->
info:     <timestamp> INFO -->
warn:     <timestamp> WARNING -->
error:    <timestamp> ERROR -->
critical: <timestamp> CRITICAL -->
Parameters:
  • message (str, optional) – The message to print.
  • logfile (file or logging object, optional) – Optional file to log to, defaults to STDERR. Can provide a logging object
  • level ({'debug', 'info', 'warn', 'error', 'normal'}, optional) – Will only print if level > MIN_LEVEL
  • also_write ({'stdout', 'stderr'}, optional) – Print to STDOUT or STDERR also. These only have an effect if the output is not already set to the same device.
  • min_level (str, deprecated) – Retained for backwards compatibility, min_level should be set using the logme.MIN_LEVEL constant.
  • kind (str, deprecated) – synonym for level, kept to retain backwards compatibility