cloudos_cli.jobs.job¶

This is the main class to create jobs.

Classes

Job(cloudos_url, apikey, cromwell_token, ...)

Class to store and operate jobs.

class cloudos_cli.jobs.job.Job(cloudos_url, apikey, cromwell_token, workspace_id, project_name, workflow_name, last=False, verify=True, mainfile=None, importsfile=None, repository_platform='github', project_id=<property object>, workflow_id=<property object>)[source]¶

Bases: Cloudos

Class to store and operate jobs.

Parameters:

cloudos_url (string) – The CloudOS service url.
apikey (string) – Your CloudOS API key.
cromwell_token (string) – Cromwell server token.
workspace_id (string) – The specific Cloudos workspace id.
project_name (string) – The name of a CloudOS project.
workflow_name (string) – The name of a CloudOS workflow or pipeline.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
mainfile (string) – The name of the mainFile used by the workflow. Required for WDL pipelines as different mainFiles could be loaded for a single pipeline.
importsfile (string) – The name of the importsFile used by the workflow. Optional and only used for WDL pipelines as different importsFiles could be loaded for a single pipeline.
repository_platform (string) – The name of the repository platform of the workflow.
project_id (string) – The CloudOS project id for a given project name.
workflow_id (string) – The CloudOS workflow id for a given workflow_name.
last (bool)

abort_job(job, workspace_id, verify=True)¶

Abort a job.

Parameters:

job (string) – The CloudOS job id of the job to abort.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

Returns:

r – The server response

Return type:

requests.models.Response

apikey: str¶

clone_or_resume_job(source_job_id, queue_name=None, cost_limit=None, master_instance=None, job_name=None, nextflow_version=None, branch=None, profile=None, do_not_save_logs=None, use_fusion=None, accelerate_saving_results=False, resumable=None, project_name=None, parameters=None, verify=True, mode=None)[source]¶

Clone or resume an existing job with optional parameter overrides.

Parameters:

source_job_id (str) – The CloudOS job ID to clone/resume from.
queue_name (str, optional) – Name of the job queue to use.
cost_limit (float, optional) – Job cost limit override.
master_instance (str, optional) – Master instance type override.
job_name (str, optional) – New job name.
nextflow_version (str, optional) – Nextflow version override.
branch (str, optional) – Git branch override.
profile (str, optional) – Nextflow profile override.
do_not_save_logs (bool, optional) – Whether to save logs override.
use_fusion (bool, optional) – Whether to use fusion filesystem override.
accelerate_saving_results (bool, optional) – Whether to accelerate saving results override.
resumable (bool, optional) – Whether to make the job resumable or not.
project_name (str, optional) – Project name override (will look up new project ID).
parameters (list, optional) – List of parameter overrides in format [‘param1=value1’, ‘param2=value2’].
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
mode (str, optional) – The mode to use for the job (e.g. “clone”, “resume”).

Returns:

The CloudOS job ID of the cloned/resumed job.

Return type:

str

cloudos_url: str¶

convert_nextflow_to_json(job_config, parameter, array_parameter, array_file_header, is_module, example_parameters, git_commit, git_tag, git_branch, project_id, workflow_id, job_name, resumable, save_logs, batch, job_queue_id, nextflow_profile, nextflow_version, instance_type, instance_disk, storage_mode, lustre_size, execution_platform, hpc_id, workflow_type, cromwell_id, azure_worker_instance_type, azure_worker_instance_disk, azure_worker_instance_spot, cost_limit, use_mountpoints, accelerate_saving_results, docker_login, command, cpus, memory)[source]¶

Converts a nextflow.config file into a json formatted dict.

Parameters:

job_config (string) – Path to a nextflow.config file with parameters scope.
parameter (tuple) – Tuple of strings indicating the parameters to pass to the pipeline call. They are in the following form: (‘param1=param1val’, ‘param2=param2val’, …)
array_parameter (tuple) – Tuple of strings indicating the parameters to pass to the pipeline call for array jobs. They are in the following form: (‘param1=param1val’, ‘param2=param2val’, …)
array_file_header (string) – The header of the file containing the array parameters. It is used to add the necessary column index for array file columns.
is_module (bool) – Whether the job is a module or not. If True, the job will be submitted as a module.
example_parameters (list) – A list of dicts, with the parameters required for the API request in JSON format. It is typically used to run curated pipelines using the already available example parameters.
git_commit (string) – The git commit hash of the pipeline to use. Equivalent to -r option in Nextflow. If not specified, the last commit of the default branch will be used.
git_tag (string) – The tag of the pipeline to use. If not specified, the last commit of the default branch will be used.
git_branch (string) – The branch of the pipeline to use. If not specified, the last commit of the default branch will be used.
project_id (string) – The CloudOS project id for a given project name.
workflow_id (string) – The CloudOS workflow id for a given workflow_name.
job_name (string.) – The name to assign to the job.
resumable (bool) – Whether to create a resumable job or not.
save_logs (bool) – Whether to save job logs or not.
batch (bool) – Whether to create an AWS batch job or not.
job_queue_id (string) – Job queue Id to use in the batch job.
nextflow_profile (string) – A comma separated string with the profiles to be used.
nextflow_version (string) – Nextflow version to use when executing the workflow in CloudOS.
instance_type (string) – Name of the instance type to be used for the job master node, for example for AWS EC2 c5.xlarge
instance_disk (int) – The disk space of the master node instance, in GB.
storage_mode (string) – Either ‘lustre’ or ‘regular’. Indicates if the user wants to select regular or lustre storage.
lustre_size (int) – The lustre storage to be used when –storage-mode=lustre, in GB. It should be 1200 or a multiple of it.
execution_platform (string ['aws'|'azure'|'hpc']) – The execution platform implemented in your CloudOS.
hpc_id (string) – The ID of your HPC in CloudOS.
workflow_type (str) – The type of workflow to run. It could be ‘nextflow’, ‘wdl’ or ‘docker’.
cromwell_id (str) – Cromwell server ID.
azure_worker_instance_type (str) – The worker node instance type to be used in azure.
azure_worker_instance_disk (int) – The disk size in GB for the worker node to be used in azure.
azure_worker_instance_spot (bool) – Whether the azure worker nodes have to be spot instances or not.
cost_limit (float) – Job cost limit. -1 means no cost limit.
use_mountpoints (bool) – Whether to use or not AWS S3 mountpoint for quicker file staging.
accelerate_saving_results (bool) – Whether to save results directly to cloud storage bypassing the master node.
docker_login (bool) – Whether to use private docker images, provided the users have linked their docker.io accounts.
command (string) – The command to run in bash jobs.
cpus (int) – The number of CPUs to use for the bash jobs task’s master node.
memory (int) – The amount of memory, in GB, to use for the bash job task’s master node.

Returns:

params – A JSON formatted dict.

Return type:

dict

create_project(workspace_id, project_name, verify=True)¶

Create a new project in CloudOS.

Parameters:

workspace_id (str) – The CloudOS workspace ID where the project will be created.
project_name (str) – The name for the new project.
verify ([bool | str], optional) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file. Default is True.

Returns:

The ID of the newly created project.

Return type:

str

Raises:

BadRequestException – If the request to create the project fails with a status code indicating an error.

cromwell_switch(workspace_id, action, verify=True)¶

Restart Cromwell server.

Parameters:

workspace_id (string) – The CloudOS workspace id in which restart/stop Cromwell status.
action (string [restart|stop]) – The action to perform.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

Returns:

r – The server response

Return type:

requests.models.Response

cromwell_token: str¶

delete_job_results(job_id, mode, verify=True)[source]¶

Delete job results folder.

Parameters:

job_id (str) – The CloudOS job ID whose results folder is to be deleted.
mode (str) – The mode to use for deletion (e.g., “analysisResults” or “workDirectory”).
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file. Default is True.

Returns:

A dictionary containing the deletion response from the API.

Return type:

dict

Raises:

BadRequestException – If the request fails with a status code indicating an error.
ValueError – If the folder ID is invalid or the folder does not exist.

detect_workflow(workflow_name, workspace_id, verify=True, last=False)¶

Detects workflow type: nextflow or wdl.

Parameters:

workflow_name (string) – Name of the workflow.
workspace_id (string) – The CloudOS workspace id from to collect the workflows.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

Returns:

wt – The workflow type detected

Return type:

string [‘nextflow’|’wdl’]

docker_workflow_param_processing(param, project_name)[source]¶

Processes a Docker workflow parameter and determines its type and associated metadata.

Parameters:

param (str) – The parameter string in the format ‘–param_name=value’. It can represent a file path, a glob pattern, or a simple text value.
project_name (str) – The name of the current project to use if no specific project is extracted from the parameter.
Returns –
dict: A dictionary containing the processed parameter details. The structure of the dictionary depends on the type of the parameter: - For glob patterns:

{ “name”: str, # Parameter name without leading dashes. “prefix”: str, # Prefix (’–’ or ‘-’) based on the parameter format. “globPattern”: str, # The glob pattern extracted from the parameter. “parameterKind”: str, # Always “globPattern”. “folder”: str # Folder ID associated with the glob pattern.
- For file paths:
  { “name”: str, # Parameter name without leading dashes. “prefix”: str, # Prefix (’–’ or ‘-’) based on the parameter format. “parameterKind”: str, # Always “dataItem”. “dataItem”: {
  
  ”kind”: str, # Always “File”. “item”: str # File ID associated with the file path.
- For text values:
  { “name”: str, # Parameter name without leading dashes. “prefix”: str, # Prefix (’–’ or ‘-’) based on the parameter format. “parameterKind”: str, # Always “textValue”. “textValue”: str # The text value extracted from the parameter.

Notes

The function uses helper methods extract_project, classify_pattern, and get_file_or_folder_id to process the parameter.
If the parameter represents a file path or glob pattern, the function retrieves the corresponding file or folder ID from the cloud workspace.
If the parameter does not match any specific pattern or file extension, it is treated as a simple text value.

fetch_cloudos_id(apikey, cloudos_url, resource, workspace_id, name, mainfile=None, importsfile=None, repository_platform='github', verify=True)[source]¶

Fetch the cloudos id for a given name.

Parameters:

apikey (string) – Your CloudOS API key
cloudos_url (string) – The CloudOS service url.
resource (string) – The resource you want to fetch from. E.g.: projects.
workspace_id (string) – The specific Cloudos workspace id.
name (string) – The name of a CloudOS resource element.
mainfile (string) – The name of the mainFile used by the workflow. Only used when resource == ‘workflows’. Required for WDL pipelines as different mainFiles could be loaded for a single pipeline.
importsfile (string) – The name of the importsFile used by the workflow. Optional and only used for WDL pipelines as different importsFiles could be loaded for a single pipeline.
repository_platform (string) – The name of the repository platform of the workflow resides.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

Returns:

project_id – The CloudOS project id for a given project name.

Return type:

string

fix_boolean_strings(obj)[source]¶: Recursively convert string booleans (‘True’, ‘False’) into real booleans inside dicts, lists, or nested structures.

get_cromwell_status(workspace_id, verify=True)¶

Get Cromwell server status from CloudOS.

Parameters:

workspace_id (string) – The CloudOS workspace id from to check the Cromwell status.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

Returns:

r – The server response

Return type:

requests.models.Response

get_field_from_jobs_endpoint(job_id, field=None, verify=True)[source]¶

Get the resume work directory id for a job.

Parameters:

job_id (str) – The CloudOS job ID to get the resume work directory for.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

Returns:

The resume work directory id.

Return type:

str

get_folder_deletion_status(folder_id, workspace_id, verify=True)¶

Get deletion status of a specific folder by ID.

Simple API wrapper to query the folders API for a specific folder with its deletion status (ready/deleting/etc).

Parameters:

folder_id (str) – The CloudOS folder ID.
workspace_id (str) – The CloudOS workspace ID.
verify ([bool | str], optional) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file. Default is True.

Returns:

The API response containing folder information with status.

Return type:

Response

Raises:

BadRequestException – If the request fails with a status code indicating an error.

get_folder_items_deletion_status(folder_id, workspace_id, verify=True)¶

Get deletion status of items within a folder.

Simple API wrapper to query the datasets API for items in a folder with their deletion status (ready/deleting).

Parameters:

folder_id (str) – The CloudOS folder ID.
workspace_id (str) – The CloudOS workspace ID.
verify ([bool | str], optional) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file. Default is True.

Returns:

The API response containing folders and files with their status.

Return type:

Response

Raises:

BadRequestException – If the request fails with a status code indicating an error.

get_job_list(workspace_id, last_n_jobs=None, page=None, page_size=None, archived=False, verify=True, filter_status=None, filter_job_name=None, filter_project=None, filter_workflow=None, filter_job_id=None, filter_only_mine=False, filter_owner=None, filter_queue=None, last=False)¶

Get jobs from a CloudOS workspace with optional filtering.

Fetches jobs page by page, applies all filters after fetching. Stops when enough jobs are collected or no more jobs are available.

Parameters:

workspace_id (string) – The CloudOS workspace id from to collect the jobs.
last_n_jobs ([int | 'all'], default=None) – How many of the last jobs from the user to retrieve. You can specify a very large int or ‘all’ to get all user’s jobs. When specified, page and page_size parameters are ignored.
page (int, default=None) – Response page to get when not using last_n_jobs.
page_size (int, default=None) – Number of jobs to retrieve per page when not using last_n_jobs. Maximum allowed value is 100.
archived (bool, default=False) – When True, only the archived jobs are retrieved.
verify ([bool|string], default=True) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
filter_status (string, optional) – Filter jobs by status (e.g., ‘completed’, ‘running’, ‘failed’).
filter_job_name (string, optional) – Filter jobs by name.
filter_project (string, optional) – Filter jobs by project name (will be resolved to project ID).
filter_workflow (string, optional) – Filter jobs by workflow name (will be resolved to workflow ID).
filter_job_id (string, optional) – Filter jobs by specific job ID.
filter_only_mine (bool, optional) – Filter to show only jobs belonging to the current user.
filter_owner (string, optional) – Filter jobs by owner username (will be resolved to user ID).
filter_queue (string, optional) – Filter jobs by queue name (will be resolved to queue ID). Only applies to jobs running in batch environment. Non-batch jobs are preserved in results as they don’t use queues.
last (bool, optional) – When workflows are duplicated, use the latest imported workflow (by date).

Returns:

r – A list of dicts, each corresponding to a jobs from the user and the workspace.

Return type:

list

get_job_logs(j_id, workspace_id, verify=True)¶: Get the location of the logs for the specified job

get_job_relatedness(workspace_id, workdir_folder_id, limit=100, verify=True)[source]¶

Get ALL related jobs that share the same working directory folder.

This method retrieves all jobs sharing the same working directory folder, using pagination internally to fetch all results from the API.

Parameters:

workspace_id (str) – The CloudOS workspace ID.
workdir_folder_id (str) – The working directory folder ID to filter jobs by.
limit (int) – Batch size for API requests (default: 100). This parameter is kept for backwards compatibility but fetches all jobs regardless.
verify ([bool | str], optional) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file. Default is True.

Returns:

A dictionary where keys are job IDs and values are dictionaries containing job details: status, name, user name, user surname, _id, createdAt, runTime, and computeCostSpent.

Return type:

dict

Raises:

BadRequestException – If the request fails with a status code indicating an error.

get_job_request_payload(job_id, verify=True)[source]¶

Get the original request payload for a job.

Parameters:

job_id (str) – The CloudOS job ID to get the payload for.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

Returns:

The original job request payload.

Return type:

dict

get_job_results(j_id, workspace_id, verify=True)¶: Get the location of the results for the specified job

get_job_status(j_id, workspace_id=None, verify=True)¶

Get job status from CloudOS.

Parameters:

j_id (string) – The CloudOS job id of the job just launched.
workspace_id (string) – The CloudOS workspace id from to check the job status.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

Returns:

r – The server response

Return type:

requests.models.Response

get_job_workdir(j_id, workspace_id, verify=True)¶: Get the working directory for the specified job

get_parent_job(workspace_id, folder_id, verify=True)[source]¶

Get the parent job of a given folder.

Parameters:

workspace_id (str) – The CloudOS workspace ID.
folder_id (str) – The ID of the folder whose parent job is to be retrieved.
verify ([bool | str], optional) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file. Default is True.

Returns:

A dictionary containing details of the parent job.

Return type:

dict

Raises:

BadRequestException – If the request fails with a status code indicating an error.

get_project_id_from_name(workspace_id, project_name, verify=True)¶

Retrieve the project ID from its name.

Parameters:

workspace_id (str) – The CloudOS workspace ID to search for the project.
project_name (str) – The name of the project to search for.
verify ([bool | str], optional) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file. Default is True.

Returns:

The server response containing project details.

Return type:

dict

Raises:

BadRequestException – If the request to retrieve the project fails with a status code indicating an error.

get_project_list(workspace_id, verify=True, get_all=True, page=1, page_size=10, max_page_size=100)¶

Get all the project from a CloudOS workspace.

Parameters:

workspace_id (string) – The CloudOS workspace id from to collect the projects.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
get_all (bool) – Whether to get all available curated workflows or just the indicated page.
page (int) – The page number to retrieve, from the paginated response.
page_size (int) – The number of workflows by page. From 1 to 1000.
max_page_size (int) – Max page size defined by the API server. It is currently 1000.

Returns:

r – The server response

Return type:

requests.models.Response

get_results_deletion_status(job_id, workspace_id, verify=True)¶

Get the deletion status of a specific job’s results folder.

This method orchestrates finding the job’s results folder and retrieving the deletion status of items within it.

Parameters:

job_id (str) – The CloudOS job ID.
workspace_id (str) – The CloudOS workspace ID.
verify ([bool | str], optional) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file. Default is True.

Returns:

A dictionary containing the deletion status information with the following structure: {

”job_id”: str, # The job ID “job_name”: str, # The job name “results_folder_id”: str, # The ID of the job’s results folder “results_folder_name”: str, # The name of the job’s results folder “items”: dict # Dictionary with ‘folders’ and ‘files’ arrays containing items and their status

}

Return type:

dict

Raises:

BadRequestException – If the request fails with a status code indicating an error.
ValueError – If the job’s results folder is not found.

get_storage_contents(cloud_name, cloud_meta, container, path, workspace_id, verify)¶

Retrieves the contents of a storage container from the specified cloud service.

This method fetches the contents of a specified path within a storage container on a cloud service (e.g., AWS S3 or Azure Blob). The request is authenticated using an API key and requires valid parameters such as the workspace ID and path.

Parameters:

cloud_name (str) – The name of the cloud service (e.g., ‘aws’ or ‘azure’).
container (str) – The name of the storage container or bucket.
path (str) – The file path or directory within the storage container.
workspace_id (str) – The identifier of the workspace or team.
verify (bool) – Whether to verify SSL certificates for the request.

Returns:

A list of contents retrieved from the specified cloud storage.

Return type:

list

Raises:

BadRequestException – If the request to retrieve the contents fails with a
status code indicating an error. –

get_user_info(verify=True)¶

Gets user information from users/me endpoint

Parameters:: verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
Returns:: r – The server response content
Return type:: requests.models.Response.content

get_workdir_deletion_status(job_id, workspace_id, verify=True)¶

Get the deletion status of a specific job’s working directory.

This method retrieves the deletion status of the job’s working directory using the folders API endpoint.

Parameters:

job_id (str) – The CloudOS job ID.
workspace_id (str) – The CloudOS workspace ID.
verify ([bool | str], optional) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file. Default is True.

Returns:

A dictionary containing the deletion status information with the following structure: {

”job_id”: str, # The job ID “job_name”: str, # The job name “workdir_folder_id”: str, # The ID of the job’s working directory folder “workdir_folder_name”: str, # The name of the job’s working directory folder “status”: str, # The deletion status “items”: dict # Full folder object with metadata

}

Return type:

dict

Raises:

BadRequestException – If the request fails with a status code indicating an error.
ValueError – If the job’s working directory is not found or not accessible.

get_workflow_content(workspace_id, workflow_name, verify=True, last=False, max_page_size=100)¶

Retrieve the workflow content from API.

Parameters:

workspace_id (str) – The CloudOS workspace ID to search for the workflow.
workflow_name (str) – The name of the workflow to search for.
verify ([bool | str], optional) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file. Default is True.

Returns:

The server response containing workflow details.

Return type:

dict

Raises:

BadRequestException – If the request to retrieve the project fails with a status code indicating an error.

get_workflow_list(workspace_id, verify=True, get_all=True, page=1, page_size=10, max_page_size=100, archived_status=False)¶

Get all the workflows from a CloudOS workspace.

Parameters:

workspace_id (string) – The CloudOS workspace id from to collect the workflows.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
get_all (bool) – Whether to get all available curated workflows or just the indicated page.
page (int) – The page number to retrieve, from the paginated response.
page_size (int) – The number of workflows by page. From 1 to 1000.
max_page_size (int) – Max page size defined by the API server. It is currently 1000.
archived_status (bool) – Whether to retrieve archived workflows or not.

Returns:

r – A list of dicts, each corresponding to a workflow.

Return type:

list

get_workflow_max_pagination(workspace_id, workflow_name, verify=True)¶

Retrieve the workflows max pages from API.

Parameters:

workspace_id (str) – The CloudOS workspace ID to search for the workflow.
workflow_name (str) – The name of the workflow to search for.
verify ([bool | str], optional) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file. Default is True.

Returns:

The server response with max pagination for workflows.

Return type:

int

Raises:

BadRequestException – If the request to retrieve the project fails with a status code indicating an error.

importsfile: str = None¶

is_module(workflow_name, workspace_id, verify=True, last=False)¶

Detects whether the workflow is a system module or not.

System modules use fixed queues, so this check is important to properly manage queue selection.

Parameters:

workflow_name (string) – Name of the workflow.
workspace_id (string) – The CloudOS workspace id from to collect the workflows.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

Returns:

True, if the workflow is a system module, false otherwise.

Return type:

bool

last: bool = False¶

mainfile: str = None¶

static process_job_list(r, all_fields=False)¶

Process a job list from a self.get_job_list call.

Parameters:

r (list) – A list of dicts, each corresponding to a job from the user and the workspace.
all_fields (bool. Default=False) – Whether to return a reduced version of the DataFrame containing only the selected columns or the full DataFrame.

Returns:

df – A DataFrame with the requested columns from the jobs.

Return type:

pandas.DataFrame

static process_project_list(r, all_fields=False)¶

Process a server response from a self.get_project_list call.

Parameters:

r (requests.models.Response) – A list of dicts, each corresponding to a project.
all_fields (bool. Default=False) – Whether to return a reduced version of the DataFrame containing only the selected columns or the full DataFrame.

Returns:

df – A DataFrame with the requested columns from the projects.

Return type:

pandas.DataFrame

static process_workflow_list(r, all_fields=False)¶

Process a server response from a self.get_workflow_list call.

Parameters:

r (list) – A list of dicts, each corresponding to a workflow.
all_fields (bool. Default=False) – Whether to return a reduced version of the DataFrame containing only the selected columns or the full DataFrame.

Returns:

df – A DataFrame with the requested columns from the workflows.

Return type:

pandas.DataFrame

property project_id: str¶

project_name: str¶

reorder_job_list(my_jobs_df, filename='my_jobs.csv')¶

Save a job list DataFrame to a CSV file with renamed and ordered columns.

Parameters:

my_jobs_df (pandas.DataFrame) – A DataFrame containing job information from process_job_list.
filename (str) – The name of the file to save the DataFrame to. Default is ‘my_jobs.csv’.

Returns:

Saves the DataFrame to a CSV file with renamed and ordered columns.

Return type:

None

repository_platform: str = 'github'¶

resolve_user_id(filter_owner, workspace_id, verify=True)¶

Resolve a username or display name to a user ID.

Parameters:

filter_owner (str) – The username or display name to search for.
workspace_id (str) – The CloudOS workspace ID.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

Returns:

The user ID corresponding to the filter_owner.

Return type:

str

Raises:

ValueError – If the user cannot be found or if there’s an error during the search.

retrieve_cols_from_array_file(array_file, ds, separator, verify_ssl)[source]¶

Retrieve metadata for columns from an array file stored in a directory.

This method fetches the metadata of an array file by interacting with a directory service and making an API call to retrieve the file’s metadata.

Parameters:

array_file (str) – The path to the array file whose metadata is to be retrieved.
ds (object) – The directory service object used to list folder content.
separator (str) – The separator used in the array file.
verify_ssl (bool) – Whether to verify SSL certificates during the API request.

Raises:

ValueError – If the specified file is not found in the directory.
BadRequestException – If the API request to retrieve metadata fails with a status code >= 400.

Returns:

The HTTP response object containing the metadata of the array file.

Return type:

Response

save_job_list_to_csv(my_jobs_df, filename='my_jobs.csv')¶

send_job(job_config=None, parameter=(), array_parameter=(), array_file_header=None, is_module=False, example_parameters=[], git_commit=None, git_tag=None, git_branch=None, job_name='new_job', resumable=False, save_logs=True, batch=True, job_queue_id=None, nextflow_profile=None, nextflow_version='22.10.8', instance_type='c5.xlarge', instance_disk=500, storage_mode='regular', lustre_size=1200, execution_platform='aws', hpc_id=None, workflow_type='nextflow', cromwell_id=None, azure_worker_instance_type='Standard_D4as_v4', azure_worker_instance_disk=100, azure_worker_instance_spot=False, cost_limit=30.0, use_mountpoints=False, accelerate_saving_results=False, docker_login=False, verify=True, command=None, cpus=1, memory=4)[source]¶

Send a job to CloudOS.

Parameters:

job_config (string) – Path to a nextflow.config file with parameters scope.
parameter (tuple) – Tuple of strings indicating the parameters to pass to the pipeline call. They are in the following form: (‘param1=param1val’, ‘param2=param2val’, …)
array_parameter (tuple) – Tuple of strings indicating the parameters to pass to the pipeline call for array jobs. They are in the following form: (‘param1=param1val’, ‘param2=param2val’, …)
array_file_header (string) – The header of the file containing the array parameters. It is used to add the necessary column index for array file columns.
example_parameters (list) – A list of dicts, with the parameters required for the API request in JSON format. It is typically used to run curated pipelines using the already available example parameters.
git_commit (string) – The git commit hash of the pipeline to use. Equivalent to -r option in Nextflow. If not specified, the last commit of the default branch will be used.
git_tag (string) – The tag of the pipeline to use. If not specified, the last commit of the default branch will be used.
git_branch (string) – The branch of the pipeline to use. If not specified, the last commit of the default branch will be used.
job_name (string) – The name to assign to the job.
resumable (bool) – Whether to create a resumable job or not.
save_logs (bool) – Whether to save job logs or not.
batch (bool) – Whether to create an AWS batch job or not.
job_queue_id (string) – Job queue Id to use in the batch job.
nextflow_profile (string) – A comma separated string with the profiles to be used.
nextflow_version (string) – Nextflow version to use when executing the workflow in CloudOS.
instance_type (string) – Name of the instance type to be used for the job master node, for example for AWS EC2 c5.xlarge
instance_disk (int) – The disk space of the master node instance, in GB.
storage_mode (string) – Either ‘lustre’ or ‘regular’. Indicates if the user wants to select regular or lustre storage.
lustre_size (int) – The lustre storage to be used when –storage-mode=lustre, in GB. It should be 1200 or a multiple of it.
execution_platform (string ['aws'|'azure'|'hpc']) – The execution platform implemented in your CloudOS.
hpc_id (string) – The ID of your HPC in CloudOS.
workflow_type (str) – The type of workflow to run. It could be ‘nextflow’, ‘wdl’ or ‘docker’.
cromwell_id (str) – Cromwell server ID.
azure_worker_instance_type (str) – The worker node instance type to be used in azure.
azure_worker_instance_disk (int) – The disk size in GB for the worker node to be used in azure.
azure_worker_instance_spot (bool) – Whether the azure worker nodes have to be spot instances or not.
cost_limit (float) – Job cost limit. -1 means no cost limit.
use_mountpoints (bool) – Whether to use or not AWS S3 mountpoint for quicker file staging.
accelerate_saving_results (bool) – Whether to save results directly to cloud storage bypassing the master node.
docker_login (bool) – Whether to use private docker images, provided the users have linked their docker.io accounts.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
command (string) – The command to run in bash jobs.
cpus (int) – The number of CPUs to use for the bash jobs task’s master node.
memory (int) – The amount of memory, in GB, to use for the bash job task’s master node.

Returns:

j_id – The CloudOS job id of the job just launched.

Return type:

string

setup_params_array_file(custom_script_path, ds_custom, command, separator)[source]¶

Sets up a dictionary representing command parameters, including support for custom scripts and array files, to be used in job execution.

Parameters:

custom_script_path (str) – Path to the custom script file. If None, the command is treated as text.
ds_custom (object) – An object providing access to folder content listing functionality.
command (str) – The command to be executed, either as text or the name of a custom script.
separator (str) – The separator to be used for the array file.

Returns:

A dictionary containing the command parameters, including:

”command”: The command name or text.
”customScriptFile” (optional): Details of the custom script file if provided.
”arrayFile”: Details of the array file and its separator.

Return type:

dict

static split_array_file_params(array_parameter, workflow_type, array_file_header)[source]¶

Splits and processes array parameters for a given workflow type and array file header.

Parameters:

array_parameter (list) – A list of strings representing array parameters in the format “key=value”.
workflow_type (str) – The type of workflow, e.g., ‘docker’.
array_file_header (list) – A list of dictionaries representing the header of the array file. Each dictionary should contain “name” and “index” keys.

Returns:

A dictionary containing processed parameter details, including:

prefix (str): The prefix for the parameter (e.g., “–” or “-“).
name (str): The name of the parameter with leading dashes stripped.
parameterKind (str): The kind of parameter, set to “arrayFileColumn”.
columnName (str): The name of the column derived from the parameter value.
columnIndex (int): The index of the column in the array file header.

Return type:

dict

Raises:

ValueError – If an array parameter does not contain a ‘=’ character or is improperly formatted.

update_parameter_value(parameters, param_name, new_value)[source]¶

Update a parameter value in the parameters list.

Parameters:

parameters (list) – List of parameter dictionaries.
param_name (str) – Name of the parameter to update.
new_value (str) – New value for the parameter.

Returns:

True if parameter was found and updated, False otherwise.

Return type:

bool

verify: bool | str = True¶

wait_job_completion(job_id, workspace_id, wait_time=3600, request_interval=30, verbose=False, verify=True)¶

Checks job status from CloudOS and wait for its complation.

Parameters:

job_id (string) – The CloudOS job id of the job just launched.
workspace_id (string) – The CloudOS workspace id from to check the job status.
wait_time (int) – Max time to wait (in seconds) to job completion.
request_interval (int) – Time interval (in seconds) to request job status.
verbose (bool) – Whether to output status on every request or not.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

Returns:

A dict with three elements collected from the job status: ‘name’, ‘id’, ‘status’.

Return type:

dict

workflow_content_query(workspace_id, workflow_name, verify=True, query='workflowType', last=False)¶

property workflow_id: str¶

workflow_import(workspace_id, workflow_url, workflow_name, repository_project_id, workflow_docs_link='', repository_id=None, verify=True)¶

Imports workflows to CloudOS.

Parameters:

workspace_id (string) – The CloudOS workspace id from to collect the projects.
workflow_url (string) – The URL of the workflow. Only Github or Bitbucket are allowed.
workflow_name (string) – A name for the imported pipeline in CloudOS.
repository_project_id (int) – The repository project ID.
workflow_docs_link (string) – Link to the documentation URL.
repository_id (int) – The repository ID. Only required for GitHub repositories.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

Returns:

workflow_id – The newly imported worflow ID.

Return type:

string

workflow_name: str¶

workspace_id: str¶