cloudos_cli.jobs.job

This is the main class to create jobs.

Classes

Job(cloudos_url, apikey, cromwell_token, ...)

Class to store and operate jobs.

class cloudos_cli.jobs.job.Job(cloudos_url, apikey, cromwell_token, workspace_id, project_name, workflow_name, last=False, verify=True, mainfile=None, importsfile=None, repository_platform='github', project_id=<property object>, workflow_id=<property object>)[source]

Bases: Cloudos

Class to store and operate jobs.

Parameters:
  • cloudos_url (string) – The CloudOS service url.

  • apikey (string) – Your CloudOS API key.

  • cromwell_token (string) – Cromwell server token.

  • workspace_id (string) – The specific Cloudos workspace id.

  • project_name (string) – The name of a CloudOS project.

  • workflow_name (string) – The name of a CloudOS workflow or pipeline.

  • verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

  • mainfile (string) – The name of the mainFile used by the workflow. Required for WDL pipelines as different mainFiles could be loaded for a single pipeline.

  • importsfile (string) – The name of the importsFile used by the workflow. Optional and only used for WDL pipelines as different importsFiles could be loaded for a single pipeline.

  • repository_platform (string) – The name of the repository platform of the workflow.

  • project_id (string) – The CloudOS project id for a given project name.

  • workflow_id (string) – The CloudOS workflow id for a given workflow_name.

  • last (bool)

abort_job(job, workspace_id, verify=True)

Abort a job.

Parameters:
  • job (string) – The CloudOS job id of the job to abort.

  • verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

Returns:

r – The server response

Return type:

requests.models.Response

apikey: str
clone_or_resume_job(source_job_id, queue_name=None, cost_limit=None, master_instance=None, job_name=None, nextflow_version=None, branch=None, profile=None, do_not_save_logs=None, use_fusion=None, resumable=None, project_name=None, parameters=None, verify=True, mode=None)[source]

Clone or resume an existing job with optional parameter overrides.

Parameters:
  • source_job_id (str) – The CloudOS job ID to clone/resume from.

  • queue_name (str, optional) – Name of the job queue to use.

  • cost_limit (float, optional) – Job cost limit override.

  • master_instance (str, optional) – Master instance type override.

  • job_name (str, optional) – New job name.

  • nextflow_version (str, optional) – Nextflow version override.

  • branch (str, optional) – Git branch override.

  • profile (str, optional) – Nextflow profile override.

  • do_not_save_logs (bool, optional) – Whether to save logs override.

  • use_fusion (bool, optional) – Whether to use fusion filesystem override.

  • resumable (bool, optional) – Whether to make the job resumable or not.

  • project_name (str, optional) – Project name override (will look up new project ID).

  • parameters (list, optional) – List of parameter overrides in format [‘param1=value1’, ‘param2=value2’].

  • verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

  • mode (str, optional) – The mode to use for the job (e.g. “clone”, “resume”).

Returns:

The CloudOS job ID of the cloned/resumed job.

Return type:

str

cloudos_url: str
convert_nextflow_to_json(job_config, parameter, array_parameter, array_file_header, is_module, example_parameters, git_commit, git_tag, git_branch, project_id, workflow_id, job_name, resumable, save_logs, batch, job_queue_id, nextflow_profile, nextflow_version, instance_type, instance_disk, storage_mode, lustre_size, execution_platform, hpc_id, workflow_type, cromwell_id, azure_worker_instance_type, azure_worker_instance_disk, azure_worker_instance_spot, cost_limit, use_mountpoints, docker_login, command, cpus, memory)[source]

Converts a nextflow.config file into a json formatted dict.

Parameters:
  • job_config (string) – Path to a nextflow.config file with parameters scope.

  • parameter (tuple) – Tuple of strings indicating the parameters to pass to the pipeline call. They are in the following form: (‘param1=param1val’, ‘param2=param2val’, …)

  • array_parameter (tuple) – Tuple of strings indicating the parameters to pass to the pipeline call for array jobs. They are in the following form: (‘param1=param1val’, ‘param2=param2val’, …)

  • array_file_header (string) – The header of the file containing the array parameters. It is used to add the necessary column index for array file columns.

  • is_module (bool) – Whether the job is a module or not. If True, the job will be submitted as a module.

  • example_parameters (list) – A list of dicts, with the parameters required for the API request in JSON format. It is typically used to run curated pipelines using the already available example parameters.

  • git_commit (string) – The git commit hash of the pipeline to use. Equivalent to -r option in Nextflow. If not specified, the last commit of the default branch will be used.

  • git_tag (string) – The tag of the pipeline to use. If not specified, the last commit of the default branch will be used.

  • git_branch (string) – The branch of the pipeline to use. If not specified, the last commit of the default branch will be used.

  • project_id (string) – The CloudOS project id for a given project name.

  • workflow_id (string) – The CloudOS workflow id for a given workflow_name.

  • job_name (string.) – The name to assign to the job.

  • resumable (bool) – Whether to create a resumable job or not.

  • save_logs (bool) – Whether to save job logs or not.

  • batch (bool) – Whether to create an AWS batch job or not.

  • job_queue_id (string) – Job queue Id to use in the batch job.

  • nextflow_profile (string) – A comma separated string with the profiles to be used.

  • nextflow_version (string) – Nextflow version to use when executing the workflow in CloudOS.

  • instance_type (string) – Name of the instance type to be used for the job master node, for example for AWS EC2 c5.xlarge

  • instance_disk (int) – The disk space of the master node instance, in GB.

  • storage_mode (string) – Either ‘lustre’ or ‘regular’. Indicates if the user wants to select regular or lustre storage.

  • lustre_size (int) – The lustre storage to be used when –storage-mode=lustre, in GB. It should be 1200 or a multiple of it.

  • execution_platform (string ['aws'|'azure'|'hpc']) – The execution platform implemented in your CloudOS.

  • hpc_id (string) – The ID of your HPC in CloudOS.

  • workflow_type (str) – The type of workflow to run. It could be ‘nextflow’, ‘wdl’ or ‘docker’.

  • cromwell_id (str) – Cromwell server ID.

  • azure_worker_instance_type (str) – The worker node instance type to be used in azure.

  • azure_worker_instance_disk (int) – The disk size in GB for the worker node to be used in azure.

  • azure_worker_instance_spot (bool) – Whether the azure worker nodes have to be spot instances or not.

  • cost_limit (float) – Job cost limit. -1 means no cost limit.

  • use_mountpoints (bool) – Whether to use or not AWS S3 mountpoint for quicker file staging.

  • docker_login (bool) – Whether to use private docker images, provided the users have linked their docker.io accounts.

  • command (string) – The command to run in bash jobs.

  • cpus (int) – The number of CPUs to use for the bash jobs task’s master node.

  • memory (int) – The amount of memory, in GB, to use for the bash job task’s master node.

Returns:

params – A JSON formatted dict.

Return type:

dict

create_project(workspace_id, project_name, verify=True)

Create a new project in CloudOS.

Parameters:
  • workspace_id (str) – The CloudOS workspace ID where the project will be created.

  • project_name (str) – The name for the new project.

  • verify ([bool | str], optional) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file. Default is True.

Returns:

The ID of the newly created project.

Return type:

str

Raises:

BadRequestException – If the request to create the project fails with a status code indicating an error.

cromwell_switch(workspace_id, action, verify=True)

Restart Cromwell server.

Parameters:
  • workspace_id (string) – The CloudOS workspace id in which restart/stop Cromwell status.

  • action (string [restart|stop]) – The action to perform.

  • verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

Returns:

r – The server response

Return type:

requests.models.Response

cromwell_token: str
detect_workflow(workflow_name, workspace_id, verify=True, last=False)

Detects workflow type: nextflow or wdl.

Parameters:
  • workflow_name (string) – Name of the workflow.

  • workspace_id (string) – The CloudOS workspace id from to collect the workflows.

  • verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

Returns:

wt – The workflow type detected

Return type:

string [‘nextflow’|’wdl’]

docker_workflow_param_processing(param, project_name)[source]

Processes a Docker workflow parameter and determines its type and associated metadata.

Parameters:
  • param (str) – The parameter string in the format ‘–param_name=value’. It can represent a file path, a glob pattern, or a simple text value.

  • project_name (str) – The name of the current project to use if no specific project is extracted from the parameter.

  • Returns

    dict: A dictionary containing the processed parameter details. The structure of the dictionary depends on the type of the parameter: - For glob patterns:

    { “name”: str, # Parameter name without leading dashes. “prefix”: str, # Prefix (’–’ or ‘-’) based on the parameter format. “globPattern”: str, # The glob pattern extracted from the parameter. “parameterKind”: str, # Always “globPattern”. “folder”: str # Folder ID associated with the glob pattern.

    • For file paths:

      { “name”: str, # Parameter name without leading dashes. “prefix”: str, # Prefix (’–’ or ‘-’) based on the parameter format. “parameterKind”: str, # Always “dataItem”. “dataItem”: {

      ”kind”: str, # Always “File”. “item”: str # File ID associated with the file path.

    • For text values:

      { “name”: str, # Parameter name without leading dashes. “prefix”: str, # Prefix (’–’ or ‘-’) based on the parameter format. “parameterKind”: str, # Always “textValue”. “textValue”: str # The text value extracted from the parameter.

Notes

  • The function uses helper methods extract_project, classify_pattern, and get_file_or_folder_id to process the parameter.

  • If the parameter represents a file path or glob pattern, the function retrieves the corresponding file or folder ID from the cloud workspace.

  • If the parameter does not match any specific pattern or file extension, it is treated as a simple text value.

fetch_cloudos_id(apikey, cloudos_url, resource, workspace_id, name, mainfile=None, importsfile=None, repository_platform='github', verify=True)[source]

Fetch the cloudos id for a given name.

Parameters:
  • apikey (string) – Your CloudOS API key

  • cloudos_url (string) – The CloudOS service url.

  • resource (string) – The resource you want to fetch from. E.g.: projects.

  • workspace_id (string) – The specific Cloudos workspace id.

  • name (string) – The name of a CloudOS resource element.

  • mainfile (string) – The name of the mainFile used by the workflow. Only used when resource == ‘workflows’. Required for WDL pipelines as different mainFiles could be loaded for a single pipeline.

  • importsfile (string) – The name of the importsFile used by the workflow. Optional and only used for WDL pipelines as different importsFiles could be loaded for a single pipeline.

  • repository_platform (string) – The name of the repository platform of the workflow resides.

  • verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

Returns:

project_id – The CloudOS project id for a given project name.

Return type:

string

get_cromwell_status(workspace_id, verify=True)

Get Cromwell server status from CloudOS.

Parameters:
  • workspace_id (string) – The CloudOS workspace id from to check the Cromwell status.

  • verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

Returns:

r – The server response

Return type:

requests.models.Response

get_field_from_jobs_endpoint(job_id, field=None, verify=True)[source]

Get the resume work directory id for a job.

Parameters:
  • job_id (str) – The CloudOS job ID to get the resume work directory for.

  • verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

Returns:

The resume work directory id.

Return type:

str

get_job_list(workspace_id, last_n_jobs=30, page=1, archived=False, verify=True, filter_status=None, filter_job_name=None, filter_project=None, filter_workflow=None, filter_job_id=None, filter_only_mine=False, filter_owner=None, filter_queue=None, last=False)

Get jobs from a CloudOS workspace with optional filtering.

Fetches jobs page by page, applies all filters after fetching. Stops when enough jobs are collected or no more jobs are available.

Parameters:
  • workspace_id (string) – The CloudOS workspace id from to collect the jobs.

  • last_n_jobs ([int | 'all']) – How many of the last jobs from the user to retrieve. You can specify a very large int or ‘all’ to get all user’s jobs.

  • page (int) – Response page to get (ignored when using filters - starts from page 1).

  • archived (bool) – When True, only the archived jobs are retrieved.

  • verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

  • filter_status (string, optional) – Filter jobs by status (e.g., ‘completed’, ‘running’, ‘failed’).

  • filter_job_name (string, optional) – Filter jobs by name.

  • filter_project (string, optional) – Filter jobs by project name (will be resolved to project ID).

  • filter_workflow (string, optional) – Filter jobs by workflow name (will be resolved to workflow ID).

  • filter_job_id (string, optional) – Filter jobs by specific job ID.

  • filter_only_mine (bool, optional) – Filter to show only jobs belonging to the current user.

  • filter_owner (string, optional) – Filter jobs by owner username (will be resolved to user ID).

  • filter_queue (string, optional) – Filter jobs by queue name (will be resolved to queue ID). Only applies to jobs running in batch environment. Non-batch jobs are preserved in results as they don’t use queues.

  • last (bool, optional) – When workflows are duplicated, use the latest imported workflow (by date).

Returns:

r – A list of dicts, each corresponding to a jobs from the user and the workspace.

Return type:

list

get_job_logs(j_id, workspace_id, verify=True)

Get the location of the logs for the specified job

get_job_request_payload(job_id, verify=True)[source]

Get the original request payload for a job.

Parameters:
  • job_id (str) – The CloudOS job ID to get the payload for.

  • verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

Returns:

The original job request payload.

Return type:

dict

get_job_results(j_id, workspace_id, verify=True)

Get the location of the results for the specified job

get_job_status(j_id, verify=True)

Get job status from CloudOS.

Parameters:
  • j_id (string) – The CloudOS job id of the job just launched.

  • verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

Returns:

r – The server response

Return type:

requests.models.Response

get_job_workdir(j_id, workspace_id, verify=True)

Get the working directory for the specified job

get_project_id_from_name(workspace_id, project_name, verify=True)

Retrieve the project ID from its name.

Parameters:
  • workspace_id (str) – The CloudOS workspace ID to search for the project.

  • project_name (str) – The name of the project to search for.

  • verify ([bool | str], optional) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file. Default is True.

Returns:

The server response containing project details.

Return type:

dict

Raises:

BadRequestException – If the request to retrieve the project fails with a status code indicating an error.

get_project_list(workspace_id, verify=True, get_all=True, page=1, page_size=10, max_page_size=100)

Get all the project from a CloudOS workspace.

Parameters:
  • workspace_id (string) – The CloudOS workspace id from to collect the projects.

  • verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

  • get_all (bool) – Whether to get all available curated workflows or just the indicated page.

  • page (int) – The page number to retrieve, from the paginated response.

  • page_size (int) – The number of workflows by page. From 1 to 1000.

  • max_page_size (int) – Max page size defined by the API server. It is currently 1000.

Returns:

r – The server response

Return type:

requests.models.Response

get_storage_contents(cloud_name, cloud_meta, container, path, workspace_id, verify)

Retrieves the contents of a storage container from the specified cloud service.

This method fetches the contents of a specified path within a storage container on a cloud service (e.g., AWS S3 or Azure Blob). The request is authenticated using an API key and requires valid parameters such as the workspace ID and path.

Parameters:
  • cloud_name (str) – The name of the cloud service (e.g., ‘aws’ or ‘azure’).

  • container (str) – The name of the storage container or bucket.

  • path (str) – The file path or directory within the storage container.

  • workspace_id (str) – The identifier of the workspace or team.

  • verify (bool) – Whether to verify SSL certificates for the request.

Returns:

A list of contents retrieved from the specified cloud storage.

Return type:

list

Raises:
  • BadRequestException – If the request to retrieve the contents fails with a

  • status code indicating an error.

get_user_info(verify=True)

Gets user information from users/me endpoint

Parameters:

verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

Returns:

r – The server response content

Return type:

requests.models.Response.content

get_workflow_content(workspace_id, workflow_name, verify=True, last=False, max_page_size=100)

Retrieve the workflow content from API.

Parameters:
  • workspace_id (str) – The CloudOS workspace ID to search for the workflow.

  • workflow_name (str) – The name of the workflow to search for.

  • verify ([bool | str], optional) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file. Default is True.

Returns:

The server response containing workflow details.

Return type:

dict

Raises:

BadRequestException – If the request to retrieve the project fails with a status code indicating an error.

get_workflow_list(workspace_id, verify=True, get_all=True, page=1, page_size=10, max_page_size=100, archived_status=False)

Get all the workflows from a CloudOS workspace.

Parameters:
  • workspace_id (string) – The CloudOS workspace id from to collect the workflows.

  • verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

  • get_all (bool) – Whether to get all available curated workflows or just the indicated page.

  • page (int) – The page number to retrieve, from the paginated response.

  • page_size (int) – The number of workflows by page. From 1 to 1000.

  • max_page_size (int) – Max page size defined by the API server. It is currently 1000.

  • archived_status (bool) – Whether to retrieve archived workflows or not.

Returns:

r – A list of dicts, each corresponding to a workflow.

Return type:

list

get_workflow_max_pagination(workspace_id, workflow_name, verify=True)

Retrieve the workflows max pages from API.

Parameters:
  • workspace_id (str) – The CloudOS workspace ID to search for the workflow.

  • workflow_name (str) – The name of the workflow to search for.

  • verify ([bool | str], optional) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file. Default is True.

Returns:

The server response with max pagination for workflows.

Return type:

int

Raises:

BadRequestException – If the request to retrieve the project fails with a status code indicating an error.

importsfile: str = None
is_module(workflow_name, workspace_id, verify=True, last=False)

Detects whether the workflow is a system module or not.

System modules use fixed queues, so this check is important to properly manage queue selection.

Parameters:
  • workflow_name (string) – Name of the workflow.

  • workspace_id (string) – The CloudOS workspace id from to collect the workflows.

  • verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

Returns:

True, if the workflow is a system module, false otherwise.

Return type:

bool

last: bool = False
mainfile: str = None
static process_job_list(r, all_fields=False)

Process a job list from a self.get_job_list call.

Parameters:
  • r (list) – A list of dicts, each corresponding to a job from the user and the workspace.

  • all_fields (bool. Default=False) – Whether to return a reduced version of the DataFrame containing only the selected columns or the full DataFrame.

Returns:

df – A DataFrame with the requested columns from the jobs.

Return type:

pandas.DataFrame

static process_project_list(r, all_fields=False)

Process a server response from a self.get_project_list call.

Parameters:
  • r (requests.models.Response) – A list of dicts, each corresponding to a project.

  • all_fields (bool. Default=False) – Whether to return a reduced version of the DataFrame containing only the selected columns or the full DataFrame.

Returns:

df – A DataFrame with the requested columns from the projects.

Return type:

pandas.DataFrame

static process_workflow_list(r, all_fields=False)

Process a server response from a self.get_workflow_list call.

Parameters:
  • r (list) – A list of dicts, each corresponding to a workflow.

  • all_fields (bool. Default=False) – Whether to return a reduced version of the DataFrame containing only the selected columns or the full DataFrame.

Returns:

df – A DataFrame with the requested columns from the workflows.

Return type:

pandas.DataFrame

property project_id: str
project_name: str
reorder_job_list(my_jobs_df, filename='my_jobs.csv')

Save a job list DataFrame to a CSV file with renamed and ordered columns.

Parameters:
  • my_jobs_df (pandas.DataFrame) – A DataFrame containing job information from process_job_list.

  • filename (str) – The name of the file to save the DataFrame to. Default is ‘my_jobs.csv’.

Returns:

Saves the DataFrame to a CSV file with renamed and ordered columns.

Return type:

None

repository_platform: str = 'github'
resolve_user_id(filter_owner, workspace_id, verify=True)

Resolve a username or display name to a user ID.

Parameters:
  • filter_owner (str) – The username or display name to search for.

  • workspace_id (str) – The CloudOS workspace ID.

  • verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

Returns:

The user ID corresponding to the filter_owner.

Return type:

str

Raises:

ValueError – If the user cannot be found or if there’s an error during the search.

retrieve_cols_from_array_file(array_file, ds, separator, verify_ssl)[source]

Retrieve metadata for columns from an array file stored in a directory.

This method fetches the metadata of an array file by interacting with a directory service and making an API call to retrieve the file’s metadata.

Parameters:
  • array_file (str) – The path to the array file whose metadata is to be retrieved.

  • ds (object) – The directory service object used to list folder content.

  • separator (str) – The separator used in the array file.

  • verify_ssl (bool) – Whether to verify SSL certificates during the API request.

Raises:
  • ValueError – If the specified file is not found in the directory.

  • BadRequestException – If the API request to retrieve metadata fails with a status code >= 400.

Returns:

The HTTP response object containing the metadata of the array file.

Return type:

Response

save_job_list_to_csv(my_jobs_df, filename='my_jobs.csv')
send_job(job_config=None, parameter=(), array_parameter=(), array_file_header=None, is_module=False, example_parameters=[], git_commit=None, git_tag=None, git_branch=None, job_name='new_job', resumable=False, save_logs=True, batch=True, job_queue_id=None, nextflow_profile=None, nextflow_version='22.10.8', instance_type='c5.xlarge', instance_disk=500, storage_mode='regular', lustre_size=1200, execution_platform='aws', hpc_id=None, workflow_type='nextflow', cromwell_id=None, azure_worker_instance_type='Standard_D4as_v4', azure_worker_instance_disk=100, azure_worker_instance_spot=False, cost_limit=30.0, use_mountpoints=False, docker_login=False, verify=True, command=None, cpus=1, memory=4)[source]

Send a job to CloudOS.

Parameters:
  • job_config (string) – Path to a nextflow.config file with parameters scope.

  • parameter (tuple) – Tuple of strings indicating the parameters to pass to the pipeline call. They are in the following form: (‘param1=param1val’, ‘param2=param2val’, …)

  • array_parameter (tuple) – Tuple of strings indicating the parameters to pass to the pipeline call for array jobs. They are in the following form: (‘param1=param1val’, ‘param2=param2val’, …)

  • array_file_header (string) – The header of the file containing the array parameters. It is used to add the necessary column index for array file columns.

  • example_parameters (list) – A list of dicts, with the parameters required for the API request in JSON format. It is typically used to run curated pipelines using the already available example parameters.

  • git_commit (string) – The git commit hash of the pipeline to use. Equivalent to -r option in Nextflow. If not specified, the last commit of the default branch will be used.

  • git_tag (string) – The tag of the pipeline to use. If not specified, the last commit of the default branch will be used.

  • git_branch (string) – The branch of the pipeline to use. If not specified, the last commit of the default branch will be used.

  • job_name (string) – The name to assign to the job.

  • resumable (bool) – Whether to create a resumable job or not.

  • save_logs (bool) – Whether to save job logs or not.

  • batch (bool) – Whether to create an AWS batch job or not.

  • job_queue_id (string) – Job queue Id to use in the batch job.

  • nextflow_profile (string) – A comma separated string with the profiles to be used.

  • nextflow_version (string) – Nextflow version to use when executing the workflow in CloudOS.

  • instance_type (string) – Name of the instance type to be used for the job master node, for example for AWS EC2 c5.xlarge

  • instance_disk (int) – The disk space of the master node instance, in GB.

  • storage_mode (string) – Either ‘lustre’ or ‘regular’. Indicates if the user wants to select regular or lustre storage.

  • lustre_size (int) – The lustre storage to be used when –storage-mode=lustre, in GB. It should be 1200 or a multiple of it.

  • execution_platform (string ['aws'|'azure'|'hpc']) – The execution platform implemented in your CloudOS.

  • hpc_id (string) – The ID of your HPC in CloudOS.

  • workflow_type (str) – The type of workflow to run. It could be ‘nextflow’, ‘wdl’ or ‘docker’.

  • cromwell_id (str) – Cromwell server ID.

  • azure_worker_instance_type (str) – The worker node instance type to be used in azure.

  • azure_worker_instance_disk (int) – The disk size in GB for the worker node to be used in azure.

  • azure_worker_instance_spot (bool) – Whether the azure worker nodes have to be spot instances or not.

  • cost_limit (float) – Job cost limit. -1 means no cost limit.

  • use_mountpoints (bool) – Whether to use or not AWS S3 mountpoint for quicker file staging.

  • docker_login (bool) – Whether to use private docker images, provided the users have linked their docker.io accounts.

  • verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

  • command (string) – The command to run in bash jobs.

  • cpus (int) – The number of CPUs to use for the bash jobs task’s master node.

  • memory (int) – The amount of memory, in GB, to use for the bash job task’s master node.

Returns:

j_id – The CloudOS job id of the job just launched.

Return type:

string

setup_params_array_file(custom_script_path, ds_custom, command, separator)[source]

Sets up a dictionary representing command parameters, including support for custom scripts and array files, to be used in job execution.

Parameters:
  • custom_script_path (str) – Path to the custom script file. If None, the command is treated as text.

  • ds_custom (object) – An object providing access to folder content listing functionality.

  • command (str) – The command to be executed, either as text or the name of a custom script.

  • separator (str) – The separator to be used for the array file.

Returns:

A dictionary containing the command parameters, including:
  • ”command”: The command name or text.

  • ”customScriptFile” (optional): Details of the custom script file if provided.

  • ”arrayFile”: Details of the array file and its separator.

Return type:

dict

static split_array_file_params(array_parameter, workflow_type, array_file_header)[source]

Splits and processes array parameters for a given workflow type and array file header.

Parameters:
  • array_parameter (list) – A list of strings representing array parameters in the format “key=value”.

  • workflow_type (str) – The type of workflow, e.g., ‘docker’.

  • array_file_header (list) – A list of dictionaries representing the header of the array file. Each dictionary should contain “name” and “index” keys.

Returns:

A dictionary containing processed parameter details, including:
  • prefix (str): The prefix for the parameter (e.g., “–” or “-“).

  • name (str): The name of the parameter with leading dashes stripped.

  • parameterKind (str): The kind of parameter, set to “arrayFileColumn”.

  • columnName (str): The name of the column derived from the parameter value.

  • columnIndex (int): The index of the column in the array file header.

Return type:

dict

Raises:

ValueError – If an array parameter does not contain a ‘=’ character or is improperly formatted.

update_parameter_value(parameters, param_name, new_value)[source]

Update a parameter value in the parameters list.

Parameters:
  • parameters (list) – List of parameter dictionaries.

  • param_name (str) – Name of the parameter to update.

  • new_value (str) – New value for the parameter.

Returns:

True if parameter was found and updated, False otherwise.

Return type:

bool

verify: bool | str = True
wait_job_completion(job_id, wait_time=3600, request_interval=30, verbose=False, verify=True)

Checks job status from CloudOS and wait for its complation.

Parameters:
  • j_id (string) – The CloudOS job id of the job just launched.

  • wait_time (int) – Max time to wait (in seconds) to job completion.

  • request_interval (int) – Time interval (in seconds) to request job status.

  • verbose (bool) – Whether to output status on every request or not.

  • verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

Returns:

A dict with three elements collected from the job status: ‘name’, ‘id’, ‘status’.

Return type:

dict

workflow_content_query(workspace_id, workflow_name, verify=True, query='workflowType', last=False)
property workflow_id: str
workflow_import(workspace_id, workflow_url, workflow_name, repository_project_id, workflow_docs_link='', repository_id=None, verify=True)

Imports workflows to CloudOS.

Parameters:
  • workspace_id (string) – The CloudOS workspace id from to collect the projects.

  • workflow_url (string) – The URL of the workflow. Only Github or Bitbucket are allowed.

  • workflow_name (string) – A name for the imported pipeline in CloudOS.

  • repository_project_id (int) – The repository project ID.

  • workflow_docs_link (string) – Link to the documentation URL.

  • repository_id (int) – The repository ID. Only required for GitHub repositories.

  • verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.

Returns:

workflow_id – The newly imported worflow ID.

Return type:

string

workflow_name: str
workspace_id: str