cloudos_cli.jobs.job¶
This is the main class to create jobs.
Classes
|
Class to store and operate jobs. |
- class cloudos_cli.jobs.job.Job(cloudos_url, apikey, cromwell_token, workspace_id, project_name, workflow_name, last=False, verify=True, mainfile=None, importsfile=None, repository_platform='github', project_id=<property object>, workflow_id=<property object>)[source]¶
Bases:
Cloudos
Class to store and operate jobs.
- Parameters:
cloudos_url (string) – The CloudOS service url.
apikey (string) – Your CloudOS API key.
cromwell_token (string) – Cromwell server token.
workspace_id (string) – The specific Cloudos workspace id.
project_name (string) – The name of a CloudOS project.
workflow_name (string) – The name of a CloudOS workflow or pipeline.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
mainfile (string) – The name of the mainFile used by the workflow. Required for WDL pipelines as different mainFiles could be loaded for a single pipeline.
importsfile (string) – The name of the importsFile used by the workflow. Optional and only used for WDL pipelines as different importsFiles could be loaded for a single pipeline.
repository_platform (string) – The name of the repository platform of the workflow.
project_id (string) – The CloudOS project id for a given project name.
workflow_id (string) – The CloudOS workflow id for a given workflow_name.
last (bool)
- abort_job(job, workspace_id, verify=True)¶
Abort a job.
- Parameters:
job (string) – The CloudOS job id of the job to abort.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
- Returns:
r – The server response
- Return type:
requests.models.Response
- apikey: str¶
- clone_or_resume_job(source_job_id, queue_name=None, cost_limit=None, master_instance=None, job_name=None, nextflow_version=None, branch=None, profile=None, do_not_save_logs=None, use_fusion=None, resumable=None, project_name=None, parameters=None, verify=True, mode=None)[source]¶
Clone or resume an existing job with optional parameter overrides.
- Parameters:
source_job_id (str) – The CloudOS job ID to clone/resume from.
queue_name (str, optional) – Name of the job queue to use.
cost_limit (float, optional) – Job cost limit override.
master_instance (str, optional) – Master instance type override.
job_name (str, optional) – New job name.
nextflow_version (str, optional) – Nextflow version override.
branch (str, optional) – Git branch override.
profile (str, optional) – Nextflow profile override.
do_not_save_logs (bool, optional) – Whether to save logs override.
use_fusion (bool, optional) – Whether to use fusion filesystem override.
resumable (bool, optional) – Whether to make the job resumable or not.
project_name (str, optional) – Project name override (will look up new project ID).
parameters (list, optional) – List of parameter overrides in format [‘param1=value1’, ‘param2=value2’].
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
mode (str, optional) – The mode to use for the job (e.g. “clone”, “resume”).
- Returns:
The CloudOS job ID of the cloned/resumed job.
- Return type:
str
- cloudos_url: str¶
- convert_nextflow_to_json(job_config, parameter, array_parameter, array_file_header, is_module, example_parameters, git_commit, git_tag, git_branch, project_id, workflow_id, job_name, resumable, save_logs, batch, job_queue_id, nextflow_profile, nextflow_version, instance_type, instance_disk, storage_mode, lustre_size, execution_platform, hpc_id, workflow_type, cromwell_id, azure_worker_instance_type, azure_worker_instance_disk, azure_worker_instance_spot, cost_limit, use_mountpoints, docker_login, command, cpus, memory)[source]¶
Converts a nextflow.config file into a json formatted dict.
- Parameters:
job_config (string) – Path to a nextflow.config file with parameters scope.
parameter (tuple) – Tuple of strings indicating the parameters to pass to the pipeline call. They are in the following form: (‘param1=param1val’, ‘param2=param2val’, …)
array_parameter (tuple) – Tuple of strings indicating the parameters to pass to the pipeline call for array jobs. They are in the following form: (‘param1=param1val’, ‘param2=param2val’, …)
array_file_header (string) – The header of the file containing the array parameters. It is used to add the necessary column index for array file columns.
is_module (bool) – Whether the job is a module or not. If True, the job will be submitted as a module.
example_parameters (list) – A list of dicts, with the parameters required for the API request in JSON format. It is typically used to run curated pipelines using the already available example parameters.
git_commit (string) – The git commit hash of the pipeline to use. Equivalent to -r option in Nextflow. If not specified, the last commit of the default branch will be used.
git_tag (string) – The tag of the pipeline to use. If not specified, the last commit of the default branch will be used.
git_branch (string) – The branch of the pipeline to use. If not specified, the last commit of the default branch will be used.
project_id (string) – The CloudOS project id for a given project name.
workflow_id (string) – The CloudOS workflow id for a given workflow_name.
job_name (string.) – The name to assign to the job.
resumable (bool) – Whether to create a resumable job or not.
save_logs (bool) – Whether to save job logs or not.
batch (bool) – Whether to create an AWS batch job or not.
job_queue_id (string) – Job queue Id to use in the batch job.
nextflow_profile (string) – A comma separated string with the profiles to be used.
nextflow_version (string) – Nextflow version to use when executing the workflow in CloudOS.
instance_type (string) – Name of the instance type to be used for the job master node, for example for AWS EC2 c5.xlarge
instance_disk (int) – The disk space of the master node instance, in GB.
storage_mode (string) – Either ‘lustre’ or ‘regular’. Indicates if the user wants to select regular or lustre storage.
lustre_size (int) – The lustre storage to be used when –storage-mode=lustre, in GB. It should be 1200 or a multiple of it.
execution_platform (string ['aws'|'azure'|'hpc']) – The execution platform implemented in your CloudOS.
hpc_id (string) – The ID of your HPC in CloudOS.
workflow_type (str) – The type of workflow to run. It could be ‘nextflow’, ‘wdl’ or ‘docker’.
cromwell_id (str) – Cromwell server ID.
azure_worker_instance_type (str) – The worker node instance type to be used in azure.
azure_worker_instance_disk (int) – The disk size in GB for the worker node to be used in azure.
azure_worker_instance_spot (bool) – Whether the azure worker nodes have to be spot instances or not.
cost_limit (float) – Job cost limit. -1 means no cost limit.
use_mountpoints (bool) – Whether to use or not AWS S3 mountpoint for quicker file staging.
docker_login (bool) – Whether to use private docker images, provided the users have linked their docker.io accounts.
command (string) – The command to run in bash jobs.
cpus (int) – The number of CPUs to use for the bash jobs task’s master node.
memory (int) – The amount of memory, in GB, to use for the bash job task’s master node.
- Returns:
params – A JSON formatted dict.
- Return type:
dict
- create_project(workspace_id, project_name, verify=True)¶
Create a new project in CloudOS.
- Parameters:
workspace_id (str) – The CloudOS workspace ID where the project will be created.
project_name (str) – The name for the new project.
verify ([bool | str], optional) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file. Default is True.
- Returns:
The ID of the newly created project.
- Return type:
str
- Raises:
BadRequestException – If the request to create the project fails with a status code indicating an error.
- cromwell_switch(workspace_id, action, verify=True)¶
Restart Cromwell server.
- Parameters:
workspace_id (string) – The CloudOS workspace id in which restart/stop Cromwell status.
action (string [restart|stop]) – The action to perform.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
- Returns:
r – The server response
- Return type:
requests.models.Response
- cromwell_token: str¶
- detect_workflow(workflow_name, workspace_id, verify=True, last=False)¶
Detects workflow type: nextflow or wdl.
- Parameters:
workflow_name (string) – Name of the workflow.
workspace_id (string) – The CloudOS workspace id from to collect the workflows.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
- Returns:
wt – The workflow type detected
- Return type:
string [‘nextflow’|’wdl’]
- docker_workflow_param_processing(param, project_name)[source]¶
Processes a Docker workflow parameter and determines its type and associated metadata.
- Parameters:
param (str) – The parameter string in the format ‘–param_name=value’. It can represent a file path, a glob pattern, or a simple text value.
project_name (str) – The name of the current project to use if no specific project is extracted from the parameter.
Returns –
dict: A dictionary containing the processed parameter details. The structure of the dictionary depends on the type of the parameter: - For glob patterns:
{ “name”: str, # Parameter name without leading dashes. “prefix”: str, # Prefix (’–’ or ‘-’) based on the parameter format. “globPattern”: str, # The glob pattern extracted from the parameter. “parameterKind”: str, # Always “globPattern”. “folder”: str # Folder ID associated with the glob pattern.
- For file paths:
{ “name”: str, # Parameter name without leading dashes. “prefix”: str, # Prefix (’–’ or ‘-’) based on the parameter format. “parameterKind”: str, # Always “dataItem”. “dataItem”: {
”kind”: str, # Always “File”. “item”: str # File ID associated with the file path.
- For text values:
{ “name”: str, # Parameter name without leading dashes. “prefix”: str, # Prefix (’–’ or ‘-’) based on the parameter format. “parameterKind”: str, # Always “textValue”. “textValue”: str # The text value extracted from the parameter.
Notes
The function uses helper methods extract_project, classify_pattern, and get_file_or_folder_id to process the parameter.
If the parameter represents a file path or glob pattern, the function retrieves the corresponding file or folder ID from the cloud workspace.
If the parameter does not match any specific pattern or file extension, it is treated as a simple text value.
- fetch_cloudos_id(apikey, cloudos_url, resource, workspace_id, name, mainfile=None, importsfile=None, repository_platform='github', verify=True)[source]¶
Fetch the cloudos id for a given name.
- Parameters:
apikey (string) – Your CloudOS API key
cloudos_url (string) – The CloudOS service url.
resource (string) – The resource you want to fetch from. E.g.: projects.
workspace_id (string) – The specific Cloudos workspace id.
name (string) – The name of a CloudOS resource element.
mainfile (string) – The name of the mainFile used by the workflow. Only used when resource == ‘workflows’. Required for WDL pipelines as different mainFiles could be loaded for a single pipeline.
importsfile (string) – The name of the importsFile used by the workflow. Optional and only used for WDL pipelines as different importsFiles could be loaded for a single pipeline.
repository_platform (string) – The name of the repository platform of the workflow resides.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
- Returns:
project_id – The CloudOS project id for a given project name.
- Return type:
string
- get_cromwell_status(workspace_id, verify=True)¶
Get Cromwell server status from CloudOS.
- Parameters:
workspace_id (string) – The CloudOS workspace id from to check the Cromwell status.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
- Returns:
r – The server response
- Return type:
requests.models.Response
- get_field_from_jobs_endpoint(job_id, field=None, verify=True)[source]¶
Get the resume work directory id for a job.
- Parameters:
job_id (str) – The CloudOS job ID to get the resume work directory for.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
- Returns:
The resume work directory id.
- Return type:
str
- get_job_list(workspace_id, last_n_jobs=30, page=1, archived=False, verify=True, filter_status=None, filter_job_name=None, filter_project=None, filter_workflow=None, filter_job_id=None, filter_only_mine=False, filter_owner=None, filter_queue=None, last=False)¶
Get jobs from a CloudOS workspace with optional filtering.
Fetches jobs page by page, applies all filters after fetching. Stops when enough jobs are collected or no more jobs are available.
- Parameters:
workspace_id (string) – The CloudOS workspace id from to collect the jobs.
last_n_jobs ([int | 'all']) – How many of the last jobs from the user to retrieve. You can specify a very large int or ‘all’ to get all user’s jobs.
page (int) – Response page to get (ignored when using filters - starts from page 1).
archived (bool) – When True, only the archived jobs are retrieved.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
filter_status (string, optional) – Filter jobs by status (e.g., ‘completed’, ‘running’, ‘failed’).
filter_job_name (string, optional) – Filter jobs by name.
filter_project (string, optional) – Filter jobs by project name (will be resolved to project ID).
filter_workflow (string, optional) – Filter jobs by workflow name (will be resolved to workflow ID).
filter_job_id (string, optional) – Filter jobs by specific job ID.
filter_only_mine (bool, optional) – Filter to show only jobs belonging to the current user.
filter_owner (string, optional) – Filter jobs by owner username (will be resolved to user ID).
filter_queue (string, optional) – Filter jobs by queue name (will be resolved to queue ID). Only applies to jobs running in batch environment. Non-batch jobs are preserved in results as they don’t use queues.
last (bool, optional) – When workflows are duplicated, use the latest imported workflow (by date).
- Returns:
r – A list of dicts, each corresponding to a jobs from the user and the workspace.
- Return type:
list
- get_job_logs(j_id, workspace_id, verify=True)¶
Get the location of the logs for the specified job
- get_job_request_payload(job_id, verify=True)[source]¶
Get the original request payload for a job.
- Parameters:
job_id (str) – The CloudOS job ID to get the payload for.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
- Returns:
The original job request payload.
- Return type:
dict
- get_job_results(j_id, workspace_id, verify=True)¶
Get the location of the results for the specified job
- get_job_status(j_id, verify=True)¶
Get job status from CloudOS.
- Parameters:
j_id (string) – The CloudOS job id of the job just launched.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
- Returns:
r – The server response
- Return type:
requests.models.Response
- get_job_workdir(j_id, workspace_id, verify=True)¶
Get the working directory for the specified job
- get_project_id_from_name(workspace_id, project_name, verify=True)¶
Retrieve the project ID from its name.
- Parameters:
workspace_id (str) – The CloudOS workspace ID to search for the project.
project_name (str) – The name of the project to search for.
verify ([bool | str], optional) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file. Default is True.
- Returns:
The server response containing project details.
- Return type:
dict
- Raises:
BadRequestException – If the request to retrieve the project fails with a status code indicating an error.
- get_project_list(workspace_id, verify=True, get_all=True, page=1, page_size=10, max_page_size=100)¶
Get all the project from a CloudOS workspace.
- Parameters:
workspace_id (string) – The CloudOS workspace id from to collect the projects.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
get_all (bool) – Whether to get all available curated workflows or just the indicated page.
page (int) – The page number to retrieve, from the paginated response.
page_size (int) – The number of workflows by page. From 1 to 1000.
max_page_size (int) – Max page size defined by the API server. It is currently 1000.
- Returns:
r – The server response
- Return type:
requests.models.Response
- get_storage_contents(cloud_name, cloud_meta, container, path, workspace_id, verify)¶
Retrieves the contents of a storage container from the specified cloud service.
This method fetches the contents of a specified path within a storage container on a cloud service (e.g., AWS S3 or Azure Blob). The request is authenticated using an API key and requires valid parameters such as the workspace ID and path.
- Parameters:
cloud_name (str) – The name of the cloud service (e.g., ‘aws’ or ‘azure’).
container (str) – The name of the storage container or bucket.
path (str) – The file path or directory within the storage container.
workspace_id (str) – The identifier of the workspace or team.
verify (bool) – Whether to verify SSL certificates for the request.
- Returns:
A list of contents retrieved from the specified cloud storage.
- Return type:
list
- Raises:
BadRequestException – If the request to retrieve the contents fails with a
status code indicating an error. –
- get_user_info(verify=True)¶
Gets user information from users/me endpoint
- Parameters:
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
- Returns:
r – The server response content
- Return type:
requests.models.Response.content
- get_workflow_content(workspace_id, workflow_name, verify=True, last=False, max_page_size=100)¶
Retrieve the workflow content from API.
- Parameters:
workspace_id (str) – The CloudOS workspace ID to search for the workflow.
workflow_name (str) – The name of the workflow to search for.
verify ([bool | str], optional) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file. Default is True.
- Returns:
The server response containing workflow details.
- Return type:
dict
- Raises:
BadRequestException – If the request to retrieve the project fails with a status code indicating an error.
- get_workflow_list(workspace_id, verify=True, get_all=True, page=1, page_size=10, max_page_size=100, archived_status=False)¶
Get all the workflows from a CloudOS workspace.
- Parameters:
workspace_id (string) – The CloudOS workspace id from to collect the workflows.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
get_all (bool) – Whether to get all available curated workflows or just the indicated page.
page (int) – The page number to retrieve, from the paginated response.
page_size (int) – The number of workflows by page. From 1 to 1000.
max_page_size (int) – Max page size defined by the API server. It is currently 1000.
archived_status (bool) – Whether to retrieve archived workflows or not.
- Returns:
r – A list of dicts, each corresponding to a workflow.
- Return type:
list
- get_workflow_max_pagination(workspace_id, workflow_name, verify=True)¶
Retrieve the workflows max pages from API.
- Parameters:
workspace_id (str) – The CloudOS workspace ID to search for the workflow.
workflow_name (str) – The name of the workflow to search for.
verify ([bool | str], optional) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file. Default is True.
- Returns:
The server response with max pagination for workflows.
- Return type:
int
- Raises:
BadRequestException – If the request to retrieve the project fails with a status code indicating an error.
- importsfile: str = None¶
- is_module(workflow_name, workspace_id, verify=True, last=False)¶
Detects whether the workflow is a system module or not.
System modules use fixed queues, so this check is important to properly manage queue selection.
- Parameters:
workflow_name (string) – Name of the workflow.
workspace_id (string) – The CloudOS workspace id from to collect the workflows.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
- Returns:
True, if the workflow is a system module, false otherwise.
- Return type:
bool
- last: bool = False¶
- mainfile: str = None¶
- static process_job_list(r, all_fields=False)¶
Process a job list from a self.get_job_list call.
- Parameters:
r (list) – A list of dicts, each corresponding to a job from the user and the workspace.
all_fields (bool. Default=False) – Whether to return a reduced version of the DataFrame containing only the selected columns or the full DataFrame.
- Returns:
df – A DataFrame with the requested columns from the jobs.
- Return type:
pandas.DataFrame
- static process_project_list(r, all_fields=False)¶
Process a server response from a self.get_project_list call.
- Parameters:
r (requests.models.Response) – A list of dicts, each corresponding to a project.
all_fields (bool. Default=False) – Whether to return a reduced version of the DataFrame containing only the selected columns or the full DataFrame.
- Returns:
df – A DataFrame with the requested columns from the projects.
- Return type:
pandas.DataFrame
- static process_workflow_list(r, all_fields=False)¶
Process a server response from a self.get_workflow_list call.
- Parameters:
r (list) – A list of dicts, each corresponding to a workflow.
all_fields (bool. Default=False) – Whether to return a reduced version of the DataFrame containing only the selected columns or the full DataFrame.
- Returns:
df – A DataFrame with the requested columns from the workflows.
- Return type:
pandas.DataFrame
- property project_id: str¶
- project_name: str¶
- reorder_job_list(my_jobs_df, filename='my_jobs.csv')¶
Save a job list DataFrame to a CSV file with renamed and ordered columns.
- Parameters:
my_jobs_df (pandas.DataFrame) – A DataFrame containing job information from process_job_list.
filename (str) – The name of the file to save the DataFrame to. Default is ‘my_jobs.csv’.
- Returns:
Saves the DataFrame to a CSV file with renamed and ordered columns.
- Return type:
None
- repository_platform: str = 'github'¶
- resolve_user_id(filter_owner, workspace_id, verify=True)¶
Resolve a username or display name to a user ID.
- Parameters:
filter_owner (str) – The username or display name to search for.
workspace_id (str) – The CloudOS workspace ID.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
- Returns:
The user ID corresponding to the filter_owner.
- Return type:
str
- Raises:
ValueError – If the user cannot be found or if there’s an error during the search.
- retrieve_cols_from_array_file(array_file, ds, separator, verify_ssl)[source]¶
Retrieve metadata for columns from an array file stored in a directory.
This method fetches the metadata of an array file by interacting with a directory service and making an API call to retrieve the file’s metadata.
- Parameters:
array_file (str) – The path to the array file whose metadata is to be retrieved.
ds (object) – The directory service object used to list folder content.
separator (str) – The separator used in the array file.
verify_ssl (bool) – Whether to verify SSL certificates during the API request.
- Raises:
ValueError – If the specified file is not found in the directory.
BadRequestException – If the API request to retrieve metadata fails with a status code >= 400.
- Returns:
The HTTP response object containing the metadata of the array file.
- Return type:
Response
- save_job_list_to_csv(my_jobs_df, filename='my_jobs.csv')¶
- send_job(job_config=None, parameter=(), array_parameter=(), array_file_header=None, is_module=False, example_parameters=[], git_commit=None, git_tag=None, git_branch=None, job_name='new_job', resumable=False, save_logs=True, batch=True, job_queue_id=None, nextflow_profile=None, nextflow_version='22.10.8', instance_type='c5.xlarge', instance_disk=500, storage_mode='regular', lustre_size=1200, execution_platform='aws', hpc_id=None, workflow_type='nextflow', cromwell_id=None, azure_worker_instance_type='Standard_D4as_v4', azure_worker_instance_disk=100, azure_worker_instance_spot=False, cost_limit=30.0, use_mountpoints=False, docker_login=False, verify=True, command=None, cpus=1, memory=4)[source]¶
Send a job to CloudOS.
- Parameters:
job_config (string) – Path to a nextflow.config file with parameters scope.
parameter (tuple) – Tuple of strings indicating the parameters to pass to the pipeline call. They are in the following form: (‘param1=param1val’, ‘param2=param2val’, …)
array_parameter (tuple) – Tuple of strings indicating the parameters to pass to the pipeline call for array jobs. They are in the following form: (‘param1=param1val’, ‘param2=param2val’, …)
array_file_header (string) – The header of the file containing the array parameters. It is used to add the necessary column index for array file columns.
example_parameters (list) – A list of dicts, with the parameters required for the API request in JSON format. It is typically used to run curated pipelines using the already available example parameters.
git_commit (string) – The git commit hash of the pipeline to use. Equivalent to -r option in Nextflow. If not specified, the last commit of the default branch will be used.
git_tag (string) – The tag of the pipeline to use. If not specified, the last commit of the default branch will be used.
git_branch (string) – The branch of the pipeline to use. If not specified, the last commit of the default branch will be used.
job_name (string) – The name to assign to the job.
resumable (bool) – Whether to create a resumable job or not.
save_logs (bool) – Whether to save job logs or not.
batch (bool) – Whether to create an AWS batch job or not.
job_queue_id (string) – Job queue Id to use in the batch job.
nextflow_profile (string) – A comma separated string with the profiles to be used.
nextflow_version (string) – Nextflow version to use when executing the workflow in CloudOS.
instance_type (string) – Name of the instance type to be used for the job master node, for example for AWS EC2 c5.xlarge
instance_disk (int) – The disk space of the master node instance, in GB.
storage_mode (string) – Either ‘lustre’ or ‘regular’. Indicates if the user wants to select regular or lustre storage.
lustre_size (int) – The lustre storage to be used when –storage-mode=lustre, in GB. It should be 1200 or a multiple of it.
execution_platform (string ['aws'|'azure'|'hpc']) – The execution platform implemented in your CloudOS.
hpc_id (string) – The ID of your HPC in CloudOS.
workflow_type (str) – The type of workflow to run. It could be ‘nextflow’, ‘wdl’ or ‘docker’.
cromwell_id (str) – Cromwell server ID.
azure_worker_instance_type (str) – The worker node instance type to be used in azure.
azure_worker_instance_disk (int) – The disk size in GB for the worker node to be used in azure.
azure_worker_instance_spot (bool) – Whether the azure worker nodes have to be spot instances or not.
cost_limit (float) – Job cost limit. -1 means no cost limit.
use_mountpoints (bool) – Whether to use or not AWS S3 mountpoint for quicker file staging.
docker_login (bool) – Whether to use private docker images, provided the users have linked their docker.io accounts.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
command (string) – The command to run in bash jobs.
cpus (int) – The number of CPUs to use for the bash jobs task’s master node.
memory (int) – The amount of memory, in GB, to use for the bash job task’s master node.
- Returns:
j_id – The CloudOS job id of the job just launched.
- Return type:
string
- setup_params_array_file(custom_script_path, ds_custom, command, separator)[source]¶
Sets up a dictionary representing command parameters, including support for custom scripts and array files, to be used in job execution.
- Parameters:
custom_script_path (str) – Path to the custom script file. If None, the command is treated as text.
ds_custom (object) – An object providing access to folder content listing functionality.
command (str) – The command to be executed, either as text or the name of a custom script.
separator (str) – The separator to be used for the array file.
- Returns:
- A dictionary containing the command parameters, including:
”command”: The command name or text.
”customScriptFile” (optional): Details of the custom script file if provided.
”arrayFile”: Details of the array file and its separator.
- Return type:
dict
- static split_array_file_params(array_parameter, workflow_type, array_file_header)[source]¶
Splits and processes array parameters for a given workflow type and array file header.
- Parameters:
array_parameter (list) – A list of strings representing array parameters in the format “key=value”.
workflow_type (str) – The type of workflow, e.g., ‘docker’.
array_file_header (list) – A list of dictionaries representing the header of the array file. Each dictionary should contain “name” and “index” keys.
- Returns:
- A dictionary containing processed parameter details, including:
prefix (str): The prefix for the parameter (e.g., “–” or “-“).
name (str): The name of the parameter with leading dashes stripped.
parameterKind (str): The kind of parameter, set to “arrayFileColumn”.
columnName (str): The name of the column derived from the parameter value.
columnIndex (int): The index of the column in the array file header.
- Return type:
dict
- Raises:
ValueError – If an array parameter does not contain a ‘=’ character or is improperly formatted.
- update_parameter_value(parameters, param_name, new_value)[source]¶
Update a parameter value in the parameters list.
- Parameters:
parameters (list) – List of parameter dictionaries.
param_name (str) – Name of the parameter to update.
new_value (str) – New value for the parameter.
- Returns:
True if parameter was found and updated, False otherwise.
- Return type:
bool
- verify: bool | str = True¶
- wait_job_completion(job_id, wait_time=3600, request_interval=30, verbose=False, verify=True)¶
Checks job status from CloudOS and wait for its complation.
- Parameters:
j_id (string) – The CloudOS job id of the job just launched.
wait_time (int) – Max time to wait (in seconds) to job completion.
request_interval (int) – Time interval (in seconds) to request job status.
verbose (bool) – Whether to output status on every request or not.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
- Returns:
A dict with three elements collected from the job status: ‘name’, ‘id’, ‘status’.
- Return type:
dict
- workflow_content_query(workspace_id, workflow_name, verify=True, query='workflowType', last=False)¶
- property workflow_id: str¶
- workflow_import(workspace_id, workflow_url, workflow_name, repository_project_id, workflow_docs_link='', repository_id=None, verify=True)¶
Imports workflows to CloudOS.
- Parameters:
workspace_id (string) – The CloudOS workspace id from to collect the projects.
workflow_url (string) – The URL of the workflow. Only Github or Bitbucket are allowed.
workflow_name (string) – A name for the imported pipeline in CloudOS.
repository_project_id (int) – The repository project ID.
workflow_docs_link (string) – Link to the documentation URL.
repository_id (int) – The repository ID. Only required for GitHub repositories.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
- Returns:
workflow_id – The newly imported worflow ID.
- Return type:
string
- workflow_name: str¶
- workspace_id: str¶