cloudos_cli.datasets.datasets¶
This is the main class for file explorer (datasets).
Classes
|
Class for file explorer. |
- class cloudos_cli.datasets.datasets.Datasets(cloudos_url, apikey, cromwell_token, workspace_id, project_name, verify=True, project_id=<property object>)[source]¶
Bases:
Cloudos
Class for file explorer.
- Parameters:
cloudos_url (string) – The CloudOS service url.
apikey (string) – Your CloudOS API key.
workspace_id (string) – The specific Cloudos workspace id.
project_name (string) – The name of a CloudOS project.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
project_id (string) – The CloudOS project id for a given project name.
cromwell_token (str)
- abort_job(job, workspace_id, verify=True)¶
Abort a job.
- Parameters:
job (string) – The CloudOS job id of the job to abort.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
- Returns:
r – The server response
- Return type:
requests.models.Response
- apikey: str¶
- cloudos_url: str¶
- copy_item(item, destination_id, destination_kind)[source]¶
Copy a file or folder (S3, Azure or Virtual) to a destination in CloudOS.
- create_project(workspace_id, project_name, verify=True)¶
Create a new project in CloudOS.
- Parameters:
workspace_id (str) – The CloudOS workspace ID where the project will be created.
project_name (str) – The name for the new project.
verify ([bool | str], optional) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file. Default is True.
- Returns:
The ID of the newly created project.
- Return type:
str
- Raises:
BadRequestException – If the request to create the project fails with a status code indicating an error.
- create_virtual_folder(name, parent_id, parent_kind)[source]¶
Create a new virtual folder in CloudOS under a given parent.
- Parameters:
name (str) – The name of the new folder.
parent_id (str) – The ID of the parent (can be a Dataset or a Folder).
parent_kind (str) – The type of the parent: either “Dataset” or “Folder”.
- Returns:
response – The response object from the CloudOS API.
- Return type:
requests.Response
- cromwell_switch(workspace_id, action, verify=True)¶
Restart Cromwell server.
- Parameters:
workspace_id (string) – The CloudOS workspace id in which restart/stop Cromwell status.
action (string [restart|stop]) – The action to perform.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
- Returns:
r – The server response
- Return type:
requests.models.Response
- cromwell_token: str¶
- delete_item(item_id, kind)[source]¶
Delete a file or folder in CloudOS.
- Parameters:
item_id (str) – The ID of the file or folder to delete.
kind (str) – Must be either “File” or “Folder”.
- Returns:
response – The response object from the CloudOS API.
- Return type:
requests.Response
- detect_workflow(workflow_name, workspace_id, verify=True, last=False)¶
Detects workflow type: nextflow or wdl.
- Parameters:
workflow_name (string) – Name of the workflow.
workspace_id (string) – The CloudOS workspace id from to collect the workflows.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
- Returns:
wt – The workflow type detected
- Return type:
string [‘nextflow’|’wdl’]
- fetch_project_id(workspace_id, project_name, verify=True)[source]¶
Fetch the project id for a given name.
- Parameters:
workspace_id (string) – The specific Cloudos workspace id.
project_name (string) – The name of a CloudOS project element.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
- Returns:
project_id – The CloudOS project id for a given project name.
- Return type:
string
- get_cromwell_status(workspace_id, verify=True)¶
Get Cromwell server status from CloudOS.
- Parameters:
workspace_id (string) – The CloudOS workspace id from to check the Cromwell status.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
- Returns:
r – The server response
- Return type:
requests.models.Response
- get_job_list(workspace_id, last_n_jobs=30, page=1, archived=False, verify=True, filter_status=None, filter_job_name=None, filter_project=None, filter_workflow=None, filter_job_id=None, filter_only_mine=False, filter_owner=None, filter_queue=None, last=False)¶
Get jobs from a CloudOS workspace with optional filtering.
Fetches jobs page by page, applies all filters after fetching. Stops when enough jobs are collected or no more jobs are available.
- Parameters:
workspace_id (string) – The CloudOS workspace id from to collect the jobs.
last_n_jobs ([int | 'all']) – How many of the last jobs from the user to retrieve. You can specify a very large int or ‘all’ to get all user’s jobs.
page (int) – Response page to get (ignored when using filters - starts from page 1).
archived (bool) – When True, only the archived jobs are retrieved.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
filter_status (string, optional) – Filter jobs by status (e.g., ‘completed’, ‘running’, ‘failed’).
filter_job_name (string, optional) – Filter jobs by name.
filter_project (string, optional) – Filter jobs by project name (will be resolved to project ID).
filter_workflow (string, optional) – Filter jobs by workflow name (will be resolved to workflow ID).
filter_job_id (string, optional) – Filter jobs by specific job ID.
filter_only_mine (bool, optional) – Filter to show only jobs belonging to the current user.
filter_owner (string, optional) – Filter jobs by owner username (will be resolved to user ID).
filter_queue (string, optional) – Filter jobs by queue name (will be resolved to queue ID). Only applies to jobs running in batch environment. Non-batch jobs are preserved in results as they don’t use queues.
last (bool, optional) – When workflows are duplicated, use the latest imported workflow (by date).
- Returns:
r – A list of dicts, each corresponding to a jobs from the user and the workspace.
- Return type:
list
- get_job_logs(j_id, workspace_id, verify=True)¶
Get the location of the logs for the specified job
- get_job_results(j_id, workspace_id, verify=True)¶
Get the location of the results for the specified job
- get_job_status(j_id, verify=True)¶
Get job status from CloudOS.
- Parameters:
j_id (string) – The CloudOS job id of the job just launched.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
- Returns:
r – The server response
- Return type:
requests.models.Response
- get_job_workdir(j_id, workspace_id, verify=True)¶
Get the working directory for the specified job
- get_project_id_from_name(workspace_id, project_name, verify=True)¶
Retrieve the project ID from its name.
- Parameters:
workspace_id (str) – The CloudOS workspace ID to search for the project.
project_name (str) – The name of the project to search for.
verify ([bool | str], optional) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file. Default is True.
- Returns:
The server response containing project details.
- Return type:
dict
- Raises:
BadRequestException – If the request to retrieve the project fails with a status code indicating an error.
- get_project_list(workspace_id, verify=True, get_all=True, page=1, page_size=10, max_page_size=100)¶
Get all the project from a CloudOS workspace.
- Parameters:
workspace_id (string) – The CloudOS workspace id from to collect the projects.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
get_all (bool) – Whether to get all available curated workflows or just the indicated page.
page (int) – The page number to retrieve, from the paginated response.
page_size (int) – The number of workflows by page. From 1 to 1000.
max_page_size (int) – Max page size defined by the API server. It is currently 1000.
- Returns:
r – The server response
- Return type:
requests.models.Response
- get_storage_contents(cloud_name, cloud_meta, container, path, workspace_id, verify)¶
Retrieves the contents of a storage container from the specified cloud service.
This method fetches the contents of a specified path within a storage container on a cloud service (e.g., AWS S3 or Azure Blob). The request is authenticated using an API key and requires valid parameters such as the workspace ID and path.
- Parameters:
cloud_name (str) – The name of the cloud service (e.g., ‘aws’ or ‘azure’).
container (str) – The name of the storage container or bucket.
path (str) – The file path or directory within the storage container.
workspace_id (str) – The identifier of the workspace or team.
verify (bool) – Whether to verify SSL certificates for the request.
- Returns:
A list of contents retrieved from the specified cloud storage.
- Return type:
list
- Raises:
BadRequestException – If the request to retrieve the contents fails with a
status code indicating an error. –
- get_user_info(verify=True)¶
Gets user information from users/me endpoint
- Parameters:
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
- Returns:
r – The server response content
- Return type:
requests.models.Response.content
- get_workflow_content(workspace_id, workflow_name, verify=True, last=False, max_page_size=100)¶
Retrieve the workflow content from API.
- Parameters:
workspace_id (str) – The CloudOS workspace ID to search for the workflow.
workflow_name (str) – The name of the workflow to search for.
verify ([bool | str], optional) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file. Default is True.
- Returns:
The server response containing workflow details.
- Return type:
dict
- Raises:
BadRequestException – If the request to retrieve the project fails with a status code indicating an error.
- get_workflow_list(workspace_id, verify=True, get_all=True, page=1, page_size=10, max_page_size=100, archived_status=False)¶
Get all the workflows from a CloudOS workspace.
- Parameters:
workspace_id (string) – The CloudOS workspace id from to collect the workflows.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
get_all (bool) – Whether to get all available curated workflows or just the indicated page.
page (int) – The page number to retrieve, from the paginated response.
page_size (int) – The number of workflows by page. From 1 to 1000.
max_page_size (int) – Max page size defined by the API server. It is currently 1000.
archived_status (bool) – Whether to retrieve archived workflows or not.
- Returns:
r – A list of dicts, each corresponding to a workflow.
- Return type:
list
- get_workflow_max_pagination(workspace_id, workflow_name, verify=True)¶
Retrieve the workflows max pages from API.
- Parameters:
workspace_id (str) – The CloudOS workspace ID to search for the workflow.
workflow_name (str) – The name of the workflow to search for.
verify ([bool | str], optional) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file. Default is True.
- Returns:
The server response with max pagination for workflows.
- Return type:
int
- Raises:
BadRequestException – If the request to retrieve the project fails with a status code indicating an error.
- is_module(workflow_name, workspace_id, verify=True, last=False)¶
Detects whether the workflow is a system module or not.
System modules use fixed queues, so this check is important to properly manage queue selection.
- Parameters:
workflow_name (string) – Name of the workflow.
workspace_id (string) – The CloudOS workspace id from to collect the workflows.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
- Returns:
True, if the workflow is a system module, false otherwise.
- Return type:
bool
- list_azure_container_content(container_name, storage_account_name, path)[source]¶
List contents of an Azure Blob container path.
- Parameters:
container_name (str)
storage_account_name (str)
path (str)
- list_datasets_content(folder_name)[source]¶
Uses¶
- apikeystring
Your CloudOS API key
- cloudos_urlstring
The CloudOS service url.
- workspace_idstring
The specific Cloudos workspace id.
- project_idstring
The specific project id
- folder_namestring
The requested folder name
- list_folder_content(path=None)[source]¶
Wrapper to list contents of a CloudOS folder.
- Parameters:
path (str, optional) – A path like ‘TopFolder’, ‘TopFolder/Subfolder’, or deeper. If None, lists all top-level datasets in the project.
- Returns:
JSON response from the appropriate CloudOS endpoint.
- Return type:
dict
- list_project_content()[source]¶
Fetch the information of the directories present in the projects.
Uses¶
- apikeystring
Your CloudOS API key
- cloudos_urlstring
The CloudOS service url.
- workspace_idstring
The specific Cloudos workspace id.
- project_id
The specific project id
- list_s3_folder_content(s3_bucket_name, s3_relative_path)[source]¶
Uses¶
- apikeystring
Your CloudOS API key
- cloudos_urlstring
The CloudOS service url.
- workspace_idstring
The specific Cloudos workspace id.
- project_idstring
The specific project id
- s3_bucket_namestring
The s3 bucket name
- s3_relative_path: string
The relative path in the s3 bucket
- list_virtual_folder_content(folder_id)[source]¶
Uses¶
- apikeystring
Your CloudOS API key
- cloudos_urlstring
The CloudOS service url.
- workspace_idstring
The specific Cloudos workspace id.
- project_idstring
The specific project id
- folder_idstring
The folder id of the folder whose content are to be listed
- move_files_and_folders(source_id, source_kind, target_id, target_kind)[source]¶
Move a file to another dataset in CloudOS.
- Parameters:
file_id (str) – The ID of the file to move.
target_dataset_id (str) – The ID of the target dataset to move the file into.
source_id (str)
source_kind (str)
target_id (str)
target_kind (str)
- Returns:
response – The response object from the CloudOS API.
- Return type:
requests.Response
- static process_job_list(r, all_fields=False)¶
Process a job list from a self.get_job_list call.
- Parameters:
r (list) – A list of dicts, each corresponding to a job from the user and the workspace.
all_fields (bool. Default=False) – Whether to return a reduced version of the DataFrame containing only the selected columns or the full DataFrame.
- Returns:
df – A DataFrame with the requested columns from the jobs.
- Return type:
pandas.DataFrame
- static process_project_list(r, all_fields=False)¶
Process a server response from a self.get_project_list call.
- Parameters:
r (requests.models.Response) – A list of dicts, each corresponding to a project.
all_fields (bool. Default=False) – Whether to return a reduced version of the DataFrame containing only the selected columns or the full DataFrame.
- Returns:
df – A DataFrame with the requested columns from the projects.
- Return type:
pandas.DataFrame
- static process_workflow_list(r, all_fields=False)¶
Process a server response from a self.get_workflow_list call.
- Parameters:
r (list) – A list of dicts, each corresponding to a workflow.
all_fields (bool. Default=False) – Whether to return a reduced version of the DataFrame containing only the selected columns or the full DataFrame.
- Returns:
df – A DataFrame with the requested columns from the workflows.
- Return type:
pandas.DataFrame
- property project_id: str¶
- project_name: str¶
- rename_item(item_id, new_name, kind)[source]¶
Rename a file or folder in CloudOS.
- Parameters:
item_id (str) – The ID of the file or folder to rename.
new_name (str) – The new name to assign to the item.
kind (str) – Either “File” or “Folder”
- Returns:
response – The response object from the CloudOS API.
- Return type:
requests.Response
- reorder_job_list(my_jobs_df, filename='my_jobs.csv')¶
Save a job list DataFrame to a CSV file with renamed and ordered columns.
- Parameters:
my_jobs_df (pandas.DataFrame) – A DataFrame containing job information from process_job_list.
filename (str) – The name of the file to save the DataFrame to. Default is ‘my_jobs.csv’.
- Returns:
Saves the DataFrame to a CSV file with renamed and ordered columns.
- Return type:
None
- resolve_user_id(filter_owner, workspace_id, verify=True)¶
Resolve a username or display name to a user ID.
- Parameters:
filter_owner (str) – The username or display name to search for.
workspace_id (str) – The CloudOS workspace ID.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
- Returns:
The user ID corresponding to the filter_owner.
- Return type:
str
- Raises:
ValueError – If the user cannot be found or if there’s an error during the search.
- save_job_list_to_csv(my_jobs_df, filename='my_jobs.csv')¶
- verify: bool | str = True¶
- wait_job_completion(job_id, wait_time=3600, request_interval=30, verbose=False, verify=True)¶
Checks job status from CloudOS and wait for its complation.
- Parameters:
j_id (string) – The CloudOS job id of the job just launched.
wait_time (int) – Max time to wait (in seconds) to job completion.
request_interval (int) – Time interval (in seconds) to request job status.
verbose (bool) – Whether to output status on every request or not.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
- Returns:
A dict with three elements collected from the job status: ‘name’, ‘id’, ‘status’.
- Return type:
dict
- workflow_content_query(workspace_id, workflow_name, verify=True, query='workflowType', last=False)¶
- workflow_import(workspace_id, workflow_url, workflow_name, repository_project_id, workflow_docs_link='', repository_id=None, verify=True)¶
Imports workflows to CloudOS.
- Parameters:
workspace_id (string) – The CloudOS workspace id from to collect the projects.
workflow_url (string) – The URL of the workflow. Only Github or Bitbucket are allowed.
workflow_name (string) – A name for the imported pipeline in CloudOS.
repository_project_id (int) – The repository project ID.
workflow_docs_link (string) – Link to the documentation URL.
repository_id (int) – The repository ID. Only required for GitHub repositories.
verify ([bool|string]) – Whether to use SSL verification or not. Alternatively, if a string is passed, it will be interpreted as the path to the SSL certificate file.
- Returns:
workflow_id – The newly imported worflow ID.
- Return type:
string
- workspace_id: str¶