Title: | Sagemaker core classes, methods and functions |
---|---|
Description: | Contains core classes, methods and functions that support `AWS Sagemaker R Software Development Kit (SDK)`. |
Authors: | Dyfan Jones [aut, cre], Amazon.com, Inc. [cph] |
Maintainer: | Dyfan Jones <[email protected]> |
License: | Apache License (>= 2.0) |
Version: | 0.1.2.9002 |
Built: | 2024-11-21 03:34:57 UTC |
Source: | https://github.com/DyfanJones/sagemaker-r-core |
Contains core classes, methods and functions that support 'AWS Sagemaker R Software Development Kit (SDK)'.
Maintainer: Dyfan Jones [email protected]
Other contributors:
Amazon.com, Inc. [copyright holder]
Appends the project tag to the list of tags, if it exists.
.append_project_tags(tags = NULL, working_dir = NULL)
.append_project_tags(tags = NULL, working_dir = NULL)
tags |
: the list of tags to append to. |
working_dir |
: the working directory to start looking. |
A possibly extended list of tags that includes the project id
Generate models for JumpStart, and optionally apply filters to result.
.generate_jumpstart_model_versions( filter = Constant$new(BooleanValues$`TRUE`), region = JUMPSTART_DEFAULT_REGION_NAME(), list_incomplete_models = FALSE )
.generate_jumpstart_model_versions( filter = Constant$new(BooleanValues$`TRUE`), region = JUMPSTART_DEFAULT_REGION_NAME(), list_incomplete_models = FALSE )
filter |
(Union[Operator, str]): Optional. The filter to apply to generate models. This can be either an “Operator“ type filter (e.g. “And("task == ic", "framework == pytorch")“), or simply a string filter which will get serialized into an Identity filter. (e.g. “"task == ic"“). If this argument is not supplied, all models will be generated. (Default: Constant(BooleanValues$TRUE)). |
region |
(str): Optional. The AWS region from which to retrieve JumpStart metadata regarding models. (Default: JUMPSTART_DEFAULT_REGION_NAME()). |
list_incomplete_models |
(bool): Optional. If a model does not contain metadata fields requested by the filter, and the filter cannot be resolved to a include/not include, whether the model should be included. By default, these models are omitted from results. (Default: False). |
Static class for storing the JumpStart models cache.
get_model_header()
Returns model header from JumpStart models cache.
.JumpStartModelsAccessor$get_model_header(region, model_id, version)
region
(str): region for which to retrieve header.
model_id
(str): model id to retrieve.
version
(str): semantic version to retrieve for the model id.
get_model_specs()
Returns model specs from JumpStart models cache.
.JumpStartModelsAccessor$get_model_specs(region, model_id, version)
region
(str): region for which to retrieve header.
model_id
(str): model id to retrieve.
version
(str): semantic version to retrieve for the model id.
set_cache_kwargs()
Sets cache kwargs, clears the cache.
.JumpStartModelsAccessor$set_cache_kwargs(cache_kwargs, region = NULL)
cache_kwargs
(str): cache kwargs to validate.
region
(str): Optional. The region to validate along with the kwargs.
reset_cache()
Resets cache, optionally allowing cache kwargs to be passed to the new cache.
.JumpStartModelsAccessor$reset_cache(cache_kwargs = NULL, region = NULL)
cache_kwargs
(str): cache kwargs to validate.
region
(str): The region to validate along with the kwargs.
get_manifest()
Return entire JumpStart models manifest.
.JumpStartModelsAccessor$get_manifest(cache_kwargs = NULL, region = NULL)
cache_kwargs
(Dict[str, Any]): Optional. Cache kwargs to use. (Default: None).
region
(str): Optional. The region to use for the cache. (Default: None).
clone()
The objects of this class are cloneable with this method.
.JumpStartModelsAccessor$clone(deep = FALSE)
deep
Whether to make a deep clone.
Invokes the docker pull command for the given image.
.pull_image(image)
.pull_image(image)
image |
(str): pull docker image |
This class is responsible for creating the directories and configuration files that the docker containers will use for either training or serving.
new()
Initialize a SageMakerContainer instance It uses a :class:'sagemaker.session.Session' for general interaction with user configuration such as getting the default sagemaker S3 bucket. However this class does not call any of the SageMaker APIs.
.SageMakerContainer$new( instance_type, instance_count, image, sagemaker_session = NULL, container_entrypoint = NULL, container_arguments = NULL )
instance_type
(str): The instance type to use. Either 'local' or 'local_gpu'
instance_count
(int): The number of instances to create.
image
(str): docker image to use.
sagemaker_session
(sagemaker.session.Session): a sagemaker session to use when interacting with SageMaker.
container_entrypoint
(str): the container entrypoint to execute
container_arguments
(str): the container entrypoint arguments
process()
Run a processing job locally using docker-compose.
.SageMakerContainer$process( processing_inputs, processing_output_config, environment, processing_job_name )
processing_inputs
(dict): The processing input specification.
processing_output_config
(dict): The processing output configuration specification.
environment
(dict): The environment collection for the processing job.
processing_job_name
(str): Name of the local processing job being run.
train()
Run a training job locally using docker-compose.
.SageMakerContainer$train( input_data_config, output_data_config, hyperparameters, job_name )
input_data_config
(dict): The Input Data Configuration, this contains data such as the channels to be used for training.
output_data_config
: The configuration of the output data.
hyperparameters
(dict): The HyperParameters for the training job.
job_name
(str): Name of the local training job being run.
(str): Location of the trained model.
serve()
Host a local endpoint using docker-compose.
.SageMakerContainer$serve(model_dir, environment)
model_dir
(str): pointing to a file or s3:// location.
environment
a dictionary of environment variables to be passed to the hosting container.
stop_serving()
Stop the serving container. The serving container runs in async mode to allow the SDK to do other tasks.
.SageMakerContainer$stop_serving()
retrieve_artifacts()
Get the model artifacts from all the container nodes. Used after training completes to gather the data from all the individual containers. As the official SageMaker Training Service, it will override duplicate files if multiple containers have the same file names.
.SageMakerContainer$retrieve_artifacts( compose_data, output_data_config, job_name )
compose_data
(list): Docker-Compose configuration in dictionary format.
output_data_config
: The configuration of the output data.
job_name
: The name of the job.
Local path to the collected model artifacts.
write_processing_config_files()
Write the config files for the processing containers. This method writes the hyperparameters, resources and input data configuration files.
.SageMakerContainer$write_processing_config_files( host, environment, processing_inputs, processing_output_config, processing_job_name )
host
(str): Host to write the configuration for
environment
(dict): Environment variable collection.
processing_inputs
(dict): Processing inputs.
processing_output_config
(dict): Processing output configuration.
processing_job_name
(str): Processing job name.
write_config_files()
Write the config files for the training containers. This method writes the hyperparameters, resources and input data configuration files.
.SageMakerContainer$write_config_files( host, hyperparameters, input_data_config )
host
(str): Host to write the configuration for
hyperparameters
(dict): Hyperparameters for training.
input_data_config
(dict): Training input channels to be used for training.
NULL
clone()
The objects of this class are cloneable with this method.
.SageMakerContainer$clone(deep = FALSE)
deep
Whether to make a deep clone.
Static class for storing the SageMaker settings.
set_sagemaker_version()
Set SageMaker version.
.SageMakerSettings$set_sagemaker_version(version)
version
(str): python sagemaker version
get_sagemaker_version()
Return SageMaker version.
.SageMakerSettings$get_sagemaker_version()
clone()
The objects of this class are cloneable with this method.
.SageMakerSettings$clone(deep = FALSE)
deep
Whether to make a deep clone.
Raise an exception if the payload is beyond the size in MB threshold.
.validate_payload_size(payload, size)
.validate_payload_size(payload, size)
payload |
: data that will be checked |
size |
(int): max size in MB |
bool: True if within bounds. if size=0 it will always return True
No-op if this is not a JumpStart model related resource.
add_jumpstart_tags( tags = NULL, inference_model_uri = NULL, inference_script_uri = NULL, training_model_uri = NULL, training_script_uri = NULL )
add_jumpstart_tags( tags = NULL, inference_model_uri = NULL, inference_script_uri = NULL, training_model_uri = NULL, training_script_uri = NULL )
tags |
(Optional[List[Dict[str,str]]): Current tags for JumpStart inference or training job. (Default: None). |
inference_model_uri |
(Optional[str]): S3 URI for inference model artifact. (Default: None). |
inference_script_uri |
(Optional[str]): S3 URI for inference script tarball. (Default: None). |
training_model_uri |
(Optional[str]): S3 URI for training model artifact. (Default: None). |
training_script_uri |
(Optional[str]): S3 URI for training script tarball. (Default: None). |
Adds “tag_key“ to “curr_tags“ if “uri“ corresponds to a JumpStart model.
add_single_jumpstart_tag(uri, tag_key, curr_tags)
add_single_jumpstart_tag(uri, tag_key, curr_tags)
uri |
(str): URI which may correspond to a JumpStart model. |
tag_key |
(enums.JumpStartTag): Custom tag to apply to current tags if the URI corresponds to a JumpStart model. |
curr_tags |
(Optional[List]): Current tags associated with “Estimator“ or “Model“. |
And operator class for filtering JumpStart content.
And operator class for filtering JumpStart content.
sagemaker.core::Operand
-> sagemaker.core::Operator
-> And
new()
Instantiates And object.
And$new(...)
...
(Operand): Operand for And-ing.
eval()
Evaluates operator.
And$eval()
clone()
The objects of this class are cloneable with this method.
And$clone(deep = FALSE)
deep
Whether to make a deep clone.
Converts boto dicts of 'UpperCamelCase' names to dicts into/from a Python object with standard python members. Clients invoke to_boto on an instance of ApiObject to transform the ApiObject into a boto representation. Clients invoke from_boto on a sub-class of ApiObject to instantiate an instance of that class from a boto representation.
new()
Initialize ApiObject class
ApiObject$new(...)
...
:
from_paws()
Construct an instance of this ApiObject from a boto response.
ApiObject$from_paws(paws_list, ...)
paws_list
(list): A dictionary of a paws response.
...
: Arbitrary keyword arguments
to_paws()
Convert an object to a paws representation.
ApiObject$to_paws(obj)
obj
(dict): The object to convert to paws.
format()
Return a string representation of this ApiObject.
ApiObject$format()
clone()
The objects of this class are cloneable with this method.
ApiObject$clone(deep = FALSE)
deep
Whether to make a deep clone.
This function looks for timestamps that match the ones produced by :func:'~sagemaker.utils.name_from_base'.
base_from_name(name)
base_from_name(name)
name |
(str): The resource name. |
str: The base name, as extracted from the resource name.
Other sagemaker_utils:
.aws_partition()
,
.download_files_under_prefix()
,
base_name_from_image()
,
build_dict()
,
common_variables
,
create_tar_file()
,
download_file()
,
download_folder()
,
get_config_value()
,
get_short_version()
,
name_from_base()
,
name_from_image()
,
regional_hostname()
,
repack_model()
,
retries()
,
sagemaker_short_timestamp()
,
sagemaker_timestamp()
,
secondary_training_status_changed()
,
secondary_training_status_message()
,
sts_regional_endpoint()
,
unique_name_from_base()
Extract the base name of the image to use as the 'algorithm name' for the job.
base_name_from_image(image)
base_name_from_image(image)
image |
(str): Image name. |
str: Algorithm name, as extracted from the image name.
Other sagemaker_utils:
.aws_partition()
,
.download_files_under_prefix()
,
base_from_name()
,
build_dict()
,
common_variables
,
create_tar_file()
,
download_file()
,
download_folder()
,
get_config_value()
,
get_short_version()
,
name_from_base()
,
name_from_image()
,
regional_hostname()
,
repack_model()
,
retries()
,
sagemaker_short_timestamp()
,
sagemaker_timestamp()
,
secondary_training_status_changed()
,
secondary_training_status_message()
,
sts_regional_endpoint()
,
unique_name_from_base()
This is a status value that an “Operand“ can resolve to.
BooleanValues
BooleanValues
An object of class BooleanValues
(inherits from Enum
, environment
) of length 4.
Return a dict of key and value pair if value is not None, otherwise return an empty dict.
build_dict(key, value = NULL)
build_dict(key, value = NULL)
key |
(str): input key |
value |
(str): input value |
dict: dict of key and value or an empty dict.
Other sagemaker_utils:
.aws_partition()
,
.download_files_under_prefix()
,
base_from_name()
,
base_name_from_image()
,
common_variables
,
create_tar_file()
,
download_file()
,
download_folder()
,
get_config_value()
,
get_short_version()
,
name_from_base()
,
name_from_image()
,
regional_hostname()
,
repack_model()
,
retries()
,
sagemaker_short_timestamp()
,
sagemaker_timestamp()
,
secondary_training_status_changed()
,
secondary_training_status_message()
,
sts_regional_endpoint()
,
unique_name_from_base()
Helper function to return help documentation for sagemaker R6 classes.
cls_help(cls)
cls_help(cls)
cls |
(R6::R6Class): R6 class |
Other r_utils:
Enum()
,
IsSubR6Class()
,
format_class()
,
is_list_named()
,
is_tarfile()
,
islistempty()
,
pkg_method()
,
retry_api_call()
,
rsplit()
,
split_str()
,
write_bin()
Create a class containing all the parameters. It can be used when calling “Model$compile_model()“
target_instance_type
Identifies the device that you want to run your model after compilation
input_shape
Specifies the name and shape of the expected inputs for your trained model
output_path
Specifies where to store the compiled model
framework
The framework that is used to train the original model
framework_version
The version of the framework
compile_max_run
Timeout in seconds for compilation
tags
List of tags for labelling a compilation job
job_name
The name of the compilation job
target_platform_os
Target Platform OS
target_platform_arch
Target Platform Architecture
target_platform_accelerator
Target Platform Accelerator
compiler_options
Additional parameters for compiler
new()
Initialize CompilationInput class
CompilationInput$new( target_instance_type = NULL, input_shape = NULL, output_path = NULL, framework = NULL, framework_version = NULL, compile_max_run = 15 * 60, tags = NULL, job_name = NULL, target_platform_os = NULL, target_platform_arch = NULL, target_platform_accelerator = NULL, compiler_options = NULL )
target_instance_type
(str): Identifies the device that you want to run your model after compilation, for example: ml_c5. For allowed strings see https://docs.aws.amazon.com/sagemaker/latest/dg/API_OutputConfig.html.
input_shape
(str): Specifies the name and shape of the expected inputs for your trained model in json dictionary form
output_path
(str): Specifies where to store the compiled model
framework
(str, optional): The framework that is used to train the original model. Allowed values: 'mxnet', 'tensorflow', 'keras', 'pytorch', 'onnx', 'xgboost' (default: None)
framework_version
(str, optional): The version of the framework (default: None)
compile_max_run
(int, optional): Timeout in seconds for compilation (default: 15 * 60). After this amount of time Amazon SageMaker Neo terminates the compilation job regardless of its current status.
tags
(list[dict], optional): List of tags for labelling a compilation job. For more, see https://docs.aws.amazon.com/sagemaker/latest/dg/API_Tag.html.
job_name
(str, optional): The name of the compilation job (default: None)
target_platform_os
(str, optional): Target Platform OS, for example: 'LINUX'. (default: None) For allowed strings see https://docs.aws.amazon.com/sagemaker/latest/dg/API_OutputConfig.html. It can be used instead of target_instance_family.
target_platform_arch
(str, optional): Target Platform Architecture, for example: 'X86_64'. (default: None) For allowed strings see https://docs.aws.amazon.com/sagemaker/latest/dg/API_OutputConfig.html. It can be used instead of target_instance_family.
target_platform_accelerator
(str, optional): Target Platform Accelerator, for example: 'NVIDIA'. (default: None) For allowed strings see https://docs.aws.amazon.com/sagemaker/latest/dg/API_OutputConfig.html. It can be used instead of target_instance_family.
compiler_options
(dict, optional): Additional parameters for compiler. (default: None) Compiler Options are TargetPlatform / target_instance_family specific. See https://docs.aws.amazon.com/sagemaker/latest/dg/API_OutputConfig.html for details.
format()
format class
CompilationInput$format()
clone()
The objects of this class are cloneable with this method.
CompilationInput$clone(deep = FALSE)
deep
Whether to make a deep clone.
Constant operator class for filtering JumpStart content.
Constant operator class for filtering JumpStart content.
sagemaker.core::Operand
-> sagemaker.core::Operator
-> Constant
new()
Instantiates Constant operator object.
Constant$new(constant)
constant
(BooleanValues): Value of constant.
eval()
Evaluates constant
Constant$eval()
clone()
The objects of this class are cloneable with this method.
Constant$clone(deep = FALSE)
deep
Whether to make a deep clone.
Create a definition for executing a container as part of a SageMaker model.
container_def( image_uri, model_data_url = NULL, env = NULL, container_mode = NULL, image_config = NULL )
container_def( image_uri, model_data_url = NULL, env = NULL, container_mode = NULL, image_config = NULL )
image_uri |
(str): Docker image to run for this container. |
model_data_url |
(str): S3 URI of data required by this container, e.g. SageMaker training job model artifacts (default: None). |
env |
(dict[str, str]): Environment variables to set inside the container (default: None). |
container_mode |
(str): The model container mode. Valid modes:
|
image_config |
(dict[str, str]): Specifies whether the image of model container is pulled from ECR, or private registry in your VPC. By default it is set to pull model container image from ECR. (default: None). |
dict[str, str]: A complete container definition object usable with the CreateModel API if passed via 'PrimaryContainers' field.
Create all the intermediate directories required for relative_path to exist within destination_directory. This assumes that relative_path is a directory located within root_dir.
copy_directory_structure(destination_directory, relative_path)
copy_directory_structure(destination_directory, relative_path)
destination_directory |
(str): root of the destination directory where the directory structure will be created. |
relative_path |
(str): relative path that will be created within destination_directory |
A class containing parameters which can be used to create a SageMaker Model Parameters:
instance_type
type or EC2 instance will be used for model deployment
accelerator_type
elastic inference accelerator type.
new()
Initialize CreateModelInput class
CreateModelInput$new(instance_type = NULL, accelerator_type = NULL)
instance_type
(str): type or EC2 instance will be used for model deployment.
accelerator_type
(str): elastic inference accelerator type.
format()
format class
CreateModelInput$format()
clone()
The objects of this class are cloneable with this method.
CreateModelInput$clone(deep = FALSE)
deep
Whether to make a deep clone.
Wrap a function with a deprecation warning.
deprecated_function(func, name)
deprecated_function(func, name)
func |
(str): Function to wrap in a deprecation warning. |
name |
(str): The name that has been deprecated. |
The modified function
A deprecated specification for a JumpStart model does not mean the whole model is deprecated. There may be more recent specifications available for this model. For example, all specification before version “2.0.0“ may be deprecated, in such a case, the SDK would raise this exception only when specifications “1.*“ are accessed.
sagemaker.core::SagemakerError
-> DeprecatedJumpStartModelError
new()
Instantiates DeprecatedJumpStartModelError exception.
DeprecatedJumpStartModelError$new( model_id = NULL, version = NULL, message = NULL )
model_id
(Optional[str]): model ID of vulnerable JumpStart model. (Default: None).
version
(Optional[str]): version of vulnerable JumpStart model. (Default: None).
message
(Optional[str]): error message
clone()
The objects of this class are cloneable with this method.
DeprecatedJumpStartModelError$clone(deep = FALSE)
deep
Whether to make a deep clone.
Download a Single File from S3 into a local path
download_file(bucket_name, path, target, sagemaker_session)
download_file(bucket_name, path, target, sagemaker_session)
bucket_name |
(str): S3 bucket name |
path |
(str): file path within the bucket |
target |
(str): destination directory for the downloaded file. |
sagemaker_session |
(sagemaker.session.Session): a sagemaker session to interact with S3. |
Other sagemaker_utils:
.aws_partition()
,
.download_files_under_prefix()
,
base_from_name()
,
base_name_from_image()
,
build_dict()
,
common_variables
,
create_tar_file()
,
download_folder()
,
get_config_value()
,
get_short_version()
,
name_from_base()
,
name_from_image()
,
regional_hostname()
,
repack_model()
,
retries()
,
sagemaker_short_timestamp()
,
sagemaker_timestamp()
,
secondary_training_status_changed()
,
secondary_training_status_message()
,
sts_regional_endpoint()
,
unique_name_from_base()
Download a Single File from S3 into a local path
download_file_from_url(url, dst, sagemaker_session)
download_file_from_url(url, dst, sagemaker_session)
url |
(str): file path within the bucket |
dst |
(str): destination directory for the downloaded file. |
sagemaker_session |
(sagemaker.session.Session): a sagemaker session to interact with S3. |
Download a folder from S3 to a local path
download_folder(bucket_name, prefix, target, sagemaker_session)
download_folder(bucket_name, prefix, target, sagemaker_session)
bucket_name |
(str): S3 bucket name |
prefix |
(str): S3 prefix within the bucket that will be downloaded. Can be a single file. |
target |
(str): destination path where the downloaded items will be placed |
sagemaker_session |
(sagemaker.session.Session): a sagemaker session to interact with S3. |
Other sagemaker_utils:
.aws_partition()
,
.download_files_under_prefix()
,
base_from_name()
,
base_name_from_image()
,
build_dict()
,
common_variables
,
create_tar_file()
,
download_file()
,
get_config_value()
,
get_short_version()
,
name_from_base()
,
name_from_image()
,
regional_hostname()
,
repack_model()
,
retries()
,
sagemaker_short_timestamp()
,
sagemaker_timestamp()
,
secondary_training_status_changed()
,
secondary_training_status_message()
,
sts_regional_endpoint()
,
unique_name_from_base()
This object stores the value itself as well as a timestamp so that this element can be invalidated if it becomes too old.
new()
Initialize an “Element“ instance for “LRUCache“.
Element$new(value, creation_time)
value
(ValType): Value that is stored in cache.
creation_time
(datetime.datetime): Time at which cache item was created.
clone()
The objects of this class are cloneable with this method.
Element$clone(deep = FALSE)
deep
Whether to make a deep clone.
Evaluates model filter with cached model spec value, returns boolean.
evaluate_filter_expression(model_filter, cached_model_value)
evaluate_filter_expression(model_filter, cached_model_value)
model_filter |
(ModelFilter): The model filter for evaluation. |
cached_model_value |
(Any): The value in the model manifest/spec that should be used to evaluate the filter. |
Parse the model ID, return a tuple framework, task, rest-of-id.
extract_framework_task_model(model_id)
extract_framework_task_model(model_id)
model_id |
(str): The model ID for which to extract the framework/task/model. |
Amazon SageMaker channel configurations for file system data sources.
Amazon SageMaker channel configurations for file system data sources.
config
(dict[str, dict])
A Sagemaker File System “DataSource“.
new()
Create a new file system input used by an SageMaker training job.
FileSystemInput$new( file_system_id, file_system_type = c("FSxLustre", "EFS"), directory_path, file_system_access_mode = c("ro", "rw"), content_type = NULL )
file_system_id
(str): An Amazon file system ID starting with 'fs-'.
file_system_type
(str): The type of file system used for the input. Valid values: 'EFS', 'FSxLustre'.
directory_path
(str): Absolute or normalized path to the root directory (mount point) in the file system. Reference: https://docs.aws.amazon.com/efs/latest/ug/mounting-fs.html and https://docs.aws.amazon.com/fsx/latest/LustreGuide/mount-fs-auto-mount-onreboot.html
file_system_access_mode
(str): Permissions for read and write. Valid values: 'ro' or 'rw'. Defaults to 'ro'.
content_type
:
format()
format class
FileSystemInput$format()
clone()
The objects of this class are cloneable with this method.
FileSystemInput$clone(deep = FALSE)
deep
Whether to make a deep clone.
Enum class for filter operators for JumpStart models.
FilterOperators
FilterOperators
An object of class FilterOperators
(inherits from Enum
, environment
) of length 4.
Extract the framework and Python version from the image name.
framework_name_from_image(image_uri)
framework_name_from_image(image_uri)
image_uri |
(str): Image URI, which should be one of the following forms:
|
tuple: A tuple containing:
str: The framework name
str: The Python version
str: The image tag
str: If the TensorFlow image is script mode
Extract the framework version from the image tag.
framework_version_from_tag(image_tag)
framework_version_from_tag(image_tag)
image_tag |
(str): Image tag, which should take the form '<framework_version>-<device>-<py_version>' |
str: The framework version.
Return an Instance of :class:'sagemaker.local.data.BatchStrategy' according to 'strategy'
get_batch_strategy_instance(strategy, splitter)
get_batch_strategy_instance(strategy, splitter)
strategy |
(str): Either 'SingleRecord' or 'MultiRecord' |
splitter |
(:class:'sagemaker.local.data.Splitter): splitter to get the data from. |
:class:'sagemaker.local.data.BatchStrategy': an Instance of a BatchStrategy
The instance can handle the provided data_source URI. data_source can be either file:// or s3://
get_data_source_instance(data_source, sagemaker_session)
get_data_source_instance(data_source, sagemaker_session)
data_source |
(str): a valid URI that points to a data source. |
sagemaker_session |
(:class:'sagemaker.session.Session'): a SageMaker Session to interact with S3 if required. |
sagemaker.local.data.DataSource: an Instance of a Data Source
Return the role ARN whose credentials are used to call the API.
get_execution_role(sagemaker_session = NULL)
get_execution_role(sagemaker_session = NULL)
sagemaker_session |
(Session): Current sagemaker session |
(str): The role ARN
Keys are JumpStartVersionedModelId objects, values are “JumpStartModelHeader“ objects
get_formatted_manifest(manifest)
get_formatted_manifest(manifest)
manifest |
: Placeholder |
If no URIs belong to JumpStart, return None.
get_jumpstart_base_name_if_jumpstart_model(uris)
get_jumpstart_base_name_if_jumpstart_model(uris)
uris |
(Optional[str]): URI to test for association with JumpStart. |
Returns regionalized content bucket name for JumpStart.
get_jumpstart_content_bucket(region)
get_jumpstart_content_bucket(region)
region |
(str): AWS region |
Returns formatted string indicating where JumpStart is launched.
get_jumpstart_launched_regions_message()
get_jumpstart_launched_regions_message()
Get arguments for create_model_package method.
get_model_package_args( content_types, response_types, inference_instances, transform_instances, model_package_name = NULL, model_package_group_name = NULL, model_data = NULL, image_uri = NULL, model_metrics = NULL, metadata_properties = NULL, marketplace_cert = FALSE, approval_status = NULL, description = NULL, tags = NULL, container_def_list = NULL, drift_check_baselines = NULL )
get_model_package_args( content_types, response_types, inference_instances, transform_instances, model_package_name = NULL, model_package_group_name = NULL, model_data = NULL, image_uri = NULL, model_metrics = NULL, metadata_properties = NULL, marketplace_cert = FALSE, approval_status = NULL, description = NULL, tags = NULL, container_def_list = NULL, drift_check_baselines = NULL )
content_types |
(list): The supported MIME types for the input data. |
response_types |
(list): The supported MIME types for the output data. |
inference_instances |
(list): A list of the instance types that are used to generate inferences in real-time. |
transform_instances |
(list): A list of the instance types on which a transformation job can be run or on which an endpoint can be deployed. |
model_package_name |
(str): Model Package name, exclusive to 'model_package_group_name', using 'model_package_name' makes the Model Package un-versioned (default: None). |
model_package_group_name |
(str): Model Package Group name, exclusive to 'model_package_name', using 'model_package_group_name' makes the Model Package versioned (default: None). |
model_data |
: Placeholder |
image_uri |
(str): Inference image uri for the container. Model class' self.image will be used if it is None (default: None). |
model_metrics |
(ModelMetrics): ModelMetrics object (default: None). |
metadata_properties |
(MetadataProperties): MetadataProperties object (default: None). |
marketplace_cert |
(bool): A boolean value indicating if the Model Package is certified for AWS Marketplace (default: False). |
approval_status |
(str): Model Approval Status, values can be "Approved", "Rejected", or "PendingManualApproval" (default: "PendingManualApproval"). |
description |
(str): Model Package description (default: None). |
tags |
: Placeholder |
container_def_list |
(list): A list of container defintiions. |
drift_check_baselines |
(DriftCheckBaselines): DriftCheckBaselines object (default: None). |
list: A dictionary of method argument names and values.
Retrieve web url describing pretrained model.
get_model_url( model_id, model_version, region = JUMPSTART_DEFAULT_REGION_NAME() )
get_model_url( model_id, model_version, region = JUMPSTART_DEFAULT_REGION_NAME() )
model_id |
(str): The model ID for which to retrieve the url. |
model_version |
(str): The model version for which to retrieve the url. |
region |
(str): Optional. The region from which to retrieve metadata. (Default: JUMPSTART_DEFAULT_REGION_NAME()) |
Get the model parallelism parameters provided by the user.
get_mp_parameters(distribution)
get_mp_parameters(distribution)
distribution |
: distribution dictionary defined by the user. |
params: dictionary containing model parallelism parameters used for training.
If the sagemaker library version has not been set, this function calls “parse_sagemaker_version“ to retrieve the version and set the constant.
get_sagemaker_version()
get_sagemaker_version()
Return short version in the format of x.x
get_short_version(framework_version)
get_short_version(framework_version)
framework_version |
(str): The version string to be shortened. |
str: The short version string
Other sagemaker_utils:
.aws_partition()
,
.download_files_under_prefix()
,
base_from_name()
,
base_name_from_image()
,
build_dict()
,
common_variables
,
create_tar_file()
,
download_file()
,
download_folder()
,
get_config_value()
,
name_from_base()
,
name_from_image()
,
regional_hostname()
,
repack_model()
,
retries()
,
sagemaker_short_timestamp()
,
sagemaker_timestamp()
,
secondary_training_status_changed()
,
secondary_training_status_message()
,
sts_regional_endpoint()
,
unique_name_from_base()
The instance returned is according to the specified 'split_type'.
get_splitter_instance(split_type = NULL)
get_splitter_instance(split_type = NULL)
split_type |
(str): either 'Line' or 'RecordIO'. Can be left as None to signal no data split will happen. |
:class:'sagemaker.local.data.Splitter': an Instance of a Splitter
Return the value of a tag whose key matches the given “tag_key“.
get_tag_value(tag_key, tag_array)
get_tag_value(tag_key, tag_array)
tag_key |
(str): AWS tag for which to search. |
tag_array |
(List[Dict[str, str]]): List of AWS tags, each formatted as dicts. |
Git clone repo containing the training code and serving code. This method also validate “git_config“, and set “entry_point“, “source_dir“ and “dependencies“ to the right file or directory in the repo cloned.
git_clone_repo(git_config, entry_point, source_dir = NULL, dependencies = NULL)
git_clone_repo(git_config, entry_point, source_dir = NULL, dependencies = NULL)
git_config |
(dict[str, str]): Git configurations used for cloning files, including “repo“, “branch“, “commit“, “2FA_enabled“, “username“, “password“ and “token“. The “repo“ field is required. All other fields are optional. “repo“ specifies the Git repository where your training script is stored. If you don't provide “branch“, the default value 'master' is used. If you don't provide “commit“, the latest commit in the specified branch is used. “2FA_enabled“, “username“, “password“ and “token“ are for authentication purpose. If “2FA_enabled“ is not provided, we consider 2FA as disabled. For GitHub and GitHub-like repos, when SSH URLs are provided, it doesn't matter whether 2FA is enabled or disabled; you should either have no passphrase for the SSH key pairs, or have the ssh-agent configured so that you will not be prompted for SSH passphrase when you do 'git clone' command with SSH URLs. When https URLs are provided: if 2FA is disabled, then either token or username+password will be used for authentication if provided (token prioritized); if 2FA is enabled, only token will be used for authentication if provided. If required authentication info is not provided, python SDK will try to use local credentials storage to authenticate. If that fails either, an error message will be thrown. For CodeCommit repos, 2FA is not supported, so '2FA_enabled' should not be provided. There is no token in CodeCommit, so 'token' should not be provided too. When 'repo' is an SSH URL, the requirements are the same as GitHub-like repos. When 'repo' is an https URL, username+password will be used for authentication if they are provided; otherwise, python SDK will try to use either CodeCommit credential helper or local credential storage for authentication. |
entry_point |
(str): A relative location to the Python source file which should be executed as the entry point to training or model hosting in the Git repo. |
source_dir |
(str): A relative location to a directory with other training or model hosting source code dependencies aside from the entry point file in the Git repo (default: None). Structure within this directory are preserved when training on Amazon SageMaker. |
dependencies |
(list[str]): A list of relative locations to directories with any additional libraries that will be exported to the container in the Git repo (default: []). |
dict: A dict that contains the updated values of entry_point, source_dir and dependencies.
Possible modes for validating hyperparameters.
HyperparameterValidationMode
HyperparameterValidationMode
An object of class HyperparameterValidationMode
(inherits from Enum
, environment
) of length 3.
Identity operator class for filtering JumpStart content.
Identity operator class for filtering JumpStart content.
sagemaker.core::Operand
-> sagemaker.core::Operator
-> Identity
new()
Instantiates Identity object.
Identity$new(operand)
operand
(Union[Operand, str]): Operand for identity operation.
eval()
Evaluates operator.
Identity$eval()
clone()
The objects of this class are cloneable with this method.
Identity$clone(deep = FALSE)
deep
Whether to make a deep clone.
Class to create and format sagemaker docker images stored in ECR
retrieve()
Retrieves the ECR URI for the Docker image matching the given arguments of inbuilt AWS Sagemaker models.
ImageUris$retrieve( framework, region, version = NULL, py_version = NULL, instance_type = NULL, accelerator_type = NULL, image_scope = NULL, container_version = NULL, distribution = NULL, base_framework_version = NULL, training_compiler_config = NULL, model_id = NULL, model_version = NULL, tolerate_vulnerable_model = FALSE, tolerate_deprecated_model = FALSE, sdk_version = NULL, inference_tool = NULL, serverless_inference_config = NULL )
framework
(str): The name of the framework or algorithm.
region
(str): The AWS region.
version
(str): The framework or algorithm version. This is required if there is more than one supported version for the given framework or algorithm.
py_version
(str): The Python version. This is required if there is more than one supported Python version for the given framework version.
instance_type
(str): The SageMaker instance type. For supported types, see https://aws.amazon.com/sagemaker/pricing/instance-types. This is required if there are different images for different processor types.
accelerator_type
(str): Elastic Inference accelerator type. For more, see https://docs.aws.amazon.com/sagemaker/latest/dg/ei.html.
image_scope
(str): The image type, i.e. what it is used for. Valid values: "training", "inference", "eia". If “accelerator_type“ is set, “image_scope“ is ignored.
container_version
(str): the version of docker image
distribution
(dict): A dictionary with information on how to run distributed training (default: None).
base_framework_version
(str):
training_compiler_config
(:class:'~sagemaker.training_compiler.TrainingCompilerConfig'): A configuration class for the SageMaker Training Compiler (default: None).
model_id
(str): The JumpStart model ID for which to retrieve the image URI (default: None).
model_version
(str): The version of the JumpStart model for which to retrieve the image URI (default: None).
tolerate_vulnerable_model
(bool): “True“ if vulnerable versions of model specifications should be tolerated without an exception raised. If “False“, raises an exception if the script used by this version of the model has dependencies with known security vulnerabilities. (Default: False).
tolerate_deprecated_model
(bool): True if deprecated versions of model specifications should be tolerated without an exception raised. If False, raises an exception if the version of the model is deprecated. (Default: False).
sdk_version
(str): the version of python-sdk that will be used in the image retrieval. (default: None).
inference_tool
(str): the tool that will be used to aid in the inference. Valid values: "neuron, None" (default: None).
serverless_inference_config
(sagemaker.core::ServerlessInferenceConfig
):
Specifies configuration related to serverless endpoint. Instance type is
not provided in serverless inference. So this is used to determine processor type.
str: the ECR URI for the corresponding SageMaker Docker image.
get_training_image_uri()
Retrieves the image URI for training.
ImageUris$get_training_image_uri( region, framework, framework_version = NULL, py_version = NULL, image_uri = NULL, distribution = NULL, compiler_config = NULL, tensorflow_version = NULL, pytorch_version = NULL, instance_type = NULL )
region
(str): The AWS region to use for image URI.
framework
(str): The framework for which to retrieve an image URI.
framework_version
(str): The framework version for which to retrieve an image URI (default: NULL).
py_version
(str): The python version to use for the image (default: NULL).
image_uri
(str): If an image URI is supplied, it is returned (default: NULL).
distribution
(dict): A dictionary with information on how to run distributed training (default: NULL).
compiler_config
(:class:'~sagemaker.training_compiler.TrainingCompilerConfig'): A configuration class for the SageMaker Training Compiler (default: NULL).
tensorflow_version
(str): The version of TensorFlow to use. (default: NULL)
pytorch_version
(str): The version of PyTorch to use. (default: NULL)
instance_type
(str): The instance type to use. (default: NULL)
str: The image URI string.
format()
format class
ImageUris$format()
clone()
The objects of this class are cloneable with this method.
ImageUris$clone(deep = FALSE)
deep
Whether to make a deep clone.
This method returns True if both arguments are not None, false if both arguments are None, and raises an exception if one argument is None but the other isn't.
is_jumpstart_model_input(model_id, version)
is_jumpstart_model_input(model_id, version)
model_id |
(str): Optional. Model ID of the JumpStart model. |
version |
(str): Optional. Version of the JumpStart model. |
Returns True if URI corresponds to a JumpStart-hosted model.
is_jumpstart_model_uri(uri)
is_jumpstart_model_uri(uri)
uri |
(Optional[str]): uri for inference/training job. |
Check if list is named
is_list_named(x)
is_list_named(x)
x |
: object |
Other r_utils:
Enum()
,
IsSubR6Class()
,
cls_help()
,
format_class()
,
is_tarfile()
,
islistempty()
,
pkg_method()
,
retry_api_call()
,
rsplit()
,
split_str()
,
write_bin()
Check the magic bytes at offset 257. If they match "ustar" including the null terminator, the file is probably a tar. https://www.gnu.org/software/tar/manual/html_node/Standard.html
is_tarfile(path)
is_tarfile(path)
path |
A character of filepath to tar archived file. |
Other r_utils:
Enum()
,
IsSubR6Class()
,
cls_help()
,
format_class()
,
is_list_named()
,
islistempty()
,
pkg_method()
,
retry_api_call()
,
rsplit()
,
split_str()
,
write_bin()
validation check of s3 uri
is.s3_uri(x)
is.s3_uri(x)
x |
(str): character to validate if s3 uri or not |
Data class for the s3 cached content keys.
sagemaker.core::JumpStartDataHolderType
-> JumpStartCachedS3ContentKey
new()
Instantiates JumpStartCachedS3ContentKey object.
JumpStartCachedS3ContentKey$new(file_type, s3_key)
file_type
(JumpStartS3FileType): JumpStart file type.
s3_key
(str): object key in s3.
clone()
The objects of this class are cloneable with this method.
JumpStartCachedS3ContentKey$clone(deep = FALSE)
deep
Whether to make a deep clone.
Data class for the s3 cached content values.
sagemaker.core::JumpStartDataHolderType
-> JumpStartCachedS3ContentValue
new()
Instantiates JumpStartCachedS3ContentValue object.
JumpStartCachedS3ContentValue$new(formatted_content, md5_hash = NULL)
formatted_content
(Union[Dict[JumpStartVersionedModelId, JumpStartModelHeader], JumpStartModelSpecs]): Formatted content for model specs and mappings from versioned model IDs to specs.
md5_hash
(str): md5_hash for stored file content from s3.
clone()
The objects of this class are cloneable with this method.
JumpStartCachedS3ContentValue$clone(deep = FALSE)
deep
Whether to make a deep clone.
Allows objects to be added to dicts and sets, and improves string representation. This class overrides the “__eq__“ and “__hash__“ methods so that different objects with the same attributes/types can be compared.
format()
Returns “__repr__“ string of object.
JumpStartDataHolderType$format()
clone()
The objects of this class are cloneable with this method.
JumpStartDataHolderType$clone(deep = FALSE)
deep
Whether to make a deep clone.
Data class for JumpStart ECR specs.
Data class for JumpStart ECR specs.
sagemaker.core::JumpStartDataHolderType
-> JumpStartECRSpecs
new()
Initializes a JumpStartECRSpecs object from its json representation.
JumpStartECRSpecs$new(spec)
spec
(Dict[str, Any]): Dictionary representation of spec.
from_json()
Sets fields in object based on json.
JumpStartECRSpecs$from_json(json_obj)
json_obj
(Dict[str, Any]): Dictionary representation of spec.
to_json()
Returns json representation of JumpStartECRSpecs object in list format.
JumpStartECRSpecs$to_json()
clone()
The objects of this class are cloneable with this method.
JumpStartECRSpecs$clone(deep = FALSE)
deep
Whether to make a deep clone.
Data class for JumpStart environment variable definitions in the hosting container.
sagemaker.core::JumpStartDataHolderType
-> JumpStartEnvironmentVariable
new()
Initializes a JumpStartEnvironmentVariable object from its json representation.
JumpStartEnvironmentVariable$new(spec)
spec
(Dict[str, Any]): Dictionary representation of environment variable.
from_json()
Sets fields in object based on json.
JumpStartEnvironmentVariable$from_json(json_obj)
json_obj
(Dict[str, Any]): Dictionary representation of environment variable.
to_json()
Returns json representation of JumpStartEnvironmentVariable object.
JumpStartEnvironmentVariable$to_json()
clone()
The objects of this class are cloneable with this method.
JumpStartEnvironmentVariable$clone(deep = FALSE)
deep
Whether to make a deep clone.
Data class for JumpStart hyperparameter definition in the training container.
sagemaker.core::JumpStartDataHolderType
-> JumpStartHyperparameter
new()
Initializes a JumpStartHyperparameter object from its json representation.
JumpStartHyperparameter$new(spec)
spec
(Dict[str, Any]): Dictionary representation of hyperparameter.
from_json()
Sets fields in object based on json.
JumpStartHyperparameter$from_json(json_obj)
json_obj
(Dict[str, Any]): Dictionary representation of hyperparameter.
to_json()
Returns json representation of JumpStartHyperparameter object.
JumpStartHyperparameter$to_json()
clone()
The objects of this class are cloneable with this method.
JumpStartHyperparameter$clone(deep = FALSE)
deep
Whether to make a deep clone.
Exception raised for bad hyperparameters of a JumpStart model.
Exception raised for bad hyperparameters of a JumpStart model.
sagemaker.core::SagemakerError
-> JumpStartHyperparametersError
clone()
The objects of this class are cloneable with this method.
JumpStartHyperparametersError$clone(deep = FALSE)
deep
Whether to make a deep clone.
Data class for launched region info.
Data class for launched region info.
sagemaker.core::JumpStartDataHolderType
-> JumpStartLaunchedRegionInfo
content_bucket
Name of JumpStart s3 content bucket associated with region.
region_name
Name of JumpStart launched region.
new()
Instantiates JumpStartLaunchedRegionInfo object.
JumpStartLaunchedRegionInfo$new(content_bucket, region_name)
content_bucket
(str): Name of JumpStart s3 content bucket associated with region.
region_name
(str): Name of JumpStart launched region.
clone()
The objects of this class are cloneable with this method.
JumpStartLaunchedRegionInfo$clone(deep = FALSE)
deep
Whether to make a deep clone.
Data class JumpStart model header.
Data class JumpStart model header.
sagemaker.core::JumpStartDataHolderType
-> JumpStartModelHeader
new()
Initializes a JumpStartModelHeader object from its json representation.
JumpStartModelHeader$new(header)
header
(Dict[str, str]): Dictionary representation of header.
to_json()
Returns json representation of JumpStartModelHeader object in list format.
JumpStartModelHeader$to_json()
from_json()
Sets fields in object based on json of header.
JumpStartModelHeader$from_json(json_obj)
json_obj
(Dict[str, str]): Dictionary representation of header.
clone()
The objects of this class are cloneable with this method.
JumpStartModelHeader$clone(deep = FALSE)
deep
Whether to make a deep clone.
The manifest and specs associated with JumpStart models provide the information necessary for launching JumpStart models from the SageMaker SDK.
new()
Initialize a “JumpStartModelsCache“ instance.
JumpStartModelsCache$new( region = JUMPSTART_DEFAULT_REGION_NAME(), max_s3_cache_items = JUMPSTART_DEFAULT_MAX_S3_CACHE_ITEMS, s3_cache_expiration_horizon = JUMPSTART_DEFAULT_S3_CACHE_EXPIRATION_HORIZON, max_semantic_version_cache_items = JUMPSTART_DEFAULT_MAX_SEMANTIC_VERSION_CACHE_ITEMS, semantic_version_cache_expiration_horizon = JUMPSTART_DEFAULT_SEMANTIC_VERSION_CACHE_EXPIRATION_HORIZON, manifest_file_s3_key = JUMPSTART_DEFAULT_MANIFEST_FILE_S3_KEY, s3_bucket_name = NULL )
region
(str): AWS region to associate with cache. Default: region associated with boto3 session.
max_s3_cache_items
(int): Maximum number of items to store in s3 cache. Default: 20.
s3_cache_expiration_horizon
(datetime.timedelta): Maximum time to hold items in s3 cache before invalidation. Default: 6 hours.
max_semantic_version_cache_items
(int): Maximum number of items to store in semantic version cache. Default: 20.
semantic_version_cache_expiration_horizon
(datetime.timedelta): Maximum time to hold items in semantic version cache before invalidation. Default: 6 hours.
manifest_file_s3_key
(str): The key in S3 corresponding to the sdk metadata manifest.
s3_bucket_name
(Optional[str]): S3 bucket to associate with cache. Default: JumpStart-hosted content bucket for region.
set_region()
Set region for cache. Clears cache after new region is set.
JumpStartModelsCache$set_region(region)
region
AWS region to associate with cache.
get_region()
Return region for cache.
JumpStartModelsCache$get_region()
set_manifest_file_s3_key()
Set manifest file s3 key. Clears cache after new key is set.
JumpStartModelsCache$set_manifest_file_s3_key(key)
key
(str): The key in S3 corresponding to the sdk metadata manifest.
get_manifest_file_s3_key()
Return manifest file s3 key for cache.
JumpStartModelsCache$get_manifest_file_s3_key()
set_s3_bucket_name()
Set s3 bucket used for cache.
JumpStartModelsCache$set_s3_bucket_name()
s3_bucket_name
(str): S3 bucket to associate with cache.
get_bucket()
Return bucket used for cache.
JumpStartModelsCache$get_bucket()
get_manifest()
Return entire JumpStart models manifest.
JumpStartModelsCache$get_manifest()
get_header()
Return header for a given JumpStart model ID and semantic version.
JumpStartModelsCache$get_header()
model_id
(str): model ID for which to get a header.
semantic_version_str
(str): The semantic version for which to get a header.
get_specs()
Return specs for a given JumpStart model ID and semantic version.
JumpStartModelsCache$get_specs()
model_id
(str): model ID for which to get specs.
semantic_version_str
(str): The semantic version for which to get specs.
clear()
Clears the model ID/version and s3 cache.
JumpStartModelsCache$clear()
clone()
The objects of this class are cloneable with this method.
JumpStartModelsCache$clone(deep = FALSE)
deep
Whether to make a deep clone.
Data class JumpStart model specs
sagemaker.core::JumpStartDataHolderType
-> JumpStartModelSpecs
new()
Initializes a JumpStartModelSpecs object from its json representation.
JumpStartModelSpecs$new(spec)
spec
(Dict[str, Any]): Dictionary representation of spec.
from_json()
Sets fields in object based on json of header.
JumpStartModelSpecs$from_json(json_obj)
json_obj
(Dict[str, Any]): Dictionary representation of spec.
to_json()
Returns json representation of JumpStartModelSpecs object.
JumpStartModelSpecs$to_json()
clone()
The objects of this class are cloneable with this method.
JumpStartModelSpecs$clone(deep = FALSE)
deep
Whether to make a deep clone.
Type of files published in JumpStart S3 distribution buckets.
JumpStartS3FileType
JumpStartS3FileType
An object of class JumpStartS3FileType
(inherits from Enum
, environment
) of length 2.
Enum class for JumpStart script scopes.
JumpStartScriptScope
JumpStartScriptScope
An object of class JumpStartScriptScope
(inherits from Enum
, environment
) of length 2.
Enum class for tag keys to apply to JumpStart models.
JumpStartTag
JumpStartTag
An object of class JumpStartTag
(inherits from Enum
, environment
) of length 4.
Data class for versioned model IDs.
sagemaker.core::JumpStartDataHolderType
-> JumpStartVersionedModelId
new()
Instantiates JumpStartVersionedModelId object.
JumpStartVersionedModelId$new(model_id, version)
model_id
(str): JumpStart model ID.
version
(str): JumpStart model version.
clone()
The objects of this class are cloneable with this method.
JumpStartVersionedModelId$clone(deep = FALSE)
deep
Whether to make a deep clone.
Split records by new line.
Split records by new line.
sagemaker.core::Splitter
-> LineSplitter
split()
Split a file into records using a specific strategy This LineSplitter splits the file on each line break.
LineSplitter$split(file)
file
(str): path to the file to split
list: for the individual records that were split from the file
clone()
The objects of this class are cloneable with this method.
LineSplitter$clone(deep = FALSE)
deep
Whether to make a deep clone.
List frameworks for JumpStart, and optionally apply filters to result.
list_jumpstart_frameworks( filter = Constant$new(BooleanValues$`TRUE`), region = JUMPSTART_DEFAULT_REGION_NAME() )
list_jumpstart_frameworks( filter = Constant$new(BooleanValues$`TRUE`), region = JUMPSTART_DEFAULT_REGION_NAME() )
filter |
(Union[Operator, str]): Optional. The filter to apply to list frameworks. This can be either an “Operator“ type filter (e.g. “And("task == ic", "framework == pytorch")“), or simply a string filter which will get serialized into an Identity filter. (eg. “"task == ic"“). If this argument is not supplied, all frameworks will be listed. (Default: Constant(BooleanValues$TRUE)). |
region |
(str): Optional. The AWS region from which to retrieve JumpStart metadata regarding models. (Default: JUMPSTART_DEFAULT_REGION_NAME()). |
List scripts for JumpStart, and optionally apply filters to result.
list_jumpstart_scripts( filter = Constant$new(BooleanValues$`TRUE`), region = JUMPSTART_DEFAULT_REGION_NAME() )
list_jumpstart_scripts( filter = Constant$new(BooleanValues$`TRUE`), region = JUMPSTART_DEFAULT_REGION_NAME() )
filter |
(Union[Operator, str]): Optional. The filter to apply to list scripts. This can be
either an “Operator“ type filter (e.g. “And("task == ic", "framework == pytorch")“),
or simply a string filter which will get serialized into an Identity filter.
(e.g. “"task == ic"“). If this argument is not supplied, all scripts will be listed.
(Default: |
region |
(str): Optional. The AWS region from which to retrieve JumpStart metadata regarding
models. (Default: |
List tasks for JumpStart, and optionally apply filters to result.
list_jumpstart_tasks( filter = Constant$new(BooleanValues$`TRUE`), region = JUMPSTART_DEFAULT_REGION_NAME() )
list_jumpstart_tasks( filter = Constant$new(BooleanValues$`TRUE`), region = JUMPSTART_DEFAULT_REGION_NAME() )
filter |
(Union[Operator, str]): Optional. The filter to apply to list tasks. This can be either an “Operator“ type filter (e.g. “And("task == ic", "framework == pytorch")“), or simply a string filter which will get serialized into an Identity filter. (e.g. “"task == ic"“). If this argument is not supplied, all tasks will be listed. (Default: Constant(BooleanValues$'TRUE')). |
region |
(str): Optional. The AWS region from which to retrieve JumpStart metadata regarding models. (Default: JUMPSTART_DEFAULT_REGION_NAME()). |
Represents a data source within the local filesystem.
sagemaker.core::DataSource
-> LocalFileDataSource
new()
Initialize LocalFileDataSource class
LocalFileDataSource$new(root_path)
root_path
(str):
get_file_list()
Retrieve the list of absolute paths to all the files in this data source.
LocalFileDataSource$get_file_list()
List[str] List of absolute paths.
get_root_dir()
Retrieve the absolute path to the root directory of this data source.
LocalFileDataSource$get_root_dir()
str: absolute path to the root directory of this data source.
clone()
The objects of this class are cloneable with this method.
LocalFileDataSource$clone(deep = FALSE)
deep
Whether to make a deep clone.
A SageMaker Runtime client that calls a local endpoint only.
A SageMaker Runtime client that calls a local endpoint only.
new()
Initializes a LocalSageMakerRuntimeClient.
LocalSagemakerRuntimeClient$new(config = NULL)
config
(list): Optional configuration for this client. In particular only the local port is read.
invoke_endpoint()
Invoke the endpoint.
LocalSagemakerRuntimeClient$invoke_endpoint( Body, EndpointName, ContentType = NULL, Accept = NULL, CustomAttributes = NULL, TargetModel = NULL, TargetVariant = NULL, InferenceId = NULL )
Body
: Input data for which you want the model to provide inference.
EndpointName
: The name of the endpoint that you specified when you created the endpoint using the CreateEndpoint API.
ContentType
: The MIME type of the input data in the request body (Default value = None)
Accept
: The desired MIME type of the inference in the response (Default value = None)
CustomAttributes
: Provides additional information about a request for an inference submitted to a model hosted at an Amazon SageMaker endpoint (Default value = None)
TargetModel
: The model to request for inference when invoking a multi-model endpoint (Default value = None)
TargetVariant
: Specify the production variant to send the inference request to when invoking an endpoint that is running two or more variants (Default value = None)
InferenceId
: If you provide a value, it is added to the captured data when you enable data capture on the endpoint (Default value = None)
object: Inference for the given input.
clone()
The objects of this class are cloneable with this method.
LocalSagemakerRuntimeClient$clone(deep = FALSE)
deep
Whether to make a deep clone.
A SageMaker “Session“ class for Local Mode.
A SageMaker “Session“ class for Local Mode.
sagemaker.core::Session
-> LocalSession
sagemaker.core::Session$account_id()
sagemaker.core::Session$auto_ml()
sagemaker.core::Session$compile_model()
sagemaker.core::Session$create_endpoint()
sagemaker.core::Session$create_endpoint_config()
sagemaker.core::Session$create_endpoint_config_from_existing()
sagemaker.core::Session$create_feature_group()
sagemaker.core::Session$create_model()
sagemaker.core::Session$create_model_from_job()
sagemaker.core::Session$create_model_package_from_algorithm()
sagemaker.core::Session$create_model_package_from_containers()
sagemaker.core::Session$create_monitoring_schedule()
sagemaker.core::Session$create_tuning_job()
sagemaker.core::Session$default_bucket()
sagemaker.core::Session$delete_endpoint()
sagemaker.core::Session$delete_endpoint_config()
sagemaker.core::Session$delete_feature_group()
sagemaker.core::Session$delete_model()
sagemaker.core::Session$delete_monitoring_schedule()
sagemaker.core::Session$describe_auto_ml_job()
sagemaker.core::Session$describe_feature_group()
sagemaker.core::Session$describe_model()
sagemaker.core::Session$describe_monitoring_schedule()
sagemaker.core::Session$describe_processing_job()
sagemaker.core::Session$describe_training_job()
sagemaker.core::Session$describe_transform_job()
sagemaker.core::Session$describe_tuning_job()
sagemaker.core::Session$download_athena_query_result()
sagemaker.core::Session$download_data()
sagemaker.core::Session$endpoint_from_job()
sagemaker.core::Session$endpoint_from_model_data()
sagemaker.core::Session$endpoint_from_production_variants()
sagemaker.core::Session$expand_role()
sagemaker.core::Session$format()
sagemaker.core::Session$get_caller_identity_arn()
sagemaker.core::Session$get_query_execution()
sagemaker.core::Session$help()
sagemaker.core::Session$list_candidates()
sagemaker.core::Session$list_monitoring_executions()
sagemaker.core::Session$list_monitoring_schedules()
sagemaker.core::Session$list_s3_files()
sagemaker.core::Session$list_tags()
sagemaker.core::Session$logs_for_auto_ml_job()
sagemaker.core::Session$logs_for_processing_job()
sagemaker.core::Session$logs_for_transform_job()
sagemaker.core::Session$process()
sagemaker.core::Session$read_s3_file()
sagemaker.core::Session$start_monitoring_schedule()
sagemaker.core::Session$start_query_execution()
sagemaker.core::Session$stop_monitoring_schedule()
sagemaker.core::Session$stop_processing_job()
sagemaker.core::Session$stop_training_job()
sagemaker.core::Session$stop_transform_job()
sagemaker.core::Session$stop_tuning_job()
sagemaker.core::Session$train()
sagemaker.core::Session$transform()
sagemaker.core::Session$tune()
sagemaker.core::Session$update_endpoint()
sagemaker.core::Session$update_monitoring_schedule()
sagemaker.core::Session$update_training_job()
sagemaker.core::Session$upload_data()
sagemaker.core::Session$upload_string_as_file_body()
sagemaker.core::Session$wait_for_athena_query()
sagemaker.core::Session$wait_for_auto_ml_job()
sagemaker.core::Session$wait_for_compilation_job()
sagemaker.core::Session$wait_for_edge_packaging_job()
sagemaker.core::Session$wait_for_endpoint()
sagemaker.core::Session$wait_for_job()
sagemaker.core::Session$wait_for_model_package()
sagemaker.core::Session$wait_for_processing_job()
sagemaker.core::Session$wait_for_transform_job()
sagemaker.core::Session$wait_for_tuning_job()
sagemaker.core::Session$was_processing_job_successful()
new()
This class provides alternative Local Mode implementations for the functionality of :class:'~sagemaker.session.Session'.
LocalSession$new( paws_session = NULL, s3_endpoint_url = NULL, disable_local_code = FALSE )
paws_session
(PawsSession): The underlying AWS credentails passed to paws SDK.
s3_endpoint_url
(str): Override the default endpoint URL for Amazon S3, if set (default: None).
disable_local_code
(bool): Set “True“ to override the default AWS configuration chain to disable the “local.local_code“ setting, which may not be supported for some SDK features (default: False).
logs_for_job()
A no-op method meant to override the sagemaker client.
LocalSession$logs_for_job(job_name, wait = FALSE, poll = 5, log_type = "All")
job_name
(str):
wait
(boolean): (Default value = False)
poll
(int): (Default value = 5)
log_type
(str):
clone()
The objects of this class are cloneable with this method.
LocalSession$clone(deep = FALSE)
deep
Whether to make a deep clone.
Other Session:
PawsSession
,
Session
LRU caches remove items in a FIFO manner, such that the oldest items to be used are the first to be removed. If you attempt to retrieve a cache item that is older than the expiration time, the item will be invalidated.
Element
Class describes the values in the cache
new()
Initialize an “LRUCache“ instance.
LRUCache$new(max_cache_items, expiration_horizon, retrieval_function)
max_cache_items
(int): Maximum number of items to store in cache.
expiration_horizon
(datetime.timedelta): Maximum time duration a cache element can persist before being invalidated.
retrieval_function
(Callable[[KeyType, ValType], ValType]): Function which maps cache keys and current values to new values. This function must have kwarg arguments “key“ and “value“. This function is called as a fallback when the key is not found in the cache, or a key has expired.
clear()
Deletes all elements from the cache.
LRUCache$clear()
get()
Returns value corresponding to key in cache.
LRUCache$get(key, data_source_fallback = TRUE)
key
(KeyType): Key in cache to retrieve.
data_source_fallback
(Optional[bool]): True if data should be retrieved if it's stale or not in cache. Default: True.
put()
Adds key to cache using “retrieval_function“. If value is provided, this is used instead. If the key is already in cache, the old element is removed. If the cache size exceeds the size limit, old elements are removed in order to meet the limit.
LRUCache$put(key, value = NULL)
key
(KeyType): Key in cache to retrieve.
value
(Optional[ValType]): Value to store for key. Default: None.
clone()
The objects of this class are cloneable with this method.
LRUCache$clone(deep = FALSE)
deep
Whether to make a deep clone.
The location returned is a potential concatenation of 2 parts 1. code_location_key_prefix if it exists 2. model_name or a name derived from the image
model_code_key_prefix(code_location_key_prefix, model_name, image)
model_code_key_prefix(code_location_key_prefix, model_name, image)
code_location_key_prefix |
(str): the s3 key prefix from code_location |
model_name |
(str): the name of the model |
image |
(str): the image from which a default name can be extracted |
str: the key prefix to be used in uploading code
For a given filter string "task == ic", the key corresponds to "task" and the value corresponds to "ic", with the operation being "==".
sagemaker.core::JumpStartDataHolderType
-> ModelFilter
new()
Instantiates “ModelFilter“ object.
ModelFilter$new(key, value, operator)
key
(str): The key in metadata for the model filter.
value
(str): The value of the metadata for the model filter.
operator
(str): The operator used in the model filter.
clone()
The objects of this class are cloneable with this method.
ModelFilter$clone(deep = FALSE)
deep
Whether to make a deep clone.
The ML framework as referenced in the prefix of the model ID. This value does not necessarily correspond to the container name.
ModelFramework
ModelFramework
An object of class ModelFramework
(inherits from Enum
, environment
) of length 8.
Can handle uploading to S3.
move_to_destination(source, destination, job_name, sagemaker_session)
move_to_destination(source, destination, job_name, sagemaker_session)
source |
(str): root directory to move |
destination |
(str): file:// or s3:// URI that source will be moved to. |
job_name |
(str): SageMaker job name. |
sagemaker_session |
(sagemaker.Session): a sagemaker_session to interact with S3 if needed |
(str): destination URI
Will group up as many records as possible within the payload specified.
sagemaker.core::BatchStrategy
-> MultiRecordStrategy
pad()
Group together as many records as possible to fit in the specified size.
MultiRecordStrategy$pad(file, size = 6)
file
(str): file path to read the records from.
size
(int): maximum size in MB that each group of records will be fitted to. passing 0 means unlimited size.
generator of records
clone()
The objects of this class are cloneable with this method.
MultiRecordStrategy$clone(deep = FALSE)
deep
Whether to make a deep clone.
This function assures that the total length of the resulting string is not longer than the specified max length, trimming the input parameter if necessary.
name_from_base(base, max_length = 63, short = FALSE)
name_from_base(base, max_length = 63, short = FALSE)
base |
(str): String used as prefix to generate the unique name. |
max_length |
(int): Maximum length for the resulting string (default: 63). |
short |
(bool): Whether or not to use a truncated timestamp (default: False). |
str: Input parameter with appended timestamp.
Other sagemaker_utils:
.aws_partition()
,
.download_files_under_prefix()
,
base_from_name()
,
base_name_from_image()
,
build_dict()
,
common_variables
,
create_tar_file()
,
download_file()
,
download_folder()
,
get_config_value()
,
get_short_version()
,
name_from_image()
,
regional_hostname()
,
repack_model()
,
retries()
,
sagemaker_short_timestamp()
,
sagemaker_timestamp()
,
secondary_training_status_changed()
,
secondary_training_status_message()
,
sts_regional_endpoint()
,
unique_name_from_base()
Create a training job name based on the image name and a timestamp.
name_from_image(image, max_length = 63L)
name_from_image(image, max_length = 63L)
image |
(str): Image name. |
max_length |
(int): Maximum length for the resulting string (default: 63). |
str: Training job name using the algorithm from the image name and a timestamp.
Other sagemaker_utils:
.aws_partition()
,
.download_files_under_prefix()
,
base_from_name()
,
base_name_from_image()
,
build_dict()
,
common_variables
,
create_tar_file()
,
download_file()
,
download_folder()
,
get_config_value()
,
get_short_version()
,
name_from_base()
,
regional_hostname()
,
repack_model()
,
retries()
,
sagemaker_short_timestamp()
,
sagemaker_timestamp()
,
secondary_training_status_changed()
,
secondary_training_status_message()
,
sts_regional_endpoint()
,
unique_name_from_base()
Does not split records, essentially reads the whole file.
sagemaker.core::Splitter
-> NoneSplitter
split()
Split a file into records using a specific strategy. For this NoneSplitter there is no actual split happening and the file is returned as a whole.
NoneSplitter$split(file)
file
(str): path to the file to split
generator for the individual records that were split from the file
clone()
The objects of this class are cloneable with this method.
NoneSplitter$clone(deep = FALSE)
deep
Whether to make a deep clone.
Not operator class for filtering JumpStart content.
Not operator class for filtering JumpStart content.
sagemaker.core::Operand
-> sagemaker.core::Operator
-> Not
new()
Instantiates Not object.
Not$new(operand)
operand
(Operand): Operand for Not-ing.
eval()
Evaluates operator.
Not$eval()
clone()
The objects of this class are cloneable with this method.
Not$clone(deep = FALSE)
deep
Whether to make a deep clone.
Operand class for filtering JumpStart content.
Operand class for filtering JumpStart content.
resolved_value
Getter method for resolved_value.
new()
Initialize Operand Class
Operand$new(unresolved_value, resolved_value = BooleanValues$UNEVALUATED)
unresolved_value
(Any): The unresolved value of the operator.
resolved_value
(BooleanValues): The resolved value of the operator.
validate_operand()
Validate operand and return “Operand“ object.
Operand$validate_operand(operand)
operand
(Any): The operand to validate.
eval()
Evaluates operand.
Operand$eval()
format()
format class
Operand$format()
clone()
The objects of this class are cloneable with this method.
Operand$clone(deep = FALSE)
deep
Whether to make a deep clone.
An operator in this case corresponds to an operand that is also an operation. For example, given the expression “(True or True) and True“, “(True or True)“ is an operand to an “And“ expression, but is also itself an operator. “(True or True) and True“ would also be considered an operator.
sagemaker.core::Operand
-> Operator
new()
Initializes “Operator“ instance.
Operator$new( resolved_value = BooleanValues$UNEVALUATED, unresolved_value = NULL )
resolved_value
(BooleanValues): Optional. The resolved value of the operator. (Default: BooleanValues.UNEVALUATED).
unresolved_value
(Any): Optional. The unresolved value of the operator. (Default: None).
eval()
Evaluates operator.
Operator$eval()
clone()
The objects of this class are cloneable with this method.
Operator$clone(deep = FALSE)
deep
Whether to make a deep clone.
Or operator class for filtering JumpStart content.
Or operator class for filtering JumpStart content.
sagemaker.core::Operand
-> sagemaker.core::Operator
-> Or
new()
Instantiates Or object.
Or$new(...)
...
(Operand): Operand for Or-ing.
eval()
Evaluates operator.
Or$eval()
clone()
The objects of this class are cloneable with this method.
Or$clone(deep = FALSE)
deep
Whether to make a deep clone.
Parse filter string and return a serialized “ModelFilter“ object.
parse_filter_string(filter_string)
parse_filter_string(filter_string)
filter_string |
(str): The filter string to be serialized to an object. |
split s3 uri
split_s3_uri(url) parse_s3_url(url)
split_s3_uri(url) parse_s3_url(url)
url |
(str): s3 uri to split into bucket and key |
Function reads “__version__“ variable in “sagemaker“ module. In order to maintain compatibility with the “packaging.version“ library, versions with fewer than 2, or more than 3, periods are rejected. All versions that cannot be parsed with “packaging.version“ are also rejected.
parse_sagemaker_version()
parse_sagemaker_version()
Class to convert lists to Paws api calls.
to_camel_case()
Convert a snake case string to camel case.
PawsFunctions$to_camel_case(snake_case)
snake_case
(str): String to convert to camel case.
(str): String converted to camel case.
to_snake_case()
Convert a camel case string to snake case.
PawsFunctions$to_snake_case(name)
name
(str): String to convert to snake case.
(str): String converted to snake case.
from_paws()
Convert an UpperCamelCase paws response to a snake case representation.
PawsFunctions$from_paws( paws_list, paws_name_to_member_name, member_name_to_type )
paws_list
(list[str, ?]): A paws response dictionary.
paws_name_to_member_name
(dict[str, str]): A map from paws name to snake_case name. If a given paws name is not in the map then a default mapping is applied.
member_name_to_type
(list[str, (ApiObject, boolean)]): A map from snake case name to a type description tuple. The first element of the tuple, a subclass of ApiObject, is the type of the mapped object. The second element indicates whether the mapped element is a collection or singleton.
list: Paws response in snake case.
to_paws()
Convert a dict of of snake case names to values into a paws UpperCamelCase representation.
PawsFunctions$to_paws( member_vars, member_name_to_paws_name, member_name_to_type )
member_vars
(list[str, ?]): A map from snake case name to value.
member_name_to_paws_name
(list[str, ?]): A map from snake_case name to paws name.
member_name_to_type
(list): A map from UpperCamelCase name to a type description tuple. The first element of the tuple, a subclass of ApiObject, is the type of the mapped object. The second element indicates whether the mapped element is a collection or singleton.
(list): paws dict converted to snake case
clone()
The objects of this class are cloneable with this method.
PawsFunctions$clone(deep = FALSE)
deep
Whether to make a deep clone.
A session stores configuration state and allows you to create paws service clients.
aws_access_key_id
aws access key
aws_secret_access_key
aws secret access key
aws_session_token
aws session token
region_name
Default region when creating new connections
profile_name
The name of a profile to use.
endpoint
The complete URL to use for the constructed client.
credentials
Formatted aws credentials to pass to paws objects
new()
Initialize PawsSession class
PawsSession$new( aws_access_key_id = NULL, aws_secret_access_key = NULL, aws_session_token = NULL, region_name = NULL, profile_name = NULL, endpoint = NULL, config = list() )
aws_access_key_id
(str): AWS access key ID
aws_secret_access_key
(str): AWS secret access key
aws_session_token
(str): AWS temporary session token
region_name
(str): Default region when creating new connections
profile_name
(str): The name of a profile to use. If not given, then the default profile is used.
endpoint
(str): The complete URL to use for the constructed client.
config
(list): Optional paws configuration of credentials, endpoint, and/or region.
client()
Create a low-level service client by name.
PawsSession$client(service_name, config = NULL)
service_name
(str): The name of a service, e.g. 's3' or 'ec2'. A list of available services can be found https://paws-r.github.io/docs/
config
(list): Optional paws configuration of credentials, endpoint, and/or region.
format()
format class
PawsSession$format()
clone()
The objects of this class are cloneable with this method.
PawsSession$clone(deep = FALSE)
deep
Whether to make a deep clone.
Other Session:
LocalSession
,
Session
Create a definition for executing a pipeline of containers as part of a SageMaker model.
pipeline_container_def(models, instance_type = NULL)
pipeline_container_def(models, instance_type = NULL)
models |
(list[sagemaker.Model]): this will be a list of “sagemaker.Model“ objects in the order the inference should be invoked. |
instance_type |
(str): The EC2 instance type to deploy this Model to. For example, 'ml.p2.xlarge' (default: None). |
list[dict[str, str]]: list of container definition objects usable with with the CreateModel API for inference pipelines if passed via 'Containers' field.
This is also part of a “CreateEndpointConfig“ request.
production_variant( model_name, instance_type = NULL, initial_instance_count = NULL, variant_name = "AllTraffic", initial_weight = 1, accelerator_type = NULL, serverless_inference_config = NULL )
production_variant( model_name, instance_type = NULL, initial_instance_count = NULL, variant_name = "AllTraffic", initial_weight = 1, accelerator_type = NULL, serverless_inference_config = NULL )
model_name |
(str): The name of the SageMaker model this production variant references. |
instance_type |
(str): The EC2 instance type for this production variant. For example, ml.c4.8xlarge'. |
initial_instance_count |
(int): The initial instance count for this production variant (default: 1). |
variant_name |
(string): The “VariantName“ of this production variant (default: 'AllTraffic'). |
initial_weight |
(int): The relative “InitialVariantWeight“ of this production variant (default: 1). |
accelerator_type |
(str): Type of Elastic Inference accelerator for this production variant. For example, 'ml.eia1.medium'. For more information: https://docs.aws.amazon.com/sagemaker/latest/dg/ei.html |
serverless_inference_config |
(list): Specifies configuration dict related to serverless endpoint. The dict is converted from sagemaker.model_monitor.ServerlessInferenceConfig object (default: None) |
dict[str, str]: An SageMaker “ProductionVariant“ description
Raise warning for deprecated python versions
python_deprecation_warning(framework, latest_supported_version)
python_deprecation_warning(framework, latest_supported_version)
framework |
(str): model framework |
latest_supported_version |
(str): latest supported version |
Eagerly read a collection of amazon Record protobuf objects from raw object
read_records_io(obj)
read_records_io(obj)
obj |
(raw): raw object |
A boto based Active Record class based on convention of CRUD operations.
sagemaker.core::ApiObject
-> Record
new()
Init Record.
Record$new(sagemaker_session = NULL, ...)
sagemaker_session
(sagemaker.session.Session): Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the estimator creates one using the default AWS configuration chain.
...
parameters passed to 'R6' class 'ApiObject'
with_paws()
Update this ApiObject with a paws response.
Record$with_paws(paws_list)
paws_list
(dict): A dictionary of a paws response.
clone()
The objects of this class are cloneable with this method.
Record$clone(deep = FALSE)
deep
Whether to make a deep clone.
Not useful for string content.
sagemaker.core::Splitter
-> RecordIOSplitter
split()
Split a file into records using a specific strategy This RecordIOSplitter splits the data into individual RecordIO records.
RecordIOSplitter$split(file)
file
(str): path to the file to split
generator for the individual records that were split from the file
clone()
The objects of this class are cloneable with this method.
RecordIOSplitter$clone(deep = FALSE)
deep
Whether to make a deep clone.
This won't throw any exception when the source directory does not exist.
recursive_copy(source, destination)
recursive_copy(source, destination)
source |
(str): source path |
destination |
(str): destination path |
We need this function because the AWS SDK does not yet honor the “region_name“ parameter when creating an AWS STS client. For the list of regional endpoints, see https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp_enable-regions.html#id_credentials_region-endpoints.
regional_hostname(service_name, region)
regional_hostname(service_name, region)
service_name |
(str): Name of the service to resolve an endpoint for (e.g., s3) |
region |
(str): AWS region name |
str: AWS STS regional endpoint
Other sagemaker_utils:
.aws_partition()
,
.download_files_under_prefix()
,
base_from_name()
,
base_name_from_image()
,
build_dict()
,
common_variables
,
create_tar_file()
,
download_file()
,
download_folder()
,
get_config_value()
,
get_short_version()
,
name_from_base()
,
name_from_image()
,
repack_model()
,
retries()
,
sagemaker_short_timestamp()
,
sagemaker_timestamp()
,
secondary_training_status_changed()
,
secondary_training_status_message()
,
sts_regional_endpoint()
,
unique_name_from_base()
Raises warning, if not None.
remove_arg(name, arg = NULL)
remove_arg(name, arg = NULL)
name |
(str): name of deprecated argument |
arg |
(str): the argument to check |
Raises warning, if present.
removed_kwargs(name, kwargs)
removed_kwargs(name, kwargs)
name |
(str): name of deprecated argument |
kwargs |
(str): keyword arguments dict |
Raise a warning for a no-op in sagemaker>=2
removed_warning(phrase, sdk_version = NULL)
removed_warning(phrase, sdk_version = NULL)
phrase |
(str): the prefix phrase of the warning message. |
sdk_version |
(str): the sdk version of removal of support. |
Raises warning, if present.
renamed_kwargs(old_name, new_name, value, kwargs)
renamed_kwargs(old_name, new_name, value, kwargs)
old_name |
(str): name of deprecated argument |
new_name |
(str): name of the new argument |
value |
(str): value associated with new name, if supplied |
kwargs |
(list): keyword arguments dict |
value of the keyword argument, if present
Raise a warning for a rename in sagemaker>=2
renamed_warning(phrase)
renamed_warning(phrase)
phrase |
(str): the prefix phrase of the warning message. |
This function does the following: - uncompresses model tarball from S3 or local system into a temp folder - replaces the inference code from the model with the new code provided - compresses the new model tarball and saves it in S3 or local file system
repack_model( inference_script, source_directory, dependencies, model_uri, repacked_model_uri, sagemaker_session, kms_key = NULL )
repack_model( inference_script, source_directory, dependencies, model_uri, repacked_model_uri, sagemaker_session, kms_key = NULL )
inference_script |
(str): path or basename of the inference script that will be packed into the model |
source_directory |
(str): path including all the files that will be packed into the model |
dependencies |
(list[str]): A list of paths to directories (absolute or relative) with any additional libraries that will be exported to the |
model_uri |
(str): S3 or file system location of the original model tar |
repacked_model_uri |
(str): path or file system location where the new model will be saved |
sagemaker_session |
(sagemaker.session.Session): a sagemaker session to interact with S3. |
kms_key |
(str): KMS key ARN for encrypting the repacked model file |
str: path to the new packed model
Other sagemaker_utils:
.aws_partition()
,
.download_files_under_prefix()
,
base_from_name()
,
base_name_from_image()
,
build_dict()
,
common_variables
,
create_tar_file()
,
download_file()
,
download_folder()
,
get_config_value()
,
get_short_version()
,
name_from_base()
,
name_from_image()
,
regional_hostname()
,
retries()
,
sagemaker_short_timestamp()
,
sagemaker_timestamp()
,
secondary_training_status_changed()
,
secondary_training_status_message()
,
sts_regional_endpoint()
,
unique_name_from_base()
Retries until max retry count is reached.
retries(max_retry_count, exception_message_prefix, seconds_to_sleep = 2)
retries(max_retry_count, exception_message_prefix, seconds_to_sleep = 2)
max_retry_count |
(int): The retry count. |
exception_message_prefix |
(str): The message to include in the exception on failure. |
seconds_to_sleep |
(int): The number of seconds to sleep between executions. |
Other sagemaker_utils:
.aws_partition()
,
.download_files_under_prefix()
,
base_from_name()
,
base_name_from_image()
,
build_dict()
,
common_variables
,
create_tar_file()
,
download_file()
,
download_folder()
,
get_config_value()
,
get_short_version()
,
name_from_base()
,
name_from_image()
,
regional_hostname()
,
repack_model()
,
sagemaker_short_timestamp()
,
sagemaker_timestamp()
,
secondary_training_status_changed()
,
secondary_training_status_message()
,
sts_regional_endpoint()
,
unique_name_from_base()
Split string from the right
rsplit(str, separator = "\\.", maxsplit)
rsplit(str, separator = "\\.", maxsplit)
str |
: string to be split |
separator |
(str): Method splits string starting from the right (default '\.') |
maxsplit |
(number): The maxsplit defines the maximum number of splits. |
Other r_utils:
Enum()
,
IsSubR6Class()
,
cls_help()
,
format_class()
,
is_list_named()
,
is_tarfile()
,
islistempty()
,
pkg_method()
,
retry_api_call()
,
split_str()
,
write_bin()
Returns the arguments joined by a slash ("/"), similarly to “file.path()“ (on Unix). If the first argument is "s3://", then that is preserved.
s3_path_join(...)
s3_path_join(...)
... |
: The strings to join with a slash. |
character: The joined string.
The contents will be downloaded and then processed as local data.
sagemaker.core::DataSource
-> S3DataSource
new()
Create an S3DataSource instance.
S3DataSource$new(bucket, prefix, sagemaker_session)
bucket
(str): S3 bucket name
prefix
(str): S3 prefix path to the data
sagemaker_session
(:class:'sagemaker.session.Session'): a sagemaker_session with the desired settings to talk to S3
get_file_list()
Retrieve the list of absolute paths to all the files in this data source.
S3DataSource$get_file_list()
List(str): List of absolute paths.
get_root_dir()
Retrieve the absolute path to the root directory of this data source.
S3DataSource$get_root_dir()
str: absolute path to the root directory of this data source.
clone()
The objects of this class are cloneable with this method.
S3DataSource$clone(deep = FALSE)
deep
Whether to make a deep clone.
Contains static methods for downloading directories or files from S3.
download()
Static method that downloads a given S3 uri to the local machine.
S3Downloader$download( s3_uri, local_path, kms_key = NULL, sagemaker_session = NULL )
s3_uri
(str): An S3 uri to download from.
local_path
(str): A local path to download the file(s) to.
kms_key
(str): The KMS key to use to decrypt the files.
sagemaker_session
(sagemaker.session.Session): Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the estimator creates one using the default AWS configuration chain.
read_file()
Static method that returns the contents of an s3 uri file body as a string.
S3Downloader$read_file(s3_uri, sagemaker_session = NULL)
s3_uri
(str): An S3 uri that refers to a single file.
sagemaker_session
(sagemaker.session.Session): AWS session to use. Automatically generates one if not provided.
str: The body of the file.
list()
Static method that lists the contents of an S3 uri.
S3Downloader$list(s3_uri, sagemaker_session = NULL)
s3_uri
(str): The S3 base uri to list objects in.
sagemaker_session
(sagemaker.session.Session): AWS session to use. Automatically generates one if not provided.
[str]: The list of S3 URIs in the given S3 base uri.
format()
format class
S3Downloader$format()
clone()
The objects of this class are cloneable with this method.
S3Downloader$clone(deep = FALSE)
deep
Whether to make a deep clone.
Contains static methods for uploading directories or files to S3
upload()
Static method that uploads a given file or directory to S3.
S3Uploader$upload( local_path = NULL, desired_s3_uri = NULL, kms_key = NULL, sagemaker_session = NULL )
local_path
(str): Path (absolute or relative) of local file or directory to upload.
desired_s3_uri
(str): The desired S3 location to upload to. It is the prefix to which the local filename will be added.
kms_key
(str): The KMS key to use to encrypt the files.
sagemaker_session
(sagemaker.session.Session): Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the estimator creates one using the default AWS configuration chain.
The S3 uri of the uploaded file(s).
upload_string_as_file_body()
Static method that uploads a given file or directory to S3.
S3Uploader$upload_string_as_file_body( body, desired_s3_uri = NULL, kms_key = NULL, sagemaker_session = NULL )
body
(str): String representing the body of the file.
desired_s3_uri
(str): The desired S3 uri to upload to.
kms_key
(str): The KMS key to use to encrypt the files.
sagemaker_session
(sagemaker.session.Session): AWS session to use. Automatically generates one if not provided.
str: The S3 uri of the uploaded file(s).
format()
format class
S3Uploader$format()
clone()
The objects of this class are cloneable with this method.
S3Uploader$clone(deep = FALSE)
deep
Whether to make a deep clone.
Return a timestamp that is relatively short in length
sagemaker_short_timestamp()
sagemaker_short_timestamp()
Other sagemaker_utils:
.aws_partition()
,
.download_files_under_prefix()
,
base_from_name()
,
base_name_from_image()
,
build_dict()
,
common_variables
,
create_tar_file()
,
download_file()
,
download_folder()
,
get_config_value()
,
get_short_version()
,
name_from_base()
,
name_from_image()
,
regional_hostname()
,
repack_model()
,
retries()
,
sagemaker_timestamp()
,
secondary_training_status_changed()
,
secondary_training_status_message()
,
sts_regional_endpoint()
,
unique_name_from_base()
Return a timestamp with millisecond precision.
sagemaker_timestamp()
sagemaker_timestamp()
Other sagemaker_utils:
.aws_partition()
,
.download_files_under_prefix()
,
base_from_name()
,
base_name_from_image()
,
build_dict()
,
common_variables
,
create_tar_file()
,
download_file()
,
download_folder()
,
get_config_value()
,
get_short_version()
,
name_from_base()
,
name_from_image()
,
regional_hostname()
,
repack_model()
,
retries()
,
sagemaker_short_timestamp()
,
secondary_training_status_changed()
,
secondary_training_status_message()
,
sts_regional_endpoint()
,
unique_name_from_base()
Returns true if training job's secondary status message has changed.
secondary_training_status_changed( current_job_description = NULL, prev_job_description = NULL )
secondary_training_status_changed( current_job_description = NULL, prev_job_description = NULL )
current_job_description |
(str): Current job description, returned from DescribeTrainingJob call. |
prev_job_description |
(str): Previous job description, returned from DescribeTrainingJob call. |
boolean: Whether the secondary status message of a training job changed or not.
Other sagemaker_utils:
.aws_partition()
,
.download_files_under_prefix()
,
base_from_name()
,
base_name_from_image()
,
build_dict()
,
common_variables
,
create_tar_file()
,
download_file()
,
download_folder()
,
get_config_value()
,
get_short_version()
,
name_from_base()
,
name_from_image()
,
regional_hostname()
,
repack_model()
,
retries()
,
sagemaker_short_timestamp()
,
sagemaker_timestamp()
,
secondary_training_status_message()
,
sts_regional_endpoint()
,
unique_name_from_base()
Returns a string contains last modified time and the secondary training job status message.
secondary_training_status_message( job_description = NULL, prev_description = NULL )
secondary_training_status_message( job_description = NULL, prev_description = NULL )
job_description |
(str): Returned response from DescribeTrainingJob call |
prev_description |
(str): Previous job description from DescribeTrainingJob call |
str: Job status string to be printed.
Other sagemaker_utils:
.aws_partition()
,
.download_files_under_prefix()
,
base_from_name()
,
base_name_from_image()
,
build_dict()
,
common_variables
,
create_tar_file()
,
download_file()
,
download_folder()
,
get_config_value()
,
get_short_version()
,
name_from_base()
,
name_from_image()
,
regional_hostname()
,
repack_model()
,
retries()
,
sagemaker_short_timestamp()
,
sagemaker_timestamp()
,
secondary_training_status_changed()
,
sts_regional_endpoint()
,
unique_name_from_base()
Manage interactions with the Amazon SageMaker APIs and any other AWS services needed. This class provides convenient methods for manipulating entities and resources that Amazon SageMaker uses, such as training jobs, endpoints, and input datasets in S3. AWS service calls are delegated to an underlying paws session, which by default is initialized using the AWS configuration chain. When you make an Amazon SageMaker API call that accesses an S3 bucket location and one is not specified, the “Session“ creates a default bucket based on a naming convention which includes the current AWS account ID.
paws_region_name
Returns aws region associated with Session
new()
Creates a new instance of this [R6][R6::R6Class] class.
Initialize a SageMaker Session
.
Session$new( paws_session = NULL, sagemaker_client = NULL, sagemaker_runtime_client = NULL, default_bucket = NULL )
paws_session
(PawsSession): The underlying AWS credentails passed to paws SDK.
sagemaker_client
(sagemaker): Client which makes Amazon SageMaker service calls other than “InvokeEndpoint“ (default: None). Estimators created using this “Session“ use this client. If not provided, one will be created using this instance's “paws session“.
sagemaker_runtime_client
(sagemakerruntime): Client which makes “InvokeEndpoint“ calls to Amazon SageMaker (default: None). Predictors created using this “Session“ use this client. If not provided, one will be created using this instance's “paws session“.
default_bucket
(str): The default Amazon S3 bucket to be used by this session.
This will be created the next time an Amazon S3 bucket is needed (by calling
:func:default_bucket
).
If not provided, a default bucket will be created based on the following format:
"sagemaker-region-aws-account-id". Example: "sagemaker-my-custom-bucket".
upload_data()
Upload local file or directory to S3.If a single file is specified for upload, the resulting S3 object key is “key_prefix/filename“ (filename does not include the local path, if any specified). If a directory is specified for upload, the API uploads all content, recursively, preserving relative structure of subdirectories. The resulting object key names are: “key_prefix/relative_subdirectory_path/filename“.
Session$upload_data(path, bucket = NULL, key_prefix = "data", ...)
path
(str): Path (absolute or relative) of local file or directory to upload.
bucket
(str): Name of the S3 Bucket to upload to (default: None). If not specified, the default bucket of the “Session“ is used (if default bucket does not exist, the “Session“ creates it).
key_prefix
(str): Optional S3 object key name prefix (default: 'data'). S3 uses the prefix to create a directory structure for the bucket content that it display in the S3 console.
...
(any): Optional extra arguments that may be passed to the upload operation. Similar to ExtraArgs parameter in S3 upload_file function. Please refer to the ExtraArgs parameter documentation here: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/s3-uploading-files.html#the-extraargs-parameter
str: The S3 URI of the uploaded file(s). If a file is specified in the path argument, the URI format is: “s3://bucket name/key_prefix/original_file_name“. If a directory is specified in the path argument, the URI format is “s3://bucket name/key_prefix“.
upload_string_as_file_body()
Upload a string as a file body.
Session$upload_string_as_file_body(body, bucket, key, kms_key = NULL)
body
(str): String representing the body of the file.
bucket
(str): Name of the S3 Bucket to upload to (default: None). If not specified, the default bucket of the “Session“ is used (if default bucket does not exist, the “Session“ creates it).
key
(str): S3 object key. This is the s3 path to the file.
kms_key
(str): The KMS key to use for encrypting the file.
str: The S3 URI of the uploaded file. The URI format is: “s3://bucket name/key“.
download_data()
Download file or directory from S3.
Session$download_data(path, bucket, key_prefix = "", ...)
path
(str): Local path where the file or directory should be downloaded to.
bucket
(str): Name of the S3 Bucket to download from.
key_prefix
(str): Optional S3 object key name prefix.
...
(any): Optional extra arguments that may be passed to the download operation. Please refer to the ExtraArgs parameter in the boto3 documentation here: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/s3-example-download-file.html
NULL invisibly
read_s3_file()
Read a single file from S3.
Session$read_s3_file(bucket, key_prefix)
bucket
(str): Name of the S3 Bucket to download from.
key_prefix
(str): S3 object key name prefix.
str: The body of the s3 file as a string.
list_s3_files()
Lists the S3 files given an S3 bucket and key.
Session$list_s3_files(bucket, key_prefix = NULL)
bucket
(str): Name of the S3 Bucket to download from.
key_prefix
(str): S3 object key name prefix.
(str): The list of files at the S3 path.
default_bucket()
Return the name of the default bucket to use in relevant Amazon SageMaker interactions.
Session$default_bucket()
(str): The name of the default bucket, which is of the form: “sagemaker-region-AWS account ID“.
train()
Create an Amazon SageMaker training job. Train the learner on a set of observations of the provided 'task'. Mutates the learner by reference, i.e. stores the model alongside other information in field '$state'.
Session$train( input_mode, input_config, role, job_name, output_config = NULL, resource_config = NULL, vpc_config = NULL, hyperparameters = NULL, stop_condition = NULL, tags = NULL, metric_definitions = NULL, enable_network_isolation = FALSE, image_uri = NULL, algorithm_arn = NULL, encrypt_inter_container_traffic = FALSE, use_spot_instances = FALSE, checkpoint_s3_uri = NULL, checkpoint_local_path = NULL, experiment_config = NULL, debugger_rule_configs = NULL, debugger_hook_config = NULL, tensorboard_output_config = NULL, enable_sagemaker_metrics = NULL, profiler_rule_configs = NULL, profiler_config = NULL, environment = NULL, retry_strategy = NULL )
input_mode
(str): The input mode that the algorithm supports. Valid modes:
'File': Amazon SageMaker copies the training dataset from the S3 location to a directory in the Docker container.
'Pipe': Amazon SageMaker streams data directly from S3 to the container via a Unix-named pipe.
input_config
(list): A list of Channel objects. Each channel is a named input source. Please refer to the format details described: https://botocore.readthedocs.io/en/latest/reference/services/sagemaker.html#SageMaker.Client.create_training_job
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. You must grant sufficient permissions to this role.
job_name
(str): Name of the training job being created.
output_config
(dict): The S3 URI where you want to store the training results and optional KMS key ID.
resource_config
(dict): Contains values for ResourceConfig:
instance_count (int): Number of EC2 instances to use for training. The key in resource_config is 'InstanceCount'.
instance_type (str): Type of EC2 instance to use for training, for example, 'ml.c4.xlarge'. The key in resource_config is 'InstanceType'.
vpc_config
(dict): Contains values for VpcConfig:
subnets (list[str]): List of subnet ids. The key in vpc_config is 'Subnets'.
security_group_ids (list[str]): List of security group ids. The key in vpc_config is 'SecurityGroupIds'.
hyperparameters
(dict): Hyperparameters for model training. The hyperparameters are made accessible as a dict[str, str] to the training code on SageMaker. For convenience, this accepts other types for keys and values, but “str()“ will be called to convert them before training.
stop_condition
(dict): Defines when training shall finish. Contains entries that can be understood by the service like “MaxRuntimeInSeconds“.
tags
(list[dict]): List of tags for labeling a training job. For more, see https://docs.aws.amazon.com/sagemaker/latest/dg/API_Tag.html.
metric_definitions
(list[dict]): A list of dictionaries that defines the metric(s) used to evaluate the training jobs. Each dictionary contains two keys: 'Name' for the name of the metric, and 'Regex' for the regular expression used to extract the metric from the logs.
enable_network_isolation
(bool): Whether to request for the training job to run with network isolation or not.
image_uri
(str): Docker image_uri containing training code.
algorithm_arn
(str): Algorithm Arn from Marketplace.
encrypt_inter_container_traffic
(bool): Specifies whether traffic between training containers is encrypted for the training job (default: “False“).
use_spot_instances
(bool): whether to use spot instances for training.
checkpoint_s3_uri
(str): The S3 URI in which to persist checkpoints that the algorithm persists (if any) during training. (default: “None“).
checkpoint_local_path
(str): The local path that the algorithm
writes its checkpoints to. SageMaker will persist all files
under this path to 'checkpoint_s3_uri' continually during
training. On job startup the reverse happens - data from the
s3 location is downloaded to this path before the algorithm is
started. If the path is unset then SageMaker assumes the
checkpoints will be provided under '/opt/ml/checkpoints/'.
(Default: NULL
).
experiment_config
(dict): Experiment management configuration. Dictionary contains
three optional keys, 'ExperimentName', 'TrialName', and 'TrialComponentDisplayName'.
(Default: NULL
)
debugger_rule_configs
Configuration information for debugging rules
debugger_hook_config
Configuration information for debugging rules
tensorboard_output_config
Xonfiguration information for tensorboard output
enable_sagemaker_metrics
(bool): enable SageMaker Metrics Time
Series. For more information see:
https://docs.aws.amazon.com/sagemaker/latest/dg/API_AlgorithmSpecification.html#SageMaker-Type-AlgorithmSpecification-EnableSageMakerMetricsTimeSeries
(Default: NULL
).
profiler_rule_configs
(list[dict]): A list of profiler rule configurations.
profiler_config
(dict): Configuration for how profiling information is emitted with SageMaker Profiler. (default: “None“).
environment
(dict[str, str]) : Environment variables to be set for use during training job (default: “None“)
retry_strategy
(dict): Defines RetryStrategy for InternalServerFailures. * max_retry_attsmpts (int): Number of times a job should be retried. The key in RetryStrategy is 'MaxRetryAttempts'.
str: ARN of the training job, if it is created.
update_training_job()
Calls the UpdateTrainingJob API for the given job name and returns the response.
Session$update_training_job( job_name, profiler_rule_configs = NULL, profiler_config = NULL )
job_name
(str): Name of the training job being updated.
profiler_rule_configs
(list): List of profiler rule configurations. (default: “None“).
profiler_config
(dict): Configuration for how profiling information is emitted with SageMaker Profiler. (default: “None“).
process()
Create an Amazon SageMaker processing job.
Session$process( inputs = NULL, output_config = NULL, job_name = NULL, resources = NULL, stopping_condition = NULL, app_specification = NULL, environment = NULL, network_config = NULL, role_arn, tags = NULL, experiment_config = NULL )
inputs
([dict]): List of up to 10 ProcessingInput dictionaries.
output_config
(dict): A config dictionary, which contains a list of up to 10 ProcessingOutput dictionaries, as well as an optional KMS key ID.
job_name
(str): The name of the processing job. The name must be unique within an AWS Region in an AWS account. Names should have minimum length of 1 and maximum length of 63 characters.
resources
(dict): Encapsulates the resources, including ML instances and storage, to use for the processing job.
stopping_condition
(dict[str,int]): Specifies a limit to how long the processing job can run, in seconds.
app_specification
(dict[str,str]): Configures the processing job to run the given image. Details are in the processing container specification.
environment
(dict): Environment variables to start the processing container with.
network_config
(dict): Specifies networking options, such as network traffic encryption between processing containers, whether to allow inbound and outbound network calls to and from processing containers, and VPC subnets and security groups to use for VPC-enabled processing jobs.
role_arn
(str): The Amazon Resource Name (ARN) of an IAM role that Amazon SageMaker can assume to perform tasks on your behalf.
tags
([dict[str,str]]): A list of dictionaries containing key-value pairs.
experiment_config
(dict): Experiment management configuration. Dictionary contains
three optional keys, 'ExperimentName', 'TrialName', and 'TrialComponentDisplayName'.
(Default: NULL
)
create_monitoring_schedule()
Create an Amazon SageMaker monitoring schedule.
Session$create_monitoring_schedule( monitoring_schedule_name, schedule_expression = NULL, statistics_s3_uri = NULL, constraints_s3_uri = NULL, monitoring_inputs = NULL, monitoring_output_config = NULL, instance_count = 1, instance_type = NULL, volume_size_in_gb = NULL, volume_kms_key = NULL, image_uri = NULL, entrypoint = NULL, arguments = NULL, record_preprocessor_source_uri = NULL, post_analytics_processor_source_uri = NULL, max_runtime_in_seconds = NULL, environment = NULL, network_config = NULL, role_arn = NULL, tags = NULL )
monitoring_schedule_name
(str): The name of the monitoring schedule. The name must be unique within an AWS Region in an AWS account. Names should have a minimum length of 1 and a maximum length of 63 characters.
schedule_expression
(str): The cron expression that dictates the monitoring execution schedule.
statistics_s3_uri
(str): The S3 uri of the statistics file to use.
constraints_s3_uri
(str): The S3 uri of the constraints file to use.
monitoring_inputs
([dict]): List of MonitoringInput dictionaries.
monitoring_output_config
(dict): A config dictionary, which contains a list of MonitoringOutput dictionaries, as well as an optional KMS key ID.
instance_count
(int): The number of instances to run.
instance_type
(str): The type of instance to run.
volume_size_in_gb
(int): Size of the volume in GB.
volume_kms_key
(str): KMS key to use when encrypting the volume.
image_uri
(str): The image uri to use for monitoring executions.
entrypoint
(str): The entrypoint to the monitoring execution image.
arguments
(str): The arguments to pass to the monitoring execution image.
record_preprocessor_source_uri
(str or None): The S3 uri that points to the script that pre-processes the dataset (only applicable to first-party images).
post_analytics_processor_source_uri
(str or None): The S3 uri that points to the script that post-processes the dataset (only applicable to first-party images).
max_runtime_in_seconds
(int): Specifies a limit to how long the processing job can run, in seconds.
environment
(dict): Environment variables to start the monitoring execution container with.
network_config
(dict): Specifies networking options, such as network traffic encryption between processing containers, whether to allow inbound and outbound network calls to and from processing containers, and VPC subnets and security groups to use for VPC-enabled processing jobs.
role_arn
(str): The Amazon Resource Name (ARN) of an IAM role that Amazon SageMaker can assume to perform tasks on your behalf.
tags
([dict[str,str]]): A list of dictionaries containing key-value pairs.
update_monitoring_schedule()
Update an Amazon SageMaker monitoring schedule.
Session$update_monitoring_schedule( monitoring_schedule_name, schedule_expression = NULL, statistics_s3_uri = NULL, constraints_s3_uri = NULL, monitoring_inputs = NULL, monitoring_output_config = NULL, instance_count = NULL, instance_type = NULL, volume_size_in_gb = NULL, volume_kms_key = NULL, image_uri = NULL, entrypoint = NULL, arguments = NULL, record_preprocessor_source_uri = NULL, post_analytics_processor_source_uri = NULL, max_runtime_in_seconds = NULL, environment = NULL, network_config = NULL, role_arn = NULL )
monitoring_schedule_name
(str): The name of the monitoring schedule. The name must be unique within an AWS Region in an AWS account. Names should have a minimum length of 1 and a maximum length of 63 characters.
schedule_expression
(str): The cron expression that dictates the monitoring execution schedule.
statistics_s3_uri
(str): The S3 uri of the statistics file to use.
constraints_s3_uri
(str): The S3 uri of the constraints file to use.
monitoring_inputs
([dict]): List of MonitoringInput dictionaries.
monitoring_output_config
(dict): A config dictionary, which contains a list of MonitoringOutput dictionaries, as well as an optional KMS key ID.
instance_count
(int): The number of instances to run.
instance_type
(str): The type of instance to run.
volume_size_in_gb
(int): Size of the volume in GB.
volume_kms_key
(str): KMS key to use when encrypting the volume.
image_uri
(str): The image uri to use for monitoring executions.
entrypoint
(str): The entrypoint to the monitoring execution image.
arguments
(str): The arguments to pass to the monitoring execution image.
record_preprocessor_source_uri
(str or None): The S3 uri that points to the script that
post_analytics_processor_source_uri
(str or None): The S3 uri that points to the script that post-processes the dataset (only applicable to first-party images).
max_runtime_in_seconds
(int): Specifies a limit to how long the processing job can run, in seconds.
environment
(dict): Environment variables to start the monitoring execution container with.
network_config
(dict): Specifies networking options, such as network traffic encryption between processing containers, whether to allow inbound and outbound network calls to and from processing containers, and VPC subnets and security groups to use for VPC-enabled processing jobs.
role_arn
(str): The Amazon Resource Name (ARN) of an IAM role that Amazon SageMaker can assume to perform tasks on your behalf.
pre-processes
the dataset (only applicable to first-party images).
start_monitoring_schedule()
Starts a monitoring schedule.
Session$start_monitoring_schedule(monitoring_schedule_name)
monitoring_schedule_name
(str): The name of the Amazon SageMaker Monitoring Schedule to start.
stop_monitoring_schedule()
Stops a monitoring schedule.
Session$stop_monitoring_schedule(monitoring_schedule_name)
monitoring_schedule_name
(str): The name of the Amazon SageMaker Monitoring Schedule to stop.
delete_monitoring_schedule()
Deletes a monitoring schedule.
Session$delete_monitoring_schedule(monitoring_schedule_name)
monitoring_schedule_name
(str): The name of the Amazon SageMaker Monitoring Schedule to delete.
describe_monitoring_schedule()
Calls the DescribeMonitoringSchedule API for the given monitoring schedule name and returns the response.
Session$describe_monitoring_schedule(monitoring_schedule_name)
monitoring_schedule_name
(str): The name of the processing job to describe.
dict: A dictionary response with the processing job description.
list_monitoring_executions()
Lists the monitoring executions associated with the given monitoring_schedule_name.
Session$list_monitoring_executions( monitoring_schedule_name, sort_by = "ScheduledTime", sort_order = "Descending", max_results = 100 )
monitoring_schedule_name
(str): The monitoring_schedule_name for which to retrieve the monitoring executions.
sort_by
(str): The field to sort by. Can be one of: "CreationTime", "ScheduledTime", "Status". Default: "ScheduledTime".
sort_order
(str): The sort order. Can be one of: "Ascending", "Descending". Default: "Descending".
max_results
(int): The maximum number of results to return. Must be between 1 and 100.
dict: Dictionary of monitoring schedule executions.
list_monitoring_schedules()
Lists the monitoring executions associated with the given monitoring_schedule_name.
Session$list_monitoring_schedules( endpoint_name = NULL, sort_by = "CreationTime", sort_order = "Descending", max_results = 100 )
endpoint_name
(str): The name of the endpoint to filter on. If not provided, does not filter on it. Default: None.
sort_by
(str): The field to sort by. Can be one of: "Name", "CreationTime", "Status". Default: "CreationTime".
sort_order
(str): The sort order. Can be one of: "Ascending", "Descending". Default: "Descending".
max_results
(int): The maximum number of results to return. Must be between 1 and 100.
dict: Dictionary of monitoring schedule executions.
was_processing_job_successful()
Calls the DescribeProcessingJob API for the given job name and returns the True if the job was successful. False otherwise.
Session$was_processing_job_successful(job_name)
job_name
(str): The name of the processing job to describe.
bool: Whether the processing job was successful.
describe_processing_job()
Calls the DescribeProcessingJob API for the given job name and returns the response.
Session$describe_processing_job(job_name)
job_name
(str): The name of the processing job to describe.
dict: A dictionary response with the processing job description.
stop_processing_job()
Calls the StopProcessingJob API for the given job name.
Session$stop_processing_job(job_name)
job_name
(str): The name of the processing job to stop.
stop_training_job()
Calls the StopTrainingJob API for the given job name.
Session$stop_training_job(job_name)
job_name
(str): The name of the training job to stop.
describe_training_job()
Calls the DescribeTrainingJob API for the given job name and returns the response.
Session$describe_training_job(job_name)
job_name
(str): The name of the training job to describe.
dict: A dictionary response with the training job description.
auto_ml()
Create an Amazon SageMaker AutoML job.
Session$auto_ml( input_config, output_config, auto_ml_job_config, role, job_name, problem_type = NULL, job_objective = NULL, generate_candidate_definitions_only = FALSE, tags = NULL )
input_config
(list[dict]): A list of Channel objects. Each channel contains "DataSource" and "TargetAttributeName", "CompressionType" is an optional field.
output_config
(dict): The S3 URI where you want to store the training results and optional KMS key ID.
auto_ml_job_config
(dict): A dict of AutoMLJob config, containing "StoppingCondition", "SecurityConfig", optionally contains "VolumeKmsKeyId".
role
(str): The Amazon Resource Name (ARN) of an IAM role that Amazon SageMaker can assume to perform tasks on your behalf.
job_name
(str): A string that can be used to identify an AutoMLJob. Each AutoMLJob should have a unique job name.
problem_type
(str): The type of problem of this AutoMLJob. Valid values are "Regression", "BinaryClassification", "MultiClassClassification". If None, SageMaker AutoMLJob will infer the problem type automatically.
job_objective
(dict): AutoMLJob objective, contains "AutoMLJobObjectiveType" (optional), "MetricName" and "Value".
generate_candidate_definitions_only
(bool): Indicates whether to only generate candidate definitions. If True, AutoML.list_candidates() cannot be called. Default: False.
tags
([dict[str,str]]): A list of dictionaries containing key-value pairs.
NULL invisible
describe_auto_ml_job()
Calls the DescribeAutoMLJob API for the given job name and returns the response.
Session$describe_auto_ml_job(job_name)
job_name
(str): The name of the AutoML job to describe.
dict: A dictionary response with the AutoML Job description.
list_candidates()
Returns the list of candidates of an AutoML job for a given name.
Session$list_candidates( job_name, status_equals = NULL, candidate_name = NULL, candidate_arn = NULL, sort_order = NULL, sort_by = NULL, max_results = NULL )
job_name
(str): The name of the AutoML job. If None, will use object's latest_auto_ml_job name.
status_equals
(str): Filter the result with candidate status, values could be "Completed", "InProgress", "Failed", "Stopped", "Stopping"
candidate_name
(str): The name of a specified candidate to list. Default to NULL
candidate_arn
(str): The Arn of a specified candidate to list. Default to NULL.
sort_order
(str): The order that the candidates will be listed in result. Default to NULL.
sort_by
(str): The value that the candidates will be sorted by. Default to NULL.
max_results
(int): The number of candidates will be listed in results, between 1 to 100. Default to None. If None, will return all the candidates.
list: A list of dictionaries with candidates information
wait_for_auto_ml_job()
Wait for an Amazon SageMaker AutoML job to complete.
Session$wait_for_auto_ml_job(job, poll = 5)
job
(str): Name of the auto ml job to wait for.
poll
(int): Polling interval in seconds (default: 5).
(dict): Return value from the “DescribeAutoMLJob“ API.
logs_for_auto_ml_job()
Display the logs for a given AutoML job, optionally tailing them until the job is complete. If the output is a tty or a Jupyter cell, it will be color-coded based on which instance the log entry is from.
Session$logs_for_auto_ml_job(job_name, wait = FALSE, poll = 10)
job_name
(str): Name of the Auto ML job to display the logs for.
wait
(bool): Whether to keep looking for new log entries until the job completes (Default: FALSE).
poll
(int): The interval in seconds between polling for new log entries and job completion (Default: 10).
compile_model()
Create an Amazon SageMaker Neo compilation job.
Session$compile_model( input_model_config, output_model_config, role, job_name, stop_condition, tags )
input_model_config
(dict): the trained model and the Amazon S3 location where it is stored.
output_model_config
(dict): Identifies the Amazon S3 location where you want Amazon SageMaker Neo to save the results of compilation job
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker Neo compilation jobs use this role to access model artifacts. You must grant sufficient permissions to this role.
job_name
(str): Name of the compilation job being created.
stop_condition
(dict): Defines when compilation job shall finish. Contains entries that can be understood by the service like “MaxRuntimeInSeconds“.
tags
(list[dict]): List of tags for labeling a compile model job. For more, see https://docs.aws.amazon.com/sagemaker/latest/dg/API_Tag.html.
str: ARN of the compile model job, if it is created.
tune()
Create an Amazon SageMaker hyperparameter tuning job
Session$tune( job_name, strategy = c("Bayesian", "Random"), objective_type, objective_metric_name, max_jobs, max_parallel_jobs, parameter_ranges, static_hyperparameters, input_mode, metric_definitions, role, input_config, output_config, resource_config, stop_condition, tags, warm_start_config, enable_network_isolation = FALSE, image_uri = NULL, algorithm_arn = NULL, early_stopping_type = "Off", encrypt_inter_container_traffic = FALSE, vpc_config = NULL, use_spot_instances = FALSE, checkpoint_s3_uri = NULL, checkpoint_local_path = NULL )
job_name
(str): Name of the tuning job being created.
strategy
(str): Strategy to be used for hyperparameter estimations.
objective_type
(str): The type of the objective metric for evaluating training jobs. This value can be either 'Minimize' or 'Maximize'.
objective_metric_name
(str): Name of the metric for evaluating training jobs.
max_jobs
(int): Maximum total number of training jobs to start for the hyperparameter tuning job.
max_parallel_jobs
(int): Maximum number of parallel training jobs to start.
parameter_ranges
(dict): Dictionary of parameter ranges. These parameter ranges can be one of three types: Continuous, Integer, or Categorical.
static_hyperparameters
(dict): Hyperparameters for model training. These hyperparameters remain unchanged across all of the training jobs for the hyperparameter tuning job. The hyperparameters are made accessible as a dictionary for the training code on SageMaker.
input_mode
(str): The input mode that the algorithm supports. Valid modes:
'File' - Amazon SageMaker copies the training dataset from the S3 location to a directory in the Docker container.
'Pipe' - Amazon SageMaker streams data directly from S3 to the container via a Unix-named pipe.
metric_definitions
(list[dict]): A list of dictionaries that defines the metric(s) used to evaluate the training jobs. Each dictionary contains two keys: 'Name' for the name of the metric, and 'Regex' for the regular expression used to extract the metric from the logs. This should be defined only for jobs that don't use an Amazon algorithm.
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. You must grant sufficient permissions to this role.
input_config
(list): A list of Channel objects. Each channel is a named input source. Please refer to the format details described: https://botocore.readthedocs.io/en/latest/reference/services/sagemaker.html#SageMaker.Client.create_training_job
output_config
(dict): The S3 URI where you want to store the training results and optional KMS key ID.
resource_config
(dict): Contains values for ResourceConfig:
instance_count (int): Number of EC2 instances to use for training. The key in resource_config is 'InstanceCount'.
instance_type (str): Type of EC2 instance to use for training, for example, 'ml.c4.xlarge'. The key in resource_config is 'InstanceType'.
stop_condition
(dict): When training should finish, e.g. “MaxRuntimeInSeconds“.
tags
(list[dict]): List of tags for labeling the tuning job. For more, see https://docs.aws.amazon.com/sagemaker/latest/dg/API_Tag.html.
warm_start_config
(dict): Configuration defining the type of warm start and other required configurations.
enable_network_isolation
(bool): Specifies whether to isolate the training container
(Default: FALSE
).
image_uri
(str): Docker image containing training code.
algorithm_arn
(str): Resource ARN for training algorithm created on or subscribed from
AWS Marketplace (Default: NULL
).
early_stopping_type
(str): Specifies whether early stopping is enabled for the job. Can be either 'Auto' or 'Off'. If set to 'Off', early stopping will not be attempted. If set to 'Auto', early stopping of some training jobs may happen, but is not guaranteed to.
encrypt_inter_container_traffic
(bool): Specifies whether traffic between training
containers is encrypted for the training jobs started for this hyperparameter
tuning job (Default: FALSE
).
vpc_config
(dict): Contains values for VpcConfig (default: None):
subnets (list[str]): List of subnet ids. The key in vpc_config is 'Subnets'.
security_group_ids (list[str]): List of security group ids. The key in vpc_config is 'SecurityGroupIds'.
use_spot_instances
(bool): whether to use spot instances for training.
checkpoint_s3_uri
(str): The S3 URI in which to persist checkpoints
that the algorithm persists (if any) during training. (Default: FALSE
).
checkpoint_local_path
(str): The local path that the algorithm
writes its checkpoints to. SageMaker will persist all files
under this path to 'checkpoint_s3_uri' continually during
training. On job startup the reverse happens - data from the
s3 location is downloaded to this path before the algorithm is
started. If the path is unset then SageMaker assumes the
checkpoints will be provided under '/opt/ml/checkpoints/'.
(Default: NULL
).
create_tuning_job()
Create an Amazon SageMaker hyperparameter tuning job. This method supports creating tuning jobs with single or multiple training algorithms (estimators), while the “tune()“ method above only supports creating tuning jobs with single training algorithm.
Session$create_tuning_job( job_name, tuning_config, training_config = NULL, training_config_list = NULL, warm_start_config = NULL, tags = NULL )
job_name
(str): Name of the tuning job being created.
tuning_config
(dict): Configuration to launch the tuning job.
training_config
(dict): Configuration to launch training jobs under the tuning job using a single algorithm.
training_config_list
(list[dict]): A list of configurations to launch training jobs under the tuning job using one or multiple algorithms. Either training_config or training_config_list should be provided, but not both.
warm_start_config
(dict): Configuration defining the type of warm start and other required configurations.
tags
(list[dict]): List of tags for labeling the tuning job. For more, see https://docs.aws.amazon.com/sagemaker/latest/dg/API_Tag.html.
describe_tuning_job()
Calls the DescribeHyperParameterTuningJob API for the given job name and returns the response.
Session$describe_tuning_job(job_name)
job_name
(str): The name of the hyperparameter tuning job to describe.
dict: A dictionary response with the hyperparameter tuning job description.
stop_tuning_job()
Stop the Amazon SageMaker hyperparameter tuning job with the specified name.
Session$stop_tuning_job(name)
name
(str): Name of the Amazon SageMaker hyperparameter tuning job.
transform()
Create an Amazon SageMaker transform job.
Session$transform( job_name = NULL, model_name = NULL, strategy = NULL, max_concurrent_transforms = NULL, max_payload = NULL, env = NULL, input_config = NULL, output_config = NULL, resource_config = NULL, experiment_config = NULL, tags = NULL, data_processing = NULL, model_client_config = NULL )
job_name
(str): Name of the transform job being created.
model_name
(str): Name of the SageMaker model being used for the transform job.
strategy
(str): The strategy used to decide how to batch records in a single request. Possible values are 'MultiRecord' and 'SingleRecord'.
max_concurrent_transforms
(int): The maximum number of HTTP requests to be made to each individual transform container at one time.
max_payload
(int): Maximum size of the payload in a single HTTP request to the container in MB.
env
(dict): Environment variables to be set for use during the transform job.
input_config
(dict): A dictionary describing the input data (and its location) for the job.
output_config
(dict): A dictionary describing the output location for the job.
resource_config
(dict): A dictionary describing the resources to complete the job.
experiment_config
(dict): A dictionary describing the experiment configuration for the job. Dictionary contains three optional keys, 'ExperimentName', 'TrialName', and 'TrialComponentDisplayName'.
tags
(list[dict]): List of tags for labeling a transform job.
data_processing
(dict): A dictionary describing config for combining the input data and transformed data. For more, see https://docs.aws.amazon.com/sagemaker/latest/dg/API_Tag.html.
model_client_config
(dict): A dictionary describing the model configuration for the job. Dictionary contains two optional keys, 'InvocationsTimeoutInSeconds', and 'InvocationsMaxRetries'.
create_model()
Create an Amazon SageMaker “Model“. Specify the S3 location of the model artifacts and Docker image containing the inference code. Amazon SageMaker uses this information to deploy the model in Amazon SageMaker. This method can also be used to create a Model for an Inference Pipeline if you pass the list of container definitions through the containers parameter.
Session$create_model( name, role, container_defs = NULL, vpc_config = NULL, enable_network_isolation = FALSE, primary_container = NULL, tags = NULL )
name
(str): Name of the Amazon SageMaker “Model“ to create.
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. You must grant sufficient permissions to this role.
container_defs
(list[dict[str, str]] or [dict[str, str]]): A single container definition or a list of container definitions which will be invoked sequentially while performing the prediction. If the list contains only one container, then it'll be passed to SageMaker Hosting as the “PrimaryContainer“ and otherwise, it'll be passed as “Containers“.You can also specify the return value of “sagemaker.get_container_def()“ or “sagemaker.pipeline_container_def()“, which will used to create more advanced container configurations, including model containers which need artifacts from S3.
vpc_config
(dict[str, list[str]]): The VpcConfig set on the model (default: None)
'Subnets' (list[str]): List of subnet ids.
'SecurityGroupIds' (list[str]): List of security group ids.
enable_network_isolation
(bool): Wether the model requires network isolation or not.
primary_container
(str or dict[str, str]): Docker image which defines the inference code. You can also specify the return value of “sagemaker.container_def()“, which is used to create more advanced container configurations, including model containers which need artifacts from S3. This field is deprecated, please use container_defs instead.
tags
(list[list[str, str]]): Optional. The list of tags to add to the model.
Example: tags = list(list('Key'= 'tagname', 'Value'= 'tagvalue'))
For more information about tags, see
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.add_tags
str: Name of the Amazon SageMaker “Model“ created.
create_model_from_job()
Create an Amazon SageMaker “Model“ from a SageMaker Training Job.
Session$create_model_from_job( training_job_name, name = NULL, role = NULL, image_uri = NULL, model_data_url = NULL, env = NULL, enable_network_isolation = FALSE, vpc_config_override = "VPC_CONFIG_DEFAULT", tags = NULL )
training_job_name
(str): The Amazon SageMaker Training Job name.
name
(str): The name of the SageMaker “Model“ to create (default: None). If not specified, the training job name is used.
role
(str): The “ExecutionRoleArn“ IAM Role ARN for the “Model“, specified either by an IAM role name or role ARN. If None, the “RoleArn“ from the SageMaker Training Job will be used.
image_uri
(str): The Docker image reference (default: None). If None, it defaults to the Training Image in “training_job_name“.
model_data_url
(str): S3 location of the model data (default: None). If None, defaults to the “ModelS3Artifacts“ of “training_job_name“.
env
(dict[string,string]): Model environment variables (default: ).
enable_network_isolation
(bool): Whether the model requires network isolation or not.
vpc_config_override
(dict[str, list[str]]): Optional override for VpcConfig set on the model. Default: use VpcConfig from training job.
'Subnets' (list[str]) List of subnet ids.
'SecurityGroupIds' (list[str]) List of security group ids.
tags
(list[list[str, str]]): Optional. The list of tags to add to the model. For more, see https://docs.aws.amazon.com/sagemaker/latest/dg/API_Tag.html.
str: The name of the created “Model“.
create_model_package_from_algorithm()
Create a SageMaker Model Package from the results of training with an Algorithm Package
Session$create_model_package_from_algorithm( name, description = NULL, algorithm_arn = NULL, model_data = NULL )
name
(str): ModelPackage name
description
(str): Model Package description
algorithm_arn
(str): arn or name of the algorithm used for training.
model_data
(str): s3 URI to the model artifacts produced by training
create_model_package_from_containers()
Get request dictionary for CreateModelPackage API.
Session$create_model_package_from_containers( containers = NULL, content_types = NULL, response_types = NULL, inference_instances = NULL, transform_instances = NULL, model_package_name = NULL, model_package_group_name = NULL, model_metrics = NULL, metadata_properties = NULL, marketplace_cert = FALSE, approval_status = "PendingManualApproval", description = NULL, drift_check_baselines = NULL )
containers
(list): A list of inference containers that can be used for inference specifications of Model Package (default: None).
content_types
(list): The supported MIME types for the input data (default: None).
response_types
(list): The supported MIME types for the output data (default: None).
inference_instances
(list): A list of the instance types that are used to generate inferences in real-time (default: None).
transform_instances
(list): A list of the instance types on which a transformation job can be run or on which an endpoint can be deployed (default: None).
model_package_name
(str): Model Package name, exclusive to 'model_package_group_name', using 'model_package_name' makes the Model Package un-versioned (default: None).
model_package_group_name
(str): Model Package Group name, exclusive to 'model_package_name', using 'model_package_group_name' makes the Model Package versioned (default: None).
model_metrics
(ModelMetrics): ModelMetrics object (default: None).
metadata_properties
(MetadataProperties): MetadataProperties object (default: None)
marketplace_cert
(bool): A boolean value indicating if the Model Package is certified for AWS Marketplace (default: False).
approval_status
(str): Model Approval Status, values can be "Approved", "Rejected", or "PendingManualApproval" (default: "PendingManualApproval").
description
(str): Model Package description (default: None).
drift_check_baselines
(DriftCheckBaselines): DriftCheckBaselines object (default: None).
wait_for_model_package()
Wait for an Amazon SageMaker endpoint deployment to complete.
Session$wait_for_model_package(model_package_name, poll = 5)
model_package_name
(str): Name of the “Endpoint“ to wait for.
poll
(int): Polling interval in seconds (default: 5).
dict: Return value from the “DescribeEndpoint“ API.
describe_model()
Calls the DescribeModel API for the given model name.
Session$describe_model(name)
name
(str): The name of the SageMaker model.
dict: A dictionary response with the model description.
create_endpoint_config()
Create an Amazon SageMaker endpoint configuration. The endpoint configuration identifies the Amazon SageMaker model (created using the “CreateModel“ API) and the hardware configuration on which to deploy the model. Provide this endpoint configuration to the “CreateEndpoint“ API, which then launches the hardware and deploys the model.
Session$create_endpoint_config( name, model_name, initial_instance_count, instance_type, accelerator_type = NULL, tags = NULL, kms_key = NULL, data_capture_config_dict = NULL )
name
(str): Name of the Amazon SageMaker endpoint configuration to create.
model_name
(str): Name of the Amazon SageMaker “Model“.
initial_instance_count
(int): Minimum number of EC2 instances to launch. The actual number of active instances for an endpoint at any given time varies due to autoscaling.
instance_type
(str): Type of EC2 instance to launch, for example, 'ml.c4.xlarge'.
accelerator_type
(str): Type of Elastic Inference accelerator to attach to the instance. For example, 'ml.eia1.medium'. For more information: https://docs.aws.amazon.com/sagemaker/latest/dg/ei.html
tags
(list[list[str, str]]): Optional. The list of tags to add to the endpoint config.
kms_key
(str): The KMS key that is used to encrypt the data on the storage volume attached to the instance hosting the endpoint.
data_capture_config_dict
(dict): Specifies configuration related to Endpoint data
capture for use with Amazon SageMaker Model Monitoring. Default: None.
Example: tags = list(list('Key'= 'tagname', 'Value'= 'tagvalue'))
For more information about tags, see
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.add_tags
str: Name of the endpoint point configuration created.
create_endpoint_config_from_existing()
Create an Amazon SageMaker endpoint configuration from an existing one. Updating any values that were passed in. The endpoint configuration identifies the Amazon SageMaker model (created using the “CreateModel“ API) and the hardware configuration on which to deploy the model. Provide this endpoint configuration to the “CreateEndpoint“ API, which then launches the hardware and deploys the model.
Session$create_endpoint_config_from_existing( existing_config_name, new_config_name, new_tags = NULL, new_kms_key = NULL, new_data_capture_config_list = NULL, new_production_variants = NULL )
existing_config_name
(str): Name of the existing Amazon SageMaker endpoint configuration.
new_config_name
(str): Name of the Amazon SageMaker endpoint configuration to create.
new_tags
(List[list[str, str]]): Optional. The list of tags to add to the endpoint config. If not specified, the tags of the existing endpoint configuration are used. If any of the existing tags are reserved AWS ones (i.e. begin with "aws"), they are not carried over to the new endpoint configuration.
new_kms_key
(str): The KMS key that is used to encrypt the data on the storage volume attached to the instance hosting the endpoint (default: None). If not specified, the KMS key of the existing endpoint configuration is used.
new_data_capture_config_list
(dict): Specifies configuration related to Endpoint data capture for use with Amazon SageMaker Model Monitoring (default: None). If not specified, the data capture configuration of the existing endpoint configuration is used.
new_production_variants
(list[dict]): The configuration for which model(s) to host and the resources to deploy for hosting the model(s). If not specified, the “ProductionVariants“ of the existing endpoint configuration is used.
str: Name of the endpoint point configuration created.
create_endpoint()
Create an Amazon SageMaker “Endpoint“ according to the endpoint configuration specified in the request. Once the “Endpoint“ is created, client applications can send requests to obtain inferences. The endpoint configuration is created using the “CreateEndpointConfig“ API.
Session$create_endpoint(endpoint_name, config_name, tags = NULL, wait = TRUE)
endpoint_name
(str): Name of the Amazon SageMaker “Endpoint“ being created.
config_name
(str): Name of the Amazon SageMaker endpoint configuration to deploy.
tags
(list[list[str, str]]): Optional. The list of tags to add to the endpoint config.
wait
(bool): Whether to wait for the endpoint deployment to complete before returning
(Default: TRUE
).
str: Name of the Amazon SageMaker “Endpoint“ created.
update_endpoint()
Update an Amazon SageMaker “Endpoint“ according to the endpoint configuration specified in the request
Session$update_endpoint(endpoint_name, endpoint_config_name, wait = TRUE)
endpoint_name
(str): Name of the Amazon SageMaker “Endpoint“ being created.
endpoint_config_name
(str): Name of the Amazon SageMaker endpoint configuration to deploy.
wait
(bool): Whether to wait for the endpoint deployment to complete before returning
(Default: TRUE
).
str: Name of the Amazon SageMaker “Endpoint“ being updated.
delete_endpoint()
Delete an Amazon SageMaker “Endpoint“.
Session$delete_endpoint(endpoint_name)
endpoint_name
(str): Name of the Amazon SageMaker “Endpoint“ to delete.
delete_endpoint_config()
Delete an Amazon SageMaker endpoint configuration.
Session$delete_endpoint_config(endpoint_config_name)
endpoint_config_name
(str): Name of the Amazon SageMaker endpoint configuration to delete.
delete_model()
Delete an Amazon SageMaker Model.
Session$delete_model(model_name)
model_name
(str): Name of the Amazon SageMaker model to delete.
list_tags()
List the tags given an Amazon Resource Name
Session$list_tags(resource_arn, max_results = 50)
resource_arn
(str): The Amazon Resource Name (ARN) for which to get the tags list.
max_results
(int): The maximum number of results to include in a single page. This method takes care of that abstraction and returns a full list.
wait_for_job()
Wait for an Amazon SageMaker training job to complete.
Session$wait_for_job(job, poll = 5)
job
(str): Name of the training job to wait for.
poll
(int): Polling interval in seconds (default: 5).
(dict): Return value from the “DescribeTrainingJob“ API.
wait_for_processing_job()
Wait for an Amazon SageMaker Processing job to complete.
Session$wait_for_processing_job(job, poll = 5)
job
(str): Name of the processing job to wait for.
poll
(int): Polling interval in seconds (Default: 5).
(dict): Return value from the “DescribeProcessingJob“ API.
wait_for_compilation_job()
Wait for an Amazon SageMaker Neo compilation job to complete.
Session$wait_for_compilation_job(job, poll = 5)
job
(str): Name of the compilation job to wait for.
poll
(int): Polling interval in seconds (Default: 5).
(dict): Return value from the “DescribeCompilationJob“ API.
wait_for_edge_packaging_job()
Wait for an Amazon SageMaker Edge packaging job to complete.
Session$wait_for_edge_packaging_job(job, poll = 5)
job
(str): Name of the edge packaging job to wait for.
poll
(int): Polling interval in seconds (default: 5).
(dict): Return value from the “DescribeEdgePackagingJob“ API.
wait_for_tuning_job()
Wait for an Amazon SageMaker hyperparameter tuning job to complete.
Session$wait_for_tuning_job(job, poll = 5)
job
(str): Name of the tuning job to wait for.
poll
(int): Polling interval in seconds (default: 5).
(dict): Return value from the “DescribeHyperParameterTuningJob“ API.
describe_transform_job()
Calls the DescribeTransformJob API for the given job name and returns the response.
Session$describe_transform_job(job_name)
job_name
(str): The name of the transform job to describe.
dict: A dictionary response with the transform job description.
wait_for_transform_job()
Wait for an Amazon SageMaker transform job to complete.
Session$wait_for_transform_job(job, poll = 5)
job
(str): Name of the transform job to wait for.
poll
(int): Polling interval in seconds (default: 5).
(dict): Return value from the “DescribeTransformJob“ API.
stop_transform_job()
Stop the Amazon SageMaker hyperparameter tuning job with the specified name.
Session$stop_transform_job(name)
name
(str): Name of the Amazon SageMaker batch transform job.
wait_for_endpoint()
Wait for an Amazon SageMaker endpoint deployment to complete.
Session$wait_for_endpoint(endpoint, poll = 30)
endpoint
(str): Name of the “Endpoint“ to wait for.
poll
(int): Polling interval in seconds (Default: 30).
dict: Return value from the “DescribeEndpoint“ API.
endpoint_from_job()
Create an “Endpoint“ using the results of a successful training job. Specify the job name, Docker image containing the inference code, and hardware configuration to deploy the model. Internally the API, creates an Amazon SageMaker model (that describes the model artifacts and the Docker image containing inference code), endpoint configuration (describing the hardware to deploy for hosting the model), and creates an “Endpoint“ (launches the EC2 instances and deploys the model on them). In response, the API returns the endpoint name to which you can send requests for inferences.
Session$endpoint_from_job( job_name, initial_instance_count, instance_type, deployment_image = NULL, name = NULL, role = NULL, wait = TRUE, model_environment_vars = NULL, vpc_config_override = "VPC_CONFIG_DEFAULT", accelerator_type = NULL, data_capture_config = NULL )
job_name
(str): Name of the training job to deploy the results of.
initial_instance_count
(int): Minimum number of EC2 instances to launch. The actual number of active instances for an endpoint at any given time varies due to autoscaling.
instance_type
(str): Type of EC2 instance to deploy to an endpoint for prediction, for example, 'ml.c4.xlarge'.
deployment_image
(str): The Docker image which defines the inference code to be used as the entry point for accepting prediction requests. If not specified, uses the image used for the training job.
name
(str): Name of the “Endpoint“ to create. If not specified, uses the training job name.
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. You must grant sufficient permissions to this role.
wait
(bool): Whether to wait for the endpoint deployment to complete before returning (Default: True).
model_environment_vars
(dict[str, str]): Environment variables to set on the model container (Default: NULL).
vpc_config_override
(dict[str, list[str]]): Overrides VpcConfig set on the model. Default: use VpcConfig from training job.
'Subnets' (list[str]): List of subnet ids.
'SecurityGroupIds' (list[str]): List of security group ids.
accelerator_type
(str): Type of Elastic Inference accelerator to attach to the instance. For example, 'ml.eia1.medium'. For more information: https://docs.aws.amazon.com/sagemaker/latest/dg/ei.html
data_capture_config
(DataCaptureConfig): Specifies configuration related to Endpoint data capture for use with Amazon SageMaker Model Monitoring. Default: None.
str: Name of the “Endpoint“ that is created.
endpoint_from_model_data()
Create and deploy to an “Endpoint“ using existing model data stored in S3.
Session$endpoint_from_model_data( model_s3_location, deployment_image, initial_instance_count, instance_type, name = NULL, role = NULL, wait = TRUE, model_environment_vars = NULL, model_vpc_config = NULL, accelerator_type = NULL, data_capture_config = NULL )
model_s3_location
(str): S3 URI of the model artifacts to use for the endpoint.
deployment_image
(str): The Docker image which defines the runtime code to be used as the entry point for accepting prediction requests.
initial_instance_count
(int): Minimum number of EC2 instances to launch. The actual number of active instances for an endpoint at any given time varies due to autoscaling.
instance_type
(str): Type of EC2 instance to deploy to an endpoint for prediction, e.g. 'ml.c4.xlarge'.
name
(str): Name of the “Endpoint“ to create. If not specified, uses a name generated by combining the image name with a timestamp.
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. You must grant sufficient permissions to this role.
wait
(bool): Whether to wait for the endpoint deployment to complete before returning (Default: True).
model_environment_vars
(dict[str, str]): Environment variables to set on the model container (Default: NULL).
model_vpc_config
(dict[str, list[str]]): The VpcConfig set on the model (default: None)
'Subnets' (list[str]): List of subnet ids.
'SecurityGroupIds' (list[str]): List of security group ids.
accelerator_type
(str): Type of Elastic Inference accelerator to attach to the instance. For example, 'ml.eia1.medium'. For more information: https://docs.aws.amazon.com/sagemaker/latest/dg/ei.html
data_capture_config
(DataCaptureConfig): Specifies configuration related to Endpoint data capture for use with Amazon SageMaker Model Monitoring. Default: None.
str: Name of the “Endpoint“ that is created.
endpoint_from_production_variants()
Create an SageMaker “Endpoint“ from a list of production variants.
Session$endpoint_from_production_variants( name, production_variants, tags = NULL, kms_key = NULL, wait = TRUE, data_capture_config_list = NULL )
name
(str): The name of the “Endpoint“ to create.
production_variants
(list[dict[str, str]]): The list of production variants to deploy.
tags
(list[dict[str, str]]): A list of key-value pairs for tagging the endpoint (Default: None).
kms_key
(str): The KMS key that is used to encrypt the data on the storage volume attached to the instance hosting the endpoint.
wait
(bool): Whether to wait for the endpoint deployment to complete before returning (Default: True).
data_capture_config_list
(list): Specifies configuration related to Endpoint data capture for use with Amazon SageMaker Model Monitoring. Default: None.
str: The name of the created “Endpoint“.
expand_role()
Expand an IAM role name into an ARN. If the role is already in the form of an ARN, then the role is simply returned. Otherwise we retrieve the full ARN and return it.
Session$expand_role(role)
role
(str): An AWS IAM role (either name or full ARN).
str: The corresponding AWS IAM role ARN.
get_caller_identity_arn()
Returns the ARN user or role whose credentials are used to call the API.
Session$get_caller_identity_arn()
str: The ARN user or role
logs_for_job()
Display the logs for a given training job, optionally tailing them until the job is complete. If the output is a tty or a Jupyter cell, it will be color-coded based on which instance the log entry is from.
Session$logs_for_job(job_name, wait = FALSE, poll = 10, log_type = "All")
job_name
(str): Name of the training job to display the logs for.
wait
(bool): Whether to keep looking for new log entries until the job completes (Default: False).
poll
(int): The interval in seconds between polling for new log entries and job completion (Default: 10).
log_type
(str): Type of logs to return from building sagemaker process
logs_for_processing_job()
Display the logs for a given processing job, optionally tailing them until the job is complete.
Session$logs_for_processing_job(job_name, wait = FALSE, poll = 10)
job_name
(str): Name of the training job to display the logs for.
wait
(bool): Whether to keep looking for new log entries until the job completes (Default: False).
poll
(int): The interval in seconds between polling for new log entries and job completion (Default: 10).
logs_for_transform_job()
Display the logs for a given transform job, optionally tailing them until the job is complete. If the output is a tty or a Jupyter cell, it will be color-coded based on which instance the log entry is from.
Session$logs_for_transform_job(job_name, wait = FALSE, poll = 10)
job_name
(str): Name of the transform job to display the logs for.
wait
(bool): Whether to keep looking for new log entries until the job completes (Default: FALSE).
poll
(int): The interval in seconds between polling for new log entries and job completion (Default: 10).
delete_feature_group()
Deletes a FeatureGroup in the FeatureStore service.
Session$delete_feature_group(feature_group_name)
feature_group_name
(str): name of the feature group to be deleted.
create_feature_group()
Creates a FeatureGroup in the FeatureStore service.
Session$create_feature_group( feature_group_name, record_identifier_name, event_time_feature_name, feature_definitions, role_arn, online_store_config = NULL, offline_store_config = NULL, description = NULL, tags = NULL )
feature_group_name
(str): name of the FeatureGroup.
record_identifier_name
(str): name of the record identifier feature.
event_time_feature_name
(str): name of the event time feature.
feature_definitions
(Sequence[Dict[str, str]]): list of feature definitions.
role_arn
(str): ARN of the role will be used to execute the api.
online_store_config
(Dict[str, str]): dict contains configuration of the
offline_store_config
(Dict[str, str]): dict contains configuration of the feature offline store.
description
(str): description of the FeatureGroup.
tags
(List[Dict[str, str]]): list of tags for labeling a FeatureGroup.
feature
online store.
Response dict from service.
describe_feature_group()
Describe a FeatureGroup by name in FeatureStore service.
Session$describe_feature_group(feature_group_name, next_token = NULL)
feature_group_name
(str): name of the FeatureGroup to descibe.
next_token
(str): next_token to get next page of features.
Response dict from service.
start_query_execution()
Start Athena query execution.
Session$start_query_execution( catalog, database, query_string, output_location, kms_key = NULL )
catalog
(str): name of the data catalog.
database
(str): name of the data catalog database.
query_string
(str): SQL expression.
output_location
(str): S3 location of the output file.
kms_key
(str): KMS key id will be used to encrypt the result if given.
Response
dict from the service.
get_query_execution()
Get execution status of the Athena query.
Session$get_query_execution(query_execution_id)
query_execution_id
(str): execution ID of the Athena query.
wait_for_athena_query()
Wait for Athena query to finish.
Session$wait_for_athena_query(query_execution_id, poll = 5)
query_execution_id
(str): execution ID of the Athena query.
poll
(int): time interval to poll get_query_execution API.
download_athena_query_result()
Download query result file from S3.
Session$download_athena_query_result( bucket, prefix, query_execution_id, filename )
bucket
(str): name of the S3 bucket where the result file is stored.
prefix
(str): S3 prefix of the result file.
query_execution_id
(str): execution ID of the Athena query.
filename
(str): name of the downloaded file.
account_id()
Get the AWS account id of the caller.
Session$account_id()
AWS account ID.
help()
Return class documentation
Session$help()
format()
foramt class
Session$format()
clone()
The objects of this class are cloneable with this method.
Session$clone(deep = FALSE)
deep
Whether to make a deep clone.
Other Session:
LocalSession
,
PawsSession
For configuring channel shuffling using a seed For more detail, see the AWS documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/API_ShuffleConfig.html
seed
value used to seed the shuffled sequence.
new()
Create a ShuffleConfig.
ShuffleConfig$new(seed)
seed
(numeric): value used to seed the shuffled sequence.
format()
format class
ShuffleConfig$format()
clone()
The objects of this class are cloneable with this method.
ShuffleConfig$clone(deep = FALSE)
deep
Whether to make a deep clone.
If a single record does not fit within the payload specified it will throw a RuntimeError.
sagemaker.core::BatchStrategy
-> SingleRecordStrategy
pad()
Group together as many records as possible to fit in the specified size. This SingleRecordStrategy will not group any record and will return them one by one as long as they are within the maximum size.
SingleRecordStrategy$pad(file, size = 6)
file
(str): file path to read the records from.
size
(int): maximum size in MB that each group of records will be fitted to. passing 0 means unlimited size.
generator of records
clone()
The objects of this class are cloneable with this method.
SingleRecordStrategy$clone(deep = FALSE)
deep
Whether to make a deep clone.
Enum class for special supported filter keys.
SpecialSupportedFilterKeys
SpecialSupportedFilterKeys
An object of class SpecialSupportedFilterKeys
(inherits from Enum
, environment
) of length 3.
We need this function because the AWS SDK does not yet honor the “region_name“ parameter when creating an AWS STS client. For the list of regional endpoints, see https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp_enable-regions.html#id_credentials_region-endpoints.
sts_regional_endpoint(region)
sts_regional_endpoint(region)
region |
(str): AWS region name |
str: AWS STS regional endpoint
Other sagemaker_utils:
.aws_partition()
,
.download_files_under_prefix()
,
base_from_name()
,
base_name_from_image()
,
build_dict()
,
common_variables
,
create_tar_file()
,
download_file()
,
download_folder()
,
get_config_value()
,
get_short_version()
,
name_from_base()
,
name_from_image()
,
regional_hostname()
,
repack_model()
,
retries()
,
sagemaker_short_timestamp()
,
sagemaker_timestamp()
,
secondary_training_status_changed()
,
secondary_training_status_message()
,
unique_name_from_base()
Returns True if “tag_key“ is in the “tag_array“.
tag_key_in_array(tag_key, tag_array)
tag_key_in_array(tag_key, tag_array)
tag_key |
(str): the tag key to check if it's already in the “tag_array“. |
tag_array |
(List[Dict[str, str]]): array of tags to check for “tag_key“. |
Package source files and upload a compress tar file to S3. The S3 location will be “s3://<bucket>/s3_key_prefix/sourcedir.tar.gz“. If directory is an S3 URI, an UploadedCode object will be returned, but nothing will be uploaded to S3 (this allow reuse of code already in S3). If directory is None, the script will be added to the archive at “./<basename of script>“. If directory is not None, the (recursive) contents of the directory will be added to the archive. directory is treated as the base path of the archive, and the script name is assumed to be a filename or relative path inside the directory.
tar_and_upload_dir( sagemaker_session, bucket, s3_key_prefix, script, directory = NULL, dependencies = NULL, kms_key = NULL )
tar_and_upload_dir( sagemaker_session, bucket, s3_key_prefix, script, directory = NULL, dependencies = NULL, kms_key = NULL )
sagemaker_session |
(sagemaker.Session): sagemaker_session session used to access S3. |
bucket |
(str): S3 bucket to which the compressed file is uploaded. |
s3_key_prefix |
(str): Prefix for the S3 key. |
script |
(str): Script filename or path. |
directory |
(str): Optional. Directory containing the source file. If it starts with "s3://", no action is taken. |
dependencies |
(List[str]): Optional. A list of paths to directories (absolute or relative) containing additional libraries that will be copied into /opt/ml/lib |
kms_key |
(str): Optional. KMS key ID used to upload objects to the bucket (default: None). |
sagemaker.fw_utils.UserCode: An object with the S3 bucket and key (S3 prefix) and script name.
Amazon SageMaker channel configurations for S3 data sources.
config
A SageMaker “DataSource“ referencing a SageMaker “S3DataSource“.
new()
See AWS documentation on the “CreateTrainingJob“ API for more details on the parameters.
TrainingInput$new( s3_data, distribution = NULL, compression = NULL, content_type = NULL, record_wrapping = NULL, s3_data_type = "S3Prefix", input_mode = NULL, attribute_names = NULL, target_attribute_name = NULL, shuffle_config = NULL )
s3_data
(str): Defines the location of s3 data to train on.
distribution
(str): Valid values: 'FullyReplicated', 'ShardedByS3Key' (default: 'FullyReplicated').
compression
(str): Valid values: 'Gzip', None (default: None). This is used only in Pipe input mode.
content_type
(str): MIME type of the input data (default: None).
record_wrapping
(str): Valid values: 'RecordIO' (default: None).
s3_data_type
(str): Valid values: 'S3Prefix', 'ManifestFile', 'AugmentedManifestFile'. If 'S3Prefix', “s3_data“ defines a prefix of s3 objects to train on. All objects with s3 keys beginning with “s3_data“ will be used to train. If 'ManifestFile' or 'AugmentedManifestFile', then “s3_data“ defines a single S3 manifest file or augmented manifest file (respectively), listing the S3 data to train on. Both the ManifestFile and AugmentedManifestFile formats are described in the SageMaker API documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/API_S3DataSource.html
input_mode
(str): Optional override for this channel's input mode (default: None). By default, channels will use the input mode defined on “sagemaker.estimator.EstimatorBase.input_mode“, but they will ignore that setting if this parameter is set. * None - Amazon SageMaker will use the input mode specified in the “Estimator“ * 'File' - Amazon SageMaker copies the training dataset from the S3 location to a local directory. * 'Pipe' - Amazon SageMaker streams data directly from S3 to the container via a Unix-named pipe.
attribute_names
(list[str]): A list of one or more attribute names to use that are found in a specified AugmentedManifestFile.
target_attribute_name
(str): The name of the attribute will be predicted (classified) in a SageMaker AutoML job. It is required if the input is for SageMaker AutoML job.
shuffle_config
(ShuffleConfig): If specified this configuration enables shuffling on this channel. See the SageMaker API documentation for more info: https://docs.aws.amazon.com/sagemaker/latest/dg/API_ShuffleConfig.html
format()
format class
TrainingInput$format()
clone()
The objects of this class are cloneable with this method.
TrainingInput$clone(deep = FALSE)
deep
Whether to make a deep clone.
Create a class containing all the parameters. It can be used when calling “sagemaker.transformer.Transformer.transform()“
data
Place holder
data_type
Place holder
content_type
Place holder
compression_type
Place holder
split_type
Place holder
input_filter
Place holder
output_filter
Place holder
join_source
Place holder
model_client_config
Place holder
new()
Initialize TransformInput class
TransformInput$new( data = NULL, data_type = "S3Prefix", content_type = NULL, compression_type = NULL, split_type = NULL, input_filter = NULL, output_filter = NULL, join_source = NULL, model_client_config = NULL )
data
(str): Place holder
data_type
(str): Place holder
content_type
(str): Place holder
compression_type
(str): Place holder
split_type
(str): Place holder
input_filter
(str): Place holder
output_filter
(str): Place holder
join_source
(str): Place holder
model_client_config
(str): Place holder
format()
format class
TransformInput$format()
clone()
The objects of this class are cloneable with this method.
TransformInput$clone(deep = FALSE)
deep
Whether to make a deep clone.
Create a unique name from base str
unique_name_from_base(base, max_length = 63)
unique_name_from_base(base, max_length = 63)
base |
(str): String used as prefix to generate the unique name. |
max_length |
(int): Maximum length for the resulting string (default: 63). |
str: Input parameter with appended timestamp.
Other sagemaker_utils:
.aws_partition()
,
.download_files_under_prefix()
,
base_from_name()
,
base_name_from_image()
,
build_dict()
,
common_variables
,
create_tar_file()
,
download_file()
,
download_folder()
,
get_config_value()
,
get_short_version()
,
name_from_base()
,
name_from_image()
,
regional_hostname()
,
repack_model()
,
retries()
,
sagemaker_short_timestamp()
,
sagemaker_timestamp()
,
secondary_training_status_changed()
,
secondary_training_status_message()
,
sts_regional_endpoint()
Updates the tags for the “sagemaker.model.Model.deploy“ command with any JumpStart tags.
update_inference_tags_with_jumpstart_training_tags( inference_tags, training_tags )
update_inference_tags_with_jumpstart_training_tags( inference_tags, training_tags )
inference_tags |
(Optional[List[Dict[str, str]]]): Custom tags to appy to inference job. |
training_tags |
(Optional[List[Dict[str, str]]]): Tags from training job. |
Validate hyperparameters for JumpStart models.
validate_hyperparameters( model_id, model_version, hyperparameters, validation_mode = HyperparameterValidationMode$VALIDATE_PROVIDED, region = JUMPSTART_DEFAULT_REGION_NAME() )
validate_hyperparameters( model_id, model_version, hyperparameters, validation_mode = HyperparameterValidationMode$VALIDATE_PROVIDED, region = JUMPSTART_DEFAULT_REGION_NAME() )
model_id |
(str): Model ID of the model for which to validate hyperparameters. |
model_version |
(str): Version of the model for which to validate hyperparameters. |
hyperparameters |
(dict): Hyperparameters to validate. |
validation_mode |
(HyperparameterValidationMode): Method of validation to use with hyperparameters. If set to “VALIDATE_PROVIDED“, only hyperparameters provided to this function will be validated, the missing hyperparameters will be ignored. If set to“VALIDATE_ALGORITHM“, all algorithm hyperparameters will be validated. If set to “VALIDATE_ALL“, all hyperparameters for the model will be validated. |
region |
(str): Region for which to validate hyperparameters. (Default: JumpStart default region). |
Validate the configuration dictionary for model parallelism.
validate_mp_config(config)
validate_mp_config(config)
config |
(list): Dictionary holding configuration keys and values. |
Currently, two strategies are supported: 'dataparallel' or 'modelparallel'. Validate if the user requested strategy is supported. Currently, only one strategy can be specified at a time. Validate if the user has requested more than one strategy simultaneously. Validate if the smdistributed dict arg is syntactically correct. Additionally, perform strategy-specific validations.
validate_smdistributed( instance_type, framework_name, framework_version, py_version, distribution, image_uri = NULL )
validate_smdistributed( instance_type, framework_name, framework_version, py_version, distribution, image_uri = NULL )
instance_type |
(str): A string representing the type of training instance selected. |
framework_name |
(str): A string representing the name of framework selected. |
framework_version |
(str): A string representing the framework version selected. |
py_version |
(str): A string representing the python version selected. |
distribution |
(dict): A dictionary with information to enable distributed training. (Defaults to None if distributed training is not enabled.) |
image_uri |
(str): A string representing a Docker image URI. |
Validate that the source directory exists and it contains the user script
validate_source_dir(script, directory)
validate_source_dir(script, directory)
script |
(str): Script filename. |
directory |
(str): Directory containing the source file. |
Validates framework and model arguments to enforce version or image specification.
validate_version_or_image_args(framework_version, py_version, image_uri)
validate_version_or_image_args(framework_version, py_version, image_uri)
framework_version |
(str): The version of the framework. |
py_version |
(str): The version of Python. |
image_uri |
(str): The URI of the image. |
Used for hosting environment variables and training hyperparameters.
VariableScope
VariableScope
An object of class VariableScope
(inherits from Enum
, environment
) of length 2.
Possible types for hyperparameters and environment variables.
VariableTypes
VariableTypes
An object of class VariableTypes
(inherits from Enum
, environment
) of length 4.
Verifies that an acceptable model_id, version, scope, and region combination is provided.
verify_model_region_and_return_specs( model_id, version, scope, region, tolerate_vulnerable_model = FALSE, tolerate_deprecated_model = FALSE )
verify_model_region_and_return_specs( model_id, version, scope, region, tolerate_vulnerable_model = FALSE, tolerate_deprecated_model = FALSE )
model_id |
(Optional[str]): model ID of the JumpStart model to verify and obtains specs. |
version |
(Optional[str]): version of the JumpStart model to verify and obtains specs. |
scope |
(Optional[str]): scope of the JumpStart model to verify. |
region |
(Optional[str]): region of the JumpStart model to verify and obtains specs. |
tolerate_vulnerable_model |
(bool): True if vulnerable versions of model specifications should be tolerated (exception not raised). If False, raises an exception if the script used by this version of the model has dependencies with known security vulnerabilities. (Default: False). |
tolerate_deprecated_model |
(bool): True if deprecated models should be tolerated (exception not raised). False if these models should raise an exception. (Default: False). |
Extracts subnets and security group ids as lists from a VpcConfig dict
vpc_from_list(vpc_config, do_sanitize = FALSE)
vpc_from_list(vpc_config, do_sanitize = FALSE)
vpc_config |
(list): a VpcConfig list containing 'Subnets' and SecurityGroupIds' |
do_sanitize |
(bool): whether to sanitize the VpcConfig dict before extracting values |
list as (subnets, security_group_ids) If vpc_config parameter is None, returns (None, None)
Other vpc_utils:
vpc_configuration_env
,
vpc_sanitize()
,
vpc_to_list()
Checks that an instance of VpcConfig has the expected keys and values, removes unexpected keys, and raises ValueErrors if any expectations are violated
vpc_sanitize(vpc_config = NULL)
vpc_sanitize(vpc_config = NULL)
vpc_config |
(list): a VpcConfig dict containing 'Subnets' and SecurityGroupIds' |
A valid VpcConfig dict containing only 'Subnets' and 'SecurityGroupIds' from the vpc_config parameter If vpc_config parameter is None, returns None
Other vpc_utils:
vpc_configuration_env
,
vpc_from_list()
,
vpc_to_list()
Prepares a VpcConfig list containing keys 'Subnets' and SecurityGroupIds' This is the dict format expected by SageMaker CreateTrainingJob and CreateModel APIs See https://docs.aws.amazon.com/sagemaker/latest/dg/API_VpcConfig.html
vpc_to_list(subnets, security_group_ids)
vpc_to_list(subnets, security_group_ids)
subnets |
(list): list of subnet IDs to use in VpcConfig |
security_group_ids |
(list): list of security group IDs to use in VpcConfig |
A VpcConfig dict containing keys 'Subnets' and 'SecurityGroupIds' If either or both parameters are None, returns None
Other vpc_utils:
vpc_configuration_env
,
vpc_from_list()
,
vpc_sanitize()
Raise this exception only if the scope of attributes accessed in the specifications have vulnerabilities. For example, a model training script may have vulnerabilities, but not the hosting scripts. In such a case, raise a “VulnerableJumpStartModelError“ only when accessing the training specifications.
sagemaker.core::SagemakerError
-> VulnerableJumpStartModelError
new()
Instantiates VulnerableJumpStartModelError exception.
VulnerableJumpStartModelError$new( model_id = NULL, version = NULL, vulnerabilities = NULL, scope = NULL, message = NULL )
model_id
(Optional[str]): model ID of vulnerable JumpStart model. (Default: None).
version
(Optional[str]): version of vulnerable JumpStart model. (Default: None).
vulnerabilities
(Optional[List[str]]): vulnerabilities associated with model. (Default: None).
scope
(str): JumpStart script scopes
message
(Optional[str]): error message
clone()
The objects of this class are cloneable with this method.
VulnerableJumpStartModelError$clone(deep = FALSE)
deep
Whether to make a deep clone.
Warn the user that training will not fully leverage all the GPU cores if parameter server is enabled and a multi-GPU instance is selected. Distributed training with the default parameter server setup doesn't support multi-GPU instances.
warn_if_parameter_server_with_multi_gpu(training_instance_type, distribution)
warn_if_parameter_server_with_multi_gpu(training_instance_type, distribution)
training_instance_type |
(str): A string representing the type of training instance selected. |
distribution |
(dict): A dictionary with information to enable distributed training. (Defaults to None if distributed training is not enabled.). |
Write matrix to dense tensor file.
write_matrix_to_dense_tensor(file, array, labels = NULL)
write_matrix_to_dense_tensor(file, array, labels = NULL)
file |
(str): file location |
array |
(array): |
labels |
(str): |
Writes a Matrix sparse matrix to a sparse tensor
write_spmatrix_to_sparse_tensor(file, array, labels = NULL)
write_spmatrix_to_sparse_tensor(file, array, labels = NULL)
file |
(str): file location |
array |
(array): |
labels |
(str): |