Title: | sagemaker machine learning developed by amazon |
---|---|
Description: | `sagemaker` machine learning developed by amazon. |
Authors: | Dyfan Jones [aut, cre], Amazon.com, Inc. [cph] |
Maintainer: | Dyfan Jones <[email protected]> |
License: | Apache License (>= 2.0) |
Version: | 0.2.0 |
Built: | 2024-12-28 04:25:52 UTC |
Source: | https://github.com/DyfanJones/sagemaker-r-mlframework |
'sagemaker' machine learning developed by amazon.
Maintainer: Dyfan Jones [email protected]
Other contributors:
Amazon.com, Inc. [copyright holder]
Base class for either PySpark or SparkJars.
sagemaker.common::Processor
-> sagemaker.common::ScriptProcessor
-> .SparkProcessorBase
new()
Initialize a “_SparkProcessorBase“ instance. The _SparkProcessorBase handles Amazon SageMaker processing tasks for jobs using SageMaker Spark.
.SparkProcessorBase$new( role, instance_type, instance_count, framework_version = NULL, py_version = NULL, container_version = NULL, image_uri = NULL, volume_size_in_gb = 30, volume_kms_key = NULL, output_kms_key = NULL, max_runtime_in_seconds = NULL, base_job_name = NULL, sagemaker_session = NULL, env = NULL, tags = NULL, network_config = NULL )
role
(str): An AWS IAM role name or ARN. The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if it needs to access an AWS resource.
instance_type
(str): Type of EC2 instance to use for processing, for example, 'ml.c4.xlarge'.
instance_count
(int): The number of instances to run the Processing job with. Defaults to 1.
framework_version
(str): The version of SageMaker PySpark.
py_version
(str): The version of python.
container_version
(str): The version of spark container.
image_uri
(str): The container image to use for training.
volume_size_in_gb
(int): Size in GB of the EBS volume to use for storing data during processing (default: 30).
volume_kms_key
(str): A KMS key for the processing volume.
output_kms_key
(str): The KMS key id for all ProcessingOutputs.
max_runtime_in_seconds
(int): Timeout in seconds. After this amount of time Amazon SageMaker terminates the job regardless of its current status.
base_job_name
(str): Prefix for processing name. If not specified, the processor generates a default job name, based on the training image name and current timestamp.
sagemaker_session
(sagemaker.session.Session): Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the processor creates one using the default AWS configuration chain.
env
(dict): Environment variables to be passed to the processing job.
tags
([dict]): List of tags to be passed to the processing job. network_config (sagemaker.network.NetworkConfig): A NetworkConfig object that configures network isolation, encryption of inter-container traffic, security group IDs, and subnets.
network_config
(sagemaker.network.NetworkConfig): A NetworkConfig object that configures network isolation, encryption of inter-container traffic, security group IDs, and subnets.
get_run_args()
For processors (:class:'~sagemaker.spark.processing.PySparkProcessor', :class:'~sagemaker.spark.processing.SparkJar') that have special run() arguments, this object contains the normalized arguments for passing to :class:'~sagemaker.workflow.steps.ProcessingStep'.
.SparkProcessorBase$get_run_args( code, inputs = NULL, outputs = NULL, arguments = NULL )
code
(str): This can be an S3 URI or a local path to a file with the framework script to run.
inputs
(list[:class:'~sagemaker.processing.ProcessingInput']): Input files for the processing job. These must be provided as :class:'~sagemaker.processing.ProcessingInput' objects (default: None).
outputs
(list[:class:'~sagemaker.processing.ProcessingOutput']): Outputs for the processing job. These can be specified as either path strings or :class:'~sagemaker.processing.ProcessingOutput' objects (default: None).
arguments
(list[str]): A list of string arguments to be passed to a processing job (default: None).
Returns a RunArgs object.
run()
Runs a processing job.
.SparkProcessorBase$run( submit_app, inputs = NULL, outputs = NULL, arguments = NULL, wait = TRUE, logs = TRUE, job_name = NULL, experiment_config = NULL, kms_key = NULL )
submit_app
(str): .py or .jar file to submit to Spark as the primary application
inputs
(list[:class:'~sagemaker.processing.ProcessingInput']): Input files for the processing job. These must be provided as :class:'~sagemaker.processing.ProcessingInput' objects (default: None).
outputs
(list[:class:'~sagemaker.processing.ProcessingOutput']): Outputs for the processing job. These can be specified as either path strings or :class:'~sagemaker.processing.ProcessingOutput' objects (default: None).
arguments
(list[str]): A list of string arguments to be passed to a processing job (default: None).
wait
(bool): Whether the call should wait until the job completes (default: True).
logs
(bool): Whether to show the logs produced by the job. Only meaningful when wait is True (default: True).
job_name
(str): Processing job name. If not specified, the processor generates a default job name, based on the base job name and current timestamp.
experiment_config
(dict[str, str]): Experiment management configuration. Dictionary contains three optional keys: 'ExperimentName', 'TrialName', and 'TrialComponentDisplayName'.
kms_key
(str): The ARN of the KMS key that is used to encrypt the user code file (default: None).
start_history()
Starts a Spark history server.
.SparkProcessorBase$start_history(spark_event_logs_s3_uri = NULL)
spark_event_logs_s3_uri
(str): S3 URI where Spark events are stored.
terminate_history_server()
Terminates the Spark history server.
.SparkProcessorBase$terminate_history_server()
clone()
The objects of this class are cloneable with this method.
.SparkProcessorBase$clone(deep = FALSE)
deep
Whether to make a deep clone.
A class for creating and interacting with SageMaker AutoML jobs.
new()
Initialize AutoML class Place holder doc string
AutoML$new( role, target_attribute_name, output_kms_key = NULL, output_path = NULL, base_job_name = NULL, compression_type = NULL, sagemaker_session = NULL, volume_kms_key = NULL, encrypt_inter_container_traffic = FALSE, vpc_config = NULL, problem_type = NULL, max_candidates = NULL, max_runtime_per_training_job_in_seconds = NULL, total_job_runtime_in_seconds = NULL, job_objective = NULL, generate_candidate_definitions_only = FALSE, tags = NULL )
role
:
target_attribute_name
:
output_kms_key
:
output_path
:
base_job_name
:
compression_type
:
sagemaker_session
:
volume_kms_key
:
encrypt_inter_container_traffic
:
vpc_config
:
problem_type
:
max_candidates
:
max_runtime_per_training_job_in_seconds
:
total_job_runtime_in_seconds
:
job_objective
:
generate_candidate_definitions_only
:
tags
:
fit()
Create an AutoML Job with the input dataset.
AutoML$fit(inputs = NULL, wait = TRUE, logs = TRUE, job_name = NULL)
inputs
(str or list[str] or AutoMLInput): Local path or S3 Uri where the training data is stored. Or an AutoMLInput object. If a local path is provided, the dataset will be uploaded to an S3 location.
wait
(bool): Whether the call should wait until the job completes (default: True).
logs
(bool): Whether to show the logs produced by the job. Only meaningful when wait is True (default: True). if “wait“ is False, “logs“ will be set to False as well.
job_name
(str): Training job name. If not specified, the estimator generates a default job name, based on the training image name and current timestamp.
attach()
Attach to an existing AutoML job. Creates and returns a AutoML bound to an existing automl job.
AutoML$attach(auto_ml_job_name, sagemaker_session = NULL)
auto_ml_job_name
(str): AutoML job name
sagemaker_session
(sagemaker.session.Session): A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, the one originally associated with the “AutoML“ instance is used.
sagemaker.automl.AutoML: A “AutoML“ instance with the attached automl job.
describe_auto_ml_job()
Returns the job description of an AutoML job for the given job name.
AutoML$describe_auto_ml_job(job_name = NULL)
job_name
(str): The name of the AutoML job to describe. If None, will use object's latest_auto_ml_job name.
dict: A dictionary response with the AutoML Job description.
best_candidate()
Returns the best candidate of an AutoML job for a given name.
AutoML$best_candidate(job_name = NULL)
job_name
(str): The name of the AutoML job. If None, will use object's .current_auto_ml_job_name.
dict: A dictionary with information of the best candidate.
list_candidates()
Returns the list of candidates of an AutoML job for a given name.
AutoML$list_candidates( job_name = NULL, status_equals = NULL, candidate_name = NULL, candidate_arn = NULL, sort_order = NULL, sort_by = NULL, max_results = NULL )
job_name
(str): The name of the AutoML job. If None, will use object's .current_job name.
status_equals
(str): Filter the result with candidate status, values could be "Completed", "InProgress", "Failed", "Stopped", "Stopping"
candidate_name
(str): The name of a specified candidate to list. Default to None.
candidate_arn
(str): The Arn of a specified candidate to list. Default to None.
sort_order
(str): The order that the candidates will be listed in result. Default to None.
sort_by
(str): The value that the candidates will be sorted by. Default to None.
max_results
(int): The number of candidates will be listed in results, between 1 to 100. Default to None. If None, will return all the candidates.
list: A list of dictionaries with candidates information.
create_model()
Creates a model from a given candidate or the best candidate from the job.
AutoML$create_model( name, sagemaker_session = NULL, candidate = NULL, vpc_config = NULL, enable_network_isolation = FALSE, model_kms_key = NULL, predictor_cls = NULL, inference_response_keys = NULL )
name
(str): The pipeline model name.
sagemaker_session
(sagemaker.session.Session): A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, the one originally associated with the “AutoML“ instance is used.:
candidate
(CandidateEstimator or dict): a CandidateEstimator used for deploying to a SageMaker Inference Pipeline. If None, the best candidate will be used. If the candidate input is a dict, a CandidateEstimator will be created from it.
vpc_config
(dict): Specifies a VPC that your training jobs and hosted models have access to. Contents include "SecurityGroupIds" and "Subnets".
enable_network_isolation
(bool): Isolates the training container. No inbound or outbound network calls can be made, except for calls between peers within a training cluster for distributed training. Default: False
model_kms_key
(str): KMS key ARN used to encrypt the repacked model archive file if the model is repacked
predictor_cls
(callable[string, sagemaker.session.Session]): A function to call to create a predictor (default: None). If specified, “deploy()“ returns the result of invoking this function on the created endpoint name.
inference_response_keys
(list): List of keys for response content. The order of the keys will dictate the content order in the response.
PipelineModel object.
deploy()
Deploy a candidate to a SageMaker Inference Pipeline.
AutoML$deploy( initial_instance_count, instance_type, serializer = NULL, deserializer = NULL, candidate = NULL, sagemaker_session = NULL, name = NULL, endpoint_name = NULL, tags = NULL, wait = TRUE, vpc_config = NULL, enable_network_isolation = FALSE, model_kms_key = NULL, predictor_cls = NULL, inference_response_keys = NULL )
initial_instance_count
(int): The initial number of instances to run in the “Endpoint“ created from this “Model“.
instance_type
(str): The EC2 instance type to deploy this Model to. For example, 'ml.p2.xlarge'.
serializer
(:class:'~sagemaker.serializers.BaseSerializer'): A serializer object, used to encode data for an inference endpoint (default: None). If “serializer“ is not None, then “serializer“ will override the default serializer. The default serializer is set by the “predictor_cls“.
deserializer
(:class:'~sagemaker.deserializers.BaseDeserializer'): A deserializer object, used to decode data from an inference
candidate
(CandidateEstimator or dict): a CandidateEstimator used for deploying to a SageMaker Inference Pipeline. If None, the best candidate will be used. If the candidate input is a dict, a CandidateEstimator will be created from it.
sagemaker_session
(sagemaker.session.Session): A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, the one originally associated with the “AutoML“ instance is used.
name
(str): The pipeline model name. If None, a default model name will be selected on each “deploy“.
endpoint_name
(str): The name of the endpoint to create (default: None). If not specified, a unique endpoint name will be created.
tags
(List[dict[str, str]]): The list of tags to attach to this specific endpoint.
wait
(bool): Whether the call should wait until the deployment of model completes (default: True).
vpc_config
(dict): Specifies a VPC that your training jobs and hosted models have access to. Contents include "SecurityGroupIds" and "Subnets".
enable_network_isolation
(bool): Isolates the training container. No inbound or outbound network calls can be made, except for calls between peers within a training cluster for distributed training. Default: False
model_kms_key
(str): KMS key ARN used to encrypt the repacked model archive file if the model is repacked
predictor_cls
(callable[string, sagemaker.session.Session]): A function to call to create a predictor (default: None). If specified, “deploy()“ returns the result of invoking this function on the created endpoint name.
inference_response_keys
(list): List of keys for response content. The order of the keys will dictate the content order in the response.
endpoint
(default: None). If “deserializer“ is not None, then “deserializer“ will override the default deserializer. The default deserializer is set by the “predictor_cls“.
callable[string, sagemaker.session.Session] or “None“: If “predictor_cls“ is specified, the invocation of “self.predictor_cls“ on the created endpoint name. Otherwise, “None“.
validate_and_update_inference_response()
Validates the requested inference keys and updates response content. On validation, also updates the inference containers to emit appropriate response content in the inference response.
AutoML$validate_and_update_inference_response( inference_containers, inference_response_keys )
inference_containers
(list): list of inference containers
inference_response_keys
(list): list of inference response keys
format()
format class
AutoML$format()
clone()
The objects of this class are cloneable with this method.
AutoML$clone(deep = FALSE)
deep
Whether to make a deep clone.
Provides a method to turn those parameters into a dictionary.
new()
Convert an S3 Uri or a list of S3 Uri to an AutoMLInput object.
AutoMLInput$new(inputs, target_attribute_name, compression = NULL)
inputs
(str, list[str]): a string or a list of string that points to (a) S3 location(s) where input data is stored.
target_attribute_name
(str): the target attribute name for regression or classification.
compression
(str): if training data is compressed, the compression type. The default value is None.
to_request_list()
Generates a request dictionary using the parameters provided to the class.
AutoMLInput$to_request_list()
format()
format class
AutoMLInput$format()
clone()
The objects of this class are cloneable with this method.
AutoMLInput$clone(deep = FALSE)
deep
Whether to make a deep clone.
A class for interacting with CreateAutoMLJob API.
sagemaker.common::.Job
-> AutoMLJob
new()
Initialize AutoMLJob class
AutoMLJob$new(sagemaker_session, job_name = NULL, inputs = NULL)
sagemaker_session
(sagemaker.session.Session): A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, the one originally associated with the “AutoMLJob“ instance is used.
job_name
:
inputs
(str, list[str]): Parameters used when called :meth:'~sagemaker.automl.AutoML.fit'.
start_new()
Create a new Amazon SageMaker AutoML job from auto_ml.
AutoMLJob$start_new(auto_ml, inputs)
auto_ml
(sagemaker.automl.AutoML): AutoML object created by the user.
inputs
(str, list[str]): Parameters used when called :meth:'~sagemaker.automl.AutoML.fit'.
sagemaker.automl.AutoMLJob: Constructed object that captures all information about the started AutoML job.
describe()
Prints out a response from the DescribeAutoMLJob API call.
AutoMLJob$describe()
wait()
Wait for the AutoML job to finish.
AutoMLJob$wait(logs = TRUE)
logs
(bool): indicate whether to output logs.
format()
format class
AutoMLJob$format()
clone()
The objects of this class are cloneable with this method.
AutoMLJob$clone(deep = FALSE)
deep
Whether to make a deep clone.
A class for SageMaker AutoML Job Candidate
new()
Constructor of CandidateEstimator.
CandidateEstimator$new(candidate, sagemaker_session = NULL)
candidate
(dict): a dictionary of candidate returned by AutoML.list_candidates() or AutoML.best_candidate().
sagemaker_session
(sagemaker.session.Session): A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, one is created using the default AWS configuration chain.
get_steps()
Get the step job of a candidate so that users can construct estimators/transformers
CandidateEstimator$get_steps()
list: a list of dictionaries that provide information about each step job's name, type, inputs and description
fit()
Rerun a candidate's step jobs with new input datasets or security config.
CandidateEstimator$fit( inputs, candidate_name = NULL, volume_kms_key = NULL, encrypt_inter_container_traffic = FALSE, vpc_config = NULL, wait = TRUE, logs = TRUE )
inputs
(str or list[str]): Local path or S3 Uri where the training data is stored. If a local path is provided, the dataset will be uploaded to an S3 location.
candidate_name
(str): name of the candidate to be rerun, if None, candidate's original name will be used.
volume_kms_key
(str): The KMS key id to encrypt data on the storage volume attached to the ML compute instance(s).
encrypt_inter_container_traffic
(bool): To encrypt all communications between ML compute instances in distributed training. Default: False.
vpc_config
(dict): Specifies a VPC that jobs and hosted models have access to. Control access to and from training and model containers by configuring the VPC
wait
(bool): Whether the call should wait until all jobs completes (default: True).
logs
(bool): Whether to show the logs produced by the job. Only meaningful when wait is True (default: True).
format()
format class
CandidateEstimator$format()
clone()
The objects of this class are cloneable with this method.
CandidateEstimator$clone(deep = FALSE)
deep
Whether to make a deep clone.
A class that maintains an AutoML Candidate step's name, inputs, type, and description.
name
Name of the candidate step -> (str)
inputs
Inputs of the candidate step -> (dict)
type
Type of the candidate step, Training or Transform -> (str)
description
Description of candidate step job -> (dict)
new()
Initialize CandidateStep Class
CandidateStep$new(name, inputs, step_type, description)
name
(str): Name of the candidate step
inputs
(dict): Inputs of the candidate step
step_type
(str): Type of the candidate step, Training or Transform
description
(dict): Description of candidate step job
format()
format class
CandidateStep$format()
clone()
The objects of this class are cloneable with this method.
CandidateStep$clone(deep = FALSE)
deep
Whether to make a deep clone.
Handle end-to-end training and deployment of custom Chainer code.
sagemaker.mlcore::EstimatorBase
-> sagemaker.mlcore::Framework
-> Chainer
.use_mpi
Entry point is run as an MPI script.
.num_processes
Total number of processes to run the entry point with
.process_slots_per_host
The number of processes that can run on each instance.
.additional_mpi_options
String of options to the 'mpirun' command used to run the entry point.
.module
mimic python module
new()
This “Estimator“ executes an Chainer script in a managed Chainer execution environment, within a SageMaker Training Job. The managed Chainer environment is an Amazon-built Docker container that executes functions defined in the supplied “entry_point“ Python script. Training is started by calling :meth:'~sagemaker.amazon.estimator.Framework.fit' on this Estimator. After training is complete, calling :meth:'~sagemaker.amazon.estimator.Framework.deploy' creates a hosted SageMaker endpoint and returns an :class:'~sagemaker.amazon.chainer.model.ChainerPredictor' instance that can be used to perform inference against the hosted model. Technical documentation on preparing Chainer scripts for SageMaker training and using the Chainer Estimator is available on the project home-page: https://github.com/aws/sagemaker-python-sdk
Chainer$new( entry_point, use_mpi = NULL, num_processes = NULL, process_slots_per_host = NULL, additional_mpi_options = NULL, source_dir = NULL, hyperparameters = NULL, framework_version = NULL, py_version = NULL, image_uri = NULL, ... )
entry_point
(str): Path (absolute or relative) to the Python source file which should be executed as the entry point to training. If “source_dir“ is specified, then “entry_point“ must point to a file located at the root of “source_dir“.
use_mpi
(bool): If true, entry point is run as an MPI script. By default, the Chainer Framework runs the entry point with 'mpirun' if more than one instance is used.
num_processes
(int): Total number of processes to run the entry point with. By default, the Chainer Framework runs one process per GPU (on GPU instances), or one process per host (on CPU instances).
process_slots_per_host
(int): The number of processes that can run on each instance. By default, this is set to the number of GPUs on the instance (on GPU instances), or one (on CPU instances).
additional_mpi_options
(str): String of options to the 'mpirun' command used to run the entry point. For example, '-X NCCL_DEBUG=WARN' will pass that option string to the mpirun command.
source_dir
(str): Path (absolute or relative) to a directory with any other training source code dependencies aside from the entry point file (default: None). Structure within this directory are preserved when training on Amazon SageMaker.
hyperparameters
(dict): Hyperparameters that will be used for training (default: None). The hyperparameters are made accessible as a dict[str, str] to the training code on SageMaker. For convenience, this accepts other types for keys and values, but “str()“ will be called to convert them before training.
framework_version
(str): Chainer version you want to use for executing your model training code. Defaults to “None“. Required unless “image_uri“ is provided. List of supported versions: https://github.com/aws/sagemaker-python-sdk#chainer-sagemaker-estimators.
py_version
(str): Python version you want to use for executing your model training code. Defaults to “None“. Required unless “image_uri“ is provided.
image_uri
(str): If specified, the estimator will use this image for training and hosting, instead of selecting the appropriate SageMaker official image based on framework_version and py_version. It can be an ECR url or dockerhub image and tag. Examples * “123412341234.dkr.ecr.us-west-2.amazonaws.com/my-custom-image:1.0“ * “custom-image:latest“ If “framework_version“ or “py_version“ are “None“, then “image_uri“ is required. If also “None“, then a “ValueError“ will be raised.
...
: Additional kwargs passed to the :class:'~sagemaker.estimator.Framework' constructor.
hyperparameters()
Return hyperparameters used by your custom Chainer code during training.
Chainer$hyperparameters()
create_model()
Create a SageMaker “ChainerModel“ object that can be deployed to an “Endpoint“.
Chainer$create_model( model_server_workers = NULL, role = NULL, vpc_config_override = "VPC_CONFIG_DEFAULT", entry_point = NULL, source_dir = NULL, dependencies = NULL, ... )
model_server_workers
(int): Optional. The number of worker processes used by the inference server. If None, server will use one worker per vCPU.
role
(str): The “ExecutionRoleArn“ IAM Role ARN for the “Model“, which is also used during transform jobs. If not specified, the role from the Estimator will be used.
vpc_config_override
(dict[str, list[str]]): Optional override for VpcConfig set on the model. Default: use subnets and security groups from this Estimator. * 'Subnets' (list[str]): List of subnet ids. * 'SecurityGroupIds' (list[str]): List of security group ids.
entry_point
(str): Path (absolute or relative) to the local Python source file which should be executed as the entry point to training. If “source_dir“ is specified, then “entry_point“ must point to a file located at the root of “source_dir“. If not specified, the training entry point is used.
source_dir
(str): Path (absolute or relative) to a directory with any other serving source code dependencies aside from the entry point file. If not specified, the model source directory from training is used.
dependencies
(list[str]): A list of paths to directories (absolute or relative) with any additional libraries that will be exported to the container. If not specified, the dependencies from training are used. This is not supported with "local code" in Local Mode.
...
: Additional kwargs passed to the ChainerModel constructor.
sagemaker.chainer.model.ChainerModel: A SageMaker “ChainerModel“ object. See :func:'~sagemaker.chainer.model.ChainerModel' for full details.
clone()
The objects of this class are cloneable with this method.
Chainer$clone(deep = FALSE)
deep
Whether to make a deep clone.
An Chainer SageMaker “Model“ that can be deployed to a SageMaker “Endpoint“.
sagemaker.mlcore::ModelBase
-> sagemaker.mlcore::Model
-> sagemaker.mlcore::FrameworkModel
-> ChainerModel
new()
Initialize an ChainerModel.
ChainerModel$new( model_data, role, entry_point, image_uri = NULL, framework_version = NULL, py_version = NULL, predictor_cls = ChainerPredictor, model_server_workers = NULL, ... )
model_data
(str): The S3 location of a SageMaker model data “.tar.gz“ file.
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if it needs to access an AWS resource.
entry_point
(str): Path (absolute or relative) to the Python source file which should be executed as the entry point to model hosting. If “source_dir“ is specified, then “entry_point“ must point to a file located at the root of “source_dir“.
image_uri
(str): A Docker image URI (default: None). If not specified, a default image for Chainer will be used. If “framework_version“ or “py_version“ are “None“, then “image_uri“ is required. If also “None“, then a “ValueError“ will be raised.
framework_version
(str): Chainer version you want to use for executing your model training code. Defaults to “None“. Required unless “image_uri“ is provided.
py_version
(str): Python version you want to use for executing your model training code. Defaults to “None“. Required unless “image_uri“ is provided.
predictor_cls
(callable[str, sagemaker.session.Session]): A function to call to create a predictor with an endpoint name and SageMaker “Session“. If specified, “deploy()“ returns the result of invoking this function on the created endpoint name.
model_server_workers
(int): Optional. The number of worker processes used by the inference server. If None, server will use one worker per vCPU.
...
: Keyword arguments passed to the :class:'~sagemaker.model.FrameworkModel' initializer.
prepare_container_def()
Return a container definition with framework configuration set in model environment variables.
ChainerModel$prepare_container_def( instance_type = NULL, accelerator_type = NULL )
instance_type
(str): The EC2 instance type to deploy this Model to. For example, 'ml.p2.xlarge'.
accelerator_type
(str): The Elastic Inference accelerator type to deploy to the instance for loading and making inferences to the model. For example, 'ml.eia1.medium'.
dict[str, str]: A container definition object usable with the CreateModel API.
serving_image_uri()
Create a URI for the serving image.
ChainerModel$serving_image_uri( region_name, instance_type, accelerator_type = NULL )
region_name
(str): AWS region where the image is uploaded.
instance_type
(str): SageMaker instance type. Used to determine device type (cpu/gpu/family-specific optimized).
accelerator_type
(str): The Elastic Inference accelerator type to deploy to the instance for loading and making inferences to the model. For example, 'ml.eia1.medium'.
str: The appropriate image URI based on the given parameters.
clone()
The objects of this class are cloneable with this method.
ChainerModel$clone(deep = FALSE)
deep
Whether to make a deep clone.
This is able to serialize Python lists, dictionaries, and numpy arrays to multidimensional tensors for Chainer inference.
sagemaker.mlcore::PredictorBase
-> sagemaker.mlcore::Predictor
-> ChainerPredictor
new()
Initialize an “ChainerPredictor“.
ChainerPredictor$new( endpoint_name, sagemaker_session = NULL, serializer = NumpySerializer$new(), deserializer = NumpyDeserializer$new() )
endpoint_name
(str): The name of the endpoint to perform inference on.
sagemaker_session
(sagemaker.session.Session): Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the estimator creates one using the default AWS configuration chain.
serializer
(sagemaker.serializers.BaseSerializer): Optional. Default serializes input data to .npy format. Handles lists and numpy arrays.
deserializer
(sagemaker.deserializers.BaseDeserializer): Optional. Default parses the response from .npy format to numpy array.
clone()
The objects of this class are cloneable with this method.
ChainerPredictor$clone(deep = FALSE)
deep
Whether to make a deep clone.
Factorization Machines combine the advantages of Support Vector Machines with factorization models. It is an extension of a linear model that is designed to capture interactions between features within high dimensional sparse datasets economically.
sagemaker.mlcore::EstimatorBase
-> sagemaker.mlcore::AmazonAlgorithmEstimatorBase
-> FactorizationMachines
repo_name
sagemaker repo name for framework
repo_version
version of framework
.module
mimic python module
num_factors
Dimensionality of factorization.
predictor_type
Type of predictor 'binary_classifier' or 'regressor'.
epochs
Number of training epochs to run.
clip_gradient
Clip the gradient by projecting onto the box [-clip_gradient, +clip_gradient]
eps
Small value to avoid division by 0.
rescale_grad
If set, multiplies the gradient with rescale_grad before updating
bias_lr
Non-negative learning rate for the bias term.
linear_lr
Non-negative learning rate for linear terms.
factors_lr
Non-negative learning rate for factorization terms.
bias_wd
Non-negative weight decay for the bias term.
linear_wd
Non-negative weight decay for linear terms.
factors_wd
Non-negative weight decay for factorization terms.
bias_init_method
Initialization method for the bias term: 'normal', 'uniform' or 'constant'.
bias_init_scale
Non-negative range for initialization of the bias term that takes effect when bias_init_method parameter is 'uniform'
bias_init_sigma
Non-negative standard deviation for initialization of the bias term that takes effect when bias_init_method parameter is 'normal'.
bias_init_value
Initial value of the bias term that takes effect when bias_init_method parameter is 'constant'.
linear_init_method
Initialization method for linear term: normal', 'uniform' or 'constant'.
linear_init_scale
on-negative range for initialization of linear terms that takes effect when linear_init_method parameter is 'uniform'.
linear_init_sigma
Non-negative standard deviation for initialization of linear terms that takes effect when linear_init_method parameter is 'normal'.
linear_init_value
Initial value of linear terms that takes effect when linear_init_method parameter is 'constant'.
factors_init_method
Initialization method for factorization term: 'normal', 'uniform' or 'constant'.
factors_init_scale
Non-negative range for initialization of factorization terms that takes effect when factors_init_method parameter is 'uniform'.
factors_init_sigma
Non-negative standard deviation for initialization of factorization terms that takes effect when factors_init_method parameter is 'normal'.
factors_init_value
Initial value of factorization terms that takes effect when factors_init_method parameter is constant'.
sagemaker.mlcore::EstimatorBase$latest_job_debugger_artifacts_path()
sagemaker.mlcore::EstimatorBase$latest_job_profiler_artifacts_path()
sagemaker.mlcore::EstimatorBase$latest_job_tensorboard_artifacts_path()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$hyperparameters()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$prepare_workflow_for_training()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$training_image_uri()
new()
Factorization Machines is :class:'Estimator' for general-purpose supervised learning. Amazon SageMaker Factorization Machines is a general-purpose supervised learning algorithm that you can use for both classification and regression tasks. It is an extension of a linear model that is designed to parsimoniously capture interactions between features within high dimensional sparse datasets. This Estimator may be fit via calls to :meth:'~sagemaker.amazon.amazon_estimator.AmazonAlgorithmEstimatorBase.fit'. It requires Amazon :class:'~sagemaker.amazon.record_pb2.Record' protobuf serialized data to be stored in S3. There is an utility :meth:'~sagemaker.amazon.amazon_estimator.AmazonAlgorithmEstimatorBase.record_set' that can be used to upload data to S3 and creates :class:'~sagemaker.amazon.amazon_estimator.RecordSet' to be passed to the 'fit' call. To learn more about the Amazon protobuf Record class and how to prepare bulk data in this format, please consult AWS technical documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/cdf-training.html After this Estimator is fit, model data is stored in S3. The model may be deployed to an Amazon SageMaker Endpoint by invoking :meth:'~sagemaker.amazon.estimator.EstimatorBase.deploy'. As well as deploying an Endpoint, deploy returns a :class:'~sagemaker.amazon.pca.FactorizationMachinesPredictor' object that can be used for inference calls using the trained model hosted in the SageMaker Endpoint. FactorizationMachines Estimators can be configured by setting hyperparameters. The available hyperparameters for FactorizationMachines are documented below. For further information on the AWS FactorizationMachines algorithm, please consult AWS technical documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/fact-machines.html
FactorizationMachines$new( role, instance_count, instance_type, num_factors, predictor_type, epochs = NULL, clip_gradient = NULL, eps = NULL, rescale_grad = NULL, bias_lr = NULL, linear_lr = NULL, factors_lr = NULL, bias_wd = NULL, linear_wd = NULL, factors_wd = NULL, bias_init_method = NULL, bias_init_scale = NULL, bias_init_sigma = NULL, bias_init_value = NULL, linear_init_method = NULL, linear_init_scale = NULL, linear_init_sigma = NULL, linear_init_value = NULL, factors_init_method = NULL, factors_init_scale = NULL, factors_init_sigma = NULL, factors_init_value = NULL, ... )
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if accessing AWS resource.
instance_count
(int): Number of Amazon EC2 instances to use for training.
instance_type
(str): Type of EC2 instance to use for training, for example, 'ml.c4.xlarge'.
num_factors
(int): Dimensionality of factorization.
predictor_type
(str): Type of predictor 'binary_classifier' or 'regressor'.
epochs
(int): Number of training epochs to run.
clip_gradient
(float): Optimizer parameter. Clip the gradient by projecting onto the box [-clip_gradient, +clip_gradient]
eps
(float): Optimizer parameter. Small value to avoid division by 0.
rescale_grad
(float): Optimizer parameter. If set, multiplies the gradient with rescale_grad before updating. Often choose to be 1.0/batch_size.
bias_lr
(float): Non-negative learning rate for the bias term.
linear_lr
(float): Non-negative learning rate for linear terms.
factors_lr
(float): Non-negative learning rate for factorization terms.
bias_wd
(float): Non-negative weight decay for the bias term.
linear_wd
(float): Non-negative weight decay for linear terms.
factors_wd
(float): Non-negative weight decay for factorization terms.
bias_init_method
(string): Initialization method for the bias term: 'normal', 'uniform' or 'constant'.
bias_init_scale
(float): Non-negative range for initialization of the bias term that takes effect when bias_init_method parameter is 'uniform'
bias_init_sigma
(float): Non-negative standard deviation for initialization of the bias term that takes effect when bias_init_method parameter is 'normal'.
bias_init_value
(float): Initial value of the bias term that takes effect when bias_init_method parameter is 'constant'.
linear_init_method
(string): Initialization method for linear term: 'normal', 'uniform' or 'constant'.
linear_init_scale
(float): Non-negative range for initialization of linear terms that takes effect when linear_init_method parameter is 'uniform'.
linear_init_sigma
(float): Non-negative standard deviation for initialization of linear terms that takes effect when linear_init_method parameter is 'normal'.
linear_init_value
(float): Initial value of linear terms that takes effect when linear_init_method parameter is 'constant'.
factors_init_method
(string): Initialization method for factorization term: 'normal', 'uniform' or 'constant'.
factors_init_scale
(float): Non-negative range for initialization of factorization terms that takes effect when factors_init_method parameter is 'uniform'.
factors_init_sigma
(float): Non-negative standard deviation for initialization of factorization terms that takes effect when factors_init_method parameter is 'normal'.
factors_init_value
(float): Initial value of factorization terms that takes effect when factors_init_method parameter is 'constant'.
...
: base class keyword argument values. You can find additional parameters for initializing this class at :class:'~sagemaker.estimator.amazon_estimator.AmazonAlgorithmEstimatorBase' and :class:'~sagemaker.estimator.EstimatorBase'.
create_model()
Return a :class:'~sagemaker.amazon.FactorizationMachinesModel' referencing the latest s3 model data produced by this Estimator.
FactorizationMachines$create_model( vpc_config_override = "VPC_CONFIG_DEFAULT", ... )
vpc_config_override
(dict[str, list[str]]): Optional override for VpcConfig set on the model. Default: use subnets and security groups from this Estimator. * 'Subnets' (list[str]): List of subnet ids. * 'SecurityGroupIds' (list[str]): List of security group ids.
...
: Additional kwargs passed to the FactorizationMachinesModel constructor.
clone()
The objects of this class are cloneable with this method.
FactorizationMachines$clone(deep = FALSE)
deep
Whether to make a deep clone.
Reference S3 model data created by FactorizationMachines estimator. Calling :meth:'~sagemaker.model.Model.deploy' creates an Endpoint and returns :class:'FactorizationMachinesPredictor'.
sagemaker.mlcore::ModelBase
-> sagemaker.mlcore::Model
-> FactorizationMachinesModel
new()
Initialize FactorizationMachinesModel class
FactorizationMachinesModel$new(model_data, role, sagemaker_session = NULL, ...)
model_data
(str): The S3 location of a SageMaker model data “.tar.gz“ file.
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if it needs to access an AWS resource.
sagemaker_session
(sagemaker.session.Session): Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the estimator creates one using the default AWS configuration chain.
...
: Keyword arguments passed to the “FrameworkModel“ initializer.
clone()
The objects of this class are cloneable with this method.
FactorizationMachinesModel$clone(deep = FALSE)
deep
Whether to make a deep clone.
The implementation of :meth:'~sagemaker.predictor.Predictor.predict' in this 'Predictor' requires a numpy “ndarray“ as input. The array should contain the same number of columns as the feature-dimension of the data used to fit the model this Predictor performs inference on. :meth:'predict()' returns a list of :class:'~sagemaker.amazon.record_pb2.Record' objects, one for each row in the input “ndarray“. The prediction is stored in the “"score"“ key of the “Record.label“ field. Please refer to the formats details described: https://docs.aws.amazon.com/sagemaker/latest/dg/fm-in-formats.html
sagemaker.mlcore::PredictorBase
-> sagemaker.mlcore::Predictor
-> FactorizationMachinesPredictor
new()
Initialize FactorizationMachinesPredictor class
FactorizationMachinesPredictor$new(endpoint_name, sagemaker_session = NULL)
endpoint_name
(str): Name of the Amazon SageMaker endpoint to which requests are sent.
sagemaker_session
(sagemaker.session.Session): A SageMaker Session object, used for SageMaker interactions (default: NULL). If not specified, one is created using the default AWS configuration chain.
clone()
The objects of this class are cloneable with this method.
FactorizationMachinesPredictor$clone(deep = FALSE)
deep
Whether to make a deep clone.
Handle training of custom HuggingFace code.
sagemaker.mlcore::EstimatorBase
-> sagemaker.mlcore::Framework
-> HuggingFace
.module
mimic python module
new()
This “Estimator“ executes a HuggingFace script in a managed execution environment. The managed HuggingFace environment is an Amazon-built Docker container that executes functions defined in the supplied “entry_point“ Python script within a SageMaker Training Job. Training is started by calling :meth:'~sagemaker.amazon.estimator.Framework.fit' on this Estimator.
HuggingFace$new( py_version, entry_point, transformers_version = NULL, tensorflow_version = NULL, pytorch_version = NULL, source_dir = NULL, hyperparameters = NULL, image_uri = NULL, distribution = NULL, compiler_config = NULL, ... )
py_version
(str): Python version you want to use for executing your model training code. Defaults to “None“. Required unless “image_uri“ is provided. List of supported versions: https://github.com/aws/sagemaker-python-sdk#huggingface-sagemaker-estimators
entry_point
(str): Path (absolute or relative) to the Python source file which should be executed as the entry point to training. If “source_dir“ is specified, then “entry_point“ must point to a file located at the root of “source_dir“.
transformers_version
(str): Transformers version you want to use for executing your model training code. Defaults to “None“. Required unless “image_uri“ is provided. The current supported version is “4.6.1“.
tensorflow_version
(str): TensorFlow version you want to use for executing your model training code. Defaults to “None“. Required unless “pytorch_version“ is provided. List of supported versions: https://github.com/aws/sagemaker-python-sdk#huggingface-sagemaker-estimators.
pytorch_version
(str): PyTorch version you want to use for executing your model training code. Defaults to “None“. Required unless “tensorflow_version“ is provided. List of supported versions: https://github.com/aws/sagemaker-python-sdk#huggingface-sagemaker-estimators.
source_dir
(str): Path (absolute, relative or an S3 URI) to a directory with any other training source code dependencies aside from the entry point file (default: None). If “source_dir“ is an S3 URI, it must point to a tar.gz file. Structure within this directory are preserved when training on Amazon SageMaker.
hyperparameters
(dict): Hyperparameters that will be used for training (default: None). The hyperparameters are made accessible as a dict[str, str] to the training code on SageMaker. For convenience, this accepts other types for keys and values, but “str()“ will be called to convert them before training.
image_uri
(str): If specified, the estimator will use this image for training and hosting, instead of selecting the appropriate SageMaker official image based on framework_version and py_version. It can be an ECR url or dockerhub image and tag. Examples: * “123412341234.dkr.ecr.us-west-2.amazonaws.com/my-custom-image:1.0“ * “custom-image:latest“ If “framework_version“ or “py_version“ are “None“, then “image_uri“ is required. If also “None“, then a “ValueError“ will be raised.
distribution
(dict): A dictionary with information on how to run distributed training (default: None). Currently, the following are supported: distributed training with parameter servers, SageMaker Distributed (SMD) Data and Model Parallelism, and MPI. SMD Model Parallelism can only be used with MPI. To enable parameter server use the following setup: .. code:: python "parameter_server": "enabled": True To enable MPI: .. code:: python "mpi": "enabled": True To enable SMDistributed Data Parallel or Model Parallel: .. code:: python "smdistributed": "dataparallel": "enabled": True , "modelparallel": "enabled": True, "parameters":
compiler_config
(:class:'sagemaker.mlcore::TrainingCompilerConfig'): Configures SageMaker Training Compiler to accelerate training.
...
: Additional kwargs passed to the :class:'~sagemaker.estimator.Framework' constructor.
hyperparameters()
Return hyperparameters used by your custom PyTorch code during model training.
HuggingFace$hyperparameters()
create_model()
Create a model to deploy. The serializer, deserializer, content_type, and accept arguments are only used to define a default Predictor. They are ignored if an explicit predictor class is passed in. Other arguments are passed through to the Model class. Creating model with HuggingFace training job is not supported.
HuggingFace$create_model( model_server_workers = NULL, role = NULL, vpc_config_override = "VPC_CONFIG_DEFAULT", entry_point = NULL, source_dir = NULL, dependencies = NULL, ... )
model_server_workers
(int): Optional. The number of worker processes used by the inference server. If None, server will use one worker per vCPU.
role
(str): The “ExecutionRoleArn“ IAM Role ARN for the “Model“, which is also used during transform jobs. If not specified, the role from the Estimator will be used.
vpc_config_override
(dict[str, list[str]]): Optional override for VpcConfig set on the model. Default: use subnets and security groups from this Estimator. * 'Subnets' (list[str]): List of subnet ids. * 'SecurityGroupIds' (list[str]): List of security group ids.
entry_point
(str): Path (absolute or relative) to the local Python source file which should be executed as the entry point to training. If “source_dir“ is specified, then “entry_point“ must point to a file located at the root of “source_dir“. If 'git_config' is provided, 'entry_point' should be a relative location to the Python source file in the Git repo.
source_dir
(str): Path (absolute, relative or an S3 URI) to a directory with any other training source code dependencies aside from the entry point file (default: None). If “source_dir“ is an S3 URI, it must point to a tar.gz file. Structure within this directory are preserved when training on Amazon SageMaker. If 'git_config' is provided, 'source_dir' should be a relative location to a directory in the Git repo.
dependencies
(list[str]): A list of paths to directories (absolute or relative) with any additional libraries that will be exported to the container (default: []). The library folders will be copied to SageMaker in the same folder where the entrypoint is copied. If 'git_config' is provided, 'dependencies' should be a list of relative locations to directories with any additional libraries needed in the Git repo.
...
: Additional parameters passed to :class:'~sagemaker.model.Model' .. tip:: You can find additional parameters for using this method at :class:'~sagemaker.model.Model'.
(sagemaker.model.Model) a Model ready for deployment.
clone()
The objects of this class are cloneable with this method.
HuggingFace$clone(deep = FALSE)
deep
Whether to make a deep clone.
A Hugging Face SageMaker “Model“ that can be deployed to a SageMaker “Endpoint“.
sagemaker.mlcore::ModelBase
-> sagemaker.mlcore::Model
-> sagemaker.mlcore::FrameworkModel
-> HuggingFaceModel
new()
Initialize a HuggingFaceModel.
HuggingFaceModel$new( role, model_data = NULL, entry_point = NULL, transformers_version = NULL, tensorflow_version = NULL, pytorch_version = NULL, py_version = NULL, image_uri = NULL, predictor_cls = HuggingFacePredictor, model_server_workers = NULL, ... )
role
(str): An AWS IAM role specified with either the name or full ARN. The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if it needs to access an AWS resource.
model_data
(str): The Amazon S3 location of a SageMaker model data “.tar.gz“ file.
entry_point
(str): The absolute or relative path to the Python source file that should be executed as the entry point to model hosting. If “source_dir“ is specified, then “entry_point“ must point to a file located at the root of “source_dir“. Defaults to None.
transformers_version
(str): Transformers version you want to use for executing your model training code. Defaults to None. Required unless “image_uri“ is provided.
tensorflow_version
(str): TensorFlow version you want to use for executing your inference code. Defaults to “None“. Required unless “pytorch_version“ is provided. List of supported versions: https://github.com/aws/sagemaker-python-sdk#huggingface-sagemaker-estimators.
pytorch_version
(str): PyTorch version you want to use for executing your inference code. Defaults to “None“. Required unless “tensorflow_version“ is provided. List of supported versions: https://github.com/aws/sagemaker-python-sdk#huggingface-sagemaker-estimators.
py_version
(str): Python version you want to use for executing your model training code. Defaults to “None“. Required unless “image_uri“ is provided.
image_uri
(str): A Docker image URI. Defaults to None. If not specified, a default image for PyTorch will be used. If “framework_version“ or “py_version“ are “None“, then “image_uri“ is required. If also “None“, then a “ValueError“ will be raised.
predictor_cls
(callable[str, sagemaker.session.Session]): A function to call to create a predictor with an endpoint name and SageMaker “Session“. If specified, “deploy()“ returns the result of invoking this function on the created endpoint name.
model_server_workers
(int): Optional. The number of worker processes used by the inference server. If None, server will use one worker per vCPU.
...
: Keyword arguments passed to the superclass :class:'~sagemaker.model.FrameworkModel' and, subsequently, its superclass :class:'~sagemaker.model.Model'.,
register()
Creates a model package for creating SageMaker models or listing on Marketplace.
HuggingFaceModel$register( content_types, response_types, inference_instances, transform_instances, model_package_name = NULL, model_package_group_name = NULL, image_uri = NULL, model_metrics = NULL, metadata_properties = NULL, marketplace_cert = FALSE, approval_status = NULL, description = NULL, drift_check_baselines = NULL )
content_types
(list): The supported MIME types for the input data.
response_types
(list): The supported MIME types for the output data.
inference_instances
(list): A list of the instance types that are used to generate inferences in real-time.
transform_instances
(list): A list of the instance types on which a transformation job can be run or on which an endpoint can be deployed.
model_package_name
(str): Model Package name, exclusive to 'model_package_group_name', using 'model_package_name' makes the Model Package un-versioned. Defaults to “None“.
model_package_group_name
(str): Model Package Group name, exclusive to 'model_package_name', using 'model_package_group_name' makes the Model Package versioned. Defaults to “None“.
image_uri
(str): Inference image URI for the container. Model class' self.image will be used if it is None. Defaults to “None“.
model_metrics
(ModelMetrics): ModelMetrics object. Defaults to “None“.
metadata_properties
(MetadataProperties): MetadataProperties object. Defaults to “None“.
marketplace_cert
(bool): A boolean value indicating if the Model Package is certified for AWS Marketplace. Defaults to “False“.
approval_status
(str): Model Approval Status, values can be "Approved", "Rejected", or "PendingManualApproval". Defaults to “PendingManualApproval“.
description
(str): Model Package description. Defaults to “None“.
drift_check_baselines
(DriftCheckBaselines): DriftCheckBaselines object (default: None)
A 'sagemaker.model.ModelPackage' instance.
prepare_container_def()
A container definition with framework configuration set in model environment variables.
HuggingFaceModel$prepare_container_def( instance_type = NULL, accelerator_type = NULL )
instance_type
(str): The EC2 instance type to deploy this Model to. For example, 'ml.p2.xlarge'.
accelerator_type
(str): The Elastic Inference accelerator type to deploy to the instance for loading and making inferences to the model.
dict[str, str]: A container definition object usable with the CreateModel API.
serving_image_uri()
Create a URI for the serving image.
HuggingFaceModel$serving_image_uri( region_name, instance_type, accelerator_type = NULL )
region_name
(str): AWS region where the image is uploaded.
instance_type
(str): SageMaker instance type. Used to determine device type (cpu/gpu/family-specific optimized).
accelerator_type
(str): The Elastic Inference accelerator type to deploy to the instance for loading and making inferences to the model.
str: The appropriate image URI based on the given parameters.
clone()
The objects of this class are cloneable with this method.
HuggingFaceModel$clone(deep = FALSE)
deep
Whether to make a deep clone.
This is able to serialize Python lists, dictionaries, and numpy arrays to multidimensional tensors for Hugging Face inference.
sagemaker.mlcore::PredictorBase
-> sagemaker.mlcore::Predictor
-> HuggingFacePredictor
new()
Initialize an “HuggingFacePredictor“.
HuggingFacePredictor$new( endpoint_name, sagemaker_session = NULL, serializer = JSONSerializer$new(), deserializer = JSONDeserializer$new() )
endpoint_name
(str): The name of the endpoint to perform inference on.
sagemaker_session
(sagemaker.session.Session): Session object that manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the estimator creates one using the default AWS configuration chain.
serializer
(sagemaker.serializers.BaseSerializer): Optional. Default serializes input data to .npy format. Handles lists and numpy arrays.
deserializer
(sagemaker.deserializers.BaseDeserializer): Optional. Default parses the response from .npy format to numpy array.
clone()
The objects of this class are cloneable with this method.
HuggingFacePredictor$clone(deep = FALSE)
deep
Whether to make a deep clone.
Handles Amazon SageMaker processing tasks for jobs using HuggingFace containers.
sagemaker.common::Processor
-> sagemaker.common::ScriptProcessor
-> sagemaker.common::FrameworkProcessor
-> HuggingFaceProcessor
estimator_cls
Estimator object
new()
This processor executes a Python script in a HuggingFace execution environment. Unless “image_uri“ is specified, the environment is an Amazon-built Docker container that executes functions defined in the supplied “code“ Python script. The arguments have the same meaning as in “FrameworkProcessor“, with the following exceptions.
HuggingFaceProcessor$new( role, instance_count, instance_type, transformers_version = NULL, tensorflow_version = NULL, pytorch_version = NULL, py_version = "py36", image_uri = NULL, command = NULL, volume_size_in_gb = 30, volume_kms_key = NULL, output_kms_key = NULL, code_location = NULL, max_runtime_in_seconds = NULL, base_job_name = NULL, sagemaker_session = NULL, env = NULL, tags = NULL, network_config = NULL )
role
(str): An AWS IAM role name or ARN. Amazon SageMaker Processing uses this role to access AWS resources, such as data stored in Amazon S3.
instance_count
(int): The number of instances to run a processing job with.
instance_type
(str): The type of EC2 instance to use for processing, for example, 'ml.c4.xlarge'.
transformers_version
(str): Transformers version you want to use for executing your model training code. Defaults to “None“. Required unless “image_uri“ is provided. The current supported version is “4.4.2“.
tensorflow_version
(str): TensorFlow version you want to use for executing your model training code. Defaults to “None“. Required unless “pytorch_version“ is provided. The current supported version is “1.6.0“.
pytorch_version
(str): PyTorch version you want to use for executing your model training code. Defaults to “None“. Required unless “tensorflow_version“ is provided. The current supported version is “2.4.1“.
py_version
(str): Python version you want to use for executing your model training code. Defaults to “None“. Required unless “image_uri“ is provided. If using PyTorch, the current supported version is “py36“. If using TensorFlow, the current supported version is “py37“.
image_uri
(str): The URI of the Docker image to use for the processing jobs (default: None).
command
([str]): The command to run, along with any command-line flags to *precede* the “'code script“'. Example: ["python3", "-v"]. If not provided, ["python"] will be chosen (default: None).
volume_size_in_gb
(int): Size in GB of the EBS volume to use for storing data during processing (default: 30).
volume_kms_key
(str): A KMS key for the processing volume (default: None).
output_kms_key
(str): The KMS key ID for processing job outputs (default: None).
code_location
(str): The S3 prefix URI where custom code will be uploaded (default: None). The code file uploaded to S3 is 'code_location/job-name/source/sourcedir.tar.gz'. If not specified, the default “code location“ is 's3://sagemaker-default-bucket'
max_runtime_in_seconds
(int): Timeout in seconds (default: None). After this amount of time, Amazon SageMaker terminates the job, regardless of its current status. If 'max_runtime_in_seconds' is not specified, the default value is 24 hours.
base_job_name
(str): Prefix for processing name. If not specified, the processor generates a default job name, based on the processing image name and current timestamp (default: None).
sagemaker_session
(:class:'~sagemaker.session.Session'): Session object which manages interactions with Amazon SageMaker and any other AWS services needed. If not specified, the processor creates one using the default AWS configuration chain (default: None).
env
(dict[str, str]): Environment variables to be passed to the processing jobs (default: None).
tags
(list[dict]): List of tags to be passed to the processing job (default: None). For more, see https://docs.aws.amazon.com/sagemaker/latest/dg/API_Tag.html.
network_config
(:class:'~sagemaker.network.NetworkConfig'): A :class:'~sagemaker.network.NetworkConfig' object that configures network isolation, encryption of inter-container traffic, security group IDs, and subnets (default: None).
clone()
The objects of this class are cloneable with this method.
HuggingFaceProcessor$clone(deep = FALSE)
deep
Whether to make a deep clone.
It is designed to capture associations between IPv4 addresses and various entities, such as user IDs or account numbers.
sagemaker.mlcore::EstimatorBase
-> sagemaker.mlcore::AmazonAlgorithmEstimatorBase
-> IPInsights
repo_name
sagemaker repo name for framework
repo_version
version of framework
MINI_BATCH_SIZE
The size of each mini-batch to use when training. If None, a default value will be used.
.module
mimic python module
num_entity_vectors
The number of embeddings to train for entities accessing online resources
vector_dim
The size of the embedding vectors for both entity and IP addresses
batch_metrics_publish_interval
The period at which to publish metrics
epochs
Maximum number of passes over the training data.
learning_rate
Learning rate for the optimizer.
num_ip_encoder_layers
The number of fully-connected layers to encode IP address embedding.
random_negative_sampling_rate
The ratio of random negative samples to draw during training.
shuffled_negative_sampling_rate
The ratio of shuffled negative samples to draw during training.
weight_decay
Weight decay coefficient. Adds L2 regularization
sagemaker.mlcore::EstimatorBase$latest_job_debugger_artifacts_path()
sagemaker.mlcore::EstimatorBase$latest_job_profiler_artifacts_path()
sagemaker.mlcore::EstimatorBase$latest_job_tensorboard_artifacts_path()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$hyperparameters()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$prepare_workflow_for_training()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$training_image_uri()
new()
This estimator is for IP Insights, an unsupervised algorithm that learns usage patterns of IP addresses. This Estimator may be fit via calls to :meth:'~sagemaker.amazon.amazon_estimator.AmazonAlgorithmEstimatorBase.fit'. It requires CSV data to be stored in S3. After this Estimator is fit, model data is stored in S3. The model may be deployed to an Amazon SageMaker Endpoint by invoking :meth:'~sagemaker.amazon.estimator.EstimatorBase.deploy'. As well as deploying an Endpoint, deploy returns a :class:'~sagemaker.amazon.IPInsightPredictor' object that can be used for inference calls using the trained model hosted in the SageMaker Endpoint. IPInsights Estimators can be configured by setting hyperparamters. The available hyperparamters are documented below. For further information on the AWS IPInsights algorithm, please consult AWS technical documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/ip-insights-hyperparameters.html
IPInsights$new( role, instance_count, instance_type, num_entity_vectors, vector_dim, batch_metrics_publish_interval = NULL, epochs = NULL, learning_rate = NULL, num_ip_encoder_layers = NULL, random_negative_sampling_rate = NULL, shuffled_negative_sampling_rate = NULL, weight_decay = NULL, ... )
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if accessing AWS resource.
instance_count
(int): Number of Amazon EC2 instances to use for training.
instance_type
(str): Type of EC2 instance to use for training, for example, 'ml.m5.xlarge'.
num_entity_vectors
(int): Required. The number of embeddings to train for entities accessing online resources. We recommend 2x the total number of unique entity IDs.
vector_dim
(int): Required. The size of the embedding vectors for both entity and IP addresses.
batch_metrics_publish_interval
(int): Optional. The period at which to publish metrics (batches).
epochs
(int): Optional. Maximum number of passes over the training data.
learning_rate
(float): Optional. Learning rate for the optimizer.
num_ip_encoder_layers
(int): Optional. The number of fully-connected layers to encode IP address embedding.
random_negative_sampling_rate
(int): Optional. The ratio of random negative samples to draw during training. Random negative samples are randomly drawn IPv4 addresses.
shuffled_negative_sampling_rate
(int): Optional. The ratio of shuffled negative samples to draw during training. Shuffled negative samples are IP addresses picked from within a batch.
weight_decay
(float): Optional. Weight decay coefficient. Adds L2 regularization.
...
: base class keyword argument values.
create_model()
Create a model for the latest s3 model produced by this estimator.
IPInsights$create_model(vpc_config_override = "VPC_CONFIG_DEFAULT", ...)
vpc_config_override
(dict[str, list[str]]): Optional override for VpcConfig set on the model. Default: use subnets and security groups from this Estimator. * 'Subnets' (list[str]): List of subnet ids. * 'SecurityGroupIds' (list[str]): List of security group ids.
...
: Additional kwargs passed to the IPInsightsModel constructor.
:class:'~sagemaker.amazon.IPInsightsModel': references the latest s3 model data produced by this estimator.
.prepare_for_training()
Set hyperparameters needed for training. This method will also validate “source_dir“.
IPInsights$.prepare_for_training( records, mini_batch_size = NULL, job_name = NULL )
records
(RecordSet) – The records to train this Estimator on.
mini_batch_size
(int or None) – The size of each mini-batch to use when training. If None, a default value will be used.
job_name
(str): Name of the training job to be created. If not specified, one is generated, using the base name given to the constructor if applicable.
clone()
The objects of this class are cloneable with this method.
IPInsights$clone(deep = FALSE)
deep
Whether to make a deep clone.
Calling :meth:'~sagemaker.model.Model.deploy' creates an Endpoint and returns a Predictor that calculates anomaly scores for data points.
sagemaker.mlcore::ModelBase
-> sagemaker.mlcore::Model
-> IPInsightsModel
new()
Initialize IPInsightsModel class
IPInsightsModel$new(model_data, role, sagemaker_session = NULL, ...)
model_data
(str): The S3 location of a SageMaker model data “.tar.gz“ file.
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if it needs to access an AWS resource.
sagemaker_session
(sagemaker.session.Session): Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the estimator creates one using the default AWS configuration chain.
...
: Keyword arguments passed to the “FrameworkModel“ initializer.
clone()
The objects of this class are cloneable with this method.
IPInsightsModel$clone(deep = FALSE)
deep
Whether to make a deep clone.
The implementation of :meth:'~sagemaker.predictor.Predictor.predict' in this 'Predictor' requires a numpy “ndarray“ as input. The array should contain two columns. The first column should contain the entity ID. The second column should contain the IPv4 address in dot notation.
sagemaker.mlcore::PredictorBase
-> sagemaker.mlcore::Predictor
-> IPInsightsPredictor
new()
Initialize IPInsightsPredictor class
IPInsightsPredictor$new(endpoint_name, sagemaker_session = NULL)
endpoint_name
(str): Name of the Amazon SageMaker endpoint to which requests are sent.
sagemaker_session
(sagemaker.session.Session): A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, one is created using the default AWS configuration chain.
clone()
The objects of this class are cloneable with this method.
IPInsightsPredictor$clone(deep = FALSE)
deep
Whether to make a deep clone.
As the result of KMeans, members of a group are as similar as possible to one another and as different as possible from members of other groups. You define the attributes that you want the algorithm to use to determine similarity.
sagemaker.mlcore::EstimatorBase
-> sagemaker.mlcore::AmazonAlgorithmEstimatorBase
-> KMeans
repo_name
sagemaker repo name for framework
repo_version
version of framework
.module
mimic python module
k
The number of clusters to produce.
init_method
How to initialize cluster locations.
max_iterations
Maximum iterations for Lloyds EM procedure in the local kmeans used in finalize stage.
tol
Tolerance for change in ssd for early stopping in local kmeans.
num_trials
Local version is run multiple times and the one with the best loss is chosen.
local_init_method
Initialization method for local version.
half_life_time_size
The points can have a decayed weight.
epochs
Number of passes done over the training data.
center_factor
The algorithm will create “num_clusters * extra_center_factor“ as it runs.
eval_metrics
JSON list of metrics types to be used for reporting the score for the model.
sagemaker.mlcore::EstimatorBase$latest_job_debugger_artifacts_path()
sagemaker.mlcore::EstimatorBase$latest_job_profiler_artifacts_path()
sagemaker.mlcore::EstimatorBase$latest_job_tensorboard_artifacts_path()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$prepare_workflow_for_training()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$training_image_uri()
new()
A k-means clustering :class:'~sagemaker.amazon.AmazonAlgorithmEstimatorBase'. Finds k clusters of data in an unlabeled dataset. This Estimator may be fit via calls to :meth:'~sagemaker.amazon.amazon_estimator.AmazonAlgorithmEstimatorBase.fit_ndarray' or :meth:'~sagemaker.amazon.amazon_estimator.AmazonAlgorithmEstimatorBase.fit'. The former allows a KMeans model to be fit on a 2-dimensional numpy array. The latter requires Amazon :class:'~sagemaker.amazon.record_pb2.Record' protobuf serialized data to be stored in S3. To learn more about the Amazon protobuf Record class and how to prepare bulk data in this format, please consult AWS technical documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/cdf-training.html. After this Estimator is fit, model data is stored in S3. The model may be deployed to an Amazon SageMaker Endpoint by invoking :meth:'~sagemaker.amazon.estimator.EstimatorBase.deploy'. As well as deploying an Endpoint, “deploy“ returns a :class:'~sagemaker.amazon.kmeans.KMeansPredictor' object that can be used to k-means cluster assignments, using the trained k-means model hosted in the SageMaker Endpoint. KMeans Estimators can be configured by setting hyperparameters. The available hyperparameters for KMeans are documented below. For further information on the AWS KMeans algorithm, please consult AWS technical documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/k-means.html.
KMeans$new( role, instance_count, instance_type, k, init_method = NULL, max_iterations = NULL, tol = NULL, num_trials = NULL, local_init_method = NULL, half_life_time_size = NULL, epochs = NULL, center_factor = NULL, eval_metrics = NULL, ... )
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if accessing AWS resource.
instance_count
(int): Number of Amazon EC2 instances to use for training.
instance_type
(str): Type of EC2 instance to use for training, for example, 'ml.c4.xlarge'.
k
(int): The number of clusters to produce.
init_method
(str): How to initialize cluster locations. One of 'random' or 'kmeans++'.
max_iterations
(int): Maximum iterations for Lloyds EM procedure in the local kmeans used in finalize stage.
tol
(float): Tolerance for change in ssd for early stopping in local kmeans.
num_trials
(int): Local version is run multiple times and the one with the best loss is chosen. This determines how many times.
local_init_method
(str): Initialization method for local version. One of 'random', 'kmeans++'
half_life_time_size
(int): The points can have a decayed weight. When a point is observed its weight, with regard to the computation of the cluster mean is 1. This weight will decay exponentially as we observe more points. The exponent coefficient is chosen such that after observing “half_life_time_size“ points after the mentioned point, its weight will become 1/2. If set to 0, there will be no decay.
epochs
(int): Number of passes done over the training data.
center_factor
(int): The algorithm will create “num_clusters * extra_center_factor“ as it runs and reduce the number of centers to “k“ when finalizing
eval_metrics
(list): JSON list of metrics types to be used for reporting the score for the model. Allowed values are "msd" Means Square Error, "ssd": Sum of square distance. If test data is provided, the score shall be reported in terms of all requested metrics.
...
: base class keyword argument values.
create_model()
Return a :class:'~sagemaker.amazon.kmeans.KMeansModel' referencing the latest s3 model data produced by this Estimator.
KMeans$create_model(vpc_config_override = "VPC_CONFIG_DEFAULT", ...)
vpc_config_override
(dict[str, list[str]]): Optional override for VpcConfig set on the model. Default: use subnets and security groups from this Estimator. * 'Subnets' (list[str]): List of subnet ids. * 'SecurityGroupIds' (list[str]): List of security group ids.
...
: Additional kwargs passed to the KMeansModel constructor.
.prepare_for_training()
Set hyperparameters needed for training. This method will also validate “source_dir“.
KMeans$.prepare_for_training(records, mini_batch_size = 5000, job_name = NULL)
records
(RecordSet) – The records to train this Estimator on.
mini_batch_size
(int or None) – The size of each mini-batch to use when training. If None, a default value will be used.
job_name
(str): Name of the training job to be created. If not specified, one is generated, using the base name given to the constructor if applicable.
hyperparameters()
Return the SageMaker hyperparameters for training this KMeans Estimator
KMeans$hyperparameters()
clone()
The objects of this class are cloneable with this method.
KMeans$clone(deep = FALSE)
deep
Whether to make a deep clone.
Calling :meth:'~sagemaker.model.Model.deploy' creates an Endpoint and return a Predictor to performs k-means cluster assignment.
sagemaker.mlcore::ModelBase
-> sagemaker.mlcore::Model
-> KMeansModel
new()
Initialize KMeansPredictor Class
KMeansModel$new(model_data, role, sagemaker_session = NULL, ...)
model_data
(str): The S3 location of a SageMaker model data “.tar.gz“ file.
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if it needs to access an AWS resource.
sagemaker_session
(sagemaker.session.Session): Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the estimator creates one using the default AWS configuration chain.
...
: Keyword arguments passed to the “FrameworkModel“ initializer.
clone()
The objects of this class are cloneable with this method.
KMeansModel$clone(deep = FALSE)
deep
Whether to make a deep clone.
The implementation of :meth:'~sagemaker.predictor.Predictor.predict' in this 'Predictor' requires a numpy “ndarray“ as input. The array should contain the same number of columns as the feature-dimension of the data used to fit the model this Predictor performs inference on. “predict()“ returns a list of :class:'~sagemaker.amazon.record_pb2.Record' objects, one for each row in the input “ndarray“. The nearest cluster is stored in the “closest_cluster“ key of the “Record.label“ field.
sagemaker.mlcore::PredictorBase
-> sagemaker.mlcore::Predictor
-> KMeansPredictor
new()
Initialize KMeansPredictor Class
KMeansPredictor$new(endpoint_name, sagemaker_session = NULL)
endpoint_name
(str): Name of the Amazon SageMaker endpoint to which requests are sent.
sagemaker_session
(sagemaker.session.Session): A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, one is created using the default AWS configuration chain.
clone()
The objects of this class are cloneable with this method.
KMeansPredictor$clone(deep = FALSE)
deep
Whether to make a deep clone.
For classification problems, the algorithm queries the k points that are closest to the sample point and returns the most frequently used label of their class as the predicted label. For regression problems, the algorithm queries the k closest points to the sample point and returns the average of their feature values as the predicted value.
sagemaker.mlcore::EstimatorBase
-> sagemaker.mlcore::AmazonAlgorithmEstimatorBase
-> KNN
repo_name
sagemaker repo name for framework
repo_version
version of framework
.module
mimic python module
k
Number of nearest neighbors.
sample_size
Number of data points to be sampled from the training data set
predictor_type
Type of inference to use on the data's labels
dimension_reduction_target
Target dimension to reduce to
dimension_reduction_type
Type of dimension reduction technique to use
index_metric
Distance metric to measure between points when finding nearest neighbors
index_type
Type of index to use. Valid values are "faiss.Flat", "faiss.IVFFlat", "faiss.IVFPQ".
faiss_index_ivf_nlists
Number of centroids to construct in the index
faiss_index_pq_m
Number of vector sub-components to construct in the index
sagemaker.mlcore::EstimatorBase$latest_job_debugger_artifacts_path()
sagemaker.mlcore::EstimatorBase$latest_job_profiler_artifacts_path()
sagemaker.mlcore::EstimatorBase$latest_job_tensorboard_artifacts_path()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$hyperparameters()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$prepare_workflow_for_training()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$training_image_uri()
new()
k-nearest neighbors (KNN) is :class:'Estimator' used for classification and regression. This Estimator may be fit via calls to :meth:'~sagemaker.amazon.amazon_estimator.AmazonAlgorithmEstimatorBase.fit'. It requires Amazon :class:'~sagemaker.amazon.record_pb2.Record' protobuf serialized data to be stored in S3. There is an utility :meth:'~sagemaker.amazon.amazon_estimator.AmazonAlgorithmEstimatorBase.record_set' that can be used to upload data to S3 and creates :class:'~sagemaker.amazon.amazon_estimator.RecordSet' to be passed to the 'fit' call. To learn more about the Amazon protobuf Record class and how to prepare bulk data in this format, please consult AWS technical documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/cdf-training.html After this Estimator is fit, model data is stored in S3. The model may be deployed to an Amazon SageMaker Endpoint by invoking :meth:'~sagemaker.amazon.estimator.EstimatorBase.deploy'. As well as deploying an Endpoint, deploy returns a :class:'~sagemaker.amazon.knn.KNNPredictor' object that can be used for inference calls using the trained model hosted in the SageMaker Endpoint. KNN Estimators can be configured by setting hyperparameters. The available hyperparameters for KNN are documented below. For further information on the AWS KNN algorithm, please consult AWS technical documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/knn.html
KNN$new( role, instance_count, instance_type, k, sample_size, predictor_type, dimension_reduction_type = NULL, dimension_reduction_target = NULL, index_type = NULL, index_metric = NULL, faiss_index_ivf_nlists = NULL, faiss_index_pq_m = NULL, ... )
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if accessing AWS resource.
instance_count
(int): Number of Amazon EC2 instances to use for training.
instance_type
(str): Type of EC2 instance to use for training, for example, 'ml.c4.xlarge'.
k
(int): Required. Number of nearest neighbors.
sample_size
(int): Required. Number of data points to be sampled from the training data set.
predictor_type
(str): Required. Type of inference to use on the data's labels, allowed values are 'classifier' and 'regressor'.
dimension_reduction_type
(str): Optional. Type of dimension reduction technique to use. Valid values: "sign", "fjlt"
dimension_reduction_target
(int): Optional. Target dimension to reduce to. Required when dimension_reduction_type is specified.
index_type
(str): Optional. Type of index to use. Valid values are "faiss.Flat", "faiss.IVFFlat", "faiss.IVFPQ".
index_metric
(str): Optional. Distance metric to measure between points when finding nearest neighbors. Valid values are "COSINE", "INNER_PRODUCT", "L2"
faiss_index_ivf_nlists
(str): Optional. Number of centroids to construct in the index if index_type is "faiss.IVFFlat" or "faiss.IVFPQ".
faiss_index_pq_m
(int): Optional. Number of vector sub-components to construct in the index, if index_type is "faiss.IVFPQ".
...
: base class keyword argument values.
create_model()
Return a :class:'~sagemaker.amazon.KNNModel' referencing the latest s3 model data produced by this Estimator.
KNN$create_model(vpc_config_override = "VPC_CONFIG_DEFAULT", ...)
vpc_config_override
(dict[str, list[str]]): Optional override for VpcConfig set on the model. Default: use subnets and security groups from this Estimator. * 'Subnets' (list[str]): List of subnet ids. * 'SecurityGroupIds' (list[str]): List of security group ids.
...
: Additional kwargs passed to the KNNModel constructor.
.prepare_for_training()
Set hyperparameters needed for training. This method will also validate “source_dir“.
KNN$.prepare_for_training(records, mini_batch_size = NULL, job_name = NULL)
records
(RecordSet) – The records to train this Estimator on.
mini_batch_size
(int or None) – The size of each mini-batch to use when training. If None, a default value will be used.
job_name
(str): Name of the training job to be created. If not specified, one is generated, using the base name given to the constructor if applicable.
clone()
The objects of this class are cloneable with this method.
KNN$clone(deep = FALSE)
deep
Whether to make a deep clone.
Calling :meth:'~sagemaker.model.Model.deploy' creates an Endpoint and returns :class:'KNNPredictor'.
sagemaker.mlcore::ModelBase
-> sagemaker.mlcore::Model
-> KNNModel
new()
Initialize KNNModel Class
KNNModel$new(model_data, role, sagemaker_session = NULL, ...)
model_data
(str): The S3 location of a SageMaker model data “.tar.gz“ file.
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if it needs to access an AWS resource.
sagemaker_session
(sagemaker.session.Session): Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the estimator creates one using the default AWS configuration chain.
...
: Keyword arguments passed to the “FrameworkModel“ initializer.
clone()
The objects of this class are cloneable with this method.
KNNModel$clone(deep = FALSE)
deep
Whether to make a deep clone.
The implementation of :meth:'~sagemaker.predictor.Predictor.predict' in this 'Predictor' requires a numpy “ndarray“ as input. The array should contain the same number of columns as the feature-dimension of the data used to fit the model this Predictor performs inference on. :func:'predict' returns a list of :class:'~sagemaker.amazon.record_pb2.Record' objects, one for each row in the input “ndarray“. The prediction is stored in the “"predicted_label"“ key of the “Record.label“ field.
sagemaker.mlcore::PredictorBase
-> sagemaker.mlcore::Predictor
-> KNNPredictor
new()
Initialize KNNPredictor class
KNNPredictor$new(endpoint_name, sagemaker_session = NULL)
endpoint_name
(str): Name of the Amazon SageMaker endpoint to which requests are sent.
sagemaker_session
(sagemaker.session.Session): A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, one is created using the default AWS configuration chain.
clone()
The objects of this class are cloneable with this method.
KNNPredictor$clone(deep = FALSE)
deep
Whether to make a deep clone.
LDA is most commonly used to discover a user-specified number of topics shared by documents within a text corpus. Here each observation is a document, the features are the presence (or occurrence count) of each word, and the categories are the topics.
sagemaker.mlcore::EstimatorBase
-> sagemaker.mlcore::AmazonAlgorithmEstimatorBase
-> LDA
repo_name
sagemaker repo name for framework
repo_version
version of framework
.module
mimic python module
num_topics
The number of topics for LDA to find within the data
alpha0
Initial guess for the concentration parameter
max_restarts
The number of restarts to perform during the Alternating Least Squares
max_iterations
The maximum number of iterations to perform during the ALS phase of the algorithm.
tol
Target error tolerance for the ALS phase of the algorithm.
sagemaker.mlcore::EstimatorBase$latest_job_debugger_artifacts_path()
sagemaker.mlcore::EstimatorBase$latest_job_profiler_artifacts_path()
sagemaker.mlcore::EstimatorBase$latest_job_tensorboard_artifacts_path()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$hyperparameters()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$prepare_workflow_for_training()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$training_image_uri()
new()
Latent Dirichlet Allocation (LDA) is :class:'Estimator' used for unsupervised learning. Amazon SageMaker Latent Dirichlet Allocation is an unsupervised learning algorithm that attempts to describe a set of observations as a mixture of distinct categories. LDA is most commonly used to discover a user-specified number of topics shared by documents within a text corpus. Here each observation is a document, the features are the presence (or occurrence count) of each word, and the categories are the topics. This Estimator may be fit via calls to :meth:'~sagemaker.amazon.amazon_estimator.AmazonAlgorithmEstimatorBase.fit'. It requires Amazon :class:'~sagemaker.amazon.record_pb2.Record' protobuf serialized data to be stored in S3. There is an utility :meth:'~sagemaker.amazon.amazon_estimator.AmazonAlgorithmEstimatorBase.record_set' that can be used to upload data to S3 and creates :class:'~sagemaker.amazon.amazon_estimator.RecordSet' to be passed to the 'fit' call. To learn more about the Amazon protobuf Record class and how to prepare bulk data in this format, please consult AWS technical documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/cdf-training.html After this Estimator is fit, model data is stored in S3. The model may be deployed to an Amazon SageMaker Endpoint by invoking :meth:'~sagemaker.amazon.estimator.EstimatorBase.deploy'. As well as deploying an Endpoint, deploy returns a :class:'~sagemaker.amazon.lda.LDAPredictor' object that can be used for inference calls using the trained model hosted in the SageMaker Endpoint. LDA Estimators can be configured by setting hyperparameters. The available hyperparameters for LDA are documented below. For further information on the AWS LDA algorithm, please consult AWS technical documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/lda.html
LDA$new( role, instance_type, num_topics, alpha0 = NULL, max_restarts = NULL, max_iterations = NULL, tol = NULL, ... )
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if accessing AWS resource.
instance_type
(str): Type of EC2 instance to use for training, for example, 'ml.c4.xlarge'.
num_topics
(int): The number of topics for LDA to find within the data.
alpha0
(float): Optional. Initial guess for the concentration parameter
max_restarts
(int): Optional. The number of restarts to perform during the Alternating Least Squares (ALS) spectral decomposition phase of the algorithm.
max_iterations
(int): Optional. The maximum number of iterations to perform during the ALS phase of the algorithm.
tol
(float): Optional. Target error tolerance for the ALS phase of the algorithm.
...
: base class keyword argument values.
create_model()
Return a :class:'~sagemaker.amazon.LDAModel' referencing the latest s3 model data produced by this Estimator.
LDA$create_model(vpc_config_override = "VPC_CONFIG_DEFAULT", ...)
vpc_config_override
(dict[str, list[str]]): Optional override for VpcConfig set on the model. Default: use subnets and security groups from this Estimator. * 'Subnets' (list[str]): List of subnet ids. * 'SecurityGroupIds' (list[str]): List of security group ids.
...
: Additional kwargs passed to the LDAModel constructor.
.prepare_for_training()
Set hyperparameters needed for training. This method will also validate “source_dir“.
LDA$.prepare_for_training(records, mini_batch_size = NULL, job_name = NULL)
records
(RecordSet) – The records to train this Estimator on.
mini_batch_size
(int or None) – The size of each mini-batch to use when training. If None, a default value will be used.
job_name
(str): Name of the training job to be created. If not specified, one is generated, using the base name given to the constructor if applicable.
clone()
The objects of this class are cloneable with this method.
LDA$clone(deep = FALSE)
deep
Whether to make a deep clone.
Calling :meth:'~sagemaker.model.Model.deploy' creates an Endpoint and return a Predictor that transforms vectors to a lower-dimensional representation.
sagemaker.mlcore::ModelBase
-> sagemaker.mlcore::Model
-> LDAModel
new()
Initialize LDAModel class
LDAModel$new(model_data, role, sagemaker_session = NULL, ...)
model_data
(str): The S3 location of a SageMaker model data “.tar.gz“ file.
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if it needs to access an AWS resource.
sagemaker_session
(sagemaker.session.Session): Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the estimator creates one using the default AWS configuration chain.
...
: Keyword arguments passed to the “FrameworkModel“ initializer.
clone()
The objects of this class are cloneable with this method.
LDAModel$clone(deep = FALSE)
deep
Whether to make a deep clone.
The implementation of :meth:'~sagemaker.predictor.Predictor.predict' in this 'Predictor' requires a numpy “ndarray“ as input. The array should contain the same number of columns as the feature-dimension of the data used to fit the model this Predictor performs inference on. :meth:'predict()' returns a list of :class:'~sagemaker.amazon.record_pb2.Record' objects, one for each row in the input “ndarray“. The lower dimension vector result is stored in the “projection“ key of the “Record.label“ field.
sagemaker.mlcore::PredictorBase
-> sagemaker.mlcore::Predictor
-> LDAPredictor
new()
Initialize LDAPredictor class
LDAPredictor$new(endpoint_name, sagemaker_session = NULL)
endpoint_name
(str): Name of the Amazon SageMaker endpoint to which requests are sent.
sagemaker_session
(sagemaker.session.Session): A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, one is created using the default AWS configuration chain.
clone()
The objects of this class are cloneable with this method.
LDAPredictor$clone(deep = FALSE)
deep
Whether to make a deep clone.
For input, you give the model labeled examples (x, y). x is a high-dimensional vector and y is a numeric label. For binary classification problems, the label must be either 0 or 1. For multiclass classification problems, the labels must be from 0 to num_classes - 1. For regression problems, y is a real number. The algorithm learns a linear function, or, for classification problems, a linear threshold function, and maps a vector x to an approximation of the label y
sagemaker.mlcore::EstimatorBase
-> sagemaker.mlcore::AmazonAlgorithmEstimatorBase
-> LinearLearner
repo_name
sagemaker repo name for framework
repo_version
version of framework
DEFAULT_MINI_BATCH_SIZE
The size of each mini-batch to use when training.
.module
mimic python module
predictor_type
The type of predictor to learn. Either "binary_classifier" or "multiclass_classifier" or "regressor".
binary_classifier_model_selection_criteria
One of 'accuracy', 'f1', 'f_beta', 'precision_at_target_recall', 'recall_at_target_precision', 'cross_entropy_loss', 'loss_function'
target_recall
Only applicable if binary_classifier_model_selection_criteria is precision_at_target_recall
target_precision
Only applicable if binary_classifier_model_selection_criteria is recall_at_target_precision.
positive_example_weight_mult
The importance weight of positive examples is multiplied by this constant.
epochs
The maximum number of passes to make over the training data.
use_bias
Whether to include a bias field
num_models
Number of models to train in parallel
num_calibration_samples
Number of observations to use from validation dataset for doing model calibration
init_method
Function to use to set the initial model weights.
init_scale
For "uniform" init, the range of values.
init_sigma
For "normal" init, the standard-deviation.
init_bias
Initial weight for bias term
optimizer
One of 'sgd', 'adam', 'rmsprop' or 'auto'
loss
One of 'logistic', 'squared_loss', 'absolute_loss', 'hinge_loss', 'eps_insensitive_squared_loss', 'eps_insensitive_absolute_loss', 'quantile_loss', 'huber_loss' or 'softmax_loss' or 'auto'.
wd
L2 regularization parameter
l1
L1 regularization parameter.
momentum
Momentum parameter of sgd optimizer.
learning_rate
The SGD learning rate
beta_1
Exponential decay rate for first moment estimates.
beta_2
Exponential decay rate for second moment estimates.
bias_lr_mult
Allows different learning rate for the bias term.
bias_wd_mult
Allows different regularization for the bias term.
use_lr_scheduler
If true, we use a scheduler for the learning rate.
lr_scheduler_step
The number of steps between decreases of the learning rate
lr_scheduler_factor
Every lr_scheduler_step the learning rate will decrease by this quantity.
lr_scheduler_minimum_lr
Every lr_scheduler_step the learning rate will decrease by this quantity.
normalize_data
Normalizes the features before training to have standard deviation of 1.0.
normalize_label
Normalizes the regression label to have a standard deviation of 1.0.
unbias_data
If true, features are modified to have mean 0.0.
unbias_label
If true, labels are modified to have mean 0.0.
num_point_for_scaler
The number of data points to use for calculating the normalizing and unbiasing terms.
margin
The margin for hinge_loss.
quantile
Quantile for quantile loss.
loss_insensitivity
Parameter for epsilon insensitive loss type.
huber_delta
Parameter for Huber loss.
early_stopping_patience
The number of epochs to wait before ending training if no improvement is made.
early_stopping_tolerance
Relative tolerance to measure an improvement in loss.
num_classes
The number of classes for the response variable.
accuracy_top_k
The value of k when computing the Top K
f_beta
The value of beta to use when calculating F score metrics for binary or multiclass classification.
balance_multiclass_weights
Whether to use class weights which give each class equal importance in the loss function.
sagemaker.mlcore::EstimatorBase$latest_job_debugger_artifacts_path()
sagemaker.mlcore::EstimatorBase$latest_job_profiler_artifacts_path()
sagemaker.mlcore::EstimatorBase$latest_job_tensorboard_artifacts_path()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$hyperparameters()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$prepare_workflow_for_training()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$training_image_uri()
new()
An :class:'Estimator' for binary classification and regression. Amazon SageMaker Linear Learner provides a solution for both classification and regression problems, allowing for exploring different training objectives simultaneously and choosing the best solution from a validation set. It allows the user to explore a large number of models and choose the best, which optimizes either continuous objectives such as mean square error, cross entropy loss, absolute error, etc., or discrete objectives suited for classification such as F1 measure, precision@recall, accuracy. The implementation provides a significant speedup over naive hyperparameter optimization techniques and an added convenience, when compared with solutions providing a solution only to continuous objectives. This Estimator may be fit via calls to :meth:'~sagemaker.amazon.amazon_estimator.AmazonAlgorithmEstimatorBase.fit_ndarray' or :meth:'~sagemaker.amazon.amazon_estimator.AmazonAlgorithmEstimatorBase.fit'. The former allows a LinearLearner model to be fit on a 2-dimensional numpy array. The latter requires Amazon :class:'~sagemaker.amazon.record_pb2.Record' protobuf serialized data to be stored in S3. To learn more about the Amazon protobuf Record class and how to prepare bulk data in this format, please consult AWS technical documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/cdf-training.html After this Estimator is fit, model data is stored in S3. The model may be deployed to an Amazon SageMaker Endpoint by invoking :meth:'~sagemaker.amazon.estimator.EstimatorBase.deploy'. As well as deploying an Endpoint, “deploy“ returns a :class:'~sagemaker.amazon.linear_learner.LinearLearnerPredictor' object that can be used to make class or regression predictions, using the trained model. LinearLearner Estimators can be configured by setting hyperparameters. The available hyperparameters for LinearLearner are documented below. For further information on the AWS LinearLearner algorithm, please consult AWS technical documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/linear-learner.html
LinearLearner$new( role, instance_count, instance_type, predictor_type, binary_classifier_model_selection_criteria = NULL, target_recall = NULL, target_precision = NULL, positive_example_weight_mult = NULL, epochs = NULL, use_bias = NULL, num_models = NULL, num_calibration_samples = NULL, init_method = NULL, init_scale = NULL, init_sigma = NULL, init_bias = NULL, optimizer = NULL, loss = NULL, wd = NULL, l1 = NULL, momentum = NULL, learning_rate = NULL, beta_1 = NULL, beta_2 = NULL, bias_lr_mult = NULL, bias_wd_mult = NULL, use_lr_scheduler = NULL, lr_scheduler_step = NULL, lr_scheduler_factor = NULL, lr_scheduler_minimum_lr = NULL, normalize_data = NULL, normalize_label = NULL, unbias_data = NULL, unbias_label = NULL, num_point_for_scaler = NULL, margin = NULL, quantile = NULL, loss_insensitivity = NULL, huber_delta = NULL, early_stopping_patience = NULL, early_stopping_tolerance = NULL, num_classes = NULL, accuracy_top_k = NULL, f_beta = NULL, balance_multiclass_weights = NULL, ... )
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if accessing AWS resource.
instance_count
(int): Number of Amazon EC2 instances to use for training.
instance_type
(str): Type of EC2 instance to use for training, for example, 'ml.c4.xlarge'.
predictor_type
(str): The type of predictor to learn. Either "binary_classifier" or "multiclass_classifier" or "regressor".
binary_classifier_model_selection_criteria
(str): One of 'accuracy', 'f1', 'f_beta', 'precision_at_target_recall', 'recall_at_target_precision', 'cross_entropy_loss', 'loss_function'
target_recall
(float): Target recall. Only applicable if binary_classifier_model_selection_criteria is precision_at_target_recall.
target_precision
(float): Target precision. Only applicable if binary_classifier_model_selection_criteria is recall_at_target_precision.
positive_example_weight_mult
(float): The importance weight of positive examples is multiplied by this constant. Useful for skewed datasets. Only applies for classification tasks.
epochs
(int): The maximum number of passes to make over the training data.
use_bias
(bool): Whether to include a bias field
num_models
(int): Number of models to train in parallel. If not set, the number of parallel models to train will be decided by the algorithm itself. One model will be trained according to the given training parameter (regularization, optimizer, loss) and the rest by close by parameters.
num_calibration_samples
(int): Number of observations to use from validation dataset for doing model calibration (finding the best threshold).
init_method
(str): Function to use to set the initial model weights. One of "uniform" or "normal"
init_scale
(float): For "uniform" init, the range of values.
init_sigma
(float): For "normal" init, the standard-deviation.
init_bias
(float): Initial weight for bias term
optimizer
(str): One of 'sgd', 'adam', 'rmsprop' or 'auto'
loss
(str): One of 'logistic', 'squared_loss', 'absolute_loss', 'hinge_loss', 'eps_insensitive_squared_loss', 'eps_insensitive_absolute_loss', 'quantile_loss', 'huber_loss' or 'softmax_loss' or 'auto'.
wd
(float): L2 regularization parameter i.e. the weight decay parameter. Use 0 for no L2 regularization.
l1
(float): L1 regularization parameter. Use 0 for no L1 regularization.
momentum
(float): Momentum parameter of sgd optimizer.
learning_rate
(float): The SGD learning rate
beta_1
(float): Exponential decay rate for first moment estimates. Only applies for adam optimizer.
beta_2
(float): Exponential decay rate for second moment estimates. Only applies for adam optimizer.
bias_lr_mult
(float): Allows different learning rate for the bias term. The actual learning rate for the bias is learning rate times bias_lr_mult.
bias_wd_mult
(float): Allows different regularization for the bias term. The actual L2 regularization weight for the bias is wd times bias_wd_mult. By default there is no regularization on the bias term.
use_lr_scheduler
(bool): If true, we use a scheduler for the learning rate.
lr_scheduler_step
(int): The number of steps between decreases of the learning rate. Only applies to learning rate scheduler.
lr_scheduler_factor
(float): Every lr_scheduler_step the learning rate will decrease by this quantity. Only applies for learning rate scheduler.
lr_scheduler_minimum_lr
(float): The learning rate will never decrease to a value lower than this. Only applies for learning rate scheduler.
normalize_data
(bool): Normalizes the features before training to have standard deviation of 1.0.
normalize_label
(bool): Normalizes the regression label to have a standard deviation of 1.0. If set for classification, it will be ignored.
unbias_data
(bool): If true, features are modified to have mean 0.0.
unbias_label
(bool): If true, labels are modified to have mean 0.0.
num_point_for_scaler
(int): The number of data points to use for calculating the normalizing and unbiasing terms.
margin
(float): The margin for hinge_loss.
quantile
(float): Quantile for quantile loss. For quantile q, the model will attempt to produce predictions such that true_label < prediction with probability q.
loss_insensitivity
(float): Parameter for epsilon insensitive loss type. During training and metric evaluation, any error smaller than this is considered to be zero.
huber_delta
(float): Parameter for Huber loss. During training and metric evaluation, compute L2 loss for errors smaller than delta and L1 loss for errors larger than delta.
early_stopping_patience
(int): The number of epochs to wait before ending training if no improvement is made. The improvement is training loss if validation data is not provided, or else it is the validation loss or the binary classification model selection criteria like accuracy, f1-score etc. To disable early stopping, set early_stopping_patience to a value larger than epochs.
early_stopping_tolerance
(float): Relative tolerance to measure an improvement in loss. If the ratio of the improvement in loss divided by the previous best loss is smaller than this value, early stopping will consider the improvement to be zero.
num_classes
(int): The number of classes for the response variable. Required when predictor_type is multiclass_classifier and ignored otherwise. The classes are assumed to be labeled 0, ..., num_classes - 1.
accuracy_top_k
(int): The value of k when computing the Top K Accuracy metric for multiclass classification. An example is scored as correct if the model assigns one of the top k scores to the true label.
f_beta
(float): The value of beta to use when calculating F score metrics for binary or multiclass classification. Also used if binary_classifier_model_selection_criteria is f_beta.
balance_multiclass_weights
(bool): Whether to use class weights which give each class equal importance in the loss function. Only used when predictor_type is multiclass_classifier.
...
: base class keyword argument values.
create_model()
Return a :class:'~sagemaker.amazon.LinearLearnerModel' referencing the latest s3 model data produced by this Estimator.
LinearLearner$create_model(vpc_config_override = "VPC_CONFIG_DEFAULT", ...)
vpc_config_override
(dict[str, list[str]]): Optional override for VpcConfig set on the model. Default: use subnets and security groups from this Estimator. * 'Subnets' (list[str]): List of subnet ids. * 'SecurityGroupIds' (list[str]): List of security group ids.
...
: Additional kwargs passed to the LinearLearnerModel constructor.
.prepare_for_training()
Set hyperparameters needed for training. This method will also validate “source_dir“.
LinearLearner$.prepare_for_training( records, mini_batch_size = NULL, job_name = NULL )
records
(RecordSet) – The records to train this Estimator on.
mini_batch_size
(int or None) – The size of each mini-batch to use when training. If None, a default value will be used.
job_name
(str): Name of the training job to be created. If not specified, one is generated, using the base name given to the constructor if applicable.
clone()
The objects of this class are cloneable with this method.
LinearLearner$clone(deep = FALSE)
deep
Whether to make a deep clone.
Calling :meth:'~sagemaker.model.Model.deploy' creates an Endpoint and returns a :class:'LinearLearnerPredictor'
sagemaker.mlcore::ModelBase
-> sagemaker.mlcore::Model
-> LinearLearnerModel
new()
Initialize LinearLearnerModel class
LinearLearnerModel$new(model_data, role, sagemaker_session = NULL, ...)
model_data
(str): The S3 location of a SageMaker model data “.tar.gz“ file.
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if it needs to access an AWS resource.
sagemaker_session
(sagemaker.session.Session): Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the estimator creates one using the default AWS configuration chain.
...
: Keyword arguments passed to the “FrameworkModel“ initializer.
clone()
The objects of this class are cloneable with this method.
LinearLearnerModel$clone(deep = FALSE)
deep
Whether to make a deep clone.
The implementation of :meth:'~sagemaker.predictor.Predictor.predict' in this 'Predictor' requires a numpy “ndarray“ as input. The array should contain the same number of columns as the feature-dimension of the data used to fit the model this Predictor performs inference on. :func:'predict' returns a list of :class:'~sagemaker.amazon.record_pb2.Record' objects, one for each row in the input “ndarray“. The prediction is stored in the “"predicted_label"“ key of the “Record.label“ field.
sagemaker.mlcore::PredictorBase
-> sagemaker.mlcore::Predictor
-> LinearLearnerPredictor
new()
Initialize LinearLearnerPredictor Class
LinearLearnerPredictor$new(endpoint_name, sagemaker_session = NULL)
endpoint_name
(str): Name of the Amazon SageMaker endpoint to which requests are sent.
sagemaker_session
(sagemaker.session.Session): A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, one is created using the default AWS configuration chain.
clone()
The objects of this class are cloneable with this method.
LinearLearnerPredictor$clone(deep = FALSE)
deep
Whether to make a deep clone.
Handle end-to-end training and deployment of custom MXNet code.
sagemaker.mlcore::EstimatorBase
-> sagemaker.mlcore::Framework
-> MXNet
.LOWEST_SCRIPT_MODE_VERSION
Lowest MXNet version that can be executed
.module
mimic python module
new()
This “Estimator“ executes an MXNet script in a managed MXNet execution environment, within a SageMaker Training Job. The managed MXNet environment is an Amazon-built Docker container that executes functions defined in the supplied “entry_point“ Python script. Training is started by calling :meth:'~sagemaker.amazon.estimator.Framework.fit' on this Estimator. After training is complete, calling :meth:'~sagemaker.amazon.estimator.Framework.deploy' creates a hosted SageMaker endpoint and returns an :class:'~sagemaker.amazon.mxnet.model.MXNetPredictor' instance that can be used to perform inference against the hosted model. Technical documentation on preparing MXNet scripts for SageMaker training and using the MXNet Estimator is available on the project home-page: https://github.com/aws/sagemaker-python-sdk
MXNet$new( entry_point, framework_version = NULL, py_version = NULL, source_dir = NULL, hyperparameters = NULL, image_uri = NULL, distribution = NULL, ... )
entry_point
(str): Path (absolute or relative) to the Python source file which should be executed as the entry point to training. If “source_dir“ is specified, then “entry_point“ must point to a file located at the root of “source_dir“.
framework_version
(str): MXNet version you want to use for executing your model training code. Defaults to 'None'. Required unless “image_uri“ is provided. List of supported versions. https://github.com/aws/sagemaker-python-sdk#mxnet-sagemaker-estimators.
py_version
(str): Python version you want to use for executing your model training code. One of 'py2' or 'py3'. Defaults to “None“. Required unless “image_uri“ is provided.
source_dir
(str): Path (absolute, relative or an S3 URI) to a directory with any other training source code dependencies aside from the entry point file (default: None). If “source_dir“ is an S3 URI, it must point to a tar.gz file. Structure within this directory are preserved when training on Amazon SageMaker.
hyperparameters
(dict): Hyperparameters that will be used for training (default: None). The hyperparameters are made accessible as a dict[str, str] to the training code on SageMaker. For convenience, this accepts other types for keys and values, but “str()“ will be called to convert them before training.
image_uri
(str): If specified, the estimator will use this image for training and hosting, instead of selecting the appropriate SageMaker official image based on framework_version and py_version. It can be an ECR url or dockerhub image and tag. Examples: * “123412341234.dkr.ecr.us-west-2.amazonaws.com/my-custom-image:1.0“ * “custom-image:latest“ If “framework_version“ or “py_version“ are “None“, then “image_uri“ is required. If also “None“, then a “ValueError“ will be raised.
distribution
(dict): A dictionary with information on how to run distributed training (default: None). Currently we support distributed training with parameter server and MPI [Horovod].
...
: Additional kwargs passed to the :class:'~sagemaker.estimator.Framework' constructor.
create_model()
Create a SageMaker “MXNetModel“ object that can be deployed to an “Endpoint“.
MXNet$create_model( model_server_workers = NULL, role = NULL, vpc_config_override = "VPC_CONFIG_DEFAULT", entry_point = NULL, source_dir = NULL, dependencies = NULL, image_uri = NULL, ... )
model_server_workers
(int): Optional. The number of worker processes used by the inference server. If None, server will use one worker per vCPU.
role
(str): The “ExecutionRoleArn“ IAM Role ARN for the “Model“, which is also used during transform jobs. If not specified, the role from the Estimator will be used.
vpc_config_override
(dict[str, list[str]]): Optional override for VpcConfig set on the model. Default: use subnets and security groups from this Estimator. * 'Subnets' (list[str]): List of subnet ids. * 'SecurityGroupIds' (list[str]): List of security group ids.
entry_point
(str): Path (absolute or relative) to the local Python source file which should be executed as the entry point to training. If “source_dir“ is specified, then “entry_point“ must point to a file located at the root of “source_dir“. If not specified, the training entry point is used.
source_dir
(str): Path (absolute or relative) to a directory with any other serving source code dependencies aside from the entry point file. If not specified, the model source directory from training is used.
dependencies
(list[str]): A list of paths to directories (absolute or relative) with any additional libraries that will be exported to the container. If not specified, the dependencies from training are used. This is not supported with "local code" in Local Mode.
image_uri
(str): If specified, the estimator will use this image for hosting, instead of selecting the appropriate SageMaker official image based on framework_version and py_version. It can be an ECR url or dockerhub image and tag. Examples: * “123412341234.dkr.ecr.us-west-2.amazonaws.com/my-custom-image:1.0“ * “custom-image:latest“
...
: Additional kwargs passed to the :class:'~sagemaker.mxnet.model.MXNetModel' constructor.
sagemaker.mxnet.model.MXNetModel: A SageMaker “MXNetModel“ object. See :func:'~sagemaker.mxnet.model.MXNetModel' for full details.
clone()
The objects of this class are cloneable with this method.
MXNet$clone(deep = FALSE)
deep
Whether to make a deep clone.
An MXNet SageMaker “Model“ that can be deployed to a SageMaker “Endpoint“.
sagemaker.mlcore::ModelBase
-> sagemaker.mlcore::Model
-> sagemaker.mlcore::FrameworkModel
-> MXNetModel
.LOWEST_MMS_VERSION
Lowest Multi Model Server MXNet version that can be executed
new()
Initialize an MXNetModel.
MXNetModel$new( model_data, role, entry_point, framework_version = NULL, py_version = NULL, image_uri = NULL, predictor_cls = MXNetPredictor, model_server_workers = NULL, ... )
model_data
(str): The S3 location of a SageMaker model data “.tar.gz“ file.
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if it needs to access an AWS resource.
entry_point
(str): Path (absolute or relative) to the Python source file which should be executed as the entry point to model hosting. If “source_dir“ is specified, then “entry_point“ must point to a file located at the root of “source_dir“.
framework_version
(str): MXNet version you want to use for executing your model training code. Defaults to “None“. Required unless “image_uri“ is provided.
py_version
(str): Python version you want to use for executing your model training code. Defaults to “None“. Required unless “image_uri“ is provided.
image_uri
(str): A Docker image URI (default: None). If not specified, a default image for MXNet will be used. If “framework_version“ or “py_version“ are “None“, then “image_uri“ is required. If also “None“, then a “ValueError“ will be raised.
predictor_cls
(callable[str, sagemaker.session.Session]): A function to call to create a predictor with an endpoint name and SageMaker “Session“. If specified, “deploy()“ returns the result of invoking this function on the created endpoint name.
model_server_workers
(int): Optional. The number of worker processes used by the inference server. If None, server will use one worker per vCPU.
...
: Keyword arguments passed to the superclass :class:'~sagemaker.model.FrameworkModel' and, subsequently, its superclass :class:'~sagemaker.model.Model'.
prepare_container_def()
Return a container definition with framework configuration set in model environment variables.
MXNetModel$prepare_container_def(instance_type = NULL, accelerator_type = NULL)
instance_type
(str): The EC2 instance type to deploy this Model to. For example, 'ml.p2.xlarge'.
accelerator_type
(str): The Elastic Inference accelerator type to deploy to the instance for loading and making inferences to the model. For example, 'ml.eia1.medium'.
dict[str, str]: A container definition object usable with the CreateModel API.
serving_image_uri()
Create a URI for the serving image.
MXNetModel$serving_image_uri( region_name, instance_type, accelerator_type = NULL )
region_name
(str): AWS region where the image is uploaded.
instance_type
(str): SageMaker instance type. Used to determine device type (cpu/gpu/family-specific optimized).
accelerator_type
(str): The Elastic Inference accelerator type to deploy to the instance for loading and making inferences to the model (default: None). For example, 'ml.eia1.medium'.
str: The appropriate image URI based on the given parameters.
clone()
The objects of this class are cloneable with this method.
MXNetModel$clone(deep = FALSE)
deep
Whether to make a deep clone.
A Predictor for inference against MXNet Endpoints. This is able to serialize Python lists, dictionaries, and numpy arrays to multidimensional tensors for MXNet inference.
sagemaker.mlcore::PredictorBase
-> sagemaker.mlcore::Predictor
-> MXNetPredictor
new()
Initialize an “MXNetPredictor“.
MXNetPredictor$new( endpoint_name, sagemaker_session = NULL, serializer = JSONSerializer$new(), deserializer = JSONDeserializer$new() )
endpoint_name
(str): The name of the endpoint to perform inference on.
sagemaker_session
(sagemaker.session.Session): Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the estimator creates one using the default AWS configuration chain.
serializer
(callable): Optional. Default serializes input data to json. Handles dicts, lists, and numpy arrays.
deserializer
(callable): Optional. Default parses the response using “json.load(...)“.
clone()
The objects of this class are cloneable with this method.
MXNetPredictor$clone(deep = FALSE)
deep
Whether to make a deep clone.
Handles Amazon SageMaker processing tasks for jobs using MXNet containers.
sagemaker.common::Processor
-> sagemaker.common::ScriptProcessor
-> sagemaker.common::FrameworkProcessor
-> MXNetProcessor
estimator_cls
Estimator object
new()
This processor executes a Python script in a managed MXNet execution environment. Unless “image_uri“ is specified, the MXNet environment is an Amazon-built Docker container that executes functions defined in the supplied “code“ Python script.
MXNetProcessor$new( framework_version, role, instance_count, instance_type, py_version = "py3", image_uri = NULL, command = NULL, volume_size_in_gb = 30, volume_kms_key = NULL, output_kms_key = NULL, code_location = NULL, max_runtime_in_seconds = NULL, base_job_name = NULL, sagemaker_session = NULL, env = NULL, tags = NULL, network_config = NULL )
framework_version
(str): The version of the framework. Value is ignored when “image_uri“ is provided.
role
(str): An AWS IAM role name or ARN. Amazon SageMaker Processing uses this role to access AWS resources, such as data stored in Amazon S3.
instance_count
(int): The number of instances to run a processing job with.
instance_type
(str): The type of EC2 instance to use for processing, for example, 'ml.c4.xlarge'.
py_version
(str): Python version you want to use for executing your model training code. One of 'py2' or 'py3'. Defaults to 'py3'. Value is ignored when “image_uri“ is provided.
image_uri
(str): The URI of the Docker image to use for the processing jobs (default: None).
command
([str]): The command to run, along with any command-line flags to *precede* the “'code script“'. Example: ["python3", "-v"]. If not provided, ["python"] will be chosen (default: None).
volume_size_in_gb
(int): Size in GB of the EBS volume to use for storing data during processing (default: 30).
volume_kms_key
(str): A KMS key for the processing volume (default: None).
output_kms_key
(str): The KMS key ID for processing job outputs (default: None).
code_location
(str): The S3 prefix URI where custom code will be uploaded (default: None). The code file uploaded to S3 is 'code_location/job-name/source/sourcedir.tar.gz'. If not specified, the default “code location“ is 's3://sagemaker-default-bucket'
max_runtime_in_seconds
(int): Timeout in seconds (default: None). After this amount of time, Amazon SageMaker terminates the job, regardless of its current status. If 'max_runtime_in_seconds' is not specified, the default value is 24 hours.
base_job_name
(str): Prefix for processing name. If not specified, the processor generates a default job name, based on the processing image name and current timestamp (default: None).
sagemaker_session
(:class:'~sagemaker.session.Session'): Session object which manages interactions with Amazon SageMaker and any other AWS services needed. If not specified, the processor creates one using the default AWS configuration chain (default: None).
env
(dict[str, str]): Environment variables to be passed to the processing jobs (default: None).
tags
(list[dict]): List of tags to be passed to the processing job (default: None). For more, see https://docs.aws.amazon.com/sagemaker/latest/dg/API_Tag.html.
network_config
(:class:'~sagemaker.network.NetworkConfig'): A :class:'~sagemaker.network.NetworkConfig' object that configures network isolation, encryption of inter-container traffic, security group IDs, and subnets (default: None).
clone()
The objects of this class are cloneable with this method.
MXNetProcessor$clone(deep = FALSE)
deep
Whether to make a deep clone.
The resulting topics contain word groupings based on their statistical distribution. Documents that contain frequent occurrences of words such as "bike", "car", "train", "mileage", and "speed" are likely to share a topic on "transportation" for example.
sagemaker.mlcore::EstimatorBase
-> sagemaker.mlcore::AmazonAlgorithmEstimatorBase
-> NTM
repo_name
sagemaker repo name for framework
repo_version
version of framework
.module
mimic python module
num_topics
The number of topics for NTM to find within the data
encoder_layers
Represents number of layers in the encoder and the output size of each layer
epochs
Maximum number of passes over the training data.
encoder_layers_activation
Activation function to use in the encoder layers.
optimizer
Optimizer to use for training.
tolerance
Maximum relative change in the loss function within the last num_patience_epochs number of epochs below which early stopping is triggered.
num_patience_epochs
Number of successive epochs over which early stopping criterion is evaluated.
batch_norm
Whether to use batch normalization during training.
rescale_gradient
Rescale factor for gradient
clip_gradient
Maximum magnitude for each gradient component.
weight_decay
Weight decay coefficient.
learning_rate
Learning rate for the optimizer.
sagemaker.mlcore::EstimatorBase$latest_job_debugger_artifacts_path()
sagemaker.mlcore::EstimatorBase$latest_job_profiler_artifacts_path()
sagemaker.mlcore::EstimatorBase$latest_job_tensorboard_artifacts_path()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$hyperparameters()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$prepare_workflow_for_training()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$training_image_uri()
new()
Neural Topic Model (NTM) is :class:'Estimator' used for unsupervised learning. This Estimator may be fit via calls to :meth:'~sagemaker.amazon.amazon_estimator.AmazonAlgorithmEstimatorBase.fit'. It requires Amazon :class:'~sagemaker.amazon.record_pb2.Record' protobuf serialized data to be stored in S3. There is an utility :meth:'~sagemaker.amazon.amazon_estimator.AmazonAlgorithmEstimatorBase.record_set' that can be used to upload data to S3 and creates :class:'~sagemaker.amazon.amazon_estimator.RecordSet' to be passed to the 'fit' call. To learn more about the Amazon protobuf Record class and how to prepare bulk data in this format, please consult AWS technical documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/cdf-training.html After this Estimator is fit, model data is stored in S3. The model may be deployed to an Amazon SageMaker Endpoint by invoking :meth:'~sagemaker.amazon.estimator.EstimatorBase.deploy'. As well as deploying an Endpoint, deploy returns a :class:'~sagemaker.amazon.ntm.NTMPredictor' object that can be used for inference calls using the trained model hosted in the SageMaker Endpoint. NTM Estimators can be configured by setting hyperparameters. The available hyperparameters for NTM are documented below. For further information on the AWS NTM algorithm, please consult AWS technical documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/ntm.html
NTM$new( role, instance_count, instance_type, num_topics, encoder_layers = NULL, epochs = NULL, encoder_layers_activation = NULL, optimizer = NULL, tolerance = NULL, num_patience_epochs = NULL, batch_norm = NULL, rescale_gradient = NULL, clip_gradient = NULL, weight_decay = NULL, learning_rate = NULL, ... )
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if accessing AWS resource.
instance_count
(int): Number of Amazon EC2 instances to use for training.
instance_type
(str): Type of EC2 instance to use for training, for example, 'ml.c4.xlarge'.
num_topics
(int): Required. The number of topics for NTM to find within the data.
encoder_layers
(list): Optional. Represents number of layers in the encoder and the output size of each layer.
epochs
(int): Optional. Maximum number of passes over the training data.
encoder_layers_activation
(str): Optional. Activation function to use in the encoder layers.
optimizer
(str): Optional. Optimizer to use for training.
tolerance
(float): Optional. Maximum relative change in the loss function within the last num_patience_epochs number of epochs below which early stopping is triggered.
num_patience_epochs
(int): Optional. Number of successive epochs over which early stopping criterion is evaluated.
batch_norm
(bool): Optional. Whether to use batch normalization during training.
rescale_gradient
(float): Optional. Rescale factor for gradient.
clip_gradient
(float): Optional. Maximum magnitude for each gradient component.
weight_decay
(float): Optional. Weight decay coefficient. Adds L2 regularization.
learning_rate
(float): Optional. Learning rate for the optimizer.
...
: base class keyword argument values.
create_model()
Return a :class:'~sagemaker.amazon.NTMModel' referencing the latest s3 model data produced by this Estimator.
NTM$create_model(vpc_config_override = "VPC_CONFIG_DEFAULT", ...)
vpc_config_override
(dict[str, list[str]]): Optional override for VpcConfig set on the model. Default: use subnets and security groups from this Estimator. * 'Subnets' (list[str]): List of subnet ids. * 'SecurityGroupIds' (list[str]): List of security group ids.
...
: Additional kwargs passed to the NTMModel constructor.
.prepare_for_training()
Set hyperparameters needed for training. This method will also validate “source_dir“.
NTM$.prepare_for_training(records, mini_batch_size, job_name = NULL)
records
(RecordSet) – The records to train this Estimator on.
mini_batch_size
(int or None) – The size of each mini-batch to use when training. If None, a default value will be used.
job_name
(str): Name of the training job to be created. If not specified, one is generated, using the base name given to the constructor if applicable.
clone()
The objects of this class are cloneable with this method.
NTM$clone(deep = FALSE)
deep
Whether to make a deep clone.
Calling :meth:'~sagemaker.model.Model.deploy' creates an Endpoint and return a Predictor that transforms vectors to a lower-dimensional representation.
sagemaker.mlcore::ModelBase
-> sagemaker.mlcore::Model
-> NTMModel
new()
Initialize NTMModel class
NTMModel$new(model_data, role, sagemaker_session = NULL, ...)
model_data
(str): The S3 location of a SageMaker model data “.tar.gz“ file.
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if it needs to access an AWS resource.
sagemaker_session
(sagemaker.session.Session): Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the estimator creates one using the default AWS configuration chain.
...
: Keyword arguments passed to the “FrameworkModel“ initializer.
clone()
The objects of this class are cloneable with this method.
NTMModel$clone(deep = FALSE)
deep
Whether to make a deep clone.
The implementation of :meth:'~sagemaker.predictor.Predictor.predict' in this 'Predictor' requires a numpy “ndarray“ as input. The array should contain the same number of columns as the feature-dimension of the data used to fit the model this Predictor performs inference on. :meth:'predict()' returns a list of :class:'~sagemaker.amazon.record_pb2.Record' objects, one for each row in the input “ndarray“. The lower dimension vector result is stored in the “projection“ key of the “Record.label“ field.
sagemaker.mlcore::PredictorBase
-> sagemaker.mlcore::Predictor
-> NTMPredictor
new()
Initialize NTMPredictor class
NTMPredictor$new(endpoint_name, sagemaker_session = NULL)
endpoint_name
(str): Name of the Amazon SageMaker endpoint to which requests are sent.
sagemaker_session
(sagemaker.session.Session): A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, one is created using the default AWS configuration chain.
clone()
The objects of this class are cloneable with this method.
NTMPredictor$clone(deep = FALSE)
deep
Whether to make a deep clone.
It can learn low-dimensional dense embeddings of high-dimensional objects. The embeddings are learned in a way that preserves the semantics of the relationship between pairs of objects in the original space in the embedding space.
sagemaker.mlcore::EstimatorBase
-> sagemaker.mlcore::AmazonAlgorithmEstimatorBase
-> Object2Vec
repo_name
sagemaker repo name for framework
repo_version
version of framework
MINI_BATCH_SIZE
The size of each mini-batch to use when training.
.module
mimic python module
epochs
Total number of epochs for SGD training
enc_dim
Dimension of the output of the embedding layer
mini_batch_size
mini batch size for SGD training
early_stopping_patience
The allowed number of consecutive epochs without improvement before early stopping is applied
early_stopping_tolerance
The value used to determine whether the algorithm has made improvement between two consecutive epochs for early stopping
dropout
Dropout probability on network layers
weight_decay
Weight decay parameter during optimization
bucket_width
The allowed difference between data sequence length when bucketing is enabled
num_classes
Number of classes for classification
mlp_layers
Number of MLP layers in the network
mlp_dim
Dimension of the output of MLP layer
mlp_activation
Type of activation function for the MLP layer
output_layer
Type of output layer
optimizer
Type of optimizer for training
learning_rate
Learning rate for SGD training
negative_sampling_rate
Negative sampling rate
comparator_list
Customization of comparator operator
tied_token_embedding_weight
Tying of token embedding layer weight
token_embedding_storage_type
Type of token embedding storage
enc0_network
Network model of encoder "enc0"
enc1_network
Network model of encoder "enc1"
enc0_cnn_filter_width
CNN filter width
enc1_cnn_filter_width
CNN filter width
enc0_max_seq_len
Maximum sequence length
enc1_max_seq_len
Maximum sequence length
enc0_token_embedding_dim
Output dimension of token embedding layer
enc1_token_embedding_dim
Output dimension of token embedding layer
enc0_vocab_size
Vocabulary size of tokens
enc1_vocab_size
Vocabulary size of tokens
enc0_layers
Number of layers in encoder
enc1_layers
Number of layers in encoder
enc0_freeze_pretrained_embedding
Freeze pretrained embedding weights
enc1_freeze_pretrained_embedding
Freeze pretrained embedding weights
sagemaker.mlcore::EstimatorBase$latest_job_debugger_artifacts_path()
sagemaker.mlcore::EstimatorBase$latest_job_profiler_artifacts_path()
sagemaker.mlcore::EstimatorBase$latest_job_tensorboard_artifacts_path()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$hyperparameters()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$prepare_workflow_for_training()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$training_image_uri()
new()
Object2Vec is :class:'Estimator' used for anomaly detection. This Estimator may be fit via calls to :meth:'~sagemaker.amazon.amazon_estimator.AmazonAlgorithmEstimatorBase.fit'. There is an utility :meth:'~sagemaker.amazon.amazon_estimator.AmazonAlgorithmEstimatorBase.record_set' that can be used to upload data to S3 and creates :class:'~sagemaker.amazon.amazon_estimator.RecordSet' to be passed to the 'fit' call. After this Estimator is fit, model data is stored in S3. The model may be deployed to an Amazon SageMaker Endpoint by invoking :meth:'~sagemaker.amazon.estimator.EstimatorBase.deploy'. As well as deploying an Endpoint, deploy returns a :class:'~sagemaker.amazon.Predictor' object that can be used for inference calls using the trained model hosted in the SageMaker Endpoint. Object2Vec Estimators can be configured by setting hyperparameters. The available hyperparameters for Object2Vec are documented below. For further information on the AWS Object2Vec algorithm, please consult AWS technical documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/object2vec.html
Object2Vec$new( role, instance_count, instance_type, epochs, enc0_max_seq_len, enc0_vocab_size, enc_dim = NULL, mini_batch_size = NULL, early_stopping_patience = NULL, early_stopping_tolerance = NULL, dropout = NULL, weight_decay = NULL, bucket_width = NULL, num_classes = NULL, mlp_layers = NULL, mlp_dim = NULL, mlp_activation = NULL, output_layer = NULL, optimizer = NULL, learning_rate = NULL, negative_sampling_rate = NULL, comparator_list = NULL, tied_token_embedding_weight = NULL, token_embedding_storage_type = NULL, enc0_network = NULL, enc1_network = NULL, enc0_cnn_filter_width = NULL, enc1_cnn_filter_width = NULL, enc1_max_seq_len = NULL, enc0_token_embedding_dim = NULL, enc1_token_embedding_dim = NULL, enc1_vocab_size = NULL, enc0_layers = NULL, enc1_layers = NULL, enc0_freeze_pretrained_embedding = NULL, enc1_freeze_pretrained_embedding = NULL, ... )
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if accessing AWS resource.
instance_count
(int): Number of Amazon EC2 instances to use for training.
instance_type
(str): Type of EC2 instance to use for training, for example, 'ml.c4.xlarge'.
epochs
(int): Total number of epochs for SGD training
enc0_max_seq_len
(int): Maximum sequence length
enc0_vocab_size
(int): Vocabulary size of tokens
enc_dim
(int): Optional. Dimension of the output of the embedding layer
mini_batch_size
(int): Optional. mini batch size for SGD training
early_stopping_patience
(int): Optional. The allowed number of consecutive epochs without improvement before early stopping is applied
early_stopping_tolerance
(float): Optional. The value used to determine whether the algorithm has made improvement between two consecutive epochs for early stopping
dropout
(float): Optional. Dropout probability on network layers
weight_decay
(float): Optional. Weight decay parameter during optimization
bucket_width
(int): Optional. The allowed difference between data sequence length when bucketing is enabled
num_classes
(int): Optional. Number of classes for classification
mlp_layers
(int): Optional. Number of MLP layers in the network
mlp_dim
(int): Optional. Dimension of the output of MLP layer
mlp_activation
(str): Optional. Type of activation function for the MLP layer
output_layer
(str): Optional. Type of output layer
optimizer
(str): Optional. Type of optimizer for training
learning_rate
(float): Optional. Learning rate for SGD training
negative_sampling_rate
(int): Optional. Negative sampling rate
comparator_list
(str): Optional. Customization of comparator operator
tied_token_embedding_weight
(bool): Optional. Tying of token embedding layer weight
token_embedding_storage_type
(str): Optional. Type of token embedding storage
enc0_network
(str): Optional. Network model of encoder "enc0"
enc1_network
(str): Optional. Network model of encoder "enc1"
enc0_cnn_filter_width
(int): Optional. CNN filter width
enc1_cnn_filter_width
(int): Optional. CNN filter width
enc1_max_seq_len
(int): Optional. Maximum sequence length
enc0_token_embedding_dim
(int): Optional. Output dimension of token embedding layer
enc1_token_embedding_dim
(int): Optional. Output dimension of token embedding layer
enc1_vocab_size
(int): Optional. Vocabulary size of tokens
enc0_layers
(int): Optional. Number of layers in encoder
enc1_layers
(int): Optional. Number of layers in encoder
enc0_freeze_pretrained_embedding
(bool): Optional. Freeze pretrained embedding weights
enc1_freeze_pretrained_embedding
(bool): Optional. Freeze pretrained embedding weights
...
: base class keyword argument values.
training
(ignored for regression problems)
create_model()
Return a :class:'~sagemaker.amazon.Object2VecModel' referencing the latest s3 model data produced by this Estimator.
Object2Vec$create_model(vpc_config_override = "VPC_CONFIG_DEFAULT", ...)
vpc_config_override
(dict[str, list[str]]): Optional override for VpcConfig set on the model. Default: use subnets and security groups from this Estimator. * 'Subnets' (list[str]): List of subnet ids. * 'SecurityGroupIds' (list[str]): List of security group ids.
...
: Additional kwargs passed to the Object2VecModel constructor.
.prepare_for_training()
Set hyperparameters needed for training. This method will also validate “source_dir“.
Object2Vec$.prepare_for_training( records, mini_batch_size = NULL, job_name = NULL )
records
(RecordSet) – The records to train this Estimator on.
mini_batch_size
(int or None) – The size of each mini-batch to use when training. If None, a default value will be used.
job_name
(str): Name of the training job to be created. If not specified, one is generated, using the base name given to the constructor if applicable.
clone()
The objects of this class are cloneable with this method.
Object2Vec$clone(deep = FALSE)
deep
Whether to make a deep clone.
Calling :meth:'~sagemaker.model.Model.deploy' creates an Endpoint and returns a Predictor that calculates anomaly scores for datapoints.
sagemaker.mlcore::ModelBase
-> sagemaker.mlcore::Model
-> Object2VecModel
new()
Initialize Object2VecModel class
Object2VecModel$new(model_data, role, sagemaker_session = NULL, ...)
model_data
(str): The S3 location of a SageMaker model data “.tar.gz“ file.
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if it needs to access an AWS resource.
sagemaker_session
(sagemaker.session.Session): Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the estimator creates one using the default AWS configuration chain.
...
: Keyword arguments passed to the “FrameworkModel“ initializer.
clone()
The objects of this class are cloneable with this method.
Object2VecModel$clone(deep = FALSE)
deep
Whether to make a deep clone.
As a result, number of features within a dataset is reduced but the dataset still retain as much information as possible.
sagemaker.mlcore::EstimatorBase
-> sagemaker.mlcore::AmazonAlgorithmEstimatorBase
-> PCA
repo_name
sagemaker repo name for framework
repo_version
version of framework
DEFAULT_MINI_BATCH_SIZE
The size of each mini-batch to use when training.
.module
mimic python module
num_components
The number of principal components. Must be greater than zero.
algorithm_mode
Mode for computing the principal components.
subtract_mean
Whether the data should be unbiased both during train and at inference.
extra_components
As the value grows larger, the solution becomes more accurate but the runtime and memory consumption increase linearly.
sagemaker.mlcore::EstimatorBase$latest_job_debugger_artifacts_path()
sagemaker.mlcore::EstimatorBase$latest_job_profiler_artifacts_path()
sagemaker.mlcore::EstimatorBase$latest_job_tensorboard_artifacts_path()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$hyperparameters()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$prepare_workflow_for_training()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$training_image_uri()
new()
A Principal Components Analysis (PCA) :class:'~sagemaker.amazon.amazon_estimator.AmazonAlgorithmEstimatorBase'. This Estimator may be fit via calls to :meth:'~sagemaker.amazon.amazon_estimator.AmazonAlgorithmEstimatorBase.fit_ndarray' or :meth:'~sagemaker.amazon.amazon_estimator.AmazonAlgorithmEstimatorBase.fit'. The former allows a PCA model to be fit on a 2-dimensional numpy array. The latter requires Amazon :class:'~sagemaker.amazon.record_pb2.Record' protobuf serialized data to be stored in S3. To learn more about the Amazon protobuf Record class and how to prepare bulk data in this format, please consult AWS technical documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/cdf-training.html After this Estimator is fit, model data is stored in S3. The model may be deployed to an Amazon SageMaker Endpoint by invoking :meth:'~sagemaker.amazon.estimator.EstimatorBase.deploy'. As well as deploying an Endpoint, deploy returns a :class:'~sagemaker.amazon.pca.PCAPredictor' object that can be used to project input vectors to the learned lower-dimensional representation, using the trained PCA model hosted in the SageMaker Endpoint. PCA Estimators can be configured by setting hyperparameters. The available hyperparameters for PCA are documented below. For further information on the AWS PCA algorithm, please consult AWS technical documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/pca.html This Estimator uses Amazon SageMaker PCA to perform training and host deployed models. To learn more about Amazon SageMaker PCA, please read: https://docs.aws.amazon.com/sagemaker/latest/dg/how-pca-works.html
PCA$new( role, instance_count, instance_type, num_components, algorithm_mode = NULL, subtract_mean = NULL, extra_components = NULL, ... )
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if accessing AWS resource.
instance_count
(int): Number of Amazon EC2 instances to use for training.
instance_type
(str): Type of EC2 instance to use for training, for example, 'ml.c4.xlarge'.
num_components
(int): The number of principal components. Must be greater than zero.
algorithm_mode
(str): Mode for computing the principal components. One of 'regular' or 'randomized'.
subtract_mean
(bool): Whether the data should be unbiased both during train and at inference.
extra_components
(int): As the value grows larger, the solution becomes more accurate but the runtime and memory consumption increase linearly. If this value is unset or set to -1, then a default value equal to the maximum of 10 and num_components will be used. Valid for randomized mode only.
...
: base class keyword argument values.
create_model()
Return a :class:'~sagemaker.amazon.pca.PCAModel' referencing the latest s3 model data produced by this Estimator.
PCA$create_model(vpc_config_override = "VPC_CONFIG_DEFAULT", ...)
vpc_config_override
(dict[str, list[str]]): Optional override for VpcConfig set on the model. Default: use subnets and security groups from this Estimator. * 'Subnets' (list[str]): List of subnet ids. * 'SecurityGroupIds' (list[str]): List of security group ids.
...
: Additional kwargs passed to the PCAModel constructor.
.prepare_for_training()
Set hyperparameters needed for training.
PCA$.prepare_for_training(records, mini_batch_size = NULL, job_name = NULL)
records
(:class:'~RecordSet'): The records to train this “Estimator“ on.
mini_batch_size
(int or None): The size of each mini-batch to use when training. If “None“, a default value will be used.
job_name
(str): Name of the training job to be created. If not specified, one is generated, using the base name given to the constructor if applicable.
clone()
The objects of this class are cloneable with this method.
PCA$clone(deep = FALSE)
deep
Whether to make a deep clone.
Calling :meth:'~sagemaker.model.Model.deploy' creates an Endpoint and return a Predictor that transforms vectors to a lower-dimensional representation.
sagemaker.mlcore::ModelBase
-> sagemaker.mlcore::Model
-> PCAModel
new()
initialize PCAModel Class
PCAModel$new(model_data, role, sagemaker_session = NULL, ...)
model_data
(str): The S3 location of a SageMaker model data “.tar.gz“ file.
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if it needs to access an AWS resource.
sagemaker_session
(sagemaker.session.Session): Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the estimator creates one using the default AWS configuration chain.
...
: Keyword arguments passed to the “FrameworkModel“ initializer.
clone()
The objects of this class are cloneable with this method.
PCAModel$clone(deep = FALSE)
deep
Whether to make a deep clone.
The implementation of :meth:'~sagemaker.predictor.Predictor.predict' in this 'Predictor' requires a numpy “ndarray“ as input. The array should contain the same number of columns as the feature-dimension of the data used to fit the model this Predictor performs inference on. :meth:'predict()' returns a list of :class:'~sagemaker.amazon.record_pb2.Record' objects, one for each row in the input “ndarray“. The lower dimension vector result is stored in the “projection“ key of the “Record.label“ field.
sagemaker.mlcore::PredictorBase
-> sagemaker.mlcore::Predictor
-> PCAPredictor
new()
Initialize PCAPredictor Class
PCAPredictor$new(endpoint_name, sagemaker_session = NULL)
endpoint_name
(str): Name of the Amazon SageMaker endpoint to which requests are sent.
sagemaker_session
(sagemaker.session.Session): A SageMaker Session object, used for SageMaker interactions (default: None). If not specified, one is created using the default AWS configuration chain.
clone()
The objects of this class are cloneable with this method.
PCAPredictor$clone(deep = FALSE)
deep
Whether to make a deep clone.
Handles Amazon SageMaker processing tasks for jobs using PySpark.
sagemaker.common::Processor
-> sagemaker.common::ScriptProcessor
-> sagemaker.mlframework::.SparkProcessorBase
-> PySparkProcessor
new()
Initialize an “PySparkProcessor“ instance. The PySparkProcessor handles Amazon SageMaker processing tasks for jobs using SageMaker PySpark.
PySparkProcessor$new( role, instance_type, instance_count, framework_version = NULL, py_version = NULL, container_version = NULL, image_uri = NULL, volume_size_in_gb = 30, volume_kms_key = NULL, output_kms_key = NULL, max_runtime_in_seconds = NULL, base_job_name = NULL, sagemaker_session = NULL, env = NULL, tags = NULL, network_config = NULL )
role
(str): An AWS IAM role name or ARN. The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if it needs to access an AWS resource.
instance_type
(str): Type of EC2 instance to use for processing, for example, 'ml.c4.xlarge'.
instance_count
(int): The number of instances to run the Processing job with. Defaults to 1.
framework_version
(str): The version of SageMaker PySpark.
py_version
(str): The version of python.
container_version
(str): The version of spark container.
image_uri
(str): The container image to use for training.
volume_size_in_gb
(int): Size in GB of the EBS volume to use for storing data during processing (default: 30).
volume_kms_key
(str): A KMS key for the processing volume.
output_kms_key
(str): The KMS key id for all ProcessingOutputs.
max_runtime_in_seconds
(int): Timeout in seconds. After this amount of time Amazon SageMaker terminates the job regardless of its current status.
base_job_name
(str): Prefix for processing name. If not specified, the processor generates a default job name, based on the training image name and current timestamp.
sagemaker_session
(sagemaker.session.Session): Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the processor creates one using the default AWS configuration chain.
env
(dict): Environment variables to be passed to the processing job.
tags
([dict]): List of tags to be passed to the processing job.
network_config
(sagemaker.network.NetworkConfig): A NetworkConfig object that configures network isolation, encryption of inter-container traffic, security group IDs, and subnets.
get_args_run()
Returns a RunArgs object. This object contains the normalized inputs, outputs and arguments needed when using a “PySparkProcessor“ in a :class:'~sagemaker.workflow.steps.ProcessingStep'.
PySparkProcessor$get_args_run( submit_app, submit_py_files = NULL, submit_jars = NULL, submit_files = NULL, inputs = NULL, outputs = NULL, arguments = NULL, job_name = NULL, configuration = NULL, spark_event_logs_s3_uri = NULL )
submit_app
(str): Path (local or S3) to Python file to submit to Spark as the primary application. This is translated to the 'code' property on the returned 'RunArgs' object.
submit_py_files
(list[str]): List of paths (local or S3) to provide for 'spark-submit –py-files' option
submit_jars
(list[str]): List of paths (local or S3) to provide for 'spark-submit –jars' option
submit_files
(list[str]): List of paths (local or S3) to provide for 'spark-submit –files' option
inputs
(list[:class:'~sagemaker.processing.ProcessingInput']): Input files for the processing job. These must be provided as :class:'~sagemaker.processing.ProcessingInput' objects (default: None).
outputs
(list[:class:'~sagemaker.processing.ProcessingOutput']): Outputs for the processing job. These can be specified as either path strings or :class:'~sagemaker.processing.ProcessingOutput' objects (default: None).
arguments
(list[str]): A list of string arguments to be passed to a processing job (default: None).
job_name
(str): Processing job name. If not specified, the processor generates a default job name, based on the base job name and current timestamp.
configuration
(list[dict] or dict): Configuration for Hadoop, Spark, or Hive. List or dictionary of EMR-style classifications. https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-configure-apps.html
spark_event_logs_s3_uri
(str): S3 path where spark application events will be published to.
run()
Runs a processing job.
PySparkProcessor$run( submit_app, submit_py_files = NULL, submit_jars = NULL, submit_files = NULL, inputs = NULL, outputs = NULL, arguments = NULL, wait = TRUE, logs = TRUE, job_name = NULL, experiment_config = NULL, configuration = NULL, spark_event_logs_s3_uri = NULL, kms_key = NULL )
submit_app
(str): Path (local or S3) to Python file to submit to Spark as the primary application
submit_py_files
(list[str]): List of paths (local or S3) to provide for 'spark-submit –py-files' option
submit_jars
(list[str]): List of paths (local or S3) to provide for 'spark-submit –jars' option
submit_files
(list[str]): List of paths (local or S3) to provide for 'spark-submit –files' option
inputs
(list[:class:'~sagemaker.processing.ProcessingInput']): Input files for the processing job. These must be provided as :class:'~sagemaker.processing.ProcessingInput' objects (default: None).
outputs
(list[:class:'~sagemaker.processing.ProcessingOutput']): Outputs for the processing job. These can be specified as either path strings or :class:'~sagemaker.processing.ProcessingOutput' objects (default: None).
arguments
(list[str]): A list of string arguments to be passed to a processing job (default: None).
wait
(bool): Whether the call should wait until the job completes (default: True).
logs
(bool): Whether to show the logs produced by the job. Only meaningful when wait is True (default: True).
job_name
(str): Processing job name. If not specified, the processor generates a default job name, based on the base job name and current timestamp.
experiment_config
(dict[str, str]): Experiment management configuration. Dictionary contains three optional keys: 'ExperimentName', 'TrialName', and 'TrialComponentDisplayName'.
configuration
(list[dict] or dict): Configuration for Hadoop, Spark, or Hive. List or dictionary of EMR-style classifications. https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-configure-apps.html
spark_event_logs_s3_uri
(str): S3 path where spark application events will be published to.
kms_key
(str): The ARN of the KMS key that is used to encrypt the user code file (default: None).
clone()
The objects of this class are cloneable with this method.
PySparkProcessor$clone(deep = FALSE)
deep
Whether to make a deep clone.
Handle end-to-end training and deployment of custom PyTorch code.
sagemaker.mlcore::EstimatorBase
-> sagemaker.mlcore::Framework
-> PyTorch
.module
mimic python module
new()
This “Estimator“ executes an PyTorch script in a managed PyTorch execution environment, within a SageMaker Training Job. The managed PyTorch environment is an Amazon-built Docker container that executes functions defined in the supplied “entry_point“ Python script. Training is started by calling :meth:'~sagemaker.amazon.estimator.Framework.fit' on this Estimator. After training is complete, calling :meth:'~sagemaker.amazon.estimator.Framework.deploy' creates a hosted SageMaker endpoint and returns an :class:'~sagemaker.amazon.pytorch.model.PyTorchPredictor' instance that can be used to perform inference against the hosted model. Technical documentation on preparing PyTorch scripts for SageMaker training and using the PyTorch Estimator is available on the project home-page: https://github.com/aws/sagemaker-python-sdk
PyTorch$new( entry_point, framework_version = NULL, py_version = NULL, source_dir = NULL, hyperparameters = NULL, image_uri = NULL, distribution = NULL, ... )
entry_point
(str): Path (absolute or relative) to the Python source file which should be executed as the entry point to training. If “source_dir“ is specified, then “entry_point“ must point to a file located at the root of “source_dir“.
framework_version
(str): PyTorch version you want to use for executing your model training code. Defaults to “None“. Required unless “image_uri“ is provided. List of supported versions: https://github.com/aws/sagemaker-python-sdk#pytorch-sagemaker-estimators.
py_version
(str): Python version you want to use for executing your model training code. One of 'py2' or 'py3'. Defaults to “None“. Required unless “image_uri“ is provided.
source_dir
(str): Path (absolute, relative or an S3 URI) to a directory with any other training source code dependencies aside from the entry point file (default: None). If “source_dir“ is an S3 URI, it must point to a tar.gz file. Structure within this directory are preserved when training on Amazon SageMaker.
hyperparameters
(dict): Hyperparameters that will be used for training (default: None). The hyperparameters are made accessible as a dict[str, str] to the training code on SageMaker. For convenience, this accepts other types for keys and values, but “str()“ will be called to convert them before training.
image_uri
(str): If specified, the estimator will use this image for training and hosting, instead of selecting the appropriate SageMaker official image based on framework_version and py_version. It can be an ECR url or dockerhub image and tag. Examples: * “123412341234.dkr.ecr.us-west-2.amazonaws.com/my-custom-image:1.0“ * “custom-image:latest“ If “framework_version“ or “py_version“ are “None“, then “image_uri“ is required. If also “None“, then a “ValueError“ will be raised.
distribution
(list): A dictionary with information on how to run distributed training (default: None). Currently, the following are supported: distributed training with parameter servers, SageMaker Distributed (SMD) Data and Model Parallelism, and MPI. SMD Model Parallelism can only be used with MPI. To enable parameter server use the following setup:
...
: Additional kwargs passed to the :class:'~sagemaker.estimator.Framework' constructor.
hyperparameters()
Return hyperparameters used by your custom PyTorch code during model training.
PyTorch$hyperparameters()
create_model()
Create a SageMaker “PyTorchModel“ object that can be deployed to an “Endpoint“.
PyTorch$create_model( model_server_workers = NULL, role = NULL, vpc_config_override = "VPC_CONFIG_DEFAULT", entry_point = NULL, source_dir = NULL, dependencies = NULL, ... )
model_server_workers
(int): Optional. The number of worker processes used by the inference server. If None, server will use one worker per vCPU.
role
(str): The “ExecutionRoleArn“ IAM Role ARN for the “Model“, which is also used during transform jobs. If not specified, the role from the Estimator will be used.
vpc_config_override
(dict[str, list[str]]): Optional override for VpcConfig set on the model. Default: use subnets and security groups from this Estimator. * 'Subnets' (list[str]): List of subnet ids. * 'SecurityGroupIds' (list[str]): List of security group ids.
entry_point
(str): Path (absolute or relative) to the local Python source file which should be executed as the entry point to training. If “source_dir“ is specified, then “entry_point“ must point to a file located at the root of “source_dir“. If not specified, the training entry point is used.
source_dir
(str): Path (absolute or relative) to a directory with any other serving source code dependencies aside from the entry point file. If not specified, the model source directory from training is used.
dependencies
(list[str]): A list of paths to directories (absolute or relative) with any additional libraries that will be exported to the container. If not specified, the dependencies from training are used. This is not supported with "local code" in Local Mode.
...
: Additional kwargs passed to the :class:'~sagemaker.pytorch.model.PyTorchModel' constructor.
sagemaker.pytorch.model.PyTorchModel: A SageMaker “PyTorchModel“ object. See :func:'~sagemaker.pytorch.model.PyTorchModel' for full details.
clone()
The objects of this class are cloneable with this method.
PyTorch$clone(deep = FALSE)
deep
Whether to make a deep clone.
An PyTorch SageMaker “Model“ that can be deployed to a SageMaker “Endpoint“.
sagemaker.mlcore::ModelBase
-> sagemaker.mlcore::Model
-> sagemaker.mlcore::FrameworkModel
-> PyTorchModel
.LOWEST_MMS_VERSION
Lowest Multi Model Server PyTorch version that can be executed
new()
Initialize a PyTorchModel.
PyTorchModel$new( model_data, role, entry_point, framework_version = NULL, py_version = NULL, image_uri = NULL, predictor_cls = PyTorchPredictor, model_server_workers = NULL, ... )
model_data
(str): The S3 location of a SageMaker model data “.tar.gz“ file.
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if it needs to access an AWS resource.
entry_point
(str): Path (absolute or relative) to the Python source file which should be executed as the entry point to model hosting. If “source_dir“ is specified, then “entry_point“ must point to a file located at the root of “source_dir“.
framework_version
(str): PyTorch version you want to use for executing your model training code. Defaults to None. Required unless “image_uri“ is provided.
py_version
(str): Python version you want to use for executing your model training code. Defaults to “None“. Required unless “image_uri“ is provided.
image_uri
(str): A Docker image URI (default: None). If not specified, a default image for PyTorch will be used. If “framework_version“ or “py_version“ are “None“, then “image_uri“ is required. If also “None“, then a “ValueError“ will be raised.
predictor_cls
(callable[str, sagemaker.session.Session]): A function to call to create a predictor with an endpoint name and SageMaker “Session“. If specified, “deploy()“ returns the result of invoking this function on the created endpoint name.
model_server_workers
(int): Optional. The number of worker processes used by the inference server. If None, server will use one worker per vCPU.
...
: Keyword arguments passed to the superclass :class:'~sagemaker.model.FrameworkModel' and, subsequently, its superclass :class:'~sagemaker.model.Model'.
register()
Creates a model package for creating SageMaker models or listing on Marketplace.
PyTorchModel$register( content_types, response_types, inference_instances, transform_instances, model_package_name = NULL, model_package_group_name = NULL, image_uri = NULL, model_metrics = NULL, metadata_properties = NULL, marketplace_cert = FALSE, approval_status = NULL, description = NULL, drift_check_baselines = NULL )
content_types
(list): The supported MIME types for the input data.
response_types
(list): The supported MIME types for the output data.
inference_instances
(list): A list of the instance types that are used to generate inferences in real-time.
transform_instances
(list): A list of the instance types on which a transformation job can be run or on which an endpoint can be deployed.
model_package_name
(str): Model Package name, exclusive to 'model_package_group_name', using 'model_package_name' makes the Model Package un-versioned (default: None).
model_package_group_name
(str): Model Package Group name, exclusive to 'model_package_name', using 'model_package_group_name' makes the Model Package versioned (default: None).
image_uri
(str): Inference image uri for the container. Model class' self.image will be used if it is None (default: None).
model_metrics
(ModelMetrics): ModelMetrics object (default: None).
metadata_properties
(MetadataProperties): MetadataProperties object (default: None).
marketplace_cert
(bool): A boolean value indicating if the Model Package is certified for AWS Marketplace (default: False).
approval_status
(str): Model Approval Status, values can be "Approved", "Rejected", or "PendingManualApproval" (default: "PendingManualApproval").
description
(str): Model Package description (default: None).
drift_check_baselines
(DriftCheckBaselines): DriftCheckBaselines object (default: None).
A 'sagemaker.model.ModelPackage' instance.
prepare_container_def()
Return a container definition with framework configuration set in model environment variables.
PyTorchModel$prepare_container_def( instance_type = NULL, accelerator_type = NULL )
instance_type
(str): The EC2 instance type to deploy this Model to. For example, 'ml.p2.xlarge'.
accelerator_type
(str): The Elastic Inference accelerator type to deploy to the instance for loading and making inferences to the model.
dict[str, str]: A container definition object usable with the CreateModel API.
serving_image_uri()
Create a URI for the serving image.
PyTorchModel$serving_image_uri( region_name, instance_type, accelerator_type = NULL )
region_name
(str): AWS region where the image is uploaded.
instance_type
(str): SageMaker instance type. Used to determine device type (cpu/gpu/family-specific optimized).
accelerator_type
(str): The Elastic Inference accelerator type to deploy to the instance for loading and making inferences to the model.
str: The appropriate image URI based on the given parameters
clone()
The objects of this class are cloneable with this method.
PyTorchModel$clone(deep = FALSE)
deep
Whether to make a deep clone.
This is able to serialize Python lists, dictionaries, and numpy arrays to multidimensional tensors for PyTorch inference.
sagemaker.mlcore::PredictorBase
-> sagemaker.mlcore::Predictor
-> PyTorchPredictor
new()
Initialize an “PyTorchPredictor“.
PyTorchPredictor$new( endpoint_name, sagemaker_session = NULL, serializer = NumpySerializer$new(), deserializer = NumpyDeserializer$new() )
endpoint_name
(str): The name of the endpoint to perform inference on.
sagemaker_session
(sagemaker.session.Session): Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the estimator creates one using the default AWS configuration chain.
serializer
(sagemaker.serializers.BaseSerializer): Optional. Default serializes input data to .npy format. Handles lists and numpy arrays.
deserializer
(sagemaker.deserializers.BaseDeserializer): Optional. Default parses the response from .npy format to numpy array.
clone()
The objects of this class are cloneable with this method.
PyTorchPredictor$clone(deep = FALSE)
deep
Whether to make a deep clone.
Handles Amazon SageMaker processing tasks for jobs using PyTorch containers.
sagemaker.common::Processor
-> sagemaker.common::ScriptProcessor
-> sagemaker.common::FrameworkProcessor
-> PyTorchProcessor
estimator_cls
Estimator object
new()
This processor executes a Python script in a PyTorch execution environment. Unless “image_uri“ is specified, the PyTorch environment is an Amazon-built Docker container that executes functions defined in the supplied “code“ Python script.
PyTorchProcessor$new( framework_version, role, instance_count, instance_type, py_version = "py3", image_uri = NULL, command = NULL, volume_size_in_gb = 30, volume_kms_key = NULL, output_kms_key = NULL, code_location = NULL, max_runtime_in_seconds = NULL, base_job_name = NULL, sagemaker_session = NULL, env = NULL, tags = NULL, network_config = NULL )
framework_version
(str): The version of the framework. Value is ignored when “image_uri“ is provided.
role
(str): An AWS IAM role name or ARN. Amazon SageMaker Processing uses this role to access AWS resources, such as data stored in Amazon S3.
instance_count
(int): The number of instances to run a processing job with.
instance_type
(str): The type of EC2 instance to use for processing, for example, 'ml.c4.xlarge'.
py_version
(str): Python version you want to use for executing your model training code. One of 'py2' or 'py3'. Defaults to 'py3'. Value is ignored when “image_uri“ is provided.
image_uri
(str): The URI of the Docker image to use for the processing jobs (default: None).
command
([str]): The command to run, along with any command-line flags to *precede* the “'code script“'. Example: ["python3", "-v"]. If not provided, ["python"] will be chosen (default: None).
volume_size_in_gb
(int): Size in GB of the EBS volume to use for storing data during processing (default: 30).
volume_kms_key
(str): A KMS key for the processing volume (default: None).
output_kms_key
(str): The KMS key ID for processing job outputs (default: None).
code_location
(str): The S3 prefix URI where custom code will be uploaded (default: None). The code file uploaded to S3 is 'code_location/job-name/source/sourcedir.tar.gz'. If not specified, the default “code location“ is 's3://sagemaker-default-bucket'
max_runtime_in_seconds
(int): Timeout in seconds (default: None). After this amount of time, Amazon SageMaker terminates the job, regardless of its current status. If 'max_runtime_in_seconds' is not specified, the default value is 24 hours.
base_job_name
(str): Prefix for processing name. If not specified, the processor generates a default job name, based on the processing image name and current timestamp (default: None).
sagemaker_session
(:class:'~sagemaker.session.Session'): Session object which manages interactions with Amazon SageMaker and any other AWS services needed. If not specified, the processor creates one using the default AWS configuration chain (default: None).
env
(dict[str, str]): Environment variables to be passed to the processing jobs (default: None).
tags
(list[dict]): List of tags to be passed to the processing job (default: None). For more, see https://docs.aws.amazon.com/sagemaker/latest/dg/API_Tag.html.
network_config
(:class:'~sagemaker.network.NetworkConfig'): A :class:'~sagemaker.network.NetworkConfig' object that configures network isolation, encryption of inter-container traffic, security group IDs, and subnets (default: None).
clone()
The objects of this class are cloneable with this method.
PyTorchProcessor$clone(deep = FALSE)
deep
Whether to make a deep clone.
These are observations which diverge from otherwise well-structured or patterned data. Anomalies can manifest as unexpected spikes in time series data, breaks in periodicity, or unclassifiable data points.
sagemaker.mlcore::EstimatorBase
-> sagemaker.mlcore::AmazonAlgorithmEstimatorBase
-> RandomCutForest
repo_name
sagemaker repo name for framework
repo_version
version of framework
MINI_BATCH_SIZE
The size of each mini-batch to use when training.
.module
mimic python module
eval_metrics
JSON list of metrics types to be used for reporting the score for the model
num_trees
The number of trees used in the forest.
num_samples_per_tree
The number of samples used to build each tree in the forest.
feature_dim
Doc string place
sagemaker.mlcore::EstimatorBase$latest_job_debugger_artifacts_path()
sagemaker.mlcore::EstimatorBase$latest_job_profiler_artifacts_path()
sagemaker.mlcore::EstimatorBase$latest_job_tensorboard_artifacts_path()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$hyperparameters()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$prepare_workflow_for_training()
sagemaker.mlcore::AmazonAlgorithmEstimatorBase$training_image_uri()
new()
An 'Estimator' class implementing a Random Cut Forest. Typically used for anomaly detection, this Estimator may be fit via calls to :meth:'~sagemaker.amazon.amazon_estimator.AmazonAlgorithmEstimatorBase.fit'. It requires Amazon :class:'~sagemaker.amazon.record_pb2.Record' protobuf serialized data to be stored in S3. There is an utility :meth:'~sagemaker.amazon.amazon_estimator.AmazonAlgorithmEstimatorBase.record_set' that can be used to upload data to S3 and creates :class:'~sagemaker.amazon.amazon_estimator.RecordSet' to be passed to the 'fit' call. To learn more about the Amazon protobuf Record class and how to prepare bulk data in this format, please consult AWS technical documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/cdf-training.html After this Estimator is fit, model data is stored in S3. The model may be deployed to an Amazon SageMaker Endpoint by invoking :meth:'~sagemaker.amazon.estimator.EstimatorBase.deploy'. As well as deploying an Endpoint, deploy returns a :class:'~sagemaker.amazon.ntm.RandomCutForestPredictor' object that can be used for inference calls using the trained model hosted in the SageMaker Endpoint. RandomCutForest Estimators can be configured by setting hyperparameters. The available hyperparameters for RandomCutForest are documented below. For further information on the AWS Random Cut Forest algorithm, please consult AWS technical documentation: https://docs.aws.amazon.com/sagemaker/latest/dg/randomcutforest.html
RandomCutForest$new( role, instance_count, instance_type, num_samples_per_tree = NULL, num_trees = NULL, eval_metrics = NULL, ... )
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if accessing AWS resource.
instance_count
(int): Number of Amazon EC2 instances to use for training.
instance_type
(str): Type of EC2 instance to use for training, for example, 'ml.c4.xlarge'.
num_samples_per_tree
(int): Optional. The number of samples used to build each tree in the forest. The total number of samples drawn from the train dataset is num_trees * num_samples_per_tree.
num_trees
(int): Optional. The number of trees used in the forest.
eval_metrics
(list): Optional. JSON list of metrics types to be used for reporting the score for the model. Allowed values are "accuracy", "precision_recall_fscore": positive and negative precision, recall, and f1 scores. If test data is provided, the score shall be reported in terms of all requested metrics.
...
: base class keyword argument values.
create_model()
Return a :class:'~sagemaker.amazon.RandomCutForestModel' referencing the latest s3 model data produced by this Estimator.
RandomCutForest$create_model(vpc_config_override = "VPC_CONFIG_DEFAULT", ...)
vpc_config_override
(dict[str, list[str]]): Optional override for VpcConfig set on the model. Default: use subnets and security groups from this Estimator. * 'Subnets' (list[str]): List of subnet ids. * 'SecurityGroupIds' (list[str]): List of security group ids.
...
: Additional kwargs passed to the RandomCutForestModel constructor.
.prepare_for_training()
Set hyperparameters needed for training. This method will also validate “source_dir“.
RandomCutForest$.prepare_for_training( records, mini_batch_size = NULL, job_name = NULL )
records
(RecordSet) – The records to train this Estimator on.
mini_batch_size
(int or None) – The size of each mini-batch to use when training. If None, a default value will be used.
job_name
(str): Name of the training job to be created. If not specified, one is generated, using the base name given to the constructor if applicable.
clone()
The objects of this class are cloneable with this method.
RandomCutForest$clone(deep = FALSE)
deep
Whether to make a deep clone.
Calling :meth:'~sagemaker.model.Model.deploy' creates an Endpoint and returns a Predictor that calculates anomaly scores for datapoints.
sagemaker.mlcore::ModelBase
-> sagemaker.mlcore::Model
-> RandomCutForestModel
new()
Initialize RandomCutForestModel class
RandomCutForestModel$new(model_data, role, sagemaker_session = NULL, ...)
model_data
(str): The S3 location of a SageMaker model data “.tar.gz“ file.
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if it needs to access an AWS resource.
sagemaker_session
(sagemaker.session.Session): Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the estimator creates one using the default AWS configuration chain.
...
: Keyword arguments passed to the “FrameworkModel“ initializer.
clone()
The objects of this class are cloneable with this method.
RandomCutForestModel$clone(deep = FALSE)
deep
Whether to make a deep clone.
The implementation of :meth:'~sagemaker.predictor.Predictor.predict' in this 'Predictor' requires a numpy “ndarray“ as input. The array should contain the same number of columns as the feature-dimension of the data used to fit the model this Predictor performs inference on.
sagemaker.mlcore::PredictorBase
-> sagemaker.mlcore::Predictor
-> RandomCutForestPredictor
new()
Initialize RandomCutForestPredictor class
RandomCutForestPredictor$new(endpoint_name, sagemaker_session = NULL)
endpoint_name
(str): Name of the Amazon SageMaker endpoint to which requests are sent.
sagemaker_session
(sagemaker.session.Session): A SageMaker Session object, used for SageMaker interactions (default: NULL). If not specified, one is created using the default AWS configuration chain.
clone()
The objects of this class are cloneable with this method.
RandomCutForestPredictor$clone(deep = FALSE)
deep
Whether to make a deep clone.
Handle end-to-end training and deployment of custom RLEstimator code.
sagemaker.mlcore::EstimatorBase
-> sagemaker.mlcore::Framework
-> RLEstimator
COACH_LATEST_VERSION_TF
latest version of toolkit coach for tensorflow
COACH_LATEST_VERSION_MXNET
latest version of toolkit coach for mxnet
RAY_LATEST_VERSION
latest version of toolkit ray
.module
mimic python module
new()
Creates an RLEstimator for managed Reinforcement Learning (RL). It will execute an RLEstimator script within a SageMaker Training Job. The managed RL environment is an Amazon-built Docker container that executes functions defined in the supplied “entry_point“ Python script. Training is started by calling :meth:'~sagemaker.amazon.estimator.Framework.fit' on this Estimator. After training is complete, calling :meth:'~sagemaker.amazon.estimator.Framework.deploy' creates a hosted SageMaker endpoint and based on the specified framework returns an :class:'~sagemaker.amazon.mxnet.model.MXNetPredictor' or :class:'~sagemaker.amazon.tensorflow.model.TensorFlowPredictor' instance that can be used to perform inference against the hosted model. Technical documentation on preparing RLEstimator scripts for SageMaker training and using the RLEstimator is available on the project homepage: https://github.com/aws/sagemaker-python-sdk
RLEstimator$new( entry_point, toolkit = NULL, toolkit_version = NULL, framework = NULL, source_dir = NULL, hyperparameters = NULL, image_uri = NULL, metric_definitions = NULL, ... )
entry_point
(str): Path (absolute or relative) to the Python source file which should be executed as the entry point to training. If “source_dir“ is specified, then “entry_point“ must point to a file located at the root of “source_dir“.
toolkit
(sagemaker.rl.RLToolkit): RL toolkit you want to use for executing your model training code.
toolkit_version
(str): RL toolkit version you want to be use for executing your model training code.
framework
(sagemaker.rl.RLFramework): Framework (MXNet or TensorFlow) you want to be used as a toolkit backed for reinforcement learning training.
source_dir
(str): Path (absolute, relative or an S3 URI) to a directory with any other training source code dependencies aside from the entry point file (default: NULL). If “source_dir“ is an S3 URI, it must point to a tar.gz file. Structure within this directory are preserved when training on Amazon SageMaker.
hyperparameters
(dict): Hyperparameters that will be used for training (default: NULL). The hyperparameters are made accessible as a dict[str, str] to the training code on SageMaker. For convenience, this accepts other types for keys and values.
image_uri
(str): An ECR url. If specified, the estimator will use this image for training and hosting, instead of selecting the appropriate SageMaker official image based on framework_version and py_version. Example: 123.dkr.ecr.us-west-2.amazonaws.com/my-custom-image:1.0
metric_definitions
(list[dict]): A list of dictionaries that defines the metric(s) used to evaluate the training jobs. Each dictionary contains two keys: 'Name' for the name of the metric, and 'Regex' for the regular expression used to extract the metric from the logs. This should be defined only for jobs that don't use an Amazon algorithm.
...
: Additional kwargs passed to the :class:'~sagemaker.estimator.Framework' constructor. .. tip:: You can find additional parameters for initializing this class at :class:'~sagemaker.estimator.Framework' and :class:'~sagemaker.estimator.EstimatorBase'.
create_model()
Create a SageMaker “RLEstimatorModel“ object that can be deployed to an Endpoint.
RLEstimator$create_model( role = NULL, vpc_config_override = "VPC_CONFIG_DEFAULT", entry_point = NULL, source_dir = NULL, dependencies = NULL, ... )
role
(str): The “ExecutionRoleArn“ IAM Role ARN for the “Model“, which is also used during transform jobs. If not specified, the role from the Estimator will be used.
vpc_config_override
(dict[str, list[str]]): Optional override for VpcConfig set on the model. Default: use subnets and security groups from this Estimator. * 'Subnets' (list[str]): List of subnet ids. * 'SecurityGroupIds' (list[str]): List of security group ids.
entry_point
(str): Path (absolute or relative) to the Python source file which should be executed as the entry point for MXNet hosting (default: self.entry_point). If “source_dir“ is specified, then “entry_point“ must point to a file located at the root of “source_dir“.
source_dir
(str): Path (absolute or relative) to a directory with any other training source code dependencies aside from the entry point file (default: self.source_dir). Structure within this directory are preserved when hosting on Amazon SageMaker.
dependencies
(list[str]): A list of paths to directories (absolute or relative) with any additional libraries that will be exported to the container (default: self.dependencies). The library folders will be copied to SageMaker in the same folder where the entry_point is copied. If the “'source_dir“' points to S3, code will be uploaded and the S3 location will be used instead. This is not supported with "local code" in Local Mode.
...
: Additional kwargs passed to the :class:'~sagemaker.model.FrameworkModel' constructor.
sagemaker.model.FrameworkModel: Depending on input parameters returns one of the following: * :class:'~sagemaker.model.FrameworkModel' - if “image_uri“ is specified on the estimator; * :class:‘~sagemaker.mxnet.MXNetModel' - if “image_uri“ isn’t specified and MXNet is used as the RL backend; * :class:‘~sagemaker.tensorflow.model.TensorFlowModel' - if “image_uri“ isn’t specified and TensorFlow is used as the RL backend.
training_image_uri()
Return the Docker image to use for training. The :meth:'~sagemaker.estimator.EstimatorBase.fit' method, which does the model training, calls this method to find the image to use for model training.
RLEstimator$training_image_uri()
str: The URI of the Docker image.
hyperparameters()
Return hyperparameters used by your custom TensorFlow code during model training.
RLEstimator$hyperparameters()
default_metric_definitions()
Provides default metric definitions based on provided toolkit.
RLEstimator$default_metric_definitions(toolkit)
toolkit
(sagemaker.rl.RLToolkit): RL Toolkit to be used for training.
list: metric definitions
clone()
The objects of this class are cloneable with this method.
RLEstimator$clone(deep = FALSE)
deep
Whether to make a deep clone.
Framework (MXNet, TensorFlow or PyTorch) you want to be used as a toolkit backed for reinforcement learning training.
RLFramework
RLFramework
An object of class Enum
(inherits from environment
) of length 3.
environment containing [TENSORFLOW, MXNET, PYTORCH]
RL toolkit you want to use for executing your model training code.
RLToolkit
RLToolkit
An object of class Enum
(inherits from environment
) of length 2.
environment containing [COACH, RAY]
Handle end-to-end training and deployment of custom Scikit-learn code.
sagemaker.mlcore::EstimatorBase
-> sagemaker.mlcore::Framework
-> SKLearn
.module
mimic python module
new()
This “Estimator“ executes an Scikit-learn script in a managed Scikit-learn execution environment, within a SageMaker Training Job. The managed Scikit-learn environment is an Amazon-built Docker container that executes functions defined in the supplied “entry_point“ Python script. Training is started by calling :meth:'~sagemaker.amazon.estimator.Framework.fit' on this Estimator. After training is complete, calling :meth:'~sagemaker.amazon.estimator.Framework.deploy' creates a hosted SageMaker endpoint and returns an :class:'~sagemaker.amazon.sklearn.model.SKLearnPredictor' instance that can be used to perform inference against the hosted model. Technical documentation on preparing Scikit-learn scripts for SageMaker training and using the Scikit-learn Estimator is available on the project home-page: https://github.com/aws/sagemaker-python-sdk
SKLearn$new( entry_point, framework_version = NULL, py_version = "py3", source_dir = NULL, hyperparameters = NULL, image_uri = NULL, ... )
entry_point
(str): Path (absolute or relative) to the Python source file which should be executed as the entry point to training. If “source_dir“ is specified, then “entry_point“ must point to a file located at the root of “source_dir“.
framework_version
(str): Scikit-learn version you want to use for executing your model training code. Defaults to “None“. Required unless “image_uri“ is provided. List of supported versions: https://github.com/aws/sagemaker-python-sdk#sklearn-sagemaker-estimators
py_version
(str): Python version you want to use for executing your model training code (default: 'py3'). Currently, 'py3' is the only supported version. If “None“ is passed in, “image_uri“ must be provided.
source_dir
(str): Path (absolute, relative or an S3 URI) to a directory with any other training source code dependencies aside from the entry point file (default: None). If “source_dir“ is an S3 URI, it must point to a tar.gz file. Structure within this directory are preserved when training on Amazon SageMaker.
hyperparameters
(dict): Hyperparameters that will be used for training (default: None). The hyperparameters are made accessible as a dict[str, str] to the training code on SageMaker. For convenience, this accepts other types for keys and values, but “str()“ will be called to convert them before training.
image_uri
(str): If specified, the estimator will use this image for training and hosting, instead of selecting the appropriate SageMaker official image based on framework_version and py_version. It can be an ECR url or dockerhub image and tag. Examples: 123.dkr.ecr.us-west-2.amazonaws.com/my-custom-image:1.0 custom-image:latest. If “framework_version“ or “py_version“ are “None“, then “image_uri“ is required. If also “None“, then a “ValueError“ will be raised.
...
: Additional kwargs passed to the :class:'~sagemaker.estimator.Framework' constructor.
create_model()
Create a SageMaker “SKLearnModel“ object that can be deployed to an “Endpoint“.
SKLearn$create_model( model_server_workers = NULL, role = NULL, vpc_config_override = "VPC_CONFIG_DEFAULT", entry_point = NULL, source_dir = NULL, dependencies = NULL, ... )
model_server_workers
(int): Optional. The number of worker processes used by the inference server. If None, server will use one worker per vCPU.
role
(str): The “ExecutionRoleArn“ IAM Role ARN for the “Model“, which is also used during transform jobs. If not specified, the role from the Estimator will be used.
vpc_config_override
(dict[str, list[str]]): Optional override for VpcConfig set on the model. Default: use subnets and security groups from this Estimator. * 'Subnets' (list[str]): List of subnet ids. * 'SecurityGroupIds' (list[str]): List of security group ids.
entry_point
(str): Path (absolute or relative) to the local Python source file which should be executed as the entry point to training. If “source_dir“ is specified, then “entry_point“ must point to a file located at the root of “source_dir“. If not specified, the training entry point is used.
source_dir
(str): Path (absolute or relative) to a directory with any other serving source code dependencies aside from the entry point file. If not specified, the model source directory from training is used.
dependencies
(list[str]): A list of paths to directories (absolute or relative) with any additional libraries that will be exported to the container. If not specified, the dependencies from training are used. This is not supported with "local code" in Local Mode.
...
: Additional kwargs passed to the :class:'~sagemaker.sklearn.model.SKLearnModel' constructor.
sagemaker.sklearn.model.SKLearnModel: A SageMaker “SKLearnModel“ object. See :func:'~sagemaker.sklearn.model.SKLearnModel' for full details.
clone()
The objects of this class are cloneable with this method.
SKLearn$clone(deep = FALSE)
deep
Whether to make a deep clone.
An Scikit-learn SageMaker “Model“ that can be deployed to a SageMaker “Endpoint“.
sagemaker.mlcore::ModelBase
-> sagemaker.mlcore::Model
-> sagemaker.mlcore::FrameworkModel
-> SKLearnModel
new()
Initialize an SKLearnModel.
SKLearnModel$new( model_data, role, entry_point, framework_version = NULL, py_version = "py3", image_uri = NULL, predictor_cls = SKLearnPredictor, model_server_workers = NULL, ... )
model_data
(str): The S3 location of a SageMaker model data “.tar.gz“ file.
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if it needs to access an AWS resource.
entry_point
(str): Path (absolute or relative) to the Python source file which should be executed as the entry point to model hosting. If “source_dir“ is specified, then “entry_point“ must point to a file located at the root of “source_dir“.
framework_version
(str): Scikit-learn version you want to use for executing your model training code. Defaults to “None“. Required unless “image_uri“ is provided.
py_version
(str): Python version you want to use for executing your model training code (default: 'py3'). Currently, 'py3' is the only supported version. If “None“ is passed in, “image_uri“ must be provided.
image_uri
(str): A Docker image URI (default: None). If not specified, a default image for Scikit-learn will be used. If “framework_version“ or “py_version“ are “None“, then “image_uri“ is required. If also “None“, then a “ValueError“ will be raised.
predictor_cls
(callable[str, sagemaker.session.Session]): A function to call to create a predictor with an endpoint name and SageMaker “Session“. If specified, “deploy()“ returns the result of invoking this function on the created endpoint name.
model_server_workers
(int): Optional. The number of worker processes used by the inference server. If None, server will use one worker per vCPU.
...
: Keyword arguments passed to the “FrameworkModel“ initializer.
prepare_container_def()
Return a container definition with framework configuration set in model environment variables.
SKLearnModel$prepare_container_def( instance_type = NULL, accelerator_type = NULL )
instance_type
(str): The EC2 instance type to deploy this Model to. This parameter is unused because Scikit-learn supports only CPU.
accelerator_type
(str): The Elastic Inference accelerator type to deploy to the instance for loading and making inferences to the model. This parameter is unused because accelerator types are not supported by SKLearnModel.
dict[str, str]: A container definition object usable with the CreateModel API.
serving_image_uri()
Create a URI for the serving image.
SKLearnModel$serving_image_uri(region_name, instance_type)
region_name
(str): AWS region where the image is uploaded.
instance_type
(str): SageMaker instance type.
str: The appropriate image URI based on the given parameters.
clone()
The objects of this class are cloneable with this method.
SKLearnModel$clone(deep = FALSE)
deep
Whether to make a deep clone.
This is able to serialize Python lists, dictionaries, and numpy arrays to multidimensional tensors for Scikit-learn inference.
sagemaker.mlcore::PredictorBase
-> sagemaker.mlcore::Predictor
-> SKLearnPredictor
new()
Initialize an “SKLearnPredictor“.
SKLearnPredictor$new( endpoint_name, sagemaker_session = NULL, serializer = NumpySerializer$new(), deserializer = NumpyDeserializer$new() )
endpoint_name
(str): The name of the endpoint to perform inference on.
sagemaker_session
(sagemaker.session.Session): Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the estimator creates one using the default AWS configuration chain.
serializer
(sagemaker.serializers.BaseSerializer): Optional. Default serializes input data to .npy format. Handles lists and numpy arrays.
deserializer
(sagemaker.deserializers.BaseDeserializer): Optional. Default parses the response from .npy format to numpy array.
clone()
The objects of this class are cloneable with this method.
SKLearnPredictor$clone(deep = FALSE)
deep
Whether to make a deep clone.
Handles Amazon SageMaker processing tasks for jobs using scikit-learn.
sagemaker.common::Processor
-> sagemaker.common::ScriptProcessor
-> sagemaker.common::FrameworkProcessor
-> SKLearnProcessor
estimator_cls
Estimator object
new()
Initialize an “SKLearnProcessor“ instance. The SKLearnProcessor handles Amazon SageMaker processing tasks for jobs using scikit-learn.
SKLearnProcessor$new( framework_version, role, instance_type, instance_count, py_version = "py3", image_uri = NULL, command = NULL, volume_size_in_gb = 30, volume_kms_key = NULL, output_kms_key = NULL, code_location = NULL, max_runtime_in_seconds = NULL, base_job_name = NULL, sagemaker_session = NULL, env = NULL, tags = NULL, network_config = NULL )
framework_version
(str): The version of the framework. Value is ignored when “image_uri“ is provided.
role
(str): An AWS IAM role name or ARN. Amazon SageMaker Processing uses this role to access AWS resources, such as data stored in Amazon S3.
instance_type
(str): The type of EC2 instance to use for processing, for example, 'ml.c4.xlarge'.
instance_count
(int): The number of instances to run a processing job with.
py_version
(str): Python version you want to use for executing your model training code. One of 'py2' or 'py3'. Defaults to 'py3'. Value is ignored when “image_uri“ is provided.
image_uri
(str): The URI of the Docker image to use for the processing jobs (default: None).
command
([str]): The command to run, along with any command-line flags to *precede* the “'code script“'. Example: ["python3", "-v"]. If not provided, ["python"] will be chosen (default: None).
volume_size_in_gb
(int): Size in GB of the EBS volume to use for storing data during processing (default: 30).
volume_kms_key
(str): A KMS key for the processing volume (default: None).
output_kms_key
(str): The KMS key ID for processing job outputs (default: None).
code_location
(str): The S3 prefix URI where custom code will be uploaded (default: None). The code file uploaded to S3 is 'code_location/job-name/source/sourcedir.tar.gz'. If not specified, the default “code location“ is 's3://sagemaker-default-bucket'
max_runtime_in_seconds
(int): Timeout in seconds (default: None). After this amount of time, Amazon SageMaker terminates the job, regardless of its current status. If 'max_runtime_in_seconds' is not specified, the default value is 24 hours.
base_job_name
(str): Prefix for processing name. If not specified, the processor generates a default job name, based on the processing image name and current timestamp (default: None).
sagemaker_session
(:class:'~sagemaker.session.Session'): Session object which manages interactions with Amazon SageMaker and any other AWS services needed. If not specified, the processor creates one using the default AWS configuration chain (default: None).
env
(dict[str, str]): Environment variables to be passed to the processing jobs (default: None).
tags
(list[dict]): List of tags to be passed to the processing job (default: None). For more, see https://docs.aws.amazon.com/sagemaker/latest/dg/API_Tag.html.
network_config
(:class:'~sagemaker.network.NetworkConfig'): A :class:'~sagemaker.network.NetworkConfig' object that configures network isolation, encryption of inter-container traffic, security group IDs, and subnets (default: None).
clone()
The objects of this class are cloneable with this method.
SKLearnProcessor$clone(deep = FALSE)
deep
Whether to make a deep clone.
Handles Amazon SageMaker processing tasks for jobs using Spark with Java or Scala Jars.
sagemaker.common::Processor
-> sagemaker.common::ScriptProcessor
-> sagemaker.mlframework::.SparkProcessorBase
-> SparkJarProcessor
new()
Initialize a “SparkJarProcessor“ instance. The SparkProcessor handles Amazon SageMaker processing tasks for jobs using SageMaker Spark.
SparkJarProcessor$new( role, instance_type, instance_count, framework_version = NULL, py_version = NULL, container_version = NULL, image_uri = NULL, volume_size_in_gb = 30, volume_kms_key = NULL, output_kms_key = NULL, max_runtime_in_seconds = NULL, base_job_name = NULL, sagemaker_session = NULL, env = NULL, tags = NULL, network_config = NULL )
role
(str): An AWS IAM role name or ARN. The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if it needs to access an AWS resource.
instance_type
(str): Type of EC2 instance to use for processing, for example, 'ml.c4.xlarge'.
instance_count
(int): The number of instances to run the Processing job with. Defaults to 1.
framework_version
(str): The version of SageMaker PySpark.
py_version
(str): The version of python.
container_version
(str): The version of spark container.
image_uri
(str): The container image to use for training.
volume_size_in_gb
(int): Size in GB of the EBS volume to use for storing data during processing (default: 30).
volume_kms_key
(str): A KMS key for the processing volume.
output_kms_key
(str): The KMS key id for all ProcessingOutputs.
max_runtime_in_seconds
(int): Timeout in seconds. After this amount of time Amazon SageMaker terminates the job regardless of its current status.
base_job_name
(str): Prefix for processing name. If not specified, the processor generates a default job name, based on the training image name and current timestamp.
sagemaker_session
(sagemaker.session.Session): Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the processor creates one using the default AWS configuration chain.
env
(dict): Environment variables to be passed to the processing job.
tags
([dict]): List of tags to be passed to the processing job.
network_config
(sagemaker.network.NetworkConfig): A NetworkConfig object that configures network isolation, encryption of inter-container traffic, security group IDs, and subnets.
get_run_args()
This object contains the normalized inputs, outputs and arguments needed when using a “SparkJarProcessor“ in a :class:'~sagemaker.workflow.steps.ProcessingStep'.
SparkJarProcessor$get_run_args( submit_app, submit_class = NULL, submit_jars = NULL, submit_files = NULL, inputs = NULL, outputs = NULL, arguments = NULL, job_name = NULL, configuration = NULL, spark_event_logs_s3_uri = NULL )
submit_app
(str): Path (local or S3) to Python file to submit to Spark as the primary application. This is translated to the 'code' property on the returned 'RunArgs' object
submit_class
(str): Java class reference to submit to Spark as the primary application
submit_jars
(list[str]): List of paths (local or S3) to provide for 'spark-submit –jars' option
submit_files
(list[str]): List of paths (local or S3) to provide for 'spark-submit –files' option
inputs
(list[:class:'~sagemaker.processing.ProcessingInput']): Input files for the processing job. These must be provided as :class:'~sagemaker.processing.ProcessingInput' objects (default: None).
outputs
(list[:class:'~sagemaker.processing.ProcessingOutput']): Outputs for the processing job. These can be specified as either path strings or :class:'~sagemaker.processing.ProcessingOutput' objects (default: None).
arguments
(list[str]): A list of string arguments to be passed to a processing job (default: None).
job_name
(str): Processing job name. If not specified, the processor generates a default job name, based on the base job name and current timestamp.
configuration
(list[dict] or dict): Configuration for Hadoop, Spark, or Hive. List or dictionary of EMR-style classifications. https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-configure-apps.html
spark_event_logs_s3_uri
(str): S3 path where spark application events will be published to.
Returns a RunArgs object.
run()
Runs a processing job.
SparkJarProcessor$run( submit_app, submit_class = NULL, submit_jars = NULL, submit_files = NULL, inputs = NULL, outputs = NULL, arguments = NULL, wait = TRUE, logs = TRUE, job_name = NULL, experiment_config = NULL, configuration = NULL, spark_event_logs_s3_uri = NULL, kms_key = NULL )
submit_app
(str): Path (local or S3) to Jar file to submit to Spark as the primary application
submit_class
(str): Java class reference to submit to Spark as the primary application
submit_jars
(list[str]): List of paths (local or S3) to provide for 'spark-submit –jars' option
submit_files
(list[str]): List of paths (local or S3) to provide for 'spark-submit –files' option
inputs
(list[:class:'~sagemaker.processing.ProcessingInput']): Input files for the processing job. These must be provided as :class:'~sagemaker.processing.ProcessingInput' objects (default: None).
outputs
(list[:class:'~sagemaker.processing.ProcessingOutput']): Outputs for the processing job. These can be specified as either path strings or :class:'~sagemaker.processing.ProcessingOutput' objects (default: None).
arguments
(list[str]): A list of string arguments to be passed to a processing job (default: None).
wait
(bool): Whether the call should wait until the job completes (default: True).
logs
(bool): Whether to show the logs produced by the job. Only meaningful when wait is True (default: True).
job_name
(str): Processing job name. If not specified, the processor generates a default job name, based on the base job name and current timestamp.
experiment_config
(dict[str, str]): Experiment management configuration. Dictionary contais three optional keys: 'ExperimentName', 'TrialName', and 'TrialComponentDisplayName'.
configuration
(list[dict] or dict): Configuration for Hadoop, Spark, or Hive. List or dictionary of EMR-style classifications. https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-configure-apps.html
spark_event_logs_s3_uri
(str): S3 path where spark application events will be published to.
kms_key
(str): The ARN of the KMS key that is used to encrypt the user code file (default: None).
clone()
The objects of this class are cloneable with this method.
SparkJarProcessor$clone(deep = FALSE)
deep
Whether to make a deep clone.
Model data and S3 location holder for MLeap serialized SparkML model. Calling :meth:'~sagemaker.model.Model.deploy' creates an Endpoint and return a Predictor to performs predictions against an MLeap serialized SparkML model .
sagemaker.mlcore::ModelBase
-> sagemaker.mlcore::Model
-> SparkMLModel
new()
Initialize a SparkMLModel.
SparkMLModel$new( model_data, role = NULL, spark_version = "2.4", sagemaker_session = NULL, ... )
model_data
(str): The S3 location of a SageMaker model data “.tar.gz“ file. For SparkML, this will be the output that has been produced by the Spark job after serializing the Model via MLeap.
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if it needs to access an AWS resource.
spark_version
(str): Spark version you want to use for executing the inference (default: '2.4').
sagemaker_session
(sagemaker.session.Session): Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the estimator creates one using the default AWS configuration chain. For local mode, please do not pass this variable.
...
: Additional parameters passed to the :class:'~sagemaker.model.Model' constructor.
clone()
The objects of this class are cloneable with this method.
SparkMLModel$clone(deep = FALSE)
deep
Whether to make a deep clone.
The implementation of :meth:'~sagemaker.predictor.Predictor.predict' in this 'Predictor' requires a json as input. The input should follow the json format as documented. “predict()“ returns a csv output, comma separated if the output is a list.
sagemaker.mlcore::PredictorBase
-> sagemaker.mlcore::Predictor
-> SparkMLPredictor
new()
Initializes a SparkMLPredictor which should be used with SparkMLModel to perform predictions against SparkML models serialized via MLeap. The response is returned in text/csv format which is the default response format for SparkML Serving container.
SparkMLPredictor$new( endpoint_name, sagemaker_session = NULL, serializer = CSVSerializer$new(), ... )
endpoint_name
(str): The name of the endpoint to perform inference on.
sagemaker_session
(sagemaker.session.Session): Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the estimator creates one using the default AWS configuration chain.
serializer
(sagemaker.serializers.BaseSerializer): Optional. Default serializes input data to text/csv.
...
: Additional parameters passed to the :class:'~sagemaker.Predictor' constructor.
clone()
The objects of this class are cloneable with this method.
SparkMLPredictor$clone(deep = FALSE)
deep
Whether to make a deep clone.
Handle end-to-end training and deployment of user-provided TensorFlow code.
sagemaker.mlcore::EstimatorBase
-> sagemaker.mlcore::Framework
-> TensorFlow
.module
mimic python module
new()
Initialize a “TensorFlow“ estimator.
TensorFlow$new( py_version = NULL, framework_version = NULL, model_dir = NULL, image_uri = NULL, distribution = NULL, ... )
py_version
(str): Python version you want to use for executing your model training code. Defaults to “None“. Required unless “image_uri“ is provided.
framework_version
(str): TensorFlow version you want to use for executing your model training code. Defaults to “None“. Required unless “image_uri“ is provided. List of supported versions: https://github.com/aws/sagemaker-python-sdk#tensorflow-sagemaker-estimators.
model_dir
(str): S3 location where the checkpoint data and models can be exported to during training (default: None). It will be passed in the training script as one of the command line arguments. If not specified, one is provided based on your training configuration: * *distributed training with SMDistributed or MPI with Horovod* - “/opt/ml/model“ * *single-machine training or distributed training without MPI* - \ “s3://output_path/model“ * *Local Mode with local sources (file:// instead of s3://)* - \ “/opt/ml/shared/model“ To disable having “model_dir“ passed to your training script, set “model_dir=False“.
image_uri
(str): If specified, the estimator will use this image for training and hosting, instead of selecting the appropriate SageMaker official image based on framework_version and py_version. It can be an ECR url or dockerhub image and tag. Examples: 123.dkr.ecr.us-west-2.amazonaws.com/my-custom-image:1.0 custom-image:latest. If “framework_version“ or “py_version“ are “None“, then “image_uri“ is required. If also “None“, then a “ValueError“ will be raised.
distribution
(dict): A dictionary with information on how to run distributed training (default: None). Currently, the following are supported: distributed training with parameter servers, SageMaker Distributed (SMD) Data and Model Parallelism, and MPI. SMD Model Parallelism can only be used with MPI. To enable parameter server use the following setup: .. code:: python "parameter_server": "enabled": True To enable MPI: .. code:: python "mpi": "enabled": True To enable SMDistributed Data Parallel or Model Parallel: .. code:: python "smdistributed": "dataparallel": "enabled": True , "modelparallel": "enabled": True, "parameters":
...
: Additional kwargs passed to the Framework constructor.
create_model()
Create a “TensorFlowModel“ object that can be used for creating SageMaker model entities, deploying to a SageMaker endpoint, or starting SageMaker Batch Transform jobs.
TensorFlow$create_model( role = NULL, vpc_config_override = "VPC_CONFIG_DEFAULT", entry_point = NULL, source_dir = NULL, dependencies = NULL, ... )
role
(str): The “TensorFlowModel“, which is also used during transform jobs. If not specified, the role from the Estimator is used.
vpc_config_override
(dict[str, list[str]]): Optional override for VpcConfig set on the model. Default: use subnets and security groups from this Estimator. * 'Subnets' (list[str]): List of subnet ids. * 'SecurityGroupIds' (list[str]): List of security group ids.
entry_point
(str): Path (absolute or relative) to the local Python source file which should be executed as the entry point to training. If “source_dir“ is specified, then “entry_point“ must point to a file located at the root of “source_dir“. If not specified and “endpoint_type“ is 'tensorflow-serving', no entry point is used. If “endpoint_type“ is also “None“, then the training entry point is used.
source_dir
(str): Path (absolute or relative or an S3 URI) to a directory with any other serving source code dependencies aside from the entry point file (default: None).
dependencies
(list[str]): A list of paths to directories (absolute or relative) with any additional libraries that will be exported to the container (default: None).
...
: Additional kwargs passed to :class:'~sagemaker.tensorflow.model.TensorFlowModel'.
sagemaker.tensorflow.model.TensorFlowModel: A “TensorFlowModel“ object. See :class:'~sagemaker.tensorflow.model.TensorFlowModel' for full details.
hyperparameters()
Return hyperparameters used by your custom TensorFlow code during model training.
TensorFlow$hyperparameters()
transformer()
Return a “Transformer“ that uses a SageMaker Model based on the training job. It reuses the SageMaker Session and base job name used by the Estimator.
TensorFlow$transformer( instance_count, instance_type, strategy = NULL, assemble_with = NULL, output_path = NULL, output_kms_key = NULL, accept = NULL, env = NULL, max_concurrent_transforms = NULL, max_payload = NULL, tags = NULL, role = NULL, volume_kms_key = NULL, entry_point = NULL, vpc_config_override = "VPC_CONFIG_DEFAULT", enable_network_isolation = NULL, model_name = NULL )
instance_count
(int): Number of EC2 instances to use.
instance_type
(str): Type of EC2 instance to use, for example, 'ml.c4.xlarge'.
strategy
(str): The strategy used to decide how to batch records in a single request (default: None). Valid values: 'MultiRecord' and 'SingleRecord'.
assemble_with
(str): How the output is assembled (default: None). Valid values: 'Line' or 'None'.
output_path
(str): S3 location for saving the transform result. If not specified, results are stored to a default bucket.
output_kms_key
(str): Optional. KMS key ID for encrypting the transform output (default: None).
accept
(str): The accept header passed by the client to the inference endpoint. If it is supported by the endpoint, it will be the format of the batch transform output.
env
(dict): Environment variables to be set for use during the transform job (default: None).
max_concurrent_transforms
(int): The maximum number of HTTP requests to be made to each individual transform container at one time.
max_payload
(int): Maximum size of the payload in a single HTTP request to the container in MB.
tags
(list[dict]): List of tags for labeling a transform job. If none specified, then the tags used for the training job are used for the transform job.
role
(str): The IAM Role ARN for the “TensorFlowModel“, which is also used during transform jobs. If not specified, the role from the Estimator is used.
volume_kms_key
(str): Optional. KMS key ID for encrypting the volume attached to the ML compute instance (default: None).
entry_point
(str): Path (absolute or relative) to the local Python source file which should be executed as the entry point to training. If “source_dir“ is specified, then “entry_point“ must point to a file located at the root of “source_dir“. If not specified and “endpoint_type“ is 'tensorflow-serving', no entry point is used. If “endpoint_type“ is also “None“, then the training entry point is used.
vpc_config_override
(dict[str, list[str]]): Optional override for the VpcConfig set on the model. Default: use subnets and security groups from this Estimator. * 'Subnets' (list[str]): List of subnet ids. * 'SecurityGroupIds' (list[str]): List of security group ids.
enable_network_isolation
(bool): Specifies whether container will run in network isolation mode. Network isolation mode restricts the container access to outside networks (such as the internet). The container does not make any inbound or outbound network calls. If True, a channel named "code" will be created for any user entry script for inference. Also known as Internet-free mode. If not specified, this setting is taken from the estimator's current configuration.
model_name
(str): Name to use for creating an Amazon SageMaker model. If not specified, the estimator generates a default job name based on the training image name and current timestamp.
clone()
The objects of this class are cloneable with this method.
TensorFlow$clone(deep = FALSE)
deep
Whether to make a deep clone.
A “FrameworkModel“ implementation for inference with TensorFlow Serving.
sagemaker.mlcore::ModelBase
-> sagemaker.mlcore::Model
-> sagemaker.mlcore::FrameworkModel
-> TensorFlowModel
LOG_LEVEL_PARAM_NAME
logging level
LOG_LEVEL_MAP
logging level map
LATEST_EIA_VERSION
latest eia version supported
new()
Initialize a Model.
TensorFlowModel$new( model_data, role, entry_point = NULL, image_uri = NULL, framework_version = NULL, container_log_level = NULL, predictor_cls = TensorFlowPredictor, ... )
model_data
(str): The S3 location of a SageMaker model data “.tar.gz“ file.
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if it needs to access an AWS resource.
entry_point
(str): Path (absolute or relative) to the Python source file which should be executed as the entry point to model hosting. If “source_dir“ is specified, then “entry_point“ must point to a file located at the root of “source_dir“.
image_uri
(str): A Docker image URI (default: None). If not specified, a default image for TensorFlow Serving will be used. If “framework_version“ is “None“, then “image_uri“ is required. If also “None“, then a “ValueError“ will be raised.
framework_version
(str): Optional. TensorFlow Serving version you want to use. Defaults to “None“. Required unless “image_uri“ is provided.
container_log_level
(int): Log level to use within the container (default: logging.ERROR). Valid values are defined in the Python logging module.
predictor_cls
(callable[str, sagemaker.session.Session]): A function to call to create a predictor with an endpoint name and SageMaker “Session“. If specified, “deploy()“ returns the result of invoking this function on the created endpoint name.
...
: Keyword arguments passed to the superclass :class:'~sagemaker.model.FrameworkModel' and, subsequently, its superclass :class:'~sagemaker.model.Model'. .. tip:: You can find additional parameters for initializing this class at :class:'~sagemaker.model.FrameworkModel' and :class:'~sagemaker.model.Model'.
register()
Creates a model package for creating SageMaker models or listing on Marketplace.
TensorFlowModel$register( content_types, response_types, inference_instances, transform_instances, model_package_name = NULL, model_package_group_name = NULL, image_uri = NULL, model_metrics = NULL, metadata_properties = NULL, marketplace_cert = FALSE, approval_status = NULL, description = NULL )
content_types
(list): The supported MIME types for the input data.
response_types
(list): The supported MIME types for the output data.
inference_instances
(list): A list of the instance types that are used to generate inferences in real-time.
transform_instances
(list): A list of the instance types on which a transformation job can be run or on which an endpoint can be deployed.
model_package_name
(str): Model Package name, exclusive to 'model_package_group_name', using 'model_package_name' makes the Model Package un-versioned (default: None).
model_package_group_name
(str): Model Package Group name, exclusive to 'model_package_name', using 'model_package_group_name' makes the Model Package versioned (default: None).
image_uri
(str): Inference image uri for the container. Model class' self.image will be used if it is None (default: None).
model_metrics
(ModelMetrics): ModelMetrics object (default: None).
metadata_properties
(MetadataProperties): MetadataProperties object (default: None).
marketplace_cert
(bool): A boolean value indicating if the Model Package is certified for AWS Marketplace (default: False).
approval_status
(str): Model Approval Status, values can be "Approved", "Rejected", or "PendingManualApproval" (default: "PendingManualApproval").
description
(str): Model Package description (default: None).
str: A string of SageMaker Model Package ARN.
deploy()
Deploy a Tensorflow “Model“ to a SageMaker “Endpoint“.
TensorFlowModel$deploy( initial_instance_count = NULL, instance_type = NULL, serializer = NULL, deserializer = NULL, accelerator_type = NULL, endpoint_name = NULL, tags = NULL, kms_key = NULL, wait = TRUE, data_capture_config = NULL, update_endpoint = NULL, serverless_inference_config = NULL )
initial_instance_count
(int): The initial number of instances to run in the “Endpoint“ created from this “Model“.
instance_type
(str): The EC2 instance type to deploy this Model to. For example, 'ml.p2.xlarge', or 'local' for local mode.
serializer
(:class:'~sagemaker.serializers.BaseSerializer'): A serializer object, used to encode data for an inference endpoint (default: None). If “serializer“ is not None, then “serializer“ will override the default serializer. The default serializer is set by the “predictor_cls“.
deserializer
(:class:'~sagemaker.deserializers.BaseDeserializer'): A deserializer object, used to decode data from an inference endpoint (default: None). If “deserializer“ is not None, then “deserializer“ will override the default deserializer. The default deserializer is set by the “predictor_cls“.
accelerator_type
(str): Type of Elastic Inference accelerator to deploy this model for model loading and inference, for example, 'ml.eia1.medium'. If not specified, no Elastic Inference accelerator will be attached to the endpoint. For more information: https://docs.aws.amazon.com/sagemaker/latest/dg/ei.html
endpoint_name
(str): The name of the endpoint to create (Default: NULL). If not specified, a unique endpoint name will be created.
tags
(List[dict[str, str]]): The list of tags to attach to this specific endpoint.
kms_key
(str): The ARN of the KMS key that is used to encrypt the data on the storage volume attached to the instance hosting the endpoint.
wait
(bool): Whether the call should wait until the deployment of this model completes (default: True).
data_capture_config
(sagemaker.model_monitor.DataCaptureConfig): Specifies configuration related to Endpoint data capture for use with Amazon SageMaker Model Monitoring. Default: None.
update_endpoint
: Placeholder
serverless_inference_config
(ServerlessInferenceConfig): Specifies configuration related to serverless endpoint. Use this configuration when trying to create serverless endpoint and make serverless inference. If empty object passed through, we will use pre-defined values in “ServerlessInferenceConfig“ class to deploy serverless endpoint (default: None)
callable[string, sagemaker.session.Session] or None: Invocation of “self.predictor_cls“ on the created endpoint name, if “self.predictor_cls“ is not None. Otherwise, return None.
prepare_container_def()
Prepare the container definition.
TensorFlowModel$prepare_container_def( instance_type = NULL, accelerator_type = NULL )
instance_type
: Instance type of the container.
accelerator_type
: Accelerator type, if applicable.
A container definition for deploying a “Model“ to an “Endpoint“.
serving_image_uri()
Create a URI for the serving image.
TensorFlowModel$serving_image_uri()
region_name
(str): AWS region where the image is uploaded.
instance_type
(str): SageMaker instance type. Used to determine device type (cpu/gpu/family-specific optimized).
accelerator_type
(str): The Elastic Inference accelerator type to deploy to the instance for loading and making inferences to the
model
(default: None). For example, 'ml.eia1.medium'.
str: The appropriate image URI based on the given parameters.
clone()
The objects of this class are cloneable with this method.
TensorFlowModel$clone(deep = FALSE)
deep
Whether to make a deep clone.
A “Predictor“ implementation for inference against TensorFlow Serving endpoints.
sagemaker.mlcore::PredictorBase
-> sagemaker.mlcore::Predictor
-> TensorFlowPredictor
new()
Initialize a “TensorFlowPredictor“. See :class:'~sagemaker.predictor.Predictor' for more info about parameters.
TensorFlowPredictor$new( endpoint_name, sagemaker_session = NULL, serializer = JSONSerializer$new(), deserializer = JSONDeserializer$new(), model_name = NULL, model_version = NULL, ... )
endpoint_name
(str): The name of the endpoint to perform inference on.
sagemaker_session
(sagemaker.session.Session): Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the estimator creates one using the default AWS configuration chain.
serializer
(callable): Optional. Default serializes input data to json. Handles dicts, lists, and numpy arrays.
deserializer
(callable): Optional. Default parses the response using “json.load(...)“.
model_name
(str): Optional. The name of the SavedModel model that should handle the request. If not specified, the endpoint's default model will handle the request.
model_version
(str): Optional. The version of the SavedModel model that should handle the request. If not specified, the latest version of the model will be used.
...
: Additional parameters passed to the Predictor constructor.
classify()
PlaceHolder
TensorFlowPredictor$classify(data)
data
:
regress()
PlaceHolder
TensorFlowPredictor$regress(data)
data
:
predict()
Return the inference from the specified endpoint.
TensorFlowPredictor$predict(data, initial_args = NULL)
data
(object): Input data for which you want the model to provide inference. If a serializer was specified when creating the Predictor, the result of the serializer is sent as input data. Otherwise the data must be sequence of bytes, and the predict method then sends the bytes in the request body as is.
initial_args
(list[str,str]): Optional. Default arguments for boto3 “invoke_endpoint“ call. Default is NULL (no default arguments).
clone()
The objects of this class are cloneable with this method.
TensorFlowPredictor$clone(deep = FALSE)
deep
Whether to make a deep clone.
Handles Amazon SageMaker processing tasks for jobs using TensorFlow containers.
sagemaker.common::Processor
-> sagemaker.common::ScriptProcessor
-> sagemaker.common::FrameworkProcessor
-> TensorFlowProcessor
estimator_cls
Estimator object
new()
This processor executes a Python script in a TensorFlow execution environment. Unless “image_uri“ is specified, the TensorFlow environment is an Amazon-built Docker container that executes functions defined in the supplied “code“ Python script.
TensorFlowProcessor$new( framework_version, role, instance_count, instance_type, py_version = "py3", image_uri = NULL, command = NULL, volume_size_in_gb = 30, volume_kms_key = NULL, output_kms_key = NULL, code_location = NULL, max_runtime_in_seconds = NULL, base_job_name = NULL, sagemaker_session = NULL, env = NULL, tags = NULL, network_config = NULL )
framework_version
(str): The version of the framework. Value is ignored when “image_uri“ is provided.
role
(str): An AWS IAM role name or ARN. Amazon SageMaker Processing uses this role to access AWS resources, such as data stored in Amazon S3.
instance_count
(int): The number of instances to run a processing job with.
instance_type
(str): The type of EC2 instance to use for processing, for example, 'ml.c4.xlarge'.
py_version
(str): Python version you want to use for executing your model training code. One of 'py2' or 'py3'. Defaults to 'py3'. Value is ignored when “image_uri“ is provided.
image_uri
(str): The URI of the Docker image to use for the processing jobs (default: None).
command
([str]): The command to run, along with any command-line flags to *precede* the “'code script“'. Example: ["python3", "-v"]. If not provided, ["python"] will be chosen (default: None).
volume_size_in_gb
(int): Size in GB of the EBS volume to use for storing data during processing (default: 30).
volume_kms_key
(str): A KMS key for the processing volume (default: None).
output_kms_key
(str): The KMS key ID for processing job outputs (default: None).
code_location
(str): The S3 prefix URI where custom code will be uploaded (default: None). The code file uploaded to S3 is 'code_location/job-name/source/sourcedir.tar.gz'. If not specified, the default “code location“ is 's3://sagemaker-default-bucket'
max_runtime_in_seconds
(int): Timeout in seconds (default: None). After this amount of time, Amazon SageMaker terminates the job, regardless of its current status. If 'max_runtime_in_seconds' is not specified, the default value is 24 hours.
base_job_name
(str): Prefix for processing name. If not specified, the processor generates a default job name, based on the processing image name and current timestamp (default: None).
sagemaker_session
(:class:'~sagemaker.session.Session'): Session object which manages interactions with Amazon SageMaker and any other AWS services needed. If not specified, the processor creates one using the default AWS configuration chain (default: None).
env
(dict[str, str]): Environment variables to be passed to the processing jobs (default: None).
tags
(list[dict]): List of tags to be passed to the processing job (default: None). For more, see https://docs.aws.amazon.com/sagemaker/latest/dg/API_Tag.html.
network_config
(:class:'~sagemaker.network.NetworkConfig'): A :class:'~sagemaker.network.NetworkConfig' object that configures network isolation, encryption of inter-container traffic, security group IDs, and subnets (default: None).
clone()
The objects of this class are cloneable with this method.
TensorFlowProcessor$clone(deep = FALSE)
deep
Whether to make a deep clone.
Handle end-to-end training and deployment of XGBoost booster training or training using customer provided XGBoost entry point script.
sagemaker.mlcore::EstimatorBase
-> sagemaker.mlcore::Framework
-> XGBoost
.module
mimic python module
new()
This “Estimator“ executes an XGBoost based SageMaker Training Job. The managed XGBoost environment is an Amazon-built Docker container thatexecutes functions defined in the supplied “entry_point“ Python script. Training is started by calling :meth:'~sagemaker.amazon.estimator.Framework.fit' on this Estimator. After training is complete, calling :meth:'~sagemaker.amazon.estimator.Framework.deploy' creates a hosted SageMaker endpoint and returns an :class:'~sagemaker.amazon.xgboost.model.XGBoostPredictor' instance that can be used to perform inference against the hosted model. Technical documentation on preparing XGBoost scripts for SageMaker training and using the XGBoost Estimator is available on the project home-page: https://github.com/aws/sagemaker-python-sdk
XGBoost$new( entry_point, framework_version, source_dir = NULL, hyperparameters = NULL, py_version = "py3", image_uri = NULL, ... )
entry_point
(str): Path (absolute or relative) to the Python source file which should be executed as the entry point to training. If “source_dir“ is specified, then “entry_point“ must point to a file located at the root of “source_dir“.
framework_version
(str): XGBoost version you want to use for executing your model training code.
source_dir
(str): Path (absolute, relative or an S3 URI) to a directory with any other training source code dependencies aside from the entry point file (default: None). If “source_dir“ is an S3 URI, it must point to a tar.gz file. Structure within this directory are preserved when training on Amazon SageMaker.
hyperparameters
(dict): Hyperparameters that will be used for training (default: None). The hyperparameters are made accessible as a dict[str, str] to the training code on SageMaker. For convenience, this accepts other types for keys and values, but “str()“ will be called to convert them before training.
py_version
(str): Python version you want to use for executing your model training code (default: 'py3').
image_uri
(str): If specified, the estimator will use this image for training and hosting, instead of selecting the appropriate SageMaker official image based on framework_version and py_version. It can be an ECR url or dockerhub image and tag. Examples: 123.dkr.ecr.us-west-2.amazonaws.com/my-custom-image:1.0 custom-image:latest.
...
: Additional kwargs passed to the :class:'~sagemaker.estimator.Framework' constructor.
create_model()
Create a SageMaker “XGBoostModel“ object that can be deployed to an “Endpoint“.
XGBoost$create_model( model_server_workers = NULL, role = NULL, vpc_config_override = "VPC_CONFIG_DEFAULT", entry_point = NULL, source_dir = NULL, dependencies = NULL, ... )
model_server_workers
(int): Optional. The number of worker processes used by the inference server. If None, server will use one worker per vCPU.
role
(str): The “ExecutionRoleArn“ IAM Role ARN for the “Model“, which is also used during transform jobs. If not specified, the role from the Estimator will be used.
vpc_config_override
(dict[str, list[str]]): Optional override for VpcConfig set on the model. Default: use subnets and security groups from this Estimator. * 'Subnets' (list[str]): List of subnet ids. * 'SecurityGroupIds' (list[str]): List of security group ids.
entry_point
(str): Path (absolute or relative) to the local Python source file which should be executed as the entry point to training. If “source_dir“ is specified, then “entry_point“ must point to a file located at the root of “source_dir“. If not specified, the training entry point is used.
source_dir
(str): Path (absolute or relative) to a directory with any other serving source code dependencies aside from the entry point file. If not specified, the model source directory from training is used.
dependencies
(list[str]): A list of paths to directories (absolute or relative) with any additional libraries that will be exported to the container. If not specified, the dependencies from training are used. This is not supported with "local code" in Local Mode.
...
: Additional kwargs passed to the :class:'~sagemaker.xgboost.model.XGBoostModel' constructor.
sagemaker.xgboost.model.XGBoostModel: A SageMaker “XGBoostModel“ object. See :func:'~sagemaker.xgboost.model.XGBoostModel' for full details.
attach()
Attach to an existing training job. Create an Estimator bound to an existing training job, each subclass is responsible to implement “_prepare_init_params_from_job_description()“ as this method delegates the actual conversion of a training job description to the arguments that the class constructor expects. After attaching, if the training job has a Complete status, it can be “deploy()“ ed to create a SageMaker Endpoint and return a “Predictor“. If the training job is in progress, attach will block and display log messages from the training job, until the training job completes. Examples: >>> my_estimator.fit(wait=False) >>> training_job_name = my_estimator.latest_training_job.name Later on: >>> attached_estimator = Estimator.attach(training_job_name) >>> attached_estimator.deploy()
XGBoost$attach( training_job_name, sagemaker_session = NULL, model_channel_name = "model" )
training_job_name
(str): The name of the training job to attach to.
sagemaker_session
(sagemaker.session.Session): Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the estimator creates one using the default AWS configuration chain.
model_channel_name
(str): Name of the channel where pre-trained model data will be downloaded (default: 'model'). If no channel with the same name exists in the training job, this option will be ignored.
Instance of the calling “Estimator“ Class with the attached training job.
clone()
The objects of this class are cloneable with this method.
XGBoost$clone(deep = FALSE)
deep
Whether to make a deep clone.
An XGBoost SageMaker “Model“ that can be deployed to a SageMaker “Endpoint“.
sagemaker.mlcore::ModelBase
-> sagemaker.mlcore::Model
-> sagemaker.mlcore::FrameworkModel
-> XGBoostModel
new()
Initialize an XGBoostModel.
XGBoostModel$new( model_data, role, entry_point, framework_version, image_uri = NULL, py_version = "py3", predictor_cls = XGBoostPredictor, model_server_workers = NULL, ... )
model_data
(str): The S3 location of a SageMaker model data “.tar.gz“ file.
role
(str): An AWS IAM role (either name or full ARN). The Amazon SageMaker training jobs and APIs that create Amazon SageMaker endpoints use this role to access training data and model artifacts. After the endpoint is created, the inference code might use the IAM role, if it needs to access an AWS resource.
entry_point
(str): Path (absolute or relative) to the Python source file which should be executed as the entry point to model hosting. If “source_dir“ is specified, then “entry_point“ must point to a file located at the root of “source_dir“.
framework_version
(str): XGBoost version you want to use for executing your model training code.
image_uri
(str): A Docker image URI (default: None). If not specified, a default image for XGBoost is be used.
py_version
(str): Python version you want to use for executing your model training code (default: 'py3').
predictor_cls
(callable[str, sagemaker.session.Session]): A function to call to create a predictor with an endpoint name and SageMaker “Session“. If specified, “deploy()“ returns the result of invoking this function on the created endpoint name.
model_server_workers
(int): Optional. The number of worker processes used by the inference server. If None, server will use one worker per vCPU.
...
: Keyword arguments passed to the “FrameworkModel“ initializer.
prepare_container_def()
Return a container definition with framework configuration set in model environment variables.
XGBoostModel$prepare_container_def( instance_type = NULL, accelerator_type = NULL )
instance_type
(str): The EC2 instance type to deploy this Model to. This parameter is unused because XGBoost supports only CPU.
accelerator_type
(str): The Elastic Inference accelerator type to deploy to the instance for loading and making inferences to the model. This parameter is unused because accelerator types are not supported by XGBoostModel.
dict[str, str]: A container definition object usable with the CreateModel API.
serving_image_uri()
Create a URI for the serving image.
XGBoostModel$serving_image_uri(region_name, instance_type)
region_name
(str): AWS region where the image is uploaded.
instance_type
(str): SageMaker instance type. Must be a CPU instance type.
str: The appropriate image URI based on the given parameters.
clone()
The objects of this class are cloneable with this method.
XGBoostModel$clone(deep = FALSE)
deep
Whether to make a deep clone.
Predictor for inference against XGBoost Endpoints. This is able to serialize Python lists, dictionaries, and numpy arrays to xgb.DMatrix for XGBoost inference.
sagemaker.mlcore::PredictorBase
-> sagemaker.mlcore::Predictor
-> XGBoostPredictor
new()
Initialize an “XGBoostPredictor“.
XGBoostPredictor$new( endpoint_name, sagemaker_session = NULL, serializer = LibSVMSerializer$new(), deserializer = CSVDeserializer$new() )
endpoint_name
(str): The name of the endpoint to perform inference on.
sagemaker_session
(sagemaker.session.Session): Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, the estimator creates one using the default AWS configuration chain.
serializer
(sagemaker.serializers.BaseSerializer): Optional. Default serializes input data to LibSVM format
deserializer
(sagemaker.deserializers.BaseDeserializer): Optional. Default parses the response from text/csv to a Python list.
clone()
The objects of this class are cloneable with this method.
XGBoostPredictor$clone(deep = FALSE)
deep
Whether to make a deep clone.
Handles Amazon SageMaker processing tasks for jobs using XGBoost containers.
sagemaker.common::Processor
-> sagemaker.common::ScriptProcessor
-> sagemaker.common::FrameworkProcessor
-> XGBoostProcessor
estimator_cls
Estimator object
new()
This processor executes a Python script in an XGBoost execution environment. Unless “image_uri“ is specified, the XGBoost environment is an Amazon-built Docker container that executes functions defined in the supplied “code“ Python script.
XGBoostProcessor$new( framework_version, role, instance_count, instance_type, py_version = "py3", image_uri = NULL, command = NULL, volume_size_in_gb = 30, volume_kms_key = NULL, output_kms_key = NULL, code_location = NULL, max_runtime_in_seconds = NULL, base_job_name = NULL, sagemaker_session = NULL, env = NULL, tags = NULL, network_config = NULL )
framework_version
(str): The version of the framework. Value is ignored when “image_uri“ is provided.
role
(str): An AWS IAM role name or ARN. Amazon SageMaker Processing uses this role to access AWS resources, such as data stored in Amazon S3.
instance_count
(int): The number of instances to run a processing job with.
instance_type
(str): The type of EC2 instance to use for processing, for example, 'ml.c4.xlarge'.
py_version
(str): Python version you want to use for executing your model training code. One of 'py2' or 'py3'. Defaults to 'py3'. Value is ignored when “image_uri“ is provided.
image_uri
(str): The URI of the Docker image to use for the processing jobs (default: None).
command
([str]): The command to run, along with any command-line flags to *precede* the “'code script“'. Example: ["python3", "-v"]. If not provided, ["python"] will be chosen (default: None).
volume_size_in_gb
(int): Size in GB of the EBS volume to use for storing data during processing (default: 30).
volume_kms_key
(str): A KMS key for the processing volume (default: None).
output_kms_key
(str): The KMS key ID for processing job outputs (default: None).
code_location
(str): The S3 prefix URI where custom code will be uploaded (default: None). The code file uploaded to S3 is 'code_location/job-name/source/sourcedir.tar.gz'. If not specified, the default “code location“ is 's3://sagemaker-default-bucket'
max_runtime_in_seconds
(int): Timeout in seconds (default: None). After this amount of time, Amazon SageMaker terminates the job, regardless of its current status. If 'max_runtime_in_seconds' is not specified, the default value is 24 hours.
base_job_name
(str): Prefix for processing name. If not specified, the processor generates a default job name, based on the processing image name and current timestamp (default: None).
sagemaker_session
(:class:'~sagemaker.session.Session'): Session object which manages interactions with Amazon SageMaker and any other AWS services needed. If not specified, the processor creates one using the default AWS configuration chain (default: None).
env
(dict[str, str]): Environment variables to be passed to the processing jobs (default: None).
tags
(list[dict]): List of tags to be passed to the processing job (default: None). For more, see https://docs.aws.amazon.com/sagemaker/latest/dg/API_Tag.html.
network_config
(:class:'~sagemaker.network.NetworkConfig'): A :class:'~sagemaker.network.NetworkConfig' object that configures network isolation, encryption of inter-container traffic, security group IDs, and subnets (default: None).
clone()
The objects of this class are cloneable with this method.
XGBoostProcessor$clone(deep = FALSE)
deep
Whether to make a deep clone.