RemoteBatchMapSubmit

RemoteBatchMapSubmit[env,f,list]

submits an array batch job in which f is applied to each element on the first level of list, using the remote batch submission environment env.

Details and Options

  • RemoteBatchMapSubmit maps a function over a list using batch job submission to providers such as AWS Batch and Charity Engine.
  • RemoteBatchMapSubmit returns a RemoteBatchJobObject representing the resulting array job on the batch computation provider. An array job is a remote batch job consisting of multiple child jobs.
  • The environment env should be a RemoteBatchSubmissionEnvironment.
  • If the batch computation provider being used has no required environment settings, then env may be the name of the provider as a string, such as in RemoteBatchMapSubmit["CharityEngine",f,list], which is equivalent to RemoteBatchMapSubmit[RemoteBatchSubmissionEnvironment["CharityEngine"],f,list].
  • The input array list will be split into contiguous spans (using a strategy specified by the Method option), with each span assigned to be processed independently by a child job of the master array job.
  • If a child job has more than one processor core available to it (e.g. as an effect of the RemoteProviderSettings option), then it will internally use ParallelMap to distribute evaluation of its span across all available cores.
  • Each evaluation of the function f occurs in an isolated environment, with no access to state set by previous evaluations.
  • The RemoteBatchJobObject returned by RemoteBatchMapSubmit will have "Array" as the value of its "JobType" property.
  • The following options are supported by RemoteBatchMapSubmit:
  • ForwardCloudCredentialsAutomaticwhether to copy the local session's Wolfram Cloud credentials into the remote session
    IncludeDefinitionsTruewhether to automatically include dependencies of the expression
    InitializationNoneexpression to evaluate once in the master kernel of each child job
    LicensingSettingsAutomaticlicensing settings to use
    MethodAutomaticstrategy with which to pack evaluations into child jobs
    RemoteInputFiles<||>association of local files to be uploaded to the provider
    RemoteProviderSettings<||>association of provider-specific settings for each child job
    TimeConstraintAutomaticthe timeout for each child job
  • The Initialization expression can be used to import input files and store their content in a variable to be accessed within the job function. »
  • The value of the IncludeDefinitions option applies to both the job function and, if specified, the Initialization expression.
  • The default value Automatic of the LicensingSettings option creates a new on-demand license entitlement with kernel count and expiration limits based on the job's configuration. Default kernel count limits are determined with the assumption that all child jobs may run concurrently.
  • When running jobs using the default on-demand licensing configuration, Wolfram Engine license usage is charged against your Wolfram Service Credits balance on a pay-as-you-go basis.
  • The Method option supports the following packing method specifications:
  • "FinestGrained"assigns the minimum possible number of evaluations to each job (within the job count limit enforced by the provider)
    "EvaluationsPerJob"npicks a number of jobs adequate to each fit as close as possible to n evaluations each, within the job count limit enforced by the provider (if Length[list] is not divisible by n, there will be a single additional job containing the remainder)
    "JobCount"nbalances the evaluations across exactly n jobs, up to the lesser of Length[list] and the job count limit enforced by the provider
    Automaticequivalent to "JobCount"Round[]
  • The default packing method MethodAutomatic attempts to minimize the difference between the number of jobs and the number of evaluations per job.
  • The Method option may also be set to a list such as {"JobCount"10,"FinestGrained"}, where the first element is a packing method from the preceding table and the second element is a supported parallelization method for Parallelize (and related functions).
  • If a parallelization method is specified, it will be supplied to ParallelMap within each child job. The default parallelization method is Automatic. Setting a parallelization method has no effect if the array job is configured to provide only a single core to each child job.
  • The value of the RemoteProviderSettings option and the timeout specified by the TimeConstraint option both apply to each child job individually, not to the master array job.

Examples

open allclose all

Basic Examples  (1)

Submit an array batch job that squares each integer from 1 through 100:

Visualize the distribution of evaluations across child jobs:

View an automatically updating visualization of the state of each child job:

While the array job is running, obtain a sparse array containing only the elements from child jobs that have completed:

After the array job is fully complete, obtain its results as a dense array:

Scope  (1)

Submit an array job using the default remote submission environment:

Options  (5)

Initialization  (1)

Use an initialization expression to assign an imported input file to a variable, which can then be used in the job function:

Method  (4)

The setting MethodAutomatic attempts to make the number of child jobs and the number of evaluations per job as close as possible:

The setting Method"FinestGrained" puts as few evaluations as possible in each child job, within the maximum child job count enforced by the provider:

The setting Method"EvaluationsPerJob"n attempts to pack exactly n evaluations into each child job, with a single additional job containing any remainder:

The setting Method"JobCount"n spreads the evaluations across exactly n child jobs:

Possible Issues  (1)

Using the default on-demand licensing settings, RemoteBatchMapSubmit will refuse to submit a job if your Service Credits balance is insufficient to create a license entitlement based on the number of array child jobs and the child job processor count and TimeConstraint:

By default, RemoteBatchMapSubmit requests a license entitlement with the assumption that all child jobs will be running simultaneously. If this is not the case, you may reduce the entitlement's concurrent kernel limit with the LicensingSettings option:

Alternatively, you may reduce the number of child jobs, the number of processor cores per child job and/or the child job TimeConstraint:

Finally, you may override the balance check by supplying the "CheckCreditsBalance"False entitlement setting to the LicensingSettings option:

If your Service Credits balance is exhausted while a child job is running, the job will terminate prematurely:

Wolfram Research (2020), RemoteBatchMapSubmit, Wolfram Language function, https://reference.wolfram.com/language/ref/RemoteBatchMapSubmit.html.

Text

Wolfram Research (2020), RemoteBatchMapSubmit, Wolfram Language function, https://reference.wolfram.com/language/ref/RemoteBatchMapSubmit.html.

BibTeX

@misc{reference.wolfram_2020_remotebatchmapsubmit, author="Wolfram Research", title="{RemoteBatchMapSubmit}", year="2020", howpublished="\url{https://reference.wolfram.com/language/ref/RemoteBatchMapSubmit.html}", note=[Accessed: 19-April-2021 ]}

BibLaTeX

@online{reference.wolfram_2020_remotebatchmapsubmit, organization={Wolfram Research}, title={RemoteBatchMapSubmit}, year={2020}, url={https://reference.wolfram.com/language/ref/RemoteBatchMapSubmit.html}, note=[Accessed: 19-April-2021 ]}

CMS

Wolfram Language. 2020. "RemoteBatchMapSubmit." Wolfram Language & System Documentation Center. Wolfram Research. https://reference.wolfram.com/language/ref/RemoteBatchMapSubmit.html.

APA

Wolfram Language. (2020). RemoteBatchMapSubmit. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/RemoteBatchMapSubmit.html