RemoteBatchMapSubmit
RemoteBatchMapSubmit[env,f,list]
submits an array batch job in which f is applied to each element on the first level of list, using the remote batch submission environment env.
RemoteBatchMapSubmit[f,list]
submits an array job using $DefaultRemoteBatchSubmissionEnvironment.
Details and Options
- RemoteBatchMapSubmit maps a function over a list using batch job submission.
- The currently supported batch computation providers are "AWSBatch", "AzureBatch" and "CharityEngine".
- RemoteBatchMapSubmit returns a RemoteBatchJobObject representing the resulting array job on the batch computation provider. An array job is a remote batch job consisting of multiple child jobs.
- The environment env should be a RemoteBatchSubmissionEnvironment.
- If the batch computation provider being used has no required environment settings, then env may be the name of the provider as a string, such as in RemoteBatchMapSubmit["CharityEngine",f,list], which is equivalent to RemoteBatchMapSubmit[RemoteBatchSubmissionEnvironment["CharityEngine"],f,list].
- The input array list will be split into contiguous spans (using a strategy specified by the Method option), with each span assigned to be processed independently by a child job of the master array job.
- If a child job has more than one processor core available to it (e.g. as an effect of the RemoteProviderSettings option), then it will internally use ParallelMap to distribute evaluation of its span across all available cores.
- In general, individual evaluations of the function f cannot rely upon global state created by specific previous evaluations, as evaluations are distributed across multiple isolated child jobs according to the Method option (and, furthermore, potentially across multiple parallel kernels within each child job). However, techniques such as function memoization that make limited, opportunistic use of global state can sometimes be employed to optimize job performance.
- The RemoteBatchJobObject returned by RemoteBatchMapSubmit will have "Array" as the value of its "JobType" property.
- The following options are supported by RemoteBatchMapSubmit:
-
ForwardCloudCredentials Automatic whether to copy the local session's Wolfram Cloud credentials into the remote session IncludeDefinitions True whether to automatically include dependencies of the expression Initialization None expression to evaluate once in the master kernel of each child job LicensingSettings Automatic licensing settings to use Method Automatic strategy with which to pack evaluations into child jobs RemoteInputFiles < > association of local files to be uploaded to the provider RemoteProviderSettings < > association of provider-specific settings for each child job TimeConstraint Automatic the timeout for each child job - The Initialization expression can be used to import input files and store their content in a variable to be accessed within the job function. »
- The value of the IncludeDefinitions option applies to both the job function and, if specified, the Initialization expression.
- The default value Automatic of the LicensingSettings option creates a new on-demand license entitlement with kernel count and expiration limits based on the job's configuration. Default kernel count limits are determined with the assumption that all child jobs may run concurrently.
- When running jobs using the default on-demand licensing configuration, Wolfram Engine license usage is charged against your Wolfram Service Credits balance on a pay-as-you-go basis.
- The Method option supports the following packing method specifications:
-
"FinestGrained" assigns the minimum possible number of evaluations to each job (within the job count limit enforced by the provider) "EvaluationsPerJob"n picks a number of jobs adequate to each fit as close as possible to n evaluations, within the job count limit enforced by the provider (if Length[list] is not divisible by n, there will be a single additional job containing the remainder) "JobCount"n balances the evaluations across exactly n jobs, up to the lesser of Length[list] and the job count limit enforced by the provider Automatic equivalent to "JobCount"Round[] - The default packing method MethodAutomatic attempts to minimize the difference between the number of jobs and the number of evaluations per job.
- The Method option may also be set to a list such as {"JobCount"10,"FinestGrained"}, where the first element is a packing method from the preceding table and the second element is a supported parallelization method for Parallelize (and related functions).
- If a parallelization method is specified, it will be supplied to ParallelMap within each child job. The default parallelization method is Automatic. Setting a parallelization method has no effect if the array job is configured to provide only a single core to each child job.
- The value of the RemoteProviderSettings option and the timeout specified by the TimeConstraint option both apply to each child job individually, not to the master array job.
Examples
open allclose allBasic Examples (1)
Submit an array batch job that squares each integer from 1 through 100:
Visualize the distribution of evaluations across child jobs:
View an automatically updating visualization of the state of each child job:
While the array job is running, obtain a sparse array containing only the elements from child jobs that have completed:
After the array job is fully complete, obtain its results as a dense array:
Options (5)
Initialization (1)
Method (4)
The setting MethodAutomatic attempts to make the number of child jobs and the number of evaluations per job as close as possible:
The setting Method"FinestGrained" puts as few evaluations as possible in each child job, within the maximum child job count enforced by the provider:
The setting Method"EvaluationsPerJob"n attempts to pack exactly n evaluations into each child job, with a single additional job containing any remainder:
The setting Method"JobCount"n spreads the evaluations across exactly n child jobs:
Possible Issues (1)
Using the default on-demand licensing settings, RemoteBatchMapSubmit will refuse to submit a job if your Service Credits balance is insufficient to create a license entitlement based on the number of array child jobs and the child job processor count and TimeConstraint:
By default, RemoteBatchMapSubmit requests a license entitlement with the assumption that all child jobs will be running simultaneously. If this is not the case, you may reduce the entitlement's concurrent kernel limit with the LicensingSettings option:
Alternatively, you may reduce the number of child jobs, the number of processor cores per child job and/or the child job TimeConstraint:
Finally, you may override the balance check by supplying the "CheckCreditsBalance"False entitlement setting to the LicensingSettings option:
Text
Wolfram Research (2020), RemoteBatchMapSubmit, Wolfram Language function, https://reference.wolfram.com/language/ref/RemoteBatchMapSubmit.html.
CMS
Wolfram Language. 2020. "RemoteBatchMapSubmit." Wolfram Language & System Documentation Center. Wolfram Research. https://reference.wolfram.com/language/ref/RemoteBatchMapSubmit.html.
APA
Wolfram Language. (2020). RemoteBatchMapSubmit. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/RemoteBatchMapSubmit.html