Wolfram Cloud Architecture for Admins

Wolfram Enterprise Private Cloud (EPC) is provisioned through a virtual machine image running CentOS Version 8 Linux. In a typical clustered installation, there is a master node running various network services and a series of compute nodes running the Wolfram Cloud application and Wolfram Engines. The cluster configuration is orchestrated from the master node. It is also possible to run a single-node installation where all services plus the web application and Wolfram Engine processes all run on the same node.
The components and services comprising the Wolfram Cloud include the following:
Except for the Wolfram Engine, these services are intended to be used exclusively for the Wolfram Cloud. If you need to use a MySQL database, for example, it is recommended to host it on another machine. Contact Wolfram to discuss alternatives.
Wolfram Engine Kernel Management
The Wolfram Cloud uses a scheme of fixed-sized pools to manage its Wolfram Engine (kernel) processes. There are four pool types, and each is configured to have a certain number of kernels in it on each compute node. The four pool types are:
The deployment kernels can be configured to run in a forked mode or a shared mode. In the forked mode, kernel processes are created using the fork Linux system call from a master copy. In this mode, each request starts in exactly the same pristine state, and is recommended for workloads with multiple users that may interfere with each other. In the shared mode, kernel processes are launched and then reused after each request, with some cleanup performed. The shared mode gives slightly better performance, but is only recommended for single-user workloads with code that has no side effects and can be run multiple times in the same kernel without issue.
User Web Sessions
Much of the Wolfram Cloud's functionality is accessed through web (HTTP) requests from a client. The client contacting the cloud can be a web browser, a mobile app, a Wolfram desktop application, a command-line script or even a Wolfram Engine running code within the cloud itself. Whatever the source, a client is either authenticated (logged in) or unauthenticated (anonymous). Some features require authentication, and some permit an anonymous conversation. If authentication is required, the cloud will respond with a request to log in. In either case, once a request has been made from a client, the server will start a web session and send a server cookie back. Some web clients permit cookies to be disabled, but some features may not work without a session, so enabling cookies is recommended.
Background Task Scheduler
The Wolfram Cloud has a distributed task scheduler. This system uses MySQL database tables to remember the list of tasks and their schedules. The cloud application on each compute node will poll for tasks to run periodically, and one of them will "win" the ability to run a particular task, so it is only ever run on one node in the system. However, any node can run jobs to distribute the load. Each node will only run as many tasks as there are service kernels allocated to it, so the total task capacity is the number of nodes × the number of service kernels per node.
System Resources
This section goes through the different types of system resources and discusses the demands on those resources from the various parts of the Wolfram Cloud.

Memory

Memory (RAM) is often the chief resource being stretched in the Wolfram Cloud.
On compute nodes, the Wolfram Engine starts with a not-insignificant memory footprint. This is due in part to pre-initializing various areas of the Wolfram Language so that certain functions run faster the first time. There is a configurable limit on memory use for user evaluations to help manage this. You can check the memory use of a Wolfram Language evaluation through functions such as MemoryInUse.
In addition to the basic memory used by the system, the Wolfram Cloud application runs in Tomcat on a Java virtual machine with a fixed heap size. Notebooks served in the Wolfram Cloud have a corresponding model maintained in the application, so more notebooks, larger notebooks and notebooks with bitmapped images will all place demands on Tomcat's Java heap.

CPU

Generally speaking, Wolfram Language evaluations will constitute the main source of CPU usage, though the various services (e.g. Tomcat, MySQL) can spike the CPU usage as well. Wolfram Language evaluations can be measured for CPU time consumed with the Timing function.

Disk Storage

The chief sources for disk storage use in Wolfram Cloud are:

Network Bandwidth

Sources of network use in the Wolfram Cloud include:

User Isolation Security

Users are isolated from each other via a sandbox mechanism preventing Wolfram Language code from one user to reach other users' data or system resources. This also means some Wolfram Language features are not available at all, because they cannot be secured through this mechanism. Because there are tradeoffs between security and capability, and because every EPC use is different, the sandbox is configurable, up to and including disabling it entirely.
The principal ways that the sandbox (also known as the "kernel sandbox") works are through: (a) allowing file read operations only from a specified list of directories; (b) allowing file write operations only to a specified list of directories; (c) allowing execution of programs and loading of dynamic libraries only from specific directories and for specific binaries; and (d) disallowing certain operations altogether. For example, with the tightest sandbox restrictions, only approved precompiled binaries can be used, and the binary compilation feature cannot be used to create new binaries.