This reference architecture document provides guidance for CloudSoda deployments. Included is the reference architecture for CloudSoda controllers and guidance on the specs for agents. For specific use cases please work with the CloudSoda team on specifics for your solution.
CloudSoda Controller
- The CloudSoda controller requires 1Tb of storage (on the primary partition) which is based on the number of files you have. For example, a CloudSoda job that moves/copies/syncs 1 million files requires 500MB of storage. If you were to configure a policy to run once per day, this would be 182 GB/year.
- 100 million files or scanning hourly should consider more storage on their controller node, please contact the CloudSoda team for details.
- The tables below outline the recommended architecture for servers and VMs and include cloud templates that can be used.
CloudSoda Agents
- An agent can run on almost any architecture to enable data movement including laptops, servers, VMs, and other platforms. The performance of the data movement is dependent on the number of cores available to the agent. The more cores you have available the faster the agent will perform.
- For typical deployments, we recommend 16 cores for an agent.
- Transferring files to the cloud can result in the agent consuming a lot of memory as a result of the cloud SDKs creating buffers for every file part. The larger the file being transferred to the cloud the more memory the agent will consume. This is not the case for transfers to file-based storage.
- For agents doing data movement to and from the cloud or an object store, using a processor that has integrated SHA extension will give approximately a 30% increase in performance. For a list of processors, click on the links below.
https://en.wikipedia.org/wiki/Intel_SHA_extensions
https://en.wikipedia.org/wiki/Ice_Lake_(microprocessor)
https://en.wikipedia.org/wiki/Zen_3
Below is a table referencing a platform for VMs, Servers, and cloud templates, including the supported operating systems. These values should only serve as a guideline. The proper specification should be adjusted based on the dataset and the storage types involved in the data transfer.
Comments
0 comments
Article is closed for comments.