Onedata Documentation

Guides, API references, and examples for building with Onedata.

Prerequisites & requirements

This chapter describes the prerequisites and requirements that must be met before deploying the Oneprovider service on any host, whether physical (bare metal) or virtual.

Follow the table of contents below as a check-list, making sure all points are addressed.

Table of Contents

Hardware & OS

All installation methods assume containerized deployments (using Docker or Kubernetes). In principle, any Linux distribution can be used; however, the recommended choice is Ubuntu 20.04 LTS or newer, as it has been thoroughly tested.

The host intended for a Oneprovider deployment must meet the following requirements:

RequirementMinimumRecommended baselineComments
CPU4 vCPU16 vCPUScale proportionally with load. The recommended baseline corresponds to approximately 50-100 concurrent clients.
RAM16 GB64 GBScale proportionally with load. The recommended baseline corresponds to approximately 50-100 concurrent clients.
Root volume30 GB≥ 60 GB 1Space for files other than service persistence; container images, backups, dependencies, OS, etc.
Persistence volume20 GB + 8 MB per 1,000 files100 GB + 10 MB per 1,000 filesThe host setup assumes a separate block device for an LVM volume. Capacity depends primarily on the number of files (metadata and service data), not on the number of clients.
Disk type for persistenceSSDHigh-speed SSDThe performance of the disk directly impacts the performance of the underlying database (Couchbase) and hence the service’s ability to handle more concurrent requests.

1 If you plan to deploy the OpenFaaS Engine for Automation on the same machine, allow more disk space for docker-based Lambda images. For starters, consider 100 GB of extra disk capacity.

For an exemplary cloud-based deployment, one could create a Virtual Machine with Ubuntu 24.04, 16 vCPU, 64 GB RAM, 60 GB root disk, and 200 GB empty block device attached (but not mounted).

Public IP and ports

Make sure your Oneprovider service will be reachable by the target audience:

  • Globally available via the Internet — you will need public external IP(s).
  • Available within an intranet — you will need IP(s) reachable via the local routing.

Assigning the IP addresses:

  • Load balancer / reverse proxy / ingress — the IP(s) should be assigned to the outward-facing network service, which then routes the traffic internally to Oneprovider host(s).
  • Direct routing — the IP(s) should be assigned to the host(s).

The Oneprovider service exposes several ports. The table below will help you decide which ones should be available publicly, or in a restricted manner. Make sure to configure your firewall / security groups accordingly.

PortTypical setupHardened setupComments
80Open publiclyClosed if Let’s Encrypt disabledThis port is used for automated Let’s Encrypt certificate generation and to automatically redirect clients from HTTP to HTTPS. In principle, it’s not mandatory to be opened.
443Open publiclyOpen within intranet or from whitelisted IPs/subnetsThe main port (SSL-protected) for most Onedata clients and interfaces. Typically open to the Internet, unless it’s required to limit access to certain addresses or networks.
4443Open publiclyOpen within intranet or from whitelisted IPs/subnets, or closed if S3 disabledThe port (SSL-protected) hosting the S3 endpoint (OneS3 service) that emulates Onedata spaces as S3 buckets. May be closed if S3 is not required.
6665Open publiclyAccessible only by other Oneprovider servicesThe port (SSL-protected) used for data transfers between providers. Typically open, but can be restricted to access strictly from the other Oneprovider hosts in the Onedata ecosystem.
9443Open within intranet or from whitelisted IPs/subnetsClosedThe port (SSL-protected) used for direct emergency access to the Oneprovider administration panel (Onepanel). Useful to manage the installation when the unified Onezone interface is not working correctly, but offers limited functions. Should be accessible only by admins, e.g. via organizational VPN.

We strongly recommend closing all other ports from public access for security. Oneprovider runs some internal services on the host, including the Couchbase DB, or the built-in Erlang daemon — EPMD. Exposing those for external access may create attack vectors.

DNS domain

The host should be accessible via its FQDN. There are two scenarios for setting this up:

  • supply your own FQDN, in which case your network administrator registers the domain and places the relevant DNS records,
  • use the subdomain delegation feature of Onedata, which will generate an FQDN within the domain managed by the Onezone service.

TLS certificate

Onedata services communicate with each other using the HTTPS protocol which require obtaining of TLS certificate for the host. There are two ways to accomplish this:

  • organizing the web cert by the administrator (it can be done both for non-delegated subdomain and delegated subdomain),
  • using the Let’s Encrypt (LE) service to obtain the web cert which also can be used for non-delegated subdomain and delegated subdomain.
NOTE

If you decide to use delegated subdomain and LE then the certificate management happens automatically — it is covered by the Onedata software.

Systemd

Systemd should be installed on your system. This is the default for most modern Linux distributions. Oneprovider software uses systemd for service management.

Access to Onezone

You should have at least user-level access to existing Onezone instance before deploying Oneprovider. In a common scenario, the Onezone instance has already been set up by your organization, and you get access to it according to the organization’s access policy. If you don’t have such a possibility you can use our Onezone service available at demo.onedata.org (see the user quickstart section for details). Another possibility is deploying your own Onezone (see the Onezone installation chapter).

Host setup

The host should be initially configured before deploying the Oneprovider service. The available methods to do this are described below.

Using Ansible script

The recommended way is to use our proven Ansible script to set up your host.

Clone the repository on your host:

git clone https://github.com/onedata/onedata-deployments.git
cd onedata-deployments

Then, follow the instructions that can be found:

  • in the repository: ./initial-vm-config/ansible/README.md,
  • or online: README.

Manual preparation

Alternatively, you may perform the steps manually Note that the Oneprovider service is quite sensitive to the network settings and depends on nuances well-captured by the Ansible playbook. Use the manual approach only as the last resort.