Oneprovider overview

  • Exposes storage resources, providing a virtualized, unified view of data.
  • Deployed within a data center.
  • Embraces high-throughput interconnects with the storage backends.

Oneprovider architecture

Oneprovider can be deployed on multi-node clusters (for horizontal scaling). Every node hosts at least two internal services:

  • Onepanel — manages a cluster node (present on each one), and offers a Web GUI and REST API for administering the Oneprovider cluster.
  • Cluster Manager — coordinates cluster operation and monitors its health.
  • Worker — realizes the data management services.
  • Database — provides persistent storage for the Oneprovider service.

Oneprovider architecture

Oneprovider multinode deployment example (Onepanel view). See the documentation here.

no-margin centered

Storage backends

Storage backends are used to store the physical data, corresponding to the logical files in a space. Oneprovider accesses the storage backends via "helpers" (drivers) implemented for each supported type of storage. Helpers serve as a POSIX-like abstraction, building a layer over different storage backend APIs and access methods. See the documentation here.

Currently supported storage backends:

  • POSIX — any POSIX compatible filesystem accessible by oneprovider via a mount point (directory).
  • NFS — filesystem exported via the NFS protocol — no need to mount it locally.
  • S3 — Amazon S3 compatible storage.
  • Ceph RADOS — versions 14, 15, 16.
  • HTTP — any server exposing data via HTTP or HTTPS (limited to read-only mode).
  • XRootD — CERN's data management protocol for LHC data.
  • WebDAV — experimental.
  • GlusterFS — experimental.
  • Others? Why not!

Storage backends

Storage backends are used to store the physical data, corresponding to the logical files in a space. Oneprovider accesses the storage backends via "helpers" (drivers) implemented for each supported type of storage. Helpers serve as a POSIX-like abstraction, building a layer over different storage backend APIs and access methods. See the documentation here.

Currently supported storage backends:

  • POSIX — any POSIX compatible filesystem accessible by oneprovider via a mount point (directory).
  • NFS — filesystem exported via the NFS protocol — no need to mount it locally.
  • S3 — Amazon S3 compatible storage.
  • Ceph RADOS — versions 14, 15, 16.
  • HTTP — any server exposing data via HTTP or HTTPS (limited to read-only mode).
  • XRootD — CERN's data management protocol for LHC data.
  • WebDAV — experimental.
  • GlusterFS — experimental.
  • Others? Why not!

Oneprovider communication

centered

Oneprovider hardware requirements

Requirement Minimum (for testing) Optimal (production)
Cluster size (nodes) 1 2 + 1 for every 500 concurrent users
CPU 4 vCPU 24 vCPU
RAM 12GB 128GB
Local disk SSD SSD
Local storage space 20GB + 10MB for each 1000 files 40GB + 10MB for each 1000 files
Open ports 80*, 443, 6665, 9443** 80*, 443, 6665, 9443**
Other Public IP address Public IP address
OS Any Docker / k8s compatible Any Docker / k8s compatible

*port 80 is optional, serves only as a redirector to HTTPS (443) **port 9443 (onepanel GUI/API) is optional — a proxy is available at :443

Deployment methods

  • Native packages (.deb, .rpm) — not officially supported, but if you like doing things the hard way — possible.
  • Docker approach — docker container running on a host (or hosts).
  • Kubernetes cluster — installation using helm charts.

Docker vs k8s approach



Docker Kubernetes
cheaper less cheap
agruably simple k8s is complex, more know-how is needed
smaller envs bigger envs (e.g. big clusters, multiple providers)
host-level resource management smoother resource management and its scaling
no failover failover on infrastructure level
easier troubleshooting complicated troubleshooting

Deployment methods

  • Native packages (.deb, .rpm) — not officially supported, but if you like doing things the hard way — possible.
  • Docker approach — docker container running on a host (or hosts).
  • Kubernetes cluster — installation using helm charts.

Possible network setups (1/3)

Private — the Onezone and Oneprovider services are deployed within an internal network of an organization/federation; users access the services from within the internal network or via VPN.

centered

Possible network setups (2/3)

Hybrid — the Onezone service is public, and Oneprovider services are either public or deployed within an internal network of an organization. Bonus: it's possible to use Let's Encrypt for certificate management in private Oneproviders if Onezone supports subdomain delegation (DNS challenge).

centered

Possible network setups (3/3)

Public — all services are deployed in the public network.

centered

Note on DB data consistency

  • Oneprovider uses the Couchbase database.
  • A sudden host shutdown or power outage may result in inconsistencies in the DB:
    • Recent changes are buffered in RAM and flushed with a delay.
    • UPS is highly recommended.
  • Couchbase doesn't like the disk to get full — non-flushed changes may be lost, and recovery procedures (time-costly index rebuilds) may be required.
    • Monitoring of the free space is highly recommended.
    • We are working on a "safety valve" on the Oneprovider level — NYI.

Next chapter:

Oneprovider deployment — practice