Glossary
The glossary provides condensed information about Onedata-specific concepts and other key terms in the context of our system, with links to the detailed documentation.
Access control
A set of policies and procedures for granting or denying access to Onedata entities (e.g., spaces) and data. For more information about space access control, refer to this page. In the case of the data access control, Onedata implements a multi-level approach, as described here.
Access-control list (ACL)
A list of permissions associated with a file or directory that is used to precisely grant or deny access to it. Learn more here.
Access token
A token for authentication and authorization, which is used when accessing Onedata interfaces: REST API, CDMI, or Oneclient.
Archive
A snapshot of a dataset created at a certain point in time. Learn more here.
Auto-cleaning
A process that automatically maintains storage usage at a certain level and ensures that there is enough space for new replicas during continuous computations. It uses statistics collected by the file popularity to determine the least popular file replicas in a space and to evict them. Learn more here.
Automation
A system for managing and running workflows. Learn more here.
Automation inventory
An organizational unit for storing workflow schemas, lambdas, and managing their members. Learn more here.
Caveat
A confinement limiting the context in which a token is valid, inscribed in the token itself, e.g., limiting the validity of the token to a certain point in time, or target services. Learn more here.
Cloud Data Management Interface (CDMI)
A standardized interface for managing cloud storage and accessing data held in it. Learn more here.
Cluster
A set of hosts that together run an instance of Onezone or Oneprovider. The cluster can consist of one or more hosts, each running a subset of services, e.g., database, Cluster Worker, and Cluster Manager. Learn more in the Onezone cluster nodes and Oneprovider cluster nodes chapters.
Cluster Manager
A component of Onedata services (Oneprovider, Onezone), which coordinates Cluster Worker instances within a single cluster.
Cluster Worker
A component of Onedata services (Oneprovider, Onezone), which enables them to easily scale on many nodes on a single cluster. Each Cluster Worker can be configured for different tasks depending on the current needs (data access, metadata management, etc.) by the Cluster Manager component.
Couchbase
A highly scalable document-oriented database, used as a crucial component of Onedata services. Learn more about its role in Onedata in the Architecture chapter or visit the official Couchbase website.
Data Discovery
A feature that provides harvesting of the user-defined metadata assigned to files in multiple spaces and submits it to indices, which can be later browsed and queried. Logically divided into separate harvesters that can have different configurations and source spaces. Learn more here.
Data distribution
A layout of the physical data blocks on storage backends for the file. Managed automatically by on-the-fly transfers, and manually using data replication, migration, or eviction. Learn more here.
Data transfer
A process of replicating or migrating data between providers. Learn more here.
Dataset
A file or directory marked by space users as representing data collections relevant to them. They can be used to organize data in a space systematically and provide an ability to create persistent snapshots — archives. Learn more here.
Digital Object Identifier (DOI)
A standardized persistent identifier, defined by the International Organization for Standardization (ISO), used to uniquely identify digital objects such as academic publications, datasets, and official documents.
Direct member
A member who has assigned privileges to the resource (space, group, harvester, etc.) without an intermediate group.
Effective privileges
A sum of all resources’ privileges assigned directly and those inherited via group membership path.
Emergency interface
A Web GUI of Onepanel accessible from the special 9443 port on the service host.
Learn more in the Oneprovider administration panel and
the Onezone administration panel chapters.
Eviction (data)
A user-triggered action that changes the data distribution of files in order to remove replicated data blocks from a specific provider.
Extended attributes
Custom key-value pairs that can be assigned to any file/directory and are compatible with POSIX extended file attributes. Learn more here.
File ID
A unique, global identifier associated with a file or directory. Learn more here.
File metadata
Information that describes a file or directory. Can be roughly divided into filesystem metadata, governed by the system, and user-defined metadata, i.e., extended attributes or custom RDF and JSON documents. Learn more here.
File path
A string specifying the location of a file or directory in the Onedata filesystem. Learn more here.
File popularity
A feature that provides tracking of usage statistics for files in a space. Used by the auto-cleaning process to clean up the least popular file replicas. Learn more here.
File registration
A feature that allows users to register files located on an imported storage in order to reflect external data collections in a Onedata space. Learn more here.
Group
An abstract entity that organizes together a subset of users and other groups. Helps manage users’ access and privileges to resources like spaces. Learn more here.
Handle
A representation of an Open Access persistent identifier and metadata, assigned to the share. Created by registering the share in a handle service and exposing it for discovery by Public Data indexes via the OAI PMH protocol, making the data collection and metadata publicly available and findable. Learn more here.
Handle service
A mediator used to register the share in the Public Data indexing services, which results in creating a handle. Learn more here.
Harvester
An internal service that provides the implementation of data discovery by harvesting the user-defined metadata across the files from the designated spaces. Available to users or groups with appropriate privileges. Learn more here.
Identity provider (IdP)
A system that authenticates users and manages their digital identities. Enables single sign-on (SSO) by allowing trusted applications, such as Onezone, to rely on the IdP for authentication instead of handling credentials themselves. Onedata supports a wide range of IdPs based on OIDC & SAML. Learn more here.
Identity token
A token proving the identity.
Imported storage
A storage backend that enables the storage import feature on supported spaces. Learn more here.
Invite token
A token for gaining access to some Onedata resource, e.g., space, group, harvester, etc.
Lambda (workflows)
A contract between custom-defined operations and the automation system that serves as a bridge for integrating custom logic into workflows. Encapsulates specifications for operation interface, resource requirements, execution details, and metadata.
Lane (workflows)
A distinct processing stage within a workflow that orchestrates data processing by routing items from a source store through parallel boxes containing tasks. Executes sequentially, forming a processing pipeline.
Let’s Encrypt (LE)
A non-profit certificate authority run by the Internet Security Research Group (ISRG) that provides web certificates without charging fees. Using the built-in LE client, Onedata can obtain and renew web certificates on its nodes automatically. Learn more in the Onezone web certificate and Oneprovider web certificate chapters.
Local User Mapping (LUMA)
A database that stores mappings between Onedata user accounts and local user accounts/credentials on storage resources. It establishes a relation between members of a Onedata space and user accounts recognized by different storage providers. Learn more here.
Member
A user or group that has assigned specific privileges for a space, group, harvester, automation inventory, or cluster. Can be direct or indirect (when a user or group gains privileges to the resource by being a member of another group).
Migration (data)
A user-triggered action that changes the data distribution of files in order to move data blocks between specific providers.
Oneclient
A command-line interface based on FUSE for mounting the Onedata distributed virtual filesystem on local machines. Learn more here.
OnedataFileRestClient
A Python client to the Onedata file REST API, offering basic operations on files as a concise, low-level library. Learn more here.
OnedataFS
A PyFilesystem2 plugin that allows accessing the user data programmatically using a Python API. Learn more here.
OnedataRestFS
A PyFilesystem2 plugin that allows accessing the user data programmatically using a Python API, based on Onedata REST API. Learn more here.
Onepanel
A service dedicated to the administration of a cluster and, itself, an integral part of the cluster. Onepanel is accessible through the Web GUI or REST API. Learn more in the Architecture, Onezone administration panel, and Oneprovider administration panel chapters.
Oneprovider
A service dedicated to managing the data, installed at a data provider site as a cluster, and gets registered in a Onezone. Oneproviders cooperate in a peer-to-peer manner, synchronizing information about commonly supported spaces. It is accessible through the various interfaces, i.a., the Web GUI, REST API, and Oneclient. Learn more here.
Oneprovider panel
A Onepanel instance dedicated for administration of a Oneprovider cluster.
Onezone
A service implementing the Onedata zone concept, serving as a center of authority (using identity providers), and an entry point to the system. Onezone allows registration of multiple Oneproviders to provide their storage resources to users. It manages the core resources of Onedata, e.g., spaces, groups, shares, and is accessible through the various interfaces, i.a., the Web GUI and REST API. Learn more here.
Onezone panel
A Onepanel instance dedicated to the administration of a Onezone cluster.
On-the-fly transfer
A transfer triggered by remote data access, which is performed in the background by Oneproviders when they are requested to serve file fragments that reside in a remote location.
Open Access (OA)
A publishing model that provides free, immediate, and unrestricted online access to research outputs without financial, legal, or technical barriers.
Persistent identifier
A long-lasting, globally unique reference to a digital or physical object that remains stable over time, even if the object’s location or metadata changes. It can be, e.g., a DOI. Onedata supports assigning a persistent identifier to the share using a handle.
Provider
An entity that handles data storage as seen by Onedata users. Providers deploy Oneprovider services near physical storage resources, i.e., in computing and data centers or even personal computers. Learn more here.
Public Data
An extended share that has been assigned a persistent identifier (e.g., DOI) and descriptive metadata. Learn more here.
Quality of Service (QoS)
A feature that provides management of file replica distribution and redundancy between providers supporting a space. Learn more here.
Replication (data)
A user-triggered action that changes the data distribution of files in order to copy data blocks to a specific provider.
REST API
An interface to various Onedata services, accessible through the HTTPS protocol, that follows the RESTful API guidelines. You can browse the Onedata REST API documentation here.
Service
A software that realizes certain roles in the Onedata software stack, communicating with other services, to provide a complete ecosystem. There are three main services in Onedata: Onezone, Oneprovider, and Onepanel.
Space
A logical container for data, fundamental for organizing user data in Onedata. It is accessible only to its members — users or groups — that are assigned fine-grained privileges. The actual data storage of a space is realized by the storage backends using Oneproviders. Learn more here.
Space owner
A designated user, who is authorized to perform all operations, regardless of the assigned privileges, in the space. A space must always have at least one owner, but there may be more. Learn more here.
Share
An entity that represents a semi-public link assigned to a file or directory, allowing anyone on the Internet to read the data. Shares in Onedata may have an optional description and can be promoted to the Public Data using a handle. Read more here.
Storage backend
A storage resource recognized by a Oneprovider and used to support Onedata spaces. Storage backends are registered in the Oneprovider panel, using the Web GUI or REST API. Learn more here.
Storage import
A feature dedicated to importing files located on a storage by registering them in a space supported by the storage backend, without copying the data. Learn more here.
Store (workflows)
A container for data used and manipulated during a workflow execution. It is defined by a store schema and created as a store instance. There are several types, such as a list store or a tree forest store.
Support
A storage backend allocation granted to a space by a Oneprovider on a physical storage. Learn more here.
Token
An alphanumeric string, like e.g. MDAxNWxvY2F00aW9uIG9uZXpvbmUKMDAzYmlkZW500H5H...,
acting as a proof of authorization, that can be used across the system to
authenticate, prove identity, or gain access
to some resources. Tokens must be kept secret. Learn more here.
Transfer
See data transfer.
User
An account in the Onedata, managed by a Onezone, with assigned authentication methods (identity providers), mostly created for a single person. A user could become a member of Onedata resources.
View
A result of continuous indexing of file metadata, mapped using a user-defined function, and optionally reduced using a reduce function. Learn more here.
Web GUI
A graphical user interface of Onedata, accessible via a web browser. Learn more here.
Workflow
A user-defined process for orchestrating complex data processing through a series of sequential lanes and shared global stores. Workflow consists of a schema, and an execution (the runtime instance). It is stored in inventories. Learn more here.
Workflow schema
A model that describes how the workflow operates when it gets executed. It specifies sequential processing lanes, shared data stores, and metadata required to orchestrate data processing pipelines. Learn more here.
Xattrs
See extended attributes.
Zone
A central entity of a single Onedata ecosystem instance, which constitutes an independent data management platform, bringing together multiple data centers — providers. Zone serves as a center of authority and an entry point to the system. It is managed by the Onezone service. Learn more here.