# Auto-cleaning
As a prerequisite for understanding this chapter, we recommend that you familiarize yourself with the mechanism of file popularity.
The purpose of the auto-cleaning mechanism is to automatically maintain storage usage at a certain level and ensure that there is enough space for new replicas when performing continuous computations. The mechanism uses the statistics collected by the file popularity* to determine the least popular file replicas and evict them. The process is safe — only redundant replicas (duplicated on remote providers) are evicted. Eviction of replicas is coordinated among providers using a custom algorithm. It ensures that there is no risk of data loss, even in the case of simultaneous requests for deletion of replicas of the same file.
Each auto-cleaning run produces a report, which shows the number of removed replicas and the amount of released storage space.
# Basic setup
The mechanism can be enabled in the space configuration tab in the Oneprovider panel.
NOTE: The file popularity mechanism must be enabled to turn auto-cleaning on. Disabling file popularity disables auto-cleaning as well.
The user interface allows specifying low and high thresholds, corresponding to the amount of data stored on the local storage supporting given space:
- high threshold — when exceeded, an auto-cleaning run is triggered to evict redundant replicas.
- low threshold — when reached, the current auto-cleaning run is stopped.
The thresholds can be adjusted in the Spaces -> "Space Name" -> Auto-cleaning
tab, in
the Oneprovider panel GUI (as shown below), or using the REST API.
# Selective rules
It is possible to filter the list of files obtained from the file popularity by enabling selective rules.
There are six rules for which ranges might be provided:
maxOpenCount
— Files that have been opened not more thanmaxOpenCount
times may be cleaned. The default value is9007199254740991 (2^53-1)
.minHoursSinceLastOpen
— Files that have been closed at least this many hours ago may be cleaned. The default value is0
.minFileSize
— Only files whose size (in bytes) is not less than the given value may be cleaned. The default value is1
.maxFileSize
— Only files whose size (in bytes) is not greater than the given value may be cleaned. The default value is1125899906842624 (1 PiB)
.maxHourlyMovingAverage
— Files that have a moving average of open operations count per hour not greater than the given value may be cleaned. The average is calculated in 24 hours window. The default value is9007199254740991 (2^53-1)
.maxDailyMovingAverage
— Files that have a moving average of open operations count per day not greater than the given value may be cleaned. The average is calculated in 30 days window. The default value is9007199254740991 (2^53-1)
.maxMonthlyMovingAverage
— Files that have a moving average of open operations count per month not greater than the given value may be cleaned. The average is calculated in 12 months window. The default value is9007199254740991 (2^53-1)
.
Disabled rules are ignored. A file replica must satisfy all enabled rules to be evicted.
# Starting run on demand
It is possible to forcefully start an auto-cleaning run by pressing the green button placed below the space occupancy bar. The run can be forcefully triggered even if the high threshold is not exceeded.
# Stopping run on demand
It is possible to forcefully stop an auto-cleaning run by pressing the red button placed below the space occupancy bar.
# REST API
All operations related to auto-cleaning can be performed using the REST API. Refer to the linked API documentation for detailed information and examples.
Request | Link to API |
---|---|
Get auto-cleaning configuration | API (opens new window) |
Update auto-cleaning configuration | API (opens new window) |
Get list of auto-cleaning runs' reports | API (opens new window) |
Get the report of auto-cleaning run | API (opens new window) |
Trigger auto-cleaning run | API (opens new window) |
Get current auto-cleaning status | API (opens new window) |