Skip to content
English
  • There are no suggestions because the search field is empty.

Historian Data Sizing Guide And Best Practises

Estimated sizing and space requirements based on data types and logging frequency

Understanding Data Block Creation

Data blocks are created every hour and is partitioned by the total number of tags being stores per block. If more than 20000 tags are being stored in a block, the block will be split into multiple blocks per hour.

 If 100,000 tags being logged, every hour will produce 5 data block files. It is recommended to not exceed 100,000 tags per dataset

The blocks are also split up based on the data type of the tag. String tags will be stored into a block file with a .object extension and numerical tags will be logged into a .data block file.

Depending on the amount of blocks to write, a controlled shutdown might need more time to complete the operation. In a Docker environmentornment, this can be controlled by your docker yaml file:

For Windows you might need to tweak the following registry setting:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\WaitToKillServiceTimeout

By default, this is set to 20000 milliseconds.

The recommendation would be to set this to 30 seconds.

Growth Estimation

Appropriately accounting for the future growth of a data historian can be very complicated and requires the consideration of several variables, including

  1. Number of tags
  2. The frequency of data changes
  3. The type of data being logged
  4. Sample Rates 

If data logging has already begun, a better way of predicting disk space is to find the average file size of a week or month's worth of data for each data set, then multiplying it out over a given time period.

Number of Tags

Perhaps the most obvious of the factors that influence growth is the total number of tags . A tag, sometimes referred to as a data-point,  is a uniquely identified point of data collection from a field device such as a sensor. A tag consists of a collection of timestamps, values, and quality scores. 

The Frequency of Data Changes

The Timebase Time Series Historian store values as 'Delta', thus, the Historian does not log a tag value that does not change from the previous value. Understanding how often your tags change helps predict how much storage space is needed. In most real-world environments, the majority of tags do not change as often as they are scanned.

Data Type

The type of data stored must be considered when calculating disk space. The Historian supports data types from single Booleans, Floats, and String values. String values will use the same amount of storage as a number data type plus the size of the string characters making up the string.

Estimation example on Numerical Data 

Please see the following link to our sizing estimator tool

This includes tag data of type Double, Integer and Boolean

Example - 10,000 tags changing every second will consume 820mb of data per day.

Amount of Tags Frequency of Change Change Percentage Duration Storage Requirements
10,000 1 s 10% 1 d 1071mb

The Real Time Historian will only store data if the subsequent data point changes on change, i.e delta storage. This size estimate calculation assumes that every second will be a different value to store. 

Performance Benchmarks

The following stress tests was conducted on 2 AWS EC2 instances:

m6i.xlarge - 4vCPUs - 16GiB memory - 3000 IOPS data disk (data volume separate to OS volume

  • 10 datasets x 50k tags each = 500k tags total (all tag data types were double)
  • 100k TVQs write per second
  • Used 80% CPU consistently
  • Used 105GB storage per day

 

m6i.2xlarge - 8vCPUs - 32GiB memory - 3000 IOPS data disk (data volume separate to OS volume

  • 10 datasets x 100k tags each = 1m tags total (all tag data types were double)
  • 100k TVQs write per second
  • Used 50% CPU consistently
  • Used 105GB storage per day