

离散化

Introduction¶

Discretization can be seen as a type of hash in essence, which ensures that the data still maintains the original full/partial order relationship after hashing.

In layman's terms, when some data is too large or the data type is not supported, causing the value itself unable to be array subscripts, and the answer is determined only by the relative size of the elements. We can use the original data according to the index from large to small to deal with the problem, that is, discretization.

The data used for discretization can be large integers, floating-point numbers, strings, etc.

Implementation¶

C++ discretization has ready-made STL algorithms:

Discretize array¶

Discretizing and querying an array is a common application scenario:

// a[i] is the initial array, and the subscript range is [1, n]
// len is the effective length of the discretized array
std::sort(a + 1, a + 1 + n);
len = std::unique(a + 1, a + n + 1) - a - 1;

// when discretizing the entire array, find the number of essentially different numbers after discretization.

After completing the above discretization, you can use the std::lower_bound function to find the rank (that is, the new index):

std::lower_bound(a + 1, a + len + 1, x) - a;  // query the corresponding index of x after discretization

Similarly, we can also discretize vector:

// std::vector<int> a, b; // b is a copy of a
std::sort(a.begin(), a.end());
a.erase(std::unique(a.begin(), a.end()), a.end());
for (int i = 0; i < n; ++i)
  b[i] = std::lower_bound(a.begin(), a.end(), b[i]) - a.begin();

本页面最近更新：，更新历史
发现错误？想一起完善？在 GitHub 上编辑此页！
本页面贡献者：GavinZhengOI
本页面的全部内容在 CC BY-SA 4.0 和 SATA 协议之条款下提供，附加条款亦可能应用

离散化

Introduction¶

Implementation¶

Discretize array¶

评论