152 Section 7
Section
7
INFORMATIONAL
AND
COMPUTATIONAL
SYSTEMS
Optimal
lossless
encoding
of
string
and
numeric
data
in
arrays
M. P. Bakulina
Institute of Computational Mathematics and Mathematical Geophysics SB RAS
Novosibirsk State University
Email: bakulina@rav.sscc.ru
DOI 10.24412/cl‐35065‐2021‐1‐02‐21
The problem of optimal coding of various types of data in information arrays is considered. Effective compression
of such data leads not only to a decrease in their physical size, but also to increase the speed of query
execution, as well as to a decrease of memory size used. The most common data types used in arrays are
strings and numeric data. An encoding algorithm is proposed that allows efficiently compressing both types of
data. The "bitmap" method [1, 2] is used to encode numeric data. We use the Ziv‐Lempel compression method
[3] to encode string data.
It is shown that the method allows one to increase the compression ratio and encoding and decoding rate
compared to previously known methods.
References
1. Li J., Rotem D., Wong H. A New Compression Method with Fast Searching on Large Databases // Proceedings of
13th International Conference on Very Large Data Bases ‐1987, Brighton, p. 311‐318.
2. Eggers S., and Shoshani A. Efficient Access of Compressed Data Performance // Proc. VLDB, Montreal. Oct. 1980.
p. 205.
3. Ziv J., Lempel A. A Universal Algorithm for Sequential Data Compression // IEEE Trans. Inform. Theory. 1977. V. 33,
N 3. P. 337‐343.
Integration
of
geophysical
information
resources
L. P. Braginskaya, A. P Grigoruk
Institute of Computational Mathematics and Mathematical Geophysics SB RAS
Email: ludmila@opg.sscc.ru
DOI 10.24412/cl‐35065‐2021‐1‐02‐94
The paper proposes an ontological approach to integration of information resources in geophysics. Ontology
can form a framework of knowledge base, create a basis for describing the basic concepts of the subject
area and serve as a basis for integrating databases containing factual knowledge necessary for effective work
of researchers. This approach allows you to provide the user with a data source at a conceptual level and address
the user query to several heterogeneous data sources.
This work was (partially) supported by the Foundation RFBR grant 20‐07‐00861A.