USER MANUALS

Data Types

The Virtual DataPort catalog includes a group of predefined data types. These types can be divided into two groups: basic types and compound types.

Virtual DataPort data types
<Virtual DataPort data type> ::=
       blob
     | boolean
     | date
     | decimal
     | double
     | float
     | int
     | intervaldaysecond
     | intervalyearmonth
     | localdate
     | long
     | text
     | time
     | timestamp
     | timestamptz
     | vector<double, <integer> >
     | vector<float, <integer> >
     | vector<int, <integer> >
     | vector<long, <integer> >
     | xml
     | <Virtual DataPort compound data type>

<Virtual DataPort compound data type> ::=
       array
     | register

The basic data types supported are:

  • int. Integer number. The maximum value is +2 31 - 1 (2,147,483,647) and the minimum is -2 31 (-2,147,483,648).

  • long. Long integer number. The maximum value is 2 63 - 1 (9,223,372,036,854,775,807) and the minimum is -2 63 (-9,223,372,036,854,775,808).

  • float. Single-precision 32-bit IEEE 754 floating point. Its range of values is explained in the section Floating-Point Types, Formats, and Values of the Java Language Specification.

  • double. Double-precision 64-bit IEEE 754 floating point. Its range of values is explained in the section Floating-Point Types, Formats, and Values of the Java Language Specification.

  • decimal. Signed decimal number with arbitrary precision.

  • boolean. Logical value: true, false, or unknown (null).

  • text. Character string.

  • date (deprecated). Timestamp with a time zone displacement. Maintained for compatibility reasons.

  • localdate. Date without a time zone (year, month, and day).

  • time. Time without a time zone (hour, minute, second, and millisecond).

  • timestamp. Timestamp without a time zone (year, month, day, hour, minute, second, and millisecond).

  • timestamptz. Timestamp with a time zone displacement.

  • intervaldaysecond. Duration of a period of time with a precision that can include any set of contiguous fields other than YEAR or MONTH.

  • intervalyearmonth. Duration of a period of time with a precision that includes a YEAR field, a MONTH field, or both.

  • blob. Binary value. These values cannot be used in query conditions.

  • vector<double, <integer> >. A sequence of double numbers with a specified length.

  • vector<float, <integer> >. A sequence of float numbers with a specified length.

  • vector<int, <integer> >. A sequence of int numbers with a specified length.

  • vector<long, <integer> >. A sequence of long numbers with a specified length.

  • xml. XML document or XML fragment.

Datetime Types

The next section (Data Types for Dates, Timestamps and Intervals) explains the datetime types in detail.

Compound Types

In Virtual DataPort, you can define compound data types to model hierarchical data, such as data obtained from SOAP Web services or XML documents. The section Defining a Data Type explains how to define compound types.

The compound data types are:

  • register. Compound data with an internal and heterogeneous structure (i.e., the fields into which the data are subdivided are not all of the same type).

  • array. List of elements of the same register type.

Vector Types

Virtual DataPort supports vector types designed to handle high-dimensional numeric data, which is essential for Machine Learning (ML) and Generative AI applications.

A vector is an ordered sequence of numbers. While simple vectors can represent coordinates in 2D or 3D space, in data science, vectors often consist of hundreds or thousands of numbers. Virtual DataPort supports vectors composed of specific numeric primitives and a dimension:

  • vector<double, <integer> >

  • vector<float, <integer> >

  • vector<int, <integer> >

  • vector<long, <integer> >

The dimension of a vector refers to the number of elements in the sequence. For example, vector<float,3> defines a vector of floats with a dimension of 3, such as [0.5, 0.1, 0.9].

Unlike strict vector dimension definitions in some databases, Virtual DataPort does not enforce a constraint that all rows in a column must share the same dimension. A single column of type vector<float> can technically store vectors of varying lengths.

However, you should ensure dimensions are consistent within a column. If dimensions vary:

  • Semantic comparisons and distance calculations will be meaningless or return NULL.

  • Underlying data sources may throw errors if they enforce strict dimension limits.

Add feedback