API Reference

A complete reference to all the SQL scalar functions, table functions, and virtual tables inside sqlite-vec.

WARNING

sqlite-vec is pre-v1, so expect breaking changes.

Constructors

SQL functions that "construct" vectors with different element types.

Currently, only float32, int8, and bit vectors are supported.

`vec_f32(vector)`

Creates a float vector from a BLOB or JSON text. If a BLOB is provided, the length must be divisible by 4, as a float takes up 4 bytes of space each.

The returned value is a BLOB with 4 bytes per element, with a special subtype of 223.

sql

select vec_f32('[.1, .2, .3, 4]');
-- X'CDCCCC3DCDCC4C3E9A99993E00008040'

select subtype(vec_f32('[.1, .2, .3, 4]'));
-- 223

select vec_f32(X'AABBCCDD');
-- X'AABBCCDD'

select vec_to_json(vec_f32(X'AABBCCDD'));
-- '[-1844071490169864000.000000]'

select vec_f32(X'AA');
-- ❌ invalid float32 vector BLOB length. Must be divisible by 4, found 1

`vec_int8(vector)`

Creates a 8-bit integer vector from a BLOB or JSON text. If a BLOB is provided, the length must be divisible by 4, as a float takes up 4 bytes of space each. If JSON text is provided, each element must be an integer between -128 and 127 inclusive.

The returned value is a BLOB with 1 byte per element, with a special subtype of 225.

sql

select vec_int8('[1, 2, 3, 4]');
-- X'01020304'

select subtype(vec_int8('[1, 2, 3, 4]'));
-- 225

select vec_int8(X'AABBCCDD');
-- X'AABBCCDD'

select vec_to_json(vec_int8(X'AABBCCDD'));
-- '[-86,-69,-52,-35]'

select vec_int8('[999]');
-- ❌ JSON parsing error: value out of range for int8

`vec_bit(vector)`

Creates a binary vector from a BLOB.

The returned value is a BLOB with 1 byte per 8 elements, with a special subtype of 224.

sql

select vec_bit(X'F0');
-- X'F0'

select subtype(vec_bit(X'F0'));
-- 224

select vec_to_json(vec_bit(X'F0'));
-- '[0,0,0,0,1,1,1,1]'

Operations

Different operations and utilities for working with vectors.

`vec_length(vector)`

Returns the number of elements in the given vector. The vector can be JSON, BLOB, or the result of a constructor function.

This function will return an error if vector is invalid.

sql

select vec_length('[.1, .2]');
-- 2

select vec_length(X'AABBCCDD');
-- 1

select vec_length(vec_int8(X'AABBCCDD'));
-- 4

select vec_length(vec_bit(X'AABBCCDD'));
-- 32

select vec_length(X'CCDD');
-- ❌ invalid float32 vector BLOB length. Must be divisible by 4, found 2

`vec_type(vector)`

Returns the name of the type of vector as text. One of 'float32', 'int8', or 'bit'.

This function will return an error if vector is invalid.

sql

select vec_type('[.1, .2]');
-- 'float32'

select vec_type(X'AABBCCDD');
-- 'float32'

select vec_type(vec_int8(X'AABBCCDD'));
-- 'int8'

select vec_type(vec_bit(X'AABBCCDD'));
-- 'bit'

select vec_type(X'CCDD');
-- ❌ invalid float32 vector BLOB length. Must be divisible by 4, found 2

`vec_add(a, b)`

Adds every element in vector a with vector b, returning a new vector c. Both vectors must be of the same type and same length. Only float32 and int8 vectors are supported.

An error is raised if either a or b are invalid, or if they are not the same type or same length.

`vec_sub(a, b)`

Subtracts every element in vector a with vector b, returning a new vector c. Both vectors must be of the same type and same length. Only float32 and int8 vectors are supported.

An error is raised if either a or b are invalid, or if they are not the same type or same length.

`vec_normalize(vector)`

Performs L2 normalization on the given vector. Only float32 vectors are currently supported.

Returns an error if the input is an invalid vector or not a float32 vector.

sql

select vec_normalize('[2, 3, 1, -4]');
-- X'BAF4BA3E8B370C3FBAF43A3EBAF43ABF'

select vec_to_json(
  vec_normalize('[2, 3, 1, -4]')
);
-- '[0.365148,0.547723,0.182574,-0.730297]'

-- for matryoshka embeddings - slice then normalize
select vec_to_json(
  vec_normalize(
    vec_slice('[2, 3, 1, -4]', 0, 2)
  )
);
-- '[0.554700,0.832050]'

`vec_slice(vector, start, end)`

Extract a subset of vector from the start element (inclusive) to the end element (exclusive). TODO check

This is especially useful for Matryoshka embeddings, also known as "adaptive length" embeddings. Use with vec_normalize() to get proper results.

Returns an error in the following conditions:

If vector is not a valid vector
If start is less than zero or greater than or equal to end
If end is greater than the length of vector, or less than or equal to start.
If vector is a bitvector, start and end must be divisible by 8.

sql

select vec_slice('[1, 2,3, 4]', 0, 2);
-- X'0000803F00000040'

select vec_to_json(
  vec_slice('[1, 2,3, 4]', 0, 2)
);
-- '[1.000000,2.000000]'

select vec_to_json(
  vec_slice('[1, 2,3, 4]', 2, 4)
);
-- '[3.000000,4.000000]'

select vec_to_json(
  vec_slice('[1, 2,3, 4]', -1, 4)
);
-- ❌ slice 'start' index must be a postive number.

select vec_to_json(
  vec_slice('[1, 2,3, 4]', 0, 5)
);
-- ❌ slice 'end' index is greater than the number of dimensions

select vec_to_json(
  vec_slice('[1, 2,3, 4]', 0, 0)
);
-- ❌ slice 'start' index is equal to the 'end' index, vectors must have non-zero length

`vec_to_json(vector)`

Represents a vector as JSON text. The input vector can be a vector BLOB or JSON text.

Returns an error if vector is an invalid vector, or when memory cannot be allocated.

sql

select vec_to_json(X'AABBCCDD');
-- '[-1844071490169864000.000000]'

select vec_to_json(vec_int8(X'AABBCCDD'));
-- '[-86,-69,-52,-35]'

select vec_to_json(vec_bit(X'AABBCCDD'));
-- '[0,1,0,1,0,1,0,1,1,1,0,1,1,1,0,1,0,0,1,1,0,0,1,1,1,0,1,1,1,0,1,1]'

select vec_to_json('[1,2,3,4]');
-- '[1.000000,2.000000,3.000000,4.000000]'

select vec_to_json('invalid');
-- ❌ JSON array parsing error: Input does not start with '['

`vec_each(vector)`

A table function to iterate through every element in a vector. One row id returned per element in a vector.

sql

CREATE TABLE vec_each(
  rowid int,    -- The
  vector HIDDEN -- input parameter: A well-formed vector value
)

Returns an error if vector is not a valid vector.

sql

select rowid, value from vec_each('[1,2,3,4]');
/*
┌───────┬───────┐
│ rowid │ value │
├───────┼───────┤
│ 0     │ 1     │
├───────┼───────┤
│ 1     │ 2     │
├───────┼───────┤
│ 2     │ 3     │
├───────┼───────┤
│ 3     │ 4     │
└───────┴───────┘

*/


select rowid, value from vec_each(X'AABBCCDD00112233');
/*
┌───────┬──────────────────────┐
│ rowid │ value                │
├───────┼──────────────────────┤
│ 0     │ -1844071490169864200 │
├───────┼──────────────────────┤
│ 1     │ 3.773402568185702e-8 │
└───────┴──────────────────────┘

*/


select rowid, value from vec_each(vec_int8(X'AABBCCDD'));
/*
┌───────┬───────┐
│ rowid │ value │
├───────┼───────┤
│ 0     │ -86   │
├───────┼───────┤
│ 1     │ -69   │
├───────┼───────┤
│ 2     │ -52   │
├───────┼───────┤
│ 3     │ -35   │
└───────┴───────┘

*/


select rowid, value from vec_each(vec_bit(X'F0'));
/*
┌───────┬───────┐
│ rowid │ value │
├───────┼───────┤
│ 0     │ 1     │
├───────┼───────┤
│ 1     │ 1     │
├───────┼───────┤
│ 2     │ 1     │
├───────┼───────┤
│ 3     │ 1     │
├───────┼───────┤
│ 4     │ 0     │
├───────┼───────┤
│ 5     │ 0     │
├───────┼───────┤
│ 6     │ 0     │
├───────┼───────┤
│ 7     │ 0     │
└───────┴───────┘

*/

Distance functions

Various algorithms to calculate distance between two vectors.

`vec_distance_L2(a, b)`

Calculates the L2 euclidian distance between vectors a and b. Only valid for float32 or int8 vectors.

Returns an error under the following conditions:

a or b are invalid vectors
a or b do not share the same vector element types (ex float32 or int8)
a or b are bit vectors. Use vec_distance_hamming() for distance calculations between two bitvectors.
a or b do not have the same length.

sql

select vec_distance_L2('[1, 1]', '[2, 2]');
-- 1.4142135381698608

select vec_distance_L2('[1, 1]', '[-2, -2]');
-- 4.242640495300293

select vec_distance_L2('[1.1, 2.2, 3.3]', '[4.4, 5.5, 6.6]');
-- 5.7157673835754395

select vec_distance_L2(X'AABBCCDD', X'00112233');
-- 1844071490169864200

select vec_distance_L2('[1, 1]', vec_int8('[2, 2]'));
-- ❌ Vector type mistmatch. First vector has type float32, while the second has type int8.

select vec_distance_L2(vec_bit(X'AA'), vec_bit(X'BB'));
-- ❌ Cannot calculate L2 distance between two bitvectors.

`vec_distance_cosine(a, b)`

Calculates the cosine distance between vectors a and b. Only valid for float32 or int8 vectors.

Returns an error under the following conditions:

a or b are invalid vectors
a or b do not share the same vector element types (ex float32 or int8)
a or b are bit vectors. Use vec_distance_hamming() for distance calculations between two bitvectors.
a or b do not have the same length.

sql

select vec_distance_cosine('[1, 1]', '[2, 2]');
-- 2.220446049250313e-16

select vec_distance_cosine('[1, 1]', '[-2, -2]');
-- 2

select vec_distance_cosine('[1.1, 2.2, 3.3]', '[4.4, 5.5, 6.6]');
-- 0.02536807395517826

select vec_distance_cosine(X'AABBCCDD', X'00112233');
-- 2

select vec_distance_cosine('[1, 1]', vec_int8('[2, 2]'));
-- ❌ Vector type mistmatch. First vector has type float32, while the second has type int8.

select vec_distance_cosine(vec_bit(X'AA'), vec_bit(X'BB'));
-- ❌ Cannot calculate cosine distance between two bitvectors.

`vec_distance_hamming(a, b)`

Calculates the hamming distance between two bitvectors a and b. Only valid for bitvectors.

Returns an error under the following conditions:

a or b are not bitvectors
a and b do not share the same length
Memory cannot be allocated

sql

select vec_distance_hamming(vec_bit(X'00'), vec_bit(X'FF'));
-- 8

select vec_distance_hamming(vec_bit(X'FF'), vec_bit(X'FF'));
-- 0

select vec_distance_hamming(vec_bit(X'F0'), vec_bit(X'44'));
-- 4

select vec_distance_hamming('[1, 1]', '[0, 0]');
-- ❌ Cannot calculate hamming distance between two float32 vectors.

Quantization

Various techniques to "compress" a vector by reducing precision and accuracy.

`vec_quantize_binary(vector)`

Quantize a float32 or int8 vector into a bitvector. For every element in the vector, a 1 is assigned to positive numbers and a 0 is assigned to negative numbers. These values are then packed into a bit vector.

Returns an error if vector is invalid, or if vector is not a float32 or int8 vector.

sql

select vec_quantize_binary('[1, 2, 3, 4, 5, 6, 7, 8]');
-- X'FF'

select vec_quantize_binary('[1, 2, 3, 4, -5, -6, -7, -8]');
-- X'0F'

select vec_quantize_binary('[-1, -2, -3, -4, -5, -6, -7, -8]');
-- X'00'

select vec_quantize_binary('[-1, -2, -3, -4, -5, -6, -7, -8]');
-- X'00'

select vec_quantize_binary(vec_int8(X'11223344'));
-- ❌ Binary quantization requires vectors with a length divisible by 8

select vec_quantize_binary(vec_bit(X'FF'));
-- ❌ Can only binary quantize float or int8 vectors

`vec_quantize_i8(vector, [start], [end])`

sql

select 'todo';
-- 'todo'

NumPy Utilities

Functions to read data from or work with NumPy arrays.

`vec_npy_each(vector)`

xxx

sql

-- db.execute('select quote(?)', [to_npy(np.array([[1.0], [2.0], [3.0]], dtype=np.float32))]).fetchone()
select
  rowid,
  vector,
  vec_type(vector),
  vec_to_json(vector)
from vec_npy_each(
  X'934E554D5059010076007B276465736372273A20273C6634272C2027666F727472616E5F6F72646572273A2046616C73652C20277368617065273A2028332C2031292C207D202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020200A0000803F0000004000004040'
)
/*
┌───────┬─────────────┬──────────────────┬─────────────────────┐
│ rowid │ vector      │ vec_type(vector) │ vec_to_json(vector) │
├───────┼─────────────┼──────────────────┼─────────────────────┤
│ 0     │ X'0000803F' │ 'float32'        │ '[1.000000]'        │
├───────┼─────────────┼──────────────────┼─────────────────────┤
│ 1     │ X'00000040' │ 'float32'        │ '[2.000000]'        │
├───────┼─────────────┼──────────────────┼─────────────────────┤
│ 2     │ X'00004040' │ 'float32'        │ '[3.000000]'        │
└───────┴─────────────┴──────────────────┴─────────────────────┘

*/


-- db.execute('select quote(?)', [to_npy(np.array([[1.0], [2.0], [3.0]], dtype=np.float32))]).fetchone()
select
  rowid,
  vector,
  vec_type(vector),
  vec_to_json(vector)
from vec_npy_each(
  X'934E554D5059010076007B276465736372273A20273C6634272C2027666F727472616E5F6F72646572273A2046616C73652C20277368617065273A2028332C2031292C207D202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020200A0000803F0000004000004040'
)
/*
┌───────┬─────────────┬──────────────────┬─────────────────────┐
│ rowid │ vector      │ vec_type(vector) │ vec_to_json(vector) │
├───────┼─────────────┼──────────────────┼─────────────────────┤
│ 0     │ X'0000803F' │ 'float32'        │ '[1.000000]'        │
├───────┼─────────────┼──────────────────┼─────────────────────┤
│ 1     │ X'00000040' │ 'float32'        │ '[2.000000]'        │
├───────┼─────────────┼──────────────────┼─────────────────────┤
│ 2     │ X'00004040' │ 'float32'        │ '[3.000000]'        │
└───────┴─────────────┴──────────────────┴─────────────────────┘

*/

Entrypoints

All the named entrypoints that load in different sqlite-vec functions and options.

API Reference

Constructors

`vec_f32(vector)`

`vec_int8(vector)`

`vec_bit(vector)`

Operations

`vec_length(vector)`

`vec_type(vector)`

`vec_add(a, b)`

`vec_sub(a, b)`

`vec_normalize(vector)`

`vec_slice(vector, start, end)`

`vec_to_json(vector)`

`vec_each(vector)`

Distance functions

`vec_distance_L2(a, b)`

`vec_distance_cosine(a, b)`

`vec_distance_hamming(a, b)`

Quantization

`vec_quantize_binary(vector)`

`vec_quantize_i8(vector, [start], [end])`

NumPy Utilities

`vec_npy_each(vector)`

Meta

`vec_version()`

`vec_debug()`

Entrypoints

Mozilla Builders

Fly.io

Turso

SQLite Cloud

Dcnvm Spark

API Reference ​

Constructors ​

vec_f32(vector) ​

vec_int8(vector) ​

vec_bit(vector) ​

Operations ​

vec_length(vector) ​

vec_type(vector) ​

vec_add(a, b) ​

vec_sub(a, b) ​

vec_normalize(vector) ​

vec_slice(vector, start, end) ​

vec_to_json(vector) ​

vec_each(vector) ​

Distance functions ​

vec_distance_L2(a, b) ​

vec_distance_cosine(a, b) ​

vec_distance_hamming(a, b) ​

Quantization ​

vec_quantize_binary(vector) ​

vec_quantize_i8(vector, [start], [end]) ​

NumPy Utilities ​

vec_npy_each(vector) ​

Meta ​

vec_version() ​

vec_debug() ​

Entrypoints ​

API Reference

Constructors

`vec_f32(vector)`

`vec_int8(vector)`

`vec_bit(vector)`

Operations

`vec_length(vector)`

`vec_type(vector)`

`vec_add(a, b)`

`vec_sub(a, b)`

`vec_normalize(vector)`

`vec_slice(vector, start, end)`

`vec_to_json(vector)`

`vec_each(vector)`

Distance functions

`vec_distance_L2(a, b)`

`vec_distance_cosine(a, b)`

`vec_distance_hamming(a, b)`

Quantization

`vec_quantize_binary(vector)`

`vec_quantize_i8(vector, [start], [end])`

NumPy Utilities

`vec_npy_each(vector)`

Meta

`vec_version()`

`vec_debug()`

Entrypoints