Configuration

Cayley can be configured using configuration file written in YAML / JSON or by passing flags to the command line. By default. All command line flags take precedence over the configuration file.

Recommended Configuration

By default, Cayley is using the memstore store. memstore works best for datasets that can fit into the memory of the machine and workloads which doesn't require persistency. For large datasets and/or workloads with require persistency it is recommended to use the bolt store.

Configuration Options

Store

store.backend

  • Type: String

  • Default: "memory"

Determines the type of the underlying database. Options include:

  • memstore: An in-memory store, based on an initial N-Quads file. Loses all changes when the process exits.

Key-Value backends

  • btree: An in-memory store, used mostly to quickly verify KV backend functionality.

  • leveldb: A persistent on-disk store backed by LevelDB.

  • bolt: Stores the graph data on-disk in a Bolt file. Uses more disk space and memory than LevelDB for smaller stores, but is often faster to write to and comparable for large ones, with faster average query times.

NoSQL backends

Slower, as it incurs network traffic, but multiple Cayley instances can disappear and reconnect at will, across a potentially horizontally-scaled store.

  • mongo: Stores the graph data and indices in a MongoDB instance.

  • elastic: Stores the graph data and indices in a ElasticSearch instance.

  • couch: Stores the graph data and indices in a CouchDB instance.

  • pouch: Stores the graph data and indices in a PouchDB. Requires building with GopherJS.

SQL backends

  • postgres: Stores the graph data and indices in a PostgreSQL instance.

  • cockroach: Stores the graph data and indices in a CockroachDB cluster.

  • mysql: Stores the graph data and indices in a MySQL or MariaDB instance.

  • sqlite: Stores the graph data and indices in a SQLite database.

store.address

  • Type: String

  • Default: ""

  • Alias: store.path

Where does the database actually live? Dependent on the type of database. For each datastore:

  • memstore: Path parameter is not supported.

  • leveldb: Directory to hold the LevelDB database files.

  • bolt: Path to the persistent single Bolt database file.

  • mongo: "hostname:port" of the desired MongoDB server. More options can be provided in mgo address format.

  • elastic: http://host:port of the desired ElasticSearch server.

  • couch: http://user:[email protected]:port/dbname of the desired CouchDB server.

  • postgres,cockroach: postgres://[username:[email protected]]host[:port]/database-name?sslmode=disable of the PostgreSQL database and credentials. Sslmode is optional. More option available on pq page.

  • mysql: [username:[email protected]]tcp(host[:3306])/database-name of the MqSQL database and credentials. More option available on driver page.

  • sqlite: filepath of the SQLite database. More options available on driver page.

store.read_only

  • Type: Boolean

  • Default: false

If true, disables the ability to write to the database using the HTTP API (will return a 400 for any write request). Useful for testing or instances that shouldn't change.

store.options

  • Type: Object

See Per-Database Options, below.

Per-Store Options

The store.options object in the main configuration file contains any of these following options that change the behavior of the datastore.

Memory

No special options.

LevelDB

write_buffer_mb

  • Type: Integer

  • Default: 20

The size in MiB of the LevelDB write cache. Increasing this number allows for more/faster writes before syncing to disk. Default is 20, for large loads, a recommended value is 200+.

cache_size_mb

  • Type: Integer

  • Default: 2

The size in MiB of the LevelDB block cache. Increasing this number uses more memory to maintain a bigger cache of quad blocks for better performance.

Bolt

nosync

  • Type: Boolean

  • Default: false

Optionally disable syncing to disk per transaction. Nosync being true means much faster load times, but without consistency guarantees.

Mongo

database_name

  • Type: String

  • Default: "cayley"

The name of the database within MongoDB to connect to. Manages its own collections and indices therein.

PostgreSQL

Postgres version 9.5 or greater is required.

db_fill_factor

  • Type: Integer

  • Default: 50

Amount of empty space as a percentage to leave in the database when creating a table and inserting rows. See PostgreSQL CreateTable.

local_optimize

  • Type: Boolean

  • Default: true

Whether to skip checking quad store size.

Connection pooling options used to configure the Go sql connection. Go defaults will be used when not specified.

maxopenconnections

  • Type: Integer

  • Default: -1.

maxidleconnections

  • Type: Integer

  • Default: -1.

connmaxlifetime

  • Type: String

  • Default: "".

Per-Replication Options

The replication_options object in the main configuration file contains any of these following options that change the behavior of the replication manager.

Query

timeout

  • Type: Integer or String

  • Default: 30

The maximum length of time the Javascript runtime should run until cancelling the query and returning a 408 Timeout. When timeout is an integer is is interpreted as seconds, when it is a string it is parsed as a Go time.Duration. A negative duration means no limit.

Load

load.ignore_missing

  • Type: Boolean

  • Default: false

Optionally ignore missing quad on delete.

load.ignore_duplicate

  • Type: Boolean

  • Default: false

Optionally ignore duplicated quad on add.

load.batch

  • Type: Integer

  • Default: 10000

The number of quads to buffer from a loaded file before writing a block of quads to the database. Larger numbers are good for larger loads.

Configuration File Location

Cayley looks in the following locations for the configuration file (named cayley.yml or cayley.json):

  • Command line flag

  • The environment variable $CAYLEY_CFG

  • Current directory

  • $HOME/.cayley/

  • /etc/