Skip to main content
Version: v25.1

Live import

Live Loader imports data into a running Dgraph cluster using the dgraph live command. Unlike Bulk Loader, Live Loader can import data into an existing database with prior data and supports upserts for updating existing nodes.

Use Live Loader when:

  • Importing data into a running cluster
  • Updating or adding data to existing nodes
  • Loading smaller datasets (for large initial loads, consider Bulk Loader)

Prerequisites

Before importing, ensure you have:

  • A running Dgraph cluster
  • Data files in RDF (.rdf, .rdf.gz) or JSON (.json, .json.gz) format
  • A schema file (optional but recommended)
note

Live Loader accepts RDF N-Quad/Triple data or JSON in plain or gzipped format. See data migration for converting other formats.

Quick Start

dgraph live --files ./data.rdf.gz --schema ./schema.txt --alpha localhost:9080

Basic Usage

dgraph live \
--files <path-to-data> \
--schema <path-to-schema> \
--alpha localhost:9080

Key options:

  • --alpha — Dgraph Alpha gRPC endpoint (default: localhost:9080). Specify multiple addresses (comma-separated) to distribute load.
  • --files — Path to data file or directory. When a directory is specified, all .rdf, .rdf.gz, .json, and .json.gz files are loaded.
  • --schema — Path to schema file (use a different extension like .txt or .schema).

Upserts: Update Existing Data

Live Loader can update existing nodes using upserts. Use one of these approaches:

Using --upsertPredicate

Specify a predicate that serves as a unique identifier:

dgraph live \
--files ./data.rdf.gz \
--schema ./schema.txt \
--upsertPredicate xid

The upsert predicate must exist in the schema and be indexed.

If you are using xid as the upsert predicate name, make sure your schema contains:

<xid>: string @index(exact) @upsert .

Example: If your data contains:

<urn:uuid:550e8400-e29b-41d4-a716-446655440000> <http://xmlns.com/foaf/0.1/name> "Alice Smith" .

This creates or updates the node where xid = "urn:uuid:550e8400-e29b-41d4-a716-446655440000>" and sets its predicate http://xmlns.com/foaf/0.1/name to "Alice Smith".

Using --xidmap

Store UID mappings in a local directory for consistent imports:

dgraph live \
--files ./data.rdf.gz \
--schema ./schema.txt \
--xidmap ./xid-directory

Live Loader looks up existing UIDs or stores new mappings in this directory.

Loading from Cloud Storage

Amazon S3

Set credentials via environment variables or use IAM roles:

Environment VariableDescription
AWS_ACCESS_KEY_IDAWS access key with S3 read permissions
AWS_SECRET_ACCESS_KEYAWS secret key
# Short form (note triple slash)
dgraph live \
--files s3:///<bucket>/<path> \
--schema s3:///<bucket>/<path>/schema.txt

# Long form
dgraph live \
--files s3://s3.<region>.amazonaws.com/<bucket>/<path> \
--schema s3://s3.<region>.amazonaws.com/<bucket>/<path>/schema.txt

IAM Setup

Instead of credentials, configure IAM:

  1. Create an IAM Role with S3 access
  2. Attach it using:

MinIO

Environment VariableDescription
MINIO_ACCESS_KEYMinIO access key
MINIO_SECRET_KEYMinIO secret key
dgraph live \
--files minio://<server>:<port>/<bucket>/<path> \
--schema minio://<server>:<port>/<bucket>/<path>/schema.txt

Multi-tenancy

When ACL is enabled, provide credentials with --creds. By default, data loads into the user's namespace.

dgraph live \
--files ./data.rdf.gz \
--schema ./schema.txt \
--creds "user=groot;password=password;namespace=0"

Loading into a Specific Namespace

Guardians of the Galaxy can load data into any namespace using --force-namespace:

# Load into namespace 123
dgraph live \
--files ./data.rdf.gz \
--schema ./schema.txt \
--creds "user=groot;password=password;namespace=0" \
--force-namespace 123

To preserve namespaces from export files, use --force-namespace -1:

dgraph live \
--files ./data.rdf.gz \
--schema ./schema.txt \
--creds "user=groot;password=password;namespace=0" \
--force-namespace -1
note

The target namespace must exist before loading data.

Encrypted Data

To load encrypted export files, provide the decryption key:

# Using key file
dgraph live \
--files ./encrypted-data.rdf.gz \
--schema ./encrypted-schema.txt \
--encryption key-file=./encryption.key

# Using HashiCorp Vault
dgraph live \
--files ./encrypted-data.rdf.gz \
--schema ./encrypted-schema.txt \
--vault addr="http://localhost:8200";enc-field="enc_key";enc-format="raw";path="secret/data/dgraph/alpha";role-id-file="./role_id";secret-id-file="./secret_id"
note

Encrypted exports can be imported into unencrypted Dgraph instances. The p directory will only be encrypted if the Alpha has encryption enabled.

CLI Options Reference

FlagDefaultDescription
--files, -fData file or directory path
--schema, -sSchema file path
--alpha, -alocalhost:9080Dgraph Alpha gRPC address(es)
--batch, -b1000N-Quads per mutation batch
--conc, -c10Concurrent requests to Dgraph
--upsertPredicate, -UPredicate for upsert matching
--xidmap, -xDirectory for UID mappings
--new_uidsfalseAssign new UIDs instead of preserving existing
--formatForce format (rdf or json)
--use_compression, -CfalseEnable gRPC compression
--credsACL credentials (user=;password=;namespace=)
--force-namespaceLoad into specific namespace (Guardian only)
--encryptionEncryption key file for decryption
--vaultVault configuration for encryption key

See dgraph live CLI reference for the complete list of options.