What Morphium Offers

storeList(), cursor-based iteration and the @WriteBuffer handle high-volume data ingestion efficiently. Morphium batches writes automatically, manages memory through configurable buffer sizes, and provides sequences for generating unique IDs across distributed instances.

The Challenge

Bulk operations in MongoDB require careful batching to avoid memory exhaustion and connection timeouts. Without write buffering, each insert is a separate round-trip. Without cursor-based iteration, reading large collections loads everything into memory.

Morphium Features Used

storeList (Batch) morphium.storeList(records) sends all entities to MongoDB in a single bulk write operation. Dramatically faster than individual store() calls because it requires only one network round-trip. Bypasses @WriteBuffer and writes directly. MongoDBAtlasCosmosDB @WriteBuffer Buffers individual store() calls in memory and flushes them in batches. Parameters: size (max buffered ops), strategy (WRITE_NEW, WRITE_OLD, DEL_OLD, IGNORE_NEW, JUST_WARN, WAIT), timeout (max ms before flush). WAIT is the default strategy. Import: de.caluga.morphium.annotations.caching.WriteBuffer MongoDBAtlasCosmosDB @AutoSequence Automatically assigns the next value from a named MongoDB-backed sequence when the field is null at store time. Uses atomic findAndModify for thread-safe, distributed counters. For storeList(), calls getNextBatch(n) — a single logical batch call (5 MongoDB round-trips internally: lock, read, increment, re-read, unlock) regardless of batch size. MongoDBAtlasCosmosDB push (array add) morphium.push(query, field, value) appends a value to an array field using MongoDB's $push operator. The document is never loaded — the update happens entirely on the server. Efficient for managing list-valued fields. MongoDBAtlasCosmosDB pull (array remove) morphium.pull(query, field, value) removes all occurrences of a value from an array field using MongoDB's $pull operator. Like push(), this operates directly on the server without loading the document. MongoDBAtlasCosmosDB set (atomic) query.set(field, value, upsert, multiple, AsyncOperationCallback) updates a single field using MongoDB's $set operator. Only the specified field is modified on the server — all other fields remain untouched. MongoDBAtlasCosmosDB unset (field remove) query.unset(field) completely removes a field from the BSON document using MongoDB's $unset operator. Different from set(field, null) which stores a null value — unset removes the key entirely. MongoDBAtlasCosmosDB MorphiumIterator Iterates over large result sets without loading all documents into memory at once. Uses server-side cursors with configurable batch sizes. Ideal for processing millions of documents. MongoDBAtlasCosmosDB

Prerequisites & Key Concepts

  • @WriteBuffer buffers writes client-side. In this showcase, the entity uses size=500 and strategy=WRITE_NEW: when the buffer is full and a new write arrives, the newest write is sent immediately while older buffered writes wait. Writes are also flushed after timeout milliseconds regardless of buffer fill level.
  • @AutoSequence uses a separate MongoDB collection to store atomic counters. Each sequence is identified by name (e.g. "import_number"). For storeList(), Morphium calls getNextBatch(count) — a single atomic lock+increment+unlock regardless of how many records are in the list.
  • push / pull modify array fields directly on the MongoDB server using the $push and $pull operators. The document is never loaded into Java memory, making these operations efficient and concurrency-safe.
  • set / unset are server-side atomic operations. set changes a field's value; unset removes the field entirely from the document (not the same as setting it to null).

Entity Source Code

ImportRecord.java Java
import de.caluga.morphium.annotations.AutoSequence;
import de.caluga.morphium.annotations.CreationTime;
import de.caluga.morphium.annotations.Entity;
import de.caluga.morphium.annotations.Id;
import de.caluga.morphium.annotations.caching.WriteBuffer;
import de.caluga.morphium.driver.MorphiumId;
import lombok.Data;
import lombok.Builder;
import lombok.NoArgsConstructor;
import lombok.AllArgsConstructor;
import lombok.experimental.FieldNameConstants;
import java.time.LocalDateTime;
import java.util.List;

@Entity(collectionName = "import_records") 1
@WriteBuffer(size = 500, strategy = WriteBuffer.STRATEGY.WRITE_NEW, timeout = 5000) 2
@Data @NoArgsConstructor @AllArgsConstructor @Builder
@FieldNameConstants
public class ImportRecord {

    @Id
    private MorphiumId id;

    @AutoSequence(name = "import_number") 3
    private Long importNumber;

    private String source;
    private String data;
    private String status;

    @CreationTime 4
    private LocalDateTime importedAt;

    private List<String> tags; 5
}
1 Maps to the import_records collection
2 Buffers up to 500 writes in memory, flushing as bulk operations. WRITE_NEW sends new writes immediately when buffer is full. timeout = 5000 ms ensures flush even at low throughput.
3 Morphium auto-assigns the next value from the import_number sequence. Uses Long (boxed) so null signals "not yet assigned". For bulk inserts, getNextBatch(n) reserves N values in a single atomic operation.
4 Automatically timestamped on first store() — never overwritten on updates
5 Modified via morphium.push() / morphium.pull() for atomic array operations without loading the full document

WriteBuffer + Sequence Code

ImportRecord.java (annotations) Java
import de.caluga.morphium.annotations.AutoSequence;
import de.caluga.morphium.annotations.Entity;
import de.caluga.morphium.annotations.caching.WriteBuffer;

@Entity(collectionName = "import_records")1
@WriteBuffer(size = 500, strategy = WriteBuffer.STRATEGY.WRITE_NEW, timeout = 5000)2
public class ImportRecord {

    @Id
    private MorphiumId id;

    @AutoSequence(name = "import_number")3
    private Long importNumber; // Long (not long) so null = "not yet assigned"4

    private String source;
    private String data;
    private String status;
    private List<String> tags;

    @CreationTime5
    private LocalDateTime importedAt;
}
1 @Entity maps this class to a MongoDB collection with an explicit collection name.
2 @WriteBuffer batches individual store() calls; size=500 caps the buffer, WRITE_NEW flushes the newest entry when the buffer is full.
3 @AutoSequence(name=...) assigns the next value from a named MongoDB-backed counter when the field is null at store time.
4 Using Long (boxed) instead of long is required so the field can be null, which signals that no sequence number has been assigned yet.
5 @CreationTime is automatically populated by Morphium on the first store() call.
Bulk Import with storeList() Java
// Build a list of records — importNumber left null for @AutoSequence
List<ImportRecord> records = new ArrayList<>();
for (int i = 0; i < count; i++) {
    records.add(ImportRecord.builder()
        .source(sources[i % sources.length])
        .data("Record #" + (i + 1))
        .status("PENDING")
        .tags(List.of("bulk", "auto-generated"))
        .build());
}

// storeList() — single bulk write + single getNextBatch() for sequences
morphium.storeList(records);1
1 storeList() sends all records in a single bulk write operation and calls getNextBatch(n) once to assign sequence numbers atomically — regardless of batch size.

Array & Field Operations

Morphium API — push / pull / set / unset Java
Query<ImportRecord> query = morphium.createQueryFor(ImportRecord.class)
    .f(ImportRecord.Fields.id).eq(new MorphiumId(id));

// push: append a tag to the tags array ($push)
morphium.push(query, ImportRecord.Fields.tags, "priority");1

// pull: remove a tag from the tags array ($pull)
morphium.pull(query, ImportRecord.Fields.tags, "auto-generated");2

// set: update status without loading the document ($set)
query.set(ImportRecord.Fields.status, "PROCESSED", false, false, null);3

// unset: remove the "source" field entirely from the document ($unset)
query.unset(ImportRecord.Fields.source);4
1 morphium.push() appends a value to an array field using MongoDB's $push operator — the document is never loaded into Java memory.
2 morphium.pull() removes all matching values from an array field using MongoDB's $pull operator entirely on the server.
3 query.set() updates a single field with $set; parameters are field, value, upsert, multiple, and async callback (AsyncOperationCallback).
4 query.unset() removes the field key entirely from the BSON document, which is different from setting it to null.

Related Documentation