The ggml_tensor struct describes a tensor but does not contain the tensor data itself.

The struct definition

// n-dimensional tensor
struct ggml_tensor {
	enum ggml_type type;
 
	struct ggml_backend_buffer * buffer;
 
	int64_t ne[GGML_MAX_DIMS];
	size_t  nb[GGML_MAX_DIMS];
 
	enum ggml_op op;
 
	int32_t op_params[GGML_MAX_OP_PARAMS / sizeof(int32_t)];
 
	int32_t flags;
 
	struct ggml_tensor * src[GGML_MAX_SRC];
 
	struct ggml_tensor * view_src;
	size_t               view_offs;
 
	void * data;
 
	char name[GGML_MAX_NAME];
 
	void * extra;
 
	char padding[8];
};

ggml/include/ggml.h:660-692

The struct fields

We’ll hit on the important parts.

  • type is the datatype of the values in the tensor, like f16 and Q4_K.
  • buffer is a pointer to the [[|ggml_backend_buffer]] that owns this tensor’s bytes.
  • ne (number of elements) describes the shape of the tensor.
  • nb (number of bytes) describes the size of a stride (in bytes) along the respective dimension.
    • e.g. With FP32 data values and ne = {4, 3, 1, 1} and, the corresponding nb is nb = {4, 16, 48, 48}.
  • op describes the GGML operator that produces this tensor.
    • GGML_OP_NONE indicates that this tensor is a leaf node (no parent/producer), meaning it is an input or constant.
  • op_params is 64 bytes of scratch storage.
    • Operators can encode its scalars here.
    • e.g. softmax scale
  • src are the input tensors that feed into the operator to create this tensor.
  • view_src is a pointer to the original tensor.
    • view_src != NULL indicates that this tensor is merely a view of an original tensor.
    • view_src != NULL implies that data instead points to somewhere in the original tensor’s data
      • It is common for a view to instead be a subset of the original tensor.
  • view_offs describes the byte offset from the base data pointer of the original tensor.
    • tensor->data == (char *)tensor->view_src->data + tensor->view_offs
  • data points to the actual data of the tensor.