Skip to main content

Writing Label Descriptions & Project Instructions

Your RAG model relies on two sources of guidance to understand what to look for in your documents: label descriptions and project instructions. The better you describe what you need, the more accurate the results will be.

Think of it as explaining to a new colleague how they should read and process a document. If your explanation would leave them guessing, the model will guess too.


Label Descriptions

A label description tells the model what a specific field is. It answers the question: "What am I looking for?"

Each description should cover three things:

  1. What it is — a short, clear definition of the field.
  2. How to find it — where it usually appears on the page, what it looks like, or what it's labeled as.
  3. What it is not — how to tell it apart from similar fields, when that could be confusing.

Examples

Delivery Date: The date when the goods need to be delivered. Often labeled "Delivery date" or "Expected arrival". Not the document date or the order date.

Gross Weight: The total weight of the shipment. Only the number, without the unit.

Loading Date: The date the goods need to be loaded. Often labeled "Loading date". In tabular rows without column headers, it is the first date in the row.

Guidelines

Start with a clear definition. The first sentence should make it obvious what this field represents. This is the most important part of any description.

Describe how to find it on the page. The model knows what a "delivery date" means in general, but it needs your help to know which value on the page is the delivery date versus the order date. Mention things like:

  • Common labels it appears near (e.g. "usually next to 'Expected arrival'")
  • Its typical format (e.g. "a 4-digit number", "DD/MM/YYYY")
  • Where it tends to sit on the page (e.g. "in the header area", "after the product name")

Clarify when two fields could be confused. If two labels might match the same piece of text, spell out which one should be used and when. Without this, the model has to make its best guess — and it may guess wrong.

Use emphasis sparingly. If you highlight everything, nothing stands out. Save bold or capitals for the single most important rule in a description.

Keep the language simple and direct. Write the way you would explain it out loud to someone sitting next to you. Short, concrete sentences work best.

Never leave a description empty. A label without a description gives the model nothing to work with. Every label needs at least one sentence.

Common Pitfalls

IssueWhy it's a problemHow to fix it
Definition only, no location hints"The delivery date" — correct but not enough to find the right value on the pageAdd where to find it: "Often near 'Expected arrival'. Not the order date."
Too much emphasis"ALWAYS extract. NEVER skip. MUST be present."Pick the one thing that matters most and emphasize only that.
Two labels claiming the same textBoth "Product name" and "Item description" could match the same textAdd clear rules to both descriptions explaining which one applies when.
Empty description(nothing written)Always write at least one sentence defining what the field is.

Project Instructions

Project instructions tell the model how your documents work. They answer the question: "What do I need to know about these documents before I start?"

This is where you describe your document types, layout differences, and any rules that apply across multiple fields.

What Goes Where

Put in the label descriptionPut in project instructions
What the field isWhat types of documents exist and how they differ
How to find it on the pageLayout rules that depend on the document type or source
How to tell it apart from similar fieldsRules that involve more than one field
Format hints (e.g. "4 digits", "DD/MM/YYYY")Relationships between fields (e.g. "when X is present, Y is not")

A simple rule of thumb: if the instruction is about a single field, put it in the label description. If it's about the document as a whole or affects multiple fields, put it in the project instructions.

How to Structure Project Instructions

Start with any rules that apply to all your documents, then add sections for specific document types or sources.

[Brief description of what these documents are]

General:
- [Rules that apply to all documents]

[Document type A]:
- [Rules specific to this type]

[Document type B]:
- [Rules specific to this type]

Example:

Transport and logistics documents including CMRs, delivery notes, and warehouse receipts.

General:
- Dates are in DD/MM/YYYY format unless stated otherwise.
- When a field is not present on the document, leave it empty.

CMR documents:
- The sender and receiver information is in boxes 1 and 2.
- The delivery date appears in box 24.

Warehouse receipts:
- The delivery date appears after the word "arrival".
- Weight values include the unit — extract only the number.

Be Specific About When Rules Apply

Rules that only apply to certain documents must say so clearly. If you don't specify, the model may apply the rule to every document — even ones where it shouldn't.

Too vagueClear and scoped
"The delivery date appears after 'arrival'.""In warehouse receipts only: the delivery date appears after the word 'arrival'. This does not apply to other document types."
"Always extract from the second column.""In documents from [source name]: the values are in the second column."

A well-scoped rule tells the model three things: when it applies, what to do, and that it does not apply elsewhere.

Common Pitfalls

IssueWhy it's a problemHow to fix it
Same rule in two placesA rule appears in both the label description and the project instructions, possibly with slightly different wording.Pick one place. Field-specific rules go in the description; cross-field or document-wide rules go in project instructions.
Rules without scopeA rule meant for one document type gets applied to all documents.Always state which document type or source a rule applies to.
Stating the obvious"Dates should look like dates."Only write rules that add information someone wouldn't already assume.
Too many small sectionsMany tiny source-specific blocks that are hard to follow.Only create sections for behaviors that genuinely differ from the default.

Before You Publish — A Quick Checklist

Use this list to review your setup before going live:

  • Every label has a description — no empty or missing entries.
  • Each description starts with a clear, one-sentence definition.
  • Labels that could be confused with each other include clear disambiguation in both descriptions.
  • No rule is duplicated across label descriptions and project instructions.
  • Every conditional rule states which document type it applies to.
  • Emphasis is used sparingly — at most one highlighted term per description.