Data Matrix

Data Matrix (ECC 200) is a 2D symbology suitable for short payloads, capable of encoding up to 174 ASCII characters in pyStrich’s implementation.

See also

Data Matrix on Wikipedia for background on the symbology itself.

Example

Wrap the input in DataMatrixData. The simplest path is auto_encoding=True, which picks the narrowest encoding that fits the input automatically. For control over the encoded byte sequence – to enforce "ascii" for a 7-bit payload (URLs, identifiers, GS1 AI strings) or to require Latin-1 / UTF-8 – pass an explicit encoding instead; see Non-ASCII text below.

from pystrich.datamatrix import DataMatrixData, DataMatrixEncoder

payload = DataMatrixData("https://github.com/mmulqueen/pyStrich", auto_encoding=True)
DataMatrixEncoder(payload).save_svg("datamatrix-example.svg")
Data Matrix encoding the pyStrich GitHub URL.

Sizing and quiet zone

The cellsize argument to save() and get_imagedata() sets the pixel side length of one module (default 5).

The quiet_zone argument to DataMatrixEncoder sets the width (in modules) of the white border applied at render time. The Data Matrix specification requires at least one module of quiet zone on each side; pyStrich defaults to 2 (set DATAMATRIX_DEFAULT_QUIET_ZONE). Reduce to 1 for the most compact symbol; increase if your printing process tends to bleed into the margin.

See also

Printing barcodes for guidance on selecting cellsize for printed output.

payload = DataMatrixData("https://github.com/mmulqueen/pyStrich", encoding="ascii")
DataMatrixEncoder(payload).save("datamatrix-large.png", cellsize=10)
Data Matrix encoding the pyStrich GitHub URL rendered with cellsize=10.

Output formats

SVG output

For embedding in web pages or any workflow that benefits from resolution-independent output, use save_svg() (or get_svg() to receive the SVG as a string).

from pystrich.marks import MarkShape

payload = DataMatrixData("https://github.com/mmulqueen/pyStrich", encoding="ascii")
DataMatrixEncoder(payload).save_svg("datamatrix.svg")
DataMatrixEncoder(payload).save_svg(
    "datamatrix-circles.svg", mark_shape=MarkShape.CIRCULAR_CELLS
)

Default

mark_shape=MarkShape.CIRCULAR_CELLS

SVG Data Matrix with the default rectangular cells. SVG Data Matrix with circular cells.

The SVG’s viewBox is in module units, while width and height scale by cellsize. The mark_shape keyword selects how matched cells are drawn – horizontal runs of rectangles (the default) or one filled circle per cell.

Note

Circular cells fall outside the standard module shape and decoder support varies. Test with your target scanner before deploying.

Added in version 0.12.

PNG output

For raster output, use save() to write a PNG file or get_imagedata() to receive the raw PNG bytes.

payload = DataMatrixData("https://github.com/mmulqueen/pyStrich", encoding="ascii")
DataMatrixEncoder(payload).save("datamatrix.png")

EPS output

For embedding in LaTeX (\includegraphics) or other vector print workflows, use save_eps() (or get_eps() to receive the EPS as a string).

payload = DataMatrixData("https://github.com/mmulqueen/pyStrich", encoding="ascii")
DataMatrixEncoder(payload).save_eps("datamatrix.eps")

The cellsize argument is the side length of one module in PostScript points (1 point = 1/72 inch).

Added in version 0.12.

Terminal output

For quick on-screen display, get_terminal_art() returns a scannable rendering using Unicode half-block characters. Each character represents two matrix rows and one column, so cells appear roughly square in a typical fixed-width terminal font.

print(DataMatrixEncoder(payload).get_terminal_art())
                              
  █ █▄▀▄█▄▀ █ █▄█ █ █▄▀▄█▄▀▄  
  ██▀▄▄█  ███ ▄▀▀▀█▄▀ ▄▀▄▀█▄  
  ██▄▄█ ▄▀ █▄█▀ ▄▀█▀▄▀ █▄▀ ▄  
  █▄ ▀██▀▄▄▀▄  ▀▀▄▄▀█  █▀ ▄▄  
  █ ▄▀▄▀▄▄▀▀▄▀▀ ▀  ▄▀ █▀▀▀ ▄  
  █▀ ▀▄▀▄ ▄█▄█▄█▄▄█ ▀█▄▄▀ ▀▄  
  █ ▄▀█▀▄█▄  ▄ ▄▀▀  █▄█▄▄▀▀▄  
  ██▀▀▄█▄ █  ▄███   █ ██▄█▄▄  
  █▀▄██ ▄▄██▀▄ ▄█▀ █ ██▄██▀▄  
  █▀█ ▀█▀▀ ▀▄▄█ ▀▀▀▄▄▄▀ █▀▄▄  
  ██▄▀█  █ ▀ ▄ ▄██▀▄▀▀█▀▄█ ▄  
  █▄▄▀█▀█▄▄██▀█▀█▀ █▄ ▄ █ ▄▄  
  █▄▄██▄▄█▄▄▄█▄▄▄▄▄▄▄██▄▄█▄▄  
                              

By default the output is wrapped in ANSI escape codes that force a white background and black foreground, so the symbol scans regardless of the terminal’s colour scheme. Pass ansi_bg=False for plain output (correct only on a light-themed terminal).

Added in version 0.12.

DXF (CAD) output

For direct part marking applications – where the symbol is engraved or laser-etched onto a physical part – get_dxf() returns a DXF representation of the symbol. DXF is the file format CAD and CAM tools read; the output can be handed straight to the engraving or etching tool. The cellsize is in your chosen units (default "mm") rather than pixels.

payload = DataMatrixData("A1268172415", encoding="ascii")
encoder = DataMatrixEncoder(payload)
with open("part.dxf", "w") as f:
    f.write(encoder.get_dxf(cellsize=0.5, units="mm"))

The default inverse=True emits geometry for the light modules, including the quiet zone – so the bounding box frames the symbol. Pass inverse=False to emit only the dark modules instead, matching the symbol’s normal appearance; the bounding box then hugs the dark cells and the quiet zone has to be reintroduced downstream.

GS1 / FNC1

See also

GS1 DataMatrix Guideline – the authoritative reference for AI selection, encoding rules and print quality requirements for GS1 Data Matrix.

GS1 Data Matrix uses an FNC1 codeword as the first symbol to signal that the payload is a sequence of GS1 Application Identifiers, and again as a separator after any variable-length AI that is not the final element of the message. Pass FNC1 as the first segment to DataMatrixData to emit codeword 232 directly:

from pystrich.datamatrix import DataMatrixData, DataMatrixEncoder, FNC1

# (01) GTIN-14 -- pad a GTIN-13 with a leading "0" indicator digit.
payload = DataMatrixData(FNC1, "0105050070007664", encoding="ascii")
DataMatrixEncoder(payload).save("gs1.png")
GS1 Data Matrix encoding (01) GTIN 05050070007664.

A typical pharmaceutical / medical-device payload combines a GTIN with an expiry date and a batch number. (01) and (17) are fixed-length, so no separator is required between them; (10) is variable-length, but because it is the last element of the message no trailing separator is required either:

# (01) GTIN + (17) expiry YYMMDD + (10) batch
payload = DataMatrixData(
    FNC1, "0109501234543213", "17261231", "10BF07", encoding="ascii"
)
DataMatrixEncoder(payload).save("gs1-multi-fixed.png")
GS1 Data Matrix encoding (01) GTIN 09501234543213, (17) expiry 261231, (10) batch BF07.

When a variable-length AI is followed by another AI, separate them with a further FNC1:

# (10) batch + (21) serial -- (10) is variable-length and not last,
# so an FNC1 separator is required between the two AIs.
payload = DataMatrixData(
    FNC1, "10BF07", FNC1, "2119890519", encoding="ascii"
)
DataMatrixEncoder(payload).save("gs1-multi.png")
GS1 Data Matrix encoding (10) batch BF07 and (21) serial SERIAL01 separated by FNC1.

Note

GS1 Data Matrix payloads must be ASCII – the GS1 General Specifications restrict AI values to a 7-bit character set (essentially ASCII). Do not combine FNC1 with non-ASCII encodings.

Deprecated since version 0.11: Older code triggered FNC1 by prefixing the payload with chr(231) – originally a bug in this library that downstream users came to rely on (see issue #13). The shim still works but emits a Fnc1WorkaroundCompatWarning. New code should use FNC1.

Non-ASCII text

The Data Matrix ASCII codeword set only covers bytes 0-127. To encode anything outside that range in a non-GS1 symbol, wrap the input in DataMatrixData and either pass auto_encoding=True (the constructor picks the narrowest fitting encoding for you) or specify an encoding explicitly:

Encoding

Behaviour

"ascii"

Raises PyStrichInvalidInput on any byte > 127.

"iso-8859-1"

Latin-1 – the default character set for Data Matrix per ISO/IEC 16022. Bytes 128-255 are emitted via the Upper Shift codeword (235); no ECI prefix is required and conformant decoders pick up the encoding automatically.

"utf-8"

Declares ECI 26 once at the start of the symbol and byte-encodes the input. Conformant decoders pick up the encoding automatically.

"compat"

Legacy lenient mode. Non-ASCII characters emit DataMatrixNonAsciiWarning and produce output that will not decode correctly. Deprecated; pick one of the above instead.

Tip

For the most compact symbol, prefer the most restrictive encoding that fits your data: "ascii" first, then "iso-8859-1", then "utf-8". Each step adds overhead – Latin-1 spends an extra codeword per high byte, and UTF-8 adds a two-codeword ECI prefix and emits multi-byte sequences for anything outside ASCII. auto_encoding=True makes the same choice for you if you’d rather not pick by hand.

from pystrich.datamatrix import DataMatrixData, DataMatrixEncoder

# Latin-1: smaller symbol if all your input fits in one byte per char.
DataMatrixEncoder(DataMatrixData("Ich dachte, Sie wären kräftiger", encoding="iso-8859-1")).save("latin1.png")
Data Matrix encoding "Ich dachte, Sie wären kräftiger" as Latin-1.
# UTF-8: required for anything outside Latin-1 (€, CJK, emoji).
DataMatrixEncoder(DataMatrixData("€5 親切にしろ 🐻‍❄️", encoding="utf-8")).save("utf8.png")
Data Matrix encoding "€5 親切にしろ 🐻‍❄️" as UTF-8 (ECI 26).

If you pass a string with the wrong encoding, the raised error names the offending character and suggests the encoding that would have worked:

>>> from pystrich.datamatrix import DataMatrixData
>>> DataMatrixData("Ich dachte, Sie wären kräftiger", encoding="ascii")
Traceback (most recent call last):
    ...
pystrich.exceptions.PyStrichInvalidInput: DataMatrix encoding 'ascii' expects ASCII; got 'ä'. Try DataMatrixData('Ich dachte, Sie wären kräftiger', encoding='iso-8859-1') or pass auto_encoding=True to select an encoding automatically.

API

class DataMatrixEncoder(text: DataMatrixData | str, *, quiet_zone: int = 2)[source]

Bases: Matrix2DEncoder[int | None]

Encode text as a Data Matrix (ECC 200) 2D barcode.

The matrix size is selected automatically based on input length. Wrap the input in DataMatrixData and pass an explicit encoding of "ascii", "iso-8859-1" or "utf-8" — or pass auto_encoding=True to let the constructor pick the narrowest encoding that fits. To produce a GS1 Data Matrix, prefix the payload with the FNC1 marker.

Typical use:

encoder = DataMatrixEncoder(DataMatrixData("Hallo", encoding="ascii"))
encoder.save("hallo.png")

# Or, let DataMatrixData pick the encoding:
encoder = DataMatrixEncoder(DataMatrixData("Rausschmeißer", auto_encoding=True))

Plain str input is also accepted but falls back to a deprecated "compat" encoding that warns on non-ASCII bytes and produces output that does not decode correctly. New code should always wrap the input in DataMatrixData.

Variables:
  • matrix – 2D list of ints (0/1, or None for unset cells) describing the symbol prior to rendering.

  • regions – Number of square regions the symbol is divided into.

  • quiet_zone – Width in modules of the white border applied at render time.

  • width – Pixel width of the most recently rendered image. 0 until a render method has been called.

  • height – Pixel height of the most recently rendered image.

get_ascii() str

Return an ASCII-art rendering of the symbol.

Return type:

str

get_dxf(cellsize: float = 1.0, inverse: bool = True, units: Literal['in', 'ft', 'mi', 'mm', 'cm', 'm'] | None = 'mm', *, mark_shape: MarkShape = MarkShape.SQUARE_CELLS) str

Return a DXF (CAD) representation of the symbol.

Parameters:
  • cellsize – Side length of one module in units.

  • inverse – If True (the default), light modules are drawn as filled cells. If False, dark modules are drawn, matching the normal appearance of the symbol.

  • units – One of "in", "ft", "mi", "mm", "cm" or "m", or None for Unspecified ($INSUNITS=0).

  • mark_shape – How matched cells are grouped and drawn.

Return type:

str

Added in version 0.9.

Changed in version 0.12: units now supports "in", "ft", "mi", "cm", "m" and None (Unspecified); previously any value other than "mm" was silently treated as unspecified.

get_eps(cellsize: int = 5, *, inverse: bool = False, mark_shape: MarkShape = MarkShape.HORIZONTAL_RUNS) str

Render the symbol and return EPS markup.

Parameters:
  • cellsize – Side length in PostScript points of one module.

  • inverse – If True, mark the light cells instead of the dark ones.

  • mark_shape – How matched cells are grouped and drawn.

Return type:

str

Added in version 0.12.

get_imagedata(cellsize: int = 5) bytes

Render the symbol and return PNG bytes.

Parameters:

cellsize – Side length in pixels of one module.

Returns:

PNG-encoded image data.

Return type:

bytes

get_pilimage(cellsize: int = 5) PIL.Image.Image

Render the symbol and return a Pillow image.

Parameters:

cellsize – Side length in pixels of one module.

Returns:

The rendered symbol.

Return type:

PIL.Image.Image

Added in version 0.11.

get_svg(cellsize: int = 5, *, inverse: bool = False, mark_shape: MarkShape = MarkShape.HORIZONTAL_RUNS) str

Render the symbol and return SVG markup.

Parameters:
  • cellsize – Side length in user units of one module.

  • inverse – If True, mark the light cells instead of the dark ones.

  • mark_shape – How matched cells are grouped and drawn.

Return type:

str

Added in version 0.12.

get_terminal_art(*, ansi_bg: bool = True) str

Render the symbol using Unicode half-block characters for terminals.

Each character represents two matrix rows and one column, producing approximately square cells in a typical fixed-width font and yielding a result that is scannable on screen.

Parameters:

ansi_bg – If True (the default), wrap each line in ANSI escape codes that force a white background and black foreground, making the symbol scannable regardless of the terminal’s colour scheme. Set to False for plain output (correct only on a light-themed terminal).

Return type:

str

Added in version 0.12.

save(filename: str | PathLike[str], cellsize: int = 5) None

Save the symbol as a PNG. Pass a .png filename.

Parameters:
  • filename – PNG output path.

  • cellsize – Side length in pixels of one module.

save_eps(filename: str | PathLike[str], cellsize: int = 5, *, inverse: bool = False, mark_shape: MarkShape = MarkShape.HORIZONTAL_RUNS) None

Save the symbol as an EPS file. Pass an .eps filename.

Parameters:
  • filename – EPS output path.

  • cellsize – Side length in PostScript points of one module.

  • inverse – If True, mark the light cells instead of the dark ones.

  • mark_shape – How matched cells are grouped and drawn.

Added in version 0.12.

save_svg(filename: str | PathLike[str], cellsize: int = 5, *, inverse: bool = False, mark_shape: MarkShape = MarkShape.HORIZONTAL_RUNS) None

Save the symbol as an SVG file. Pass a .svg filename.

Parameters:
  • filename – SVG output path.

  • cellsize – Side length in user units of one module.

  • inverse – If True, mark the light cells instead of the dark ones.

  • mark_shape – How matched cells are grouped and drawn.

Added in version 0.12.

init_renderer() DataMatrixRenderer[source]

Construct a DataMatrixRenderer for the encoded matrix.

Updates width and height with the renderer’s pixel dimensions and returns the renderer.

class DataMatrixData(*segments: str | DataMatrixCodeword, encoding: Literal['compat', 'ascii', 'iso-8859-1', 'utf-8'] | None = None, auto_encoding: bool = False)[source]

Bases: object

Composable encoder input mixing text chunks with raw-codeword markers.

Build values by concatenating marker constants (e.g. FNC1) with plain strings on either side, then pass the result to DataMatrixEncoder in place of a str.

Construction requires either an explicit encoding= (one of "ascii", "iso-8859-1", "utf-8" or the legacy "compat") or auto_encoding=True. With auto_encoding=True the constructor picks the narrowest encoding from ascii, iso-8859-1, utf-8 that represents every segment; any encoding= argument passed alongside is ignored. After construction, encoding is always one of the four concrete charsets.

Added in version 0.11.

Changed in version 0.12: Callers must now pass either an explicit encoding= or auto_encoding=True. Added the auto_encoding flag.

Deprecated since version 0.11: The "compat" encoding is retained only for backwards compatibility and will be removed in a future release. New code should pick "ascii", "iso-8859-1" or "utf-8" explicitly, or use auto_encoding=True.

class DataMatrixCodeword(value: int)[source]

Bases: object

A literal DataMatrix codeword value to emit verbatim.

Concatenation with a plain str or another codeword (e.g. FNC1 + "...") is the modern API path and produces a DataMatrixData tagged with the strict "ascii" encoding. Concatenation with an existing DataMatrixData preserves that object’s encoding instead.

Added in version 0.11.

pystrich.datamatrix.FNC1

The GS1 FNC1 marker (Data Matrix codeword 232). An instance of DataMatrixCodeword; concatenate with strings via + to build a GS1 payload.

Added in version 0.11.