Python API
The public API is intentionally small.
read_file_format(path)
Loads a tab-delimited layout file and returns (title, spec), where spec is
a list of FieldInfo(width, datatype, name) values.
from fixwidth import read_file_format
title, spec = read_file_format('example/data.layout')
print(title)
parse_file(path, spec, ...)
Opens a file in binary mode and yields parsed rows as OrderedDict objects.
from fixwidth import parse_file, read_file_format
_, spec = read_file_format('example/data.layout')
for row in parse_file('example/data1.txt', spec=spec, type_errors='ignore'):
print(row['employee_id'], row['salary'])
Important defaults:
encoding='ascii'type_errors='raise'skip_blank_lines=False
parse_lines(lines, spec, ...)
Parses an iterable of binary lines. This is useful for tests, streams, and in-memory data.
from io import BytesIO
from fixwidth import parse_lines
layout = [(2, 'int', 'row_id'), (5, 'str', 'name')]
rows = parse_lines(BytesIO(b'01Bob \n02Susan\n'), layout)
print(next(rows))
Important defaults:
encoding='utf-8'type_errors='raise'skip_blank_lines=False
DictReader(fileobj, fieldinfo, ...)
Provides a csv.DictReader-like interface for binary file objects.
import fixwidth
with open('example/data1.txt', 'rb') as fh:
reader = fixwidth.DictReader(fh, 'example/data.layout')
first = next(reader)
print(first['job_title'])
Notes:
fileobjmust be opened in binary read mode.fieldinfocan be a layout path or a sequence of layout tuples.line_numincrements as rows are consumed.
register_type(name)
Registers a custom converter function in the global converter registry.
from fixwidth import register_type
@register_type('uppercase')
def convert_uppercase(value):
return value.strip().upper()
After registration, use uppercase in a layout just like any built-in type.