ocrd_utils package¶
Utility functions and constants usable in various circumstances.
coordinates_of_segment
,coordinates_for_segment
These functions convert polygon outlines for PAGE elements on all hierarchy levels below page (i.e. region, line, word, glyph) between relative coordinates w.r.t. a corresponding image and absolute coordinates w.r.t. the toplevel image. This includes rotation and offset correction, based on affine transformations. (Used by
Workspace
methodsimage_from_page
andimage_from_segment
)rotate_coordinates
,shift_coordinates
,transpose_coordinates
,transform_coordinates
These backend functions compose affine transformations for reflection, rotation and offset correction of coordinates, or apply them to a set of points. They can be used to pass down the coordinate system along with images (both invariably sharing the same operations context) when traversing the element hierarchy top to bottom. (Used by
Workspace
methodsimage_from_page
andimage_from_segment
).rotate_image
,crop_image
,transpose_image
These PIL.Image functions are safe replacements for the
rotate
,crop
, andtranspose
methods.image_from_polygon
,polygon_mask
These functions apply polygon masks to PIL.Image objects.
xywh_from_points
,points_from_xywh
,polygon_from_points
etc.These functions have the syntax
X_from_Y
, whereX
/Y
can bebbox
is a 4tuple of integers x0, y0, x1, y1 of the bounding box (rectangle)(used by PIL.Image)
points
a string encoding a polygon:"0,0 100,0 100,100, 0,100"
(used by PAGEXML)
polygon
is a list of 2lists of integers x, y of points forming an (implicitly closed) polygon path:[[0,0], [100,0], [100,100], [0,100]]
(used by opencv2 and higherlevel coordinate functions in ocrd_utils)
xywh
a dict with keys for x, y, width and height:{'x': 0, 'y': 0, 'w': 100, 'h': 100}
(produced by tesserocr and image/coordinate recursion methods in ocrd.workspace)
x0y0x1y1
is a 4list of stringsx0
,y0
,x1
,y1
of the bounding box (rectangle)(produced by tesserocr)
y0x0y1x1
is the same asx0y0x1y1
with positions ofx
andy
in the list swapped
is_local_filename
,safe_filename
,abspath
,get_local_filename
FSrelated utilities
is_string
,membername
,concat_padded
,nth_url_segment
,remove_non_path_from_url
,parse_json_string_or_file
String and OOP utilities
MIMETYPE_PAGE
,EXT_TO_MIME
,MIME_TO_EXT
,VERSION
Constants
logging
,setOverrideLogLevel
,getLevelName
,getLogger
,initLogging
Exports of ocrd_utils.logging
deprecated_alias
Decorator to mark a kwarg as deprecated

ocrd_utils.
adjust_canvas_to_rotation
(size, angle)[source]¶ Calculate the enlarged image size after rotation.
Given a numpy array
size
of an original canvas (width and height), and a rotation angle in degrees counterclockwiseangle
, calculate the new size which is necessary to encompass the full image after rotation.Return a numpy array of the enlarged width and height.

ocrd_utils.
adjust_canvas_to_transposition
(size, method)[source]¶ Calculate the flipped image size after transposition.
Given a numpy array
size
of an original canvas (width and height), and a transposition modemethod
(seetranspose_image
), calculate the new size after transposition.Return a numpy array of the enlarged width and height.

ocrd_utils.
bbox_from_points
(points)[source]¶ Construct a numeric list representing a bounding box from polygon coordinates in page representation.

ocrd_utils.
bbox_from_xywh
(xywh)[source]¶ Convert a bounding box from a numeric dict to a numeric list representation.

ocrd_utils.
bbox_from_polygon
(polygon)[source]¶ Construct a numeric list representing a bounding box from polygon coordinates in numeric list representation.

ocrd_utils.
coordinates_for_segment
(polygon, parent_image, parent_coords)[source]¶ Convert relative coordinates to absolute.
Given…
polygon
, a numpy array of points relative toparent_image
, a PIL.Image (not used), along withparent_coords
, its corresponding affine transformation,
…calculate the absolute coordinates within the page.
That is, apply the given transform inversely to
polygon
The transform encodes (recursively):Whenever
parent_image
or any of its parents was cropped, all points must be shifted by the offset in opposite direction (i.e. coordinate system gets translated by the upper left).Whenever
parent_image
or any of its parents was rotated, all points must be rotated around the center of that image in opposite direction (i.e. coordinate system gets translated by the center in opposite direction, rotated purely, and translated back; the latter involves an additional offset from the increase in canvas size necessary to accommodate all points).
Return the rounded numpy array of the resulting polygon.

ocrd_utils.
coordinates_of_segment
(segment, parent_image, parent_coords)[source]¶ Extract the coordinates of a PAGE segment element relative to its parent.
Given…
segment
, a PAGE segment object in absolute coordinates (i.e. RegionType / TextLineType / WordType / GlyphType), andparent_image
, the PIL.Image of its corresponding parent object (i.e. PageType / RegionType / TextLineType / WordType), (not used), along withparent_coords
, its corresponding affine transformation,
…calculate the relative coordinates of the segment within the image.
That is, apply the given transform to the points annotated in
segment
. The transform encodes (recursively):Whenever
parent_image
or any of its parents was cropped, all points must be shifted by the offset (i.e. coordinate system gets translated by the upper left).Whenever
parent_image
or any of its parents was rotated, all points must be rotated around the center of that image (i.e. coordinate system gets translated by the center in opposite direction, rotated purely, and translated back; the latter involves an additional offset from the increase in canvas size necessary to accommodate all points).
Return the rounded numpy array of the resulting polygon.

ocrd_utils.
crop_image
(image, box=None)[source]¶ “Crop an image to a rectangle, filling with background.
Given a PIL.Image
image
and a listbox
of the bounding rectangle relative to the image, crop at the box coordinates, filling everything outsideimage
with the background. (This covers the case wherebox
indexes are negative or larger thanimage
width/height. PIL.Image.crop would fill with black.) Sinceimage
is not necessarily binarized yet, determine the background from the median color (instead of white).Return a new PIL.Image.

ocrd_utils.
getLevelName
(lvl)[source]¶ Get (numerical) python logging level for (string) specdefined log level name.

ocrd_utils.
getLogger
(*args, **kwargs)[source]¶ Wrapper around
logging.getLogger
that respects overrideLogLevel.

ocrd_utils.
nth_url_segment
(url, n=1)[source]¶ Return the last /delimited segment of a URLlike string
 Parameters
url (string) –
n (integer) – index of segment, default: 1

ocrd_utils.
membername
(class_, val)[source]¶ Convert a member variable/constant into a member name string.

ocrd_utils.
image_from_polygon
(image, polygon, fill='background', transparency=False)[source]¶ “Mask an image with a polygon.
Given a PIL.Image
image
and a numpy arraypolygon
of relative coordinates into the image, fill everything outside the polygon hull to a color according tofill
:if
background
(the default), then use the median color of the image;otherwise use the given color, e.g.
'white'
or (255,255,255).
Moreover, if
transparency
is true, then add an alpha channel from the polygon mask (i.e. everything outside the polygon will be transparent, for those consumers that can interpret alpha channels). Images which already have an alpha channel will have it shrunk from the polygon mask (i.e. everything outside the polygon will be transparent, in addition to existing transparent pixels).Return a new PIL.Image.

ocrd_utils.
parse_json_string_or_file
(value='{}')[source]¶ Parse a string as either the path to a JSON object or a literal JSON object.
Empty strings are equivalent to ‘{}’

ocrd_utils.
points_from_bbox
(minx, miny, maxx, maxy)[source]¶ Construct polygon coordinates in page representation from a numeric list representing a bounding box.

ocrd_utils.
points_from_polygon
(polygon)[source]¶ Convert polygon coordinates from a numeric list representation to a page representation.

ocrd_utils.
points_from_x0y0x1y1
(xyxy)[source]¶ Construct a polygon representation from a rectangle described as a list [x0, y0, x1, y1]

ocrd_utils.
points_from_xywh
(box)[source]¶ Construct polygon coordinates in page representation from numeric dict representing a bounding box.

ocrd_utils.
points_from_y0x0y1x1
(yxyx)[source]¶ Construct a polygon representation from a rectangle described as a list [y0, x0, y1, x1]

ocrd_utils.
polygon_from_bbox
(minx, miny, maxx, maxy)[source]¶ Construct polygon coordinates in numeric list representation from a numeric list representing a bounding box.

ocrd_utils.
polygon_from_points
(points)[source]¶ Convert polygon coordinates in page representation to polygon coordinates in numeric list representation.

ocrd_utils.
polygon_from_x0y0x1y1
(x0y0x1y1)[source]¶ Construct polygon coordinates in numeric list representation from a string list representing a bounding box.

ocrd_utils.
polygon_from_xywh
(xywh)[source]¶ Construct polygon coordinates in numeric list representation from numeric dict representing a bounding box.

ocrd_utils.
polygon_mask
(image, coordinates)[source]¶ “Create a mask image of a polygon.
Given a PIL.Image
image
(merely for dimensions), and a numpy arraypolygon
of relative coordinates into the image, create a new image of the same size with black background, and fill everything inside the polygon hull with white.Return the new PIL.Image.

ocrd_utils.
rotate_coordinates
(transform, angle, orig=array([0, 0]))[source]¶ Compose an affine coordinate transformation with a passive rotation.
Given a numpy array
transform
of an existing transformation matrix in homogeneous (3d) coordinates, and a rotation angle in degrees counterclockwiseangle
, as well as a numpy arrayorig
of the center of rotation, calculate the affine coordinate transform corresponding to the composition of both transformations. (This entails translation to the center, followed by pure rotation, and subsequent translation back. However, since rotation necessarily increases the bounding box, and thus image size, do not translate back the same amount, but to the enlarged offset.)Return a numpy array of the resulting affine transformation matrix.

ocrd_utils.
rotate_image
(image, angle, fill='background', transparency=False)[source]¶ “Rotate an image, enlarging and filling with background.
Given a PIL.Image
image
and a rotation angle in degrees counterclockwiseangle
, rotate the image, increasing its size at the margins accordingly, and filling everything outside the original image according tofill
:if
background
(the default), then use the median color of the image;otherwise use the given color, e.g.
'white'
or (255,255,255).
Moreover, if
transparency
is true, then add an alpha channel fully opaque (i.e. everything outside the original image will be transparent for those that can interpret alpha channels). (This is true for images which already have an alpha channel, regardless of the setting used.)Return a new PIL.Image.

ocrd_utils.
safe_filename
(url)[source]¶ Sanitize input to be safely used as the basename of a local file.

ocrd_utils.
setOverrideLogLevel
(lvl)[source]¶ Override all logger filter levels to include lvl and above.
Set root logger level
iterates all existing loggers and sets their log level to
NOTSET
.
 Parameters
lvl (string) – Log level name.

ocrd_utils.
shift_coordinates
(transform, offset)[source]¶ Compose an affine coordinate transformation with a translation.
Given a numpy array
transform
of an existing transformation matrix in homogeneous (3d) coordinates, and a numpy arrayoffset
of the translation vector, calculate the affine coordinate transform corresponding to the composition of both transformations.Return a numpy array of the resulting affine transformation matrix.

ocrd_utils.
transform_coordinates
(polygon, transform=None)[source]¶ Apply an affine transformation to a set of points. Augment the 2d numpy array of points
polygon
with a an extra column of ones (homogeneous coordinates), then multiply with the transformation matrixtransform
(or the identity matrix), and finally remove the extra column from the result.

ocrd_utils.
transpose_coordinates
(transform, method, orig=array([0, 0]))[source]¶ “Compose an affine coordinate transformation with a transposition (i.e. flip or rotate in 90° multiples).
Given a numpy array
transform
of an existing transformation matrix in homogeneous (3d) coordinates, a transposition modemethod
, as well as a numpy arrayorig
of the center of the image, calculate the affine coordinate transform corresponding to the composition of both transformations, which is respectively:PIL.Image.FLIP_LEFT_RIGHT
: entails translation to the center, followed by pure reflection about the yaxis, and subsequent translation backPIL.Image.FLIP_TOP_BOTTOM
: entails translation to the center, followed by pure reflection about the xaxis, and subsequent translation backPIL.Image.ROTATE_180
: entails translation to the center, followed by pure reflection about the origin, and subsequent translation backPIL.Image.ROTATE_90
: entails translation to the center, followed by pure rotation by 90° counterclockwise, and subsequent translation backPIL.Image.ROTATE_270
: entails translation to the center, followed by pure rotation by 270° counterclockwise, and subsequent translation backPIL.Image.TRANSPOSE
: entails translation to the center, followed by pure rotation by 90° counterclockwise and pure reflection about the xaxis, and subsequent translation backPIL.Image.TRANSVERSE
: entails translation to the center, followed by pure rotation by 90° counterclockwise and pure reflection about the yaxis, and subsequent translation back
Return a numpy array of the resulting affine transformation matrix.

ocrd_utils.
transpose_image
(image, method)[source]¶ “Transpose (i.e. flip or rotate in 90° multiples) an image.
Given a PIL.Image
image
and a transposition modemethod
, apply the respective operation:PIL.Image.FLIP_LEFT_RIGHT
: all pixels get mirrored at half the width of the imagePIL.Image.FLIP_TOP_BOTTOM
: all pixels get mirrored at half the height of the imagePIL.Image.ROTATE_180
: all pixels get mirrored at both, the width and half the height of the image, i.e. the image gets rotated by 180° counterclockwisePIL.Image.ROTATE_90
: rows become columns (but counted from the right) and columns become rows, i.e. the image gets rotated by 90° counterclockwise; width becomes height and vice versaPIL.Image.ROTATE_270
: rows become columns and columns become rows (but counted from the bottom), i.e. the image gets rotated by 270° counterclockwise; width becomes height and vice versaPIL.Image.TRANSPOSE
: rows become columns and vice versa, i.e. all pixels get mirrored at the main diagonal; width becomes height and vice versaPIL.Image.TRANSVERSE
: rows become columns (but counted from the right) and columns become rows (but counted from the bottom), i.e. all pixels get mirrored at the opposite diagonal; width becomes height and vice versa
Return a new PIL.Image.

ocrd_utils.
unzip_file_to_dir
(path_to_zip, output_directory)[source]¶ Extract a ZIP archive to a directory