TotalDepth.DAT.DAT_parser

DAT Files

Parses DAT files. DAT is an informal specification (i.e. undefined) with loads of poor implementations.

Here described:

Section 1: Channel Declaration. A set of lines of the form: A B C, whitespace separated, where A is an uppercase word, B is free text, C is a word. There is no guarantee that channels declared here have any data in the subsequent sections. Order of this section is ignored.

Section 2: Channel Header A single line with space separated uppercase words. All words must appear as channel declarations from section 1 but some declarations from section 1 may be missing. The order of the words in the channel header is used to interpret the order of the subsequent values in section 3.

So deciding if we are in section 2 can be done with some (dubious?) heuristics:

  • List matches list of section 1. Seems sensible but does not work when channels are declared but not defined.
  • Some subset of the declared channels.
  • All uppercase?
  • Lots of words?
  • Starts with ‘UTIM DATE TIME …’

In this implementation we use the latter.

Section 3: Channel values Space separated values. Mostly numeric but date/time conversion can be inferred from section 1. The column order is defined by the order of the Channel Header.

Note many deficiencies here:

DATE Date ddmmyy but value 9Dec06, 09-Dec-06 etc.

TIME Time hhmmss but value 11-50-17

Example, <…> is continuation:

UTIM Unix Time sec DATE Date ddmmyy TIME Time hhmmss WAC Wits Activity Code unitless BDIA Bit Diameter inch <…> NPEN n-Pentane ppm EPEN Neo-Pentane ppm UTIM DATE TIME WAC BDIA <…> NPEN EPEN 1165665017 09Dec06 11-50-17 0 8.50 <…> 0 0

Performance

There is no particular effort made here for high performance. DAT files are small, typically <10Mb, so artful coding is not really required.

API

exception TotalDepth.DAT.DAT_parser.ExceptionDAT

General exception for problems with a DAT object.

exception TotalDepth.DAT.DAT_parser.ExceptionDATRead

Exception for reading a DAT file.

TotalDepth.DAT.DAT_parser.RE_CHANNEL_DEFINITION = re.compile('^([A-Z0-9]+)\\s(.+?)\\s(\\S+)$')

Matches ‘UTIM Unix Time sec’ Also need to process: “UTIM Unix Time sec”

TotalDepth.DAT.DAT_parser.RE_DATA_HEADER_DEFINITION = re.compile('^UTIM\\s+DATE\\s+TIME\\s+.+$')

Matches ‘UTIM DATE TIME …’

TotalDepth.DAT.DAT_parser.RE_DATE_STYLE_A = re.compile('^(\\d+)(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)(\\d+)$')

Matches ‘12Oct20’ and ‘5Oct20’

TotalDepth.DAT.DAT_parser.RE_DATE_STYLE_B = re.compile('^(\\d+)-(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)-(\\d+)$')

Matches ‘12-Oct-20’ and ‘5O-Oct-20’

TotalDepth.DAT.DAT_parser.NAME_UNITS_TYPE_MAP = {('DATE', 'ddmmyy'): <class 'object'>, ('TIME', 'hhmmss'): <class 'object'>, ('UTIM', 'sec'): <class 'object'>}

Map of numpy dtype from the name/units.

TotalDepth.DAT.DAT_parser.parse_file(file_object: TextIO, ident: str = '', description: str = 'DAT File') → TotalDepth.common.LogPass.FrameArray

Parse the File object as a DAT file into a FrameArray. Will raise an ExceptionDAT on error.

Parameters:
  • file_object – The file to parse.
  • ident – Identification of this DAT file.
  • description – Description of this DAT file.
Returns:

TotalDepth.DAT.DAT_parser.parse_path(path: str) → TotalDepth.common.LogPass.FrameArray

Parse the DAT file in the given path.

TotalDepth.DAT.DAT_parser.can_parse_file(file_object: TextIO) → bool

Tries to parse the file with just one row of data. On error returns False.

TotalDepth.DAT.DAT_parser.can_parse_path(path: str) → bool

Tries to parse the file at path with just one row of data. On error returns False.

TotalDepth.DAT.DAT_parser.main() → int

Read a file and dump the Log Pass.