Initial commit

This commit is contained in:
Yevhenii Kutsenko 2024-12-28 15:35:05 +03:00
commit bd10fad70a
32 changed files with 2400 additions and 0 deletions

48
.gitignore vendored Normal file
View File

@ -0,0 +1,48 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# PyBuilder
.pybuilder/
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

21
LICENSE Normal file
View File

@ -0,0 +1,21 @@
The MIT License (MIT)
Copyright (c) 2021 Organized Crime and Corruption Reporting Project
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

132
README.md Normal file
View File

@ -0,0 +1,132 @@
# cronodump
The cronodump utility can parse most of the databases created by the [CronosPro](https://www.cronos.ru/) database software
and dump it to several output formats.
The software is popular among Russian public offices, companies and police agencies.
# Quick start
In its simplest form, without any dependencies, the croconvert command creates a [CSV](https://en.wikipedia.org/wiki/Comma-separated_values) representation of all the database's tables and a copy of all files contained in the database:
```bash
bin/croconvert --csv test_data/all_field_types
```
By default it creates a `cronodump-YYYY-mm-DD-HH-MM-SS-ffffff/` directory containing CSV files for each table found. It will under this directory also create a `Files-FL/` directory containing all the files stored in the Database, regardless if they are (still) referenced in any data table. All files that are actually referenced (and thus are known by their filename) will be stored under the `Files-Referenced` directory. With the `--outputdir` option you can chose your own dump location.
When you get an error message, or just unreadable data, chances are your database is protected. You may need to look into the `--dbcrack` or `--strucrack` options, explained below.
# Templates
The croconvert command can use the powerful [jinja templating framework](https://jinja.palletsprojects.com/en/3.0.x/) to render more file formats like PostgreSQL and HTML.
The default action for `croconvert` is to convert the database using the `html` template.
Use
```bash
python3 -m venv ./venc
. venv/bin/activate
pip install jinja2
bin/croconvert test_data/all_field_types > test_data.html
```
to dump an HTML file with all tables found in the database, files listed and ready for download as inlined [data URI](https://en.wikipedia.org/wiki/Data_URI_scheme) and all table images inlined as well. Note that the resulting HTML file can be huge for large databases, causing a lot of load on browsers when trying to open them.
The `-t postgres` command will dump the table schemes and records as valid `CREATE TABLE` and `INSERT INTO` statements to stdout. This dump can then be imported in a PostgreSQL database. Note that the backslash character is not escaped and thus the [`standard_conforming_strings`](https://www.postgresql.org/docs/current/runtime-config-compatible.html#GUC-STANDARD-CONFORMING-STRINGS) option should be off.
Pull requests for [more templates supporting other output types](/templates) are welcome.
# Inspection
There's a `bin/crodump` tool to further investigate databases. This might be useful for extracting metadata like path names of table image files or input and output forms. Not all metadata has yet been completely reverse engineered, so some experience with understanding binary dumps might be required.
The crodump script has a plethora of options but in the most basic for the `strudump` sub command will provide a rich variety of metadata to look further:
```bash
bin/crodump strudump -v -a test_data/all_field_types/
```
The `-a` option tells strudump to output ascii instead of a hexdump.
For a low level dump of the database contents, use:
```bash
bin/crodump crodump -v test_data/all_field_types/
```
The `-v` option tells crodump to include all unused byte ranges, this may be useful when identifying deleted records.
For a bit higher level dump of the database contents, use:
```bash
bin/crodump recdump test_data/all_field_types/
```
This will print a hexdump of all records for all tables.
## decoding password protected databases
Cronos v4 and higher are able to password protect databases, the protection works
by modifying the KOD sbox. `cronodump` has two methods of deriving the KOD sbox from
a database:
Both these methods are statistics based operations, it may not always
yield the correct KOD sbox.
### 1. strudump
When the database has a sufficiently large CroStru.dat file,
it is easy to derive the nodified KOD-sbox from the CroStru file, the `--strucrack` option
will do this.
crodump --strucrack recdump <dbpath>
### 2. dbdump
When the Bank and Index files are compressed, we can derive the KOD sbox by inspecting
the fourth byte of each record, which should decode to a zero.
The `--dbcrack` option will do this.
crodump --dbcrack recdump <dbpath>
# Installing
`cronodump` requires python 3.7 or later. It has been tested on Linux, MacOS and Windows.
There is one optional requirement: the `Jinja2` templating engine, but it will install fine without.
There are several ways of installing `cronodump`:
* You can run `cronodump` directly from the cloned git repository, by using the shell scripts in the `bin` subdirectory.
* You can install `cronodump` in your python environment by ruinning: `python setup.py build install`.
* You can install `cronodump` from the public [pypi repository](https://pypi.org/project/cronodump/) with `pip install cronodump`.
* You can install `cronodump` with the `Jinja2` templating engine from the public [pypi repository](https://pypi.org/project/cronodump/) with `pip install cronodump[templates]`.
# Terminology
We decided to use the more common terminology for database, tables, records, etc.
Here is a table showing how cronos calls these:
| what | cronos english | cronos russian
|:------ |:------ |:------
| Database | Bank | Банк
| Table | Base | Базы
| Record | Record | Записи
| Field | Field | поля
| recid | System Number | Системный номер
# License
cronodump is released under the [MIT license](LICENSE).
# References
cronodump builds upon [documentation of the file format found in older versions of Cronos](http://sergsv.narod.ru/cronos.htm) and
the [subsequent implementation of a parser for the old file format](https://github.com/occrp/cronosparser) but dropped the heuristic
approach to guess offsets and obfuscation parameters for a more rigid parser. Refer to [the docs](docs/cronos-research.md) for further
details.

5
bin/croconvert Executable file
View File

@ -0,0 +1,5 @@
#!/bin/sh
BINPATH=`dirname $0`
export PYTHONPATH="$BINPATH/.."
python3 -mcrodump.croconvert "$@"

5
bin/crodump Executable file
View File

@ -0,0 +1,5 @@
#!/bin/sh
BINPATH=`dirname $0`
export PYTHONPATH="$BINPATH/.."
python3 -mcrodump.crodump "$@"

288
crodump/Database.py Normal file
View File

@ -0,0 +1,288 @@
from __future__ import print_function, division
import os
import re
from sys import stderr
from binascii import b2a_hex
from .readers import ByteReader
from .hexdump import strescape, toout, ashex
from .Datamodel import TableDefinition, Record
from .Datafile import Datafile
import base64
import struct
import crodump.koddecoder
import sys
if sys.version_info[0] == 2:
sys.exit("cronodump needs python3")
class Database:
"""represent the entire database, consisting of Stru, Index and Bank files"""
def __init__(self, dbdir, compact, kod=crodump.koddecoder.new()):
"""
`dbdir` is the directory containing the Cro*.dat and Cro*.tad files.
`compact` if set, the .tad file is not cached in memory, making dumps 15 % slower
`kod` is optionally a KOD coder object.
by default the v3 KOD coding will be used.
"""
self.dbdir = dbdir
self.compact = compact
self.kod = kod
# Stru+Index+Bank for the components for most databases
self.stru = self.getfile("Stru")
self.index = self.getfile("Index")
self.bank = self.getfile("Bank")
# the Sys file resides in the "Program Files\Cronos" directory, and
# contains an index of all known databases.
self.sys = self.getfile("Sys")
def getfile(self, name):
"""
Returns a Datafile object for `name`.
this function expects a `Cro<name>.dat` and a `Cro<name>.tad` file.
When no such files exist, or only one, then None is returned.
`name` is matched case insensitively
"""
try:
datname = self.getname(name, "dat")
tadname = self.getname(name, "tad")
if datname and tadname:
return Datafile(name, open(datname, "rb"), open(tadname, "rb"), self.compact, self.kod)
except IOError:
return
def getname(self, name, ext):
"""
Get a case-insensitive filename match for 'name.ext'.
Returns None when no matching file was not found.
"""
basename = "Cro%s.%s" % (name, ext)
for fn in os.listdir(self.dbdir):
if basename.lower() == fn.lower():
return os.path.join(self.dbdir, fn)
def dump(self, args):
"""
Calls the `dump` method on all database components.
"""
if self.stru:
self.stru.dump(args)
if self.index:
self.index.dump(args)
if self.bank:
self.bank.dump(args)
if self.sys:
self.sys.dump(args)
def strudump(self, args):
"""
prints all info found in the CroStru file.
"""
if not self.stru:
print("missing CroStru file")
return
self.dump_db_table_defs(args)
def decode_db_definition(self, data):
"""
decode the 'bank' / database definition
"""
rd = ByteReader(data)
d = dict()
while not rd.eof():
keyname = rd.readname()
if keyname in d:
print("WARN: duplicate key: %s" % keyname)
index_or_length = rd.readdword()
if index_or_length >> 31:
d[keyname] = rd.readbytes(index_or_length & 0x7FFFFFFF)
else:
refdata = self.stru.readrec(index_or_length)
if refdata[:1] != b"\x04":
print("WARN: expected refdata to start with 0x04")
d[keyname] = refdata[1:]
return d
def dump_db_definition(self, args, dbdict):
"""
decode the 'bank' / database definition
"""
for k, v in dbdict.items():
if re.search(b"[^\x0d\x0a\x09\x20-\x7e\xc0-\xff]", v):
print("%-20s - %s" % (k, toout(args, v)))
else:
print('%-20s - "%s"' % (k, strescape(v)))
def dump_db_table_defs(self, args):
"""
decode the table defs from recid #1, which always has table-id #3
Note that I don't know if it is better to refer to this by recid, or by table-id.
other table-id's found in CroStru:
#4 -> large values referenced from tableid#3
"""
dbinfo = self.stru.readrec(1)
if dbinfo[:1] != b"\x03":
print("WARN: expected dbinfo to start with 0x03")
dbdef = self.decode_db_definition(dbinfo[1:])
self.dump_db_definition(args, dbdef)
for k, v in dbdef.items():
if k.startswith("Base") and k[4:].isnumeric():
print("== %s ==" % k)
tbdef = TableDefinition(v, dbdef.get("BaseImage" + k[4:], b''))
tbdef.dump(args)
elif k == "NS1":
self.dump_ns1(v)
def dump_ns1(self, data):
if len(data)<2:
print("NS1 is unexpectedly short")
return
unk1, sh, = struct.unpack_from("<BB", data, 0)
# NS1 is encoded with the default KOD table,
# so we are not using stru.kod here.
ns1kod = crodump.koddecoder.new()
decoded_data = ns1kod.decode(sh, data[2:])
if len(decoded_data) < 12:
print("NS1 is unexpectedly short")
return
serial, unk2, pwlen, = struct.unpack_from("<LLL", decoded_data, 0)
password = decoded_data[12:12+pwlen].decode('cp1251')
print("== NS1: (%02x,%02x) -> %6d, %d, %d:'%s'" % (unk1, sh, serial, unk2, pwlen, password))
def enumerate_tables(self, files=False):
"""
yields a TableDefinition object for all `BaseNNN` entries found in CroStru
"""
dbinfo = self.stru.readrec(1)
if dbinfo[:1] != b"\x03":
print("WARN: expected dbinfo to start with 0x03")
try:
dbdef = self.decode_db_definition(dbinfo[1:])
except Exception as e:
print("ERROR decoding db definition: %s" % e)
print("This could possibly mean that you need to try with the --strucrack option")
return
for k, v in dbdef.items():
if k.startswith("Base") and k[4:].isnumeric():
if files and k[4:] == "000":
yield TableDefinition(v)
if not files and k[4:] != "000":
yield TableDefinition(v, dbdef.get("BaseImage" + k[4:], b''))
def enumerate_records(self, table):
"""
Yields a Record object for all records in CroBank matching
the tableid from `table`
usage:
for tab in db.enumerate_tables():
for rec in db.enumerate_records(tab):
print(sqlformatter(tab, rec))
"""
for i in range(self.bank.nrofrecords):
data = self.bank.readrec(i + 1)
if data and data[0] == table.tableid:
try:
yield Record(i + 1, table.fields, data[1:])
except EOFError:
print("Record %d too short: -- %s" % (i+1, ashex(data)), file=stderr)
except Exception as e:
print("Record %d broken: ERROR '%s' -- %s" % (i+1, e, ashex(data)), file=stderr)
def enumerate_files(self, table):
"""
Yield all file contents found in CroBank for `table`.
This is most likely the table with id 0.
"""
for i in range(self.bank.nrofrecords):
data = self.bank.readrec(i + 1)
if data and data[0] == table.tableid:
yield i + 1, data[1:]
def get_record(self, index, asbase64=False):
"""
Retrieve a single record from CroBank with record number `index`.
"""
data = self.bank.readrec(int(index))
if asbase64:
return base64.b64encode(data[1:]).decode('utf-8')
else:
return data[1:]
def recdump(self, args):
"""
Function for outputing record contents of the various .dat files.
This function is mostly useful for reverse-engineering the database format.
"""
if args.index:
dbfile = self.index
elif args.sys:
dbfile = self.sys
elif args.stru:
dbfile = self.stru
else:
dbfile = self.bank
if not dbfile:
print(".dat not found")
return
nerr = 0
nr_recnone = 0
nr_recempty = 0
tabidxref = [0] * 256
bytexref = [0] * 256
for i in range(1, args.maxrecs + 1):
try:
data = dbfile.readrec(i)
if args.find1d:
if data and (data.find(b"\x1d") > 0 or data.find(b"\x1b") > 0):
print("record with '1d': %d -> %s" % (i, b2a_hex(data)))
break
elif not args.stats:
if data is None:
print("%5d: <deleted>" % i)
else:
print("%5d: %s" % (i, toout(args, data)))
else:
if data is None:
nr_recnone += 1
elif not len(data):
nr_recempty += 1
else:
tabidxref[data[0]] += 1
for b in data[1:]:
bytexref[b] += 1
nerr = 0
except IndexError:
break
except Exception as e:
print("%5d: <%s>" % (i, e))
if args.debug:
raise
nerr += 1
if nerr > 5:
break
if args.stats:
print("-- table-id stats --, %d * none, %d * empty" % (nr_recnone, nr_recempty))
for k, v in enumerate(tabidxref):
if v:
print("%5d * %02x" % (v, k))
print("-- byte stats --")
for k, v in enumerate(bytexref):
if v:
print("%5d * %02x" % (v, k))

344
crodump/Datafile.py Normal file
View File

@ -0,0 +1,344 @@
import io
import struct
import zlib
from .hexdump import tohex, toout
import crodump.koddecoder
class Datafile:
"""Represent a single .dat with it's .tad index file"""
def __init__(self, name, dat, tad, compact, kod):
self.name = name
self.dat = dat
self.tad = tad
self.compact = compact
self.readdathdr()
self.readtad()
self.dat.seek(0, io.SEEK_END)
self.datsize = self.dat.tell()
self.kod = kod if not kod or self.isencrypted() else crodump.koddecoder.new()
def isencrypted(self):
return self.version in (b'01.04', b'01.05') or self.isv4()
def isv3(self):
# 01.02: 32 bit file offsets
# 01.03: 64 bit file offsets
# 01.04: encrypted?, 32bit
# 01.05: encrypted?, 64bit
return self.version in (b'01.02', b'01.03', b'01.04', b'01.05')
def isv4(self):
# 01.11 v4 ( 64bit )
# 01.14 v4 ( 64bit ), encrypted?
# 01.13 ?? I have not seen this version anywhere yet.
return self.version in (b'01.11', b'01.13', b'01.14')
def isv7(self):
# 01.19 ?? I have not seen this version anywhere yet.
return self.version in (b'01.19',)
def readdathdr(self):
"""
Read the .dat file header.
Note that the 19 byte header if followed by 0xE9 random bytes, generated by
'srand(time())' followed by 0xE9 times obfuscate(rand())
"""
self.dat.seek(0)
hdrdata = self.dat.read(19)
(
magic, # +00 8 bytes
self.hdrunk, # +08 uint16
self.version, # +0a 5 bytes
self.encoding, # +0f uint16
self.blocksize, # +11 uint16
) = struct.unpack("<8sH5sHH", hdrdata)
if magic != b"CroFile\x00":
print("unknown magic: ", magic)
raise Exception("not a Crofile")
self.use64bit = self.version in (b"01.03", b"01.05", b"01.11")
# blocksize
# 0040 -> Bank
# 0400 -> Index or Sys
# 0200 -> Stru or Sys
# encoding
# bit0 = 'KOD encoded'
# bit1 = compressed
def readtad(self):
"""
read and decode the .tad file.
"""
self.tad.seek(0)
if self.isv3():
hdrdata = self.tad.read(2 * 4)
self.nrdeleted, self.firstdeleted = struct.unpack("<2L", hdrdata)
elif self.isv4():
hdrdata = self.tad.read(4 * 4)
unk1, self.nrdeleted, self.firstdeleted, unk2 = struct.unpack("<4L", hdrdata)
else:
raise Exception("unsupported .tad version")
self.tadhdrlen = self.tad.tell()
self.tadentrysize = 16 if self.use64bit else 12
if self.compact:
self.tad.seek(0, io.SEEK_END)
else:
self.idxdata = self.tad.read()
self.tadsize = self.tad.tell() - self.tadhdrlen
self.nrofrecords = self.tadsize // self.tadentrysize
if self.tadsize % self.tadentrysize:
print("WARN: leftover data in .tad")
def tadidx(self, idx):
"""
If we're not supposed to be more compact but slower, lookup from a cached .tad
"""
if self.compact:
return self.tadidx_seek(idx)
if self.use64bit:
# 01.03 and 01.11 have 64 bit file offsets
return struct.unpack_from("<QLL", self.idxdata, idx * self.tadentrysize)
else:
# 01.02 and 01.04 have 32 bit offsets.
return struct.unpack_from("<LLL", self.idxdata, idx * self.tadentrysize)
def tadidx_seek(self, idx):
"""
Memory saving version without caching the .tad
"""
self.tad.seek(self.tadhdrlen + idx * self.tadentrysize)
idxdata = self.tad.read(self.tadentrysize)
if self.use64bit:
# 01.03 and 01.11 have 64 bit file offsets
return struct.unpack("<QLL", idxdata)
else:
# 01.02 and 01.04 have 32 bit offsets.
return struct.unpack("<LLL", idxdata)
def readdata(self, ofs, size):
"""
Read raw data from the .dat file
"""
self.dat.seek(ofs)
return self.dat.read(size)
def readrec(self, idx):
"""
Extract and decode a single record.
"""
if idx == 0:
raise Exception("recnum must be a positive number")
ofs, ln, chk = self.tadidx(idx - 1)
if ln == 0xFFFFFFFF:
# deleted record
return
if self.isv3():
flags = ln >> 24
ln &= 0xFFFFFFF
elif self.isv4():
flags = ofs >> 56
ofs &= (1<<56)-1
dat = self.readdata(ofs, ln)
if not dat:
# empty record
encdat = dat
elif not flags:
if self.use64bit:
extofs, extlen = struct.unpack("<QL", dat[:12])
o = 12
else:
extofs, extlen = struct.unpack("<LL", dat[:8])
o = 8
encdat = dat[o:]
while len(encdat) < extlen:
dat = self.readdata(extofs, self.blocksize)
if self.use64bit:
(extofs,) = struct.unpack("<Q", dat[:8])
o = 8
else:
(extofs,) = struct.unpack("<L", dat[:4])
o = 4
encdat += dat[o:]
encdat = encdat[:extlen]
else:
encdat = dat
if self.encoding & 1:
if self.kod:
encdat = self.kod.decode(idx, encdat)
if self.iscompressed(encdat):
encdat = self.decompress(encdat)
return encdat
def enumrecords(self):
for i in range(self.nrofrecords):
yield self.readrec(i+1)
def enumunreferenced(self, ranges, filesize):
"""
From a list of used byte ranges and the filesize, enumerate the list of unused byte ranges
"""
o = 0
for start, end, desc in sorted(ranges):
if start > o:
yield o, start - o
o = end
if o < filesize:
yield o, filesize - o
def dump(self, args):
"""
Dump decodes all data referenced from the .tad file.
And optionally print out all unreferenced byte ranges in the .dat file.
This function is mostly useful for reverse-engineering the database format.
the `args` object controls how data is decoded.
"""
print("hdr: %-6s dat: %04x %s enc:%04x bs:%04x, tad: %08x %08x" % (
self.name, self.hdrunk, self.version,
self.encoding, self.blocksize,
self.nrdeleted, self.firstdeleted))
ranges = [] # keep track of used bytes in the .dat file.
for i in range(self.nrofrecords):
(ofs, ln, chk) = self.tadidx(i)
idx = i + 1
if args.maxrecs and i==args.maxrecs:
break
if ln == 0xFFFFFFFF:
print("%5d: %08x %08x %08x" % (idx, ofs, ln, chk))
continue
if self.isv3():
flags = ln >> 24
ln &= 0xFFFFFFF
elif self.isv4():
flags = ofs >> 56
# 04 --> data, v3compdata
# 02,03 --> deleted
# 00 --> extrec
ofs &= (1<<56)-1
dat = self.readdata(ofs, ln)
ranges.append((ofs, ofs + ln, "item #%d" % i))
decflags = [" ", " "]
infostr = ""
tail = b""
if not dat:
# empty record
encdat = dat
elif not flags:
if self.use64bit:
extofs, extlen = struct.unpack("<QL", dat[:12])
o = 12
else:
extofs, extlen = struct.unpack("<LL", dat[:8])
o = 8
infostr = "%08x;%08x" % (extofs, extlen)
encdat = dat[o:]
while len(encdat) < extlen:
dat = self.readdata(extofs, self.blocksize)
ranges.append((extofs, extofs + self.blocksize, "item #%d ext" % i))
if self.use64bit:
(extofs,) = struct.unpack("<Q", dat[:8])
o = 8
else:
(extofs,) = struct.unpack("<L", dat[:4])
o = 4
infostr += ";%08x" % (extofs)
encdat += dat[o:]
tail = encdat[extlen:]
encdat = encdat[:extlen]
decflags[0] = "+"
else:
encdat = dat
decflags[0] = "*"
if self.encoding & 1:
if self.kod:
encdat = self.kod.decode(idx, encdat)
else:
decflags[0] = " "
if args.decompress:
if self.iscompressed(encdat):
encdat = self.decompress(encdat)
decflags[1] = "@"
# TODO: separate handling for v4
print("%5d: %08x-%08x: (%02x:%08x) %s %s%s %s" % (
i+1, ofs, ofs + ln, flags, chk,
infostr, "".join(decflags), toout(args, encdat), tohex(tail)))
if args.verbose:
# output parts not referenced in the .tad file.
for o, l in self.enumunreferenced(ranges, self.datsize):
dat = self.readdata(o, l)
print("%08x-%08x: %s" % (o, o + l, toout(args, dat)))
def iscompressed(self, data):
"""
Check if this record looks like a compressed record.
"""
if len(data) < 11:
return
if data[-3:] != b"\x00\x00\x02":
return
o = 0
while o < len(data) - 3:
size, flag = struct.unpack_from(">HH", data, o)
if flag != 0x800 and flag != 0x008:
return
o += size + 2
return True
def decompress(self, data):
"""
Decompress a record.
Compressed records can have several chunks of compressed data.
Note that the compression header uses a mix of big-endian and little numbers.
each chunk has the following format:
size - big endian uint16, size of flag + crc + compdata
flag - big endian uint16 - always 0x800
crc - little endian uint32, crc32 of the decompressed data
the final chunk has only 3 bytes: a zero size followed by a 2.
the crc algorithm is the one labeled 'crc-32' on this page:
http://crcmod.sourceforge.net/crcmod.predefined.html
"""
result = b""
o = 0
while o < len(data) - 3:
# note the mix of bigendian and little endian numbers here.
size, flag = struct.unpack_from(">HH", data, o)
storedcrc, = struct.unpack_from("<L", data, o+4)
C = zlib.decompressobj(-15)
result += C.decompress(data[o+8:o+8+size-6])
# note that we are not verifying the crc!
o += size + 2
return result

255
crodump/Datamodel.py Normal file
View File

@ -0,0 +1,255 @@
# -*- coding: utf-8 -*-
from .hexdump import tohex, ashex
from .readers import ByteReader
class FieldDefinition:
"""
Contains the properties for a single field in a record.
"""
def __init__(self, data):
self.decode(data)
def decode(self, data):
self.defdata = data
rd = ByteReader(data)
self.typ = rd.readword()
self.idx1 = rd.readdword()
self.name = rd.readname()
self.flags = rd.readdword()
self.minval = rd.readbyte() # Always 1
if self.typ:
self.idx2 = rd.readdword()
self.maxval = rd.readdword() # max value or length
self.unk4 = rd.readdword() # Always 0x00000009 or 0x0001000d
else:
self.idx2 = 0
self.maxval = self.unk4 = None
self.remaining = rd.readbytes()
def __str__(self):
if self.typ:
return "Type: %2d (%2d/%2d) %04x,(%d-%4d),%04x - %-40s -- %s" % (
self.typ, self.idx1, self.idx2,
self.flags, self.minval, self.maxval, self.unk4,
"'%s'" % self.name, tohex(self.remaining))
else:
return "Type: %2d %2d %d,%d - '%s'" % (
self.typ, self.idx1, self.flags, self.minval, self.name)
def sqltype(self):
return { 0: "INTEGER PRIMARY KEY",
1: "INTEGER",
2: "VARCHAR(" + str(self.maxval) + ")",
3: "TEXT", # dictionaray
4: "DATE",
5: "TIMESTAMP",
6: "TEXT", # file reference
}.get(self.typ, "TEXT")
class TableImage:
def __init__(self, data):
self.decode(data)
def decode(self, data):
if not len(data):
self.filename = "none"
self.data = b''
return
rd = ByteReader(data)
_ = rd.readbyte()
namelen = rd.readdword()
self.filename = rd.readbytes(namelen).decode("cp1251", 'ignore')
imagelen = rd.readdword()
self.data = rd.readbytes(imagelen)
class TableDefinition:
def __init__(self, data, image=''):
self.decode(data, image)
def decode(self, data, image):
"""
decode the 'base' / table definition
"""
rd = ByteReader(data)
self.unk1 = rd.readword()
self.version = rd.readbyte()
if self.version > 1:
_ = rd.readbyte() # always 0 anyway
# if this is not 5 (but 9), there's another 4 bytes inserted, this could be a length-byte.
self.unk2 = rd.readbyte()
self.unk3 = rd.readbyte()
if self.unk2 > 5: # seen only 5 and 9 for now with 9 implying an extra dword
_ = rd.readdword()
self.unk4 = rd.readdword()
self.tableid = rd.readdword()
self.tablename = rd.readname()
self.abbrev = rd.readname()
self.unk7 = rd.readdword()
nrfields = rd.readdword()
self.headerdata = data[: rd.o]
# There's (at least) two blocks describing fields, ended when encountering ffffffff
self.fields = []
for _ in range(nrfields):
deflen = rd.readword()
fielddef = rd.readbytes(deflen)
self.fields.append(FieldDefinition(fielddef))
# Between the first and the second block, there's some byte strings inbetween, count
# given in first dword
self.extraunkdatastrings = rd.readdword()
for _ in range(self.extraunkdatastrings):
datalen = rd.readword()
skip = rd.readbytes(datalen)
try:
# Then there's another unknow dword and then (probably section indicator) 02 byte
self.unk8_ = rd.readdword()
if rd.readbyte() != 2:
print("Warning: FieldDefinition Section 2 not marked with a 2")
self.unk9 = rd.readdword()
# Then there's the amount of extra fields in the second section
nrextrafields = rd.readdword()
for _ in range(nrextrafields):
deflen = rd.readword()
fielddef = rd.readbytes(deflen)
self.fields.append(FieldDefinition(fielddef))
except Exception as e:
print("Warning: Error '%s' parsing FieldDefinitions" % e)
try:
self.terminator = rd.readdword()
except EOFError:
print("Warning: FieldDefinition section not terminated")
except Exception as e:
print("Warning: Error '%s' parsing Tabledefinition" % e)
self.fields.sort(key=lambda field: field.idx2)
self.remainingdata = rd.readbytes()
self.tableimage = TableImage(image)
def __str__(self):
return "%d,%d<%d,%d,%d>%d %d,%d '%s' '%s' [TableImage(%d bytes): %s]" % (
self.unk1, self.version, self.unk2, self.unk3, self.unk4, self.tableid,
self.unk7, len(self.fields),
self.tablename, self.abbrev, len(self.tableimage.data), self.tableimage.filename)
def dump(self, args):
if args.verbose:
print("table: %s" % tohex(self.headerdata))
print(str(self))
for i, field in enumerate(self.fields):
if args.verbose:
print("field#%2d: %04x - %s" % (
i, len(field.defdata), tohex(field.defdata)))
print(str(field))
if args.verbose:
print("remaining: %s" % tohex(self.remainingdata))
class Field:
"""
Contains a single fully decoded value.
"""
def __init__(self, fielddef, data):
self.decode(fielddef, data)
def decode(self, fielddef, data):
self.typ = fielddef.typ
self.data = data
if not data:
self.content = ""
return
elif self.typ == 0:
# typ 0 is the recno, or as cronos calls this: Системный номер, systemnumber.
# just convert this to string for presentation
self.content = str(data)
elif self.typ == 4:
# typ 4 is DATE, formatted like: <year-1900:signedNumber><month:2digits><day:2digits>
try:
data = data.rstrip(b"\x00")
y, m, d = 1900+int(data[:-4]), int(data[-4:-2]), int(data[-2:])
self.content = "%04d-%02d-%02d" % (y, m, d)
except ValueError:
self.content = str(data)
elif self.typ == 5:
# typ 5 is TIME, formatted like: <hour:2digits><minute:2digits>
try:
data = data.rstrip(b"\x00")
h, m = int(data[-4:-2]), int(data[-2:])
self.content = "%02d:%02d" % (h, m)
except ValueError:
self.content = str(data)
elif self.typ == 6:
# decode internal file reference
rd = ByteReader(data)
self.flag = rd.readdword()
self.remlen = rd.readdword()
self.filename = rd.readtoseperator(b"\x1e").decode("cp1251", 'ignore')
self.extname = rd.readtoseperator(b"\x1e").decode("cp1251", 'ignore')
self.filedatarecord = rd.readtoseperator(b"\x1e").decode("cp1251", 'ignore')
self.content = " ".join([self.filename, self.extname, self.filedatarecord])
elif self.typ == 7 or self.typ == 8 or self.typ == 9:
# just hexdump foreign keys
self.content = ashex(data)
else:
# currently assuming everything else to be strings, which is wrong
self.content = data.rstrip(b"\x00").decode("cp1251", 'ignore')
class Record:
"""
Contains a single fully decoded record.
"""
def __init__(self, recno, tabledef, data):
self.decode(recno, tabledef, data)
def decode(self, recno, tabledef, data):
"""
decode the fields in a record
"""
self.data = data
self.recno = recno
self.table = tabledef
# start with the record number, or as Cronos calls this:
# the system number, in russian: Системный номер.
self.fields = [ Field(tabledef[0], str(recno)) ]
rd = ByteReader(data)
for fielddef in tabledef[1:]:
if not rd.eof() and rd.testbyte(0x1b):
# read complex record indicated by b"\x1b"
rd.readbyte()
size = rd.readdword()
fielddata = rd.readbytes(size)
else:
fielddata = rd.readtoseperator(b"\x1e")
self.fields.append(Field(fielddef, fielddata))

0
crodump/__init__.py Normal file
View File

140
crodump/croconvert.py Normal file
View File

@ -0,0 +1,140 @@
"""
Commandline tool which convert a cronos database to .csv, .sql or .html.
python3 croconvert.py -t html chechnya_proverki_ul_2012/
"""
from .Database import Database
from .crodump import strucrack, dbcrack
from .hexdump import unhex
from sys import exit, stdout
from os.path import dirname, abspath, join
from os import mkdir, chdir
from datetime import datetime
import base64
import csv
def template_convert(kod, args):
"""looks up template to convert to, parses the database and passes it to jinja2"""
try:
from jinja2 import Environment, FileSystemLoader
except ImportError:
exit(
"Fatal: Jinja templating engine not found. Install using pip install jinja2"
)
db = Database(args.dbdir, args.compact, kod)
template_dir = join(dirname(dirname(abspath(__file__))), "templates")
j2_env = Environment(loader=FileSystemLoader(template_dir))
j2_templ = j2_env.get_template(args.template + ".j2")
j2_templ.stream(db=db, base64=base64).dump(stdout)
def safepathname(name):
return name.replace(':', '_').replace('/', '_').replace('\\', '_')
def csv_output(kod, args):
"""creates a directory with the current timestamp and in it a set of CSV or TSV
files with all the tables found and an extra directory with all the files"""
db = Database(args.dbdir, args.compact, kod)
mkdir(args.outputdir)
chdir(args.outputdir)
filereferences = []
# first dump all non-file tables
for table in db.enumerate_tables(files=False):
tablesafename = safepathname(table.tablename) + ".csv"
with open(tablesafename, 'w', encoding='utf-8') as csvfile:
writer = csv.writer(csvfile, delimiter=args.delimiter, escapechar='\\')
writer.writerow([field.name for field in table.fields])
# Record should be iterable over its fields, so we could use writerows
for record in db.enumerate_records(table):
writer.writerow([field.content for field in record.fields])
filereferences.extend([field for field in record.fields if field.typ == 6])
# Write all files from the file table. This is useful for unreferenced files
for table in db.enumerate_tables(files=True):
filedir = "Files-" + table.abbrev
mkdir(filedir)
for system_number, content in db.enumerate_files(table):
with open(join(filedir, str(system_number)), "wb") as binfile:
binfile.write(content)
if len(filereferences):
filedir = "Files-Referenced"
mkdir(filedir)
# Write all referenced files with their filename and extension intact
for reffile in filereferences:
if reffile.content: # only print when file is not NULL
filesafename = safepathname(reffile.filename) + "." + safepathname(reffile.extname)
content = db.get_record(reffile.filedatarecord)
with open(join("Files-Referenced", filesafename), "wb") as binfile:
binfile.write(content)
def main():
import argparse
parser = argparse.ArgumentParser(description="CRONOS database converter")
parser.add_argument("--template", "-t", type=str, default="html",
help="output template to use for conversion")
parser.add_argument("--csv", "-c", action='store_true', help="create output in .csv format")
parser.add_argument("--delimiter", "-d", default=",", help="delimiter used in csv output")
parser.add_argument("--outputdir", "-o", type=str, help="directory to create the dump in")
parser.add_argument("--kod", type=str, help="specify custom KOD table")
parser.add_argument("--compact", action="store_true", help="save memory by not caching the index, note: increases convert time by factor 1.15")
parser.add_argument("--strucrack", action="store_true", help="infer the KOD sbox from CroStru.dat")
parser.add_argument("--dbcrack", action="store_true", help="infer the KOD sbox from CroIndex.dat+CroBank.dat")
parser.add_argument("--nokod", "-n", action="store_true", help="don't KOD decode")
parser.add_argument("dbdir", type=str)
args = parser.parse_args()
import crodump.koddecoder
if args.kod:
if len(args.kod)!=512:
raise Exception("--kod should have a 512 hex digit argument")
kod = crodump.koddecoder.new(list(unhex(args.kod)))
elif args.nokod:
kod = None
elif args.strucrack:
class Cls: pass
cargs = Cls()
cargs.dbdir = args.dbdir
cargs.sys = False
cargs.silent = True
cracked = strucrack(None, cargs)
if not cracked:
return
kod = crodump.koddecoder.new(cracked)
elif args.dbcrack:
class Cls: pass
cargs = Cls()
cargs.dbdir = args.dbdir
cargs.sys = False
cargs.silent = True
cracked = dbcrack(None, cargs)
if not cracked:
return
kod = crodump.koddecoder.new(cracked)
else:
kod = crodump.koddecoder.new()
if args.csv:
if not args.outputdir:
args.outputdir = "cronodump"+datetime.now().strftime("-%Y-%m-%d-%H-%M-%S-%f")
csv_output(kod, args)
else:
template_convert(kod, args)
if __name__ == "__main__":
main()

296
crodump/crodump.py Normal file
View File

@ -0,0 +1,296 @@
from .kodump import kod_hexdump
from .hexdump import unhex, tohex
from .readers import ByteReader
from .Database import Database
from .Datamodel import TableDefinition
def destruct_sys3_def(rd):
# todo
pass
def destruct_sys4_def(rd):
"""
decode type 4 of the records found in CroSys.
This function is only useful for reverse-engineering the CroSys format.
"""
n = rd.readdword()
for _ in range(n):
marker = rd.readdword()
description = rd.readlongstring()
path = rd.readlongstring()
marker2 = rd.readdword()
print("%08x;%08x: %-50s : %s" % (marker, marker2, path, description))
def destruct_sys_definition(args, data):
"""
Decode the 'sys' / dbindex definition
This function is only useful for reverse-engineering the CroSys format.
"""
rd = ByteReader(data)
systype = rd.readbyte()
if systype == 3:
return destruct_sys3_def(rd)
elif systype == 4:
return destruct_sys4_def(rd)
else:
raise Exception("unsupported sys record")
def cro_dump(kod, args):
"""handle 'crodump' subcommand"""
if args.maxrecs:
args.maxrecs = int(args.maxrecs, 0)
else:
# an arbitrarily large number.
args.maxrecs = 0xFFFFFFFF
db = Database(args.dbdir, args.compact, kod)
db.dump(args)
def stru_dump(kod, args):
"""handle 'strudump' subcommand"""
db = Database(args.dbdir, args.compact, kod)
db.strudump(args)
def sys_dump(kod, args):
"""hexdump all CroSys records"""
# an arbitrarily large number.
args.maxrecs = 0xFFFFFFFF
db = Database(args.dbdir, args.compact, kod)
if db.sys:
db.sys.dump(args)
def rec_dump(kod, args):
"""hexdump all records of the specified CroXXX.dat file."""
if args.maxrecs:
args.maxrecs = int(args.maxrecs, 0)
else:
# an arbitrarily large number.
args.maxrecs = 0xFFFFFFFF
db = Database(args.dbdir, args.compact, kod)
db.recdump(args)
def destruct(kod, args):
"""
decode the index#1 structure information record
Takes hex input from stdin.
"""
import sys
data = sys.stdin.buffer.read()
data = unhex(data)
if args.type == 1:
# create a dummy db object
db = Database(".", args.compact)
db.dump_db_definition(args, data)
elif args.type == 2:
tbdef = TableDefinition(data)
tbdef.dump(args)
elif args.type == 3:
destruct_sys_definition(args, data)
def strucrack(kod, args):
"""
This function derives the KOD key from the assumption that most bytes in
the CroStru records will be zero, given a sufficient number of CroStru
items, statistically the most common bytes will encode to '0x00'
"""
# start without 'KOD' table, so we will get the encrypted records
db = Database(args.dbdir, args.compact, None)
if args.sys:
table = db.sys
if not db.sys:
print("no CroSys.dat file found in %s" % args.dbdir)
return
else:
table = db.stru
if not db.stru:
print("no CroStru.dat file found in %s" % args.dbdir)
return
xref = [ [0]*256 for _ in range(256) ]
for i, data in enumerate(table.enumrecords()):
if not data: continue
for ofs, byte in enumerate(data):
xref[(ofs+i+1)%256][byte] += 1
KOD = [0] * 256
for i, xx in enumerate(xref):
k, v = max(enumerate(xx), key=lambda kv: kv[1])
KOD[k] = i
if not args.silent:
print(tohex(bytes(KOD)))
return KOD
def dbcrack(kod, args):
"""
This function derives the KOD key from the assumption that most records in CroIndex
and CroBank will be compressed, and start with:
uint16 size
byte 0x08
byte 0x00
So because the fourth byte in each record will be 0x00 when kod-decoded, I can
use this as the inverse of the KOD table, adjusting for record-index.
"""
# start without 'KOD' table, so we will get the encrypted records
db = Database(args.dbdir, args.compact, None)
xref = [ [0]*256 for _ in range(256) ]
for dbfile in db.bank, db.index:
if not dbfile:
print("no data file found in %s" % args.dbdir)
return
for i in range(1, min(10000, dbfile.nrofrecords)):
rec = dbfile.readrec(i)
if rec and len(rec)>11:
xref[(i+3)%256][rec[3]] += 1
KOD = [0] * 256
for i, xx in enumerate(xref):
k, v = max(enumerate(xx), key=lambda kv: kv[1])
KOD[k] = i
if not args.silent:
print(tohex(bytes(KOD)))
return KOD
def main():
import argparse
parser = argparse.ArgumentParser(description="CRO hexdumper")
subparsers = parser.add_subparsers(title='commands',
help='Use the --help option for the individual sub commands for more details')
parser.set_defaults(handler=lambda *args: parser.print_help())
parser.add_argument("--debug", action="store_true", help="break on exceptions")
parser.add_argument("--kod", type=str, help="specify custom KOD table")
parser.add_argument("--strucrack", action="store_true", help="infer the KOD sbox from CroStru.dat")
parser.add_argument("--dbcrack", action="store_true", help="infer the KOD sbox from CroBank.dat + CroIndex.dat")
parser.add_argument("--nokod", "-n", action="store_true", help="don't KOD decode")
parser.add_argument("--compact", action="store_true", help="save memory by not caching the index, note: increases convert time by factor 1.15")
p = subparsers.add_parser("kodump", help="KOD/hex dumper")
p.add_argument("--offset", "-o", type=str, default="0")
p.add_argument("--length", "-l", type=str)
p.add_argument("--width", "-w", type=str)
p.add_argument("--endofs", "-e", type=str)
p.add_argument("--nokod", "-n", action="store_true", help="don't KOD decode")
p.add_argument("--unhex", "-x", action="store_true", help="assume the input contains hex data")
p.add_argument("--shift", "-s", type=str, help="KOD decode with the specified shift")
p.add_argument("--increment", "-i", action="store_true",
help="assume data is already KOD decoded, but with wrong shift -> dump alternatives.")
p.add_argument("--ascdump", "-a", action="store_true", help="CP1251 asc dump of the data")
p.add_argument("--invkod", "-I", action="store_true", help="KOD encode")
p.add_argument("filename", type=str, nargs="?", help="dump either stdin, or the specified file")
p.set_defaults(handler=kod_hexdump)
p = subparsers.add_parser("crodump", help="CROdumper")
p.add_argument("--verbose", "-v", action="store_true")
p.add_argument("--ascdump", "-a", action="store_true")
p.add_argument("--maxrecs", "-m", type=str, help="max nr or recots to output")
p.add_argument("--nodecompress", action="store_false", dest="decompress", default="true")
p.add_argument("dbdir", type=str)
p.set_defaults(handler=cro_dump)
p = subparsers.add_parser("sysdump", help="SYSdumper")
p.add_argument("--verbose", "-v", action="store_true")
p.add_argument("--ascdump", "-a", action="store_true")
p.add_argument("--nodecompress", action="store_false", dest="decompress", default="true")
p.add_argument("dbdir", type=str)
p.set_defaults(handler=sys_dump)
p = subparsers.add_parser("recdump", help="record dumper")
p.add_argument("--verbose", "-v", action="store_true")
p.add_argument("--ascdump", "-a", action="store_true")
p.add_argument("--maxrecs", "-m", type=str, help="max nr or recots to output")
p.add_argument("--find1d", action="store_true", help="Find records with 0x1d in it")
p.add_argument("--stats", action="store_true", help="calc table stats from the first byte of each record",)
p.add_argument("--index", action="store_true", help="dump CroIndex")
p.add_argument("--stru", action="store_true", help="dump CroIndex")
p.add_argument("--bank", action="store_true", help="dump CroBank")
p.add_argument("--sys", action="store_true", help="dump CroSys")
p.add_argument("dbdir", type=str)
p.set_defaults(handler=rec_dump)
p = subparsers.add_parser("strudump", help="STRUdumper")
p.add_argument("--verbose", "-v", action="store_true")
p.add_argument("--ascdump", "-a", action="store_true")
p.add_argument("dbdir", type=str)
p.set_defaults(handler=stru_dump)
p = subparsers.add_parser("destruct", help="Stru dumper")
p.add_argument("--verbose", "-v", action="store_true")
p.add_argument("--ascdump", "-a", action="store_true")
p.add_argument("--type", "-t", type=int, help="what type of record to destruct")
p.set_defaults(handler=destruct)
p = subparsers.add_parser("strucrack", help="Crack v4 KOD encrypion, bypassing the need for the database password.")
p.add_argument("--sys", action="store_true", help="Use CroSys for cracking")
p.add_argument("--silent", action="store_true", help="no output")
p.add_argument("dbdir", type=str)
p.set_defaults(handler=strucrack)
p = subparsers.add_parser("dbcrack", help="Crack v4 KOD encrypion, bypassing the need for the database password.")
p.add_argument("--silent", action="store_true", help="no output")
p.add_argument("dbdir", type=str)
p.set_defaults(handler=dbcrack)
args = parser.parse_args()
import crodump.koddecoder
if args.kod:
if len(args.kod)!=512:
raise Exception("--kod should have a 512 hex digit argument")
kod = crodump.koddecoder.new(list(unhex(args.kod)))
elif args.nokod:
kod = None
elif args.strucrack:
class Cls: pass
cargs = Cls()
cargs.dbdir = args.dbdir
cargs.sys = False
cargs.silent = True
cracked = strucrack(None, cargs)
if not cracked:
return
kod = crodump.koddecoder.new(cracked)
elif args.dbcrack:
class Cls: pass
cargs = Cls()
cargs.dbdir = args.dbdir
cargs.sys = False
cargs.silent = True
cracked = dbcrack(None, cargs)
if not cracked:
return
kod = crodump.koddecoder.new(cracked)
else:
kod = crodump.koddecoder.new()
if args.handler:
args.handler(kod, args)
if __name__ == "__main__":
main()

84
crodump/dumpdbfields.py Normal file
View File

@ -0,0 +1,84 @@
"""
`dumpdbfields` demonstrates how to enumerate tables and records.
"""
import os
import os.path
from .Database import Database
from .crodump import strucrack, dbcrack
from .hexdump import unhex
def processargs(args):
for dbpath in args.dbdirs:
if args.recurse:
for path, _, files in os.walk(dbpath):
# check if there is a crostru file in this directory.
if any(_ for _ in files if _.lower() == "crostru.dat"):
yield path
else:
yield dbpath
def main():
import argparse
parser = argparse.ArgumentParser(description="db field dumper")
parser.add_argument("--kod", type=str, help="specify custom KOD table")
parser.add_argument("--strucrack", action="store_true", help="infer the KOD sbox from CroStru.dat")
parser.add_argument("--dbcrack", action="store_true", help="infer the KOD sbox from CroIndex.dat+CroBank.dat")
parser.add_argument("--nokod", "-n", action="store_true", help="don't KOD decode")
parser.add_argument("--maxrecs", "-m", type=int, default=100)
parser.add_argument("--recurse", "-r", action="store_true")
parser.add_argument("--verbose", "-v", action="store_true")
parser.add_argument("dbdirs", type=str, nargs='*')
args = parser.parse_args()
for path in processargs(args):
try:
import crodump.koddecoder
if args.kod:
if len(args.kod)!=512:
raise Exception("--kod should have a 512 hex digit argument")
kod = crodump.koddecoder.new(list(unhex(args.kod)))
elif args.nokod:
kod = None
elif args.strucrack:
class Cls: pass
cargs = Cls()
cargs.dbdir = path
cargs.sys = False
cargs.silent = True
cracked = strucrack(None, cargs)
if not cracked:
return
kod = crodump.koddecoder.new(cracked)
elif args.dbcrack:
class Cls: pass
cargs = Cls()
cargs.dbdir = path
cargs.sys = False
cargs.silent = True
cracked = dbcrack(None, cargs)
if not cracked:
return
kod = crodump.koddecoder.new(cracked)
else:
kod = crodump.koddecoder.new()
db = Database(path, kod)
for tab in db.enumerate_tables():
tab.dump(args)
print("nr of records: %d" % db.bank.nrofrecords)
i = 0
for rec in db.enumerate_records(tab):
for field, fielddef in zip(rec.fields, tab.fields):
print(">> %s -- %s" % (fielddef, field.content))
i += 1
if i > args.maxrecs:
break
except Exception as e:
print("ERROR: %s" % e)
if __name__ == "__main__":
main()

95
crodump/hexdump.py Normal file
View File

@ -0,0 +1,95 @@
"""
Several functions for converting bytes to readable text or hex bytes.
"""
import struct
from binascii import b2a_hex, a2b_hex
def unhex(data):
"""
convert a possibly space separated list of 2-digit hex values to a byte-array
"""
if type(data) == bytes:
data = data.decode("ascii")
data = data.replace(" ", "")
data = data.strip()
return a2b_hex(data)
def ashex(line):
"""
convert a byte-array to a space separated list of 2-digit hex values.
"""
return " ".join("%02x" % _ for _ in line)
def aschr(b):
"""
convert a CP-1251 byte to a unicode character.
This will make both cyrillic and latin text readable.
"""
if 32 <= b < 0x7F:
return "%c" % b
elif 0x80 <= b <= 0xFF:
try:
c = struct.pack("<B", b).decode("cp1251")
if c:
return c
except UnicodeDecodeError:
# 0x98 is the only invalid cp1251 character.
pass
return "."
def asasc(line):
"""
convert a CP-1251 encoded byte-array to a line of unicode characters.
"""
return "".join(aschr(_) for _ in line)
def hexdump(ofs, data, args):
"""
Output offset prefixed lines of hex + ascii characters.
"""
w = args.width
if args.ascdump:
fmt = "%08x: %s"
else:
fmt = "%%08x: %%-%ds %%s" % (3 * w - 1)
for o in range(0, len(data), w):
if args.ascdump:
print(fmt % (o + ofs, asasc(data[o:o+w])))
else:
print(fmt % (o + ofs, ashex(data[o:o+w]), asasc(data[o:o+w])))
def tohex(data):
"""
Convert a byte-array to a sequence of 2-digit hex values without separators.
"""
return b2a_hex(data).decode("ascii")
def toout(args, data):
"""
Return either ascdump or hexdump, depending on the `args.ascdump` flag.
"""
if args.ascdump:
return asasc(data)
else:
return tohex(data)
def strescape(txt):
"""
Convert bytes or text to a c-style escaped string.
"""
if type(txt) == bytes:
txt = txt.decode("cp1251")
txt = txt.replace("\\", "\\\\")
txt = txt.replace("\n", "\\n")
txt = txt.replace("\r", "\\r")
txt = txt.replace("\t", "\\t")
txt = txt.replace('"', '\\"')
return txt

59
crodump/koddecoder.py Normal file
View File

@ -0,0 +1,59 @@
"""
Decode CroStru KOD encoding.
"""
INITIAL_KOD = [
0x08, 0x63, 0x81, 0x38, 0xA3, 0x6B, 0x82, 0xA6, 0x18, 0x0D, 0xAC, 0xD5, 0xFE, 0xBE, 0x15, 0xF6,
0xA5, 0x36, 0x76, 0xE2, 0x2D, 0x41, 0xB5, 0x12, 0x4B, 0xD8, 0x3C, 0x56, 0x34, 0x46, 0x4F, 0xA4,
0xD0, 0x01, 0x8B, 0x60, 0x0F, 0x70, 0x57, 0x3E, 0x06, 0x67, 0x02, 0x7A, 0xF8, 0x8C, 0x80, 0xE8,
0xC3, 0xFD, 0x0A, 0x3A, 0xA7, 0x73, 0xB0, 0x4D, 0x99, 0xA2, 0xF1, 0xFB, 0x5A, 0xC7, 0xC2, 0x17,
0x96, 0x71, 0xBA, 0x2A, 0xA9, 0x9A, 0xF3, 0x87, 0xEA, 0x8E, 0x09, 0x9E, 0xB9, 0x47, 0xD4, 0x97,
0xE4, 0xB3, 0xBC, 0x58, 0x53, 0x5F, 0x2E, 0x21, 0xD1, 0x1A, 0xEE, 0x2C, 0x64, 0x95, 0xF2, 0xB8,
0xC6, 0x33, 0x8D, 0x2B, 0x1F, 0xF7, 0x25, 0xAD, 0xFF, 0x7F, 0x39, 0xA8, 0xBF, 0x6A, 0x91, 0x79,
0xED, 0x20, 0x7B, 0xA1, 0xBB, 0x45, 0x69, 0xCD, 0xDC, 0xE7, 0x31, 0xAA, 0xF0, 0x65, 0xD7, 0xA0,
0x32, 0x93, 0xB1, 0x24, 0xD6, 0x5B, 0x9F, 0x27, 0x42, 0x85, 0x07, 0x44, 0x3F, 0xB4, 0x11, 0x68,
0x5E, 0x49, 0x29, 0x13, 0x94, 0xE6, 0x1B, 0xE1, 0x7D, 0xC8, 0x2F, 0xFA, 0x78, 0x1D, 0xE3, 0xDE,
0x50, 0x4E, 0x89, 0xB6, 0x30, 0x48, 0x0C, 0x10, 0x05, 0x43, 0xCE, 0xD3, 0x61, 0x51, 0x83, 0xDA,
0x77, 0x6F, 0x92, 0x9D, 0x74, 0x7C, 0x04, 0x88, 0x86, 0x55, 0xCA, 0xF4, 0xC1, 0x62, 0x0E, 0x28,
0xB7, 0x0B, 0xC0, 0xF5, 0xCF, 0x35, 0xC5, 0x4C, 0x16, 0xE0, 0x98, 0x00, 0x9B, 0xD9, 0xAE, 0x03,
0xAF, 0xEC, 0xC9, 0xDB, 0x6D, 0x3B, 0x26, 0x75, 0x3D, 0xBD, 0xB2, 0x4A, 0x5D, 0x6C, 0x72, 0x40,
0x7E, 0xAB, 0x59, 0x52, 0x54, 0x9C, 0xD2, 0xE9, 0xEF, 0xDD, 0x37, 0x1E, 0x8F, 0xCB, 0x8A, 0x90,
0xFC, 0x84, 0xE5, 0xF9, 0x14, 0x19, 0xDF, 0x6E, 0x23, 0xC4, 0x66, 0xEB, 0xCC, 0x22, 0x1C, 0x5C,
]
class KODcoding:
"""
class handing KOD encoding and decoding, optionally
with a user specified KOD table.
"""
def __init__(self, initial=INITIAL_KOD):
self.kod = [_ for _ in initial]
# calculate the inverse table.
self.inv = [0 for _ in initial]
for i, x in enumerate(self.kod):
self.inv[x] = i
def decode(self, o, data):
"""
decode : shift, a[0]..a[n-1] -> b[0]..b[n-1]
b[i] = KOD[a[i]]- (i+shift)
"""
return bytes((self.kod[b] - i - o) % 256 for i, b in enumerate(data))
def encode(self, o, data):
"""
encode : shift, b[0]..b[n-1] -> a[0]..a[n-1]
a[i] = INV[b[i]+ (i+shift)]
"""
return bytes(self.inv[(b + i + o) % 256] for i, b in enumerate(data))
def new(*args):
"""
create a KODcoding object with the specified arguments.
"""
return KODcoding(*args)

81
crodump/kodump.py Normal file
View File

@ -0,0 +1,81 @@
"""
This module has the functions for the 'kodump' subcommand from the 'crodump' script.
"""
from .hexdump import unhex, toout, hexdump
import io
import struct
def decode_kod(kod, args, data):
"""
various methods of hexdumping KOD decoded data.
"""
if args.nokod:
# plain hexdump, no KOD decode
hexdump(args.offset, data, args)
elif args.shift:
# explicitly specified shift.
args.shift = int(args.shift, 0)
enc = kod.decode(args.shift, data)
hexdump(args.offset, enc, args)
elif args.increment:
def incdata(data, s):
"""
add 's' to each byte.
This is useful for finding the correct shift from an incorrectly shifted chunk.
"""
return b"".join(struct.pack("<B", (_ + s) & 0xFF) for _ in data)
# explicitly specified shift.
for s in range(256):
enc = incdata(data, s)
print("%02x: %s" % (s, toout(args, enc)))
else:
# output with all possible 'shift' values.
for s in range(256):
if args.invkod:
enc = kod.encode(s, data)
else:
enc = kod.decode(s, data)
print("%02x: %s" % (s, toout(args, enc)))
def kod_hexdump(kod, args):
"""
handle the `kodump` subcommand, KOD decode a section of a data file
This function is mostly useful for reverse-engineering the database format.
"""
args.offset = int(args.offset, 0)
if args.length:
args.length = int(args.length, 0)
elif args.endofs:
args.endofs = int(args.endofs, 0)
args.length = args.endofs - args.offset
if args.width:
args.width = int(args.width, 0)
else:
args.width = 64 if args.ascdump else 16
if args.filename:
with open(args.filename, "rb") as fh:
if args.length is None:
fh.seek(0, io.SEEK_END)
filesize = fh.tell()
args.length = filesize - args.offset
fh.seek(args.offset)
data = fh.read(args.length)
decode_kod(kod, args, data)
else:
# no filename -> read from stdin.
import sys
data = sys.stdin.buffer.read()
if args.unhex:
data = unhex(data)
decode_kod(kod, args, data)

96
crodump/readers.py Normal file
View File

@ -0,0 +1,96 @@
import struct
class ByteReader:
"""
The ByteReader object is used when decoding various variable sized structures.
all functions raise EOFError when attempting to read beyond the end of the buffer.
functions starting with `read` advance the current position.
"""
def __init__(self, data):
self.data = data
self.o = 0
def readbyte(self):
"""
Reads a single byte
"""
if self.o + 1 > len(self.data):
raise EOFError()
self.o += 1
return struct.unpack_from("<B", self.data, self.o - 1)[0]
def testbyte(self, bytevalue):
"""
returns True when the current bytes matches `bytevalue`.
"""
if self.o + 1 > len(self.data):
raise EOFError()
return self.data[self.o] == bytevalue
def readword(self):
"""
Reads a 16 bit unsigned little endian value
"""
if self.o + 2 > len(self.data):
raise EOFError()
self.o += 2
return struct.unpack_from("<H", self.data, self.o - 2)[0]
def readdword(self):
"""
Reads a 32 bit unsigned little endian value
"""
if self.o + 4 > len(self.data):
raise EOFError()
self.o += 4
return struct.unpack_from("<L", self.data, self.o - 4)[0]
def readbytes(self, n=None):
"""
Reads the specified number of bytes, or
when no size was specified, the remaining bytes in the buffer
"""
if n is None:
n = len(self.data) - self.o
if self.o + n > len(self.data):
raise EOFError()
self.o += n
return self.data[self.o-n:self.o]
def readlongstring(self):
"""
Reads a cp1251 encoded string prefixed with a dword sized length
"""
namelen = self.readdword()
return self.readbytes(namelen).decode("cp1251")
def readname(self):
"""
Reads a cp1251 encoded string prefixed with a byte sized length
"""
namelen = self.readbyte()
return self.readbytes(namelen).decode("cp1251")
def readtoseperator(self, sep):
"""
reads bytes upto a bytes sequence matching `sep`.
when no `sep` is found, return the remaining bytes in the buffer.
"""
if self.o > len(self.data):
raise EOFError()
oldoff = self.o
off = self.data.find(sep, self.o)
if off >= 0:
self.o = off + len(sep)
return self.data[oldoff:off]
else:
self.o = len(self.data)
return self.data[oldoff:]
def eof(self):
"""
return True when the current position is at or beyond the end of the buffer.
"""
return self.o >= len(self.data)

322
docs/cronos-research.md Normal file
View File

@ -0,0 +1,322 @@
# About Cronos databases.
A _cronos database_ consists of those files
CroBank.dat
CroBank.tad
CroIndex.dat
CroIndex.tad
CroStru.dat
CroStru.tad
and a Vocabulary database with another set of these files in a sub directory Voc/
`CroIndex.*` can be ignored for most dumping purposes, unless the user suspects there to be residues of deleted data.
Additionally there are the `CroSys.dat` and `CroSys.tad` files in the cronos application directory, which list the currently
known databases.
## app installation
On a default non-russian Windows installation, the CronosPro app shows with several encoding issues that can be fixed like this:
reg set HKLM\System\CurrentControlSet\Control\Nls\Codepage 1250=c_1251.nls 1252=c_1251.nls
[from](https://ixnfo.com/en/question-marks-instead-of-russian-letters-a-solution-to-the-problem-with-windows-encoding.html)
Also note that the v3 cronos app will run without problem on a linux machine using [wine](https://winehq.org/)
## Files ending in .dat
All .dat files start with a 19 byte header:
char magic[8] // allways: 'CroFile\x00'
uint16 unknown
char version[5] // 01.XX see, below
uint16 encoding // bitfield: bit0 = KOD, bit1 = ?
uint16 blocksize // 0x0040, 0x0200 or 0x0400
Most Bank files use blocksize == 0x0040
most Index files use blocksize == 0x0400
most Stru files use blocksize == 0x0200
This is followed by a block of 0x101 or 0x100 minus 19 bytes seemingly random data.
The unknown word is unclear but seems not to be random, might be a checksum.
## File versions
* Pre cronos pro used version `01.01`.
* Cronos version 3 introduced version indicators of `01.02`, `01.03`, `01.04` and `01.05`.
* `01.02` and `01.04` are called "small model", i.e. 32 bit offsets,
* `01.03` and `01.05` are called "big model", i.e. 64 bit offsets.
* `01.04` and `01.05` are called "lite".
* Cronos version 4 introduced version indicators of `01.11`, `01.13` and `01.14`.
* `01.11` are called "small model", i.e. 32 bit offsets,
* `01.13` are called "pro".
* `01.14` are called "lite".
* Cronos version 7 introduced version indicator of `01.19`.
## Files ending in .tad
The first two `uint32` are the number of deleted records and the tad offset to the first deleted entry.
The deleted entries form a linked list, with the size always 0xFFFFFFFF.
Depending on the version in the `.dat` header, `.tad` use either 32 bit or 64 bit file offsets
version `01.02` and `01.04` use 32 bit offsets:
uint32 offset
uint32 size // with flag in upper bit, 0 -> large record
uint32 checksum // but sometimes just 0x00000000, 0x00000001 or 0x00000002
versions `01.03`, `01.05` and `01.11` use 64 bit offsets:
uint64 offset
uint32 size // with flag in upper bit, 0 -> large record
uint32 checksum // but sometimes just 0x00000000, 0x00000001 or 0x00000002
where size can be 0xffffffff (indicating a free/deleted block).
Bit 31 of the size indicates that this is an extended record.
Extended records start with plaintext: { uint32 offset, uint32 size } or { uint64 offset, uint32 size }
## the 'old format'
The original description made it look like there were different formats for the block references.
This was found in previously existing documentation, but no sample databases with this format were found so far.
If the .dat file has a version of 01.03 or later, the corresponding .tad file looks like this:
uint32_t offset
uint32_t size // with flag in upper bit, 0 -> large record
uint32_t checksum // but sometimes just 0x00000000, 0x00000001 or 0x00000002
uint32_t unknownn // mostly 0
The old description would also assume 12 byte reference blocks but a packed struct, probably if the CroFile version is 01.01.
uint32 offset1
uint16 size1
uint32 offset2
uint16 size2
with the first chunk read from offset1 with length size1 and potentially more parts with total length of size2 starting at file offset offset2 with the first `uint32` of the 256 byte chunk being the next chunk's offset and a maximum of 252 bytes being actual data.
However, I never found files with .tad like that. Also the original description insisted on those chunks needing the decode-magic outlined below, but the python implementation only does that for CroStru files and still seems to produce results.
## CroStru
Interesting files are CroStru.dat containing metadata on the database within blocks whose size and length are found in CroStru.tad. These blocks are rotated byte wise using an sbox found in the cro2sql sources and then each byte is incremented by a one byte counter which is initialised by a per block offset. The sbox looks like this:
unsigned char kod[256] = {
0x08, 0x63, 0x81, 0x38, 0xa3, 0x6b, 0x82, 0xa6,
0x18, 0x0d, 0xac, 0xd5, 0xfe, 0xbe, 0x15, 0xf6,
0xa5, 0x36, 0x76, 0xe2, 0x2d, 0x41, 0xb5, 0x12,
0x4b, 0xd8, 0x3c, 0x56, 0x34, 0x46, 0x4f, 0xa4,
0xd0, 0x01, 0x8b, 0x60, 0x0f, 0x70, 0x57, 0x3e,
0x06, 0x67, 0x02, 0x7a, 0xf8, 0x8c, 0x80, 0xe8,
0xc3, 0xfd, 0x0a, 0x3a, 0xa7, 0x73, 0xb0, 0x4d,
0x99, 0xa2, 0xf1, 0xfb, 0x5a, 0xc7, 0xc2, 0x17,
0x96, 0x71, 0xba, 0x2a, 0xa9, 0x9a, 0xf3, 0x87,
0xea, 0x8e, 0x09, 0x9e, 0xb9, 0x47, 0xd4, 0x97,
0xe4, 0xb3, 0xbc, 0x58, 0x53, 0x5f, 0x2e, 0x21,
0xd1, 0x1a, 0xee, 0x2c, 0x64, 0x95, 0xf2, 0xb8,
0xc6, 0x33, 0x8d, 0x2b, 0x1f, 0xf7, 0x25, 0xad,
0xff, 0x7f, 0x39, 0xa8, 0xbf, 0x6a, 0x91, 0x79,
0xed, 0x20, 0x7b, 0xa1, 0xbb, 0x45, 0x69, 0xcd,
0xdc, 0xe7, 0x31, 0xaa, 0xf0, 0x65, 0xd7, 0xa0,
0x32, 0x93, 0xb1, 0x24, 0xd6, 0x5b, 0x9f, 0x27,
0x42, 0x85, 0x07, 0x44, 0x3f, 0xb4, 0x11, 0x68,
0x5e, 0x49, 0x29, 0x13, 0x94, 0xe6, 0x1b, 0xe1,
0x7d, 0xc8, 0x2f, 0xfa, 0x78, 0x1d, 0xe3, 0xde,
0x50, 0x4e, 0x89, 0xb6, 0x30, 0x48, 0x0c, 0x10,
0x05, 0x43, 0xce, 0xd3, 0x61, 0x51, 0x83, 0xda,
0x77, 0x6f, 0x92, 0x9d, 0x74, 0x7c, 0x04, 0x88,
0x86, 0x55, 0xca, 0xf4, 0xc1, 0x62, 0x0e, 0x28,
0xb7, 0x0b, 0xc0, 0xf5, 0xcf, 0x35, 0xc5, 0x4c,
0x16, 0xe0, 0x98, 0x00, 0x9b, 0xd9, 0xae, 0x03,
0xaf, 0xec, 0xc9, 0xdb, 0x6d, 0x3b, 0x26, 0x75,
0x3d, 0xbd, 0xb2, 0x4a, 0x5d, 0x6c, 0x72, 0x40,
0x7e, 0xab, 0x59, 0x52, 0x54, 0x9c, 0xd2, 0xe9,
0xef, 0xdd, 0x37, 0x1e, 0x8f, 0xcb, 0x8a, 0x90,
0xfc, 0x84, 0xe5, 0xf9, 0x14, 0x19, 0xdf, 0x6e,
0x23, 0xc4, 0x66, 0xeb, 0xcc, 0x22, 0x1c, 0x5c,
};
given the `shift`, the encoded data: `a[0]..a[n-1]` and the decoded data: `b[0]..b[n-1]`, the encoding works as follows:
decode: b[i] = KOD[a[i]] - (i+shift)
encode: a[i] = INV[b[i] + (i+shift)]
The original description of an older database format called the per block counter start offset 'sistN' which seems to imply it to be constant for certain entries. They correspond to a "system number" of meta entries visible in the database software. For encoded records this is their primary key.
In noticed that the first 256 bytes of CroStru.dat look close to identical (except the first 16 bytes) than CroBank.dat.
The toplevel table-id for CroStru and CroSys is #3, while referenced records have tableid #4.
## CroBank
CroBank.dat contains the actual database entries for multiple tables as described in the CroStru file. After each chunk is re-assembled (and potentially decoded with the per block offset being the record number in the .tad file).
Its first byte defines, which table it belongs to. It is encoded in cp1251 (or possibly IBM866) with actual column data separated by 0x1e.
There is an extra concept of sub fields in those columns, indicated by a 0x1d byte.
Fields of field types 6 and 9 start with an 0x1b byte, followed by a uint32 size of the actual fields. It may then contain further 0x1e bytes which indicate sub field separators.
If used for field type 6, the field begins with two uint32 (the first one mostly 0x00000001, the second one the size of the next strings) followed by three 0x1e separated strings containing file name, file extension and system number of the actual file record data referred to by this record.
## structure definitions
records start numbering at '1'.
Names are stored as: `byte strlen + char value[strlen]`
The first entry contains:
uint8
array {
Name keyname
uint32 index_or_size; // size when bit31 is set.
uint8 data[size]
}
this results in a dictionary, with keys like: `Bank`, `BankId`, `BankTable`, `Base`nnn, etc.
the `Base000` entry contains the record number for the table definition of the first table.
## table definitions
uint16 unk1
union {
uint8 shortversion; // 1
uint16 version; // >1
}
uint8 somelen; // 5 or 9
struct {
uint8 unk3
uint32 unk4 // not there when 'somelen'==5
uint32 unk5
}
uint32 tableid
Name tablename
Name abbreviation
uint32 unk7
uint32 nrfields
array {
uint16 entrysize -- total nr of bytes in this entry.
uint16 fieldtype // see below
uint32 fieldindex1 // presentation index (i.e. where in the UI it shows)
Name fieldname
uint32 flags
uint8 alwaysone // maybe the 'minvalue'
uint32 fieldindex2 // serialization index (i.e. where in the record in the .dat it appears)
uint32 fieldsize // max fieldsize
uint32 unk4
...
followed by remaining unknown bytes
} fields[nrfields]
uint32 extradatstr // amount of unknown length indexed data strings between field definition blocks
array {
uint16 datalen
uint8[datalen]
} datastrings[extradatstr]
uint32 unk8
uint8 fielddefblock // always 2, probably the number of this block of field definitions
uint32 unk9
uint32 nrextrafields
array {
... as above
} extrafields[nrextrafields]
followed by remaining unknown bytes
...
In order to have field definitions for all the fields in a record from the .dat for that table,
fields.append(extrafields) must be sorted by their fieldindex2.
## field types
The interface gives a list of field types I can select for table columns:
* 0 - Системный номер = Primary Key ID
* 1 - Числовое = Numeric
* 2 - Текстовое = Text
* 3 - Словарное = Dictionary
* 4 - Дата = Date
* 5 - Время = Time
* 6 - Фаил = File (internal)
* 29 - Внеэшний фаил = File (external)
* 7 - Прямая ссылка = Direkt link
* 8 - Обратная ссылка = Back link
* 9 - Прямаяь-Обратная ссылка = Direct-Reverse link
* 17 - Связь по полю = Field communication
Other unassigned values in the table entry definition are
* Dictionary Base (defaults to 0)
* номер в записи = number in the record
* Длина Поля = Field size
* Flags:
* (0x2000) Множественное = Multiple
* (0x0800) Информативное = Informative
* (0x0040) Некорректируемое = Uncorrectable
* (0x1000) поиск на вводе = input search
* (?) симбольное = symbolic
* (?) Лемматизировать = Lemmatize
* (?) поиск по значениям = search by values
* (0x0200) замена непустого значения = replacement of a non-empty value
* (0x0100) замена значения = value replacement
* (0x0004) автозаполнения = autocomplete
* (?) корневая связь = root connection
* (?) допускать дубли = allow doubles
* (0x0002) обязательное = obligatory
## compressed records
some records are compressed, the format is like this:
multiple-chunks {
uint16 size; // stored in bigendian format.
uint8 head[2] = { 8, 0 }
uint32 crc32
uint8 compdata[size-6]
}
uint8 tail[3] = { 0, 0, 2 }
# v4 format
The header version 01.11 indicates a database created with cronos v4.x.
## .tad
A 4 dword header:
dword -2
dword nr deleted
dword first deleted
dword 0
16 byte records:
qword offset, with flags in upper 8 bits.
dword size
dword unk
flags:
02,03 - deleted record.
04 - compressed { int16be size; int16be flag int32le crc; byte data[size-6]; } 00 00 02
00 - extended record
## .dat
The .dat file of a 01.11 database has 64bit offsets, like the 01.03 file format.

51
setup.py Normal file
View File

@ -0,0 +1,51 @@
from setuptools import setup
setup(
name = "cronodump",
version = "1.1.0",
entry_points = {
'console_scripts': [
'croconvert=crodump.croconvert:main',
'crodump=crodump.crodump:main',
],
},
packages = ['crodump'],
author = "Willem Hengeveld, Dirk Engling",
author_email = "itsme@xs4all.nl, erdgeist@erdgeist.org",
description = "Tool and library for extracting data from Cronos databases.",
long_description_content_type='text/markdown',
long_description = """
The cronodump utility can parse most of the databases created by the [CronosPro](https://www.cronos.ru/) database software
and dump it to several output formats.
The software is popular among Russian public offices, companies and police agencies.
Example usage:
croconvert --csv <yourdbpath>
Will create a .csv dump of all records in your database.
or:
crodump strudump <yourdbpath>
Will print details on the internal definitions of the tables present in your database.
For more details see the [README.md](https://github.com/alephdata/cronodump/blob/master/README.md) file.
""",
license = "MIT",
keywords = "cronos dataconversion databaseexport",
url = "https://github.com/alephdata/cronodump/",
classifiers = [
'Environment :: Console',
'Intended Audience :: End Users/Desktop',
'Intended Audience :: Developers',
'License :: OSI Approved :: MIT License',
'Operating System :: OS Independent',
'Programming Language :: Python :: 3.7',
'Topic :: Utilities',
'Topic :: Database',
],
python_requires = '>=3.7',
extras_require={ 'templates': ['Jinja2'] },
)

58
templates/html.j2 Normal file
View File

@ -0,0 +1,58 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Cronos Database Dump</title>
</head>
<body>
{% for table in db.enumerate_tables(files=True) %}
<table>
<caption>{{ table.tablename | e }}</caption>
<thead>
<tr>
{%- for field in table.fields %}
<th>{{ field.name | e }}</th>
{%- endfor %}
<th>Data</th>
</tr>
</thead>
<tbody>
{% for system_number, file in db.enumerate_files(table) %}
<tr>
<td>{{ system_number | e }}</td>
<td><a href="data:application/x-binary;base64,{{ base64.b64encode( file ).decode('utf-8') }}">File content</a></td>
<tr>
{% endfor %}
</tbody>
</table>
{% endfor %}
{% for table in db.enumerate_tables(files=False) %}
{%- if table.tableimage -%}
<img src="data:image;base64,{{ base64.b64encode( table.tableimage.data ).decode('utf-8') }}"/>
{%- endif -%}
<table>
<caption>{{ table.tablename | e }}</caption>
<thead>
<tr>
{%- for field in table.fields %}
<th>{{ field.name | e }}</th>
{%- endfor %}
</tr>
</thead>
<tbody>
{%- for record in db.enumerate_records( table ) %}
<tr>
{%- for field in record.fields %}
{%- if field.typ == 6 and field.content -%}
<td><a download="{{ field.filename }}.{{ field.extname }}" href="data:application/x-binary;base64,{{ db.get_record( field.filedatarecord, True ) }}">{{ field.filename | e }}.{{ field.extname | e }}</a></td>
{%- else -%}
<td>{{ field.content | e }}</td>
{%- endif -%}
{%- endfor %}
</tr>
{%- endfor %}
</tbody>
</table>
{% endfor %}
</body>
</html>

20
templates/postgres.j2 Normal file
View File

@ -0,0 +1,20 @@
{% for table in db.enumerate_tables(files=False) %}
CREATE TABLE "{{ table.tablename | replace('"', '_') }}" (
{%- for field in table.fields %}
"{{ field.name | replace('"', '_') }}" {{ field.sqltype() -}}
{{- ", " if not loop.last else "" -}}
{%- endfor %}
);
INSERT INTO "{{ table.tablename | replace('"', '_') }}" VALUES
{%- for record in db.enumerate_records( table ) %}
( {%- for field in record.fields -%}
'{{ field.content | replace("'", "''") }}' {{- ", " if not loop.last else "" -}}
{%- endfor -%}
)
{{- ", " if not loop.last else "" -}}
{%- endfor %}
;
{% endfor %}

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.