Banner showing base of Eiffel tower

Github

Download version 1.4.8: Windows or Linux

Persistency Library: Markup Document Processing

Classes

ECF: markup-docs.ecf

Directory source listing

Overview

Classes for processing documents encoded with various kinds of markup language.

1. OpenDocument Flat XML spreadsheets using VTD-XML.

2. Read and export emails from the Thunderbird email client.

Directory: library/persistency

. /thunderbird

. /thunderbird/reader

. /thunderbird/support

. /thunderbird/test

. /xml/open-office-spreadsheet

thunderbird

EL_LOCALIZED_THUNDERBIRD_ACCOUNT_READER

Reads Thunderbird HTML email documents from a selected account where content folders are organized with sub-folders named as 2 letter language code to hold localized versions of documents.

foo/en
foo/de
foo/fr

bar/en
bar/de
bar/fr

Each document folder is read and processed by a class conforming to EL_THUNDERBIRD_FOLDER_READER

Further Information

Click on class link to see client examples.

EL_LOCALIZED_THUNDERBIRD_BOOK_EXPORTER

Merge localized folder of emails into a single HTML book with chapter numbers and titled derived from subject line.

Further Information

Click on class link to see client examples.

EL_THUNDERBIRD_ACCOUNT_READER

Reads Thunderbird HTML email documents from a selected account and configured by a Pyxis document.

pyxis-doc:
   version = 1.0; encoding = "UTF-8"

thunderbird:
   account = "<email account name>"; export_dir = "<export path>"
   language = "<optional language code"
   folders:
      "<folder name 1>"
      "<folder name 2>"

Further Information

Click on class link to see client examples.

thunderbird/reader

EL_THUNDERBIRD_BOOK_EXPORTER

Merge Thunderbird folder of emails into a HTML book

EL_THUNDERBIRD_EXPORT_AS_XHTML

Export contents of Thunderbird email folder as XHTML files

EL_THUNDERBIRD_EXPORT_AS_XHTML_BODY

Extract all html between <body> and </body> tags and output as <subject name>.body. Insert a page anchor before each h2 heading

<a id="Title 1"></a>
<h2>Title 1</h2>

Insert a class attribute into the first h2 element in the page.

<h2 class="first">Title 1</h2>

Further Information

Click on class link to see client examples.

EL_THUNDERBIRD_FOLDER_EXPORTER

Export filtered contents of Thunderbird email folder as HTML and edited by class conforming to EL_HTML_WRITER

EL_THUNDERBIRD_FOLDER_READER

Read folder of Thunderbird HTML email content and collects email headers in field_table HTML content is collected in line list html_lines and then event handler on_email_end is called, before processing the next email.

thunderbird/support

EL_BOOK_CHAPTER

Book chapter

EL_HTML_BODY_WRITER

Html body writer

EL_HTML_WRITER

Html writer

EL_SUBJECT_LINE_DECODER

Decode internal Thunderbird subject lines Example:

"=?ISO-8859-15?Q?=DCber_My_Ching?=" -> "Über My Ching"

"=?UTF-8?B?w5xiZXLigqwgTXkgQ2hpbmc=?=" -> Über€ My Ching

"=?UTF-8?Q?3.Journaleintr=c3=a4ge_bearbeiten?=" -> "Journaleinträge bearbeiten"

EL_SUBJECT_LIST

Subject list

EL_THUNDERBIRD_CONSTANTS

Thunderbird constants

EL_XHTML_WRITER

Xhtml writer

thunderbird/test

EL_SUBJECT_LINE_DECODER_TEST_SET

Subject line decoder test set

xml/open-office-spreadsheet

EL_OPEN_OFFICE

Open office

EL_SPREAD_SHEET

Object representing OpenDocument Flat XML spreadsheets as tables of rows of data strings.

XML namespace

xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"
office:mimetype="application/vnd.oasis.opendocument.spreadsheet"
office:version="1.2"

EL_SPREAD_SHEET_DATA_CELL

Object representing table data cell in OpenDocument Flat XML format spreadsheet

EL_SPREAD_SHEET_ROW

Object representing table row in OpenDocument Flat XML format spreadsheet

EL_SPREAD_SHEET_TABLE

Object representing table in OpenDocument Flat XML format spreadsheet