ZWI file format

From HandWiki

The ZWI file format is used to store wiki articles (and associated data). The format is developed to facilitate exchange between different wiki software. A ZWI file contains Wikitext of the final revision of an article, names of contributors, old revisions, embedded media and ready-to use HTML files, as well as other (optional) file formats derived from the original Wikitex file.

ZWI file format
ZWI icon
Filename extension.zwi
Internet media typewiki
Developed by S.V. Chekanov
Initial releaseMarch, 2021
Type of formatCompressed container format
Open format?Yes

The ZWI files are compact (zipped) file and sharable via the network. The ZWI file format is used for exchange between wikis, and it was deployed to the HandWiki encyclopedia.[1]. I registered user can use the button “ZWI export” (above the editor area) to download the wiki page. The ZWI file can be unzipped as any zip archive. The ZWI files have the extension *.zwi.

ZWI file structure

A ZWI file is a ZIP archive thus it can be manipulated using the standard zip compression tools. A typical ZWI file has the following structure:

  • article.wikitext - Wikitext of the article with last modification using Mediawiki syntax. It is the main source of HTML, XHTML and other possible derivations.
  • article.html - HTML file to view in a browser (with all headers). It is a secondary (derived from article.wikitext) format.
  • article.xhtml - HTML portion with the article content (without headers, navigation etc.) (optional)
  • article.tex - article in the LaTeX file format (optional)
  • article.dokuwiki - article in the DokuWiki file format (optional)
  • metadata.json - a JSON file with the information about the articles (editors, revisions, namespaces, abstract etc.)
  • signature.json - a JSON file with the signature of the publisher (optional)
  • plugins.json - a JSON file with the information about plugins used by software that creates this file (used for a consistency check) (optional)
  • media.json - a JSON file with the list of linked media files (images)
  • data/media/[namespace]/ - directory with images associated with the article (only if they are available from the local server)
  • data/attic/[namespace]/ - directory with files with older revisions of article.wikitext. Each file has the name:
[article name].[timestamp].wikitext

The most important file that contains the description of the ZWI file is metadata.json. It describes the version of the ZWI format specification, which file is the primary source of derivations (article.wikitext for the MediaWiki software). All other files, such as article.html, article.tex, article.dokuwiki are secondary conversions since they are obtained after using convertors of the original article.wikitext file.

A typical example of the metadata.json file of a Wikipedia article is given here:

{
    "ZWIversion": 1.3,
    "Title": "Ben Davidson (rugby league)",
    "ShortTitle": "Ben Davidson",
    "Topics": [
        "Rugby league",
        "New Zealand rugby league footballer"
    ],
    "Lang": "en",
    "Content": {
        "article.html": "65c821ccc989721f2fcbeeb69f6b6bed3e32b3de",
        "article.wikitext": "0783919ee5352969be3445d3c9e7a17a79392ea2",
        "article.txt": "3a1c5b32ce22d8e8d5fed9d057adcf04b1019988"
    },
    "Primary": "article.wikitext",
    "Revisions": [],
    "Publisher": "wikipedia",
    "CreatorNames": [],
    "ContributorNames": [],
    "LastModified": "1657454530",
    "TimeCreated": "1657454530",
    "PublicationDate": "2022-08-18",
    "Categories": [
        "1902 births",
        "1961 deaths",
    ],
    "Rating": [0,0],
    "Description": "Benjamin Alfred Davidson (1902 \u2013 1961) was a New Zealand rugby league footballer who represented New Zealand. ",
    "Comment": "",
    "License": "CC BY-SA 3.0",
    "GeneratorName": "MediaWiki",
    "SourceURL": "https://en.wikipedia.org/wiki/Ben_Davidson_(rugby_league)"
}

Note that PublicationDate is used for historic articles. It uses the format "yyyy-mm-dd", unlike TimeCreated and LastModified fields that hold the proper timestamps (in seconds since 1970) that correspond to creation and modification time of the ZWI file itself.

The field "Rating" consists of 2 numbers: total score (defined by a publisher) and the number of hits. By default, both numbers are 0. The ZWI file is a self-aware of its rating.

The file "signature.json" contains a token signed with a private key. This token using metadata.json and media.json as inputs. Therefore, any attempts to modify metadata.json (such as rating, publisher, text file or images) will lead to a broken signature and thus such a file cannot be verified.

If a ZWI file is created using DokuWiki software, it is likely that the primary file is article.dokuwiki while article.wikitext is a result of internal conversion. This should be stated in metadata.json.

Generally, all article revisions should be stored. In some cases (like for HandWiki), only the first revision is stored.

The ZWI file can include the images linked in the articles. They are stored in the directory "data/media/[namespace]/". The images are included only if they were located on the local server (i.e. where the wiki with the article is installed). The ZWI export mechanism does not attempt to extract images if they are linked from the Mediawiki commons. However, the ZWI creation mechanism attempts to identifies cached images.

If there are no other (older) revisions of the article, the directory data/attic/[namespace]/ is not created.

The ZWI file format was initially implemented for the SandBox of the HandWiki encyclopedia in March 2021. A proof of the basic principles for creation and insertion of the ZWI files was illustrated using the DokuWiki wiki software. [2]. In April 2021, ZWI file export was deployed as a standard feature of the HandWiki encyclopedia. In October 2021 ZWI file production was launched by the Encycloreader project[3].

References

  1. S.V.Chekanov, HandWiki encyclopedia. https://handwiki.org/ 2021
  2. S.V.Chekanov. EncycloED editor. A wiki editor based on DokuWiki with ZWI file export and import. (retrieved May 2021)
  3. Encycloreader. Search and read online encyclopedias [1] (retrieved Oct 2021). KSF