Talk:SiSU

Latest comment: 8 years ago by DavidCary in topic Document Content Certificate

In my opinion, the article on SiSU should stay. Having myself used SiSU for a while, and found it to be very useful, I tried to improve the article a little by adding a section on Usage. I also added a couple references. The warning "This article has multiple issues" could now be removed, it seems to me. --Mikaelbook (talk) 08:08, 11 August 2011 (UTC)Reply


I have added a link to the description of "hash sum" and to "regular expression" within Wikipedia, which looks to be perhaps a term not immediately known to everyone, but most of the terms used given the explanations should be understandable, and i would suggest are justified in explaining succinctly what makes this work (and how/why the source document for a work like "War and Peace" requires almost no markup to be used by this system to produce say XML, LaTeX or populate an SQL database in the manner described).

"Object" used to describe an individual unit of text or information identified and used by the system, also central to how it works and what it does and the output produced (as e.g. in "object numbering system" as opposed to paragraph numbering system), and is described in a footnote.

It might be possible to expand any of these further if this would be helpful.

Feedback is welcome.

Mod3 16:12, 17 February 2007 (UTC)Reply

The problem is not specific jargon terms, but the fact that it is a bit unclear from the article what this program is supposed to do. The first paragraph is probably okay now, but the second one doesn't really say enough clearly. NicM 19:20, 17 February 2007 (UTC).Reply


Thanks, it is possible to set out basic elements in a few sentences to give a general idea of how it works. SiSU is less about document layout than finding a way with little markup to be able to construct an abstract representation a document that makes it possible to produce multiple representations of it which may be rather different from each other and used for different purposes, whether layout and publishing, or search of contents i.e. to be able to take advantage from this minimal preparation starting point of some of the strengths of rather different established ways of representing documents for different purposes, whether for search (relational database, or indexed flat files generated for that purpose whether of complete documents, or say of files made up of objects), online viewing (e.g. html, xml, pdf), or paper publication (e.g. pdf)... the solution arrived at is by extracting structural information about the document (about headings within the document) and by tracking objects (which are serialized and also given hash values) in the manner described. It makes possible representations that are quite different from those offered at present. For example objects could be saved individually and identified by their hashes, with an index of how the objects relate to each other to form a document. (also the existence and meaning of the Finnish term sisu influenced the selection of the name but it was not exactly named after it so the footnote remains more appropriate) hmmm, again whilst the command line is used and that is the interface that is offered, it need not be restricted to that, the command-line is easily placed behind buttons, or other interfaces Mod3 00:37, 18 February 2007 (UTC)Reply

This may be helpful in presenting a visual representation of document structure and objects (with their object numbers and hash values) http://www.jus.uio.no/sisu/the_wealth_of_networks.yochai_benkler/digest.txt this being for the document: http://www.jus.uio.no/sisu/the_wealth_of_networks.yochai_benkler/sisu_manifest.html

or "Free Culture" might be a better example being available in a few languages for which the each document shares the same structure and equivalent objects: http://www.jus.uio.no/sisu/free_culture.lawrence_lessig/sisu_manifest.html 84.9.34.29 11:08, 18 February 2007 (UTC)Reply

Document Content Certificate edit

My understanding is that SiSU is a collection of tools to convert from a relatively easy-to-edit markup language to various output files.

  • Which markup languages does SiSU support? If there is only one, is there a name for that markup language other than "the markup language recognized by SiSU"? What are the standard extensions for those languages?
    • (".sst", ".ssm", ".ssp", ... any others?)
  • What sorts of things does SiSU do that are handled poorly or not at all by other markup-processing tools?
    • SiSU can automatically generate a "concordance" (actually an index (publishing)).
    • SiSU can automatically generate a Document Content Certificate.
    • any others?
  • What is the connection, if any, between the "SiSU Document Content Certificate (Digest/DCC) sha256 digests"[1] generated by SuSI and the hashes used by IPFS? What do people do with those digests?

I think this article would be better if it answered those questions. --DavidCary (talk) 18:20, 10 April 2016 (UTC)Reply