Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Documentation Costs Avoided using Python and other Open Standards Andrew Jonathan Fine Operating Systems Software Organization Engines, Systems, and Services Honeywell International Original Core Data Flow Generator Application Paragraphs, Tables, Pictures Company Document Translators Inserter Raw.doc Formatter Company Database Company Template.dot Single Python application • set of front end translators • content inserter • post-processing formatter Final.doc Front End Translator • Selected by caller • Caller specifies input file containing corporate data • Extracts components from file Pictures Tables Paragraphs • Saves to Python dictionary Inserter • Caller selects components from Python dictionaries made by front-ends for respective documents. • Inserter creates a Word document • Inserter uses Python/Com to insert components into document Back End Formatter • Scans corporate Word document template • Scans Word document made by inserter • Makes final style corrections. Why? The flow was designed to cope with changes in requirements! • • • • New projects New teams New data source formats New standards for existing formats First front-end translator Take pictures, tables, and data from a recursive property list constructed by an aerospace industry software visual programming tool called BEACON. (… actual design of translator outside the scope of this paper…) Initial Design of Inserter • Straightforward use of principles demonstrated by Mark Hammond's book, Python Programming in Win32. • Chapter containing a thorough treatment of how to have Python use the Word 97 COM object model to create and manipulate a Word Document. Problems!!! • Must cope with huge amounts of corporate data such as table cells.. • Speed of COM interface for new individual elements. • Reuse issues for detailed typesetting of elements. What I wanted: • Faster conversion • Existing standard • Callable from Python What I found: • Faster conversion (OpenJade) • Existing standard (DocBook SGML) Why Call from Python? • New scripting language to replace islands of automation (Perl, MSDOS, internal test stand controller language). • Easier to connect islands after writing in Python. • Open source thus continuously peer reviewed. • Tremendous user base! Plenty of wrappers written in Python around open source libraries supporting open standards. … so I wrote a Python wrapper around some DocBook rules … Revised Core Data Flow Generator Application Translators Company Database Typesetting Text in DocBook SGML Paragraphs, Tables, Pictures Company Document DocBook.py OpenJade DocBook.smgl Result.rtf DocBook SGML definition and default stylesheets Local Docbook DSSSL stylesheets \usr\packages\sgml Local.dsl Local.dtd Cleanup.py Company Template.dot • Python wrapper writes DocBook SGML • OpenJade translates DocBook SGML to Word RTF Final.doc A DocBook Informal table rendered by OpenJade into Word Name Type statex Integer statey Long Input to OpenJade as local DocBook SGML <!DOCTYPE informaltable SYSTEM "C:\Local.dtd"> <informaltable frame='all'> <tgroup cols='2' colsep='1' rowsep='1' align='center'> <colspec colname='Name' colwidth='75' align='left'></colspec> <colspec colname='Type' colwidth='64' align='center'></colspec> <thead> <row> <entry><emphasis role='bold'>Name</emphasis></entry> <entry><emphasis role='bold'>Type</emphasis></entry> </row> </thead> <tbody> <row> <entry><phrase role='xe' condition='italic'>statex</phrase></entry> <entry>Integer</entry> </row> <row> <entry><phrase role='xe' condition='italic'>statey</phrase></entry> <entry>Long</entry> </row> </tbody> </tgroup> </informaltable> from DocBook import DocBook class ItalicIndexPhrase (DocBook.Rules.Phrase): "italic indexible text phrase" TITLE = DocBook.Rules.Phrase def __init__ (self, text): DocBook.Rules.Phrase.__init__ (self, 'xe', 'italic') self.data = [ text ] class NameCell (DocBook.Rules.Entry): "table row cell describing name of identifier (italic and indexible text!)" TITLE = DocBook.Rules.Entry def __init__ (self, text): DocBook.Rules.Entry.__init__ (self) self.data = [ ItalicIndexPhrase (text) ] class StorageCell (DocBook.Rules.Entry): "table row cell describing storage type of identifier (ordinary text)" TITLE = DocBook.Rules.Entry def __init__ (self, text): DocBook.Rules.Entry.__init__ (self) self.data = text class TRow (DocBook.Rules.Row): "each row in application's informal table body" TITLE = DocBook.Rules.Row def __init__ (self, binding): (identifier, storage) = binding DocBook.Rules.Row.__init__ (self, [ NameCell (identifier), StorageCell (storage) ]) class TBody (DocBook.Rules.TBody): "application's informal table body" TITLE = DocBook.Rules.TBody def __init__ (self, items): DocBook.Rules.TBody.__init__ (self, map (TRow, items)) class TGroup (DocBook.Rules.TGroup): "application's informal table group" COLSPECS = [ DocBook.Rules.ColSpec ('Name', 75, 'left'), DocBook.Rules.ColSpec ('Type', 64, 'center') ] SHAPE TBODY = [ '2', '1', '1', 'center' ] = TBody class InformalTable (DocBook.Rules.InformalTable): "application's informal table" TGROUP = TGroup class Example (DocBook): 'example application of DocBook formatting class' SECTION = str (InformalTable) def __call__ (self): self.data = [ InformalTable ()(self.data) ] return DocBook.__call__ (self) if __name__ == '__main__': print Example ([('statex', 'Integer'), ('statey', 'Long')]) () Python code to translate data into OpenJade input in local DocBook SGML (based on Python to DocBook sample wrapper class DocBook) Using class DocBook • class DocBook from DocBook.py in Appendix F is the top-level interface callable class • Application inherits from class DocBook • Contents of application inherit from classes contained by DocBook.Rules • Use overrides to specify structure, formatting, and text. OpenJade • OpenJade is an open source DSSSL execution engine available from SourceForge. • DSSSL is an ISO standard for typesetting specification and document conversion. • OpenJade reads DocBook DSSSL stylesheets and our local DSSSL stylesheets if any. • The DSSSL is executed by OpenJade upon SGML source text to write a final document for later loading into a word processor. DocBook Post-Processing using Word Automation with Python/COM • DocBook/OpenJade emits RTF with different Word document style identifier names than in corporate Word DOT file. • Much faster to change document using Python/COM than to create document! • Cannibalized Python code from inserter first draft to create post-processor. • Reads RTF, changes, saves as final DOC. Return on Investment 5 projects ranging from 30 BEACON files to 150, average about 75 files Each project has 2 releases per year where each file must generate hard copy. Previously (cut/paste by hand): Each project release: 1/5 * 75 * 4 hours 3/5 * 75 * 8 hours 1/5 * 75 * 16 hours = = = Two releases per year: Five projects needing releases: Two year period (2002-2003) * 2 * 5 * 2 Total effort avoided: 60 360 240 ----660 hours hours hours = 1,320 = 6,600 = 13,200 -----13,200 hours hours hours hours hours Automated: Automated releases over 2 year period: My effort (12 * 140 hours per labor month): Total investment: Net effort avoided, 2002-3: Net avoided by customers 2002-3 at $100/hour: 160 hours 1 680 hours 1 840 hours 11 360 hours 1 136 000 dollars Net labor years avoided 2002-3 at 1680 hours/year: Headcount avoided per year: ROI (Total effort avoided / total invested) 2002-3: 6.76 years 3.38 people 7.17 Python and DocBook together • Python connects our department’s engineering specific islands of automation. • Python with DocBook created Word documents from engineering data. • The combination of an open language with an open standard eliminated a real-world business process bottleneck. • The return on investment was substantial.