Python论坛  - 讨论区

标题:[zeuux-python] Fwd: Python Module of the Week

2009年02月03日 星期二 22:18

仨儿 zoomquiet+sns在gmail.com
星期二 二月 3 22:18:39 CST 2009

这一期模块自研非常的实用哪,,,值得优先体验,,,

---------- Forwarded message ----------
From: Doug Hellmann
<doug.hellmann+feedburner在gmail.com2Bfeedburner在gmail.com>
>
Date: Tue, Feb 3, 2009 at 22:02
Subject: Python Module of the Week
To: zoomquiet+sns在gmail.com 2Bsns在gmail.com>


   Python Module of the
Week<http://blog.doughellmann.com/search/label/PyMOTW>

Writing Technical Documentation with Sphinx, Paver, and
Cog<http://feedproxy.google.com/%7Er/PyMOTW/%7E3/JOnbNzmhSNU/writing-technical-documentation-with.html>

Posted: 02 Feb 2009 06:55 AM PST
I've been working on the Python Module of the
Week<http://www.doughellmann.com/PyMOTW/>series since March
of 2007<http://blog.doughellmann.com/2007/03/pymotw-python-module-of-week.html>.
During the course of the project, my article style and tool chain have both
evolved. I now have a fairly smooth production process in place, so the
mechanics of producing a new post don't get in the way of the actual
research and writing. Most of the tools are open source, so I thought I
would describe the process I go through and how the tools work together.

Editing Text: TextMate

I work on a MacBook Pro, and use TextMate <http://macromates.com/> for
editing the articles and source for PyMOTW. TextMate is the one tool I use
regularly that is not open source. When I'm doing heavy editing of hundreds
of files for my day job I use Aquamacs Emacs <http://aquamacs.org/>, but
TextMate is better suited for prose editing and is easier to extend with
quick actions. I discovered TextMate while looking for a native editor to
use for Python Magazine <http://www.pythonmagazine.com/>, and after being
able to write my own "bundle" to manage magazine articles (including
defining a mode for the markup language we use) I was hooked.

Some of the features that I like about TextMate for prose editing are
as-you-type spell-checking (I know some people hate this feature, but I find
it useful), text statistics (word count, etc.), easy block selection (I can
highlight a paragraph or several sentences and move them using cursor keys),
a moderately good reStructuredText mode (emacs' is better, but TextMate's is
good enough), paren and quote matching as you type, and very simple
extensibility for repetitive tasks. I also like TextMate's project
management features, since they makes it easy to open several related files
at the same time.

Version Control: svn

I started out using a private svn repository for all of my projects,
including PyMOTW. I'm in the middle of evaluating hosted DVCS options for
PyMOTW<http://blog.doughellmann.com/2008/12/moving-pymotw-to-public-repository.html>,
but still haven't had enough time to give them all the research I think is
necessary before making the move. The Python core developers are considering
a similar move (PEP 374 <http://www.python.org/dev/peps/pep-0374/>) so it
will be interesting to monitor that
discussion<http://mail.python.org/pipermail/python-dev/2009-January/085347.html>.
No doubt we have different requirements (for example, they are hosting their
own repository), but the experiences with the various DVCS tools will be
useful input to my own decision.

Markup Language: reStructuredText

When I began posting, I wrote each article by hand using HTML. One of the
first tasks that I automated was the step of passing the source code through
pygments to produce a syntax colorized version. This worked well enough for
me at the time, but restricted me to producing only HTML output. Eventually
John Benediktsson contacted me with a version of many of the posts converted
from HTML to reStructuredText<http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html>.


When reStructuredText was first put forward in the '90's, I was heavily into
Zope development. As such, I was using
StructuredText<http://www.zope.org/Documentation/Articles/STX>for
documenting my code, and in the Zope-based wiki that we ran at
ZapMedia.
I even wrote my own app <http://happydoc.sourceforge.net/> to extract
comments and docstrings to generate library documentation for a couple of
libraries I had released as open source. I really liked StructuredText and,
at first, I *didn't* like reStructuredText. Frankly, it looked ugly compared
to what I was used to. It quickly gained acceptance in the general community
though, and I knew it would give me options for producing other output
formats for the PyMOTW posts, so when John sent me the markup files I took
another look.

While re-acquainting myself with reST, I realized two things. First,
although there is a bit more punctuation involved in the markup than with
the original StructuredText, the markup language was designed with
consistency in mind so it isn't as difficult to learn as my first
impressions had lead me to believe. Second, it turned out the part I thought
was "ugly" was actually the part that made reST *more powerful* than
StructuredText: It has a standard syntax for extension directives that users
can define for their own documents.

Markup to Output: Sphinx

Before I made a final decision on switching from hand-coded HTML to reST, I
needed a tool to convert to HTML (I still had to post the results on the
blog, after all, and Blogger doesn't support reST). I first tried David
Goodger's docutils <http://docutils.sourceforge.net/> package. The scripts
it includes felt a little too much like "pieces" of a tool rather than a
complete solution, though, and I didn't really want to assemble my own
wrappers if I didn't have to -- I wanted to write text for this project, not
code my own tools. Around this time, Georg Brandl had made significant
progress on Sphinx <http://sphinx.pocoo.org/>, which turned out to be a more
complete turn-key system for converting a pile of reST files to HTML or PDF.
After a few hours of experimentation, I had a sample project set up and was
generating HTML from my documents using the standard templates.

I decided that reStructuredText looked like the way to go.

HTML Templates: Jinja:

My next step was to work out exactly how to produce all of the outputs I
needed from reST inputs. Each post for the PyMOTW series ends up going to
several different places:


   - the PyMOTW source distribution (HTML)
   - my Blogger blog (HTML)
   - the PyMOTW project site (HTML)
   - O'Reilly.com <http://python.oreilly.com/> (HTML)
   - the PyMOTW "book" (PDF)



Each of the four HTML outputs uses slightly different formatting, requiring
separate templates (PDF is a whole different problem, covered below). The
source distribution and project site are both full HTML versions of all of
the documents, but use different templates. I decided to use the default
Sphinx templates for the packaged version; I may change that later, but it
works for the time being, and it's one less custom template to deal with. I
wanted the online version to match the appearance of the rest of my site, so
I needed to create a template for it. The two blogs use a third template
(O'Reilly's site ignores a lot of the markup due to their Moveable Type
configuration, but the articles come out looking good enough so I can use
the same template I use for my own blog without worrying about a separate
custom template).

Sphinx uses Jinja <http://jinja.pocoo.org/> templates to produce HTML
output. The syntax for Jinja is very similar to Django's template language.
As it happens, I use Django for the dynamic portion of my web site that I
host myself. I lucked out, and my site's base template was simple enough to
use with Sphinx without making any changes. Yay for compatibility!

Cleaning up HTML with BeautifulSoup

The blog posts need to be relatively clean HTML that I can upload to Blogger
and O'Reilly, so they could not include any html or body tags or require any
markup or styles not supported by either blogging engine. The template I
came up with is a stripped down version that doesn't include the CSS and
markup for sidebars, header, or footer. The result was *almost* exactly what
I wanted, but had two problems.

The easiest problem to handle was the permalinks generated by Sphinx. After
each heading on the page, Sphinx inserts an anchor tag with a ¶ character
and applies CSS styles that hide/show the tag when the user hovers over it.
That's a nice feature for the main site and packaged content, but they
didn't work for the blogs. I have no control over the CSS used at O'Reilly,
so the tags were always visible. I didn't really care if they were included
on the Blogger pages, so the simplest thing to do was stick with one
"blogging" template and remove the permalinks.

The second, more annoying, problem, was that Blogger wanted to insert extra
whitespace into the post. There is a configuration option on Blogger to
treat line breaks in the post as "paragraph breaks" (I think they actually
insert br tags). This is very convenient for normal posts with mostly
straight text, since I can simply write each paragraph on one long line,
wrapped visually by my editor, and break the paragraphs where I want them.
The result is I can almost post directly from plain text input.
Unfortunately, the option is applied to *every* post in the blog (even old
posts), so changing it was not a realistic option -- I wasn't about to go
back and re-edit every single post I had previously written.

Sphinx didn't have an option to skip generating the permalinks, and there
was no way to express that intent in the template, so I fell back to writing
a little script to strip them out after the fact. I used
BeautifulSoup<http://www.crummy.com/software/BeautifulSoup/>to find
the tags I wanted removed, delete them from the parse tree, then
assemble the HTML text as a string again. I added code to the same script to
handle the whitespace issue by removing all newlines from the input unless
they were inside pre tags, which Blogger handled correctly. The result was a
single blob of partial HTML without newlines or permalinks that I could post
directly to either blog without editing it by hand. Score a point for
automation.

def clean_blog_html(body):
    # Clean up the HTML
    import re
    import sys
    from BeautifulSoup import BeautifulSoup
    from cStringIO import StringIO

    # The post body is passed to stdin.
    soup = BeautifulSoup(body)

    # Remove the permalinks to each header since the blog does not have
    # the styles to hide them.
    links = soup.findAll('a', attrs={'class':"headerlink"})
    [l.extract() for l in links]

    # Get BeautifulSoup's version of the string
    s = soup.__str__(prettyPrint=False)

    # Remove extra newlines.  This depends on the fact that
    # code blocks are passed through pygments, which wraps each part of the line
    # in a span tag.
    pattern = re.compile(r'([^s][^p][^a][^n]>)\n$', re.DOTALL|re.IGNORECASE)
    s = ''.join(pattern.sub(r'\1', l) for l in StringIO(s))

    return s



Code Syntax Highlighting: pygments

I wanted my posts to look as good as possible, and an important factor in
the appearance would be the presentation of the source code. I adopted
pygments <http://pygments.org/> in the early hand-coded HTML days, because
it was easy to integrate into TextMate with a simple script.

pygmentize -f html -O cssclass=syntax $@



Binding the command to a key combination meant with a few quick keypresses I
had HTML ready to insert into the body of a post.

When I moved to Sphinx, using pygments became even easier because Sphinx
automatically passes included source code through pygments as it generates
its output. Syntax highlighting works for HTML and PDF, so I didn't need any
custom processing.

Automation: Paver

Automation is important for my sense of well being. I hate dealing with
mundane repetitive tasks, so once an article was written I didn't want to
have to touch it to prepare it for publication of any of the final
destinations. As I have written
before<http://blog.doughellmann.com/2009/01/converting-from-make-to-paver.html>,
I started out using make to run various shell commands. I have since
converted the entire process to Paver.

The stock Sphinx integration provided with that comes with Paver didn't
quite meet my needs, but by examining the source I was able to create my own
replacement tasks in an afternoon. The main problem was the tight coupling
between the code to run Sphinx and the code to find the options to pass to
it. For normal projects with a single documentation output format (Paver
assumes HTML with a single config file), this isn't a problem. PyMOTW's
requirements are different, with the four output formats discussed above.

In order to produce different output with Sphinx, you need different
configuration files. Since the base name for the file must always be conf.py,
that means the files have to be stored in separate directories. One of the
options passed to Sphinx on the command line tells it the directory to look
in for its configuration file. Even though Paver doesn't fork() before
calling Sphinx, it still uses the command line options to pass instructions.


Creating separate Sphinx configuration files was easy. The problem was
defining options in Paver to tell Sphinx about each configuration directory
for the different output. Paver options are grouped into bundles, which are
essentially a namespace. When a Paver task looks for an option, it scans
through the bundles, possibly cascading to the global namespace, until it
finds the option by name. The search can be limited to specific bundles, so
that the same option name can be used to configure different tasks.

The html task from paver.doctools sets the options search order to look for
values first in the *sphinx* section, then globally. Once it has retrieved
the path values, via _get_paths(), it invokes Sphinx.

def _get_paths():
    """look up the options that determine where all of the files are."""
    opts = options
    docroot = path(opts.get('docroot', 'docs'))
    if not docroot.exists():
        raise BuildFailure("Sphinx documentation root (%s) does not exist."
                           % docroot)
    builddir = docroot / opts.get("builddir", ".build")
    builddir.mkdir()
    srcdir = docroot / opts.get("sourcedir", "")
    if not srcdir.exists():
        raise BuildFailure("Sphinx source file dir (%s) does not exist"
                            % srcdir)
    htmldir = builddir / "html"
    htmldir.mkdir()
    doctrees = builddir / "doctrees"
    doctrees.mkdir()
    return Bunch(locals())

@task
def html():
    """Build HTML documentation using Sphinx. This uses the following
    options in a "sphinx" section of the options.

    docroot
      the root under which Sphinx will be working. Default: docs
    builddir
      directory under the docroot where the resulting files are put.
      default: build
    sourcedir
      directory under the docroot for the source files
      default: (empty string)
    """
    options.order('sphinx', add_rest=True)
    paths = _get_paths()
    sphinxopts = ['', '-b', 'html', '-d', paths.doctrees,
        paths.srcdir, paths.htmldir]
    dry("sphinx-build %s" % (" ".join(sphinxopts),), sphinx.main, sphinxopts)



This didn't work for me because I needed to pass a separate configuration
directory (not handled by the default _get_paths()) and different build and
output directories. The simplest solution turned out to be re-implementing
the Paver-Sphinx integration to make it more flexible. I created my own
_get_paths() and made it look for the extra option values and use the
directory structure I needed.

def _get_paths():
    """look up the options that determine where all of the files are."""
    opts = options

    docroot = path(opts.get('docroot', 'docs'))
    if not docroot.exists():
        raise BuildFailure("Sphinx documentation root (%s) does not exist."
                           % docroot)

    builddir = docroot / opts.get("builddir", ".build")
    builddir.mkdir()

    srcdir = docroot / opts.get("sourcedir", "")
    if not srcdir.exists():
        raise BuildFailure("Sphinx source file dir (%s) does not exist"
                            % srcdir)

    # Where is the sphinx conf.py file?
    confdir = path(opts.get('confdir', srcdir))

    # Where should output files be generated?
    outdir = opts.get('outdir', '')
    if outdir:
        outdir = path(outdir)
    else:
        outdir = builddir / opts.get('builder', 'html')
    outdir.mkdir()

    # Where are doctrees cached?
    doctrees = opts.get('doctrees', '')
    if not doctrees:
        doctrees = builddir / "doctrees"
    else:
        doctrees = path(doctrees)
    doctrees.mkdir()

    return Bunch(locals())



Then I defined a new function, run_sphinx(), to set up the options search
path, look for the option values, and invoke Sphinx. I set add_rest to False
to disable searching globally for an option to avoid namespace polution from
option collisions, since I knew I was going to have options with the same
names but different values for each output format. I also look for a
"builder", to support PDF generation.

def run_sphinx(*option_sets):
    """Helper function to run sphinx with common options.

    Pass the names of namespaces to be used in the search path
    for options.
    """
    if 'sphinx' not in option_sets:
        option_sets += ('sphinx',)
    kwds = dict(add_rest=False)
    options.order(*option_sets, **kwds)
    paths = _get_paths()
    sphinxopts = ['',
                  '-b', options.get('builder', 'html'),
                  '-d', paths.doctrees,
                  '-c', paths.confdir,
                  paths.srcdir, paths.outdir]
    dry("sphinx-build %s" % (" ".join(sphinxopts),), sphinx.main, sphinxopts)
    return



With a working run_sphinx() function I could define several Sphinx-based
tasks, each taking options with the same names but from different parts of
the namespace. The tasks simply call run_sphinx() with the desired namespace
search path. For example, to generate the HTML to include in the sdist
package, the html task looks in the *html* bunch:

@task
@needs(['cog'])
def html():
    set_templates(options.html.templates)
    run_sphinx('html')



while generating the HTML output for the website uses a different set of
options from the *website* bunch:

@task
@needs(['webtemplatebase', 'cog'])
def webhtml():
    set_templates(options.website.templates)
    run_sphinx('website')
    return



All of the option search paths also include the *sphinx* bunch, so values
that do not change (such as the source directory) do not need to be
repeated. The relevant portion of the options from the PyMOTW pavement.py
file looks like this:

options(

    # ...

    sphinx = Bunch(
        sourcedir=PROJECT,
        docroot = '.',
        builder = 'html',
        doctrees='sphinx/doctrees',
        confdir = 'sphinx',
    ),

    html = Bunch(
        builddir='docs',
        outdir='docs',
        templates='pkg',
    ),

    website=Bunch(
        templates = 'web',
        #outdir = 'web',
        builddir = 'web',
    ),

    pdf=Bunch(
        templates='pkg',
        #outdir='pdf_output',
        builddir='web',
        builder='latex',
    ),

    blog=Bunch(
        sourcedir=path(PROJECT)/MODULE,
        builddir='blog_posts',
        outdir='blog_posts',
        confdir='sphinx/blog',
        doctrees='blog_posts/doctrees',
    ),

    # ...
)



To find the sourcedir for the html task, _get_paths() first looks in the *
html* bunch, then the *sphinx* bunch.

Capturing Program Output: cog

As an editor at Python Magazine, and reviewer for several books, I've
discovered that one of the most frequent sources of errors with technical
writing occurs in the production process where the output of running sample
code is captured to be included in the final text. This is usually done
manually by running the program and copying and pasting its output from the
console. It's not uncommon for a bug to be found, or a library to change,
requiring a change in the source code provided with the article. That
change, in turn, means the output of commands may be different. Sometimes
the change is minor, but at other times the output is different in some
significant way. Since I've seen the problem come up so many times, I spent
time thinking about and looking for a solution to avoid it in my own work.

During my research, a few people suggested that I switch to using doctests
for my examples, but I felt there were several problems with that approach.
First, the doctest format isn't very friendly for users who want to copy and
paste examples into their own scripts. The reader has to select each line
individually, and can't simply grab the entire block of code. Distributing
the examples as separate scripts makes this easier, since they can simply
copy the entire file and modify it as they want. Using individual .py files
also makes it possible for some of the more complicated examples to run
clients and servers at the same time from different scripts (as with
SimpleXMLRPCServer <http://www.doughellmann.com/PyMOTW/SimpleXMLRPCServer/>,
for example). But most importantly, using doctests does not solve the
fundamental problem. Doctests tell me when the output has changed, but I
still have to manually run the scripts to generate that output and paste it
into my document in the first place. What I really wanted to be able to do
was run the script and insert the output, whatever it was,
*without*manually copying and pasting text from the console.

I finally found what I was looking for in
cog<http://nedbatchelder.com/code/cog/>,
from Ned Batchelder. Ned describes cog as a "code generation tool", and most
of the examples he provides on his site are in that vein. But cog is a more
general purpose tool than that. It gives you a way to include arbitrary
Python instructions in your source document, have them executed, and then
have the source document change to reflect the output.

For each code sample, I wanted to include the Python source followed by the
output it produces when run on the console. There is a reST directive to
include the source file, so that part is easy:

.. include:: anydbm_whichdb.py
    :literal:
    :start-after: #end_pymotw_header



The include directive tells Sphinx that the file "anydbm_whichdb.py" should
be treated as a literal text block (instead of more reST) and to only
include the parts following the last line of the standard header I use in
all my source code. Syntax highlighting comes for free when the literal
block is converted to the output format.

Grabbing the command output was a little trickier. Normally with cog, one
would embed the actual source to be run in the document. In my case, I had
the text in an external file. Most of the source is Python, and I could just
import it, but I would have to go to special lengths to capture any output
and pass it to cog.out(), the cog function for including text in the
processed document. I didn't want my example code littered with calls to
cog.out() instead of print, so I needed to capture sys.stdout and sys.stdin.
A bigger question was whether I wanted to have all of the sample files
imported into the namespace of the build process. Considering both issues,
it made sense to run the script in a separate process and capture the
output.

There is a bit of setup work needed to run the scripts this way, so I
decided to put it all into a function instead of including the boilerplate
code in every cog block. The reST source for running anydbm_whichdb.py looks
like:

.. {{{cog
.. cog.out(run_script(cog.inFile, 'anydbm_whichdb.py'))
.. }}}
.. {{{end}}}



The .. at the start of each line causes the reStructuredText parser to treat
the line as a comment, so it is not included in the output. After passing
the reST file through cog, it is rewritten to contain:

.. {{{cog
.. cog.out(run_script(cog.inFile, 'anydbm_whichdb.py'))
.. }}}

::

	$ python anydbm_whichdb.py
	dbhash

.. {{{end}}}



The run_script() function runs the python script it is given, adds a prefix
to make reST treat the following lines as literal text, then indents the
script output. The script is run via Paver's sh() function, which wraps the
subprocess module and supports the dry-run feature of Paver. Because the cog
instructions are comments, the only part that shows up in the ouput is the
literal text block with the command output.

def run_script(input_file, script_name, interpreter='python'):
    """Run a script in the context of the input_file's directory,
    return the text output formatted to be included as an rst
    literal text block.
    """
    from paver.runtime import sh
    from paver.path import path
    rundir = path(input_file).dirname()
    output_text = sh('cd %(rundir)s; %(interpreter)s %(script_name)s
2>&1' % vars(),
                    capture=True)
    response = '\n::\n\n\t$ %(interpreter)s %(script_name)s\n\t' % vars()
    response += '\n\t'.join(output_text.splitlines())
    while not response.endswith('\n\n'):
        response += '\n'
    return response
# Stuff run_script() into the builtins so we don't have to
# import it in all of the cog blocks where we want to use it.
__builtins__['run_script'] = run_script



I defined run_script() in my pavement.py file, and added it to the
__builtins__ namespace to avoid having to import it each time I wanted to
use it from a source document.

A somewhat more complicated example shows another powerful feature of cog.
Because it can run any arbitrary Python code, it is possible to establish
the pre-conditions for a script before running it. For example,
anydbm_new.py assumes that its output database does not already exist. I can
ensure that condition by removing it before running the script.

.. {{{cog
.. from paver.path import path
.. from paver.runtime import sh
.. workdir = path(cog.inFile).dirname()
.. sh("cd %s; rm -f /tmp/example.db" % workdir)
.. cog.out(run_script(cog.inFile, 'anydbm_new.py'))
.. }}}
{{{end}}}



Since cog is integrated into Sphinx, all I had to do to enable it was define
the options and import the module. I chose to change the begin and end tags
used by cog because the default patterns ([[[cog and ]]]) appeared in the
output of some of the scripts (printing nested lists, for example).

cog=Bunch(
    beginspec='{{{cog',
    endspec='}}}',
    endoutput='{{{end}}}',
),



To process all of the input files through cog before generating the output,
I added 'cog' to the @needs list for any task running sphinx. Then it was
simply a matter of running "paver html" or "paver webhtml" to generate the
output.

Paver includes an "uncog" task to remove the cog output from your source
files before committing to a source code repository, but I decided to
include the cogged values in committed versions so I would be alerted if the
output ever changed.

Generating PDF: TexLive

Generating HTML using Sphinx and Jinja templates is fairly straightforward;
PDF output wasn't quite so easy to set up. Sphinx actually produces LaTeX,
another text-based format, as output, along with a Makefile to run
third-party LaTeX tools to create the PDF. I started out experimenting on a
Linux system (normally I use a Mac, but this box claimed to have the
required tools installed). Due to the age of the system, however, the tools
weren't compatible with the LaTeX produced by Sphinx. After some searching,
and asking on the sphinx-dev mailing list, I installed a copy of TeX
Live<http://www.tug.org/texlive/>,
a newer TeX distro. A few tweaks to my $PATH later and I was in business
building PDFs right on my Mac.

My pdf task runs Sphinx with the "latex" builder, then runs make using the
generated Makefile.

@task
@needs(['cog'])
def pdf():
    """Generate the PDF book.
    """
    set_templates(options.pdf.templates)
    run_sphinx('pdf')
    latex_dir = path(options.pdf.builddir) / 'latex'
    sh('cd %s; make' % latex_dir)
    return



I still need to experiment with some of the LaTeX options, including
templates for pages in different sizes, logos, and styles. For now I'm happy
with the default look.

Releasing

Once I had the "build" fully automated, it was time to address the
distribution process. For each version, I need to:


   - upload HTML, PDF, and tar.gz files to my server
   - update PyPI
   - post to my blog
   - post to the O'Reilly blog



The HTML and PDF files are copied to my server using rsync, invoked from
Paver. I use a web browser and the admin interface for
django-codehosting<http://www.doughellmann.com/projects/codehosting/>to
upload the tar.gz file containing the source distribution manually.
That
will be automated, eventually. Once the tar.gz is available, PyPI can be
updated via the builtin task "paver register". That just leaves the two blog
posts.

For my own blog, I use MarsEdit <http://www.red-sweater.com/marsedit/> to
post and edit entries. I find the UI easy to use, and I like the ability to
work on drafts of posts offline. It is much nicer than the web interface for
Blogger, and has the benefit of being AppleScript-able. I have plans to
automate all of the steps right up to actually posting the new blog entry,
but for now I copy the generated blog entry into a new post window by hand.

O'Reilly's blogging policy does not allow desktop clients (too much of a
support issue for the tech staff), so I need to use their Moveable Type web
UI to post. As with MarsEdit, I simply copy the output and paste it into the
field in the browser window, then add tags.

Tying it All Together

A quick overview of my current process is:

1. Pick a module, research it, and write examples in reST and Python.
Include the Python source and use cog directives to bring in the script
output.

2. Use the command "paver html" to produce HTML output to verify the results
look good and I haven't messed up any markup.

3. Commit the changes to svn. When I'm done with the module, copy the
"trunk" to a release branch for packging.

4. Use "paver sdist" to create the tar.gz file containing the Python source
and HTML documentation.

5. Upload the tar.gz file to my site.

6. Run "paver installwebsite" to regenerate the hosted version of the HTML
and the PDF, then copy both to my web server.

7. Run "paver register" to update PyPI with the latest release information.

8. Run "paver blog" to generate the HTML to be posted to the blogs. The task
opens a new TextMate window containing the HTML so it is ready to be copied.

9. Paste the blog post contents into MarsEdit, add tags, and send it to
Blogger.

10. Paste the blog post contents into the MT UI for O'Reilly, add tags,
verify that it renders properly, then publish.

Try It Yourself

All of the source for PyMOTW (including the pavement.py file with
configuration options, task definitions, and Sphinx integration) is
available from the PyMOTW web site <http://www.doughellmann.com/PyMOTW/>.
Sphinx, Paver, cog, and BeautifulSoup are all open source projects. I've
only tested the PyMOTW "build" on Mac OS X, but it should work on Linux
without any major alterations. If you're on Windows, let me know if you get
it working.
 <http://feeds2.feedburner.com/%7Ef/PyMOTW?a=GEXuyOIv>
<http://feeds2.feedburner.com/%7Ef/PyMOTW?a=KaO5FHUC>
<http://feeds2.feedburner.com/%7Ef/PyMOTW?a=oongWbl6>
   You are subscribed to email updates from Doug
Hellmann<http://blog.doughellmann.com/search/label/PyMOTW>
To stop receiving these emails, you may unsubscribe
now<http://feedburner.google.com/fb/a/mailunsubscribe?k=5gT2ShUuNOMsdzS0s2Lr-VQJqp4>
.Email delivery powered by Google  Inbox too full? [image:
(feed)]<http://feeds2.feedburner.com/PyMOTW>
Subscribe <http://feeds2.feedburner.com/PyMOTW> to the feed version of Doug
Hellmann in a feed reader.  If you prefer to unsubscribe via postal mail,
write to: Doug Hellmann, c/o Google, 20 W Kinzie, Chicago IL USA 60610



-- 
http://zoomquiet.org
'''过程改进乃是催生可促生靠谱的人的组织!'''
Time is unimportant, only life important!
-------------- 下一部分 --------------
一个HTML附件被移除...
URL: <http://www.zeuux.org/pipermail/zeuux-python/attachments/20090203/7d30a89d/attachment-0001.html>

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

2009年02月04日 星期三 12:33

Xia Qingran qingran.xia在gmail.com
星期三 二月 4 12:33:57 CST 2009

最近Sphinx很猛~

On Tue, Feb 3, 2009 at 10:18 PM, 仨儿
<zoomquiet+sns at gmail.com2Bsns at gmail.com>
> wrote:

> 这一期模块自研非常的实用哪,,,值得优先体验,,,
>
> ---------- Forwarded message ----------
> From: Doug Hellmann <doug.hellmann+feedburner at gmail.com2Bfeedburner at gmail.com>
> >
> Date: Tue, Feb 3, 2009 at 22:02
> Subject: Python Module of the Week
> To: zoomquiet+sns at gmail.com 2Bsns at gmail.com>
>
>
>    Python Module of the Week<http://blog.doughellmann.com/search/label/PyMOTW>
>
> Writing Technical Documentation with Sphinx, Paver, and Cog<http://feedproxy.google.com/%7Er/PyMOTW/%7E3/JOnbNzmhSNU/writing-technical-documentation-with.html>
>
> Posted: 02 Feb 2009 06:55 AM PST
> I've been working on the Python Module of the Week<http://www.doughellmann.com/PyMOTW/>series since March
> of 2007<http://blog.doughellmann.com/2007/03/pymotw-python-module-of-week.html>.
> During the course of the project, my article style and tool chain have both
> evolved. I now have a fairly smooth production process in place, so the
> mechanics of producing a new post don't get in the way of the actual
> research and writing. Most of the tools are open source, so I thought I
> would describe the process I go through and how the tools work together.
>
> Editing Text: TextMate
>
> I work on a MacBook Pro, and use TextMate <http://macromates.com/> for
> editing the articles and source for PyMOTW. TextMate is the one tool I use
> regularly that is not open source. When I'm doing heavy editing of hundreds
> of files for my day job I use Aquamacs Emacs <http://aquamacs.org/>, but
> TextMate is better suited for prose editing and is easier to extend with
> quick actions. I discovered TextMate while looking for a native editor to
> use for Python Magazine <http://www.pythonmagazine.com/>, and after being
> able to write my own "bundle" to manage magazine articles (including
> defining a mode for the markup language we use) I was hooked.
>
> Some of the features that I like about TextMate for prose editing are
> as-you-type spell-checking (I know some people hate this feature, but I find
> it useful), text statistics (word count, etc.), easy block selection (I can
> highlight a paragraph or several sentences and move them using cursor keys),
> a moderately good reStructuredText mode (emacs' is better, but TextMate's is
> good enough), paren and quote matching as you type, and very simple
> extensibility for repetitive tasks. I also like TextMate's project
> management features, since they makes it easy to open several related files
> at the same time.
>
> Version Control: svn
>
> I started out using a private svn repository for all of my projects,
> including PyMOTW. I'm in the middle of evaluating hosted DVCS options for
> PyMOTW<http://blog.doughellmann.com/2008/12/moving-pymotw-to-public-repository.html>,
> but still haven't had enough time to give them all the research I think is
> necessary before making the move. The Python core developers are considering
> a similar move (PEP 374 <http://www.python.org/dev/peps/pep-0374/>) so it
> will be interesting to monitor that discussion<http://mail.python.org/pipermail/python-dev/2009-January/085347.html>.
> No doubt we have different requirements (for example, they are hosting their
> own repository), but the experiences with the various DVCS tools will be
> useful input to my own decision.
>
> Markup Language: reStructuredText
>
> When I began posting, I wrote each article by hand using HTML. One of the
> first tasks that I automated was the step of passing the source code through
> pygments to produce a syntax colorized version. This worked well enough for
> me at the time, but restricted me to producing only HTML output. Eventually
> John Benediktsson contacted me with a version of many of the posts converted
> from HTML to reStructuredText<http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html>.
>
>
> When reStructuredText was first put forward in the '90's, I was heavily
> into Zope development. As such, I was using StructuredText<http://www.zope.org/Documentation/Articles/STX>for documenting my code, and in the Zope-based wiki that we ran at ZapMedia.
> I even wrote my own app <http://happydoc.sourceforge.net/> to extract
> comments and docstrings to generate library documentation for a couple of
> libraries I had released as open source. I really liked StructuredText and,
> at first, I *didn't* like reStructuredText. Frankly, it looked ugly
> compared to what I was used to. It quickly gained acceptance in the general
> community though, and I knew it would give me options for producing other
> output formats for the PyMOTW posts, so when John sent me the markup files I
> took another look.
>
> While re-acquainting myself with reST, I realized two things. First,
> although there is a bit more punctuation involved in the markup than with
> the original StructuredText, the markup language was designed with
> consistency in mind so it isn't as difficult to learn as my first
> impressions had lead me to believe. Second, it turned out the part I thought
> was "ugly" was actually the part that made reST *more powerful* than
> StructuredText: It has a standard syntax for extension directives that users
> can define for their own documents.
>
> Markup to Output: Sphinx
>
> Before I made a final decision on switching from hand-coded HTML to reST, I
> needed a tool to convert to HTML (I still had to post the results on the
> blog, after all, and Blogger doesn't support reST). I first tried David
> Goodger's docutils <http://docutils.sourceforge.net/> package. The scripts
> it includes felt a little too much like "pieces" of a tool rather than a
> complete solution, though, and I didn't really want to assemble my own
> wrappers if I didn't have to -- I wanted to write text for this project, not
> code my own tools. Around this time, Georg Brandl had made significant
> progress on Sphinx <http://sphinx.pocoo.org/>, which turned out to be a
> more complete turn-key system for converting a pile of reST files to HTML or
> PDF. After a few hours of experimentation, I had a sample project set up and
> was generating HTML from my documents using the standard templates.
>
> I decided that reStructuredText looked like the way to go.
>
> HTML Templates: Jinja:
>
> My next step was to work out exactly how to produce all of the outputs I
> needed from reST inputs. Each post for the PyMOTW series ends up going to
> several different places:
>
>
>    - the PyMOTW source distribution (HTML)
>    - my Blogger blog (HTML)
>    - the PyMOTW project site (HTML)
>    - O'Reilly.com <http://python.oreilly.com/> (HTML)
>    - the PyMOTW "book" (PDF)
>
>
>
> Each of the four HTML outputs uses slightly different formatting, requiring
> separate templates (PDF is a whole different problem, covered below). The
> source distribution and project site are both full HTML versions of all of
> the documents, but use different templates. I decided to use the default
> Sphinx templates for the packaged version; I may change that later, but it
> works for the time being, and it's one less custom template to deal with. I
> wanted the online version to match the appearance of the rest of my site, so
> I needed to create a template for it. The two blogs use a third template
> (O'Reilly's site ignores a lot of the markup due to their Moveable Type
> configuration, but the articles come out looking good enough so I can use
> the same template I use for my own blog without worrying about a separate
> custom template).
>
> Sphinx uses Jinja <http://jinja.pocoo.org/> templates to produce HTML
> output. The syntax for Jinja is very similar to Django's template language.
> As it happens, I use Django for the dynamic portion of my web site that I
> host myself. I lucked out, and my site's base template was simple enough to
> use with Sphinx without making any changes. Yay for compatibility!
>
> Cleaning up HTML with BeautifulSoup
>
> The blog posts need to be relatively clean HTML that I can upload to
> Blogger and O'Reilly, so they could not include any html or body tags or
> require any markup or styles not supported by either blogging engine. The
> template I came up with is a stripped down version that doesn't include the
> CSS and markup for sidebars, header, or footer. The result was *almost*exactly what I wanted, but had two problems.
>
> The easiest problem to handle was the permalinks generated by Sphinx. After
> each heading on the page, Sphinx inserts an anchor tag with a ¶ character
> and applies CSS styles that hide/show the tag when the user hovers over it.
> That's a nice feature for the main site and packaged content, but they
> didn't work for the blogs. I have no control over the CSS used at O'Reilly,
> so the tags were always visible. I didn't really care if they were included
> on the Blogger pages, so the simplest thing to do was stick with one
> "blogging" template and remove the permalinks.
>
> The second, more annoying, problem, was that Blogger wanted to insert extra
> whitespace into the post. There is a configuration option on Blogger to
> treat line breaks in the post as "paragraph breaks" (I think they actually
> insert br tags). This is very convenient for normal posts with mostly
> straight text, since I can simply write each paragraph on one long line,
> wrapped visually by my editor, and break the paragraphs where I want them.
> The result is I can almost post directly from plain text input.
> Unfortunately, the option is applied to *every* post in the blog (even old
> posts), so changing it was not a realistic option -- I wasn't about to go
> back and re-edit every single post I had previously written.
>
> Sphinx didn't have an option to skip generating the permalinks, and there
> was no way to express that intent in the template, so I fell back to writing
> a little script to strip them out after the fact. I used BeautifulSoup<http://www.crummy.com/software/BeautifulSoup/>to find the tags I wanted removed, delete them from the parse tree, then
> assemble the HTML text as a string again. I added code to the same script to
> handle the whitespace issue by removing all newlines from the input unless
> they were inside pre tags, which Blogger handled correctly. The result was
> a single blob of partial HTML without newlines or permalinks that I could
> post directly to either blog without editing it by hand. Score a point for
> automation.
>
> def clean_blog_html(body):
>     # Clean up the HTML
>     import re
>     import sys
>
>     from BeautifulSoup import BeautifulSoup
>     from cStringIO import StringIO
>
>     # The post body is passed to stdin.
>
>     soup = BeautifulSoup(body)
>
>     # Remove the permalinks to each header since the blog does not have
>     # the styles to hide them.
>
>     links = soup.findAll('a', attrs={'class':"headerlink"})
>
>     [l.extract() for l in links]
>
>     # Get BeautifulSoup's version of the string
>
>     s = soup.__str__(prettyPrint=False)
>
>     # Remove extra newlines.  This depends on the fact that
>
>     # code blocks are passed through pygments, which wraps each part of the line
>     # in a span tag.
>     pattern = re.compile(r'([^s][^p][^a][^n]>)\n$', re.DOTALL|re.IGNORECASE)
>
>     s = ''.join(pattern.sub(r'\1', l) for l in StringIO(s))
>
>
>     return s
>
>
>
> Code Syntax Highlighting: pygments
>
> I wanted my posts to look as good as possible, and an important factor in
> the appearance would be the presentation of the source code. I adopted
> pygments <http://pygments.org/> in the early hand-coded HTML days, because
> it was easy to integrate into TextMate with a simple script.
>
> pygmentize -f html -O cssclass=syntax $@
>
>
>
> Binding the command to a key combination meant with a few quick keypresses
> I had HTML ready to insert into the body of a post.
>
> When I moved to Sphinx, using pygments became even easier because Sphinx
> automatically passes included source code through pygments as it generates
> its output. Syntax highlighting works for HTML and PDF, so I didn't need any
> custom processing.
>
> Automation: Paver
>
> Automation is important for my sense of well being. I hate dealing with
> mundane repetitive tasks, so once an article was written I didn't want to
> have to touch it to prepare it for publication of any of the final
> destinations. As I have written before<http://blog.doughellmann.com/2009/01/converting-from-make-to-paver.html>,
> I started out using make to run various shell commands. I have since
> converted the entire process to Paver.
>
> The stock Sphinx integration provided with that comes with Paver didn't
> quite meet my needs, but by examining the source I was able to create my own
> replacement tasks in an afternoon. The main problem was the tight coupling
> between the code to run Sphinx and the code to find the options to pass to
> it. For normal projects with a single documentation output format (Paver
> assumes HTML with a single config file), this isn't a problem. PyMOTW's
> requirements are different, with the four output formats discussed above.
>
> In order to produce different output with Sphinx, you need different
> configuration files. Since the base name for the file must always be
> conf.py, that means the files have to be stored in separate directories.
> One of the options passed to Sphinx on the command line tells it the
> directory to look in for its configuration file. Even though Paver doesn't
> fork() before calling Sphinx, it still uses the command line options to
> pass instructions.
>
> Creating separate Sphinx configuration files was easy. The problem was
> defining options in Paver to tell Sphinx about each configuration directory
> for the different output. Paver options are grouped into bundles, which are
> essentially a namespace. When a Paver task looks for an option, it scans
> through the bundles, possibly cascading to the global namespace, until it
> finds the option by name. The search can be limited to specific bundles, so
> that the same option name can be used to configure different tasks.
>
> The html task from paver.doctools sets the options search order to look
> for values first in the *sphinx* section, then globally. Once it has
> retrieved the path values, via _get_paths(), it invokes Sphinx.
>
> def _get_paths():
>     """look up the options that determine where all of the files are."""
>     opts = options
>
>     docroot = path(opts.get('docroot', 'docs'))
>
>     if not docroot.exists():
>         raise BuildFailure("Sphinx documentation root (%s) does not exist."
>
>                            % docroot)
>     builddir = docroot / opts.get("builddir", ".build")
>
>     builddir.mkdir()
>     srcdir = docroot / opts.get("sourcedir", "")
>
>     if not srcdir.exists():
>         raise BuildFailure("Sphinx source file dir (%s) does not exist"
>
>                             % srcdir)
>     htmldir = builddir / "html"
>     htmldir.mkdir()
>
>     doctrees = builddir / "doctrees"
>     doctrees.mkdir()
>     return Bunch(locals())
>
>
> @task
> def html():
>     """Build HTML documentation using Sphinx. This uses the following
>     options in a "sphinx" section of the options.
>
>     docroot
>       the root under which Sphinx will be working. Default: docs
>     builddir
>       directory under the docroot where the resulting files are put.
>       default: build
>     sourcedir
>       directory under the docroot for the source files
>       default: (empty string)
>     """
>
>     options.order('sphinx', add_rest=True)
>     paths = _get_paths()
>
>     sphinxopts = ['', '-b', 'html', '-d', paths.doctrees,
>
>         paths.srcdir, paths.htmldir]
>     dry("sphinx-build %s" % (" ".join(sphinxopts),), sphinx.main, sphinxopts)
>
>
>
> This didn't work for me because I needed to pass a separate configuration
> directory (not handled by the default _get_paths()) and different build
> and output directories. The simplest solution turned out to be
> re-implementing the Paver-Sphinx integration to make it more flexible. I
> created my own _get_paths() and made it look for the extra option values
> and use the directory structure I needed.
>
> def _get_paths():
>     """look up the options that determine where all of the files are."""
>     opts = options
>
>
>     docroot = path(opts.get('docroot', 'docs'))
>
>     if not docroot.exists():
>         raise BuildFailure("Sphinx documentation root (%s) does not exist."
>
>                            % docroot)
>
>     builddir = docroot / opts.get("builddir", ".build")
>
>     builddir.mkdir()
>
>     srcdir = docroot / opts.get("sourcedir", "")
>
>     if not srcdir.exists():
>         raise BuildFailure("Sphinx source file dir (%s) does not exist"
>
>                             % srcdir)
>
>     # Where is the sphinx conf.py file?
>     confdir = path(opts.get('confdir', srcdir))
>
>
>     # Where should output files be generated?
>     outdir = opts.get('outdir', '')
>
>     if outdir:
>         outdir = path(outdir)
>     else:
>         outdir = builddir / opts.get('builder', 'html')
>
>     outdir.mkdir()
>
>     # Where are doctrees cached?
>     doctrees = opts.get('doctrees', '')
>
>     if not doctrees:
>         doctrees = builddir / "doctrees"
>     else:
>
>         doctrees = path(doctrees)
>     doctrees.mkdir()
>
>     return Bunch(locals())
>
>
>
> Then I defined a new function, run_sphinx(), to set up the options search
> path, look for the option values, and invoke Sphinx. I set add_rest to
> False to disable searching globally for an option to avoid namespace
> polution from option collisions, since I knew I was going to have options
> with the same names but different values for each output format. I also look
> for a "builder", to support PDF generation.
>
> def run_sphinx(*option_sets):
>     """Helper function to run sphinx with common options.
>
>     Pass the names of namespaces to be used in the search path
>     for options.
>     """
>     if 'sphinx' not in option_sets:
>
>         option_sets += ('sphinx',)
>     kwds = dict(add_rest=False)
>
>     options.order(*option_sets, **kwds)
>     paths = _get_paths()
>
>     sphinxopts = ['',
>                   '-b', options.get('builder', 'html'),
>
>                   '-d', paths.doctrees,
>                   '-c', paths.confdir,
>
>                   paths.srcdir, paths.outdir]
>     dry("sphinx-build %s" % (" ".join(sphinxopts),), sphinx.main, sphinxopts)
>
>     return
>
>
>
> With a working run_sphinx() function I could define several Sphinx-based
> tasks, each taking options with the same names but from different parts of
> the namespace. The tasks simply call run_sphinx() with the desired
> namespace search path. For example, to generate the HTML to include in the
> sdist package, the html task looks in the *html* bunch:
>
> @task
> @needs(['cog'])
> def html():
>     set_templates(options.html.templates)
>
>     run_sphinx('html')
>
>
>
> while generating the HTML output for the website uses a different set of
> options from the *website* bunch:
>
> @task
> @needs(['webtemplatebase', 'cog'])
> def webhtml():
>
>     set_templates(options.website.templates)
>     run_sphinx('website')
>
>     return
>
>
>
> All of the option search paths also include the *sphinx* bunch, so values
> that do not change (such as the source directory) do not need to be
> repeated. The relevant portion of the options from the PyMOTW pavement.py
> file looks like this:
>
> options(
>
>     # ...
>
>     sphinx = Bunch(
>         sourcedir=PROJECT,
>
>         docroot = '.',
>         builder = 'html',
>         doctrees='sphinx/doctrees',
>
>         confdir = 'sphinx',
>     ),
>
>     html = Bunch(
>         builddir='docs',
>
>         outdir='docs',
>         templates='pkg',
>     ),
>
>     website=Bunch(
>
>         templates = 'web',
>         #outdir = 'web',
>         builddir = 'web',
>
>     ),
>
>     pdf=Bunch(
>         templates='pkg',
>         #outdir='pdf_output',
>
>         builddir='web',
>         builder='latex',
>     ),
>
>     blog=Bunch(
>
>         sourcedir=path(PROJECT)/MODULE,
>         builddir='blog_posts',
>
>         outdir='blog_posts',
>         confdir='sphinx/blog',
>         doctrees='blog_posts/doctrees',
>
>     ),
>
>     # ...
> )
>
>
>
> To find the sourcedir for the html task, _get_paths() first looks in the *
> html* bunch, then the *sphinx* bunch.
>
> Capturing Program Output: cog
>
> As an editor at Python Magazine, and reviewer for several books, I've
> discovered that one of the most frequent sources of errors with technical
> writing occurs in the production process where the output of running sample
> code is captured to be included in the final text. This is usually done
> manually by running the program and copying and pasting its output from the
> console. It's not uncommon for a bug to be found, or a library to change,
> requiring a change in the source code provided with the article. That
> change, in turn, means the output of commands may be different. Sometimes
> the change is minor, but at other times the output is different in some
> significant way. Since I've seen the problem come up so many times, I spent
> time thinking about and looking for a solution to avoid it in my own work.
>
> During my research, a few people suggested that I switch to using doctests
> for my examples, but I felt there were several problems with that approach.
> First, the doctest format isn't very friendly for users who want to copy and
> paste examples into their own scripts. The reader has to select each line
> individually, and can't simply grab the entire block of code. Distributing
> the examples as separate scripts makes this easier, since they can simply
> copy the entire file and modify it as they want. Using individual .py files
> also makes it possible for some of the more complicated examples to run
> clients and servers at the same time from different scripts (as with
> SimpleXMLRPCServer<http://www.doughellmann.com/PyMOTW/SimpleXMLRPCServer/>,
> for example). But most importantly, using doctests does not solve the
> fundamental problem. Doctests tell me when the output has changed, but I
> still have to manually run the scripts to generate that output and paste it
> into my document in the first place. What I really wanted to be able to do
> was run the script and insert the output, whatever it was, *without*manually copying and pasting text from the console.
>
> I finally found what I was looking for in cog<http://nedbatchelder.com/code/cog/>,
> from Ned Batchelder. Ned describes cog as a "code generation tool", and most
> of the examples he provides on his site are in that vein. But cog is a more
> general purpose tool than that. It gives you a way to include arbitrary
> Python instructions in your source document, have them executed, and then
> have the source document change to reflect the output.
>
> For each code sample, I wanted to include the Python source followed by the
> output it produces when run on the console. There is a reST directive to
> include the source file, so that part is easy:
>
> .. include:: anydbm_whichdb.py
>
>     :literal:
>     :start-after: #end_pymotw_header
>
>
>
> The include directive tells Sphinx that the file "anydbm_whichdb.py"
> should be treated as a literal text block (instead of more reST) and to only
> include the parts following the last line of the standard header I use in
> all my source code. Syntax highlighting comes for free when the literal
> block is converted to the output format.
>
> Grabbing the command output was a little trickier. Normally with cog, one
> would embed the actual source to be run in the document. In my case, I had
> the text in an external file. Most of the source is Python, and I could just
> import it, but I would have to go to special lengths to capture any output
> and pass it to cog.out(), the cog function for including text in the
> processed document. I didn't want my example code littered with calls to
> cog.out() instead of print, so I needed to capture sys.stdout and
> sys.stdin. A bigger question was whether I wanted to have all of the sample
> files imported into the namespace of the build process. Considering both
> issues, it made sense to run the script in a separate process and capture
> the output.
>
> There is a bit of setup work needed to run the scripts this way, so I
> decided to put it all into a function instead of including the boilerplate
> code in every cog block. The reST source for running anydbm_whichdb.py looks
> like:
>
> .. {{{cog
> .. cog.out(run_script(cog.inFile, 'anydbm_whichdb.py'))
> .. }}}
> .. {{{end}}}
>
>
>
> The .. at the start of each line causes the reStructuredText parser to
> treat the line as a comment, so it is not included in the output. After
> passing the reST file through cog, it is rewritten to contain:
>
> .. {{{cog
> .. cog.out(run_script(cog.inFile, 'anydbm_whichdb.py'))
> .. }}}
>
> ::
>
> 	$ python anydbm_whichdb.py
> 	dbhash
>
> .. {{{end}}}
>
>
>
> The run_script() function runs the python script it is given, adds a
> prefix to make reST treat the following lines as literal text, then indents
> the script output. The script is run via Paver's sh() function, which
> wraps the subprocess module and supports the dry-run feature of Paver.
> Because the cog instructions are comments, the only part that shows up in
> the ouput is the literal text block with the command output.
>
> def run_script(input_file, script_name, interpreter='python'):
>
>     """Run a script in the context of the input_file's directory,
>     return the text output formatted to be included as an rst
>     literal text block.
>     """
>
>     from paver.runtime import sh
>     from paver.path import path
>     rundir = path(input_file).dirname()
>
>     output_text = sh('cd %(rundir)s; %(interpreter)s %(script_name)s 2>&1' % vars(),
>
>                     capture=True)
>     response = '\n::\n\n\t$ %(interpreter)s %(script_name)s\n\t' % vars()
>
>     response += '\n\t'.join(output_text.splitlines())
>
>     while not response.endswith('\n\n'):
>         response += '\n'
>
>     return response
> # Stuff run_script() into the builtins so we don't have to
> # import it in all of the cog blocks where we want to use it.
> __builtins__['run_script'] = run_script
>
>
>
> I defined run_script() in my pavement.py file, and added it to the
> __builtins__ namespace to avoid having to import it each time I wanted to
> use it from a source document.
>
> A somewhat more complicated example shows another powerful feature of cog.
> Because it can run any arbitrary Python code, it is possible to establish
> the pre-conditions for a script before running it. For example,
> anydbm_new.py assumes that its output database does not already exist. I can
> ensure that condition by removing it before running the script.
>
> .. {{{cog
> .. from paver.path import path
> .. from paver.runtime import sh
> .. workdir = path(cog.inFile).dirname()
> .. sh("cd %s; rm -f /tmp/example.db" % workdir)
> .. cog.out(run_script(cog.inFile, 'anydbm_new.py'))
>
> .. }}}
> {{{end}}}
>
>
>
> Since cog is integrated into Sphinx, all I had to do to enable it was
> define the options and import the module. I chose to change the begin and
> end tags used by cog because the default patterns ([[[cog and ]]])
> appeared in the output of some of the scripts (printing nested lists, for
> example).
>
> cog=Bunch(
>     beginspec='{{{cog',
>     endspec='}}}',
>     endoutput='{{{end}}}',
> ),
>
>
>
> To process all of the input files through cog before generating the output,
> I added 'cog' to the @needs list for any task running sphinx. Then it was
> simply a matter of running "paver html" or "paver webhtml" to generate the
> output.
>
> Paver includes an "uncog" task to remove the cog output from your source
> files before committing to a source code repository, but I decided to
> include the cogged values in committed versions so I would be alerted if the
> output ever changed.
>
> Generating PDF: TexLive
>
> Generating HTML using Sphinx and Jinja templates is fairly straightforward;
> PDF output wasn't quite so easy to set up. Sphinx actually produces LaTeX,
> another text-based format, as output, along with a Makefile to run
> third-party LaTeX tools to create the PDF. I started out experimenting on a
> Linux system (normally I use a Mac, but this box claimed to have the
> required tools installed). Due to the age of the system, however, the tools
> weren't compatible with the LaTeX produced by Sphinx. After some searching,
> and asking on the sphinx-dev mailing list, I installed a copy of TeX Live<http://www.tug.org/texlive/>,
> a newer TeX distro. A few tweaks to my $PATH later and I was in business
> building PDFs right on my Mac.
>
> My pdf task runs Sphinx with the "latex" builder, then runs make using the
> generated Makefile.
>
> @task
> @needs(['cog'])
> def pdf():
>     """Generate the PDF book.
>     """
>     set_templates(options.pdf.templates)
>
>     run_sphinx('pdf')
>     latex_dir = path(options.pdf.builddir) / 'latex'
>
>     sh('cd %s; make' % latex_dir)
>     return
>
>
>
> I still need to experiment with some of the LaTeX options, including
> templates for pages in different sizes, logos, and styles. For now I'm happy
> with the default look.
>
> Releasing
>
> Once I had the "build" fully automated, it was time to address the
> distribution process. For each version, I need to:
>
>
>    - upload HTML, PDF, and tar.gz files to my server
>    - update PyPI
>    - post to my blog
>    - post to the O'Reilly blog
>
>
>
> The HTML and PDF files are copied to my server using rsync, invoked from
> Paver. I use a web browser and the admin interface for django-codehosting<http://www.doughellmann.com/projects/codehosting/>to upload the tar.gz file containing the source distribution manually. That
> will be automated, eventually. Once the tar.gz is available, PyPI can be
> updated via the builtin task "paver register". That just leaves the two blog
> posts.
>
> For my own blog, I use MarsEdit <http://www.red-sweater.com/marsedit/> to
> post and edit entries. I find the UI easy to use, and I like the ability to
> work on drafts of posts offline. It is much nicer than the web interface for
> Blogger, and has the benefit of being AppleScript-able. I have plans to
> automate all of the steps right up to actually posting the new blog entry,
> but for now I copy the generated blog entry into a new post window by hand.
>
> O'Reilly's blogging policy does not allow desktop clients (too much of a
> support issue for the tech staff), so I need to use their Moveable Type web
> UI to post. As with MarsEdit, I simply copy the output and paste it into the
> field in the browser window, then add tags.
>
> Tying it All Together
>
> A quick overview of my current process is:
>
> 1. Pick a module, research it, and write examples in reST and Python.
> Include the Python source and use cog directives to bring in the script
> output.
>
> 2. Use the command "paver html" to produce HTML output to verify the
> results look good and I haven't messed up any markup.
>
> 3. Commit the changes to svn. When I'm done with the module, copy the
> "trunk" to a release branch for packging.
>
> 4. Use "paver sdist" to create the tar.gz file containing the Python source
> and HTML documentation.
>
> 5. Upload the tar.gz file to my site.
>
> 6. Run "paver installwebsite" to regenerate the hosted version of the HTML
> and the PDF, then copy both to my web server.
>
> 7. Run "paver register" to update PyPI with the latest release information.
>
> 8. Run "paver blog" to generate the HTML to be posted to the blogs. The
> task opens a new TextMate window containing the HTML so it is ready to be
> copied.
>
> 9. Paste the blog post contents into MarsEdit, add tags, and send it to
> Blogger.
>
> 10. Paste the blog post contents into the MT UI for O'Reilly, add tags,
> verify that it renders properly, then publish.
>
> Try It Yourself
>
> All of the source for PyMOTW (including the pavement.py file with
> configuration options, task definitions, and Sphinx integration) is
> available from the PyMOTW web site <http://www.doughellmann.com/PyMOTW/>.
> Sphinx, Paver, cog, and BeautifulSoup are all open source projects. I've
> only tested the PyMOTW "build" on Mac OS X, but it should work on Linux
> without any major alterations. If you're on Windows, let me know if you get
> it working.
>  <http://feeds2.feedburner.com/%7Ef/PyMOTW?a=GEXuyOIv>
> <http://feeds2.feedburner.com/%7Ef/PyMOTW?a=KaO5FHUC>
> <http://feeds2.feedburner.com/%7Ef/PyMOTW?a=oongWbl6>
>    You are subscribed to email updates from Doug Hellmann<http://blog.doughellmann.com/search/label/PyMOTW>
> To stop receiving these emails, you may unsubscribe now<http://feedburner.google.com/fb/a/mailunsubscribe?k=5gT2ShUuNOMsdzS0s2Lr-VQJqp4>
> . Email delivery powered by Google  Inbox too full? [image: (feed)]<http://feeds2.feedburner.com/PyMOTW>
> Subscribe <http://feeds2.feedburner.com/PyMOTW> to the feed version of
> Doug Hellmann in a feed reader.  If you prefer to unsubscribe via postal
> mail, write to: Doug Hellmann, c/o Google, 20 W Kinzie, Chicago IL USA 60610
>
>
>
> --
> http://zoomquiet.org
> '''过程改进乃是催生可促生靠谱的人的组织!'''
> Time is unimportant, only life important!
> _______________________________________________
> zeuux-python mailing list
> zeuux-python at zeuux.org
> http://www.zeuux.org/mailman/listinfo/zeuux-python
>
>


-- 
夏清然
Xia Qingran
qingran.xia at gmail.com
Calvin Trillin  - "Health food makes me sick."
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.zeuux.org/pipermail/zeuux-python/attachments/20090204/3a334c48/attachment-0001.html>

[导入自Mailman归档:http://www.zeuux.org/pipermail/zeuux-python]

如下红色区域有误,请重新填写。

    你的回复:

    请 登录 后回复。还没有在Zeuux哲思注册吗?现在 注册 !

    Zeuux © 2024

    京ICP备05028076号