Difference between revisions of "Fetching scripts"

From PyMOLWiki
Jump to navigation Jump to search
m
Line 7: Line 7:
 
</source>
 
</source>
  
The convenience benefits are obvious, and I think that lowering the barrier to script usage will greatly increase both the number of people who use various scripts and the incentive to place scripts on the wiki.
+
The convenience benefits are obvious, especially for new users, and I think that lowering the barrier to script usage will greatly increase both the number of people who use various scripts and the incentive to place scripts on the wiki (especially if the fetch mechanism makes it easy for script authors to provide a citation/DOI/etc.).
  
 
= Issues =
 
= Issues =
Line 18: Line 18:
 
* We should print a warning each time a new script is fetched anyway
 
* We should print a warning each time a new script is fetched anyway
 
* Can fetched scripts persist across saved sessions? Perhaps not.
 
* Can fetched scripts persist across saved sessions? Perhaps not.
 +
* Plugins? This is probably more worth considering for a future version, but it would be nice to be able to load plugins as well. Since plugins are (now) installed permanently, we have to think carefully about the implications.
 +
 +
== Convenience options ==
 +
 +
The main benefit is to make things as convenient and easy as possible, especially for new users.
 +
 +
* Local cache. This would make reloading scripts with each new session easier and faster. You could then stick a bunch of "fetch" lines in your pymolrc.
 +
* A command to list all available scripts?
  
 
== Validation ==
 
== Validation ==
Line 23: Line 31:
 
* How will users know that their script is doing the correct thing
 
* How will users know that their script is doing the correct thing
 
* Perhaps we should have two classes of scripts: approved and validated
 
* Perhaps we should have two classes of scripts: approved and validated
 +
 +
== Format ==
 +
 +
My guess is that we'll require fetchable scripts to follow a certain format on the wiki pages. That should include some metadata like
 +
 +
* Version number. This makes debugging easier and it makes smart caching possible
 +
* Citation. Script authors should be able to provide a preferred citation, DOI, etc. One of the benefits is to get script authors more credit.
 +
* Documentation. Or should this be handled in the doc string?
 +
 +
== Implementation ==
 +
 +
* This will obviously be written in Python.
 +
* We'll probably make use of some screen scraping library. I don't know the state of the art here, but I've seen at least the following, and would love some comments:
 +
** Generic interfaces
 +
*** [http://scrappy.org scrappy]
 +
*** [http://www.crummy.com/software/BeautifulSoup/ Beautiful Soup]
 +
*** [http://lxml.de lxml]
 +
*** [http://wwwsearch.sourceforge.net/mechanize/ mechanize]
 +
*** [http://code.google.com/p/webscraping/ webscraping]
 +
*** urllib(2?) with regular expressions for that old-school feel?
 +
** Specific interfaces
 +
*** [http://pywikipediabot.sourceforge.net/ PyWikipedia]
 +
*** [http://code.google.com/p/python-wikitools/ python-wikitools] (this looks good to me!)
 +
*** [http://pypi.python.org/pypi?%3Aaction=search&term=wikipedia Other pages listed here]

Revision as of 11:56, 4 May 2011

Overview

I'm considering building in a mechanism for automatically fetching scripts from PyMOLWiki. The goal is to allow users to say

fetch findSurfaceResidues, type=script
findSurfaceResidues doShow=True, cutoff=0.5

The convenience benefits are obvious, especially for new users, and I think that lowering the barrier to script usage will greatly increase both the number of people who use various scripts and the incentive to place scripts on the wiki (especially if the fetch mechanism makes it easy for script authors to provide a citation/DOI/etc.).

Issues

Security

Running untrusted code is trouble. Some ideas

  • We could have a page that only administrators are allowed to edit that links to approved scripts
  • We should print a warning each time a new script is fetched anyway
  • Can fetched scripts persist across saved sessions? Perhaps not.
  • Plugins? This is probably more worth considering for a future version, but it would be nice to be able to load plugins as well. Since plugins are (now) installed permanently, we have to think carefully about the implications.

Convenience options

The main benefit is to make things as convenient and easy as possible, especially for new users.

  • Local cache. This would make reloading scripts with each new session easier and faster. You could then stick a bunch of "fetch" lines in your pymolrc.
  • A command to list all available scripts?

Validation

  • How will users know that their script is doing the correct thing
  • Perhaps we should have two classes of scripts: approved and validated

Format

My guess is that we'll require fetchable scripts to follow a certain format on the wiki pages. That should include some metadata like

  • Version number. This makes debugging easier and it makes smart caching possible
  • Citation. Script authors should be able to provide a preferred citation, DOI, etc. One of the benefits is to get script authors more credit.
  • Documentation. Or should this be handled in the doc string?

Implementation