Fetching scripts
Jump to navigation
Jump to search
Overview
I'm considering building in a mechanism for automatically fetching scripts from PyMOLWiki. The goal is to allow users to say
fetch findSurfaceResidues, type=script
findSurfaceResidues doShow=True, cutoff=0.5
The convenience benefits are obvious, especially for new users, and I think that lowering the barrier to script usage will greatly increase both the number of people who use various scripts and the incentive to place scripts on the wiki (especially if the fetch mechanism makes it easy for script authors to provide a citation/DOI/etc.).
Issues
Security
Running untrusted code is trouble. Some ideas
- MediaWiki allows us to protect pages so that only administrators can edit them. We could protect all approved scripts.
- Alternately, we could have a protected page that links to the approved scripts. I lean towards the first option from a security standpoint, although it's obviously less convenient for script authors, as it requires them to get an administrator involved when making changes. Maybe a hybrid system where scripts have a development version and a release version? I don't want to make too much overhead, though.
- We should print a warning each time a new script is fetched anyway
- Can fetched scripts persist across saved sessions? Perhaps not.
- Plugins? This is probably more worth considering for a future version, but it would be nice to be able to load plugins as well. Since plugins are (now) installed permanently, we have to think carefully about the implications.
Convenience options
The main benefit is to make things as convenient and easy as possible, especially for new users.
- Local cache. This would make reloading scripts with each new session easier and faster. You could then stick a bunch of "fetch" lines in your pymolrc.
- A command to list all available scripts?
Validation
- How will users know that their script is doing the correct thing
- Perhaps we should have two classes of scripts: approved and validated
Format
My guess is that we'll require fetchable scripts to follow a certain format on the wiki pages. That should include some metadata like
- Version number. This makes debugging easier and it makes smart caching possible
- Citation. Script authors should be able to provide a preferred citation, DOI, etc. One of the benefits is to get script authors more credit.
- Documentation. Or should this be handled in the doc string?
Implementation
- This will obviously be written in Python.
- We'll probably make use of some screen scraping library. I don't know the state of the art here, but I've seen at least the following, and would love some comments:
- Generic interfaces
- scrapy (looks reasonable, this page may also be useful.
- Beautiful Soup
- lxml
- mechanize
- webscraping
- urllib(2?) with regular expressions for that old-school feel?
- Specific interfaces
- PyWikipedia
python-wikitools (this looks good to me!)can never be bundled with PyMOL due to GPL v3.- Other pages listed here
- Generic interfaces