Fetching scripts: Difference between revisions
|  (Created page with "= Overview = I'm considering building in a mechanism for automatically fetching scripts from PyMOLWiki. The goal is to allow users to say  <source lang="python"> fetch findSurfac...") | |||
| (37 intermediate revisions by 3 users not shown) | |||
| Line 7: | Line 7: | ||
| </source> | </source> | ||
| The convenience benefits are obvious, and I think that lowering the barrier to script usage will greatly increase both the number of people who use various scripts and the incentive to place scripts on the wiki. | The convenience benefits are obvious, especially for new users, and I think that lowering the barrier to script usage will greatly increase both the number of people who use various scripts and the incentive to place scripts on the wiki (especially if the fetch mechanism makes it easy for script authors to provide a citation/DOI/etc.). | ||
| = Issues = | = Issues = | ||
| Line 15: | Line 15: | ||
| Running untrusted code is trouble. Some ideas | Running untrusted code is trouble. Some ideas | ||
| * MediaWiki allows us to protect pages so that only administrators can edit them. We could protect all approved scripts. | |||
| * Alternately, we could have a protected page that links to the approved scripts.  | |||
| * I lean towards using both of those options. Just requiring the page to be protected could cause trouble if there's an unrelated page that happens to both be protected and have a security leak due to the fact that it was never intended to be a fetchable script. | |||
| * Maybe a hybrid system where scripts have a development version and a release version? I don't want to make too much overhead, though. Current idea: setting secure=0 means you get the development version. | |||
| * We should print a warning each time a new script is fetched anyway | |||
| * Can fetched scripts persist across saved sessions? It's easy enough to save them in fetch_path. | |||
| * Plugins? This is probably more worth considering for a future version, but it would be nice to be able to load plugins as well. Since plugins are (now) installed permanently, we have to think carefully about the implications. | |||
| == Convenience options == | |||
| The main benefit is to make things as convenient and easy as possible, especially for new users. | |||
| * Local cache. This would make reloading scripts with each new session easier and faster. You could then stick a bunch of "fetch" lines in your pymolrc. fetch_path. | |||
| * A command to list all available scripts? | |||
| == Validation == | == Validation == | ||
| * How will users know that their script is doing the correct thing | |||
| * Perhaps we should have two classes of scripts: approved and validated | |||
| == Format == | |||
| My guess is that we'll require fetchable scripts to follow a certain format on the wiki pages. That should include some metadata like | |||
| * Version number. This makes debugging easier and it makes smart caching possible | |||
| * Citation. Script authors should be able to provide a preferred citation, DOI, etc. One of the benefits is to get script authors more credit. | |||
| * Documentation. Or should this be handled in the doc string? | |||
| I think all of these things should be function attributes | |||
| <source lang="python"> | |||
| def myScript(a,b,c): | |||
|     """ Documentation for myScript goes here """ | |||
|     myScript.citation = "I. Coder, A Journal, 2011, Vol. 2, Issue 1, pages 64-66" | |||
|     myScript.version = 1.2 | |||
|     blah | |||
|     blah | |||
| cmd.extend('myScript',myScript) | |||
| </source> | |||
| == Implementation == | |||
| * This will obviously be written in Python. | |||
| * We'll probably make use of some screen scraping library. I don't know the state of the art here, but I've seen at least the following, and would love some comments: | |||
| ** Generic interfaces | |||
| *** [http://scrapy.org scrapy] (looks reasonable, [https://github.com/clofresh/couch-crawler/blob/master/python/couchcrawler/spiders/wiki.py this page] may also be useful. | |||
| *** [http://www.crummy.com/software/BeautifulSoup/ Beautiful Soup] | |||
| *** [http://lxml.de lxml] | |||
| *** [http://wwwsearch.sourceforge.net/mechanize/ mechanize] | |||
| *** [http://code.google.com/p/webscraping/ webscraping] | |||
| *** urllib(2?) with regular expressions for that old-school feel? | |||
| ** Specific interfaces | |||
| *** [http://pywikipediabot.sourceforge.net/ PyWikipedia] | |||
| *** <s>[http://code.google.com/p/python-wikitools/ python-wikitools] (this looks good to me!)</s> can never be bundled with PyMOL due to GPL v3. | |||
| *** [http://pypi.python.org/pypi?%3Aaction=search&term=wikipedia Other pages listed here] | |||
| = Github repository = | |||
| To download and update github scripts | |||
| Make a text file "github.sh" and make it executable | |||
| <source lang="bash"> | |||
| chmod u+x github.sh | |||
| </source> | |||
| Put this in the file, modify the first 2 lines | |||
| <source lang="bash"> | |||
| #!/bin/bash -e | |||
| pymolscripts=/home/tlinnet/Software/pymol/Pymol-script-repo | |||
| pymoldir=/home/tlinnet/Software/pymol | |||
| if [ -d $pymolscripts ]; then | |||
| echo "### Script library exist, updating it ###" | |||
| cd $pymolscripts | |||
| git pull | |||
| fi | |||
| if [ ! -d $pymolscripts ]; then | |||
| echo "### Script library does not exist, downloading it ###" | |||
| sudo apt-get install git | |||
| git clone https://github.com/tlinnet/Pymol-script-repo.git $pymolscripts | |||
| t="'" | |||
| if grep -Fxq "import sys" ~/.pymolrc | |||
| then | |||
| echo "# 'import sys' already exist in ~/.pymolrc #" | |||
| else | |||
| echo "# Adding 'import sys' to ~/.pymolrc #" | |||
| echo "import sys" >> ~/.pymolrc | |||
| fi | |||
| if grep -Fxq "sys.path.append($t$pymolscripts$t)" ~/.pymolrc | |||
| then | |||
| echo "# sys.path.append($t$pymolscripts$t) already exist in ~/.pymolrc #" | |||
| else | |||
| echo "# adding sys.path.append($t$pymolscripts$t) to ~/.pymolrc #" | |||
| echo "sys.path.append($t$pymolscripts$t)" >> ~/.pymolrc | |||
| fi | |||
| fi | |||
| </source> | |||
| == Changes to Github repository == | |||
| You can do it online, https://github.com/tlinnet/Pymol-script-repo | |||
| Make yourself a user on https://github.com/ | |||
| Locate token at: Account settings -> Account admin -> API Token | |||
| Configure git | |||
| <source lang="bash"> | |||
| git config --global user.name "Your Name" | |||
| git config --global user.email you@example.com | |||
| git config --global github.token 0123456789yourf0123456789token | |||
| cat ~/.gitconfig | |||
| git remote show origin | |||
| git remote set-url origin https://GITUSERNAME@github.com/GITUSERNAME/Pymol-script-repo.git | |||
| git remote show origin | |||
| </source> | |||
| And see cheat sheet here http://help.github.com/git-cheat-sheets/ | |||
| Scheduling the editing or addition of all files to the next commit | |||
| <source lang="bash"> | |||
| git add . | |||
| </source> | |||
| Checking the status of your repository | |||
| <source lang="bash"> | |||
| git status | |||
| </source> | |||
| Committing files | |||
| <source lang="bash"> | |||
| git commit -m "First import" | |||
| </source> | |||
| Push the changes to remote repository | |||
| <source lang="bash"> | |||
| git push origin master | |||
| </source> | |||
| == Read more here == | |||
| http://learn.github.com/p/intro.html | |||
| http://gitref.org/branching/#merge | |||
| https://github.com/features/projects/codereview | |||
| == Test to fetch script from github ==  | |||
| <include src="https://github.com/tlinnet/Pymol-script-repo/blob/master/README" /> | |||
Latest revision as of 11:09, 30 November 2011
Overview
I'm considering building in a mechanism for automatically fetching scripts from PyMOLWiki. The goal is to allow users to say
fetch findSurfaceResidues, type=script
findSurfaceResidues doShow=True, cutoff=0.5
The convenience benefits are obvious, especially for new users, and I think that lowering the barrier to script usage will greatly increase both the number of people who use various scripts and the incentive to place scripts on the wiki (especially if the fetch mechanism makes it easy for script authors to provide a citation/DOI/etc.).
Issues
Security
Running untrusted code is trouble. Some ideas
- MediaWiki allows us to protect pages so that only administrators can edit them. We could protect all approved scripts.
- Alternately, we could have a protected page that links to the approved scripts.
- I lean towards using both of those options. Just requiring the page to be protected could cause trouble if there's an unrelated page that happens to both be protected and have a security leak due to the fact that it was never intended to be a fetchable script.
- Maybe a hybrid system where scripts have a development version and a release version? I don't want to make too much overhead, though. Current idea: setting secure=0 means you get the development version.
- We should print a warning each time a new script is fetched anyway
- Can fetched scripts persist across saved sessions? It's easy enough to save them in fetch_path.
- Plugins? This is probably more worth considering for a future version, but it would be nice to be able to load plugins as well. Since plugins are (now) installed permanently, we have to think carefully about the implications.
Convenience options
The main benefit is to make things as convenient and easy as possible, especially for new users.
- Local cache. This would make reloading scripts with each new session easier and faster. You could then stick a bunch of "fetch" lines in your pymolrc. fetch_path.
- A command to list all available scripts?
Validation
- How will users know that their script is doing the correct thing
- Perhaps we should have two classes of scripts: approved and validated
Format
My guess is that we'll require fetchable scripts to follow a certain format on the wiki pages. That should include some metadata like
- Version number. This makes debugging easier and it makes smart caching possible
- Citation. Script authors should be able to provide a preferred citation, DOI, etc. One of the benefits is to get script authors more credit.
- Documentation. Or should this be handled in the doc string?
I think all of these things should be function attributes
def myScript(a,b,c):
    """ Documentation for myScript goes here """
    myScript.citation = "I. Coder, A Journal, 2011, Vol. 2, Issue 1, pages 64-66"
    myScript.version = 1.2
    blah
    blah
cmd.extend('myScript',myScript)
Implementation
- This will obviously be written in Python.
- We'll probably make use of some screen scraping library. I don't know the state of the art here, but I've seen at least the following, and would love some comments:
- Generic interfaces
- scrapy (looks reasonable, this page may also be useful.
- Beautiful Soup
- lxml
- mechanize
- webscraping
- urllib(2?) with regular expressions for that old-school feel?
 
- Specific interfaces
- PyWikipedia
- python-wikitools (this looks good to me!)can never be bundled with PyMOL due to GPL v3.
- Other pages listed here
 
 
- Generic interfaces
Github repository
To download and update github scripts
Make a text file "github.sh" and make it executable
chmod u+x github.sh
Put this in the file, modify the first 2 lines
#!/bin/bash -e
pymolscripts=/home/tlinnet/Software/pymol/Pymol-script-repo
pymoldir=/home/tlinnet/Software/pymol
if [ -d $pymolscripts ]; then
echo "### Script library exist, updating it ###"
cd $pymolscripts
git pull
fi
if [ ! -d $pymolscripts ]; then
echo "### Script library does not exist, downloading it ###"
sudo apt-get install git
git clone https://github.com/tlinnet/Pymol-script-repo.git $pymolscripts
t="'"
if grep -Fxq "import sys" ~/.pymolrc
then
echo "# 'import sys' already exist in ~/.pymolrc #"
else
echo "# Adding 'import sys' to ~/.pymolrc #"
echo "import sys" >> ~/.pymolrc
fi
if grep -Fxq "sys.path.append($t$pymolscripts$t)" ~/.pymolrc
then
echo "# sys.path.append($t$pymolscripts$t) already exist in ~/.pymolrc #"
else
echo "# adding sys.path.append($t$pymolscripts$t) to ~/.pymolrc #"
echo "sys.path.append($t$pymolscripts$t)" >> ~/.pymolrc
fi
fi
Changes to Github repository
You can do it online, https://github.com/tlinnet/Pymol-script-repo
Make yourself a user on https://github.com/
Locate token at: Account settings -> Account admin -> API Token Configure git
git config --global user.name "Your Name"
git config --global user.email you@example.com
git config --global github.token 0123456789yourf0123456789token
cat ~/.gitconfig
git remote show origin
git remote set-url origin https://GITUSERNAME@github.com/GITUSERNAME/Pymol-script-repo.git
git remote show origin
And see cheat sheet here http://help.github.com/git-cheat-sheets/
Scheduling the editing or addition of all files to the next commit
git add .
Checking the status of your repository
git status
Committing files
git commit -m "First import"
Push the changes to remote repository
git push origin master
Read more here
http://learn.github.com/p/intro.html
http://gitref.org/branching/#merge
https://github.com/features/projects/codereview
Test to fetch script from github
<include src="https://github.com/tlinnet/Pymol-script-repo/blob/master/README" />