Bundle Example: Fetch from Network Database¶
This example describes how to create a ChimeraX bundle that retrieves data from a network source. In this example, the network source is HomoloGene from NCBI.
The steps in implementing the bundle are:
Create a
bundle_info.xml
containing information about the bundle,Create a Python package that interfaces with ChimeraX and implements the file-reading functionality, and
Install and test the bundle in ChimeraX.
The final step builds a Python wheel that ChimeraX uses to install the bundle. So if the bundle passes testing, it is immediately available for sharing with other users.
Source Code Organization¶
The source code for this example may be downloaded as a zip-format file containing a folder named tut_fetch. Alternatively, one can start with an empty folder and create source files based on the samples below. The source folder may be arbitrarily named, as it is only used during installation; however, avoiding whitespace characters in the folder name bypasses the need to type quote characters in some steps.
Sample Files¶
The files in the tut_fetch
folder are:
tut_fetch
- bundle folderbundle_info.xml
- bundle information read by ChimeraXsrc
- source code to Python package for bundle__init__.py
- package initializer and interface to ChimeraXfetch.py
- source code to retrieve HomoloGene entries
The file contents are shown below.
bundle_info.xml
¶
bundle_info.xml
is an eXtensible Markup Language
format file whose tags are listed in Bundle Information XML Tags.
While there are many tags defined, only a few are needed
for bundles written completely in Python. The
bundle_info.xml
in this example is similar to the one
from the Bundle Example: Add a Tool example with changes highlighted.
For explanations of the unhighlighted sections, please
see Bundle Example: Hello World, Bundle Example: Add a Command and
Bundle Example: Add a Tool.
1<!--
2ChimeraX bundle names must start with "ChimeraX-"
3to avoid clashes with package names in pypi.python.org.
4When uploaded to the ChimeraX toolshed, the bundle
5will be displayed without the ChimeraX- prefix.
6-->
7
8<BundleInfo name="ChimeraX-TutorialFetch"
9 version="0.1" package="chimerax.tut_fetch"
10 minSessionVersion="1" maxSessionVersion="1">
11
12 <!-- Additional information about bundle source -->
13 <Author>UCSF RBVI</Author>
14 <Email>chimerax@cgl.ucsf.edu</Email>
15 <URL>https://www.rbvi.ucsf.edu/chimerax/</URL>
16
17 <!-- Synopsis is a one-line description
18 Description is a full multi-line description -->
19 <Synopsis>Example for fetching sequence alignment from HomoloGene</Synopsis>
20 <Description>Example code for implementing ChimeraX bundle.
21
22Implements capability for fetching and displaying sequence alignments
23from HomoloGene.
24 </Description>
25
26 <!-- Categories is a list where this bundle should appear -->
27 <Categories>
28 <Category name="General"/>
29 </Categories>
30
31 <!-- Dependencies on other ChimeraX/Python packages -->
32 <Dependencies>
33 <Dependency name="ChimeraX-Core" version="~=1.1"/>
34 <Dependency name="ChimeraX-Alignments" version="~=1.0"/>
35 </Dependencies>
36
37 <!-- Register HomoloGene as a fetch source. The downloaded file
38 will (almost) be in FASTA format and should be displayable
39 using ChimeraX alignment tools. If we were using a format
40 not supported by ChimeraX, we would need to supply
41 "DataFormat" and "Open" ChimeraXClassifiers as well. -->
42 <Providers manager="open command">
43 <Provider name="homologene" type="fetch" format_name="fasta" example_ids="87131" />
44 </Providers>
45
46 <Classifiers>
47 <!-- Development Status should be compatible with bundle version number -->
48 <PythonClassifier>Development Status :: 3 - Alpha</PythonClassifier>
49 <PythonClassifier>License :: Freeware</PythonClassifier>
50 </Classifiers>
51
52</BundleInfo>
The BundleInfo
, Synopsis
and Description
tags are
changed to reflect the new bundle name and documentation
(lines 8-10 and 17-24).
The Providers
sections on lines 42 through 44 use the
Manager/Provider protocol to inform
the “open command” manager that this bundle supports fetching data from
a database named homologene
(really HomoloGene,
but the user will type “homologene”).
The attributes usable with the “open command” manager (with type="fetch"
)
are described in detail in Fetching Files.
src
¶
src
is the folder containing the source code for the
Python package that implements the bundle functionality.
The ChimeraX devel
command, used for building and
installing bundles, automatically includes all .py
files in src
as part of the bundle. (Additional
files may also be included using bundle information tags
such as DataFiles
as shown in Bundle Example: Add a Tool.)
The only required file in src
is __init__.py
.
Other .py
files are typically arranged to implement
different types of functionality. For example, cmd.py
is used for command-line commands; tool.py
or gui.py
for graphical interfaces; io.py
for reading and saving
files, etc.
src/__init__.py
¶
As described in Bundle Example: Hello World, __init__.py
contains
the initialization code that defines the bundle_api
object
that ChimeraX needs in order to invoke bundle functionality.
ChimeraX expects bundle_api
class to be derived from
chimerax.core.toolshed.BundleAPI
with methods
overridden for registering commands, tools, etc.
1# vim: set expandtab shiftwidth=4 softtabstop=4:
2
3from chimerax.core.toolshed import BundleAPI
4
5
6# Subclass from chimerax.core.toolshed.BundleAPI and
7# override the method for fetching from databases,
8# inheriting all other methods from the base class.
9class _MyAPI(BundleAPI):
10
11 api_version = 1
12
13 @staticmethod
14 def run_provider(session, name, mgr):
15 # 'run_provider' is called by a manager to invoke the
16 # functionality of the provider. Since we only provide
17 # single provider to a single manager, we know this method
18 # will only be called by the "open command" manager to
19 # fetch HomoloGene data, and customize this routine accordingly.
20 #
21 # The 'name' arg will be the same as the 'name' attribute
22 # of your Provider tag, and mgr will be the corresponding
23 # Manager instance
24 #
25 # For the "open command" manager with type="fetch", this method
26 # must return a chimerax.open_command.FetcherInfo subclass instance.
27 from chimerax.open_command import FetcherInfo
28 class HomoloGeneFetcherInfo(FetcherInfo):
29 def fetch(self, session, identifier, format_name, ignore_cache, **kw):
30 from .fetch import fetch_homologene
31 return fetch_homologene(session, identifier, ignore_cache=ignore_cache, **kw)
32 return HomoloGeneFetcherInfo()
33
34
35# Create the ``bundle_api`` object that ChimeraX expects.
36bundle_api = _MyAPI()
The run_provider()
method is called by a ChimeraX manager
when it needs additional information from a provider or it needs a
provider to execute a task.
The session argument is a Session
instance,
the name argument is the same as the name
attribute in your Provider
tag, and the mgr argument is the manager instance.
These arguments can be used to decide what to do when your bundle offers
several Provider tags (to possibly several managers), but since this bundle
only declares one provider to one manager, we know it will be called by the
“open command” manager to fetch HomoloGene data and don’t need to check
the run_provider()
arguments.
When called by the “open command” manager (that was given the type="fetch"
Provider tag),
run_provider()
must return an instance of a subclass of
chimerax.open_command.FetcherInfo
.
The methods of the class are thoroughly documented if you click the preceding
link, but briefly:
The
fetch()
method is called to actually fetch the data and should return a (models, status message) tuple. Do not add the models to the session — that will done by the calling function.The ignore_cache argument indicates whether your routine should use locally cached data (if any) or instead ignore the cache and fetch the data again. Some types of data fetches may not amenable to caching at all, but for those that are the caching is usually implemented automatically by having the fetching function use the
chimerax.core.fetch.fetch_file()
routine, which takes an ignore_cache keyword argument.If there are fetch-specific keyword arguments that the
open
command should handle, then afetch_args
property should be implemented, which returns a dictionary mapping Python keyword names to Annotation subclasses. Such keywords will be passed to yourfetch()
method, along with format-specific keywords. Note that format-specific keywords are known from theopen_args
property of the bundle that opens the data’s format, and should not be included in the dictionary returned byfetch_args
, so therefore it is rarely necessary to actually implement thefetch_args
property.
For this example, the format_name argument is omitted because
the bundle only supports FASTA format.
All other arguments are passed through to fetch.fetch_homologene
to actually retrieve and process the data.
src/fetch.py
¶
1# vim: set expandtab shiftwidth=4 softtabstop=4:
2
3_URL = ("https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi"
4 "?db=homologene"
5 "&id=%s"
6 "&rettype=fasta"
7 "&retmode=text")
8
9
10def fetch_homologene(session, ident, ignore_cache=True, **kw):
11 """Fetch and display sequence alignment for 'ident' from HomoloGene.
12
13 Use Python library to download the FASTA file and use ChimeraX
14 alignment tools for display.
15 """
16 # First fetch the file using ChimeraX core function
17 url = _URL % ident
18 session.logger.status("Fetching HomoloGene %s" % ident)
19 save_name = "%s.fa" % ident
20 from chimerax.core.fetch import fetch_file
21 filename = fetch_file(session, url, "HomoloGene %s" % ident, save_name,
22 "HomoloGene", ignore_cache=ignore_cache, uncompress=True)
23
24 session.logger.status("Opening HomoloGene %s" % ident)
25 models, status = session.open_command.open_data(filename, alignment=False, name=ident)
26 return models, status
The fetch_homologene
function performs the following steps:
create an URL for fetching content for the given identifier and an output file name where the content will be saved (lines 17-19),
call
chimerax.core.fetch.fetch_file()
to retrieve the actual contents (lines 20-22),update status line (line 24),
open the saved file using the “open command” manager’s
open_data()
method (line 25), which return a (models, status message) tuple.return the list of models created and status message (line 26)
Building and Testing Bundles¶
To build a bundle, start ChimeraX and execute the command:
devel build PATH_TO_SOURCE_CODE_FOLDER
Python source code and other resource files are copied
into a build
sub-folder below the source code
folder. C/C++ source files, if any, are compiled and
also copied into the build
folder.
The files in build
are then assembled into a
Python wheel in the dist
sub-folder.
The file with the .whl
extension in the dist
folder is the ChimeraX bundle.
To test the bundle, execute the ChimeraX command:
devel install PATH_TO_SOURCE_CODE_FOLDER
This will build the bundle, if necessary, and install the bundle in ChimeraX. Bundle functionality should be available immediately.
To remove temporary files created while building the bundle, execute the ChimeraX command:
devel clean PATH_TO_SOURCE_CODE_FOLDER
Some files, such as the bundle itself, may still remain and need to be removed manually.
Building bundles as part of a batch process is straightforward, as these ChimeraX commands may be invoked directly by using commands such as:
ChimeraX --nogui --exit --cmd 'devel install PATH_TO_SOURCE_CODE_FOLDER exit true'
This example executes the devel install
command without
displaying a graphics window (--nogui
) and exits immediately
after installation (exit true
). The initial --exit
flag guarantees that ChimeraX will exit even if installation
fails for some reason.
Distributing Bundles¶
With ChimeraX bundles being packaged as standard Python
wheel-format files, they can be distributed as plain files
and installed using the ChimeraX toolshed install
command. Thus, electronic mail, web sites and file
sharing services can all be used to distribute ChimeraX
bundles.
Private distributions are most useful during bundle development, when circulation may be limited to testers. When bundles are ready for public release, they can be published on the ChimeraX Toolshed, which is designed to help developers by eliminating the need for custom distribution channels, and to aid users by providing a central repository where bundles with a variety of different functionality may be found.
Customizable information for each bundle on the toolshed includes its description, screen captures, authors, citation instructions and license terms. Automatically maintained information includes release history and download statistics.
To submit a bundle for publication on the toolshed,
you must first sign in. Currently, only Google
sign in is supported. Once signed in, use the
Submit a Bundle
link at the top of the page
to initiate submission, and follow the instructions.
The first time a bundle is submitted to the toolshed,
it is held for inspection by the ChimeraX team, which
may contact the authors for more information.
Once approved, all subsequent submissions of new
versions of the bundle are posted immediately on the site.
What’s Next¶
Bundle Example: Save a New File Format (previous topic)
Bundle Example: Fetch from Network Database (current topic)
Bundle Example: Define a Chemical Subgroup Selector (next topic)